Commun. Math. Phys. 235, 1–45 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0778-0
Communications in
Mathematical Physics
Passive Advection and the Degenerate Elliptic Operators Mn Ville Hakulinen Department of Mathematics, P.O. Box 4, 00014 University of Helsinki, Finland. E-mail:
[email protected] Received: 25 August 2001 / Accepted: 30 September 2002 Published online: 20 January 2003 – © Springer-Verlag 2003
Abstract: We prove estimates for the stationary state n-point functions at zero molecular diffusivity in the Kraichnan model [13]. This is done by proving upper bounds for the heat kernels and Green’s functions of the degenerate elliptic operators Mn that occur in the Hopf equations for the n-point functions. 1. Introduction The Kraichnan model of passive advection is an exactly solvable model that has a very similar phenomenology to the full Navier-Stokes turbulence, but is much simpler in many respects. I’ll only give a very short summary for the reader. More detailed introductions to the problem we are addressing can be found e.g. in [9] and [14]. See also [7, 15 and 16]. Let T (t, x) ∈ R, x ∈ Rd be a scalar quantity satisfying ∂t T = κT − v · ∇T + f.
(1.1)
In the Kraichnan model we take v and f random, decorrelated in time, independent and Gaussian with mean zero and covariances v α (t1 , x1 )v β (t2 , x2 ) = D αβ (x1 − x2 )δ(t1 − t2 ) and
(1.2)
f (t1 , x1 )f (t2 , x2 ) = C(x1 − x2 )δ(t1 − t2 ).
(1.3)
Here the v ·∇T should be interpreted in the Stratonovich sense. The incompressibility of the velocity field v is guaranteed by taking kα kβ αβ −ik·x αβ D (x) = e dk, (1.4) D(|k|) δ − 2 k
2
V. Hakulinen
where D is smooth, nonnegative and of compact support in (0, ∞). A D that mimics turbulent velocities is 1 −(d+ξ ) D(|k|) = |k| χ |k|η + (1.5) |k|
with χ smooth, χ = 1 in a neighbourhood of the origin and χ (x) = 0 for x > 1. The idea is that D behaves like |x|ξ in the so-called inertial range η << |x| << . The number η is called the Kolmogorov scale and is called the inertial scale. We let ˜ C˜ ∈ C0∞ (Rd ) with a nonnegative Fourier transform and C := C(·/L), with L > 0. One is interested in the statistics of T (t, x) as t → ∞. Let Fn (t, x1 , . . . , xn ) := T (t, x1 ) . . . T (t, xn ).
(1.6)
Given (1.2) and (1.3) the n-point functions Fn of the scalar T obey the so-called Hopf equations (see [16]): ∂t Fn (t, x1 , . . . , xn ) = −Mn Fn (t, x1 , . . . , xn ) + Fn−2 (t, x1 , . . ., xn )C(xi − xj ), iˆjˆ
1≤i<j ≤n
(1.7)
with
Mn := −
D αβ (xi − xj )
1≤i<j ≤n 1≤α,β≤d
∂2 β
∂xiα ∂xj
−κ
i .
(1.8)
1≤i≤n
The fact that the Hopf equation for Fn does not contain Fm with m > n makes it easy to solve these equations inductively. The situation here differs drastically from full Navier-Stokes turbulence, where the Hopf equation for Fn contains also Fn+1 . Mn is an elliptic operator and in terms of its heat kernel Fn (with zero initial condition for simplicity) is given by t F2 (t, x) = ds dy e−(t−s)M2 (x, y)C(y1 − y2 ), (1.9) t0
F2n (t, x) =
t
ds
dy e−(t−s)M2n (x, y)
1≤i<j ≤2n t0
·F2n−2 (s, y1 , . . ., y2n )C(yi − yj ) dy iˆjˆ
with vanishing odd correlators. As t0 → −∞ these have the stationary limit F2 = dy (M2 )−1 (x, y)C(y1 − y2 ),
F2n =
1≤i<j ≤2n
(M2n )−1 (x, y)F2n−2 (y1 , . . ., y2n )C(yi − yj ) dy. iˆjˆ
(1.10)
(1.11)
(1.12)
Passive Advection and the Degenerate Elliptic Operators Mn
3
One is interested in the study of F2n for η small, large, κ small and L large. In this paper we prove bounds for these directly in the limit η = 0, = ∞ and κ = 0 with fixed L, say L = 1. Our methods also allow the study of the limit η → 0, → ∞ and κ → 0 [12]. A comment on D is now in place. While sending η → 0 and → ∞ in D, we get into trouble with , since D diverges as → ∞. Fortunately it doesn’t matter: Let kα kβ d αβ (x) := dk (1 − eik·x )D(|k|) δ αβ − 2 . (1.13) k Now (1.8) can be written in the following form: Mn :=
d αβ (xi − xj )
1≤i<j ≤n 1≤α,β≤d
∂2 β
∂xiα ∂xj
− κ − D0
1≤i≤n 1≤α≤d
∂ ∂xiα
2 . (1.14)
In (1.7) Mn acts on translationally invariant functions, so the last term drops out and the rest has a limit as → ∞. Finally, here’s our main theorem, proved directly at η = 0, = ∞ and κ = 0: Theorem 1.1. F2n (x) ≤ Cn
(1 + |xi − xj |)2−ξ −d ,
(1.15)
π {i,j }∈π
where the sum is over pairings of {1, . . . , 2n}. 2. Preliminaries This section fixes the notation and discusses the results from other papers ([3, 6, 11, 19]) used in this paper. There is an overview of this paper in Sect. 3, so the reader might want to start there. 2.1. Degenerate elliptic operators in divergence form. Let ⊂ Rn be a domain. We shall be interested in second order differential operators in divergence form, i.e. in operators H of the form H = −∇ · A∇, where A is a locally square integrable function from to real symmetric positive n × n matrices with locally square integrable distributional 1,2 derivative, i.e. A ∈ Wloc ( , Mn ). One can make sense of more general operators, but this is not relevant to the results presented in this paper. Definition 2.1. Let H and A be as above. The matrix A is called the symbol of H , and we denote σ (H ) := A. The function w1H (x) := inf v∈Sn−1 v, σ (H )(x)v (resp. w2H (x) := supv∈Sn−1 v, σ (H )(x)v) is called the greatest lower bound (resp. least upper bound) of the symbol. We shall also use σ (H ) to denote the quadratic form v, A(x)v. The usage will be clear from the context. We often speak loosely and forget the attributes “greatest” and “lowest” from the bounds.
4
V. Hakulinen
If A and B are two symbols and U ⊆ Rm , we shall denote A ∼λ B on U , if λA ≤ B ≤ λ−1 A a.e. on U . If there is λ > 0 so that A ∼λ B on U we also say A ∼ B on U . If “on U ” is dropped, we refer to whole Rm . We shall use ? to denote the identity matrix. Thus a symbol A is uniformly elliptic iff A ∼ ? . Moreover, if A and B are symbols on Rn1 and Rn2 , then A ⊕ B is just the natural symbol on Rn1 +n2 . Definition 2.2. Let w be a nonnegative locally integrable function (a weight) defined on Rn . We denote w(A) := A w(x)dx. The function w is called a doubling weight (resp. an A2 -weight), if there is a constant C such that for every ball B ⊂ Rn we have 1 −1 (B) ≤ C). w(2B) ≤ Cw(B) (resp. |B| 2 w(B)w Since by Schwartz inequality |B|2 ≤ w(B)w −1 (B), we have |2B|2 = 22n |B|2 ≤ ≤ 22n w(B)w −1 (2B), so we can conclude that an A2 -weight is also a doubling weight. Definition 2.3. Denote uB := |B|−1 B u(x) dx and let w1 , w2 be weights on Rn and let q > 2. We say that the Poincar´e inequality (resp. Sobolev inequality) holds for w1 , w2 with q, if there is C < ∞ so that for every ball B ⊆ Rn and u ∈ W 1,2 (B) (resp. u ∈ W01,2 (B)) we have 1/q 1/2 |u − uB |q w2 dx ≤ C|B|1/n w1 (B)−1 |∇u|2 w1 dx , w2 (B)−1 22n w(B)w −1 (B)
B
(resp.
w2 (B)−1
B
1/q
|u|q w2 dx B
1/2 ≤ C|B|1/n w1 (B)−1 |∇u|2 w1 dx ).
(2.1)
(2.2)
B
Theorem 2.4 (Harnack inequality). Suppose H := −∇ · A∇ is a divergence form operator with w1 ≤ A ≤ w2 and suppose that the weights w1 and w2 satisfy the following: 1. w1 and w2 are in A2 , 2. The Poincar´e inequality holds for w1 , w2 with some q > 2 and 3. The Poincar´e inequality holds for w1 , 1 with some q > 2. Let t0 , . . . , t4 ∈ R with t0 < . . . < t4 , ⊆ Rn open and K ⊆ compact and connected. Let u be a strictly positive solution to ut + H u = 0 in × (t0 , t4 ). Then there is a constant C < ∞ depending on , K and t0 , . . . , t4 , but on A only through the bounds w1 and w2 so that ess supK×(t1 ,t2 ) u ≤ C ess inf K×(t3 ,t4 ) u.
(2.3)
Proof. This is just Theorem A of [11] supplemented with a covering argument from [17], pp. 734–736. Remark 2.5. For the purposes of Theorem 2.4 the concept of u being a solution of ut + H u = 0 on Q := × (t0 , t4 ) means exactly the following: 1. u ∈ L2 (Q), 2. ut ∈ L2 (Q),
Passive Advection and the Degenerate Elliptic Operators Mn
5
3. |∇u|2 w2 ∈ L1 (Q), and 4. For all φ ∈ C01 (Q) we have Q
ut φ + A∇u, ∇φ dx dt = 0.
(2.4)
We are going to apply the Harnack inequality only to heat kernels of some degenerate elliptic operators. In particular as long as t0 > 0 all the above items will hold. Since the heat kernel is a positive distribution, it is a measure and (4) follows from the fact that the heat kernel is a distributional solution of the corresponding degenerate heat equation. First of all (1) holds because for t0 > 0 the heat kernel is a bounded function on × (t0 , t4 ) (by Corollary 4.22). Secondly (2) holds because of the following computation which is justified by Remark 2.8: (∂t K)(s, ·, y) = −H K(s, ·, y) = −e−(s−t0 )H H e−t0 H /2 K
t0 , ·, y . 2
(2.5)
Now since by Remark 2.7 e−tH is a contraction on L2 , sup ||∂t K(s, ·, y)||2 < ∞.
(2.6)
s∈(t0 ,t4 )
Let A be the symbol of H . To prove (3) it suffices to show that |∇K| is locally in L2 , since w2 is locally bounded. Since |∇K|2 dx dt ≤ w1−1 A∇K, ∇K dx dt. (2.7) Q
Q
Since w1 is in A2 (by Lemma A.2), w1−1 is locally integrable, so it suffices to prove that A∇K, ∇K is essentially bounded on Q. We show that for any 0 ≤ φ ∈ C0∞ (Q) we have φA∇K, ∇K ≤ C φ, (2.8) Q
Q
with C not depending on φ. So we compute using the facts that K and ∇ · A∇K are locally bounded: φA∇K, ∇K = K∇ · φA∇K Q Q ≤ C A∇φ, ∇K + C φ∇ · A∇K Q Q ≤ 2C φ∇ · A∇K ≤ C φ. Q
(2.9)
Q
It follows from the results in Sect. 4.2 and Appendix A that this Harnack inequality holds for the operators Mn , which will be our main interest and will be defined in Sect. 2.3.
6
V. Hakulinen
2.2. Gaussian upper bounds for heat kernels. The material in this section is mostly taken from [6]. For more information, see Sects. 1.3, 2.4 and 3.2 there. See also [4 and 19]. Definition 2.6. Let H ≥ 0 be a real self-adjoint operator on L2 (Rn ). We call the semigroup e−H t a symmetric Markov semigroup, if it is positivity-preserving and a contraction on L∞ (Rn ). Remark 2.7. By saying that e−H t is a contraction on Lp with p = 2 we mean that e−H t is a contraction on Lp ∩ L2 and can be extended to a unique contraction on Lp . In the case of L∞ we have to impose the extra condition of weak* continuity to achieve uniqueness since L∞ ∩ L2 is not norm dense in L∞ . Remark 2.8. A symmetric Markov semigroup is strongly continuous on Lp with 1 ≤ p < ∞, see Theorem 1.4.1 of [6]. This in particular implies that the generator H commutes with the semigroup e−H t (see [4]). By Theorem 1.3.5 of [6], any self-adjoint divergence form operator with non-negative symbol and core C0∞ (Rn ) gives rise to a symmetric Markov semigroup. The theorem there is stated for “elliptic” operators, but the proof works for any non-negative symbol. The keywords here are self-adjointness and core C0∞ . Both follow for Mn from the fact 1,2 (R(n−1)d ) (Proposition 4.3). See Theorem 1.2.5 of [6]. that σ (Mn ) ∈ Wloc Definition 2.9. Let e−H t be a symmetric Markov semigroup on L2 (Rn ). We say that e−H t is ultracontractive if the map e−H t is bounded from L2 to L∞ for every t > 0. Definition 2.10. Suppose that C0∞ (Rn ) ⊆ Dom(H ). Let e−H t be a symmetric Markov semigroup on L2 (Rn ). We say that e−H t (or H or σ (H )) is of dimension µ if there is C2 < ∞ such that for all t > 0 and f ∈ L2 (Rn ) we have ||e−H t f ||∞ ≤ C2 t −µ/4 ||f ||2 .
(2.10)
Note that the dimension of a semigroup need not be unique. There is a standard method for obtaining global Gaussian upper bounds for heat kernels of divergence form operators with nonnegative symbols using global spaceindependent bounds. A good reference for this is [4]. Definition 2.11. Let A be a symbol on Rn . The function dA (x, y) := sup |φ(x) − φ(y)| : φ is C ∞ and bounded with ∇φ, A∇φ ≤ 1 on Rn
(2.11)
is called the metric associated with A (or H , if H := −∇ · A∇ or e−tH or the heat kernel of H ). The following theorem was proved by Davies [4]. Theorem 2.12. Let µ be a positive real number. Suppose H := −∇ · A∇ ≥ 0 is a positive self-adjoint divergence form operator with e−H t a symmetric Markov semigroup of dimension µ. Then for each δ > 0 there is Cδ < ∞ such that the heat kernel K of e−H t satisfies
dA (x, y)2 −µ/2 exp − (2.12) 0 ≤ K(t, x, y) ≤ Cδ t 4(1 + δ)t for all 0 < t < ∞ and x, y ∈ Rn . Besides δ, Cδ depends only on µ and the constant C2 of Definition 2.10.
Passive Advection and the Degenerate Elliptic Operators Mn
Proof. See [4].
7
We shall use the following theorem later to get the dimension of Mn in Corollary 4.22. It was proved by Varopoulos [19]. John Nash [18] also proved a similar result. Theorem 2.13. Suppose C0∞ (Rn ) ⊆ Dom(H ). Let e−H t be a symmetric Markov semigroup on L2 (Rn ) and let µ > 2 be given. Then there is C1 < ∞ such that for all f ∈ C0∞ (Rn ) we have ||f ||22µ/(µ−2) ≤ C1 f, Hf ,
(2.13)
if and only if there is C2 < ∞ such that for all t > 0 and f ∈ L2 (Rn ) we have ||e−H t f ||∞ ≤ C2 t −µ/4 ||f ||2 .
(2.14)
Here the constants C1 and C2 depend only on each other and the number µ. Proof. This is just Theorem 2.4.2 of [6].
Remark 2.14. One can show using the Schwartz Kernel and Radon-Nikodym Theorems that a bounded linear map L : L1 → L∞ has an integral kernel that is a function in L∞ whose L∞ -norm equals the operator norm of L. Since our e−H t is self-adjoint, boundedness of e−H t : L2 → L∞ implies boundedness of e−H t : L1 → L2 , so in this case we have a heat kernel that is a genuine function. Finally, we give a nice way to estimate heat kernels of operators H that have symbols satisfying σ (H ) ∼ A1 ⊕ A2 . Theorem 2.15. Suppose that for i = 1, 2, Ai is a symbol on Rni such that et∇·Ai ∇ is a symmetric Markov semigroup on L2 (Rni ) and B ∼λ A1 ⊕ A2 . Suppose also that the heat kernels of Ai ’s satisfy
µi dA (x, y)2 . (2.15) KAi (t, x, y) ≤ Ci t − 2 exp − i Ci t Then there is C < ∞ depending only on C1 , C2 , µ1 , µ2 and λ so that the heat kernel of B satisfies
µ +µ dA1 (x, y)2 + dA2 (x, y)2 − 12 2 exp − . (2.16) KB (t, x, y) ≤ Ct Ct Proof. Since KA1 ⊕A2 (t, (x1 , x2 ), (y1 , y2 )) = KA1 (t, x1 , y1 )KA2 (t, x2 , y2 ), we can conclude that KA1 ⊕A2 ≤ C1 C2 t −
µ1 +µ2 2
,
(2.17)
which by the Riesz-Thorin interpolation theorem and the fact that et∇·A1 ⊕A2 ∇ is a contraction L∞ imply (2.14) for H = −∇ · A1 ⊕ A2 ∇. Therefore by Theorem 2.13 ||f ||22µ/(µ−2) ≤ C3 ∇f, (A1 ⊕ A2 )∇f
(2.18)
for any f ∈ C0∞ (Rn1 +n2 ) with C3 depending only on C1 C2 and µ1 +µ2 . Since A1 ⊕A2 ≤ λ−1 B, we have KB ≤ C4 t −
µ1 +µ2 2
,
(2.19)
with C4 depending only on C1 C2 , µ1 + µ2 and λ. We now apply Theorem 2.12 to conclude the claim.
8
V. Hakulinen
2.3. The definition of the operators Mn . For the rest of the paper, we fix a constant ξ , 0 < ξ < 2 and an integer d ≥ 2. Here d is the dimension of the “physical” space. Next, we overload the symbol d immediately and let d be the map from R d to d × d matrices defined by 1 − cos(k · x) ˆ dk, d(x) := C (1 − kˆ ⊗ k) (2.20) |k|d+ξ Rd with C :=
(4π)d/2 2ξ ξ ((d + ξ + 2)/2) . (d − 1)((2 − ξ )/2)
(2.21)
A computation (see e.g. [8]) shows that ξ ξ ? − xˆ ⊗ xˆ . d(x) = |x|ξ 1 + d −1 d −1
(2.22)
In the following definition, we denote vectors in Rnd by {vi }ni=1 , where each vi is a vector in Rd . sc Definition 2.16. Let n ≥ 2. The operator Msc n := −∇ · σ (Mn )∇ is the one with the symbol σ (Msc vi , d(xi − xj )vj . (2.23) n ) := − 1≤i<j ≤n
If a ∈ Rd , we denote the vector (xi +a)ni=1 by x+a. We call a function f : Rnd → R translationally invariant, if for every a ∈ Rd and x ∈ Rnd we have f (x) = f (x + a). We shall be interested in Msc n acting on translationally invariant functions, so we need to reduce the number of total space dimensions to (n − 1)d. In other words, we set xi := yi − yi+1 for 1 ≤ i ≤ n − 1, so ∂ if i = 1, α ∂x 1 ∂ ∂ ∂ if 2 ≤ i ≤ n − 1 and = (2.24) α − ∂x α α ∂x ∂yi i i−1 ∂ if i = n. − α ∂xn−1 Denote the symbol obtained in this way by σ (Mn ). A simple calculation shows that σ (Mn ) equals j j −1 j j −1 n−1 n−1 vi , d xk − d xk − d xk + d xk vj . i=1 j =i
k=i
k=i
k=i+1
k=i+1
(2.25) In particular, σ (M2 ) = v1 , d(x1 )v1 ,
(2.26)
Passive Advection and the Degenerate Elliptic Operators Mn
9
σ (M3 ) = v1 , d(x1 )v1 + v2 , d(x2 )v2 +v1 , (d(x1 + x2 ) − d(x1 ) − d(x2 ))v2 ,
(2.27)
and σ (M4 ) = v1 , d(x1 )v1 + v2 , d(x2 )v2 + v3 , d(x3 )v3 v1 , (d(x1 + x2 ) − d(x1 ) − d(x2 ))v2 v2 , (d(x2 + x3 ) − d(x2 ) − d(x3 ))v3 v1 , (d(x1 + x2 + x3 ) − d(x1 + x2 ) − d(x2 + x3 ) + d(x2 ))v3 .
(2.28)
3. Overview Our intent here is to give some intuition on the arguments of this paper and how they lead to the proof of Theorem 1.1. What is obvious at first sight, is that if Theorem 1.1 is to hold, the Green’s functions of the operators M2n should be locally integrable in the sense that for every n ≥ 2 there is C < ∞ so that for every x ∈ R(2n−1)d we have d (2n−1)d y GM2n (x, y) < C. (3.1) B(x,1)
One might hope to get (3.1) to hold using the heat kernel estimate of Theorem 2.12, but unfortunately this direct approach fails. First of all we see that σ (M2 ) ∼ | · |ξ . Applying Definition 2.11, Corollary 4.22 and Theorem 2.12 to this, we find a C < ∞ such that
d |x − y|2 − 2−ξ KM2 (t, x, y) ≤ Ct (3.2) exp − Ct for |x| = 1 and |x − y| ≤ 21 . Integrating with respect to t from 0 to ∞ we get 2d
GM2 (x, y) ≤ C |x − y|2− 2−ξ .
(3.3)
2d 4 This estimate yields (3.1) only when 2 − 2−ξ > −d, that is ξ < d+2 . We might be satisfied with the fact that (3.1) holds only for small ξ , but there is worse to come: For each σ (Mn ) will have points x ∈ S(n−1)d−1 so that σ (Mn ) ∼ ? in a neighbourhood U of x. A similar argument as above now yields
GM2n (x, y) ≤ C |x − y|2−
2(n−1)d 2−ξ
(3.4)
4 for y ∈ U . This yields (3.1) for Mn only when ξ < (n−1)d+2 , which means trouble: Given ξ with 0 < ξ < 2, there will always be some N so that our argument above fails to give local integrability for Mn with n ≥ N. On the other hand, since M2 is uniformly elliptic in a neighbourhood U of x, the heat kernel of M2 should behave like the heat kernel of the Laplacian for small times and small distances from x. Turning this analysis into formulas let’s suppose
|x − y|2 − d2 KM2 (t, x, y) ≤ C2 t exp − (3.5) C2 t
10
V. Hakulinen
for |x| = 1, |x − y| ≤ ≤ t
d − 2−ξ
≤ C3 t
− d2
1 2
and 0 < t ≤ t0 . Since there is C3 < ∞ so that
for t ≥ t0 , we can combine (3.2) with (3.5) and conclude that
d |x − y|2 KM2 (t, x, y) ≤ C4 t − 2 exp − C4 t
(3.6)
for |x| = 1, |x − y| ≤ and 0 < t < ∞. Now an integration w.r.t. t from 0 to ∞ yields GM2 (x, y) ≤ C5 |x − y|2−d
(3.7)
for |x| = 1 and |x − y| ≤ . The same holds for Mn with n > 2. This leads us to a further twist: for n > 2, σ (Mn ) has degeneracies also outside of the origin, but fortunately in the end these turn out not to be problematic. A few words on the structure of the rest of the paper. In Sect. 4 the symbols of Mn are analyzed in detail. The local analysis of the heat kernels is done in Sect. 5. Theorem 1.1 is proved in Sect. 6. Finally, there are three appendices containing technicalities. 4. The Operators Mn From now on, we live in R(n−1)d and denote vectors of R(n−1)d with v = (vi )n−1 i=1 and n−1 d x = (xi )i=1 , where vi , xi ∈ R . The symbol of Mn has a bunch of useful symmetries, inherited from Msc n . For k l k L : R → R a surjective linear mapping and A a symbol on R which for all x ∈ Rk is constant on {x} + ker L denote AL (x) := LA(L−1 x)LT , where L−1 is some rightinverse of L. Let Ln : Rnd → R(n−1)d be given by the matrix (Ln )ij := δij − δi+d,j , so that σ (Mn ) = σ (Msc )Ln We let nd Ln = {Ln LL−1 n : L is a permutation of the coordinate axes of R }.
(4.1)
Now σ (Mn )L = σ (Mn ) for every L ∈ Ln . Remark 4.1. Let A1 and A2 be two symbols on Rk and let G1 , G2 ⊆ GL(Rk ) be their respective symmmetry groups, i.e Gi := {L ∈ GL(Rk ) : AL i = Ai },
(4.2)
for i ∈ {1, 2}. Now if A1 ∼ A2 on U , then A1 ∼ A2 on LU for any L ∈ G1 ∩ G2 . Remark 4.2. A simple calculation shows that Mn is degenerate, whenever bi=a xi = 0, where 1 ≤ a ≤ b ≤ n − 1. In fact these are the only points where Mn degenerates, as we see in Theorem 4.7. To avoid lengthy statements in the rest of the paper, we denote {x ∈ R(n−1)d : xi = 0} by {xi = 0} and similarly for the other sets. Proposition 4.3. 1,2 (R(n−1)d ). σ (Mn ) ∈ Wloc
(4.3)
Passive Advection and the Degenerate Elliptic Operators Mn
11
Proof. The case 1 < ξ < 2 is an easy computation, since then σ (Mn ) is continuously differentiable. In case 0 < ξ ≤ 1, we let
F :=
b
1≤a≤b
xi = 0 .
(4.4)
i=a
A relatively simple calculation shows that there is C < ∞ such that |∇(σ (Mn ))(x)| ≤ Cd(x, F )ξ −1 .
(4.5)
Since F is a finite union of vector subspaces of codimension d ≥ 2, we can conclude that d(x, F )ξ −1 is a locally square integrable function. Remark 4.4. It is trivial to get an upper bound for Mn : σ (Mn ) ≤
sup w, σ (Mn )(y)w |x|ξ |v|2 .
(4.6)
|y|=|w|=1
We obtain a better upper bound in Sect. 4.2. Proposition 4.5. For any ∈ (0, 1) there is C < ∞ such that dσ (Mn ) (x, y) ≤ C|x − y|1−ξ/2 ,
(4.7)
when |x − y| ≥ |x|. Proof. By Definition 2.11 and Remark 4.4 it suffices to show that there is C < ∞ such that d|·|ξ (x, y)2 ≤ C|x − y|2−ξ , when |x − y| ≥ |x|. Trivial dimensional analysis gives y ˆ |x| ). Therefore we may assume |x| = 1. By rotational d|·|ξ (x, y) = |x|1−ξ/2 d|·|ξ (x, symmetry, we may fix x. By scaling, there is C < ∞ so that C |y|1−ξ/2 = d|·|ξ (0, y). Since now d|·|ξ (x, y)2 |x − y|2−ξ
= C
d|·|ξ (x, y)2 d|·|ξ (0, x − y)2
(4.8)
,
it suffices to show that f (R) := sup|x−y|=R d|·|ξ (x, y)/d|·|ξ (0, x − y) is a bounded function of R for R ∈ [, ∞). Obviously f is continuous. By continuity of d|·|ξ we have d|·|ξ (x, y)2 d|·|ξ (0, x − y)2 as |x − y| → ∞.
=
d|·|ξ
y x |x−y| , |x−y|
d|·|ξ (0, x − y)2
2 →
ˆ 2 d|·|ξ (0, y) d|·|ξ (0, y) ˆ 2
=1
(4.9)
12
V. Hakulinen
4.1. Fourier integral representation and the degeneration set. Definition 4.6. Let A be a symbol. We call the set Dgn(A) := {x ∈ R n : A(x) is not invertible}
(4.10)
the degeneration set of A. The following Fourier integral representation of the symbol is crucial for the computation of the degeneration sets of Mn (which then implies corresponding properties for the operators Mn to be introduced later). Theorem 4.7. The degeneration set of Mn is {x ∈ R (n−1)d : |xi + . . . + xj | = 0}. Dgn(Mn ) =
(4.11)
1≤i≤j
n nd Proof. By Remark 4.2, it suffices to show that for every v ∈ R with i=1 vi = 0 we have − 1≤i<j ≤n vi , d(xi − xj )vj > 0 whenever xi = xj for all 1 ≤ i < j ≤ n. We have vi , d(xi − xj )vj − 1≤i<j ≤n
=−
1 2
vi , d(xi − xj )vj
1≤i,j ≤n
1 − eik·(xi −xj ) C =− Re vi , (? − k ⊗ k)vj dk 2 Rd |k|d+ξ n 1≤i,j ≤n n C ik·xi ? − k ⊗ k ik·xi = dk. (4.12) Re vi e , vi e 2 Rd |k|d+ξ i=1
i=1
The rest goes as in Proposition 1 of [8]: For the integral to be zero, we have to have n
vi eik·xi = α(k)k
(4.13)
i=1
almost everywhere for some scalar function α. Taking the exterior product (i.e. the antisymmetric part of the tensor product) with respect to k and Fourier transforming in the sense of distributions we arrive at n
vi ∧ ∇δ(x − xn ) = 0.
(4.14)
i=1
Thus for any smooth test function φ n
vi ∧ ∇φ(xn ) = 0.
(4.15)
i=1
This is a contradiction since the values of ∇φ can be arbitrarily specified on a discrete set and the xn ’s are all distinct.
Passive Advection and the Degenerate Elliptic Operators Mn
13
4.2. Estimates for the symbol of Mn . We shall now show that the symbol of Mn can be estimated using the symbols of Mm , m ∈ {2, . . . , n − 1}. Definition 4.8. Let x ∈ R(n−1)d . The dimension of the zero eigenspace of σ (Mn ) at x divided by d is called the rank of the point x and denoted rk(x). In particular x is a degeneration point of σ (Mn ) iff rk(x) > 0. Below, for a symbol A and invertible linear transformation L we define the symbol AL by the formula AL (x) := LA(L−1 x)LT . Theorem 4.9. Let n ≥ 2 and x ∈ S(n−1)d−1 . Then either Mn is uniformly elliptic in some neighbourhood of x or there is a invertible linear transformation L of R(n−1)d , a neighbourhood U of Lx so that σ (Mn )L ∼ ki=1 σ (Mni ) ⊕ ? on U with k ≥ 1, each k nk ≥ 2, rk(x) = i=1 (nk − 1) < n − 1 and (Lx)i = 0 for 1 ≤ i ≤ kj =1 (nk − 1). Let’s introduce some convenient notation at this point. First of all [i, j ] := {i, . . . , j }. Let A ⊆ [1, n]. Then we write xi , xA := i∈A
γA := vmin A , (d(xA ) − d(xA\{min A} ) − d(xA\{max A} ) + d(xA\{min A,max A} ))vmax A , γA∩[i,j ] . σA :=
(4.16)
i,j ∈A;i≤j
Moreover xi,j := x[i,j ] , σi,j := σ[i,j ] , γi,j := γ[i,j ] and σi := γi := γ{i} . 4.3. Two propositions for the proof of Theorem 4.9. Our purpose here is to prove Proposition 4.10 and Proposition 4.17. Let us illustrate what we’re going to do by studying σ (M3 ) in some detail. Let x ∈ S2d−1 be such that x1 = 0, i.e. x = (x1 , x2 ) with x2 ∈ Sd−1 . We’ll show that there is a neighbourhood U of x and C < ∞ so that for every y ∈ U we have 1 |y1 |ξ |v1 |2 + |y2 |ξ |v2 |2 ≤ σ (M3 )(y) ≤ C |y1 |ξ |v1 |2 + |y2 |ξ |v2 |2 . C Let E be given by Lemma 4.11 and let ∈ 0, 41 be such that 1 E (2)1−ξ/2 + (2)ξ/2 ≤ 2 and let
(4.17)
(4.18)
3 1 < |y2 | < . 2 2
(4.19)
1 (|y1 |ξ |v1 |2 + |y2 |ξ |v2 |2 ) 2
(4.20)
U := B(0, ) ×
By our choice of we have |γ1,2 | ≤
in U . In other words (4.17) holds and thus σ (M3 ) ∼ σ (M2 ) ⊕ ? on U .
14
V. Hakulinen
Proposition 4.10 will be used when we have several (or all) coordinates away from the degeneration set. As might be guessed from our calculation with σ (M3 ), the point of Lemmata 4.11–4.14 is that in the proof of Theorem 4.9 we need to have estimates for the crossterms with the flavor |γi,j | ≤ something · (|xi |ξ |vi |2 + |xj |ξ |vj |2 ).
(4.21)
We have neatly blackboxed all this mess into Proposition 4.17; the lemmata of this section are not directly used in the proof of Theorem 4.9. The proofs can be found in Appendix B. Proposition 4.10. Suppose n ≥ 1, ∈ (0, 1) and let A := {x ∈ Rnd : max{|xi,j | : 1 ≤ i ≤ j ≤ n} ≤ min{|xi,j | : 1 ≤ i ≤ j ≤ n}}. (4.22) Then there is C < ∞ so that for every x ∈ A we have n n 1 |xi |ξ |vi |2 ≤ σ (Mn+1 ) ≤ C |xi |ξ |vi |2 . C i=1
(4.23)
i=1
Lemma 4.11. There is a constant E < ∞ such that if 1 ≤ i < n and |xi | < 21 |xi+1 |, then |vi , (d(xi + xi+1 ) − d(xi ) − d(xi+1 ))vi+1 | 1−ξ/2 ξ/2 |xi | |xi | ≤E (|xi |ξ |vi |2 + |xi+1 |ξ |vi+1 |2 ). (4.24) + |xi+1 | |xi+1 | Lemma 4.12. There is a constant E < ∞ such that if 1 ≤ i < i + 1 < j ≤ n, |xi | < 21 min{|xi+1,j |, |xi+1,j −1 |} and |xj | > 0, then |vi , (d(xi,j ) − d(xi+1,j ) − d(xi,j −1 ) + d(xi+1,j −1 ))vj | 1−ξ/2 1−ξ/2 |xi+1,j | ξ/2 |xi+1,j −1 | ξ/2 |xi | |xi | + ≤E |xi+1,j | |xi+1,j −1 | |xj | |xj | ·(|xi |ξ |vi |2 + |xj |ξ |vj |2 ).
(4.25)
Lemma 4.13. There is a constant E < ∞ such that if 1 ≤ i < i + 1 < j ≤ n, 1 1 2 |xi+1,j −1 | ≤ |xi | < 2 |xi+1,j | and |xj | > 0, then |vi , (d(xi,j ) − d(xi+1,j ) − d(xi,j −1 ) + d(xi+1,j −1 ))vj | 1−ξ/2 |xi+1,j | ξ/2 |xi | ξ/2 |xi | (|xi |ξ |vi |2 + |xj |ξ |vj |2 ). + ≤E |xi+1,j | |xj | |xj | (4.26) Lemma 4.14. There is E < ∞ so that if 1 ≤ i < i + 1 < j ≤ n and max{|xi |, |xj |} < 1 3 {|xi+1,j −1 |}, we have |vi , (d(xi,j ) − d(xi+1,j ) − d(xi,j −1 ) + d(xi+1,j −1 ))vj | 1−ξ/2 1−ξ/2 |xi | |xi | (|xi |ξ |vi |2 + |xj |ξ |vj |2 ). ≤E |xi+1,j −1 | |xi+1,j −1 | (4.27)
Passive Advection and the Degenerate Elliptic Operators Mn
15
We still have one more lemma to go before we can start proving Proposition 4.17. We’ll illustrate it with σ (M6 ). Let x ∈ S5d−1 with |x1 | = |x3 | = |x5 | = 0 and |x2 |, |x4 |, |x2,4 | > 0. By Proposition 4.10 σ{2,4} (y2 , y4 ) behaves like |y2 |ξ |v2 |2 + |y4 |ξ |v4 |2 in a neighbourhood of (x2 , x4 ). Unfortunately the relevant part of σ (M6 ) is γ2 + γ4 + γ2,4 , but at least we would have some hope, if we could get an estimate of the form |γ2,4 − γ{2,4} | ≤ something · (|y2 |ξ |v2 |2 + |y4 |ξ |v4 |2 )
(4.28)
for y in a neighbourhood of x. This is the point of Lemma 4.15. More precisely, let µ := min{|y2 |, |y4 |, |y2,4 |} ≤ max{|y2 |, |y4 |, |y2,4 |} =: ν µ 2
and let C < ∞ be such that if
(4.29)
< |y2 |, |y4 |, |y2 + y4 | < 2ν we have
1 (|y2 |ξ |v2 |2 + |y4 |ξ |v4 |2 ) ≤ σ{2,4} ≤ C(|y2 |ξ |v2 |2 + |y4 |ξ |v4 |2 ). C
(4.30)
Let ∈ (0, µ6 ) be such that E
2 µ
1−ξ/2
4ν µ
ξ/2
+
2 µ
ξ/2
≤
1 . 2C
(4.31)
Let µ U := |y3 | < and < |y2 |, |y4 |, |y2,4 |, |y2 + y4 | < 2ν . 2
(4.32)
By Lemma 4.15 for y ∈ U we have |γ2,4 − γ{2,4} | ≤
1 (|y2 |ξ |v2 |2 + |y4 |ξ |v4 |2 ). 2C
(4.33)
Combining (4.33) with (4.30) we conclude that γ2 + γ4 + γ2,4 behaves like |y2 |ξ |v2 |2 + |y4 |ξ |v4 |2 in U . Again, the proof of the following lemma can be found in Appendix B. Lemma 4.15. There is E < ∞ such that if 1 ≤ i < j ≤ n and {i, j } ⊆ A ⊆ [i, j ] and if k∈[i,j ]\A |xk | ≤ 21 min{|xk,l | : k, l ∈ A, k ≤ l}, then |γi,j − γA | ≤ E +
k∈[i,j ]\A |xk |
1−ξ/2
min{|xk,l | : k, l ∈ A, k ≤ l} ξ/2 k∈[i,j ]\A |xk | min{|xk,l | : k, l ∈ A, k ≤ l}
k∈A |xk |
ξ/2
|xj |
(|xi |ξ |vi |2 + |xj |ξ |vj |2 ). (4.34)
If L ∈ GL(R(n−1)d ), we shall use the following somewhat weird notation: If x ∈ R(n−1)d , we let Lxi := (Lx)i for 1 ≤ i ≤ n − 1. Similarly, we let Lxi,j := (Lx)i,j for 1 ≤ i ≤ j ≤ n − 1.
16
V. Hakulinen
Remark 4.16. Let x be a degeneration point of σ (Mn ). We claim that there is a symmetry L ∈ Ln and A {1, . . . , n − 1} so that |Lxi | = 0 if i ∈ A and Lxi,j > 0 if {i, . . . , j } ⊆ A. This is easy to see, if we look at the original symbol σ (Msc n ). Then the claim above simply says that if we have points y1 , . . . , yn ∈ Rd , then there is a permutation π ∈ Sn so that if yπ(i) = yπ(j ) with π(i) ≤ π(j ), then yπ(i) = yk with every k with π(i) ≤ k ≤ π(j ). Still in other words: if we pick n possibly coinciding points from Rd , we can label them with numbers 1, . . . , n so that the coinciding points get consecutive numbers as labels. Given x and A as above, write A as {i1 , . . . , j1 } ∪ . . . ∪ {im , . . . , jm }
(4.35)
with i1 ≤ j1 < j1 + 1 < i2 ≤ . . . < im ≤ jm and write σ (Mn ) as σ (Mn ) =
m l=1
σil ,jl + σAc +
γi,j − σAc + the rest.
(4.36)
i,j ∈Ac
Let µ := min{|xi,j | : {i, . . . , j } ⊆ A} and ν := max{|xi,j | : {i, . . . , j } ⊆ A}. Proposition 4.17. For any C > 0 there is a neighbourhood U of x so that n 1 c + the rest ≤ γ − σ |yi |ξ |vi |2 i,j A 2C i,j ∈Ac i=1
(4.37)
for any y ∈ U . Proof. For > 0 let U := {y ∈ Rnd : |yi,j | < if {i, . . . , j } ⊆ A and µ/2 < |yi,j | < 2ν otherwise}. (4.38) Let N := n(n−1) be the number of terms in σ (Mn ). We’ll find > 0 so that each term 2 1 n in (4.37) is ≤ 2NC i=1 |xi |ξ |vi |2 , where we count each γi,j − γAc ∩[i,j ] with i, j ∈ Ac as one term. A (long) moment’s look at Lemmata 4.11–4.15 reveals to us that this is possible. Here’s a list of the requirements for : 1. Lemma 4.11: < 2. Lemma 4.12: < 3. Lemma 4.13: < 4. Lemma 4.14: <
µ 4 µ 4 µ 4 µ 6
5. Lemma 4.15: n <
1−ξ/2 + ( 2 )ξ/2 ) ≤ and E(( 2 µ) µ 1−ξ/2 ( 4ν )ξ/2 ≤ and 2E( 2 µ) µ
1 2NC .
1 2NC . 1−ξ/2 ( 4ν )ξ/2 + ( 2 )ξ/2 ) ≤ 1 . and E(( 2 µ) µ µ 2NC 2 2−ξ 1 and E( µ ) ≤ 2NC . µ 2n 1−ξ/2 4nν ξ/2 ξ/2 ) ≤ 1 . ( µ ) + ( 2n 4 and E(( µ ) µ ) 2NC
Passive Advection and the Degenerate Elliptic Operators Mn
17
4.4. The proof of Theorem 4.9. Proof of Theorem 4.9. We shall prove this theorem by induction on n and we shall accomplish this by proving in parallel that there is a constant C < ∞ so that for any x ∈ R(n−1)d there is K ∈ Ln so that n−1 n−1 1 |Kxi |ξ |vi |2 ≤ σ (Mn ) ≤ C |Kxi |ξ |vi |2 . C i=1
(4.39)
i=1
This is trivial for σ (M2 ). We assume now that the claim above is true for σ (Mm ), 2 ≤ m < n and prove it for σ (Mn ). This is done as follows. For every x ∈ Snd−1 we find a neighbourhood Ux of x so that the claim above holds on Ux with a constant C(x) depending on x . Since Snd−1 is compact, there is a finite set {x1 , . . . , xk } so that ! Snd−1 ⊆ ki=1 Uxk , so the claim above will then hold with C = max1≤i≤k C(xi ). If x is not a degeneration point of Mn+1 , then by Proposition 4.10 the estimate above can be satisfied in a neighbourhood of x with K = ? , so we assume x is a degeneration point. We now apply the symmetry discussed in Remark 4.16, so we can assume there is nonempty A {1, . . . , n} so that |xi | = 0 if i ∈ A and |xi,j | > 0 if {i, . . . , j } ⊆ A. Write A as {i1 , · · · , j1 } ∪ . . . ∪ {im , . . . , jm } with i1 ≤ j1 < j1 + 1 < i2 < · · · < im ≤ jm . Denote Ac := {1, . . . , n} \ A. We may even assume that i1 = 1 and if m > 1, we have jm = n. Note that rk(x) = #(A). Let U be the neighbourhood of x given by Proposition 4.17. Recall that µ and ν were defined as µ := min{|xi,j | : {i, . . . , j } ⊆ A} and ν := max{|xi,j | : {i, . . . , j } ⊆ A}. Let µ (4.40) < |yB | < 2ν : B ⊆ A . U := U ∩ 2 First of all, let C < ∞ be such that our induction hypothesis is satisfied with it for 2 ≤ m < n and also that C is so large that the conclusion of Proposition 4.10 holds with µ . Also we require that := 4ν 1 max |yB |ξ ≤ 1 ≤ C min |yB |ξ B⊆A C B⊆A
(4.41)
holds whenever y ∈ U . We claim that on U we have σ (Mn ) ∼ σ (Mj1 +1 ) ⊕ ? if m = 1 and σ (Mn ) ∼ σ (Mj1 +1 ) ⊕ 1 ⊕ σ (Mj2 −i2 +2 ) ⊕ · · · ⊕ ? ⊕ σ (Mn−im +2 ) otherwise. Denote the righthand sides of these expressions collectively as . By our induction hypotheses, for any y ∈ U and any k ∈ {1, . . . , m} there is a symmetry K ∈ Ln so that for 1 ≤ k ≤ m we have jk jk 1 ξ 2 |Kyi | |vi | ≤ σ (Mjk −ik +2 )(Kyik , . . . , Kyjk ) ≤ C |Kyi |ξ |vi |2 C i=ik
(4.42)
i=ik
with C not depending on y : Just pick such a symmetry Kk ∈ Ljk −ik +2 for k ∈ {1, . . . , m} and take any K ∈ Ln such that the restriction to the yik , . . . , yjk coordinates is Kk . Here we have been abusing notation with the Kk ’s so that Kk above operates on coordinates yik , . . . , yjk and not y1 , . . . , yjk −ik +1 . Extend Kk now naturally to the whole of R(n−1)d . We can now take K to be say K = K1 K2 . . . Km−1 Km .
18
V. Hakulinen
Now for every y ∈ U fix such a transformation Ky and denote y := Ky y . By (4.41) and (4.42) we have n−1 n−1 1 |yi |ξ |vi |2 ≤ (y) ≤ C |yi |ξ |vi |2 . C i=1
(4.43)
i=1
As before, we write σ (Mn ) =
m
σil ,jl + σAc +
γi,j − σAc + the rest.
(4.44)
i,j ∈Ac
l=1
The first two terms satisfy n−1 m n−1 1 |yi |ξ |vi |2 ≤ σil ,jl + σAc ≤ C |yi |ξ |vi |2 , C i=1
l=1
(4.45)
i=1
and by Proposition 4.17 we have n 1 c γi,j − σA + the rest ≤ |yi |ξ |vi |2 . 2C i=1 i,j ∈Ac
(4.46)
So we have n−1 n−1 1 1 ξ 2 |yi | |vi | ≤ σ (Mn )(y) ≤ C + |yi |ξ |vi |2 . 2C 2C i=1
(4.47)
i=1
! Let U K := {y ∈ U : Ky = K}. Clearly U = {U K : K ∈ Kn }. We just proved that for any y ∈ U we have σ (Mn ) ∼ in Ky U Ky . Since both and σ (Mn ) are invariant under Ky−1 for any y ∈ U , we can conclude by Remark 4.1 that σ (Mn ) ∼ on U Ky . Since Ln is finite we can conclude that σ (Mn ) ∼ on U .
Let L n := {L ∈ GL(R(n−1)d ) : ∃i1 , j1 , . . . , in−1 , jn−1 : ∀x1 , . . . , xn−1 : L((x1 , . . . , xn−1 )) = (xi1 ,j1 , . . . , xin−1 ,jn−1 )}. (4.48) Obviously, L n is a finite set. It is also easy to see that it is a group. Note that the L as constructed in Theorem 4.9 belongs to L n . Remark 4.18. The following proposition simply says the following: Suppose we have a symbol of the form k "
σ (Mni +1 ) ⊕ ? .
(4.49)
i=1
This corresponds to a splitting Rnd = Rld ⊕ R(n−l)d with l = n1 + · · · + nk . Then we can replace R(n−l)d with any complementary subspace to Rld and the symbol looks the same in these new coordinates as it looks in a neighbourhood of 0 which is bounded in the Rld -direction.
Passive Advection and the Degenerate Elliptic Operators Mn
19
k Proposition 4.19. Let σ ∼ σ (Mni +1 ) ⊕ ? on a set U ⊆ B × R(n−l)d with B k i=1 bounded and l := rk(0) = i=1 ni . Let L ∈ GL(Rnd ) be such that 1. L : {0} × R(n−l)d = {0} × R(n−l)d and 2. Let P : Rnd → Rld be the natural projection onto the first ld coordinates and let L := L Rld × {0}. Then L " " k k σ (Mni +1 ) ∼ σ (Mni +1 ). (4.50) i=1
i=1
With these assumptions σL ∼
k "
σ (Mni +1 ) ⊕ ?
(4.51)
i=1
on LU . Proof. Without loss of generality we may assume that ? 0 L := , M?
(4.52)
with M an R(n−l)d × Rld -matrix. Also without loss of generality we may assume U = B(0, 1) × R(n−l)d . Let A := ki=1 σ (Mni +1 ). Denote v := (v1 , v2 ) and x := (x1 , x2 ), where v1 , x1 ∈ Rld and v2 , x2 ∈ R(n−l)d . Then v, (A ⊕ ? )L (x)v = v1 , A((L−1 x)1 )v1 + v1 , A((L−1 x)1 )M T v2 +M T v2 , A((L−1 x)1 )v1 + |v2 |2 =: (∗).
(4.53)
Since A(x) is a symmetric matrix for every x the two middle terms are equal. Moreover, (L−1 x)1 = x1 and thus (∗) = v1 , A(x1 )v1 + 2v1 , A(x1 )M T v2 + |v2 |2 =: (∗∗).
(4.54)
Next, we use induction on rk 0 = n1 + . . . + nk . If rk 0 = 1, i.e. A = σ (M2 ) we have 1 (|v1 |2 + |v2 |2 ) ≤ (∗∗) ≤ C(|v1 |2 + |v2 |2 ) C
(4.55)
for some C < ∞ when (x1 , x2 ) ∈ Sd−1 × R(n−1)d . Adding (|x1 |−ξ − 1)|v2 |2 and multiplying by |x1 |ξ yields 1 (|x1 |ξ |v1 |2 + |v2 |2 ) ≤ (∗∗) ≤ C(|x1 |ξ |v1 |2 + |v2 |2 ) C
(4.56)
when (x1 , x2 ) ∈ B(0, 1) × R(n−1)d . Since σ (M2 ) ∼ | · |ξ we can conclude our claim. Next, suppose our proposition is true for configurations of rank < l and we prove our claim when rk 0 = l. Now cover Sld−1 by finitely many open sets B1 , . . . , Bm so that " k i=1
Lj σ (Mni +1 )
∼
kj "
σ (Mnj,i +1 ) ⊕ ?
i=1
on Bj with some linear transformation Lj and with
kj
i=1 nj,i
< l.
(4.57)
20
V. Hakulinen
Letting L j := L(Lj ⊕ ? ), and applying this theorem on Uj := Bj × R(n−l)d we see that
σ (Mn+1 )Lj ∼
kj "
σ (Mnj,i +1 ) ⊕ ?
(4.58)
i=1
on Uj . Now a similar argument as above for rank 0 yields the desired conclusion. The reader may fill in the details. The following is an immediate corollary to this proposition. Corollary 4.20. Let L ∈ L be such that for some neighbourhood U of x we have σ (Mn+1 )L ∼
k "
σ (Mni +1 ) ⊕ ?
(4.59)
i=1
on LU . Then for every L ∈ L such that L−1 = L −1 on {|xi | = 0 : 1 ≤ i ≤ rk x}
(4.60)
we have
σ (Mn+1 )L ∼
k "
σ (Mni +1 ) ⊕ ?
(4.61)
i=1
on L U . 4.5. Some Corollaries. Corollary 4.21. For every n ≥ 2 there is C > 0 such that Cd(x, Dgn(Mn ))ξ ≤ σ (Mn ).
(4.62)
The proof of this fact is easy and thus omitted. The assumptions of Theorem 2.4 are now satisfied (by Corollary 4.21, Theorem A.1 and Proposition A.3) for Mn . Moreover, we can directly calculate the dimension of Mn : Corollary 4.22. There is C < ∞ such that for any f ∈ L2 (R(n−1)d ) we have ||e−Mn t f ||∞ ≤ Ct −
(n−1)d 4−2ξ
||f ||2 .
(4.63)
Moreover, C depends only on the lower bound for σ (Mn ). Proof. By Proposition A.3 there is C < ∞ so that ||f ||q ≤ C||d(x, Dgn(Mn ))ξ/2 ∇f ||2 =: (∗) for any f ∈ C0∞ (R(n−1)d ) with q := By Corollary 4.21 we have
(4.64)
2n n+ξ −2 .
(∗) ≤ C f, Mn f . Finally, by Theorem 2.13 we can conclude that (4.63) holds.
(4.65)
Passive Advection and the Degenerate Elliptic Operators Mn
21
Corollary 4.23. For any ρ ∈ (0, 1) there is C < ∞ such that for any x ∈ R(n−1)d and any y ∈ B(x, ρ|x|) we have
(n−1)d |x − y|2−ξ (4.66) KMn (t, x, y) ≤ Ct − 2−ξ exp − Ct and GMn (x, y) ≤ C|x − y|2−ξ −(n−1)d .
(4.67)
Proof. This is a direct consequence of Proposition 4.5, Theorem 2.12 and Corollary 4.22.
Corollary 4.24. Suppose A ∼λ σ (Mn1 +1 ) ⊕ · · · ⊕ σ (Mnk +1 ) ⊕ ? on Rld × R(n−l)d with l := n1 + · · · + nk < n and let > 0 be given. Then there is C < ∞ such that if z ∈ B(y1 , |y1 |) × B(y2 , |y1 |1−ξ/2 ) (here y := (y1 , y2 ) ∈ Rld × R(n−l)d ), we have
ld n−l |y1 − z1 |2−ξ + |y2 − z2 |2 . (4.68) KA (t, y, z) ≤ Ct − 2−ξ − 2 exp − Ct Moreover C depends on A only through λ, n1 , . . . , nk and n. Proof. The proof is straightforward using Theorem 2.15, Proposition 4.5 and Corollary 4.22 and we leave the details for the reader. The only finesse is the appearance of B(y2 , |y1 |1−ξ/2 ) above. This is due to the fact that if z1 ∈ B(y1 , |y1 |) and z2 ∈ B(y2 , |y1 |1−ξ/2 ), we have |y1 − z1 |2−ξ + |y2 − z2 |2 ≤ (|y1 |)2−ξ + |y2 − z2 |2 ≤ −ξ |y2 − z2 |2 + |y2 − z2 |2 .
(4.69)
5. Local Estimates for the Heat Kernel The main result in this section is Theorem 5.12. Superficially it is very similar to Corollary 4.24, but there is a very important difference: In Corollary 4.24 one assumes that A ∼ σ (Mn1 +1 ) ⊕ · · · ⊕ σ (Mnk +1 ) ⊕ ?
(5.1)
in Rnd but in Theorem 5.12 A = σ (Mn+1 ) and (5.1) holds only in a relatively compact neighbourhood of a point x. The point of this section is to close the gap between these two results. We start with some technicalities and prove a uniform version of the Harnack inequality adapted to our case. Remark 5.1. In a few places we use the somewhat terse assumption “A has a heat kernel”. In these places we assume that A has a heat kernel K such that both K(·, x, ·) and K(·, ·, x) are solutions to ut + Au = 0 in the sense of Remark 2.5 and that for every t and x we have both dy K(t, x, y) ≤ 1 and dy K(t, y, x) ≤ 1. (5.2) In the cases that are of interest to us (see Remark 2.14) this is the case and moreover our heat kernels are symmetric in the spatial coordinates.
22
V. Hakulinen
A well-known argument (see for example [20], Sect. I.3, p. 5) yields the following: Suppose A is a divergence-form operator on Rn with a nonnegative symbol. Suppose also that A is uniformly elliptic on some ball B and that A has a heat kernel. Then for any ball B ⊂⊂ B there is C < ∞ such that we have K(t, x, y) ≤ Ct −n/2
(5.3)
whenever t ∈ (0, 1], x ∈ B and y ∈ Rd . We shall now make a generalization (Corollary 5.5) of this result. So for the rest of the section we fix a symbol A on Rnd and suppose that A ∼λ σ (Mn1 +1 ) ⊕ · · · ⊕ σ (Mnk +1 ) ⊕ ?
(5.4)
on B(0, 2) × B(0, 2) ⊆ Rld × R(n−l)d , where l := n1 + · · · + nk . Let’s denote ¯ ¯ ¯ 1). Q := B(0, 1) × B(0, 1) and D := Snd−1 × B(0,
(5.5)
Proposition 5.2. For each t ∈ (0, 1] there is an open covering {Uyt }y∈Q of Q with the following properties: 1. y ∈ Uyt for every y ∈ Q and t ∈ (0, 1]. √ 2. There is > 0 not depending on t such that B(y1 , t 1/2−ξ ) × B(y2 , t) ⊆ Uyt . 3. For every t ∈ (0, 1], every y ∈ Q and every positive solution u of ut = ∇ · A∇u on (0, 3) × Uyt we have sup u(t, y ) ≤ C inf u(2t, y ). y ∈Uyt
y ∈Uyt
(5.6)
Moreover, C depends on A only through λ, n1 , . . . , nk and n. Remark 5.3. Strictly speaking in (3) we only assume u is a solution of ut = ∇ · A∇u in the sense of Remark 2.5 on (, 3) × Uyt for every ∈ (0, 3). Corollary 5.4. Proposition 5.2 holds with obvious modifications for any affine transform AK of A with possibly different and C. To give some intuition to the reader we first give a corollary to this proposition. Corollary 5.5. There is C < ∞ such that KA (t, y, y ) ≤ Ct − 2−ξ − ld
(n−l)d 2
(5.7)
for any y ∈ Q, y ∈ Rnd and t ∈ (0, 1]. Proof. By Proposition 5.2 for any y ∈ Q and y ∈ Rnd we have t 2−ξ + ld
(n−l)d 2
KA (t, y, y ) ≤ C |Uyt | sup KA (t, y , y ) ≤ CC
≤ CC
y ∈Uyt t |Uy | inf y ∈Uyt
≤ CC .
Uyt
KA (2t, y , y )
KA (2t, y , y ) dy (5.8)
Passive Advection and the Degenerate Elliptic Operators Mn
23
Next we prove a small lemma used in the proof of Proposition 5.2. The setup here is the following. Let y ∈ D. In our proof of Proposition 5.2 we use induction on rank. By Theorem 4.9 there is an invertible affine transformation Ky of Rnd sending y to 0 so that AKy ∼ σ (Mn 1 +1 ) ⊕ · · · ⊕ σ (Mn k +1 ) ⊕ ?
(5.9)
on B(0, 2) × B(0, 2) with l := n 1 + · · · + n k < l. Now Lemma 5.6 allows us to conclude that if (2) of Proposition 5.2 holds for the covering associated with y in Ky coordinates with some (for convenience, we have put this equal to 1 in the statement of Lemma 5.6), then it holds in the usual coordinates of Rnd with some other . Here is our choice of the subspaces for Lemma 5.6:
1. S1 := Ky−1 [Rl d × {0}] − {y} and 2. S2 := Ky−1 [{0} × R(n−l )d ] − {y}. In other words S2 is the degeneration subspace associated with y. The fact that y ∈ Q guarantees that {0} × R(n−l)d ⊆ S2 . Note that the −{y} in the definition of S2 is redundant, since y ∈ S2 , but we didn’t want to confuse the reader a few lines ago, did we? Lemma 5.6. Let S1 , S2 be a splitting of Rnd into complementary subspaces so that {0} × R(n−l)d ⊆ S2 . Assume also that each of them is equipped with a norm and denote the balls with respect to these norms with Bi (x, r) with i = 1, 2. Then there is > 0 so that √ √ B(0, t 1/(2−ξ ) ) × B(0, t) ⊆ B1 (0, t 1/(2−ξ ) ) × B2 (0, t) (5.10) for any t ∈ (0, 1]. Proof. Obviously there is > 0 so that B(0, ) × B(0, ) ⊆ B1 (0, 1) × B2 (0, 1).
(5.11)
√
Let us write B(0, t 1/(2−ξ ) ) × B(0, t) as √ √ 1 B(0, t 2−ξ ) × R(n−l)d ∩ B(0, t) × B(0, t)
(5.12)
√ √ and similarly for B1 (0, t 1/(2−ξ ) ) × B2 (0, t) (we used the fact that t 1/(2−ξ ) ≤ t for t ∈ (0, 1]). Now since {0} × R(n−l)d ⊆ S2 , we conclude by scaling that 1
1
B(0, t 2−ξ ) × R(n−l)d ⊆ B1 (0, t 2−ξ ) × S2 for any t > 0. Also by scaling we get √ √ √ √ B(0, t) × B(0, t) ⊆ B1 (0, t) × B2 (0, t) for any t > 0.
(5.13)
(5.14)
24
V. Hakulinen
√ Proof of Proposition 5.2. If l = 0, then we just choose Uyt := B(y, t). Obviously, these sets satisfy (2) above and by classical results (see again [20], Sect. I.3, p. 5) they satisfy (3) too. Next we assume that the cases < l have been handled and prove the proposition for l. This is done in three phases: 1. Phase 1: Use our induction hypothesis (i.e. that the cases < l have been handled) to handle points in D. 2. Phase 2: Use scaling to handle points z ∈ Q with 0 < |z1 | < 1 and times t ∈ (0, |z1 |2−ξ ]. And finally 3. Phase 3: Do something creative for points z ∈ Q and times t ∈ (|z1 |2−ξ , 1]. Note that this includes defining the sets Uzt when |z1 | = 0. First, Phase 1: By compactness, there is {y1 , . . . , yk } ⊆ D so that [B(0, 1) × B(0, 1)]}ki=1 cover D. Obviously each yi is of rank < l. For each {Ky−1 i t ∈ (0, 1] and z ∈ D pick Uzt to be one of the Uzt ’s associated with some of the y1 , . . . , yk (this is possible by induction hypothesis and Corollary 5.4). Now these Uzt ’s satisfy (2) and (3), where (3) is satisfied by induction and (2) is satisfied by Lemma 5.6 (and the discussion before it) and finiteness of the set {y1 , . . . , yk }. Next, Phase 2: We define the sets Uzt for z’s with 0 < |z1 | < 1 and t ∈ (0, |z1 |2−ξ ]. This is achieved by scaling A outwards so that in this scaling z travels to D. Then the symbol Az obtained this way has the same upper and lower bounds as A on B(0, 2)×B(0, 2), so we can use our sets Uyt defined above for y ∈ D. After this we just scale things back. So, let z ∈ Q with 0 < |z1 | < 1 and let y z := (y1 /|z1 |, z2 + (y2 − z2 )/|z1 |1−ξ/2 ).
(5.15)
ξ z if 1 ≤ i, j ≤ ld |z1 | Aij (y ) |z |ξ/2 A (y z ) if 1 ≤ i ≤ ld < j ≤ nd or 1 ij . Azij (y) := 1 ≤ j ≤ ld < i ≤ nd σ (A (y z )) if ld < i, j ≤ nd ij
(5.16)
Let Az be defined by
Similarly define uz by uz (t, y) := u(|z1 |ξ −2 t, y z ). Now if u satisfies ut = ∇A · ∇u on (0, 3) × B(0, 2) × B(0, 2), then uz satisfies uzt = ∇ · Az ∇uz on this same set. Since now if A ∼λ σ (Mn1 +1 ) ⊕ · · · ⊕ σ (Mnk +1 ) ⊕ ? on B(0, 2) × B(0, 2), then the same is true of Az , and we can conclude that (2) and (3) hold for Az with the same constants as for A. So if we scale back and let |z1 |ξ −2 t
Uzt = {(|z1 |y1 , z2 + |z1 |1−ξ/2 (y2 − z2 )) : (y1 , y2 ) ∈ Uzˆ
}
(5.17)
then (2) and (3) hold for these whenever defined. Finally, Phase 3: To finish the argument, we set for t ≥ |z1 |2−ξ 3 1/(2−ξ ) 1√ × B z2 , = B 0, t t . 2 2
Uzt
(5.18)
Passive Advection and the Degenerate Elliptic Operators Mn
25
Now (2) holds for these sets. To prove (3) we may assume without loss of generality that z2 = 0 and let At be defined as follows: ξ √ t − 2−ξ Aij (y1 t 1/(2−ξ ) , y2 t) if 1 ≤ i, j ≤ ld − ξ √ 4−2ξ A (y t 1/(2−ξ ) , y ij 1 2 t) if 1 ≤ i ≤ ld < j ≤ nd or . (5.19) Atij (y1 , y2 ) := t 1 ≤ j ≤ ld < i ≤ nd √ 1/(2−ξ ) Aij (y1 t , y2 t) if ld < i, j ≤ nd As before, for t ∈ (0, 1] the substitution A → At preserves the constant in the Harnack inequality (Theorem 2.4) and thus we can conclude that (3) holds. Remark 5.7. It is not hard to modify the previous proof so that for given > 0 there is > 0 so that √ 1. B(y1 , t 1/(2−ξ ) ) × B(y2 , t) ⊆ Uyt for every t ∈ (0, 1]. √ 2. Uyt ⊆ B(y1 , t 1/(2−ξ ) ) × B(y2 , t), when |y1 |2−ξ ≤ t ≤ 1. 3. Uyt ⊆ B(y1 , |y1 |) × B(y2 , |y1 |(2−ξ )/2 ), when 0 < t ≤ |y1 |2−ξ . We need (2) and (3) in the proof of Theorem 5.12. There we need to find > 0 so that √ t 1/(2−ξ ) Uz and B(y1 , t ) × B(y2 , t) are disjoint whenever z ∈ B(y1 , t 1/(2−ξ ) ) × √ B(y2 , t) and this is hard to arrange if we don’t have any kind of control over the Uzt ’s from outside. This required control is provided by (2) and (3) above. The actual choice of > 0 is done in Lemma 5.10. Anyway, it is quite easy to make (2) and (3) hold. First of all, it is easy to see that (2) and (3) hold with some 0 > 0 when Uyt ’s are defined as in the proof of Proposition 5.2. By letting Vyt := Uy with T := (0 / )2 we see that Vyt ’s for t ∈ (0, 1] satisfy (1)–(3) above together with the claims of Proposition 5.2. The details are left to the reader. We will use Proposition 5.2 in this form in the proofs below. t/T
We now have to estimate the tails of the heat kernel. We use a common probabilistic argument for this (killing probabilities). Denote d(x, y)2 := max{|x1 − y1 |2−ξ , |x2 − y2 |2 }. Obviously there is C < ∞ so that # C −1 d(x, y) ≤ |x1 − y1 |2−ξ + |x2 − y2 |2 ≤ Cd(x, y).
(5.20)
(5.21)
y
Below, PA (sups≤t d(Xs , y) ≥ µ) denotes the probability of the diffusion X associated with A starting from y at time 0 hitting the set {z : d(y, z) = µ} before time t. The following is Proposition 6.5 on p. 179 of [1]. Proposition 5.8. Suppose A ∼λ ? on Rl . There is C < ∞ depending on A only through λ such that
µ2 y . (5.22) PA sup |Xs − y| ≥ µ ≤ C exp − Ct s≤t
26
V. Hakulinen
Corollary 5.9. Suppose A ∼λ ? on B(0, 2) ⊆ Rnd . Then there is C < ∞ depending on A only through λ such that for every y ∈ B(0, 1), z ∈ B(y, 21 ) and 0 < t ≤ 1 we have
nd |y − z|2 KA (t, y, z) ≤ Ct − 2 exp − . (5.23) Ct The proof of this corollary is quite simple and well-known (folklore) and we shall not prove it here, but the interested reader can reconstruct the argument from the proof of Theorem 5.12 which is a generalization of Corollary 5.9. Unfortunately we need the following technicality in the proofs of Proposition 5.11 and Theorem 5.12. Lemma 5.10. Suppose > 0 is given. Then there is > 0 so that if d(y, z) ≥ |y1 |1−ξ/2 , we have
d(y, z) {z : d(z, z ) ≤ |z1 |1−ξ/2 } ⊆ z : d(z, z ) ≤ (5.24) 2 and B(y1 , |y1 |) × B(y2 , |y1 |1−ξ/2 ) ∩ B(z1 , |z1 |) × B(z2 , |z1 |1−ξ/2 ) = ∅. (5.25) Proof. Let α :=
d(y, z)2/(2−ξ ) . |y1 |
(5.26)
Then we have |z1 | ≤ |y1 | + |y1 − z1 | ≤ |y1 | + d(y, z)2/(2−ξ ) ≤ (1 + α)|y1 |.
(5.27)
So to prove (5.24), we just have to find > 0 so that ((1 + α)|y1 |)1−ξ/2 ≤
1 (α|y1 |)1−ξ/2 , 2
(5.28)
whenever α ≥ ( )2/(2−ξ ) . By elementary calculus, we see that this is possible. Using similar reasoning, we see that to prove (5.25) we have to find > 0 so that 1. |y1 | + (1 + α)|y1 | ≤ α|y1 | and 2. |y1 |1−ξ/2 + ((1 + α)|y1 |)1−ξ/2 ≤ (α|y1 |)1−ξ/2 , when α ≥ ( )2/(2−ξ ) . Again, this is possible.
Proposition 5.11. Suppose A ∼λ σ (Mn1 ) ⊕ · · · ⊕ σ (Mnk ) ⊕ ? on Rld+(n−l)d with k i=1 (ni − 1) = l and let > 0 be given. Then there is C < ∞ such that for 1−ξ/2 µ ≥ |y1 | we have
µ2 y PA sup d(Xs , y) ≥ µ ≤ C exp − . (5.29) Ct s≤t
Passive Advection and the Degenerate Elliptic Operators Mn
27
Proof. Let > 0 be given by Lemma 5.10. By Corollary 4.24, there is C1 < ∞ so that if d(y, z) ≥ |y1 |1−ξ/2 we have
(n−l)d ld |y1 − z1 |2−ξ + |y2 − z2 |2 KA (t, y, z) ≤ C1 t − 2−ξ − 2 exp − . (5.30) C1 t Now a direct computation gives y y PA sup d(Xs , y) ≥ µ ≤ PA (d(Xt , y) ≥ µ/2) s≤t
y
+PA (d(Xt , y) ≤ µ/2 and ∃s < t : d(Xs , y) = µ) y
≤ PA (d(Xt , y) ≥ µ/2) y
+PA (∃s < t : d(Xs , s) = µ and d(Xs , Xt ) ≥ µ/2) y
≤ PA (d(Xt , y) ≥ µ/2) +
sup d(y,z)=µ,s≤t
z PA (d(Xs , z) ≥ µ/2)
= (∗). By (5.24) of Lemma 5.10, for every z ∈ Rnd with d(y, z) = µ we have µ {z : d(z, z ) ≤ |z1 |1−ξ/2 } ⊆ z : d(z, z ) ≤ . 2 A fortiori we also have µ {z : d(y, z ) ≤ |y1 |1−ξ/2 } ⊆ z : d(y, z ) ≤ , 2
(5.31)
(5.32)
(5.33)
since there are points z ∈ Rd with d(y, z) = µ and |z1 | ≥ |y1 |. Thus by (5.30) we can conclude that
ld |y1 − z1 |2−ξ + |y2 − z2 |2 − 2−ξ − (n−l)d 2 (∗) ≤ C2 dy t exp − C1 t
d(y,z)≥µ/2 ld |y1 − z1 |2−ξ ≤ C3 t − 2−ξ exp − dy1 C1 t |y1 −z1 |2−ξ ≥µ2
(n−l)d |y2 − z2 |2 + C3 t − 2 exp − dy2 C1 t
|y2 −z22|≥µ µ ≤ C exp − . (5.34) Ct
Now we can finish with the local estimates. Theorem 5.12. Suppose that A ∼λ σ (Mn1 +1 ) ⊕ · · · ⊕ σ (Mnk +1 ) ⊕ ? on B(0, 2) × B(0, 2) with l := n1 + · · · + nk < n and that A has a heat kernel. For any ∈ (0, 1] there is C < ∞ so that if y ∈ Q, 0 < t ≤ 1 and |y1 |1−ξ/2 ≤ d(z, y) ≤ 21 we have
|y1 − z1 |2−ξ + |y2 − z2 |2 KA (t, y, z) ≤ Ct −ld/(2−ξ )−(n−l)d/2 exp . (5.35) Ct Moreover, this estimate depends on A only through λ, n1 , . . . , nk and n.
28
V. Hakulinen
Remark 5.13. It is not difficult to modify the proof to take into account more general sets. One can replace B(0, 2) × B(0, 2) with U := A × B with A and B open, starlike w.r.t. origin, open and satisfying
z : d(z, y) ≤
y∈Q
1 2
⊂⊂ U.
(5.36)
Similarly Q can be replaced with Q := A × B with A and B closed and starlike w.r.t. origin. Also d can be replaced with any equivalent metric. (Note in particular that Lemma 5.10 is preserved under replacement by an equivalent metric with possibly a different ). Proof. If 0 < d(z, y)2 ≤ t, then there is C < ∞ so that
|y1 − z1 |2−ξ + |y2 − z2 |2 1 ≤ C exp − . Ct
(5.37)
Thus in view of Corollary 5.5 we only need to prove the claim for t ≤ d(z, y)2 ≤ 1. y Let > 0 be given by Lemma 5.10 and let {Ut } be a collection of open coverings given by Proposition 5.2 and Remark 5.7 associated with this . We may assume 1 1 ≤ min 2 , 2ξ . √ We want to show that Uzt and B(y1 , t 1/(2−ξ ) ) × B(y2 , t) are disjoint whenever √ z ∈ B(y1 , t 1/(2−ξ ) ) × B(y2 , t). The case |y1 |2−ξ ≤ t ≤ 1 follows easily, since we assumed ≤ min{ 21 , 21ξ }. In case 0 < t ≤ |y1 |2−ξ we just use Lemma 5.10 to conclude that √ B(y1 , t 1/(2−ξ ) ) × B(y2 , t) ∩ B(z1 , |z1 |) × B(z2 , |z1 |(2−ξ )/2 ) = ∅,
(5.38)
whenever d(y, z) ≥ |y1 |1−ξ/2 . By the proof of Corollary 5.5 we have t 2−ξ + ld
(n−l)d 2
sup KMn+1 (t, y, z ) ≤ C2
dy KMn+1 (2t, x, y ).
(5.39)
|y1 − z1 |2−ξ + |y2 − z2 |2 KMn+1 (2t, y, z) ≤ C3 exp − , C3 t
(5.40)
z ∈Uzt
Uzt
By Proposition 5.11 we have Uzt
so we are done.
Remark 5.14. Note that the conclusion of the theorem depends on n1 , . . . , nk only through l. In particular the estimate obtained above remains the same, when σ (Mn1 +1 )⊕ · · · ⊕ σ (Mnk +1 ) is replaced by σ (M2 )⊕l .
Passive Advection and the Degenerate Elliptic Operators Mn
29
6. Construction of the Stationary State In this section, we shall finally prove Theorem 1.1 modulo some technicalities whose proofs are postponed until Appendix C. To this end, we shall inductively show the following Theorem 6.1. Let χ : Rd → R be compactly supported and nonnegative. Then for some Cn < ∞ we have −1 −1 M2n (M2n−2 (. . . (M2−1 χ ⊗ χ ) . . . ) ⊗ χ ) ≤ Cn
n
(1 + |x2i−1 |)2−ξ −d .
(6.1)
i=1
Obviously Theorem 1.1 follows directly from this. The following formula is a central tool in this section. Proposition 6.2. Let 1 ≤ l ∈ N. Then Rld
d ld y |x − y|2−ξ −ld
l−1
(1 + |yi |)2−ξ −d χ (yl ) ≤ C
i=1
l (1 + |xi |)2−ξ −d .
(6.2)
i=1
The proof of this proposition can be found in Appendix C. We want to show that n−1 n GM2n (x, y) (1 + |y2i−1 |)2−ξ −d χ (y2n−1 ) dy ≤ C (1 + |xi |)2−ξ −d . R(2n−1)d
i=1
i=1
(6.3) We find finitely many sets {Ai }ki=1 so that together with {(x, y) ∈ R(2n−1)d × : |y| ≥ ρ|x|} they cover R(2n−1)d × R(2n−1)d . Let Axi := {x + y : (x, y) ∈ nd R }. We shall write the above integral as
R(2n−1)d
R(2n−1)d
=
|x−y|≥ρ|x|
+
k(x) x x j =0 Aj \Aj −1
(6.4)
and then prove the desired estimate of (6.3) separately for each term of the right-hand side. We apologise to the reader for bouncing around in our use of 2n and n + 1, but for the moment n + 1 is more convenient. We will first reduce everything to investigation of operators σ (M2 )⊕l using Remark 5.14. What we mean by this is the following: Let −ξ 2 ξd C|x|− 2 t − d2 exp − |x| |x−y| if |y| < |x| Ct 2 EC (t, x, y) := (6.5) d |x| Ct − 2−ξ exp − |x−y|2−ξ if |y| ≥ 2 Ct and let ECn (x, y)
:=
∞
dt 0
n i=1
EC (t, xi , yi ).
(6.6)
30
V. Hakulinen
nd nd We want to find C < ∞ and a finite covering {Ai }m i=1 for R × R so that for every i ∈ {1, . . . , m} there is Li ∈ Ln+1 so that
G
L
Mn i
(t, x, x + y) ≤ ECn (t, x, x + y)
(6.7)
whenever (x, y) ∈ LAi . Then the proof of (6.3) is reduced to the investigation of EC (t, x, y) (which is just the natural estimate for σ (M2 )⊕n ). We’ll first define Ai ’s for symbols A ∼ ki=1 σ (Mni +1 ) ⊕ ? on B(0, 2) × B(0, 2) by induction on l := ki=1 and then use these to define Ai ’s for σ (Mn ). In this local case we just cover B(0, 2) × B(0, 2) × B(0, ). So suppose we have just a uniformly elliptic operator A on B(0, 2). Then we just take one set A1 := {(x, y) : x ∈ B(0, 1) and |x − y| < 21 }. Next suppose all the cases l < l have been handled. Then by induction hypothesis and compactness of Sld−1 × B(0, 1) there exists a finite set {x1 , . . . , xm } of Sld−1 × B(0, 1) so that there are affine transfori mations K1 , . . . , Km so that each Kj sends xj to 0 and AKj ∼ ki=1 σ (Mnj +1 ) ⊕ ? i ki j with lj = i=1 ni < l on B(0, 2) × B(0, 2). Since lj < l, there are {Ai }ki=1 so that Sld−1 × B(0, 1) × B(0, ) gets covered by ⊕2 Kj them and each Ai is just (L−1 j ) A for some associated A given for A . Moreover the linear part Lj of Kj is of the form Lj :=
Mj 0 , 0 ?
(6.8)
where Mj is a ld × ld-matrix. So there is a neighbourhood B := {(x, y) : x ∈ Sld−1 × B(0, 1) and |x − y| < }
(6.9)
! so that on B ⊆ m i=1 Ui everything is under control. Let’s define the set A˜ i as follows: A˜ i := {((rx1 , x2 ), (ry2 , r 1−ξ/2 y2 )) : (x, y) ∈ Ai , x ∈ Sld−1 × B(0, 1) and r ∈ (0, 1]}.
(6.10)
Clearly there is > 0 so that {A˜ i }m i=1 together with (see again Theorem 5.12)
|y1 |1−ξ/2 ≤ d(z, y) ≤
1 2
(6.11)
cover {(x, y) : x ∈ B(0, 1) × B(0, 1) and |x − y| < }
(6.12)
for some > 0. On this last set A clearly “behaves as” the heat kernel of σ (M2 )⊕l ⊕ ? , so we have to prove the same for ALi on Li A˜ i . This is a rather easy scaling argument: Pick
Passive Advection and the Degenerate Elliptic Operators Mn
31
k λ > 0 so that A ∼λ i=1 σ (Mni +1 ) ⊕ ? . Let y ∈ B(0, 1) × B(0, 1) and denote x y := (|y1 |−1 x1 , |y1 |ξ/2−1 (y2 − x2 ) + x2 ). Define |y1 |ξ Aij (z) |y |ξ/2 A (z) y 1 ij Bij (zy ) := A (z) ij
if 1 ≤ i, j ≤ ld if 1 ≤ i ≤ ld < j ≤ nd or 1 ≤ j ≤ ld < i ≤ nd if ld < i, j ≤ nd.
A straightforward computation shows that B y ∼λ sional analysis
k
i=1 σ (Mni +1 ) ⊕ ?
GA (y, z) = |y1 |2−ξ −ld−(1−ξ/2)(n−l)d GB y (y y , zy ).
(6.13)
. By dimen-
(6.14)
Therefore, since the same scaling property holds for ECn , we can conclude that GALi (y, z) ≤ ECn (y, z)
(6.15)
on the whole of Li A˜ i . Finally for σ (Mn ) we just cover Snd−1 × B(0, ρ) by the sets described above and conify these. Now if σ (Mn+1 )L “behaves as” σ (M2 )⊕l ⊕ ? on LA, then by scaling it “behaves as” σ (M2 )⊕l ⊕ σ (M2 )⊕(n−l) = σ (M2 )⊕n on CLA, where CLA denotes the conification of LA. So we have reduced (6.3) to proving R(2n−1)d
EC2n−1 (x, y)
n−1
(1 + |Ly2i−1 |)2−ξ −d χ (Ly2n−1 ) dy ≤ C
i=1
n
(1 + |Lxi |)2−ξ −d
i=1
(6.16) for arbitrary L ∈ L 2n and arbitrary C > 0. To this end, we split the domain of integration in (6.16) into parts and prove it separately for these parts. Define the sets Bjx as follows: Assume first that max{|xi | : 1 ≤ i ≤ n} = 1. (We then x/r
just simply let Bjx := rBj if max{|xi | : 1 ≤ i ≤ n} = r.) By symmetry we may assume |x1 | ≤ · · · ≤ |xn | = 1. Let k(x) := #({|x1 |, . . . , |xn |} \ {0}) (i.e. the number of distinct strictly positive numbers) and let be defined by 0 < |x x1 | = · · · = |x x2 −1 | < |x x2 | = · · · = |x x3 −1 | < · · · < |x xk(x) | = · · · = |xn | = 1 with x1 being the smallest integer so that |x x1 | > 0. Let rjx := |x xj |. For every x with k(x) > 1, we define x˜ as follows: x 1. x˜i = xi /rk(x)−1 for 1 ≤ i < xk(x) and 2. x˜i = xi = 1 for xk(x) ≤ i ≤ n.
(6.17)
32
V. Hakulinen
Clearly for such x, k(x) ˜ = k(x) − 1. We first give the sets Bjx inductively in terms of k(x) and then explicitly. First of all, for all our x let
1 x Bk(x) := y ∈ Rnd : |yi − xi | ≤ for every i ∈ {1, . . . , n}. (6.18) 2 y
In particular, for k(x) = 1 everything is done. If k(x) > 1 and Bj has been defined
x in the for y with k(y) < k(x), we just translate Bjx˜ on top of x and scale it by rk(x)−1 x x 1−ξ/2 first k(x) − 1 coordinates and by a factor of (rk(x)−1 ) in the rest of the coordinates. In plain “formulese”, this is
Bjx := {(ry1 , . . . , ry xk(x) −1 , x xk(x) + r 1−ξ/2 (y xk(x) − x xk(x) ), . . . , xn + r 1−ξ/2 (yn − xn )) : y ∈ Bjx˜ }.
(6.19)
Thus, explicitly, we have (denoting xk(x)+1 = n + 1)
1 Bjx := y ∈ Rnd : ∀i < xj +1 : |xi − yi | ≤ |x xj | and 2 ∀l ∈ {j + 1, . . . , k(x)}∀i ∈ { xl , . . . , xl+1 − 1} : 1 1−ξ/2 ξ/2 . |x xl | |xi − yi | ≤ |x xj | 2
(6.20)
For technical reasons related to the fact that the symmetry group of σ (Mn+1 ) (i.e. L n+1 ) is rather different from the one of σ (M2 )⊕n we have to modify our covering {Bjx } a bit, since with our current covering, (2) of Lemma 6.5 would not be true. (It is true however for such L for which {Lyl = 0} = {yl = 0} for some other l .) First of all, we can concentrate our investigation on a conical neighbourhood C of the degeneration set, since outside such neighbourhood for |x − y| ≤ ρ|x| we have ECn (x, y) ≤ C|x|−ξ |x − y|2−nd ≤ C |x − y|2−ξ −nd ,
(6.21)
and this is sufficient by the computation in Phase 1 of the proof of Theorem 6.1 below. Therefore, we pick our conical neighbourhood C and ρ > 0 so that the set B(x, 3ρ|x|) (6.22) x∈C
does not contain any of the degeneration points of σ (Mn+1 ) that are not degeneration points of σ (Mn+1 ). Then we just intersect each Bjx for x ∈ C with
B(x, 2ρ|x|)
(6.23)
x∈C
thus forcing (2) of Lemma 6.5 to be true. After this small diversion, we shall now list some basic properties of ECn on these sets needed to establish (6.3). The statements of Lemmata 6.3, 6.4 and 6.5 clearly scale x/r if for general x we define Bjx to be rBj , where r := max{|x1 |, . . . , |xn |}. So we can assume that r = 1 in the following proofs.
Passive Advection and the Degenerate Elliptic Operators Mn
33
Lemma 6.3. For every x ∈ C with |x1 | ≤ · · · ≤ |xn |, j ∈ {1, . . . , k(x)} and y ∈ Bjx there are some positive numbers a xj , . . . , an so that ECn (x, y)
≤C
n i= xj
− ξd ai 2
x −1 j
|xi − yi |
2−ξ
+
n
−ξ ai |xi
− yi |
2
ld 1− 2−ξ − (n−l)d 2
i= xj
i=1
(6.24) on Bjx \ Bjx−1 . Proof. This is just a straightforward computation by induction on k(x). Lemma 6.4. There is C < ∞ such that ECn (x, y) dy ≤ C (rjx )2−ξ . Bjx \Bjx−1
Proof. Again by induction on k(x).
(6.25)
Lemma 6.5. Let L ∈ L n+1 (L n+1 was defined in (4.48)). Then there is C < ∞ such that for every x ∈ C, j ∈ {1, · · · , k(x)} and l ∈ {1, . . . , n} the following hold: 1. If {yi = 0 : 1 ≤ i ≤ xj − 1} ⊆ {Lyl = 0}, then |Lxl | ≤ C rjx . 2. If {yi = 0 : 1 ≤ i ≤ xj − 1} ⊆ {Lyl = 0}, then Bjx ⊆ {|Lyl | ≥ C −1 |Lxl |}. Proof. As before, without loss of generality we may assume that max{|x1 |, . . . , |xn |} = 1 and that |x1 | ≤ · · · ≤ |xn |. First we handle (1). Basically it says that if Lyl can be expressed as a linear combination of yi ’s with 1 ≤ i ≤ xj − 1, then |Lxl | is small. Since maxL∈L n +1 ||L|| ≤ C for some C < ∞, it suffices to show that in case of L = ? we have |xl | ≤ rjx . But this is immediate from the fact that |xl | ≤ |x xj−1 | ≤ |x xj | = rjx . Also in the case of (2), we can immediately restrict our attention to the case L = ? . This is then immediate using (6.20). Proof of Theorem 6.1. As was said before, it suffices to prove (6.16) for x ∈ C, since for x ∈ C the whole thing reduces to Phase 1 below using (6.21). Without loss of generality, we may assume that the support of χ is so small that if (2) applies to y2n−1 , then whenever rjx ≥ 1, we have L[R(2n−2)d × supp χ ] ⊆ {|Ly2n−1 | < C −1 |Lx2n−1 |}.
(6.26)
Our proof goes as follows. First we split the domain of integration into parts and then we proceed in three phases: 1. Phase 1: Handle the integral |x−y|≥ ρ x| . | 2. Phase 2: Handle the integrals B x \B x with rjx ≤ 1. j j −1 3. Phase 3: Handle the integrals B x \B x with rjx ≥ 1. j
j −1
34
V. Hakulinen
First, Phase 1: We know that for |x − y| ≥ 21 |x| we have EC2n−1 (x, y)
≤ C1
2n−1
2−ξ −2(n−1)d |Lxi − Lyi |
(6.27)
,
i=1
so we can conclude that
(2n−1)d y E 2n−1 (x, y) C |x−y|≥ρ|x| d
(1 + |Ly2i−1 |)2−ξ −d χ (|Ly2n−1 |)
i=1
2n−1
≤ C1
n−1
R(2n−1)d n−1
d
(2n−1)d
y
2−ξ −(2n−1)d |Lxi − Lyi |
i=1
(1 + |Ly2i−1 |)2−ξ −d χ (|Ly2n−1 |) =: (∗).
·
(6.28)
i=1
By a change of variables we see that 2n−1 2−ξ −(2n−1)d n−1 d d y2i |xi − yi | R(n−1)d i=1
=
n
i=1
2−ξ −nd
|x2i−1 − y2i−1 |
n−1
R(n−1)d i=1
i=1
d z2i · 1 + d
n−1
2−ξ −(2n−1)d |z2i |
,
i=1
(6.29) so we can conclude that (∗) ≤ C1 ·
n
Rnd i=1
d
d y2i−1
n
2−ξ −nd |Lx2i−1 − Ly2i−1 |
i=1
n−1
(1 + |Ly2i−1 |)2−ξ −d χ (|Ly2n−1 |) =: (∗2 ).
(6.30)
i=1
By Proposition 6.2, we have (∗2 ) ≤ C2
n
(1 + |Lx2i−1 |)2−ξ −d .
(6.31)
i=1
Next, some initial preparation for Phases 2 and 3: Let x and j ≤ k(x) be given and let U := {1, 3, . . . , 2n − 1}, let U1 be the set of those l ∈ U for which (2) of Lemma 6.5 applies and let U2 := U \ U1 . Then, Phase 2: So suppose rjx ≤ 1. Then by Lemma 6.4 we have y∈Bjx \Bjx−1
≤ C3
d (n−1)d y EC2n−1 (x, y)
n−1
(1 + |Ly2i−1 |)2−ξ −d χ (|Ly2n−1 |)
i=1
sup
n
y∈Bjx \Bjx−1 i=1
(1 + |Ly2n−i |)2−ξ −d =: (∗3 ).
(6.32)
Passive Advection and the Degenerate Elliptic Operators Mn
35
Since (1 + |yi |)2−ξ −d ≤ 1 for any i ∈ U , we can conclude that (∗3 ) ≤ C3 sup (1 + |Lyi |)2−ξ −d := (∗4 ).
(6.33)
y∈Bjx \Bjx−1 i∈U
2
By (2) of Lemma 6.5 we have (∗4 ) ≤ C4
(1 + |Lxi |)2−ξ −d =: (∗5 ).
(6.34)
i∈U2
Since by (1) of Lemma 6.5 we have |Lxi | ≤ C for i ∈ U1 we can finally conclude that (∗5 ) ≤ C5
n
(1 + |Lx2i−1 |)2−ξ −d .
(6.35)
i=1
Finally, Phase 3: If rjx ≥ 1 and 2n − 1 ∈ U2 , then by (6.26) we have L[R(2n−2)d × supp χ ] ∩ Bjx = ∅
(6.36)
by (2) of Lemma 6.5 and thus in this case we have y∈Bjx
d (n−1)d y EC2n−1 (x, y)
n−1
(1 + |Ly2i−1 |)2−ξ −d χ (|Ly2n−1 |) = 0.
(6.37)
i=1
So we may assume 2n − 1 ∈ U1 . By (2) of Lemma 6.5 we have (1 + |Lyi |)2−ξ −d ≤ C6 (1 + |Lxi |)2−ξ −d
(6.38)
for i ∈ U2 . Therefore y∈Bjx \Bjx−1
≤ C7
d (2n−1)d y EC2n−1 (x, y)
(1 + |Lxi |)2−ξ −d
i∈U2
·
n
(1 + |Ly2i−1 |)2−ξ −d
i=1 y∈Bjx \Bjx−1
d (2n−1)d y ECn (x − y)
(1 + |Lyi |)2−ξ −d χ (Ly2n−1 ) = (∗6 ).
(6.39)
i∈U1 \{2n−1}
Writing l := 2n − 1 − l we get y∈Bjx \Bjx−1
d (2n−1)d y EC2n−1 (x − y)
≤ C8 ·
R(2n−1)d
d
i∈U1 \{2n−1}
(2n−1)d
y
n
(1 + |Lyi |)
i=l+1 2−ξ −d
(1 + |Lyi |)2−ξ −d χ (Ly2n−1 )
i∈U1 \{2n−1}
− ξd ai 2
l
|xi −yi |
2−ξ
i=1
+
n
ld l d 1−2−ξ −2
−ξ ai |xi −yi |2
i=l+1
χ (Ly2n−1 ) = (∗ ). 7
(6.40)
36
V. Hakulinen
Note that by the definition of U1 we have that in the expression (1 + |Lyi |)2−ξ −d χ (Ly2n−1 )
(6.41)
i∈U1 \{2n−1}
depends only on the variables y1 , . . . , yl . Using a similar change of variables as in (6.29), we get (∗7 ) ≤ C8
l
Rld i=1
d d yi
l
(1 + |Lyi |)
i∈U1 \{2n−1}
·
2n−1
− ξ2d
1+
ai
i=l+1
By substituting yi = Therefore
(∗ ) ≤ C10 ·
2−ξ −d
2n−1
χ (Ly2n−1 )
2n−1
Rl d i=l+1
d d yi
ld 1− 2−ξ − l 2d
−ξ
ai |xi − yi |2
= (∗8 ). (6.42)
i=l+1
−ξ/2 ai yi
8
ld 1− 2−ξ
i=1
·
|xi − yi |2−ξ
we see that the last integral is ≤ C9 .
l
Rld i=1
d
d yi
l
|xi − yi |
2−ξ
ld 1− 2−ξ
i=1
(6.43)
(1 + |Lyi |)2−ξ −d χ (Ly2n−1 ) =: (∗9 ).
i∈U1 \{2n−1}
l
Noticing that i=1 |xi − yi |2−ξ is essentially just |(x1 − y1 , . . . , xl − yl )|2−ξ for estimation purposes, using a similar change-of-variables argument as before, we can conclude that (∗9 ) ≤ C11 (1 + |Lxi |)2−ξ −d , (6.44) i∈U1
so (∗6 ) ≤ C12
n
(1 + |Lx2i−1 |)2−ξ −d .
(6.45)
i=1
A. Poincar´e and Sobolev Inequalities The following theorem was proved in [3]. Theorem A.1. Let q > 2 and let w1 and w2 be two weights on Rn and suppose that w1 is A2 and that w2 is doubling. Suppose also that for all balls B and B with B ⊆ 2B, 1/n w2 (B ) 1/q w1 (B ) 1/2 |B | ≤c (A.1) |B| w2 (B) w1 (B) with c independent of the balls. Then the Poincar´e and Sobolev inequalities hold for w1 , w2 with q.
Passive Advection and the Degenerate Elliptic Operators Mn
37
So in order to conclude that the Harnack inequality holds for Mn , it suffices to check the assumptions of the above theorem with w1 = d(x, F )ξ with F a finite union of vector subspaces of Rn or just {0} and w2 either |x|ξ or 1. Lemma A.2. Let F be a finite union of vector subspaces of Rn or just {0}. Suppose ξ > −n. Then wξ (x) := d(x, F )ξ satisfies the following: There is a constant C < ∞ such that for every x ∈ Rn we have ) 1 ξ n ξ n 1. If 0 < r < d(x,F 2 , then C d(x, F ) r ≤ wξ (B(x, r)) ≤ Cd(x, F ) r and d(x,F ) 1 n+ξ n+ξ ≤ wξ (B(x, r)) ≤ Cr . 2. If r ≥ 2 , then C r
d(x,F ) ξ ) ) ξ Proof. Since for y ∈ B(x, r) ⊆ B(x, d(x,F ≤ wξ (y) ≤ 3d(x,F , 2 ) we have 2 2 the first estimate follows. r For the second estimate, since wξ (x, r) = |x|n+ξ wξ (x, ˆ |x| ) we see that by scaling it n−1 suffices to prove the inequality for x ∈ S . To conclude the proof, it suffices to prove that 0 < lim
r→∞
sup x∈Sn−1
1 r n+ξ
wξ (B(x, r)) = lim
1
inf
r→∞ x∈Sn−1
r n+ξ
wξ (B(x, r)) < ∞.
(A.2)
The computation is omitted.
) was arbitrary. We can and will put Naturally, the choice of the borderline at d(x,F 2 the borderline at d(x, F ) with ∈ (0, 1) depending on the situation.
Proposition A.3. Let 0 < ξ < 2 and let F be a finite union of vector subspaces of Rn of codimension ≥ 2 or just {0}. Let w1 = C1 d(x, F )ξ and w2 = C2 |x|ξ with 0 < C1 , C2 < ∞. Then the Poincar´e and Sobolev inequalities hold for w1 , w2 with 2n q := n+ξ −2 and for w1 , 1 with q. Proof. By Lemma A.2 both w1 and w2 are A2 , so it suffices to prove the scaling assumption in Theorem A.1 with q. Now Lemma A.2 implies that there is a constant C < ∞ such that for every x ∈ Rd and r > 0 and every ball B := B(x , r ) ⊆ B(x, 2r) we have ) −1 ξ n ξ n 1. If 0 < r < d(x,F 4 , then C |x| r ≤ w(B ) ≤ C|x| r . ) < r, then C −1 r n+ξ ≤ w(B ) ≤ Cr ξ r n . 2. If d(x,F 4
Here w stood for either w1 or w2 . Thus we have for some C < ∞ the following: 1. If 0 < r <
d(x,F ) 4 ,
then C
2. If
d(x,F ) 4
−1
r n rn
≤
w(B ) w(B)
r n ≤C n . r
(A.3)
< r, then C
−1
r n+ξ r n+ξ
≤
w(B ) w(B)
r n ≤C n . r
(A.4)
Therefore, the claim reduces to finding C < ∞ such that for every r and r with r ≤ 2r we have:
38
V. Hakulinen
1. If 0 < r <
d(x,F ) 4 ,
then
and
2. If
d(x,F ) 4
n n−2+ξ n 1/2 2n r r r ≤ C n r r rn
(A.5)
n n+ξ 2 r 2 r ≤C . r r
(A.6)
n+ξ 1/2 n n−2+ξ 2n r r r ≤ C n r r r n+ξ
(A.7)
n+ξ n+ξ 2 2 r r ≤C . r r
(A.8)
< r, then
and
Obviously, such a C exists, so our claim has been proved.
B. Proofs for Sect. 4.3 Proof of Proposition 4.10. Let C1 := inf{v, σ (Mn+1 )(x)v : |x| = |v| = 1 and x ∈ A} and C2 := sup{v, σ (Mn+1 )(x)v : |x| = |v| = 1 and x ∈ A}. Since A is conical with A ∩ S nd−1 compact and disjoint from the degeneration set, we have C1 > 0. For x ∈ A we have n n C1 |xi |ξ |vi |2 ≤ C1 |x|ξ |vi |2 i=1
i=1
= C1 |x|ξ |v|2 ≤ σ (Mn ) ≤ C2 |x|ξ |v|2 n = C2 |x|ξ |vi |2 i=1
√ ξ n n ≤ C2 |xi |ξ |vi |2 ,
(B.1)
i=1
where the last inequality follows from the fact that ξ/2 n |x|ξ = |xi |2 i=1 ξ/2
≤ n max{|xi |ξ : 1 ≤ i ≤ n}) √ ξ n ≤ min{|xi |ξ : 1 ≤ i ≤ n} √ ξ n ≤ |xi |ξ .
(B.2)
Passive Advection and the Degenerate Elliptic Operators Mn
39
Proof of Lemma 4.11. We write |vi , (d(xi + xi+1 ) − d(xi ) − d(xi+1 ))vi+1 | ≤ |vi , (d(xi +xi+1 )−d(xi+1 ))vi+1 |+|vi , d(xi )vi+1 | and estimate the two terms separately. Since d is differentiable in the ball B(xi+1 , 21 |xi+1 |) a simple application of the mean value theorem of elementary calculus gives |vi , (d(xi + xi+1 ) − d(xi+1 ))vi+1 | ≤ sup vi , (xi · ∇)d(xi+1 + rxi )vi+1 0≤r≤1
≤
sup 1 3 2 ≤|y|≤ 2
|vˆi , (xˆi · ∇)d(y)vˆi+1 ||xi ||xi+1 |ξ −1 |vi ||vi+1 |
:= C|xi ||xi+1 |ξ −1 |vi ||vi+1 | |xi | 1−ξ/2 ξ/2 =C |xi | |xi+1 |ξ/2 |vi ||vi+1 | |xi+1 | C |xi | 1−ξ/2 (|xi |ξ |vi |2 + |xi+1 |ξ |vi+1 |2 ). ≤ 2 |xi+1 |
(B.3)
Similarly,
ξ |vi , d(xi )vi+1 | ≤ 1 + |xi |ξ |vi ||vi+1 | d −1 |xi | ξ/2 ξ/2 ξ |xi | |xi+1 |ξ/2 |vi ||vi+1 | ≤ 1+ d −1 |xi+1 | ξ |xi | ξ/2 1 + (|xi |ξ |vi |2 + |xi+1 |ξ |vi+1 |2 ). (B.4) ≤ 2 2d − 2 |xi+1 | Therefore, by setting E := max{ C2 , 21 +
ξ 2d−2 },
we can conclude our claim.
Proof of Lemma 4.12. We just estimate |vi , (d(vi,j ) − d(vi+1,j ))vj | and the other part is estimated similarly. Again an application of the mean value theorem gives us |vi , (d(xi,j ) − d(xi+1,j ))vj | ≤ sup vi , (xi · ∇)d(xi+1,j + rxi )vj 0≤r≤1
≤
sup 1 3 2 ≤|y|≤ 2
|vˆi (xˆi · ∇)d(y)vˆj ||xi ||xi+1,j |ξ −1 |vi ||vj |
:= C|xi ||xi+1,j |ξ −1 |vi ||vj | 1−ξ/2 |xi+1,j | ξ/2 |xi | |xi |ξ/2 |xj |ξ/2 |vi ||vj | =C |xi+1,j | |xj | 1−ξ/2 |xi+1,j | ξ/2 |xi | C (|xi |ξ |vi |2 + |xi+1 |ξ |vi+1 |2 ). ≤ |xj | 2 |xi+1,j |
(B.5)
40
V. Hakulinen
Proof of Lemma 4.13. First we make a split: |vi , (d(xi,j ) − d(xi+1,j ) − d(xi,j −1 ) + d(xi+1,j −1 ))vj | ≤ |vi , (d(xi,j ) − d(xi+1,j ))vj | + |vi , d(xi,j −1 )vj | + |vi , d(xi+1,j −1 )vj |. (B.6) Now the first term is estimated exactly as in the previous lemma and the latter as follows. (Actually we only estimate the second one, the third one is handled identically.) |vi , d(x i,j −1 )vj | ξ ≤ 1+ |xi,j −1 |ξ |vi ||vj | d −1 |xi,j −1 | ξ |xi | ξ/2 ξ/2 ξ |xi | |xj |ξ/2 |vi ||vj | ≤ 1+ d −1 |xi | |xj | |xi | + |xi+1,j −1 | ξ |xi | ξ/2 ξ 1 (|xi |ξ |vi |2 + |xj |ξ |vj |2 ) + ≤ 2 2d − 2 |xi | |xj | ξ/2 ξ 1 ξ |xi | (|xi |ξ |vi |2 + |xj |ξ |vj |2 ). (B.7) + 3 ≤ 2 2d − 2 |xj |
Proof of Lemma 4.14. Two applications of the mean value theorem give us |vi , (d(xi,j ) − d(xi+1,j ) − d(xi,j −1 ) + d(xi+1,j −1 ))vj | ≤ sup vi , ((xi · ∇)d(xi+1,j + rxi ) − (xi · ∇)d(xi+1,j −1 + rxi )vj 0≤r≤1
≤ ≤
sup vi , (xi · ∇)(xj · ∇)d(xi+1,j −1 + rxi + r xj ))vj
0≤r,r ≤1
sup 1 4 3 ≤|y|≤ 3
|vˆi , (xˆi · ∇)(xˆj · ∇)d(y)vˆj |xi ||xj ||xi+1,j −1 |ξ −2 |vi ||vj |
:= 2E|xi ||xj ||xi+1,j −1 |ξ −2 |vi ||vj | 1−ξ/2 1−ξ/2 |xj | |xi | (|xi |ξ |vi |2 + |xj |ξ |vj |2 ). ≤E |xi+1,j −1 | |xi+1,j −1 |
(B.8)
Proof of Lemma 4.15. We estimate the terms individually. The mean value theorem gives us |vi , (d(xi,j ) − d(xA ))vj | ≤ 2ξ/2 C
|xk | |xA |ξ −1 |vi ||vj |
k∈[i,j ]\A
≤2
ξ/2
C
k∈[i,j ]\A |xk |
|xA |
1−ξ/2
|xA | |xj |
ξ/2 .
(B.9)
Passive Advection and the Degenerate Elliptic Operators Mn
Since we assumed that 2
ξ/2
C
k∈[i,j ]\A |xk |
k∈[i,j ]\A |xk |
1−ξ/2
|xA | |xj |
≤
1 2
41
min{|xk,l | : k, l ∈ A, k ≤ l}, we have
ξ/2
|xA | 1−ξ/2 |xA | ξ/2 k∈[i,j ]\A |xk | ≤ 2ξ/2−1 C min{|xk,l | : k, l ∈ A, k ≤ l} − k∈[i,j ]\A |xk | |xj | · (|xi |ξ |vi |2 + |xj |ξ |vj |2 ) 1−ξ/2 ξ/2 k∈[i,j ]\A |xk | k∈A |xk | ≤C min{xk,l : k, l ∈ A, k ≤ l} |xj | · (|xi |ξ |vi |2 + |xj |ξ |vj |2 ).
(B.10)
Similar estimates hold for the other terms, except when A = {i, j }, which causes modifications to the last pair of terms. Then |vi , d(xi+1,j −1 )vj | ξ ≤ 1+ |xi+1,j −1 |ξ |vi ||vj | d −1 ξ/2 ξ 1 k∈[i,j ]\A |xk | (|xi |ξ |vi |2 + |xj |ξ |vj |2 ). + ≤ 2 2d − 2 |xj |
(B.11)
C. Proof of Proposition 6.2 In order to prove Proposition 6.2 we first need a lemma. Lemma C.1. Let 2 ≤ l ∈ N. Then there is C < ∞ such that Rd
d d y (k + |x − y|)2−ξ −ld (1 + |y|)2−ξ −d ≤ Ck 2−ξ −(l−1)d (1 + |x|)2−ξ −d . (C.1)
Proof. We split the domain of integration into three parts and estimate these separately:
d d y (k + |x − y|)2−ξ −ld (1 + |y|)2−ξ −d = + +
Rd
|x−y|≤|x|/2
|y|≤|x|/2
|y|,|x−y|≥|x|/2
=: (∗1 ) + (∗2 ) + (∗3 ).
(C.2)
To estimate (∗1 ) we note that in |x − y| ≤ |x|/2 we have |x|/2 ≤ |y| which implies that in |x − y| ≤ |x|/2 we have (1 + |y|)2−ξ −d ≤ (1 + |x|/2)2−ξ −d ≤ C1 (1 + |x|)2−ξ −d .
(C.3)
42
V. Hakulinen
Therefore
d d y (k + |x − y|)2−ξ −ld (1 + |x|)2−ξ −d 2−ξ −(l−1)d 2−ξ −d (1 + |x|) d d y (1 + |x − y|)2−ξ −ld = C1 k |x−y|≤|x|/(2k) 2−ξ −(l−1)d 2−ξ −d ≤ C1 k (1 + |x|) d d y (1 + |x − y|)2−ξ −ld
(∗ ) ≤ C1 1
|x−y|≤|x|/2
≤ C2 k 2−ξ −(l−1)d (1 + |x|)2−ξ −d .
Rd
(C.4)
We make in a similar estimate in (∗2 ): in |y| ≤ |x|/2 we have |x|/2 ≤ |x − y| which implies that in |y| ≤ |x|/2 we have (k + |x − y|)2−ξ −ld ≤ (k + |x|/2)2−ξ −ld ≤ C3 (k + |x|)2−ξ −ld .
(C.5)
Now we can compute:
d d y (k + |x|)2−ξ −ld (1 + |y|)2−ξ −d $ |x|d if |x| ≤ 1 and 2−ξ −ld = C4 (k + |x|) · 2−ξ |x| if |x| ≥ 1.
(∗2 ) ≤ C3
|y|≤|x|/2
(C.6)
To treat the case |x| ≤ 1, we compute: C4 (k + |x|)2−ξ −ld |x|d ≤ C4 k 2−ξ −(l−1)d |x|−d |x|d ≤ C5 k 2−ξ −(l−1)d (1 + |x|)2−ξ −d .
(C.7)
If |x| ≥ 1 we have C4 (k + |x|)2−ξ −ld |x|2−ξ ≤ C4 k 2−ξ −(l−1)d |x|−d |x|2−ξ = C4 k 2−ξ −(l−1)d |x|2−ξ −d .
(C.8)
Finally, we handle (∗3 ). When |y|, |x − y| ≥ |x|/2, we have |y|/3 ≤ |x − y|. Since this might not be obvious, we compute: Since B(x, |x|/2) ⊆ B(0, 3|x|/2), we have 1 |y|/3 = |x|/2 + d(y, B(0, 3|x|/2)) 3 ≤ |x|/2 + d(y, B(0, 3|x|/2) ≤ |x|/2 + d(y, B(x, |x|/2) = |x − y|.
(C.9)
Therefore, when |y|, |x − y| ≥ |x|/2, we have (k + |x − y|)2−ξ −ld (1 + |y|)2−ξ −d ≤ C6 (k + |y|)2−ξ −ld (1 + |y|)2−ξ −d and thus (∗3 ) ≤ C6
|y|≥|x|/2
d d y (k + |y|)2−ξ −ld (1 + |y|)2−ξ −d =: (∗4 ).
(C.10)
(C.11)
Passive Advection and the Degenerate Elliptic Operators Mn
43
We split the analysis of (∗4 ) into two subcases: |x| ≥ 2 and |x| ≤ 2. If |x| ≥ 2, then we have
(k + |y|)2−ξ −ld |y|2−ξ −d 2(2−ξ )−ld = C7 k d d y (1 + |y|)2−ξ −ld |y|2−ξ −d
(∗ ) ≤ C6 4
= C8 k
|y|≥|x|/2
2(2−ξ )−ld
|y|≥|x|/(2k) 2(2−ξ )−ld
(1 + |x|/k)
=: (∗5 ).
(C.12)
If |x| ≤ k, then (∗5 ) = C8 k 2(2−ξ )−ld ≤ C9 k 2−ξ −(l−1)d (1 + |x|)2−ξ −d .
(C.13)
On the other hand, if k ≤ |x|, then (∗5 ) = C8 |x|2(2−ξ )−ld ≤ C10 k 2−ξ −(l−1)d (1 + |x|)2−ξ −d .
(C.14)
If instead of |x| ≥ 2 we have |x| ≤ 2 in (∗4 ), we compute
d d y (k + |y|)2−ξ −ld (1 + |y|)2−ξ −d ≤ C11 d d y (k + |y|)2−ξ −ld + C11 (k + |y|)2−ξ −ld |y|2−ξ −d |y|≤1 |y|≥1 2−ξ −(l−1)d d 2−ξ −ld ≤ C12 k d y (1 + |y|) + C12 k 2−ξ −(l−1)d
(∗ ) ≤ C6 4
Rd
≤ C13 k
2−ξ −(l−1)d
Rd
(1 + |x|)2−ξ −d .
(C.15)
Proof of Proposition 6.2. Without loss of generality, we may assume that χ is the characteristic function of the unit ball. First we integrate yl out: Write k := l−1 i=1 |xi − yi |. Now we have 2−ξ −ld (k+|xl |) d d yl (k + |xl − yl |)2−ξ −ld ≤ C1 k 2−ξ −(l−1)d yl ∈B(0,1) k 2−ξ −ld
if |xl | ≥ 2, if |xl | ≤ 2 and k ≤ 1 and if |xl | ≤ 2 and k ≥ 1. (C.16)
The first case, i.e. |xl | ≥ 2, is computed by a repeated application of Lemma C.1:
44
V. Hakulinen
l−1
R(l−1)d i=1
l−1 l−1 2−ξ −ld d d yi |xl | + |xi − yi | (1 + |yi |)2−ξ −d i=1
≤ C2 (1 + |xl−1 |)2−ξ −d
i=1
l−1
R(l−1)d i=1
d d yi
l−2 l−2 2−ξ −(l−1)d · |xl | + |xi − yi | (1 + |yi |)2−ξ −d i=1
≤ ... l−1 ≤ Cl (1 + |xi |)2−ξ −d i=2
i=1
Rd
d d y1
·(|xl | + |x1 − y1 |)2−ξ −2d (1 + |y1 |)2−ξ −d ≤ Cl+1
l
(1 + |xi |)2−ξ −d .
(C.17)
i=1
In the second case, i.e. |xl | ≤ 2 and k ≤ 1, we get
l−1
k≤1 i=1
≤ sup
d d yi k 2−ξ −(l−1)d
i=1 l−1
k≤1 i=1
≤ C
l−1 (1 + |yi |)2−ξ −d
l−1
(1 + |yi |)2−ξ −d
l−1 k≤1 i=1
d d yi k 2−ξ −(l−1)d
(1 + |xi |)2−ξ −d .
(C.18)
i=1
The third case, i.e. |xl | ≤ 2 and k ≥ 1, uses the following trick:
l−1
d d y1 k 2−ξ −ld
k≥1 i=1
≤ C2
l−1
(1 + |yi |)2−ξ −d
i=1
l−1
R(l−1)d i=1
d d yi (1 + k)2−ξ −ld
l−1
(1 + |yi |)2−ξ −d = (∗).
(C.19)
i=1
Now repeating the computation of the first case, we get: (∗) ≤ C3
l−1 i=1
(1 + |xi |)2−ξ −d ≤ C4
l
(1 + |xi |)2−ξ −d .
(C.20)
i=1
Acknowledgements. The author would like to thank his thesis advisor Professor Antti Kupiainen for his support, guidance and patience during the preparation of this article and Professor N. Th. Varopoulos for teaching the argument used in Sect. 5 in the uniformly elliptic case.
Passive Advection and the Degenerate Elliptic Operators Mn
45
References 1. Bass, R.F.: Diffusions and elliptic operators. Probability and its applications. New York: SpringerVerlag, 1997 2. Bernard, D., Gaw¸edzki, K., Kupiainen, A.: Slow modes in passive advection. J. Stat. Phys. 90(3–4), 519–569 (1998) 3. Chanillo, S., Wheeden, R.L.: Weighted Poincar´e and Sobolev inequalities and estimates for weighted Peano maximal functions. Am. J. Math. 107, 1191–1226 (1985) 4. Davies, E.B.: Explicit constants for Gaussian upper bounds on heat kernels. Am. J. Math. 109, 319–334 (1987) 5. Davies, E.B.: One-parameter Semigroups. London: Academic Press, 1980 6. Davies, E.B.: Heat kernels and spectral theory. Cambridge Tracts in Mathematics 92, Cambridge: Cambridge University Press, 1989 7. Vanden Eijnden, W.E.E.: Generalized flows, intrinsic stochasticity, and turbulent transport. Proc. Natl. Acad. Sci. USA 97(15), 8200–8205 (2000) 8. Eyink, G., Xin, J.: Existence and uniqueness of L2 -solutions at zero-diffusivity in the Kraichnan model of a passive scalar. Preprint (chao-dyn/9605008), 1996 9. Gaw¸edzki, K.: Easy turbulence. Preprint (chao-dyn/9907024), 1999 10. Gaw¸edzki, K., Kupiainen, A.: Universality in turbulence: An exactly solvable model. In: Low-Dimensional Models in Statistical Physics and Quantum Field Theory (Schladming, 1995), Lecture Notes in Phys. 469, Berlin: Springer, 1996, pp. 71–105 11. Guti´errez, C.E., Wheeden, R.L.: Mean value and Harnack inequalities for degenerate parabolic equations. Coll. Math. LX/LXI, 157–194 (1990) 12. Hakulinen, V.: in preparation 13. Kraichnan, R.H.: Small-scale structure of a passive scalar field convected by turbulence. Phys. Fluids 11, 945–963 (1968) 14. Kupiainen, A.: Some mathematical problems in passive advection. In: Advances in differential equations and mathematical physics (Atlanta, GA, 1997), Contemp. Math. 217, Providence, RI:Am. Math. Soc., 1998, pp. 83–97 15. Le Jan, Y., Raimond, O.: Solutions statistiques fortes des e´ quations diff´erentielles stochastiques. C. R. Acad. Sci. Paris S´er. I Math. 327(10), 893–896 (1998) 16. Le Jan, Y., Raimond, O.: Integration of Brownian vector fields. Preprint (math.PR/9909147), 1999 17. Moser, J.: On a pointwise estimate for parabolic differential equations. Comm. Pure Appl. Math. 24, 727–740 (1971) 18. Nash, J.: Continuity of solutions of parabolic and elliptic equations. Am. J. Math. 80, 931–954 (1958) 19. Varopoulos, N.Th.: Hardy-Littlewood theory for semigroups. J. Funct. Anal. 63, 215–239 (1985) 20. Varopoulos, N.Th., Saloff-Coste, L., Coulhon, T.: Analysis and geometry on groups. Cambridge Tracts in Mathematics 100, Cambridge: Cambridge University Press, 1992 Communicated by A. Kupiainen
Commun. Math. Phys. 235, 47–68 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0772-6
Communications in
Mathematical Physics
Torus Chiral n-Point Functions for Free Boson and Lattice Vertex Operator Algebras Geoffrey Mason1, , Michael P. Tuite2,3, 1
Department of Mathematics, University of California Santa Cruz, CA 95064, USA. E-mail:
[email protected] 2 Department of Mathematical Physics, National University of Ireland, Galway, Ireland. E-mail:
[email protected] 3 School of Theoretical Physics, Dublin Institute for Advanced Studies, 10 Burlington Road, Dublin 4, Ireland Received: 6 May 2002 / Accepted: 4 October 2002 Published online: 24 January 2003 – © Springer-Verlag 2003
Abstract: We obtain explicit expressions for all genus one chiral n-point functions for free bosonic and lattice vertex operator algebras. We also consider the elliptic properties of these functions.
1. Introduction This is the first of several papers devoted to a detailed and mathematically rigorous study of chiral n-point functions at all genera. Given a vertex operator algebra (VOA) V (i.e. a chiral conformal field theory) one may define n-point functions at genus one following Zhu [Z] and use various sewing procedures to define such functions at successively higher genera. A discussion of this sewing procedure for genus two 0-point functions appears in [T]. In order to implement such a procedure in practice, one needs a detailed description of the genus one functions. This itself is a non-trivial issue, and little seems to be currently rigorously known beyond certain global descriptions for some specific theories [DMN, DM]. The purpose of the present paper is to supply the needed information in case V is either a free bosonic or lattice VOA. More precisely, if V is a free bosonic Heisenberg or lattice VOA, N a V-module, and v1 , . . . , vn states in V , we establish a closed formula below in Theorem 1 for the genus one n-point function FN (v1 , z1 ; . . . ; vn , zn ; τ ). Roughly speaking, in the free boson case the n-point functions are elliptic functions whose detailed structure depends on certain combinatorial data determined by the states v1 , . . . , vn . In the lattice case, the function is naturally the product of two pieces, one determined by the Heisenberg subalgebra and one which may be described in terms of the lattice and the genus one prime form. We note that the role Partial support provided by NSF DMS-9709820 and the Committee on Research, University of California, Santa Cruz Supported by an Enterprise Ireland Basic Research Grant and the Millenium Fund, National University of Ireland, Galway
48
G. Mason, M.P. Tuite
played by elliptic functions and the prime form in calculating genus one n-point functions in string theory has long been discussed by physicists but a rigorous and complete description of these n-point functions has been lacking until now e.g. [D, P]. The paper is organized as follows. We begin in Sect. 2 with a brief review of relevant aspects of free bosonic Heisenberg and even lattice vertex operator algebras. Section 3 contains the main results of this work. We begin with a discussion of free bosonic and lattice VOAs of rank one and later generalise to the rank l case. We firstly use a recursion formula for n-point functions due to Zhu [Z] to demonstrate that every lattice n-point function is a product of a part determined by the free bosonic Heisenberg sub-VOA and a part dependent on lattice vectors only. We also obtain an explicit expression for every free bosonic n-point function FN (v1 , z1 ; . . . ; vn , zn ; τ ) in terms of a combinatoric sum over specific elliptic functions labelled by data determined by the states v1 , . . . , vn . We next describe the n-point functions for pure lattice states. This involves the identification of such n-point functions as a sum of appropriate weights over a certain set of graphs. This combinatorial approach then leads to a closed expression for all such n -point functions in terms of the lattice vectors and the genus one prime form. Finally we conclude the section with Theorem 1 which describes the expression for every rank l lattice n-point function. Section 4 concludes the paper with a discussion of these n-point function from the point of view of their symmetry and elliptic properties. This provides some further insight into the nature of the explicit formulas obtained for n-point functions in Sect. 3. We collect here notation for some of the more frequently occurring functions and symbols that will play a role in our work. N = {1, 2, 3, . . . .} is the set of positive integers, Z the integers, C the complex numbers, H the complex upper-half plane. We will always take τ to lie in H, and z will lie in C unless otherwise noted. We set qz = exp(z) and q = q2πiτ = exp(2πiτ ). For n symbols z1 , . . . , zn we also set qi = qzi = exp(zi ) and zij = zi − zj . We now define some elliptic and modular functions. Let ℘ (z, ω1 , ω2 ) denote the Weierstrass elliptic ℘-function with periods ω1 , ω2 and set ℘ (z, 2πi, 2πiτ ) =
1 + z2
∞
(k − 1)Ek (τ )zk−2 ,
(1)
k=4,k even
so that ∞
Ek (τ ) = −
Bk 2 σk−1 (n)q n , + k! (k − 1)!
k ∈ N,
k even
(2)
n=1
is the Eisenstein series of weight k normalized as in [DLM]; Bk is a certain Bernoulli number and σk−1 (n) a power sum over positive divisors of n. Also set Ek (τ ) = 0,
k ∈ N,
k odd.
(3)
Ek (τ )zk ,
(4)
We define P0 (z, τ ) = − log z +
∞ 1 k=2
k
related to the genus one prime form K(z, τ ) [Mu] by K(z, τ ) = exp(−P0 (z, τ )).
(5)
Torus Chiral n-Point Functions
49
We further define for n ≥ 1, ∞ k−1 (−1)n d n 1 n Pn (z, τ ) = Ek (τ )zk−n . P0 (z, τ ) = n + (−1) n−1 (n − 1)! dzn z
(6)
k=n
Note that P1 (z, τ ) = ς(z, 2πi, 2πiτ ) − E2 (τ )z, for ς the Weierstrass zeta-function and P2 (z, τ ) = ℘ (z, 2πi, 2πiτ ) + E2 (τ ). We note two expansions for P2 : 1 + C(r, s, τ )zr−1 w s−1 , 2 (z − w) r,s∈N P2 (z + w1 − w2 , τ ) = D(r, s, z, τ )w1r−1 w2s−1 P2 (z − w, τ ) =
(7) (8)
r,s∈N
(expanding the latter in w1 , w2 ) so that for r, s ∈ N, (r + s − 1)! Er+s (τ ), (r − 1)!(s − 1)! (r + s − 1)! Pr+s (z, τ ). D(r, s, z) = D(r, s, z, τ ) = (−1)r+1 (r − 1)!(s − 1)! C(r, s) = C(r, s, τ ) = (−1)r+1
(9) (10)
We also define for r ∈ N, C(r, 0) = C(r, 0, τ ) = (−1)r+1 Er (τ ), D(r, 0, z) = D(r, 0, z, τ ) = (−1)r+1 Pr (z, τ ).
(11) (12)
The Dedekind eta-function is defined by η(τ ) = q 1/24
∞
(1 − q n ).
(13)
n=1
Finally, for a (finite) set we denote by ( ) the symmetric group consisting of all permutations of . Set Inv( ) = {σ ∈ ( )|σ 2 = 1}, (involutions of ( )), (14) Fix(σ ) = {x ∈ |σ (x) = x}, (fixed-points of σ ), (15) F ( ) = {σ ∈ Inv( )|Fix(σ ) = ∅}, (fixed-point-free involutions of ( )). (16) 2. Vertex Operator Algebras We discuss some aspects of VOA theory to establish context and notation. For more details, see [FHL, FLM, Ka, MN]. A vertex operator algebra (VOA) is a quadruple (V , Y, 1, ω) consisting of a Z-graded complex vector space V = n∈Z Vn , a linear map Y : V → (EndV )[[z, z−1 ]], and a pair of distinguished vectors (states): the vacuum 1 in V0 , and the conformal vector ω
50
G. Mason, M.P. Tuite
in V2 . We adopt mathematical rather than physical notation for vertex operators, so that for a state v in V , its image under the Y map is denoted Y (v, z) = v(n)z−n−1 , (17) n∈Z
with component operators (or Fourier modes) v(n) ∈ EndV and where the creation axiom Y (v, z).1|z=0 = v(−1).1 = v holds. We generally take z to be a formal variable. A concession to physics notation is made concerning the vertex operator for the conformal vector ω, where we write Y (w, z) = L(n)z−n−2 . n∈Z
The modes L(n) close on the Virasoro Lie algebra of central charge c: c [L(m), L(n)] = (m − n)L(m + n) + (m3 − m) δm,−n . 12 We define the homogeneous space of weight k to be Vk = {v ∈ V |L(0)v = kv}, where for v in Vk we write wt (v) = k. Then as an operator on V we have v(n) : Vm → Vm+k−n−1 . In particular, the zero mode o(v) = v(wt (v) − 1) is a linear operator on each homogeneous space of V . Next we consider some particular VOAs, namely Heisenberg VOAs (or free boson theories), and lattice VOAs. We consider an l-dimensional complex vector space (i.e., abelian Lie algebra) H equipped with a non-degenerate, symmetric, bilinear form (, ) and a distinguished orthonormal basis a1 , a2 , . . . al . The corresponding affine Lie algebra is ^ = H ⊗ C[t, t −1 ] ⊕ Ck with brackets [k, H] ^ = 0 and the Heisenberg Lie algebra H [a ⊗ t m , b ⊗ t n ] = (a, b)mδm,−n k. Corresponding to an element λ in the dual space H∗
(18)
we consider the Fock space defined
by the induced (Verma) module ^ ⊗U (H⊗C[t]⊕Ck) C, M λ = U (H) where C is the 1-dimensional space annihilated by H ⊗ tC[t] and on which k acts as the identity and H ⊗ t 0 via the character λ; U denotes the universal enveloping algebra. There is a canonical identification of linear spaces M λ = S(H ⊗ t −1 C[t −1 ]), where S denotes the (graded) symmetric algebra. The Heisenberg VOA M corresponds to the case λ = 0 and the Fock states v = a1 (−1)e1 .a1 (−2)e2 . . . .a1 (−n)en . . . .al (−1)f1 .al (−2)f2 . . . al (−p)fp .1, (19) for non-negative integers ei , . . . , fj form a basis of M. The vacuum 1 is canonically identified with the identity of M0 = C, while the weight 1 subspace M1 may be naturally identified with H. The vertex operator corresponding to h in H is given by Y (h, z) = h(n)z−n−1 , (20) n∈Z
where h(n) is the usual operator on M. M is a simple VOA.
Torus Chiral n-Point Functions
51
Next we consider the case of lattice VOAs VL associated to a positive-definite, even lattice L ([B, FLM]). Thus L is a free abelian group of rank l, say, equipped with a positive definite, integral bilinear form (, ) : L ⊗ L → Z such that (α, α) is even for α ∈ L. Let H be the space C ⊗Z L equipped with the C -linear extension of (, ) to H ⊗ H and let M be the corresponding Heisenberg VOA. The Fock space of the lattice theory may be described by the linear space VL = M ⊗ C[L] = M ⊗ eα , (21) α∈L
where C[L] denotes the group algebra of L with canonical basis eα , α ∈ L. M may be identified with the subspace M ⊗ e0 of VL , in which case M is a subVOA of VL and the rightmost equation of (21) then displays the decomposition of VL into irreducible M-modules. We identify eα with the element 1 ⊗ eα in VL ; each of the elements eα is a primary state of weight (α, α)/2. The vertex operator for h in H is again represented in the obvious way by (20). The vertex operator for eα is more complicated (loc. cit.) and is given by Y (eα , z) = Y− (eα , z)Y+ (eα , z)eα zα , ∓n . z Y± (eα , z) = exp ∓ n>0 α(±n) n
(22)
(The slight inconsistency in notation is more than compensated by its convenience.) The operators eα ∈ C[L] have group commutator eα eβ e−α e−β = (−1)(α,β) ,
(23)
and eα , zα act on any state u ⊗ eβ ∈ VL as eα (u ⊗ eβ ) = (α, β)u ⊗ eα+β ,
(24)
z (u ⊗ e ) = z
(25)
α
β
(α,β)
u⊗e , β
for cocycle (α, β) = ±1. This cocycle can be chosen so that [FLM] (α, β + γ ) = (α, β)(α, γ ), (α, −α) = (α, α) = 1.
(26) (27)
In the context of his theory of modular-invariance for n-point functions at genus 1, Zhu introduced in [Z] a second VOA (V , Y [, ], 1, ω) ˜ associated to a given VOA (V , Y (, ), 1, ω). This will be important in the present paper, and we review some aspects of the construction here. The underlying Fock space of the second VOA is the same space V as the first; moreover they share the same vacuum vector 1 and have the same central charge. The new vertex operators are defined by a change of co-ordinates 1 , namely Y [v, z] = v[n]z−n−1 = Y (qzL(0) v, qz − 1), (28) n∈Z
1 Concerning the co-ordinate change we follow [DLM] rather than [Z]. The latter has z replaced by 2π iz in (28). This leads to minor discrepancies between the notation in [Z] and the present paper which should be borne in mind.
52
G. Mason, M.P. Tuite
while the new conformal vector ω˜ is defined to be the state ω − L[n]z−n−2 Y [ω, ˜ z] =
c 24 1. We
set (29)
n∈Z
and write wt[v] = k if L[0].v = kv, V[k] = {v ∈ V |wt[v] = k}. States homogeneous with respect to the first degree operator L(0) are not necessarily homogeneous with respect to L[0]. On the other hand, it transpires (cf. [Z, DLM]) that the two Virasoro algebras enjoy the same set of primary states. We have L[−1] = L(0) + L(−1), which leads to the useful relation o(L[−1]v) = 0,
(30)
for any state v. In as much as the co-ordinate change z → qz = exp(z) maps the complex plane to an infinite cylinder, we sometimes refer to the VOA as being ‘on the sphere’, or ‘on the cylinder’. The Heisenberg VOA M is a simple example where there is not too much difference between being on the sphere or the cylinder. This is basically because M is generated by its weight 1 states which are primary for both Virasoro algebras, and because we have u[1]v = u(1)v = (u, v)1 and the commutator formula [u[m], v[n]] = m(u, v)δm,−n ,
(31)
for weight 1 states u, v ∈ M (cf. [Z, DMN] for more details). 3. Torus n-Point Functions In this section we will consider n-point functions at genus one. A general reference is Zhu’s paper [Z]. Let (V , Y, 1, ω) be a vertex operator algebra as discussed in Sect. 2 with N a V-module. Recall (loc. cit.) that for states v1 , . . . vn ∈ V , the n-point function on the torus determined by N is L(0) FN (v1 , z1 ; . . . ; vn , zn ; q) = T rN Y q1 v1 , q1 . . . Y qnL(0) vn , qn q L(0)−c/24 , (32) where qi = qzi , 1 ≤ i ≤ n, for auxiliary variables z1 , . . . , zn . Equation (32) incorporates some cosmetic changes compared to [Z]: we have adorned our n-point functions with an extra factor q −c/24 and omitted a factor of 2π i from the variables zi (cf. footnote to (28)). In case n = 1, (32) is the usual trace function which we will often denote by ZN (v1 , τ ) = T rN o(v1 )q L(0)−c/24 ,
(33)
where o(v1 ) again denotes the zero mode, now for the vertex operator Y (v1 , z) acting on N. Note that this trace is independent of z1 . Taking all vi = 1 in (32) yields the genus one partition function for N : dim Nm q m , (34) ZN (τ ) = T rN q L(0)−c/24 = q −c/24 m≥0
where Nm is the subspace of N of homogeneous vectors of conformal weight m. The following result, which we use later, holds:
Torus Chiral n-Point Functions
53
Lemma 1. For states v1 , v2 , . . . , vn as above we have FN (v1 , z1 ; . . . ; vn , zn ; q) = ZN (Y [v1 , z1n ].Y [v2 , z2n ] . . . Y [vn−1 , zn−1n ].vn , τ ) = ZN (Y [v1 , z1 ].Y [v2 , z2 ] . . . Y [vn , zn ].1, τ ),
(35) (36)
where zij = zi − zj . Proof. Recall notation from Sect. 2 for vertex operator algebras on the cylinder. Lemma 1 is implicit in [Z], Sect. 4, especially Eq. (4.4.21). We will give a direct proof in the case n = 2 based on the associativity of vertex operators. The general case follows in similar fashion. Associativity tells us ([FHL], Prop 3.3.2) that T rN Y (v1 , z1 )Y (v2 , z2 )q L(0) = T rN Y (Y (v1 , z12 )v2 ), z2 )q L(0) .
(37)
We also have ([FHL], Eq.(2.6.4)) exL(0) Y (v, y)e−xL(0) = Y (exL(0) v, ex y).
(38)
By (37) and (38) it follows that the left-hand-side of (35) is equal to L(0)
T rN Y (Y (q1
L(0)
v1 , q1 − q2 )q2
L(0)
= T rN Y (q2
v2 , q2 )q L(0)−c/24
Y (qzL(0) v1 , qz12 − 1).v2 , q2 )q L(0)−c/24 12
L(0)
= T rN Y (q2 Y [v1 , z12 ].v2 , q2 )q L(0)−c/24 = ZN (Y [v1 , z12 ].v2 , τ ), as required. Finally using [FHL], Eq.(2.3.17) we have in general that exL(−1) Y (v, y)e−xL(−1) = Y (v, y + x).
(39)
Hence (36) follows from o(Y [v1 , z12 ].v2 ) = o(Y [v1 , z12 ].e−z2 L[−1] .Y [v2 , z2 ].1) = o(e−z2 L[−1] .Y [v1 , z1 ].Y [v2 , z2 ].1) = o(Y [v1 , z2 ].Y [v2 , z2 ].1), and using (30).
Recall from the previous section the notation for states and modules for Heisenberg and lattice vertex operator algebras. We are going to develop explicit formulas for n-point functions in these cases. The final answer is quite elaborate, so we begin with the rank 1 case. The general case will proceed in exactly the same manner. We fix the following notation: L is a rank l = 1 even lattice with inner product (, ), M the corresponding Heisenberg vertex operator algebra based on the complex space H = C ⊗Z L, a ∈ H satisfies (a, a) = 1, N = M ⊗ eβ is a simple M-module with β ∈ L, h = (β, β)/2 the conformal weight of the highest weight vector of N . We will establish a closed formula for n-point expressions of the form FN (v1 ⊗ eα1 , z1 ; . . . ; vn ⊗ eαn , zn ; q),
(40)
54
G. Mason, M.P. Tuite
where α1 , . . . , αn ∈ L and v1 , . . . , vn are elements in the canonical Fock basis (19) of M on the cylinder. Thus, v1 = a[−1]e1 a[−2]e2 . . . , etc. Note that the individual vertex operators Y (vi ⊗ eαi , zi ), 1 ≤ i ≤ n do not generally act on the module N , however their composite does as long as α1 + . . . + αn = 0. We will always assume that this is the case. It transpires that (40) factors as QN .FN (1 ⊗ eα1 , z1 ; . . . ; 1 ⊗ eαn , zn ; q),
(41)
where QN is independent of the αi , and our main task will be to elucidate the structure of the two factors QN and FN . Our results generalize the calculations in [DMN], which dealt with the case n = 1, α1 = 0. We turn to the precise description of QN . Consider first a Fock state v ∈ M given by v = a[−1]e1 . . . a[−p]ep .1,
(42)
where e1 , . . . , ep are non-negative integers. The state v is determined by a multi-set or labelled set, which consists of e1 + e2 + . . . + ep elements, the first e1 of which are labelled 1, the next e2 labelled 2, etc. In this way, each of the states vi in (40) is associated with a labelled set i , and we let = 1 ∪ . . . ∪ n denote the disjoint union of the i , itself a labelled set. For convenience we often specify an element of by its label: the reader should bear in mind that this expedient can be misleading because there are generally several distinct elements with the same label. An element ϕ ∈ Inv( ) (cf. (14)) considered as a permutation of , may be represented as a product of cycles, each of length 1 or 2: ϕ = (r1 s1 ) . . . (rb sb )(t1 ) . . . (tc ).
(43)
Equation (43) tells us that = {r1 , s1 , . . . , rb , sb , t1 , . . . , tc }, while ϕ exchanges elements with labels ri and si , and fixes elements with labels t1 , . . . , tc . Notice that involutions may produce the same permutation of labels yet correspond to distinct permutations of . We will always consider such involutions to be distinct, regardless of labels. Recall the definitions (9) to (12). Let be a subset of with || ≤ 2. If = {r, s} has size 2 with r ∈ i , s ∈ j , we define D(r, s, zij , τ ), i = j γ () = (44) C(r, s, τ ), i = j. Note that D(r, s, z, τ ) = D(s, r, −z, τ ), so that the order in which the arguments r, s appear is of no relevance. If = {r} ⊆ k we define D(r, 0, zkl , τ )αl ), (45) γ () = (a, δr,1 β + C(r, 0, τ )αk + l>k
recalling that a ∈ h satisfies (a, a) = 1 and that β ∈ L determines the module N = M ⊗ eβ . For ϕ ∈ Inv( ) set γ (), (46) (ϕ) =
where the product ranges over all orbits (cycles) of ϕ in its action on . Finally, set (ϕ). (47) QN (v1 , z1 ; . . . ; vn , zn ; q) = ϕ∈Inv( )
We can now formally state our first main result about n-point functions.
Torus Chiral n-Point Functions
55
Proposition 1. Let v1 , . . . , vn be states of the form (42) in the rank 1 free boson theory M and let be the labelled set determined (as above) by these states. Then the following holds for lattice elements α1 , . . . , αn satisfying α1 + . . . + αn = 0: FN (v1 ⊗ eα1 , z1 ; . . . ; vn ⊗ eαn , zn ; q) = QN (v1 , z1 ; . . . ; vn , zn ; q)FN (1 ⊗ eα1 , z1 ; . . . ; 1 ⊗ eαn , zn ; q).
(48)
Proof. The idea is to carefully examine a recursion formula for n -point functions due to Zhu ([Z], Prop. 4.3.3). Bearing in mind the differences in notation between the present paper and [Z], we quote the following: Lemma 2 (Zhu). Assume that u1 , . . . , un , b are states in a vertex operator algebra V , that N is a V-module, and that o(b) acts as a scalar on N . Then FN (b[−1]u1 , z1 ; . . . ; un , zn ; q) L(0) = T rN o(b)Y q1 u1 , q1 . . . Y qnL(0) un , qn q L(0)−c/24 + E2k (τ )FN (b[2k − 1]u1 , z1 ; u2 , z2 ; . . . ; un , zn ; q) k≥1
+ −
n
(−1)m+1 Pm+1 (zk1 , τ )FN (u1 , z1 ; . . . ; b[m]uk , zk ; . . . ; un , zn ; q)
m≥0 k=2 n
1 2
FN (u1 , z1 ; . . . ; b[0]uk , zk ; . . . ; un , zn ; q).
(49)
k=1
We apply this result in the case that N and ui = vi ⊗eαi are as discussed in (40)–(47), with b = a. The zero mode of a acts on N as multiplication by the scalar (a, β) and a[0].vi ⊗ eαi = (a, αi )vi ⊗ eαi . As a result of α1 + . . . + αn = 0 it follows that the last summand in (49) vanishes, and (49) then reads FN (a[−1]v1 ⊗ eα1 , z1 ; . . . ; vn ⊗ eαn , zn ; q) n = a, β + D(1, 0, z1k )αk FN (v1 ⊗ eα1 , z1 ; . . . ; vn ⊗ eαn , zn ; q) +
k=2
C(2k − 1, 1)FN (vˆ1 ⊗ eα1 , z1 ; . . . ; vn ⊗ eαn , zn ; q)
k≥1
+
n
D(m, 1, zk1 )FN (v1 ⊗ eα1 , z1 ; . . . vˆk ⊗ eαk , zk ; . . . ; vn ⊗ eαn , zn ; q).
m≥1 k=2
(50) Here, we have used the (admittedly uninformative) notation vˆ1 in the second summand to indicate that a factor a[−2k + 1] should be removed from the expression of v1 as a product (42), and indeed that this should be implemented as often as a[−2k + 1] occurs in the expression. If a[−2k + 1] does not occur in the expression for v1 then vˆ1 is defined to be zero. Similar notation vˆk occurs in the third summand, where it indicates removal of a factor a[−m]. Next we develop the analog of (50) in which a[−1] is replaced by a[−p] for any positive integer p. To this end we take b = L[−1]p−1 .a. We easily calculate that
56
G. Mason, M.P. Tuite
m b[m] = (−1)p−1 p−1 a[m − p + 1], in particular b[−1] = a[−p] and b[0] = 0 if p ≥ 2. Note also that o(b) = 0 if p ≥ 2, thanks to (30). With this choice of b and p, and after some calculation, (49) reduces to the next equation. In fact, we can combine the resulting equality (for p ≥ 2 ) with the case p = 1. What is obtained is the basic recursive relation satisfied by our n-point functions, namely FN (a[−p]v1 ⊗ eα1 , z1 ; . . . ; vn ⊗ eαn , zn ; q) = C(2k − p, p)FN (vˆ1 ⊗ eα1 , z1 ; . . . ; vn ⊗ eαn , zn ; q) k>p/2 n
+
D(m − p + 1, p, zk1 ).
m>p−1 k=2 FN (v1 ⊗ eα1 , z1 ; . . .
; vˆk ⊗ eαk , zk ; . . . ; vn ⊗ eαn , zn ; q) n + a, δp,1 β + C(p, 0)α1 + D(p, 0, z1k )αk .
k=2
FN (v1 ⊗ eα1 , z1 ; . . . ; vn ⊗ eαn , zn ; q),
(51)
where we have used a similar convention to the case p = 1 regarding symbols vˆ1 , vˆk . Close scrutiny of relation (51) reveals how to complete the proof of Proposition 1, which at this point is a matter of interpreting the recursion formula. We choose an element with label p from the first labelled set 1 determined by v1 . The first sum on the r.h.s of (51) then corresponds to certain terms in the representation (42) of v1 . Indeed, as long as 2k > p, a factor a[p − 2k] will give rise to a term C(2k − p, p, τ )FN (. . . ), and via (44) we identify C(2k − p, p, τ ) with γ (), where = {2k − p, p} ⊆ 1 and (2k − p, p) is the initial transposition of a putative involution that is to be constructed inductively. Terms in the second (double) summation in (51) are treated similarly - they correspond to expressions γ ()FN (. . . ), where = {m − p + 1, p} and m − p + 1, p are labels of elements in k , 1 respectively (for k = 1). The third term in (51) is similarly seen to coincide with γ ()FN (. . . ), now with = {p} ⊆ 1 . We repeat this process in an inductive manner. It is easy to see that in this way we construct every element of Inv( ) exactly once, and what emerges is the formula (48). This completes our discussion of the proof of Proposition 1. In order to complete our discussion of the rank one n-point functions we must of course evaluate the term FN (1 ⊗ eα1 , z1 ; . . . ; 1 ⊗ eαn , zn ; q). Before we do that, however, it will be useful to draw some initial conclusions from Proposition 1. Taking all lattice elements α1 = . . . = αn = β = 0 corresponds to the case of n-point functions in the rank 1 free bosonic theory M. In this case all contributions from orbits of length 1 vanish and hence the sum over Inv( ) reduces to one over F ( ) of (16) only. Furthermore we know that FM (1, z1 ; . . . ; 1, zn ; q) is just the partition function for M, i.e. 1/η(τ ). Thus we arrive at a formula for n-point functions for a single free boson: Corollary 1. Let M be the VOA for a single free boson. For Fock states v1 , . . . , vn as in (42) with corresponding labelled set we have FM (v1 , z1 ; . . . ; vn , zn ; q) =
1 (ϕ). η(τ ) ϕ∈F ( )
(52)
Torus Chiral n-Point Functions
57
Some special cases for n = 1 of Corollary 1 were established in [DMN]. From (52) the general formula for 1-point functions for a single free boson is ZM (v; τ ) =
1 C(r, s, τ ), η(τ )
(53)
ϕ∈F ( )
for fixed point free involutions ϕ = . . . (rs) . . . of the labelled set labelling v of (42) where the product is taken over all the transpositions (rs) in ϕ. An even more special, yet interesting, case arises if in Corollary 1 we take each state vi to coincide with the conformal weight one state a. Thus vi = a[−1].1, 1 ≤ i ≤ n, consists of n elements each carrying the label 1, and elements of ( ) may be considered as mappings on the set {1, 2, . . . , n} in the usual way. If n is odd then there are no fixed-point-free involutions acting on , so (52) is zero in this case. If n is even then γ () = P2 (zij , τ ) if = i ∪ j for i = j . Thus we obtain Corollary 2. Let M be the VOA for a single free boson. Then for n even, FM (a, z1 ; . . . ; a, zn ; q) =
1 P2 (zij , τ ), η(τ )
(54)
ϕ∈F ( )
where the product ranges over the cycles of ϕ = . . . (ij ) . . . We next consider the case of an M-module N = M ⊗eβ . If n = 1 we necessarily have α1 = 0, ZN = q (β,β)/2 /η(τ ), and the labelled set coincides with 1 . If ϕ ∈ Inv( ) and = {r} is an orbit of ϕ of length 1 then from (45) we get γ () = δr,1 (a, β). So (ϕ) vanishes unless all labels of the set of fixed-points Fix(ϕ) are equal to 1. In this case we may write ϕ = 1|| ϕ0 to indicate that ϕ fixes a set of elements with label 1, and that ϕ0 is the fixed-point-free involution induced by ϕ on the complement \. We thus obtain Corollary 3. Let M be as in (40) and let N be the M-module M ⊗ eβ . Take v to be the state (42) and let denote the elements in with label 1. Then q (β,β)/2 (a, β)|| (ϕ0 ), η(τ ) ⊆ ϕ0 ∈F ( \) (ϕ0 ) = C(r, s, τ ),
ZN (v, τ ) =
(55)
where ϕ0 = . . . (rs) . . . acting on \. Similarly to Corollary 2 we can consider again the special case with each vi = a, generalising (54). In the above notation we therefore have = so that Corollary 4. Let M be the VOA for a single free boson with module N = M ⊗ eβ . Then FN (a, z1 ; . . . ; a, zn ; q) =
q (β,β)/2 (a, β)|| η(τ ) ⊆
ϕ0 ∈F ( \)
where the product ranges over the cycles of ϕ0 = . . . (ij ) . . . .
P2 (zij , τ ),
(56)
58
G. Mason, M.P. Tuite
We now show how (56) can be interpreted as the generator of all free bosonic n-point functions for Fock states (19). This provides a useful insight into the structure found for these n-point functions in terms of the elliptic function P2 (z, τ ) and the scalar (a, β). Proposition 2. FN (a, z1 ; . . . ; a, zn ; q) is a generating function for the n-point functions for all Fock states v1 , . . . , vn . Proof. This follows from Lemma 1 and the expansions of P2 of (7) and (8). We will illustrate the result for n = 1 and n = 2. A general proof can be given along the same lines. From (36) we obtain FN (a, z1 ; . . . ; a, zn ; q) = ZN (Y [a, z1 ] . . . Y [a, zn ].1, q) ZN (a[−l1 ] . . . a[−ln ].1, q)z1l1 −1 . . . znln −1 . = l1 ,...ln ∈Z
(57) The 1-point function for the bosonic Fock state v = a[−l1 ] . . . a[−ln ].1 is clearly the coefficient of z1l1 −1 . . . znln −1 for l1 , . . . , ln > 0. We then recover (55) from the expansion for each P2 (zij , τ ) using (7). For n = 2 consider the 1-point function ZN (Y [Y [a, w1 ] . . . Y [a, wm ].1, w].Y [Y [a, z1 ] . . . Y [a, zn ].1, z].1, q).
(58)
The 2-point function FN (v1 , qw ; v2 , qz ; q) for v1 = a[−l1 ] . . . a[−lm ].1 and v2 = n li −1 mj −1 a[−k1 ] . . . a[−kn ].1 is the coefficient of m zj in (58). By associativi=1 j =1 wi ity (37) and using Y [1, z] = Id Eq. (58) can be expressed as ZN (Y [a, w1 + w] . . . Y [a, wm + w].Y [a, z1 + z] . . . Y [a, zn + z].1, q). Using (56) this becomes (suppressing the τ dependence for clarity) q (β,β)/2 (a, β)|| η(τ ) ⊆
P2 (wab ) . . . P2 (zcd ) . . . P2 (w − z + we − zf ) . . . ,
ϕ0 ∈F ( \)
where ϕ0 = (ab) . . . (cd) . . . (ef ) . . . ∈ F ( \) with a, b, e . . . ∈ {1, 2, . . . m} and c, d, f . . . ∈ {1, 2, . . . n}. Then the coefficient of wala −1 wblb −1 in P2 (wab ) is C(la , lb ), the coefficient of zcmc −1 zdmd −1 in P2 (zcd ) is C(mc , md ) from (7) and the coefficient of m −1
wele −1 zf f in P2 (w − z + we − zf ) is D(le , mf , w − z) from (8 ) leading to the result (52) in this case.
We complete our discussion of bosonic n-point functions with two global formulas for 1-point functions. The first shows how to write certain 1 -point functions with respect to N in terms of 1-point functions with respect to M:
Torus Chiral n-Point Functions
59
Proposition 3. Let notation be as in Corollary 1. Then if ς is an indeterminate, 1 a[−m]ς m .1, τ ZN exp m m≥1 1 = q (β,β)/2 exp((a, β)ς )ZM exp a[−m]ς m .1, τ . m
(59)
m≥1
Proposition 4. Let λ1 , . . . λn be n scalars obeying ni=1 λi = 0. Then the following holds: n K(zij , τ ) λi λj a[−m] 1 m ZM exp λi zi .1, τ = , (60) m η(τ ) zij m≥1
1≤i<j ≤n
i=1
where K(z, τ ) is the prime form of (5). We divide the proof of Proposition 3 into several steps. Lemma 3. Let u ∈ M be a state such that a[1].u = 0. For an integer p ≥ 0, ZN (a[−1]p .u, q) = q (β,β)/2 ZM (((a, β) + a[−1])p .u, q).
(61)
Proof. Suppose first that p = 0. In this case, the lemma says that for a state v as in (42) which satisfies also e1 = 0, the traces of o(v) over N and M differ only by an overall factor of q (β,β)/2 . A moment’s thought shows that this follows from Corollary 1 because the set is empty in this case. This proves the case p = 0 of the lemma. We prove the general case by induction on p. Using Lemma 2 with n = 1 we calculate ZN (a[−1]p+1 .u, τ ) = (a, β)ZN (a[−1]p .u, τ ) +
E2k (τ )ZN (a[2k − 1].a[−1]p .u, τ )
k≥1
=q
(β,β)/2
(a, β)ZM (((a, β) + a[−1])p .u, τ )
+pE2 (τ )q (β,β)/2 ZM (((a, β) + a[−1])p−1 .u, τ ) +q (β,β)/2 E2k (τ )ZM (a[2k − 1].((a, β) + a[−1])p .u, τ ), k≥2
from which the result follows by using Lemma 2 again.
If we multiply (61) by ς p , rearrange the constants and sum over p we find Lemma 4. Let notation be as above. Then (62) ZN (exp(a[−1]ς ).u, τ ) = q (β,β)/2 exp((a, β)ς )ZM (exp a[−1]ς ).u, τ ). a[−m] m .1 we note that Proposition 3 is a special case Choosing u = exp m≥2 m ς of Lemma 4.
60
G. Mason, M.P. Tuite
Proof of Proposition 4. When expanded as a sum, the left-hand-side of (60) can be written in the form ek n p 1 1 k λi z i ZM (v, τ ), (63) ek ! k v k=1
i=1
where v ranges over the basis elements (42). In this regard p one should note that as a consequence of Corollary 1, ZM (v, τ ) = 0 whenever k=1 kek is odd. The argument that we use to establish the equality (60) involves the use of a combinatorial technique that proliferates in other work on genus two and higher VOAs [MT]. Use the case N = M of (55) and use Corollary 1 to write (63) as ek n p p 1 k 1 1 (ϕ) λi z i , (64) η(τ ) v ek ! k k=1
ϕ∈F ( v ) k=1
i=1
where v is the labelled set determined by v. Fix for a moment an element ϕ, considered as a product of transpositions. We can represent ϕ by a graph with nodes labelled by positive integers corresponding to the elements of v , two nodes being connected precisely when ϕ interchanges the nodes in question. Such a graph masquerades under various names in combinatorics: bipartite graph, or a complete matching (cf. [LW], for example), and is nothing more than another way to think about fixed-point-free involutions. Pictorially the complete matching looks like r1 • —– • s1 r2 • —– • s2 .. .
rb • —– • sb Fig. 1
We denote the complete matching determined by ϕ by the symbol µϕ . Any complete matching on a set labelled by positive integers corresponds to a fixed-point-free involution and a state v. Let us agree that two complete matchings µ1 , µ2 are isomorphic if there is a bijection from the node set of µ1 to the node set of µ2 that preserves labels. We may identify the node sets of the two matchings, call it , in which case an isomorphism may be realized by an element in the symmetric group ( ). More precisely, let us define the label subgroup of ( ) to be the subgroup ( ) consisting of all permutations of that preserve labels. Then it is the case that an isomorphism between µ1 and µ2 may be realized by an element in the label subgroup. Note that our notation implies that |( )| = ek !. Now consider (64). The expression to the right of the second summation, which we denote by ek n p 1 k ˆ ϕ) = (ϕ) λi z i (65) (µ k k=1
i=1
is determined by the complete matching µϕ . By what we have said, every complete matching on a set labelled by positive integers occurs in this way as v ranges over the preferred Fock basis of M (on the cylinder) and ϕ ranges over F ( v ), while the factor
Torus Chiral n-Point Functions
61
1 ek !
may be interpreted as averaging over all complete matchings with a given set of labels. The only duplication that occurs is due to automorphisms of the complete matching. The upshot is that expression (64) is equal to ˆ 1 (µ) , η(τ ) µ |Aut(µ)|
(66)
where µ ranges over all isomorphism classes of complete matchings labelled by positive integers. For a given complete matching µ as in Fig. 1, let E = E(r, s) denote an edge with labels r, s, and let m(E) denote the multiplicity of E in µ. Thus we may represent µ symbolically by its decomposition µ = m(E)E into isomorphism classes of labelled edges. Now it is evident that there is an isomorphism of groups Aut(µ) Aut(E) m(E) , E
a direct product, indexed by isomorphism classes of labelled edges, of groups which are themselves the (regular) wreathed product of Aut(E) and m(E) . In particular, we have |Aut(µ)| = m(E)!|Aut(E)|m(E) . (67) E
Note that |Aut(E)| ≤ 2, with equality only if the two node labels of E are equal. Next it ˆ is easy to see that the expression (µ) is multiplicative over edges. In other words, we have m(E) ˆ ˆ (µ) = , (68) (E) E
and for an edge E(r, s) we have n n 1 1 ˆ λi zir λj zjs (E) = C(r, s, τ ) r s =
i=1
r+1
(−1) r +s
r +s Er+s (τ ) s
j =1 n
n
i=1
j =1
λi zir
λj zjs .
Substitute (67) and (68) in (66) to obtain the expression ˆ ˆ or ) 1 (E) (E 1 exp = exp , η(τ ) |Aut(E)| η(τ ) 2 E
(69)
(70)
Eor
where E ranges over all labelled edges r•——•s and Eor ranges over all oriented edges r•—>—•s (which have trivial automorphism group). Using (69), the expression (70) is in turn equal to 2k n n 1 2k 1 1 (71) E2k (τ ) (−1)r+1 λi zir λj zj2k−r . exp r η(τ ) 2 2k k≥1
r=0
i=1
j =1
62
G. Mason, M.P. Tuite
But, 2k n n 1 r+1 2k (−1) λi zir λj zj2k−r = − r 2 r=0
i=1
j =1
1≤i<j ≤n
2k λi λj zij ,
and hence using (4), (5) we find (71) becomes 1 η(τ )
1≤i<j ≤n
1 exp(−λi λj (P0 (zij , τ ) + log zij )) = η(τ )
This completes the proof of Proposition 4.
1≤i<j ≤n
K(zij , τ ) zij
λi λj .
We now turn our attention on the second factor FN of (41) where we abbreviate 1 ⊗ eαi by eαi again. Proposition 5. Let M and N = M ⊗eβ as above, and let α1 , . . . , αn be lattice elements in the rank one even lattice L satisfying α1 + . . . + αn = 0. Then FN (eα1 , z1 ; . . . ; eαn , zn ; q) q (β,β)/2 = exp((β, αr )zr ) η(τ ) 1≤r≤n
(αi , αj )K(zij , τ )(αi ,αj ) .
(72)
1≤i<j ≤n
Proof. Use (36) of Lemma 1 to rewrite the LHS of (72) as ZN (Y [eα1 , z1 ] . . . Y [eαn , zn ].1; q).
(73)
Referring to (22), by repeated use of the identity z − w (α,β) Y+ (eα , z)Y− (eβ , w) = Y− (eβ , w)Y+ (eα , z) z (for |z| > |w| ) and using (24) to (27) we find (73) is n α [−m] i zrs (αr ,αs ) (αr , αs )ZN exp zim .1,τ . m 1≤r<s≤n
(74)
m>0 i=1
The operator corresponding to m = 1 in the exponential in (74) may be written in the form a[−1]ς, where ς = nk=1 (a, αk )zk so that, from Lemma 4, n αi [−m] m ZN exp zi .1,τ m m>0 i=1 n n αi [−m] m (β,β)/2 =q zi .1,τ . exp((β, αi )zi )ZM exp m m>0 i=1
i=1
Now use Proposition 4 with λi = (a, αi ) so that we find n αi [−m] m 1 ZM exp zi .1,τ = m η(τ )
1≤i<j ≤n
m>0 i=1
This completes the proof of the proposition.
K(zij , τ ) zij
(αi ,αj ) .
Torus Chiral n-Point Functions
63
We now consider the lattice VOA VL constructed from a rank l even lattice L as described in Sect. 2. We recall that a1 , a2 , . . . al is an orthonormal basis for H with respect to the non-degenerate symmetric bilinear form (, ). We let M be the rank l Heisenberg vertex operator algebra and let N = M ⊗ eβ be a simple M-module with β ∈ L, h = (β, β)/2 the conformal weight of the highest weight vector of N . Then M M 1 ⊗ . . . ⊗ M l , the tensor product of l copies of the rank 1 Heisenberg VOA, and is spanned by the Fock states v = a1 [−1]e1 . . . a1 [−p]ep . . . al [−1]f1 . . . al [−q]fq .1,
(75)
where e1 , . . . , fq are non-negative integers. We now give a general closed formula for all rank l lattice n-point functions (40), where v1 , . . . , vn are Fock states of the form (75). Viewing each vector vi as an element of M 1 ⊗. . .⊗M l we define ri as the labelled set for the r th tensored vector of vi , e.g. 1i contains 1 with multiplicity e1 , 2 with multiplicity e2 , etc. Therefore, vi is determined by the labelled set i = 1i ∪ . . . ∪ li , the disjoint union of l sets. We also define r = 1≤i≤n ri to be the labelled set for the r th tensored vectors of the n vectors v1 , . . . , vn . Then we have: Theorem 1. Let v1 , . . . , vn be states of the form (75) in the rank l free boson theory M and let 1 , . . . l be the labelled sets defined as above by these states. Then for N = M ⊗ eβ , a simple M-module with β ∈ L, the following holds for lattice elements α1 , . . . , αn ∈ L satisfying α1 + . . . + αn = 0: FN (v1 ⊗ eα1 , z1 ; . . . ; vn ⊗ eαn , zn ; q) = QN (v1 , z1 ; . . . ; vn , zn ; q)FN (eα1 , z1 ; . . . ; eαn , zn ; q), where QN (v1 , z1 ; . . . ; vn , zn ; q) =
(76)
(ϕ r ),
(77)
(αi , αj )K(zij , τ )(αi ,αj ) .
(78)
1≤r≤l ϕ r ∈Inv( r )
and FN (eα1 , z1 ; . . . ; eαn , zn ; q) q (β,β)/2 = exp((β, αr )zr ) η(τ )l 1≤r≤n
1≤i<j ≤n
Proof. We sketch the proof which follows along the same lines as the rank one case described above. We firstly apply Lemma 2 to the r th tensored vectors labelled by r . Following the same argument for the rank one case in Proposition 1, this results in (76) and (77). We secondly evaluate the LHS of (78) as in Proposition 5 using Proposition 4 with λri = (ar , αi ) for r = 1, . . . , l to obtain (78). We conclude this section with the first non-trivial examples of lattice n-point functions which occur for n = 2. From Theorem 1 and recalling (27) we have: Corollary 5. For a rank l lattice theory with N = M ⊗ eβ as above and with states eα and e−α we have: FN (eα , z1 ; e−α , z2 ; q) =
q (β,β)/2 exp((β, α)z12 ) . ηl (τ ) K(z12 , τ )(α,α)
Taking the sum over all β ∈ L we immediately obtain:
(79)
64
G. Mason, M.P. Tuite
Corollary 6. For V = VL , the lattice vertex operator algebra for a rank l even lattice L, and for states eα and e−α we have: FVL (eα , z1 ; e−α , z2 ; q) = where α,L (τ, z) =
β∈L
1 ηl (τ )
α,L (τ, z12 /2π i) , K(z12 , τ )(α,α)
(β, β) exp 2πi τ + (β, α)z 2
(80)
,
(81)
is a Jacobi form of weight l and index (α, α)/2 [EZ]. 4. The Elliptic Properties of n-Point Functions In this section we will consider the elliptic properties of the n-point functions described in the previous section. For vertex operator algebras satisfying the so-called C2 condition, Zhu has shown that every n -point function is meromorphic and periodic in each parameter zi and is therefore elliptic [Z]. The C2 condition does not hold for simple modules of free bosonic theories but nevertheless all n-point functions (76) are found to be meromorphic and either elliptic for the free bosonic n-point functions or quasi-elliptic for the lattice n-point functions. In this section we will consider these n -point functions from first principles where our aim is to provide further insight into the structure found for these functions. In particular, we will show how (56), the generating function for free bosonic n -point functions and the lattice n-point function (72) are the unique elliptic or quasi-elliptic functions determined by permutation symmetry, periodicity and certain natural singularity and normalisation properties. We begin with a general statement about n-point functions: Lemma 5. The n-point function FN = FN (v1 ⊗ eα1 , z1 ; . . . ; vn ⊗ eαn , zn ; q) for α1 + . . . + αn = 0 obeys the following: (i) FN is symmetric under all permutations of its indices. (ii) FN is a function of zij = zi − zj . (iii) FN is non-singular at zij = 0 for all i = j . (iv) FN is periodic in zi with period 2πi. (v) FN is quasi-periodic in zi with period 2πiτ and multiplier (αi ,αi )
q (αi ,αi )/2+(αi ,β) qi
.
(82)
Proof. (i) Apply the general locality property for vertex operators, e.g. [Ka] (z − w)N Y (u, z).Y (v, w) = (z − w)N Y (v, w).Y (u, z), for N sufficiently large, to all adjacent pairs of operators in (36) of Lemma 1. (ii) This follows from (i) and (35) of Lemma 1. (iii) Suppose that FN is singular at zn = z0 for some z0 = zj for all j = 1, . . . , n−1. Using (ii) we may assume that z0 = 0 by redefining zi to be zi − z0 for all i. But FN cannot be singular at zn = 0 from (36) of Lemma 1 since Y [vn ⊗ eαn , zn ].1|zn =0 = vn ⊗ eαn and hence the result follows. (iv) This follows from the integrality of conformal weights.
Torus Chiral n-Point Functions
65
(v) Using (i) we have
L(0) FN = q −c/24 T rN Y q2 v2 ⊗ eα2 , q2 . L(0)
. . . Y qnL(0) vn ⊗ eαn , qn .Y q1 v1 ⊗ eα1 , q1 q L(0) . Under z1 → z1 + 2πiτ we have FN → FˆN where
L(0) FˆN = q −c/24 T rN Y q2 v2 ⊗ eα2 , q2 .
L(0)
. . . Y qnL(0) vn ⊗ eαn , qn .q L(0) .Y q1 v1 ⊗ eα1 , q1 ,
(83)
using (38). Consider the cocycle parts of the vertex operators within FˆN . Using (23) to (25) we see that eαr .qrαr .q L(0) .eα1 .q1α1 .(v ⊗ eβ ) 2≤r≤n (α1 ,α1 )
= q (α1 ,α1 )/2+(α1 ,β) q1
eαr .qrαr .q L(0) .(v ⊗ eβ ).
(84)
1≤r≤n
Since N = n∈Z Mn ⊗ eβ , the graded trace (83) decomposes into finite dimensional traces over Mn to which we may apply the standard trace property T rAB = T rBA on the remaining parts of the vertex operators within FˆN . Hence we obtain (α ,α ) FˆN = q (α1 ,α1 )/2+(α1 ,β) q1 1 1 FN ,
as required. By (i) we obtain the quasi-periodicity (82) for each zi .
Let us now consider the generating function, FN (a, z1 ; . . . ; a, zn ; q), for all Fock state n-point functions for the rank one case. Note that FN (q) ≡ ZN (q) = q (β,β)/2 /η(τ ) for n = 0. From Lemma 5, FN (a, z1 ; . . . ; a, zn ; q) is periodic in each zi with periods 2π i and 2π iτ . The singularity structure at zij = 0 is determined by Lemma 6. For n ≥ 2 and for i = j , FN (a, z1 ; . . . ; a, zn ; q) has the following leading behaviour in its (formal) Laurent expansion in zij : FN (a, z1 ; . . . ; a, zn ; q) =
1 F (a, z1 ; . . . ; a, ˆ zˆ i . . . ; a, ˆ zˆ j . . . ; a, zn ; q) + . . . , 2 N zij (85)
where a, ˆ zˆ i and a, ˆ zˆ j denotes the deletion of the corresponding vertex operators resulting in an n − 2 point function. Proof. Using Lemma 5 (i) it suffices to consider the expansion in zn−1n . The result then follows from (35) of Lemma 1 and using Y [a, zn−1n ].a =
1 2 zn−1n
1+
k−1 zn−1n a[−k].a.
k≥1
We also have the following integral normalisation:
66
G. Mason, M.P. Tuite
Lemma 7. For each i = 1, . . . , n ≥ 1, 2π i 1 FN (a, z1 ; . . . ; a, zn ; q)dzi = (a, β)FN (a, z1 ; . . . ; a, ˆ zˆ i ; . . . ; a, zn ; q), 2π i 0 (86) where a, ˆ zˆ i denotes the deletion of the corresponding vertex operator giving an n − 1 point function. Proof. Using Lemma 5 (i) it suffices to consider i = 1 only. Then the integral is 2πi 1 FN (a, z1 ; . . . ; a, zn ; q)dz1 2π i 0 1 = T rN Y (a, q1 )dq1 Y (q2 a, q2 ) . . . Y (qn a, qn )q L(0)−1/24 2πi C1 = T rN o(a)Y (q2 a, q2 ) . . . Y (qn a, qn )q L(0)−1/24 = (a, β)T rN Y (q2 a, q2 ) . . . Y (qn a, qn )q L(0)−1/24 , where C1 denotes a closed contour surrounding q1 = 0.
We now show that FN (a, z1 ; . . . ; a, zn ; q) is uniquely determined to be given by (56) of Corollary 4 as follows: Proposition 6. FN (a, z1 ; . . . ; a, zn ; q) is the unique meromorphic function in zi ∈ C/{2π i(m + nτ )|m, n ∈ Z} obeying Lemmas 5, 6 and 7 and is given by (56). Proof. If FN (a, z1 ; . . . ; a, zn ; q) is meromorphic then it is an elliptic function from Lemma 5 (iv) and (v). We prove the required result by induction. For n = 1, FN (a, z1 ; q) has no poles from Lemma 5 (iii) and is therefore constant in z1 . Then (86) of Lemma 7 implies FN (a, z1 ; q) = (a, β)
q (β,β)/2 , η(τ )
in agreement with the RHS of (56) in this case. Next consider the elliptic function G(z1 , . . . , zn ) = FN (a, z1 ; . . . ; a, zn ; q) n − P2 (z1i )FN (a, z2 ; . . . ; a, ˆ zˆ i ; . . . ; a, zn ; q), i=2
where a, ˆ zˆ i denotes the deletion of the corresponding vertex operator. From Lemma 6, G is holomorphic and elliptic in z1 and is therefore independent of z1 . Next note that although not an elliptic function, P1 (z, τ ) is periodic with period 2π i and so 2π i P2 (z1i , τ )dz1 = 0. Hence Lemma 7 implies 0 FN (a, z1 ; . . . ; a, zn ; q) = (a, β)FN (a, z2 ; . . . ; a, zn ; q) n + P2 (z1i )FN (a, z2 ; . . . ; a, ˆ zˆ i ; . . . ; a, zn ; q). i=2
(87) But this recurrence relation is precisely (50) with v1 = 1, v2 = . . . = vn = a and αi = 0. Thus we obtain the RHS of (56) by induction as before.
Torus Chiral n-Point Functions
67
Remark 1. The rank l result follows as before by considering the tensor product of l rank one Heisenberg VOAs. We now consider the lattice n-point function FN (eα1 , z1 ; . . . ; eαn , zn ; q) for a rank l lattice. We firstly note the following: Lemma 8. For n ≥ 1 and for i = j , the (formal) Laurent expansion in zij of FN (eα1 , z1 ; . . . ; eαn , zn ; q) has leading behaviour FN (eα1 , z1 ; . . . ; eαn , zn ; q) (α ,αj )
= (αi , αj )zij i
FN (eα1 , z1 ; . . . eˆαi , zˆ i ; . . . eαi +αj , zj ; . . . ; eαn , zn ; q) + . . . , (88)
eˆαi , zˆ i
where denotes the deletion of the corresponding vertex operator resulting in an n − 1 point function. Proof. Using Lemma 5 (i) it suffices to consider the expansion in zn−1n . The result then follows from (24), (25) and (35) of Lemma 1 to find (α
n−1 Y [eαn−1 , zn−1n ].eαn = (αn−1 , αn )zn−1n
,αn ) αn +αn−1
e
+ ... .
Next recall the following properties for the prime form K(z, τ ), e.g., [Mu] Lemma 9. The genus one prime form K(z, τ ) is a holomorphic function on C/{2π i(m+ nτ )|m, n ∈ Z} given by iθ1 (z, τ ) K(z, τ ) = − , (89) η(τ )3 θ1 (z, τ ) ≡ exp(π iτ (n + 1/2)2 + (n + 1/2)(z + iπ )), (90) n∈Z
where K(z, τ ) is quasi-periodic in z with period 2π i and multiplier −1 and with period 2π iτ and multiplier −q −1/2 qz−1 . Furthermore K(z, τ ) has a unique zero at z = 0 on C/{2π i(m + nτ )|m, n ∈ Z}. We finally show that FN (eα1 , z1 ; . . . ; eαn , zn ; q) is uniquely determined to be given by (78) as follows: Proposition 7. FN (eα1 , z1 ; . . . ; eαn , zn ; q) is the unique meromorphic function in zi ∈ C/{2π i(m + nτ )|m, n ∈ Z} obeying Lemmas 5 and 8. Proof. If FN (eα1 , z1 ; . . . ; eαn , zn ; q) is meromorphic then consider the meromorphic function FN (eα1 , z1 ; . . . ; eαn , zn ; q) G(z1 , . . . , zn ) = . (91) (αi ,αj ) 1≤r≤n exp((β, αr )zr ) 1≤i<j ≤n (αi , αj )K(zij , τ ) We wish to show that G = FN (q) = η(τ )l . We prove this by induction in n. It is easy to see that G is periodic with periods 2πi, 2π iτ using Lemma 5 (iv), (v) and Lemma 9 and hence G is elliptic in zi . G is also a function of zij and considering the Laurent expansion in zij one finds that the leading term is given by G(z1 , . . . , zn ) = FN (q) + . . . , using Lemma 8 and (26) and induction. Hence G is regular at zij = 0. But the denominator of G has zeros possible only at zij = 0 (for (αi , αj ) > 0) and hence G is a holomorphic elliptic function and is therefore constant in zi . Thus G(z1 , . . . , zn ) = FN (q) and the result follows.
68
G. Mason, M.P. Tuite
References [B]
Borcherds, R.: Vertex algebras, Kac-Moody algebras and the Monster. Proc. Natl. Acad. Sci. U.S.A. 83, 3068–3071 (1986) [D] D’Hoker, E.: String Theory. In: Quantum Fields and Strings: A Course for Mathematicians, Deligne et al. (eds). Providence, RI: AMS, 1999 [DLM] Dong, C., Li, H., Mason, G.: Modular-invariance of trace functions in orbifold theory and generalized moonshine. Commun. Math. Phys. 214, 1–56 (2000) [DM] Dong, C., Mason, G.: Monstrous moonshine of higher weight. Acta Math. 185, 101–121 (2000) [DMN] Dong, C., Mason, G., Nagatoma, K.: Quasi-modular forms and trace functions associated to free boson and lattice vertex operator algebras. Int. Math. Res. Not. 8, 409–427 (2001) [EZ] Eichler, M., Zagier, D.: The Theory of Jacobi Forms. Boston: Birkh¨auser, 1985 [FHL] Frenkel, I., Huang, Y.-Z., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Mem. Am. Math. Soc. 104, 494 (1993) [FLM] Frenkel, I., Lepowsky, J., Meurman, A.: Vertex Operator Algebras and the Monster. New York: Academic Press, 1988 [Go] Goddard, P.: Meromorphic conformal field theory. In: Proceedings of the CIRM Luminy Conference 1988. Singapore: World Scientific, 1989 [Ka] Kac, V.: Vertex Operator Algebras for Beginners. University Lecture Series, Vol. 10, Providence, RI: AMS, 1998 [LW] van Lint, J., Wilson, R.: A Course in Combinatorics. Cambridge: Cambridge University Press, 1992 [MN] Matsuo, A., Nagatomo, K.: Axioms for a vertex algebra and the locality of quantum fields. Math. Soc. Japan Memoirs 4 (1999) [MT] Mason, G., Tuite, M.P.: Work in progress [Mu] Mumford, D.: Tata Lectures on Theta I. Boston: Birkh¨auser, 1983 [P] Polchinski, J.: String Theory, Volumes I and II. Cambridge: Cambridge University Press, 1998 [T] Tuite, M.P.: Genus two meromorphic conformal field theory. CRM Proc. Lecture Notes 30, 231–251 (2001) [Z] Zhu, Y.: Modular invariance of characters of vertex operator algebras. J. Am. Math. Soc. 9, 237–302 (1996) Communicated by L. Takhtajan
Commun. Math. Phys. 235, 69–124 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0809-5
Communications in
Mathematical Physics
Renormalizations and Rigidity Theory for Circle Homeomorphisms with Singularities of the Break Type K. Khanin1,2,3 , D. Khmelev1,2,4 1
Isaac Newton Institute for Mathematical Sciences, University of Cambridge, 20 Clarkson Road, Cambridge CB3 OEH, UK. E-mail:
[email protected] 2 Department of Mathematics, Heriot-Watt University, Edinburgh, UK 3 Landau Institute for Theoretical Physics, Moscow, Russia 4 Department of Mathematics, University of Toronto, Canada Received: 1 August 2002 / Accepted: 7 October 2002 Published online: 25 February 2003 – © Springer-Verlag 2003
Abstract: Circle homeomorphisms with singularities of the break type are considered in the case when rotation numbers have periodic continued fraction expansion. We establish hyperbolicity for renormalizations and then use it in order to prove the following rigidity result. Namely, we show that any two homeomorphisms with a single break point are smoothly conjugate to each other provided they have the same quadratic irrational rotation number and the same “size” of a break. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. General Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. The Family of Homeomorphisms Ta,v,c . . . . . . . . . . . . . . . 4. The Monotonicity of Rotation Number ρ(Ta,v,c ) w.r.t. a . . . . . . . 5. The Structure of Domains of Constant Rotation Number in c . . . 6. Exponential Instability of ρ(Ta,v,c ) w.r.t. a . . . . . . . . . . . . . . 7. Uniform Hyperbolicity of the RG-Transformation at Periodic Points 8. The Convergence of Renormalizations for f ∈ Fc . . . . . . . . . . 9. Proof of Theorem A . . . . . . . . . . . . . . . . . . . . . . . . . . 10. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
69 74 79 89 92 97 102 109 113 121
1. Introduction Renormalization group ideas have been successfully applied in the context of dynamical systems for about 25 years, starting with the pioneering work by M. Feigenbaum [19, 20]. Feigenbaum studied infinite sequences of period-doubling bifurcations for unimodal mappings. It was the initial example for which the concept of universal metric properties appeared in the theory of dynamical systems for the first time. Since then the
70
K. Khanin, D. Khmelev
renormalizations method was successfully used in other dynamical problems such as complex universality, KAM-theory, circle dynamics, critical invariant curves and many others [1, 5–7, 9–18, 21–23, 25, 30, 31, 33–49, 51–55, 57–67, 69–73]. The literature on renormalizations is vast and the list above aims just to give an idea of different directions and developments of the theory. Although the universal behavior for period-doubling bifurcations was observed not only for one-dimensional mappings, but also for systems of ordinary differential equations and even in the case of infinite-dimensional PDE’s, the phenomenon is essentially one-dimensional. In fact, the main body of the RG-theory in dynamical systems corresponds to the one-dimensional or quasi-one-dimensional case (like complex dynamics). That is mainly due to the presence of the order structure in the one-dimensional situation. As a result, one can identify the combinatorial types of the trajectories of a mapping, which are topological invariants of the dynamics. These combinatorial types are characterized by kneading invariants in the case of interval maps and by rotation numbers in the case of circle dynamics. The discovery by Mitchell Feigenbaum led to realizing that the combinatorics of the trajectories in fact uniquely determines asymptotic geometry or, in other words, asymptotic scalings of trajectories. This brings us to the rigidity principle, which seems to be valid in a very general situation. Roughly speaking, it can be formulated in the following way. If two one-dimensional mappings have the same structure of singularities and the same combinatorial type, then they are, in a certain sense, smoothly conjugate to each other. In other words, if two maps are homeomorphically conjugate to each other and the types of their singular points are the same, then, in fact, the conjugation is smooth. At present, there exist only few examples, where this general statement was proven, and renormalization is the most powerful tool in the existing approaches. Probably, the first rigidity result was the local KAM-type theorem by Arnold [2]. He has proved that if an analytic diffeomorphism on the circle with irrational Diophantine rotation number is close to pure rotation, then they are analytically conjugate to each other. Arnold conjectured that in the one-dimensional case the local condition of being close to pure rotation is not essential and the result should be valid globally. Much later, the global result indeed has been proven by Michael Herman [26, 27]. Herman’s theorem is certainly a rigidity type result in the case where the singularities are absent. In this situation there is a distinguished representative which is a pure rotation, and the theorem says that a C 3+ε −smooth diffeomorphism is smoothly conjugate to a linear one. Yoccoz extended Herman’s theory to the case of Diophantine rotation numbers [74]. Later the results of Herman’s type were proven for the C 2+ε diffeomorphism [28, 29, 33, 62, 64]. The first case of mappings with singularities was studied in connection with perioddoubling sequences for unimodal maps. The local rigidity result was basically proven by O. Lanford, who gave a computer-assisted proof of existence of a hyperbolic fixed point for the period-doubling operator [38, 40]. The global result for a period doubling case was a remarkable achievement of the theory of Sullivan-McMullen-Lyubich [45, 51, 67]. In fact, this theory can be applied to the case of infinitely-renormalizable unimodal maps of bounded combinatorial type with a singular point of a type x 2α , where α ∈ N. The theory of Sullivan-McMullen-Lyubich is based on a beautiful and deep connection with complex dynamics. However, the condition of singularity being of integer type x 2α is quite annoying, although it is natural from the “physical” point of view. Indeed, it is quite clear that the phenomena of rigidity and universality do not depend on the condition of α being integer-valued, and the result should be correct for the maps of the type f (|x − xmax |β ) for all real β > 1. The fact that the general phenomenon can be analyzed only for particular values of the parameter β indicates that we yet do not have real understanding of the mechanism of rigidity.
Renormalizations and Rigidity Theory
71
It turns out that it is easier to discuss rigidity properties for circle dynamics rather than for interval maps. This is due to monotonicity which simplifies both the combinatorics of trajectories and also the dependence on the parameter for one-parameter families of circle maps. Combinatorics of trajectories is characterized by the rotation number ρ. In the case of ρ irrational it gives a complete description of the order of points for every trajectory. The counterpart to unimodal maps in the case of circle dynamics is formed by so-called critical circle mappings. Critical circle mappings are circle homeomorphisms which arise from smooth functions f (y), the derivatives of which are positive everywhere on [0, 1) except at one critical point xcr , where they vanish. These mappings were studied quite intensively in the last twenty years. As in the unimodal case the type of singularity is characterized by the exponent 1 + β, namely, f (y) = g((y − xcr )|y − xcr |β ), β > 0. It was shown by de Faria and de Melo and Yampolsky [12, 13, 72] that SullivanMcMullen-Lyubich theory can be extended to critical circle mappings. They have proved that if two mappings have singularities of the same type 1 + β, where β = 2k, k ∈ N and the same irrational rotation number of bounded type, then they are C 1+γ -conjugate. As in the case of unimodal maps, at present one can prove the result only for the case of integer β, while there is no doubt that the rigidity holds for all β > 0. One of the main aims of this paper is to present a class of one-dimensional dynamical systems, where rigidity analysis can be carried out for all singularities of particular type. Instead of critical maps we consider homeomorphisms of the circle with a break singularity. More precisely, let S 1 = R/Z, f (y) = y + f˜(y). Suppose that 1) f˜ is a continuous Z-periodic function: f˜ ∈ C(R), f˜(y + k) = f˜(y) for any k ∈ Z. We shall also assume that 0 ≤ f˜(0) < 1. 2) There exists a unique xcr ∈ [0, 1), such that f˜ ∈ C 2+ε (xcr , xcr + 1) . 3) In the point of break xcr there exist one-sided derivatives f (xcr −), f (xcr +) > 0 and f (xcr −)/f (xcr +) = c = 1. 4) inf f (y) > 0.
(B)
y∈(xcr ,xcr +1)
Denote by Fc the set of functions f satisfying conditions 1)–4). For every f ∈ Fc we define a homeomorphism of a circle Tf (x) = {f (x)}, where {b} = b − [b] = b (mod 1) and {·}, [·] stand for a fractional and integer parts of a number. We shall say that an orientation-preserving homeomorphism of a circle Tf has a break singularity of the type c at the point xcr . In other words, a homeomorphism with a break singularity is just a homeomorphism of the circle which is C 2+ε -smooth everywhere except at the point of break xcr , where it has a jump in the first derivative. Notice that the conditions 3)–4) imply that derivatives f (x) are positive and bounded away from zero. Parameter c is obviously a smooth invariant and it characterizes a type of singularity in the same sense as α, β above. Notice that c = 1 corresponds to the diffeomorphism case, with rigidity following from Herman’s theorem. However, in the case of non-trivial break (c = 1) the dynamics exhibits many “critical” properties which are similar to the case of critical circle mappings. As we shall see later, the mappings with break singularity have strong rigidity properties with highly non-trivial asymptotic scalings. This “criticality” is not obvious at all since the mappings do not have critical points. It is an interesting and unexpected phenomenon that a rather weak singularity of the break type can produce a very rich scaling structure. It is also important to mention that homeomorphisms of the break type can be considered as a natural one-parameter extension of Herman’s rigidity theory where linear rotations are replaced by fractional-linear mappings with break singularities.
72
K. Khanin, D. Khmelev
We next describe two natural geometrical constructions which lead to circle homeomorphisms of the break type. The first one was suggested by V.I. Arnold. Consider a convex closed set S. Suppose that its boundary ∂S is a curve of length 1. An involution Ia of this curve with respect to the line (direction) a is defined as follows. Consider all parallel translations of the line a. Since the set S is convex, any translation of a meets ∂S at none, one or two points. In the former case we exchange these two points, i.e, Ia (x1 ) = x2 and Ia (x2 ) = x1 for any two points x1 , x2 ∈ ∂S such that the segment connecting these two points is parallel to a. For any two directions a and b the iteration Ib ◦ Ia of the corresponding involutions defines an orientation-preserving homeomorphism of the boundary ∂S. In order to obtain a homeomorphism with break singularity suppose that ∂S is smooth everywhere except at one point C, where we have a corner (see Fig. 1). Suppose now that ∂S is parameterized by the arc length parameter: x = x(l), y = y(l), where l ∈ [0, 1), so that 0 corresponds to a singular point C. We shall denote by T the homeomorphism corresponding to Ib ◦ Ia in the coordinate l. Suppose that one of the following assumptions holds: i) x(l), y(l) ∈ C 4 ([0, 1]) and all tangent lines to ∂S have non-degenerate tangency, ii) x(l), y(l) ∈ C ∞ ([0, 1]) and there are no flat points.
(I )
Also assume that the curve (x(l), y(l)) has finite one-sided curvatures which are k+ > 0 at l = 0 and k− > 0 at l = 1. Notice that the last condition is obviously satisfied in
A π−γ
a γ γ
γ
b
α α
C γ
Fig. 1. Arnold’s two involutions. The homeomorphism Ib Ia has one break-point at A and the break parameter c = sin(γ + α)/ sin(γ − α)
Renormalizations and Rigidity Theory
73
the case i). The number of singularities of the homeomorphism Ib ◦ Ia obtained depends on the choice of a and b. Below we assume without loss of generality that a and b are passing through the point C. Let a be the internal bisector of the angle 2α between two distinct tangents to the curve ∂S at point C. Denote by γ the angle between a and b (see Fig. 1). We shall assume that 0 < α < γ < π − α. This condition guarantees that Ib ◦ Ia can have singularities only at points A and C. Since a is the bisector, it is easy to check that Ib ◦ Ia has no jump of first derivative at point C, i.e., T (0+) = T (0−). Simple calculation implies that T (0−)/T (0+) = k− /k+ . Assuming k− = k+ , we see that the induced homeomorphism T satisfies conditions (B) withbreak at the point xcr , where (x(xcr ), y(xcr )) = A. An easy calculation shows that c = T (xcr −)/T (xcr +) = sin γ / sin 2α. Notice that condition k− = k+ is not essential. It turns out that singularities in the second derivative do not change the renormalization group behavior of a mapping. One can easily generalize all the results of this paper to the case where the second derivative is piecewise continuous with finite number of jump discontinuities. The second construction (which was suggested to us by S. Anisov) involves an annulus formed by two convex sets S1 , S2 with disjoint boundaries ∂S1 , ∂S2 (see Fig. 2). We shall assume that the internal boundary ∂S2 is smooth and the external boundary ∂S1 is smooth everywhere except at the point B, where it has a corner. For every point P of ∂S1 there exist two tangents to ∂S2 . Let P− , P+ be the intersections of those tangents with ∂S1 and Q− , Q+ be the corresponding tangency points. The indices − and + are chosen according to orientation of the plain, namely, we assume that the oriented area corre −, P P + ) is positive. Obviously, both P+ and P− depend continuously sponding to (P P ˆ on P1 . Denote by T1 , Tˆ2 the homeomorphisms of ∂S1 , ∂S2 defined by Tˆ1 : P → P+ , Tˆ2 : Q− → Q+ . Using the normalized arc length parameters we get the homeomorphisms of the unit circle T1 , T2 corresponding to Tˆ1 , Tˆ2 . It is easy to see that Tˆ2 has one break point at the point Q (see Fig. 2). It follows that Tˆ2 satisfies all the conditions 1)–4) in (B). The homeomorphism Tˆ1 has, in general, two break points: at A and at B, where A = Tˆ1−1 B. However, since both break points belong to the same trajectory, the analysis of such homeomorphisms can be reduced to the case (B). The “strengths” of
B α
P
γ β
Q− P− Q
Q+ S2
S1 y
P+ A Fig. 2. Annulus with the corner at point B
x
74
the breaks cA =
K. Khanin, D. Khmelev
T1 (A−)/T1 (A+) and cB = T1 (B−)/T1 (B+) can be easily found: cA =
sin (β + γ ) , cB = sin α
sin γ . sin (α + β)
As in the case of involutions the boundary curves ∂S1 , ∂S2 should satisfy certain regularity conditions similar to (I ). The main result of this paper is given by the following theorem which for the first time was announced in [31] without the proof. Theorem A. Let f , g ∈ Fc . Suppose that 1) ρ(Tf ) = ρ(T g) = ρ, √ 2) ρ is quadratic irrational, i.e., ρ = a + b, a, b ∈ Q. Then there exists γ > 0 such that f and g are C 1+γ conjugate, i.e., Tf = ◦ T g ◦ −1 , where is a C 1+γ orientation-preserving diffeomorphism. It is well-known [24] that quadratic irrationals are exactly the irrational numbers which have eventually a periodic continued fraction expansion. Although they form a dense set in [0, 1), there are only countably many of them. In this paper for simplicity we restrict our attention to this case only. However it will be shown in the forthcoming paper [32] that the result can be generalized to a much wider class of rotation numbers. In fact, there are strong arguments which suggest that the case of circle mappings with singularities might have stronger rigidity properties compared to the diffeomorphism case, namely, that a certain rigidity holds for all irrational rotation numbers, not only for typical ones. The paper has the following structure. In Sect. 2 we define renormalizations and present results on the convergence to fractional-linear transformations which were proven in [34]. The properties of renormalization in the space of pairs of fractional-linear transformation is studied in Sect. 3. In Sects. 4–6 we prove the monotonicity and exponential instability of the rotation number with respect to the natural parameter which is defined in Sect. 2. In Sect. 7 we establish uniform hyperbolicity of the renormalization transformation at periodic points. The convergence of the renormalizations to the periodic orbits is proven in Sect. 8. We finish with the proof of the main. Theorem A in Sect. 9 and conclusions in Sect. 10. 2. General Setting In this section we present basic facts of the theory of circle homeomorphisms with the break singularity. Let f (y) = y + f˜(y), where f˜ is a continuous Z-periodic function, i.e., f˜(y + k) = f˜(y) + k for any k ∈ Z, 0 ≤ f˜(0) < 1. Suppose that f (y) strictly increases, i.e. f (y1 ) > f (y2 ) for any y1 > y2 . For any such f we define a homeomorphism of the circle Tf (x) = {f (x)}, where {·} is the fractional part of a number. Given a homomorphism Tf one can define its rotation number f n x0 , n→∞ n
ρ(f ) = lim
(1)
Renormalizations and Rigidity Theory
75
where the right-hand-side limit exists for all x0 ∈ R and is independent of x0 . In (1) and everywhere below f n stands for the nth iterate of f : f n = f (f . . . (f (x))).
n times
It is easy to see that 0 ≤ ρ(f ) < 1. We shall repeatedly use the continued fraction expansion for ρ(f ): ρ(f ) = ρ = [k1 , k2 , . . . , kn , . . . ] =
1 k1 +
,
1 1 ···
k2 +
kn +
1 ···
which is finite for rational ρ and is infinite for irrational. Positive integers kl are the partial quotients of ρ. In the case of rational ρ = [k1 , . . . , kn ] we shall always assume kn > 1. This condition guarantees the uniqueness of the expansion (see [24]). We shall also use the sequence of convergents pn = [k1 , . . . , kn ]. qn The numerator pn and denominator qn satisfy the following recurrent relation: qn = kn qn−1 + qn−2 , n ≥ 1, pn = kn pn−1 + pn−2 , where we have used standard conventions p−1 = 1, q−1 = 0 and p0 = 0, q0 = 1. The Gauss map G : ρ → {1/ρ} corresponds to the unit shift of a symbolic sequence given by the continued fraction expansion: Gρ = [k2 , k3 , . . . ] if ρ = [k1 , k2 , k3 , . . . ]. This implies that ki+1 = [1/ρi ], where ρi = G i ρ, i ≥ 0 and [·] is the integer part of a number. We next discuss the notion of renormalizations. The main idea of the method is to study large time iterates of the original mappings in a rescaled coordinate system corresponding to some neighborhood of a given point. We start with a traditional definition of a sequence of the renormalized pairs. Suppose Tf has an irrational rotation number ρ(Tf ). Denote pn /qn = [k1 , . . . , kn ] the sequence of convergents to ρ(Tf ). For a fixed initial point x0 define its trajectory {xi = Tf i x0 , i ≥ 1}. Then the nth renormalized pair associated to the point x0 is defined by: 1 [f qn (x0 + z(x0 − xqn−1 )) − x0 − pn ], z ∈ [−1, 0], x0 − xqn−1 1 gn (z) = [f qn−1 (x0 + z(x0 − xqn−1 )) − x0 − pn−1 ], z ∈ [0, an ], x0 − xqn−1 fn (z) =
76
K. Khanin, D. Khmelev
where x0 − xqn−1 = x0 + pn−1 − f qn−1 x0 , xqn − x0 = f qn x0 − x0 − pn , an = (xqn − x0 )/(x0 − xqn−1 ). We shall formulate now the results on renormalized pairs which were proven in [34]. In what follows we assume that the initial point x0 = xcr . Let us first define the family of pairs of the fractional-linear mappings F˜a,b,c (z) =
(a + cz)b ab(z − c) ˜ a,b,c (z) = , G . b + (b + a − c)z abc + (c − a − cb)z
Then, the following theorem holds. Theorem 1. Suppose f satisfies conditions 1)–4) given in the introduction. Then there exist positive constants K > 0 and 0 < λ < 1 such that for all n, fn − F˜an ,bn ,cn C 2 ([−1,0]) ≤ Kλεn and
Kλεn , an where cn = c for odd n and cn = 1/c for even n, ε is the H¨older exponent in condition 2) in (B) and an , bn are given by the following relations: ˜ an ,bn ,cn C 2 ([0,a ]) ≤ gn − G n
an =
x0 − xqn−1 +qn xqn − x0 , bn = . x0 − xqn−1 x0 − xqn−1
(2)
The flipping of cn is due to a change of an orientation while passing from the nth to the (n + 1)th step of the renormalization. The proof of Theorem 1 is based on an analysis of the dynamical partitions ξn (xcr ), which are formed by the points {xi = Tf i xcr , 0 ≤ i < qn + qn−1 }. We shall call ξn (xcr ) (n) (n) the nth level partition. It consists of two sets of intervals: i = Tf i 0 , 0 ≤ i < qn−1 (n−1) (n−1) (n) (n−1) and j = Tf j 0 , 0 ≤ j < qn , where 0 , 0 are intervals with endpoints at (x0 , xqn ) and (xqn−1 , x0 ) respectively. Denote q n −1 Mn = exp j =0
(n−1)
j
f (y) dy . 2f (y)
The following estimate was also proven in [34]: |an + bn Mn − cn | ≤ const an λn .
(3)
Here and often below we use the notation const for constants which depend only on the homeomorphism Tf . Using the thermodynamic formalism one can prove an ergodic theorem for the random process, corresponding to a symbolic representation for the elements of partition ξn (xcr ) (see [33]). This implies that (n−1) (n−1) |j | |j | (n−1) (n−1) √ j :j ∈1 j :j ∈2 ≤ const λ k , (4) − max 1 1 ,2 ∈ξn−k |1 | |2 |
Renormalizations and Rigidity Theory
77
where 0 < λ1 < 1. The estimate (4) means one can define an approximate density of q (n−1) (n−1) , 0 ≤ j < qn . This density is equal to pn = j n=0 |j |. It follows intervals j that f (y) Mn = exp p¯ n (5) dy = ep¯n log c , (y) 1 2f S √
n
where |pn − p¯ n | ≤ const λ2 , 0 < λ2 < 1. Dynamical partitions ξn (x) can be defined for any homomorphism of the circle with an irrational rotation number. We shall repeatedly use the following lemma from [33]. Lemma 1. Suppose ρ(Tf ) is irrational, Tf is piecewise C 1 and Var x∈S 1 log f (x) = V < ∞. Then for any x0 ∈ S 1 and for all n ≥ 0, (n+2)
|0
(x0 )| (n) |0 (x0 )|
≤λ=
1 < 1. 1 + e−V
This lemma follows from the Denjoy inequality (see [33]). It guarantees that the length of elements of the dynamical partition ξn (x0 ) decay exponentially as n → ∞ uniformly in x0 . It will be convenient for us to define also a slightly different renormalization scheme. This scheme allows to construct a new renormalized homeomorphism of the circle instead of renormalized pairs (fn , gn ). The new renormalizations are also associated with a fixed point x0 ∈ S 1 . As above it is natural to consider x0 = xcr in the case of homeomorphisms with a singularity. Suppose that ρ(Tf ) = 1/k for all k ∈ N, i.e., ρ(Tf ) = [k1 , k2 , . . . ]. Therefore, the following inequalities hold for all y ∈ R: f k1 y < y + 1 < f k1 +1 y < fy + 1. Then we can define a pair of functions (f, g) which we call the renormalization of Tf : 1 [f k1 (x0 + z(x0 − f x0 )) − x0 − 1], z ∈ [−1, 0), x0 − f x 0 1 g(z) = [f (x0 + z(x0 − x1 )) − x0 ], z ∈ [0, a], x0 − f x 0 f(z) =
where a = (f k1 x0 − x0 − 1)/(x0 − x1 ). We shall denote (f, g) = Rx0 Tf . Clearly, f(0) = a, g(0) = −1 and f(−1) = g(a) = (f k1 +1 x0 −x0 −1)/(x0 −f x0 ) ∈ (−1, 0). Therefore one can construct a homeomorphism of the unit circle corresponding to the pair (f, g) by rescaling the segment [−1, a] to the whole circle. We shall denote this operation by Hx0 . Choosing a semi-open interval [x0 − 1/(1 + a), x0 + a/(1 + a)) as a representative for the circle we define f((1 + a)(x − x0 )) 1 + x0 , when x ∈ [x0 − , x0 ], +a 1+a (Hx0 (f, g))(x) = g((1 +1a)(x (6) − x0 )) a + x0 , when x ∈ [x0 , x0 + ). 1+a 1+a Obviously (Hx0 Rx0 )Tf defines a new homeomorphism of the unit circle which we call the renormalization of Tf .
78
K. Khanin, D. Khmelev
Lemma 2. Suppose that f : x → x + ρ is renormalizable, i.e., ρ(Tf ) = ρ = [k1 , k2 , . . . ] = 1/k, k ∈ N. Then ρ((Hx0 Rx0 )Tf ) = [k2 + 1, k3 , . . . ] =
1 k2 + 1 +
.
1
(7)
1 ···
k3 +
kn +
1 ···
Proof. Obviously (Hx0 Rx0 )Tf is a linear rotation. To find its rotation number it is enough to evaluate it at one point. Taking x = x0 we have (Hx0 (f, g))(x0 ) =
f(0) a + x0 = + x0 . 1+a a+1
a . Using ρ = 1/(k1 + α), 0 < α = [k2 , k3 , ...] < a+1 1, a = (1 − k1 ρ)/ρ = α we have:
Hence, ρ((Hx0 Rx0 )Tf ) =
ρ((Hx0 Rx0 )Tf ) =
α 1 a = = = [k2 + 1, k3 , ...]. a+1 α+1 1 + 1/α
Notice that the last statement holds for rational rotation numbers as well as for irrationals. In particular, ρ((Hx0 Rx0 )Tf ) = [k2 + 1] if f (x) = x + ρ and ρ(Tf ) = ρ = [k1 , k2 ]. Lemma 3. Suppose that Tf is an arbitrary renormalizable homeomorphism of the circle S 1 and ρ(Tf ) = [k1 , k2 , . . . ]. Then ρ((Hx0 Rx0 )Tf ) satisfies (7). Proof. For irrational ρ the statement easily follows from the previous lemma. Indeed, a rotation number is completely determined by the order of points of a trajectory {xj }j ≥0 which by the Poincar´e theorem coincides with the same order for the linear rotation. If ρ = p/q then again by the Poincar´e theorem there exists a periodic trajectory y0 , y1 , . . . , yq−1 such that the order of points of the trajectory {yj } coincides with the order of points for pure rotation. At least one of the points of this periodic trajectory belongs to the fundamental interval with the end points Tf x0 and Tf k1 x0 containing x0 . Obviously this point corresponds to a periodic trajectory for the renormalized homeomorphism (Hx0 Rx0 )Tf . It is easy to see that the combinatorial type of this periodic trajectory corresponds exactly to a rotation number given by (7). Lemma 3 shows that Hx0 Rx0 acts on rotation numbers similarly to the Gauss mapping. The only difference is connected with a replacement of the first partial quotient k with k + 1. For simplicity of notations everywhere below we omit the index x0 , if x0 corresponds to the critical point of the mapping. If the homeomorphism (H R)Tf is renormalizable, we say that Tf is twice renormalizable, if all homeomorphisms (H R)i Tf, 1 ≤ i ≤ n−1 are renormalizable we say that Tf is n-times renormalizable. It follows from Lemma 3
Renormalizations and Rigidity Theory
79
that the homeomorphism Tf is n-times renormalizable if and only if the rotation number ρ(Tf ) = [k1 , . . . , kn , . . . ] has at least n + 1 partial quotients. It is easy to see that renormalized pairs (fn , gn ) which we defined above are given by (fn , gn ) = R(H R)n−1 Tf, n ≥ 1. Our aim is to study the asymptotic properties of (fn , gn ) as n → ∞. The essence of the renormalization method is based on the following idea: after iterations and rescaling the asymptotic behavior of the pairs (fn , gn ) becomes universal and depends only on few parameters. This is indeed the case in the break situation. We have seen above that (fn , gn ) ˜ a,b,c ). It was asymptotically as n → ∞ approaches the two-parameter family (F˜a,b,c , G ˜ ˜ shown in [34] that the family (Fa,b,c , Ga,b,c ) is invariant with respect to renormalization, ˜ a,b,c ) is renormalizable, then i.e., if the homeomorphism H (F˜a,b,c , G ˜ a,b,c ) = (F˜a ,b ,c , G ˜ a ,b ,c ), RH (F˜a,b,c , G where c = 1/c and a , b are obtained as follows. 1) Find a trajectory of the point −1 for the mapping F˜a,b,c until it changes sign: k+1 k F˜a,b,c (−1) = −b, . . . , F˜a,b,c (−1) < 0, F˜a,b,c (−1) > 0, k ≥ 1. k ˜ 2) Now R(a, b, c) = (a , b , c ), where c = 1/c, a = −F˜a,b,c (−1)/a and b = k+1 (−1)/a. F˜a,b,c The above procedure defines a transformation (a, b, c) → (a , b , c ) we shall denote ˜ Denote also by R ˜ c the transformation acting on (a, b) by the following formula. by R.
˜ c (a, b) = (a , b ). R
(8)
˜ and ˜ a,b,c ). Certainly R We shall also denote by T˜a,b,c the homeomorphism H (F˜a,b,c , G ˜ ˜ Rc are defined only if Ta,b,c is renormalizable. We finish this section with a general plan for the proof of Theorem A. In what follows ˜ We show that all the periodic trajectories for R ˜ are hyperbolic we study the behavior of R. and that the corresponding stable manifolds consist of homeomorphisms with a fixed rotation number. This together with Theorem 1 gives exponentially fast convergence to ˜ in the case when a rotation number is a quadratic irrational. periodic trajectories for R This implies that the renormalizations of Tf and T g are exponentially close to each other if Tf and T g have the same quadratic irrational rotation number and the same break: c(f ) = c(g). Hence Tf and T g are smoothly conjugate to each other which proves Theorem A. 3. The Family of Homeomorphisms Ta,v,c Consider the homeomorphism T˜a,b,c , defined by functions (a + cz)b when z ∈ [−1, 0], b + (b + a − c)z ab(z − c) ˜ a,b,c (z) = G when z ∈ [0, a]. abc + (c − a − cb)z F˜a,b,c (z) =
Notice that b ∈ [0, 1) and a > 0.
80
K. Khanin, D. Khmelev
Let us introduce a new variable v = (c−a−b)/b, so that b(a, v, c) = (c−a)/(v+1). Then a + cz when z ∈ [−1, 0], 1 − vz a(z − c) ˜ a,b(a,v,c),c (z) = Ga,v,c (z) = G when z ∈ [0, a]. ac + z(1 + v − c) Fa,v,c (z) = F˜a,b(a,v,c),c (z) =
Let us denote by Ta,v,c a corresponding diffeomorphism if one exists. Notice that a, v and c satisfy some constraints. First of all, there are “geometrical” constraints, which guarantee the existence of Ta,v,c . We shall use the notation c for the set of all (a, v) satisfying “geometrical” constraints. Secondly, it is convenient to exclude explicitly from the set c those (a, v) corresponding to rotation number 0. Thirdly, there are some ergodic constraints which we explain below. 3.1. Geometrical constraints. The renormalization scheme which we described above implies 0 ≤ b < 1. Accordingly, we consider only such values of a and v that this condition holds. Notice that Fa,v,c (0) = a and Ga,v,c (0) = −1. A pair (Fa,v,c , Ga,v,c ) defines an orientation-preserving homeomorphism of the circle if and only if Ga,v,c (a) = Fa,v,c (−1) = −b, 0 ≤ b < 1, and Fa,v,c (y) > 0 for all y ∈ [−1, 0], and Ga,v,c (y) > 0 for all y ∈ [0, a]. Lemma 4. Suppose a > 0. The pair (Fa,v,c , Ga,v,c ) defines an orientation-preserving homeomorphism if and only if 0 < a ≤ c and a + v > c − 1. Proof. Since a−c = G(a), 1+v c + va Fa,v,c (z) = , (1 − vz)2 ac(a + v − (c − 1)) Ga,v,c (z) = , (ac + z(1 + v − c))2
Fa,v,c (−1) =
we have 0≤b=
c−a < 1. 1+v
(9)
In the case 1 + v < 0 one has a + v < c − 1, which implies Ga,v,c (z) < 0, and, consequently, there is no suitable (a, v) for v + 1 < 0, a > 0. Hence 1 + v > 0. It follows from (9) that a ≤ c and (a + v) − (c − 1) > 0. Easy calculation shows that inequalities 0 < a ≤ c, a + v > c − 1 (z) > 0 and 1 + v > 0. imply Ga,v,c (z) > 0 and Fa,v,c
Consequently, the homeomorphism Ta,v,c is well-defined for all (a, v) ∈ c , where c = {(a, v) | 0 < a ≤ c, a + v > c − 1}. The set c is shown on Figs. 3 and 4.
Renormalizations and Rigidity Theory
81
a a=c
c–1
–1
v
c–1 Fig. 3. The set c for c > 1 (here c = 4)
a 1
a=c
–1
c–1
v
Fig. 4. The set c for c < 1 (here c = 1/4)
3.2. Rotation number ρ = 0. In this section we study those (a, v) corresponding to rotation number ρ = 0. Lemma 5. If c > 1 then the rotation number is 0 for all (a, v) ∈ c such that v > (c − 1)/2 and a ≤ (c − 1)2 /4v. If c ≤ 1 then the rotation number is positive for all (a, v) ∈ c . Proof. The rotation number is 0 if and only if Ta,v,c has a fixed point. Obviously, there are no fixed points on [0, a]. To find a fixed point on [−1, 0], we should find the solution to the equation Fa,v,c (z) = z or a + cz = z, where z ∈ [−1, 0]. 1 − vz Denote by P (z) = vz2 + (c − 1)z + a. One can show that the last equation has roots if and only if the next equation has roots when z ∈ [−1, 0], P (z) = vz2 + (c − 1)z + a = 0.
82
K. Khanin, D. Khmelev
Notice that P (0) = a > 0 and P (−1) = a + v − (c − 1) > 0 for all (a, v) ∈ c . Hence P (z) has roots in [−1, 0] only if v > 0, (c − 1)2 − 4av ≥ 0 and z∗ = −(c − 1)/2v belongs to [−1, 0], where z∗ is a point of the minimum for the quadratic function P (z). The last conditions are satisfied if v > 0, c > 1, v ≥ (c − 1)/2 and a ≤ (c − 1)2 /4v. Finally, notice that in the case v = (c − 1)/2 we have a ≤ (c − 1)2 /4v = (c − 1)/2 and the set of these (a, v) has no intersection with c . Denote ∂− c = {(a, v) | −1 < v ≤
c−1 , a = c − 1 − v} 2
in the case c > 1, and ∂− c = {(a, v) | −1 < v < c − 1, a = c − 1 − v or v > c − 1, a = 0} in the case c < 1. We next show that the rotation number ρ(Ta,v,c ) approaches to 0 as point (a, v) ∈ c tends to ∂− c . Lemma 6. For every (a0 , v0 ) ∈ ∂− c and any ε > 0 there exists δ = δ(ε, a0 , v0 ) such that ρ(Ta,v,c ) < ε provided (a, v) ∈ c and dist ((a, v), (a0 , v0 )) < δ. Proof. We shall give qualitative proof of the lemma, although it is possible to give effective estimates for δ. It is easy to see that for all (a, v) ∈ ∂− c the function Fa,v,c has a fixed point z0 on [−1, 0]. More precisely, the fixed point is z0 = 0 if a = 0 and z0 = −1 if a = c − 1 − v, −1 < v < c − 1. Hence if (a, v) ∈ c is close to ∂− c , then Ta,v,c has an almost fixed point at either 0 or −1. This implies that the rotation number is small. The subset of c , corresponding to rotation number 0 is shown on Fig. 5. 3.3. Rotation number ρ = 1/2. We next describe the set of (a, v) such that ρ(Ta,v,c ) = 1/2. Notice that this set is non-empty for all c. One can check that for any (c, v) ∈ c the point 0 = x0 is a periodic point of period 2 for the homeomorphism Tc,v,c . Are there any other points (a, v) such that ρ(Ta,v,c ) = 1/2? Like the case of the rotation number 0 the situation is different for c > 1 and c < 1. a a=c
c–1
(c–1)/2
–1
(c–1)/2
c–1
v
Fig. 5. The subset of c of (a, v) such that ρ(Ta,v,c ) = 0 for c > 1 (here c = 4)
Renormalizations and Rigidity Theory
83
Lemma 7. Fix any positive c = 1. Then the following statements hold. (i) ρ(Tc,v,c ) = 1/2 for all (c, v) ∈ c . (ii) If c > 1 then for any (a, v) ∈ c such that a < c the rotation number ρ(Ta,v,c ) is not 1/2. (iii) If 0 < c < 1 then there exists a non-trivial infinite domain D1/2 (c) of those (a, v) ∈ c such that the rotation number ρ(Ta,v,c ) is 1/2. 2 can be defined as follows: Proof. Notice that the homeomorphism Ta,v,c Fa,v,c (Fa,v,c (z)) for z ∈ [−1, −a/c], 2 Ta,v,c (z) = Ga,v,c (Fa,v,c (z)) for z ∈ [−a/c, 0], F for z ∈ [0, a]. a,v,c (Ga,v,c (z))
Notice that there are no fixed points in the first case. Indeed, if there exists z0 ∈ [−1, −a/c] such that Fa,v,c (z0 ) = z0 then the rotation number is zero and we arrive at contradiction with ρ(Ta,v,c ) = 1/2. Therefore, Fa,v,c (z) > z for all z ∈ [−1, −a/c], and hence Fa,v,c (Fa,v,c (z)) > z for all z ∈ [−1, −a/c], i.e., Fa,v,c (Fa,v,c (z)) has no roots on [−1, −a/c]. Now consider two other cases. Clearly, if z0 ∈ [0, a] is a fixed point of Fa,v,c (Ga,v,c (·)), −1 (z ) is a fixed point of G then z˜ 0 = Ga,v,c (z0 ) = Fa,v,c 0 a,v,c (Fa,v,c (·)) and z˜ 0 ∈ [−a/c, 0]. Vice versa, if z0 ∈ [−a/c, 0] is a fixed point of Ga,v,c (Fa,v,c (·)), then z˜ 0 = Fa,v,c (z0 ) = G−1 a,v,c (z0 ) is a fixed point of Fa,v,c (Ga,v,c (·)) and z˜ 0 ∈ [0, a]. Therefore, we can consider only the case Fa,v,c (Ga,v,c (·)) and z ∈ [0, a]. One can show that equation Fa,v,c (Ga,v,c (z)) = z has roots if and only if the next equation has roots when z ∈ [0, a] P (z) = (c − 1 + (a − 1)v)z2 − a(c − 1)(v + 1)z − ac(c − a) = 0. Clearly, if a = c, then P (0) = 0 and we have a fixed point of period 2 as required in the statement (i) of the lemma. Now assume that a < c. Notice that P (0) = −ac(c − a) < 0 and P (a) = a(c + va)(a − c) < 0 for all (a, v) ∈ c such that a < c. Hence P (z) has roots on [0, a] if and only if c − 1 + (a − 1)v < 0, z∗ =
1 a(c − 1)(v + 1) ∈ [0, a], 2 c − 1 + (a − 1)v
(10)
and P (z∗ ) > 0 where z∗ is a point of maximum of quadratic function P (z). Let us first consider the case c > 1. In this case c − 1 + (a − 1)v < 0 imply z∗ < 0. Therefore, P (z) < 0 for all z ∈ [0, a] and equation P (z) = 0 has no roots in [0, a]. Consequently, assume that 0 < c < 1. Again, P (0) < 0 and P (a) < 0 for all (a, v) ∈ c such that a < c. Denote D(c) = {(a, v) ∈ c | c − 1 + (a − 1)v < 0}. It follows from (10) that z∗ > 0 for all (a, v) ∈ D(c). Inequality z∗ < a is equivalent to −
a (a − c)v + (a − 1)v + (c − 1) < 0. 2 (c − 1) + (a − 1)v
Equation P (z) = 0 has roots when z ∈ (0, a) if and only if 1 a(c − 1)2 (v + 1)2 ∗ P (z ) = −a + c(c − a) ≥ 0. 4 c − 1 + (a − 1)v
(11)
(12)
84
K. Khanin, D. Khmelev
Denote by D1/2 (c) the domain of whose (a, v) ∈ c such that the last inequality holds and a < c. The domain D1/2 (c) can be described as follows: D1/2 (c) = {(a, v) ∈ c | a1/2 (v) ≤ a < c}, where a1/2 (v) is a solution of the quadratic equation a(c − 1)2 (v + 1)2 + 4c(c − a)(c − 1 + (a − 1)v) = 0 (−1) = (c − 1)/2. One can check that for any (a, v) ∈ such that a1/2 (−1) = −1, a1/2 D1/2 (c) an inequality c − 1 + (a − 1)v < 0 holds and hence D1/2 (c) ⊂ D(c). Also, one can check that for any (a, v) ∈ D1/2 (c) an inequality (11) holds and therefore, all the (a, v) such that ρ(Ta,v,c ) = 2 actually belong to D1/2 (c).
The set of (a, v) ∈ c such that ρ(Ta,v,c ) = 1/2 for 0 < c < 1 is shown in Fig. 6. Notice that Theorem 1 states that the family (Fa,v,c , Ga,v,c ) is a limiting family for successive renormalizations (RH )n−1 RTf . Since Lemma 3 implies that the rotation number ρ(H (RH )n−1 RTf ) never exceeds 1/2, one can expect that there are no (a, v) ∈ c corresponding to the rotation number which is greater than 1/2. This is indeed true and follows from the condition b ≥ 0. More precisely, the following proposition holds. Proposition 1. Fix arbitrary positive c = 1. Then 0 ≤ ρ(Ta,v,c ) ≤ 1/2 for all (a, v) ∈ c . Proof. Fix v and consider the function ρ(a) = ρ(Ta,v,c ) on the segment a ∈ (a0 , a1/2 ], where a0 = a0 (v), a1/2 = a1/2 (v). We choose a0 such that one of the following conditions holds: (a) (a0 , v) ∈ ∂− c , (b) ρ(Ta0 ,v,c ) = 0 and ρ((Ta0 +ε,v,c )) = 0 for any ε > 0 such that (a0 + ε, v) ∈ c . It follows from the proof of Lemma 7 that we can choose a1/2 such that ρ(Ta,v,c ) = 1/2 for all a1/2 < a ≤ c and ρ(Ta1/2 −ε,v,c ) = 1/2 for any ε > 0 such that (a1/2 −ε, v) ∈ c . Notice that ρ(a) is a continuous function on (a0 , a1/2 ] and by Lemma 6 we have lima→a0 ρ(a) = 0. Therefore, defining ρ(a0 ) = 0 we obtain that ρ(·) is a continuous a 1
a=c
–1
c–1
v
Fig. 6. The subset of c of (a, v) such that ρ(Ta,v,c ) = 1/2 for 0 < c < 1 (here c = 1/4)
Renormalizations and Rigidity Theory
85
function on [a0 , a1/2 ]. Hence, the continuous function ρ(a) takes values 0 and 1/2 at a0 and a1/2 respectively, and does not take these values for any a ∈ (a0 , a1/2 ). Therefore, 0 < ρ(a) < 1/2 for any a ∈ (a0 , a1/2 ) as required. We can see that cases c > 1 and 0 < c < 1 are in some sense dual to each other with exchanging roles of 0 and 1/2. Also notice that the rotation number 1/2 plays the role of 1 here: all the admissible rotation numbers are between 0 and 1/2 and the natural continued fraction representation of rotation numbers in (0, 1/2] is [1 + k1 , k2 , . . . ] and an analogue of the Gauss map here is a map sending [1+k1 , k2 , . . . ] → [1+k2 , k3 , . . . ]. 3.4. RG-transformation and ergodic constraints. Denote by Ic the set of (a, v) such that Ta,v,c is renormalizable. Also denote by Ick (Ic∞ ) the set of (a, v) such that Ta,v,c is k-times (infinitely many times) renormalizable. Obviously, Ic∞ are (a, v) such that ˜ and ρ(Ta,v,c ) is irrational. Let us define mappings R and Rc , which are mappings R ˜ c defined by (8) for new variables a and v = (c − a)/b − 1. Therefore R sends R k k (a, v, c) to (a , v , c ) and c = 1/c, a = −Fa,v,c (−1)/a = −F˜a,b(a,v,c),c (−1)/a and k+1 k+1 v = (c − a − b )/b , where b = Fa,v,c (−1)/a = F˜ (−1)/a and k is such a,b(a,v,c),c
k k+1 (−1) > 0. Denote also by R a mapping such that (−1) < 0 and Fa,v,c that Fa,v,c c
Rc (a, v) = (a , v ),
(13)
where (a , v , c ) = R(a, v, c). Lemma 8. Rc ( c ∩ Ic ) ⊂ 1/c . Proof. Consider (a , v , c ) = R(a, v, c). Since Ta,v,c is renormalizable, Ta ,v ,c is a homeomorphism and we have 0 < b =
c − a 1/c − a = < 1. v +1 v +1
Notice that 1/c − a > 0 implies two inequalities v + 1 > 0 and a + v > 1/c − 1 and hence (a , v ) ∈ 1/c . We next show that 1/c − a > 0. By the definition of Rc we have for some k ≥ 1, k a = −Fa,v,c (−1)/a, k where −1 < Fa,v,c (−1) < 0 and k+1 0 < Fa,v,c (−1) = Fa,v,c (−aa ) =
a − caa a(1 − ca ) = . 1 + v aa 1 + v aa
Since v > −1 and 0 < aa < 1, we have 1 + vaa > 0, which imply 1 − ca > 0.
We now define the domain c ⊂ c which corresponds to ergodic restrictions. We n−1 start with motivation for this definition. Let an , bn , vn = (cn −an −bn )/bn , cn = c(−1) be sequences corresponding to successive renormalizations of the given renormalizable p¯ Tf . It follows from (3) and (5) that an + bn Mn ≈ cn and Mn ≈ cn n , where 0 < p¯ n < 1.
86
K. Khanin, D. Khmelev
a 1
a=c
–1
v
c–1 Fig. 7. The set c for c < 1 (here c = 1/4)
a a=c
c–1
–1
v
c–1 Fig. 8. The set c for c > 1 (here c = 4) p¯
Therefore, we have vn = (cn − an )/bn − 1 ≈ Mn − 1 ≈ cn n − 1. This implies that asymptotically as n → ∞ we have 0 ≤ vn ≤ cn − 1 if cn > 1, (14) cn − 1 ≤ vn ≤ 0 if 0 < cn < 1. In view of (14) it is natural to define {(a, v) ∈ c | 0 ≤ v ≤ c − 1}, c > 1, c = {(a, v) ∈ c | c − 1 ≤ v ≤ 0}, 0 < c < 1. The next Proposition shows that c is invariant w.r.t. the RG-transformation. Proposition 2. Rc (c ∩ Ic ) ⊂ 1/c .
Renormalizations and Rigidity Theory
87
Proof. Since b = Fa,v,c (−aa )/a, we have c − a 1 − ca c(c − a ) = = . 1 + v 1 + vaa 1 + vaa
(15)
It follows from the proof of the previous lemma that c −a > 0. Hence (15) is equivalent to c 1 = , 1 + v 1 + vaa which gives v =
1 + vaa − 1. c
(16)
Suppose c > 1. Since 0 ≤ v ≤ c − 1 and 0 < aa < 1 we have c − 1 =
1 1 + (c − 1) − 1 ≤ v ≤ − 1 = 0. c c
The proof in the case 0 < c < 1 is similar.
Notice that Ta,v,c is not renormalizable whenever ρ(Ta,v,c ) = 1/k for some k ∈ N. ¯ c for which rotation numbers are positive: It is convenient to define a domain ¯ c = {(a, v) | ρ(Ta,v,c ) > 0}.
(17)
¯ c . We next show that the set c is an attractor for the RG-transClearly, c ∩ Ic ⊂ formation. Proposition 3. Fix arbitrary positive c = 1. Then for any (a, v) ∈ c ∩ Ic∞ there exists N = N(a, v, c) such that (R1/c ◦ Rc )N (a, v) ∈ c . Remark 1. This statement can be easily generalized to the following one. Take any −1 < V1 < 0 < V2 < ∞. Then there exists a constant N = N (V1 , V2 ) such that for any (a, v) ∈ c ∩ Ic2N satisfying V1 ≤ v ≤ V2 we have (R1/c ◦ Rc )N (a, v) ∈ c . Proof. Notice that by Proposition 2 the case 0 < c < 1 is reduced to the case c > 1 by two additional iterations of R. Therefore, assume c > 1. Let us denote (a0 , v0 , c0 ) = (a, v, c), (an , vn , cn ) = R(an−1 , vn−1 , cn−1 ) = Rn (a0 , v0 , c0 ). It follows from (16) that v2(n+1) =
1 + v2n+1 a2n+1 a2(n+1) −1 1/c 1 + v2n a2n a2n+1 =c+c − 1 a2n+1 a2(n+1) − 1, c
which gives v2(n+1) = (c − 1) + a2n+1 a2(n+1) (v2n a2n a2n+1 − (c − 1)).
(18)
88
K. Khanin, D. Khmelev
We also have 0 < ak ak+1 < 1 for all k ≥ 0. In fact, for k ≥ 1 there is a much stronger estimate 1 ak ak+1 ≤ λ = < 1, (19) 1 + e−V (x) < ∞. Indeed, suppose that ρ(Ta,v,c ) = [k1 , k2 , . . . ] = ρ where V = Var log Ta,v,c x∈S 1
is irrational and let ps /qs = [k1 , . . . , ks ]. By definition of ak for k ≥ 1, (k)
ak =
0 (xcr ) (k−1)
0
,
(xcr ) q
(s)
s xcr and xcr , s ≥ 0. Now where 0 (xcr ) is an interval with endpoints xqs = Ta,v,c application of Lemma 1 yields the estimate (19). Consider now three cases. If 0 ≤ v0 ≤ c − 1, then we can take N = 0, since by Proposition 2 (a2n , v2n ) ∈ c for all n ≥ 0. 1 Therefore, assume that v0 < 0. Let us choose N = N0 = 2 + , (c − 1)(1 − λ) where [·] stands for the integer part of a number. Now denote ln = v2n . It follows from (18) that
ln+1 = a2n a2n+1 ln a2n+1 a2(n+1) + (c − 1)(1 − a2n+1 a2(n+1) ).
(20)
Notice that if for some m < N0 we have lm < 0 and lm+1 > 0, then by (20) we have lm+1 < c − 1, and, by Proposition 2, we shall have (a2(m+s) , v2(m+s) ) ∈ c for all s > 0 and, in particular, for s = N0 − m. Therefore, the only “dangerous” case is ln < 0 for all n < N0 . It follows from (20) and (19) that ln+1 > ln + (c − 1)(1 − λ) > −1 + n(c − 1)(1 − λ) and for n = N0 − 1 we have lN0 > −1 + (N0 − 1)(c − 1)(1 − λ) > 0 by choice of N0 . Finally, assume that v0 > c − 1. Clearly, one can choose N = N1 > 1 such that λ(N1 −1) (v0 − (c − 1)) < (1/λ − 1)(c − 1). Let us denote rn = v2n − (c − 1). Using (18) we obtain rn+1 = a2n+1 a2(n+1) (rn + (c − 1))a2n a2n+1 − (c − 1) , (21) rn+1 = a2n+1 a2(n+1) rn a2n a2n+1 − (c − 1)(1 − a2n a2n+1 ) . (22) Clearly, if for some m < N1 one has rm > 0 and rm+1 < 0, then by (21) rm+1 > −(c−1), i.e., (a2(m+1) , v2(m+1) ) ∈ c and we have (a2(m+s) , v2(m+s) ) ∈ c for all s > 0 and, in particular, for s = N1 − m. Again, the only “dangerous” case is rn > 0 for all n < N1 . Since c > 1 it follows from (22) and (19) that rn ≤ λn r0 . By (21) the sign of rN1 is determined by the l.h.s. of the following inequalities (rN1 −1 + (c − 1))a2(N1 −1) a2N1 −1 − (c − 1) < (rN1 −1 + (c − 1))λ − (c − 1) 1 − 1 (c − 1) + (c − 1) λ − (c − 1) = 0. < λ
Renormalizations and Rigidity Theory
89
The last two inequalities hold since rN1 −1 ≤ λ(N1 −1) r0 < (1/λ − 1)(c − 1) by choice of N1 . Therefore, rN1 < 0 as required. 4. The Monotonicity of Rotation Number ρ(Ta,v,c ) w.r.t. a It follows from Proposition 3 that the domain c is of main interest in studying dynamics of the RG-transformation of the pair (Fa,v,c , Ga,v,c ). For our purposes it is enough to consider the case c > 1 only. Below we show that for a particular choice of the lifting fa for (Fa,v,c , Ga,v,c ), one can prove that fa is monotonous with respect to a. Let us now fix c > 1 and some v such that 0 ≤ v ≤ c − 1. For any a such that (a, v) ∈ c let us define the lifting fa : R → R such that 0 ≤ fa (0) < 1 and Tfa = Hx0 (a) (Fa,v,c , Ga,v,c ), where x0 (a) = 1/(1 + a). It follows from (6) that Fa (y), for y ∈ [0, 1/(1 + a)], (Hx0 (a) (Fa,v,c , Ga,v,c ))(y) = (23) Ga (y), for y ∈ [1/(1 + a), 1), where (1 + a)(v − c)y − (a + 1 + v − c) , (−1 + vy + avy − v)(1 + a) (a + v − c + 1))(−(a + 1)y + 1) Ga (y) = . (1 + a)(−(v + 1 − c)(1 + a)y − ac − c + 1 + v) Fa (y) =
Notice that in previous sections we implicitly assumed the point x0 to be fixed. Here we choose x0 depending on a since for fixed x0 monotonicity of fa w.r.t. a does not take place in the whole domain c . The proof of monotonicity of fa involves some cumbersome calculations which are briefly presented in the following four lemmas. Lemma 9. 1) We have U (y) ∂ , Fa (y) = ∂a (−1 + vy + avy − v)2 (1 + a)2 ∂ V (y) , Ga (y) = ∂a (1 + a)2 (−(v + 1 − c)(1 + a)y − ac + v + 1 − c)2 where U (y) = v(a + 1)2 (c − v)y 2 + v(a + 1)(a − 2c + 2v + 1)y + (v + 1)(c − v), V (y) = (c − v)(v + 1 − c)(a + 1)2 y 2 + (v + 1 − c)(a + 1)(−ac − 3c + 2v)y + a 2 c + 2c(v + 1 − c)a + (v + 1 − c)(2c − v). 2) There exist C such that ∂ Fa (y) ≤ C for any y ∈ 0, 1 , ∂a 1+a ∂ 1 Ga (y) ≤ C for any y ∈ , 1 ∂a 1+a ¯ c (see (17)). for any (a, v) ∈
90
K. Khanin, D. Khmelev
This lemma is proved by direct evaluation. Lemma 10. For all (a, v) ∈ c we have U (y) ≥ 1/8c > 0 for all y ∈ [0, 1/(1 + a)]. Proof. Notice that U (0) = (v + 1)(c − v) ≥ 0 and U (1/(1 + a)) = av + c ≥ 0 for all (a, v) ∈ A. Now consider the vertex yU of U (y), yU = −
a − 2c + 2v + 1 . 2(a + 1)(c − v)
Notice that yU ≥ 0 if and only if v ≤ c − a/2 − 1/2. Therefore it is enough to take into account U (yU ) only for (a, v) ∈ B ∩ c , where B = {(a, v) ∈ c | v ≤ c − a/2 − 1/2} = {(a, v) | 1/2 ≤ a ≤ c, c − 1 − a ≤ v ≤ c − 1/2 − a/2}. Consider P (v) = 4(c − v)U (yU ): P (v) = −4av 2 − (a 2 + 2a + 1 + 4c − 4ac)v + 4c2 . Clearly, min
(·,v)∈B∩c
P (v) ≥ min P (v) = min(P (c − 1 − a), P (c − 1/2 − a/2)). (·,v)∈B
Now notice that P1 (c) = P (c − 1 − a) = (a + 1)2 (3(c − a) + 1). Clearly we have P1 (c) ≥ (a + 1)2 for all c ≥ a. Finally, consider P2 (c) = P (c − 1/2 − a/2). We have 1−a P2 (c) = (1 + a)2 c + . 2 Clearly, we have P2 (a) ≥ (a + 1)3 /2 ≥ 0 for all a ≤ c. Taking into account all the estimates we obtain (1 + a)2 (1 + a)3 U (y) ≥ min (v + 1)(c − v), av + c, , 4(c − v) 8(c − v) 1 ≥ min(c, c, 1/4c, 1/8c) ≥ 8c as required.
Lemma 11. For all (a, v) ∈ c we have V (y) ≥ ac min(a + v − (c − 1), a) > 0 for all y ∈ [1/(1 + a), 1]. 1 Proof. Since the coefficient of y 2 in V (y) is non-positive, min V (y) = min(V ( 1+a ), V (1)). One can check by direct calculation that V (1/(1 + a)) = (a + v − (c − 1))ac ≥ 0 and V (1) = a 2 (v + 1)(c − v) ≥ a 2 c > 0 for all (a, v) ∈ c .
Renormalizations and Rigidity Theory
91
Easy calculation yields the following estimates. Lemma 12. We have 1/8c ∂ Fa (y) ≥ for y ∈ [0, 1/(1 + a)], ∂a (1 + c)2 c2 ∂ a + v − (c − 1) for y ∈ [1/(1 + a), 1]. Ga (y) ≥ ∂a (1 + c)2 c2 Now we can prove the monotonicity of fa . Notice that the derivative of ∂fa (y)/∂a is well-defined at point y = k for all k ∈ Z, since fa (k) = (k − 1) + (1 + Ga (1)) = k + Fa (1) = k +
a + v − (c − 1) . (1 + a)(1 + v)
But, for point y0 = 1/(1 + a0 ) we have ∂ ∂ a+v+1−c = = , fa (y0 ) Ga (y0 ) ∂a ∂a ac(1 + a)2 a=a0 + a=a0 ∂ ∂ c−v = = . fa (y0 ) Fa (y0 ) ∂a ∂a ac(1 + a)2 a=a0 −
a=a0
Proposition 4. Fix c > 1 and 0 ≤ v ≤ c − 1. For any a1 < a2 such that (a1 , v), (a2 , v) ∈ c there exists ε = ε(a1 , v, c) such that 1 ∂ fa (y) ≥ ε for all (a, y) ∈ [a1 , a2 ] × R \ {(a, k + ) | k ∈ Z, a ∈ [a1 , a2 ]} ∂a 1+a and
fa2 (y) − fa1 (y) ≥ (a2 − a1 )ε > 0 for all y ∈ R.
Proof. Since fa (y + 1) = fa (y) + 1, we can consider only the segment [0, 1]. Notice that Fa (y) for y ∈ [0, 1/(1 + a)], fa (y) = (24) 1 + Ga (y) for y ∈ [1/(1 + a), 1]. It follows from Lemma 12 that
∂ ∂ ∂ fa (y) ≥ min Fa (y), Ga (y) ≥ ε(a1 , v, c), ∂a ∂a ∂a
where
1 1 min a1 + v − (c − 1), . ε(a1 , v, c) = 2 c (1 + c)2 8c
Fix any y ∈ [0, 1]. One of the following three situations occur: a) y ∈ [0, 1/(1 + a2 )], b) y ∈ [1/(1 + a2 ), 1/(1 + a1 )], c) y ∈ [1/(1 + a1 ), 1].
92
K. Khanin, D. Khmelev
Consider the case a). Evidently, fa2 (y) − fa1 (y) = Fa2 (y) − Fa1 (y) =
a2
a1
∂ Fa (y) dα ≥ ε(a2 − a1 ) > 0. ∂a a=α
The case c) is similar. Case b) is more complicated. Given y ∈ [1/(1 + a2 ), 1/(1 + a1 )], let us choose β = β(y) such that y = 1/(1 + β). Notice that 1 + Gβ(y) (y) = Fβ(y) (y). Also notice that a1 ≤ β ≤ a2 . Therefore, fa2 (y) − fa1 (y) = (1 + Ga2 (y)) − Fa1 (y) = (1 + Ga2 (y)) − (1 + Gβ(y) (y)) + Fβ(y) (y) − Fa (y) = (∗). Again, using the Newton-Leibnitz formula, we obtain
β(y)
a2 ∂ ∂ Ga (y) Fa (y) dα + dα ≥ ε(a2 − a1 ) > 0. (∗) = ∂a β(y) ∂a a1 a=α a=α
Corollary 1. Fix (a1 , v) ∈ c . For any a1 < a2 ≤ c we have ρ(fa1 ) ≤ ρ(fa2 ). 5. The Structure of Domains of Constant Rotation Number in Φc Let us study some simple corollaries of monotonicity w.r.t. a for the family of functions fa defined in (23). Our aim is to understand the geometrical structure of subsets (a, v) ∈ c corresponding to constant rotation number ρ0 = ρ(Ta,v,c ). The structure of these domains is different for rational and irrational ρ. Both cases are important in the proof of Theorem A. In this section we fix c > 1 and define ρ(a, v) = ρ(Ta,v,c ). Denote (ρ0 ) = {(a, v) ∈ c | ρ(Ta,v,c ) = ρ0 }. The structure of domain (ρ0 ) is described in the following lemmas. Lemma 13. For any irrational ρ0 ∈ (0, 1/2) and 0 ≤ v ≤ c − 1 there exists a unique a˜ such that ρ(a, ˜ v) = ρ0 . Proof. Define the function ρ(a) = ρ(a, v) for a ∈ (c−1−v, c] and set ρ(c−1−v) = 0. By Lemma 6 the function ρ(a) is continuous on [c − 1 − v, c] and by Lemma 7 we have ρ(a) = 1/2. Since ρ(a) is continuous there exists a˜ such that ρ(a) ˜ = ρ0 . Assume that there exists a¯ = a˜ such that ρ(a) ¯ = ρ. Without loss of generality assume c − 1 − v < a˜ < a¯ < c. Denote f¯(x) = fa¯ (x) and f˜(x) = fa˜ (x), where fa is defined in (24). By Proposition 4 we have f¯(x) ≥ f˜(x) + δ for some δ = δ(a, ˜ a) ¯ > 0. Obviously f¯n (x) ≥ f˜n (x) + δ for any n ∈ N. Let ρ0 have a continued fraction expansion ρ0 = [k1 , k2 , . . . ]. Take m so large that x0 − δ < f˜q2m+1 (x0 ) − p2m+1 < x0 , where p2m+1 /q2m+1 = [k1 , . . . , k2m+1 ] and x0 = 0. Then f¯q2m+1 (x0 ) − p2m+1 > x0 which implies ρ ≥ p2m+1 /q2m+1 in contradiction with the obvious relation ρ0 ∈ (p2m /q2m , p2m+1 /q2m+1 ).
Renormalizations and Rigidity Theory
93
It follows from Lemma 13 that for any irrational ρ one can define a function aρ (v) on v ∈ [0, c − 1] such that ρ(Taρ (v),v,c ) = ρ. Lemma 14. Function aρ (v) is continuous with respect to v. Proof. It is well-known (see, for example [8, p.77, Theorem 2]) that rotation number is a continuous function of a homeomorphism. It follows that ρ(a, v) is a continuous function of (a, v). Fix v and a = aρ (v). Suppose aρ (v) is not a continuous function. It follows that there exists a sequence vn → v, an = aρ (vn ) → a ∗ = aρ (v) as n → ∞. Since ρ(a, v) is a continuous function, ρ(a ∗ , v) = ρ. This contradicts the uniqueness of a˜ = aρ (v). We next consider rational rotation numbers. For fixed v denote Iv (p/q) the set of values of a such that ρ(a, v) = p/q. Since ρ(a, v) is continuous and monotonous w.r.t. (1) (2) a it easily follows that Iv (p/q) is a closed interval Iv (p/q) = [ap/q (v), ap/q (v)]. Lemma 15. Consider 0 < p/q < 1/2 and fix v ∈ [0, c − 1]. Suppose ρ(a1 , v) < p/q < ρ(a2 , v), where 0 ≤ v ≤ c − 1. Then for any x ∈ [0, 1) there exists a unique q a(x) ∈ (a1 , a2 ) such that ρ(a(x), v) = p/q and fa(x) (x) = x + p. Proof. Notice that there exists K such that for all k > K we have kq
kq
fa1 (x) − x − kp < 0 and fa2 (x) − x − kp > 0. kq
Fix arbitrary k > K. Since fa (x) is a strictly monotonous function of a, there exists kq a = a(x), a(x) ∈ (a1 , a2 ) such that we have fa(x) (x) − x − kp = 0. This implies q fa (x) − x − p = 0. q lq Indeed if fa(x) (x) > p + x, then by fa(x) (x) + 1 = fa(x) (x + 1) we have fa(x) (x) > kq
lp + x for all l ≥ 1 which contradicts fa(x) (x) − x − kp = 0. In the same way one q excludes the case fa(x) (x) − x − p < 0. q Uniqueness of a(x) follows from monotonicity of fa (x) w.r.t. a. q
On Fig. 9 we present graphs of fa − p for p/q = 1/3 in the case c = 4, v = 2 for two different values of a. In both cases the rotation number is ρ = 1/3. The lower graph corresponds to a situation when the periodic orbit passes through the break point. (1) We recall that for all a the lifting fa (x) has two break points on [0, 1): xcr = 1/(1+a) (2) (2) (1) (2) and xcr = 0. Notice that xcr = Tfa (xcr ). Denote x0 = xcr = 0. By Lemma 15, there exists a unique a0 = a(x0 ) such that ρ(fa0 ) = p/q and x0 is a periodic point. Lemma 16. 1) Periodic trajectory {xi = Tfai0 (x0 ), 0 ≤ i ≤ q − 1} is a unique periodic trajectory for Tfa0 . q 2) The graph of fa (x) − p consists of q fractional-linear pieces. All endpoints lie on the diagonal and all other points strictly below it (see the lower curve on Fig. 9). Remark 2. We consider the case c > 1. In the opposite case c < 1 the graph is above the diagonal except at periodic points.
94
K. Khanin, D. Khmelev
1 0.8 0.6 0.4 0.2 0 Fig. 9. The graph of (a2 , v, c) = (2.5, 2, 4)
fa31
0.2 − 1,
fa32
0.4
0.6
− 1 for (a1 , v, c) =
(42 /(1
0.8
1
+ 2 + 4), 2, 4) (lower graph) and
Proof. Consider the periodic trajectory {xi , 0 ≤ i ≤ q} of period q and enumerate points according to the ordering on the circle S 1 : y0 = x0 = 0 < y1 < · · · < yq−1 < yq = 1. q Notice that fa −p is a fractional-linear function on [yi , yi+1 ], i = 0, . . . , q −1. Consider this fractional-linear function corresponding to [y0 , y1 ] in a renormalized coordinate system such that the interval has length 1: f¯(ζ ) =
1 q [(fa0 (y0 + (y1 − y0 )ζ ) − p) − y0 ], ζ ∈ [0, 1]. y1 − y 0
Notice that f¯(0) = 0, f¯(1) = 1. It also follows from an easy calculation (see relation (27) in the next section) that f¯ (1)/f¯ (0) = c2 . These three relations uniquely determine ζ f¯(ζ ) = . c + (1 − c)ζ It is easy to see that f¯ gives the expression for all q fractional-linear pieces of the funcq tion fa0 (x) − p in the renormalized coordinates. To finish the proof it is enough to notice that f¯ is a convex function and ζ < f (ζ ) for all ζ ∈ (0, 1). It easily follows from Lemma 16 that a0 is a left endpoint of the interval Iv (p/q): (1) a0 = ap/q (v). Lemma 17. For any a < a0 rotation number ρ(a, v) < p/q. q
Proof. Since fa (x) < fa0 (x) the graph of fa (x) − p is strictly below the diagonal. Hence the rotation number is less than p/q. Consider now the case a > a0 . In the next lemma we prove that the following picture q holds for fa (x) − p. When a changes all fractional-linear pieces deform smoothly and q a new piece appears near every integer point. All break points of a graph of fa (x) − p
Renormalizations and Rigidity Theory
95
lie above the diagonal. Periodic trajectories for a homeomorphism Tfa are generated only by deformed pieces, so that an extra piece appearing near integer points contains no periodic points. (1)
q
(2)
Lemma 18. 1) For any a ∈ (ap/q (v), ap/q (v)) the graph of fa (x) − p on [0, 1] consists of q + 1 fractional-linear pieces with endpoints 0 = y0 (a) < y1 (a) < · · · < yq (a) < 1, where yi (a), 1 ≤ i ≤ q − 1 are smooth deformations of yi , 1 ≤ i ≤ q − 1 which were defined for a = a0 in the proof of Lemma 16. The last endpoint yq (a) is a new break point which bifurcates from x = 1 at a = a0 . q 2) fa (yi (a)) − p > yi (a), 0 ≤ i ≤ q. (1) (2) 3) For any a ∈ (ap/q (v), ap/q (v)) there exist two periodic orbits for Tfa , one stable (2)
and one unstable. For a = ap/q (v) there exists a unique periodic trajectory which is neutral. Proof. 1) It is easy to see that the endpoints of different fractional-linear pieces correspond to preimages Tfa−i 0, 0 ≤ i ≤ q − 1 and Tfa−i (1/1 + a), 0 ≤ i ≤ q − 1. Since fa (1/(1 + a)) = 1 all endpoints are given by fa−i (l), 0 ≤ i ≤ q − 1, l ∈ Z. This gives q exactly q + 1 endpoints for fa − p on [0, 1]: 0 = y0 (a) < y1 (a) < · · · < yq−1 (a) < yq (a) < 1. −q Notice that yq (a) = 1 for a = a0 . However, for a > a0 we have yq (a) = fa (p + 1) < 1 since fa is increasing with a. All other endpoints yi (a), 1 ≤ i ≤ q −1 are smooth deformation of the corresponding endpoints yi for a = a0 . To finish the proof of the first statement we show that the order 0 = y0 (a) < y1 (a) < · · · < yq−1 (a) < yq (a) < 1 preserves when a increases. To prove it notice that if yi (a) = yj (a) then there exists a periodic orbit of period |i − j | < q for the homeomorphism Tfa which contradicts ρ = p/q. −j (i) 2) Since fa is increasing with a and yi (a) = fa (l(i)) we immediately see that yi (a) is monotone decreasing with a for i > 0. Notice that j (i) < q for i < q. Therefore q
q
−j (i)
fa (yi (a)) − p = fa (fa >
q−j (a)
(l(i))) − p = fa
q−j (a) fa0 (l(i)) − p
(l(i)) − p
= yi (a0 ) > yi (a).
To complete the proof we have to consider separately the cases i = 0 and i = q. We have for a > a0 , q
q
fa (yq (a0 )) − p = fa (0) − p > 0 = y0 and q
q
−q
fa (yq (a)) − p = fa (fa (p + 1)) − p = p + 1 − p = 1 > yq (a). 3) It follows from a previous statement that for a ↓ a0 deforming fractional-linear pieces have two intersections with the diagonal: one stable and one unstable (here and below we use convexity of the fractional-linear function). By Lemma 15 the periodic orbit passes through any point x at a unique value of a. Hence, stable and unstable periodic points are monotone functions of a: the stable one is increasing and unstable one is (2) decreasing. When a reaches ap/q (v) the stable and unstable trajectories meet each other.
96
K. Khanin, D. Khmelev
Otherwise monotonicity implies the existence of an interval of points x such that there is no periodic orbit which passes through x. Existence of such an interval contradicts (2) Lemma 15. Finally the unique periodic orbit for a = ap/q (v) has to be neutral since the (2)
mapping is differentiable. Obviously, if the multiplier is different from 1, ap/q (v) is not an endpoint of Ip/q (v). The following proposition summarizes the statements proven in Lemmas 15–18. (1)
(2)
Proposition 5. For every p/q there exists an interval [ap/q (v), ap/q (v)] of a such that (1)
(2)
(1)
(2)
/ [ap/q (v), ap/q (v)]. ρ(a) = p/q for all a ∈ [ap/q (v), ap/q (v)] and ρ(a) = p/q for all a ∈ (1)
(2)
For a = ap/q (v), ap/q (v) there exists a unique trajectory of period q. (1)
(2)
For a ∈ (ap/q (v), ap/q (v)) there exist exactly two periodic trajectories of period q, (1)
one stable and one unstable. For a = ap/q (v) the periodic trajectory is formed by points q−1 Tfa0 (0),
(1)
(2)
0, Tfa0 (0), . . . , where a0 = ap/q (v). For a = ap/q (v) a periodic trajectory η0 (a), . . . , ηq−1 (a) is neutral, i.e., q−1
fy (ηi (a); a1 ) = 1.
i=0 (1)
(2)
It follows from Proposition 5 that curves ap/q (v) and ap/q (v) are algebraic w.r.t. v, c, since they are the roots of some algebraic system of equations. Unfortunately, the complexity of the equations grows very fast. However, in some cases it is easy to find explicit (1) expressions for ap/q (v). For example, we can find an explicit solution for p/q = 1/3: (1)
ap/q (v) =
c2 . 1+v+c (1)
(1)
It is convenient to consider an inverse function to ap/q (v) which we denote vp/q (a). It is easy to see that (1)
c(c − a) . a
(25)
(c − a)2 (1 + c) c(c − a) − . a a(1 + c + (c − a))
(26)
v1/3 (a) = −1 + Direct calculation gives (1)
v1/4 (a) = −1 + 1/3
Consider the domain D1/4 (c) = {(a, v) ∈ c | 1/4 ≤ ρ(Ta,v,c ) < 1/3} (see Fig. 10). 1/3
Lemma 19. D1/4 (c) = {(a, v) ∈ c | v1/4 (a) ≤ v < v1/3 (a)}. Proof. It follows immediately from Proposition 5 and (25), (26).
Renormalizations and Rigidity Theory
97
a a=c
c–1
–1
v
c–1 Fig. 10. The set
1/3 D1/4 (c)
for c = 4
6. Exponential Instability of ρ(Ta,v,c ) w.r.t. a In this section we shall show that the monotonicity of the family fa described in (23) w.r.t. parameter a implies exponential instability of rotation number ρ(fa ) w.r.t. a. We assume everywhere in this section that the value of 0 ≤ v ≤ c − 1 is fixed and a ∈ A(v) which is A(v) = {a | (a, v) ∈ c }. We shall also define
¯ ¯ c } (see (17)). A(v) = {a | (a, v) ∈
The family fa has two 1-periodic sequences of break points: (1) (2) xcr (a) = 1/(1 + a), xcr (a) = 0
(mod 1).
One can easily find the breaks c1 (a) and c2 (a) at those points. Notice that ∂ F ((1 + a)y − 1) + 1 ∂ = = F (−1), fa (y) ∂y ∂y (1 + a) y=1+0 y=0+0 ∂ ∂ F ((1 + a)y − 1) + 1 = = F (0), fa (y) ∂y ∂y (1 + a) y=1/(1+a)−0 y=1/(1+a)−0 ∂ ∂ G((1 + a)y − 1) + 1 = = G (0), fa (y) ∂y ∂y (1 + a) y=1/(1+a)+0 y=1/(1+a)+0 ∂ ∂ G((1 + a)y − 1) + 1 fa (y) = = G (a), ∂y ∂y (1 + a) y=1−0
y=1−0
where F (−1) =
c + va , (1 + v)2
F (0) = c + va,
a + v − (c − 1) , ac c(a + v − (c − 1)) G (a) = . a(1 + v)2 G (0) =
98
K. Khanin, D. Khmelev
Hence a(c + va) F (0) =c , G (0) a + v − (c − 1) G (a) a + v − (c − 1) c22 (a) = =c F (−1) a(c + va) c12 (a) =
and c12 (a)c22 (a) = c2 .
(27)
As before for every fa we define a homeomorphism Tfa (y) = {fa (y)} of the unit circle S 1 = [0, 1), whose rotation number we denote ρ(a) = ρ(fa ). It follows from Corollary 1 that ρ(a) is monotonous with a. In what follows it is convenient to use another notation for fa (y): f (y; a) = fa (y). Partial derivatives of f (y; a) w.r.t. y and a will be denoted by fy (y; a) and fa (y; a). We also denote iterations of fa (y) by f i (y; a) = f (f i−1 (y; a); a), i ≥ 1, where f 0 (y; a) = y. Finally, we shall use the notation Tf (y; a) for the homeomorphism Tfa (y). Below we use dynamical partitions which were defined in [34]. Consider the arbitrary open Farey interval (p1 /q1 , p2 /q2 ) (see [34]). Let ρ(a) ∈ (p1 /q1 , p2 /q2 ). For (1) an arbitrary point y0 ∈ S 1 denote yi (a) = Tf i (y0 ; a). Let 0 (a) = [y0 , yq1 (a)] and (2) (1) (1) (2) (2) i 0 (a) = [yq2 (a), y0 ]. Define i (a) = Tf (0 (a); a) and j = Tf j (0 (a); a). The following simple proposition was proven in [34]. Proposition 6. Points {yi (a), 0 ≤ i < q1 + q2 } form a partition of the circle with ele(1) ments of a partition forming two sequences of intervals {i (a), 0 ≤ i < q2 }, and (2) {i (a), 0 ≤ i < q1 }. Denote by V (a) the total variation of ∂/∂y(log Tf (y; a)) on the circle S 1 : V (a) = |log c12 (a)| + |log c22 (a)| +
Var
(1) (2) y∈(xcr (a),xcr (a))
log fy (y; a) +
Var
(2) (1) y∈(xcr (a),xcr (a)+1)
log fy (y; a).
Let p = max(p1 , p2 ), q = max(q1 , q2 ). Consider an open Farey interval I with endpoints at p3 p1 + p2 p = and . q3 q1 + q 2 q Notice that I is a “good” Farey interval, i.e., q1 + q2 < 2q (see [34, 68]). The following proposition is the Denjoy-type inequality (see [34]). (1)
(2)
Proposition 7. Suppose ρ(a) ∈ I and yi (a) = xcr (a), xcr (a) for all 0 ≤ i ≤ q − 1. Then q−1 e−V (a) ≤ fy (yi (a); a) ≤ eV (a) . i=0
Renormalizations and Rigidity Theory
99
Let ρ = [k1 , k2 , k3 , . . . ] be an irrational rotation number. Consider a sequence of convergents pl = [k1 , k2 , k3 , . . . , kl ]. ql Denote p˜ 1 p˜ 3 p˜ 2 = [k1 , . . . , k2n ] < = [k1 , . . . , k2n+1 + 1] < = [k1 , . . . , k2n+1 ]. q˜1 q˜3 q˜2 Notice that (p˜ 1 /q˜1 , p˜ 2 /q˜2 ) forms a Farey interval. Since p˜ 3 = p˜ 1 + p˜ 2 , q˜3 = q˜1 + q˜2 , an interval (p˜ 3 /q˜3 , p˜ 2 /q˜2 ) is a good Farey interval inside (p˜ 1 /q˜1 , p˜ 2 /q˜2 ). Denote intervals of parameters a corresponding to both Farey intervals by (a1 , a2 ), (a3 , a2 ) respectively: (a1 , a2 ) = ρ −1 (p˜ 1 /q˜1 , p˜ 2 /q˜2 ), (a3 , a2 ) = ρ −1 (p˜ 3 /q˜3 , p˜ 2 /q˜2 ). By Proposition 7, e
−V (a)
q2n+1 −1
≤
fy (yi (a); a) ≤ eV (a)
(28)
i=0
for all a ∈ (a3 , a2 ) and for all y where all derivatives are defined. Denote by ε and δ two positive constants such that ε≤
1 ∂ 1 ∂ f (y, a) ≤ , δ ≤ f (y, a) ≤ ∂a ε ∂y δ (1,2)
holds for all a ∈ (a3 , a2 ) and y = xcr (a) (mod 1). Notice that existence of ε and δ follows from Lemma 9 and Proposition 4. Also define V =
V (a), C2 =
sup
and
L=
sup
sup
a∈(a3 ,a2 ) y=x (1,2) (a)
a∈(a3 ,a2 )
fy (y; a),
cr
sup a∈(a3 ,a2 )
max |log (c12 (a))|, |log (c22 (a))| .
Theorem 2. There exists a constant C = C(ε, δ, V , C2 , L) > 0 such that |(a3 , a2 )| = a2 − a3 ≤
C . 2 q2n+1
(29)
Proof. Choose y0 ∈ S 1 such that for almost all a ∈ A points of trajectory yj (a) = (1,2) xcr (a) for all 0 ≤ j < q2n + q2n+1 . Existence of such y0 follows from the Fubini theorem. We start with the following simple relations: q2n+1 −1 q2n+1 −1 ∂ fa (yi (a); a) fy (yj (a); a) yq2n+1 (a) = ∂a
(30)
j =i+1
i=0
and q2n+1 −1
q2n+1 −1
j =i+1
fy (yj (a), a) =
fy (yj (a); a)
j =0 i j =0
. fy (yj (a); a)
(31)
100
K. Khanin, D. Khmelev
Notice that (28) gives an estimate for the numerator in the formula above. We shall use (2n) (2n+1) (a) = elements of the partition j (a) = Tf j ([0, yq2n (a)]; a), 0 ≤ j < q2n+1 , i (2n)
Tf i ([yq2n+1 (a), 0]; a), 0 ≤ i < q2n . Suppose j does not contain a break point. Then since (2n) (2n) j +1 (a) = Tf (j (a); a) we have
|j +1 (a)| = fy (ξj (a); a)|j (2n)
(2n)
where ξj (a) ∈ j
(2n)
(a)|,
(a). Notice that
fy (ξj (a); a) fy (yj (a); a)
fy (yj (a); a) + fy (ζj (a); a)(ζj (a) − yj (a))
=
fy (yj (a); a)
fy (ζj (a); a)
= 1+
fy (yj (a); a)
(ζj (a) − yj (a)) ,
where ζj (a) ∈ (yj (a), ξj (a)). Hence (2n)
j +1 (a) (2n)
j
(a)
= fy (yj (a); a)(1 + δj ),
(2n)
(l)
(2n)
where |δj | < (C2 /δ)j . Suppose now that xcr (a) ∈ j it is easy to see that the following formula holds:
, where l = 1 or 2. Then
(2n)
j +1 (a) (2n) j (a)
= fy (yj (a); a)dj (1 + δj ),
where dj = αj + (1 − αj )cl2 (a), 0 ≤ αj ≤ 1 is a relative coordinate of a break point (l) (2n) xcr (a) inside j and δj satisfies the estimate |δj | ≤
2C2 max(1, cl2 (a)) (2n) δ min(1, cl2 (a)) j
for n large enough uniformly in 0 ≤ j < q2n+1 . It follows that (2n)
|i+1 | =
i
i
fy (yj (a); a)
j =0
j =0
dj
i
1 + δj
(2n)
|0
|,
j =0
(2n)
where dj = 1 if j does not contain break points. Obviously, there exists a constant C3 which depends on C2 , δ and c12 (a), c22 (a) such that (2n)
−C3 j Therefore
(2n)
≤ log (1 + δj ) ≤ C3 j
i 1 ≤ (1 + δj ) ≤ C4 C4 j =0
.
Renormalizations and Rigidity Theory
101
for some C4 = C4 (C2 , δ, L) > 1. Since e
−2L
≤
i
dj ≤ e2L ,
j =0
we get (2n)
(2n)
|0 | | (a)| 1 1 ≤ i . ≤ e2L C4 0(2n) e2L C4 |(2n) (a)| | (a)| j =0 fy (yi (a); a) i+1
(32)
i+1
Using (28), (31), (30), (32) one gets ∂ εe−V (a) yq2n+1 (a) ≥ 2L ∂a e C4
q2n+1 −1
|(2n) (a)| 0 (2n)
|i+1 (a)|
i=0
≥
δεe−V (a) e2L C4
q2n+1 −1
|(2n) (a)| 0 (2n)
i=0
|i
.
(a)|
Notice that the minimum of the sum q2n+1 −1
under condition
q2n+1 −1 i=0
(2n)
|i
1
i=0
(2n) |i (a)| (2n)
(a)| ≤ 1 is reached when all i (2n)
|i It follows that
q2n+1 −1
(a)| =
1
i=0
(2n) |i |
1 q2n+1
(a) are equal, i.e.,
.
2 ≥ q2n+1
and ∂ εδe−V (a) (2n) 2 |0 (a)|q2n+1 . yq2n+1 (a) ≥ 2L ∂a e C4
(33)
To finish the proof take arbitrary a such that a3 < a < a2 . Since (33) holds for almost all a ∈ (a, a2 ) we obtain
a2
a2 ∂ εδe−V 2 (2n) yq2n+1 (a)da ≥ 2L q2n+1 |0 (a)|da. yq2n+1 (a2 ) − yq2n+1 (a) = ∂a e C4 a a (2n)
Notice that |0
(a)| is an increasing function of a. Hence
yq2n+1 (a2 ) − yq2n+1 (a) ≥ (2n)
Since Tf q2n+1 (0
(2n+1)
(a); a) ⊃ 0
(2n)
|Tf q2n+1 (0
εδe−V 2 (2n) q (a2 − a)|0 (a)|. e2L C4 2n+1
(a) and (2n)
(a); a)| ≤ eV (a) |0
(a)|.
102
K. Khanin, D. Khmelev
we finally get yq2n+1 (a2 ) − yq2n+1 (a) (2n+1)
|0
(a)|
≥
εδe−V 2 q (a2 − a)e−V (a) . e2L C4 2n+1
Since the left-hand side is a continuous function of y0 we remark that the last estimate holds uniformly in y0 . Choose y0 to be a point of a periodic trajectory of period (2n+1) (a)|. q2n+1 = q˜2 for a = a2 . Then yq2n+1 (a2 ) = y0 and yq2n+1 (a2 )−yq2n+1 (a) = |0 Hence C a2 − a ≤ 2 , q2n+1 where C = e2L C4 e2V /εδ. Since a can be taken arbitrary close to a3 the estimate (29) holds. Denote the golden mean by
√ 1+ 5 σ = . 2
It is well-known that for an arbitrary irrational number ρ, q2n+1 ≥ σ 2n+1 . This estimate is asymptotically sharp for ρ = 1/σ . Denote ζ =
2 1 = √ . σ2 3+ 5
(34)
Then the following corollary holds. Corollary 2. In conditions of Theorem 2 |(a3 , a2 )| = a2 − a3 ≤ C(ε, δ, C2 , V , L)ζ 2n+1 . 7. Uniform Hyperbolicity of the RG-Transformation at Periodic Points In this section we describe all periodic points of the RG-transformation. We start with an important property of the Jacoby matrix for the RG-transformation which holds for all c > 0. Lemma 20. Let (a, v) be a renormalizable pair (a, v) ∈ Ic and (a , v , 1/c) = R(a, v, c). Denote by Dc (a, v) a Jacobian of Rc at (a, v): Dc (a, v) = det dRc (a, v). Then Dc (a, v) = −
a 1 d k (−1), F a c dz a,v,c
(35)
where k = [1/ρ(Ta,v,c )] is a number of iterations of the mapping Fa,v,c in the renormalization procedure.
Renormalizations and Rigidity Theory
103
Proof. Equation (16) gives v = Denote w = va. Then or
1 + vaa − 1. c
1 + wa w −1 = a c 1 w(a )2 w = − 1 a + . c c
Denote by D¯ c (a, w) a Jacobian of Rc in coordinates (a, w): ∂a ∂a a 2 ∂a ∂a ∂w , D¯ c (a, w) = = 2 (a ) ∂a ∂a ∂a ∂a c ∂a K1 + K2 K1 + + K2 ∂a ∂a ∂w c ∂w where K1 = 1/c − 1, K2 = 2wa /c. k (−1)/a, where We recall that a = −Fa,v,c Fa,v,c (z) =
a + cz for z ∈ [−1, 0]. 1 − vz
Denote by F the mapping Fa,v,c in a new coordinate z˜ = z/a: F (˜z) =
1 a + ca z˜ 1 + c˜z = . a 1 − va z˜ 1 − w z˜
It is important that F is independent of a. Since a = −F (−1/a) we have !k−1 " ∂a 1 , F (yi ) − − 2 =− ∂a an i=0
where yi = F i (y0 ), y0 = −1/a. Since Fa,v,c (z) = a F (z/a), we obtain
k (Fa,v,c ) (y0 ) = (F k ) (−1)
which gives
(a )2 1 d k D¯ c (a, w) = − 2 (−1). F a c dz a,v,c Since the Jacobian for the change of variables (a, v) → (a, w) is equal to a we finally get a a 1 d k (−1).
F Dc (a, v) = D¯ c (a, w) = − a a c dz a,v,c The next lemma gives another expression for D in the case when (a, v) is twice renormalizable.
104
K. Khanin, D. Khmelev
Lemma 21. Let (a, v) ∈ Ic2 . Then Dc (a, v) = −
a 1 Fa ,v ,1/c (0) a 1 Ga ,v ,c (0) = − , a c Ga,v,c (0) a c Ga,v,c (0)
(36)
where (a , v ) = R1/c (a , v ). Proof. Since Ga ,v ,c coincides with Fa ,v ,1/c after rescaling of the coordinate z we have d d k d d = = . Ga ,v ,c (z) Fa ,v ,1/c (z) Fa,v,c (z) Ga,v,c (z) dz dz dz z=0 z=0 z=−1 dz z=0 This together with (35) immediately implies the statement of the lemma.
We next discuss periodic points for the RG-transformation and show that the Jacobian along the periodic orbit is equal to 1. Notice that since R transforms c to 1/c the periodic orbit has always even period in the case c = 1. Lemma 21 implies the following important fact. Proposition 8. i) Let (ai , vi , ci ), 0 ≤ i ≤ p − 1 be a periodic orbit of the RG-transformation of even period p. Then the Jacobian of the transformation p/2 R(1) c0 = (R1/c0 ◦ Rc0 )
at point (a0 , v0 ) is equal to 1. ii) Let c = 1 and (ai , vi ) be a periodic orbit of the RG-transformation of period q q (even or odd). Then the Jacobian of R1 at point (a0 , v0 ) is equal to (−1)q . Proof. Denote Di = Dci (ai , vi ). Using Lemma 21 and relations ai+p = ai , vi+p = vi , ci+p = ci , we obtain p−1 i=0
Di =
p−1 i=0
ai+1 1 Gai+2 ,vi+2 ,ci+2 (0) − ai ci Gai ,vi ,ci (0)
= 1,
which proves the first statement. The proof in the second case is the same.
Proposition 8 suggests that there exists a smooth invariant measure for the transformation R1/c ◦ Rc . This is indeed the case and the density is given by √
χc (a, v) =
c . (c + av)(a + v − (c − 1))
(37)
More precisely, the following proposition holds. Proposition 9. Let (a , v ) = Rc (a, v). Then χ1/c (a , v ) =
χc (a, v) . |Dc (a, v)|
(38)
Renormalizations and Rigidity Theory
105
Proof. It follows from (36) that a (1/c + v a ) . a a + v − (c − 1)
|det dRc (a, v)| = |Dc (a, v)| =
(39)
It follows from (16) that c + av =
c(a + v + 1) − 1 . a
(40)
Formula (38) is obtained from (37), (39), (40) by direct calculation.
Fix arbitrary c > 1. Let (ai , vi , ci ) be a periodic orbit of even period p with c0 = c. Then ρ(Ta,v,c ) = [1 + k1 , k2 , . . . , kn , . . . ], where ki is a p-periodic sequence, i.e., ki+p = ki , i ≥ 1. Remark 3. Notice that the minimal period for the sequence ki might be equal to p/2. (n)
Denote Rc
(1)
= (Rc )n = (R1/c ◦ Rc )np/2 . Obviously (a0 , v0 ) is a fixed point for
(1) Rc :
R(1) c (a0 , v0 ) = (a0 , v0 ). (1)
(1)
Denote by Q : R2 → R2 a derivative of Rc at point (a0 , v0 ): Q = dRc (a, v) and by λ(Q) the spectral radius of Q. Theorem 3. λ(Q) ≥ σ 2p . Proof. Suppose that λ(Q) < σ 2p . Then there exists a vector norm · in R2 such that the matrix norm of Q is also less than σ 2p . Denote by Br (a0 , v0 ) a ball of the radius r centered at (a0 , v0 ). For any ε > 0 there exists r = r(ε) > 0 such that for any (a , v ), (a , v ) ∈ Br(ε) (a0 , v0 ) the following inequality holds: (1) R(1) c (a , v ) − Rc (a , v ) ≤ (1 + ε)Q(a , v ) − (a , v ).
Choose ε so small that γ = (1 + ε)Q < σ 2p . For large enough n denote (n)
(n)
ρ1 = [1 + k1 , . . . , knp , 4], ρ2 = [1 + k1 , . . . , knp , 2]. Notice that p˜ 1 p˜ 3 = [1 + k1 , . . . , knp−2 ] < = [1 + k1 , . . . , knp−1 + 1] q˜1 q˜3 p˜ 2 (n) (n) < ρ1 < ρ2 < = [1 + k1 , . . . , knp−1 ]. q˜2 (n)
(n)
Take arbitrary α1 , α2 such that (n)
(n)
ρ(Tα (n) ,v,c ) = ρ1 , ρ(Tα (n) ,v,c ) = ρ2 . 1
2
Denote by (n)
(n)
(n)
(αj (i), vj (i)) = R(i) c (αj , v), j = 1, 2.
106
K. Khanin, D. Khmelev
Notice that ρ(Tα (n) (n),v (n) (n),c ) = [1+4] = 1/5 and ρ(Tα (n) (n),v (n) (n),c ) = [1+2] = 1/3. 1 1 2 2 It follows from Proposition 2 and Lemma 19 that there exists U > 0 such that uniformly in n, (n)
(n)
(n)
(n)
(α1 (n), v1 (n)) − (α2 (n), v2 (n)) ≥ U.
(41)
On the other hand it follows from Corollary 2 that there exists a constant C0 such that for n large enough, (n)
(n)
(α1 , v) − (α2 , v) < C0 ζ np .
(42)
Now choose n so large that C0 (ζ p )n γ n < min(r(ε), U ). Then (n)
(n)
(n)
(n)
(α1 (n), v1 (n)) − (α2 (n), v2 (n)) ≤ C0 (ζ p )n γ n < U, which contradicts (41).
It follows from Proposition 8 and Theorem 3 that for arbitrary p any fixed point (1) P (c) of Rc is uniformly hyperbolic, i.e., there exist two eigenvalues λu and λs of (1) dRc (P (c)) such that |λs | ≤ ζ p < 1 < ζ −p ≤ |λu |.
(43)
Uniform hyperbolicity (43) implies that any fixed point P (c) can be smoothly continued in parameter c for all c ∈ [1, ∞). More precisely the following proposition holds. (1)
Proposition 10. 1. Let P (c0 ) be a fixed point of Rc0 for some p even and c0 > 1. Then there exists a unique smooth1 curve {(P (c), c), c ≥ 1} of fixed points for transformation (1) Rc which contains (P (c0 ), c0 ). q 2. Let P (1) be a fixed point for R1 . Then there exists a unique curve {(P (c), c), c ≥ 1} (1) of fixed points for Rc = (R1/c ◦ Rc )p(q)/2 , where q for q even, p(q) = 2q for q odd. The proof of Proposition 10 is standard (see, for example [3]). Proposition 10 implies that in order to study the structure of periodic orbits for R it is enough to find all periodic orbits of R1 . Lemma 22. Let c = 1. Suppose (a0 , v0 ) ∈ 1 ∩ I1∞ and (an , vn ) = Rn1 (a0 , v0 ). Then there exists λ < 1 such that |vn | ≤ λn . Proof. Using (16) with c = 1 we get vn = (an an−1 )vn−1 , which together with Lemma 1 implies exponential decay of |vn | (see Proposition 3). 1
In fact, arbitrary smooth . . .
Renormalizations and Rigidity Theory
107
Denote 1 = {(a, 0) ∈ 0 < a ≤ 1}. It follows from Lemma 22 that all infinitely renormalizable trajectories converge to 1 . Hence all periodic orbits of R1 belong to 1 . Denote by [k1 , . . . , kq ] the irrational number with periodic continued fraction expansion with a period given by k1 , . . . , kq : [k1 , . . . , kq ] = [k1 , . . . , kq , k1 , . . . , kq , . . . ]. Lemma 23. 1. Let (a, 0) ∈ 1 ∩ I1 . Then R1 (a, 0) = ({1/a}, 0), where {·} stands for the fractional part of a number. 2. All periodic points for R1 have the following form: (a, v) = ([k1 , . . . , kq ], 0). 3. Let {(ai , 0), 1 ≤ i ≤ q} be a periodic orbit for R1 : ai = [ki , . . . , kq , k1 , . . . , ki−1 ]. Then {(ai , 0), 1 ≤ i ≤ q} is hyperbolic and dR1 (ai , 0) has stable and unstable eigenvalues given by λs =
q
ai2 , λu = (−1)q
i=1
q
ai2
−1 .
(44)
i=1
Remark 4. Notice that stable and unstable eigenvalues (44) satisfy (43). Proof. It is easy to check that 1 is invariant under renormalizations, namely R1 (1 ∩ I1 ) ⊂ 1 (see Proposition 2). Using definition (23) of fa we get in the case c = 1, v = 0, a fa (x) = x + . 1+a It follows that ρ(fa ) =
1 a = = [1 + k1 , k2 , . . . ], 1+a 1 + 1/a
(45)
where a = [k1 , k2 , . . . ]. Hence ρ(R(a, 0)) = ρ(a , 0) = [1 + k2 , k3 , . . . ], which gives a = [k2 , k3 , . . . ] = {1/a}. The second and the third statements follow immediately from the first one and the following fact: q
det(dR1 (a, 0)) = (−1)q if (a, 0) belongs to periodic orbit with period q.
108
K. Khanin, D. Khmelev
Proposition 10, Lemma 23 and Lemma 22 show that for a fixed c > 1 all periodic points of R are parameterized by irrational numbers a having periodic continued fraction expansion. Namely for all a = [k1 , . . . , kq ], where q is a minimal period there exists a unique periodic orbit of period p(q): q for q even, {(ai , vi ), 1 ≤ i ≤ q}, where p(q) = 2q for q odd, such that ρi = ρ(ai , vi ) =
1 1 + 1/[ki , . . . , kq , k1 , . . . , ki−1 ]
= [1 + ki , ki+1 , . . . , kq , k1 , . . . , ki−1 , ki , . . . ]. Fix c > 1 and a¯ = [k1 , . . . , kq ]. Denote by = (ρa¯ ) a curve of pairs (a, v) ∈ c such that ρ(a, v) = ρa¯ = 1/(1 + 1/a) ¯ (cf. Sect. 5). It follows from above that is invariant under (R1/c ◦ Rc )p(q)/2 = R(1) and contains a unique fixed point (a ∗ , v ∗ ) of (1) (1) Rc : Rc (a ∗ , v ∗ ) = (a ∗ , v ∗ ). It also follows that there are no other periodic points of (1) Rc in . (1) We next show that a unique fixed point (a ∗ , v ∗ ) is globally attractive under Rc restricted on . The proof uses the following general fact. Proposition 11 ([4], Coppel, 1955). Let f : I → I be a continuous map of a compact interval I into itself. If f has no periodic points of period 2 then for every x ∈ I the trajectory {f k (x)} is bimonotonic and converges to a fixed point of f . Remark 5. A sequence xk , k ≥ 0 is said to be bimonotonic if for every m ≥ 0, either xk > xm for all k > m, or xk = xm for all k > m, or xk < xm for all k > m. Recall that for any irrational 0 < ρ < 1/2 a set (ρ) can be considered as a graph of a continuous function aρ (v). For ρ = ρa¯ denote this curve a(v): = {(a, v) | a = a(v), v ∈ [0, c − 1]}. Theorem 4. 1. a(v) is a real-analytic function in a complex neighborhood of [0, c − 1]. ¯ > 0 such that for any (a, v) ∈ , 2. There exists a constant C0 = C0 (a) ∗ ∗ np(q) , R(n) c (a, v) − (a , v ) ≤ C0 ζ
(46)
(1) n (n) where Rc = Rc . Remark 6. Notice that a priori we can guarantee only continuity of a(v). However hyperbolicity implies that it is smooth, in fact, real-analytic. Proof. Consider continuous mapping f of an interval I = [0, c − 1] into itself which (1) corresponds to the action of Rc on parameterized by v: f : v → v (R(1) c (a(v), v)), where v is a v-coordinate of Rc (a(v), v). Since f has a fixed point at v ∗ ∈ (0, c − 1) and no other periodic orbits Proposition 11 implies that v ∗ is globally attracting for f . (1)
Renormalizations and Rigidity Theory
109
Since I is compact we immediately get that for any ε > 0 there exists n(ε) such that f n(ε) I ⊂ (v ∗ − ε, v ∗ + ε) (indeed one can choose a finite subcover of compact I from # (1) −i ∗ ∗ i≥1 f (v −ε, v +ε) ⊃ I ). Obviously, the same statement holds for Rc : → . Let (st) be a local stable manifold for a hyperbolic fixed point (a ∗ , v ∗ ). Obviously for all (a, v) ∈ (st) the rotation number ρ(a, v) = ρa¯ . Therefore c ∩ (st) ⊂ . (1) Since Rc is real-analytic it is well-known (see [56]) that (st) is real-analytic as well. Recall that the whole gets mapped inside (st) in a finite number of iterations. This implies real-analyticity of a(v) in a neighborhood of [0, c − 1]. Finally, hyperbolicity and an estimate (43) for eigenvalues gives (46). 8. The Convergence of Renormalizations for f ∈ Fc Fix c > 1 and consider f ∈ Fc such that ρ(f ) = [k1 , . . . , kn , . . . ] is of bounded type, i.e., 1 ≤ ki ≤ M for all i ≥ 1. Take x0 = xcr and define a sequence xi = Tf i xcr . Below we consider renormalizations (fn , gn ) = (Rxcr Hxcr )n−1 Rxcr Tf defined in Sect. 2. In this section we would be interested in the behavior of a triple (an (f ), vn (f ), cn ) which is defined as follows (see (2)): xcr − xqn−1 +qn xqn − xcr , bn (f ) = , xcr − xqn−1 xcr − xqn−1 $ cn − an (f ) − bn (f ) c for odd n, vn (f ) = , cn = bn (f ) 1/c for even n.
an (f ) =
In the next two lemma statements it is convenient to study behavior of the renormalization operator R in coordinates (a, b, c), introduced at the end of Sect. 2. We say that a point (a0 , b0 , c0 ) is γ -non-degenerate if the following two conditions hold: ˜ 0 , b0 , c0 ) is well1) (a0 , b0 , c0 ) is renormalizable, i.e., the point (a1 , b1 , c1 ) = R(a defined. 2) Both points (a0 , b0 , c0 ) and (a1 , b1 , c1 ) satisfy the following inequalities: 0 < γ < ai , 0 < γ < bi < 1 − γ , i = 0, 1. Let us define k(a, b, c) to be the first partial quotient of ρ(T˜a,b,c ) minus 1, k(a, b, c) = k1 if ρ(T˜a,b,c ) = [1 + k1 , k2 , . . . ]. Lemma 24. Let c0 = c. Suppose that (a0 , b0 , c0 ) is γ -non-degenerate and k(a0 , b0 , c0 ) ≤ M. Then there exist constants K = K(M, γ , c) > 1, δ = δ(M, γ , c) > 0 such that for any (a, b) ∈ Uδ (a, b) the following two statements hold: 1) (a, b, c) is renormalizable, 2) k(a, b, c) = k(a0 , b0 , c0 ). ˜ (a, b) ∈ Uδ (a0 , b0 ) we have Moreover, for any (a, ˜ b), ˜ a, ˜ ˜ c) − R(a, ˜ c) − (a, b, c). R( ˜ b, b, c) ≤ K(a, ˜ b,
(47)
˜ 0 , b0 , c0 ) is defined by the following relaProof. Let us recall that (a1 , b1 , c1 ) = R(a tions: c1 =
F˜ak ,b ,c (−1) F˜ak+1 1 1 ,b ,c (−1) = , a1 = − 0 0 0 , b1 = − 0 0 0 , c0 c a0 a0
(48)
110
K. Khanin, D. Khmelev
˜ [l] where k = k(a0 , b0 , c0 ). Denote by R c (a, b) a transformation corresponding to the th l iterate of F˜a,b,c in (48). Clearly in some δ0 -neighborhood of (a0 , b0 ) the following estimate holds: ˜ [l] max D R c (a, b) ≤ K 1≤l≤M
for some K = K(C, M, γ ) > 1. It is easy to see that one can take δ0 = δ0 (γ , M, c) uniform for all γ -non-degenerate (a0 , b0 , c0 ). ˜ [k] Let (a , b ) = R c (a, b). Then for all (a, b) ∈ Uδ (a0 , b0 ), max(|b − b1 |, |a − a1 |) ≤ (a , b ) − (a1 , b1 ) ≤ K(a, b) − (a0 , b0 ). If (a, b) − (a0 , b0 ) < δ = min(δ0 , γ /2K), then γ -non-degeneracy of (a0 , b0 , c0 ) imply that a > γ /2 > 0, 0 < γ /2 < b < 1 − γ /2. It follows that k(a, b, c) = k(a0 , b0 , c0 ). Lemma 25. There exist constants 0 < σ1 < 1, 0 < θ1 < 1, C¯ 1 > 0, N1 = N1 (f ) such that for all n > N1 (f ) the rotation number of Hxcr (fn , gn ) and T˜an (f ),bn (f ),cn coincide at the first [σ1 n] places of continued fraction expansion and ˜ s (an (f ), bn (f ), cn ) ≤ C¯ 1 θ n (an+s (f ), bn+s (f ), cn+s ) − R 1 for all 0 ≤ s ≤ σ1 n. (0)
Proof. Denote αi = (an+i (f ), bn+i (f ), cn+i ), i = 0, . . . , s. It follows from boundness of partial quotients of ρ(f ) that an+i (f ), vn+i (f ), bn+i (f ) are bounded from their extremal values uniformly in i and n. It follows from Theorem 1 that for n large enough, (0) 1) Points αi are renormalizable, i = 0, . . . , s, (0) 2) k(αi ) = kn+i , ˜ (0) − α (0) ≤ Cλn , i = 0, . . . , s. 3) Rα i i+1 Since kn+i ≤ M, statements 1) and 3) imply that there exist N¯ = N¯ (f ) and γ = (0) γ (M, f ) such that points αi , i = 0, . . . , s are γ -non-degenerate for all n > N¯ (f ). Let K = K(γ , M, f ), δ = δ(γ , M, f ) be the constants defined in Lemma 24. Assume that s is chosen in such a way that the following inequality holds: (1 + K + · · · + K s−1 )Cλn < δ.
(49)
We shall prove by induction that the following statements hold for l = 0, . . . , s. (l+1) (l) ˜ (l) , 1) Points αi , i = l, . . . , s are renormalizable and hence the points αi+1 = Rα i i = l, . . . , s are well-defined. (l) (0) 2) k(αi ) = k(αi ), i = l, . . . , s. (l+1) (0) − αi ≤ (1 + K + · · · + K l )Cλn , i = l, . . . , s. 3) αi Obviously statements 1)–3) hold true for l = 0. Assume that statements 1)–3) are proven for l − 1 ≤ s − 1. By the induction assumption 3) for l − 1, (l)
(0)
αi − αi ≤ (1 + K + · · · + K l−1 )Cλn < δ, (l)
i = l, . . . , s. (l)
(0)
Condition (49) and Lemma 24 imply that αi is renormalizable and k(αi ) = k(αi ) (l+1) ˜ (l) , i = l, . . . , s, are well-defined. for i = l, . . . , s. Hence points αi+1 = Rα i
Renormalizations and Rigidity Theory
111
Since (l+1)
αi
(1) ˜ (l) − Rα ˜ (0) ≤ Kα (l) − α (0) − αi = Rα i−1 i−1 i−1 i−1
≤ K(1 + K + · · · + K l−1 )Cλn , the following inequality holds: (l+1)
αi
(0)
(l+1)
− αi ≤ αi
(1)
(1)
− αi + αi
(0)
− αi
≤ (1 + K + · · · + K l )Cλn ,
for i = l, . . . , s.
Hence all statements 1)–3) hold for l as required. ˜ i α (0) = α (i) is well-defined, k(α (i) ) = kn+i and It follows that the trajectory R 0 i i (i)
(0)
αi − αi ≤ (1 + K + · · · + K i−1 )Cλn ,
i = 1, . . . , s,
which imply ˜ s (an (f ), bn (f ), cn ) − (an+s (f ), bn+s (f ), cn+s ) ≤ (1 + K + · · · + K s−1 )Cλn . R We next choose σ1 so small that (49) holds. Since (1 + K + · · · + K s−1 )Cλn ≤
Ks − 1 n C Cλ ≤ K s λn . K −1 K −1
It is enough to take σ1 such that θ1 = K σ1 λ < 1. Then for all 0 ≤ s ≤ σ1 n and n large enough (49) holds. Since a trajectory (an (f ), bn (f ), cn ) is uniformly non-degenerate, we can change coordinates to (an (f ), vn (f ), cn ). Lemma 26. There exists a constant C1 > 0 such that for all n > N1 (f ), (an+s (f ), vn+s (f ), cn+s ) − Rs (an (f ), vn (f ), cn ) ≤ C1 θ1n for all 0 ≤ s ≤ σ1 n. Lemma 27. There exists N0 = N0 (f ) such that (a2n (f ), v2n (f )) ∈ c for all n > N0 . Proof. It follows from Proposition 3 that any point (a, v) which is bounded away from its extremal values enters c in a finite number of iterations and stays strictly inside c : (a2n , v2n ) = (R1/c ◦ Rc )n (a, v) ∈ c for n > N . In fact, if k(ai , vi , ci ) ≤ M for i = 2n, 2n + 1, 2n + 2, then there exists δ0 = δ0 (M, c, λ) such that 0 < δ0 < a2n+1 a2(n+1) < 1 − δ0 . Then estimate (18) yields the existence of δ = δ(M, c, λ) such that δ(M, c, λ) < v2(n+1) < c − 1 − δ(M, c, λ). Application of Lemma 26 easily gives the necessary conclusion.
112
K. Khanin, D. Khmelev
Suppose now that ρ(f ) has a periodic continued fraction expansion with period q: ρ(f ) = [k1 , . . . , kq ]. Lemma 3 implies that the rotation number for the renormalization pair (fnq , gnq ) is given by ρ¯ =
1 = [k1 + 1, k2 , . . . , kq , k1 , . . . , kq , . . . ]. 1 + 1/ρ(f )
As before put
p = p(q) =
q 2q
for q even, for q odd.
Denote by (αn , νn ) a point of the curve = (ρ) ¯ such that νn = vnp (f ). Lemma 28. There exist 0 < θ2 < 1, C2 > 0, N2 = N2 (f ) such that (αn , νn ) − (anp (f ), vnp (f )) ≤ C2 θ2n for all n > N2 (f ). Proof. Lemma 26 guarantees that the rotation number of (anp (f ), vnp (f )) agrees with ρ¯ on the first [σ1 n] positions of the continued fraction expansion. This together with Corollary 2 give the statement of the lemma. Lemma 29. There exist N3 (f ), K1 > 1, 0 < σ2 < 1 such that for all n > N3 (f ), Rs (anp (f ), vnp (f ), c) − Rs (αn , νn , c) ≤ K1s (anp (f ), vnp (f ), c) − (αn , νn , c) (50) for all 0 ≤ s ≤ σ2 p. Proof. Notice that (αn , νn ) ∈ Ic∞ , ρ(αn , νn , c) = ρ(f ) is of bounded type and trajectory Rs (αn , νn , c) is uniformly non-degenerate. Hence the arguments as in the proof of Lemmas 24, 25 give (50). ˆ ) < 1 and C = C(f ) such that for n large enough, Theorem 5. There exist λˆ = λ(f (anp (f ), vnp (f )) − (a ∗ , v ∗ ) < C λˆ n . Proof. Denote (a ∗ , v ∗ ) the fixed point of Rc : → , Rc = (R1/c ◦ Rc )p(q)/2 corresponding to the rotation number ρ. ¯ Notice that for n > max(N0 , N1 , N2 , N3 ) we have an estimate (1)
(1)
(a(n+s)p (f ), v(n+s)p (f )) − (a ∗ , v ∗ ) s ≤ (a(n+s)p (f ), v(n+s)p (f )) − (R(1) c ) (anp (f ), vnp (f )) s (1) s + (R(1) c ) (anp (f ), vnp (f )) − (Rc ) (αn , νn ) s ∗ ∗ + (R(1) c ) (αn , νn ) − (a , v ) np
sp
≤ C1 θ1 + C2 θ2n K1 + C4 λs0 , provided 0 ≤ s ≤ min ([σ1 n], [σ2 n]). Obviously one can choose an appropriate s = [σ3 n] such that the right-hand side of the last inequality is estimated from above by C λˆ n .
Renormalizations and Rigidity Theory
113
9. Proof of Theorem A In this section we consider f , g ∈ Fc and prove that they are smoothly conjugate provided they have the same break c and share the same rotation number of bounded type. Since the case c < 1 can be reduced to c > 1 by reversing the orientation of the circle we assume without loss of generality that c > 1. By the Denjoy theorem there exists a homeomorphism T h with a lift h ∈ C(R) such that f = h−1 ◦ g ◦ h. Our aim is to show that h is C 1+γ (R) for some γ > 0. Consider dynamical partitions ξn (f ) formed by the finite trajectory Pn (f ) of the break point xcr , Pn (f ) = {Tf i x0 | 0 ≤ i < qn + qn−1 }, where x0 = xcr . Partition ξn (f ) consists of two sets of intervals: & % & % (n) (n) (n−1) (n−1) = Tf j 0 , 0 ≤ j < qn , ξn (f ) = i = Tf i 0 , 0 ≤ i < qn−1 , ∪ j (n)
(n−1)
where 0 , 0 are open intervals with endpoints at (x0 , xqn ) and (xqn−1 , x0 ) respectively. Choose T h in such a way that T h(xcr (f )) = xcr (g). This obviously implies that if u ∈ Pn (f ) and u = Tf i xcr then T h(u) = v ∈ Pn (g) and v = T g i xcr . We shall use the following criteria of smoothness for h. Proposition 12 (de Faria, de Melo, [12]). Assume that the rotation number ρ(f ) is of bounded type. Suppose that there exist constants C > 0, 0 < λ¯ < 1 such that |1 | |h(1 )| ¯n (51) | | − |h( )| ≤ C λ 2 2 for each pair of adjacent elements 1 , 2 ∈ ξn (f ) for all n = 1. Then h is a C 1+γ -diffeomorphism for some γ > 0. Below we describe a plan for the proof of estimate (51). We first consider the set (n) (n−1) (n) (f ) and show that condition (51) holds for any 1 ∪ 2 ⊂ 0 (f ) ∪ 0 (f ) ∪ 0 (n−1) (f ), where 1 , 2 are adjacent elements of ξn+m and m = [σ n] for some σ > 0. 0
This fact is implied by the exponential convergence of renormalizations. We then extend estimate (51) for any two adjacent elements of partition ξn+m . This (n) (n−1) (f ) and showing can be done by iterating suitable intervals inside 0 (f ) ∪ 0 that the distortion of |1 |/|2 | is small. Finally, since m = [σ n] we can replace λ¯ n by 1/(1+σ ) n+m . λ¯ In what follows we shall give precise statements which implement this plan. Denote (0) (1) (n−1) (f ), by Tn f , Tn f the mappings Tf qn , Tf qn−1 defined on a closure of intervals 0 (n) 0 (f ) respectively: (n−1)
Tn(0) f x = Tf qn x, x ∈ 0 Tn(1) f x (0)
(1)
= Tf
qn−1
x, x ∈
,
(n) 0 .
We also define T0 f = T0 f = Tf . We shall show that any element of the partition (0) (1) ξn (f ) can be obtained by successive iteration of Ti f , Ti f , i = n − 1, . . . , 0. Notice that this statement is of combinatorial nature. In particular one can assume that Tf is a pure rotation by an arbitrary irrational angle. In fact only the existence of the dynamical partition ξn (f ) of the level n is required.
114
K. Khanin, D. Khmelev
Lemma 30. Consider the partition ξn (f ) consisting of (n) 0 ≤ i < qn−1 , i , (n−1) j , 0 ≤ j < qn . (n−1)
For any j
, 0 ≤ j < qn there exists a sequence of functions ϕ0 , . . . , ϕn−1 such that (n−1)
j
(n−1)
= ϕ0 ◦ · · · ◦ ϕn−1 0
,
where ϕi = (Ti f )ri −si (Ti f )si and 0 ≤ ri ≤ ki+1 , ri − si ≥ 0, si = 0, 1 for 0 ≤ i ≤ n − 1. (0)
(1)
(n−1)
Remark 7. Although the statement of Lemma 30 is formulated only for intervals j (n) i ,
,
(n) i
0 ≤ i < qn−1 . Indeed, since ⊂ 0 ≤ j < qn , it also holds for intervals (n−2) ∈ ξn−1 (f ), we can apply Lemma 30 for the partition ξn−1 (f ). This yields the i following formula: (n−1) (n−1) = ϕ0 ◦ · · · ◦ ϕn−2 0 , j where ϕi = (Ti f )ri −si (Ti f )si and 0 ≤ ri ≤ ki+1 , ri − si ≥ 0, si = 0, 1 for 0 ≤ i ≤ n − 2. (0)
(1)
Proof. We shall prove the lemma by induction. Obviously the statement holds true for n = 1. Indeed, in this case q0 = 1, q1 = k1 and the partition ξ1 has the following form: % & % & (1) (0) (0) ξ1 = 0 ∪ 0 , . . . , q1 −1 . Hence j = Tf j 0 = (T0 f )j 0 and one can take ϕ0 = (T0 f )r0 −s0 (T0 f )s0 , r0 = j , s0 = 0. Assume that the statement of the lemma holds true for all dynamical partitions of level less than or equal to n. Let us prove the statement for ξn+1 . Notice that ξn+1 consists of two sequences of intervals: % & % & (n+1) (n) ξn+1 = i , 0 ≤ i < qn ∪ j , 0 ≤ j < qn+1 . (0)
(0)
(0)
(0)
(0)
(1)
We shall consider two cases: 0 ≤ j < qn−1 and qn−1 ≤ j < qn+1 . (n) (n) (n−2) (n) Firstly, consider j for 0 ≤ j < qn−1 . Since 0 ⊂ 0 we have j ⊂ (n−2)
j
for all 0 ≤ j < qn−1 . Application of the induction assumption to (n−1)
ξn−1 = {i
(n−2)
, 0 ≤ i < qn−2 } ∪ {j
, 0 ≤ j < qn−1 }
yields the existence of a sequence ϕ0 , . . . , ϕn−2 such that (n−2)
j
(n−2)
= ϕ0 ◦ · · · ◦ ϕn−2 0
.
Choosing ϕn−1 = id and ϕn = id, we obtain that (n)
(n)
j = ϕ0 ◦ · · · ϕn−2 ◦ ϕn−1 ◦ ϕn 0 as required.
Renormalizations and Rigidity Theory
115
(n)
In the other case j , qn−1 ≤ j < qn+1 one has j = Tf j −qn−1 Tn(1) f 0 , (n)
(1)
(n)
(n)
(n)
since j ≥ qn−1 and Tn f 0 = Tf qn−1 0 . Since j < qn+1 and qn+1 = kn+1 qn + qn−1 there exists 0 ≤ rn ≤ kn+1 − 1 such that 0 ≤ j − qn−1 − rn qn < qn . By construction,
(n)
(n−1)
(Tn(0) f )rn (Tn(1) f )0 ⊂ 0 (n)
and j = Tf j −qn−1 −rn qn (Tn(0) f )rn (Tn(1) f )0 . (n)
(n)
(n−1)
Notice that j ⊂ j −qn−1 −r qn . Applying the induction assumption for ξn we get a n sequence of the function ϕ0 , . . . , ϕn−1 of the form required such that (n−1)
(n−1)
j −qn−1 −r qn = ϕ0 ◦ · · · ◦ ϕn−1 0 n
.
Hence choosing ϕn = (Tn f )rn −sn (Tn f )sn , rn = rn + 1, sn = 1, we obtain (0)
(1)
(n)
(n)
j = ϕ0 ◦ · · · ◦ ϕn 0 . (n)
(n−1)
Lemma 31. Assume that v ∈ Pn+m ∩ (0 (f ) ∪ 0 sequence ϕn , . . . , ϕn+m such that
(f )). Then there exists a
v = ϕn ◦ · · · ◦ ϕn+m xcr , where ϕi = (Ti f )ri −si (Ti f )si and 0 ≤ ri ≤ ki+1 , ri − si ≥ 0, si = 0, 1 for (1) 0 ≤ i ≤ n + m − 1, and ϕn+m = (Tn+m f )sn+m , sn+m = 0 or 1. (0)
(1)
(n)
(n−1)
Proof. An arbitrary point v ∈ Pn+m ∩ (0 (f ) ∪ 0 (f )) is an endpoint for two intervals 1 , 2 ∈ ξn+m , 1 = 2 . One of intervals 1 or 2 is of rank n + m − 1: (n+m−1) l = j for some 0 ≤ j < qn+m and l = 1 or 2. It follows from the proof of (n+m−1)
, where ϕi , n ≤ i ≤ n + m − 1, are Lemma 30 that l = ϕn ◦ · · · ϕn+m−1 0 (n+m−1) under mappings of the required form. Hence v is an image of an endpoint of 0 ϕn ◦· · ·◦ϕn+m−1 . To finish the proof it is enough to notice that xqn+m−1 = Tf qn+m−1 xcr = (1) Tn+m f xcr . Proposition 13. There exist constants 0 < µ0 < 1, C0 > 0, µ0 = µ0 (f, g), C0 = C0 (f, g) such that an (f ) < C0 µn (52) − 1 0 a (g) n for all n ∈ N. Proof (Proof follows from Theorem 5). (n)
(n)
Denote αn = |0 (f )|/|0 (g)|.
116
K. Khanin, D. Khmelev
Lemma 32. There exist constants α > 0, C1 = C1 (C0 , µ0 ) > 0, such that |αn − α| < C1 µn0 for all n ∈ N. Proof. It follows from (52) that
1 − αn ≤ C0 µn . 0 αn−1
Hence αn = (1 + εn−1 )αn−1 , where |εn | ≤ C0 µn0 . This implies that the sequence αn is a Cauchy sequence and hence there exists a limit α = limn→∞ αn . Moreover, convergence to α is exponential: |εm | ≤ K2 µn0 , |log α − log αn | ≤ K1 m≥n
which implies the statement.
It is convenient to assume that α = 1. Formally, one can use the following trivial lemma. Lemma 33. There exists an arbitrary smooth conjugation g˜ of g such that |(n) (f )| 0 − 1 ≤ C1 µn0 (n) | (g)| ˜ 0
(53)
for some constant C1 > 0. In what follows for simplicity we denote g˜ by g. We also assume for simplicity that xcr (f ) = xcr (g) = 0. Lemma 34. There exist constants C2 = C2 (C1 , f, g) > 1, C3 = C3 (C1 , f, g) > 0, ˆ ), λ(g)) ˆ 0 < µ1 = µ1 (µ0 , λ(f ), λ(g), λ(f < 1 such that for any x, x satisfying one of the following conditions: a) x ∈ 0 (f ), x ∈ 0 (g), (n−1) (n−1) (f ), x ∈ 0 (g), b) x ∈ 0 (n)
(n)
the following estimate holds: |Tn(l) f (x) − Tn(l) g(x )| ≤ C2 |x − x | + C3 |0 (f )|µn1 , (n)
where l = 1 for case a) and l = 0 for case b). Proof. Consider the case a). Notice that (n) (l) (l) (l) |0 (f )| (l) |Tn f (x) − Tn g(x )| ≤ Tn f (x) − Tn f x (n) |0 (g)| (n) (l) |0 (f )| (l) + Tn f x g(x ) − T . n (n) |0 (g)|
(54)
Renormalizations and Rigidity Theory
117
Since renormalizations of f and g are exponentially close (by Theorem 1 and Theorem 5), we obtain that n (n) (l) (n) |0 (f )| (l) ˆ ˆ − T g(x ) ≤ K | (g)| max(λ(f ), λ(g), λ(f ), λ(g)) . Tn f x 1 n 0 (n) | (g)| 0
In the estimate above it is essential that x |0 (f )|/|0 (g)| and x represent the same (l) point in renormalized coordinates for f and g respectively. Since the derivative of Tn f is bounded by a Denjoy-type inequality, we obtain (n) (n) | (f )| | (f )| (l) ≤ K2 x − x 0(n) Tn f (x) − Tn(l) f x 0(n) |0 (g)| |0 (g)| (n) (n) |0 (f )| ≤ K2 |x − x | + K2 x (n) − 1 ≤ K2 |x − x | + K2 C1 µn0 |0 (g)|. | (g)| 0 (n)
(n)
ˆ ), λ(g), ˆ µ0 ) we have Combining these estimates and setting µ1 = max(λ(f ), λ(g), λ(f |Tn(l) f (x) − Tn(l) g(x )| ≤ K2 |x − x | + |0 (g)|K3 µn1 . (n)
(n)
(n)
By Lemma 33 |0 (f )|/|0 (g)| is exponentially close to 1 which implies (54). The case b) can be considered in an analogous way. Lemma 35. There exist constants σ1 > 0, 0 < µ2 < 1, C4 > 0 such that the follow(n) (n−1) (f )) and w = h(v) ∈ ing estimate takes place for v ∈ Pn+m (f ) ∩ (0 (f ) ∪ 0 (n) (n−1) (g)): Pn+m (g) ∩ (0 (g) ∪ 0 (n)
|v − h(v)| ≤ C4 µn2 |0 (f )| for each 0 ≤ m ≤ σ1 n.
(55)
The same estimate holds for v = xqn or v = xqn−1 . Proof. The case v = xqn or v = xqn−1 is covered completely by Lemma 33. (n) (n−1) (f )) is obtained It follows from Lemma 31 that v ∈ Pn+m (f ) ∩ (0 (f ) ∪ 0 (0) (1) by iterations of Ti f and Ti f : v = ϕn ◦ · · · ◦ ϕn+m−1 ◦ ϕn+m (xcr (f )), where ϕi = (Ti f )ri −si (Ti f )si , 0 ≤ ri ≤ ki+1 , ri − si ≥ 0, si = 0, 1 for n ≤ i ≤ (1) n + m − 1 and ϕn+m = (Tn+m f )sn+m , sn+m = 0 or 1. A similar sequence ψi exists for h(v) = w: w = ψn ◦ · · · ◦ ψn+m (xcr (g)), (0)
(1)
where ψi = (Ti g)ri −si (Ti g)si for n ≤ i ≤ n + m − 1 and ψn+m = (Tn+m g)sn+m , sn+m = 0 or 1. Denote (0)
(1)
vi = ϕi ◦ · · · ϕn+m (xcr (f )), wi = ψi ◦ · · · ◦ ψn+m (xcr (g)).
(1)
118
K. Khanin, D. Khmelev
Iterating (54) ri times yields |ϕi (vi+1 ) − ψi (wi+1 )| ≤ C2ri |vi+1 − wi+1 | + |0 (f )|µi1 C3 (1 + C2 + · · · + C2ri −1 ) (i)
(i)
≤ K1 |vi+1 − wi+1 | + |0 (f )|µi1 K2 , where K1 = K1 (C2 , M) = C2M , K2 = K2 (C2 , C3 , M) = C3 (C2M − 1)/(C2 − 1) and we used that 0 ≤ ri ≤ ki+1 ≤ M. Now we iterate the previous estimate: (n)
|vn − wn | ≤ K1 |vn+1 − wn+1 | + K2 µn1 |0 (f )| (n)
(n+1)
≤ K12 |vn+2 − wn+2 | + K2 µn1 |0 (f )| + K1 K2 µn+1 1 |0 ≤ ··· ≤ K1m+1 |xcr (f ) − xcr (g)| + K2
n+m−1
(f )|
µi1 |(i) (f )|K1i−n
i=n (n)
(n+1)
≤ K2 µn1 max(|0 (f )|, |0 ≤
(f )|)
m−1
µi1 K1i
i=0 (n) (n+1) n m K3 max(|0 (f )|, |0 (f )|)µ1 K1 ,
where K3 = K3 (K1 , K2 , c). Taking σ1 > 0 so small that µ2 = µ1 K1σ1 < 1 and using an estimate (n+1)
|0
(f )|
(n) |0 (f )|
= an (f ) < c
which holds for all n large enough, we obtain (55).
Lemma 36. There exist constants σ2 > 0, 0 < µ3 < 1, C5 > 0 such that for any 0 ≤ m ≤ σ2 n, |L| |h(L)| n (56) − |R| |h(R)| ≤ C5 µ3 , (n)
(n−1)
for all adjacent elements L, R ∈ ξn+m (f ) such that L ∪ R ⊂ 0 (f ) ∪ 0
(f ).
Proof. Let v1 , v2 , v3 ∈ ξ(f ) be the endpoints of L and R, v2 being their common endpoint. Let w1 , w2 , w3 be the corresponding endpoints of h(L), h(R). (n) Then we have by (55) that |vi − wi | ≤ C4 µn2 |0 (f )|. Since the rotation number is of bounded type the following apriori estimates take place: (n)
|v1 − v2 |, |v2 − v3 |, |w1 − w2 |, |w2 − w3 | ≥ K1 µm ∗ |0 (f )|, 0 < µ∗ < 1, K1 > 0, 1 |v1 − v2 | |w1 − w2 | , ≤ ≤ K2 . K2 |v2 − v3 | |w2 − w3 |
Renormalizations and Rigidity Theory
Hence
119
|L| |h(L)| |v1 − v2 | |w1 − w2 | − = − |R| |h(R)| |v − v | |w − w | 2 3 2 3 |v1 − v2 | |w1 − w2 | |v2 − v3 | = 1 − |v2 − v3 | |v1 − v2 | |w2 − w3 | |w1 − w2 | |v2 − v3 | ≤ K2 1 − = K2 |1 − xy|, |v1 − v2 | |w2 − w3 |
where x=
|w1 − w2 | , |v1 − v2 |
y=
|v2 − v3 | . |w2 − w3 |
Notice that (n) |w1 − w2 | − |v1 − v2 | 2C4 µn2 |(n) 2C4 µn2 |0 (f )| 0 (f )| ≤ |x − 1| = ≤ ≤ K3 µn3 , (n) m |v1 − v2 | |v1 − v2 | K1 µ∗ |0 (f )| 2 < 1 and σ2 is chosen small enough. A similar estimate holds for where µ3 = µ2 µ−σ ∗ y which gives |1 − xy| ≤ K4 µn3 .
Lemma 37. There exist constants σ3 , 0 < µ4 < 1, C6 > 0 such that for m = [σ3 n], |L| |h(L)| n (57) − |R| |h(R)| ≤ C6 µ4 , for all adjacent elements L, R ∈ ξn+m (f ). (n)
(n−1)
(f ). Proof. Lemma 36 implies that the statement is true for L ∪ R ⊂ 0 (f ) ∪ 0 (n) (n−1) (f ). Then there exists j < qn + qn−1 such Consider now L ∪ R ⊂ 0 (f ) ∪ 0 (n) (n−1) (f ), L , R ∈ ξn+m . that L ∪ R = f j (L ∪ R ), where (L ∪ R ) ∈ 0 (f ) ∪ 0 l / f (L ∪ R ) for l = 1, . . . , j − 1. This statement is obvious when Moreover, xcr (f ) ∈ L ∪ R ⊂ ∈ ξn . If L and R belong to two different elements of partition ξn , then their joint endpoint is equal to Tf j xcr , where 0 ≤ j < qn + qn−1 . In this case one can (n+m) (n+m−1) , 0 } provided m ≥ 2. take {L , R } = {0 For any U ∈ ξn+m denote by (U ) the element of ξn such that U ⊂ (U ). It follows from Lemma 1, that |U | ≤ λm |(U )| for all U ∈ ξn+m , where λ = λ(f ). Therefore j −1 l=0
|f l U | ≤ λm
j −1
|f l (U )| ≤ 2λm .
l=0
Consider two arbitrary adjacent elements of the partition V1 , V2 ∈ ξn+m (f ). Let V be the smallest interval, containing both V1 and V2 : |V | = |V1 | + |V2 |, V1 , V2 ⊂ V . Suppose that f is twice differentiable on V . Then |f (V1 )| |V1 | f (ζ1 ) f (ζ3 ) : = =1+ (ζ1 − ζ2 ) = 1 + δ, |f (V2 )| |V2 | f (ζ2 ) f (ζ2 )
120
K. Khanin, D. Khmelev
where ζ1 ∈ V1 , ζ2 ∈ V2 , ζ3 ∈ (ζ1 , ζ2 ) ⊂ V , and |δ| < K0 |V |, K0 =
maxx∈S 1 \xcr f (x) minx∈S 1 \xcr f (x)
.
Since 1/K1 ≤ |V1 |/|V2 | ≤ K1 , the following estimate holds: |f (V1 )| |V1 | |f (V )| − |V | ≤ K1 δ ≤ K2 |V |. 2 2
(58)
Consider now two cases. First, assume that xcr (f ) ∈ / L or xcr (f ) ∈ / R . In this case we can apply estimate (58) to get j −1 |L | |f j (L )| |f l (L ∪ R )| ≤ 4K2 (λ(f ))m . |R | − |f j (R )| ≤ K2 l=0
The same estimate for g yields j −1 |h(L )| |g j (h(L ))| |g l (h(L ∪ R ))| ≤ 4K3 (λ(g))m . − ≤ K 3 |h(R )| |g j (h(R ))| l=0
Using Lemma 36 |L| |h(L)| |L| |L | |L | |h(L )| |h(L )| |h(L)| ¯ n − ≤ − + − + − |R| |h(R)| |R| |R | |R | |h(R )| |h(R )| |h(R)| ≤ C6 µ¯ 4 , where µ¯ 4 = max(λ(f )σ2 , λ(g)σ3 , µ3 ) < 1. It remains to study the case when xcr (f ) (n+m) (n+m−1) (f ) ∪ 0 (f ). Consider is a joint endpoint of L and R , i.e., L ∪ R = 0 L = f (L ) and R = f (R ). Notice that for the iterates of L and R we can use the previous estimates: |h(L )| |L | |f j −1 (L )| |g j −1 (h(L ))| m m |R | − |f j −1 (R )| ≤ 4K2 λ(f ) , |h(R )| − |g j −1 (h(R ))| ≤ 4K3 (λ(g)) . By Lemma 1 intervals L , R , h(L ), h(R ) are exponentially small. Therefore |L | |L | 2 = (c + δ(f )) |R | |R | and
|h(L )| |h(L )| 2 + δ(g)) = (c , |h(R )| |h(R )|
where |δ(f )|, |δ(g)| ≤ K4 max(λ(f ), λ(g))n+m . Hence |L| |h(L)| |R| − |h(R)| |L| |L | |L | |h(L )| |h(L )| |h(L)| ≤ − + − + − ≤ Cˆ 6 µˆ n4 , |R| |R | |R | |h(R )| |h(R )| |h(R)|
Renormalizations and Rigidity Theory
since
121
|L | |h(L )| c2 + ε(g) |R | |h(L )| − ≤ K 1 − 5 |R | |h(R )| c2 + ε(f ) |L | |h(R )| ≤ K6 max(µn3 , λ(f )n+m , λ(g)n+m ).
Proof (Proof of Theorem A). Lemma 37 implies that for all adjacent L, R ∈ ξn , |L| |h(L)| n |R| − |h(R)| ≤ C6 µ5 , 1/(1+σ3 )
where µ5 = µ4
. This together with Proposition 12 yields the smoothness required.
10. Concluding Remarks 1. We have considered in this paper the simplest case of rotation numbers with eventually periodic continued fraction expansion. The results in this case are based on the analysis of the hyperbolic properties of the periodic orbits for the renormalization transformation. In fact, the renormalizations are uniformly hyperbolic for all irrational rotation numbers. This will be shown in the forthcoming paper [32] where the global structure of the hyperbolic horseshoe for the renormalization transformation is studied. In particular, this implies the rigidity results for all irrational rotation numbers of the bounded type. 2. Another more algebraic approach to renormalizations of fractional-linear mappings with breaks was suggested by M. Martens and F. Tangermann [50]. 3. One of the most interesting problems in the whole subject is the analysis of the rigidity properties for non-typical irrational rotation numbers. It is quite possible that in the case of circle homeomorphisms with singularities contrary to the case of diffeomorphisms certain rigidity holds for all irrational rotation numbers. There are very few results in this area. One of them is due to de Faria and de Melo [12]. They have shown that in the non-analytic case one cannot have C 1+γ smoothness of conjugation for some irrational rotation numbers. However, their argument doesn’t rule out a possibility of C 1 smoothness or even C 1+γ in the analytic case. Below we formulate three questions. In our opinion it is reasonable to expect an affirmative answer to all of these questions. I. Suppose Ta1 ,v1 ,c and Ta2 ,v2 ,c have the same irrational rotation number. Is it true that Ta1 ,v1 ,c and Ta2 ,v2 ,c are C 1+γ smoothly conjugate? II. Let f and g be analytic in some complex neighborhood [0, 1], Tf of the interval f (1) g (1) and T g have the same irrational rotation number and c = = = 1. Is it f (0) g (0) true that Tf and T g are C 1+γ smoothly conjugate? III. Let f and g satisfy conditions 1)–4) from the introduction and have the same value of break c and the same irrational rotation number. Is it true that there exists an absolutely continuous conjugation between Tf and T g? Acknowledgement. The authors are grateful to E. Ghys, A. Teplinsky and M. Yampolsky for very useful discussions.
122
K. Khanin, D. Khmelev
References 1. Abad, J.J., Koch, H.: Renormalization and periodic orbits for Hamiltonian flows. Commun. Math. Phys. 212(2), 371–394 (2000) 2. Arnol’d, V.I.: Small denominators. I. Mapping the circle onto itself. Izv. Akad. Nauk SSSR Ser. Mat. 25, 21–86 (1961) 3. Arnol’d, V.I.: Ordinary Differential Equations. Berlin: Springer-Verlag, 1992. Translated from the third Russian edition by Roger Cooke 4. Block, L.S., Coppel, W.A.: Dynamics in One Dimension. Berlin: Springer-Verlag, 1992 5. Collet, P., Eckmann, J.-P., Koch, H.: On universality for area-preserving maps of the plane. Phys. D 3(3), 457–467 (1981) 6. Collet, P., Eckmann, J.-P., Koch, H.: Period doubling bifurcations for families of maps on Rn . J. Stat. Phys. 25(1), 1–14 (1981) 7. Collet, P., Eckmann, J.-P., Lanford, III O.E.: Universal properties of maps on an interval. Commun. Math. Phys. 76(3), 211–254 (1980) 8. Cornfeld, I.P., Fomin, S.V., Sinai, Ya.G.: Ergodic Theory. New York: Springer-Verlag, 1982. Translated from the Russian by A. B. Sosinski˘ı 9. Cvitanovi´c, P.: Universality in Chaos. Bristol: Adam Hilger Ltd., 1984 10. Davie, A.M.: Period doubling for C 2+ mappings. Commun. Math. Phys. 176(2), 261–272 (1996) 11. de Faria, E.:Asymptotic rigidity of scaling ratios for critical circle mappings. Ergodic Theory Dynam. Systems 19(4), 995–1035 (1999) 12. de Faria, E., de Melo, W.: Rigidity of critical circle mappings. I. J. Eur. Math. Soc. (JEMS) 1(4), 339–392 (1999) 13. de Faria, E., de Melo, W.: Rigidity of critical circle mappings. II. J. Am. Math. Soc. 13(2), 343–370 (2000) 14. de Melo, W., Pinto, A.A.: Rigidity of C 2 infinitely renormalizable unimodal maps. Commun. Math. Phys. 208(1), 91–105 (1999) 15. Eckmann, J.-P., Epstein, H.: On the existence of fixed points of the composition operator for circle maps. Commun. Math. Phys. 107(2), 213–231 (1986) 16. Eckmann, J.-P., Koch, H., Wittwer, P.: A computer-assisted proof of universality for area-preserving maps. Mem. Am. Math. Soc. 47(289), vi+122 (1984) 17. Eckmann, J.-P., Wittwer, P.: Multiplicative and additive renormalization. In: Ph´enom`enes Critiques, Syst`emes Al´eatoires, Th´eories de jauge, Part I, II (Les Houches, 1984). Amsterdam: North-Holland, 1986, pp. 455–465 18. Eckmann, J.-P., Wittwer, P.: A complete proof of the Feigenbaum conjectures. J. Stat. Phys. 46(3–4), 455–475 (1987) 19. Feigenbaum, M.J.: Quantitative universality for a class of nonlinear transformations. J. Stat. Phys. 19(1), 25–52 (1978) 20. Feigenbaum, M.J.: Metric universality in nonlinear recurrence. In: Stochastic Behavior in Classical and Quantum Hamiltonian Systems (Volta Memorial Conf., Como, 1977). Berlin: Springer, 1979, pp. 163–166 21. Feigenbaum, M.J., Kadanoff, L.P., Shenker, S.J.: Quasiperiodicity in dissipative systems: A renormalization group analysis. Phys. D 5(2–3), 370–386 (1982) 22. Gol’berg, A.I., Sinai, Ya.G., Khanin, K.M.: Universal properties of sequences of period-tripling bifurcations. Usp. Mat. Nauk 38(1(229)), 159–160 (1983) 23. Greene, J.M., MacKay, R.S., Vivaldi, F., Feigenbaum, M.J.: Universal behaviour in families of area-preserving maps. Phys. D 3(3), 468–486 (1981) 24. Hardy, G.H., Wright, E.M.: An Introduction to the Theory of Numbers. New York: The Clarendon Press Oxford University Press, Fifth edition, 1979 25. Haydn, N.T.A.: On invariant curves under renormalisation. Nonlinearity 3(3), 887–912 (1990) 26. Herman, M.-R.: Sur la conjugaison diff´erentiable des diff´eomorphismes du cercle a` des rotations. ´ Inst. Hautes Etudes Sci. Publ. Math. 49, 5–233 (1979) 27. Herman, M.-R.: R´esultats r´ecents sur la conjugaison diff´erentiable. In: Proceedings of the International Congress of Mathematicians (Helsinki, 1978). Helsinki: Acad. Sci. Fennica, 1980, pp. 811–820 28. Katznelson, Y., Ornstein, D.: The absolute continuity of the conjugation of certain diffeomorphisms of the circle. Ergodic Theory Dynam. Systems 9(4), 681–690 (1989) 29. Katznelson, Y., Ornstein, D.: The differentiability of the conjugation of certain diffeomorphisms of the circle. Ergodic Theory Dynam. Systems 9(4), 643–680 (1989) 30. Khanin, K.M.: Universal estimates for critical circle mappings. Chaos 1(2), 181–186 (1991) 31. Khanin, K.M.: Rigidity for circle homeomorphisms with a break-type singularity. Dokl. Akad. Nauk 357(2), 176–179 (1997)
Renormalizations and Rigidity Theory
123
32. Khanin, K.M., Khmelev, D., Teplinsky, A.: Homeomorphisms of the circle with the break-type singularity II. 2002. In preparation 33. Khanin, K.M., Sinai, Ya.G.: A new proof of M. Herman’s theorem. Commun. Math. Phys. 112(1), 89–101 (1987) 34. Khanin, K.M., Vul, E.B.: Circle homeomorphisms with weak discontinuities. In: Dynamical Systems and Statistical Mechanics (Moscow, 1991). Providence, RI: Am. Math. Soc., 1991, pp. 57–98 35. Koch, H.: A renormalization group for Hamiltonians, with applications to KAM tori. Ergodic Theory Dynam. Systems 19(2), 475–521 (1999) 36. Koch, H.: On the renormalization of Hamiltonian flows, and critical invariant tori. Discrete Contin. Dyn. Syst. 8(3), 633–646 (2002) 37. Kosygin, D.V.: Multidimensional KAM theory from the renormalization group viewpoint. In: Dynamical Systems and Statistical Mechanics (Moscow, 1991). Providence, RI: Am. Math. Soc., 1991, pp. 99–129 Translated from the Russian by V. E. Naza˘ıkinski˘ı 38. Lanford, III O.E.: A computer-assisted proof of the Feigenbaum conjectures. Bull. Am. Math. Soc. (N.S.) 6(3), 427–434 (1982) 39. Lanford, III O.E.: Functional equations for circle homeomorphisms with golden ratio rotation number. J. Stat. Phys. 34(1–2), 57–73 (1984) 40. Lanford, III O.E.: A shorter proof of the existence of the Feigenbaum fixed point. Commun. Math. Phys. 96(4), 521–538 (1984) 41. Lanford, III O.E.: Renormalization group methods for circle mappings. In: Nonlinear Evolution and Chaotic Phenomena (Noto, 1987). New York: Plenum, 1988, pp. 25–36 42. Lopes Dias, J.: Renormalization of flows on the multidimensional torus close to a KT frequency vector. Nonlinearity 15(3), 647–664 (2002) 43. Lopes Dias, J.: Renormalization scheme for vector fields on T2 with a Diophantine frequency. Nonlinearity 15(3), 665–679 (2002) 44. Lopes Dias, J.: Renormalizations of Vector Fields. PhD thesis, University of Cambridge, 2002 45. Lyubich, M.: Feigenbaum-Coullet-Tresser universality and Milnor’s hairiness conjecture. Ann. of Math. (2) 149(2), 319–420 (1999) 46. MacKay, R.: Renormalization of Area-preserving Maps. PhD thesis, Princeton University, 1982. Was later published by World Scientific Publishing Co. Inc. (1993) 47. MacKay, R.S.: A renormalisation approach to invariant circles in area-preserving maps. Phys. D 7(1–3), 283–300 (1983) Oreder in chaos (Los Alomas, N.M., 1982) 48. MacKay, R.S.: Renormalisation of bicritical circle maps. Phys. Lett. A 187(5–6), 391–396 (1994) 49. Martens, M.: The periodic points of renormalization. Ann. Math. (2) 147(3), 543–584 (1998) 50. Martens, M., Tangerman, F.M.: Private communication, 1995 51. McMullen, C.T.: Complex Dynamics and Renormalization. Princeton, NJ: Princeton University Press, 1994 52. McMullen, C.T.: Renormalization and 3-manifolds which Fiber over the Circle. Princeton, NJ: Princeton University Press, 1996 53. Mestel, B.D., Osbaldestin, A.H.: Feigenbaum theory for unimodal maps with asymmetric critical point: Rigorous results. Commun. Math. Phys. 197(1), 211–228 (1998) 54. Mestel, B.D., Osbaldestin, A.H.: Asymptotics of scaling parameters for period-doubling in unimodal maps with asymmetric critical points. J. Math. Phys. 41(7), 4732–4746 (2000) 55. Mestel, B.D., Osbaldestin, A.H., Winn, B.: Golden mean renormalization for the Harper equation: The strong coupling fixed point. J. Math. Phys. 41(12), 8304–8330 (2000) 56. Moser, J.: The analytic invariants of an area-preserving mapping near a hyperbolic fixed point. Commun. Pure Appl. Math. 9, 673–692 (1956) ¨ 57. Ostlund, S., Rand, D., Sethna, J., Siggia, E.: Universal properties of the transition from quasiperiodicity to chaos in dissipative systems. Phys. D 8(3), 303–342 (1983) 58. Rand, D.A.: Existence, nonexistence and universal breakdown of dissipative golden invariant tori. I. Golden critical circle maps. Nonlinearity 5(3), 639–662 (1992) 59. Rand, D.A.: Existence, nonexistence and universal breakdown of dissipative golden invariant tori. II. Convergence of renormalization for mappings of the annulus. Nonlinearity 5(3), 663–680 (1992) 60. Rand, D.A.: Existence, nonexistence and universal breakdown of dissipative golden invariant tori. III. Invariant circles for mappings of the annulus. Nonlinearity 5(3), 681–706 (1992) 61. Sinai, Ya.G., Khanin, K.M.: Renormalization group method in the theory of dynamical systems. Internat. J. Modern Phys. B 2(2), 147–165 (1988) 62. Sinai, Ya.G., Khanin, K.M.: Smoothness of conjugacies of diffeomorphisms of the circle with rotations. Usp. Mat. Nauk 44(1(265)), 57–82, 247 (1989) 63. Sinai,Ya.G., Khanin, K.M., Shchur, L.N.: A new approach to the construction of fixed points of the renormalization group in dynamical systems. Izv. Vyssh. Uchebn. Zaved. Radiofiz. 29(9), 1061–1066 (1986)
124
K. Khanin, D. Khmelev
64. Stark, J.: Smooth conjugacy and renormalisation for diffeomorphisms of the circle. Nonlinearity 1(4), 541–575 (1988) 65. Stirnemann, A.: Existence of the Siegel disc renormalization fixed point. Nonlinearity 7(3), 959–974 (1994) 66. Stirnemann, A.: A renormalization proof of Siegel’s theorem. Nonlinearity 7(3), 943–958 (1994) 67. Sullivan, D.: Bounds, quadratic differentials, and renormalization conjectures. In: American Mathematical Society Centennial Publications, Vol. II (Providence, RI, 1988). Providence, RI: Am. Math. Soc., 1992, pp. 417–466 ´ atek, G.: Rational rotation numbers for maps of the circle. Commun. Math. Phys. 119(1), 109– 68. Swi¸ 128 (1988) 69. Tresser, C., Coullet, P.: It´erations d’endomorphismes et groupe de renormalisation. C. R. Acad. Sci. Paris S´er. A-B 287(7), A577–A580 (1978) 70. Vul, E.B., Khanin, K.M.: On the unstable separatrix of Feigenbaum’s fixed point. Usp. Mat. Nauk 37(5(227)), 173–174 (1982) 71. Vul, E.B., Sinai,Ya.G., Khanin, K.M.: Feigenbaum universality and thermodynamic formalism. Usp. Mat. Nauk 39(3(237)), 3–37 (1984) 72. Yampolsky, M.: Complex bounds for renormalization of critical circle maps. Ergodic Theory Dynam. Systems 19(1), 227–257 (1999) 73. Yampolsky, M.: The attractor of renormalization and rigidity of towers of critical circle maps. Commun. Math. Phys. 218(3), 537–568 (2001) 74. Yoccoz, J.-C.: Conjugaison diff´erentiable des diff´eomorphismes du cercle dont le nombre de rotation ´ v´erifie une condition diophantienne. Ann. Sci. Ecole Norm. Sup. (4) 17(3), 333–359 (1984) Communicated by G. Gallavotti
Commun. Math. Phys. 235, 125–137 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0804-x
Communications in
Mathematical Physics
An Extension of the HarishChandra-Itzykson-Zuber Integral E. Br´ezin1 , S. Hikami2 Laboratoire de Physique Th´eorique, Ecole Normale Sup´erieure∗ , 24 rue Lhomond 75231, Paris Cedex 05, France. E-mail:
[email protected] 2 Department of Basic Sciences, University of Tokyo, Meguro-ku, Komaba, Tokyo 153, Japan. E-mail:
[email protected] 1
Received: 15 July 2002 / Accepted: 9 October 2002 Published online: 21 February 2003 – © Springer-Verlag 2003
Abstract: The HarishChandra-Itzykson-Zuber integral over the unitary group U (k) (β = 2) is present in numerous problems involving Hermitian random matrices. It is well known that the result is semi-classically exact. This simple result does not extend to other symmetry groups, such as the symplectic or orthogonal groups. In this article the analysis of this integral is extended first to the symplectic group Sp(k) (β=4). There the semi-classical approximation has to be corrected by a WKB expansion. It turns out that this expansion stops after a finite number of terms ; in other words the WKB approximation is corrected by a polynomial in the appropriate variables. The analysis is based upon new solutions to the heat kernel differential equation. We have also investigated arbitrary values of the parameter β, which characterizes the symmetry group. Closed formulae are derived for arbitrary β and k = 3, and also for large β and arbitrary k. 1. Introduction The HarishChandra and Itzykson-Zuber (HIZ) integral [1–3] for the unitary group Uk is defined by (1) I = dU exp itrU† XU, where X and are diagonal k × k Hermitian matrices. It is well known that this integral is given, up to some normalization of the measure, by I=
det 1≤i,j ≤k exp(ixi λj ) , (X)()
(2)
∗ Unit´e Mixte de Recherche 8549 du Centre National de la Recherche Scientifique et de l’Ecole ´ Normale Sup´erieure.
126
E. Br´ezin, S. Hikami
where (X) and () are the Vandermonde determinants of the eigenvalues of X and , ((X) = (xi − xj )). This HIZ formula may be easily derived by considering the i<j
Laplacian operator [4] L=−
∂2 . 2 ∂Xij
(3)
Its eigenfunctions are plane waves LeitrX = (tr2 )eitrX .
(4)
One can construct a unitary invariant eigenfunction of L, for the same eigenfunction tr2 , by the superposition † I = dU eitrUXU , (5) which is nothing but the HIZ integral. The integral I being unitary invariant, is a function of the k eigenvalues xi of X. The one dimensional quantum mechanics of the eigenvalues, for unitary invariant eigenstates, is then easily solved in terms of free fermions, leading to the unitary HIZ formula. The same considerations hold for the three ensembles β = 1, 2 and 4, corresponding to the orthogonal, unitary and symplectic ensembles, with −1 I = eitrgXg dg. (6) The Laplacian acting on invariant states may be expressed in terms of a differential operator on the eigenvalues xi [5]: k k 2 ∂ 1 ∂ I = −I (7) +β 2 x − x ∂x ∂x i j i i i=1 i=1,(i=j ) with the eigenvalue =
k
λ2i .
(8)
i=1
Note that the integral I is manifestly symmetric under interchange of the matrices and X, but the procedure is dissymmetric. The solutions will of course restore this property, which is not obvious if one considers Eq. (7) alone. The x-dependent eigenfunctions of this Schr¨odinger operator have a scalar product given by the measure (9) ϕ1 |ϕ2 = dx1 · · · dxk |(x1 · · · xk )|β ϕ1∗ (x1 · · · xk )ϕ2 (x1 · · · xk ). The measure becomes trivial if one multiplies the wave function by ||β/2 . Thus if, for some given ordering of the xi ’s, one changes I (x) to ψ(x1 · · · xk ) = |(x1 · · · xk )|β/2 I (x1 · · · xk ),
(10)
An Extension of the HarishChandra-Itzykson-Zuber Integral
127
one obtains the Hamiltonian, k 2 ∂ 1 β ψ = −ψ. −1 −β 2 (xi − xj )2 ∂xi2 i=1 i<j
(11)
The relation between matrix quantum mechanics and many-body problems with 1/r 2 pair potentials was already present in [8] and it has also be used for the study of Selberg integrals by Forrester [9]. This Schr¨odinger equation is a simple Calogero-Moser model. Although the solutions of this many-body problem are known, they are of little use for our purpose, since it is already quite involved to recover from the formal expression of the solution for arbitrary β, the simple unitary result of HIZ and it looks formidable to extend it to other values of β [10]. For β = 2, the solution is again given by plane waves in the xi and (taking into account the symmetry under permutations of I ), one obtains the HIZ formula. For β = 4, the problem is much less trivial, but it leads to explicit solutions for finite values of k. Indeed the problem turns out to possess simple solutions of the form of symmetrized sums of plane waves multiplied by polynomials in the variables 1/(xi − xj )(λi − λj ), providing therefore a complete explicit solution for the Sp(k) group integration [5]. For general k, the solution of (11) is of the form ψ0 = ei(λ1 x1 +···+λk xk ) χ ,
(12)
where χ satisfies
k k 2 ∂ ∂ 1 χ = 0 + 2i λi −y 2 2 ∂x (x − x ) ∂x i i j i i<j i=1 i=1
in which
y=β
β −1 . 2
(13)
(14)
For k=2, it is immediate to verify that there is a solution function of the single variable τ = (λ1 − λ2 )(x1 − x2 ), whose asymptotic expansion for large τ is χ = 1+
∞
(−i/τ )
n
n 1
n=1
√ = Iν (z) 2πze−z
y p−1− 2p
(15)
in which the Bessel function Iν (z) is defined with ν 2 = (y + 1/2)/4 and z = iτ/4. k ∂2 4 − annihilates the function −1 When β = 4, the operator 2 (xi − xj )2 ∂xi i=1
i<j
(x1 , · · · , xk ). Consequently the solution of (13) may be written χ (x1 · · · xk ) =
f (x1 · · · xk ) (x1 · · · xk )
(16)
128
E. Br´ezin, S. Hikami
in which f (x1 · · · xk ) is a polynomial of degree k(k − 1)/2 in the xa ’s. Defining τij = (λi − λj )(xi − xj )
(17)
one finds for k = 3 1 1 1 1 1 1 1 χ = 1 + 2i −4 − 12i + + + + . τ12 τ23 τ31 τ12 τ23 τ23 τ31 τ31 τ12 τ12 τ23 τ31 (18) Remaining still with β = 4, one may factor out the Vandermonde determinant from χ as (16), and one obtains a differential equation for f , k k k ∂2 ∂f ∂f β ∂ 1 − 1 + 2i f + 2 + iλ f λi = 0. (19) i 2 2 ∂x ∂x ∂x ∂xi i i i i=1 i=1 i=1 The solution of this equation is obtained by expanding in powers of the λi ’s, but the expansion ends with the highest degree, the Vandermonde product of degree six (λ1 · · · λ4 ). Using the notation of (17), τij = (xi − xj )(λi − λj ), we obtain i f = 1 − (τ12 + τ13 + τ14 + τ23 + τ24 + τ34 ) + O(τ 2 ) 4
(20)
as shown in the appendix of [5]. In the next section, we consider the symplectic case (β = 4) for larger values of k, and we will clarify the meaning of the coefficients which appear in the expansion of χ . 2. Decomposition into Complete Graphs in the Symplectic Case Let us discuss now the asymptotic expansion of χ for β = 4, i.e. for the symplectic group integration of the HIZ integral. Such integrals appear in various problems; for instance one encounters it in the calculation of the average of products of characteristic polyno
mials of random matrices in the Gaussian orthogonal ensemble < ki=1 det(λi − X) >. Indeed we have shown in a previous work [5, 6] that this average may be expressed as an integral over a k × k quaternion matrix; this representation exhibits a duality between the GOE average and a symplectic HIZ integral. Thus this requires to compute the HIZ integral for β = 4, as discussed in [5], in which the expressions were given up to k = 4. We extend here the construction of the polynomial solutions to arbitrary values of k. As discussed before, the expansion of the function χ in inverse powers of the vari 1 ables 1/τij , defined in (17), terminates with a term proportional to , of degree τij i<j
k(k − 1)/2. The monomials in the variables 1/τij may be represented graphically: one marks k points, and a line between the points i and j will represent a factor 1/τij in a given monomial. A complete graph is a graph in which all the pairs of points are connected by a line. For instance, a triangle (k=3) and a tetrahedron (k=4) are complete graphs. The k-point complete graph has k(k-1)/2 lines. A complete graph of k points receives a factor Ck in which Ck =
k l=1
l!.
(21)
An Extension of the HarishChandra-Itzykson-Zuber Integral
129
This number is related to the integral
Ck =
1 (2π)k/2
1
(zi − zj )2 e− 2
i<j
zi2
k
dzi .
(22)
i=1
The last term of highest degree in χ is equal to
i k Ck
k 1 . τij
(23)
i<j
All the terms of the expansion of χ have coefficients which may be decomposed into combinations of factors Cn with n ≤ k. For instance, the k=3 solution is χ =1+i
C2 C1 (C2 )2 (v1 + v2 + v3 ) + i 2 (v1 v2 + v2 v3 + v1 v3 ) + i 3 C3 v1 v2 v3 , C0 C1 (24)
1 1 1 , v1 = , and v2 = . The graphs of the various terms in χ may all τ12 τ23 τ31 be analyzed as a decomposition into complete graphs. For instance, v1 v2 consists of two lines connecting the point number three to points one and two. The lines (2-3) and (3-1) are complete graphs of two points, hence the factor C22 , with the point three counted twice, thus a division by (C1 ), equal to one. (One has to multiply further the result by a power of i raised to the number of lines.) When k = 4, by this rule of decomposition of the graphs into complete graphs, we find the coefficients of all the terms in the form where v3 =
χ = 1+i
6 (C2 )2 C 2 C1 vi + i 2 (v1 v2 + v2 v3 + v1 v3 + v1 v4 + v1 v5 + v2 v5 C0 C1 i=1
+ v 2 v 6 + v 3 v 6 + v 3 v4 + v 4 v5 + v 4 v 6 + v 5 v 6 ) + i 2 (C2 )2 (v1 v6 + v2 v4 + v3 v5 ) C3 C1 + i3 (v1 v2 v3 + v1 v4 v5 + v2 v5 v6 + v3 v4 v6 ) C0 (C2 )3 + i3 (v1 v3 v5 + v1 v3 v6 + v2 v3 v4 + v3 v4 v5 + v3 v5 v6 + v1 v5 v6 + v2 v4 v5 (C1 )2 + v 2 v 3 v 5 + v 1 v 2 v4 + v 1 v 4 v 6 + v 1 v 2 v6 + v 2 v 4 v6 ) (C2 )3 + i3 (v1 v2 v5 + v1 v3 v4 + v2 v3 v6 + v4 v5 v6 ) C1 (C2 )4 C0 + i4 (v1 v2 v4 v6 + v1 v3 v5 v6 + v2 v3 v4 v5 ) (C1 )4
130
E. Br´ezin, S. Hikami
C3 C2 (v2 v4 v5 v6 + v2 v3 v5 v6 + v1 v2 v5 v6 + v3 v4 v5 v6 + v1 v3 v4 v6 + v2 v3 v4 v6 C1 + v 1 v 2 v 3 v5 + v 1 v2 v 3 v 6 + v 1 v 2 v3 v 4 + v 1 v3 v4 v 5 + v 1 v 2 v 4 v 5 + v 1 v4 v 5 v 6 ) (C3 )2 + i5 (v2 v3 v4 v5 v6 + v1 v3 v4 v5 v6 + v1 v2 v4 v5 v6 + v1 v2 v3 v5 v6 C2 + v 1 v 2 v 3 v4 v 6 + v 1 v 2 v 3 v 4 v5 ) (25) + i 6 C 4 v 1 v 2 v 3 v 4 v 5 v6 , + i4
where we have used v1 = 1/τ12 , v2 = 1/τ23 , v3 = 1/τ13 , v4 = 1/τ14 , v5 = 1/τ24 , v6 = 1/τ34 . Again the graph of the last term v1 v2 v 3 v4 v5 v6 is a complete graph of 4-points. Substituting in this formula the values Cn = nj=1 j !, one recovers precisely the k = 4 result, given in an earlier publication [5, 7]. The decomposition of graphs into complete components has certain general structures for arbitrary values of k. The term of highest degree is i k(k−1)/2 Ck ; the next one, obtained by deleting one line, has a decomposition given by (Ck−1 )2 . Ck−2
(26)
For k=5, this decomposition means that the graph is made of two tetrahedra sharing a triangular face; hence the division by C3 to avoid double counting of this face. This explains the above factor (26). In the case k=4, this factor represents two triangles connected by (C3 )2 one edge, and it reads = 72. C2 When two lines are eliminated from the maximum complete graph Ck , we have two type of graphs. (i) The first case is the elimination of two lines emerging from the same point. In this case the decomposition is Ck−1 Ck−2 /Ck−3 . For k=4, it is a triangle and one line connected at the vertex. For k=5, it is a tetrahedron connected to a triangle through one edge. (ii) The second type consists of deleting two non consecutive lines. In this case, we have the decomposition of (Ck−2 )4 Ck−4 /(Ck−3 )4 . In the case k=4, it corresponds just to a square, four line connecting the four vertices of a square. For k=5, it is a graph of four triangles connected to each other by four edges. If we divide this factor (Ck−2 )4 Ck−4 /(Ck−3 )4 by Ck , as appeared in the expression 1 3 2 for f in (20), it becomes , , for k= 4, 5 and 6, respectively. These values have 18 80 75 been found by a direct evaluation of the solution of the differential equation (19). Thus we have, for β = 4 and for arbitrary values of k, an expression for χ whose coefficients are given by the decomposition of the complete graphs Cn . The case β = 4 is thus easy to analyze for two reasons. First there are no “multiple” lines in a graph, meaning that every
monomial in χ is of degree at most one in any variable 1/τij . Then the factor Cn = j ! for complete graphs has a simple structure. However, for β = 4, the number of lines between two points may be arbitrarily large. Then the contribution Cn of complete graphs has a more complicated expression. For values of β of the form β = 2(m + 1)
(27)
An Extension of the HarishChandra-Itzykson-Zuber Integral
131
a complete graph between two points consists of m lines joining these two points and y y y y −1 + −2 + · · · −(m − 1) + (28) C2 = 2 4 6 2m in which y has be defined in (14). The evaluation of C3 is non-trivial and we will discuss it in the following section. For β’s of the form (27) the previous graphical analysis of decomposition into complete subgraphs may be extended to arbitrary k. However, even then, more complex rules for non-complete graphs are required. To understand this complexity, we consider the k = 3 case for arbitrary β in the next section. 3. k=3 for Arbitrary β Even for the simple k = 3 case, the HIZ integral is not known explicitly in the form of a WKB expansion. An analysis in terms of the solutions of the Calogero three-body problem has been performed in [10]. However, its outcome is far from our purpose since it is already quite involved to recover from there the unitary HIZ formula, and one does not see that for values of β such as four, the WKB expansion is a polynomial. Therefore we start again with the partial derivative equation (13) for k = 3, and expand the solution as Cn,m,r χ= i n+m+r n m r . (29) τ12 τ23 τ31 n,m,r=0
Note that it is not at all obvious that Eq. (13) admits such a solution. Indeed, assuming that the solution is only a function of the τ ’s, one finds easily that 3 ∂ y ∂ 2 2 2i χ = −(λ1 − λ2 ) τ12 2i λi − + y χ + perm. ∂xi (xi − xj )2 ∂τ12 i=1
i<j
(30) However the term involving the second derivative does not share this structure since 3 2 ∂ 2χ 2 4 ∂ χ 3 ∂ = 2(λ1 − λ2 ) τ12 2 + τ12 ∂xa2 ∂τ12 ∂τ12 1
2 2 −2(λ1 − λ2 )(λ3 − λ1 )τ12 τ31
∂ 2χ + perm. ∂τ12 ∂τ31
(31)
The last term, with cross-products in the λa ’s, is a priori different. However there are, order by order, a set of identities which allow one to cast those cross-products into sums of squares. Those identities follow from (x1 − x2 ) + (x2 − x3 ) + (x3 − x1 ) = 0, leading to (λ1 − λ2 )τ23 τ31 + perm. = 0.
(32)
From there one derives order by order the identities which are needed to cast the crossproducts into squares. Let us quote the simplest, 2 (λ1 − λ2 )(λ3 − λ1 )τ23 + perm = (λ1 − λ2 )2 τ31 τ23 + perm.
(33)
132
E. Br´ezin, S. Hikami
Order by order those identities allow us to satisfy the conditions on the expansion of χ , which make it a function of the τ variables. One finds easily that Cn00 , which is associated to the diagram consisting of n lines between two points, is given by: Cn00 =
n
−(l − 1) +
l=1
y . 2l
(34)
The generating function of those coefficients is the modified Bessel function, solution of the k=2 problem. Note that Cn,m,r is symmetric under permutations of n,m and r ; for instance Cm,0,0 = C0,m,0 . When r=0, one finds that Cn,m,0 has a decomposition into a product of two factors, Cn,m,0 = Cn,0,0 Cm,0,0 .
(35)
For the full HIZ integral, triangle graphs, in which the three integers (n, m, r) of Cn,m,r are non-zero, play an essential role. From the differential equation (13), one can derive a recursion equation for the Cn,m,r . However a direct expansion of the solution is extremely tedious and complicated at increasing orders, since one has to use more and more identities of the type (33), but of higher degree. One needs triangle identities. The expansion for the differential equation (19) has a simpler structure; it gives a relation between Cn,m,r , with n + m + r = P + 1, in terms of Cn,m,r with n + m + r = P ; this relation is valid for a fixed given value of m. If one writes Cn,m,r as a linear combination of Cn ,m,r , where r < r, and n + r = n + r − 1, one obtains y , (36) Cn,m,1 = Cn,m,0 nm + 2 Cn,m,2 = Cn,m,1 Cn,m,3 = Cn,m,2
m
m(n − 1) y −1+ 2 4
(n − 2) − 2 +
+ Cn+1,m,0
m (n + 1), 2
(37)
mn m y + Cn+1,m,1 + Cn+2,m,0 (n + 2). (38) 6 3 3
3 For general r (r > 3), we have m(n − r + 1) y − (r − 1) + Cn,m,r = Cn,m,r−1 r 2r m m + Cn+1,m,r−2 (n − r + 3) + Cn+2,m,r−3 (n − r + 5) r r +··· m m + Cn+r−2,m,1 (n + r − 3) + Cn+r−1,m,0 (n + r − 1). (39) r r The solution of these recursion equations guarantees automatically the symmetry under exchange between n, m and r: Cn,m,r = Cn,r,m = Cm,n,r = Cm,r,n = Cr,n,m = Cr,m,n . From these recursion formulae, one may determine iteratively all the coefficients Cn,m,r . The resulting expressions are complicated for general y. For instance, even at low order, one finds y 2 3 y 2 y2 −6 + y + −1 + , (40) C2,2,2 = 2 4 2 8
An Extension of the HarishChandra-Itzykson-Zuber Integral
C3,3,3 =
y 2 2
133
y 2 19 y 2 1 ) −2 + −48 − 8y + y 2 + y 3 . 4 6 24 48
(−1 +
(41)
The solution for β = 4 and k = 3 may be written in the compact form χ = eφ , φ = 2i
1 1 1 + + τ12 τ23 τ13
(42) − 4i
1 , τ12 τ23 τ13
(43)
in which it is understood that one expands it in powers of 1/τ , dropping any term involving powers of 1/τij greater than one. This corresponds to the absence of multiple bonds between two points in the graphical representation for β = 4. This result agrees with (18). This certainly suggests that φ ought to be written in terms of Grassmann variables. In the next section, we consider such expressions for general y in an expansion in 1/y. This also provides consistency checks for the recursion equation (39). 4. Large y Expansion One can obtain an expansion in powers of 1/y if one returns to the differential equation (13), taking now as variables ua = xa /y.
(44)
The function χ (u1 · · · uk ) is a solution of k k 1 ∂2 ∂ 1 χ = 0. + 2i λa − y ∂u2a ∂ua (ua − ub )2 a=1
a=1
The leading term χ0 is thus a solution of k ∂ 1 χ0 = 0. 2i λa − ∂ua (ua − ub )2 a=1
(45)
a
(46)
a
We look for a solution which is a symmetric function of the variables va,b =
i , a
(47)
which thus satisfies k
λa
a=1
∂χ0 =i ∂ua
∂χ0 . ∂vab
(48)
∂ 2 2 (λa − λb )2 vab − 1 χ0 = 0. ∂vab
(49)
1≤a
2 (λa − λb )2 vab
Equation (46) takes then the simple form 1≤a
134
E. Br´ezin, S. Hikami
There is a trivial symmetric solution which satisfies the k(k − 1)/2 equations ∂ 2 − 1 χ0 = 0, ∂vab namely
1 χ0 = exp vab 2
a
(50)
iy 1 = exp . 2 (λa − λb )(xa − xb )
(51)
a
One thus writes χ = eφ , φ = φ0 +
1 1 φ1 + 2 φ 2 + · · · , y y
(52)
with φ0 =
1 vab . 2
(53)
a
The next terms are solutions of k k ∂φ0 2 ∂ 2 φ0 ∂φ1 + 2i + λa = 0, 2 ∂ua ∂ua ∂ua 1
(54)
1
k k ∂φ2 ∂φ0 ∂φ1 ∂ 2 φ1 2 + 2i + λa = 0. ∂ua ∂ua ∂ua 2 ∂ua 1
(55)
1
In order to make it clear that it is a non trivial property of these equations to have solutions which are symmetric polynomials in the variables vab , we shall examine these equations in some detail. For simplicity of notations we shall limit ourselves to k=3, the generalization to arbitrary k being immediate. It is thus convenient to simplify slightly the notations and write v12 = V3 , v23 = V1 , v31 = V2 .
(56)
Let us first consider (54); since the function φ1 is a symmetric function of the Vi ’s one finds easily, as in (48), that 3 ∂φ1 2 2 ∂ λa = i (λ2 − λ3 ) V1 + perm. φ1 . (57) ∂ua ∂V1 1
The explicit solution (53) allows one to write 3 1 ∂φ0 2 1 = − (λ2 −λ3 )2 V14 + perm. + (λ1 − λ2 )(λ3 − λ1 )V22 V32 + perm. ∂ua 2 2 1
3 ∂ 2 φ0 1
∂u2a
= −2[(λ2 − λ3 )2 V13 + perm.].
(58)
An Extension of the HarishChandra-Itzykson-Zuber Integral
135
The second term of the first line of (57) does not involve a priori (λa − λb )2 ; however the variables Vi are not independent since they are defined in terms of the differences (t1 − t2 ), (t2 − t3 ), (t3 − t1 ) whose sum is zero. Thus V2 V3 (λ1 − λ2 )(λ1 − λ3 ) + perm. = 0,
(59)
and one can check without difficulty that V22 V32 (λ1 − λ2 )(λ1 − λ3 ) + perm. = −V1 V2 V3 [(λ2 − λ3 )2 V1 + perm.]. With the help of this identity the equation for φ1 takes the simple form 1 2 1 2 2 ∂φ1 (λ2 − λ3 ) V1 + V1 + V1 − V2 V3 + perm. = 0. ∂V1 4 4
(60)
(61)
Thanks to the identity (60) one can look for a solution of the three equations ∂φ1 1 1 + V12 + V1 − V2 V3 = 0, ∂V1 4 4
(62)
plus the other two deduced from the cyclic permutations. The system is trivially integrable and one finds φ1 =
1 1 1 V1 V2 V3 − (V12 + V22 + V32 ) − (V13 + V23 + V33 ). 4 2 12
(63)
For arbitrary k this generalizes to φ1 =
1 1 2 1 3 vab vbc vca − vab − vab , 4 2 12 (abc)
a
(64)
a
in which the sum (abc) runs over all distinct triplets of indices. We have exposed this simple calculation in some detail, in order to make it clear that the success of the method relies on two successive facts: • Identities such as (59, 60) which allow one to write the differential equation as three separate equations for ∂φ/∂Va . • The integrability conditions of this system of three equations are satisfied. At higher orders the technique is identical. It requires new identities of higher degree, such as V22 V32 (V22 + V32 )(λ1 − λ2 )(λ1 − λ3 ) + perm. 1 = −(λ2 − λ3 )2 V1 V2 V3 V1 (V12 + V22 + V32 ) + V1 V2 V3 + perm. 2
(65)
and many others; thereby one obtains again remarkably an integrable system of three equations. At order 1/y 2 the solution is 1 1 φ2 = (V13 + V23 + V33 ) + (V14 + V24 + V34 ) − V1 V2 V3 (V1 + V2 + V3 ) 2 2 1 1 5 5 5 2 (66) + (V1 + V2 + V3 ) − V1 V2 V3 (V1 + V22 + V32 ). 8 20
136
E. Br´ezin, S. Hikami
5. A Cubic Identity Let us consider the case of k = 4 for different values of β, assuming that it is an even integer. For the tetrahedron (k = 4), the coefficients of the expansion of χ may be written as χ=
j n,m,r,p,q,l=0
i n+m+r+p+q+l
Cnmr;pql p
q
n τmτr τ τ τl τ12 13 23 24 34 14
,
(67)
where j = β/2 − 1. By imposing to satisfy the differential equation (19), one should determine those coefficients. For instance, in the case β = 6 (or y = 12), one finds C222;222 = 8 · 7(6 · 5)2 (4 · 3)3 , C222;221 = (2 · 7)(6 · 5)2 (4 · 3)3 , C222;211 = 4(6 · 5)2 (4 · 3)3 , C222;220 = (C222;000 )2 /C200;000 = (6 · 5)2 (4 · 3)3 , C222;000 = (6 · 5)(4 · 3)2 ,
C221;221 = 6(6 · 5)3 25 , Cnm2;pq1 = Cnm2;pq0 np + mq + y2 , Cnmr;pq0 = Cnmr;000
Cpqr;000 /Cr00;000 , C222;pq2 = C222;pq1 p2 + q2 − 1 + y4 . (The coefficient Cnmr;000 is identical to the Cnmr of the k=3 case (29). Whenever the graphs reduce to triangles, one recovers the coefficients of the k=3 problem.) However the above representation (67) is in fact somewhat ambiguous. Indeed there is an interesting cubic identity between the six variables τab = (xa − xb )(λa − λb ), 1 ≤ a < b ≤ 4, which holds for every set of xa ’s and λa ’s, namely 2 2 2 2 2 2 τ34 + τ13 τ24 + τ14 τ23 + τ23 τ14 + τ24 τ13 + τ34 τ12 τ12 −τ12 τ34 (τ23 + τ24 + τ14 + τ13 ) − τ13 τ24 (τ14 + τ12 + τ23 + τ34 ) −τ14 τ23 (τ12 + τ24 + τ34 + τ13 ) +τ12 τ24 τ14 + τ13 τ14 τ34 + τ23 τ24 τ34 + τ12 τ23 τ13 = 0.
(68)
Consequently, when one divides the l.h.s. of this identity by the Vandermonde squared (x)2 (λ)2 , one generates an identity for the terms of the expansion of χ whose coefficients are C221;220 , C221;211 , C122;121 . Thus there is an arbitrariness in writing χ in the form of the expansion (67), in view of this cubic identity. For arbitrary k, there are (k − 2)(k − 3)/2 similar cubic identities, and therefore only (2k − 3) independent τab . For other values of β such as β = 8 for instance, by imposing the differential equation (19), one finds that the values of C332;331 , C332;322 , C323;322 are not uniquely determined, again because of this same cubic identity. One may need to eliminate this arbitrariness by choosing some arbitrary value for some coefficients, decide that some are zero for instance. This cubic identity persists at higher orders. Indeed, multiplying this same cubic identity by some symmetric function of the τ ’s, such as τij , one obtains an identity of order τ 4 , which again leaves some arbitrariness in expressing the expansion of χ at next order. In summary, we have obtained expressions for the HarishChandra-Itzykson-Zuber integration for k × k matrices, for group integrals characterized by the parameter β. Our investigation relies upon new solutions to the heat kernel differential equation. Other methods such as supersymmetric quantum mechanics, combinatorial group analysis, or algebraic geometry may be of help in solving the numerous open questions concerning this problem.
An Extension of the HarishChandra-Itzykson-Zuber Integral
137
Acknowledgements. It is a pleasure to thank Dr. Alfaro for interesting remarks concerning the relevance of supersymmetric quantum mechanics in this problem. S.H. has benefited from a Grant-in-Aid for Scientific Research (B) by JSPS.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
HarishChandra: Proc. Nat. Acad. Sci. 42, 252 (1956) Itzykson, C., Zuber, J.-B.: J. Math. Phys. 21, 411 (1980) Duistermaat, J.J., Heckman, G.H.: Invent. Math. 69, 259 (1982) Br´ezin, E.: In: Two Dimensional Quantum Gravity and Random Surfaces. Gross, D.J., Piran, T., Weinberg, S. (ed.), Singapore: World Scientific, 1992, p. 37 Br´ezin, E., Hikami, S.: Commun. Math. Phys. 223, 363 (2001) Br´ezin, E., Hikami, S.: Commun. Math. Phys. 214, 111 (2000) Guhr, T., Kohler, H.: J. Math. Phys. 43, 2707 (2002) Br´ezin, E., Itzykson, C., Parisi, G., Zuber, J.-B.: Commun. Math. Phys. 59, 35 (1978) Forrester, P.J.: Nucl. Phys. B388, 671 (1992) Mahoux, G., Mehta, M.L., Normand, J.-M.: Random Matrices and Their Applications. MSRI Publications 40, 301 (2001)
Communicated by L. Takhtajan
Commun. Math. Phys. 235, 139–167 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0780-6
Communications in
Mathematical Physics
Non-Semi-Regular Quantum Groups Coming from Number Theory Saad Baaj1 , Georges Skandalis2 , Stefaan Vaes2 1
Laboratoire de Math´ematiques Pures, Universit´e Blaise Pascal, Bˆatiment de Math´ematiques, 63177 Aubi`ere Cedex, France. E-mail:
[email protected] 2 Institut de Math´ematiques de Jussieu, Alg`ebres d’Op´erateurs et Repr´esentations, 175, rue du Chevaleret, 75013 Paris, France. E-mail:
[email protected];
[email protected] Received: 10 October 2002 / Accepted: 10 October 2002 Published online: 24 January 2003 – © Springer-Verlag 2003
Abstract: In this paper, we study C∗ -algebraic quantum groups obtained through the bicrossed product construction. Examples using groups of adeles are given and they provide the first examples of locally compact quantum groups which are not semi-regular: the crossed product of the quantum group acting on itself by translations does not contain any compact operator. We describe all corepresentations of these quantum groups and the associated universal C∗ -algebras. On the way, we provide several remarks on C∗ -algebraic properties of quantum groups and their actions. 1. Introduction What is the quantum analogue of a locally compact group? Several authors addressed this question and came out with various answers. G.I. Kac, Kac-Vainerman and EnockSchwartz gave a quite satisfactory set of axioms in [7, 9, 11] defining what EnockSchwartz called Kac algebras. In the 1980’s, new examples, which were certainly to be considered as quantum groups, but were not Kac algebras, were constructed, see e.g. [22]. Many efforts were then made to enlarge the definition of Kac algebras. Already in the Kac algebra setting, it was known that all the information on the quantum group could be encoded in one object: this was called a multiplicative unitary in [2]. It was then quite natural to try and make constructions with a multiplicative unitary as a starting point. In order to be able to construct C∗ -algebras out of a multiplicative unitary, some additional assumptions were made: this multiplicative unitary was assumed to be regular in [2], semi-regular in [1], manageable in [21], . . . . In fact, a multiplicative unitary can really be quite singular: in Remark 4.5 below, we give an example of a multiplicative unitary which should certainly not be considered as the quantum analogue of a locally compact group. In [14, 15], a definition of a locally compact quantum group along the lines of Kac algebras, but with a weaker set of axioms, was given and was shown to lead to a
140
S. Baaj, G. Skandalis, S. Vaes
manageable multiplicative unitary. We believe that the definitions of [14, 15] can be considered as the final ones, at least from the measure theoretic (von Neumann) point of view. On the other hand, regularity and semi-regularity are natural conditions and present some additional features – that will be discussed below. In particular, regularity is very much connected with the Takesaki-Takai duality. All this leaves us with many questions: what are the relations between the properties of semi-regularity and manageability? We actually really tried hard to prove that these properties are equivalent. All previously known examples were locally compact quantum groups whose multiplicative unitary was semi-regular. The main result of this paper is the construction of a locally compact quantum group whose multiplicative unitary is not semi-regular. This is done using a construction which goes back to G.I. Kac [10] and was used by several authors [2, 3, 16, 18, 19, 20, ...]. Let G be a locally compact group and let G1 and G2 be closed subgroups such that the map θ : (x1 , x2 ) → x1 x2 from G1 × G2 into G is a measure class isomorphism. Associated to this situation, there is a locally compact quantum group. We will show that its associated multiplicative unitary is semi-regular if and only if θ is a homeomorphism from G1 × G2 onto an open subset of G. Moreover, we will be able to identify all the associated operator algebras. In fact, in the examples considered up to now, G was a real Lie group and one could easily see that the associated multiplicative unitary is always semi-regular (by differentiability considerations). The examples that we consider here are just adelic analogues of these Lie groups. In particular, we will consider the case where G is an adelic ax + b group and G1 , G2 are natural subgroups. Note that in the non-semi-regular case, the image of θ is a countable union of compact sets with empty interior and the complement of the image of θ has measure 0. Although this is a relatively common phenomenon in topology, it was quite a surprise to us to encounter it in the locally compact group case. It was taken for granted in [3] that such a phenomenon could not occur. Because of this, the main result in [3] is incorrect as stated. One has either to add the extra assumption that the associated multiplicative unitary is semi-regular, or to change the conclusion: the product map (g1 , g2 ) → g1 g2 is not a homeomorphism from G1 × G2 onto a dense open subset of G but a measure class isomorphism. This is discussed in Subsect. 5.1. We conclude the introduction by explaining the structure of the paper. In Sect. 2, we recall the definition and main properties of multiplicative unitaries and locally compact quantum groups. In Sect. 3, we introduce the most general definition of a matched pair of locally compact groups. We describe the associated locally compact quantum groups obtained through the bicrossed product construction. We compute the different associated C∗ -algebras, as well as the corepresentations and the covariant representations. We give necessary and sufficient conditions for (semi-)regularity. In Sect. 5, we give several examples, providing the first examples of locally compact quantum groups that are not semi-regular. Finally, in Sect. 5, we give different characterizations of (semi-)regularity and we consider some C∗ -algebraic properties of coactions. 2. Preliminaries In this paper, all locally compact groups are supposed to be second countable. When X is a subset of a Banach space, we denote by [X] the norm closed linear span of X. We denote by K(H ) and L(H ) the compact, resp. the bounded operators on a
Non-Semi-Regular Quantum Groups
141
Hilbert space H . We use to denote the flip map from H ⊗ K to K ⊗ H , when H and K are Hilbert spaces. By ⊗, we denote several types of tensor products: minimal tensor products of C∗ -algebras, von Neumann algebraic tensor products or Hilbert space tensor products. There should be no confusion. 2.1. Multiplicative unitaries and locally compact quantum groups. Multiplicative unitaries were studied in [2]. We have the following definition. Definition 2.1. A unitary W ∈ L(H ⊗ H ) is called a multiplicative unitary if W satisfies the pentagonal equation W12 W13 W23 = W23 W12 . We associate to any multiplicative unitary two natural algebras S and Sˆ defined by S = [(ω ⊗ ι)(W ) | ω ∈ L(H )∗ ] ,
Sˆ = [(ι ⊗ ω)(W ) | ω ∈ L(H )∗ ] .
(2.1)
In general, the norm closed algebras S and Sˆ need not be C∗ -algebras, but they are when the multiplicative unitary is regular or semi-regular, see Definition 2.5 and the papers [1, 2], or when it is manageable, see [21]. In these cases, we can define comultiplications δ and δˆ on the C∗ -algebras S and Sˆ by the formulas δ : S → M(S⊗S) : δ(x) = W (x⊗1)W ∗ ,
ˆ S) ˆ : δ(y) ˆ δˆ : Sˆ → M(S⊗ = W ∗ (1⊗y)W .
Multiplicative unitaries appear most naturally as the right (or left) regular representation of a locally compact (l.c.) quantum group. The theory of locally compact quantum groups is developed in [14, 15]. Definition 2.2. A pair (M, δ) is called a (von Neumann algebraic) l.c. quantum group when • M is a von Neumann algebra and δ : M → M ⊗ M is a normal and unital *-homomorphism satisfying the coassociativity relation: (δ ⊗ ι)δ = (ι ⊗ δ)δ. • There exist normal semi-finite faithful weights ϕ and ψ on M such that – ϕ is left invariant in the sense that ϕ (ω ⊗ ι)δ(x) = ϕ(x)ω(1) for all x ∈ M + with ϕ(x) < ∞ and all ω ∈ M∗+ , – ψ is right invariant in the sense that ψ (ι⊗ω)δ(x) = ψ(x)ω(1) for all x ∈ M + with ψ(x) < ∞ and all ω ∈ M∗+ . There is an equivalent C∗ -algebraic approach to l.c. quantum groups and the link is provided by the right (or left) regular representation. So, suppose that (M, δ) is a l.c. quantum group with right invariant weight ψ. Represent M on the GNS-space H of ψ and consider the subspace Nψ ⊂ M of square integrable elements: Nψ = {x ∈ M | ψ(x ∗ x) < ∞} . Denote by : Nψ → H the GNS-map. Then, we define a unitary V ∈ L(H ⊗ H ) by the formula V (x) ⊗ (y) = ( ⊗ ) δ(x)(1 ⊗ y) for all x, y ∈ Nψ .
142
S. Baaj, G. Skandalis, S. Vaes
The unitary V is a multiplicative unitary and it is called the right regular representation of (M, δ). The comultiplication δ is implemented by V as above: δ(x) = V (x ⊗ 1)V ∗ . Although V need not be regular or semi-regular in general (see the discussion below), it is always manageable (see [14]) and we have C∗ -algebras S and Sˆ defined by Eq. (2.1) and we have the comultiplications δ and δˆ on the C∗ -algebraic level. To compare notations between this paper and the papers [14, 15], we observe that in [14, 15], the left regular representation is the main object. The (S, δ) agrees with the ˆ ˆ Jˆy Jˆ)(Jˆ ⊗ Jˆ). (A, ) of [14], but Sˆ agrees with JˆAˆ Jˆ and δ(y) agrees with (Jˆ ⊗ Jˆ) ( 2.2. Representations and corepresentations. Next, we recall the notion of a representation and a corepresentation of a multiplicative unitary, see Definition A.1 in [2]. Definition 2.3. Suppose that W is a multiplicative unitary on the Hilbert space H . We call x ∈ L(K ⊗ H ) a representation of W on the Hilbert space K if x12 x13 W23 = W23 x12 . We call y ∈ L(H ⊗ K) a corepresentation on the Hilbert space K if W12 y13 y23 = y23 W12 . We remark that, if W is the (left or right) regular representation of a l.c. quantum group, then a representation x satisfies x ∈ M(K(K) ⊗ S) and a corepresentation y satisfies y ∈ M(Sˆ ⊗ K(K)). The defining relations become (ι ⊗ δ)(x) = x12 x13 and (δˆ ⊗ ι)(y) = y13 y23 . It is clear that, conversely, if x and y satisfy these relations, they give a representation, resp. a corepresentation of W . In the same way as C ∗ (G), we can define the universal C∗ -algebras Su and Sˆu for any l.c. quantum group (see [13]). There exists a universal corepresentation Wu ∈ M(Sˆ ⊗Su ) such that Su = [(ω ⊗ ι)(Wu ) | ω ∈ L(H )∗ ] . There is a bijective correspondence between representations π : Su → L(K) of the C∗ -algebra Su and corepresentations y of W on K given by y = (ι ⊗ π )(Wu ). All this is developed in [13]. When we want to stress the distinction between the reduced and the universal C∗ algebras, we denote S by Sr . The following definition is taken from p. 482 in [2]. Definition 2.4. Let W be a multiplicative unitary. A pair (x, y) of a representation x and a corepresentation y of W on the same Hilbert space K is called covariant if y12 W13 x23 = x23 y12 . We remark that Proposition A.10 of [2] remains valid in the setting of l.c. quantum groups. Hence, if W is the (left or right) regular representation of a l.c. quantum group and (x, y) is a covariant pair, then x is stably isomorphic to W as well as y, but not jointly, and that is a very crucial point. 2.3. Regularity and semi-regularity. Let V be a multiplicative unitary. Then, we have a naturally associated algebra (see [2]), defined by C(V ) = {(ι ⊗ ω)(V ) | ω ∈ L(H )∗ } . We have [C(V )C(V )] = [C(V )], but [C(V )] is in general not a C∗ -algebra. Next, we recall the notions of regularity [2] and semi-regularity [1] of a multiplicative unitary.
Non-Semi-Regular Quantum Groups
143
Definition 2.5. A multiplicative unitary V is called regular if the closure of C(V ) is K(H ) and semi-regular if this closure contains K(H ). When V is the regular representation of a l.c. quantum group, we have the following characterization of regularity and semi-regularity. Proposition 2.6. Let V be the right regular representation of a l.c. quantum group on ˆ the Hilbert space H . Then, [C(V )] ∼ = S r S. So, V is regular if and only if S r Sˆ = K(H ) and V is semi-regular if and only if S r Sˆ contains K(H ). Remark that we consider δ as a right coaction of (S, δ) on the C∗ -algebra S. Then, ˆ Using the the reduced crossed product S r Sˆ is, by definition, given by [δ(S)(1 ⊗ S)]. ∗ ˆ ⊂ L(H ). left regular representation, this last C -algebra is isomorphic to [S S] Proof. Suppose that V is the right regular representation of (M, δ). From [14, 15], we know that we have two modular conjugations J and Jˆ at our disposal: J is the modular conjugation of the left invariant weight ϕ and, up to a scalar, also the modular conjugation of the right invariant weight ψ, while Jˆ is the modular conjugation of the left invariant weight on the dual. We put U = J Jˆ. Then, U is a unitary and U 2 is scalar. From [14, 15], we know enough formulas to apply Proposition 6.9 of [2] and to conclude that ((1 ⊗ U )V )3 is scalar. Hence, up to a scalar, we have V = (U ⊗ 1)V ∗ (1 ⊗ U )V ∗ (U ⊗ 1) .
(2.2)
ˆ = ˆ Using the fact that (J ⊗ Jˆ)V (J ⊗ Jˆ) = V ∗ and the fact that [(y ⊗ 1)δ(x) | x, y ∈ S] ˆ we get Sˆ ⊗ S, ˆ ω ∈ L(H )∗ ] [C(V )] = [(ι ⊗ ω) (1 ⊗ J yJ )V (1 ⊗ Jˆx Jˆ) | x, y ∈ S, ∗ ˆ ω ∈ L(H )∗ ] ˆ = [(ι ⊗ ω) ((J ⊗ Jˆ)(y ⊗ 1)δ(x)V (J ⊗ Jˆ) | x, y ∈ S, = [JˆSˆ JˆC(V )] . ˆ = S, ˆ we get that [U C(V )U ] = [SU ˆ C(V )U ]. We combine this last Observing that J SJ equality with Eq. (2.2) and the fact that V ∈ M(Sˆ ⊗ K(H )) to conclude that ˆ ω ∈ L(H )∗ , k ∈ K(H )] [U C(V )U ] = [(ι ⊗ ω) (x ⊗ k)V ∗ (1 ⊗ U )V ∗ | x ∈ S, ˆ . = [SS]
Corollary 2.7. Let V be the (left or right) regular representation of a l.c. quantum group on the Hilbert space H . Then, [C(V )] is always a C∗ -algebra. Remark 2.8. The isomorphism [C(V )] ∼ = S r Sˆ is typical for the regular representation. Suppose that, on the Hilbert space K, (x, y) is a covariant representation of the right regular representation V of a l.c. quantum group (in the sense of Definition 2.4 above). Because both x and y are necessarily amplifications of the regular representation (individually, but not jointly, see the remark after Definition 2.4), this means that we have
144
S. Baaj, G. Skandalis, S. Vaes
representations π of S and πˆ of Sˆ on K such that x = (πˆ ⊗ ι)(V ), y = (ι ⊗ π )(V ) and which are covariant for the action δ of (S, δ) on S in the sense that (π ⊗ ι)δ(a) = x(π(a) ⊗ 1)x ∗
for all a ∈ S .
ˆ whose image is So, we have a representation of the full crossed product S f S, ˆ [π(S)πˆ (S)]. On the other hand, we can write V = (πˆ ⊗π )(V ), which is a multiplicative unitary on K. Then, the image of S f Sˆ is equal to [SV SˆV ]. We claim that [C(V)] is Morita equivaˆ More precisely, [C(V)] is a C∗ -algebra and [C(V)] ⊗ K(K ⊗ H ) lent to [C(V )] ∼ = S r S. is spatially isomorphic to [C(V )] ⊗ K(K ⊗ K). Hence, up to Morita equivalence, [C(V)] is always the same, while [SV SˆV ] changes when V runs through the covariant images of V . Here, we can also observe that, if V is semi-regular or regular, then the same holds for any covariant image V of V . From the proofs of Proposition 3.19 and Corollaire 3.20 of [1], we know that the multiplicative unitaries V14 V24 and V36 on K ⊗ K ⊗ H are unitarily equivalent. It follows that [C(V )] ⊗ K(K ⊗ K) ∼ = [C(V13 V23 )] ⊗ K(H ). An easy calculation shows that B := [C(V13 V23 )] = [C(V) ⊗ K(K)]V. From the isomorphism above, we know that B is a C∗ -algebra. Because [(1 ⊗ K(K))B] = B and B is a C∗ -algebra, we have B = [B(1 ⊗ K(K))] = [C(V)C(V)] ⊗ K(K) = [C(V)] ⊗ K(K). Our claim is proven. We make a more detailed study of regularity and semi-regularity of l.c. quantum groups in Sect. 5.
3. Bicrossed Products of Locally Compact Groups Definition 3.1. We call G1 , G2 a matched pair of l.c. groups, if there is a given l.c. group G such that G1 and G2 are closed subgroups of G satisfying G1 ∩ G2 = {e} and such that G1 G2 has complement of measure 0 in G. Suppose, throughout this section, that we have fixed a matched pair G1 , G2 of closed subgroups of G. We always use g, h, k to denote elements in G1 , s, t, r to denote elements in G2 and accordingly, dg, ds, . . . to denote integration on G1 , G2 with respect to a fixed right Haar measure. We use x, y to denote elements in G and dx for the corresponding integration. The right Haar measure on G is fixed such that the following proposition holds. We denote by 1 , 2 and the modular functions on G1 , G2 and G. Proposition 3.2. The map θ : G1 × G2 → G : θ(g, s) = gs preserves sets of measure zero and satisfies −1 F (sg) 2 (s) (s)−1 dg ds , F (x) dx = F (gs) 1 (g) (g) dg ds = for all positive Borel functions F on G.
Non-Semi-Regular Quantum Groups
145
Proof. The proof of Lemma 4.10 in [19] can essentially be copied: if K ∈ Cc (G1 ×G2 ), we define the bounded, compactly supported, Borel function K˜ on G by the formula ˜ ˜ K(gs) = K(g, s) 1 (g)−1 (g) and K(x) = 0 for x ∈ G \ G1 G2 . Then, we define ˜ I (K) = K(x) dx. The same calculation as in [19] shows that I is a right invariant integral. Because we supposed that G is second countable, G1 × G2 is σ -compact and so, the integral I must be non-zero.
From the point of measure theory, we can not distinguish G1 × G2 and G. We define almost everywhere on G the Borel functions p1 : G → G1 and p2 : G → G2 such that x = p1 (x) p2 (x) . Then, we can define τ : L∞ (G1 ) ⊗ L∞ (G2 ) → L∞ (G2 ) ⊗ L∞ (G1 ) : τ (F )(s, g) = F (p1 (sg), p2 (sg)) . The ∗ -automorphism τ will be a (von Neumann algebraic version of an) inversion of (L∞ (G1 ), δ1 ) and (L∞ (G2 ), δ2 ) in the sense of [2], Def. 8.1. Denoting by σ the flip op map, it means that τ σ is a matching of (L∞ (G2 ), δ2 ) and (L∞ (G1 ), δ1 ) with trivial cocycles, in the sense of [19], Def. 2.1: (τ ⊗ ι)(ι ⊗ τ )(δ1 ⊗ ι) = (ι ⊗ δ1 )τ
and (ι ⊗ τ )(τ ⊗ ι)(ι ⊗ δ2 ) = (δ2 ⊗ ι)τ . (3.1)
Referring to Sect. 2.2 and in particular, Theorem 2.13 of [19], we can construct a l.c. quantum group using τ . Because we have chosen to follow systematically the conventions of [2] and [3], we will state explicitly the needed formulas. Define α : L∞ (G1 ) → L∞ (G2 )⊗L∞ (G1 ) and β : L∞ (G2 ) → L∞ (G2 )⊗L∞ (G1 ) by α(F ) = τ (F ⊗ 1) and β(F ) = τ (1 ⊗ F ) . Then, α defines a left action of G2 on L∞ (G1 ) while β defines a right action of G1 on op L∞ (G2 ). In fact, if we consider τ σ as a matching of (L∞ (G2 ), δ2 ) and (L∞ (G1 ), δ1 ) in the sense of [19], these two actions α and β precisely agree with the actions α and β appearing in Definition 2.1 of [19]. We also have the obvious formulas α(F )(s, g) = F (p1 (sg)) and
β(F )(s, g) = F (p2 (sg)) .
If we equip the quotient spaces G/G2 and G1 \G with their canonical invariant measure class, the embedding G1 → G/G2 identifies G1 with a Borel subset of G/G2 with complement of measure zero. Because of Proposition 3.2, this embedding, as well as its inverse, respects Borel sets of measure zero. Hence, it induces an isomorphism between L∞ (G1 ) and L∞ (G/G2 ). Then, p1 can be considered as the projection of G to G/G2 . Any s ∈ G2 acts as a homeomorphism on G/G2 , preserving Borel sets of measure zero and hence, the map p1 (s ·) is defined almost everywhere on G1 , preserves Borel sets of measure zero and gives precisely the automorphism αs of L∞ (G1 ). Similar considerations can be made for β. We also conclude that, for a fixed s ∈ G2 , sg ∈ G1 G2 for almost all g ∈ G1 . An analogous statement holds for a fixed g ∈ G1 . It follows that p1 (sg) and p2 (sg) are defined almost everywhere if either s or g is fixed. This will allow us to use freely p1 (sg) and p2 (sg) in integrals, see e.g. Eq. (3.4). We denote by X and Y the usual multiplicative unitaries on H1 := L2 (G1 ) and H2 := 2 L (G2 ) respectively, defined by (Xξ )(g, h) = ξ(gh, h) and (Y η)(s, t) = η(st, t). Then Yˆ is the multiplicative unitary on L2 (G2 ) given by (Yˆ η)(s, t) = 2 (s)1/2 η(s, s −1 t). Following [2] and [19], we can define the main actors of the paper.
146
S. Baaj, G. Skandalis, S. Vaes
Definition 3.3. Defining W = (β ⊗ ι)(Yˆ )123 (ι ⊗ α)(X)234 , we get a multiplicative unitary W , which is the right regular representation of the l.c. quantum group (M, δ) given by M = (α(L∞ (G1 )) ∪ L(G2 ) ⊗ 1)
= G2 α L∞ (G1 ) with δ(z) = W (z ⊗ 1)W ∗ , where L(G2 ) denotes the left regular representation of the group von Neumann algebra of G2 , generated by the unitaries λs defined by (λs ξ )(t) = 2 (s)1/2 ξ(s −1 t). op
The right Haar measure of G1 is the left Haar measure of G1 and secondly, using the modular function 2 , the L2 -spaces of G2 with right or left Haar measure are naturally isomorphic. Under this isomorphism W agrees exactly with the Wˆ of [19], Def. 2.2 (one just has to interchange the indices 1 and 2 referring to G1 and G2 everywhere). Hence, the (M, δ) defined above agrees with (M, op ) in [19], where op = σ with σ the flip map. So, we get indeed that W is the right regular representation of the l.c. quantum group (M, δ). Dually, we have Mˆ = (β(L∞ (G2 )) ∪ 1 ⊗ R(G1 ))
= L∞ (G2 ) β G1
with
ˆ δ(z) = W ∗ (1 ⊗ z)W ,
where R(G1 ) denotes the right regular representation of the group von Neumann algebra of G1 , generated by the unitaries (ρg ξ )(h) = ξ(hg). ˆ δ) ˆ ) ˆ above and (M, ˆ in [19] really agree. Then, (M, From Eq. (3.1), we get the following formulas for the comultiplication on the generˆ ators of M and M: δα = (α ⊗ α)δ1 and (ι ⊗ δ)(Yˆ12 ) = Yˆ12 ((ι ⊗ α)β ⊗ ι)(Yˆ )1234 , ˆ = (β ⊗ β)δ2 and (δˆ ⊗ ι)(X23 ) = (ι ⊗ (β ⊗ ι)α)(X)2345 X45 . δβ
(3.2) (3.3)
Remark 3.4. If we interchange G1 and G2 and keep G, we also have a matched pair of l.c. groups. Performing the bicrossed product construction again, we get a multiplicative unitary on L2 (G1 × G2 ), which through the unitary (vξ )(s, g) = 1 (g)1/2 2 (s)1/2 ξ(g −1 , s −1 ) corresponds to W ∗ . So, S and Sˆ are interchanged and the comultiplications are flipped. So, the von Neumann algebraic picture of the l.c. quantum group (M, δ) is completely clear. In this paper, we study the associated C∗ -algebras. To determine these, the following lemma will be crucial. Lemma 3.5. Using the embedding C0 (G/G2 ) → M(C0 (G1 )) ⊂ L∞ (G1 ), we have [(ω ⊗ ι)α(F ) | ω ∈ L(H2 )∗ , F ∈ C0 (G1 )] = C0 (G/G2 ) . Proof. We have to prove that for all Fi ∈ Cc (Gi ), the function H (g) := F1 (p1 (sg))F2 (s) ds
(3.4)
Non-Semi-Regular Quantum Groups
147
belongs to Cc (G/G2 ) and that these functions H span a dense subspace of C0 (G/G2 ). It suffices to look at F1 = K˜ 1 ∗ P1 with K1 , P1 ∈ Cc (G1 ), i.e. F1 (g) =
K1 (h)P1 (hg) dh .
Take some P2 ∈ Cc (G2 ) such that P2 (s −1 ) ds = 1. On G, we define bounded Borel functions K and P with compact support by K(hs) = K1 (h)F2 (s) 1 (h)−1 (h)
and
P (hs) = P1 (h)P2 (s)
and such that K and P equal 0 outside G1 G2 . In particular, K, P ∈ L2 (G) and defining Q = K˜ ∗ P , i.e. Q(x) = K(y)P (yx) dy , we have Q ∈ Cc (G). We claim that
Q(gt −1 ) dt .
H (g) = Take g ∈ G1 . Then, using Proposition 3.2,
Q(gt −1 ) dt =
K(y)P (ygt −1 ) dy dt = K(hs)P (hsgt −1 ) 1 (h) (h)−1 dh ds dt = K1 (h)F2 (s)P1 (hp1 (sg))P2 (p2 (sg)t −1 ) dt dh ds = K1 (h)F2 (s)P1 (hp1 (sg)) dh ds = H (g) .
This proves our claim. We conclude that H ∈ Cc (G/G2 ). Moreover, the functions H span a dense subset of C0 (G/G2 ), because in the proof above, the functions K and P both span a dense subset of L2 (G), so that the functions Q span a dense subset of C0 (G).
From [19], we know that the multiplicative unitary W is the (right) regular representation of the l.c. quantum group (M, δ). Hence, we have two associated C∗ -algebras: S = [(ω ⊗ ι)(W ) | ω ∈ L(H2 ⊗ H1 )∗ ]
and Sˆ = [(ι ⊗ ω)(W ) | ω ∈ L(H2 ⊗ H1 )∗ ] . (3.5)
The comultiplications δ and δˆ restrict nicely to morphisms δ : S → M(S ⊗ S) and ˆ δˆ : Sˆ → M(Sˆ ⊗ S). ˆ We use λ to denote the As a first application of Lemma 3.5, we can identify S and S. left regular representation of Cr∗ (G2 ) on L2 (G2 ).
148
S. Baaj, G. Skandalis, S. Vaes
Proposition 3.6. The following equalities hold: S = G2 r C0 (G/G2 ) = (λ(Cr∗ (G2 )) ⊗ 1)α(C0 (G/G2 )) and α : C0 (G1 ) → M(S) is non-degenerate, Sˆ = C0 (G1 \G) r G1 = (1 ⊗ Cr∗ (G1 ))β(C0 (G1 \G)) and ˆ is non-degenerate . β : C0 (G2 ) → M(S) Proof. Observe that β(F ) = B(F ⊗ 1)B ∗ for all F ∈ L∞ (G2 ) where B ∈ M(K(H2 ) ⊗ C0 (G1 )) is a representation of X, the canonical implementation of β. Because B is a representation of X, we get ∗ X23 )) | ω ∈ L(H2 )∗ , µ ∈ L(H1 )∗ ] S = [(ω ⊗ µ ⊗ ι ⊗ ι)(Yˆ13 (ι ⊗ ι ⊗ α)(B12 = [(ω ⊗ µ ⊗ ι ⊗ ι)(Yˆ13 (ι ⊗ ι ⊗ α)(B13 X23 )) | ω ∈ L(H2 )∗ , µ ∈ L(H1 )∗ ] = [(ω ⊗ ι ⊗ ι) Yˆ12 (ι ⊗ α)(B(1 ⊗ C0 (G1 ))) | ω ∈ L(H2 )∗ ]
= [(λ(Cr∗ (G2 )) ⊗ 1)α(C0 (G1 ))] .
Because S is a C∗ -algebra and because of Lemma 3.5, we get S = [(λ(Cr∗ (G2 )) ⊗ 1)α(C0 (G1 ))(λ(Cr∗ (G2 )) ⊗ 1)] = [(λ(Cr∗ (G2 )) ⊗ 1)α(C0 (G/G2 ))(λ(Cr∗ (G2 )) ⊗ 1)] . By definition, G2 r C0 (G/G2 ) = [(λ(Cr∗ (G2 )) ⊗ 1)α(C0 (G/G2 ))] and this is a C∗ -algebra. Hence, S = G2 r C0 (G/G2 ). So, we have proven the first line of the statement above. The second line is analogous (or follows by interchanging G1 and G2 and applying Remark 3.4).
Next, we prove the corresponding result for the universal C∗ -algebras. Proposition 3.7. Suppose that W is a corepresentation of W on a Hilbert space K. a) There exist unique corepresentations y of Yˆ and x of X on K such that W = (β ⊗ ι)(y) x23 . b) If we denote by π1 and πˆ 2 the representations of C0 (G1 ) and C ∗ (G2 ) corresponding to x and y and if we define the representation π˜ 1 of C0 (G/G2 ) using the extension of π1 to M(C0 (G1 )), then the pair (π˜ 1 , πˆ 2 ) is a covariant representation for the left action of G2 on C0 (G/G2 ). c) The representation π of Su associated to W and the representation π˜ 1 × πˆ 2 of G2 f C0 (G/G2 ) have the same image. Conversely, every covariant representation for the left action of G2 on G/G2 is obtained in this way from a corepresentation W of W . In particular, Su ∼ = G2 f C0 (G/G2 ) in a natural way. An analogous statement holds for Sˆu and in particular, Sˆu ∼ = C0 (G1 \G) f G1 .
Non-Semi-Regular Quantum Groups
149
Proof. a) This follows from [4]. Observe that x ∈ M(Cr∗ (G1 ) ⊗ K(K)) and y ∈ M(C0 (G2 ) ⊗ K(K)). b) Because W is a corepresentation, we get the following equality in L(H2 ⊗ H1 ⊗ H2 ⊗ H1 ⊗ K): (δˆ ⊗ ι)(x23 ) = (β ⊗ ι)(y ∗ )345 x25 (β ⊗ ι)(y)345 x45 .
(3.6)
Define the non-degenerate representation π˜ 1 of C0 (G/G2 ) as in the statement of the proposition. We know that, in L(H2 ⊗ H1 ⊗ H2 ⊗ H1 ⊗ H1 ), (δˆ ⊗ ι)(X23 ) = (ι ⊗ (β ⊗ ι)α)(X)2345 X45 .
(3.7)
From Proposition 3.6, it follows that X23 ∈ M(Sˆ ⊗ C0 (G1 )) and so, it follows that (ι ⊗ (β ⊗ ι)α)(X)2345 ∈ M(Sˆ ⊗ Sˆ ⊗ C0 (G1 )), from which we conclude that (β ⊗ ι)α : C0 (G1 ) → M(Sˆ ⊗ C0 (G1 ))
(3.8)
is a well-defined and non-degenerate ∗ -homomorphism. From Eq. (3.7) and the fact that (ι ⊗ π1 )(X) = x, we conclude that, in L(H2 ⊗ H1 ⊗ H2 ⊗ H1 ⊗ K), (δˆ ⊗ ι)(x23 ) = (ι ⊗ (ι ⊗ ι ⊗ π1 )(β ⊗ ι)α)(X)2345 x45 . Combining with Eq. (3.6), we find that (ι ⊗ (ι ⊗ ι ⊗ π1 )(β ⊗ ι)α)(X) = (β ⊗ ι)(y ∗ )234 x14 (β ⊗ ι)(y)234 ,
(3.9)
which yields in L(H2 ⊗ H1 ⊗ K) the equality (ι ⊗ ι ⊗ π1 )(β ⊗ ι)α(F ) = (β ⊗ ι)(y ∗ )(1 ⊗ 1 ⊗ π1 (F ))(β ⊗ ι)(y)
(3.10)
for all F ∈ C0 (G1 ). Because of non-degenerateness, the same holds for all F ∈ M(C0 (G1 )) and hence, for F ∈ C0 (G/G2 ). As G2 acts continuously on C0 (G/G2 ), we have α : C0 (G/G2 ) → M(C0 (G2 ) ⊗ C0 (G/G2 )) and so, we arrive at the formula (ι ⊗ π˜ 1 )α(F ) = y ∗ (1 ⊗ π˜ 1 (F ))y , for all F ∈ C0 (G/G2 ). This precisely gives the required covariance of (π˜ 1 , πˆ 2 ). c) Denote by π and π˜ 1 × πˆ 2 the corresponding representations of Su and G2 f C0 (G/G2 ). We have to prove that they have the same image and this will be analogous to the proof of Proposition 3.6. We use again the canonical implementation B ∈ M(K(H2 ) ⊗ C0 (G1 )) of β. Observe that π(Su ) = [(ω ⊗ µ ⊗ ι)(W) | ω ∈ L(H2 )∗ , µ ∈ L(H1 )∗ ] ∗ = [(ω ⊗ µ ⊗ ι)(y13 B12 x23 ) | ω ∈ L(H2 )∗ , µ ∈ L(H1 )∗ ] . ∗ x = (ι ⊗ π )(B) x B ∗ and hence, Because B12 B13 X23 = X23 B12 , we get B12 23 1 13 23 12
π(Su ) = [(ω ⊗ µ ⊗ ι)(y13 (ι ⊗ π1 )(B)13 x23 ) | ω ∈ L(H2 )∗ , µ ∈ L(H1 )∗ ] = [(ω ⊗ ι)(y (ι ⊗ π1 )(B) π1 (C0 (G1 )) | ω ∈ L(H2 )∗ ] = [πˆ 2 (C ∗ (G2 )) π1 (C0 (G1 ))] .
150
S. Baaj, G. Skandalis, S. Vaes
From Eq. (3.10), it follows that
π˜ 1 ((ωβ ⊗ ι)α(F )) = (ω ⊗ ι) (β ⊗ ι)(y ∗ ) (1 ⊗ 1 ⊗ π1 (F )) (β ⊗ ι)(y)
for all F ∈ C0 (G1 ) and ω ∈ L(H2 ⊗ H1 )∗ . Combining this with Lemma 3.5, it follows that [πˆ 2 (C ∗ (G2 )) π˜ 1 (C0 (G/G2 )) πˆ 2 (C ∗ (G2 ))] = [πˆ 2 (C ∗ (G2 )) π1 (C0 (G1 )) πˆ 2 (C ∗ (G2 ))] . The image of π˜ 1 × πˆ 2 is given by [πˆ 2 (C ∗ (G2 )) π1 (C0 (G/G2 ))] and is a C∗ -algebra. Also π(Su ) is a C∗ -algebra, so that the calculation above shows that, indeed π and π˜ 1 × πˆ 2 have the same image. This concludes the proof of the first part of the proposition. Suppose now, conversely, that we have a covariant representation (π˜ 1 , πˆ 2 ) on K for the left action of G2 on G/G2 . Denote by y the corepresentation of Yˆ corresponding to πˆ 2 . The representation π˜ 1 defines a measure class on G/G2 which, by covariance, is invariant under the action of G2 . We claim that this measure class is supported by the image of G1 in G/G2 . By invariance, the transformation (s, x) ¯ → (s, sx) of G2 ×G/G2 preserves sets of measure zero and so, we have to prove that the set of pairs (s, x) ¯ such that sx ∈ G1 has measure zero. Using the Fubini theorem, it is enough to prove that, for any x ∈ G, the set of all s ∈ G2 such that sx ∈ G1 has measure zero. But this last set is equal to the set of all s ∈ G2 such that gs ∈ G1 G2 x −1 for all g ∈ G1 . And here, we get a set of measure zero and our claim has been proved. Because the measure class on G/G2 is supported by G1 , we can use the Borel calculus to find a non-degenerate representation π1 of C0 (G1 ) on K such that π˜ 1 is indeed obtained by extending π1 to M(C0 (G1 )). Put x = (ι ⊗ π1 )(X). We have to prove that W := (β ⊗ ι)(y) x23 is a corepresentation of W . Because π˜ 1 makes sense on all bounded Borel functions on G/G2 (which is essentially the same thing as bounded Borel functions on G1 ), we get, by covariance π˜ 1 (F (p1 (s ·))) = ys∗ π˜ 1 (F )ys , for all bounded Borel functions on F on G1 . Integrating, we conclude that π˜ 1 ((ω ⊗ ι)α(F )) = (ω ⊗ ι)(y ∗ (1 ⊗ π˜ 1 (F ))y) , for all bounded Borel functions on F on G1 and ω ∈ L(H2 )∗ . From this, it follows that Eq. (3.10) holds for all F ∈ C0 (G1 ) and as we saw above, this yields that W is a corepresentation of W .
Observe that it follows immediately that the projection ∗ -homomorphism of Su onto S corresponds exactly to the projection ∗ -homomorphism of the full crossed product G2 f C0 (G/G2 ) onto the reduced crossed product G2 r C0 (G/G2 ). From the proof of the previous proposition, it also follows that we have a non-degenerate ∗ -homomorphism C0 (G1 ) → M(Su ). ˆ Next, we consider covariant representations and we identify S r,f S. Proposition 3.8. Suppose that V is a representation and W is a corepresentation of W , such that (V, W) is a covariant representation of W . Take the unique corepresentations x of X and y of Yˆ and the unique representations a of X and b of Yˆ such that V = b12 (ι ⊗ α)(a) and W = (β ⊗ ι)(y) x23 . Denote by π1 and π2 the representations of C0 (G1 ) and C0 (G2 ) associated with x and b, respectively.
Non-Semi-Regular Quantum Groups
151
a) The ranges of π1 and π2 commute and we can define a non-degenerate representation π of C0 (G1 × G2 ). Extending first to Cb (G1 × G2 ) and then restricting to C0 (G) using the map (g, s) → gs, we get a non-degenerate representation π˜ of C0 (G). b) The unitaries ag and ys commute for all g ∈ G1 , s ∈ G2 . c) The pair (π, ˜ (ys ag )(s,g) ) is a covariant representation for the action of G2 × G1 on C0 (G), where G2 acts on the left and G1 on the right. d) The images of the associated representations of S f Sˆ and (G2 × G1 ) f C0 (G) coincide. Conversely, every covariant representation for the action of G2 × G1 on C0 (G) is obtained in this way from a covariant pair for W. In particular, S f Sˆ ∼ = (G2 × G1 ) f C0 (G) in a natural way. Proof. Let (V, W) be a covariant pair for W on K and take, using Proposition 3.7, x, y, a and b as in the statement of the proposition. As we remarked right after Definition 2.4, both W and V are stably isomorphic to W , but not jointly. This means that we have faithful, normal ∗ -homomorphisms η : M → L(K) and ηˆ : Mˆ → L(K) such that W = (ι ⊗ η)(W ) and V = (ηˆ ⊗ ι)(W ). Restricting to the von Neumann subalgebras ˆ we obtain α(L∞ (G1 )) and L(G2 ) ⊗ 1 of M and β(L∞ (G2 )) and 1 ⊗ R(G1 ) of M, faithful, normal ∗ -homomorphisms π1 , πˆ 2 , π2 and πˆ 1 respectively, such that x = (ι ⊗ π1 )(X),
y = (ι ⊗ πˆ 2 )(Yˆ ),
a = (πˆ 1 ⊗ ι)(X)
and
b = (π2 ⊗ ι)(Yˆ ) .
The covariance of the pair (V, W) is equivalent to each of the following formulas: (η ⊗ ι)δ(z) = V(η(z) ⊗ 1)V ∗ for all z ∈ M ˆ ˆ for all z ∈ Mˆ . (ι ⊗ η) ˆ δ(z) = W ∗ (1 ⊗ η(z))W Evaluating these formulas on the generators of M and Mˆ following Eq. (3.2) and (3.3), covariance is equivalent to each of the following two lines of formulas: ∗ x12 (ι ⊗ α)(X)134 = V234 x12 V234 ,
∗ y12 ((ι ⊗ π1 )β ⊗ ι)(Yˆ )123 = V234 y12 V234 , (3.11)
∗ ∗ b34 W123 , (ι ⊗ (π2 ⊗ ι)α)(X)234 a34 = W123 a34 W123 . (β ⊗ ι)(Yˆ )124 b34 = W123 (3.12)
Using the explicit expression of V, the second formula of Eq. (3.11) becomes ∗ ∗ b23 y12 ((ι ⊗ π1 )β ⊗ ι)(Yˆ )123 b23 = (ι ⊗ ι ⊗ α)(a23 y12 a23 ).
(3.13)
Because α(L∞ (G1 )) ∩ L(G2 ) ⊗ 1 = C, we find a unitary v ∈ L(H2 ⊗ K), such that ∗ = v . Treating in the same way the second formula of Eq. (3.12), we find a23 y12 a23 12 ∗ a ∗ ∗ that y12 23 y12 ∈ 1 ⊗ L(K ⊗ H1 ). But, y12 a23 y12 = y12 v12 a23 , which yields the existence of u ∈ L(K) such that v = y(1 ⊗ u). Because clearly both v and y are corepresentations of Yˆ , we find that u = 1. Hence, we arrived at the commutation of a23 and y12 . This means that the unitaries ag and ys commute for all g ∈ G1 , s ∈ G2 .
152
S. Baaj, G. Skandalis, S. Vaes
Next, we observe that b∗ (1 ⊗ F )b = (π2 ⊗ ι)δ2 (F ) for all F ∈ L∞ (G2 ). Combining this with the formula (δ2 ⊗ ι)α = (ι ⊗ α)α, we can rewrite the first formula of Eq. (3.11) in the form ∗ ∗ x12 b23 = (ι ⊗ ι ⊗ α) a23 x12 a23 (ι ⊗ (π2 ⊗ ι)α)(X ∗ ) . (3.14) b23 ∗ x As above, it follows that b23 12 b23 ∈ L(H1 ⊗ K) ⊗ 1. Treating similarly the first formula of Eq. (3.12), we find ∗ ∗ (3.15) = (β ⊗ ι ⊗ ι) ((ι ⊗ π1 )β ⊗ ι)(Yˆ ∗ ) y12 b23 y12 x23 b34 x23 ∗ ∈ 1 ⊗ L(K ⊗ H ). A similar reasoning as above yields and hence, that x12 b23 x12 2 now the commutation of x12 and b23 . This means that the ranges of π1 and π2 commute. Combining this with Eq. (3.15), we get ∗ b23 y12 . ((ι ⊗ π1 )β ⊗ ι)(Yˆ ) b23 = y12
(3.16)
Denote by π the non-degenerate representation of C0 (G1 × G2 ) on K such that π(F1 ⊗ F2 ) = π1 (F1 ) π2 (F2 ). Dualizing the continuous map (g, s) → gs, we embed C0 (G) into M(C0 (G1 × G2 )) and obtain the non-degenerate representation π˜ of C0 (G) on K. We have to prove that (π, ˜ (ag ys )) is a covariant representation. We claim that γ := ((β ⊗ ι)τ ⊗ ι)(ι ⊗ δ2 ) : C0 (G1 × G2 ) → M(K(H2 ⊗ H1 ) ⊗ C0 (G1 × G2 )) is well-defined and non-degenerate. It suffices to verify this statement on C0 (G1 ) ⊗ 1 and 1 ⊗ C0 (G2 ) separately. For C0 (G1 ) ⊗ 1, it follows from Eq. (3.8). On the other hand, we observe that ∗ ∗ ˆ Y45 ∈ M(K(H2 ⊗ H1 ) ⊗ C0 (G1 × G2 ) ⊗ K(H2 )) , B12 (γ ⊗ ι)(Yˆ23 ) = B12 B13 Yˆ15 B13
where B ∈ M(K(H2 )⊗C0 (G1 )) is again the canonical implementation of β. This proves our claim. Using this last equation and Eq. (3.16), we also observe that ∗ b34 (ι ⊗ ι ⊗ π ⊗ ι)(γ ⊗ ι)(1 ⊗ Yˆ ) = B12 ((ι ⊗ π1 )β ⊗ ι)(Yˆ )134 B12 ∗ = (β ⊗ ι ⊗ ι)(y12 b23 y12 ) = (β ⊗ ι)(y ∗ )123 (π ⊗ ι)(1 ⊗ Yˆ )34 (β ⊗ ι)(y)123 .
Combining this with Eq. (3.10), which holds because W is a corepresentation of W , we get (ι ⊗ ι ⊗ π)γ (F ) = (β ⊗ ι)(y ∗ ) (1 ⊗ 1 ⊗ π(F )) (β ⊗ ι)(y) ,
(3.17)
for all F ∈ C0 (G1 × G2 ). By non-degenerateness, the same holds for F ∈ C0 (G), which exactly means that π(F ˜ (s ·)) = ys∗ π˜ (F ) ys , for all F ∈ C0 (G) and s ∈ G2 . One similarly proves that π(F ˜ (· g)) = ag π˜ (F ) ag∗ for all F ∈ C0 (G) and g ∈ G1 . So, (π˜ , (ag ys )) is a covariant representation.
Non-Semi-Regular Quantum Groups
153
Denote the image of the representation of S f Sˆ associated with (V, W) by A. Then, by definition A = [(ι ⊗ ω)(V) (µ ⊗ ι)(W) | ω, µ ∈ L(H2 ⊗ H1 )∗ ] . It follows from Proposition 3.7 and its proof that A = [πˆ 2 (C ∗ (G2 ) π1 (C0 (G1 )) π2 (C0 (G2 )) πˆ 1 (C ∗ (G1 ))] = [πˆ 2 (C ∗ (G2 ) π(C0 (G1 × G2 )) πˆ 1 (C ∗ (G1 ))] . From Eq. (3.17), we know that π (ω ⊗ ι ⊗ ι)(τ ⊗ ι)(ι ⊗ δ2 )(F ) = (ω ⊗ ι)(y ∗ (1 ⊗ π(F )) y) for all F ∈ C0 (G1 × G2 ) and ω ∈ L(H2 )∗ . Lemma 3.9 following this proof, tells us that the left-hand side spans a dense subset of π˜ (C0 (G)). So, we get ˜ 0 (G)) πˆ 1 (C ∗ (G1 )] = [πˆ 2 (C ∗ (G2 ) π(C0 (G1 × G2 )) πˆ 1 (C ∗ (G1 )] . [πˆ 2 (C ∗ (G2 ) π(C Because both A and the image of (G2 × G1 ) f C0 (G) are C∗ -algebras, we easily get that A indeed equals the image of (G2 × G1 ) f C0 (G). Suppose now that, conversely, we have a covariant representation (π˜ , (ys ag )) for the action of G2 × G1 on C0 (G). The representation π˜ gives rise to a measure class on G which is invariant by multiplication on the left by G2 and on the right by G1 . We claim that this measure class is supported by G1 G2 . Because the transformation (s, x) → (s, sx) on G2 × G preserves Borel sets of measure zero, it is enough to prove that for any x ∈ G, the set of all s ∈ G2 such that sx ∈ G1 G2 has measure zero. But this is the case, as we already saw in the proof of Proposition 3.7. Hence, the Borel functional calculus provides a non-degenerate representation π of C0 (G1 × G2 ) such that π˜ is obtained by extending π to M(C0 (G1 × G2 )) and then restricting to C0 (G) through the map (g, s) → gs. We denote by π1 and π2 the corresponding representations of C0 (G1 ) and C0 (G2 ). Then, we can define x = (ι ⊗ π1 )(X) and b = (π2 ⊗ ι)(Yˆ ). Defining V = b12 (ι ⊗ α)(a) and W = (β ⊗ ι)(y) x23 , it follows from Proposition 3.7 that V is a representation of W and W is a corepresentation of W . We know that π(F ˜ (s ·)) = ys∗ π(F ˜ )ys for all s ∈ G2 and all bounded Borel functions F on G. Integrating, it follows in particular that π˜ (ω ⊗ ι ⊗ ι)(β ⊗ ι)δ2 (F ) = (ω ⊗ ι)(y ∗ (1 ⊗ π2 (F ))y) for all ω ∈ L(H2 )∗ and all F ∈ C0 (G2 ). Evaluating this formula on F = (ι ⊗ µ)(Yˆ ), we get ∗ ((ι ⊗ π1 )β ⊗ ι)(Yˆ ) b23 = y12 b23 y12 , which makes sense because β : C0 (G2 ) → M(K(H2 ) ⊗ C0 (G1 )) is well-defined and non-degenerate (using the canonical implementation of β). Because a23 and y12 commute, it follows that Eq. (3.13) holds and hence, also the second formula of Eq. (3.11) holds.
154
S. Baaj, G. Skandalis, S. Vaes
Because π˜ (F (· g)) = ag π(F ˜ )ag∗ for all g ∈ G1 and all bounded Borel functions F on G, we find similarly that π˜ (ι ⊗ ι ⊗ ω)(ι ⊗ α)δ1 (F ) = (ι ⊗ ω)(a(π1 (F ) ⊗ 1)a ∗ ) for all ω ∈ L(H1 )∗ and all F ∈ C0 (G1 ), which yields, in the same way, ∗ x12 (ι ⊗ (π2 ⊗ ι)α)(X) = a23 x12 a23 ,
where (π2 ⊗ ι)α makes sense again using the canonical implementation of α. Because x12 and b23 commute, we get that Eq. (3.14) holds and hence, also the first formula in Eq. (3.11). Combining both formulas of Eq. (3.11) and the definitions of V, W and W , we get ∗ V345 W123 V345 = (β ⊗ ι)(y)123 ((ι ⊗ ι ⊗ π1 )(ι ⊗ δ1 )β ⊗ ι)(Yˆ )1234 x23 (ι ⊗ α)(X)245 = W123 (β ⊗ ι)(Yˆ )124 (ι ⊗ α)(X)245 = W123 W1245 ,
where we used the formula (ι ⊗ π1 )δ1 (F ) = x(F ⊗ 1)x ∗ for all F ∈ C0 (G1 ). So, we precisely get that (V, W) is a covariant pair for W .
The following lemma was needed to prove the previous proposition. It is completely analogous to Lemma 3.5 above. Lemma 3.9. Using the map (g, s) → gs, we embed C0 (G) → M(C0 (G1 × G2 )) ⊂ L∞ (G1 × G2 ). Then, we have C0 (G) = [(ω ⊗ ι ⊗ ι)(τ ⊗ ι)(ι ⊗ δ2 )(C0 (G1 × G2 )) | ω ∈ L(H2 )∗ ] . Proof. We have to prove that the functions gt → K2 (s)F1 (p1 (sg))F2 (p2 (sg)t) ds belong to Cc (G) whenever F1 ∈ Cc (G1 ), K2 , F2 ∈ Cc (G2 ) and that they span a dense subset of C0 (G). It suffices to consider F1 of the form K˜ 1 ∗ P1 with K1 , P1 ∈ Cc (G1 ) and F1 (g) = K1 (h)P1 (hg) dh . Then, we can define bounded Borel functions K and P with compact support on G by the formulas P (gs) = P1 (g)F2 (s),
K(gs) = K1 (g)K2 (s) 1 (g)−1 (g)
and P and K equal 0 outside G1 G2 . Because K and P belong to L2 (G) and have compact support, the function H := K ∗ P defined by H (x) = K(y)P (yx) dy belongs to Cc (G). But, using Proposition 3.2,
Non-Semi-Regular Quantum Groups
155
K(y)P (ygt) dy = K(hs)P (hsgt) 1 (h) (h)−1 dh ds = K1 (h)K2 (s)P1 (hp1 (sg))F2 (p2 (sg)t) dh ds = K2 (s)F1 (p1 (sg))F2 (p2 (sg)t) ds .
H (gt) =
This ends the proof, because it is clear that the functions K and P span dense subspaces of L2 (G) and hence the functions H span a dense subspace of C0 (G).
We also characterize the reduced crossed products in the following easily proved proposition. Proposition 3.10. The regular covariant pair (W, W ) for W corresponds through the procedure of Proposition 3.8 to a representation of (G2 × G1 ) f C0 (G) which is stably isomorphic to the regular representation. In particular, S r Sˆ ∼ = (G2 × G1 ) r C0 (G) and the projection ∗ -homomorphisms from the full onto the reduced crossed products are intertwined by the isomorphisms S f,r Sˆ ∼ = (G2 × G1 ) f,r C0 (G). Proof. Because the orbit of e under the action of G2 ×G1 , which is G2 G1 , is dense in G, it follows that the covariant representation associated with this orbit, on the Hilbert space L2 (G2 × G1 ), is stably isomorphic to the regular representation of (G2 × G1 ) f C0 (G). It is clear that the V and W corresponding to this covariant representation are twice W.
Using this proposition, we prove our main theorem. Theorem 3.11. The multiplicative unitary W of the bicrossed product l.c. quantum group (M, δ) is regular if and only if the map θ : G1 × G2 → G : θ(g, s) = gs is a homeomorphism of G1 × G2 onto G. The multiplicative unitary W is semi-regular if and only if θ is a homeomorphism of G1 ×G2 onto an open subset of G with complement of measure zero. Proof. In the proof of the previous proposition, we have seen that Sr Sˆ is precisely given by the image, denoted by A, of the irreducible representation of (G2 ×G1 )f C0 (G) corresponding to the free orbit G2 G1 . From Lemma 3.12, it follows that A = K(H2 ⊗H1 ) if and only if this orbit is closed and homeomorphic to G2 × G1 and that K(H2 ⊗ H1 ) ⊂ A if and only if this orbit is locally closed and homeomorphic to G2 × G1 . Because the orbit is dense and because of Proposition 2.6, we precisely arrive at the statement of the theorem, using also that G1 G2 = (G2 G1 )−1 .
In the next section, we will give examples where the image of θ is not open and hence, the associated multiplicative unitary is not semi-regular. Nevertheless, it should be observed that W , being the regular representation of a l.c. quantum group, is always manageable in the sense of [21]. So, not all manageable multiplicative unitaries are semi-regular. The next lemma is well known (see [8] for a related result), but we include a short proof for completeness.
156
S. Baaj, G. Skandalis, S. Vaes
Lemma 3.12. Suppose that a l.c. group G acts continuously (on the right) on a l.c. space X. Let x0 ∈ X have a free orbit, i.e. the map θ : G → X : θ(g) = x0 · g is injective. Denote by π the representation of C0 (X) f G on L2 (G) corresponding to the orbit of x0 . Then, a) The image of π contains K(L2 (G)) if and only if the orbit θ (G) is locally closed and θ is a homeomorphism of G onto θ (G). b) The image of π is equal to K(L2 (G)) if and only if the orbit θ (G) is closed and θ is a homeomorphism of G onto θ (G). Proof. Suppose first that the image of π contains K(L2 (G)). Equip θ (G) with its relative topology and assume that θ −1 is not continuous. Then, we find elements gn ∈ G such that gn (n ≥ 1) remains outside a neighborhood of g0 , but x0 ·gn → x0 ·g0 . Consider the dense subalgebra of the image of π consisting of the operators γ (F ), F ∈ Cc (G × X), defined by F (h, x0 · g) ξ(gh) dh . γ (F )ξ )(g) = G
Take a function η ∈ L2 (G) with η = 1 and with small enough support such that λgn η, λg0 η = 0 for all n ≥ 1, where (λg ) is the left regular representation of G. One verifies immediately that λgn η, γ (F )λgn η → λg0 η, γ (F )λg0 η
for all
F ∈ Cc (G × X) .
Hence, the same holds for all a ∈ π(C0 (X) f G) instead of γ (F ) and, in particular, for all a ∈ K(L2 (G)). This gives a contradiction when we take for a the projection on λg0 η. So, θ −1 is continuous. This means that θ (G) is locally compact in its relative topology, which precisely means that θ (G) is locally closed in X. Suppose next that the image of π is precisely K(L2 (G)). Denote by Y the closure of θ (G). From the previous paragraph, we already know that θ (G) is open in Y and that θ is a homeomorphism. Suppose that θ (G) = Y , take x1 ∈ Y \ θ (G) and take gn ∈ G such that x0 · gn → x1 . Then, gn goes to infinity. Writing π1 for the representation of C0 (X) f G on L2 (G) corresponding to the orbit of x1 , whose image contains the dense subalgebra of operators γ1 (F ), F ∈ Cc (G × X), we observe that λgn η, γ (F )λgn η → η, γ1 (F )η for all η ∈ L2 (G), F ∈ Cc (G × X) . But, λgn η → 0 weakly, as gn → ∞. Because the image of π is supposed to be exactly the compact operators, it follows from the previous formula that η, γ1 (F )η = 0 for all F and η. This is a contradiction. The converse implications are easy to prove.
Using the theory that we developed so far, it is also easy to give examples of l.c. quantum groups such that the projection ∗ -homomorphism from S f Sˆ onto S r Sˆ is not faithful. Loosely speaking, this means that the action of (S, δ) on itself by translation is not amenable and in particular, not proper (whatever this means). Proposition 3.13. Suppose that G1 and G2 are conjugated, i.e. there exists an element z0 ∈ G such that G2 = z0 G1 z0−1 . a) S f,r Sˆ is strongly Morita equivalent with Su,r .
Non-Semi-Regular Quantum Groups
157
b) The projection ∗ -homomorphism from S f Sˆ onto S r Sˆ is faithful if and only if G1 is amenable. c) M ∼ = Mˆ ∼ = L(L2 (G1 )). Proof. a) We use an argument which is essentially contained in Rieffel’s paper [17], but we include a sketch of it for completeness. Write B = C0 (G/G1 ). On Cc (G), the continuous compactly supported functions on G, we define a B-valued inner product by ξ, η(x) = (ξ η)(xg −1 ) dg . G1
Completion yields the Hilbert B-module E. Because the right action of G1 on G is proper, there is only one crossed product C0 (G) G1 , which can be identified with K(E), the “compact” operators on the Hilbert B-module E. Because E is full, we have a Morita equivalence between C0 (G) G1 and B. Then, we have S f,r Sˆ ∼ = (G2 × G1 ) f,r C0 (G) ∼ = G2 f,r K(E) ∼ = K(G2 f,r E) , where all isomorphisms are natural and intertwine the projection of full onto reduced crossed products and where G2 f,r E is the obvious full Hilbert G2 f,r C0 (G/G1 )module. In the identifications above, only the isomorphism G2 f K(E) ∼ = K(G2 f E) requires some care: consider the C∗ -algebra K(E ⊕ B) in which K(E) is a full corner and on which G2 acts by automorphisms. Then, G2 f K(E) is a full corner of G2 f K(E ⊕B) and is as such identified with K(G2 f E). Observing that the right multiplication by z0−1 gives a homeomorphism of G/G1 onto G/G2 intertwining the left action of G2 , we see that G2 f,r C0 (G/G1 ) ∼ = Su,r . So, we have proven the required strong Morita equivalence. b) Because of the Morita equivalence above, the projection of S f Sˆ onto S r Sˆ is faithful if and only if the projection of Su onto S is faithful. From Theorem 15 in [6], it follows that this last projection is faithful if and only if G2 is amenable. Because of conjugacy, this is equivalent to the amenability of G1 . c) As above, we observe that the left action of G2 on G/G2 is isomorphic to the left action of G2 on G/G1 . Using the isomorphism L∞ (G/G1 ) ∼ = L∞ (G2 ), we see that ∞ the action of G2 on L (G/G2 ) is isomorphic to the left translation of G2 on L∞ (G2 ). Hence, M ∼
= L(L2 (G2 )). Remark 3.14. Although in the situation of Proposition 3.13, we have a strong Morita equivalence between S f Sˆ and Su and hence, an isomorphism S f Sˆ ⊗ K ∼ = Su ⊗ K, this isomorphism is very much “twisted” for the following reason: if G1 is non-amenable, Su is very different from S, but nevertheless, we claim that, for any l.c. quantum ˆ As usual group (M, δ), we have a natural, injective ∗ -homomorphism M → M(S f S). ∗ we write S and Sˆ for the underlying C -algebras as in Eq. (2.1). From the remark after Definition 2.4, it follows that we can realize (in a natural way) S f Sˆ on a Hilbert space K such that there exist normal, faithful ∗ -homomorphisms ˆ We prove that π : M → L(K) and πˆ : Mˆ → L(K) such that S f Sˆ = [π(S) πˆ (S)]. ˆ Take x ∈ S
and observe that π(M) ⊂ M(S f S). (πˆ ⊗ ι)(V )(π(x) ⊗ 1) = (π ⊗ ι)δ(x)(πˆ ⊗ ι)(V ) = (π ⊗ ι)(W ∗ )(1 ⊗ x)(π ⊗ ι)(W )(πˆ ⊗ ι)(V ) ,
158
S. Baaj, G. Skandalis, S. Vaes
where V ∈ M(Sˆ ⊗ K) is the right regular representation of (M, δ) and W ∈ M(S ⊗ K) is the left regular representation. Hence, ˆ (S f S)π(x)
= [π(S)(ι ⊗ ω) (π ⊗ ι)(W ∗ )(1 ⊗ x)(π ⊗ ι)(W )(πˆ ⊗ ι)(V ) | ω ∈ L(H )∗ ] = [(ι ⊗ ω) (π(S) ⊗ x)(π ⊗ ι)(W )(πˆ ⊗ ι)(V ) | ω ∈ L(H )∗ ] ⊂ S f Sˆ .
So, we are done. 4. Examples We start off by presenting a fairly general example. Suppose that A is a l.c. ring, with unit, but not necessarily commutative and A = {0}. Denote by A the group of invertible elements in A. Then, A is a l.c. group by considering it as the closed subspace {(a, b) | ab = 1 = ba} of A × A. Next, we can define the ax + b-group on this l.c. ring A. G = A × A
with
(a, x)(b, y) = (ab, x + ay) .
Defining the closed subgroups (isomorphic with A ) G1 = {(a, a − 1) | a ∈ A }
and
G2 = {(b, 0) | b ∈ A } ,
we observe that G1 G2 consists of all pairs (a, x) such that x + 1 ∈ A . Hence, we get a matched pair of l.c. groups if and only if A has complement of (additive) Haar measure zero in A. Observe that G1 and G2 are conjugated by the element (−1, −1) ∈ G. Suppose that A is a l.c. ring, different from {0}, with A \ A∗ of (additive) Haar measure zero. Denote the bicrossed product multiplicative unitary associated to the above matched pair by WA and the associated l.c. quantum group by (M, δ)A . Proposition 4.1. We have the following properties. • The multiplicative unitary WA is not regular. It is semi-regular if and only if A is open in A. • The projection of S f Sˆ onto S r Sˆ is faithful if and only if the l.c. group A is amenable. ˆ δ) ˆ A is isomorphic to the opposite quantum group (M, δ op )A . • The dual (M, • The associated C∗ - and von Neumann algebras are given by Su,r ∼ = Sˆu,r ∼ = A f,r C0 (A), where A multiplies A on the left , M∼ = Mˆ ∼ = L(L2 (A )) , S f,r Sˆ ∼ = K(L2 (A )) ⊗ A f,r C0 (A) . Proof. Using Theorem 3.11 and Proposition 3.13, we get immediately the first two statements of the proposition. To prove the third statement, define the unitary U on L2 (A × A ) by the formula (Uξ )(b, a) = A (ab)1/2 ξ(a −1 , b−1 ) ,
Non-Semi-Regular Quantum Groups
159
where A is the modular function of A . A straightforward calculation on the generators shows that UMU ∗ = Mˆ and that this isomorphism intertwines δ op on M and δˆ on ˆ M. There is a homeomorphism from G/G2 onto A mapping (a, x) to x, intertwining the left action of G2 on G/G2 with the left multiplication of A on A. So, using Propositions 3.6 and 3.7, we get Su,r ∼ = A f,r C0 (A), where A multiplies A on the left. Because of the already proven third statement, we have Su,r ∼ = Sˆu,r and so, we have proven the first formula of the fourth statement. The second formula of the fourth statement follows from Proposition 3.13. Finally, using the homeomorphism (d, x) → (d, x − d) of G, the action of G2 × G1 on G is isomorphic to the action b · (d, x) · a = (bda, bx) of G2 × G1 on G. If we take the crossed product with the action of a ∈ A , we obtain K(L2 (A )) ⊗ C0 (A), on which A acts diagonally. This diagonal action is isomorphic with the amplified action on C0 (A) and we obtain the third formula, using Propositions 3.8 and 3.10.
Observe that the von Neumann algebraic picture of (M, δ)A is fairly trivial, because M∼ = L(L2 (A )). Nevertheless, the C∗ -algebra picture is far from trivial. As we will see in concrete examples below, the C∗ -algebra S need not be type I. An example of a non-amenable A∗ is, for instance, given by A = M2 (R). In that case the projection of S f Sˆ onto S r Sˆ is not an isomorphism. Example 4.2. Let P be an infinite set of prime numbers such that 1 < ∞. p
p∈P
Define A=
Qp ,
p∈P
where Qp is the l.c. field of p-adic numbers and where the prime means that we take the restricted Cartesian product: we only consider elements (xp ) that eventually belong to the compact sub-ring Zp of p-adic integers. When we equip A with the usual l.c. topology, we obtain a l.c. ring such that A has complement of measure zero in A but A has empty interior in A. The fact that A has complement of measure zero follows from a straightforward application of the Borel-Cantelli lemma: normalizing the Haar measure on Qp such that Zp has measure one, we observe that Zp \ Zp has measure p1 , which is assumed to be summable over p ∈ P. Observe that A is a (restricted) ring of adeles and that SA ∼ = A C0 (A). Such a ∗ C -algebra SA is not of type I and is intensively studied by J.B. Bost and A. Connes in [5]1 . Remark 4.3. Observe that in the previous example, we can replace Qp by the field of formal Laurent series over the finite field Fp with p elements. We have the compact sub-ring of formal power series over Fp and can perform an analogous construction. 1
More precisely, they study the Hecke algebra, which is p(A C0 (A))p for a natural projection p.
160
S. Baaj, G. Skandalis, S. Vaes
Example 4.4. Another, more canonical example can be given, much in the same spirit as Example 4.2. Denote for every prime number p by Kp the (unique) non-ramified extension of degree 2 of Qp . Let O(Kp ) be the compact ring of integers in Kp . Because the extension is non-ramified, the Haar measure of O(Kp ) \ O(Kp ) is p12 . Because this sequence is summable, we can take A= Kp . p prime
Remark 4.5. Consider the easiest case A = R. Define K = L2 (R∗ ). A covariant representation on the Hilbert space K for G2 × G1 acting on C0 (G) is given by π(F )(x) = F (x, x) for F ∈ C0 (G), x ∈ R∗
and
(λb ρa )(b,a)∈G2 ×G1 .
We can then take the covariant image of the bicrossed product multiplicative unitary WR and this yields the multiplicative unitary W on K given by
y xy , . (Wξ )(x, y) = ξ x+y+1 x+1 , x + y + xy , we see that both W and W ∗ Observing that (W ∗ ξ ) (x, y) = ξ x(y+1) y ∗ ) ⊗ L2 (R∗ ) invariant. So, the restriction of W gives us a multiplicative leave L2 (R+ + ∗ ). It is easy to check that the weak closure of S ˜ on K+ := L2 (R+ unitary W ˜ consists W precisely of all the operators T ∈ L(K+ ) such that P[x,+∞[ K+ is an invariant subspace ∗ . Hence, this weak closure is not invariant under involution. This of T for all x ∈ R+ means that SW˜ is not a C∗ -algebra. Analogously, SˆW˜ is not a C∗ -algebra. ˜ is very singular and should certainly not be considered The multiplicative unitary W as the multiplicative unitary of a quantum group. ∗ : v(x) = x , the multiplicative unitary Using the transformation v :]0, 1[→ R+ 1−x ˜ is transformed to a multiplicative unitary X on L2 (]0, 1[, dµ) given by W
1−x . (Xξ )(x, y) = ξ xy, y 1 − xy 1−x is studied in [12] and is shown to be essenThe transformation v(x, y) = xy, y 1−xy tially the only pentagonal transformation on ]0, 1[ of the form v(x, y) = (xy, u(x, y)) with u continuously differentiable. We can construct an even more singular multiplicative unitary Y on the Hilbert space 2 ˜ It is easy to check that l (Q∗+ ) which is formally given by the same formula as W. ∗ SY = [θes θer | 0 < r < s, r, s ∈ Q], where (eq ) denotes the obvious basis in l 2 (Q∗+ ). Instead of Q, we can as well take any countable subfield of R.
Example 4.6. Up to now, we considered in fact only one example: the ax + b group over an arbitrary l.c. ring with two natural subgroups. If A = R, this example is the easiest non-trivial example of a matched pair of real Lie groups (and the only non-trivial example when G is of dimension 2). But, there are a lot of examples of matched pairs of algebraic groups, even in low dimensions, see e.g. [20]. Often, the field R or C can
Non-Semi-Regular Quantum Groups
161
be replaced by any l.c. ring A with A \ A of measure zero and A = {0}. Fix such a l.c. ring A. We give two examples. First, define G = GL2 (A), the two by two matrices over A. Take
A A 1 0 G1 = . and G2 = A A 0 1 Observe that G1 and G2 are conjugated by 01 01 . Next, we give an example in the same spirit as [19], Sect. 5.4. Fix q ∈ A and suppose q to be central. Define on Bq := A4 the structure of a l.c. ring by putting
ab a b a + a b + b + = cdq c d q c + c d + d q
ab a b aa + qbc ab + bd · = . cdq c d q ca + dc dd + qcb q a b a qb Defining πq ac db q = qc d ⊕ c d , we identify Bq with the closed subring of M2 (A) ⊕ M2 (A) consisting of the elements ac db ⊕ ca bd satisfying b = qb and c = qc . If q ∈ A , the first component of πq gives an isomorphism Bq ∼ = M2 (A). Define Gq = (Bq ) and
ab 10 q q G1 = | a, d ∈ A , b ∈ A and G2 = |x∈A . 0dq x 1q
and
q Observe that G1 ∼ = G11 ∼ = the group of upper triangular matrices in GL2 (A) and q ∼ 1 ∼ G2 = G2 = (A, +) for all q ∈ A. For any q ∈ A, we get a bicrossed product l.c. quantum group (M, δ)q whose right regular representation Wq lives on the constant Hilbert space L2 G12 × G11 . It is easy to check that the application q → Wq is strongly∗ continuous. So, we have a continuous family of l.c. quantum groups. If q ∈ A , it is clear that (M, δ)q ∼ = (M, δ)1 , because then, the first component of πq is an isomorphism. On the other hand, if q = 0, we observe that G02 is a normal subgroup of G0 . So, G01 acts by automorphisms on (A, +) and G0 ∼ = G01 A. So, (M, δ)0 is commutative and isomorphic to (L∞ (H ), δH ), where H := G01 Aˆ and Aˆ is the Pontryagin dual of (A, +). In this precise sense, the l.c. quantum groups (M, δ)q are continuous deformations of the l.c. group H . If A is commutative, we can remain closer to [19], Sect 5.4 and quotient out the center of Gq consisting of the matrices a0 a0 q , for a ∈ A .
5. Concluding Remarks 5.1. Pentagonal transformations. Following [3], we call v : X × X → X × X a pentagonal transformation, when X is a (standard) measure space and v is a measure class isomorphism satisfying the pentagonal relation v23 ◦ v13 ◦ v12 = v12 ◦ v23 .
162
S. Baaj, G. Skandalis, S. Vaes
Associated to any pentagonal transformation v, we have a multiplicative unitary V on the Hilbert space L2 (X), defined by (V ξ )(x, y) = d(x, y)1/2 ξ(v(x, y)), where d is the right Radon-Nikodym derivative. Not all pentagonal transformations are nice: in Remark 4.5, we obtained a pentagonal transformation such that SV and SˆV are not even C∗ -algebras. In [3], there was given a natural sufficient condition for a pentagonal transformation to be good. As explained in the introduction, the result in [3] is incorrect as stated, but can be repaired as follows. Proposition 5.1. Let v be a pentagonal transformation on X. Introduce binary composition laws on X by writing v(x, y) = (x • y, x y). Suppose that the transformations φ(x, y) = (x • y, y) and η(x, y) = (x, x y) are measure class isomorphisms. Then, there exists a matched pair G1 , G2 of closed subgroups of a l.c. group G (in the sense of Definition 3.1), commuting actions on X of G2 on the left and G1 on the right, and a G2 × G1 -equivariant measurable map f : X → G, such that for almost all (x, y) ∈ X×X. v(x, y) = x·p1 p2 (f (x))−1 f (y) , p2 (f (x))−1 ·y We sketch the proof of this result, following [3] (there only appears an error in the proof of Proposition 3.4 a) of [3]). In order to find G1 , G2 and G, we write v −1 (x, y) = (x y, x ∗ y) and further, ψ (x, y) = (y ∗ x, y), w(a, b, c, d) = (a • (b c), d ∗ (b • c), c, d). One can check that φ and ψ are pentagonal transformations on X and w is a pentagonal transformation on X × X. Because the pentagonal transformations φ, ψ , w are all of the form (r, s) → (. . . , s), it follows from Lemme 2.1 in [3] that we can find l.c. groups G1 , G2 and G, right actions of Gi on X and of G on X × X, and equivariant measurable maps fi : X → Gi , F : X × X → G such that φ(x, y) = (x · f1 (y), y) , ψ (x, y) = (x · f2 (y), y) w(a, b, c, d) = ((a, b) · F (c, d), c, d).
and
v φ v −1 . One also verifies that w = ψ24 21 13 21 Denote by V the multiplicative unitary associated to v and by V 1 , V 2 and W the mul∗ V1 V V2 . tiplicative unitaries associated to φ, ψ and w, resp. So, we have W = V21 13 21 24 Hence, the group von Neumann algebras of G1 and G2 are von Neumann subalgebras of the group von Neumann algebra of G and so, G1 and G2 can be considered as closed subgroups of G. Under this identification, one has F (c, d) = f1 (c)f2 (d) almost everywhere. Because F is surjective (from a measure theoretic point of view), it follows that the complement of G1 G2 has measure zero. From the formula for W , we also get that L∞ (G) ⊂ L∞ (G1 ×G2 ) and we have to prove that we have an equality. Then, G1 ∩G2 = {e} and we have a matched pair G1 , G2 in G. Denote by − σ -weak the weak closure. Then,
− σ -weak ∗ 1 2 L∞ (G) = (ω ⊗ ι ⊗ ι)(V21 V13 V21 V24 ) | ω ∈ L(L2 (X × X))∗ 1 ∞ 2 = (ω ⊗ ι ⊗ ι) V13 (L (X) ⊗ 1)V (1 ⊗ L∞ (X)) 21 V24 | − σ -weak 2 . ω ∈ L(L (X × X))∗
Non-Semi-Regular Quantum Groups
163
If H, K ∈ L∞ (X), we observe that (H ⊗ 1)V (1 ⊗ K)V ∗ is the function (x, y) → H (x)K(x y). As the transformation (x, y) → (x, x y) is a measure class isomorphism, the weak closure of these functions is the whole of L∞ (X × X). So, 1 ∞ − σ -weak 2 L∞ (G) = (ω ⊗ ι ⊗ ι) V13 L (X)1 V21 V24 | ω ∈ L(L2 (X × X))∗ − σ -weak 1 ∞ 2 | ω ∈ L(L2 (X × X))∗ = (ω ⊗ ι ⊗ ι) V13 (SˆV 1 L (X))1 V21 V24 − σ -weak 2 = (L∞ (G1 ) ⊗ 1)(ω ⊗ ι ⊗ ι)(V24 ) | ω ∈ L L2 (X × X) ∗ = L∞ (G1 × G2 ) ,
where we used that SˆV 1 L∞ (X) is weakly dense in L(L2 (X)). We conclude that we indeed get a matched pair G1 , G2 in G. Because η (x, y) = (y x, y) is a measure class isomorphism, we associate with
η η = η ψ and it a unitary operator B on L2 (X × X). One checks that ψ23 12 13 12 23 2 2 ∗ hence, B13 B12 V23 = V23 B12 . It follows that B is a representation of V 2 . Hence, B ∈ L(L2 (X)) ⊗ L∞ (G2 ). It follows that there exists a left action of G2 on X such that y x = f2 (y) x almost everywhere. Analogously, we can find a right action of G1 on X such that x f1 (y)−1 = x y. The left action of G2 and the right action of G1 commute and defining f (x) = f1 (x)f2 (x)−1 , we have a G2 × G1 -equivariant measurable map f : X → G. One can show that this map and the commuting actions of G1 and G2 satisfy the conclusion of the proposition. Remark 5.2. Combining Proposition 5.1 and Proposition 3.8, we observe that the multiplicative unitary V associated to a good pentagonal transformation v is precisely a covariant image of the regular representation of the associated bicrossed product. In particular, (SV , δV ) is always a bicrossed product l.c. quantum group. Remark 5.3. In the beginning of Sect. 3, we started with a matched pair G1 , G2 of closed subgroups of G and constructed with them a ∗ -automorphism τ : L∞ (G1 × G2 ) → op L∞ (G2 × G1 ) such that τ σ is a matching of (L∞ (G2 ), δ2 ) and L∞ (G1 ), δ1 with trivial cocycles, in the sense of [19], Def. 2.1. Suppose now that, conversely, τ is such that τ σ is a matching. Hence, τ is a faithful ∗ -homomorphism satisfying (τ ⊗ ι)(ι ⊗ τ )(δ1 ⊗ ι) = (ι ⊗ δ1 )τ
and
(ι ⊗ τ )(τ ⊗ ι)(ι ⊗ δ2 ) = (δ2 ⊗ ι)τ.
Defining α and β to be the restrictions of τ to L∞ (G1 ) and L∞ (G2 ) respectively, we get a left action of G2 on L∞ (G1 ) and a right action of G1 on L∞ (G2 ). Hence, we can write τ (F )(s, g) = F (αs (g), βg (s)) for almost all g, s and where (αs ) and (βg ) are actions of G2 , resp. G1 by measure class isomorphisms of G1 , resp. G2 . We can define with the same formula as in Definition 3.3, the multiplicative unitary W , which corresponds to the pentagonal transformation v(s, g, t, h) = (s, g αβg (s)−1 t (h), βg (s)−1 t, h) on X := G2 ×G1 . Corresponding to v, we have the mappings φ and η as above. It is clear that η is a measure class isomorphism. We prove that the same holds for φ. Because we have a pentagonal transformation, we can define the (measure theoretically) surjective mapping P (x, y) = x•y. Defining the measure class isomorphism uα (s, g) = (s, αs (g)) and the surjective mapping P1 (s, g, t, h) = (s, gh), one checks that uα P = P1 (uα ×uα ),
164
S. Baaj, G. Skandalis, S. Vaes
where we use the relation αs (gh) = αs (g)αβg (s) (h) for almost all s, g, h, which follows from (ι ⊗ δ1 )α = (τ ⊗ ι)(ι ⊗ α)δ1 . It follows that P = u−1 α P1 (uα × uα ) and so −1 φ 1 (u × u ), where φ 1 (g, h) = (gh, h). Hence, also φ is a measure φ = u−1 × u α α α 24 α class isomorphism and we can apply Proposition 5.1. We find a l.c. group G and going through the construction, we observe that we identify G1 and G2 with closed subgroups of G such that G1 , G2 is a matched pair and αs (g) = p1 (sg), βg (s) = p2 (sg). 5.2. Regularity and semi-regularity. We will give several characterizations of regularity and semi-regularity of l.c. quantum groups. In fact, it will become clear that it is very amazing that there indeed exist non-semi-regular l.c. quantum groups. Terminology 5.4. A l.c. quantum group is said to be regular, resp. semi-regular, if its right (or, equivalently, left) regular representation is a regular, resp. semi-regular multiplicative unitary. Let V be the right regular representation of a fixed l.c. quantum group (M, δ), as in Sect. 2. Denote by S its underlying C∗ -algebra as in Eq. (2.1). In the proof of Proposition 2.6, we introduced the modular conjugations J and Jˆ, that we will use extensively. They are anti-unitary operators on the Hilbert space H , which is the GNS-space of the invariant weights. Observe that J MJ = M , JˆMˆ Jˆ = Mˆ , ˆ = S. ˆ JˆS Jˆ = S and J SJ When we write K, we always mean K(H ). ˆ and this is From Proposition 2.6, we know that regularity is equivalent with K = [S S] not always the case. However, if we put slightly more, we do get K. ˆ = K. Lemma 5.5. We have [J SJ S S] Proof. From Eq. (2.2) and using the notation U = J Jˆ, we know that, up to a scalar (1 ⊗ U ) = (U ⊗ U )V ∗ (1 ⊗ U )V ∗ (U ⊗ 1)V ∗ . Applying ω ⊗ ι, it follows that K = (ω ⊗ ι) (U ⊗ U )V ∗ (1 ⊗ U )V ∗ (U ⊗ 1)V ∗ | ω ∈ L(H )∗ . Because K = [J SJ K S] and because V ∈ M(K ⊗ S), it follows that K = [J SJ (ω ⊗ ι)(V ∗ ) S | ω ∈ L(H )∗ ] = [J SJ Sˆ S]. Because [Sˆ S] is a C∗ -algebra, we are done.
Recall that V ∈ M(K ⊗ S) and S = [(ω ⊗ ι)(V ) | ω ∈ L(H )∗ ]. So, the following result is quite surprising. Proposition 5.6. The l.c. quantum group (M, δ) is semi-regular if and only if [(ω ⊗ ι)(V ∗ (1 ⊗ S)V ) | ω ∈ L(H )∗ ] ∩ S = {0}. If this is the case, we have S ⊂ [(ω ⊗ ι)(V ∗ (1 ⊗ S)V ) | ω ∈ L(H )∗ ]. The quantum group is regular if and only if the equality holds.
Non-Semi-Regular Quantum Groups
165
Proof. We first claim that in general [Sˆ J SJ (ω ⊗ ι)(V ∗ (1 ⊗ S)V ) | ω ∈ L(H )∗ ] = [Sˆ J SJ ]. We make the following calculation, where we use the pentagonal equation, Lemma 5.5 and the formula V ∗ = (J ⊗ Jˆ)V (J ⊗ Jˆ): [Sˆ J SJ (ω ⊗ ι)(V ∗ (1 ⊗ S)V ) | ω ∈ L(H )∗ ] ∗ (1 ⊗ J SJ S ⊗ 1)V12 | ω, µ ∈ L(H )∗ = (ω ⊗ ι ⊗ µ) V23 V12 ∗ ∗ = (ω ⊗ ι ⊗ µ) V13 V12 V23 (1 ⊗ J SJ S ⊗ 1)V12 | ω, µ ∈ L(H )∗ = (ω ⊗ ι)(V ∗ (1 ⊗ Sˆ J SJ S)V ) | ω ∈ L(H )∗ = Jˆ (ι ⊗ ω)(V (1 ⊗ K)V ∗ ) | ω ∈ L(H )∗ Jˆ = Jˆ C(V )C(V )∗ Jˆ = Sˆ J SJ , where, in the end, we used the calculation of C(V ) from the proof of Proposition 2.6. This proves our claim. Suppose that a = 0 and a ∈ [(ω ⊗ ι)(V ∗ (1 ⊗ S)V ) | ω ∈ L(H )∗ ] ∩ S. Take b ∈ S and x ∈ Sˆ such that xJ bJ a = 0. Combining our claim with Lemma 5.5, we see that xJ bJ a ∈ [Sˆ J SJ ] ∩ K. Hence, [Sˆ J SJ ] ∩ K = {0}. Because the C∗ -algebra [Sˆ J SJ ] acts irreducibly on H , it follows that K ⊂ [Sˆ J SJ ] and so, the l.c. quantum group is semi-regular. Conversely, suppose that the l.c. quantum group is semi-regular. Then, ∗ ∗ (ω ⊗ ι) V ∗ (1 ⊗ S)V | ω ∈ L(H )∗ = (µ ⊗ ω ⊗ ι) V23 V13 V23 | ω, µ ∈ L(H )∗ ∗ = (µ ⊗ ω ⊗ ι)(V23 V12 V23 ) | ω, µ ∈ L(H )∗ = (ω ⊗ ι)(V ∗ (S ⊗ 1)V ) | ω ∈ L(H )∗ = (ω ⊗ ι)(V ∗ (JˆSˆ Jˆ S ⊗ 1)V ) | ω ∈ L(H )∗ ⊃ (ω ⊗ ι)(V ∗ (K ⊗ 1)V ) | ω ∈ L(H )∗ = S. If the l.c. quantum group is regular, the ⊃ above is an equality. If, finally, [(ω ⊗ ι)(V ∗ (1 ⊗ S)V ) | ω ∈ L(H )∗ ] = S, it follows from our claim above and Lemma 5.5 that [Sˆ J SJ ] = [Sˆ J SJ S] = K. This precisely gives the regularity.
Another characterization of semi-regularity is in terms of coactions of l.c. quantum groups. Let B be a C∗ -algebra and α : B → M(S ⊗ B) a non-degenerate ∗ -homomorphism satisfying (ι ⊗ α)α = (δ ⊗ ι)α. We say that α is a coaction of S on B. A seemingly natural definition for the continuous elements under the coaction is T := [(ω ⊗ ι)α(B) | ω ∈ L(H )∗ ] .
166
S. Baaj, G. Skandalis, S. Vaes
Proposition 5.7. The l.c. quantum group (M, δ) is semi-regular if and only if the above closed linear space T of continuous elements is a C∗ -algebra for all coactions α of S on a C∗ -algebra B. Proof. Suppose first that (M, δ) is semi-regular. Then, T ⊃ (ω ⊗ ι)α y (µ ⊗ ι)α(x) | ω, µ ∈ L(H )∗ , x, y ∈ B ∗ = (µ ⊗ ω ⊗ ι) α(y)23 V12 α(x)13 V12 | ω, µ ∈ L(H )∗ , x, y ∈ B = (µ ⊗ ω ⊗ ι)(α(y)23 (K ⊗ 1)V (1 ⊗ K) 12 α(x)13 ) | ω, µ ∈ L(H )∗ , x, y ∈ B ⊃ [(µ ⊗ ω ⊗ ι)(α(y)23 (K ⊗ K ⊗ 1)α(x)13 ) | ω, µ ∈ L(H )∗ , x, y ∈ B] = [T T ]. Hence, T is an algebra. It is clear that T ∗ = T and so, T is a C∗ -algebra. If conversely, the space of continuous elements is always a C∗ -algebra, we define B := K(H ⊕ C). Consider the left regular representation W ∈ M(S ⊗ K) satisfying (δ ⊗ ι)(W ) = W13 W23 and put X := W ⊕ 1 ∈ M(S ⊗ B). Define α(x) = X ∗ (1 ⊗ x)X for all x ∈ B. Writing θξ for the obvious operator in K(C, H ) whenever ξ ∈ H , we observe that
[(ω ⊗ ι)(W ∗ (1 ⊗ K)W ) | ω ∈ L(H )∗ ] θξ T = θ∗ C | ξ, η ∈ H . η
So, because T is a
C∗ -algebra,
we get
K ⊂ [(ω ⊗ ι)(W ∗ (1 ⊗ K)W ) | ω ∈ L(H )∗ ]. Using the fact that W = (U ⊗ 1)V (U ∗ ⊗ 1), we conclude that ˆ K ⊂ U [(ι ⊗ ω)(V ∗ (1 ⊗ K)V ) | ω ∈ L(H )∗ ]U ∗ = U [C(V )∗ C(V )]U ∗ = [S S]. So, (M, δ) is semi-regular.
We conclude that the notion of continuous elements is problematic for coactions of non-semi-regular quantum groups. Hence, it is not surprising that it is equally problematic to define continuous coactions. There are at least two natural definitions. Let α : B → M(S ⊗ B) be a coaction. • We call α continuous in the weak sense when B = [(ω ⊗ ι)α(B) | ω ∈ L(H )∗ ]. • We call α continuous in the strong sense when S ⊗ B = [α(B)(S ⊗ 1)]. We have the following result. Proposition 5.8. Consider a coaction α : B → M(S ⊗ B) of S on B. If (M, δ) is regular, continuity in the weak sense implies continuity in the strong sense. If (M, δ) is semi-regular, but not regular, there exists a coaction α that is continuous in the weak sense, but not in the strong sense. Proof. If (M, δ) is regular, we know from [2] that [(K ⊗ 1)V (1 ⊗ S)] = K ⊗ S. So, if α is continuous in the weak sense, we get [α(B)(S ⊗ 1)] = [α((ω ⊗ ι)α(B))(S ⊗ 1) | ω ∈ L(H )∗ ] ∗ (K ⊗ S ⊗ 1)) | ω ∈ L(H )∗ = (ω ⊗ ι ⊗ ι)(V12 α(B)13 V12 = [(ω ⊗ ι ⊗ ι) (V12 α(B)13 (1 ⊗ S ⊗ 1)) | ω ∈ L(H )∗ ] = (ω ⊗ ι ⊗ ι) ((K ⊗ 1)V (1 ⊗ S))12 α(B)13 | ω ∈ L(H )∗ = S ⊗ [(ω ⊗ ι)α(B) | ω ∈ L(H )∗ ] = S ⊗ B .
Non-Semi-Regular Quantum Groups
167
Hence, α is continuous in the strong sense. Suppose now that (M, δ) is semi-regular, but not regular. Define
∗
x θξ W (1 ⊗ x)W W ∗ (1 ⊗ θξ ) [Sˆ S] θξ | η, ξ ∈ H with α B := = , (1 ⊗ θη∗ )W θη∗ λ 1⊗λ θη∗ C where W ∈ M(S ⊗ JˆSˆ Jˆ) is again the left regular representation. It is easy to verify that α is a coaction which is continuous in the weak sense, but not in the strong sense.
References 1. Baaj, S.: Repr´esentation r´eguli`ere du groupe quantique des d´eplacements de Woronowicz. Ast´erisque 232, 11–48 (1995) 2. Baaj, S., Skandalis, G.: Unitaires multiplicatifs et dualit´e pour les produits crois´es de C∗ -alg`ebres. Ann. Scient. Ec. Norm. Sup., 4e s´erie 26, 425–488 (1993) 3. Baaj, S., Skandalis, G.: Transformations pentagonales. C.R. Acad. Sci., Paris, S´er. I, 327, 623–628 (1998) 4. Baaj, S., Vaes, S.: Double crossed products of locally compact quantum groups. Preprint 5. Bost, J.-B., Connes, A.: Hecke algebras, type III factors and phase transitions with spontaneous symmetry breaking in number theory. Selecta Math. (N.S.) 1, 411–457 (1995) 6. Desmedt, P., Quaegebeur, J., Vaes, S.: Amenability and the bicrossed product construction. Ill. J. Math., to appear 7. Enock, M., Schwartz, J.-M.: Kac Algebras and Duality of Locally Compact Groups. Berlin–Heidelberg–New York: Springer-Verlag, 1992 8. Fack, T.: Quelques remarques sur le spectre des C∗ -alg`ebres de feuilletages. Bull. Soc. Math. Belg. S´er. B 36, 113–129 (1984) 9. Kac, G.I.: Ring groups and the principle of duality, I, II. Trans. Moscow Math. Soc., 291–339 (1963), 94–126 (1965) 10. Kac, G.I.: Extensions of groups to ring groups. Math. USSR Sbornik 5, 451–474 (1968) 11. Kac, G.I., Vainerman, L.I.: Nonunimodular ring groups and Hopf-von Neumann algebras. Math. USSR. Sbornik 23, 185–214 (1974) 12. Kashaev, R.M., Sergeev, S.M.: On pentagon, ten-term, and tetrahedron relations. Commun. Math. Phys. 195, 309–319 (1998) 13. Kustermans, J.: Locally compact quantum groups in the universal setting. Internat. J. Math. 12, 289–338 (2001) 14. Kustermans, J., Vaes, S.: Locally compact quantum groups. Ann. Scient. Ec. Norm. Sup., 4e s´erie 33, 837–934 (2000) 15. Kustermans, J., Vaes, S.: Locally compact quantum groups in the von Neumann algebraic setting. Math. Scand., to appear 16. Majid, S.: Hopf-von Neumann algebra bicrossproducts, Kac algebra bicrossproducts, and the classical Yang-Baxter equations. J. Funct. Anal. 95, 291–319 (1991) 17. Rieffel, M. A.: Strong Morita equivalence of certain transformation group C∗ -algebras. Math. Ann. 222, 71–97 (1976) 18. Takeuchi, M.: Matched pairs of groups and bismash products of Hopf algebras. Comm. Algebra 9, 841–882 (1981) 19. Vaes, S., Vainerman, L.: Extensions of locally compact quantum groups and the bicrossed product construction. Adv. Math., to appear 20. Vaes, S., Vainerman, L.: On low-dimensional locally compact quantum groups. In: Locally Compact Quantum Groups and Groupoids. Proceedings of the Meeting of Theoretical Physicists and Mathematicians, Strasbourg, February 21–23, 2002., Ed. L. Vainerman, IRMA Lectures on Mathematics and Mathematical Physics. Berlin, New York: Walter de Gruyter, 2003, pp. 127–187 21. Woronowicz, S.L.: From multiplicative unitaries to quantum groups. Int. J. Math. 7(1), 127–149 (1996) 22. Woronowicz, S.L.: Twisted SU (2) group. An example of a non-commutative differential calculus. Publ. RIMS, Kyoto University 23, 117–181 (1987) Communicated by A. Connes
Commun. Math. Phys. 235, 169–189 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0788-y
Communications in
Mathematical Physics
Ergodic Properties of a Model Related to Disordered Quantum Anharmonic Crystals Francesco Fidaleo, Carlangelo Liverani Dipartimento di Matematica, II Universit`a di Roma (Tor Vergata), Via della Ricerca Scientifica, 00133 Roma, Italy. E-mail:
[email protected];
[email protected] Received: 14 January 2002 / Accepted: 14 October 2002 Published online: 10 February 2003 – © Springer-Verlag 2003
Abstract: We introduce a model suggested by disordered anharmonic quantum crystals. We then investigate in detail the ergodic properties exhibited by such a model.
1. Introduction Some recent results [7, 11] have emphasized the possibility to use perturbation theory to investigate the ergodic properties of harmonic crystals in the presence of a small local non-harmonicity. As a result, the existence of non-KMS invariant states has been proven, stable with respect to a large class of local perturbations [7]. It is then inevitable to address the question if such a method could be applied to a translational invariant anharmonic crystal, at least in the regime when the anharmonicity per site is very small. The previously mentioned results rested on some weaker version of L1 asymptotic abelianess for the unperturbed crystal, whereas the above generalization would require some sort of L1 space-time asymptotic abelianess for the corresponding harmonic crystal. Unfortunately, harmonic crystals seem not to enjoy such a strong ergodic property. Thus, any investigation in such a direction would require non-perturbative techniques. This state of affairs could support the point of view that the KMS are the only stable states with respect to “reasonable” perturbations. Conversely, in this paper we show that there exist models (see Sect. 2 for more details) for which the above mentioned program can be carried out. The physical relevance of such models is debatable, so the issue of the existence or not of stable non-KMS invariant states in physical models cannot be considered solved. In fact, we do not claim that these models describe the precise quantitative behaviour of a disordered anharmonic crystal, yet it is possible that such models capture the qualitative behaviour of physically relevant systems. Hence the present work spells a word of caution on the belief that Gibbs (or KMS) states are the only “stable” states and, therefore, the only physically relevant ones ([9]).
170
F. Fidaleo, C. Liverani
The proposed model is inspired by the work done on disordered systems. Such systems enjoy rather strong time mixing, at least limited to the average expectation of some class of observables (e.g. quadratic polynomials of the positions and momenta, such as two point functions). The main point of the model is that it has a non-linear dynamics, yet its ergodic properties can be investigated in some detail. As far as we know, this is one of the first extended non-linear systems for which time mixing properties can be investigated at all. The plan of the paper is as follows. Section 2 contains a preliminary description of the model and several explanatory comments. Section 3 collects the needed properties on random Hamiltonians. Section 4 defines precisely the algebra (that is the observables) on which the linear model is constructed. Section 5 contains the precise description of the non-linear perturbation. In Sect. 6 we enlarge the algebra of Sect. 4 in order to properly define a non-linear dynamics. In Sect. 7 we discuss its ergodic properties. The paper is supplemented by two appendices. Some needed facts about Anderson localization not present in the literature, are proven in Appendix A. Appendix B contains a needed estimate related to a sort of L1 space–time asymptotic abelianess of the linear system. 2. The Model Let us consider the Hamiltonian 1 H (q, p) = V∗ (qi − qj ), (pi2 + κi qi2 ) + 2 d
(2.1)
|i−j |=1
i∈Z
where the mass of all particles is equal to one, and the Planck constant = 1. In the above formula, the nearest neighbour interaction, described by V∗ (x) :=
1 2 x + V (x) 2
(2.2)
with x ∈ Rd , is split into the quadratic part, and the non-quadratic part V (x) describing the non-linear perturbation. All couples (qi , pi ) are d-dimensional canonical variables describing the state of a particle with the label i whose rest position, corresponding to qi = 0, is precisely the site i of Zd . Hence, the expression (2.1) represents the Hamilton function of an infinitely extended anharmonic crystal with a nearest neighbour interaction. Actually, the Hamiltonian (2.1) has a purely formal meaning even if the function V is a bounded one. Yet the corresponding Hamiltonian flow can be rigorously defined. We will consider a disordered system associated to the Hamiltonian (2.1). By disordered we mean that all the elastic constants {κi }i∈Zd appearing in the Hamiltonian (2.1), are random variables defined on a standard space (X, P ). Namely, disregarding for the moment the non-quadratic part of (2.1), let us consider the block (infinite) matrix 0 I A= , (2.3) −K 0 where is the d-dimensional discrete Laplacian given by (v)i = |i−j |=1 vj − 2dvi , and K = K(ξ ) is a random multiplication operator given, for generic ξ ∈ X, by (K(ξ )v)i := κi (ξ )vi .
Model Related to Disordered Quantum Crystals
171
The block matrix A(ξ ) acts componentwise on the double sequence of vectors {(qi , pi )}i∈Zd , each of which has d components, and is the generator of the linear flow at a fixed environment {κi (ξ )}i∈Zd . We assume that all κi are equidistributed independent random variables with law p, the support of p being a bounded closed interval of the positive real line. In such a way, we get for the sample space X ≡ i∈Zd supp(p), and for the law P ≡ i∈Zd p.1 Such a system can be seen as a collection of Hamiltonian systems for each fixed environment (that is for almost all the realizations of the random variables), but can also be seen as a single Hamiltonian system, this latter point of view being more profitable for us. For the reader’s convenience, let us briefly describe it in the simplified case in which the system has only finitely many degrees of freedom and the non-linear part is absent. Accordingly, we start with a classical disordered system described by the random Hamiltonian 1 Hξ (q, p) := (qi − qj )2 pi2 + κi (ξ )qi2 + 2 |i−j |=1
i∈
which takes into account the quadratic part of (2.1) and where, for simplicity, the index set is a finite one (i.e. a box in the standard lattice Zd ). In order to describe the dynamics of the (classical) associated disordered system, we consider the (infinite dimensional) sympletic space (V, ) consisting of an appropriate linear (sub)space V of measurable functions on X with values in the phase-space R2d|| equipped with a sympletic form of the type ((q, p), (Q, P )) := µ(dξ )σ ((q(ξ ), p(ξ )), (Q(ξ ), P (ξ ))) . (2.4) X
Here σ is the usual 2-form on the phase-space R2d|| , and µ is any measure on X in the measure-class determined by P . ˙ ), p(ξ ˙ )) = A(ξ )(q(ξ ), p(ξ )) are then determined by The equations of motion2 (q(ξ the Hamiltonian flow Tt∗ associated to the functional Hamiltonian H(q, p) := µ(dξ )Hξ (q(ξ ), p(ξ )) . (2.5) X
Therefore, the Hamiltonian flow determines uniquely, almost everywhere, the timeevolution at fixed environment. The dynamics of the random model is thus perfectly described, at least in relation to its linear part, by the above infinite dimensional Hamiltonian system. In order to have a quantum description of our random medium, one must follow the standard procedure of quantization, yet there exist different implementations. Our choice is to quantize the sympletic space (V, ) by specifying the measure in Formulae (2.4), (2.5) as the probability P describing the randomness. In such a way, our quantum system is described by a genuine CCR-algebra: the commutation function between (linear combinations of) fields and conjugate momenta is a multiple of the identity operator determined by the sympletic form (2.4), see Sect. 4 for details. 1 One could merely suppose that all κ are random variables on a common probability space (X, P ) on i which the lattice translations act ergodically, see e.g. [10 and 13] for a recent result on the two dimensional lattice. 2 See (2.3) for the description of the generator A.
172
F. Fidaleo, C. Liverani
This approach differs from the alternative one consisting of first quantizing at fixed environment and then studying the collection of the resulting systems. The algebra of observables in this case can be taken to be the “product” (more precisely a direct integral) of the CCR-algebras corresponding to each fixed environment. This is a different algebra than the previous one, in fact it has a non-trivial centre and it is not a CCR.3 To ensure the ergodic properties we are interested in, we choose, for simplicity, a rather small space V: just the smallest one containing the constant functions and invariant with respect to the linear evolution4 (see Definition 4.1). Clearly, this limits severely the possible observables and it could certainly be improved, yet it cannot be enlarged to contain observables related to a single environment, as the systems at fixed environment have poor ergodic properties (see Sect. 7). In order to put forward a non-linear model for which it is possible to investigate the ergodic properties by a perturbative approach, we must select a class of invariant states. In fact, we consider only quasi-free states (see Sect. 4). Such quasi-free states are defined on the Weyl operators as ω(W (v)) = eE(− 2 B(v,v)) , 1
(2.6)
where B is a random two-point function which uniquely determines the quasi-free state, and E denotes the average w.r.t. the randomness. Other typical possibilities could be to consider the quenched or annealed states.5 For example, the quenched state can be defined in the environment-by-environment scheme as6 1
ωq (W (v)) = E e− 2 B(v,v) .
(2.7)
Notice that the quenched state is not quasi-free. To compare the two states one must consider them on some common subalgebra, in our case the algebra generated by the constant functions. On such a subalgebra, the states described by (2.6) are the truncated functionals associated to the quenched states given in (2.7), see [4], Sect. 5.2.3.7 3 To have a more concrete idea of the situation, consider the trivial case in which d = 1, consists of only one point and X is a two point space (that is, just a one dimensional harmonic oscillator with two possible random values of the spring). In this case the largest possible V is isomorphic to R4 . For such a choice the CCR in our quantization scheme is naturally represented on the Hilbert space L2 (R4 ). The generated von Neumann algebra is all B(L2 (R4 )) (the algebra of the bounded operators). The alternative quantization scheme yields the algebra B(L2 (R2 )) ⊕ B(L2 (R2 )). The non-trivial center consists of operators with two diagonal blocks proportional to the identity. 4 In the trivial example of footnote 3 this corresponds to all R4 , yet when X is not discrete the situation is more subtle. 5 The quenched state ω and the annealed one ω at inverse temperature β are defined as q a
Tr(e−βH A) ωq (A) = E Tr(e−βH )
;
ωa (A) =
E(Tr(e−βH A)) E(Tr(e−βH ))
when the r.h.s. exist. 6
,
⊕
With an abuse of notation, here we are denoting by W (v) the direct integral
[15], corresponding to the element {vξ } in the field of CCR-algebras. 7 In addition, their two point functions coincide even at ∂λ ∂µ ω(W (λv)W (µTt w))|λ=µ=0 = ∂λ ∂µ ωq (W (λv)W (µTt w))|λ=µ=0 .
P (dξ )W (vξ ), see X
different
times,
that
is
Model Related to Disordered Quantum Crystals
173
In order to treat the non-linear perturbation, we assume that the function V appearing in (2.2) is given by the Fourier transform V (x) := eiλ·x ν(dλ) (2.8) Rd
of a complex bounded Borel measure ν on Rd . The measure ν should satisfy the reality condition ν ◦ Is = ν¯ where the bar is the complex-conjugation and Is is the space-inversion. In addition, to give meaning to the quantized version of 2.8, we assume that it is
deterministic. Deterministic means that its quantized version will be given by PV := Rd W (λv)ν(dλ), where v is an appropriate constant function, see Sect. 5 for the precise definition. To simplify matters, we assume that ν is compactly supported into some box [−L, L]d . Further, we also suppose that the Borel measure ν satisfies the bound L2 |ν|(1) <
1 , K∗
(2.9)
where K∗ is the constant appearing in Lemma B.3. Remark that K∗ depends only on the unperturbed Hamiltonian H0 . Loosely stated, our main results can be summarized as follows. Each invariant state ω of the type (2.6) is mixing w.r.t. the linear dynamics (Theorem 4.3). The non-linear dynamics relative to an infinitely extended perturbation of type (5.3) is well defined (Sect. 5). Finally we show that, for each such ω, there exists a uniquely determined state ω∞ which is invariant and mixing w.r.t. the non linear dynamics (Theorem 7.3). 3. General Properties of Random Hamiltonians The results of the present paper rest on some preliminary properties, together with some assumptions on the random Hamiltonian that, for the reader’s convenience, are collected in this section. Let us consider a random Hamiltonian on 2 (Zd ) of the form H (ξ ) := − + K(ξ )
(3.1)
as that appearing in the infinite block matrix (2.3). It is well known that the following properties hold true: (a) For almost all ξ ∈ X, the spectrum of H (ξ ) is contained in [0, 4d] + supp(p); (b) if d = 1, or when d > 1 in cases of large disorder (that is when supp(p) ⊂ (k0 , +∞) where k0 is some critical value), almost all H (ξ ) have pure point non-discrete spectrum; (c) under the same conditions as those in (b), we have that the spectral measures 8 X P (dξ )δm |EH (ξ ) (dx)|δn satisfy, for each Borel set of the real line, P (dξ )|δm |EH (ξ ) ()|δn | ≤ Ae−B|m−n| (3.2) X
which is nothing else than the Anderson localization, see e.g. [1, 10]. 8
By {δn }n∈Zd we designate the standard base of 2 (Zd ).
174
F. Fidaleo, C. Liverani
In addition, we will need the following properties. (i) There exist Borel functions {ϕk }k∈Zd such that, for each Borel set ⊂ R, we have P (dξ )δm |EH (ξ ) ()|δn = ϕn−m (x)dx, (3.3) X
where dx is the Lebesgue measure on the real line. It is well-known that this condition is satisfied for one dimensional models, see [10], Theorem IX.2. Notice that if (3.3) is satisfied, it follows by polarization that all ϕk belong to L1 (R, dx) with |ϕk (x)|dx ≤ 8 . (3.4) R
(ii) All densities ϕk are absolutely continuous w.r.t. the Lebesgue measure with derivatives ϕk belonging to Lp (R, dx) for some 1 < p ≤ 2, and satisfying the bound |ϕk (x)|p dx ≤ C(|k|α + 1) (3.5) R
for positive numbers C, α. The above regularity, limited to the integrated density of states (i.e. the case (ii) with k = 0), was proven in [5] for one dimensional models when the density of the probability measure p is infinitely often differentiable. The result for arbitrary k is obtained in Appendix A. Remark 3.1. By the previous considerations there exist one dimensional random Hamiltonians H (ξ ) on 2 (Z) satisfying all properties listed above. We conclude with a result which can be considered as an extension of the Fubini Theorem to our situation. Its proof is a straightforward approximation argument. Lemma 3.2. Under condition (i), we get for each bounded Borel function F on the real line, P (dξ )δm |F (H (ξ ))|δn = F (x)ϕm−n (x)dx . X
4. CCR-Algebra and Quasi-Free States We start by considering, for a fixed > 0, the weighted space W made of sequences v such that e|m| |vm | < +∞ . m∈Zd
In the following, we will choose > h, where h is the constant appearing in Lemma B.1. From now on, we keep fixed the weighted space W ≡ L0 as above, L is the direct sum of d copies of L0 , that is L∼ = L0 ⊗ Cd .
(4.1)
Model Related to Disordered Quantum Crystals
175
Notice that the random operator A(ξ ) given by (2.3), is an equibounded operator on L0 ⊕ L0 for all ξ ∈ X. Our starting point is the symplectic phase-space L2 := L ⊕ L. If k ) }, where k = 1, 2 distinguishes the position v ∈ L2 , then it can be written as v = {(vm j from the momentum, j = 1, . . . , d corresponds to the d-components of the position and the momentum, and finally m ∈ Zd is the index relative to the site. Such a phase-space is naturally equipped with the norm k e|m| |(vm )j | < +∞ . (4.2) []v[] := j,k m∈Zd
The phase-space is also equipped with the complex-conjugation C0 , and the commutation function 0 iI θ0 (u, v) = u, v −iI 0 (see [2, 3]). Further, for each ξ ∈ X, let A(ξ ) be the random operator (2.3) on the phasespace (L2 , C0 , θ0 ). The associated one-parameter group of Bogoliubov automorphisms is given by ∗
Tt (ξ )v := etA(ξ ) v . The group Tt (ξ ) is easily computed ([6], Formula (1.2)) obtaining cos(H (ξ )1/2 t) −H (ξ )1/2 sin(H (ξ )1/2 t) . Tt (ξ ) = H (ξ )−1/2 sin(H (ξ )1/2 t) cos(H (ξ )1/2 t)
(4.3)
(4.4)
Definition 4.1. A deterministic variable v ≡ (v, τ ) ∈ L2 × R is a (equivalence class of) Borel function on X with values in L2 , uniquely determined by v(ξ ) := Tτ (ξ )v . The space D(L) will be the algebraic linear span of the deterministic variables.9 A deterministic variable v will be (with an abuse of notation) alsoan element of D(L), i.e. a linear combination of generators of D(L). Further, if v = i Tτi (ξ )vi we write (again with an abuse of notation) []vi [] . []v[] := i
The space D(L) spanned by the deterministic variables is a phase-space in the terminology of [2, 3], if one defines as C the usual complex-conjugation on functions, and the “commutation function” θ as 0 iI θ (u, v) := P (dξ ) u(ξ ), v(ξ ) . (4.5) −iI 0 X If v = (v, τ ), the map (4.3) again defines a one-parameter group of Bogoliubov automorphisms Tt on all of D(L) by Tt v(ξ ) := Tt+τ (ξ )v .
(4.6)
9 The main reasons for the above choice is that if one considers the space of all square integrable functions on X with value into L2 , then the dynamics given by (4.3) is quasi-periodic. Namely, the whole system does not exhibit any natural ergodic property, being the last however present in real models.
176
F. Fidaleo, C. Liverani
We assume the self dual CCR-algebra A(D(L), C, θ ) canonically associated to the phase-space (D(L), C, θ ), together with the time evolution (4.6) and a suitable class of its quasi-free states (Proposition 4.2), as our starting point for the construction of a non-linear model. The following proposition describes a class of suitable quasi-free states ([2, 3]) on the CCR. Proposition 4.2. Let F be a bounded Borel function on the real line satisfying, on a neighborhood of the common spectrum of almost all H (ξ ), (i) F (x) > 0, (ii) xF (x)2 ≥ 41 . Then the two-point function given by i I F (H (ξ )) 2 P (dξ ) u(ξ ), v(ξ ) BF (u, v) = − 2i I H (ξ )F (H (ξ )) X
(4.7)
leads to a quasi-free state on the self-dual CCR-algebra A(D(L), C, θ ) which is invariant for the one-parameter group of Bogoliubov automorphisms (4.3). Proof. The proof is quite standard (e.g., see [7], Prop. 4.2), taking into account Lemma 3.2, and is left to the reader. It is well known that the real elements D(LR ) of D(L) generate by exponentiation the C ∗ -algebra W of Weyl operators10 associated to the GNS representation of a (fixed) quasi-free state ωF as above, being ωF given by 1
ωF (W (u)) = e− 2 BF (u,u) . The commutation rule (4.5) translates on the Weyl operators as 1
W (u)W (v) = e− 2 θ(u,v) W (u + v) . Next, we fix a quasi-free state ω on the self-dual CCR-algebra A(D(L), C, θ ) as above, and consider the quantum dynamical system (QDS for short) (W, αt0 , ω) based, respectively, on the C ∗ -algebra generated by the Weyl operators of the GNS representation of A(D(L), C, θ ) relative to ω, the one-parameter group of automorphisms induced on W by the Bogoliubov transformations (4.6), and finally the invariant state ω, see [2, 3].11 As the state ω is kept fixed during our analysis, we suppress all the indices relative to ω. Some of the most relevant ergodic properties of the QDS (W, αt0 , ω) are summarized in the following. 10 It is unclear if θ is nondegenerate on D(L), that is if W is a simple C ∗ -algebra. This fact does not affect our analysis. 11 Let (A, α , ω) be a triple consisting of a C ∗ -algebra A, a one-parameter group of automorphisms t of A, and a state on A respectively. Consider the GNS representation (πω , Hω , ω ) relative to the state ω. By a quantum dynamical system we mean that, for A ∈ A, all the maps t → αt A are continuous in the s-topology ([14], E 5.5) generated by the folium Fω . The last condition is equivalent to the fact that all the maps t → πω (αt A) are continuous w.r.t. the strong operator topology of B(Hω ). If the state ω is invariant w.r.t. {αt }, such a continuity condition assures that the unitary group {Uω (t)} implementing the dynamics on Hω in the Heisenberg picture, is continuous w.r.t. the strong operator topology. Then, by Stone Theorem, the (selfadjoint, possibly unbounded) Hamilton operator Hω is well-defined on Hω .
Model Related to Disordered Quantum Crystals
177
Theorem 4.3. Let (W, αt0 , ω) be a QDS as above. Then (W, αt0 , ω) is asymptotically abelian, and mixing as well. Proof. We briefly sketch the proof which is quite similar to the corresponding ones in [7]. Let u ≡ (u, τ ) and u˜ ≡ (u, ˜ τ˜ ) be deterministic variables. One computes 1 ˜ = 2 sin θ (u, Tt u) ˜ , W (u), W (Tt u) 2i 1
˜ ˜ ˜ −t u)) F (u,Tt u)+B F (u,T ˜ = ω(W (u))ω(W (u))e ˜ − 2 (θ(u,Tt u)+B ω(W (u)W (Tt u)) ,
where θ and BF are the two-point functions given in (4.5) and (4.7), respectively. By Lemma 3.2, Formulae 3.4 and 4.4, it is straightforward to see after an elementary change ˜ as well as BF (u, Tt u) ˜ are Fourier transforms of L1 -functions. of variables, that θ (u, Tt u) The proof follows by the Riemann-Lebesgue Lemma. 5. The Non-Linear Dynamics In order to define a quantum evolution related to the non-linear Hamiltonian (2.1), we start by defining, for i ∈ Zd and k = 1, 2, a vector ei,k made of d deterministic variables as ei,k := (([i, k, 1], 0), . . . , ([i, k, d], 0)),
(5.1)
where ([i0 , k0 , j0 ]ki )j = δi,i0 δj,j0 δk,k0 . We recall that by definition (W, αt0 , ω) is covariantly represented on the separable Hilbert space H,12 where the linear dynamics αt0 is generated (in the Heisenberg picture) by a strongly continuous one parameter unitary group αt0 A = Ut AUt−1 ,
(5.2)
and the invariant state ω is implemented by an invariant vector ∈ H. In such a way, the W ∗ -dynamical system (W , αt0 , ω ) is automatically singled out if the dynamics αt0 is defined by the natural extension of (5.2) to all of W . The state ω is the vector state on W given by . We want to construct the dynamics associated to the perturbation of the free Hamiltonian given by the formal sum P := Vm , (5.3) m∈Zd
where Vm is given by Vm := 12
d |m−n|=1 R
ν(dλ)W (λ · (em,1 − en,1 )) .
(5.4)
This can be viewed by considering the self-dual algebra A(R 2 , C, θ) based on the vector-valued := L2 (X, P ; 2 (Zd ) ⊗ Cd ). In such a way, A(D(L), C, θ ) ⊂ A(R 2 , C, θ) and the assertion follows as in Remark 6.3 of [2].
L2 -space R
178
F. Fidaleo, C. Liverani
In Formula (5.4), x · y is the standard Euclidean inner product in Rd , the Borel measure ν is sufficiently “small and regular”, and finally the integral is understood in the weak (or equivalently strong) operator topology. More precisely, we restrict ourselves to the measure ν satisfying all conditions listed in Sect. 2, and in particular the bound (2.9) with K∗ the same positive constant as that appearing in Lemma B.3. To construct the dynamics we use the usual spatial cut-off. Namely, for each bounded region ⊂ Zd , let us consider Vm , (5.5) P = m∈
where Vm is given by (5.4). Then, P is a well-defined element of W ([7], p. 978). The perturbation αt of the linear dynamics αt0 , is defined on all of W . It is the unique solution of the integral equation t 0 0 ds αs [P , αt−s A] , (5.6) αt A = αt A + i 0
see [4], Prop. 5.4.1. Let v be a deterministic variable, for i ∈ Zd we define the vector-valued function γvi as 1 P (dξ )ei,2 , Tt v . (5.7) γvi (t) := 2 X Consider for j = 1, . . . , d, k = 1, 2, and m, n ∈ Zd , the elementary deterministic variables δj · en,k (as usual δj is the delta-function at the place j in Cd ), defined taking into account (5.1), together with the vector-valued functions {γδmj ·en,k } given according to (5.7). Then, it is interesting to note that for v = {(vl , τl )}, γvm can be expressed in the following simple form: (vl kn )j γδmj ·en,k (t + τl ) , (5.8) γvm (t) = j,k,l,n
where the above series is absolutely convergent. Let β = (m, n) be made of an ordered pair of elements of Zd . We define the set IN := {(m, n) ∈ Zd × Zd | |m| ≤ N ; |m − n| = 1}. The set I∞ will be given by I∞ := {(m, n) ∈ Zd × Zd | |m − n| = 1}. Again, taking into account (5.1) and (5.7), we set eβ := em,1 − en,1 , β
together with the vector-valued function γv (t) := γvm (t) − γvn (t) and β
gβ (v, t, λ) = 2 sin(λ · γv (t)) .
(5.9)
The computation of the commutator in (5.6) is the starting point of our analysis. We get [Vm , αt0 W (v)] = −i
β=(m,n) |n−m|=1
Rd
ν(dλ)W (λ · eβ + Tt v)gβ (v, t, λ) ,
(5.10)
Model Related to Disordered Quantum Crystals
179
where the γvi are the vector-valued functions defined by (5.7). Next, we define t n := (tn , . . . , t1 ), λn := (λn , . . . , λ1 ), and β n := (βn , . . . , β1 ). Taking into account the analogous formulae in Sect. 5.2 of [7], we let wn (v, t n , λn , β n ) :=
n
λk · T−tk eβk + v ,
k=1
Gn (v, t n , λn , β n ) := gβn (v, tn , λn )
n−1
gβk Ttk+1
n
(5.11)
λj · T−tj eβj + v , tk − tk+1 , λk ,
j =k+1
k=1
and we consider the series Xt (v) = W (Tt v) +
∞
t
t
dtn
0 n n=1 β n ∈I∞
t
dtn−1 · · ·
tn
dt1 t2
ν(dλ1 ) . . . ν(dλn )W Tt wn (v, t n , λn , β n ) Gn (v, t n , λn , β n ) .
Rnd
(5.12)
Theorem 5.1. The series given in (5.12) is absolutely norm-summable, uniformly for t ∈ R. Moreover, it can be obtained as the following pointwise norm-limit, uniform in time: Xt (v) = lim αtN W (v),
(5.13)
N ↑Zd
where αtN W (v) is the unique solution of the integral equation (5.6) associated to the perturbation P in (5.5). Proof. We can compute as in [7], the perturbed dynamics on W associated to (5.5), taking into account (5.10). We obtain αtN W (v) = W (Tt v) +
n=1 β n ∈IN
×
∞
Rnd
t
t
dtn 0
tn
dtn−1 · · ·
t
dt1 t2
ν(dλ1 ) . . . ν(dλn )W Tt wn (v, t n , λn , β n ) Gn (v, t n , λn , β n ),
where the series is a well-defined converging one by Lemma B.3.13 The proof follows again by Lemma B.3 as the r.h.s. of the above series converges in norm to the corresponding one of (5.12). The results of the last theorem, and more precisely (5.13), allow us to define the series (5.12) as describing, at least for the Weyl operators, the non-linear dynamics associated to the whole perturbation (5.3). 13 The finite sums are well-defined as all integrands in the r.h.s. of (5.12) are Borel functions, see Footnote 13 of [7].
180
F. Fidaleo, C. Liverani
6. The C ∗ -Algebra of Physical Observables In the previous section we saw that the C ∗ -algebra W generated by the Weyl operators, cannot be invariant under the time evolution (5.12). On the other hand, W is most likely too large for our purposes, and the dynamics on it may very well have poor ergodic properties. We tackle this problem by enlarging W enough to accommodate the non-linear dynamics, but not as much as to spoil the ergodic properties we are interested in. We start by introducing the set Fm := {h ≡ {(hi , ti )} | hi : Rm → L2R , ti : Rm → R, (hi , ti ) = (0, 0) for all but finitely many i ∈ Z}. Clearly, h(x) = i Tti (x) (ξ )hi (x) ∈ D(LR ) is interpreted as a real deterministic variable. We then define the triples m := (m, h, µ), where m ∈ N, h ∈ Fm is made of Borel functions, and finally µ is a “suitable” complex measure on Rm . More precisely, given the set δ[]h(x)[] e |µ|(dx) < +∞ , δ > 0 , M := (m, h, µ) Rm
we can define, for each m = (m, h, µ) ∈ M, W (h(x))µ(dx), W (m) = Rm
where the integrals are meant w.r.t the strong (or, equivalently weak) operator topology of B(H). Next, we consider the linear set of operators M := span {W (m) | m ∈ M} . It is easy to check that M is a ∗-subalgebra of W . We define M as the C ∗ -algebra generated by M. Notice that W ⊂ M ⊂ W . Remark 6.1. The QDS (M, αt0 , ω) is asymptotically abelian, and mixing as well. The last results easily follow by Theorem 4.3 taking into account the definition of M. Let, for m ∈ M, t ∞ t W (Tt h(x))µ(dx) + dtn · · · dt1 Xt (m) = Rm
×
Rm+nd
0 n n=1 β n ∈I∞
t2
µ(dx)ν(dλ1 ) . . . ν(dλn )W Tt wn (h(x), t n , λn , β n )
× Gn (h(x), t n , λn , β n ) .
(6.1)
Taking into account all the properties relative to the measures ν(dλ) and µ(dx), Lemma B.3 tells us that the series (6.1) is absolutely norm-summable, uniformly in time as 1 −1 Xt (m) ≤ eaL []h(x)[] |µ|(dx) . 2 1 − K∗ L |ν|(1) Rm
Model Related to Disordered Quantum Crystals
181
Theorem 6.2. The series (6.1) uniquely defines a one-parameter group of automorphisms {αt }t∈R of M. Moreover, such a group of automorphisms can be obtained for A ∈ M, by the following pointwise norm-limit αt A = lim αtN A, N ↑Zd
where αtN A is the unique solution of (5.6). Proof. We start by noticing that, if m ∈ M, then all terms of the series (6.1) belong to M. Hence, αtN (M) ⊂ M as the series W (Tt h(x))µ(dx) αtN W (m) ≡ Rm t t ∞ t + dtn dtn−1 · · · dt1 µ(dx)ν(dλ1 ) . . . ν(dλn ) n n=1 β n ∈I
0 N
tn
t2
Rm+nd
× W Tt wn (h(x), t n , λn , β n ) Gn (h(x), t n , λn , β n ) , made of elements which generate M algebraically, converges in norm, so defines an element of M. Further, by Lemma B.3 Xt (m) − αt N W (m) ∞ n ≤ d t |µ|(dx)|ν|(dλ1 ) . . . |ν|(dλn )|Gn (h(x), t n , λn , β n )| n \I n n=1 β n ∈I∞
nt
Rm+nd
N
−→ 0 as N increases in order to exhaust all of Zd . Then on M, αtN converges pointwise in norm, to a group of automorphisms which, on a single generator W (m) of M, coincides with Xt (m). To conclude, we mention without proof the following Theorem 6.3. For every real deterministic variable v and each t ∈ R, the non-linear evolution αt W (v) is the unique solution of integral equation αt W (v) = W (Tt v) + ds ν(dλ) αt−s (W (λ · eβ + Ts v))gβ (v, s, λ). [0,t]×Rd
β∈I∞
7. Ergodic Properties of the Non-Linear Dynamics The investigation of ergodic properties of the non-linear system is possible as we are able to control the Dyson expansion uniformly in time, see Lemma B.3. 0 α A, it is immediate to show that, by We start by noticing that if we define t A := α−t t Theorem 6.2, Formula (6.1) gives the explicit expression for t on (topological) generators of M. Accordingly, the forward Møller morphism (the backward Møller morphism
182
F. Fidaleo, C. Liverani
− is constructed analogously) + A := limt→+∞ t A exists as a pointwise norm-limit and is given by + W (m) ∞ = W (h(x))µ(dx) + Rm
×
Rm+nd
n=1
n β n ∈I∞
+∞
dtn · · ·
0
+∞
dt1 t2
µ(dx)ν(dλ1 ) . . . ν(dλn )W wn (h(x), t n , λn , β n ) Gn (h(x), t n , λn , β n ) . (7.1)
In order to obtain on M invariant states w.r.t. the perturbed dynamics, we define for the real deterministic variable v, (v)) , n = 0, ω(W
±∞
±∞
±∞ ± dt dt · · · dt ν(dλ ) . . . ν(dλ ) n n−1 1 n an,β n (v) := 0 tn t2 1 Rnd ×ω(W w (v, t n , λn , β n ) )G (v, t n , λn , β n ) , n > 0, n n together with ± an,β n (m) :=
Rm
± an,β n (h(x))µ(dx)
for each m = (m, h, µ). It is straightforward to check that, for m ∈ M, the formula ω± (W (m)) :=
+∞ n=1
± an,β n (m)
(7.2)
n β n ∈I∞
defines (a priori different) states on M which are invariant w.r.t. the non-linear dynamics. Such states can be obtained by the pointwise limit ω± (A) = lim ω(αt A) . t→±∞
Indeed, as the unit ball of M∗ is ∗-weakly compact, the nets {ω ◦ αt }t∈R± have cluster points. But such nets converge to the r.h.s. of (7.2) on the total set {W (m) | m ∈ M}. Notice that, by definition, ω+ (A) = ω(+ A) for each element A of M. The following lemma is needed in the sequel. Lemma 7.1. For each A ∈ M we have lim ω± (αt0 A) = ω(A) .
t→+∞
(7.3)
Model Related to Disordered Quantum Crystals
183
Proof. By a standard approximation trick, it is enough to prove the assertion for each A = W (m). By Lemma B.3 it is enough to check that, for each n > 0 and v a real ± deterministic variable, limt→+∞ an,β n (Tt v) = 0. The last claim follows as for all the functions gβ given in (5.9), we get limt→+∞ gβ (Tt v, τ, λ) = 0 for all fixed τ , λ, taking into account Lemma B.2. Now we introduce the forward basin of attraction B of ω w.r.t. the unperturbed dynamics, B := ϕ ∈ W∗ lim ϕ(αt0 A) = ω(A) , A ∈ M . t→+∞
Lemma 7.1 tells us that {ω+ , ω− } ⊂ B. Proposition 7.2. Let ϕ ∈ B. Then lim ϕ(αt A) = ω+ (A) .
t→+∞
As an immediate consequence ω+ = ω− . Proof. For A ∈ M we compute by (7.3), |ϕ(αt A) − ω+ (A)| ≡ |ϕ(αt0 t A) − ω+ (A)| ≤ t A − + A + |ϕ(αt0 + A) − ω(+ A)| . The assertion follows as the last quantities go to zero as t increases. Further, as ω− ∈ B by Lemma 7.1, and it is also invariant w.r.t. the non-linear dynamics by construction, we have by the first assertion, ω− (A) = ω− (αt A) = lim ω− (αt A) = ω+ (A) t→+∞
and the proof is complete.
Setting ω∞ := ω+ ≡ ω− , we can conclude by Proposition 7.2 that ω∞ is the only state in B which is invariant w.r.t. the non-linear dynamics. We remark also that, by a direct inspection of (7.2), the triple (M, αt , ω∞ ) defines a QDS according to our definition, see Footnote 11. Finally, we are ready to prove our last announced theorem. Theorem 7.3. The QDS (M, αt , ω∞ ) is asymptotically abelian, and mixing as well. Proof. Recall that the QDS (M, αt0 , ω) is asymptotically abelian, and mixing as well, see Remark 6.1. Next, for A, B ∈ M we get lim [A, αt B] = lim [A, αt0 (t B − + B)] + lim [A, αt0 (+ B)] = 0
t→+∞
t→+∞
t→+∞
by the asymptotic abelianess of (M, αt0 ), taking into account that t B → + B in norm.
184
F. Fidaleo, C. Liverani
The proof of mixing follows by the analogous property of (M, αt0 , ω). Namely, lim ω∞ (Aαt B) = lim
t→+∞
lim ω(αs (A)αs+t (B))
t→+∞ s→+∞
= lim
0 lim ω(αs0 (s A)αs+t (s+t B))
= lim
lim ω(s Aαt0 (s+t B))
t→+∞ s→+∞ t→+∞ s→+∞
= lim ω(+ Aαt0 (+ B)) t→+∞
= ω(+ A)ω(+ B) = ω∞ (A)ω∞ (B) . The theorem is proven as the backward asymptotic abelianess and the backward mixing follow analogously. Appendix A In the present appendix we show that there exist one dimensional models satisfying the inequality in (3.5) for the densities {ϕk } given in (3.3). We restrict our analysis to the case when the law p is absolutely continuous w.r.t. the Lebesgue measure with infinitely often differentiable density f ∈ D(0, +∞). We refer the reader to [5] for the basic definitions and results about the supersymmetric replica trick. We start by considering, for a positive integer k, l, the interval Ik,l ⊂ Z given by k k+1 +l , +l , Ik,l := − 2 2 together with the associated finite volume random Hamiltonian HIk,l on the probability space (XIk,l , PIk,l ) recovered from (3.1). The main object of our analysis is, for (z) > 0, the Green functions PIk,l (dξ )δ # k $ |(HIk,l (ξ ) − z)−1 |δ# k+1 $ Gk,l (z) := −
XIk,l
2
2
together with their limit Gk (z) as l goes to infinity. Let h be the Fourier transform of the law p(dx) appearing in the definition of the random Hamiltonian H (ξ ), we set β(x, z) := h(x)eizx . Define +∞ +∞ (Tf )(r 2 ) := −2 J0 (rs)f (s 2 )sds ; (Jf )(r 2 ) := J0 (rs)f (s 2 )sds, 0
0
where J0 is the Bessel function of order 0, and B(z) is the multiplication operator by β( · , z). Lemma A.1. We have
# $
+∞
Gk,l (z) = 2i(−i)k
[(J B(z)) 0
× [(J B(z))
#
k+1 2
k 2
(T B(z))l 1](r 2 )
$
(T B(z))l 1](r 2 )β(r 2 , z)rdr.
Model Related to Disordered Quantum Crystals
185
¯ we compute (see [5] for the definition of the Proof. Setting ψ − := ψ and ψ + := ψ, transfer operator T(1 , 2 )) & 1 % ψ2± T(1 , 2 )F (22 )d 2 φ2 d ψ¯ 2 dψ2 = ±iψ1± f (φ22 )e−iφ1 ·φ2 d 2 φ2 2π R2 +∞ % & ≡ ±iψ1± J0 (rs)f (s 2 )sds . 0
The proof follows as in [5], Sect. 2.
As far as the Bessel function of order zero is concerned, we note that √ √ r(Jf )(r 2 ) = H0 ( sf (s 2 ))(r),
(A.1)
where H0 is the Hankel transform of order zero. Then, a little bit of Harmonic Analysis relative to Hankel Transforms is needed in the sequel. We denote by M the multiplication √ operator by r and by B the multiplication operator by the function β. The norms [] · []i denote the Banach norms of the spaces Hi defined in [5], Sect. 3. Lemma A.2. Let β(r 2 ) ∈ Lp ([0, +∞), dr), with p > 2. Then M(J B)2j B(H1 ,H0 ) ≤ 2j β(r 2 )p . Proof. The proof is similar to that of Lemma 3.1 of [5]. Let f ∈ H1 , then √ √ []M(J B)2 f []0 ≡ r(J BJ Bf )(r 2 )2 = H0 ( sβ(s 2 )(J Bf )(s 2 ))2 √ √ = rβ(r 2 )(J Bf )(r 2 )2 ≤ β(r 2 )p r(J Bf )(r 2 )( 1 − 1 )−1 2 p √ √ 2 2 2 2 = β(r )p H0 ( sβ(s )f (s ))( 1 − 1 )−1 ≤ β(r )p sβ(s 2 )f (s 2 )( 1 + 1 )−1 2 p 2 p 2 2 √ 2 2 2 ≤ β(r )p sf (s )2 ≡ β(r )p []Mf []0 , where we have used (A.1), Plancherel and Hausdorff-Young Formulae for the Hankel transform, and finally the Holder Inequality. The repeated application of the above computation leads to 2j
2j
[]M(J B)2j f []0 ≤ β(r 2 )p []Mf []0 ≤ β(r 2 )p []f []1 which is the assertion.
Remark A.3. By a simple application of the Lebesgue dominated convergence Theorem, if for r > 0, β(r 2 ) < 1 and decays faster than the inverse of a power, then β(r 2 )p < 1 for all p > 2 large enough. We are ready to prove the announced regularity result. Theorem A.4. Let H (ξ ) = −+K(ξ ) be a random Hamiltonian on 2 (Z) described by identically distributed random variables Kij (ξ ) = δij κi (ξ ) with common law p(dx) = f (x)dx. If f ∈ D(0, +∞), then for each Borel set ⊂ R we have P (dξ )δm |EH (ξ ) ()|δn = ϕ|n−m| (x)dx, X
where ϕk ∈ D(0, +∞) for each k ∈ N.
186
F. Fidaleo, C. Liverani
Moreover, for every p ≥ 1 the densities ϕk satisfy 1
ϕk p ≤ (supp(ϕ)) p C1 (k + 1)C2k , where C1 > 0 and 0 < C2 < 1 are positive quantities independent on p and k. Proof. We consider the case k = 2j . The odd case follows analogously. As it was'shown in Sect. 5 of [5], η(x) := liml (T B(x))l 1 exists for each x ∈ R, as a d function in n Hn .' Moreover, it is infinitely often differentiable with derivative dx η(x) belonging again to n Hn . Next, by the above consideration and Lemma A.1, we get G2j (x + i0) = 2(i)2j +1 M(J B(x))j η(x), B(x)M(J B(x))j η(x) where
u, v :=
+∞
(A.2)
u(r 2 )v(r 2 )r −1 dr .
0
Notice that, by polarization, ϕ2j is given precisely by G2j (x + i0), see [5], p. 229. As f is infinitely often differentiable with compact support, β(s, x) ≡ eisx h(s) falls off more rapidly than the inverse of every power. In such a way, for each k ∈ N, the multiplication operator M k B(x) is a bounded one. Hence we can differentiate infinitely often the bilinear form (A.2) obtaining the desired regularity property for all the ϕ2j . Further, after differenting the form (A.2), we easily obtain ∞ ≤ C1 (2j + 1)C2 , ϕ2j 2j
¯ of h(r 2 ) (p¯ > 2), can be where C1 does not depend on j , and C2 , being the p-norm made smaller than 1, see Lemma A.2 and Remark A.3. Appendix B In the present appendix we prove some estimations related to L1 -asymptotic abelianess, as well as the basic estimate concerning the control of the dynamics uniformly in time. We start by considering for j = 1, . . . , d, k = 1, 2, and m, n ∈ Zd , the deterministic variable δj · en,k defined taking into account 5.1, together with the vector-valued functions {γδmj ·en,k } given according to (5.7). We denote by | · | (one of the equivalent) norm on Cd .
Lemma B.1. The functions {γδmj ·en,k } satisfy R
dt γδmj ·en,k (t)Cd ≤ De−h|n−m|
(B.1)
for uniform constants D, h ∈ R+ . Further, for each deterministic variable v we get dt γvm (t)Cd ≤ D[]v[]e−h|m| , (B.2) R
where D is the same constant as above and we have assumed > h, for appearing in (4.2).
Model Related to Disordered Quantum Crystals
187
Proof. We choose a suitable sequence {Tm,n } of numbers greater than 1 and split the integral in (B.1) into two parts |γδmj ·en,k | = |γδmj ·en,k | + |γδmj ·en,k | . R
|t|
|t|≥Tm,n
Taking into account Lemma 3.2 and (3.2), we obtain for the first addendum |γδmj ·en,k | ≤ c1 Tm,n e−B|n−m| . |t|
To estimate the second addendum we proceed similarly to Lemma B.3 of [7]. We obtain, after an elementary application of the Holder, Hausdorff-Young and Poincare inequalities, −1 |γδmj ·en,k | = |Fk (Fk )(t)| = t Fk (Fk )(t) |t|≥Tm,n
|t|≥Tm,n
≤ c2
1
|t|≥Tm,n
t −p dt
p
|t|≥Tm,n
ϕn−m p ≤ c2
(|n − m|α + 1) (Tm,n )
p−1 p
,
where F1 , F2 stand, respectively, for the real and imaginary part of the Fourier transform, the functions Fk can be computed taking into account (4.4), and finally α is given in (3.5). The proof now follows by choosing Tm,n := ec3 |n−m| for suitable c3 > 0. The proof of (B.2) immediately follows by (B.1), taking into account (5.8). Lemma B.2. For each deterministic variable v, the (vector valued) function γvm is a continuous one vanishing at infinity. Moreover, sup γvm (t)Cd ≤ E[]v[] t∈R
for a uniform constant E ∈ R+ . Proof. We are just shown that the γδmj ·en,k are Fourier transforms of functions whose L1 -norms are equibounded, see (3.4). Then we have the uniform estimation |γvm (t)| ≤ E (vl kn )j ≡ E[]v[] j,k,l,n
taking into account (5.8). The second statement is immediate, whereas the first one follows by the Riemann-Lebesgue Lemma, taking into account the Lebesgue dominated convergence Theorem. Lemma B.3. There exist K∗ , a > 0, independent of the measure ν in (2.8), such that, for each real deterministic variable v, the following estimate holds: |Gn (v, t n , λn , β n )|d n t|ν|(dλ1 ) . . . |ν|(dλk ) n β n ∈I∞
nt ×Rdn
−1 []v[]
≤ (K∗ L2 |ν|(1))n eaL
.
188
F. Fidaleo, C. Liverani
Proof. Remembering definitions (5.8), (5.9) and (5.11) we have |Gn (v, t, λ, β)| n n λj λp |γeβj (tj − tp )| + λj γvβj (tj ) ≤ 2n βp j =1
=
n−1
2n
q=0
p=j +1
βj β |λj | |λσ (j ) ||γeβjσ (j ) (tj − tσ (j ) )|, λj γv (tj )
A⊂{1,...,n−1} σ ∈nA j ∈A #A=q
j ∈A
where nA := {σ : A → {2, . . . , n} | σ (k) > k}. Next, for each σ ∈ {1, . . . , n}A with |A| ≤ n we use the following n × n matrices: [B(A, σ )]kl :=
k ∈ A k ∈ A,
δkl δkl − δσ (k)l
(n = {σ ∈ {1, . . . , n}A | | det B(A, σ )| = 1}. Clearly A ⊂ (n . Moreover to define A A a permutation π ∈ Pn that fixes A acts in a natural way on σ ∈ {1, . . . , n}A and it is (n is globally invariant under this action. Hence, (see [7] for details) easily seen that A n β n ∈I∞
nt ×Rnd
|Gn | ≤
n−1 1 n 2 n! n n β ∈I∞ q=0
×
n nd (n R ×R A⊂{1,...,n−1} σ ∈ A #A=q
βj β |λj | |λσ (j ) ||γeβjσ (j ) (tj − tσ (j ) )| . λj γv (tj )
j ∈A
j ∈A
Let us now consider the change of variable (λn , tn ) = (λn , tn ) if n ∈ A and = (λn , tn − tσ (n) ) if n ∈ A. The Jacobian of such a transformation is exactly | det B(A, σ )| which is precisely 1 by definition. Setting nq = {1, . . . , n}q and remembering Lemma B.1, (λn , tn )
n β n ∈I∞
n−1 n 2n n β |Gn | ≤ |λj | ds γv j (s)Cd q n nd nd n! t ×R R σ ∈n R j =q+1 β∈I n q=0 ∞
×
q
q
|λj | |λσ (j ) |
j =1
R
β
ds γeβjσ (j ) (s)Cd
n−1 n+q 2n D n d n n ≤ []v[]n−q Ln+q e(n+q)h d 2 |ν|(1)n q n! q=0
×
q σ ∈nq
n∈Zn j =1
e−h|nj −nσ (j ) |
n j =q+1
e−h|nj | .
Model Related to Disordered Quantum Crystals
189
To continue we relabel the n sum with the new labels nj = nj for j ≤ q and 2D nj = nj − nσ (j ) for j ≤ q, and let C∗ := 1−e −h , n β∈I∞
n−1 n+q d n C∗n |ν(1)|n n []v[]n−q Ln+q e(n+q)h d 2 |Gn | ≤ 1, q n! nt ×Rn n σ ∈q
q=0
n−1 q n []v[]n−q Lq nq eqh d 2 q
3n 2
d C∗n |ν|(1)n Ln enh n! q=0
n a[]v[] ≤ K∗ L2 |ν|(1) e L , ≤
where K∗ := C∗ d 2 e2h+1 and a :=
e√−h . d
Acknowledgements. It is a pleasure to thank the I.H.E.S., where part of the paper was written, for the kind hospitality. We acknowledge the partial support of INDAM and MURST. Finally, we would like to thank the anonymous referees for forcing us to substantially clarify the context of the paper.
References 1. Aizenman, M., Molchanov, S.: Localization at large disorder and at extreme energies: An elementary derivation. Commun. Math. Phys. 157, 245–278 (1993) 2. Araki, H., Shiraishi, M.: On quasi-free states of the Canonical Commutation Relations. I. Publ. Res. Inst. Math. Sci. 7, 105–120 (1971–72) 3. Araki, H.: On quasi-free states of the Canonical Commutation Relations II. Publ. Res. Inst. Math. Sci. 7, 121–152 (1971–72) 4. Bratteli, O., Robinson, D. W.: Operator algebras and quantum statistical mechanics I, II, BerlinHeidelberg-New York: Springer, 1981 5. Campanino, M., Klein, A.: A supersymmetric transfer matrix and differentiability of the density of states in the one dimensional Anderson model. Commun. Math. Phys. 104, 227–241 (1986) 6. D¨urr, D., Naroditsky, V., Zanghi, N.: On the hydrodynamic motion of a test particle in a random chain. Ann. Phys. 178, 74–88 (1987) 7. Fidaleo, F., Liverani, C.: Ergodic properties for a quantum non-linear dynamics. J. Stat. Phys. 97, 957–1009 (1999) 8. Haag, R., Hugenholtz, N.M., Winnink, M.: On the equilibrium states in quntum statistical mechanics. Commun. Math. Phys. 5, 215–236 (1967) 9. Haag, R., Kastler, D., Trych–Pohlmeyer, B.: Stability and equibrium states. Commun. Math. Phys. 38, 173–193 (1974) 10. Kunz, H., Souillard, B.: Sur le spectre des op´erateurs aux diff´erence finies al´eatoires. Commun. Math. Phys. 78, 201–246 (1986) 11. Maassen, H., Guta, M., Botvich, D.: Stability of Bose dynamical systems by branching theory. Preprint 1999 12. Manuceau, J., Verbeure, A.: Quasi-free states of the CCR-algebra and Bogoliubov transformation. Commun. Math. Phys. 9, 293–302 (1968) 13. Schlag, W.: On the integrated density of states for Schr¨oedinger operators on Z2 with quasi periodic potential. Commun. Math. Phys. 223, 47–65 (2001) 14. Stratila, S., Zsido, L.: Lectures on von Neumann algebras. Tunbridge Wells-keat: Abacus Press (1979) 15. Takesaki, M.: Theory of operator algebras I. Berlin-Heidelberg-New York: Springer, 1979 Communicated by J. L. Lebowitz
Commun. Math. Phys. 235, 191–214 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0802-z
Communications in
Mathematical Physics
Annealed Feynman-Kac Models P. Del Moral, L. Miclo Laboratoire de Statistique et Probabilit´es, CNRS-UMR 5583, CNRS and Universit´e Paul Sabatier, 31062 Toulouse cedex 4, France. E-mail: delmoral(miclo)@cict.fr Received: 10 June 2002 / Accepted: 11 October 2002 Published online: 18 February 2003 – © Springer-Verlag 2003
Abstract: We analyze the concentration properties of an annealed Feynman-Kac model in distribution space. We characterize the concentration regions in terms of a variational problem involving a competition between the potential function and the mutation kernel. When the temperature parameter is evanescent with time and under appropriate hypotheses, the probability mass tends to concentrate on regions with minimal potential values. We give a precise description of these areas using non-linear semi-group contractions and large deviation techniques. We illustrate this annealed model with two physical interpretations related respectively to Markov motions in absorbing media and interacting measure valued processes. 1. Introduction This study is concerned with the long time behavior of a Feynman-Kac model associated to a potential function and a cooling schedule. This annealed distribution flow can be interpreted as the evolution of the laws of a Markov particle in an absorbing medium conditioned to non-extinction. In this context the cooling schedule represents in some sense the temperature of the medium. The smaller it is, the more stringent become the obstacles. These Feynman-Kac models can alternatively be regarded as a dynamical system in distribution space. In this context they can model the evolution of the marginal laws of a non-linear and non-homogeneous Markov process. The non-linearity comes from the fact that the elementary transitions depend on the distribution flow itself. In this connection the non-linear Markov model can be regarded as a Feynman-Kac type simulated annealing algorithm with mutation/selection transitions. As in the traditional simulated annealing model the temperature parameter is used to increase the selection pressure of the algorithm. These non-linear measure valued processes have a natural genetic type particle interpretation. They have some important applications in biology, advanced signal processing and numerical function analysis [4]. In contrast to previous studies on the
192
P. Del Moral, L. Miclo
convergence of genetic algorithms in global optimization problems (see for instance [2, 3] and the references given there) we underline that our study is not restricted to finite state spaces and above all that our mutation kernel is homogeneous in time (and not related to the cooling schedule), in particular it does not force the underlying (i.e. unperturbed by potentials) system to be motionless for large time. Our final counter-example relative to convergence to the global minima enables to better apprehend this classical assumption and we will see that it can be advantageously replaced by permitting the mutation kernel to have loops on each point of the state space. This latter precaution is very mild from the point of view of implementation of the procedure, whose speed of convergence should benefit a lot from homogeneity in time of the mutations (and furthermore, endowed with this feature, our algorithm seems closer to true biological/genetic mechanisms than the traditional ones). The above physical interpretations result from two ways to turn a sub-Markov and Boltzmann-Gibbs operator into a Markov kernel. One of the central questions in the study of these annealed Feynman-Kac models is of course the investigation of the long time behavior of these flows. Intuitively speaking when the temperature parameter tends to zero the probability mass of regions with high potential values decreases and the flow tends to concentrates to regions with minimal potential. The main objective of this article is to make clear this statement. First we discuss the convergence to equilibrium of the annealed models. We exhibit two different types of cooling schedules depending on the mixing parameter of the mutation transitions. Then we characterize the asymptotic regions where the flow concentrates in terms of a variational problem in distribution space. We show that the concentration properties of the annealed model are the result of a competition between the selection potential and the mutation transition. When the temperature parameter tends to zero the variation problem is solved by taking the infimum of the mean potentials over a suitably chosen collection of measures. We already mention that for sufficiently regular mutations and for judicious choices of cooling schedules the annealed model converges in probability to the global infimum of the potential function. To our knowledge the asymptotic concentration properties of the annealed FeynmanKac flow presented in this study have not been covered by the literature. We propose an original strategy based on non-linear semi-group contraction and large deviation techniques. The question of the stability of non-linear Feynman-Kac semi-groups arises in many research areas. To our knowledge the first studies in this field originate in nonlinear filtering literature. We again refer the reader to [4] for a precise discussion and a precise list of referenced papers. To our knowledge most of these works are only concerned with proving that the flow forgets its initial condition. The reason why these studies do not apply are twofold. First the annealed Feynman-Kac flows presented here are related to an increasing cooling schedule and the resulting functions e−β(n)V tend to the indicator function of null potential regions. The analysis of this degenerate situation is more involved and clearly differs from traditional filtering studies. On the other hand our objective is not to check that the flow forgets its initializations but we want to identify the regions on which the distributions concentrate. Our way to enter into this question has been influenced by the article [5]. This study shed some new light on the connections between the limiting distribution of the homogeneous model and the Lyapunov exponent of Schr¨odinger-Feynman-Kac semi-groups. Here we develop the profound interplay between the asymptotic behavior of these exponents and the concentration properties of the limiting measures. We also present a set of sufficient conditions on the mutation transitions under which these limiting regions coincide with the essential infimum of the potential function with respect to the invariant measure of the mutation kernel. We end
Annealed Feynman-Kac Models
193
this article showing that annealed models with off-diagonal type mutations may lose the essential infimum.
1.1. Description of the models and motivations. Let E be a separable complete metric space endowed with its Borel σ -field E and let V : E → R+ = [0, ∞) be a nonnegative bounded continuous potential function on E whose oscillation will be denoted by osc(V ) = sup V (x) − V (y) ; (x, y) ∈ E 2 . We consider a Markov kernel M(x, dy) on (E, E) and we denote by = E N , F = (Fn )n∈N , X = (Xn )n∈N , (Px )x∈E the canonical Markov chain with elementary transition M. For a given distribution µ on E we use the notation Eµ (.) for the expectation with respect to µ(dx) Px (.). Pµ (.) = E
We recall that a Markov kernel M generates two integral operators, one acting on the Banach space B(E) of bounded measurable functions f and the other on the set P(E) of probability measures µ on E by M(f )(x) = M(x, dy) f (y) and (µM)(A) = µ(dx) M(x, A) E
E
for any x ∈ E and A ∈ E. The space B(E) is endowed with the supremum norm f = sup |f (x)| x∈E
and the set P(E) with the total variation distance µ − ηtv = sup |µ(A) − η(A)| = A∈E
1 sup |µ(f ) − η(f )|. 2 f ∈B(E) : f ≤1
Here are collected some standard notations and conventions. If Q is a bounded positive operator on B(E) then we denote by Qn , n ∈ N, the semi-group defined by the inductive formula Qn = QQn−1 with Q0 = I d. Unless otherwise mentioned, x, y, z denote generic points in E, f an arbitrary bounded and measurable test-function on E and µ a given probability measure on E. We also use the conventions inf ∅ = ∞, sup∅ = −∞ and ∅ = 0, ∅ = 1 and a denotes the integer part of a number a ∈ R+ . : N → R+ we denote by µn ∈ P(E) the For a given inverse-freezing schedule β annealed distribution flow defined by the Feynman-Kac formulae µn (f ) =
γn (f ) γn (1)
n (p)V (Xp ) . β with γn (f ) = Eµ0 f (Xn ) exp − p=1
(1)
194
P. Del Moral, L. Miclo
In this notation we notice that γ0 = µ0 is the initial distribution law of the chain X. When (n) → ∞ as n → ∞, the distributions µn should inthe cooling schedule parameter β tuitively concentrate on some regions with minimal potential values. To illustrate this observation and motivate this article we present next two different physical interpretations of the Feynman-Kac model. To clarify the presentation we restrict ourselves to the (n) = β ∈ R+ . homogeneous situation and we suppose β Before proceeding further, we notice from the multiplicative nature of these formulae that the finite measures γn satisfy a linear equation of the form γn = γn−1 Q,
(2)
with
Q(x, dz) =
M(x, dy) G(y, dz)
and
E
G(x, dy) = e−βV (x) δx (dy).
The first way to turn the sub-Markovian kernel G into the Markov case consists in adding a cemetery point c ∈ E to the state space E and in extending the various quantities as follows. The test functions f ∈ B(E) and respectively the Markov transition M are first extended to E {c} by setting f (c) = 0 (note that V (c) = 0) and M c (x, dy) = ?E (x) M(x, dy) + ?c (x) δc (dy). Finally the Markov extension Gc of G on E {c} is given by Gc (x, dy) = e−βV (x) δx (dy) + (1 − e−βV (x) ) δc (dy).
(3)
The corresponding Markov chain (, F, X, (Pxc )x∈E {c} ) with elementary transitions M c Gc can be regarded as a Markov particle evolving in an environment with absorbing obstacles related to a potential function V . When the temperature of the medium decreases the obstacles become more and more stringent and the particle is more rapidly absorbed. In this physical context the Feynman-Kac flow (µn )n∈N defined by (1) represents the conditional distributions of the particle motions given the fact it has not been absorbed. With some obvious abusive notations we have c (f (Xn ) | T > n), µn (f ) = Eµ 0 c (.) is the expectawhere T = inf {n ≥ 0 ; Xn = c} is the lifetime of X and where Eµ 0 c tion with respect to Pµ0 (.). Intuitively speaking, when the temperature of the medium decreases a non-absorbed particle will evolve in regions with low potential obstacles. In measure valued and interacting processes literature the Feynman-Kac flow is alternatively seen as a solution of a non-linear evolution equation of the form β µn = µn−1 Kµ n−1 β
(4) β
with a collection of Markov kernels Kµ on E. The choice of Kµ is not unique. By direct inspection we see from (2) that we can choose β
β Kµ = MSµM β
with the Markov kernels Sµ on E defined by Sµβ (x, dy) = e−βV (x) δx (dy) + (1 − e−βV (x) ) β (µ)(dy).
Annealed Feynman-Kac Models
195
Here β is the Gibbs-Boltzmann mapping from P(E) into itself defined by β (µ)(dy) =
1 µ(e−βV )
e−βV (y) µ(dy).
Note that the evolution equation (4) is decomposed in two separate Markov transitions Sηβn
M
µn−1 −→ ηn = µn−1 M −→ µn = ηn Sηβn
(5)
In connection with the first interpretation we have turned here the sub-Markovian kernel G into the Markov case in a non-linear way by replacing the Dirac measure on the cemetery point c by the Gibbs-Boltzmann distribution β (ηn ). In this context the probβ abilistic interpretation consists in introducing a new reference probability measure Pµ0 on the canonical space under which µn is the distribution law of a non-homogeneous and non-linear Markov chain (Zn )n∈N . This measure is called the McKean measure β β associated to the collection of kernels Kµ and it is defined by its n-time marginals Pµ0 ,n , β Pµ (d(x0 , . . . , xn )) = µ0 (dx0 ) Kηβ0 (x0 , dx1 ) · · · Kηβn−1 (xn−1 , dxn ). 0 ,n
(6)
In this notation we clearly have that β (f (Zn )), µn (f ) = Eµ 0 β
β
where Eµ0 (·) is the expectation with respect to Pµ0 (·). The non-linear model (4) has a natural interacting particle interpretation. We will not enter into more details here. This would be too much digression and we refer the interested reader to the survey paper [4]. For completeness and for the convenience of the reader we give next an intuitive feel for the particles’ motions. As dictated by (5) the evolution of the systems is again decomposed into two mechanisms. During the first one the particles explore the state space independently of each other according to the mutation kernel M. After this mutation stage each particle in a site x decides with a probability e−βV (x) to stay in its location and with a probability 1 − e−βV (x) it selects a new particle at site y proportionally to its fitness e−βV (y) . This interacting particle model can be regarded as a genetic approximating model or as a Feynman-Kac version of the traditional simulated annealing algorithm. We refer the reader to the book of Duflo [7] and references therein for a precise description of these classical stochastic algorithms. In this “engineering perspective” the objective is to find the right mutation kernel and a judicious cooling schedule so that the particle will converge as the time tends to infinity to the global minima of the potential function V . 1.2. Statement of some results. In this section we present a quick derivation of our main (n) = β ∈ R+ , results. For an homogeneous cooling schedule, namely for all n ∈ N, β the distribution flow (4) is homogeneous with respect to the time parameter. More precisely it satisfies a non-linear equation in distribution space of the form µn = β (µn−1 ). The mapping β from P(E) into itself is defined by the formula β (µ)(f ) = µM e−βV f /µM e−βV .
196
P. Del Moral, L. Miclo β
By n , n ∈ N, we denote the corresponding semi-group β
n+1 = β ◦ βn
with
β
0 = I d.
We will work with the following mixing type condition: (H) There exists an integer m ≥ 1 and an ε ∈ (0, 1) such that for each x, y ∈ E, M m (x, .) ≥ ε M m (y, .). This condition clearly holds true for irreducible and aperiodic Markov kernels M on a finite space but it is also satisfied for bi-Laplace transitions with bounded drift functions as well as for some classes of Gaussian transitions (cf. [4]). The reasons for the introduction of this mixing type condition are twofold. – The main reason is that it guarantees the existence of an unique point µβ = β (µβ ) ∈ P(E) of each mapping β , β ∈ R+ . We will deduce this result from a contraction property β of the semi-group n . More precisely we will find a contraction parameter k ∈ (0, 1) and a collection of integers n(β), β ∈ R+ , such that for any µ, η ∈ P(E), β
β
n(β) (µ) − n(β) (η)tv < k µ − ηtv . (n) one expects that the non-homogeFor judicious choices of cooling schedule β neous Feynman-Kac flow behaves as the limiting distributions µβ (n) , as it is usual for simulated annealing (see for instance [9, 10]). – On the other side, condition (H) allows to connect the concentration properties of µβ with the asymptotic behavior of the logarithmic Lyapunov exponent n 1 ln sup Ex e−β p=1 V (Xp ) n→∞ n x∈E
(−βV ) = lim
(7)
of the semi-group Q(f ) = M(e−β f ) (on the Banach space B(E)). To describe in some detail the interplay between the exponents (7) and the fixed point measures µβ we recall that under condition (H) the occupation measures Ln =
n 1 δXp n p=1
converge as n → ∞ to an unique invariant measure ν = νM. The exponential deviant behavior of Ln is expressed by a large deviation principle. Namely, the distributions sequence Ln satisfy as n → ∞ a large deviation principle with a convex rate function I : µ ∈ P(E) → [0, ∞] defined by
µ(dx) Ent(K(x, .)|M(x, .)) , I (µ) = inf E
where the infimum is taken over all Markov kernels K with invariant measure µ. For a proof of this statement we refer the reader to the book of Dupuis and Ellis [8]. Note that
Annealed Feynman-Kac Models
197
I (µ) < ∞ implies that we can find a Markov kernel K leaving µ invariant, µK = µ, and verifying K(x, .) << M(x, .) for µ-a.s. all x ∈ E. Loosely speaking these probability measures can be interpreted as the limiting distributions of M-admissible Markov chains. The rate function I , the fixed points µβ and the logarithmic Lyapunov exponents
(−βV ), β ∈ R+ , are connected by the following formula: − (−βV ) = ln µβ (eβV ) =
inf
η∈P (E)
(β η(V ) + I (η)).
This formula expresses the concentration properties of the limiting measures µβ in terms of a variational problem in P(E) with competition between the mean potential η(V ) and the I -entropy I (η). If we combine this expression with the exponential version of the Markov inequality we can check that for any δ > 0, 1 ln µβ (V ≥ VI + δ) ≤ −δ β→∞ β lim
with
VI = inf {η(V ) ; η ∈ P(E) ,
I (η) < ∞},
Note that VI represents the minimal mean potential value we can asymptotically obtain running all M-admissible Markov chains. This asymptotic result indicates that the fixed points µβ concentrate as β tends to infinity to regions with potential less than VI . We will see that VI is always greater than the ν-essential infimum of V defined by Vν = sup {v ∈ R+ ; V ≥ v
ν − a.e.}.
The interplay between VI and Vν depends on the nature of Markov kernel M. For regular Markov kernels M we will see that VI = Vν . Nevertheless we will also exhibit situations where VI > Vν and for some δ > 0, µβ (V ≤ Vν + δ) → 0 as the parameter β → ∞. If (β(n))n∈N (respectively (t (n))n∈N ) is an increasing sequence of non-negative real numbers (resp. non-negative integers, with t (0) = 0), we will consider the cooling given by schedule β ∀ n ∈ N, ∀ t (n) ≤ p < t (n + 1),
(p) = β(n). β
But by a slight language abuse, from now on the sequence β := (β(n))n∈N will be called the cooling schedule and (t (n))n∈N the time mesh sequence. Our result is basically stated as follows: Theorem 1.1. When the mixing condition holds true for m = 1 we have VI = Vν . In addition there exists an integer parameter ≥ 1 such that for any choice of cooling schedule of the type β(n) = β(0) (n + 1)a ,
with a ∈ (0, 1) and β(0) < ∞
we have for the time mesh t (n) = n , µt (n) − µβ(n) tv ≤
C0 (n + 1)1−a
for a certain constant C0 > 0. When condition (H) is only satisfied for some m > 1, then for a logarithmic cooling schedule of the type β(n) = β(0) ln (n + e),
with b = (m − 1)osc(V )β(0) < 1
198
P. Del Moral, L. Miclo
we have for some time mesh t (n) = O(n1+b ln n), µt (n) − µβ(n) tv ≤ C1
ln (n + e) (n + 1)1−b
for a certain constant C1 > 0, and for each δ > 0, lim µt (n) (V ≥ VI + δ) = 0.
n→∞
is also quasi-logarithmic: We note that in the latter case, the “true” cooling schedule β (p) = O(ln(p)), as it is customary in simulated annealing. for large time p ∈ N, β In the context of Markov motions in an absorbing medium the choice of the mutation kernel M is dictated by the problem at hand. In this situation the analysis on the concentration levels discussed in this article will provide a way to predict the location of a non-absorbed particle when the temperature of the medium decreases. On the other side of the picture if we want to construct a Feynman-Kac simulated annealing which converges to the global infimum of a potential we have the choice of the mutation transitions. The concentration results presented here can be used to find a judicious choice of mutation kernel. In view of the above theorem it is preferable to explore the state space with a mutation transition satisfying condition (H) with a mixing parameter m = 1. In the genetic interpretation (5) the algorithm evolves according to a two stage mutation/selection. When the mutation transition has a mixing parameter m it seems preferable to run m mutations between each selection procedure, so that we are brought back to the situation m = 1. But in practice m can be very large and it may be not efficient to wait too long between mutations (quantitatively, this corresponds to a large constant C0 in the above theorem), especially for a given number of iteration n ∈ N∗ . 2. Regularity Properties The study of the asymptotic stability of Feynman-Kac type semi-groups has been initiated in [4]. Although this study does not discuss the contraction properties of the non-linear Feynman-Kac semi-group it designs a semi-group technique which can be easily transβ ferred to obtain contraction properties of the mappings n . Next we present a slightly more general formulation which applies to annealed Feynman-Kac models and capture the main ideas. Suppose Q is a given bounded and positive operator on B(E). We associate to Q the non-linear semi-group n , n ∈ N, defined on P(E) by the formula ∀ A ∈ E,
n (µ)(A) = µQn (A)/µQn (1).
Proposition 2.1. Suppose there exists an integer m ≥ 1 and a collection of numbers εQ ∈ (0, 1) such that Qm (x, .) ≥ εQ Qm (y, .). Then we have for any µ, η and n ∈ N, −2 2 n/m (1 − εQ ) µ − ηtv . n (µ) − n (η)tv ≤ 2εQ
(8)
Annealed Feynman-Kac Models
199
Proof. The contraction inequality (8) is essentially proved in [4]. We give next a short proof which completes the arguments given there. By construction we notice that n (µ)(f ) = µQn (f )/µQn (1) = n (µ)Qn (f ) with the Markov kernel Qn on E and the mapping n on P(E) defined by Qn (f ) =
Qn (f ) Qn (1)
and
n (µ)(f ) =
µ(Qn (1) f ) . µ(Qn (1))
The sequence of kernels Qn and the mappings n do not satisfy the semi-group property but we have for any pair of indexes p + q = n, (q)
Qn = Qp Q q with the Markov kernel
Qp (Qq (1) f ) . Qp (Qq (1)) To see this claim it suffices to use the semi-group property of Q and the definition of Qn ; indeed we have (q)
Qp (f ) =
Qn (f ) =
Qp (Qq (1)Qq (f )) Qp (Qq (f )) (q) = = Qp Qq . Qp (Qq (1)) Qp (Qq (1))
From this observation we can write (n−m)
Qn = Qm
(n−2m)
Qm
(n−n/mm)
· · · Qm
Qn−n/mm .
Under our assumptions we have for any q ∈ N, x, y ∈ E and n ≥ m, (q)
(q)
Qm (x, .) ≥ εQ Qm (y, .)
and
Qn (1)(x) ≥ εQ Qn (1)(y).
We recall that for any Markov kernel K such that for any x, y ∈ E, K(x, .) ≥ δ K(y, .) for some δ ∈ (0, 1) we have the contraction property µK − ηKtv ≤ (1 − δ) µ − ηtv . From this and the above considerations we obtain (q)
(q)
2 ) µ − ηtv . µQm − ηQm tv ≤ (1 − εQ
We end the proof of the proposition by noting that the decomposition η(Qn (1)) Qn (1) (µ − η) (f − n (η)(f )) n (µ)(f ) − n (η)(f ) = µ(Qn (1)) η(Qn (1)) together with (9) yields that −2 µ − ηtv . n (µ) − n (η)tv ≤ 2εQ
This ends the proof of the proposition.
(9)
200
P. Del Moral, L. Miclo
Next we apply Proposition 2.1 to determine the degree of contraction of the semi-group β n . When a Markov kernel M satisfies condition (H) for some m ≥ 1 and some ε ∈ (0, 1) we write (0) the integer (0) = 1 + ε −2 4 + ln (2ε −2 ) and m (β), β ∈ R+ , the collection of integers m (β) = m (0) (1 + eα(m)β ) (1 + α(m)β) with
α(m) = (m − 1)osc(V ).
Theorem 2.2. Suppose the Markov kernel M satisfies condition (H) for some integer m ≥ 1 and some parameter ε ∈ (0, 1). Then for any n ≥ m we have n/m βn (µ) − βn (η)tv < 2ε −2 e2α(m)β 1 − ε 2 e−α(m)β µ − ηtv . In addition we have for any β ∈ R+ , β
β
m (β) (µ) − m (β) (η)tv <
1 µ − ηtv . e
Thus m (β) can be seen as a relaxation time for β with respect to ·tv . Proof. Let Q = MG be the composition of the Markov kernel M with the multiplicative kernel G(f ) = e−βV f . For any positive function f ∈ B(E) we find that MQm−1 (f )(x) QQm−1 (f )(x) Qm (f )(x) = ≥ e−βosc(V ) m m−1 Q (f )(y) QQ (f )(y) MQm−1 (f )(y) and by induction m Qm (f )(x) −(m−1)βosc(V ) M G(f )(x) ≥ e Qm (f )(y) M m G(f )(y)
(this can also be seen directly on normalized Feynman-Kac formulae). Hence under our assumptions we conclude that Qm (x, .) ≥ ε e−α(m)β Qm (y, .). Thus Q satisfies the mixing condition stated in Proposition 2.1 with εQ = ε e−α(m)β . Consequently we find that βn (µ) − βn (η)tv < 2ε −2 e2α(m)β (1 − ε 2 e−2α(m)β )n/m µ − ηtv . The factor 2 in the second exponential term in the last display can be removed by using the same lines of arguments as in the proof of Proposition 2.1 and noting that for any positive function f ∈ B(E) , (q)
Qm (f )(x) =
M m G(Qq (1)f )(x) Qm ( Qq (1) f )(x) ≥ e−(m−1)βosc(V ) m q Q ( Q (1) )(x) M m G(Qq (1))(x) M m G(Qq (1)f )(y) ≥ ε 2 e−(m−1)βosc(V ) . M m G(Qq (1))(y)
Annealed Feynman-Kac Models
201
To prove the second assertion we observe that (1 − ε 2 e−α(m)β ) m (β)/m ≤ e−ε
2
e−α(m)β m (β)/m
≤ e−ε
2
(0)(1+α(m)β)
.
Since we have ε2 (0)(1 + α(m)β) = ε 2 (1 + ε −2 ) (4 + ln (2ε −2 )) (1 + α(m)β) ≥ (4 + ln (2ε −2 ) + 2α(m)β) ≥ 1 + ln (2ε −2 ) + 2α(m)β, we conclude that 2ε −2 e2α(m)β (1 − ε 2 e−α(m)β ) m (β)/m ≤ 1/e.
The end of the proof is now straightforward.
The uniform contraction property stated in Theorem 2.2 is an important first step in proving the convergence Theorem 1.1. First, and as mentioned in Sect. 1.2, it guarantees the existence of a unique collection of fixed point probability measures µβ = β (µβ ) ∈ P(E) ,
β ∈ R+ .
Furthermore it is also an important instrument tool for proving the following regularity condition. Proposition 2.3. When condition (H) holds true for some m ≥ 1 and ε ∈ (0, 1) we have for any 0 ≤ β1 ≤ β2 , µβ1 − µβ2 tv ≤ osc(V ) m (β1 ) (β2 − β1 ). Proof. Using the fixed point property we have the decomposition β
β
β
β
µβ1 − µβ2 = 1m (β1 ) (µβ1 ) − 1m (β1 ) (µβ2 ) + 1m (β1 ) (µβ2 ) − 2m (β1 ) (µβ2 ). By the uniform property stated in Theorem 2.2 we notice that µβ1 − µβ2 tv ≤
e β β 1m (β1 ) (µβ2 ) − 2m (β1 ) (µβ2 )tv . e−1
Let Pµ,n be the distribution of the sequence of random variables (X0 , . . . , Xn ) with initial distribution µ, that is Pµ,n (d(x0 , . . . , xn )) = µ(dx0 ) M(x0 , dx1 ) . . . M(xn−1 , dxn ). β
In this notation we see that each distribution n (µ) is the n-time marginal of the GibbsBoltzmann measure on E n+1 defined by β (dx) = Pµ,n
1 e−βVn (x) Pµ,n (dx) Pµ,n (e−βVn )
with the potential Vn from E n+1 into R+ defined for any x = (x0 , . . . , xn ) by Vn (x) =
n p=1
V (xp ).
202
P. Del Moral, L. Miclo
It is now well-known that for any β1 ≤ β2 β1 β2 Pµ,n − Pµ,n tv ≤
β2 − β1 β2 − β1 osc(Vn ) ≤ n osc(V ) 2 2
from which we conclude that m (β1 ) (β2 − β1 ) osc(V ). 2 This ends the proof of the proposition, due to the elementary bound e ≤ 2(e − 1). β
β
1m (β1 ) (µβ2 ) − 2m (β1 ) (µβ2 )tv ≤
3. Asymptotic Behavior In the further development of this section we assume without further mention that the Markov kernel satisfies condition (H) for some integer m ≥ 1 and some ε ∈ (0, 1). Let β = (β(n))n∈N be a given non-decreasing inverse cooling schedule and let tm (n), n ∈ N, be the associated time mesh defined by the following recursive formula: tm (n + 1) = tm (n) + m (β(n)) with
tm (0) = 0.
We associate to the pair (β(.), tm (.)) the annealed Feynman-Kac flow µp , p ∈ N, defined for each n ∈ N by µp+1 = β(n) (µp ) for each
tm (n) ≤ p < tm (n + 1).
In other words µp is the annealed Feynman-Kac flow with a constant inverse temperature parameter β(n) between the dates tm (n) and tm (n + 1), that is for each 0 ≤ p < tm (n + 1) − tm (n), p Eµtm (n) f (Xp ) e−β(n) q=1 V (Xq ) µtm (n)+p (f ) = . p Eµtm (n) e−β(n) q=1 V (Xq ) The core idea in the study of the long time behavior of the annealed model consists in combining the regularity properties of the fixed point distributions µβ with the contraction properties of the mappings β . To this end we introduce the decomposition β(n−1) β(n−1) µtm (n) −µβ(n) = m (β(n−1)) (µtm (n−1) )− m (β(n−1)) µβ(n−1) + µβ(n−1) −µβ(n) . From this display we now apply Theorem 2.2 and Proposition 2.3 to get the following system of inequalities: µtm (n) − µβ(n) tv 1 µtm (n−1) − µβ(n−1) tv + osc(V ) m (β(n − 1)) (β(n) − β(n − 1)) ≤ e and thus it appears that en µtm (n) − µβ(n) tv ≤ µ0 − µβ(0) tv + osc(V )
n
ep m (β(p − 1)) (β(p) − β(p − 1)).
p=1
We are now in a position to state the main result of this section.
(10)
Annealed Feynman-Kac Models
203
Theorem 3.1. – If m = 1 then we have t1 (n) = (0)n and we can choose β(t) = (t + 1)a ,
for any fixed
a ∈ (0, 1).
In addition we have for some c(a) < ∞, µt1 (n) − µβ(n) tv ≤ c(a)/n1−a .
(11)
– If m > 1 then we can take β(t) = β(0) ln (t + e),
with b = α(m)β(0) < 1.
In this case we have tm (n) = O(nb+1 ln n) and for some c(b) < ∞, µtm (n) − µβ(n) tv ≤ c(b) ln (n + e) /n1−b . Proof. When m = 1 we recall that 1 (β) = 2 (0) does not depend on β and t1 (n) = 2 (0)n. By direct inspection, if we choose β(t) = (t + 1)a , for any fixed a ∈ (0, 1), then we find from (10) that en µt1 (n) − µβ(n) tv ≤ µ0 − µβ(0) tv + 2 osc(V ) (0)
n
ep ((p + 1)a − p a ).
p=1
Recalling that x a − y a ≤ ay a−1 (x − y) for any x, y ≥ 0, we get n n n p p−1 e e . ep ((p + 1)a − p a ) ≤ a ≤ ae 1 + p 1−a p 1−a p=1
p=1
p=2
Next we observe that for any p ≥ 2, ep−2 p 1−a 1 ep−1 21−a ep−1 2 ep−1 = ≤ ≤ , (p − 1)1−a e p 1−a (p − 1)1−a e p 1−a e p 1−a so that we have n n ep−1 ep−1 ep−2 −1 ≤ (1 − 2/e) − p 1−a p 1−a (p − 1)1−a
p=2
p=2
en−1 ≤ 2 e 1−a . n It follows that µt1 (n) − µ
β(n)
tv ≤ e
−n
+ 6 osc(V ) (0) e−n +
Since en ≥ n1−a , this yields that µt1 (n) − µβ(n) tv ≤ c(a)/n1−a with
c(a) ≤ 1 + 18 osc(V ) (0).
2 n1−a
.
204
P. Del Moral, L. Miclo
Now we examine the situation where condition (H) is met only for some integer m > 1. We begin in this case by observing that m (β) ≤ 4m (0) eα(m)β (2 + α(m)β). We now choose the cooling schedule β(t) = β(0) ln (t + e) with b = α(m)β(0) < 1. In this notation, we find that from the above observation that m (β(n)) ≤ 4m (0) (e + n)b (2 + b ln (n + e)) ≤ 16m (0) (e + n)b ln (n + e).
(12)
This yields the following growth estimate on the corresponding time mesh tm (n) =
n−1
m (β(p)) ≤ 16meb (0) ln (n + e)
p=0
≤
n
pb
p=1
16m eb b+1
(0) (n + 1)b+1 ln (n + e).
We deduce from (10) and (12) that en µtm (n) − µβ(n) tv ≤ 1 + 32m osc(V )β(0) (0)
n−1
p=0
ln (p + e)
ep+1 . (p+e)1−b
This implies that en µtm (n) − µβ(n) tv ≤ 1 + 32m osc(V )β(0) (0) ln (n + e)
n p=1
From the above estimates we have e−n
n p=1
9 ep en −n ≤ 1−b . ≤ e e 1 + 2 1−b 1−b p n n
This finally shows that µtm (n) − µβ(n) tv ≤ c(b) ln (n + e)
1 n1−b
with c(b) = 1 + 288m osc(V )β(0) (0).
ep . p 1−b
Annealed Feynman-Kac Models
205
4. Concentration Properties This section is decomposed in two parts. In a first sub-section we provide a variational formulation of the concentration properties of µβ . We use a large deviation analysis to connect the Lyapunov exponent of the underlying Feynman-Kac semi-group at temperature β −1 > 0 with the β-exponential moment of the potential V under µβ . The description of the concentration level VI is discussed in the second part of this section. We give sufficient conditions on the Markov kernel M under which VI coincides with the essential infimum Vν of the potential. In the final part of the article we examine the finite state space situation. We provide an alternative formulation of VI in terms of the minimal mean potential along M-admissible cycles in E. We also examine the two situations, where VI = Vν and VI > Vν . 4.1. Large deviation analysis. In this section we examine the concentration properties of the fixed point distributions µβ as β tends to infinity. To obtain a better grasp of what is at stake it is useful to interpret these distributions as the limiting measures of the β canonical Markov chain Z under the McKean distribution Pµ0 defined in (6). We recall that this distribution describes the evolution of a non-homogeneous Markov chain with elementary transitions β
β Pµ (Zn ∈ dy | Zn−1 = x) = MSµn−1 M (x, dy) 0
with
−1 β Pµ ◦ Zn−1 = µn−1 . 0
This chain can be regarded as a Feynman-Kac simulated annealing model. At each time n the particle Zn−1 first evolves according to the transition kernel M to a new location Yn . With a probability e−βV (Yn ) we set Zn = Yn and with a probability 1 − e−βV (Yn ) it jumps to a new location Zn randomly chosen with the Boltzmann-Gibbs distribution β (ηn )(dy) =
1 e−βV (y) ηn (dy) ηn (e−βV )
with
β ηn = µn−1 M = Pµ ◦ Yn−1 . 0
These transitions tend to favor regions with low potential values. The precise description of these areas is contained in the concentration properties of the limiting distributions µβ . In contrast to what would be the case for traditional simulated annealing or statistical mechanics models here the limiting distributions are not defined in terms of a Boltzmann-Gibbs measure. As a result the concentration analysis is more involved and we have to find a new strategy to enter into these questions. Here the interplay between µβ and the quantities (β, M, V ) is only described by the fixed point formula ∀ f ∈ B(E), µβ (f ) = µβ (Qβ (f ))/µβ (Qβ (1)) with Qβ (f ) = M e−βf . As we already mentioned in the introduction under the uniform mixing condition (H) the Markov kernel M has a unique invariant measure ν = νM ∈ P(E) and the sequence of occupation measures Ln = n1 np=1 δXp of the chain X under Pµ0 satisfies as n → ∞ a large deviation principle with good rate function
µ(dx) Ent(K(x, .)|M(x, .)) , (13) I (µ) = inf E
206
P. Del Moral, L. Miclo
where the infimum is taken over all Markov kernels K with invariant measure µ (cf. [8]). In the most naive view we could think that the Feynman-Kac simulated annealing model converges in probability to the ν-essential infimum Vν of the potential V (since under (H) for any x0 ∈ E, M m (x0 , ·) is equivalent to ν, it is not really necessary to know the latter probability to compute Vν ). This intuitive idea appears to be true for regular Markov transitions M with a diagonal term M(x, x) > 0 but it is false in more general situations. To better introduce our strategy to study the concentration properties of µβ we need a more physical interpretation of the Feynman-Kac models. If we interpret the potential V as the absorption rate for a Markov particle with transition M evolving in an medium with obstacles, the normalizing constants n Eµ exp −β V (Xp ) p=1
represent the probabilities of a killed Markov particle starting with distribution µ and conditioned to perform a long crossing of length n without being trapped. The cost attached to performing long crossings is measured in terms of the logarithmic Lyapunov exponents of the semi-group Qβ on the Banach space B(E), n 1 1 ln ||Qnβ (1)|| = lim ln sup Ex e−β p=1 V (Xp ) . n→∞ n n→∞ n x
(−βV ) = lim
The next lemma shows that these Lyapunov exponents coincide with the logarithmic rate of the β-exponential moment of the fixed point measures µβ . It also makes a link between the large deviation rate I and the concentration properties of µβ . Informally it shows that µβ (eβV ) eβVI , where VI is the value of the variational problem VI := inf{µ(V ) µ ∈ P(E) : I (µ) < ∞}.
(14)
Loosely speaking the concentration properties of the limiting measures µβ as β tends to infinity are related to a competition in P(E) between the mean potential µ(V ) and the I -entropy I (µ). The next lemma also shows that the concentration of µβ is related to a variational problem in which the competition with the entropy I becomes less and less severe as β tends to infinity. Lemma 4.1. For any β ∈ R+ we have the formulae
(−βV ) 1 1 β βV − η(V ) + I (η) −−−−−−→ VI ≥ Vν . = ln µ (e ) = inf β β β η∈P (E) β→∞ Proof. If we take f = Qnβ (1) in the fixed point equality we readily find the recursive formula β n β µβ (Qn+1 β (1)) = µ (Qβ (1)) µ (Qβ (1)). Thus we have for each n ≥ 0, n µβ (Qnβ (1)) = (µβ (Qβ (1)))n = Eµβ e−β p=1 V (Xp ) .
(15)
Annealed Feynman-Kac Models
207
Now if we take f = eβV in the fixed point equation we get µβ (eβV ) µβ (Qβ (1)) = 1.
(16)
Recalling that under condition (H) the Laplace transformation 1 ln Eµ enLn (−βV ) n→∞ n
(−βV ) = lim
does not depend on the choice of the initial distribution µ we deduce that − (−βV ) = − ln µβ (Qβ (1)) = ln µβ eβV . Since (−βV ) is also given as the Fenchel transformation of I ,
(−βV ) = sup (η(−βV ) − I (η)), η∈P (E)
(17)
the end of the proof of the first assertion is clear. To end the proof we notice that 1 1 VI ≤ inf η(V ) + I (η) ≤ η0 (V ) + I (η0 ) β β η∈P (E) for each distribution η0 such that I (η0 ) < ∞. Letting β → ∞ we find that 1 η(V ) + I (η) ≤ η0 (V ). VI ≤ lim sup inf β β→∞ η∈P (E) Taking the infimum over all distributions η0 such that I (η0 ) < ∞ we obtain 1 ln µβ (eβV ) = VI . β→∞ β lim
To see that VI ≥ Vν , it is clearly sufficient to show that for any probability µ, I (µ) < +∞ implies that µ ν. One easy way to obtain this assertion in our context is to note that if I (µ) < +∞, then there exists a kernel K verifying µ = µK and K(x, ·) M(x, ·) for µ-a.s. all x ∈ E. But since for all x ∈ E, M m (x, ·) is equivalent to ν, due to the hypothesis (H), we get that µ = µK m µM m ∼ ν. This ends the proof of the lemma.
By using the exponential version of Markov’s inequality, Lemma 4.1 provides a concentration property of µβ in the level sets (V < VI + δ), δ > 0. More precisely we have for any δ > 0, µβ (V ≥ VI + δ) = µβ eβ(V −VI ) ≥ eβδ ≤ e−βδ µβ eβ(V −VI ) , from which one concludes that 1 ln µβ (V ≥ VI + δ) ≤ −δ. β→∞ β lim
Combining this concentration property with Theorem 3.1 we prove the following asymptotic convergence result.
208
P. Del Moral, L. Miclo
Proposition 4.2. Suppose condition (H) holds true for some m ≥ 1 and let tm (n) and β(n) be respectively the time mesh sequence and the cooling schedule described in Theorem 3.1. Then the corresponding annealed Feynman-Kac distribution flow µtm (n) concentrates as n → ∞ to regions with potential less than VI , that is for each δ > 0 we have that lim µtm (n) (V ≥ VI + δ) = 0. n→∞
Remark 4.3. The topological hypotheses that E is Polish and that V is continuous are only necessary to obtain (17), see for instance [6]. So except for the definition (14), the rest of the paper is true under the assumption that (E, E) is a measurable space and that V is a non-negative bounded and measurable potential. In particular under this extended setting we can consider 1 1 lim ln (Ex [exp(−βV (X1 ) − · · · − βV (Xn ))]) n→∞ β→+∞ β n
V∗ := − lim
which always exists and does depend on the initial condition x ∈ E. Indeed, if we denote ∀ n ∈ N, ∀ β ∈ R+ , λn (β) = inf ln Ex exp(−βV (X1 ) − · · · − βV (Xn )) , x∈E
then it is quite clear via the Markov property that (λn (β))n∈N is super-additive, so that the following limit exists: λ(β) := lim
n→∞
1 λn (β) n
(this is just a rewriting of the traditional existence of the Lyapunov exponent of the underlying unnormalized Feynman-Kac operator). Now taking into account condition (H), it appears that for any n ≥ m and x, y ∈ E, 2 exp(−(m − 1)β osc(V )) ≤
Ey [exp(−βV (X1 ) − · · · − βV (Xn ))] Ex [exp(−βV (X1 ) − · · · − βV (Xn ))]
≤ −2 exp((m − 1)β osc(V )), thus we see that
Ey [exp(−βV (X1 ) − · · · − βV (Xn ))] 1 = 0, ln n→∞ n Ex [exp(−βV (X1 ) − · · · − βV (Xn ))] lim
and in particular for any initial distribution µ0 , we have 1 ln Eµ0 [exp(−βV (X1 ) − · · · − βV (Xn ))] . n→∞ n
λ(β) = lim
Besides the lhs is convex in β, as a limit of convex functions, so we are assured of the existence of − lim
β→+∞
λ(β) λ(β) − λ(0) λ(β) − λ(0) = − lim = − sup β→+∞ β β β β>0
a priori in R {−∞}, but as V is non-negative and bounded, we conclude that V∗ ∈ R+ . In this context, Lemma 4.1 can be rewritten as saying that under the topological hypotheses that E is Polish and that V is continuous, we have V∗ = VI ≥ Vν .
Annealed Feynman-Kac Models
209
4.2. Concentration levels. This section is devoted to a discussion on the concentration regions of µβ as β tends to infinity. In a first subsection we examine Feynman-Kac models where the Markov kernel M satisfies condition (H) with m = 1 or has a regular diagonal term. We show that in this case the concentration level VI coincides with the essential infimum of the potential with respect to the invariant measure of M. The second part of this section focuses on Feynman-Kac models on finite state spaces. We relate the exponential concentration of µβ with a collection of Bellman’s fixed point equations. We propose an alternative characterization of the concentration level VI . We show that VI can be seen as the minimal mean potential value over all closed cycles on E. Thanks to this representation we will check that VI = Vν iff there exists a closed cycle on V −1 (Vν ). For more general off-diagonal mutation transitions we have VI > Vν . We illustrate this assertion with a simple three point example, showing furthermore that µβ is not concentrating on “neighborhoods” of V −1 (Vν ). 4.2.1. Diagonal mutations. Certainly the easiest way to insure that VI = Vν is to impose loops on every point of E for M. This assertion is based on the following simple upper bound. Proposition 4.4. Let x0 ∈ E be such that M(x0 , x0 ) > 0, then we have VI ≤ V (x0 ). Proof. By definition of the Markov chain X, we have that for any β ∈ R and any n ∈ N, Ex0 [exp(−βV (X1 ) − · · · − βV (Xn ))] ≥ Ex0 [?X1 =x0 , X2 =x0 , ... , Xn =x0 exp(−βV (X1 ) − · · · − βV (Xn ))] = {M(x0 , x0 )}n exp(−nβV (x0 )), thus
(−βV ) ≥ lim
n→∞
1 ln Ex0 exp −β n
n
V (Xp )
p=1
≥ ln(M(x0 , x0 )) − βV (x0 ) so that VI = − lim
β→+∞
(−βV ) ≤ V (x0 ). β
As a corollary, we get that VI ≤ inf V (x), x∈EM
where EM := {x ∈ E : M(x, x) > 0}. In particular, to be sure that VI = Vν , it is sufficient that there exists a sequence (xn )n∈N such that limn→∞ V (xn ) = Vν and verifying M(xn , xn ) > 0 for all n ∈ N. In practice, this can be insured by imposing on the underlying exploration kernel M that for all x ∈ E, M(x, x) > 0.
210
P. Del Moral, L. Miclo
Let us also remark that when condition (H) is satisfied for m = 1, this slight precaution is even useless, since we have automatically VI = Vν . Indeed, this is an immediate consequence of the next ordering of measures, for any β ∈ R+ , 2
ν(e−βV ·) 1 ν(e−βV ·) β ≤ µ (·) ≤ ν(e−βV ) 2 ν(e−βV )
itself deduced from the fixed point formula. As already mentioned at the end of Sect. 1, this suggests that in implementing genetic algorithms, it is better to wait a certain number of mutation steps (here m) before proceeding to a selection step. We note the same is true for simulated annealing algorithms (cf. for instance [1]). 4.2.2. Finite state space. We consider here the simpler case of a finite state space E endowed with an irreducible Markov kernel M. Then the unique associated invariant probability ν charges all points of E. We note that in this context the condition (H) is equivalent to the aperiodicity of M, but we won’t require this property for our first result. More precisely, our next objective is to give an explicit representation of VI , which in this setting is just given by 1 1 VI = − lim (18) lim ln Ex exp(−βV (X1 ) − · · · − βV (Xn )) . β→+∞ β n→∞ n We have already seen in Remark 4.3 that this definition does not depend on the starting point x ∈ E, which could be replaced by any initial distribution. We will say that a finite sequence of elements of E, C = (x1 , · · · , xn ), where the length n ∈ N∗ will be denoted in what follows by l(C), is a proper cycle (relatively to M), if for any 1 ≤ i ≤ n, M(xi , xi+1 ) > 0, with the convention that xn+1 = x1 , and if all the xi , 1 ≤ i ≤ n, are distinct. We denote C the finite set of all such admissible proper cycles. To any C = (x1 , · · · , xn ) ∈ C, we associate its mean potential 1 V (C) := V (xi ). n 1≤i≤n
Proposition 4.5. We have VI = min V (C). C∈C
In particular it appears that the equality VI = Vν is equivalent to the existence of a proper cycle inside V −1 (Vν ). Proof. Denoting by VC the rhs in the above proposition, we begin by showing that VI ≥ VC . A finite collection P = (y1 , ..., yn ) of elements of E is called a path (relatively to M) if for any 1 ≤ i < n, M(yi , yi+1 ) > 0 and as before we associate to this object its length l(P ) = n ∈ N (n = 0 corresponds to the empty path, note that this length was not permitted for cycles) and its mean potential V (P ) = 1≤i≤n V (yi )/n. If such a path is given, we can find k proper cycles C1 , ..., Ck , and a sub-path R of P (this does not mean that it is a subsegment, i.e. R is not necessarily of the form (yr , yr+1 , ..., yr+l(R) )) of length less than card(E) such that l(Ci )V (Pi ) + l(R)V (R). (19) l(P )V (P ) = 1≤i≤k
Annealed Feynman-Kac Models
211
To be convinced of the existence of such a decomposition, we look for the first return of the path P on itself: let s = min{t ≥ 2 : yt ∈ {y1 , ..., yt−1 }} and 1 ≤ r < s be such that ys = yr . Then we define C1 := (yr , yr+1 , ..., ys−1 ) and we consider the new path P := (y1 , ..., yr−1 , ys , ys+1 , ..., yn ) (one would have noted that M(yr−1 , ys ) > 0). Applying next recursively the previous procedure, we construct/remove the proper cycles C2 , ... , Ck and we end up with a path R whose elements are all different. From formula (19), we deduce that l(P )V (P ) ≥ l(Ci )VC − card(E) V ∞ 1≤i≤k
≥ l(P )VC − 2card(E) V ∞ . Thus for any x ∈ E and n ∈ N∗ , we have Ex [exp(−βV (X1 ) − · · · − βV (Xn ))] ≤ exp(nβVC − 2card(E)β V ∞ ) and the announced bound follows at once. To see the reciprocal inequality, let us consider C ∈ C such that V (C) = VC . If an initial point x and a large enough length n are given, we construct a path Pn by first going from x to a point of C by a self-avoiding path (whose existence is insured by irreducibility) and next always following C (in the direction included in its definition and jumping from its last element to the first one). Then it is quite clear that limn→∞ V (Pn ) = V (C), thus denoting q = minx,y∈E : M(x,y)>0 M(x, y) and taking into account the bound Ex [exp(−βV (X1 ) − · · · − βV (Xn ))] ≥ q n exp(nβV (Pn )) we conclude by a similar argument to the one given in the proof of Proposition 4.5. In fact the equality of the previous proposition is also true in the case of a Markov kernel M admitting a unique recurrence class (but in this situation ν does not necessarily charge all points of E). In the most general case, the initial point x in (18) plays a role: VI (x) is the minimal mean potential of proper cycles included in the recurrence classes which can be reached from x. Remark 4.6. In view of the above result, it appears that if we note AC the set of positive functions f defined on E which are of the form f = C∈C aC ?C , where we have identified proper cycles C with the subset of their elements and where (aC )C∈C is a family of non-negative reals, we have
ν(f V ) VI = inf ; f ∈ AC . ν(f ) This expression should be compared with the general formula for Vν :
ν(f V ) Vν = inf ; f ∈ A+ , ν(f ) where A+ is the set of positive bounded measurable functions defined on (E, E). To understand precisely concentration phenomenon for µβ , it would be very interesting to obtain a large deviation principle: there exists a function U : E → R+ such that ∀ x ∈ E,
U (x) := − lim ln(µβ (x))/β β→+∞
212
P. Del Moral, L. Miclo
(necessarily minE U = 0, in analogy with generalized simulated annealing, U could be called the underlying virtual energy). Unfortunately we have not been able to prove such a convergence, even under the condition (H), but we are still trying to get this result. Nevertheless, we note that under the latter hypothesis the family of mappings (ln(µβ (·))/β)β≥1 is compact; indeed, let m ≥ 1 and > 0 be as in (H), we have for any β > 0 and x ∈ E, µβ (x) =
µβ (Qm β (?{x} )) µβ (Qm β (1))
≥ 2 e−(m−1)βosc(V )
ν(e−βV ?{x} ) ≥ 2 e−mβosc(V ) ν(x), ν(e−βV )
thus 0≤−
1 1 ln µβ (x) ≤ m osc(V ) − ln( 2 min ν(x)). x∈E β β
So at least under (H), we can consider accumulation functions U of − ln(µβ (x))/β for β large. In order to derive Bellman’s equations verified by this kind of objects, let us introduce for n ∈ N∗ and x, y ∈ E, the n-communication cost function, V (n) (x, y) := min V (P ), (n)
P ∈Px,y
(n)
where Px,y is the set of paths of length n going from x to y (i.e. P = (P1 , . . . , Pn ) with M(x, P1 ) > 0 and Pn = y). In particular for any x, y ∈ E, V (1) (x, y) = V (y). As in the proof of Proposition 4.4, we show without difficulties that for any x, y ∈ E, lim inf n→∞ V (n) (x, y) = VC (and this is a true limit if M is aperiodic, the difference of the two terms being at most of order 1/n). For a subset A ⊂ E we also define the M-boundary of A as the subset of all possible sites which are accessible from A, that is ∂M (A) = {y ∈ E − A ;
∃x ∈ A
M(x, y) > 0}.
Now we can state E be any accumulation point as above, then it satisfies the Proposition 4.7. Let U ∈ R+ Bellman’s fixed point equations
U (y) = inf (U (x) + nV (n) (x, y)) − nVI y∈E
(20)
for any n ∈ N∗ and nVI = inf x,y∈E (U (x) + nV (x, y)). Furthermore we have the inclusions U −1 (0) ⊂ (V ≤ VI ) and ∂M U −1 (0) ⊂ (V > VI ).
(21)
Before getting into the proof of this proposition, let us pause for a while and give some comments on the consequence of these results. The inclusions (21) show that a point x ∈ {V ≤ VI } with energy U (x) > 0 cannot be reached from U −1 (0) (the reverse being in general true). This shows that when all pairs of points x, y ∈ {V ≤ VI } can be joined by a path in this level set then U −1 (0) = {V ≤ VI }.
Annealed Feynman-Kac Models
213
Proof of Proposition 4.7. The Bellman’s equations are immediate consequences of the equalities, seen in the proof of Lemma 4.1, ∀ n ∈ N∗ , ∀ x ∈ E, ∀ β > 0, µβ (x) = (µβ [exp(βV )])n µβ (y)Ey [?{x} (Xn ) exp(−βV (X1 ) − · · · − βV (Xn ))] y∈E
by taking the logarithm and dividing by β that we let go to infinity. To prove the inclusions (21), we suppose on the contrary that we can find a pair (x, y) ∈ E 2 such that U (x) = 0 , M(x, y) > 0 U (y) > 0 and V (y) ≤ VI . From the Bellman’s equation this will give that U (y) = inf {U (z) + V (y) − VI ; z ∈ E , M(z, y) > 0} ≤ inf {U (z) ; M(z, y) > 0} ≤ U (x) = 0, and we obtain a contradiction with the fact that U (y) > 0.
We end this article with a simple three point example in which VI > Vν and V −1 (Vν ) ⊂ U −1 (0). So we take for state space E = {0, 1, 2} and we consider the Markov kernel defined by p 1−p 0 M = 0 0 1 , with p ∈ (0, 1). 1 0 0 It is clear that M is irreducible and aperiodic and we check that its unique invariant probability ν is given by ν(0) =
1 3 − 2p
and ν(1) = ν(2) =
1−p . 3 − 2p
Let V : E → R+ be a potential function such that V (0) >
V (0) + V (1) + V (2) > V (2) > V (1) = 0. 3
(22)
So the ν-essential infimum Vν is given by Vν = 0 = V (1) and by Proposition 4.5, we have VI =
V (0) + V (1) + V (2) . 3
This could also be deduced from the fact that here the rate function I satisfies I (µ) < ∞ ⇐⇒ ∃ ∈ [0, 1] : µ = r(δ0 + δ1 + δ2 )/3 + (1 − r)δ0 , a property which reflects that trajectories of X are concatenations of the words [0] and [1, 2, 0] (except for a possible start with [2]). Our next objective is to solve explicitly the Bellman’s fixed point equation (20) for n = 1: U (0) = min {U (0), U (2)} + V (0) − VI U (1) = U (0) + V (1) − VI U (2) = U (1) + V (2) − V . I
214
P. Del Moral, L. Miclo
By (22), we see that in the first equality the min cannot be U (0) (otherwise V (0) = VI ), so U (0) = U (2) + V (0) − VI and this shows that U (2) < U (0). The last equation also implies that U (2) < U (1) and necessarily U (2) = 0, from which we obtain that U is unique and that it is given by U (0) = V (0) − VI U (1) = VI − V (2) U (2) = 0. One concludes that limβ→∞ µβ (2) = 1 and that this convergence is exponentially fast. In particular µβ does not concentrate for large β on the unique point 1 where the “essential” infimum is achieved (this latter assertion could also be deduced directly from the observation (21)). References 1. B´elisle, C.: Convergence theorems for a class of simulated annealing algorithms on Rd . J. Appl. Probab. 29(4), 885–895 (1992) 2. Cerf, R.: The dynamics of mutation-selection algorithms with large population sizes. Ann. Inst. H. Poincar´e Probab. Statist. 32(4), 455–508 (1996) 3. Del Moral, P., Miclo, L.: On the convergence and the applications of the generalized simulated annealing. SIAM J. Cont. Optim. 37(4), 1222–1250 (1999) 4. Del Moral, P., Miclo, L.: Branching and interacting particle systems approximations of FeynmanKac formulae with applications to non-linear filtering. In: S´eminaire de Probabilit´es, XXXIV. Berlin: Springer, 2000, pp. 1–145 5. Del Moral, P., Miclo, L.: Particle approximations of Lyapunov exponents connected to Schr¨odinger operators and Feynman-Kac semigroups. Preprint, publications du Laboratoire de Statistique et Probabilit´es, no 2001-08, 2001 6. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Boston, London: Jones and Bartlett, 1993 7. Duflo, M.: Random iterative models. Berlin: Springer-Verlag, 1997. Translated from the 1990 French original by Stephen S. Wilson and revised by the author 8. Dupuis, P., Ellis, R.: A weak convergence approach to the theory of large deviations. New York: John Wiley & Sons Inc., 1997. A Wiley-Interscience Publication 9. Holley, R., Stroock, D.: Simulated annealing via Sobolev inequalities. Commun. Math. Phys. 115, 553–569 (1988) 10. Miclo, L.: Recuit simul´e sans potentiel sur un ensemble fini. In: S´eminaire de Probabilit´es XXVI, J. Az´ema, P.A. Meyer, M. Yor (eds), Lecture Notes in Mathematics 1526, Berlin-Heidelberg-New York: Springer-Verlag, 1992, pp. 47–60 Communicated by A. Kupiainen
Commun. Math. Phys. 235, 215–231 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0789-x
Communications in
Mathematical Physics
Potts Model on Infinite Graphs and the Limit of Chromatic Polynomials Aldo Procacci1,∗ , Benedetto Scoppola2,∗∗ , Victor Gerasimov1 1
Departamento de Matem´atica, Universidade Federal de Minas Gerais, Av. Antˆonio Carlos, 6627, Caixa Postal 702, 30161-970, Belo Horizonte, MG Brazil. E-mail:
[email protected];
[email protected] 2 Dipartimento di Matematica, Universit´a “La Sapienza” di Roma, Piazzale A. Moro 2, 00185 Roma, Italy. E-mail:
[email protected] Received: 16 January 2002 / Accepted: 17 October 2002 Published online: 18 February 2003 – © Springer-Verlag 2003
Abstract: Given an infinite graph G quasi-transitive and amenable with maximum degree , we show that reduced ground state degeneracy per site Wr (G, q) of the q-state antiferromagnetic Potts model at zero temperature on G is analytic in the variable 1/q, whenever |2e3 /q| < 1. This result proves, in an even stronger formulation, a conjecture originally sketched in [12] and explicitly formulated in [16 and 19], based on which a sufficient condition for Wr (G, q) to be analytic at 1/q = 0 is that G is a regular lattice. 1. Introduction The Potts model with q states (or q “colors”) is a system of random variables (spins) σx sitting in the vertices x ∈ V of a locally finite graph G = (V, E) with vertex set V and edge set E, and taking values in the set of integers {1, 2, . . . , q} . Usually the graph G is a regular lattice, such as Zd with the set of edges E being the set on nearest neighbor pairs, but of course more general situations can be considered. A configuration σV of the system is a function σV : V → {1, 2, . . . , q} with σx representing the value of the spin at the site x. We denote by V the set of all spin configurations in V. If V ⊂ V we denote σV the restriction of σV to V and by V the set of all spin configurations in V . Let V ⊂ V and let G|V = (V , EV ), where E|V = {{x, y} ∈ E : x ∈ V , y ∈ V }. Then for V ⊂ V finite, the energy of the spin configuration σV in G|V is defined as HG|V (σV ) = −J δσx σy , (1.1) {x,y}⊂E|V
where δσx σy is the Kronecker symbol which is equal to one when σx = σy and zero otherwise. The coupling J is called ferromagnetic if J > 0 and anti-ferromagnetic if J < 0. ∗ ∗∗
Partially supported by CNPq (Brazil) Partially supported by CNR, G.N.F.M. (Italy)
216
A. Procacci, B. Scoppola, V. Gerasimov
The statistical mechanics of the system can be done by introducing the Boltzmann weight of a configuration σV , defined as exp{−βHG|V (σV )}, where β ≥ 0 is the inverse temperature. Then the probability to find the system in the configuration σV is given by Prob(σV ) =
e−βHG|V (σV ) . ZG|V (q)
(1.2)
The normalization constant in the denominator is called the partition function and is given by ZG|V (q, β) = e−βHG|V (σV ) . (1.3) σV ∈V
The case βJ = −∞ is the anti-ferromagnetic and zero temperature Potts model with q-states. In this case configurations with non zero probability are only those in which adjacent spins have different values (or colors) and ZG|V (q) becomes simply the number of all allowed configurations. The thermodynamics of the system at inverse temperature β and “volume” V is recovered through the free energy per unit volume given by fG|V (q, β) =
1 ln ZG|V (q), |V |
(1.4)
where |V | denotes the cardinality of V . All thermodynamic functions of the system can be obtained via derivative of the free energy. In the zero temperature anti-ferromagnetic case the function fG|V (q, β) is usually called the ground state entropy of the system. The Potts model, despite its simple formulation, is an intensely investigated subject. Besides its own interest as a statistical mechanics model, it has deep connections with several areas in theoretical physics, probability and combinatorics. In particular, Potts models on general graphs are strictly related to a typical combinatorial problem. As a matter of fact, the partition function of the Potts model with q state on a finite graph G, is equal, in the zero temperature antiferromagnetic case, to the number of proper coloring with q colors of the graph G, where proper coloring means that adjacent vertices of the graph must have different colors. This number viewed as a function of the number of colors q is actually a polynomial function in the variable q which is known as the chromatic polynomial. On the other hand, the same partition function in the general case can be related to more general chromatic type polynomials, known as Tutte polynomials [21]. This beautiful connection between statistical mechanics and graph coloring problems, first discussed by Fortuin and Kasteleyn [9], has been extensively studied and continues to attract many researchers till nowadays (see e.g. [1, 8, 15, 18–20, 22] and references therein). One of the main interests in statistical physics is to establish whether or not a given system exhibits phase transitions. This means to search for points in the interval β ∈ [0, ∞] where some thermodynamic function (like e.g. the free energy defined above) is non analytic. Now, functions such as (1.3) and (1.4) are manifestly analytic as long as V is a finite set. Hence a phase transition (i.e. non-analyticity) can arise only in the so called infinite volume limit or thermodynamic limit. That is, the graph G is some countably infinite graph, usually a regular lattice, and the infinite volume limit fG (q, β) = lim
N→∞
1 ln ZG|VN (q, β) |VN |
(1.5)
Potts Model on Infinite Graphs and the Limit of Chromatic Polynomials
217
is taken along a sequence VN of finite subsets of V such that, roughly speaking, G|VN increase in size equally in all directions. Typically, when V is Zd , VN are usually cubes of increasing size LN . There is a considerable amount of rigorous results about the thermodynamic limit and phase transitions for the Potts model on Zd and other regular lattices, see e.g. the reviews [23] and, more recently, [20]. On the other hand, the study of thermodynamic limits of spin systems on infinite graphs which are not usual lattices has recently gained the attention of many researchers (e.g. [3, 10, 11, 13] and references therein). Concerning specifically the antiferromagnetic Potts model and/or chromatic polynomial on graphs, Sokal [19] has shown recently that for any finite graph G with maximum degree , the zeros of the chromatic polynomial lies in a disk q ≤ C, and he optimally estimated such a constant, using the Dobrushin inductive approach in the cluster expansion framework and numerical methods, giving the value C = 7.963907. An important extension of this result would be to prove the existence and analyticity of the limiting free energy per unit volume (1.5) for a suitable class, as wide as possible, of infinite graphs. Such a generalization would be relevant from the statistical mechanics point of view, since it would imply that the anti-ferromagnetic Potts model on such a class of infinite graphs, if q is sufficiently large, does not present a phase transition at zero temperature (and hence at any temperature). In various papers, see e.g. [12, 16 and 19] for a more explicit mathematical formulation, has been conjectured that the ground state entropy per unit volume of the antiferromagnetic Potts model at zero temperature on an infinite graph G should be analytic in the neighborhood of 1/q = 0 whenever G is a regular lattice. In this paper we actually prove that this conjecture is true not only for regular lattices, but even for a wide class of graphs. In particular we prove that the ground state zero entropy is analytic near 1/q = 0 for all infinite graphs which are quasi-transitive and amenable, and the limit may be evaluated along any Følner sequence in V. We stress that this result proves the conjecture above in a considerably stronger formulation, since all regular lattices, either with the elementary cell made by one single vertex or by more than one vertex, are indeed quasi-transitive amenable graphs but actually the class of quasi-transitive amenable graphs is much wider than that of regular lattices. In order to obtain this result we perform a traditional polymer expansion. In the proof of analyticity we use quite rough analytical bounds to improve readability. Therefore our value of the constant appearing in the estimate of the convergence radius is not optimized as in [19]. However it should be possible to enlarge our radius of convergence with the same numerical techniques used in [19]. The constant could be greater than the quoted value since in order to prove the existence of the thermodynamic limit we have to show the absolute convergence not only of the pressure of the polymer gas, but also of more general quantities (see (3.12) and Lemma 4 below). We would also like to stress that the polymer expansion introduced here can be used without changes to prove the same result for finite temperature. We restrict ourselves to the zero temperature case since we were interested in the proof of the conjecture cited above. The paper is organized as follows. In Sect. 2 we introduce the notations used along the paper, and we state the main result (Theorem 2). In Sect. 3 we rephrase the problem in term of polymer expansion and prove a main technical result (Lemma 4). In Sect. 4 we prove a graph theory property (Lemma 6) concerning quasi-transitive amenable graphs. Finally in Sect. 5 we give the proof of the main result of the paper, i.e. Theorem 2.
218
A. Procacci, B. Scoppola, V. Gerasimov
2. Some Further Notations and Statement of the Main Result In general, if V is any finite set, we denote by |V | the number of elements of V . The set {1, 2, . . . , n} will be denoted shortly In . We denote by P2 (V ) the set of all subsets U ⊂ V such that |U | = 2 and by P≥2 (V ) the set of all finite subsets U ⊂ V such that |U | ≥ 2. Given a countable set V , and given E ⊂ P2 (V ), the pair G = (V , E) is called a graph in V . The elements of V are called vertices of G and the elements of E are called edges of G. Given two graphs G = (V , E) and G = (V , E ) in V , we say that G ⊂ G if E ⊂ E and V ⊂ V . Given a graph G = (V , E), two vertices x and y in V are said to be adjacent if {x, y} ∈ E. The degree dx of a vertex x ∈ V in G is the number of vertices y adjacent to x. A graph G = (V , E) is called locally finite if dx < +∞ for all x ∈ V , and it is called bounded degree if maxx∈V {dx } ≤ < ∞. A graph G = (V , E) is said to be connected if for any pair B, C of subsets of V such that B ∪ C = V and B ∩ C = ∅, there is an edge e ∈ E such that e ∩ B = ∅ and e ∩ C = ∅. We denote by GV the set of all connected graphs with vertex set V . If V = In we use the notation Gn in place of GIn . A tree graph τ on V is a connected graph τ ∈ GV such that |τ | = |V | − 1. We denote by TV the set of all tree graphs of V and shortly Tn in place of TIn . Let Rn ≡ (R1 , . . . , Rn ) be an ordered n-ple of non empty sets; then we denote by E(Rn ) the set ⊂ P2 (In ) defined as E(Rn ) = {{i, j } ∈ P2 (In ) : Ri ∩ Rj = ∅}. We denote G(Rn ) the graph (In , E(Rn )). Given two distinct vertices x and y of G = (V , E), a path τ (x, y) joining x to y is a tree sub-graph of G with dx = dy = 1 and dz = 2 for any vertex z in τ (x, y) distinct from x and y. We define the distance between x and y as |x − y| = min{|τ (x, y)| : τ (x, y) path in G}. Remark that |x − y| = 1 ⇔ {x, y} ∈ E. Given G = (V , E) connected and R ⊂ V , let E|R = {{x, y} ∈ E : x ∈ R, y ∈ R} and define the graph G|R = (R, E|R ). Note that G|R is a sub-graph of G. We call G|R the restriction of G to R. We say that R ⊂ V is connected if G|R is connected. For any non-void R ⊂ V , we further denote by ∂R the external boundary of R which is the subset of V \R given by ∂R = {y ∈ V \R : ∃x ∈ R : |x − y| = 1}.
(2.1)
An automorphism of a graph G = (V , E) is a bijective map γ : V → V such that {x, y} ∈ E ⇒ {γ x, γ y} ∈ E. A graph G = (V , E) is called transitive if, for any x, y in V , an automorphism γ on G exists which maps x to y. The graph G is called quasi-transitive if V can be partitioned in finitely many sets O1 , . . . Os (vertex orbits) such that for {x, y} ∈ Oi an automorphism γ on G exists which maps x to y and this holds for all i = 1, . . . , s. If x ∈ Oi and y ∈ Oi we say that x and y are equivalent. Remark that a locally finite quasi-transitive graph is necessarily bounded degree. Roughly speaking in a transitive graph any vertex of the graph is equivalent; in other words G “looks the same” to observers sitting in different vertices. In a quasi-transitive graph there is a finite number of different type of vertices and G “looks the same” to observers sitting in vertices of the same type. As an immediate example all periodic lattices with the elementary cell made by one site (e.g. square lattice, triangular lattice, hexagonal lattice, etc.) are transitive infinite graphs, while periodic lattices with the elementary cell made by more than one site are quasi-transitive infinite graphs.
Potts Model on Infinite Graphs and the Limit of Chromatic Polynomials
219
Let G = (V, E) be a connected infinite graph. G is said to be amenable if |∂W | : W ⊂ V, 0 < |W | < +∞ = 0. inf |W | A sequence {VN }N∈N of finite sub-sets of V is called a Følner sequence if |∂VN | = 0. N→∞ |VN | lim
(2.2)
Note that such a definition recalls the notion of Van Hove sequence in statistical mechanics. From now on G = (V, E) will denote a connected locally finite infinite graph and VN ⊂ V a finite subset. The partition function of the antiferromagnetic Potts model with q colours on G|VN at zero temperature can be rewritten (in a slightly different notation with respect to (1.1)) as ZG|VN (q) = (2.3) exp − Jxy δσx σy , {x,y}∈P2 (VN )
σVN
where Jxy =
+∞
if |x − y| = 1
, (2.4) 0 otherwise and when in (2.3) a product 0 · ∞ appears it must be interpreted as zero. We stress again that, due to assumption (2.4) (i.e. antiferromagnetic interaction + zero temperature), the function ZG|VN (q) represents the number of ways that the vertices x ∈ VN of G|VN can be assigned “colors” from the set {1, 2, . . . , q} in such a way that adjacent vertices always receive different colors. We also recall that the function ZG|VN (q) is called, in graph theory language, the chromatic polynomial of G|VN . Definition 1. Let G = (V, E) be a connected and locally finite infinite graph and let {VN }N∈N be a Følner sequence of subsets of V. Then we define, if it exists, the ground state specific entropy of the antiferromagnetic Potts model at zero temperature on G as SG (q) = lim
N→∞
1 ln ZG|VN (q). |VN |
We also define the reduced ground state degeneracy per site as 1 1 |V | lim ZG|VN (q) N . Wr (G, q) = q N→∞
(2.5)
(2.6)
The ground state specific entropy SG (q) and the reduced ground state degeneracy Wr (G, q) are directly related by the identity SG (q) = ln Wr (G, q) + ln q .
(2.7)
We can now state our main result. Theorem 2. Let G = (V, E) be a locally finite connected quasi-transitive amenable infinite graph with maximum degree , and let {VN }N∈N be a Følner sequence in G. Then, Wr (G, q) exists, is finite, is independent on the choice of the sequence {VN }N∈N , and is analytic in the variable 1/q whenever |1/q| < 1/2e3 (e being the basis of the natural logarithm).
220
A. Procacci, B. Scoppola, V. Gerasimov
Again we stress that this result proves the conjecture in [12], [16] and [19] a considerably stronger formulation, since any regular lattice is a quasi-transitive amenable graph but the class of quasi-transitive amenable graphs is actually much wider than that of regular lattices.
3. Polymer Expansion and Analyticity We first rewrite the partition function of the Potts model on a generic finite graph G = (V , E) as a hard core Polymer gas grand canonical partition function. This is a standard procedure (see e.g. [19]). Without loss of generality, we will assume in this section that G is a sub-graph of a bounded degree infinite graph G = (V, E) with maximum degree . Denote by π(V ) the set of all unordered partitions of V , i.e. an element of π(V ) is an unordered n-ple {R1 , R2 , . . . , Rn }, with 1 ≤ n ≤ |V |, such that, for i, j ∈ In , Ri ⊂ V , Ri = ∅, Ri ∩ Rj = ∅,and ∪ni=1 Ri = V . Then, by writing the factor exp{− {x,y}⊂V δσx σy Jxy } in (2.3) as {x,y}⊂V [(exp{−δσx σy Jxy } − 1) + 1] and developing the product (a standard Mayer expansion procedure, see e.g [7 or 6] and references therein) we can rewrite the partition function on G (2.3) as ZG (q) = q |V | G (q), where
G (q) =
(3.1)
ρ(R1 ) . . . ρ(Rn )
(3.2)
n≥1 {R1 ,...,Rn }∈π(V )
with 1 −|R| ρ(R) = q σR ∈R 0
if |R| = 1
E ⊂P2 (R) (R,E )∈GR
{x,y}∈E
[e−δσx σy Jxy − 1]
if |R| ≥ 2 and G|R ∈ GR . if |R| ≥ 2 and G|R ∈ / GR (3.3)
Observe that the sum in l.h.s. of (3.3) runs over all possible connected graphs with vertex set R. The r.h.s. of (3.2) can be written in a more compact way, by using the short notations Rn ≡ (R1 , . . . , Rn ) as
G (q) = 1 +
; 1 n! n≥1
ρ(Rn ) ≡ ρ(R1 ) · · · ρ(Rn )
ρ(Rn ) ,
(3.4)
Rn ∈[P≥2 (V )]n Ri ∩Rj =∅ ∀{i,j }⊂In
where [P≥2 (V )]n denote the n-times Cartesian product of P≥2 (V ) (which, we recall, denotes the set of all finite subsets of V with cardinality greater than 2). It is also convenient to simplify the expression for the activity (3.3) by performing the sum over σR . As a matter of fact
Potts Model on Infinite Graphs and the Limit of Chromatic Polynomials
q −|R|
σR ∈R
E ⊂P2 (R) (R,E )∈GR
{x,y}∈E
= q −|R|
= q −|R|
[e−δσx σy Jxy − 1]
σR ∈R
E ⊂P2 (R) (R,E )∈GR
{x,y}∈E
E ⊂P2 (R) (R,E )∈GR
221
δσx σy [e−Jxy − 1]
δσx σy
σR ∈R {x,y}∈E
[e−Jxy − 1] .
{x,y}∈E
But now, for any connected graph (R, E ) ∈ GR , δσx σy = q . σR ∈R {x,y}∈E
Hence we get, for |R| > 1, −(|R|−1) q ρ(R) = 0
E ⊂P2 (R) (R,E )∈GR
{x,y}∈E
[e−Jxy − 1]
if G|R ∈ GR .
(3.5)
otherwise
By definitions (3.5) or (3.3), the polymer activity ρ(R) can be viewed as a real valued function defined on any finite subset R of V. Of course this function depends on the “topological structure” of G. We remark that if γ is an automorphism of G, then (3.5) clearly implies that ρ(γ R) = ρ(R). In other words the activity ρ(R) is invariant under the automorphism of G. The function G (q) is the standard grand canonical partition function of a hard core polymer gas in which the polymers are finite subsets R ⊂ V with cardinality greater than 2, with activity ρ(R), and submitted to a hard core condition (Ri ∩ Rj = ∅ for any pair {i, j } ∈ In ). Note that by (3.1) and definitions (2.6)-(2.7) we have 1 (3.6) ln G|VN (q) . Wr (G, q) = exp lim N→∞ |VN | It is a well known fact in statistical mechanics that the natural logarithm of G can be rewritten as a formal series, called the Mayer series (see e.g. [7]) as ln G (q) =
∞ 1 n! n=1
where
φ (Rn ) = T
E ⊂E(Rn ) (In ,E )∈Gn
φ T (Rn )ρ(Rn ) ,
(3.7)
Rn ∈[P≥2 (V )]n
{i,j }∈E (−1)
|E |
if G(Rn ) ∈ Gn
(3.8) 0 otherwise and G(Rn ) ≡ G(R1 , . . . , Rn ) defined at the beginning of Sect. 2. The reader should note that the summation in the l.h.s. of (3.4) is actually a finite sum. On the contrary, the summation in the r.h.s. of (3.7) is an infinite series. We conclude this section showing two important technical lemmas concerning precisely the convergence of the series (3.7). In the proof of both lemmas we will use a well
222
A. Procacci, B. Scoppola, V. Gerasimov
known combinatorial inequality due to Rota [14], which states that if G = (V , E) is a connected graph, i.e. G ∈ GV , then |E | (−1) 1 = NTV [G] , (3.9) ≤ E ⊂E: E ⊂E: (V ,E )∈TV (V ,E )∈GV where NTV [G] is the number of tree graphs with vertex set V which are sub-graphsof G. We want to outline that a similar inequality can be used to treat the finite temperature case. Actually in the latter case the quantity −1 in the l.h.s. of (3.9) is replaced by e−βJ − 1 which is, in the antiferromagnetic case, a quantity between 0 and 1. Then one can use, instead of the Rota inequality, the more general Penrose inequality (see e.g. [6 and 19]) and get a bound similar to (3.10) with e/q replaced by e e−βJ − 1/q. Lemma 3. Let G = (V, E) be a bounded degree infinite graph with maximum degree , and let, for any R ∈ V such that |R| ≥ 2, the activity ρ(R) be given as in (3.5). Then, for any n ≥ 2, e n−1 sup |ρ(R)| ≤ , (3.10) |q| x∈V R⊂V: x∈R |R|=n
Proof. By definition
sup
x∈V
|ρ(R)| = |q|−(n−1) sup
x∈V
R∈V: x∈R |R|=n
R⊂V: x∈R |R|=n, G|R ∈GR
E ⊂P2 (R) (R,E )∈G
[e−Jxy
{x,y}∈E
R
− 1] . (3.11)
Using thus the Rota inequality (3.9), recalling that E|R = {{x, y} ∈ E : x ∈ R, y ∈ R}, and observing that e−Jxy − 1 = −1 if |x − y| = 1 and e−Jxy − 1 = 0 otherwise, we get −Jxy |E | = [e − 1] (−1) 1 ≤ E ⊂P2 (R) {x,y}∈E E ⊂E|R E ⊂E|R : (R,E )∈G (R,E )∈T (R,E )∈G R R R δ|x−y|1 , = E ⊂P2 (R) (R,E )∈TR
{x,y}∈E
where δ|x−y|1 = 1 if |x − y| = 1 and δ|x−y|1 = 0 otherwise. Hence sup |ρ(R)| ≤ |q|−(n−1) sup δ|x−y|1 x∈V
x∈V
R⊂V: x∈R |R|=n, G|R ∈GR
R⊂V: x∈R E ⊂P2 (R) |R|=n (R,E )∈TR
{x,y}∈E
≤
|q|−(n−1) sup (n − 1)! x∈V x E ⊂P2 (In ) (In ,E )∈Tn
n−1 1 =x, (x2 ,...,xn )∈V xi =xj ∀{i,j }∈In
{i,j }∈E
δ|xi −xj |1 .
Potts Model on Infinite Graphs and the Limit of Chromatic Polynomials
223
It is now easy to check that, for any E ⊂ P2 (In ) such that (In , E ) is a tree, it holds
x1 =x, (x2 ,...,xn )∈Vn−1 xi =xj ∀{i,j }∈In
{i,j }∈E
sup
x∈V
and since, by the Cayley formula,
sup
x∈V
1 = nn−2 , we get
E ⊂P2 (In ) (R,E )∈Tn
|ρ(R)| ≤
R⊂V: x∈R |R|=n
δ|xi −xj |1 ≤ n−1
|q|
n−1
e n−1 nn−2 . ≤ (n − 1)! |q|
To state the second lemma we need to introduce a formal series more general than the r.h.s. of (3.7). Let thus U ⊂ V be finite and let m be a positive integer. We define SUm (G, q) =
∞ 1 n! n=1
φ T (Rn )ρ(Rn ) ,
(3.12)
(V)]n
Rn ∈[P≥2 |Rn |≥m, R1 ∩U =∅
where |Rn | = ni=1 |Ri | and recall that P≥2 (V) denotes the set of all finite subsets of V with cardinality greater than or equal to 2 and [P≥2 (V)]n denote the n-times Cartesian product. We will now prove the following: Lemma 4. Let G = (V, E) be a locally finite infinite graph with maximum degree . Let U ⊂ V be finite and let m be a positive integer. Then SUm (G, q) defined in (3.12) exists and is analytic as a function of 1/q in the disk |2e3 /q| < 1. Moreover it satisfies the following bound: |SUm (G, q)|
≤ |U |
1−
1 2e3 |/q|
3 m/2 2e . q
Proof. We will prove the theorem by showing directly that the r.h.s. of (3.12) converges absolutely when |1/q| is sufficiently small. Let us define |S|m U (G) =
∞ 1 n! n=1
|φ T (Rn )ρ(Rn )| ,
(3.13)
Rn ∈[P≥2 (V)]n |Rn |≥m, R1 ∩U =∅
m then |SUm (G)| ≤ |S|m U (G). We now bound |S|U (G). We have:
|S|m U (G)
∞ [s/2] 1 ≤ n! s=m n=1
Rn ∈[P≥2 (V)]n R1 ∩U =∅, |Rn |=s
∞ [s/2] 1 |φ (Rn )ρ(Rn )| = n! s=m T
n=1
kn ∈Nn : ki ≥2 k1 +...+kn =s
Bn (kn ) ,
224
A. Procacci, B. Scoppola, V. Gerasimov
where kn ≡ (k1 , . . . , kn ), Nn denotes the n- times Cartesian product of N, [s/2] = max{ ∈ N : ≤ s/2}, and |φ T (Rn )ρ(Rn )| . Bn (kn ) = Rn ∈[P≥2 (V)]n R1 ∩U =∅ |R1 |=k1 ,..., |Rn |=kn
Recalling now (3.8) and using again the Rota bound (3.9) we get
if G(Rn ) ∈ Gn ≤ NTn [G(Rn )] T . |φ (Rn )| =0 otherwise Hence
Bn (kn ) ≤
NTn [G]
G∈Gn
|ρ(Rn )| .
(3.14)
Rn ∈[P≥2 (V)]n R1 ∩U =∅, G(Rn )=G |R1 |=k1 ,...,|Rn |=kn
Observing now that
NTn [G](· · ·) =
G∈Gn
we can rewrite
(· · ·) ,
τ ∈Tn G∈Gn : G⊃τ
Bn (kn ) ≤
Bn (τ, kn ) ,
(3.15)
τ ∈Tn
where
Bn (τ, kn ) =
|ρ(Rn )| .
Rn ∈[P≥2 (V)]n R1 ∩U =∅, G(Rn )⊃τ |R1 |=k1 ,...,|Rn |=kn
Note now that for any non negative function F (R) it holds F (R) ≤ |R | sup F (R) . x∈V
R∈V: R∩R =∅ |R|=k
(3.16)
R∈V x∈R, |R|=k
Hence we can now estimate Bn (τ, kn ) for any fixed τ by explicitly performing the sum over polymers Rn submitted to the constraint that g(Rn ) ⊃ τ , summing first over the “outermost polymers”, i.e. those polymers Ri such that i is a vertex of degree 1 in τ , and using repetitively the bounds (3.16). Then one can easily check that Bn (τ, kn ) ≤ |U | sup
x∈V
R1 ∈V x∈R1 , |R1 |=k1
· · · sup
x∈V
|ρ(R1 )||R1 |
Rn ∈V x∈Rn |Rn |=kn
d1
k
|Ri |di −1 |ρ(Ri )| ,
i=2
(3.17) where di is the degree of the vertex i of τ . Recall that, for any tree τ ∈ Tn , 1 ≤ di ≤ n−1 and d1 + . . . + dn = 2n − 2 hold. Now, by Lemma 3, (3.10), we can bound Bn (τ, kn ) ≤ |U |εk1 −1 k1d1
k
kidi −1 ε kn −1 ,
i=2
(3.18)
Potts Model on Infinite Graphs and the Limit of Chromatic Polynomials
225
where we have put for simplicity ε = e/|q|. Noting that estimates in the l.h.s. of (3.17) depends only on the degrees d1 , . . . , dn of the vertices in τ , we can now easily sum over all connected tree graphs in Tn and obtain Bn (kn ) ≤ Bn (τ, kn ) = Bn (τ, kn ) τ ∈Tn
r1 ,...,rn τ ∈Tn r1 +...+rn =2n−2 d1 =r1 ,...,dn =rn 1≤ri ≤n−1
≤ |U |
(n − 2)!k1
r1 ,...,rn r1 +...rn =2n−2 1≤ri ≤n−1
n i=1
kiri −1 ki −1 ε (ri − 1)!
,
where in the second line we used the bound (3.17) and the Cayley formula (n − 2)! 1 = n . i=1 (di − 1)! τ ∈T
(3.19)
n d1 ,...dn fixed
Now, recalling that k1 + . . . + kn = s and using the Newton multinomial formula, we get Bn (kn ) ≤ |U |k1 s n−2 ε s−n ≤ |U |s n ε s−n .
Thus, since k1 ,...,kn : ki ≥2 1 ≤ 2s−n , we obtain k1 +...kn =s
|SUm (G)| ≤ |U |
∞ [s/2] sn s=m n=1
n!
The series above converges if ε < |SUm (G)|
≤ |U |
∞ [s/2] sn s=m n=1
n!
ε s−n
k1 ,...,kn : ki ≥2 k1 +...kn =s
1 2e
∞ [s/2] sn
1 ≤ |U |
s=m n=1
n!
[2ε]s−n .
and we get the bound
[2ε]s−n ≤
∞
[2ε]s−[s/2]
s=m
≤ |U |
∞ n s n=1
n!
≤
∞
[2e2 ε]s/2
s=m
m/2
2e2 ε √ 1 − e 2ε
provided 2e2 ε < 1 . Hence, recalling that ε = e/|q|, the lemma is proved.
The following corollary is now a trivial consequence of the two lemmas above. Corollary 5. Let G = (V , E) be any finite connected sub-graph of an infinite connected bounded degree graph G = (V, E) with maximum degree . Then the function |V |−1 log G (q) is analytic in the variable 1/q for |1/q| < 1/2e3 and it admits the following bound uniformly in |V |: 1 3 1 ≤ 2e . (q) log
(3.20) G |V | q 1 − 2e3 |/q| Proof. For any G = (V , E) ⊂ G = (V, E) with V finite, by definition (3.7) and (3.12), it holds that | ln G (q)| ≤ |S|2V (G, q) and one can thus apply Lemma 4.
226
A. Procacci, B. Scoppola, V. Gerasimov
4. A Graph Theory Lemma Lemma 6. Let G = (V, E) be a locally finite quasi-transitive infinite graph and let {VN }N∈N be a Følner sequence of finite subsets of V. Then, for every vertex orbit O ⊂ V of Aut(G), there exists a non-zero finite limit |O∩VN | N→∞ |VN | lim
(4.1)
and it is independent of the choice of the sequence {VN }N∈N . Remark . A proof of this lemma can be found in [2] (Prop. 3.6). We give an alternative and self-contained proof for completeness. Proof. For a natural r and a finite set F ⊂V denote by Br F the set Br (F ) = {x∈V : ∃y∈F |x − y|r} .
(4.2)
Thus, for a single-point set {y}, Br ({y}) is the ball of radius r centered at y. Moreover we have the bound |Br (F )| |F |(1++ . . . +r ) r+1 |F | .
(4.3)
Let O1 , . . . , Os be the complete list of vertex orbits of Aut(G) in the set V and let A0 ⊂ V be a set with exactly one element in common with every orbit. Denote by d the diameter of A0 . Consider the orbit A = {gA0 : g∈Aut(G)} of A0 . A set A ⊂ V is therefore an element of A if g∈Aut(G) exists such that A = gA0 . For any set U ⊂V we denote AU = {A∈A : A⊂U }. Note that for any set A ∈ A and any vertex orbit O, we have that |A ∩ O| = 1, hence for a fixed vertex orbit Oi we can define the function ϕi as follows: ϕi : A → Oi : A → A ∩ Oi . The function ϕi is a surjection and for x ∈ Oi the number ki = |ϕi−1 (x)| is finite and does not depend on the choice of x ∈ Oi . To see that ki is finite simply observe that the automorphisms of G preserve the graph distance. Hence ϕi−1 (x) is contained in the ball of radius d centered at x. Such a ball has finite cardinality, is bounded by d+1 , and contains therefore a finite number of subsets. To see that |ϕi−1 (x)| does not depend on x when x varies in Oi , observe that if x, y ∈ Oi , we can find g ∈ Aut(G) such that y = gx. Then observe that ϕi−1 (y) ⊃ gϕi−1 (x) and thus |ϕi−1 (y)| ≥ |gϕi−1 (x)|. On the other side ϕi−1 (x) ⊃ g −1 ϕi−1 (y). Hence |ϕi−1 (x)| = |ϕi−1 (y)|. Define now the sets VN− = VN \Bd (∂VN ),
VN+ = VN ∪Bd (∂VN ) .
(4.4)
By construction VN− ⊂VN ⊂VN+ and ϕi−1 (VN− ∩Oi ) ⊂ AVN ⊂ ϕi−1 (VN ∩Oi ) ⊂ AV + . N
(4.5)
Indeed, suppose that A∈ϕi−1 (VN− ∩Oi ) and A ⊂VN . Let a1 =ϕi (A)∈VN− ⊂VN and let a2 ∈A\VN . There exists a path τ (a1 , a2 ) in G of length d. This chain must have at least one point in ∂VN . This implies a1 ∈Bd (∂VN ) contradicting the assumption a1 ∈VN− . The
Potts Model on Infinite Graphs and the Limit of Chromatic Polynomials
227
second inclusion of (4.5) is obvious and the third one is true by the same reason as the first one. Equation (4.5) implies ki |VN− ∩Oi | |AVN | ki |VN ∩Oi | |AVn+ |
(4.6)
and 1 1 |AVN | |VN ∩Oi | |AV + | . N ki ki
|VN− ∩Oi |
(4.7)
By taking the sum over i we get |VN− | α|AVN | |VN | α|AV + | ,
(4.8)
N
where α =
s
1 i=1 ki .
On the other hand AV + \AVN ⊂ ABd (∂VN ) and, by (4.3), N
|AV + | |AVN | + |ABd (∂VN ) | |AVN | + kd+1 |∂VN | ,
(4.9)
N
where k=max{ki : i∈{1, . . . , s}}. From the first inequality of (4.8) we have |AVN | If
|∂VN | |VN |
1 − 1 1 |VN | (|VN | − |Bd (∂VN )|) (|VN | − d+1 |∂VN |) . α α α
(4.10)
ε then, by (4.9) and (4.10), 1
|AV + | N
|AVN |
1+
d+1 |∂VN | 1 d+1 |∂V |) N α (|VN | −
1+
d+1 ε 1 d+1 ε) α (1 −
.
This proves that lim
N→∞
|AV + |
= 1.
(4.11)
|AVN | 1 = . |VN | α
(4.12)
N
|AVN |
By (4.8) and (4.10) we also have lim
N→∞
Dividing (4.7) by |VN | and using (4.12) we obtain lim
N→∞
and the lemma is proved.
1 |Oi ∩VN | = , |VN | ki α
228
A. Procacci, B. Scoppola, V. Gerasimov
5. Potts Model on Infinite Graphs: Proof of Theorem 2 Let G = (V, E) be an infinite bounded degree graph and let x ∈ V. Then we define ∞ 1 fG (x) = n! n=1
φ T (Rn )
Rn ∈[P≥2 (V)]n x∈R1
ρ(Rn ) . |R1 |
(5.1)
We stress that, by construction, fG (x) is invariant under automorphism. I.e. if x ∈ V and y ∈ V are equivalent (i.e. a γ automorphism of G exists such that y = γ x) then fG (x) = fG (y). Given now a finite set VN ⊂ V, we define 1 F (VN ) = fG (x) . (5.2) |VN | x∈VN
The numbers F (VN ) are actually functions of q. As a trivial corollary of Lemma 4 we can state the following Lemma 7. Let G = (V, E) be an infinite bounded degree. Then for any VN ⊂ V finite, the functions fG (x) and F (VN ) defined in (5.1) and (5.2)are analytic in the variable 1/q for |1/q| < 1/2e3 and bounded by 2e3 /q /(1 − 2e3 |/q|) uniformly in N . Proof. Comparing the l.h.s. of (3.12) with the l.h.s. of (5.1) we have that |fG (x)| ≤ |S|2{x} (G), hence one can again use Lemma 4 and get immediately the proof. From Lemma 6 and Lemma 7 it follows: Proposition 8. Let G = (V, E) be a locally finite quasi-transitive infinite graph and let {VN }N∈N be a sequence of finite subsets of V such that |∂VN |/|VN | → 0 as N → ∞. Let be the maximum degree of G, then the limit . lim F (VN ) = FG (q) (5.3) N→∞
exists, is finite, is independent on the sequence {VN }N∈N , and is analytic as a function of 1/q for |1/q| < 1/2e3 . Proof. If the limit (5.3) exists, then by Lemma 7 it is clearly bounded by |2e3 /q|/(1 − 3 2e |/q|) and is analytic in 1/q for |1/q| < 1/2e3 . To prove the existence of the limit (5.3) we proceed as follows. Since G is quasi-transitive then V can be partitioned into orbits O1 , . . . , Os of Aut(G) such that for any two vertices x, y in the same orbit Oi there is an automorphism of G which maps x to y. Hence for such a pair we have fG (x) = fG (y) and we can conclude that fG (x) has value in a finite set {f1 , . . . , fs } with fi = fG (x), where x is any vertex x ∈ Oi . Thus for any finite connected VN and any j ∈ {1, 2, . . . , s} we have 1 |VN ∩ O1 | |VN ∩ Os | fG (x) = f1 + . . . fs , |VN | |VN | |VN | x∈VN
hence lim F (VN ) == f1 lim
N→∞
N→∞
|VN ∩ O1 | |VN ∩ Os | + . . . + fs lim , N→∞ |VN | |VN |
and by Lemma 6 the limit above exists.
Potts Model on Infinite Graphs and the Limit of Chromatic Polynomials
229
We are at last in the position to prove the main results of the paper, namely Theorem 2 enunciated at the end of Sect. 2. Proof of Theorem 2. We will prove that limN→∞ |VN |−1 log G|VN (q) = FG (q), where FG (q) is the function defined in (5.3) and then use definition (3.6), fG (x) log G|VN − x∈VN
=
∞ n=1
1 n!
Now note that
Rn ∈[P≥2 (VN )]n
x∈VN
(· · ·) =
R1 ∈VN x∈R1
(· · ·) =
Rn ∈[P≥2 (V)]n x∈R1
moreover
x∈VN
Rn ∈[P≥2 (V)]n x∈R1
φ T (Rn )ρ(Rn ) −
ρ(Rn ) . |R1 |
(· · ·) ,
Rn ∈[P≥2 (V)]n x∈R1 ∃Ri : Ri ⊂VN
|R1 |(· · ·) ,
R1 ∈VN
(· · ·) +
Rn ∈[P≥2 (VN )]n x∈R1
φ T (Rn )
x∈VN
(· · ·) =
|R1 ∩ VN |(· · ·) .
R1 ∈V
R1 ∈V x∈R1
Hence, using also that |R1 ∩ VN |/|R1 | ≤ 1 we get ∞ 1 T fG (x) ≤ log G|VN − φ (Rn )ρ(Rn ) . n! R ∈[P (V)]n n=1
x∈VN
n ≥2 R1 ∩VN =∅ ∃Ri : Ri ⊂VN
Let us now choose p > ln and define p mN
1 |VN | = ln . p |∂VN |
(5.4)
We remark that, since by the hypothesis the sequence VN is Følner and hence (2.2) holds, p then limN→∞ mN = ∞, for any integer p. We now can rewrite (· · ·) = (· · ·) + (· · ·) . Rn ∈[P≥2 (V)]n R1 ∩VN =∅ ∃Ri : Ri ⊂VN
Rn ∈[P≥2 (V)]n p R1 ∩VN =∅, |Rn |≥mN ∃Ri : Ri ⊂VN ,
Hence ∞ 1 log G| − f (x) ≤ G VN n! n=1 x∈VN
+
∞ 1 n! n=1
Rn ∈[P≥2 (V)]n p R1 ∩VN =∅, |Rn |<mN ∃Ri : Ri ⊂VN ,
Rn ∈[P≥2 (V)]n p R1 ∩VN =∅, |Rn |≥mN ∃Ri : Ri ⊂VN ,
Rn ∈[P≥2 (V)]n
p R1 ∩VN =∅, |Rn |<mN ∃Ri : Ri ⊂VN ,
T φ (Rn )ρ(Rn )
T φ (Rn )ρ(Rn ) .
(5.5)
230
A. Procacci, B. Scoppola, V. Gerasimov
But, concerning the first sum, recalling definition (3.13), we have ∞ 1 n! n=1
p p m T φ (Rn )ρ(Rn ) ≤ |S|VNN (G, q) ≤ Const.|VN |ε mN /2 ,
Rn ∈[P≥2 (V)]n p R1 ∩VN =∅, |Rn |≥mN ∃Ri : Ri ⊂VN ,
where ε = q/2e3 < 1 by hypothesis. which, divided by |VN |, converges to zero as p N → ∞ because by hypothesis mN → ∞ as N → ∞. T On the other hand, recalling that due to
the factor φ (Rn ) the sets Ri pmust be pairwise connected, we have that | ∪i Ri | < i |Ri |. So, since | ∪i Ri | < mN and at least one among Ri intersects ∂VN , this means that all polymers Ri must lie in the set p
Bmp (∂VN ) = {x ∈ V : ∃v ∈ ∂VN : |x − v| ≤ mN } . N
Recalling (4.2) we have p
|Bmp (∂VN )| ≤ |∂VN |mN +1 . N
Hence we have, again recalling (3.13), that the second sum in the l.h.s. of (5.5) is bounded by ∞ 1 n! n=1
T φ (Rn )ρ(Rn ) ≤ |S|2B
p (∂VN ) mN
Rn ∈[P≥2 (V)]n
(G, q)
p R1 ∩VN =∅, |Rn |<mN ∃Ri : Ri ⊂VN ,
p
≤ Cost.|Bmp (∂VN )|ε ≤ Const.|∂VN |mN ε . N
Thus recalling definition (5.4), we have 1 1 1 = − f (x) − F (q) log
log
G | G G | G |V | |V | VN VN |V | N x∈V N N N
≤ Const.
|∂VN | |VN |
| ln ε| p
+ Const.ε
ln |∂VN | 1− p . |VN |
Since by hypothesis |∂VN |/|VN | → 0 as N → ∞, we conclude that the quantity above is as small as we please for N large enough. This ends the proof of the theorem. Acknowledgements. This work was supported by the Ministero dell’Universita’ e della Ricerca Scientifica e Tecnologica (Italy) and Conselho Nacional de Desenvolvimento Cientfico e Tecnol´ogico CNPq, a Brazilian Governmental agency promoting Scientific and technology development (grant n. 460102/00-1). We thank Prof. M´ario Jorge Dias Carneiro for useful discussions and Prof. Bojan Mohar for a valuable suggestion via e-mail.
Potts Model on Infinite Graphs and the Limit of Chromatic Polynomials
231
References 1. Baxter, R.J.: Dichromatic Polynomials and Potts Models Summed Over Rooted Maps. Annals of Combinatorics 5, 17–36 (2001) 2. Benjamini, I., Lyons, R., Peres, Y., Schramm, O.: Group-invariant percolation on graphs. Geom. Funct. Anal. 9(1), 29–66 (1999) 3. Benjamini, I., Schramm, O.: Percolation beyond Z d , many questions and a few answers. Electr. Comm. Probab. 1, 71–82 (1996) 4. Biggs, N.L.: Chromatic and thermodynamic limits. J. Phys. A 8(10), L110–L112 (1975) 5. Biggs, N.L., Meredith, G.H.J.: Approximations for chromatic polynomials. J. Combinatorial Theory Ser. B 20(1), 5–19 (1976) 6. Brydges, D.: A short course on cluster expansion. In: Les Houches 1984, K. Osterwalder, R., Stora, (eds.), Amsterdam: North Holland Press, 1986 7. Cammarota, C.: Decay of Correlations for Infinite range Interactions in unbounded Spin Systems. Commun. Math Phys. 85, 517–528 (1982) 8. Chang, Shu-Chiuan, Shrock, Robert: Structural properties of Potts model partition functions and chromatic polynomials for lattice strips. Phys. A 296(1–2), 131–182 (2001) 9. Fortuin, C.M., Kasteleyn, P.W.: On the random-cluster model. I. Introduction and relation to other models. Physica 57, 536–564 (1972) 10. H¨aggstr¨om, O., Schonmann, R.H., Steif, J.E.: The Ising model on diluted graphs and strong amenability. Ann. Probab. 28(3), 1111–1137 (2000) 11. Jonasson, J.: The random cluster model on a general graph and a phase transition characterization of nonamenability. Stochastic Process Appl. 79, 335–534 (1999) 12. Kim, D., Enting, I.G.: The limit of chromatic polynomials. J. Combin. Theory Ser. B 26(3), 327–336 (1979) 13. Lyons, R.: Phase transitions on nonamenable graphs. Probabilistic techniques in equilibrium and nonequilibrium statistical physics. J. Math. Phys. 41(3), 1099–1126 (2000) 14. Rota, G.: On the foundations of combinatorial theory. I. Theory of M¨obius functions. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 2, 340–368 (1964) 15. Salas, J., Sokal, A.D.: Transfer matrices and partition-function zeros for antiferromagnetic Potts models. I. General theory and square-lattice chromatic polynomial. J. Statist. Phys. 104(3–4), 609– 699 (2001) 16. Shrock, R., Tsai, S.H.: Families of graphs with Wr (G, q) functions that are nonanalytic at 1/q=0. Phys. Rev. E 56(4), 3935–3943 (1997) 17. Shrock, R., Tsai, S.H.: Ground-state degeneracy of Potts antiferromagnets: Homeomorphic classes with noncompact W boundaries. Physica A 265(1-2), 186–223 (1999) 18. Shrock, R.: Chromatic polynomials and their zeros and asymptotic limits for families of graphs. 17th British Combinatorial Conference (Canterbury, 1999). Discrete Math. 231, 421–446 (2001) 19. Sokal, A.D.: Bounds on the complex zeros of (di)chromatic polynomials and Potts-model partition functions. Combin. Probab. Comput. 10(1), 41–77 (2001) 20. Sokal, A.D.: A personal list of unsolved problems concerning lattice gases and antiferromagnetic Potts models. In: Inhomogeneous random systems (Cergy-Pontoise, 2000). Markov Process. Related Fields 7(1), 21–38 (2001) 21. Tutte, W.T.: A contribution to the theory of chromatic polynomials. Canadian J. Math. 6, 80–91 (1954) 22. Welsh, D.J.A., Merino, C.: The Potts model and the Tutte polynomial. J. Math. Phys. 41(3), 1127– 1152 (2000) 23. Wu, F.Y.: The Potts model. Rev. Mod. Phys. 54(1), 235–268 (1982) Communicated by H. Spohn
Commun. Math. Phys. 235, 233–253 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0805-9
Communications in
Mathematical Physics
Spectral Properties of Hypoelliptic Operators J.-P. Eckmann1,2 , M. Hairer1,∗ 1
D´epartement de Physique Th´eorique, Universit´e de Gen`eve, Geneve, Switzerland. E-mail:
[email protected];
[email protected] 2 Section de Math´ematiques, Universit´e de Gen`eve, Geneve, Switzerland Received: 30 July 2002 / Accepted: 18 October 2002 Published online: 25 February 2003 – © Springer-Verlag 2003
Abstract: We study hypoelliptic operators with polynomially bounded coefficients that T are of the form K = m i=1 Xi Xi + X0 + f , where the Xj denote first order differential operators, f is a function with at most polynomial growth, and XiT denotes the formal adjoint of Xi in L2 . For any ε > 0 we show that an inequality of the form uδ,δ ≤ C(u0,ε + (K + iy)u0,0 ) holds for suitable δ and C which are independent of y ∈ R, in weighted Sobolev spaces (the first index is the derivative, and the second the growth). We apply this result to the Fokker-Planck operator for an anharmonic chain of oscillators coupled to two heat baths. Using a method of H´erau and Nier [HN02], we conclude that its spectrum lies in a cusp {x + iy|x ≥ |y|τ − c, τ ∈ (0, 1], c ∈ R}. 1. Introduction In an interesting paper, [HN02], H´erau and Nier studied the Fokker-Planck equation associated to a Hamiltonian system H in contact with a heat reservoir at inverse temperature β. For this problem, it is well-known that the Gibbs measure µβ (dp dq) = exp (−βH (p, q)) dp dq is the only invariant measure for the system. In their study of convergence under the flow of any measure to the invariant measure, they were led to study spectral properties of the Fokker-Planck operator L when considered as an operator on L2 (µβ ). In particular, they showed that L has a compact resolvent and that its spectrum is located in a cusp-shaped region, as depicted in Fig. 1.1 below, improving (for a special case) earlier results obtained by Rey-Bellet and Thomas [RBT02b], who showed that e−Lt is compact and that L has spectrum only in Reλ > c > 0 aside from a simple eigenvalue at 0. ∗
Present address: Mathematics Research Centre of the University of Warwick
234
J.-P. Eckmann, M. Hairer Imλ
Reλ
Fig. 1.1. Cusp containing the spectrum of L
Extending the methods of [HN02], we show in this paper that the cusp-shape of the spectrum of L occurs for many H¨ormander-type operators of the form K=
m
XiT Xi + X0 + f,
(1.1)
i=1
(the symbol T denotes the formal L2 adjoint) when the family of vector fields {Xj }m j =0 is sufficiently non-degenerate (see Definition 2.1 and assumption b1 below) and some growth condition on f holds. The main motivation for our paper comes from the study of the model of heat conduction proposed in [EPR99a] and further studied in [EPR99b, EH00, RBT00, RBT02b, RBT02a]. These papers deal with Hamiltonian anharmonic chains of point-like particles with nearest-neighbor interactions whose ends are coupled to heat reservoirs modeled by linear classical field theories. Our results improve the detailed knowledge about the spectrum of the generator L of the associated Markov process, see Sect. 5. As a byproduct, our paper also gives a more elegant analytic proof of the results obtained in [EH00]. A short probabilistic proof has already been obtained in [RBT02b]. The main technical result needed to establish the cusp-form of the spectrum is the Sobolev estimate Theorem 4.1 which seems to be new.
2. Setup and Notations We will derive lower bounds for hypoelliptic operators with polynomially bounded coefficients that are of the form (1.1). We start by defining the class of functions and vector fields we consider. 2.1. Notations. For N ∈ R, we define the set PolN 0 of polynomially growing functions by ∞ n −N α PolN |∂ f (x)| ≤ Cα . (2.1) 0 = f ∈ C (R ) ∀α, sup (1 + x) x∈Rn
Spectral Properties of Hypoelliptic Operators
235
In this expression, α denotes a multi-index of arbitrary order. We also define the set PolN 1 of vector fields in Rn that can be written as n G = G0 (x) + Gj (x)∂j , Gi ∈ PolN 0 . j =1 th One can similarly define sets PolN k of k order differential operators. It is clear that if N+ε N M N X ∈ Polk and Y ∈ Pol , then [X, Y ] ∈ PolN+M k+−1 . If f is in Pol0 , but not in Pol0 for any ε > 0, we say it is of degree N.
2.2. Hypotheses.
n n Definition 2.1. A family {Ai }m j =1 Ai,j ∂j is called i=1 of vector fields in R with Ai = non-degenerate if there exist constants N and C such that for every x ∈ Rn and every vector v ∈ Rn one has the bound m v2 ≤ C(1 + x2 )N Ai (x), v 2 , with Ai (x), v =
n
i=1
j =1 Ai,j (x) vj .
The conditions on K which we will use below are taken from the following list. a The vector fields Xj with j = 0, . . . , m belong to PolN 1 and the function f belongs to PolN . 0 b0 There exists a finite number M such that the family consisting of {Xi }m i=0 , m m {[Xi , Xj ]}i,j =0 , [Xi , Xj ], Xk i,j,k=0 and so on up to commutators of rank M is non-degenerate. b1 There exists a finite number M such that the family consisting of {Xi }m i=1 , m {[Xi , Xj ]}m , [X , X ], X and so on up to commutators of rank M is i j k i,j =0 i,j,k=0 non-degenerate. The difference between b0 and b1 is in the inclusion of the vector field X0 (in b0 ), so that b1 is stronger than b0 . Definition 2.2. We call K0 the class of operators of the form of (1.1) satisfying a and b0 above, and K1 the class of those satisfying a and b1 . Clearly, b1 is more restrictive than b0 and therefore K1 ⊂ K0 . Remark 2.3. If K is in K0 then K is hypoelliptic. If K is in K1 then ∂t +K is hypoelliptic. 3. Localized Bound The main result of this section is Theorem 3.1 which provides bounds for localized test functions. We let B(x) denote the unit cube around x ∈ Rn : B(x) = y ∈ Rn |yj − xj | ≤ 1, j = 1, . . . , n . To formulate our bounds, we introduce the operator , defined as the positive square root of 2 = 1 − ni=1 ∂i2 = 1 − . Later on, we will also need the multiplication ¯ defined as the positive root of (multiplication by) ¯ 2 = 1 + x2 . operator
236
J.-P. Eckmann, M. Hairer
Theorem 3.1. Assume K ∈ K1 . Then, there
exist positive constants ε∗ , C∗ , and N∗ such that for every x ∈ Rn and every u ∈ C0∞ B(x) , one has uniformly for y ∈ R: ε∗ u ≤ C∗ (1 + x2 )N∗ u + (K + iy)u.
(3.1)
If K is in K0 (but not in K1 ) the same estimate holds, but the constant C∗ will depend generally on y.1 Proof. The novelty of the bound is in allowing for polynomial growth of the coefficients of the differential operators. Were it not for this, the result would be a special case of H¨ormander’s proof of hypoellipticity of second-order partial differential operators [H¨or85, Thm. 22.2.1]. Since the coefficients of our differential operators can grow polynomially we need to work with weighted spaces. We introduce a family of weighted Sobolev spaces S α,β with α, β ∈ R as the following subset of tempered distributions Sn on Rn : ¯ β u ∈ L2 (Rn )}. S α,β = {u ∈ Sn | α We equip this space with the scalar product ¯ β f, α ¯ β g L2 , f, g α,β = α
(3.2)
writing also ·, · α instead of ·, · α,0 . We also use the corresponding norms · α,β . Note that these spaces are actually a particular case of the more general class of Sobolev spaces introduced in [BC94]. The following lemma lists a few properties of the spaces S α,β that will be useful in the sequel. We postpone its proof to Appendix A. Lemma 3.2. Let α, β ∈ R. We have the following:
a. Embedding: For α ≥ α and β ≥ β, the space S α ,β is continuously embedded into S α,β . The embedding is compact if and only if both inequalities are strict. ¯ γ are bounded from S α,β into S α−γ ,β and b. Scales of spaces: The operators γ and N α,β−γ S respectively. If X ∈ Polk then X is bounded from S α,β into S α−k,β−N . c. Polarization: For every α, β ∈ R, one has the bound |f, g α,β | ≤ C f α ,β gα ,β ,
α + α = 2α, β + β = 2β ,
which holds for all f and g belonging to the Schwartz space Sn . The constant C may depend on the indices. N γ d. Commutator: Let X ∈ PolN k and Y ∈ Polk . For every γ ∈ R, [X, ] is bounded α,β α+1−k−γ ,β−N γ into S . Similarly, [X, [Y, ]] is bounded from S α,β into from S −γ ,β−N−N α+2−k−k . S e. Adjoint: Let X ∈ PolN k and let f, g ∈ Sn . Then f, Xg α,β = X T f, g α,β + R(f, g), where the bilinear form R satisfies the bound |R(f, g)| ≤ Cf α ,β gα ,β , with
α + α = 2α + k − 1, β + β = 2β + N . The constant C may depend on the indices. 1
The norms are L2 norms.
(3.3)
Spectral Properties of Hypoelliptic Operators
237
Remark 3.3. A special case of point e is given by k = 1. Since X T and −X then differ only by a function in PolN 0 , one can write f, Xg α,β = −Xf, g α,β + R (f, g), with the bilinear form R satisfying the same bounds as R. Notation 3.1. We write Ky instead of K + iy. We also introduce the notation ≤ B to mean:
There exist constants C and N independent of x and y such that for all u ∈ C0∞ B(x) : ≤ C(1 + x)N (u + Ky u). We will show below that A ε−1 u ≤ B,
(3.4)
holds for A taking values among all of the vector fields appearing in b1 or b0 . Assuming (3.4) one completes the proof of Theorem 3.1 as follows: Notice that if the collection {Ai }ki=1 is non-degenerate, then u2 ≤ u2 + C1 (1 + x2 )N1
k
Ai u2 ,
i=1
for every x ∈ Rn and every u ∈ C0∞ (B(x)). Therefore, by (3.4) we find u2ε = ε−1 u2 ≤ ε−1 u2 + C1 (1 + x2 )N1
k
Ai ε−1 u2 ≤ B 2 .
i=1
Polarizing, we obtain:
u2ε/2 ≤ u uε ≤ C2 u(1 + x2 )N2 u + Ky u
2 ≤ C22 u2 (1 + x2 )2N2 + u + Ky u ≤ (C2 u(1 + x2 )N2 + u + Ky u)2 , and hence (3.1) follows with ε∗ = ε/2, N∗ = N2 , and C∗ = C2 + 1. It remains to prove (3.4). Remark 3.4. To the end of this proof, we use the symbols C and N to denote generic positive constants which may change from one inequality to the next. By the bound on [A, ε−1 ] of Lemma 3.2(d)—and the fact that u ∈ C0∞ (B(x)) implies u0,N ≤ C(1 + x2 )N/2 u for every N > 0—we will have shown (3.4) if we can prove (3.5) Auε−1 ≤ B. Notice that by Lemma 3.2(b), the estimate (3.5) yields Au2ε−1,γ ≤ Cγ (1 + x2 )γ +N (u2 + Ky u2 ), for every γ > 0, x ∈ Rn , and u ∈ C0∞ (B(x)).
(3.6)
238
J.-P. Eckmann, M. Hairer
To prove (3.5), we proceed as follows. First, we verify it for A = Xi with i = 1, . . . , m (as well as for A = X0 in the case K0 ). The remaining bounds are shown by induction. The induction step consists in proving that if (3.5) holds for some A ∈ PolN 1 then [A, Xi ]uε/8−1 ≤ B for i = 0, . . . , m .
(3.7)
The first step. By the definition of K and the fact that Xi maps C0∞ (B(x)) into itself, we see that Xi u ≤ B , i = 1, . . . , m , (3.8) that is, (3.5) holds for ε ≤ 1 and A = Xi . We next show that it also holds for A = X0 whenever ε ≤ 1/2. (This will be the only place in the proof where C depends on y, but we need this estimate only for the case K0 .) Using (1.1) and Lemma 3.2(c), we can write X0 u2−1/2 ≤ X0 u−1 (Ky u + f u + |y| u) +
m
X0 u, XiT Xi u −1/2 .
i=1
Using Lemma 3.2(b) to estimate X0 u−1 , the first term is bounded by B 2 , so it remains to bound X0 u, XiT Xi u −1/2 . Using this time Lemma 3.2(e), (with α = − 21 and β = 0), we write X0 u, XiT Xi u −1/2 = Xi X0 u, Xi u −1/2 + R(X0 u, Xi u), (3.9) where R(X0 u, Xi u) is bounded by C(1+x)N X0 u−1 Xi u, which in turn is bounded by B 2 , using the previous bounds on X0 u−1 and Xi u. The first term of (3.9) can be written as |Xi X0 u, Xi u −1/2 | ≤ CXi X0 u−1 Xi u. Since Xi u ≤ B by (3.8), we only need to bound Xi X0 u−1 by B. This is achieved by writing Xi X0 u−1 ≤ X0 Xi u−1 + [Xi , X0 ]u−1 . The second term is bounded by B using Lemma 3.2(b). The first term is also bounded by B since Xi u0,N ≤ C(1 + x)N Xi u and X0 is bounded from S 0,N into S −1,0 (for some N ) by Lemma 3.2(b). Therefore, we conclude that X0 u−1/2 ≤ B,
(3.10)
where C will in general depend on y. The inductive step. Let A ∈ PolN 1 and assume that (3.5) holds. We show that a similar estimate (with different values for ε, C, and N ) then also holds for B = [A, Xi ] with i = 0, . . . , m. We distinguish the case i = 0 from the others. The case i > 0. We assume that (3.5) holds and we estimate Buε −1 for some ε ≤ 1/2 to be fixed later. We obtain Bu2ε −1 = Bu, AXi u ε −1 − Bu, Xi Au ε −1 = T1 + T2 . Both terms T1 and T2 are estimated separately. For T1 , we get from Remark 3.3: T1 = −ABu, Xi u ε −1 + R(Bu, Xi u), where (since ε ≤ 1/2), |R(Bu, Xi u)| ≤ C(1 + x)N Bu−1 Xi u ≤ C(1 + x)N uXi u ≤ B 2 . (3.11)
Spectral Properties of Hypoelliptic Operators
239
The term ABu, Xi u ε −1 is written as |ABu, Xi u ε −1 | ≤ BAu2ε −2 Xi u + [A, B]u−1 Xi u. The second term is bounded by B 2 like in (3.11). The first term is also bounded by B 2 by combining Lemma 3.2(b) with the induction assumption in its form (3.6) (taking 2ε ≤ ε). The estimation of T2 is very similar: we write again T2 = −Xi Bu, Au ε −1 + R(Bu, Au).
(3.12)
The first term is bounded by CXi Bu−1 Au2ε −1 . The second factor of this quantity is bounded by B by the inductive assumption, while the first factor is bounded by Xi Bu−1 ≤ BXi u−1 + [B, Xi ]u−1 ≤ B,
(3.13)
using Lemma 3.2(b) and the estimate Xi u0,N ≤ B. The remainder R of (3.12) is bounded by |R(Bu, Au)| ≤ C(1 + x)N Bu−1 Au2ε −1 , which is bounded by B 2 , using Lemma 3.2(b) for the first factor and the inductive assumption for the second. Combining the estimates on T1 and T2 we get Buε −1 ≤ B
for ε ≤ ε/2,
which is the required estimate. The case i = 0. To conclude the proof of Theorem 3.1, it remains to bound Buε −1 by B. In this expression, B = [A, X0 ] and ε > 0 is to be fixed later. We first introduce the operator m K˜ = XiT Xi , i=1
which is (up to a term of multiplication by a function) equal to the real part of Ky , when considered as an operator on L2 . We can thus write X0 as X0 = K − K˜ + f1 = K˜ − K T + f2 , for two functions f1 , f2 ∈ PolN 0 for some N. This allows us to express B as ˜ A] − 2KA ˜ + Af1 − f2 A. B = [A, X0 ] = AKy + KyT A + [K, We write Bu2ε −1 = Bu, [A, X0 ]u ε −1 and we bound separately by B 2 each of the terms that appear in this expression according to the above decomposition of the commutator. The two terms containing f1 and f2 are bounded by B 2 using the inductive assumption. We therefore concentrate on the four remaining terms. The term AKy . We write this term as Bu, AKy u ε −1 = −BAu, Ky u ε −1 + [A, B]u, Ky u ε −1 + R(Bu, Ky u), where the two last terms are bounded by B 2 using Lemma 3.2(b,e). Using assumption (3.6) (assuming ε ≤ ε/2) and Lemma 3.2(b,c), we also bound the first term by B 2 .
240
J.-P. Eckmann, M. Hairer
The term KyT A. We write this term as
Bu, KyT Au ε −1 = Ky Bu, Au ε −1 + 2−2ε [K, 2ε −2 ]Bu, Au ε −1 = T1 + T2 . The term T1 is bounded by Ky Bu−1 Au2ε −1 by polarization. The second factor of this product is bounded by B, using the induction hypothesis and the assumption ε ≤ ε/2. The first factor is bounded by Ky Bu−1 ≤ BKy u−1 + [K, B]u−1 .
(3.14)
The first term of this sum is obviously bounded by B. The second term is expanded using the explicit form of K as given in (1.1). The only “dangerous” terms appearing in this expansion are those of the form [XiT Xi , B]u−1 . They are bounded by [XiT Xi , B]u−1 ≤ [XiT , B]Xi u−1 + [Xi , B]XiT u−1 + XiT , [Xi , B] u −1 . The terms in this sum are bounded individually by B, using the estimates on Xi u, together with Lemma 3.2(b,d). We now turn to the term T2 . We bound it by
|T2 | ≤ C 2−2ε [K, 2ε −2 ]Bu−1 Au2ε −1 . The second factor is bounded by B by the induction hypothesis, so we focus on the first factor. We again write explicitly K as in (1.1) and estimate each term separately. The two terms containing X0 and f are easily bounded by B using Lemma 3.2(b,d). We also write XiT Xi = Xi2 + Yi with Yi ∈ PolN 1 and similarly bound by B the terms in Yi . The remaining terms are of the type
Qi = 2−2ε [Xi2 , 2ε −2 ]Bu−1 . They are bounded by Qi ≤ 2 2−2ε [Xi , 2ε −2 ]Xi Bu−1 + 2−2ε Xi , [Xi , 2ε −2 ] Bu −1 . In order to bound the first term, one writes Xi B = BXi + [Xi , B] and bounds each term separately by B, using the bound Xi u0,γ ≤ B together with Lemma 3.2(b,d). The last term is also bounded by B using Lemma 3.2(d). T ˜ A]. We write K˜ = m The term [K, i=1 Xi Xi and we bound each term separately: Bu, [XiT Xi , A]u ε −1 = Bu, XiT [Xi , A]u ε −1 + Bu, [XiT , A]Xi u ε −1 ≡ Ti,1 + Ti,2 . The first term is written as Ti,1 = Xi Bu, [Xi , A]u ε −1 + R(u), where R(u) is bounded by C(1 + x)N Bu−1 [Xi , A]u2ε −1 . The factor Bu−1 is bounded by B using Lemma 3.2(b) and the last factor is bounded by B, using the estimate for the case i = 0 (we have to assume ε ≤ ε/4 in order to get this bound). The term Xi Bu, [Xi , A]u ε −1 is estimated by |Xi Bu, [Xi , A]u ε −1 | ≤ Xi Bu−1 [Xi , A]u2ε −1 .
Spectral Properties of Hypoelliptic Operators
241
The first factor is bounded by B as in (3.13) and the second factor is again bounded by B, using the estimate for the case i = 0. It thus remains to bound Ti,2 , which we write as
Ti,2 = Bu, Xi [XiT , A]u ε −1 + Bu, [XiT , A], Xi u ε −1 . The first term in this equation is similar to the term Bu, XiT [Xi , A]u ε −1 and is bounded by B 2 in the same way. The second term is bounded by
Bu, [XiT , A], Xi u ε −1 ≤ Bu−1 [XiT , A], Xi u 2ε −1 , which can also be bounded by B 2 , using the estimate for the case i = 0, provided ε ≤ ε/8. ˜ The term KA. In order to bound this term, we need the following preliminary lemma: Lemma 3.5. Let v ∈ Sn , α, δ ∈ R, and let Ky be as above. There exist constants C˜ and N˜ independent of y such that the estimate m m 2 2 ˜ Xi vα ≤ C˜ Xi vα−δ,N˜ vα+δ,N˜ + Cv (3.15) ReKy v, v α − α,N˜ i=1
i=1
holds. Proof. Obviously ReKy v, v α = ReKv, v α . We decompose K according to (1.1). The terms containing X0 and f are bounded by Cv2α,N according to Lemma 3.2(b,e), so we focus on the terms containing XiT Xi . Using Lemma 3.2(e), we write them as XiT Xi v, v α = Xi v2α + Ri (v), where Ri (v) is bounded by CXi vα−δ,N vα+δ,N . This concludes the proof of Lemma 3.5. ˜ as We now write the term containing KA ˜ Bu, KAu ε −1 =
m
(Xi Bu, Xi Au ε −1 + Ri ),
(3.16)
i=1
and we apply Lemma 3.2(e) with f = Bu, g = Xi Au, X = XiT. Then we find |Ri | ≤ Bu−1,N Xi Au2ε −1 ≤ Bu2−1,N + Xi Au22ε −1 . By Lemma 3.2(b), the first term is bounded by B 2 . Using Lemma 3.2(c) to polarize the scalar product in (3.16) we thus get 2 ˜ |Bu, KAu ε −1 | ≤ B + C
m i=1
Xi Bu2−1 + C
m
Xi Au22ε −1 .
i=1
The term involving Xi Bu2−1 is bounded by B 2 as in (3.13). The last term is bounded by Lemma 3.5, yielding
242
J.-P. Eckmann, M. Hairer
2 ˜ |Bu, KAu ε −1 | ≤ B + C|Ky Au, Au 2ε −1 | + C
m i=1
Xi Au2−1,N˜
+ CAu24ε −1,N˜ . The last term in this expression is bounded by B 2 by the induction hypothesis if we choose ε ≤ ε/4. The term containing Xi Au can be bounded by B 2 as in (3.13), so the only term that remains to be bounded is |Ky Au, Au 2ε −1 |. By polarizing the estimate obtained by Lemma 3.2(c), one gets |Ky Au, Au 2ε −1 | ≤ CAu24ε −1 + CKy Au2−1 . The first term is bounded by B 2 using the induction assumption. The second term is bounded by B 2 exactly like (3.14) above. Summing all these bounds this proves (3.7) and hence the inductive step is completed. Since K was assumed to satisfy K1 (or K0 ), we see that after M inductive steps the proof of Theorem 3.1 is complete.
4. Global Estimate The results of the previous section were restricted to functions u with well-localized compact support. In this section, we are interested in getting bounds for every u ∈ Sn . The main estimate of this section is given by Theorem 4.1. Let K ∈ K1 and define Ky = K + iy for y ∈ R. For every ε > 0, there exist constants δ > 0 and C > 0 independent of y, such that for the norms defined by (3.2) one has uδ,δ ≤ C(u0,ε + Ky u),
(4.1)
for every u ∈ Sn . If K ∈ K0 , the same bound holds, but the constant C may depend on y. Since S δ,δ is compactly embedded into L2 , this result implies: Corollary 4.2. Let K be as above. If there exist constants ε, C > 0 such that u0,ε ≤ C(u + Ku),
(4.2)
then K has compact resolvent when considered as an operator acting on L2 . Proof (of the Corollary). Combining (4.1) with (4.2), we get uδ,δ ≤ C(u + Ku). This implies that for λ outside of the spectrum of K, the operator (K − λ)−1 is bounded from L2 into S δ,δ . By Lemma 3.2(a), it is therefore compact.
Spectral Properties of Hypoelliptic Operators
243
Proof (of Theorem 4.1). Let ε∗ and N∗ be the values of the constants obtained in estimate (3.1) of Theorem 3.1. Observe that Theorem 3.1 also holds for any bigger value of N∗ , and we will assume N∗ is sufficiently large. We choose ε > 0. As a first step, we will show that there exist constants δ and C such that, for any x ∈ Rn and u ∈ C0∞ (B(x)), the following estimate holds: uδ,δ ≤ C(1 + x2 )−N∗ uε∗ + C(1 + x2 )ε/2 u.
(4.3)
Denote by J the smallest integer for which J ≥1+
8N∗ , ε
and define
ε ε∗ δ = min 2N∗ , , . (4.4) 2 J First, we note that when A is a positive self-adjoint operator on some Hilbert space H, one has the estimate (4.5) AuJ ≤ CAJ u uJ −1 ,
whenever both expressions make sense. In the case J = 2j for j an integer, this can be seen by a repeated application of the Cauchy-Schwarz inequality. It was shown in [KS59] to hold in the general case as well. We next use Jensen’s inequality to write (1 + x2 )N∗ +δ/2 δ u ≤ C
δ u u
J
1 (N∗ +δ/2) 1+ J −1
u + C(1 + x2 )
u.
Dividing this expression by (1 + x2 )N∗ and using the definition of J , we get (1 + x2 )δ/2 δ u ≤ C(1 + x2 )−N∗
δ u u
J u
+ C(1 + x2 )(N∗ +δ/2)(1+ε/(8N∗ ))−N∗ u. Using (4.5), the fact that 8Nε ∗ ≤ 2Nε−δ by (4.4), and u ∈ C0∞ (B(x)), we get (4.3). ∗ +δ In order to prove Theorem 4.1, we use the following partitionof unity. Let χ0 : R → [0, 1] be a C ∞ function with support in |x| < 1 and satisfying i∈Z χ0 (x − i) = 1 for all x ∈ R. The family of functions P = {χx : Rn → [0, 1] | x ∈ Zn }, defined by χx (z) =
n
χ0 (zj − xj ),
j =1
is therefore a partition of unity for Rn . By construction, when x, x ∈ Z then χx and χx have disjoint support if there exists at least one index j with |xj − xj | ≥ 2. We can therefore split P into subsets Pk |k=1,...,3n such that any two different functions belonging to the same Pk have disjoint supports.
244
J.-P. Eckmann, M. Hairer
Consider next an arbitrary function u ∈ Sn . We define ux = χx u, and then the construction of the Pk implies ux 0,ε ≤ 3n u0,ε . (4.6) x∈Zn
Using (4.3), then Theorem 3.1 and (4.6), we find uδ,δ ≤ ux δ,δ ≤ C (1 + x2 )−N∗ ux ε∗ + (1 + x2 )ε/2 ux x∈Zn
≤C
x∈Zn
ux + (1 + x2 )−N∗ Ky ux + (1 + x2 )ε/2 ux
x∈Zn
≤ C3n (u + u0,ε ) + C
(1 + x2 )−N∗ Ky ux .
x∈Zn
For k ∈ {1, . . . , 3n } we now define fk =
(1 + x2 )−N∗ χx .
χx ∈Pk
With this notation, we have n
uδ,δ ≤ Cu0,ε + C
3
Ky fk u.
k=1
The claim (4.1) thus follows if we can show that Ky fk u ≤ Cu + CKy u.
(4.7)
Since the fk are bounded functions, it suffices to estimate [K, fk ]u. 2 −N∗ , so (for sufficiently By construction, every derivative of fk decays like (1+x ) large N∗ ), the functions [Xj , fk ] and Xk , [Xj , fk ] are bounded. The only “dangerous” terms appearing in [K, fk ] are thus the terms of the form [Xi , fk ]Xi . By choosing N∗ sufficiently large, it follows from (3.8) that [Xi , fk ]Xi u ≤ C(u + Ky u), thus concluding the proof of Theorem 4.1. 4.1. Cusp. Our statement about the cusp-like shape of the spectrum of K is now a consequence of Theorem 4.1. 2 Theorem 4.3. Let K ∈ PolN 2 be of the type (1.1). Assume that the closure of K in L is m-accretive and that K ∈ K1 . Assume furthermore that there exist constants ε, C > 0 such that (4.8) u0,ε ≤ C(u + Ky u),
for all y ∈ R. Then, the spectrum of K (as an operator on L2 ) is contained in the cusp {λ ∈ C | Re λ ≥ 0, |Im λ| ≤ C(1 + Re λ)ν }, for some positive constants C and ν.
Spectral Properties of Hypoelliptic Operators
245
Remark 4.4. In principle, our proofs give a constructive upper bound on ν. However, no attempt has been made to optimize this bound. Proof. The proof follows very closely that of Theorem 4.1 in [HN02], however we give the details for completeness. One ingredient we need is the following lemma: Lemma 4.5. Let A : L2 → L2 be a maximal accretive operator that has Sn as a core. Assume there exist constants C, α > 0 for which Au ≤ Cuα,α ,
∀u ∈ Sn .
Then, for every N ∈ N, there exists a constant CN such that A1/N u ≤ CN uα/N,α/N ,
∀u ∈ S α/N,α/N .
Proof. By Lemma 3.2(b), one can bound uα,α by
¯ α/N α/2N N u . uα,α ≤ C α/2N The generalized Heinz inequality presented in [Kat61] then yields ¯ α/N α/2N u . A1/N u ≤ CN α/2N This concludes the proof of Lemma 4.5.
We now turn to the proof of Theorem 4.3. Since K ∈ PolN 2 , one has for α = max{2, N} the bound (K + 1)u ≤ Cuα,α ,
∀u ∈ Sn .
By Lemma 4.5, one can find for every δ > 0 an integer M > 0 and a constant C such that: u, ((K + 1)∗ (K + 1))1/M u ≤ Cu2δ,δ , (4.9) Furthermore, Theorem 4.1 together with (4.8) yields constants C and δ such that for every u ∈ Sn and every y ∈ R: u2δ,δ ≤ C(u2 + (K + iy)u2 ).
(4.10)
Since K is m-accretive by assumption, we can apply [HN02, Prop. B.1] to get the estimate
1 |z + 1|2/M u2 ≤ ((K + 1)∗ (K + 1))1/M u, u + (K − z)u2 4 ≤ Cu2δ,δ + (K − z)u2 , where the second line is a consequence of (4.9). Using (4.10) and the triangle inequality for z = Re z + i Im z, we get 1 |z + 1|2/M u2 ≤ C((1 + Re z)2 u2 + (K − z)u2 ). 4 Together with the compactness of the resolvent of K, this immediately implies that every λ in the spectrum of K satisfies the inequality 1 |λ + 1|2/M u2 ≤ C(1 + Re λ)2 u2 . 4 This concludes the proof of Theorem 4.3.
246
J.-P. Eckmann, M. Hairer
5. Examples We present two examples in this section: A first, very simple one, and a second which was the main motivation for this paper. 5.1. Langevin equation for a simple anharmonic oscillator. Our first example consists of one anharmonic oscillator which is in contact with a stochastic heat bath at temperature T . The Hamiltonian of the oscillator is given by p2 ν2q 2 q4 + +ε . 2 2 4 For this model the associated spectral problem can be solved explicitly when ε = 0, because it is an harmonic oscillator. The spectrum lies in a cone as shown in Fig. 5.1. We also show that in first order perturbation theory in ε, the spectrum seems to form a non-trivial cusp, but this result remains conjectural, because of non-uniformity of our bounds. The Langevin equation for this system is dp = −ν 2 q dt − εq 3 dt − γp dt + 2γ T dw(t), dq = p dt, (5.1) H (p, q) =
where γ > 0 measures the strength of the interaction between the oscillator and the bath. Denote by (, P) the probability space on which the Wiener process w(t) is defined. We write ϕt,ω (x) with ω ∈ for the solution at time t for (5.1) with initial condition x = (p, q) and realization ω of the white noise. The corresponding semigroups acting on observables and on measures on R2 are given by (Tt f )(x) = (f ◦ ϕt,ω (x)) dP(ω), (5.2a) −1 (Tt∗ µ)(A) = (µ ◦ ϕt,ω (A)) dP(ω), (5.2b)
where A ⊂ R is a Borel set. It is well-known that 2
dµT (p, q) = exp (−H (p, q)/T ) dp dq is the only stationary solution for (5.2b). The Itˆo formula yields for ft = Tt f the Fokker-Planck equation given by ∂t ft = γ T ∂p2 ft + p ∂q ft − (ν 2 q + εq 3 + γp) ∂p ft .
(5.3)
We study (5.3) in the space Hβ = L2 (R2 , dµT ) and make the change of variables ft = exp(H /(2T ))Ft in order to work in the unweighted space H0 = L2 (R2 , dp dq). Equation (5.3) then becomes ∂t Ft = −L˜ ε Ft , where the differential operator L˜ ε is given by γ 2 γ L˜ ε = −γ T ∂p2 + p − − p ∂q + ν 2 q ∂p + εq 3 ∂p . 4T 2 By rescaling time, p and q, one can bring L˜ ε to the form Lε =
1 (−∂p2 + p 2 − 1) + α(q ∂p − p ∂q ) + cεq 3 ∂p , 2
√ where α = 2 2T ν/γ and c > 0.
Spectral Properties of Hypoelliptic Operators
247
The operator K = Lε is thus of the type (1.1) with X0 = α(q ∂p − p ∂q ) + cεq 3 ∂p and X1 = ∂p . We now verify the conditions of Sect. 2.2. It is obvious that these vector fields are of polynomial growth, thus condition a is satisfied. Since [X1 , X0 ] = −α∂q , the operator Lε satisfies condition b1 as well, and so the conclusion of Theorem 4.1 holds. Proceeding like in [EH00, Prop. 3.7], one shows an estimate of the type (4.8) (see also the proof of Theorem 5.5 below, where details are given). Therefore, Theorem 4.3 applies, showing that the spectrum of Lε is located in a cusp-shaped region. In fact, we show in the next subsection that the cusp is a cone when ε = 0, and then we study its perturbation to first order in ε. 5.1.1. First-order approximation of the spectrum of Lε . We will explicitly compute the spectrum and the corresponding eigenfunctions for L0 and then (formally) apply firstorder perturbation theory to get an approximation to the spectrum of Lε . We introduce the “creation and annihilation” operators a=
p + ∂p √ , 2
a∗ =
p − ∂p √ , 2
b=
q + ∂q √ , 2
b∗ =
q − ∂q √ , 2
in terms of which Lε can be written as Lε = a ∗ a + α(b∗ a − a ∗ b) + cεq 3 ∂p . With this notation, it is fairly easy to construct the spectrum of L0 . Note first that 0 is an eigenvalue for L with eigenfunction exp(−p 2 /2 − q 2 /2). This is actually the vacuum state for the two-dimensional harmonic oscillator in quantum mechanics (which is given by a ∗ a + b∗ b), so we call this eigenfunction | . ∗ defined by A straightforward calculation shows that the creation operators c± ∗ c±
∗
∗
= a + β± b ,
1 ±i β± = − 2α
√
4α 2 − 1 , 2α
satisfy the following commutation relation with L0 : √ 4α 2 − 1 1 α ∗ ∗ [L0 , c± ] = λ± c± , λ± = ± i =− . 2 2 β± Therefore, λn,m = nλ+ + mλ− with n and m positive integers are eigenvalues for L0 0 with eigenvectors given by ∗ n ∗ m (c+ ) (c− ) | . We conclude that for α > 1/2 the spectrum of L0 consists of a triangular grid located inside a cone (see Fig. 5.1). Remark 5.1. Although the spectrum of L0 is located inside a sector, L0 is not sectorial since the closure of its numerical range is the half-plane Re λ ≥ 0. In order to do first-order perturbation theory for the spectrum of Lε we also need the ∗ and d ∗ to | , eigenvectors for L∗0 , which can be obtained by applying successively d+ − where ∗ d± = a ∗ − β∓ b ∗ .
248
J.-P. Eckmann, M. Hairer Imλ
Imλ
λ+
Reλ
Reλ
λ−
Fig. 5.1. Spectrum of L0
Fig. 5.2. Approximate spectrum of Lε
∗ )n (d ∗ )m | is an eigenvector of L∗ with eigenvalue λ ¯ n,m . By With this notation, (d+ − 0 0 first-order perturbation theory, the eigenvalues of Lε are approximated by
λn,m ≈ λn,m + cεδn,m , ε 0
δn,m =
m d n q 3 ∂ (c∗ )n (c∗ )m | |d− p + + − m d n (c∗ )n (c∗ )m | . |d− + + −
(5.4)
The resulting spectrum2 is shown in Fig. 5.2 (the sector containing the spectrum of L0 is shown in light gray for comparison). One clearly sees that the boundary of the sector bends to a cusp. A (lengthy) explicit computation also shows that δn,0 = −12n(n − 1) √
λ¯ + 4α 2
−1
+ 9n √
iα 4α 2 − 1
.
In principle this confirms the cusp-like shape of the boundary, were it not for the nonuniformity of the perturbation theory (in n). 5.2. A model of heat conduction. In this subsection, we apply our results to the physically more interesting case of a chain of nearest-neighbor interacting anharmonic oscillators coupled to two heat baths at different temperatures. We model the chain by the deterministic Hamiltonian system given by H =
N 2 p i
i=0
2
N V2 (qi − qi−1 ). + V1 (qi ) + i=1
(We will give conditions on the potentials V1 and V2 later on.) In order to keep notations short, we assume pi , qi ∈ R, but one could also take them in Rd instead. The two heat baths are modeled by classical free field theories ϕL and ϕR with initial conditions drawn randomly according to Gibbs measures at respective inverse temperatures βL and βR . (We refer to [EPR99a] for a more detailed description of the model.) It is shown in [EPR99a] that this model is equivalent to the following system of stochastic differential equations: dqi = pi dt,
i = 0, . . . , N,
dp0 = −V1 (q0 ) dt + V2 (q˜1 ) dt + rL dt, dpj = −V1 (qj ) dt − V2 (q˜j ) dt + V2 (q˜j +1 ) dt, 2
Actually the set
{λn,m 0
+ cεδn,m | n, m ≥ 0}.
j = 1, . . . , N − 1,
Spectral Properties of Hypoelliptic Operators
249
dpN = −V1 (qN ) dt − V2 (q˜N ) dt + rR dt, drL = −γL rL dt + λ2L γL q0 dt − λL 2γL TL dwL (t), drR = −γR rR dt + λ2R γR qN dt − λR 2γR TR dwR (t), where Ti = βi−1 , γi are positive constants describing the coupling of the chain to the heat baths, and wi are two independent Wiener processes. The variables rL and rR describe the internal state of the heat baths. If TL = TR = T , the equilibrium measure for this system is dµT (p, q, r) = exp (−G(p, q, r)/T ) dp dq dr, where the “energy” G is given by the expression G(p, q, r) = H (p, q) +
rL 2 rR 2 − q0 rL + 2 − qN rR . 2 2λL 2λR
If TL = TR , there is no way of guessing the invariant measure for the system. We can nevertheless make the construction of Sect. 5.1 with the reference measure dµT˜ for some temperature T˜ > max{TL , TR }, which is a stability condition, as one can see in (5.6) below. The resulting operator K = L is given by ∗ L = XL∗ XL + XR XR + fL2 + fR2 + X0 ,
(5.5)
XL,R = λL,R γL,R TL,R ∂rL,R , fL,R = γL,R (TL,R /T˜ − 1)(rL,R − λL,R q0,N ),
(5.6)
where
X0 = ∇q H ∇p − ∇p H
∇q + bL (rL − λ2L q0 )∂rL
+ bR (rL − λ2R qN )∂rR with bL,R =
− rL ∂p0
− rL ∂pN ,
γL,R (TL,R 2 λL,R T˜ 2
− T˜ ).
We are now in a position to express the conditions of Sect. 2.2 in terms of sufficient conditions on the potentials of the model. The first two assumptions guarantee that L is in K1 . 2n−|α|
Assumption 5.1. There exist real numbers n, m > 0 such that D α V1 ∈ Pol0 2m−|α| D α V2 ∈ Pol0 for |α| ≤ 2.
and
Assumption 5.2. There exists a constant c > 0 such that V2 (x) > c for all x ∈ R. Remark 5.2. The second assumption states that there is a non-vanishing coupling between neighboring particles in every possible state of the chain. The verification that these assumptions imply a is easy, and the verification that b1 holds can be found in [EPR99a, EH00].
250
J.-P. Eckmann, M. Hairer
Proposition 5.3. Let L be defined as above and let V1 and V2 fulfill Assumptions 5.1 and 5.2 above. Then L satisfies the assumptions of Theorem 4.1 and satisfies Eq. (4.1) with C and δ independent of y. In order to show that the spectrum of L is located in a cusp-shaped region (i.e. that the hypotheses of Theorem 4.3 hold), two more assumptions have to be made on the asymptotic behaviour of V1 and V2 : Assumption 5.3. The exponents n and m appearing in Assumption 5.1 satisfy 1 < n < m. Remark 5.4. The physical interpretation of the condition n < m (actually 1 ≤ n ≤ m would probably work as well, see [RBT02b], but we could not apply directly the results of [EH00]) goes as follows. If n > m, the relative strength of the coupling between neighboring particles decreases as the energy of the chain tends to infinity. Therefore, an initial condition where all the energy of the chain is concentrated into one single oscillator is “metastable” in the sense that the energy gets transmitted only very slowly to the neighboring particles and eventually to the heat baths. As a consequence, it is likely that the convergence to a stationary state is no longer exponential in this case, and so the operator L has probably not a compact resolvent anymore. Our last assumption states that the potentials and the resulting forces really grow asymptotically like |x|n and |x|m respectively (and not just “slower than”). Assumption 5.4. The potentials V1 and V2 satisfy the conditions n n V1 (x) ≥ c1 1 + x2 − c2 , xV1 (x) ≥ c3 1 + x2 − c4 , m m V2 (x) ≥ c5 1 + x2 − c6 , xV2 (x) ≥ c7 1 + x2 − c8 , for all x ∈ R and for some positive constants ci . Theorem 5.5. Let L be defined as above and let V1 and V2 fulfill assumptions 5.1–5.4 above. Then, L has compact resolvent and there exist positive constants C and N such that the spectrum of L is contained in the cusp λ ∈ C Re λ ≥ 0 and Im λ ≤ C(1 + |Re λ|)N . Proof. We will apply Theorem 4.3, and need to check its assumptions. It has been shown in [EH00, Prop. B.3] that L is m-accretive. The fact that L ∈ K1 was checked above, and (4.8) was shown for y = 0 in [EH00, Prop. 3.7]. However, closer inspection of that proof reveals that whenever X0 was used, it only appeared inside a commutator. Therefore, we can replace it by X0 + iy without changing the bounds. Thus, we have checked all the assumptions of Theorem 4.3 and the proof of Theorem 5.5 is complete. A. Proof of Lemma 3.2 The points a and b of Lemma 3.2 are standard results in the theory of pseudodifferential operators (see e.g. [H¨or85, Vol. III] or, more specifically, [BC94, HT94a, HT94b]). The point c is an immediate consequence of the Cauchy-Schwarz inequality combined with a. In order to prove the points d and e, we first show the following intermediate result:
Spectral Properties of Hypoelliptic Operators
251
Lemma A.1. Let f : Rn → R and α ∈ R. Let k be the smallest even integer such that |α| ≤ k. Then, if f satisfies sup |∂ δ f (y)| < κ,
y∈Rn
∀ |δ| ≤ k,
the corresponding operator of multiplication is bounded from S α,β into S α,β and its operator norm is bounded by Cκ. The constant C depends only on α and β. Proof. By the definition of S α,β , it suffices to show that the operator α f −α is bounded by Cκ from L2 into L2 . Since f is obviously bounded by κ as a multiplication operator from L2 into L2 , it actually suffices to bound α [f, −α ]. Assume first that α ∈ (0, 2). In that case, we write ∞ α 1 α −α z−α/2 [f, 2 ] dz. [f, ] = Cα 2 z + z + 2 0 The commutator appearing in this expression can be written as [f, 2 ] =
n
(2∂i f ∂i + ∂i2 f s).
(A.1)
i=1
It is clear from basic Fourier analysis that ∂i (z + 2 )−1/2 ≤ 1 and therefore [f, 2 ](z + 2 )−1/2 ≤ Cκ. Furthermore, the spectral theorem tells us that for any function F , F ( 2 ) is bounded by supλ≥1 F (λ). Therefore there exists a constant C independent of z > 0 such that α (z + 2 )−1 ≤
C . 1 + z1−α/2
Combining these estimates shows the claim when α ∈ (0, 2). The case α = 2 follows from the boundedness of [f, 2 ] −2 . Values of α greater than 2 can be obtained by iterating the relation α+2 f −α−2 = α f −α − α [f, 2 ] −α−2 . Using (A.1), the fact that ∂i commutes with , and the fact that ∂i −2 is bounded, we can reduce this to the previous case, but with two more derivatives to control. The case α < 0 follows by considering adjoints. This concludes the proof of Lemma A.1. Remark A.2. Since the direct and the inverse Fourier transforms both map S α,β continuously into S β,α , the above lemma also holds for bounded functions of ∂y and not only for bounded functions of y. We are now ready to turn to the Proof of point d. Let X ∈ PolN k . We first consider γ ∈ (−2, 0). Since, in Fourier space, 2 is a multiplication operator by a real positive function, we can write ∞ 1 dz zγ /2 [X, 2 ] . [X, γ ] = Cγ 2 z+ z + 2 0
252
J.-P. Eckmann, M. Hairer
In order to bound this expression, we define B = [X, 2 ], commute B with the resolvent, and obtain ∞ ∞ 2−γ dz zγ /2 dz γ γ /2 γ −2 z B + C [B, 2 ] . [X, ] = Cγ γ 2 2 2 2 (z + ) (z + ) z + 2 0 0 ∞ The first term equals Cγ γ −2 B because 0 zγ /2 x 2−γ (z + x 2 )−2 dz does not depend on x > 0. This, in turn, is bounded from S α,β into S α+1−k−γ ,β−N using B ∈ PolN k+1 and Lemma 3.2(b). To bound the second term, we rewrite
∞ 0
zγ /2 dz [B, 2 ] = 2 2 (z + ) z + 2
∞ 0
zγ /2 1−γ 2 γ −1 2 −2 · [B, ] · dz. (z + 2 )2 z + 2
The factor 2 (z + 2 )−1 is bounded from S α,β into itself, uniformly in z. Using Lemma 3.2(b) as before, we see that the factor γ −1 [B, 2 ] −2 is bounded from S α,β into S α+1−k−γ ,β−N ≡ S α ,β . Finally, using Lemma A.1 and counting powers, we see that the first factor has norm bounded by O(z−3/2 ) for large z and O(zγ /2 ) for z near 0 as a map from S α ,β to itself. This proves the first statement of Lemma 3.2(d) for γ ∈ (−2, 0). The case γ = 2 follows from [X, 2 ] ∈ PolN k+1 . All other values of γ can be obtained by repeatedly using the equalities [X, γ +2 ] = [X, γ ] 2 + [X, 2 ] γ , [X, γ −2 ] = [X, γ ] −2 − −2 [X, 2 ] γ −2 . The second statement of Lemma 3.2(d) can be proven similarly and is left to the reader. Proof of point e. Recall that we want to bound I = |f, Xg α,β − X T f, g α,β |, T 2 where X ∈ PolN k and X denotes the formal adjoint (in L ) of X. We write this as
¯ −β −2α ¯ −β , XT ] ¯ β f, g α,β |. ¯ β 2α I = |[ The operator appearing in this expression can be expanded as ¯ −β , XT ] ¯ β = [ ¯β ¯ β 2α ¯ −β , XT ] ¯β + ¯ −β [ −2α , XT ] 2α ¯ −β −2α [ ¯ −β −2α [ ¯ −β , XT ] ¯ β 2α ¯ β. + The first term belongs to PolN k−1 by inspection, and the required bound follows at once from Lemma 3.2(b,c). A similar remark applies to the last term. The second term is similarly bounded by using Lemma 3.2(d,b,c). This concludes the proof of Lemma 3.2. Acknowledgements. We thank G. van Baalen and E. Zabey for helpful remarks, as well as a careful referee for pointing out many misprints. This work was partially supported by the Fonds National Suisse.
Spectral Properties of Hypoelliptic Operators
253
References [BC94] [EH00] [EPR99a] [EPR99b] [HN02] [H¨or85] [HT94a] [HT94b] [Kat61] [KS59] [RBT00] [RBT02a] [RBT02b]
Bony, J.-M., Chemin, J.-Y.: Espaces fonctionnels associ´es au calcul de Weyl-H¨ormander. Bull. Soc. Math. France 122(1), 77–118 (1994) Eckmann, J.-P., Hairer, M.: Non-equilibrium statistical mechanics of strongly anharmonic chains of oscillators. Commun. Math. Phys. 212(1), 105–164 (2000) Eckmann, J.-P., Pillet, C.-A., Rey-Bellet, L.: Non-equilibrium statistical mechanics of anharmonic chains coupled to two heat baths at different temperatures. Commun. Math. Phys. 201, 657–697 (1999) Eckmann, J.-P., Pillet, C.-A., Rey-Bellet, L.: Entropy production in non-linear, thermally driven hamiltonian systems. J. Stat. Phys. 95, 305–331 (1999) H´erau, F., Nier, F.: Isotropic hypoellipticity and trend to equilibrium for the Fokker-Planck equation with high degree potential, 2002. Preprint H¨ormander, L.: The Analysis of Linear Partial Differential Operators I–IV. NewYork: Springer, 1985 Haroske, D., Triebel, H.: Entropy numbers in weighted function spaces and eigenvalue distributions of some degenerate pseudodifferential operators. I. Math. Nachr. 167, 131– 156 (1994) Haroske, D., Triebel, H.: Entropy numbers in weighted function spaces and eigenvalue distributions of some degenerate pseudodifferential operators. II. Math. Nachr. 168, 109–137 (1994) Kato, T.: A generalization of the Heinz inequality. Proc. Japan Acad. 37, 305–308 (1961) Krasnosel ski˘ı, M.A., Sobolevski˘ı, P.E.: Fractional powers of operators acting in Banach spaces. Dokl. Akad. Nauk SSSR 129, 499–502 (1959) Rey-Bellet, L., Thomas, L.E.: Asymptotic behavior of thermal nonequilibrium steady states for a driven chain of anharmonic oscillators. Commun. Math. Phys. 215(1), 1–24 (2000) Rey-Bellet, L., Thomas, L.: Fluctuations of the entropy production in anharmonic chains, 2002. To be published in Ann. Henri Poincar´e Rey-Bellet, L., Thomas, L.E.: Exponential convergence to non-equilibrium stationary states in classical statistical mechanics. Commun. Math. Phys. 225(2), 305–329 (2002)
Communicated by A. Kupiainen
Commun. Math. Phys. 235, 255–273 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0810-z
Communications in
Mathematical Physics
Exotic Tensor Gauge Theory and Duality P.F. de Medeiros, C.M. Hull Physics Department, Queen Mary, University of London, Mile End Road, London, E1 4NS, UK. E-mail:
[email protected];
[email protected] Received: 9 September 2002 / Accepted: 22 October 2002 Published online: 21 February 2003 – © Springer-Verlag 2003
Abstract: Gauge fields in exotic representations of the Lorentz group in D dimensions – i.e. ones which are tensors of mixed symmetry corresponding to Young tableaux with arbitrary numbers of rows and columns – naturally arise through massive string modes and in dualising gravity and other theories in higher dimensions. We generalise the formalism of differential forms to allow the discussion of arbitrary gauge fields. We present the gauge symmetries, field strengths, field equations and actions for the free theory, and construct the various dual theories. In particular, we discuss linearised gravity in arbitrary dimensions, and its two dual forms. 1. Introduction Fields in symmetric or antisymmetric tensor representations of the Lorentz group occur in many contexts. However, tensor fields in more exotic representations corresponding to arbitraryYoung tableaux can also occur. Such fields arise among the higher-spin massive modes in string theory, and can also occur in dualising some of the more familiar tensor fields. They arise too in higher spin gauge theories in dimensions 5 ≤ D ≤ 7 [17]. The free covariant field theories for such representations have been discussed in [16, 18, 2]. It is well known that a massless p-form gauge field A in D dimensions can be dualised to a D − p − 2 form gauge field B, with dA = ∗dB, exchanging field equations with Bianchi identitites. This can be extended to other tensor fields. For example, a massive spin-two field in four dimensions, usually described in terms of a symmetric second rank tensor, has a dual description in terms of a third rank tensor dµνρ = d[µν]ρ satisfying dµνρ = −dνµρ and d[µνρ] = 0 [4]. It is then of mixed symmetry, corresponding to a Young tableau with two columns of length two and one. In [12–15], dual forms of linearised gravity were found in arbitrary dimensions. For example, in D = 5, linearised gravity is formulated in the usual way in terms of a symmetric tensor hµν . There are however two dual forms of this theory, one formulated in terms of a gauge field dµνρ (again satisfying dµνρ = −dνµρ and d[µνρ] = 0) and one
256
P.F. de Medeiros, C.M. Hull
formulated in terms of a gauge field cµνρσ with the same symmetries as the Riemann tensor. These fields transform under the gauge transformations δcµνρσ = 2 ∂[µ χν]ρσ + 2 ∂[ρ χσ ]µν − 4 ∂[µ χνρσ ] , δdµνρ = 2 ∂[µ αν]ρ − 2 ∂[µ ανρ] + ∂ρ βµν − ∂[ρ βµν] , δhµν = 2 ∂(µ ξν)
(1)
respectively, with parameters χµ[νρ] = χµνρ , αµν , β[µν] = βµν and ξµ , so that each gauge field corresponds to five physical degrees of freedom; in each case, these are in the 5 representation of the little group SO(3). The respective gauge invariant field strengths are Gµνρσ αβ = 9 ∂[µ cνρ][σ α,β] , Sµνραβ = −6 ∂[µ dνρ][α,β] , (2) Rµνρσ = −4 ∂[µ hν][ρ,σ ] , which all involve two derivatives. In gravity, the linearised curvature satisfies the Bianchi identity R[µνρ]σ = 0 (3) and the linearised field equation ηµρ Rµνρσ = 0,
(4)
where ηµν is the SO(4, 1)-invariant Minkowski metric. The gauge field dµνρ can be defined non-locally in terms of the gauge field hµν via the duality relation Sµνραβ =
1 γδ µνργ δ R αβ . 2
(5)
The gravitational field equation then implies that Sµνραβ satisfies the Bianchi identity S[µνρα]β = 0,
(6)
while the gravitational Bianchi identity implies the field equation ηµα Sµνραβ = 0.
(7)
Similarly, one can introduce the gauge field cµνρσ via a further duality Gµνρσ αβ =
1 σ αβγ δ Sµνρ γ δ 2
(8)
whose field strength then satisfies the Bianchi identity G[µνρσ ]αβ = 0,
(9)
corresponding to the gravitational Bianchi identity and the field eqautions for dµνρ , and the field equation ηµσ ηνα Gµνρσ αβ = 0, (10) corresponding to the gravitational field equation and the Bianchi identity for Sµνραβ . These dual forms of gravity in D = 5 arise naturally from considering the reduction of the (4,0) supersymmetric free theory in D = 6, in which the five physical degrees of freedom of the graviton in D = 5 arise from the reduction of a gauge + field CMNP Q in six dimensions with the symmetries of the Riemnann tensor and with
Exotic Tensor Gauge Theory and Duality
257
field strength satisfying D = 6 self-duality constraints [12, 14, 15]. In [12], it was argued that M-theory compactified to D = 5, with D = 5, N = 8 supergravity as the low-energy effective field theory, could have a strong-coupling limit giving rise to a D = 6, (4,0) supersymmetric theory with gravity described by the exotic gauge field + CMNP Q. This generalises to D dimensions, where the graviton hµν is dual to a gauge field Dµ1 ...µD−3 ν (corresponding to a Young tableau with one column of length D − 3 and one of length one) or to a gauge field Cµ1 ...µD−3 ν1 ...νD−3 (corresponding to a Young tableau with two columns, both of length D − 3). Antisymmetric tensor gauge fields are naturally formulated in the language of differential forms. The purpose of this paper is to develop the corresponding formalism for these higher rank gauge fields and their dualities, following the work of [6–8]. Gauge fields such as hµν , Dµ1 ...µD−3 ν or Cµ1 ...µD−3 ν1 ...νD−3 represented by Young tableaux with two columns of length p, q are elegantly described in terms of bi-forms, taking values in the tensor product space p ⊗ q of p-forms with q-forms. This formalism is developed in Sect. 2 and applied to gauge theories. The construction is generalised in Sect. 3, using multi-forms to establish dual descriptions of theories with gauge fields in arbitrary representations of GL(D, R). 2. Generalised Dual Descriptions of Linearised Gravity In five dimensions the usual electromagnetic duality between the local descriptions in terms of the one and two-form gauge fields Aµ or Bµν generalises to a gravitational triality between the local descriptions in terms of the three gauge fields hµν , dµνρ or cµνρσ in representations corresponding toYoung tableaux with two columns. The electromagnetic duality can be considered in arbitrary spacetime dimension D with equivalent descriptions in terms of the electric or magnetic potentials Aµ or Bµ1 ...µD−3 . As discussed in [15], one can similarly consider a linearised gravitational triality in D dimensions with the conventional presentation in terms of the graviton hµν having equivalent descriptions in terms of either Dµ1 ...µD−3 ν or Cµ1 ...µD−3 ν1 ...νD−3 . The three dual fields correspond to the GL(D, R)-irreducible two-columnYoung tableaux representations [1, 1], [D −3, 1] and [D − 3, D − 3]. The purpose of this section is to develop the theory of bi-forms which describe these gauge fields and their triality most economically. This work was first presented in [5].
2.1. Bi-forms. Consider the GL(D, R)-reducible tensor product space of p-forms and q-forms Xp,q := p ⊗ q on RD whose elements are T =
1 Tµ ...µ ν ...ν dx µ1 ∧ ... ∧ dx µp ⊗ dx ν1 ∧ ... ∧ dx νq , p!q! 1 p 1 q
(11)
where the components Tµ1 ...µp ν1 ...νq = T[µ1 ...µp ][ν1 ...νq ] are totally antisymmetric in each of the {µ} and {ν} sets of indices separately. No symmetry properties are assumed between the indices µi and the indices νj . The tensor field T ∈ X p,q is well defined and will be referred to as a bi-form. This definition of a bi-form is useful since one can employ various constructions from the theory of forms acting on the individual p and
q subspaces.
258
P.F. de Medeiros, C.M. Hull
A generalisation of the exterior wedge product defines the bi-form T T for any T ∈ X p,q and T ∈ Xp ,q , by
Xp+p ,q+q ,
T T =
∈
1
Tµ ...µ ν ...ν T µp+1 ...µp+p νq+1 ...νq+q + q )! 1 p 1 q dx µ1 ∧ ... ∧ dx µp+p ⊗ dx ν1 ∧ ... ∧ dx νq+q . (12) This definition gives the space X∗ := (p,q) ⊕X p,q a ring structure with respect to the -product and the natural addition of bi-forms. Clearly, standard operations on differential forms generalise to bi-forms. There are two exterior derivatives on X p,q . The left derivative (p
+ p )!(q
and the right derivative
d : Xp,q → X p+1,q
(13)
d˜ : Xp,q → Xp,q+1
(14)
whose actions on T are defined by 1 ∂[µ Tµ1 ...µp ]ν1 ...νq dx µ ∧ dx µ1 ∧ ... ∧ dx µp ⊗ dx ν1 ∧ ... ∧ dx νq , p!q! ˜ = 1 ∂[ν T|µ1 ...µp |ν1 ...νq ] dx µ1 ∧ ... ∧ dx µp ⊗ dx ν ∧ dx ν1 ∧ ... ∧ dx νq .1 (15) dT p!q!
dT =
It is clear from these definitions that d 2 = d˜ 2 = 0,
˜ d d˜ = dd.
(16)
One can also write the total derivative
defined as
D : Xp,q → X p+1,q ⊕ X p,q+1
(17)
D := d + d˜
(18)
D3
which satisfies = 0. Similarly, one can also construct distinct left ιk : Xp,q → X p−1,q
(19)
ι˜k : Xp,q → X p,q−1
(20)
and right interior products defined by 1 k µ1 Tµ1 µ2 ...µp ν1 ...νq dx µ2 ∧ ... ∧ dx µp ⊗ dx ν1 ∧ ... ∧ dx νq , (p − 1)!q! 1 ι˜k T = k ν1 Tµ1 ...µp ν1 ν2 ...νq dx µ1 ∧ ... ∧ dx µp ⊗ dx ν2 ∧ ... ∧ dx νq , (21) p!(q − 1)! ιk T =
for some vector field k. Again, it is clear that ι2k = ι˜2k = 0 and ιk ι˜k = ι˜k ιk . 1
As usual, the square bracketed indices are to be antisymmetrised while those inside vertical bars are excluded from the antisymmetrisation. For example T[µ|νρ|σ ] := 21 (Tµνρσ − Tσ νρµ ) for some fourth rank tensor Tµνρσ .
Exotic Tensor Gauge Theory and Duality
259
Consider now such bi-forms as reducible representations of the Lorentz group SO (D − 1, 1) ⊂ GL(D, R), so that there is a Minkowski metric ηµν and a totally antisymmetric tensor µ1 ...µD which are SO(D − 1, 1)-invariant tensors. These allow the construction of two inequivalent Hodge duality operations on bi-forms. There is a left dual (22) ∗ : X p,q → XD−p,q and a right dual ∗˜ : Xp,q → X p,D−q
(23)
defined by 1 µ ...µ Tµ ...µ ν ...ν 1 pµp+1 ...µD dx µp+1 ∧ ... ∧ dx µD p!(D − p)!q! 1 p 1 q ⊗dx ν1 ∧ ... ∧ dx νq , 1 ν ...ν ∗˜ T = Tµ ...µ ν ...ν 1 q νq+1 ...νD dx µ1 ∧ ... ∧ dx µp p!q!(D − q)! 1 p 1 q ⊗dx νq+1 ∧ ... ∧ dx νD , (24) ∗T =
where indices are raised using the SO(D − 1, 1)-invariant metric. These definitions imply ∗2 = (−1)1+p(D−p) , ∗˜ 2 = (−1)1+q(D−q) and ∗˜∗ = ∗˜ ∗. 2 This allows one to also define two inequivalent “adjoint” derivatives
and
d † := (−1)1+D(p+1) ∗ d∗ : X p,q → X p−1,q
(25)
d˜† := (−1)1+D(q+1) ∗˜ d˜ ∗˜ : Xp,q → X p,q−1
(26)
whose actions on T are defined by 1 ∂ µ1 Tµ1 µ2 ...µp ν1 ...νq dx µ2 ∧ ... ∧ dx µp ⊗ dx ν1 ∧ ... ∧ dx νq , (p − 1)!q! 1 d˜† T = ∂ ν1 Tµ1 ...µp ν1 ν2 ...νq dx µ1 ∧ ... ∧ dx µp ⊗ dx ν2 ∧ ... ∧ dx νq . (27) p!(q − 1)!
d †T =
2 2 These definitions imply d † = d˜† = 0 and d † d˜† = d˜† d † . One can then define the Laplacian operator
:= dd † + d † d ≡ d˜ d˜† + d˜† d˜ : Xp,q → X p,q , where the second equality follows identically. A trace operation τ : Xp,q → X p−1,q−1
(28)
(29)
2 Another formalism used to describe higher spin gauge theories is proposed in [6–8]. The construction there specifies a sequence of Young diagrams with increasing numbers of cells. There is then just one notion of exterior derivation which maps to the next element in the sequence and one interior product which maps to the previous element. This encounters some difficulties when it comes to describing the linearised gauge theory. Recall, for example, that the gauge transformation for dµνρ in (1) contained terms involving two parameters αµν and βµν with different index symmetries. Consequently if dµνρ is an element in the complex then one could only write its gauge transformation as the exterior derivative of either the symmetric part of αµν or βµν - whichever was chosen to be in the complex.
260
P.F. de Medeiros, C.M. Hull
can be defined by τT =
1 ηµ1 ν1 Tµ1 ...µp ν1 ...νq dx µ2 ∧...∧dx µp ⊗dx ν2 ∧...∧dx νq . (30) (p − 1)!(q − 1)!
Consequently, one can define two inequivalent “dual trace” operations
and
σ := (−1)1+D(p+1) ∗ τ ∗ : X p,q → X p+1,q−1
(31)
σ˜ := (−1)1+D(q+1) ∗˜ τ ∗˜ : Xp,q → X p−1,q+1
(32)
so that (−1)p+1 T[µ ...µ ν ]ν ...ν dx µ1 ∧ ... ∧ dx µp ∧ dx ν1 ⊗ dx ν2 ∧ ... ∧ dx νq , p!(q − 1)! 1 p 1 2 q (−1)q+1 Tµ ...µ [µ ν ...ν ] dx µ1 ∧ ... ∧ dx µp−1 ⊗ dx µp ∧ dx ν1 ∧ ... ∧ dx νq , σ˜ T = (p − 1)!q! 1 p−1 p 1 q (33) σT =
It is also useful to define a transpose operation t : Xp,q → X q,p
(34)
by tT =
1 Tν ...ν µ ...µ dx ν1 ∧ ... ∧ dx νq ⊗ dx µ1 ∧ ... ∧ dx µp p!q! 1 q 1 p
(35)
and a map η : Xp,q → X p+1,q+1
(36)
by ηT =
1 ηµ ν Tµ ...µ ν ...ν dx µ1 ∧ ... ∧ dx µp ∧ dx µp+1 (p + 1)!(q + 1)! 1 1 2 p+1 2 q+1 ⊗dx ν1 ∧ ... ∧ dx νq ∧ dx νq+1 ,
(37)
where the action in (37) is identical to the -product with the SO(D − 1, 1)-invariant metric ηµν , so that ηT ≡ η T . It is also convenient for the forthcoming discussion to state the identities τ d + dτ ˜ τ d˜ + dτ † † τd + d τ τ d˜† + d˜† τ
= d˜† , = d †, = 0,
(38)
= 0,
which imply the further relations (−1)n+1 τ n d + dτ n ˜ n (−1)n+1 τ n d˜ + dτ σ d + dσ σ˜ d˜ + d˜ σ˜ that follow by induction.
= n d˜† τ n−1 , = n d † τ n−1 , = 0, = 0,
(39)
Exotic Tensor Gauge Theory and Duality
261
A further important result is that, for any bi-form T ∈ X p,q , then τ nT = 0
(τ D−p−q+n ∗ ∗˜ )T = 0,
⇒
(40)
∗ ∗˜ )T = 0 for n ≥ 1 and D − p − q + n ≥ 1. The but does not imply proof follows since the expression on the right hand side of (40) only contains terms with n or more traces of T . Consequently τ T = 0 implies the whole bi-form T = 0 for D < p + q (but not for D ≥ p + q). More generally, τ n T = 0 implies T = 0 for D < p + q + 1 − n. (τ D−p−q+n−1
2.2. Reducible gauge theories. In this section, we will discuss the gauge theories for fields in reducible representations of GL(D, R), and in the following section we will refine this to examine irreducible representations. Consider then a gauge field A ∈ X p,q with components Aµ1 ...µp ν1 ...νq = A[µ1 ...µp ][ν1 ...νq ] . Consider also the general gauge transformation δA = dα p−1,q + d˜ α˜ p,q−1 (41) p−1,q p−1,q p,q−1 p,q−1 with gauge parameters α ∈X , α˜ ∈X . The gauge invariant field strength is ˜ F = d dA (42) satisfying the Bianchi indentities ˜ = 0. dF
dF = 0,
(43)
A natural field equation to impose is τ F = 0.
(44)
This gives a reducible theory. For example, A ∈ X1,1 is a general second rank tensor Aµν which can be decomposed into symmetric and antisymmetric parts. Such reducible theories were investigated in [15] and although the formalism can be developed, it seems more natural to decompose into irreducible representations of GL(D, R). 2.3. Bi-form gauge theory. The formalism above can be used to describe a free gauge theory whose gauge potential A ∈ X p,q is a tensor field transforming in an irreducible representation of GL(D, R) such that its components Aµ1 ...µp ν1 ...νq have the index symmetry of a two-column Young tableau with p cells in the left column and q cells in the right column. Without loss of generality, we take p ≥ q. A shorthand notation for this representation is [p, q].3 We will write X [p,q] for the subspace of X p,q in the representation [p, q]. Recall that irreducibility under GL(D, R) implies that the components of A must satisfy [10] and
A[µ1 ...µp ][ν1 ...νq ] = Aµ1 ...µp ν1 ...νq Aµ1 ...µp ν1 ...νq = Aν1 ...νq µ1 ..µp
Such a tensor has dimD [p, q] =
D p
,
A[µ1 ...µp ν1 ]ν2 ...νq = 0, if p = q.
D+1 q
q 1− p+1
(45)
(46)
3 The convention of writing tableaux in terms of the number of cells in each column differs from the standard way of labelling by row occupancy (e.g. in [10]), but is more suitable for the discussion here.
262
P.F. de Medeiros, C.M. Hull
independent components for p ≥ q. For example, the graviton hµν in five dimensions has dim5 [1, 1] = 15 components. These conditions can be written as the conditions on the bi-form A and
σ A = 0, A = tA
if p = q.
(47)
Thus, for p = q, X[p,q] is the kernel of the map σ : X p,q → X p+1,q−1 , while for p = q it is the subspace of the kernel invariant under the transpose t. It should be noted that σ˜ A = 0 for p > q though for p = q, σ A = σ˜ A = 0, since tA = A. This presentation is given before gauge fixing. After restricting to the physical (lightˆ written Aˆ i1 ...ip j1 ...jq , transform irreducibly under the cone) gauge, the components of A, little group SO(D − 2) ⊂ SO(D − 1, 1) ⊂ GL(D, R) (for D ≥ 2). Since this implies the existence of an SO(D − 2)-invariant metric δij then irreducibility under SO(D − 2) implies that, in addition to the index symmetries (45), the components of Aˆ must also satisfy the tracelessness condition Aˆ i i2 ...ip ij2 ...jq = 0 (or equivalently τ Aˆ = 0 with respect to the SO(D − 2)-invariant metric). Such a representation, in physical gauge, therefore has dim(D−2) [p, q] = dim(D−2) [p, q] − dim(D−2) [p − 1, q − 1]
(48)
independent components. For example, the physical graviton hˆ ij in five dimensions has dim3 [1, 1] = 5. The GL(D, R)-reducible space of bi-forms Xp,q contains the space of all type [p, q] tensors, written X[p,q] , as a GL(D, R)-irreducible subspace so the bi-form operations in Sect. 2.1 are well defined on these irreducible tensors. The projection from X p,q onto X[p,q] is the Young symmetriser Y[p,q] for the particular [p, q] tableau symmetry [10]. Thus A satisfies A = Y[p,q] ◦ A. (49) In order to preserve this, we project the gauge transformation (41) to obtain δA = Y[p,q] ◦ dα p−1,q + d˜ α˜ p,q−1
(50)
for bi-form parameters α p−1,q ∈ X p−1,q and α˜ p,q−1 ∈ X p,q−1 . These gauge parameters are not assumed to be GL(D, R)-irreducible. The first order gauge transformation for A is then proportional to the sum of two tableaux with symmetry of [p, q] type - one term having a single partial derivative entered in the left column and the other having a single partial derivative entered in the right column. In conventional Abelian gauge theory with a p-form (one-column [p, 0] tableau) potential Aµ1 ...µp one can write a field strength (p + 1)-form F = dA which is invariant under the gauge transformation δA = dα for some (p −1)-form parameter α. Following this, several authors [4, 9, 3] have proposed a type [3, 1] field strength involving a single derivative of the two-column, type [2, 1] gauge potential d[µν]ρ used to describe the Pauli-Fierz system. Such a construction, however, is found to be invariant under only the α 1,1 part of the most general gauge transformation proposed in (50). The example of the type [1, 1] graviton though has the linearised Riemann tensor, involving two derivatives, as its invariant field strength. This is indicative of the general observation that two-column tableaux gauge potentials should describe linearised “gravitational type” systems whose
Exotic Tensor Gauge Theory and Duality
263
field strengths involve two derivatives of the given gauge potential. Consequently, the unique GL(D, R)-irreducible field strength F is the type [p + 1, q + 1] tensor defined by ˜ ˜ F = Y[p+1,q+1] ◦ d dA ≡ d dA (51) which is invariant under the full gauge transformation (50). The left and right exterior derivatives act as d : X [p,q] → X p+1,q and d˜ : X [p,q] → X p,q+1 on the irreducible subspace X[p,q] and so do not map tableaux to tableaux. The composite operator ˜ however, acts as d d˜ : X [p,q] → X [p+1,q+1] on the irreducible subspace which d d, implies the identity in (51). The expression is unambiguous since the left and right exterior derivatives commute. A theorem in [7], in fact, allows any globally defined type [p + 1, q + 1] tensor F that is closed under both d and d˜ to be written as in (51) for some locally defined type [p, q] potential A (this is true globally on RD ). This generalises the well known Poincar´e lemma that any closed (p + 1)-form can be written locally as the exterior derivative of some p-form potential. In terms of Young tableaux, (51) simply corresponds to a type [p + 1, q + 1] pattern with one partial derivative in each of the two columns. The gauge invariance of (51) then follows because δF corresponds to a type [p + 1, q + 1] pattern with at least two (commuting) partial derivatives in a single column. In addition to this gauge invariance, the field strength F also satisfies the two second Bianchi identities ˜ = 0 dF = 0, dF (52) which follow from a similar reasoning, and the first Bianchi identity σF = 0
(53)
for p ≥ q, by virtue of GL(D, R)-irreducibility. The natural equation of motion is the generalised Einstein equation τ F = 0.
(54)
This is non-trivial, in the sense that it does not imply F = 0, for dimension D ≥ p+q +2 (using (40)). More generally, for dimensions D = p + q + 3 − n, the field equation τ nF = 0
(55)
gives a non-trivial equation. For example, for a type [1,1] tensor (graviton) hµν , F is the type [2,2] linearised Riemann tensor, and τ F = 0 is the vacuum Einstein equation Rµν = 0. This is non-trivial for D ≥ 4. For D = 3, the Einstein equation Rµν = 0 implies the whole curvature vanishes, and is too strong as it has no non-trivial solutions. For D = 3, the equation τ 2 F = 0, i.e. the vanishing of the Ricci scalar R = 0 does give a non-trivial equation and gives a theory dual to a scalar field [15].
2.4. Dualities between type [p, q] tensors. As discussed in [15], although the gravitational triality in five dimensions was found via dimensional reduction of a self-dual field in six dimensions, one can more generally consider the dual descriptions of a graviton in D dimensions. Given the linearised Riemann curvature R as the type [2, 2] field strength
264
P.F. de Medeiros, C.M. Hull
(51) for the graviton then one can construct two inequivalent Hodge duals given by the bi-forms S := ∗R ∈ X D−2,2 and G := ∗˜∗R ∈ X D−2,D−2 which are written 1 αβ R ν1 ν2 αβµ1 ...µD−2 , 2 1 = R αβγ δ αβµ1 ...µD−2 γ δν1 ...νD−2 4
Sµ1 ...µD−2 ν1 ν2 = Gµ1 ...µD−2 ν1 ...νD−2
(56) (57)
in component form. The curvature tensor satisfies the first Bianchi identities σ R = 0,
σ˜ R = 0,
(58)
˜ = 0, dR
(59)
the second Bianchi identities dR = 0, and the Einstein equation
τR = 0
(60)
in D ≥ 4, which implies the secondary field equations d † R = 0,
d˜† R = 0,
(61)
which are obtained using (38). The dual tensors S = ∗R,
G = ∗˜∗R
(62)
σ G = σ˜ G = 0,
(63)
then satisfy the algebraic constraints σ S = 0,
which imply they are GL(D, R)-irreducible, such that S ∈ X[D−2,2] and G ∈ X [D−2,D−2] . They also satisfy the differential constraints dS = 0,
˜ = 0, dS
d † S = 0,
d˜† S = 0
(64)
τ D−4 d˜† G = 0
(65)
and dG = 0,
˜ = 0, dG
τ D−4 d † G = 0,
together with the field equations τ S = 0,
τ D−3 G = 0,
(66)
which are non-trivial for S and G in D ≥ 4. These are derived from the properties of the Riemann tensor and by using (39) and (40). The dualities (56) and (57) interchange field equations and Bianchi identities. For example the field equation τ R = 0 becomes τ ∗ S = 0 and which is equivalent to ∗σ S = 0, implying the Bianchi identity σ S = 0. It then gives the field equation τ ∗ ∗˜ G = 0, which is equivalent to τ D−3 G = 0. The Bianchi identity σ R = 0 becomes the field equation τ S = 0 and then the Bianchi identity σ˜ G = 0. These properties follow from the fact that τ , which occurs in the field equations, is related to σ , which occurs in the Bianchi identities, by Hodge duality as σ ∼ ∗τ ∗.
Exotic Tensor Gauge Theory and Duality
265
The irreducibility of the duals S and G, together with the fact that they are closed under both d and d˜ further implies that these tensors can be solved in terms of the type [D − 3, 1] and [D − 3, D − 3] gauge potentials Dµ1 ...µD−3 ν and Cµ1 ...µD−3 ν1 ...νD−3 , such ˜ and G = d dC. ˜ The three constraints τ R = 0, τ S = 0 and τ D−3 G = 0 that S = d dD can then be seen as the non-trivial, linearised equations of motion for h, D and C. Another way to see the duality given above is in physical gauge where one has no gauge symmetry to consider and there exists an SO(D − 2) orientation tensor i1 ...iD−2 . The two dual potentials are then related to the physical graviton hij such that D = ∗h and C = ∗˜∗h (where ∗ here denotes the SO(D − 2)-covariant Hodge dual). This gaugefixed definition of D and C implies that they are irreducible under SO(D − 2) following the irreducibility of h. A non-trivial check that each of these three fields describes the same number of physical degrees of freedom follows the result that D(D − 3) dim(D−2) [1, − 3, 1] = dim(D−2) [D − 3, D − 3] = 1] = dim(D−2) [D 2 (67) for D > 3.4 The arguments presented in this section are straightforwardly extended for dualities between arbitrary type [p, q] tensor gauge theories. Consider a type [p, q] tensor gauge ˜ satisfying the field h (with p ≥ q) having type [p + 1, q + 1] field strength R = d dh ˜ = 0. It is assumed that first and second Bianchi identities σ R = 0, dR = 0 and dR D ≥ p + q + 2 so that the “Einstein” equation τ R = 0 which is imposed has non-trivial solutions. These properties of R again imply the secondary field equations d † R = 0 and d˜† R = 0. For general p ≥ q one can define three inequivalent Hodge dual field strengths S := ∗R ∈ X D−p−1,q+1 , S˜ := ∗˜ R ∈ X p+1,D−q−1 , G := ∗˜∗R ∈ X D−p−1,D−q−1 ,
(68)
where S˜ = tS for p = q. The above dualities, together with the properties of R, imply the algebraic constraints σ S = 0,
σ˜ S˜ = 0,
σ˜ G = 0,
(69)
with the additional constraints σ G = 0 only if p = q and σ˜ S = σ S˜ = 0 only if D = p + q + 2. S, S˜ and G are therefore GL(D, R)-irreducible. 5 The dual tensors also satisfy the differential constraints dS = 0, d S˜ = 0, ˜ = 0, dG
d † S = 0, d˜ S˜ = 0, τ D−2−p−q d † G = 0
d˜† S = 0, d˜† S˜ = 0,
(70)
4 This is analogous to the more usual duality in physical gauge wherein a p-form gauge field A is related to a dual (D − 2 − p)-form B such that B = ∗A. In this case both fields describe the same D−2 physical degrees of freedom. p 5 The constraints in (69) imply this following the assumption D ≥ p + q + 2 which implies that the ˜ The bi-form G has bi-form S has a left column length ≥ that of the right column and vice versa for S. right column length ≥ left column length, due to the assumption p ≥ q.
266
P.F. de Medeiros, C.M. Hull
˜ = 0, d † S˜ = 0, dG = 0 and τ D−2−p−q d˜† G = 0 if with the additional constraints dS p = q. The field equations imposed on the dual tensors are τ S = 0,
τ 1+p−q S˜ = 0,
τ D−1−p−q G = 0
(71)
which are non-trivial equations for S, S˜ and G if p ≥ q. The expressions (39) and (40) are used to derive the results above and again the dualities mix Bianchi and field equation ˜ constraints for the dual field strengths. For p = q, the expressions above imply S(= t S) and G can be solved in terms of type [D − p − 2, p] and [D − p − 2, D − p − 2] tensor ˜ and G = d dC. ˜ The field equations (71) can potentials D and C, such that S = d dD then be seen as the equations of motion for D and C.
2.5. Gauge invariant actions. In this section we construct the gauge invariant actions corresponding to the field equations given above for bi-form gauge fields in sufficiently large dimension. The construction of such actions is perhaps best described by starting with some simple examples. Consider the three fields hµν , dµνρ and cµνρσ of types [1, 1], [2, 1] and [2, 2] described in Sect. 1, but now in D dimensions. The invariant action S [1,1] for the graviton hµν is the linearised Einstein-Hilbert action given by S [1,1] = −
1 2
d D x hµν Eµν =
dDx
∂ [µ hν]ρ ∂[µ hν]ρ − 2∂ [µ hν]ν ∂[µ hρ]ρ , (72)
where Eµν := Rµν − 21 ηµν R denotes the linearised Einstein tensor, satisfying ∂ µ Eµν ≡ 0, constructed from contractions of the linearised Riemann tensor Rµνρσ = −4∂[µ hν][ρ,σ ] . The Lagrangian has two expressions as either − 21 hµν Eµν in terms of the linearised Einstein tensor Eµν or as quadratic terms in the type [2, 1] single derivative object ∂[µ hν]ρ and its SO(D − 1, 1)-irreducible trace part ∂[µ hν] ν . Notice that neither ∂[µ hν]ρ nor its SO(D − 1, 1)-irreducible traceless part is invariant under δhµν = 2∂(µ ξν) even though S [1,1] is. The easiest way to see the gauge invariance is in terms of the − 21 hµν Eµν Lagrangian. Since the linearised Riemann tensor Rµνρσ = −4∂[µ hν][ρ,σ ] is the field strength (51) for the graviton then it follows that the linearised Einstein tensor is also gauge invariant (it being constructed from traces of Rµνρσ ). The gauge transformation therefore only changes the Lagrangian by − 21 (δhµν )E µν which simply consists of a total derivative term (since ∂ µ Eµν ≡ 0 identically) and so vanishes in the action integral. The field equation for this model is simply the linearised vacuum Einstein equation Eµν = 0 which is equivalent to the linearised Ricci flatness condition R α µαν = 0 in D = 2. It is also noted that this equation is trivial unless D ≥ 4. 6 The equation can be decomposed in terms of its linearly independent components, whence the graviton µ satisfies ∂ 2 hµν = 0, ∂ µ hµν = 0 and hµ = 0. The gauge field dµνρ has a field strength Sµνραβ = −6 ∂[µ dνρ][α,β] (51) and field equation τ S = 0. The invariant action S [2,1] giving this field equation has been constructed in [9, 3]. The presentation in these articles consists of quadratic terms in the type [3, 1] single derivative object ∂[µ dνρ]α and its trace part ∂[µ dνα] α . As already noted, 6 In the lower dimension D = 3, one cannot obtain the non-trivial (vanishing Ricci scalar) equation R = 0 from a gauge invariant action.
Exotic Tensor Gauge Theory and Duality
267
these objects are not individually gauge invariant under (50), even though S [2,1] is. A more obviously gauge invariant presentation is given by 1 S [2,1] = − d D x d µνρ Eµνρ , (73) 2 where Eµνρ is the linearised “Einstein” tensor for dµνρ , defined by Eµνρ := S α µναρ − ηρ[µ S
αβ ν]αβ
(74)
or equivalently, E = τ S − ητ 2 S in bi-form notation. By construction, Eµνρ is a gauge invariant type [2, 1] tensor satisfying ∂ µ Eµνρ ≡ 0 and ∂ ρ Eµνρ ≡ 0 identically. Consequently the gauge transformation of the Lagrangian in (73) is a total derivative implying the action is gauge invariant. The field equation for this model is then the linearised vacuum “Einstein” equation Eµνρ = 0 which is equivalent to S α µναρ = 0 in D = 3. This equation is non-trivial in D ≥ 5. The linearly independent components of this µ imply that dµνρ satisfies ∂ 2 dµνρ = 0, ∂ µ dµνρ = 0, ∂ ρ dµνρ = 0 and dµν = 0. [2,2] for cµνρσ also has a more explicitly gauge invariant form The invariant action S given by 1 d D x cµνρσ Eµνρσ , (75) S [2,2] = − 4 where Eµνρσ is defined by Eµνρσ := Gαµναρσ − ηρ[µ G
αβ ν]αβσ
+ ησ [µ G
αβ ν]αβρ
+
1 αβγ ηρ[µ ην]σ G αβγ (76) 3
in terms of the field strength Gµνρσ αβ = 9 ∂[µ cνρ][σ α,β] . Equation (76) can equivalently be written as E = τ G − 2ητ 2 G + 13 η2 τ 3 G in bi-form notation. By construction, Eµνρσ is a gauge invariant type [2, 2] tensor satisfying ∂ µ Eµνρσ ≡ 0. This implies the gauge invariance of (75). The field equation for this model is Eµνρσ = 0 which is equivalent to Gαµναρσ = 0 in D = 4. This equation is non-trivial in D ≥ 6. The linearly independent components of this then imply that cµνρσ satisfies ∂ 2 cµνρσ = 0, ∂ µ cµνρσ = 0 and µ cµν σ = 0. The general construction then considers the gauge invariant action S [p,q] for a given type [p, q] tensor gauge field Aµ1 ...µp ν1 ...νq . The gauge invariance can be seen by writing S [p,q] in the form 1 (77) d D x Aµ1 ...µp ν1 ...νq Eµ1 ...µp ν1 ...νq , S [p,q] = − p!q! where Eµ1 ...µp ν1 ...νq is the gauge invariant type [p, q] tensor satisfying ∂ µ1 Eµ1 ...µp ν1 ...νq ≡ 0 and ∂ ν1 Eµ1 ...µp ν1 ...νq ≡ 0 identically. Such a tensor always exists and involves various terms involving two derivatives on Aµ1 ...µp ν1 ...νq and all its possible traces. The general form of Eµ1 ...µp ν1 ...νq is given by Eµ1 ...µp ν1 ...νq = Y[p,q] ◦ F α µ1 ...µp αν1 ...νq + a1 ηµ1 ν1 F α1 α2 µ2 ...µp α1 α2 ν2 ...νq α ...α + . . . + aq ηµ1 ν1 ...ηµq νq F 1 q+1 µq+1 ...µp α1 ...αq+1 (78)
268
P.F. de Medeiros, C.M. Hull
for q coefficients a1 ,...,aq (assuming p ≥ q). The leading order term is the single trace τ F of the type [p + 1, q + 1] field strength F of A which is a type [p, q] tensor. The correction terms involve successive traces of this object (appropriately symmetrised using Y[p,q] ). The precise values of the coefficients are fixed uniquely by the conservation conditions above (for example, a1 = −pq/2). The field equation is given by the linearised vacuum “Einstein” equation Eµ1 ...µp ν1 ...νq = 0 which is equivalent to the equation F α µ1 ...µp αν1 ...νq = 0, proposed in [15], for D = p + q. This equation is non-trivial in D ≥ p + q + 2. The linearly independent components of this equation imply ∂ 2 Aµ1 ...µp ν1 ...νq = 0, ∂ µ1 Aµ1 ...µp ν1 ...νq = 0, ∂ ν1 Aµ1 ...µp ν1 ...νq = 0 and Aα µ2 ...µp αν2 ...νq = 0. 3. Multi-Form Structure in Exotic Tensor Gauge Theory Having found the structure of bi-forms most suitable to describe gravitational dualities, this section will develop the general structure of multi-forms which can be used to construct gauge theories in D dimensions whose gauge fields transform in general irreducible representations of GL(D, R). The various dual descriptions of such gauge theories are also considered. This generalised description could also be relevant in studies of string field theory and W-geometry [11]. The multi-form construction has also been considered in [8, 1]. 3.1. Multi-forms. A multi-form of order N is a tensor field T that is an element of the GL(D, R)-reducible N -fold tensor product space of pi -forms (where i = 1, ..., N), written Xp1 ,...,pN := p1 ⊗ ... ⊗ pN . (79) The components of T are written Tµ1 ...µ1 1
antisymmetric in each set of
{µi }
and are taken to be totally
indices, such that
T[µ1 ...µ1 1
i i N p1 ...µ1 ...µpi ...µpN
i N i N p1 ]...[µ1 ...µpi ]...[µ1 ...µpN ]
= Tµ1 ...µ1 1
i N i N p1 ...µ1 ...µpi ...µ1 ...µpN
.
(80)
The generalisation of the operations defined for bi-forms to multi-forms of order N over RD is then straightforward. The -product is the map
: X p1 ,...,pN × X p 1 ,...,p N → X p1 +p 1 ,...,pN +p N
(81)
defined by the N -fold wedge product on the individual form subspaces, by analogy with (12). There are N inequivalent exterior derivatives d (i) : Xp1 ,...,pi ,...pN → Xp1 ,...,pi +1,...,pN
(82)
which are individually defined, by analogy with (15), as the exterior derivatives acting 2 on the pi form subspaces. This definition implies d (i) = 0 (with no sum over i) and that any two d (i) commute. One can also define the total derivative D :=
N
d
i=1
which satisfies DN+1 = 0.
(i)
:X
p1 ,...,pi ,...pN
→
N i=1
⊕ Xp1 ,...,pi +1,...,pN
(83)
Exotic Tensor Gauge Theory and Duality
269
There are N inequivalent interior products ιk : Xp1 ,...,pi ,...,pN → Xp1 ,...,pi −1,...,pN (i)
(84)
whose action is defined, by analogy with (21), as the individual interior products on (i) 2
(i)
each pi form subspace. Consequently ιk = 0 (with no sum over i) and any two ιk commute. For representations of SO(D − 1, 1) ⊂ GL(D, R) there are N inequivalent Hodge dual operations ∗(i) : Xp1 ,...,pi ,...pN → X p1 ,...,D−pi ,...pN (85) which, following (24), are defined to act as the Hodge duals on the individual pi form 2 subspaces. This implies that ∗(i) = (−1)1+pi (D−pi ) (with no sum over i) with any two (i) ∗ commuting. This also allows one to define N inequivalent “adjoint” exterior derivatives d†
(i)
:= (−1)1+D(pi +1) ∗(i) d (i) ∗(i) : Xp1 ,...,pi ,...pN → X p1 ,...,pi −1,...,pN . (i) 2
This implies d † = 0 (with no sum over i) and any two d † define the Laplacian operator := d (i) d †
(i)
(i)
(86)
commute. One can then
(i)
+ d † d (i) : Xp1 ,...,pi ,...pN → X p1 ,...,pi ,...,pN
(87)
with no sum over i. There exist (N − 1)! inequivalent trace operations τ (ij ) : Xp1 ,...,pi ,...,pj ,...pN → X p1 ,...,pi −1,...,pj −1,...,pN
(88)
defined, by analogy with (30), as the single trace between the pi and pj form subi j spaces using ηµ1 µ1 . This allows one to define two inequivalent “dual-trace” operations σ (ij ) := (−1)1+D(pi +1) ∗(i) τ (ij ) ∗(i) : Xp1 ,...,pi ,...,pj ,...pN → X p1 ,...,pi +1,...,pj −1,...,pN (89) and σ˜ (ij ) := (−1)1+D(pj +1) ∗(j ) τ (ij ) ∗(j ) : Xp1 ,...,pi ,...,pj ,...pN → X p1 ,...,pi −1,...,pj +1,...,pN (90) associated with a given τ (ij ) (with no sum over i or j ). Notice that σ˜ (ij ) = σ (j i) since τ (ij ) = τ (j i) . One can also write (N − 1)! inequivalent involutions t (ij ) : Xp1 ,...,pi ,...,pj ,...pN → X p1 ,...,pj ,...,pi ,...,pN
(91)
defined by exchange of the pi and pj form subspaces in the tensor product space. And finally, there are (N − 1)! distinct operations η(ij ) : Xp1 ,...,pi ,...,pj ,...pN → X p1 ,...,pi +1,...,pj +1,...,pN
(92)
defined as the -product with the SO(D − 1, 1) metric η (understood as a [1, 1] bi-form in the pi ⊗ pj subspace), such that η(ij ) T ≡ η T for any T ∈ X p1 ,...,pN .
270
P.F. de Medeiros, C.M. Hull
3.2. Multi-form gauge theory. Consider now a gauge potential that is a tensor A in an arbitrary irreducible representation of GL(D, R) whose components have the index symmetry of an N -column Young tableaux with pi cells in the i th column (it is assumed pi ≥ pi+1 ). A given multi-form A ∈ X p1 ,...,pN is in such an irreducible representation if σ (ij ) A = 0 (93) for any j ≥ i and also satisfying t (ij ) A = A if the i th and j th columns are of equal length, such that pi = pj . Such a representation is labelled [p1 , ..., pN ]. Again, one can project onto this irreducible tensor subspace X[p1 ,...,pN ] from X p1 ,...,pN using the Young symmetriser Y[p1 ,...,pN ] . The natural gauge transformation for this object is then given by N (i) p1 ,...,pi −1,...,pN (94) δA = Y[p1 ,...,pN ] ◦ d α(i) i=1 p ,...,p −1,...,p
i N for any gauge parameters α(i)1 ∈ Xp1 ,...,pi −1,...,pN . This just corresponds to the sum over N tableaux of type [p1 , ..., pN ] with the i th term in the sum having a single partial derivative entered in the i th column. The associated field strength F is a type [p1 + 1, ..., pN + 1] tensor defined by N N
(i) F = Y[p1 +1,...,pN +1] ◦ d A ≡ d (i) A (95)
i=1
i=1
which is gauge invariant under (94). The identity in (95) follows from the observation (i) : X [p1 ,...,pN ] → X [p1 +1,...,pN +1] maps that the composite exterior operator N i=1 d tableaux to tableaux even though individual exterior derivatives do not. The expression is unambiguous since all d (i) commute. The field strength (95) corresponds to a [p1 + 1, ..., pN + 1] Young tableau with N partial derivatives (one in each column). Gauge invariance then follows from δF vanishing identically since at least one column must contain at least two partial derivatives. Again, rewriting the generalised Poincar´e lemma in [7] allows any type [p1 + 1, ..., pN + 1] tensor F (satisfying d (i) F = 0 for all i) to be written as in (95) for some type [p1 , ..., pN ] potential A. The field strength F also satisfies second Bianchi identities d (i) F = 0
(96)
σ (ij ) F = 0
(97)
and the first Bianchi identities for any j ≥ i. Considering the irreducible representations of GL(D, R) above to be reducible representations of the SO(D − 1, 1) Lorentz subgroup allows the construction of a gauge invariant action functional from which physical equations of motion can be obtained. This principle is perhaps best illustrated by the use of two simple, non-trivial examples. Consider first the N = 1 example of a one-form Maxwell gauge field Aµ , viewed as a type [1, 0] tensor. The natural field equation for this model is the Maxwell equation ∂ µ ∂[µ Aν] = 0. This equation can be derived from a gauge invariant Lagrangian proportional to Aµ ∂ ν ∂[µ Aν] . The gauge invariant field equation factor Eµ = ∂ ν ∂[µ Aν]
Exotic Tensor Gauge Theory and Duality
271
corresponds to the trace of the type [2, 1] field strength tensor Fµνρ = 2 ∂[µ Aν],ρ . The derived field equations can then be written τ (12) F = 0.
(98)
Now consider the N = 3 example of a totally symmetric, third rank gauge field φµνρ , viewed as a type [1, 1, 1, 0] tensor. The type [2, 2, 2, 1] field strength F = d (1) d (2) d (3) d (4) φ is invariant under the most general gauge transformation δφµνρ = 3 ∂(µ ξνρ) for the second rank gauge parameter ξµν which can be taken to be symmetric. A unique gauge invariant action is given by 1 S [1,1,1,0] = − d D x φ µνρ Eµνρ , (99) 6 where
1 Eµνρ = Y[1,1,1,0] ◦ Fαµανβρβ − ηµν Fαβαβγργ 2
(100)
is the type [1, 1, 1, 0] field equation tensor, satisfying ∂ µ Eµνρ ≡ 0 identically. Consequently, (99) is invariant under the gauge transformation δφµνρ = 3 ∂(µ ξνρ) . The field equation derived from (99) is Eµνρ = 0 which implies Y[1,1,1,0] ◦ Fαµανβρβ = 0 in D = 2. In bi-form notation, this field equation is then written τ (ij ) τ (k4) F = 0, (101) where the sum is over three terms with (ij k) ∈ {(123), (231), (312)}, corresponding to the [1, 1, 1, 0] Young symmetrisation. It should be noted that this field equation contains four derivatives of gauge field φ and consequently is somewhat unphysical in the sense that physical theories contain precisely two derivatives in their equations of motion. Examples of such higher derivative field equations occur in Weyl gravity and higher order gauge theories with Lagrangians quadratic in the curvature. For N odd, the natural field equation for a general type [p1 , ..., pN , 0] gauge potential A, with type [p1 + 1, ..., pN + 1, 1] field strength F , is then given by Y[p1 ,...,pN ,0] ◦ τ (i1 i2 ) ...τ (iN N+1) F = 0, (102) I ∈SN
where the sum is on the labels I = (i1 ...iN ) whose values vary over all permutations of the set (1...N ). The (N + 1)th label is not included in the sum and the Young projection is onto an irreducible type [p1 , ..., pN , 0] tensor. For N even, the field equation for a type [p1 , ..., pN ] gauge potential A, with type [p1 + 1, ..., pN + 1] field strength F , is given by Y[p1 ,...,pN ] ◦ τ (i1 i2 ) ...τ (iN −1 iN ) F = 0, (103) I ∈SN
where the sum here is on all the labels I = (i1 ...iN ) whose values vary over all permutations of the set (1...N ). TheYoung projection is then onto an irreducible type [p1 , ..., pN ] tensor.
272
P.F. de Medeiros, C.M. Hull
These field equations can be derived from the gauge invariant action N
1 µ1 ...µ1 ...µi ...µi ...µN [p1 ,...,pN ] = − d D x A 1 p1 1 pi pN Eµ1 ...µ1 ...µi ...µi ...µN S p1 pi pN 1 1 p! i=1 i (104) in terms of the type [p1 , ..., pN ] gauge potential A and some gauge invariant field equation tensor E involving N partial derivatives on A for even N (or N + 1 derivatives for odd N ). Gauge invariance of (104) necessarily implies that E should satisi fy the N conservation conditions ∂ µ1 Eµ1 ...µ1 ...µi ...µi ...µN ≡ 0 identically for i = p1 pi pN 1 1 1, ..., N. For N even then the leading term in E necessarily involves N/2 traces of the field strength F of A and is obtained by the [p1 , ..., pN ] Young symmetrisation of the N −1 N 1 2 sum over all permutations of N labels of the term Fµ1 ...µ1 ...µN ηµ1 µ1 ...ηµ1 µ1 . 1
p1 +1
pN +1
The correction terms then consist of all further traces (appropriately symmetrised) with coefficients fixed by overall conservation of E. For N odd one can consider the potential to be a type [p1 , ..., pN , 0] tensor of even order N + 1 whose field strength F is a type [p1 + 1, ..., pN + 1, 1] tensor. The construction of E is then the same as for the even N case. The notion of duality described in the previous sections could also be extended to give 2N equivalent descriptions of a gauge theory with type [p1 , ..., pN ] gauge potential A. The construction is perhaps best illustrated by considering A in physical gauge such that it transforms in the SO(D − 2)-irreducible representation [p1 , ..., pN ]. There are then 2N dual descriptions in terms of each of the potentials A, ∗(i) A, ∗(i) ∗(j ) A,..., ∗(1) ... ∗(N) A. It must be understood however that not all of these dual descriptions are non-trivial. For example, in the type [1, 1] gravitational case then one has the set of four physical dual fields (h, ∗h, ∗˜ h, ∗˜∗h) though only three of them h, D = ∗h = t ∗˜ h and C = ∗˜∗h are distinct due to the symmetry of h. In general the number of distinct dual descriptions would therefore depend on the particular representation A is in. Note added. Most of the new material here was presented in [5]. Subsequently, while this article was in preparation, the work [1] appeared which also develops a multi-form formulation of the results of [15] and so has considerable overlap with this paper. References 1. Bekaert, X., Boulanger, N.: Tensor gauge fields in arbitrary representations of GL(D, R): Quality and Poincar´e lemma. hep-th/0208058 ˇ Mod. Phys. Lett. A16, 731 (2001), hep-th/0101201; 2. Burdik, C.: ˇ Pashnev, A., Tsulaia, M.: Nucl. Phys. Proc. Suppl. 102, 285 (2001), hep-th/0103143; Burdik, C., Buchbinder, I.L., Pashnev, A., Tsulaia, M.: Phys. Lett. B523, 338 (2001), hep-th/0109067; Buchbinder, I.L., Pashnev, A., Tsulaia, M.: Massless higher spin fields in the AdS background and BRST constructions for non-linear algebras. hep-th/0206026 3. Casini, H., Montemayor, R., Urrutia, L.F.: Phys. Lett. B507, 336 (2001), hep-th/0102104 4. Curtright, T.L., Freund, P.G.O.: Nucl. Phys. B172, 413 (1980); Curtright, T.L.: Phys. Lett. B165, 304 (1985) 5. de Medeiros, P.: Duality in non-perturbative quantum gravity. PhD thesis, Queen Mary, University of London [submitted 24/07/02] 6. Dubois-Violette, M., Henneaux, M.: Lett. Math. Phys. 49, 245 (1999), math.QA/9907135 7. Dubois-Violette, M.: Lectures on differentials, generalized differentials and on some examples related to theoretical physics. math.QA/0005256 8. Dubois-Violette, M., Henneaux, M.: Tensor fields of mixed Young symmetry type and N-complexes. math.QA/0110088
Exotic Tensor Gauge Theory and Duality
273
9. Garcia, J.A., Knaepen, B.: Phys. Lett. B441, 198 (1998), hep-th/9807016 10. Hamermesh, M.: Group theory and its application to physical problems. London-New York: Dover Publications, 1989 11. Hull, C.M.: Commun. Math. Phys. 156, 245 (1993), hep-th/9211113 12. Hull, C.M.: Nucl. Phys. B583, 237 (2000), hep-th/0004195 13. Hull, C.M.: Class. Quant. Grav. 18, 3233 (2001), hep-th/0011171 14. Hull, C.M.: J. High Energy Phys. 0012, 007 (2000), hep-th/0011215 15. Hull, C.M.: J. High Energy Phys. 0109, 027 (2001), hep-th/0107149 16. Labastida, J.M.F., Morris, T.R.: Phys. Lett.B180, 101 (1986) 17. Sezgin, E., Sundell, P.: J. High Energy Phys. 0109, 025 (2001), hep-th/0107186; Sezgin, E., Sundell, P.: 7D bosonic higher spin gauge theory: Symmetry algebra and linearized constraints, hep-th/0112100; Sezgin, E., Sundell, P.: Massless higher spins and holography, hep-th/0205131 18. Siegel, W., Zwiebach, B.: Nucl. Phys. B282, 125 (1987); Siegel, W.: Nucl. Phys. B284, (1987) 632; Siegel, W.: Introduction to string field theory. Chapter 4, Singapore: World Scientific, 1988, hep-th/0107094; Berkovits, N.: Phys. Lett. 388B, 743 (1996), hep-th/9607070; Siegel, W.: Fields. Chapter 12 (1999), hep-th/9912205 Communicated by A. Connes
Commun. Math. Phys. 235, 275–288 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0806-8
Communications in
Mathematical Physics
On Yang–Mills Instantons over Multi-Centered Gravitational Instantons G´abor Etesi1 , Tam´as Hausel2 1
Alfr´ed R´enyi Institute of Mathematics, Hungarian Academy of Science, Re´altanoda u. 13-15, Budapest, 1053 Hungary. E-mail:
[email protected] 2 Miller Institute for Basic Research in Science and Department of Mathematics, University of California at Berkeley, Berkeley, CA 94720, USA. E-mail:
[email protected] Received: 16 September 2002 / Accepted: 22 October 2002 Published online: 21 February 2003 – © Springer-Verlag 2003
Abstract: In this paper we explicitly calculate the analogue of the ’t Hooft SU (2) Yang–Mills instantons on Gibbons–Hawking multi-centered gravitational instantons, which come in two parallel families: the multi-Eguchi–Hanson, or Ak ALE gravitational instantons and the multi-Taub–NUT spaces, or Ak ALF gravitational instantons. We calculate their energy and find the reducible ones. Following Kronheimer we also exploit the U (1) invariance of our solutions and study the corresponding explicit singular SU (2) magnetic monopole solutions of the Bogomolny equations on flat R3 .
1. Introduction Asymptotically Locally Euclidean or ALE and Asymptotically Locally Flat or ALF gravitational instantons are complete, non-compact hyper-K¨ahler four-manifolds which are intensively studied from both physical and mathematical sides recently. This paper is a continuation of the project of constructing SU (2) Yang–Mills instantons on ALF gravitational instantons started in [10] and [11]. In [10] we identified all the reducible SU (2) instantons on the Euclidean Schwarzschild manifold (which is ALF and Ricci flat, though not self-dual), and showed that these solutions, albeit not their reducibility, were already known to [6]. Then in [11] we went on and studied SU (2) instantons on the Taub–NUT space (which is an ALF gravitational instanton). Following [16] we exploited the self-duality of the metric, to obtain a family of SU (2) instantons, which could be considered as the analogue of the ’t Hooft solutions on R4 . (For more historical remarks the reader is referred to [11].) Here we carry out the generalization of [11] to the more general case of multi-Taub– NUT spaces (which are also ALF gravitational instantons). An interesting aspect of our paper is that what we do goes almost verbatim in the case of the other family of Gibbons– Hawking multi-centered gravitational instantons [13], namely the multi-Eguchi–Hanson spaces.
276
G. Etesi, T. Hausel
The only difference in the two cases will be the energy of some of the Yang–Mills instantons. Namely in the multi-Eguchi–Hanson case the infinity may contribute minus a fraction of one to the energy, while in the multi-Taub–NUT case the energy is always an integer. We will make the calculation of the energy integral in the convenient framework of considering the U (1)-invariant instantons as singular monopoles on R3 . We will follow [18] to write down this reduction and perform the integral. This way we will also exhibit explicit, but singular, solutions to the SU (2) Bogomolny equation on R3 . Remarkably we will get the same monopoles, regardless of which family of Gibbons–Hawking metrics we consider: the ALE or ALF. Then we go on and identify the reducible U (1) instantons among our solutions. They are interesting from the point of view of Hodge theory as their curvatures are L2 harmonic 2-forms. Namely, only very recently has the dimension of the space of L2 harmonic forms on ALF gravitational instantons been calculated in [14]. Remarkably we are able to identify a basis of generators for these L2 harmonic 2-forms, arising in this paper as curvatures of reducible ’t Hooft instantons. In the last part of our paper we put everything together and look at the parameter spaces of solutions we uncovered in this paper and explain its properties in light of the results and point out some future directions to consider. Our instantons are not new. Most of them were found during the early eighties by Aragone and Colaiacomo [1] and by Chakrabarti, Boutaleb-Joutei and Comtet in a series of papers [4]; for a review cf. [5]. The ’t Hooft solutions and a few more general ADHM solutions in the Eguchi–Hanson case were written down explicitly in [3]. Finally, SU (2) Yang–Mills instantons over Ak ALE gravitational instantons were classified by Kronheimer and Nakajima [19].
2. Instantons over the Multi-Centered Spaces In this section we generalize the method of [11], designed for finding Yang–Mills instantons on Taub–NUT space, to the multi-centered gravitational instantons (MV , gV ) of Gibbons and Hawking [13, 15]. In our previous paper we used the following result. Suppose (M, g) is a four-dimensional Riemannian spin-manifold which is self-dual and has vanishing scalar curvature. S of the rescaled manifold (M, g) Consider the metric spin connection ∇ ˜ with g˜ = f 2 g, where f is harmonic (i.e., f = 0 with respect to g). This so(4)-valued connection S . This connection lives on the complex spinor bundle SM. Take the ∇ − component of ∇ − can be constructed as the projection onto the chiral spinor bundle S M, according to the splitting SM = S + M ⊕S − M and can be regarded as an su(2)− -valued connection. This is because the above splitting of the spinor bundle is compatible with the Lie algebra decomposition so(4) ∼ = su(2)+ ⊕ su(2)− . In light of a result of Atiyah, Hitchin and Singer [2] (see also [11]) ∇ − is self-dual with respect to g. These ideas in the case of flat R4 were first used by Jackiw, Nohl and Rebbi [16], in the case of ALE gravitational instantons in [4] while for the Taub–NUT case by the authors [11] to construct plenty of new instantons. Our aim is to repeat this method in the present more general case. S is represented locally by an so(4)-valued 1-form ω˜ then we write A− for the If ∇ su(2)− -valued connection 1-form of ∇ − in this gauge over a chart U ⊂ M. Now we turn our attention to a brief description of the Gibbons–Hawking spaces denoted by MV . This space topologically can be understood as follows. There is a circle
Yang–Mills Instantons over Gravitational Instantons
277
action on MV with k fixed points p1 , . . . , pk ∈ MV , called NUTs1 . The quotient is R3 and we denote the images of the fixed points also by p1 , . . . , pk ∈ R3 . Then UV := MV \ {p1 , . . . , pk } is fibered over ZV := R3 \ {p1 , . . . , pk } with S 1 fibers. The degree of this circle bundle around each point pi is one. The metric gV on UV looks like (cf. e.g. p. 363 of [8]) ds 2 = V (dx 2 + dy 2 + dz2 ) +
1 (dτ + α)2 , V
(1)
where τ ∈ (0, 8πm] parametrizes the circles and x = (x, y, z) ∈ R3 ; the smooth function V : ZV → R and the 1-form α ∈ C ∞ (1 ZV ) are defined as follows: V (x, τ ) = V (x) = c +
k i=1
2m , |x − pi |
dα = ∗3 dV .
(2)
Here c is a parameter with values 0 or 1 and ∗3 refers to the Hodge-operation with respect to the flat metric on R3 . We can see that the metric is independent of τ hence we have a Killing field on (MV , gV ). This Killing field provides the above mentioned U (1)-action. Furthermore it is possible to show that, despite the apparent singularities in the NUTs, these metrics extend analytically over the whole MV . If c = 0 with k running over the positive integers we find the multi-Eguchi–Hanson spaces. If c = 1 we just recover the multi-Taub–NUT spaces. In particular if c = 1 and k = 1 then (1) is the Taub–NUT geometry on R4 (cf. Eq. (6) of [11]): this is easily seen by using the coordinate transformation x 2 + y 2 + z2 = (r − m)2 . Note that under this transform V has the form (by putting the only NUT into the origin r = m) V (r) = 1 +
2m r −m
i.e., coincides with the scaling function f found in the Taub–NUT case (cf. Eq. (8) in [11]) with λ = 2m. From here we guess that in the general case the right scaling functions will have the shape f (x) = λ0 +
k i=1
λi , |x − pi |
(3)
where by an inessential rescaling we can always assume that λ0 is either 0 or 1. We can prove that these are indeed harmonic functions. In order toput our formulas in the simplest form, we introduce the notation (x, y, z) = x 1 , x 2 , x 3 and will use the Einstein summation convention. 1 The reason for writing the nut with block letters is the following. In 1951 Taub discovered an empty space solution of the Lorentzian Einstein equations [23] whose maximal analytical extensions were found by Newman, Unti and Tamburino in 1963 [20]. Hence this solution is referred to as Taub–NUT space-time. On the other hand in 1978 Gibbons and Hawking presented a classification of known gravitational instantons taking into account the topology of the critical set of the Killing field appearing in these spaces [8, 13]. Those whose critical set contains only isolated points were called “nuts” while another class having two dimensional spheres as singular sets were called “bolts”. It is a funny coincidence that an example for the former class is provided by the generalization of the Riemannian version of Taub–NUT space-time, called multi-Taub–NUT space in the present work.
278
G. Etesi, T. Hausel
First note that from the form of the metric, we have the following straightforward orthonormal tetrad of 1-forms on UV : 1 ξ 0 = √ (dτ + α), V
ξ1 =
√ V dx 1 ,
ξ2 =
√ V dx 2 ,
ξ3 =
√ V dx 3 .
(4)
The orientation is fixed such that ∗(ξ 0 ∧ξ 1 ∧ξ 2 ∧ξ 3 ) = 1 which is the opposite to the orientation induced by any of the complex structures in the hyper-K¨ahler family. Let us thus take a general U (1)-invariant function f : MV → R. It means that f = f (x 1 , x 2 , x 3 ) does not depend on τ . Then we have on UV : df =
∂f 1 ∂f i dx i = √ ξ i ∂x V ∂x i
(i = 1, 2, 3).
By using the above orthonormal tetrad (4), we see that 1 ∂f j ∂f ∗df = εji k √ ξ ∧ ξ k ∧ ξ 0 = εji k i dx j ∧ dx k ∧ (dτ + α). i ∂x V ∂x (Note that ∗ is the Hodge star operator on (MV , gV ).) Consequently f = δ df = − ∗ d ∗ df =
∂ 2f ∂ 2f ∂ 2f + + . ∂x 2 ∂y 2 ∂z2
Thus we see that the U (1)-invariant f is harmonic on (MV , gV ) if and only if it is harmonic on the flat R3 . Positive harmonic functions on R3 which are bounded at infinity and have finitely many point-singularities, with at most inverse polynomial growth, have the shape λi f (x) = λ0 + , |x − qi | i
where qi ’s are finitely many points in Note if we want our function f : MV → R to be a harmonic function with only point singularities, we need to place the qi ’s at the NUTs pi of the metric. Thus we have found all reasonable positive U (1)-invariant, harmonic functions on (MV , gV ) with point singularities and bounded at infinity. They are of the form (3). Now we determine the Levi–Civit´a connection of the re-scaled metric g˜ = f 2 gV restricted to UV . By using the trivialization (4) of the tangent bundle T UV , the Levi– Civit´a connection can be represented by an so(4)-valued 1-form ω˜ on UV . With the help of the Cartan equation we can write R3 .
dξ˜ i = −ω˜ ji ∧ ξ˜ j . Taking into account that ξ˜ i = f ξ i , this yields dξ i + d(log f ) ∧ ξ i = −ω˜ ji ∧ ξ j . As we have seen, f does not depend on τ , therefore we have an expansion like 1 ∂ log f j d(log f ) = √ ξ V ∂x j
(j = 1, 2, 3).
Yang–Mills Instantons over Gravitational Instantons
279
Putting this and dξ i = −ωji ∧ ξ j for the original connection into the previous Cartan equation, we get 1 ∂ log f i i ωj + √ ξ ∧ ξ j = ω˜ ji ∧ ξ j , V ∂x j consequently the local components of the new connection on UV , after antisymmetrizing, have the shape 1 ∂ log f i ∂ log f j ω˜ ji = ωji + √ . ξ − ξ ∂x j ∂x i V (Here it is understood that ∂ log f/∂x 0 = 0.) By the aid of the Cartan equation in the original metric, the components of the original connection ω take the form on UV , 1 ∂ log V 0 1 ∂ log V 1 1 ∂ log V 2 ω21 = − √ ξ + √ ξ − √ ξ , 3 2 2 V ∂x 2 V ∂x 2 V ∂x 1 and
1 ∂ log V 0 1 ∂ log V 2 1 ∂ log V 3 ω10 = − √ ξ + √ ξ − √ ξ , 1 3 ∂x ∂x 2 V 2 V 2 V ∂x 2
and cyclically for the rest. Notice that from this explicit form we see that in this gauge the Levi–Civit´a connection is itself self-dual i.e., ω10 = ω32 etc. (cf. p. 363 of [8]). Consequently it cancels out if we project ω˜ onto the su(2)− subalgebra via A− λ0 ,...,λk =
3 3 1 i ηa,j ω˜ ji ηa , 4 a=1 i,j =0
where the ’t Hooft matrices ηi are given by: 0 1 0 0 0 0 −1 0 0 0 0 0 η1 = , η = 0 0 0 −1 2 1 0 0 0 1 0 0 1
−1 0 0 0
0 0 −1 0 , η = 0 3 0 0 1
0 0 0 1 −1 0 0 0
−1 0 . 0 0
Using the identification su(2)− ∼ = Im H via (η1 , η2 , η3 ) → (i, −j, −k) we get for ∇λ−0 ,...,λk in the gauge (or bundle trivialization) given by (4) on UV that i ∂ log f 0 ∂ log f 2 ∂ log f 3 A− = ξ − ξ + ξ √ λ0 ,...,λk ∂x 3 ∂x 2 2 V ∂x 1 j ∂ log f 0 ∂ log f 1 ∂ log f 3 + √ ξ + ξ − ξ ∂x 3 ∂x 1 2 V ∂x 2 k ∂ log f 0 ∂ log f 1 ∂ log f 2 + √ ξ − ξ + ξ . ∂x 3 ∂x 2 ∂x 1 2 V This long but very symmetric expression can be written in a quite simple form as follows. Consider the following quaternion-valued 1-form ξ and the imaginary quaternion d(log f ) (we use the symbol “d” to distinguish it from the real 1-form d(log f )): ξ := ξ 0 + ξ 1 i + ξ 2 j + ξ 3 k,
d(log f ) :=
∂ log f ∂ log f ∂ log f i+ j+ k. ∂x 1 ∂x 2 ∂x 3
280
G. Etesi, T. Hausel
It is easily checked that with this notation the connection takes the simple shape A− λ0 ,...,λk = Im
d(log f ) ξ . √ 2 V
(5)
This form emphasizes the analogy with the case of flat R4 (cf. p. 103 of [12]). By construction these connections are self-dual over (UV , gV ); but we will prove in the next sections that they are furthermore gauge equivalent to smooth, self-dual connections over the whole (MV , gV ) and have finite energy. We remark that there is a familiar face in the crowd (5). This is the solution corresponding to the choice f = V . The Yang–Mills instanton (5) is then the same as the projection of the Levi–Civit´a connection of MV onto the other chiral bundle S + MV , which is easy to see from the form of the Levi–Civit´a connection calculated earlier in this section. We denote this connection by ∇metric . This is the solution which we called in [11] the Pope–Yuille solution of unit energy in the Taub–NUT case [21, 4]. In the Eguchi–Hanson case this is the solution of energy 3/2 found in [4] and later again in [17]. To conclude this section, we write down the field strength or curvature of (5) over UV . The field strength of a connection ∇ with connection 1-form A over a chart is F = dA + A ∧ A. Therefore we can see by (5) that our field strength has the form over UV , dV 1 ∧ Im(d(log f ) ξ ) + √ Im(d(log f )dξ ) 3/2 4V 2 V 1 1 + √ Im(dd(log f ) ∧ ξ ) + (Im(d(log f )ξ ) ∧ Im(d(log f )ξ )) . 4V 2 V
Fλ−0 ,...,λk = −
The terms in the first line can be adjusted as follows. Using the identity dV dV ∧ ξ, dξ = ∗3 √ − 2V V we can write them in the form dV d(log f ) dV Re dV ξ ∧ ξ d(log f ) = − − 3/2 ∧ ξ 0 + ∗3 2 2V 2V 4V with
∂V ∂V ∂V i + 2 j + 3 k. 1 ∂x ∂x ∂x One immediately sees at this point that these two terms are self-dual with respect to gV at least over UV because ξ ∧ξ is a basis for self-dual 2-forms. Self-duality of the remaining two terms is not so transparent; however a tedious but straightforward calculation assures us about it. So we can conclude that the connections ∇λ−0 ,...,λk are self-dual with respect to gV at least over UV . The action, or energy, or L2 -norm of the connection (if it exists) is the integral 1 1 − − − 2 |F | = − tr F ∧ ∗F (6)
Fλ−0 ,...,λk 2 = λ0 ,...,λk gV λ0 ,...,λk λ0 ,...,λk . 8π 2 8π 2 dV =
MV
MV
Next we turn our attention to the extendibility of (5) over the NUTs and its asymptotical behaviour in order to calculate the above integral.
Yang–Mills Instantons over Gravitational Instantons
281
3. A Gauge Transformation Around a NUT Our next goal is to demonstrate that the self-dual connections just constructed are welldefined over the whole MV up to gauge transformations. As we have seen, the gauge (5) contains only pointlike singularities, hence if we could prove that the energy of ∇λ−0 ,...,λk is finite in a small ball around a fixed NUT then, in light of the removable singularity theorem of Uhlenbeck [24] we could conclude that our self-dual connections extend through the NUTs after suitable gauge transformations. However the direct calculation of (6) is complicated because of its implicit character. Consequently it will be performed in the next section; here we write down a gauge transformation explicitly, such that the resulting connection will be easily extendible over the NUTs. To this end, we derive a useful decomposition of (5). To keep our expressions as short as possible, we introduce further notations: let us write rj (x) := |x − pj | and Vi := c +
2m , ri
fi := λ0 +
λi , ri
and define the 1-form αi on R3 by the equation ∗3 dαi = dVi . With these notations we introduce a new real valued function ai on UV as follows: V =: ai Vi . One easily calculates ai = 1 +
(7)
2m ri . 2m + cri rj j =i
In the same fashion by putting the fixed NUT pi into the origin of R3 (i.e., pi = 0) we can write k λj 1 1 λj 1 2 3 d(log f ) = − (x 1 i + x 2 j + x 3 k) + i + p j + p k . p j j j f f r3 r3 j =1 j j =i j
On the other hand, d(log fi ) = −
1 λi 1 x i + x2j + x3k , 3 fi ri
therefore we can write for a real valued function bi on UV that d(log f ) =: bi d(log fi ) + pi , where 1+ bi =
1+
j =i
j =i
λj λi
3 ri rj
λj ri λi +λ0 ri rj
,
and we have introduced the function pi : UV → Im H given by pi :=
1 λj 1 (pj i + pj2 j + pj3 k). 3 f r j j =i
(8)
282
G. Etesi, T. Hausel
As a next step, we have to examine these new objects around the NUT pi . It is not difficult to see, via the explicit expressions for ai , bi and pi that lim ai (x) = 1,
ri (x)→0
lim bi (x) = 1,
ri (x)→0
lim |pi (x)|H = 0.
ri (x)→0
These observations show that in fact ai , bi and pi are well defined on UV ∪ {pi }. Now putting decompositions (7), (8) into (5) we arrive at the expression A− λ0 ,...,λk = Im
bi d(log fi )ξ pi ξ + Im √ . √ 2 ai Vi 2 ai V i
(9)
From here one immediately deduces that the first term (which formally coincides with the original ’t Hooft solution on flat R4 when c = 0; and the solution on Taub–NUT space constructed in [11] when c = 1) is singular while the last term vanishes in the NUT pi . In order to analyse the structure of the singular term more carefully, we introduce spherical coordinates around pi i.e., the origin of R3 : x 1 := ri sin i cos φi ,
x 2 := ri sin i sin φi ,
x 3 := ri cos i .
Here i ∈ (0, π] and φi ∈ (0, 2π] are the angles. In this way we rewrite the singular term as b i λi 1 bi d(log fi )ξ = − (dτ + α) + (ri + 2m)αi qi Im √ 2(λ0 ri + λi )(ri + 2m) ai 2 ai V i bi λi + ((sin φi i − cos φi j)d i − k dφi ) , 2(λ0 ri + λi ) where we have introduced the notation for the “radial” imaginary quaternion qi := sin i cos φi i + sin i sin φi j + cos i k and have exploited the identity αi = cos i dφi at several points. Now we can easily see that the above expression is singular because the quaternion qi is ill-defined in the NUT pi i.e., in the origin ri = 0 (all the other terms involved are regular in pi ). Consequently we are seeking for a gauge transformation which rotates the quaternion qi into k, for example. This gauge transformation cannot be performed continuously over the whole UV ∼ = R3 \ {p1 , . . . , pk } × S 1 , therefore we introduce the two open subsets π , UV+ := (x 1 , x 2 , x 3 , τ ) | x 3 ≥ 0, i.e., i ≥ 2 π UV− := (x 1 , x 2 , x 3 , τ ) | x 3 ≤ 0, i.e., i ≤ . 2 Now it is not difficult to check that the gauge transformations gi± : UV± → SU (2) given by φi i φi ± gi (τ, ri , i , φi ) := exp ±k exp −j exp −k 2 2 2 (here exp: su(2) → SU (2) is the exponential map) indeed rotate qi into k. (We remark that this gauge transformation is exactly the same which was used to identify the Charap–Duff instantons with Abelian ones over the Euclidean Schwarzschild manifold in [10].) Notice that exp(kφi )gi− = gi+ , showing that the two gauge transformations are related with an Abelian one along the equatorial plane x 3 = 0 i.e., i = π/2.
Yang–Mills Instantons over Gravitational Instantons
283
We can calculate that (cf. [10]) −1 gi± ((sin φi i − cos φi j) d i − kdφi )(gi± )−1 = −2gi± d gi± ∓ kdφi , ± −1 therefore we get for the gauge transformed connection Bλ±0 ,...,λk := gi± A− λ0 ,...,λk (gi ) + −1 by using decomposition (9) that gi± d gi± b i λi 1 − (dτ + α) + (ri + 2m)(∓1 + cos i )dφi k Bλ±0 ,...,λk = 2(λ0 ri + λi )(ri + 2m) ai bi λi pi ξ ± −1 ± ± −1 g i d gi + 1− g + gi± Im √ . (10) λ0 ri + λ i 2 ai V i i
Now we have reached the desired result: we can see that, approaching the NUT pi from the north (i.e., along a curve whose points obey x 3 > 0) the terms written in the first line of Bλ+0 ,...,λk remain regular while the (non-Abelian) terms of the second line vanish if ri = 0. The situation is exactly the same from the south if we use Bλ−0 ,...,λk . Moreover Bλ+0 ,...,λk and Bλ−0 ,...,λk are related via an Abelian gauge transformation along the equator x 3 = 0. Consequently in this new gauge our instantons are regular in the particular NUT pi . By performing the same transformations around all NUTs p1 , . . . , pk we can see that in fact the instantons (5) extend smoothly across all the NUTs. 4. Kronheimer’s Singular Monopoles and the Energy In this section we identify our U (1)-invariant instantons over the Gibbons–Hawking spaces with monopoles over flat R3 carrying singularities in the (images of the) NUTs p1 , . . . , pk . This identification enables us to calculate the energy of our solutions as well. In this section we will follow Kronheimer’s work [18]. Remember S − MV is an SU (2) vector bundle over MV and the U (1) action can be lifted from MV to S − MV . Our instantons ∇λ−0 ,...,λk are self-dual, U (1)-invariant SU (2)connections on this bundle. If we choose an U (1)-invariant gauge in S − MV (for example − (5) or (10) is suitable) then A− λ0 ,...,λk becomes an U (1)-invariant su(2) -valued 1-form which we can uniquely express as ∗ ∗ A− λ0 ,...,λk = π A − π (dτ + α),
where A and are a 1-form and a 0-form on ZV and π : UV → ZV is the projection. Dividing S − MV by the U (1)-action we obtain a SU (2) vector bundle E over ZV together with a pair (A, ) on it. Omitting π ∗ we calculate Fλ−0 ,...,λk = ∇A− λ0 ,...,λk = (F − dα) − ∇ ∧ (dτ + α).
(11)
Here we have used the notation F = ∇A. One finds that the self-duality Fλ−0 ,...,λk = ∗Fλ−0 ,...,λk of the original connection is equivalent to ∗3 (F − dα) = V ∇ or ∗3 F = ∇(V ) since dα = ∗3 dV . Putting := V we can write F = ∗3 ∇ which is the Bogomolny equation for the pair (A, ) on ZV . This shows that the pair (A, ) can be naturally interpreted as a magnetic vectorpotential and a Higgs field while
284
G. Etesi, T. Hausel
F as a magnetic field on ZV . Notice however that in this case the Higgs field is singular at the images of the NUTs, hence the reason we have to use the punctured R3 denoted by ZV . In our case, by using (5) we can write down the pair (A, ) explicitly. We easily find that =
d(log f ) , 2
A = Im
d(log f )i 1 d(log f )j 2 d(log f )k 3 dx +Im dx +Im dx . (12) 2 2 2
In this framework one can find a simple formula for the energy we are seeking. In what follows define ZεR to be the intersection of a large ball of radius R in R3 containing all the NUTs and the complements of small balls of radius εi surrounding the NUTs pi . Putting (11) into (6) we can write |(F − dα) − ∇ ∧ (dτ + α)|2gV 8π 2 Fλ−0 ,...,λk 2 = MV
1 2 2 ∧ (dτ + α) |F − dα| + |∇| V2 MV = 16πm lim V |∇|2 , =−
V
R→∞ εi →0 ZεR
taking into account the Bogomolny equation in the form F − dα = ∗3 V ∇. By writing 2V |∇|2 = d ∗3 d(V ||2 ) and exploiting Stokes’ theorem we arrive at the useful formula 2 || 8π 2 Fλ−0 ,...,λk 2 = 8πm lim . ∗3 d V R→∞ εi →0 ∂ZεR
In the above expressions |·, ·| stands for the Killing norm of su(2) induced by the Euclidean metric on R3 i.e., it is the standard Killing norm, hence it is equal to twice the usual norm square of a quaternion under the identification su(2) ∼ = Im H; e.g. ||2 = 2||2H . In order to determine the exact value of the action of our solutions, we simply have to calculate the contributions of each component of the boundary ∂ZεR , in other words, the NUTs pi and the infinity of R3 . First we can see that, using (7) and (8), for small εi there is an expansion 2 εi 1 b i λi ||2 = − qi + pi V 2 εi (λ0 εi + λi ) ai (cεi + 2m) H 1/(4mεi ) + O(1) if λi = 0, = 0 if λi = 0. This implies that 2 | − 1/(4mε 2 ) + O(1/εi )| || i d = V 0
if λi = 0, if λi = 0.
Yang–Mills Instantons over Gravitational Instantons
285
(For clarity we remark that the outermost | · | in the last expression is the Euclidean norm on R3 .) However in the above integral there is a contribution −4π εi2 by the spheres (the minus sign comes from the orientation), consequently each NUT pi together with the condition λi = 0 contributes a factor 8π 2 to the integral i.e., 1 to the energy. Hence the total contribution is n, where n stands for the number of non-zero λi ’s. Clearly 0 ≤ n ≤ k. To see the contribution of infinity, we have to understand the fall-off properties of the function ||2 /V . Clearly this is not modified if we put all the NUTs into the origin of R3 . Thus asymptotically our functions take the shape V =c+
2mk , R
f = λ0 +
k 1 λ λi =: λ0 + . R R i=1
Putting these expressions into ||2 /V and expanding it into 1/R terms one finds the following for large R: 2 ||2 R 1 λ = − q V 2 R(λ0 R + λ) cR + 2mk H 2 1/(4mkR) + O(1/R ) if λ0 = 0 and c = 0, = O(1/R 2 ) otherwise. Consequently 2 | − 1/(4mkR 2 ) + O(1/R 3 )| if λ0 = 0 and c = 0, || d = V |O(1/R 3 )| otherwise. Since now again 4πR 2 is the volume of the large sphere (notice that there is no minus sign because of the orientation) we get that the contribution of infinity is −8π 2 /k to the above integral i.e., −1/k to the energy in the case of the multi-Eguchi–Hanson space c = 0 with the special limit λ0 = 0. Otherwise the contribution of infinity is zero. Also notice that if n = 0 then f = λ0 is a constant hence the energy is certainly zero. We summarize our findings in the following Theorem 4.1. The connection ∇λ−0 ,...,λk as given in (5) is a smooth self-dual SU (2) Yang–Mills connection and has energy 0 if all λi = 0 (i > 0) i.e., n = 0 (in which case ∇λ−0 ,0,...,0 is the trivial connection), otherwise we have
Fλ−0 ,...,λk 2 =
n − (1/k) if λ0 = 0, c = 0, n
otherwise,
where k refers to the number of NUTs while n is the number of non-zero λi ’s (i > 0). 3
286
G. Etesi, T. Hausel
5. The Abelian Solutions In this section we show that (5) is reducible to U (1) if and only if for an i = 0, . . . , k we have λi > 0 (for simplicity we take λi = 1), while λj = 0 if j = i and j = 0, . . . , k. First take the case when λ0 = 0 but the other λ’s vanish, then our solution (5), which we denote by ∇0 , is trivial. Now suppose that λi = 0 for i = 1, . . . , k but the others vanish. Take the NUT pi and consider the new gauge (10). In the case at hand the second line van± ishes and (10) reduces to the manifestly Abelian instanton ∇i with Bi± := B0,...,0,1,0,...,0 (with 1 in the i th place): k dτ + α ± Bi = − + (∓1 + cos i )dφi . V ri 2 But in fact these connections are gauge equivalent because Bi+ + 21 k dφi = Bi− − 21 k dφi , and ∇i locally can be written as k dτ + α Bi = − + αi , (13) V ri 2 where ∗3 dαi = dVi . The curvature Fi = ∇Bi of this Abelian connection is the L2 harmonic 2-forms found by [22] in the multi-Taub–NUT case. Now take any reducible SU (2) instanton of the form (5). Then in a suitable gauge it can be brought to the form µi Bi , i
where µi are real numbers. This follows by applying a result from [14] which claims that the 2-forms Fi generate the space of L2 harmonic 2-forms on our space. Now note µi k that the corresponding Higgs field of this instanton is 2 . Since || is gauge ri i
invariant we have the following identity around a particular NUT pi via (8) and (12): 2 2 µ 1 1 µ b i λi j i ||2 = + = − qi + pi . 2 εi rj 2 εi (λ0 εi + λi ) H j =i
Provided it is not identically zero the right-hand side nowhere vanishes. We can deduce that the µi ’s have to be all non-positive or non-negative (otherwise ||2 would be zero at some point). Without loss of generality we suppose they are all non-negative. Moreover the right-hand side times εi2 tends to either 0 or 1/2 when εi tends to 0. Therefore we have that µi is either 0 or 1. Moreover for large R we have 1/(2R 2 ) i µi for the left-hand side while the right-hand side asymptotically looks like 1/(2R 2 ) + O(1/R 3 ) if λ0 = 0, ||2 = O(1/R 3 ) if λ0 = 0. Thus this last expansion implies that λ0 = 0 and only one µi = 1, the rest vanishes. This in turn shows that only one λi is not zero which proves the following Theorem 5.1. An instanton in the form (5) is reducible if and only if for any i = 0, . . . , k we have λi = 0 and λj = 0 for j = 0, 1, . . . , i − 1, i + 1, . . . , k; in this case it can be put into the form (13). 3
Yang–Mills Instantons over Gravitational Instantons
287
6. Conclusion In this paper we have explicitly calculated the analogue of the ’t Hooft SU (2) instantons for multi-centered metrics of k NUTs. We found a k parameter family of such solutions (parametrized by λ0 , . . . , λk all non-negative numbers modulo an overall scaling) one for each NUTs. The structure of the space of solutions could be best visualized as an intersection of the positive quadrant and the unit ball in Rk . The k flat sides of this body correspond to the solutions λi = 0 for an i = 1, . . . , k, while the spherical boundary component corresponds to λ0 = 0. The vertices of this body correspond to the reducible solutions, the origin corresponding to the trivial solution; the rest to the non-Abelian solutions (5). The energy of the solutions are k at the interior points of the body, while it reduces by 1 for every λi = 0 for i = 1, . . . , k and by 1/k if λ0 = c = 0. In order to see the boundary solutions as ideal solutions of energy k we can imagine an ideal Dirac delta connection of energy 1 at each NUT pi for which λi = 0 in the Taub–NUT case; the situation is the same for the multi-Eguchi–Hanson case except that we add a further Dirac delta connection of energy 1/k at infinity if λ0 = 0.
∆
∆
2
∆ metric λ 0 = 0
∆
λ2 = 0
∆
Fig. 1. Case of 2-Taub–NUT
1
λ0 = 0
λ1 = 0
∆
0
∆
λ1 = 0
0
metric
λ2 = 0
∆
2
1
Fig. 2. Case of Eguchi–Hanson
A puzzling feature of this description in the multi-Taub–NUT case is at the interior of the spherical boundary component of our solution space, where λ0 = 0 but no other λi is zero. These solutions are not reducible by Theorem 5.1 but have energy k by Theorem 4.1. However they are singular points in our solution space. Thus either there are other solutions on the other side of the spherical boundary, or the moduli space of energy k instantons will have singularities at non-reducible points, a phenomenon which does not occur when the underlying 4-manifold is compact or ALE. The relation to the multi-Eguchi–Hanson case, explained in the next paragraph, however might point to the second possibility. Following [18] we have also studied our U (1) invariant instantons as singular monopoles on R3 . Interestingly we got the same monopoles (12) from the two cases (c = 0, 1). This raises the possibility to take a U (1) invariant instanton on a multi-Eguchi–Hanson space (where all the solutions were classified in [19]), consider it as a singular monopole on R3 and then pull it back to a U (1) invariant instanton on the corresponding multiTaub–NUT space. This method might lead to the construction of (all) U (1) invariant instantons on a multi-Taub–NUT space and if so it would indeed exhibit singular points in the non-reducible part of the moduli space as we explained in the paragraph above. We postpone a more thorough discussion of these ideas for a future work as well as the construction of the spectral data [7, 18] for the singular monopoles.
288
G. Etesi, T. Hausel
Acknowledgement. The main steps in this work were done when the first author visited the University of California at Berkeley during May 2002. We would like to acknowledge the financial support by Prof. P. Major (R´enyi Institute, Hungary) from his OTKA grant No. T26176 and of the Miller Institute for Basic Research in Science at UC Berkeley.
References 1. Aragone, C., Colaiacomo, G.: Gravity-matter instantons. Lett. Nuovo Cim. 31, 135–139 (1981) 2. Atiyah, M.F., Hitchin, N., Singer, I.M.: Self-duality in four-dimensional Riemannian geometry. Proc. Roy. Soc. London A362, 425–461 (1978) 3. Bianchi, M., Fucito, F., Rossi, G., Martenilli, M.: Explicit construction of Yang–Mills instantons on ALE spaces. Nucl. Phys. B473, 367–404 (1996) 4. Boutaleb-Joutei, H., Chakrabarti, A., Comtet, A.: Gauge field configurations in curved space-times I-V, Part I: Phys. Rev. D20, 1884–1897 (1979); Part II: D20, 1898–1908 (1979); Part III: D21, 979–983 (1980); Part IV: D21, 2280–2284 (1980); Part V: D21, 2285–2290 (1980) 5. Chakrabarti, A.: Classical solutions of Yang–Mills fields. Fortschr. Phys. 35, 1–64 (1987) 6. Charap, J.M., Duff, M.J.: Space-time topology and a new class of Yang-Mills instanton. Phys. Lett. B71, 219–221 (1977) 7. Cherkis, S.A., Kapustin, A.: Singular monopoles and supersymmetric gauge theories in three dimensions. Nucl. Phys. B525, 215–234 (1998) 8. Eguchi, T., Gilkey, P.B., Hanson, A.J.: Gravity, gauge theories and differential geometry. Phys. Rep. 66, 213–393 (1980) 9. Eguchi, T., Hanson, A.J.: Self-dual solutions to Euclidean gravity. Ann. Phys. 120, 82–106 (1979) 10. Etesi, G., Hausel, T.: Geometric interpretation of Schwarzschild instantons. J. Geom. Phys. 37, 126–136 (2001) 11. Etesi, G., Hausel, T.: Geometric construction of new Yang–Mills instantons over Taub–NUT space. Phys. Lett. B514, 189–199 (2001) 12. Freed, D.S., Uhlenbeck, K.K.: Instantons and four-manifolds. MSRI Publications 1, Berlin: SpringerVerlag, 1984 13. Gibbons, G.W., Hawking, S.W.: Gravitational multi-instantons. Phys. Lett. B78, 430–432 (1978) 14. Hausel, T., Hunsicker, E., Mazzeo, R.: Hodge cohomology of gravitational instantons. Preprint, arXiv: math.DG/0207169 15. Hawking, S.W.: Gravitational instantons. Phys. Lett. A60, 81–83 (1977) 16. Jackiw, R., Nohl, C., Rebbi, C.: Conformal properties of pseudo-particle configurations. Phys. Rev. D15, 1642–1646 (1977) 17. Kim, H.,Yoon,Y.:Yang–Mills instantons in the gravitational instanton background. Phys. Lett. B495, 169–175 (2000) 18. Kronheimer, P.B.: Monopoles and Taub–NUT metrics. Diploma thesis. Merton College, Oxford, 1985 (unpublished) 19. Kronheimer, P.B., Nakajima, H.:Yang–Mills instantons on ALE gravitational instantons. Math. Ann. 288, 263–307 (1990) 20. Newman, E.T., Tamburino, L., Unti, T.J.: Empty space generalization of the Schwarzschild metric. J. Math. Phys. 4, 915–923 (1963) 21. Pope, C.N., Yuille, A.L.: A Yang–Mills instanton in Taub–NUT space. Phys. Lett. B78, 424–426 (1978) 22. Ruback, P.: The motion of Kaluza–Klein monopoles. Commun. Math. Phys. 107, 93–102 (1986) 23. Taub, A.H.: Empty space-times admitting a three parameter group of motions. Ann. Math. 53, 472– 490 (1951) 24. Uhlenbeck, K.K.: Removable singularities in Yang–Mills fields. Commun. Math. Phys. 83, 11–29 (1982) Communicated by A. Connes
Commun. Math. Phys. 235, 289–311 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0800-1
Communications in
Mathematical Physics
Gradient Formula for Linearly Self-Interacting Branes Brandon Carter1 , Richard A. Battye2,3 , Jean–Philippe Uzan4 1 2 3
LUTH, Bˆat. 18, Observatoire de Paris, 92195 Meudon, France Jodrell Bank Observatory, Macclesfield, Cheshire SK11 9DL, U.K. Department of Physics and Astronomy, University of Manchester, Schuster Laboratory, Brunswick St, Manchester M13 9PL, U.K. 4 Laboratoire de Physique Th´eorique, CNRS–UMR 8627, Bˆat. 210, Universit´e Paris XI, 91405 Orsay Cedex, France Received: 15 April 2002 / Accepted: 23 October 2002 Published online: 25 February 2003 – © Springer-Verlag 2003
Abstract: The computation of long range linear self-interaction forces in string and higher dimensional brane models requires the evaluation of the gradients of regularised values of divergent self-interaction potentials. It is shown that the appropriately regularised gradient in directions orthogonal to the brane surface will always be obtainable simply by multiplying the regularised potential components by just half the trace of the second fundamental tensor, except in the hypermembrane case for which the method fails. Whatever the dimension of the background this result is valid provided the codimension is two (the hyperstring case) or more, so it can be used for investigating brane-world scenarios with more than one extra space dimension. 1. Introduction In recent studies of the linear self-interaction of superconducting [1] and other [2, 3] cosmic string models in a standard (3+1) dimensional background spacetime, it has been found that the relevant divergences can be conveniently dealt with by the application of a simple universal gradient formula. This formula can be used to obtain the effective self-force from the appropriately regularized self-field and the precise form of this gradient formula does not depend on the physical nature of the fields to which the string may be coupled. These could, for example, be of the ordinary (observationally familiar) electromagnetic [1], and linearised gravitational [2] kinds, but might also be of the theoretically predicted dilatonic and axionic kinds [3]. The purpose of this work is to generalise this gradient formula to higher dimensional branes in higher dimensional backgrounds. A particular motivation for doing this is to generalise the study of the original kind of brane-world scenario [4] not just by dropping the postulate of reflection symmetry [5, 6], but by allowing for more than one extra dimension [7–9], so as to obtain models consisting of a 3-brane embedded in a spacetime of dimension greater than 5. More generally, our results are applicable to any p-brane in a (q + 1) dimensional background spacetime, so long as the codimension (q − p) is two
290
B. Carter, R.A. Battye, J.-P. Uzan
(the “hyperstring” case, which includes ordinary strings in a (3+1) dimensional spacetime) or more. The method given here does however fail (due to long range “infrared” divergences) when the co-dimension is only one (the “hypermembrane” case, which includes ordinary membranes in a (3+1) dimensional spacetime), but in that case it is not really needed, since the short range “ultraviolet” divergences will have a relatively innocuous form that can be dealt with in terms of simple jump discontinuities. The reason why one needs a gradient formula is that the relevant forces are typically obtained as gradients of potentials. Outside the brane worldsheet these potentials will be well behaved fields, but in the thin brane limit they will be singular on the worldsheet itself. One therefore needs to regularise these fields by some appropriate ultra-violet cut-off procedure, typically involving a length scale, say, that can be interpreted as characterising the underlying microstructure. When one has obtained the appropriately regularised potential on the worldsheet (which represents the cross-sectional average over an underlying microstructure) the next problem is to obtain the corresponding force by evaluating its gradient. The gradient components in directions tangential to the worldsheet can be obtained directly just by differentiating the regularised (macroscopic averaged) potential. The trouble is that, since the support of the regularised potential is confined to the worldsheet, differentiation in orthogonal directions is not directly meaningful. To obtain the necessary orthogonal gradient components it is therefore necessary, in principle, to go back to the underlying microstructure and perform the ultra-violet limit process again, a rather arduous task that has frequently been performed independently by different authors in a wide range of different physical contexts. The present work has been prompted by the observation [1] that, in the case of a string in an ordinary 3-dimensional background, the effect of the orthogonal gradient operation will always turn out to be equivalent just to multiplication of the regularised value of the relevant potential field by exactly half the extrinsic curvature vector K ρ , which is defined to be the trace K ρ = g µν Kµνρ ,
(1)
of the second fundamental tensor Kµν ρ of the worldsheet. The recognition of this simple universal rule makes it unnecessary to go back and perform the calculation over again every time such a problem arises in a different physical context. On the basis of dimensional considerations, and of the need to respect local Lorentz invariance, one would expect that multiplication by a factor proportional to K ρ would inevitably be what is required. But what is not so obvious in advance is whether the relevant proportionality factor should still always be exactly a half even for branes and backgrounds of higher dimensions. The present work addresses this question and provides an unreservedly affirmative reply whenever 0 < p < q − 1, on the basis of the consideration that whatever the relevant factor may be for a generic evolution of a brane of given space dimension p in a background of given space (as opposed to spacetime) dimension q, this factor must evidently remain the same for any static configuration of the brane. By restricting our attention to static configurations, for which a rigourous analysis is technically much simpler than in the dynamic case, what we have succeeded in showing here is that – provided the codimension (q − p) is two (the “hyperstring” case) or more, so that the quantities involved are at worst logarithmically divergent in the “infra red” (which excludes the strongly divergent case of a membrane, but admits the case of a
Gradient Formula for Linearly Self-Interacting Branes
291
string in 3 dimensions) – the appropriate factor is indeed always exactly a half, no matter how high the dimension of the brane or the background. However our analysis also shows that this easily memorable result will no longer hold in the strongly “infra red” divergent extreme case of a “hypermembrane” (meaning a brane with codimension one, so that its supporting worldsheet is a hypersurface). What the present approach can not do is to provide any evidence at all, one way or the other, about the value of the factor in question in the opposite extreme case of a zero-brane, meaning a simple point particle. The article is organised as follows. In Sect. 2, we recall the definitions of the basic geometrical quantities, in particular the second fundamental tensor, that will appear in our discussion. We introduce the Poisson equation, both in the thin brane limit and for regularised configurations in Sect. 3 and discuss its solution in Sect. 4. The cross-sectional average of the field and its gradient have a zero-order contribution, in the case of a flat brane, that is evaluated in Sect. 5 and we illustrate our formalism on the choice of a canonical profile function in Sect. 6. The cross-sectional average of the gradient of the field vanishes in the flat configuration and our goal is to evaluate its value at first order in the curvature. After defining the procedure of averaging in Sect. 7, we study the effect of the bending of the brane on this procedure in Sect. 8. Section 9 finalises our demonstration by computing the cross-sectional average of the gradient of the field. A summary of the notation and some useful integrals are presented in the appendix. 2. Brane Embedding Geometry Since it will play a central role in our analysis, it is worthwhile to recall the definition and some basic properties of the second fundamental tensor Kµν ρ . We consider a p-dimensional surface (a p-brane) with internal coordinates σ i (with i = 1 . . . p) smoothly embedded in a higher q-dimensional background endowed with a metric gµν with respect to coordinates x µ (with µ = 1 . . . q). If the location of the surface is given in terms of a set of embedding functions x¯ µ (σ i ), the internal surface metric, also referred to as first fundamental tensor, can be expressed as γij = gµν
∂ x¯ µ ∂ x¯ ν . ∂σ i ∂σ j
(2)
Provided that this metric is non-singular (in the sense that the determinant of the component matrix does not vanish), as will always be the case for a strictly timelike surface in a Lorentz signature background spacetime, or for any surface if the background metric is positive definite, then it will have a well defined inverse with components γ ij . It can be used to raise internal coordinate indices in the same way as the inverse background metric g µν is used for raising background coordinate indices. For any such non-singular embedding, γ ij can be mapped into the background tensor γ µν = γ ij
∂ x¯ µ ∂ x¯ ν . ∂σ i ∂σ j
(3)
µ
The corresponding mixed tensor γ ν acts on vectors as the natural (rank p) surface tangential projector operator, while its (rank q − p) complement ⊥µν = g µν − γ νµ , acts similarly as the corresponding orthogonal projection operator.
(4)
292
B. Carter, R.A. Battye, J.-P. Uzan µ
µ
Fields with support confined to the p-surface S (p) , such as γ ν and ⊥ν , can not be directly subjected to the unrestricted operation of partial differentiation with respect to the background coordinates, but only to differentiation in tangential directions, as performed by the operator ∇µ = γ µρ ∇ρ ,
(5)
where ∇ρ is the usual covariant derivative associated with gµν . The action of this differential operator on the first fundamental tensor defines the second fundamental tensor of the p-surface, according to the specification [10] Kµνρ = ηνσ ∇µ ηρσ .
(6)
As a non-trivial integrability condition, it satisfies the generalised Weingarten symmetry condition Kµνρ = Kνµρ .
(7)
It is also p-surface orthogonal on its last index and tangential on the other two, that is, Kµνσ ηρσ = 0 = ⊥σµ Kσ νρ ,
(8)
so that it has only a single non-identically vanishing self-contraction, namely the extrinsic curvature vector introduced in Eq. (1).
3. Regularisation of Brane Supported Source Distribution Since we are essentially concerned with the treatment of “ultraviolet” regularisation, the effects of long range background curvature will be unimportant and can thus be neglected. As the framework for our analysis, it will be sufficient to consider a q-dimensional background space with a flat, static, positive definite metric gµν . It is therefore possible to choose a linear (but not necessarily orthogonal) system of space coordinates x µ so that the background metric will be constant, that is, ∂gµν /∂x ρ = 0. This choice has different implications. First, the tangentially projected covariant derivative (5) will take the simple form ∇µ = γ µρ
ν ∂ ∂ ij ∂ x¯ = g γ . µν ∂x ρ ∂σ i ∂σ j
(9)
Second, in the linear system (for, example Maxwellian or linearised gravitational), with which we are concerned here, each tensorial component decouples and evolves independently like a simple scalar field. Thus, it will be sufficient for our purpose to concentrate on the prototype problem of finding a scalar field, let us say φ{x} (introducing the systematic use of curly brackets to indicate functional dependence), that satisfies the corresponding generalised Poisson equation to which the linear wave equation will be reduced in the static case to be dealt with here. Third, the background spacetime is assumed to be static so that the generalised Laplacian ∇ µ ∇µ is to be regarded as the static projection of what would be a Dalembertian type wave operator in a more complete (p + 1)-dimensional spacetime description.
Gradient Formula for Linearly Self-Interacting Branes
293
3.1. Generalised Poisson equation and its solutions in the thin brane limit. In unrationalised units for, let us say, a repulsive scalar source distribution ρ{x} the generalised Poisson equation takes the form ∇ µ ∇µ φ = −
[q−1]
ρ{x},
(10)
where the Laplacian operator is simply given by ∇ µ ∇µ = g µν
∂2 , ∂x µ ∂x ν
(11)
[q−1]
is the surface area of a unit (q −1)-sphere in the q-dimensional background. where The latter is given by the well known formula
[q−1]
=
2π q/2 , {q/2}
(12)
in which {z} is the usual Eulerian Gamma function (as specified in the appendix) which satisfies the recursion relation {n + 1} = n{n} and has the particular values {1} = 1 √ and {1/2} = π. Thus in particular we shall have [0] = 2, [1] = 2π , [2] = 4π , [3] = 2π 2 and [4] = 8π 2 /3. The motivation for working with a p-brane model of a physical system is of course to be able to use an economical description in which as many as possible of the fields involved are specified just on the supporting surface. Evidently, this requires less information than working with fields specified over the higher dimensional background. However, when long range interactions are involved then, as a price for such an economy, divergences are to be expected if the source fields are considered to be strictly confined to the supporting p-surface. To be more specific, the brane supported source distribution ρ{σ ¯ } (using a bar as a reminder to indicate quantities that are undefined off the relevant supporting surface) will correspond to the background source distribution [q] [p] ρ{x} = δ {x, x{σ ¯ }} ρ{σ ¯ } dS , (13) [q]
where δ {x, x{σ ¯ }} is the q-dimensional bi-scalar Dirac distribution. This is characterised by the condition that it vanishes wherever the evaluation points x µ and x¯ µ {σ } are different while nevertheless satisfying the unit normalisation condition [q] [q] δ {x, x} ¯ dS = 1. (14) [p]
[q]
We use the notations dS and dS respectively for the p-dimensional brane surface measure and q-dimensional background space measure. The latter is given by an expression of the form dS
[q]
[q]
= |g|1/2 d x, |g|1/2
(15)
is just the square root of the where the q-dimensional volume measure factor determinant of the (positive definite) matrix of metric components gµν . Similarly, we have [p] [p] dS = |γ |1/2 d σ, (16)
294
B. Carter, R.A. Battye, J.-P. Uzan
where the p-dimensional volume measure factor |γ |1/2 is the square root of the determinant of the matrix of surface metric components, as given by (2). The normalisation condition (14) is such that the measure factor means that the Dirac distributions specified in this way will transform as an ordinary bi-scalar, a property that distinguishes it from the corresponding Dirac type that would be given by the commonly used alternative convention in which the measure factor is omitted. This specifies distributions that behave not as ordinary bi-scalars but as weight one bi-scalar densities. Of course it does not matter which of these conventions is used if, as will be done below, the coordinates are restricted to be of the standard orthonormal type with respect to which the metric just has the Cartesian unit diagonal form. For the source distribution given in the thin brane limit by (13), the corresponding solution of the linear field equation (10) is given in terms of a Green function by the expression [q] [p] φ{x} = G {x, x{σ ¯ }} ρ{σ ¯ } dS , (17) while the corresponding scalar field gradient – which is what determines the resulting force density – will be given by ∂φ {x} = ∂x µ
[q]
∂G [p] {x, x{σ ¯ }} ρ{σ ¯ } dS . µ ∂x
(18)
[q]
The Green function G {x, x{σ ¯ }} satisfies the Laplace equation ∇µ ∇ µ G {x, x{σ ¯ }} = − [q]
[q−1] [q]
δ {x, x{σ ¯ }}
(19)
subject to the appropriate boundary conditions, which in the standard case for an isolated system will simply consist of the requirement that the field should tend to zero at large distances from the source. Like the Dirac “function”, it is a bi-scalar (not a bi-scalar density) that is well behaved as a distribution, but singular as a function wherever the two points on which it depends are coincident. Note that the Green function is determined up to an harmonic function, that is a solution of the homogeneous Poisson equation [q] [q] ∇µ ∇ µ G {x, x{σ ¯ }} = 0, with the appropriate boundary conditions. In particular G is determined up to an additive constant that induces a global shift in the potential and is not relevant for the computation of its gradient.
3.2. From the thin brane to a regular thick brane. Although convenient for many formal calculational purposes, the drawback to the use of the strict thin brane limit as represented in terms of such singular Dirac and Green distributions is that the ensuing value of the corresponding field φ and its gradient ∂φ/∂x µ will also be singular on the brane p-surface, which unfortunately is just where one needs to evaluate them for purposes such as computing the corresponding energy and force density. The obvious way to get round this difficulty – without paying the cost of introducing extra degrees of freedom – is to replace (13) by an expression of the form ρ {} {x} =
[q]
[p]
D{} {x, x{σ ¯ }} ρ{σ ¯ } dS ,
(20)
Gradient Formula for Linearly Self-Interacting Branes
295
[q]
in which the strict Dirac distribution δ {x, x} ¯ is replaced by a regular bi-scalar profile [q] ¯ say, characterised by some sufficiently small but finite smoothing function, D{} {x, x} length scale, say, which is subject to the normalisation condition [q] [q] D{} {x, x} ¯ dS = 1. (21) In order to respect the geometric isotropy and homogeneity of the background space, [q] this regular “profile function” D{} {x, x} ¯ can depend only on the relative distance, s say, that is defined – in the flat background under consideration – by s 2 = gµν (x µ − x¯ µ )(x ν − x¯ ν ),
(22)
according to some ansatz of the form [q]
[q]
D{} {x, x} ¯ = D{} {s 2 }.
(23)
The functionally singular Dirac distribution can be considered as the limit of such a function as the smoothing lengthscale tends to zero. Such a description corresponds to what may be described as a “fuzzy” brane model characterised by an effective thickness whose order of magnitude will be a function of the value of the regularisation parameter . For some kinds of physical application one might wish to adjust the regularisation ansatz to make it agree as well as possible with the actual internal structure (if known) of the extended physical system under consideration. The spirit of the present work is, however, rather to treat the regularisation parameter just as a provisional freely variable quantity for use at an intermediate stage of an analysis process whereby it would ultimately be allowed to tend to zero so as to provide a strict “thin” brane limit description. In order for our results to be meaningful for this latter purpose, it is important that they should be insensitive to the particular choice of the smoothing ansatz that is postulated. We shall take care to verify this. 4. Field Solution for Regularised Source Distribution [q]
For any source distribution of the regularised form (20), the function, G{} {s 2 } that is the solution (with the appropriate boundary conditions) of the corresponding generalisation ∇ µ ∇µ G{} {s 2 } = − [q]
[q−1]
[q]
D{} {s 2 },
(24)
of (19) will act as a generalised Green function. The solution of the Poisson equation (10) for the generic smoothed source distribution (20) will thus be given, using the natural abbreviation s{x, σ } = s{x, x{σ ¯ }}, (25) by [p] {} {x} = G [q] {s{x, σ }} ρ(σ ¯ ) dS . (26) φ {}
The gradient of this scalar field is evidently given by {} ∂φ ∂x µ
{x} =
∂G [q] {} ∂x µ
[p]
{s 2 {x, σ }} ρ{σ ¯ } dS ,
in which the integrand can be evaluated using the formula
(27)
296
B. Carter, R.A. Battye, J.-P. Uzan [q]
[q]
∂G{}
dG{}
= 2sµ
∂x µ
ds 2
,
(28)
where s µ = x µ − x¯ µ . One can evidently go on in the same way to obtain the second derivative ∂ 2 G [q] {} ∂ 2φ [p] {} {x} = {s 2 {x, σ }} ρ{σ ¯ } dS , (29) µ ν µ ν ∂x ∂x ∂x ∂x in which the integrand can be evaluated using the formula [q]
[q]
∂ 2 G{}
∂x µ ∂x ν
= 2gµν
[q]
dG{}
+ 4sµ sν
ds 2
d2 G{}
(ds 2 )2
.
(30) [q]
It is to be noted that, by contraction of (30), the Laplace equation (24) for G{} takes the form [q] [q−1] d q dG{} [q] s =− s q−2 D{} {s 2 }. (31) ds 2 ds 2 4 [q]
If one introduces the monotonically increasing function I{} {s 2 } by s [q] [q−1] [q] I{} {s 2 } ≡ uq−1 D{} {u2 } du,
(32)
0
then it will satisfy [q]
dI{} ds 2
[q−1]
=
s q−2 D{} {s 2 }. [q]
(33)
2 By substituting into (31), it follows that we shall have [q]
dG{} ds 2 and hence
[q]
d2 G{}
(ds 2 )2
[q]
=−
I{} {s 2 } 2s q
[q]
=
q I{} {s 2 } 4s q+2
−
,
[q−1]
(34) [q]
D{} {s 2 }
4s 2
.
(35) [q]
The normalisation condition (21) implies by (32) that the function I{} {s 2 } should satisfy the condition [q] I{} {s 2 } → 1 (36) in the limit s 2 → ∞, for all . It follows that any admissible ansatz for the smoothing profile will entail an asymptotic behaviour characterised by [q]
dG{} ds 2
∼−
1 2s q
(37)
as s 2 → ∞. The corresponding asymptotic formula for the regularised Green function [q] G{} {s 2 } will therefore be given, as long as q > 2, by [q]
G{} {s 2 } ∼
s 2−q q −2
(38)
Gradient Formula for Linearly Self-Interacting Branes
297
as s → ∞. In the particular case q = 2, the integral over s 2 is marginally convergent, with asymptotic behaviour characterised by 1 [2] G{} {s 2 } ∼ − ln{s 2 } 2
(39)
as s 2 → ∞. 5. Evaluation for Flat Brane Configurations Before considering the effect of the curvature of the brane, it will be instructive to apply the preceding formulae to the case of a flat p-brane configuration supporting a uniform source distribution. The uniformity condition means that the value of ρ¯ is independent of the internal coordinates σ i , and the flatness condition means that the embedding mapping σ i → x¯ µ {σ i } of the brane p-surface will be expressible in a system of suitably aligned Cartesian background space coordinates {x¯ µ } = {¯zi , r¯ a } (with µ = 1 . . . q, i = 1 . . . p, a = p+1 . . . q) in terms of a corresponding naturally induced system of internal coordinates σ i simply by z¯ i = σ i , r¯ a = 0. (40) With such a coordinate system, the background metric components will be given by gij = γij ,
gia = 0,
gab = ⊥ab ,
(41)
where γij and ⊥ab are unit matrices of dimension p and (q − p) respectively. The intrinsic uniformity of the configuration ensures that physical quantities depend only on the external coordinates r a , and the isotropy of the system ensures that scalar quantities can depend only on the radial distance r from the brane p-surface given by r 2 = ⊥ab r a r b .
(42)
In particular, the distance s from a point with coordinates x µ on the transverse (q − p)plane through the origin (that is, with σ i = 0) to a generic point with coordinates σ on the flat p-brane locus will be given simply by s 2 {x, σ } = r 2 + σ 2 ,
(43)
σ 2 = γij σ i σ j .
(44)
where
It can be seen that for a given fixed value of the source density ρ¯ on the brane, the prescription (20) will provide a radial dependence of the form ⊥
[q,p]
ρ {r 2 } = D{} {r 2 } ρ, ¯
(45) ⊥
[q,p]
where the dimensionally reduced radial profile function D{} {r 2 } is given in terms of [q]
the original q-dimensional profile function D{} {s 2 } by the expression ⊥
[q,p]
[p−1]
D{} {r 2 } =
∞ 0
σ p−1 D{} {σ 2 + r 2 } dσ. [q]
(46)
298
B. Carter, R.A. Battye, J.-P. Uzan
This integral will always be convergent as a consequence of the rapid asymptotic fall[q] off condition that must be satisfied by the function D {s 2 } in order for it to obey the normalisation condition (21), which implies that ∞ [q−p−1] ⊥ [q,p] r q−p−1 D{} {r 2 } dr = 1. (47) 0
[q]
Since (37) implies that the associated regularised Green function G{} {s 2 } will satisfy {} will be given by the comparatively weak asymptotic fall off condition (38), the field φ {} {r 2 } = ρ¯ ⊥G [q,p] {r 2 }, φ {}
(48) ⊥ [q,p]
using (26), where the dimensionally reduced radial Green function G{} {r 2 } is defined by ⊥ [q,p] [p−1] [q] G{} {r 2 } ≡ σ p−1 G{} {σ 2 + r 2 } dσ, (49) 0
in the limit where → ∞, a limit that will be convergent only if q > p + 2. In the prototype case [1–3] of an ordinary string with p = 1 in a background of space dimension q = 3, the latter is only marginally too low compared with the dimension of the brane surface to achieve convergence. In that case, as in the more general hyperstring case, that is to say whenever q = p + 2, it is still possible to obtain a useful asymptotic formula by taking the upper limit to be a finite “infra-red” cut off length. It can be seen from (38) that in this marginally divergent codimension-2 case the resulting asymptotic behaviour will be given by [q−3]
⊥ [q,q−2] , (50) ln √ G{} {r 2 } ∼ q −2 r 2 + 2 as → ∞. One might go on to try to obtain an analogous formula for what, as far as “infra-red” divergence behaviour is concerned, is the worst possible case of all, namely that of a hypermembrane (or “wall”) meaning the hyper-surface supported case characterised by q = p + 1. However in this codimension-1 case, the divergence will have the linear form [q−2] ⊥ [q,q−1]
, (51) G{} {r 2 } ∼ q −2 which is particularly sensitive to the choice of cut-off and hence is not very useful. {} will typically be less important than its (not For physical purposes the potential φ so highly gauge dependent) gradient, which is what will determine the relevant force density. In the flat and uniform configuration considered in this section, the gradient {} /∂x µ is not liable to the infrared divergence problems that beset the undifferenti∂φ ated field when the background dimension p is not sufficiently high compared with the brane dimension q. It is evident from the uniformity of the configuration that the gradient components in directions parallel to the brane p-surface will vanish, that is, we shall {} /∂zi = 0, while starting from the expression (27), it can be concluded that the have ∂ φ orthogonal gradient components will be given by {} ∂φ ∂r a
⊥ [q,p]
{r } = ρ¯ 2
∂ G{} ∂r a
{r 2 },
(52)
Gradient Formula for Linearly Self-Interacting Branes
299
where, by Eq. (34), even in the cases for which the undifferentiated integral (49) would be divergent, the required derivative is given by a well defined formula of the form ⊥ [q,p]
∂ G{}
[q,p]
{r 2 } = −ra J{} {r 2 },
∂r a
(53)
[q,p]
in which the function J{} {r 2 } is defined as the integral
[q,p]
[p−1]
J{} {r } = 2
[q]
∞
σ
p−1
0
I{} {r 2 + σ 2 } (r 2 + σ 2 )q/2
dσ,
(54)
which will always converge by virtue of (36). In a similar manner, starting from (29), it can be seen that the corresponding second [q,p] derivative components will be given in terms of the same convergent integral J{} {r 2 } by ⊥ [q,p]
{} ∂ 2φ ∂r a ∂r
{r 2 } = ρ¯ b
∂ 2 G{}
∂r a ∂r b
{r 2 }
(55)
in which, by the definition (49), it can be seen that we shall have ⊥ [q,p]
∂ 2 G{}
[p−1]
{r 2 } = 2
∂r a ∂rb
0
σ p−1 ⊥ab
[q]
dG{} ds 2
[q]
+ 2ra rb
d2 G{}
(ds 2 )2
{σ 2 + r 2 } dσ, (56)
in which it is to be recalled that the use of curly brackets (as in the expression {σ 2 + r 2 } at the end) indicates functional dependence (not simple multiplication). Using (35) to express the second derivative in the integrand as [q] [q] [q−1] 2−p d dG{} (p − q) dG{} σ [q] σ p , = − D {s 2 } − 2 (ds 2 )2 2r 2 ds 2 4r 2 {} r dσ 2 ds 2 [q]
d2 G{}
(57)
and observing that the asymptotic behaviour (37) ensures that the last term in (57) will contribute nothing to the integral on the right of (56), we finally obtain an expression of the convenient form ⊥ [q,p]
∂ 2 G{}
∂r a ∂rb ⊥
{r 2 } = − [q,p]
[q,p] ra rb [q−1] ⊥ [q,p] 2 r a rb D {r } + (q − p) − ⊥ J{} {r 2 }, ab {} r2 r2
(58)
in which D{} {r 2 } is the dimensionally reduced profile function given by the integral (46).
300
B. Carter, R.A. Battye, J.-P. Uzan
6. Illustration Using a Canonical Profile Function Our main results will be derived for any generic smoothing ansatz, without restriction [q] to any particular form for the profile function D{} {s 2 }. However as a concrete illustration, it is instructive to consider the case of what we shall refer to as the “canonical” [q] ¯ and regularisation ansatz, whereby the q-dimensional Dirac delta “function” δ {x, x} [q] ¯ are considered as the respective limits – as the associated Green function G (x, x) the regularisation parameter → 0 – of one-parameter families of smooth bi-scalar functions [q] [q] [q] [q] D{} {x, x} ¯ = δ{} {s 2 }, and G{} {x, x} ¯ = G{} {s 2 } (59) that are characterised in terms of the squared distance s 2 {x, x} ¯ and of the smoothing lengthscale by the prescription [q]
δ{} {s 2 } =
2q
[q−1]
s2 + 2
−(q+2)/2
.
(60)
[q]
This canonical profile function evidently satisfies δ{} {s 2 } → 0 when 2 → ∞ for 2 s = 0, as well as the unit normalisation condition (21) using the fact that dS [q] = [q−1] q−1 s ds. By expressing the canonical profile function as
q/2 s2 2 [q] 2 2−q d , (61) δ{} {s } = [q−1] s ds 2 s2 + 2 it can be seen that the resulting integral (32) will simply be given by [q]
I{} {s 2 } =
s2 2 s + 2
q/2 .
(62)
It is then straightforward, by integrating (34), to show that the corresponding regularised Green function, satisfying the Poisson equation (24), will be given by the canonical [q] Green function G{} defined by [q]
G{} {s 2 } =
−(q−2)/2 1 2 , s + 2 (q − 2)
(63)
as long as q > 2. As explained above, it is determined up to a constant that is fixed by the fall off requirement at infinity which is mandatory for the convergence of the integrals considered below but that will not be relevant for the gradients to be computed later on. In the marginally convergent case where q = 2, one finds that s2 1 [2] 2 G{} {s } = − ln 1 + 2 . (64) 2 For a specific physical system this particular choice is not necessarily the ansatz that will be most accurate for the purpose of providing a realistic representation of its actual internal microstructure. However this “canonical” prescription – as characterised by the formulae (62) and (63) – is distinguished from other conceivable alternatives by the
Gradient Formula for Linearly Self-Interacting Branes
301
very convenient property of having an analytic form that is preserved by the dimensional reduction procedure. The associated radial profile function is simply given by ∞ [p−1] 2q up−1 ⊥ [q,p] D{} {r 2 } = [q−1] 2 du. (65) 2 (q−p+2)/2 ( + r ) (1 + u2 )(q+2)/2 0 Since we shall always have q > p − 2, the latter integral will always be convergent. Using the well known properties (see (118) in the appendix) of the Euler Beta function B{p/2, (q −p)/2+1} it can be seen that in this canonical case the radial profile function will be given simply by ⊥ [q,p] [q−p] D{} {r 2 } = δ{} {r 2 }. (66) It can similarly be seen that, since we always have q > p, the integral (54) will also be convergent, and (again using the formulae in the appendix) that the result will be expressible in terms of the Beta function as [q−p] p q −p [q,p] 2 J{} {r } = , . (67) B 2 2 2(r 2 + 2 )(q−p)/2 According to (49) the corresponding dimensionally reduced Green functions will have a canonical form ⊥ [q,p] ⊥ [q,p] G{} {r 2 } = G{} {r 2 } (68)
that will similarly be given whenever they are well defined – that is whenever q > p + 2 – by expressions of the form [p−1] p q −p ⊥ [q,p] G{} {r 2 } = B , − 1 , (69) 2 2 2(q − 2)( 2 + r 2 )(q−p)/2−1 so that, in terms of the canonical Green function (63), they take the form ⊥
[q−1]
[q,p]
G{} {r 2 } =
[q−p]
G{} {r 2 }.
(70) The form of the Green function, unlike that of the canonical profile function, is thus not exactly preserved by the dimensional reduction process but the only difference is an overall volume factor. The preceding expression is only valid when q > p + 2, but in the marginally convergent hyperstring case where q = p + 2 one can use the expression (64) for the canonical Green function to obtain the asymptotic relation [q−3]
⊥ [q,q−2] (71) G{} {r 2 } ∼ ln √ (q − 2) r 2 + 2 as → ∞. It is to be noted that the logarithmic term in this formula and in (50) will be expressible by (64) in the form
[2] 2 ln √ = G{} {r } + ln (72) 2 2 r + in which the first term is independent of the cutoff . This means that in the asymptotic limit the dependence on r will drop out, so that we shall be left simply with [q−3]
⊥ [q,q−2] G{} {r 2 } ∼ ln (73) (q − 2) as → ∞.
[q−p−1]
302
B. Carter, R.A. Battye, J.-P. Uzan
7. Source Weighted Averages in the Flat Canonical Case For the purposes of a macroscopic description, what really matters is not the detailed field distribution in a smoothed microscopic treatment as developed in the previous sections, but only its effective averaged values. For instance, in the example of the scalar {} {r}, we shall be interested in its cross-sectional average defined as field φ ∞ {} ≡ [q−p−1] {} {r}w{r} dr, φ r q−p−1 φ (74) 0
where w{r} is a weighting factor dependent only on the radial coordinate r subject to the normalization condition ∞ [q−p−1] r q−p−1 w{r} dr = 1. (75) 0
Let us first consider the flat uniform configurations studied in Sect. 5. Whatever the explicit form of the function w{r}, it is obvious that the isotropy of the configuration will ensure that the average of the field gradient (52) will cancel out, meaning that we {} /∂r a = 0. Using the general relation shall have ∂ φ
2 r 1 r a rb = ⊥ab , f {r} q −p f {r}
(76)
which applies to any purely radial function f {r}, it can thus be seen from (55) and (58), using the expression (45) for the radial source density distribution, that [q−1] {} ∂ 2φ ρ ⊥ab , =− (77) q − p {} ∂r a ∂r b in which the value of ρ {} also depends on the choice of the weighting function. It is to be remarked that (as a useful check on the preceding algebra) the trace this equation gives back the original Poisson equation (56). The kinds of energy and force contributions that are relevant in physical applications will typically be proportional to the product of the linear field φ with the corresponding {} considered here should be source term. This implies that the weighting of the field φ proportional to the source term ρ {} . For a flat configuration, it can thus be deduced from (45) that the appropriate weighting factor should be explicitly given by the formula ⊥
[q,p]
w{r} = D{} {r 2 },
(78)
and hence that the average of the source term will be given by an expression of the form ∞
2 [q−p−1] ⊥ [q,p] ρ {} = ρ ¯ D{} {r 2 } r q−p−1 dr, (79) 0
which will be valid, whatever the profile function, as long as the configuration under consideration is flat. The corresponding expression for the average of the field φ will have the form ∞ [q−p−1] ⊥ [q,p] ⊥ [q,p] {} = ρ φ ¯ D{} {r 2 } G{} {r 2 }r q−p−1 dr. (80) 0
Gradient Formula for Linearly Self-Interacting Branes
303
In the special canonical case introduced in Sect. 6, the profile factor will be given by (60), so that the (always convergent) integral (79) will be obtainable explicitly – as shown in the appendix – in terms of the Beta function. This leads to an expression of the form (q − p)2 ρ¯ q −p q −p ρ{} = , + 2 q−p , B (81) [q−p−1] 2 2 2 for the average source density. With further use of the formulae in the appendix, this can be written even more explicitly as
ρ{} =
1 16π
q−p 2
(q − p)2 (q − p + 2) [{(q − p)/2}]3 ρ¯ . (q − p + 1) {q − p} q−p
(82)
The simplest application of this result is to the case of a hypermembrane or “wall”, meaning a hypersurface forming brane as characterised by q − p = 1, for which the formula (82) just gives ρ{} = 3π ρ/(32). ¯ For the case q − p = 2, which includes that of a string in an ordinary 3-dimensional space background, the corresponding result is 2 ). ρ{} = ρ/(3π ¯ In terms of the canonically reduced Green functions (68), it can be seen that the average of the regularised field φ{} will be given in the flat case by a prescription of the form ∞ [q−p−1] ⊥ [q,p] [q−p] φ{} = ρ G{} {r 2 }δ{} {r 2 }r q−p−1 dr. (83) ¯ 0
Whenever the expression (70) is valid, that is whenever q > p + 2, the formulae in the appendix will provide a result expressible as
φ{}
[q−1] 1 (q − p) q −p q −p ρ¯ = B , , [q−p−1] q−p−2 2 (q − p − 2) 2 2
(84)
which can be rewritten in terms of functions as π p/2 (q − p) [{(q − p)/2}]3 ρ¯ . φ{} = 2 (q − p − 2) {q/2}{q − p} q−p−2
(85)
The preceding formula will no longer be valid, but an analogous though weakly cutoff dependent expression will still be obtainable in the marginal hyperstring case, when q = p + 2. In this case, instead of (70), the effective dimensionally reduced Green function will be given by the simple asymptotic formula (73) in the large value limit for the relevant infrared cutoff . The ensuing asymptotic formula for the averaged value of φ{} (r) in a hyperstring is thus immediately seen to be given by [q−3] ρ¯
φ{} ∼ ln 2(q − 2)
as → ∞.
(86)
304
B. Carter, R.A. Battye, J.-P. Uzan
8. Allowance for Curvature in the Averaging Process We are now ready to tackle the main problem that motivates this study, which is the evaluation of the effect of curvature of the brane p-surface. In order to do this, instead of the exactly flat configuration characterised by the coordinate system (40) and the particular form of the metric components (41), we need to consider a generically curved p-surface adjusted so as to be tangent to the reference flat p-surface at the coordinate origin. The validity of the “thin brane” limit description with which we are concerned depends on the requirement that the relevant macroscopic lengthscale (that was used to specify an appropriate “infra-red” cut off, in cases for which the convergence condition q > p + 2 does not hold) should be large compared with the microscopic brane thickness scale . Subject to the requirement that α≡
1
(87)
is satisfied (either because is very large or because is very small), it will be sufficient for our present purpose to work at linear order in α. This means that we need only consider up to quadratic order in σ i for an expansion where the brane locus will be specified in the relevant neighbourhood – of dimension large compared with but small compared with – by z¯ i = σ i ,
r¯ a =
1 Kij a σ i σ j [1 + o{α}], 2
(88)
for an arbitrary set of constant coefficients Kij a . In this expansion, we have used the standard meaning of o{δ} denoting a quantity negligible with respect to δ, that is satisfying o{δ}/δ → 0 in the limit δ → 0. It can be easily verified that, as our notation (88) suggests, the constants of the development are exactly given by the non-vanishing components of the second fundamental tensor Kµν ρ , defined in (6) at the origin, with respect to the orthonormal background coordinates (40). In the simplest cases, the condition (87) might be satisfied simply by taking to be arbitrarily large. In specific applications an upper bound on the admissible magnitude for will commonly be imposed by the physical nature of the environment in which upper limits might be provided by lengthscales such as those characterising the gradients of relevant background fields or those characterising the separation between neighbouring branes. However, an upper limit that must always be respected will be provided, except in the strictly flat case, by the curvature lengthscale defined by the inverse of the magnitude of the mean curvature vector whose components are given by the formula K a = Ki ia .
(89)
As our notation suggests, these components K a are identifiable with respect to the orthonormal background coordinates (40) as the non-vanishing components of the complete curvature vector K µ given by (1). In order to ensure that none of the coefficients Kij a exceeds the order of magnitude of the inverse lengthscale −1 , the latter must thus be subject to the limitation a
−2 > (90) ∼ K Ka . In particular, our formalism will break down near a kink or a cusp of the brane worldsheet. The curvature of the worldsheet makes the evaluation of the physically appropriate weighting factor for the definition of effective sectional averages across the brane
Gradient Formula for Linearly Self-Interacting Branes
305
rather more delicate than in the flat case dealt with in the preceding section. The kind of average that is ultimately relevant is a source density weighted mean taken not just over a (q − p)-dimensional orthogonal section through the brane but rather over the q-dimensional volume between neighbouring (q − p)-dimensional orthogonal sections through the brane. This distinction is irrelevant for a flat brane for which neighbouring sections are exactly parallel. However for a slightly curved brane, there will be a relative tilt between nearby orthogonal cross sections, which means that the effective weighting factor w{r} appropriate for the purpose of averaging over a (q − p)-dimensional crosssection will no longer be exactly proportional to the density ρ{r} as in the flat case. This needs to be adjusted by a correction factor allowing for the relative compression or expansion of the relevant q-volume due to the relative tilting. By summing over the different directions in which the tilting can occur, it can be seen that the weighting factor will be given, at first order, by w{r} =
1 − Ka r a + o{α} .
ρ {}{r} ρ¯
(91)
To obtain the corresponding generalisation of the formula (78) for the flat case, we must allow for the fact that the original formula (45) for the source density distribution also needs to be corrected to take into account the effect of the curvature. Using (88) to expand the distance function (22) as s 2 {x, σ } = r 2 + σ 2 − ⊥ab Kij a r b σ i σ j [1 + o{α}] ,
(92)
we obtain the Taylor expansion [q]
[q]
[q]
D{} {s } = D{} {r + σ } − 2
2
2
dD{} {r 2 + σ 2 } dσ 2
⊥ab Kij a r b σ i σ j [1 + o{α}] ,
(93)
for the profile function (23). Since we are neglecting corrections of quadratic and higher order in the curvature, it can be seen that when the integration is carried out in two stages of which the first is just to take the integration over the p-sphere of constant [p] radius σ centered on the origin, it is possible to replace the surface element dS in (20) [p−1] σ p−1 dσ as in (46). Furthermore, in the anisotropic contribution proportional by [p] to σ i σ j it will be possible to replace σ i σ j dS by the spherically averaged equivalent [p−1] σ p+1 dσ/p. Therefore, to allow for the effect of the curvature at first order, γ ij the density distribution (45) needs to be replaced by ⊥
ρ {} {r} = ρ¯ D{} {r 2 } − [q,p]
[p−1]
Ka r a + o{α}
p
∞
[q]
σ p+1 dσ
dD{} {r 2 + σ 2 }
0
dσ 2
,
(94) in which the second term can be evaluated using an integration by parts, using the ⊥ [q,p] definition (46) for D{} {r 2 } and the final result takes the simple form ⊥
[q,p]
ρ {} {r} = ρ¯ D{}
1 {r } 1 + Ka r a + o{α} . 2 2
(95)
306
B. Carter, R.A. Battye, J.-P. Uzan
Thus, it can be seen that the geometric curvature adjustment factor in (91) is partially canceled by the density adjustment factor in (95) to give the corresponding curvature adjusted weighting factor 1 ⊥ [q,p] 2 (96) w{r} = D{} {r } 1 − Ka r a + o{α} . 2 We note that as a corollary of this partial cancellation, the first order effect of the curvature is entirely canceled in the correspondingly weighted density function, which will be given simply by w{r} ρ {} {r} = ρ¯
⊥ [q,p]
2 D{} {r 2 } [1 + o{α}] .
(97)
Thus to first order in the curvature, the average density will still be given by the same form (79) as in the flat case.
9. Allowance for Curvature in the Potential Gradient {} /∂r a , vanishes. Thus, In the flat configuration, the average of the field gradient ∂ φ {} and ∂ 2 φ {} /∂r a ∂r b for which the dominant contributions, respectively givunlike φ en by Eqs. (74) and (77), are of zeroth order in the curvature, it will have a dominant contribution that is of linear order in the curvature. {} in terms of the generalised Starting from the general expression (26) for the field φ Green function, the strict analogue of the integral (93) for the source density ρ {} is given, after Taylor expanding the Green function, by [q] 2 dG {r 2 + σ 2 } σ [p−1] [q] {} {} {r} = ρ¯ σ p−1 dσG{} {r 2 + σ 2 } − Ka r a [1 + o{α}] . φ p dσ 2 0 (98) The truncation of the integration at the finite infra-red cut-off is only needed for the treatment of the marginally divergent case for which q = p +2, but makes no difference, at first order in / . In the convergent cases q > p + 2, the result will be effectively the same as would be obtained simply by taking the limit → ∞. The corresponding expression for the gradient is found, after an integration by parts, to be {} ∂φ
[q]
dG{} {r 2 + σ 2 } σ2 = ρ ¯ σ dσ 2 + K r − r K [1 + o{α}] . b a a ∂r a p dσ 2 0 (99) The appropriately weighted potential function required for the evaluation of the relevant average can now be seen from (96) and (98) to be given, after another integration by parts, by [p−1]
p−1
[q−1] ⊥
{} {r} = ρ¯ w{r} φ
b
[q,p]
D{} {r 2 }
0
σ p−1 dσ G{} {r 2 + σ 2 } [1 + o{α}] [q]
[q−1] ⊥
−ρ¯
D{} {r 2 }Kb r b [q,p]
p [q] 2 G {r + 2 }. 2p {}
(100)
Gradient Formula for Linearly Self-Interacting Branes
307
When q > p + 2, the asymptotic behaviour (38) implies that the boundary contribution tends to zero, while the integral converges toward the radial Green function, as defined in (49) so that we obtain an expression of the simple cut-off independent form ⊥
⊥ [q,p]
{} {r} = ρ¯ D {r 2 } G {r 2 }. w{r} φ {} {} [q,p]
(101)
In the marginal case q = p + 2, we obtain an expression of the asymptotic form
[q−3]
Ka r a ⊥ [q,q−2] 2 {} {r} = ρ¯ D , (102) {r } ln − w{r} φ {} q −2 2(q − 2)2 in which the boundary contribution at the end will provide a finite curvature adjustment term. However, since this adjustment term is an odd function of the radius vector r a it will still provide no net contribution to the integrated average, which will thus be given simply by ∞ [q−p−1] ⊥ [q,p] ⊥ [q,p] φ{} = ρ¯ r q−p−1 dr D{} {r 2 } G{} {r 2 } [1 + o{α}] , (103) 0
which to first order in α is just the same as in the corresponding formula (80) for the flat case. When we go on to work out the analogously weighted potential gradient distribution, it can be seen that the term with quadratic radial dependence in (99) will cancel out, leaving [q] dG{} {r 2 + σ 2 } σ2 p−1 σ dσ 2r − K [1+o{α}] . a a ∂r a p dσ 2 0 (104) Since the first term in the previous integral is an odd function of the radius vector r a , it provides no net contribution to the corresponding integrated average, which will therefore be given simply by ∞ {} ∂φ [q−p−1] ⊥ [q,p] [q,p] = ρ¯ Ka r q−p−1 dr D{} {r 2 } H {r 2 } [1 + o{α}] , (105) ∂r a 0
w{r}
{} ∂φ
[p−1] ⊥
= ρ¯
[q,p]
D{} {r 2 }
where [p−1]
H
[q,p]
{r } ≡ − 2
p
[q]
σ
p+1
dσ
0
dG{} {r 2 + σ 2 } dσ 2
.
(106)
Using an integration by parts, this integral can be rewritten in terms of the dimensionally ⊥ reduced Green function G{} [q,p] {r 2 } defined by (49) in the form
p [q] 2 1 ⊥ [q,p] 2 G{} {r } − G{} {r + 2 }. 2 2p [p−1]
H
[q,p]
{r 2 } =
(107)
As in (100), the boundary contribution at the end will vanish in the large limit in consequence of the asymptotic behaviour (38) provided the convergence condition q > p + 2 is satisfied. In the marginally divergent hyperstring case q = p + 2, the boundary term will tend to a limit having a finite value (namely −1/2p 2 ) that will still be negligible
308
B. Carter, R.A. Battye, J.-P. Uzan
compared with the first term in (107). Thus, whenever q − p ≥ 2 we shall always have a relation of the form H
[q,p]
{r 2 } =
1 ⊥ [q,p] 2 G {r } [1 + o{α}] . 2 {}
(108)
The only case to which this formula does not apply is that of a hypermembrane, meaning the hypersurface supported case characterised by q = p − 1, for which the [q,p] boundary term will be linearly divergent, like the corresponding integral for D{} {r 2 } as characterised by (51). We can still derive an asymptotic relation H
[q,q−1]
{r 2 } ∼
q − 2 ⊥ [q,q−1] 2 G {r } 2(q − 1) {}
(109)
as → ∞ even in this extreme case, but its utility is limited by the strongly cut off dependence of the quantities involved. It can be seen that it approaches agreement with the generic formula (108) when the space dimension q is very large. When (108) is substituted back into (105) one obtains a result that can immediately {} to be be seen by comparison with the formula (103) for the averaged potential φ expressible in terms of the latter by {} ∂φ 1 {} [1 + o{α}] , (110) = Ka φ a ∂r 2 a simple and easily memorable result whose derivation is the main purpose of this work. 10. Discussion The preceding work shows not only that the relation (110) holds as an ordinary numerical equality for the strictly convergent cases for which the co-dimension, (q − p), is greater than 2, but also that it holds as an asymptotic relation for the marginally divergent case of a hyperstring with q − p = 2, including the previously studied examples [1–3] of application to a string with p = 1 in the ordinary case of a background with space dimension q = 3. It is also clear from the preceding work that the formula (110) is not applicable to the case of hypermembrane, for which the co-dimension is given by q − p = 1. In this case, it can be seen from (109) that the factor 1/2 in (110) should in principle be replaced by a factor (p − 1)/2p. However it is debatable whether any useful information can be extracted in this case since the result is strongly dependent on the cut-off due to the “infra-red” divergence, and in practice this issue is of little importance because the particularly good ultraviolet behaviour of the hypermembrane case makes it easily amenable to other, more traditional, methods such as the use [11] of Israel Darmois type jump conditions. Dropping the explicit reminder that higher order adjustments would be needed if one wanted accuracy beyond first order in the ratio of brane thickness to curvature radius, the relation (110) translates into fully covariant notation as
1 {} = K µ φ {} . ⊥µν ∇ν φ 2
(111)
Gradient Formula for Linearly Self-Interacting Branes
309
It is to be emphasised that the generic relation (111) is a robust result, whose validity, whenever the co-dimension is 2 or more, does not depend on the particular of the choice {} was based. canonical regularisation ansatz on which the explicit formula (85) for φ It also shows that (111) is valid for any alternative non-canonical regularisation ansatz, whose mathematical properties would be less convenient for the purpose of explicit evaluation, but that might provide a more exact representation of the internal structure for particular physical applications in cases where information about this actual internal structure is available. The question that remains to be discussed is how this result generalises from static Poisson configurations, as considered here, to dynamic configurations in a Lorentz covariant treatment for which the Laplacian operator would be replaced by a Dalembertian operator. A priori, considerations just of Lorentz invariance imply that the only admissible alternative to the formula (111) would be a formula differing just by replacement of the factor 1/2 by some other numerical pre-factor. However, it must be taken into account that this numerical coefficient must match the coefficient that applies to any static configuration. It can thus be deduced from the postulate of consistency with what has been derived here that the formula (111) should be applicable to any p-brane with p + 1-dimensional timelike worldsheet in Lorentzian q + 1-dimensional background spacetime, for all values of p and q for which our present derivation applies, that is to say for q − 2 ≥ p ≥ 1. There are two interesting opposite extreme cases that are beyond this range. At one extreme, in the most strongly “infra red” divergent hyper-membrane case p = q − 1, the formula (111) needs a modified factor (p − 1)/2p. However, as discussed above, the quantities involved are, in this case, too highly cut off dependent for the result in question to have much significance, and anyway, as remarked above, their ultraviolet misbehaviour is so mild that hypermembranes can be satisfactorily treated without recourse to regularisation, so no analogue of the gradient formula is needed. On the other hand, at the opposite extreme, in the most strongly “ultra violet” divergent case, namely that of a simple point particle with p = 0, an appropriate regularised gradient formula is indeed something that is obviously needed. However, for a static configuration, the possibility of curvature does not arise at all in this “zero-brane” case, so that – while there is no reason to doubt the conjecture that it should still apply – there is no short cut whereby the formula (111) can be derived without further work, which will require recourse to the technical complications (due to the dimension sensitive nature of the time-dependent Green functions [12]) involved in a fully dynamical treatment.
Appendix: Integral Formulae and Notation The standard definition of the Euler Gamma function is provided by the integral formula {z} ≡
∞
e−t t z−1 dt.
(112)
0
When the argument has an integer value n it can be shown, using the obvious recursion relation {n + 1} = n{n}, that it satisfies the well known relations
310
B. Carter, R.A. Battye, J.-P. Uzan
{n + 1} = n!,
(2n)! √ 1 = 2n n+ π. 2 2 n!
(113)
In this article, we frequently need to evaluate integrals of the form ∞ y p−1 dy. Ip,q ≡ (1 + y 2 )q/2 0
(114)
Whenever p > 1 and q > p, this integral will be convergent and will be expressible – via a change of variables t = 1/(1 + y 2 ) – in the form p q −p 1 , , (115) Ip,q = B 2 2 2 where B{a, b} is an Euler integral of the first kind – also known as the Beta function – that is specified by the formulae 1 {a}{b} B{a, b} ≡ . (116) t a−1 (1 − t)b−1 dt = {a + b} 0 [q−1]
The values, as given by (12), of the surface area of a unit q-sphere are related for different values of q by the formulae " ! " ! [q−1] p2 q−p 2 p q −p 2 " ! = B = , , (117) [p−1] [q−p−1] 2 2 q2 and thus satisfy the useful relation [p−1] q p q −p q −p B , +1 = [q−p−1] . [q−1] 2 2 2
(118)
In order to facilitate the reading of the article, we conclude by providing Table 1 summarising the notation used for the profile, Green, and other related functions in the three cases considered in this article.
Table 1 General notation
Infinitely thin limit
canonical case
Potential
φ {}
φ
φ{}
Source term
ρ {}
ρ
ρ{}
Profile distribution
D{}
[q]
δ
[q]
δ{}
Green function
G{}
[q]
G
[q]
G{}
[q,p]
δ
⊥ [q,p]
⊥
Radial profile function Radial Green function
⊥
D{} G{}
Acknowledgements. R.A.B. is funded by PPARC.
[q,p] [q,p]
G
[q] [q]
[q,p]
δ{} ⊥
[q,p]
G{}
Gradient Formula for Linearly Self-Interacting Branes
311
References 1. Carter, B.: Electromagnetic self-interaction in strings. Phys. Lett. B404, 246–252 (1997) [hep-th/9704210] 2. Carter, B., Battye, R.A.: Non-divergence of gravitational self-interactions for Goto-Nambu strings. Phys. Lett. B430, 49–53 (1998) [hep-th/9803012] 3. Carter, B.: Cancellation of linearised axion-dilaton self-interaction in strings. Int. J. Theor. Phys. 38, 2779–2804 (1999) [hep-th/0001136] 4. Randall, L., Sundrum, R.: A large mass hierarchy from a small extra dimension. Phys. Rev. Lett. 83, 3370–3373 (1999) [hep-ph/9905221] 5. Davis, A-C., Vernon, I., Davis, S.C., Perkins, W.B.: Brane world cosmology without the Z2 symmetry. Phys. Lett. B504, 254–261 (2001) [hep-ph/0008132] 6. Carter, B., Uzan, J-P.: Reflection symmetry breaking scenarios with minimal gauge form coupling in brane world cosmology. Nucl. Phys. B606, 45–58 (2001) [gr-qc/0101010] 7. Cohen, A.G., Kaplan, D.B.: Solving the hierarchy problem with noncompact extra dimensions. Phys. Lett. B470, 52–58 (1999) [hep-th/9910132] 8. Gregory, R.: Nonsingular global string compactifications. Phys. Rev. Lett. 84, 2564–2567 (2000) [hep-th/9911015] 9. Gherghetta, T., Shaposhnikov, M.: Localizing gravity on a string-like defect in six dimensions. Phys. Rev. Lett. 85, 240–243 (2000) [hep-th/0004014] 10. Carter, B.: Dynamics of cosmic strings and other brane models. In: Formation and Interactions of Topological Defects, NATO ASI B349. A-C. Davis, R. Brandenberger, New York: Plenum, 1995, pp. 303–348 [hep-th/9609041] 11. Battye, R.A., Carter, B.: Generic junction conditions in brane-world scenarios. Phys. Lett. B509, 331–336 (2001) [hep-th/0101061] 12. Courant, R., Hilbert, D.: Methods of Mathematical Physics II: Partial Differential Equations. New York: Wiley, 1962, pp. 681–698 Communicated by H. Nicolai
Commun. Math. Phys. 235, 313–338 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0782-4
Communications in
Mathematical Physics
Non-Abelian Geometry Keshav Dasgupta, Zheng Yin School of Natural Sciences, Institute for Advanced Study, Einstein Drive, Princeton, NJ 08540, USA. E-mail:
[email protected];
[email protected] Received: 13 December 2000 / Accepted: 24 October 2002 Published online: 24 January 2003 – © Springer-Verlag 2003
Abstract: Spatial noncommutativity is similar and can even be related to the nonAbelian nature of multiple D-branes. But they have so far seemed independent of each other. Reflecting this decoupling, the algebra of matrix valued fields on noncommutative space is thought to be the simple tensor product of constant matrix algebra and the Moyal-Weyl deformation. We propose scenarios in which the two become intertwined and inseparable. Therefore the usual separation of ordinary or noncommutative space from the internal discrete space responsible for non-Abelian symmetry is really the exceptional case of an unified structure. We call it non-Abelian geometry. This general structure emerges when multiple D-branes are configured suitably in a flat but varying B field background, or in the presence of non-Abelian gauge field background. It can also occur in connection with Taub-NUT geometry. We compute the deformed product of matrix valued functions using the lattice string quantum mechanical model developed earlier. The result is a new type of associative algebra defining non-Abelian geometry. A possible supergravity dual is also discussed.
1. Introduction This paper is devoted to the search and study of certain unusual and hitherto unknown facets of noncommutative space from string and field theories and quantum mechanics. Introducing noncommutativity as a way of perturbing a known field theory has received much interest recently (see [1–3] and the references therein), and hence we shall refrain from repeating the usual motivations and excuses for doing it. Another reason lies in string theory itself. The antisymmetric tensor field B in the Neveu-Schwarz − NeveuSchwarz sector of string theory, while simpler than it cousins in the Ramond − Ramond sector, is still shrouded in mystery and surprisingly resistant to an unified understanding. One of its many features is its relation to spatial noncommutativity. Let us recall it briefly.
314
K. Dasgupta, Z. Yin
Open strings interact by joining and splitting. This lends naturally to the picture of a geometrical product of open string wave functionals that is clearly noncommutative. One may formulate a field theory of open strings based on this noncommutative product the same way as conventional field theory is formulated on products of wave function fields [4]. But the string wave functional is unwieldy and its product is enormously complex. Noncommutativity certainly does not help. To learn more we have to do with less. One way is to truncate the theory to a low energy effective theory of the small set of massless fields. Another is to approximate the string by a minimal “lattice” of two points. This is especially well suited to mimicking the geometric product of open strings. It emerges from both approximations that, at least using some choice of variables, the natural product of wave function fields is the following noncommutative deformation of the usual one: x =x =x ı ∂ µν ∂ ( ∗ )(x) = exp (x )(x ) . (1.1) 2 ∂x µ ∂x ν The parameter of noncommutativity is expressed in terms of the spacetime metric1 G and B by −1 . (1.2) = −(2πα )2 G−1 BG−1 1 − (2π α )2 BG−1 BG−1 It should be noted that noncommutativity is not a consequence of B being nonvanishing or large. It is intrinsic to the geometry of smooth string junctions that a canonical product exists for the functions on the space of open paths in the target space manifold with the appropriate boundary condition, and that product is noncommutative. The approximations mentioned above induces noncommutativity in the algebra of functions on the submanifold to which the end points are restricted, namely the D-brane. The algebra actually becomes commutative in the limit of very large B! It is a glaring deficiency of the present understanding from string theory that one knew only how to deal with a constant and flat B field. Introducing curvature for B takes string theory away from the usual sigma model to a rather different realm, so understanding it fully seems to call for a drastic conceptual advance. On the other hand, a varying but flat B field should be accessible by the available technology but is hampered by technical difficulties. For example, a formal construction of a noncommutative product using an arbitrary Poisson structure in place of the constant has been given by Kontsevich [6]. The construction made essential use of a degenerate limit of the sigma model [7]. But the result employs some very abstruse mathematics and its convergence properties are essentially unknown. Behind the complication must lie some interesting and novel structures that need to be deciphered. We propose, as a first step toward understanding such a situation from string theory, probing it with multiple parallel D-branes configured in a way such that each D-brane only senses a constant but respectively different B field. On each D-brane the usual noncommutative algebra incorporates the effect of the locally constant B field without knowing that B is actually varying. The latter is revealed in the communication among different D-branes via the open strings that start and end on different D-branes. We study the wave functions associated with such “cross” strings and find that their product is deformed in a new and intriguing way that retains associativity. Along the same line of 1 Note that G in this paper and in [5] is the same as the “closed string metric” g in [3], and the noncommutativity parameter here is the same as there.
Non-Abelian Geometry
315
reasoning as in [3], one expects that it is in terms of this product that the effect of the B field is best described at the zero slope limit. As D-branes are dynamical objects inclined to fluctuate, this picture is necessarily an idealization, describing the limit where the effect of such fluctuation is very small. It would be worthwhile to study quantitatively the corrections due to such fluctuation. One can better appreciate the import of this new deformation of algebra by recalling another player. It is an essential feature of a D-brane that it has a gauge symmetry and an associated gauge connection. Let us briefly recollect some of the well known facts relevant here. In the simplest and most common circumstances, a single D-brane has an U (1) symmetry, and a multiplet of N D-branes on top of each other collectively have an U (N) symmetry, with fields that are N × N matrices transforming in the adjoint representation of this U (N ). When the algebra of functions on the D-brane submanifold is deformed, so is the gauge symmetry. For U (1), the new Lie algebra is given by the commutator of the deformed product and is no longer trivial. For U (N ), the product of the matrix valued functions becomes j Mji ∗ Nk (x). (1.3) (M ∗ N )ik (x) = j
And the deformed Lie algebra originates from the commutator of this new matrix algebra. In this deformation the noncommutativity of spacetime and the non-Abelian property of multiple D-branes are simply and independently tensored together and do not affect each other, yet. It has long been known that the U (1) (trace) part of the field strength F always appear together with B as (B − FU (1) ). A certain gauge symmetry actually connects the two. Therefore F also contributes to noncommutativity and appears in the expression for by replacing B with (B − tr F /N). What about the non-Abelian part of F ? Let us consider a constant background for F , as varying F is again too difficult. For this constraint to make sense it has to be U (N ) covariant, i.e. it should be covariantly constant. Hence we also choose F so that different spatial components of F are in some Cartan subalgebra of N × N matrices. By a choice of basis we can make them all diagonal. Such background generically breaks U (N ) down to U (1)N and it is meaningful to talk about N distinct branes, each with a constant field strength of its own unbroken U (1). This poses the same problem as the early configuration of multiple D-branes probing transversally varying the B field but with a different interpretation2 . Now we are dealing with an intrinsically non-Abelian deformation of the matrix product on the D-branes. As it turns out, this new product is no longer the simple tensoring of the star product (Eq. 1.1) and the usual matrix algebra. The noncommutative “real” space and the non-Abelian internal space mix and become inseparable. We call this Non-Abelian Geometry. In this work, we have found a large class of examples of this new geometry by considering non-trivial D-brane configurations with non-Abelian gauge field content and/or under the influence of varying B field. The concrete form of the product are derived from a lattice approximation of string theory [5] in Sect. 2 and 3, and presented below. There are different ways to express the product, corresponding to different choices of operator ordering. With the “symmetric” ordering defined and used throughout Sect. 2, 2 One difference is that in the configuration with varying flat B field, an open string stretching between two D-branes would have a mass offset proportional to the separation between them. It’s possible to take a special limit for the components of the close string metric along the separation of the D-branes to make the offset vanish. However, here we are only studying the kinematics encoded in the algebra connecting all the (i, j ) strings and this offset is irrelevant.
316
K. Dasgupta, Z. Yin
the geometry is defined by an algebra with the following product. (
∗ )ij (x)
≡
l
∂ ı ∂ µν exp 2 ∂x µ il;lk ∂x ν
x =S ik x ij
li (x )lk (x )
(1.4)
, x =Sjikk x
with S and satisfying (Sect. 2.2) j j
j j
Sj13j24 = Sjk11jk22 Sk13k24 ;
(1.5)
j j
j j
Sj13j24 j3 j4 ;j5 j6 = j1 j2 ;j5 j6 = j1 j2 ;j3 j4 Sj53j64 . j j
Here no summation over Latin (Yang-Mills) indices takes place, but each Sj13j24 and j1 j2 ;j3 j4 is a matrix with (suppressed) space-time indices and the products in (Eq. 1.5) are understood as their matrix products. With the split ordering introduced in Sect. 3, the product takes the form ( × )ij (ea , eA ) ≡
k,l
∂ ∂ ˆ Aa ki (ea , M A ) exp ı ∂M A ∂M a
M A =0=M a lj (M a , eA ) . kl
(1.6) ˆ Aa is a matrix with Latin indices k and l, the exponential is the usual exponenHere tial of a matrix. In this paper we shall derive (Eq. 1.4) and (Eq. 1.6) from certain physical backgrounds in string theory, so they appear in concrete and specific context with motivation from string theory. However, we should emphasize here that with the forms of the products now known, one can and should consider them independently and abstractly as examples of non-Abelian geometry. And one ought to look for their presence in other contexts as well. In our derivation of (Eq. 1.4) the parameters S and take on values determined by the physical background in our setup. However, (Eq. 1.4) stands as a valid definition of an associative product as long as (Eq. 1.5) holds. Similarly, we have derived (Eq. 1.6) in Sect. 3 for the case of SU (2), but this form of product applies more ˆ Aa . Unlike the symmetric ordering, here ˆ is already “gauge generally for arbitrary kl fixed” and requires no further constraint for (Eq. 1.6) to be associative. The two forms of multiplication should be related to each other by a change of ordering. This is clear in the examples discussed in this paper, although we have not worked out the general and explicit transformation relating the two in this paper. Finally we note that even in such general forms, they only represent a certain class of non-Abelian geometry. We propose an unifying perspective for the general program of non-Abelian geometry at the end of Sect. 3. The explicit form of this new geometry in its full generality is a very interesting problem still under investigation. At this point one may well consider other approaches to generalizing D-brane geometry. One very interesting approach in recent time has been the efforts to study the geometry of D-branes with vector bundles in Calabi-Yau manifolds ([8] and the references therein). There one takes the D-brane wrapping supersymmetric cycles in Calabi-Yau manifolds and the vector bundle on the D-branes as a whole and study their properties in relation to target space supersymmetry and mirror symmetry. It would be very relevant to fully reconcile these two facets of D-branes: the vector bundle aspects of D-branes steeped in conventional commutative geometry as one, and the non-commutative geometry that the open string fields see as the other. However, non-commutativity introduces such tremendous technical challenges in curve space that novel and powerful
Non-Abelian Geometry
317
methods and concepts seem necessary to tackle it. This paper provides one possibility in dealing with a varying anti-symmetric tensor B field. It might prove helpful in dealing with a nonconstant metric through various correspondences such as T-duality, which relates the metric and the B-field. Here is an outline of the paper. The non-Abelian noncommutative product is explicitly constructed in Sect. Two. It turns out that a two point lattice approximation to quantum mechanics is perfectly suited for this purpose. A systematic methodology of computing the product was developed in [5]. We review and elaborate it in Sect. 2.1. In Sect. 2.2 we apply it to the most general case of non-Abelian noncommutativity and obtain the main result of the paper (Eq. 2.55). In Sect. 3 we then turn to the specific case of the deformation parameter being in the adjoint of SU (2) and use a variation of the method presented in Sect. Two. The result is a surprisingly compact and highly suggestive form of the new product (Eq. 3.16). The situation of multiple noncommutativity parameters also makes an appearance in connection with Taub-NUT geometry3 . We shall discuss in Sect. Four how a whole class of Lorentz non-invariant theories governed by the B field dynamics can be studied in an unified way. We conclude with a discussion on the possible gravity dual for the system we study as well as some other related issues.
2. Construction of the Non-Abelian Noncommutative Product 2.1. Review and elaboration. The origin of noncommutativity. A classic and salient feature of string theory is its geometric appeal. For example, strings interact by smoothly joining and splitting. In conventional field theories, one can visualize an interaction of particles as a vertex of intersection by propagators in a Feynman diagram. The well known rule from perturbation theory states that each term in the interaction Lagrangian gives rise to a distinct kind of such vertex. Interaction at a point corresponds to the product of fields at the same point. The rule of string theory perturbation is entirely analogous. However, the algebra of the product, besides being obviously much more complicated, has a new twist. Consider the joining of two or more open strings into one. It should be apparent that this process is not commutative though still clearly associative. The multiplication between the wave functionals of the open string, also known as the open string fields, share the same property4 . Intuitively, the product seems easy to define. Let [γ ] and [γ ] be two string wave functionals, where the argument γ is an open path in the target space with the proper boundary conditions. The geometric product defined above can be written as ( ∪ ) [γ ] =
D[γ1 ]D[γ2 ] δ[γ = γ1 γ2 ] [γ1 ] [γ2 ].
(2.1)
The operation is just the geometric process of “joining” defined above, with a refining sensitivity to sign and orientation so that a segment that backtracks itself also erases itself. This “definition” manifests noncommutativity5 and associativity, but it is also 3 For branes near a conifold, non-Abelian noncommutativity makes an appearance in the fractional brane setup [9]. 4 This observation was made clear in [4], where one can also find relevant graphic illustrations. 5 Even though we have made no mention of B!
318
K. Dasgupta, Z. Yin
horribly divergent and ill defined. One can remedy this with an elaborate procedure [4] but there is an alternative way to make sense of this product, if one is willing to forgo the bulk of the data encoded in the string field in exchange for a better understanding of what remains. Before we do this first recall that the standard string action is 1 µ ν S= d 2 σ Gµν X˙ µ X˙ ν − X X 4πα µ ˙ + dτ Aµ (X)X − dτ Aµ (X)X˙ µ . (2.2) ∂2
∂1
Here the subscripts “2” and “1” on ∂ label the “left” and “right” boundaries of the open string worldsheet. G is the background closed string metric and assumed to be constant. 1 ˙ µ ν − X µ X˙ ν in the action. However Usually there would also be a term 4πα Bµν X X we would only be dealing with a flat B field, and in R D flat B is exact and equal to dA for some A . We henceforth include −A implicitly in A so that dA = F − B ≡ F. Let us be careful with boundary conditions from now on. To solve the equation of motion we need to impose one for each boundary component. We want the two ends of the strings to move only within two possibly distinct but parallel D-branes of the same dimensions. For the purpose at hand, we will only be concerned with coordinates that parameterize the D-branes’ worldvolume under the influence of a nondegenerate F and ignore from now on all the other coordinates, including those along which the D-branes separate. We shall only consider situations for which this space is R D . The boundary condition for the relevant coordinate fields is 1 ν Gµν X = Fµν X˙ ν . 2πα
(2.3)
For the problem to be tractable using the method in this paper, we also require F to be constant. Note that since F is evaluated only at the ends of the strings, confined to the D-branes, this requirement only enforces the constancy of the pull-back of B to the D-branes. B may vary in directions transverse to the D-branes, or have components not entirely parallel to the D-branes that vary. Indeed the flatness of B correlate the last two kinds of variations. In this subsection let us consider the case of a single D-brane, so there is only one constant F. We will return to the general case in the next subsection. Now we approximate the spatial extent of the open string by the coarsest “lattice” of two points, namely the two ends, labeled by 1 and 2. Let the width of the string be 2/ω. The action (Eq. 2.2) is approximated by6 1 ω2 2 2 2 ˙ ˙ X1 + X2 − S = dτ (X2 − X1 ) 4πωα 2 µ µ
+ dτ Aµ (X2 )X˙ − Aµ (X1 )X˙ . (2.4) 2
1
We shall call this system lattice string quantum mechanics (LSQM). The boundary conditions now become [5] Dµ1 ≡ [G(X1 − X2 )]µ + 6
A similar but different model appeared in [10].
4π α [F X˙ 1 ]µ ∼ 0; ω
Non-Abelian Geometry
319
Dµ2 ≡ [G(X2 − X1 )]µ −
4π α [F X˙ 2 ]µ ∼ 0. ω
The result of canonical quantization with constraints is 7 −1 µν µ ν X2 , X2 = −ı(2πα )2 G−1 FG−1 1 − (2π α )2 FG−1 FG−1 µ ≡ ıµν = − X1 , X1ν ; µ ν X1 , X2 = 0.
(2.5)
(2.6)
These are precisely the commutation relations for the ends of the string found in [11]. Matrix, Chan-Paton factor, and noncommutative product. We are now only one step from the deformed product (Eq. 1.1). It is here that the LSQM approach distinguishes itself for its conceptual and technical advantage. Now that the entire continuum of the open string is distilled down to two points, the above mentioned ∪ joining of two oriented paths into one reduces to the merging of two ordered pairs of points, with the second (end) of the first pair coinciding with and “cancelling” the first (start) of the second pair: (x1 , m) (m, x2 ) = (x1 , x2 ). This induces a product ∗ of two wave functions of the lattice string, entirely analogous to (Eq. 2.1): (2.7) ( ∗ ) (x1 , x2 ) = dm1 dm2 δ(m1 − m2 )(x1 , m2 )(m1 , x2 ). If this seems reminiscent of an ordinary matrix product, it is no illusion. One can think of an index on a matrix as a coordinate parameterizing some discrete space8 . Since a matrix carries two indices it is the wave function of a lattice string moving on this discrete space and its ∗ product would simply be the standard matrix multiplication. Distinguishing between the contravariant and covariant indices corresponds to distinguishing the two ends of the (lattice) string by a choice of orientation. On the other hand, attaching discrete indices to string ends is none other than introducing Chan-Paton factors. In this light, the noncommutativity of open string field and of non-Abelian gauge symmetry are not just similar in their failure to commute but have a shared geometric origin and interpretation! Now we return to LSQM (lattice string quantum mechanics). Its salient feature, reviewed shortly, is the truncation of the noncommutative string field algebra down to a closed noncommutative algebra of (wave) functions on the target space. The latter is something much simpler and easier to study than the full open string algebra and still carries nontrivial information, especially the effects of the B field. The known noncommutative algebra found this way is a deformation of the “classical” commutative algebra of functions. It modifies the U (1) gauge symmetry of a single D-brane experiencing this B field into a deformed one corresponding to the group of unitary transformations in a certain Hilbert space. When multiple D-branes are present so that U (1) is replaced by the non-Abelian U (N ), the U (N ) group as well as the N × N matrix algebra is also modified. The new algebra is just the tensor product of the matrix algebra and the 7 In comparing the results summarized here with [5] one should note that D i defined here is equal to Gµν Ciν in [5], and that there is a typographical error of a missing (−1) in front the expression on the third line in (Eq. 2.12) of [5], and another (−1) on the exponent of the second parenthesis in the same expression. 8 The precise connection is the discrete “fuzzy” torus explained later in footnote (13) in Sect. 3.
320
K. Dasgupta, Z. Yin
deformed noncommutative algebra of the scalar functions. No essential difference in the noncommutativity of the space is introduced by having non-Abelian gauge symmetry. This then begs the question: is there some other deformation in which the discrete internal space can become fully entangled with the (noncommutative) “real” space so that it is impossible to separate them. The answer, we shall propose, is yes. The condition, we shall show, is that the background parameter for noncommutativity is in an appropriate sense non-Abelian. This can be due to a non-U (1) background for the gauge field or a varying flat B configured in the manner prescribed above. Defining the product. Equation 2.7 almost entirely defines the rule for making product. We still have the trivial freedom of changing the overall normalization by a constant factor, which we will fix later. Yet that equation would seem to be applicable to functions on the square of R D rather than R D itself. Fortunately, (Eq. 2.6) says that although we start from 2D canonical coordinates in the LSQM, constraint (Eq. 2.5) reduces the size of a complete set of commuting observables to only D, the right number for a wave function to be defined on R D itself. At each of the two ends, there are only D/2 commuting observables. Let us make some choice and call them E1a and E2a , a = 1 . . . D/2, where the subscript labels boundary components. Together they form a complete set of commuting observables. We shall call it the “aa” representation which diagonalizes simultaneously E1a and E2a with eigenvalues e1a and e2a respectively. For wave functions aa (e1a , e2a ) and aa (e1a , e2a ) in this representation, the adaptation of (Eq. 2.7) is immediate and obvious: a a (2.8) ( ∗ )aa (e1 , e2 ) ∝ dma1 dma2 δ(m1 − m2 )aa (e1a , ma2 )aa (ma1 , e2a ). The proportionality sign here signifies that we have yet to specify the overall normalization, which scales the right-hand side of (Eq. 2.8) by a constant factor. We will fix it later by relating to the usual commutative product. This product is natural also from the point of view of the LSQM. X1 and X2 commute with each other. Therefore the left and right ends decouple and the Hilbert space for the LSQM is the tensor product of the Hilbert spaces H1 and H2 of two ends respectively9 . Furthermore, the operator algebra in H1 and H2 are generated by the same set of observables Xµ , but from (Eq. 2.6) their commutator is exactly opposite in sign. This canonically correlates them as a complex conjugate pair of representations of the same operator algebra. To see this, choose a basis of R D so that is brought to the canonical form: 0 I J = . (2.9) −I 0 Let a, b, . . . enumerate the first D/2 coordinates and A, B, . . . the rest. Then (Eq. 2.9) can be written more compactly as: D
J aA = δ a+ 2 ,A = −J Aa , J ab = 0 = J AB .
(2.10)
In the aa representation the E a ’s are simultaneously diagonalized while the E A ’s are implemented as differentiations: ∂ ∂ a α + ı (e ) , E1A = −ıJ aA 1 1 ∂e1a ∂a 9
Subtlety might arise for other topology but at least for R D this factorization holds.
Non-Abelian Geometry
321
E2A = ıJ aA
∂ ∂ a α + ı (e ) . 2 2 ∂e2a ∂a
(2.11)
Here α1 and α2 are just the usual phase ambiguity in canonical quantization. We can naturally identify e1 and e2 by identifying the wave functions in H2 to H1 after complex conjugation, and requiring α1 = −α2 . Thus a ket in H2 is a bra in H1 and vice versa. The product (Eq. 2.8) can be rewritten as (|α ⊗ β|) ∗ (|θ ⊗ ρ|) ∝ (β||θ )(|α ⊗ ρ|).
(2.12)
Although this product has manifest noncommutativity and associativity, the wave functions are not functions on the target space and there is no parameter visible that controls the noncommutativity. This is a fitting time to remember that an associative algebra has both additive and multiplicative structures but (Eq. 2.8) defines only the latter. We want our algebra to be a deformation, in its multiplicative structure, of the algebra of functions on R D , so it should be identified with the set of functions on R D as a vector space. Wave functions in the aa representation clearly does not suit this purpose. We need to find an “aA” representation in which a set of D observables that can pass as coordinates on R D are simultaneously diagonalized. That is tantamount to requiring the action of the translation generator P on them should be what is expected of R D coordinates. We call them geometric observables. For example, in the LSQM above P = −1 (X1 − X2 ). A particularly symmetric µ choice for the geometric observables is simply the center of mass coordinates Xc of the lattice string system: Xcµ = such that
1 µ µ (X + X2 ) 2 1
Xcµ , Pν = ıδνµ .
(2.13)
(2.14)
We can rewrite the product (Eq. 2.8) in terms of functions of the eigenvalue x of Xc by using the change of basis function10 ı 1 x||e1a , e2a = δ x a − (e1 + e2 )a exp − x A JAa (e2 − e1 )a 2 4
(2.15)
and find that, in terms of , (Eq. 2.8) is explicitly given by x =x =x ∂ ı µν ∂ (x )(x ) . ( ∗ )(x) = exp 2 ∂x µ ∂x ν
(2.16)
Here we have fixed the overall normalization mentioned before by requiring it to reproduce the usual commutative product when = 0. 10 |x is an eigenstate of X and |ea , ea (shorthand for |ea ⊗ ea |) that for E a and E a . We have also c 1 2 1 2 1 2 made a convenient choice for the phase for the basis wave function of these representations.
322
K. Dasgupta, Z. Yin
2.2. Non-Abelian deformation. Now we come to the main task of this paper and consider the possibility of more than one noncommutativity parameter. For each such parameter we can define the ∗ product above and have a distinct algebra. Let us assign labels ranging from 1 to N to this group of parameters i . We denote elements of the i th algebra by functions labeled such as ii , satisfying x =x =x ∂ ∂ ı (ii ∗ ii )(x) = exp . µν µ ν ii (x )ii (x ) 2 ∂x ∂x
(2.17)
The reductionist view of what we want to do is to find a way to glue these algebras together cogently into one unified algebra. For that we now return to string theory for intuition. In string theory the above situation can arise in a configuration of N D-branes with different but constant F on each of them. From the discussion of the last subsection, this can happen in an arbitrary combination of two scenarios. The first, already explained before, is a background of flat but varying B field configured in such a way that (only) the pull-back of B to each D-brane is constant through it. The second scenario is a background gauge field that is constant but breaks the U (N ) gauge symmetry. It is not in general meaningful to talk about constant non-Abelian curvature because it would normally not satisfy the equation of motion or the Bianchi identity, but if all the spatial components of the curvatures are in some Cartan subalgebra then everything is fine. For U (N) this amounts to being able to diagonalize all spatial components of the curvature as N × N matrices. This breaks U (N ) down to U (1)N and gives the interpretation of N D-branes each with a distinct and constant U (1) background. The (i, i) strings on each of the D-branes are now complemented by (i, j ) strings, which start on the i th brane and end on the j th D-brane11 . Consider wave functions ji in the lattice string quantum mechanics approximating to the (i, j ) string. The Hilbert space is a tensor product of Hi ⊗ Hj∗ and the product rule of the whole algebra is generated by (|α i ⊗ β|j ) ∗ (|θ j ⊗ ρ|k ) = (β|j |θ j )(|α i ⊗ ρ|k ).
(2.18)
Written in terms of matrix valued functions and on R D , this is j ji ∗ij k k (x). ( × )ik (x) ≡
(2.19)
j
Note that in (Eq. 2.18) the product seems to depend only on j , but one has to write the final form (Eq. 2.19) in the aA representation. In general that would mean ∗ij k depends on all three indices. Our goal is to calculate ∗ij k . Preparation and Notations. Again each brane is labeled by the index i, j, . . .. Let us denote the F on the i th D-brane by F i . One repeats the same procedure of constrained quantization. This time one finds the Poisson brackets of the constraint D’s are Dµ1 , Dν1 = −4(2πα )2 Fi 1 − (2π α )2 Fi G−1 Fi G−1 , µν
In [3] a system of D0-D4 was studied with the 0 − 4 strings having mixed boundary conditions. These strings complement the 0 − 0 strings and the 4 − 4 strings to produce a bigger algebra. 11
Non-Abelian Geometry
323
, Dµ2 , Dν2 = 4(2πα )2 Fj 1 − (2π α )2 Fj G−1 Fj G−1 µν Dµ1 , Dν2 = 2(2πα )2 Fi − Fj µν .
(2.20)
For i = j , D 1 and D 2 no longer commute. This would translate to X1 and X2 not commuting with each other and would impede the program we have developed for constructing the product. However, we can take the zero slope limit employed in [3], in which α → 0 while F and (2πα )2 G−1 remain finite. After taking the limit, one finds that µ ν µν i , X1µ , X1ν = −ı µν X2 , X2 = ıj , µ ν X1 , X2 = 0, (2.21) where
i = (F i )−1 .
(2.22)
We can always, through a congruence transformation, turn an F into the following canonical form: i = Ti J Ti , (2.23) where J =
0 −I
I . 0
(2.24)
It shall become convenient to use the following symbols: (1) Uij ≡ Ti Tj−1 ;
(2) M j1 j2 ≡ Tj−1 + Tj−1 ; 1 2 j1 j2 ) J M j3 j4 (M (3) F j1 j2 ;j3 j4 ≡ − ; 4 −1 ; (4) j1 j2 ;j3 j4 ≡ F j3 j4 ;j1 j2
(2.25)
(5) Sj13j24 ≡ (M j1 j2 )−1 M j3 j4 . j j
They are related to each other and j by
(1) F j1 j2 ;j3 j4 = −F j3 j4 ;j1 j2 ; (2) F j = F jj ;jj ; (3) j = jj ;jj ; (4) (5)
j j Sj13j24 = j1 j2 ;k1 k2 F k1 k2 ;j3 j4 j j Sj13j24 j3 j4 ;j5 j6 = j1 j2 ;j5 j6
(2.26) j j = Sjk11jk22 Sk13k24 ; j j = j1 j2 ;j3 j4 Sj53j64 .
Note that because we will deal with a plethora of indices we shall suppress spatial indices unless doing so will cause confusion. Repeated gauge indices i, j are not summed over unless stated otherwise explicitly, but repeated spatial indices are always summed over implicitly. The situation should be obvious from the context. Coordinates are arranged into column vector, or row vectors after transposition.
324
K. Dasgupta, Z. Yin
In search of a center. In this subsection we figure out the geometric observables for the µ (i, j ) dipole. That is, we want operators Xc such that µ Xc , Pν = ıδνµ , (2.27) where the translation generator Pµ for the dipole system described by (Eq. 2.4) is P = F i X1 − F j X2 in the zero slope limit taken earlier. P satisfies the property i j Pµ , Pν = −ı(Fµν − Fµν ) ≡ −ıµν .
(2.28)
(2.29)
Therefore they are like covariant derivatives and we require them to be as such: P ≡ + A, where µ = −ı and
∂ µ, ∂Xc
ν − ∂ν A µ = µν . ∂µ A
(2.30)
(2.31) (2.32)
suffers the usual phase ambiguity and we choose a linear gauge The definition of A = − 1 ( + )Xc , A 2
(2.33)
where is a symmetric matrix and pure gauge. It will be fixed later for convenience. There are an infinite number of choices for operators satisfying (Eq. 2.27). Let us for now look for one as close to the center of mass X1/2 = as possible. Alas
so X1/2
1 (X1 + X2 ) 2
ı µν µ µν ν = − (i − j ) ≡ ı∇ µν , X1/2 , X1/2 4 itself does not suffice. Therefore we define Xc indirectly so that
(2.34)
X1/2 = Xc + .
(2.35)
(2.36)
Then and can be found by substituting (Eq. 2.31) and (Eq. 2.36) into the commutation relation known for X1/2 and P : 1 µ ıδνµ = X1/2 , Pν = ı − ( − ) (2.37) 2 and
µ ν ı∇ µν = X1/2 , X1/2 = ı( − ).
(2.38)
It thus follows that is related to by 1 = I + ( − ) . 2
(2.39)
Non-Abelian Geometry
So
325
− + = ∇.
(2.40)
is thus related to a matrix γ satisfying
through
γ γ = − ∇
(2.41)
γ = 1 + .
(2.42)
One can then solve for Xc and find that
Xc = (1 + )−1 X1/2 − P 1 1 −1 i j = (1 + ) + F X1 + − F X2 . 2 2
(2.43)
2.2.1. Solving for ij . The matrix Uij defined in (Eq. 2.25) satisfy i = Uij j UijT .
(2.44)
Uij Uj k = Uik .
(2.45)
as well as the cocycle condition
From the requirement that the Xc ’s commute among themselves it follows that (1 + 2F i )−1 (1 − 2F j ) also satisfy the condition (Eq. 2.44). We use this to find a solution for ij in terms of Uij : −1
1 = 1 − Uij F i Uij + F j (2.46) 2 so that Uij = (1 + 2F i )−1 (1 − 2F j ). (2.47) Then one can show that −1 1 i + F Ti = (Ti−1 + Tj−1 )−1 (1 + ) 2 −1 1 − F j Tj = M ij = (1 + )−1 , 2
(2.48)
where M ij has been defined in (Eq. 2.25). This allows us to find a simple expression for Xc that we will use shortly: M ij Xc ≡ (E1 + E2 ), where
E1 = Ti−1 X1 ,
E2 = Tj−1 X2 .
The E’s are convenient because µ ν µ E2 , E2 = J µν = − E1 , E1ν ,
(2.49) (2.50)
326
K. Dasgupta, Z. Yin
µ E1 , E2ν = 0.
(2.51)
For Xc to be defined, M has to be nondegenerate, which means Uij has no eigenvalue equal to (−1). Actually it might very well happen that for certain given Fi ’s, for instance the SU (2) case that we shall consider later, Uij does have eigenvalue to (−1). However, Uij is only defined up to Uij → Ti Si (Tj Sj )−1 , where Si and Sj are Sp(D) transformations. It is easy to show that one can always find suitable S’s so that M is nondegenerate. 2.2.2. The product. In computing the actual product, the key step is the change of basis functions between the aA and aa representations. We now find them for the (i, j ) string. We choose to diagonalize E1a and E2a in the aa representation, a ranging from 1 to D2 .
The first D2 component of 21 M ij Xc is 21 E1a + E2a . The canonical conjugates of the rest
of the D2 components are E1a − E2a . To find how the latter are represented, substituting the expressions for and Xc in (Eq. 2.30), we find that =P + =−
1 ( + ) Xc 2
M J (E1 − E2 ) + + Ti−1 J Tj−1 − Tj−1 J Ti−1 (2M)−1 (E1 + E2 ) . 2 (2.52)
Now we fix the gauge choice by requiring that the second term in the last expression vanish. Thus (E1 − E2 )a are represented purely as derivatives with respect to (Mx)A . Therefore the change of basis matrix element between the (e1a , e2a ) basis and the Xc basis is √ a e1 + e2a |M| (M ij x)a a a aA 1 ij A δ exp −ı(e − e )J ( x) M − . e1a , e2a ||x = 1 2 D 2 2 2 22 (2.53) The determinant and powers of two appear as a result of the different normalization c between the Xc basis and the MX 2 basis. They will not matter in the end. Now we are finally ready to compute the star product. j j (ji ∗ k )(x) ∝ de1a de2a x||e1a , e2a (ji ∗ k )(e1a , e2a ) j = dx dx de1a de2a dma1 dma2 ji (x )k (x )δ(ma1 − ma2 ) a a a a a a ×x||e 1 , e2 e1 , m2 ||x m1 , e2 ||x |M ij M j k M ik | j dx dx ji (x )k (x ) (2.54) = D 2 2 ı × exp (x (F ik;ij x − F ik;j k x ) + x F ij ;j k x ) 2 x =S ik x ij D ı ∂ |M ik | ∂ j i =22 exp (x ) (x ) . ij ;j k j k ik |M ij M j k | 2 ∂x ∂x x =Sj k x
Non-Abelian Geometry
327
Again we fix the normalization by requiring the recovery of the usual matrix product when all the i ’s vanish. Hence we would get
(ji
j ∗ k )(x)
∂ ı ∂ µν = exp µ ij ;j k 2 ∂x ∂x ν
x =S ik x,x =S ik x ij jk
j ji (x )k (x )
.
(2.55)
For plane waves, this translates to ı exp (ık1 x)∗ij k exp (ık2 x) = exp − k1 ij ;j k k2 exp ı(k1 Sijik + k2 Sjikk )x , (2.56) 2 which is the desired product. By using (Eq. 2.26), one can show that (exp (ık1 x) ∗ij k exp (ık2 x)) ∗ikl exp (ık3 x) = exp (ık 1 x) ∗ij l (exp (ık2 x) ∗j kl exp (ık3 x)), 1 = exp − ı(k1 ij ;j k k2 + k2 j k;kl k3 + k1 ij ;kl k3 ) 2 il × exp ı(k1 Sijil + k2 Sjilk + k3 Skl )x ,
(2.57)
thus proving associativity. 3. The Case of SU (2) In this section we deal with the simplest instance of non-Abelian geometry: N = 2 and the deformation parameter is in the adjoint of SU (2). That is to say the noncommutativity parameter on brane 1 is but that on brane 2 is −. For this situation one may certainly apply the method developed in the previous section again. But we shall take this opportunity to consider a variation and illustrate the meaning of the large degree of freedom in choosing the geometric observables mentioned earlier. For simplification of notation, we can, by means of a congruence transformation, bring to its canonical form J and shall work in this basis till near the end of this section. We call the coordinate observables on the left and right ends of the string Lµ and R µ respectively. Unlike the previous section, where E2 is generically in a different parameterization of R D from E1 , related by some linear transformation, here R is the same parameterization as L. Therefore [Lµ , Lν ] and [R µ , R ν ] are exactly opposite in sign on a 11 string, but identical on a 12 string. 3.1. Split ordering. The Moyal-Weyl product can serve as a method of quantization, i.e. mapping a function on the phase space (in our case, R D ) to an operator to the Hilbert space of a quantum mechanical model. As usual there is the ambiguity of operator ordering, and the Moyal-Weyl product makes a symmetric choice. There are other orderings, and they can also be obtained by variation of the method developed in [5] and reviewed in the last section. Recall that to represent states in the LSQM Hilbert space as (wave) functions on R D , we had to choose a set of D geometric observables, simultaneously diagonalized in the aA representation. The action on them by the generator of translation should be what one expects for coordinates being translated. In the last section, Xc ’s are the geometric observables, but there are many other choices. Some of them, giving
328
K. Dasgupta, Z. Yin
different values to , correspond to different choices of phase for the wave function. Some other choices correspond to different operator ordering schemes in the language of quantization. Both will show up here. Let us divide the coordinates of the present problem into two groups which are canonical conjugates to each other with respect to J (and −J ). We label them with a, b, . . . and A, B, . . . respectively as in Sect. 2.1 Then we choose as geometric observables for any (i, j ) string E a = La , EA = RA.
(3.1)
Now let us consider the 11 string. First we will describe a scheme for illustration only . that will not be used again in the paper. Therefore to avoid confusion we use = instead of = in equations peculiar to this example. The basic commutation relations are − Lµ , Lν = J µν = R µ , R ν , Lµ , R ν = 0. (3.2) The translation operator is P = −J (L − R).
(3.3)
By a specific choice of phase of the basis state in the aA representation, we can implement translation by differentiation with respect to the space coordinates as per tradition: P = . This means in particular that ∂ . R a = −ıJ aA A + ea . ∂e
(3.4)
Then by another choice of phase the change of basis between aA and aa basis is described by a a aA A . e||La , R a = δ(ea − La ) expı(R −L )J e . (3.5) Then one finds that the noncommutative product is given by 11
∗ 11 (e)
M A =eA . ı ∂M∂ A J Aa ∂M∂ a 1 a A 1 a A =e 1 (e , M )1 (M , e ) . a a
(3.6)
M =e
This corresponds, in quantization, to a choice of ordering in which all the E a ’s are brought to the left and all the E A ’s are brought to the right. However, in the rest of the paper we shall only use a variant of this ordering so that the condition α1 = α2 is satisfied in (Eq. 2.11) and the final result could be in a more convenient form. Another choice of phase in the aA representation is made which replaces (Eq. 3.4) by ∂ R a = −ıJ aA A . (3.7) ∂e Then (Eq. 3.5) is replaced by e||La , R a = δ(ea − La ) exp ıR a J aA eA ,
(3.8)
Non-Abelian Geometry
329
and (Eq. 3.6) by
11
∗ 11 (e)
∂ ∂ = exp ı J Aa ∂M A ∂M a
M A =0=M a
11 (ea , M A )11 (M a , eA )
(3.9)
.
The U (1) phase that relates this and the last
one is given by the unit element in this new product. Instead of 1, it is exp ıea J aA eA . We call this scheme split ordering. 3.2. Off diagonal elements. On a 12 string the commutation relations are µ ν µ ν µν Lµ , Lν = −J = R , R , L , R = 0.
(3.10)
The translation operator is
P = −J (L + R). (3.11) A crucially new feature is that P no longer commutes among themselves: [P µ , P ν ] = −2ıJ µν . By a specific choice of gauge we can implement it as Pµ = −ı
∂ − [J e]µ . ∂eµ
(3.12)
This means in particular ∂ . (3.13) ∂eA Then by a choice of phase consistent with the split ordering the change of basis between aA and aa basis is described by ea , eA ||La , R a = δ(ea − La ) exp ıR a J aA eA . (3.14) R a = −ıJ aA
Then one finds that the noncommutative product is given by 11
∗ 12 (e)
M A =0=M a ∂ Aa ∂ 1 a A 1 a A = exp ı J (e , M ) (M , e ) . (3.15) 1 2 ∂M A ∂M a j
Using this method systematically we find all the possible products ji ∗ij k k . They in fact can be written in a very compact matrix form: M A =0=M a ∂ ∂ a A ˆ Aa (M , e ) , ( × ) (ea , eA ) ≡ (ea , M A ) exp ı ∂M A ∂M a (3.16) ˆ Aa = J Aa σ3 and all products are understood as matrix products12 . This is highly where ˆ For each (µ, ν) pair, ˆ µν is a two suggestive of an SU (2) valued Poisson structure . by two matrix, intuitively in the adjoint of SU (2). In this case, ˆ µν = J µν σ3 .
(3.18)
This product is clearly associative. 12
σ3 is just the usual element of Pauli matrices: 1 σ3 = 0
0 . −1
(3.17)
330
K. Dasgupta, Z. Yin
The unit element of this new product is Isu(2) = exp −ıea Fˆ aA eA , where
Fˆ µν = −J µν σ3 .
(3.19)
(3.20)
3.3. Non-Abelian geometry. Just as in the general case discussed in the last section, constant matrices form a subalgebra. However, × M = (ea , 0)M = M(0, eA ) = M × ,
(3.21)
unless (ea , 0) = (0, eA ) is a constant matrix that commutes with M. Therefore one cannot obtain the whole algebra by tensoring this matrix subalgebra with some other algebra. Curiously, there are two other distinct matrix subalgebras with the interesting properties: (MIsu(2) ) × = M, × (Isu(2) M) = M, (3.22) so that the total algebra is a left and right module under them separately and respectively. The new algebra defined by (Eq. 3.16) and (Eq. 2.55) also contains subalgebras that are the deformations of that of the scalar functions on R D . However, there are N, rather than just one, of them, distinguished by their deformation parameters i . No one is more preferred than the others. On the other hand, a well defined deformed algebra of functions is essential for the noncommutative geometric interpretation of D-brane worldvolume. A noncommutative space is itself defined only by the algebra of functions “on it.” The loss of a canonical noncommutative algebra of scalar functions calls for a drastic reinterpretation of the underlying “space.” In the present case, the N different algebras represent N deformed noncommutative spaces on top of each other, distinguished only by their deformation parameters. However, this simplistic picture overlooks all the (i, j ) strings. Indeed, it is clearly not “covariant” enough. The total algebra is not a simple tensor product of any one of the scalar subalgebras with some matrix algebra, as it were for the usual case of Abelian or no deformation. What is the meaning of this? We propose that these algebras define examples of a new type of geometry, which we call non-Abelian Geometry. It is a type quite apart from both the original underlying commutative space R D and the noncommutative R D defined by the Moyal product because the matrix (and more generally, non-Abelian) degree of freedom and the function degree of freedom become entangled everywhere and become one entity. Recall that the algebra of functions of the direct product of two manifolds, M × M , is the tensor product of their respective algebras: AM×M = AM ⊗ AM .
(3.23)
So should be the case for the direct product of noncommutative spaces. Now also recall that the algebra of N × N matrices can be reinterpreted as the algebra of functions on a certain discrete noncommutative space: a discrete “fuzzy” torus with N units of magnetic flux13 . Therefore the usual case of N × N matrix fields on commutative or 13 This can be constructed, for example, on a two-torus as follows. With N (more generally, rational) units of flux through the torus, the algebra of functions contains centers. One finds that the algebra has an N dimensional irreducible representation in which the Fourier components are realized as clock and shift operators of a N-ary clock and the products thereof. The latter generate the algebra of N × N matrices.
Non-Abelian Geometry
331
U (1) deformed space can be reinterpreted as (the algebra on) the direct product of the continuous space with the appropriate discrete torus. When the U (N ) bundle is nontrivial, it should be identified as a fibration of the discrete torus over the continuous base space. The notion of a base space is based on the existence of a canonical algebra of scalar valued functions on it, of which the total algebra is a module. This semi-decoupling makes it a matter of taste whether to consider the total algebra as representative of a combined noncommutative space or just conventionally as an adjoint module of the algebra of the base space. However, with non-Abelian deformation considered in this paper, such a canonical scalar algebra ceases to exist. The discrete torus and the continuous space therefore lose their independent identities and separate meaning. The conventional picture has to be replaced by a total space that intertwines the discrete, non-Abelian degrees of freedom of the Yang-Mills theory with that of the continuous, noncommutative space. This is the precise meaning of non-Abelian geometry. 4. The Taub-NUT Connection So far we have studied an example of Lorentz non-invariant theory. These theories give new deformations to the otherwise constrained structure of quantum field theory. As discussed above, they can be realized in string theory when we have a background B-field. In the presence of branes we have basically four choices of orienting the B-field resulting in four different theories. The first case would be to orient the B-field transverse to the brane [12] i.e. the B-field is polarized orthogonally. Naively such a constant B-field can be gauged away. However if we also have a nontrivial orthogonal space − say a TaubNUT − and one leg of the B-field is along the Taub-NUT cycle then this configuration gives rise to new theories known as the pinned brane theories [12]. The D-branes have minimal tension at the origin of the Taub-NUT and therefore the hypermultiplets in these theories are massive. The mass is given by b2 , (4.1) 1 + b2 where b is the expectation value of the B-field at infinity. The origin of the mass of the hypermultiplets is easy to see from the T-dual version. For simplicity we will take a D3 brane oriented along x 0,1,2,3 and is orthogonal to a Taub-NUT which has a non-trivial metric along x 6,7,8,9 . The coordinate x 6 is the Taub-NUT cycle and the B-field has polarization B56 . Making a T-duality along the compact direction of the Taub-NUT we have a configuration of a NS5 brane and a D4 brane. The hypermultiplets in this model come from strings on the D4 brane crossing the NS5 brane. Due to the twist on the torus x 5,6 , the D4 on the NS5 brane comes back to itself by a shift resulting in the hypermultiplets being massive while the vectors remain massless. The second case is to orient the B-field with one leg along the brane and the other leg orthogonal to it [13]. Again we could gauge away such a B-field. But in the presence of Taub-NUT − with the leg of the B-field along the Taub-NUT cycle − we generate new theories on the brane known as the dipole theories [14]. Hypermultiplets in these theories have dipole length L determined by the expectation of the B-field. The vector multiplets have zero dipole lengths. The dipoles are light and typically the branes are not pinned. The field theory on the branes are nonlocal theories with the following multiplication rule: x =x =x 1 i ∂ j ∂ ( ◦ )(x) ≡ exp (L1 i − L2 j ) (x )(x ) , (4.2) 2 ∂x ∂x m2 =
332
K. Dasgupta, Z. Yin
where (x) and (x) have dipole lengths L1 and L2 respectively. It is easy to check that when we specify the dipole length of the above product as L1 + L2 , the multiplication rule is associative. The dipoles in these theories are actually rotating arched strings stabilized (at weak coupling) by a generalized magnetic force [13]. In this limit the radiation damping and the coulomb attraction are negligible. Again the T-dual model can illustrate why the hypermultiplets have dipole length. We take the above configuration of a D3 transverse to a Taub-NUT but now with a B-field B16 . Under T-duality we get a configuration of a NS5 brane and a D4 brane with a twisted x 1,6 torus. Along direction x 1 the D4 comes back to itself up to a twist. Since both the NS5 and the D4 branes are along x 1 , this shift gives a dipole length to the hypermultiplets. Observe that this way the vectors have zero dipole length. The third case is to orient the B-field completely along the branes. Here we cannot gauge away the B-field. Gauging will give rise to an F field on the world volume of the branes. This would also mean that now we no longer need any nontrivial manifold. The supersymmetry will thus be maximal (in the above two cases the supersymmetry was reduced by half or more). The theory on the brane is noncommutative gauge theory. The B-field modifies the boundary conditions of the open strings describing the D branes. This modification is crucial in giving a non zero correlation function for three (and more) gauge propagators. This in turn tells us that the usual kinetic term of the gauge theory is replaced by14 detg g ii g jj Fij ∗ Fi j , (4.3) where (Eq. 4.3) involves an infinite sequence of terms due to the definition of ∗ product (Eq. 2.16). The above equation is written in terms of local variables, i.e. the variables defining the usual commutative Yang-Mills theory. The map which enables us to do this is found in [3]. In other words, noncommutative YM at low energies can be viewed as a simple tensor deformation of commutative YM. This deformation is also responsible in producing a scale in the theory. A key feature is that this scale governs the size of the smallest lump of energy that can be stored in space. Any lump of size smaller than this will have more energy and therefore will not be physically stable. For a D3 brane with a B-field B23 T-duality along x 3 will give a D2 brane on a twisted x 23 torus. The non-locality in this picture can be seen from the string which goes around the x 3 circle and reaches the D2 with a shift [2]. For the dipole theories (which are again non-local theories) there also exist a map with which we could write these theories in terms of local variables [14]. This map is relatively simpler than the Seiberg-Witten map for noncommutative theories. Using this map one can show that the dipole theories at low energies are simple vector deformations of SYM theory [13]. The fourth case is the topic of this paper. Here, as discussed earlier, we have a configuration of multiple D-branes with different B-fields oriented parallel to the branes. However one difference now is that this configuration may or may not preserve any supersymmetry. Also the multiplication rule in this theory is more complex and now there is no clear distinction between the non-Abelian and the noncommutative spaces. The algebra (Eq. 2.55) reflects this intertwining. From the above considerations it would seem that all the four theories have distinct origins. However, as we shall discuss below, all these theories can be derived from a particular setup in M-theory but with different limits of the background parameters. This will give us a unified way to understand many of the properties of these theories. 14
gij is the open string metric Gij of [3].
Non-Abelian Geometry
333
Consider first the pinned brane case. If we lift a D4 brane with transversely polarized B-field we will have a configuration of a M5 brane near a Taub-NUT singularity and a C-field having one leg parallel to the M5 brane. The limits of the external parameters which give rise to decoupled theory on the M5 brane are [12]: C → ,
R → ,
Mp → −β ,
β > 1,
(4.4)
where R is the Taub-NUT radius and Mp is the Planck mass. In this limit the energy scale of the excitations of the M5 brane is kept finite whereas the other scales in the problem are set to infinity. The dipole theories are now easy to get from the above configuration. Keeping the background limits the same we rotate the M5 brane such that the C-field now has two legs along the M5 (it’s still orthogonal to the Taub-NUT). With this choice a simple calculation will tell us that that the M5 is not pinned in this case. To generate the noncommutative gauge theories we first remove the M5 brane from the picture and identify the M-theory direction with the Taub-NUT circle. Now the limits15 which give rise to 6 + 1 dimensional noncommutative gauge theories are [16]: C → −1/2 ,
R → f ixed,
Mp → −1/6 ,
M gµν → 2/3 ,
(4.5)
M is the dimensionless M-theory metric16 . This limit is the same as Seibergwhere gµν Witten limit and the coupling constant of the theory
gY2 M = Mp−3 C = f ixed.
(4.6)
The theory of non-Abelian geometry can be studied in M-theory using a multi TaubNUT background with a G-flux that has non-zero expectation values near the Taub-NUT singularities. As discussed in [17], such a choice of background flux generally breaks supersymmetry. From the type IIA D6 brane point of view this flux will appear as gauge fluxes Fi = dAi on the i th world volume17 . If the Taub-NUT is oriented along x 7,8,9,10 , x 7 being the Taub-NUT circle, and the C field has two legs along the Taub-NUT and one leg along x 1 then the world volume gauge fields Ai can be determined by decomposing the C field as: N (2) C(t, y, x 7 , r) = Ai (t, y) ∧ Li (x 7 , r), (4.7) i=1 (2)
where y’s are the coordinates x 1,..,6 of the D6 brane world volume, Li are the harmonic √ forms of the multi Taub-NUT and |r | = x k x k , k = 8, 9, 10. However it turns out that 15 These limits however don’t specify the complete decoupling of the theory. This is because the theory has negative specific heat [15]. 16 There is an interesting digression to the above cases. Between the two limits of the background C field there exists a case under which
C → f inite, R → f inite, Mp → ∞, gs → 0. This gives us another decoupled theory on the M5 brane which is in the same spirit as the little string theory [12]. 17 There is a subtlety here. For F/M-theory compactification with G-flux, when there is a generic flux− not concentrated near the singularities of the manifold − this appears in the corresponding type IIB theory as HNSNS and HRR background. However when the flux is concentrated near the singularity, then it appears as gauge fields on the brane [17].
334
K. Dasgupta, Z. Yin
these harmonic forms are deformed from their original values due to the background G-flux. Therefore in the above equation the harmonic forms are C-twisted ones which preserve their anti-self-duality with respect to the C-twisted background metric. The precise form of this twist will be presented elsewhere [18]. In this framework it might be possible − by pure geometric means − to see this intertwining more clearly. Before we end this section let us summarize the connections between different theories governed by the B field dynamics in the following table18 : j
Theories
SUSY
Product rule: (ij ∗ k )(x)
Pinned branes
N =2
ij (x)k (x)
Dipole theory
N ≤2
Noncomm. geometry
N =4
Non-Abelian geometry
N ≤1
j
1
j
(Li1 ∂i −L2 ∂j ) i j ∂x ∂x j (x )k (x )|x =x =x ı ∂ µν ∂x∂ν j e 2 ∂x µ i (x )k (x )|x =x =x j µν ∂ ı ∂ x =S ik x j e 2 ∂x µ ij ;j k ∂x ν ij (x )k (x )| ijik x =S x
e2
jk
5. Discussions and Conclusion In this section we will discuss possible supergravity background for the analysis presented in the earlier sections. We will also illustrate some aspects of this using a mode expansion for a background U (1) × U (1). This case is related to the recent analysis done in [19], where the worldsheet propagator was calculated to compute two distinct noncommutativity parameters. It was shown that near one of the branes, say 1, the ∗-product involves 1 only. This is clear from our analysis because 11 ∗ 11 involves 11;11 which from (Eq. 2.26) is just 1 . Another recent paper which dealt with some related aspects is [9]. Here two different ∗ products arise naturally in the fractional brane setting. In this model we have a configuration of D5 − D5 wrapping a vanishing two cycle of a Calabi-Yau. This model however is supersymmetric and for some special choice of background B and F fields the tachyon is massless [9]. 5.1. Gravity solutions. Let us first consider the case of a large number of D3 branes on top of each other and with a background B-field switched on. The B-field is constant along the world volume of the D3 branes. What is the supergravity solution for the system? Obviously the near horizon geometry cannot be AdS as there is a scale in the theory which breaks the conformal invariance. Indeed, as shown by Hashimoto and Itzhaki [20] and independently by Maldacena and Russo [15], the supergravity solution can be calculated by making a simple T-duality of the D3 brane solution. Under a T-duality the B background becomes metric and it tilts the torus which the D2 brane wraps. This solution is known and therefore we could calculate the metric for this case and T-dualize to get our required solution. Observe that under a T-duality we do get a B-field which is constant along the brane but is a nontrivial function along the directions orthogonal to the brane. In other words there is a H field. But for all practical purposes this solution is good enough to give us the near horizon geometry of the system. The 18 We have chosen to make everything non-Abelian. It’s a straightforward exercise to extend the above three Abelian cases to this.
Non-Abelian Geometry
335
scale of the theory appears in the metric deforming our AdS background which one would expect in the absence of the B-field. For the case in which D3 branes are along x 0 , x 1 , .., x 3 and the B-field has a polarization B23 the near horizon geometry looks like [20, 15]: du2 ds 2 = α u2 (−dx02 + dx12 ) + u2 h(dx22 + dx32 ) + 2 + d25 , u
(5.1)
where h = (1 + a 4 u4 )−1 and a 2 is the typical scale in the theory (it is related to θ ). The above metric has the expected behavior that for small u the theory reduces to AdS5 × S 5 . From gauge theory this is the IR regime of the theory. One naturally expects that noncommutative YM reduces to ordinary YM at large distances. The above solution has an added √ advantage that it tells us that below the scale a (which will be proportional to ) commutative variables are no longer the right parameters to describe the system accurately. Noncommutativity becomes the inherent property of the system and therefore local variables fail to capture all the dynamics. At this point we should ask whether our new system has a consistent large N behavior. The gauge theory is highly noncommutative of course but it also has a large number of noncommutativity parameters (typically N). We have a system of N D3 branes with gauge fields F i , i = 1, .. N on them. We can simplify the problem by taking only one i . polarization of the gauge fields, i.e. we would concentrate on the fields F23 A simple analysis tells us immediately that the previous procedure to generate a solution is not going to help in this case. The procedure is suitable to generate one scale and therefore we should now rely on a different technique. Also the system now has no supersymmetry and therefore we have to carefully interpret the background. i on the branes as B . We can make a Lorentz Let us denote the magnetic field F23 i transformation to generate a constant magnetic field B on the branes but different electric fields Ei such that the relations Bi2 = B 2 − Ei2 , Ei • B = 0 are satisfied. We can also make a gauge transformation to convert the constant magnetic field to a background constant B-field. We now make a T-duality along the x 3 direction19 . Under this the electric fields Ei will become velocities vi of the D2 branes and the B-field will tilt the torus x 2 , x 3 as before. Therefore the final configuration will be a bunch of D2 branes (or, in a reduced sense, points) moving with velocities vi along the circle x 3 . At this point it would seem that the supergravity solution is easy to write down. But there are some subtleties here. Recall that when we had a single scale in the problem and the T-dual picture was a D2 brane wrapped on a tilted torus, T-duality along x 3 was easy because we had assumed that the harmonic function of the D2 brane is delocalized along the third direction. Therefore the D2 brane is actually smeared along that direction. This typically has the effect that the harmonic function of the D2 brane is no longer 1 + Qr 52 , rather its 1 + Qr 42 , Q2 is the charge of the D2 brane. This is the case that we have to consider. Delocalizing the D2 branes would mean that we have an infinite array of D2 branes moving with velocity, say, v1 and so on. Also since the system 19 There could be a subtlety in performing a T-duality here because the string theory background is not supersymmetric. But we are considering a T-duality completely from the supergravity point of view in which the transformation of the bosonic background is important for us. As such the extra corrections are not relevant for studying this.
336
K. Dasgupta, Z. Yin
lacks supersymmetry the velocities are not constant. An interpretation of this model can be given from fluid mechanics. Due to delocalization we have layers of fluid moving with velocities vi along x 3 with a viscosity between them. This would tend to retard the motion of the various layers making the problem slightly nontrivial. But as we shall see some interesting property of the system is obvious without going to the original (T-dual) model. The metric for the D2 brane (for simplicity the direction 23 is on a square torus) is given by 2 2 ds 2 = H −1/2 ds012 + H 1/2 ds34..9 , (5.2) where the harmonic function satisfies: N δ(ri ) ∂ 2 H = i=1
and ri is given by ri = (x3 − vi t)2 + x42 + · · · + x92 when the velocities are small so that we could neglect relativistic effects. Recall that the system is delocalized along the direction x3 therefore there is actually an infinite array of branes moving (i.e. it behaves like a fluid). Let r = x42 + · · · + x92 , then it’s easy to show that for a large radius of the x3 circle and near horizon geometry (i.e. r → 0) the harmonic function is modified from the naive expected value. The harmonic function becomes: N 1 π/2 sin3 θ dθ, H (r) = 1 + r4 0 (1 − vri t sin θ)5 i=1
(5.3)
when the compact direction is very small one can show that we get H (r) = 1 + r14 . A T-duality along that direction will give us a noncompact D3 brane whose harmonic function will have the right property. This calculation is done without assuming any force between the branes. A more detailed analysis would require the behavior of open strings between the branes. In the next section we will elaborate on this issue by doing a mode expansion. (i)
5.2. Mode expansion. For simplicity we will take two D3 branes having fluxes Fi = F23 on them. The D3 branes are oriented along x 0,1,2,3 and let z = x 2 + ix 3 such that the mode expansion for the system becomes: 1 − iF 1 exp (−i(n + ν)σ ) . z= An+ν exp ((n + ν)t) exp (i(n + ν)σ ) + 1 + iF1 n (5.4) The quantity ν measures the shift in the mode number due to the presence of different gauge fluxes at the boundary. This shift can be easily worked out from the boundary conditions at the two ends of the open string. In terms of the above variables ν is given by 1 (F2 − F1 )(1 + F1 F2 ) −1 ν= 2 sin . (5.5) 2π (1 + F12 )(1 + F22 ) As is obvious from the above formula when the gauge fluxes are the same on the different branes we do not expect any shift in the mode number. This shift can now be used to
Non-Abelian Geometry
337
calculate the new ground state energy of the system. This zero-point energy, in the NS sector, will now depend on ν. Using the identity
(n + ν) = −
n≥0
1 (6ν 2 − 6ν + 1) 12
(5.6)
the zero-point energy can be calculated from the bosons, fermions and the ghost contributions. The bosons and the fermions along directions x 2,3 are quantized with mode numbers n + ν and n ± |ν − 21 | respectively. In general for a system of Dp branes with fluxes F1,2 the zero point energy is given by 1 p − 1 1 − + ν − . (5.7) 2α 4 2 From the above formula it’s clear that there is a tachyon on the D3 brane for any values of ν. For very small values of ν the tachyon has m2 = − 2α1 (1 − ν) and for large values of ν it has m2 = − 2αν . Also now there is a subtlety about GSO projection. Therefore it depends on whether we study a D1 brane or D1 brane. When the branes are kept far apart then there would be no tachyon in the system but the branes will be attracted to each other which in turn will retard the velocities of the brane. Let us now consider the special case of SU (2). For this we have the following background: F1 = −F2 = −F. (5.8) It is straightforward to show that now the modes will be shifted by ν given as ν=
2 tan−1 F. π
(5.9)
For the case we are interested in, F → ∞, and therefore the shift ν = 1. The ground state energy does not change but all the modes of the string get shifted by 1. It would be worthwhile to analyze in greater detail the spectrum and dynamics of this theory. Acknowledgements. K.D. would like to thank J. Maldacena, S. Mukhi and R. Tatar for useful discussions. Z.Y. would like to thank D. Gaitsgory and C. H.Yan for useful discussions. We would like to thank N. Seiberg for helpful comments. The work of K.D. is supported by DOE grant number DE-FG02-90ER40542 and the work of Z.Y. is supported by NSF grant number PHY-0070928.
References 1. Connes, A., Douglas, M.R., Schwarz, A.: Noncommutative geometry and matrix theory: Compactification on tori. JHEP 9802, 003 (1998), hep-th/9711162 2. Douglas, M.R., Hull, C.: D-branes and the noncommutative torus. JHEP 9802, 008 (1998), hep-th/9711165 3. Seiberg, N., Witten, E.: String theory and noncommutative geometry. JHEP 9909, 032 (1999), hep-th/9908142 4. Witten, E.: Noncommutative geometry and string field theory. Nucl. Phys. B268, 253 (1986) 5. Yin, Z.: A note on space noncommutativity. Phys. Lett. B 466, 234 (1999), hep-th/9908152 6. Kontsevich, M.: Deformation quantization of poisson manifolds. q-alg/9709040 7. Cattaneo, A.S., Felder, G.: A path integral approach to kontsevich quantization formula. Commun. Math. Phys. 212, 591 (2000), math.qa/9902090 8. Douglas, M.R.: D-branes on Calabi-Yau manifolds. math.ag/0009209
338
K. Dasgupta, Z. Yin
9. Tatar, R.: A note on noncommutative field theory and stability of Brane-Antibrane systems. hep-th/0009213 10. Bigatti, D., Susskind, L.: Magnetic fields, branes and noncommutative geometry. Phys. Rev. D62, 066004 (2000), hep-th/9908056 11. Chu, C.-S., Ho, P.-M.: Constrained quantization of open string in background B-field and noncommutative D-Brane. Nucl. Phys. B568, 447 (2000), hep-th/9906192 12. Chakravarty, S., Dasgupta, K., Ganor, O.J., Rajesh, G.: Pinned branes and new non-lorentz invariant theories. Nucl. Phys. B587, 228 (2000), hep-th/0002175 13. Dasgupta, K., Ganor, O.J., Rajesh, G.: Vector deformations of N = 4 Yang-Mills theory, pinned branes and arched strings. hep-th/0010072 14. Bergman, A., Ganor, O.J.: Dipoles, twists and noncommutative gauge theory. hep-th/0008030 15. Maldacena, J.M., Russo, J.G.: Large N limit of non-commutative gauge theories. JHEP 9909, 025 (1999), hep-th/9908134 16. Dasgupta, K., Ganor, O.J., Rajesh, G.: unpublished 17. Dasgupta, K., Rajesh, G., Sethi, S.: M-theory, orientifolds and G-flux. JHEP 9908, 023 (1999), hep-th/9908088 18. Dasgupta, K., Rajesh, G., Sethi, S.: Work in progress 19. Dolan, L., Nappi, C.: A scaling limit with many noncommutativity parameters. hep-th/0009225 20. Hashimoto, A., Itzhaki, N.: Non-commutative Yang-Mills and the AdS/CFT correspondence. Phys. Lett. B465, 142 (1999), hep-th/9907166 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 235, 339–378 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0790-4
Communications in
Mathematical Physics
Conformally Invariant Powers of the Laplacian, Q-Curvature, and Tractor Calculus A. Rod Gover1 , Lawrence J. Peterson2 1
Department of Mathematics, The University of Auckland, Private Bag 92019, Auckland 1, New Zealand. E-mail:
[email protected] 2 Department of Mathematics, The University of North Dakota, Grand Forks, ND 58202-8376, USA. E-mail:
[email protected] Received: 24 January 2002 / Accepted: 1 November 2002 Published online: 18 February 2003 – © Springer-Verlag 2003
Abstract: We describe an elementary algorithm for expressing, as explicit formulae in tractor calculus, the conformally invariant GJMS operators due to C.R. Graham et alia. These differential operators have leading part a power of the Laplacian. Conformal tractor calculus is the natural induced bundle calculus associated to the conformal Cartan connection. Applications discussed include standard formulae for these operators in terms of the Levi-Civita connection and its curvature and a direct definition and formula for T. Branson’s so-called Q-curvature (which integrates to a global conformal invariant) as well as generalisations of the operators and the Q-curvature. Among examples, the operators of order 4, 6 and 8 and the related Q-curvatures are treated explicitly. The algorithm exploits the ambient metric construction of Fefferman and Graham and includes a procedure for converting the ambient curvature and its covariant derivatives into tractor calculus expressions. This is partly based on [12], where the relationship of the normal standard tractor bundle to the ambient construction is described. 1. Introduction Conformally invariant differential operators have long been known to play an important role in physics and the geometry of many structures related to and including Riemannian and conformal geometries. For example, the classical field equations describing massless particles, including the Maxwell and Dirac (neutrino) equations, depend only on conformal structure [2, 18]. More recently string theory and quantum gravity have motivated several developments in mathematics where conformally invariant operators play a key role. Many of these could be said to fall under the umbrella of geometric spectral theory where, broadly, one attempts to relate global geometry to the spectrum of some natural operators on the manifold. For example, on compact manifolds there are programmes to find extremal metrics for functional determinants of natural operators. Conformally invariant operators yield determinants with a workable formula (a so called Polyakov formula) for the conformal variation of the determinant thus leading
340
A. R. Gover, L. J. Peterson
to significant progress [10, 6, 5]. In another direction there is new progress [35] in relating scattering matrices on conformally compact Einstein manifolds with conformal objects on their boundaries at infinity. This falls within the framework of the AdS/CFT correpondence of quantum gravity [43, 36, 37, 33]. In these areas it seems an especially important role is played by natural conformally invariant operators with principal part a power of the Laplacian . The earliest known of these is the conformally invariant wave operator which was first constructed for the study of massless fields on curved spacetime. More recently its Riemannian signature variation, usually called the Yamabe operator, has played a large role in the Yamabe problem on compact Riemannian manifolds. As an operator on functions it is given by the formula − (n − 2)R/(4(n − 1)), and it governs the transformation of the scalar curvature R under conformal rescaling. An operator with principal part 2 is due to Paneitz [40] (see also [41, 23]), and then sixth-order analogues were constructed in [3, 44]. Graham, Jenne, Mason and Sparling (GJMS) solved a major existence problem in [32] where they used a formal geometric construction to show the existence of conformally invariant differential operators P2k (to be referred to as the GJMS operators) with principal part k . In odd dimensions, k is any positive integer, while in dimension n even, k is a positive integer no more than n/2. The k = 1 and k = 2 cases recover, respectively, the Yamabe and Paneitz operators. In dimension 2 the transformation of the scalar curvature can also be deduced from the Yamabe operator by a dimensional continuation argument, and the curvature fixing problem corresponding to the Yamabe problem is usually known as the Gauss curvature prescription. In the late 1980’s Branson [4, 10] observed that the Paneitz operator P is formally self-adjoint and can be expressed in the form P 1 + ((n − 4)/2)Q4 , where P 1 annihilates constant functions and Q4 is a scalar curvature invariant which could play a role parallel to the scalar curvature in higher order analogues of the Gauss curvature prescription programme. In dimension 4 the conformal transformation of Q4 is given by the Paneitz operator, and it follows that the integral of Q4 over compact 4-manifolds is a global conformal invariant. On conformally flat structures this is a multiple of the Euler characteristic. It has recently been established by Graham and Zworski and Fefferman and Graham [34, 35, 27] that the GJMS operators P2k are formally self-adjoint, and so [5] shows that these operators yield an analogous local Riemannian invariant Qn for each even-dimensional manifold. There has been considerable recent interest and progress in understanding Branson’s Q-curvatures, especially in low dimensions and on conformally flat structures [16, 17]. In [32] the GJMS operators are derived from the Laplacian of the ambient metric of Fefferman and Graham [25, 26]. This construction is very valuable not only in itself but also because of the close links with the Poincar´e metrics of the conformally compact Einstein theory. On the other hand there is another way to generate a conformally invariant operator with principal part k . The result is usually presented as a simple formulae, first due to M.G. Eastwood, as given in (15). (See [28] for a derivation and some further related developments.) Underlying this formula are two related key tools. The first is a geometric construction developed by Eastwood and others [22, 19] known as the curved translation principle. This construction is a generalised and geometric variant of the translation functor due to Zuckerman and others [45]. The second is a machinery known as tractor calculus [1, 29, 14, 13]. This calculus brings the conformally invariant Cartan connection to induced bundles and also involves other fundamental conformally invariant operators (such as the ones used in this formula). The combination is potent since on the one hand it is very easy to expand these tractor formulae in terms of the Levi-Civita
Conformally Invariant Powers
341
connection and its curvature (which is useful for the investigation of issues such as positivity of the operators), and on the other hand the link with representation theory means one easily obtains rules for generalising the operators and how they may be composed with certain other conformally invariant operators. See for example (16). It should be pointed out that the tractor formulae are themselves complete and explicit formulae and can be readily worked with directly without using any knowledge of the representation theory aspects. That is essentially the approach below. See also [7], for example, where these tractor formulae for conformally invariant powers of the Laplacian are used to construct formally self-adjoint conformally invariant boundary problems, higher order conformally invariant Dirichlet-to-Neumann operators, and related constructions. One problem with the tractor approach up until now has been that, on even dimension n manifolds, this had failed to yield the operators of order n except for a quotient construction in dimension 4 [28]. Here we give a similar quotient tractor construction for a sixth-order operator and show that we have in fact recovered P4 and P6 . This brings us to one of the main purposes of this paper, which is to explicitly relate the tractor calculus approach to the GJMS construction. This is achieved in Sect. 4, where an algorithm is described for finding a tractor formula for any of the GJMS operators P2k . Remarkably this algorithm does not require solving the Fefferman-Graham ambient construction. For low order operators it is essentially trivial and quickly recovers the simple tractor formulae for P4 and P6 and yields a corresponding tractor formula for P8 . See Sect. 4.1 and Proposition 2.3. In Proposition 2.4 we use these formulae to prove directly that these operators are formally self-adjoint (verifying directly for these cases the general results of [35, 27]). Expanding these formulae into formulae in terms of the Levi-Civita connection and its curvature simply requires repeated use of the Leibniz rule and the definitions of the tractor objects. This is easily automated and is done in Sect. 2.2. The nature of the formulae we use mean the calculations have a large number of built-in self-checks which ensure that the formulae used are entered and used correctly by the software. Thus overall this demonstrates an effective means to obtain explicit formulae for the GJMS operators. It should be pointed out that the formulae in Sect. 2.2 are not in fact the raw output from the expansion of the tractor formulae, but rather this output manipulated into the canonical form described in [21]. The authors performed these expansions and manipulations mainly by using Mathematica and J. Lee’s Ricci programme [39]; this work was performed under the assumption of a Riemannian signature metric, but the resulting formulae are independent of the signature. The most important outcomes of Sect. 4 are Proposition 4.5 and Theorem 2.5. The first of these establishes important features about the form of the tractor formulae for the GJMS operators, and the latter exploits this to provide some new invariant operators closely linked to the GJMS operators. There are several applications of these. One is a direct tractor based construction of Branson’s Q-curvatures. See Proposition 2.7. In fact, this also gives a new definition for these invariants. This gives an effective way to calculate these (Q4 and Q6 are treated as examples), and it sheds light on their remarkable transformation properties. Another application of Theorem 2.5 is Corollary 2.6. In words this states that except for the k = n/2 case, the theorem yields generalisations of the GJMS operators P2k that are “strongly invariant” in the sense of [19]. That is, operators that can be composed with tractor bundle valued operators to yield further conformally invariant operators. This is one of the key ideas of the curved translation principle. Finally, Theorem 2.5 is a crucial ingredient in the general construction in [8] of an elliptic conformally invariant operator on 1-forms with close connections to the first de Rham cohomology.
342
A. R. Gover, L. J. Peterson
There are other results presented. For example, in Sect. 2.3 we describe how to proliferate Riemannian invariants which are not conformally invariant but have a transformation formula similar to the Branson Q-curvatures. These can be viewed as representing terms that could be added to the Q-curvature without affecting its key properties and so play a role in generating new curvature prescription problems. There are also many other potential applications for this work not touched upon in this article. For example, the tractor formulae for the GJMS operators could immediately be used in a construction parallel to that in [7] to produce alternative conformally invariant boundary problems and non-local operators based around the GJMS operators. It should also be pointed out that the results and ideas in this paper should have analogues for CR structures, where one would instead be involved with CR-invariant powers of the sub-Laplacian [30] and the ambient construction of C. Fefferman [24]. The construction presented in this article is in part an application of ideas developed in ˘ the joint work of one of the authors with A. Cap. See [12] where it is described explicitly how to relate the Cartan/tractor approach to the ambient construction of Fefferman and Graham and its applications to invariant theory. The relevant aspects of this theory are summarised in Sect. 3.1. There is a corresponding theory for the CR case [11]. ˘ The authors are indebted to Tom Branson, Andi Cap, Mike Eastwood, and Robin Graham for several illuminating conversations. The authors would also like to thank the Mathematical Sciences Research Institute and the organisers of the Spring session in 2001 for helping to make this research possible.
2. Conformal Geometry and Tractor Calculus We summarise here an approach to local conformal geometry that is rather useful for our applications. This is broadly based on the development presented in [13], but many of the ideas and tools had their origins in [42, 1], and [29]. The notation and conventions in general follow the last two sources. We shall work on a real conformal n-manifold M, where n ≥ 3. That is, we have a pair (M, [g]), where M is a smooth n-manifold and [g] is a conformal equivalence class of metrics of signature (p, q). Two metrics g and gˆ are said to be conformally equivalent if gˆ is a positive scalar function multiple of g. In this case it is convenient to write gˆ = 2 g for some positive smooth function . Although we assume that the metrics have some fixed signature, all considerations below will be signature independent. For a given conformal manifold (M, [g]), we shall denote by Q the bundle of metrics. That is, Q is a subbundle of S 2 T ∗ M with fibre R+ . The points correspond to values of metrics in the conformal class. Let E a denote the space of smooth sections of the tangent bundle T M, and similarly let Ea be the smooth sections of the cotangent bundle T ∗ M. In fact, we will generally abuse notation and also use these symbols to indicate the sheaves of germs of smooth sections and even the bundles themselves. These conventions will be carried through to all bundles that we discuss. We write E to denote the trivial bundle over M. Penrose’s abstract index notation is embraced throughout, so tensor products of these bundles will be indicated by adorning the symbol E with appropriate abstract indices. For example, in this notation ⊗2 T ∗ M is written Eab . An index which appears twice, once raised and once lowered, indicates a contraction. These conventions will be extended in an obvious way to the tractor bundles described below. In all settings indices may also be “suppressed” (omitted) if superfluous by context.
Conformally Invariant Powers
343
The bundle Q is a principal bundle with group R+ , so there are natural line bundles on (M, [g]) induced from the irreducible representations of R+ . We write E[w] for the line bundle induced from the representation of weight −w/2 on R (that is R+ x → x −w/2 ∈ End(R)). Thus a section of E[w] corresponds to a real-valued function f on Q with the homogeneity property f (x, 2 g) = w f (x, g), where is a positive function on M, x ∈ M, and g is a metric from the conformal class [g]. We use the notation Ea [w] for Ea ⊗ E[w] and so on. Note that for consistency with [1], this convention differs in sign from the one of [14, Sect. 4.15]. Let E+ [w] be the fibre subbundle of E[w] corresponding to R+ ⊂ R. Choosing a metric g from the conformal class defines a function f : Q → R by f (g, ˆ x) = −2 , 2 where gˆ = g, and this clearly defines a smooth section of E+ [−2]. Conversely, if f is such a section, then f (g, x)g is constant upto the fibres of Q and so defines a metric in the conformal class. Thus E+ [−2] is canonically isomorphic to Q, and the conformal metric g ab is the tautological section of Eab [2] that represents the map E+ [−2] ∼ = Q → E(ab) . ab ab bc From this there is a canonical section g of E [−2] such that g ab g = δa c (where δa c is the section of Ea c corresponding to the identity endomorphism of the tangent bundle). The conformal metric (and its inverse g ab ) will be used to raise and lower indices without further mention. Given a choice of metric g from the conformal class, we write ∇a for the corresponding Levi-Civita connection. With these conventions the Laplacian is given by = g ab ∇a ∇b = ∇ b ∇b . In view of the isomorphism E+ [−2] ∼ = Q, a choice of metric also trivialises the bundles E[w]. In particular we will write ξ g for the canonical section of E[1] satisfying g = (ξ g )−2 g. Conversely a section of E+ [1] clearly determines a metric by this relation, so such a ξ g is termed a choice of conformal scale. This determines a connection on E[w] via the corresponding trivialisation of E[w] and the exterior derivative on functions. We shall also denote such a connection by ∇a and refer to it as the Levi-Civita connection. Note in particular then that, by definition, ∇a ξ g = 0, so ∇a also preserves the conformal metric. The curvature Rab c d of the Levi-Civita connection is known as the Riemannian curvature, and is defined by (∇a ∇b − ∇b ∇a )v c = Rab c d v d . This can be decomposed into the totally trace-free Weyl curvature Cabcd and a remaining part described by the symmetric Rho-tensor Pab , according to Rabcd = Cabcd + 2g c[a Pb]d + 2g d[b Pa]c , where [· · · ] indicates the antisymmetrization over the enclosed indices. The Rho-tensor is a trace modification of the Ricci tensor Rab . We write J for the trace Pa a of P. Under a conformal transformation we replace our choice of metric g by the metric gˆ = 2 g, where is a positive smooth function. The Levi-Civita connection then transforms as follows: c ∇ a ub = ∇a ub − ϒa ub − ϒb ua + g ab ϒ uc
∇ a σ = ∇a σ + wϒa σ.
(1)
Here ub ∈ Eb , σ ∈ E[w], and ϒa := −1 ∇a . The Weyl curvature is conformally invariant, that is Cˆ abcd = Cabcd , and the Rho-tensor transforms by Pab = Pab − ∇a ϒb + ϒa ϒb − 21 ϒ c ϒc g ab . For the density bundle E[1], we have the jet exact sequence at 2-jets,
(2)
344
A. R. Gover, L. J. Peterson
0 → E(ab) [1] → J 2 (E[1]) → J 1 (E[1]) → 0, where (· · · ) indicates symmetrization over the enclosed indices. Note we have a bundle homomorphism E(ab) [1] → E[−1] given by complete contraction with g ab . This is split via ρ → n1 ρg ab and so the conformal structure decomposes E(ab) [1] into the direct sum E(ab)0 [1] ⊕ E[−1]. Clearly then E(ab)0 [1] is a smooth subbundle of J 2 (E[1]), and we define E A to be the quotient bundle. That is, the standard tractor bundle E A is defined by the exact sequence 0 → E(ab)0 [1] → J 2 (E[1]) → E A → 0.
(3)
The jet exact sequence at 2-jets and the corresponding sequence at 1-jets, viz 0 → E A which Ea [1] → J 1 (E[1]) → E[1] → 0, determine a composition series for we can A + summarise via a self-explanatory semi-direct sum notation E = E[1] Ea [1] + E[−1]. We denote by XA the canonical section of E A [1] := E A ⊗ E[1] corresponding to the mapping E[−1] → E A . Composing the canonical projection J 2 (E[1]) → E A with the 2-jet operator j 2 yields an invariant differential operator n1 D A : E[1] → E A . On the other hand, if we choose a metric g from the conformal class, then the map jx2 σ → [ n1 D A σ (x)]g := (σ (x), ∇a σ (x), − n1 ( + J)σ (x)), induces an isomorphism E A → E[1] ⊕ Ea [1] ⊕ E[−1] =: [E A ]g of vector bundles. Tautologically the displayed formula for n1 [D A σ (x)]g gives the operator D A in terms of this decomposition. If the image of V A ∈ E A is [V A ]g = (σ, µa , τ ), then from the change in the Levi-Civita connection (1) we get [V A ]gˆ = (σ, µa , τ ) = (σ, µa + σ ϒa , τ − ϒb µb − 21 σ ϒb ϒ b ). This transformation formula characterises sections of E A in terms of triples in E[1] ⊕ Ea [1] ⊕ E[−1]. With a fixed rescaling of the map E[−1] → E A , we have [X A ]g = (0, 0, 1). It is convenient to introduce scale-dependent sections Z Ab ∈ E Ab [−1] and Y A ∈ E A [−1] mapping into the other slots of these triples so that [V A ]g = (σ, µa , τ ) is equivalent to V A = Y A σ + Z Ab µb + X A τ. If Yˆ A and Zˆ A b are the corresponding quantities in terms of the metric gˆ = 2 g then we have Zˆ Ab = Z Ab + ϒ b X A , Yˆ A = Y A − ϒb Z Ab − 21 ϒb ϒ b X A .
(4)
The standard tractor bundle has an invariant metric hAB of signature (p + 1, q + 1) and an invariant connection, which we shall also denote by ∇a , preserving hAB . If V A is as above and V B ∈ E B is given by [V B ]g = (σ , µb , τ ), then hAB V A V B = µb µb + σ τ + τ σ . Using hAB and its inverse to raise and lower indices, we immediately see that YA X A = 1, ZAb Z A c = g bc
Conformally Invariant Powers
345
Y A Z Ac YA 0 0 ZAb 0 δb c XA 1 0
XA 1 0 0
Fig. 1. Tractor inner product
and that all other quadratic combinations that contract the tractor index vanish. This is summarised in Fig. 1. Thus we also have YA V A = τ, XA V A = σ, ZAb V A = µb and the metric may be decomposed into a sum of projections, hAB = ZA c ZBc + XA YB + YA XB . If for a metric g from the conformal class V A ∈ E A is given by [V A ]g = (σ, µa , τ ), then the invariant connection is given by
∇a σ − µ a [∇a V B ]g = ∇a µb + g ab τ + Pab σ . ∇a τ − Pab µb
(5)
The tractor metric will be used to raise and lower indices without further comment. We shall use either “horizontal” (as in [V B ]g = (σ, µb , τ )) or “vertical” (as in (5)) notation, depending on which is clearer in each given situation. Tensor products of the standard tractor bundle, skew or symmetric parts of these and so forth are all termed tractor bundles. The bundle tensor product of such a bundle with E[w], for some real number weight w, is termed a weighted tractor bundle. For example EA1 A2 ···A [w] = EA1 ⊗ · · · ⊗ EA ⊗ E[w] is a weighted tractor bundle. Given a choice of conformal scale we have the corresponding Levi-Civita connection on tensor and density bundles. In this setting we can use the coupled Levi-Civita tractor connection to act on sections of the tensor product of a tensor bundle with a tractor bundle. This is defined by the Leibniz rule in the usual way. For example if ub V C σ ∈ E b ⊗ E C ⊗ E[w] =: E bC [w], then ∇a ub V C σ = (∇a ub )V C σ + ub (∇a V C )σ + ub V C ∇a σ . Here ∇ means the LeviCivita connection on ub ∈ E b and σ ∈ E[w], while it denotes the tractor connection on V C ∈ E C . In particular with this convention we have ∇a XA = ZAa , ∇a ZAb = −Pab XA − YA g ab , ∇a YA = Pab ZA b ,
(6)
which for the purposes of automating calculations is a very useful description of the tractor connection. Note that if V is a section of E [w], which means simply some tractor bundle of weight w, then the coupled Levi-Civita tractor connection is not confomally invariant but transforms just as the Levi-Civita connection transforms on densities of the same weight. That is a V = ∇a V + wϒa V ∇ under the conformal rescaling g → gˆ = 2 g (cf. (1)). It is an elementary exercise using the last transformation formulae and (4) to show that, for V ∈ E [w], the formula D AP V := 2wX [P Y A] V + 2X [P Z A]b ∇b V
(7)
346
A. R. Gover, L. J. Peterson
determines an invariant operator D AP : E [w] → E [AP ] ⊗ E [w]. (This was first developed in early versions of [29] and is closely related to the “fundamental D” operator developed in [14].) Since we can vary the weight and the tractor bundle E , D AP is really an entire family of operators. The point is that with the way we have defined ∇, the same formula works for the entire family, and so it is reasonable to let the single symbol D AP denote all of these operators. We abuse terminology and describe it as an operator. (The Levi-Civita connection is usually used this way.) If we have a single formula Op that gives a family of conformally invariant operators Op : E ⊗ V → E ⊗ W as we range over all tractor bundles E then, following [19], we describe Op as a strongly invariant operator. For example D AP is strongly invariant. As already pointed out D AP is rather more universal since we can vary the weight w as well. Thus we can form compositions of this operator with itself, and in particular we consider hAB DA(Q D|B|P )0 V for V ∈ E [w] some weighted tractor. Expanding this out using (6), (7), and the Leibniz rule for ∇, it is easily verified that it may be re-expressed in the form hAB DA(Q D|B|P )0 V = −X(Q DP )0 V , where D is some operator determined explicitly in the calculation. Since the map EP [w− 1] → E(P Q)0 [w] given by SP → X(Q SP )0 is injective, this establishes DA : E [w] → EA ⊗ E [w − 1] as a conformally invariant differential operator on weighted tractor bundles. For V ∈ E [w], this is given by D A V := (n + 2w − 2)wY A V + (n + 2w − 2)Z Aa ∇a V − X A V ,
(8)
V := ∇p ∇ p V + wJV .
(9)
where
So DA is in fact precisely the tractor D-operator in [1]. Note the identity DA X A V = (n + 2w + 2)(n + w)V ,
(10)
which we will use later. The curvature of the tractor connection is defined by [∇a , ∇b ]V C = ab C E V E for V C ∈ E C and is precisely the local obstruction to conformal flatness. (That is locally there is a flat metric in the conformal class if and only if this curvature vanishes.) Using (6) and the usual formulae for the curvature of the Levi-Civita connection we calculate (cf. [1]) abCE = ZC c ZE e Cabce − 4X[C ZE] e ∇[a Pb]e .
(11)
It is straightforward to use this and (6) to show that if V ∈ ECE···F [w], then [DA , DB ]VCE···F = (n + 2w − 2)[WABC Q VQE···F + 2wABC Q VQE···F + 4X[A B] s C Q ∇s VQE···F + · · · + WABF Q VCE···Q + 2wABF Q VCE···Q + 4X[A B] s F Q ∇s VCE···Q ].
(12)
Conformally Invariant Powers
347
Here ABCE = ZA a ZB b abCE , BsCE = ZB b bsCE , and WABCE = (n − 4)ABCE − 2X[A ZB b] ∇ p pbCE .
(13)
It follows that on conformally flat structures [DA , DB ]VCE···F = 0. Similarly it is easily verified that [DB , DC ] annihilates densities. We should point out some features of WABCE . Firstly, it is conformally invariant. One can already see this from (12) by setting w = 0 and then considering sections VC of EC such that ∇a VC vanishes at a given point. This is also immediately clear 3 from the formula WAB K L := n−2 D P X[P AB] K L (see [29]), which is readily verified. From this several things are immediately clear. Firstly, WABCE vanishes on conformally flat structures. Next, we have that WABCE = W[AB][CE] and that it is trace-free (since ABCE is annihilated by contraction with X P on any index). Furthermore expanding (13) reveals that W[ABC]E = 0. Thus WABCE has “Weyl tensor symmetries”. Whence it is immediately clear that WABCE is also annihilated upon contraction with X P . Finally we should comment on the uniqueness of this tractor calculus. In Sects. 2.6 and 2.7 of [13] it is shown that the transformation properties (4) and the form of the connection (6) identify E A and its tractor connection ∇a as above as a normal tractor bundle and connection corresponding to the defining representation of SO(p + 1, q + 1). Let V be Rn+2 as the representation space for the standard (or defining) representation of SO(p + 1, q + 1). We can construct [14] from the pair (E A , ∇a ) a principal bundle G which is the frame bundle for E A corresponding to the metric and filtration. This has fibre P , a certain parabolic subgroup of SO(p + 1, q + 1). A Cartan connection ω on G is determined by ∇. This is the normal Cartan connection on G such that ∇a is the vector bundle connection induced from ω. That is the normality condition on the pair (E A , ∇a ) is equivalent to the pair (G, ω) being a normal Cartan bundle and connection in the sense of [15].
2.1. Conformally invariant powers of the Laplacian. Since the tractor-D operator constructed above is well-defined on any weighted tractor bundle, we can compose the tractor-D operators. It is clear from the formula for the tractor-D operator that any such composition will yield a natural operator, that is an operator which can be written as a polynomial formula in terms of a representative metric, its inverse, the metric connection and its curvature. On densities of the appropriate weight and with some minor adjustment a composition of this form will lead to conformally invariant operators with principal part a power of the Laplacian. First let us observe how the conformal Laplacian arises from the tractor machinery. Let f ∈ E[1−n/2]. Then observe that immediately from (8) we have DA f = −XA f . Since DA is conformally invariant we have immediately that for f ∈ E[1 − n/2], f is conformally invariant. From (9) this is (∇ a ∇a + 2−n 2 J)f – the usual conformal Laplacian.
Now suppose that instead we have f ∈ E [1 − n/2], where here and below E [w] will be used to indicate any tractor bundle of weight w. We still have DA f = −XA f,
(14)
but now in f = (∇ a ∇a + 2−n 2 J)f , ∇ means the Levi-Civita tractor coupled connection. In particular this establishes a strongly invariant generalisation of the Yamabe operator on tractor sections of the said weight.
348
A. R. Gover, L. J. Peterson
It is clear from our observations that there is a conformally invariant operator DA1 DA2 · · · DAk−1 : E[k − n/2] → EA1 A2 ···Ak−1 [−1 − n/2]. In the conformally flat case this already yields an operator between densities (cf. [28]). Proposition 2.1. On conformally flat structures, if f ∈ E[k − n/2], then DAk−1 · · · DA1 f = (−1)k−1 XA1 · · · XAk−1 2k f, where 2k : E[k − n/2] → E[−k − n/2] is a conformally invariant operator. Locally we can choose a flat metric from the conformal class. This determines a connection in terms of which we have 2k f = k f . Proof. In any choice of conformal scale, expand out DAk−1 · · · DA1 f via the formula (8) and move the X, Y, Z’s to the left of all ∇’s via the identities (6). It is immediately clear that the highest order term is precisely (−1)k−1 XA1 · · · XAk−1 k f and that any other coefficient of XA1 · · · XAk−1 involves the curvature Pab or its trace. On the other hand, on conformally flat structures, [DA , DB ]V = 0 for V any weighted tractor field. Thus DA · · · DA1 f is completely symmetric for any ∈ Z+ . In particular DAk−1 · · · DA1 f ∈ E(A1 ···Ak−1 ) [−1 − n/2] and DAk DAk−1 · · · DA1 f ∈ E(A1 ···Ak ) [−n/2]. Consequently it must be that 0 = D[Ak DAk−1] · · · DA1 f . But DAk−1 · · · DA1 f has weight 1 − n/2, so from (14) this implies X[Ak DAk−1 ] DAk−2 · · · DA1 f = 0. It follows immediately that DAk−1 DAk−2 · · · DA1 f = (−1)k−1 XA1 · · · XAk−1 2k f for some operator 2k . With the above we are done.
If we are happy to work in the scale of a flat metric then there is an even simpler proof along the lines of the proof of Proposition 4.3. We leave this for the reader. By the same ideas as in the proof above, it is easy to use (12) and (8) to show that, if k ≥ 3, then X[Ak DAk−1 ] DAk−2 · · · DA1 f = 0 for f ∈ E[k − n/2] on a general conformally curved manifold. Thus the proposition fails if we remove the requirement of conformal flatness. One way to generalise the 2k is as follows. Consider D A1 · · · D Ak−1 DAk−1 · · · DA1 f for f ∈ E [k − n/2]. This is manifestly strongly conformally invariant in all dimensions and for all positive integers k. Furthermore by the identity (10) we have that, on conformally flat structures, k A1 Ak−1 DAk−1 · · · DA1 f = (n − 2i)(i − 1) 2k f. (15) D ···D i=2
On the other hand for general conformally curved structures, suppose that the dimension n is odd or satisfies 2k < n. Then we can define 2k f by (15), and this gives a conformally invariant operator 2k : E [k − n/2] → E [−k − n/2]
(16)
with principal part k . Here, as usual, E [k − n/2] indicates any tractor bundle of weight k − n/2. In these dimensions this generalises the operator 2k of the proposition. Although we do not wish to describe the curved translation principle [22, 19], it is worth
Conformally Invariant Powers
349
pointing out that it is partly illustrated here. The tractor formula (15) manifests 2k as a “translate” of the Yamabe operator . In fact proceeding in smaller steps it demonstrates 2k as a translate of 2k−2 . Before we move on, let us demonstrate that the operators 2k are formally self-adjoint. We summarise some results we need from Sect. 7 of [7] in the following proposition. These results can be verified easily using the definitions above, and it is important for our needs to note that this works in a rather formal manner. That is, we can leave the dimension and weight as unknown in the calculations. Proposition 2.2. On a conformal manifold M we have (i) If ψ B ∈ E B [w] and ϕ ∈ E[1 − n − w] is compactly supported on M, then
A ϕDA ψ = (DA ϕ)ψ A . M
M
(ii) If E is any tractor bundle, then E is canonically isomorphic to its dual E via the tractor metric and for any pair ψ , ϕ ∈ E [1 − n/2], (E [1 − n/2] := E ⊗ E[1 − n/2]) we have
ϕ ψ = ψ ϕ . M
Since E[−n] is naturally identified with the space of volume densities, the integrals are well-defined. Now part (ii) of the proposition asserts 2 = is formally self-adjoint, while the same result for D A1 · · · D Ak−1 DAk−1 · · · DA1 follows immediately from this and repeated use of (i). So the formal self-adjoint property of 2k is proved. It should be pointed out that as well as observing that D A1 · · · D Ak−1 DAk−1 · · · DA1 f recovers a conformally invariant power of the Laplacian, M.G.Eastwood also observed the formal self-adjoint property. It is clear from (15) that, unfortunately, this formula does not yield a conformally invariant operator of order n on even dimensional structures, yet the existence of such an operator is guaranteed by the construction of [32]. Recall from Sect. 2 that if f ∈ E[w], then [DA , DB ]f = 0, and so the k = 2 case of the proposition does hold on general conformal structures. In particular, as observed in [28], for f ∈ E[2 − n/2] we can define P4 f by the quotient formula DA f = −XA P4 f. Then P4 has principle part 2 . This construction works even when n = 4, and in other dimensions P4 = 4 as defined above. It is not hard to do the next even order in a similar way. If now f ∈ E[3 − n/2], then [DB , DC ]f = 0 and hence DB DC f = D(B DC) f . Now it is a short exercise, using (12) and the definition of WABCD once more, to show that (n − 4)X[A DB] DC f = −2X[A WB] S C T DS DT f. Now from the fact that DS DT f is symmetric and that WBSCT has Weyl tensor type symmetries, we can deduce that in dimensions n = 4, PBC f := DB DC f +
2 W B S C T DS D T f n−4
(17)
350
A. R. Gover, L. J. Peterson
is symmetric (i.e. PBC f ∈ E(BC) [−1 − n/2]). On the other hand, from the previous display X[A PB]C f = 0. Thus, for n = 4, PBC f = XB XC P6 f, where P6 is a conformally invariant operator E[3 − n/2] → E[−3 − n/2] generalising (for the allowed dimensions) the sixth-order operator of Proposition 2.1. In particular this works in dimension 6. We should point out that although 2k as defined by (15) is manifestly strongly invariant, we cannot conclude this for P6 as defined here. The operator PBC defined above is clearly invariant when acting on weighted tractors, but the argument here to deduce that PBC f has the form XB XC P6 f relies on the vanishing of [DA , DB ]f . We will establish in Sect. 4 (see in particular Subsect. 4.1) the following result. Proposition 2.3. The operators P4 and P6 defined by the tractor expressions above are precisely the fourth-order and sixth-order GJMS operators. That is P4 = P4 , P6 = P6 . A tractor expression for the eighth-order GJMS operator P8 is as follows: XA XB XC P8 f = − DA DB DC f −
2 P Q n−4 WA B DP DQ DC f
2 − n−4 WA P C Q DP DB DQ f −
4 P Q n−6 XA UB C DP DQ f
2 + (n−4)(n−6) XA D E W B P C Q DE DP DQ f n−2 P Q S T + 4 (n−4) 2 (n−6) XA WB C WP Q DS DT f ,
where all operators act on everything to their right, in a given term, and UB P C Q is the tractor field 2 (n−4)2
W AP B F WFAC Q + W AP C F WBAF Q + W AP QF WBACF .
Here D E WB P C Q DE DP DQ f means D E (WB P C Q DE DP DQ f ) (and not (D E WB P C Q )DE DP DQ f ). This and similar conventions for other operators and situations will apply throughout the paper. We should emphasise at this point that the tractor formulae for P4 , P6 and P8 above, and similar ones for the higher order P2k that we could easily construct via the algorithm of Sect. 4, are genuine formulae for the GJMS operators. No further algorithm is required. They are valid on any conformal manifold where the given GJMS operators exist. In this tractor form they are already suitable for many applications, such as establishing strong invariance or constructing related operators. The remainder of the section will demonstrate this. We begin by using the tractor formulae directly to show that the operators P4 , P6 and P8 are formally self-adjoint (FSA). We treat these in order. For f ∈ E[2 − n/2], we have DA f = −XA P4 f
which implies
D A DA f = (n − 4)P4 f.
(18)
We have already observed that D A1 · · · D Ak−1 DAk−1 · · · DA1 is FSA on E[k − n/2]. So from the second of these it is clear that P4 is FSA in dimensions other than 4. From the expressions (8) and (9) it follows that D A DA f and DA f , as expressions in terms of Levi-Civita covariant derivatives of f , Pab and J, are polynomial in n. So from
Conformally Invariant Powers
351
(18) it is clear that (4 − n) divides this expression for D A DA f and so P4 f is also given as a formula polynomial in n and the Levi-Civita covariant derivatives of f , Pab and J. Working among tensors of this form a calculation to verify the FSA property of P4 (in dimensions greater than 4) can be carried out formally in dimension n, since Proposition 2.2 is established that way. It follows immediately that the same calculation must work when we set n = 4. Thus P4 is also FSA in dimension 4. Now for P6 , let f ∈ E[3 − n/2] and note that DB DC f and WB S C T DS DT f are polynomial in n. Thus PBC f is rational in n with a singularity only at n = 4. From PBC f = XB XC P6 f we have (n − 4)D C D B PBC f = (n − 4)D C D B DB DC f + 2D C D B WB S C T DS DT f = 2(n − 4)2 (n − 6)P6 f. Now since WBSCT has the Weyl tensor symmetries (in fact here we just need WBSCT = WCT BS = WT CSB ), it follows from Proposition 2.2 that D C D B WB S C T DS DT is FSA on E[3 − n/2]. We know D C D B DB DC f is also FSA and as expressions in terms of Levi-Civita covariant derivatives of f , Cabcd , Pab , and J, both of these and (n − 4)PBC are polynomial in n. Thus the expression like this for D C D B PBC f is divisible by (n − 6), so reasoning as for P4 , we quickly conclude that P6 is FSA in all dimensions for which it is defined. Finally, since UABCD also has Weyl tensor symmetry (as readily verified directly or since it corresponds to ∆R ABCD as in Sect. 3), it follows that D C D B UB P C Q DP DQ f is FSA for f ∈ E[4 − n/2]. A similar comment applies to the other terms on the right-hand side of the formula, 6(n − 4)2 (n − 6)(n − 8)P8 f = (n − 4)D C D B D A DA DB DC f + 2D C D B D A WA P B Q DP DQ DC f 2
P Q C B + 2D C D B D A WA P C Q DP DB DQ f − 4 (n−4) n−6 D D UB C DP DQ f P Q C B E + 2 n−4 n−6 D D D WB C DE DP DQ f P Q S T C B + 4 n−2 n−6 D D WB C WP Q DS DT f,
which follows from the earlier display for P8 . This shows immediately that P8 f is FSA in dimensions other than 8, and then, arguing as in the previous cases, we can deduce that it is also FSA in dimension 8. We have directly proved the following. Proposition 2.4. The GJMS operators P4 , P6 and P8 are formally self-adjoint. In fact the result is also immediate from the formulae for these operators in Sect. 2.2. The formulae there are given in terms of the Levi-Civita connection and its curvature and are in a canonical form that manifests the formal self-adjoint symmetry. (It should be pointed out that in deriving those formulae formal self-adjointness was not assumed.) Recently the entire family of operators P2k have been shown to be formally self-adjoint by other means [35, 27]. In Sect. 4 we will show that there are similar tractor formulae for all of the GJMS operators and that these tractor formulae share some of the qualitative features of the examples above. In particular we will prove the following theorem.
352
A. R. Gover, L. J. Peterson
Theorem 2.5. (i) In each dimension n and for each integer 2 ≤ k, with k ≤ n/2 if n is even, there is a conformally invariant differential operator ECD AB : EAB [k − 2 − n/2] → ECD [2 − k − n/2] such that ECD AB DA DB f = XC XD P2k f for f ∈ E[k − n/2], where P2k : E[k − n/2] → E[−k − n/2] is the order 2k GJMS operator. The operator is given by a formula which is a partial contraction polynomial in , DA , WABCD , XA , hAB and its inverse hAB . (ii) With n and k as in (i) and for E any tractor bundle, there is a conformally invariant differential operator ECD AB : E ⊗ EAB [k − 2 − n/2] → E ⊗ ECD [2 − k − n/2] which generalises the operator of part (i). (iii) On conformally flat structures the operator E is, up to a non-zero scale, 2k−4 . In this case, given a choice of flat metric from the conformal class, E is, up to a non-zero scale, k−2 . Proof. We have already observed that for f ∈ E[2 − n/2], we have (see (18)) DA f = −XA P4 f . Since then DA f ∈ EA [1 − n/2] we have XB DA f = −DB DA f , and so we have (i) for k = 2. Otherwise establishing part (i) is the primary purpose of Sect. 4. More precisely, from the discussion there we obtain Proposition 4.5, which asserts that XA1 · · · XAk−1 P2k f = (−1)k−1 DAk−1 · · · DA1 f + Ak−1 ···A1 P Q DP DQ f. Applying D A3 · · · D Ak−1 to both sides of this and using (10), we obtain XA1 XA2 P2k f = EA1 A2 P Q DP DQ f, where EA1 A2 P Q = 2(k−2) δA1P δA2Q
−1 A3 · · · D Ak−1 P Q . (19) + ( k−2 Ak−1 ···A3 A2 A1 i=2 (2i − n)(i − 1)) D As explained in Proposition 4.5, is given explicitly by a sum of terms each of which is a monomial in DA , WABCD , XA , hAB , and its inverse hAB . Each such monomial is thus a composition of strongly conformally invariant operators. So EA1 A2 P Q is a sum of compositions of strongly conformally invariant operators. Just knowing that EA1 A2 P Q is a sum of compositions of conformally invariant operators of this form gives part (i). Then part (ii) is immediate from the fact that these are strongly invariant operators. Next we show part (iii). From Proposition 4.5 each term in the expression for is of degree at least 1 in WABCD . The latter vanishes on conformally flat structures. Thus, from (19), on such structures E is just 2k−4 . From Proposition 2.1, given a choice of flat metric from the conformal class, we have 2k−4 = k−2 .
Remarks . In regard to part (i) of the theorem we should point out that Sect. 4 not only establishes the existence of a formula for E which is a partial contraction polynomial in DA , WABCD , XA , hAB , and its inverse hAB , but describes an algorithm for finding such a formula.
Conformally Invariant Powers
353
It seems likely that the operator E in the theorem is formally self-adjoint. Note for example, if we write E ∗ for the formal adjoint of E, then from Proposition 2.2, the identity k (10), and the result that P2k is formally self-adjoint, we have (n − 2i)(i − 1) P2k = i=k−1 ∗ P Q D D on E[k − n/2]. D A D B EAB P Q Finally we should add that the terms which distinguish the P2k from the 2k do not vanish in general. At least we have verified by direct calculation that for f ∈ E[3 − n/2] (and n = 4) the leading term of D C D B WB S C T DS DT f is a non-zero scalar multiple of C a cde C bcde ∇a ∇b f . Thus P6 is not simply a scalar multiple of 6 . It is a non-trivial matter to know when conformally invariant operators have strongly invariant generalisations. Some do not. For example in dimension 4 we know there is a conformally invariant operator P4 : E → E[−4] with principal part 2 . Suppose there were a strongly invariant generalisation of this. Then, in particular, it would give a conformally invariant operator HA B : EB → EA [−4] with principal part 2 . (Here we mean the principal part as an operator between the reducible bundles indicated.) Then, in the case of Riemannian signature conformal 4-manifolds, using the ellipticity of this, (10), Proposition 2.2 and the differential operator existence results in the conformally flat setting (cf. [23]) we can conclude that we would have a conformally invariant operator D A HA B DB : E[1] → E[−5] with principal part 3 (on arbitrary conformal 4-manifolds). This contradicts C.R. Graham’s non-existence result [31], and so we can conclude the operator HA B does not exist. (See also [22].) However P4 does have a strongly invariant generalisation in all other dimensions. This is just 4 as a special case of (16). More generally, a consequence of part (ii) of the theorem is that the GJMS operators P2k admit strongly invariant generalisations except in the critical dimension n = 2k. That is, we have the following proposition on n-dimensional conformal manifolds: Corollary 2.6. For each integer k ≥ 1, with 2k < n if n is even, there is a (tractor) formula that gives, for each tractor bundle E , a formally self-adjoint differential operator
P2k : E [k − n/2] → E [−k − n/2],
where E [w] := E ⊗ E[w]. The operator has principal part k and can be expressed as a sum of 2k and a contraction polynomial in DA , WABCD , XA , hAB , and its inverse hAB . In the conformally flat case the operator is 2k . In the case that E = E then
=P . P2k 2k Proof. Since DA is strongly invariant and since also, from (ii) of the theorem, EAB P Q is strongly invariant, it follows that there is a conformally invariant operator (F :=
( ki=k−1 (n − 2i)(i − 1))−1 D A D B EAB P Q DP DQ ) : E [k − n/2] → E [−k − n/2] for any tractor bundle. By (10) this precisely recovers the GJMS operator P2k if E [k −n/2] is simply the density bundle E[k − n/2]. Now consider the formal adjoint F ∗ of F . This is another conformally invariant operator E [k −n/2] → E [−k −n/2] (where as usual we identify E with its dual via the tractor metric). Since P2k is formally self-adjoint, it is clear that, when applied to E[k − n/2], F ∗ also recovers the GJMS operator. Thus (F + F ∗ )/2 is the required formally self-adjoint operator. It is clear from Proposition 4.5 that we can express F by a formula which is a sum of 2k and a contraction polynomial in DA , WABCD , XA , hAB , and its inverse hAB . From that proposition we also have that each term in the latter polynomial expression is of degree at least 1 in WABCD . Using Proposition 2.2 and the formal self-adjoint property of 2k , we see that there is an expression for F ∗ as a sum of 2k and a contraction
354
A. R. Gover, L. J. Peterson
polynomial in DA , WABCD , XA , hAB , and its inverse hAB . Again each term in the latter polynomial is of degree at least 1 in WABCD . So the final part of the corollary follows from these observations and Proposition 2.1.
2.2. Conventional formulae. There are circumstances where it is useful to have explicit formulae for the GJMS-related operators and invariants in terms of the Levi-Civita connection and its curvature. These formulae are generally cumbersome. But the various curvature terms are closely related to the spectrum of the operator, so it is important to be able to extract these explicitly. In particular, for example, issues of positivity can be investigated directly in this setting. Moreover such formulae are ready to be mechanically rewritten in local coordinates should this be required. Here we will describe how to re-express tractor formulae for P2k f into formulae which are polynomial in g, its inverse, ∇ (meaning the Levi-Civita connection), C, P and J, and of course linear in f . For the most part, the process is simply an expansion of the tractor formulae using the definitions above. Consider the Paneitz operator P4 first. We observed in Proposition 2.3 that for f ∈ E[2 − n/2], −XA P4 f = DA f . So P4 f = −Y A DA f , and we could simply calculate this scalar quantity Y A DA f . In fact we prefer to expand the entire tractor valued quantity DA f using (8) and (9). According to its definition, DA lowers weight by 1. Thus DA f is given by n n (∇b ∇ b + (1 − )J)((4 − n)YA f + 2ZA a ∇a f − XA (∇c ∇ c + (2 − )J)f ). 2 2 Now we simply move the XA , YA and ZA a to the left of all other operators by repeated use of (6). This is easily done by hand and simplified via the Bianchi identity to yield −DA f = XA (2 f − (n − 2)Jf + 4 Pij ∇i ∇j f − (n − 6)(∇ i J)∇i f −
n−4 2 (J)f
+
n(n−4) 2 4 J f
− (n − 4)Pij Pij f ).
The coefficient of XA on the right-hand side is a formula for the Paneitz operator. Note that the coefficient of YA and the coefficient of ZA a both turned out to be zero. Of course this is exactly as predicted by our formula −XA P4 f = DA f , but it provides a very useful check of the formulae to verify this. So this is all there is to producing the required formula for P4 from the tractor formula. Before we continue with the general case let us just reorganise the result. For any linear differential operator on densities of the appropriate weight there is a canonical form for the formula which, among other features, manifests the symmetry in the formally self-adjoint and formally anti-self-adjoint parts [21]. As already observed, the Paneitz operator is formally self-adjoint. Applying this idea to the formula above yields ij kl
ij
P4 f = ∇i ∇j S4 ∇k ∇l f + ∇i S2 ∇j f + ij kl
Here S4
n−4 g 2 Q4,n f.
(20)
ij
and S2 are the tensors (1/3)(g il g j k + g ik g j l + g ij g kl ) and 2(n−8) ij 4−3n ij 3 g J− 3 P , g
respectively, and Q4,n denotes the scalar n 2 2J
− 2Pij Pij − J.
In (20), the ∇’s act on all tensors to their right within the given term.
(21)
Conformally Invariant Powers
355
We now discuss the general case. Explicit tractor formulae are readily produced by the algorithm described in Sect. 4, and so we shall suppose we are beginning with a formula for P2k as described in Proposition 4.5. The formulae for P6 and P8 above (see Proposition 2.3) give explicit examples that can be kept in mind. These formulae are polynomial in , DA , WABCD , XA , hAB , and its inverse hAB . We replace each of these with its formula in terms of the coupled tractor-Levi-Civita connection ∇, XA and so forth according to the formulae (8), (9), (11), and (13). In doing so we note that W has weight −2 and that D lowers the weight of a tractor by 1. Next we move all occurrences of XA , ZA a , and YA to the left of the ∇ via repeated use of (6). At the end of this process all tractor valued objects are to the left of the remaining ∇’s, and so at this point these ∇’s are simply Levi-Civita covariant derivative operators. Next we use the inner product rules of Fig. 1 to simplify the resulting expression. The formula for P2k f is then simply the overall coefficient of XA1 XA2 · · · XAk−1 . From Proposition 4.5 all other slots of the tractor expression vanish. That is, the sum of the terms that do not contain XA1 XA2 · · · XAk−1 is zero. Verifying this or even partly verifying this provides a very serious check of all formulae and any software that are used in the calculation. For example, one can verify that the sum of the terms containing ZA1 a XA2 · · · XAk−1 vanishes. This procedure is very simple. But there are many terms involved, as the next examples will illustrate. Thus it becomes very useful to be able to calculate via a suitable computer algebra system. For the examples below the authors used Mathematica and J. Lee’s Ricci program [39], which proved to be very effective. Certainly the P8 case is beyond a reasonable hand calculation. The use of software and the self-checking nature of the formulae as discussed above mean that one can be confident of the final result. As a technical point for these calculations, we describe a simple technique which can considerably reduce the computing time they require. One can implement this technique by developing a short computer programme. We begin by noting that certain steps in the computation may produce tractor inner products of the form B1 B2 ···B B1 B2 ···B , where the indices B1 , B2 ,· · ·, and B appear as subscripts or superscripts attached to the tractors Y , Z, and X. Suppose that is large and that the tractors B1 B2 ···B and
B1 B2 ···B are the sums of many terms. Suppose also that no derivatives of Y , Z, or X occur. The tractor B1 B2 ···B is a linear combination of the following 3 terms: YB1 YB2 · · · YB −1 YB , YB1 YB2 · · · YB −1 ZB b , YB1 YB2 · · · YB −1 XB , YB1 YB2 · · · ZB −1 b −1 YB , YB1 YB2 · · · ZB −1 b −1 ZB b , YB1 YB2 · · · ZB −1 b −1 XB .. . XB1 XB2 · · · XB −1 XB .
The coefficients of these terms may, of course, be very complicated. By raising indices we may write B1 B2 ···B as a similar linear combination. Each term of each linear combination may be paired off with at most one term in the other linear combination so as to give a nonzero inner product. We compute the 3 possible inner products and add the results. We conclude this section with the calculation of P6 and P8 via these methods, beginning with the tractor formulae indicated in Proposition 2.3. As a check, the authors
356
A. R. Gover, L. J. Peterson
verified the vanishing of the overall coefficient of the ZB i XC term in the expansion of (17). In a similar fashion, they also verified the vanishing of the overall coefficients of the XB ZC i and ZA i XB XC terms in the expansions for P6 and P8 , respectively. This involved the use of the Bianchi identities, tensor symmetries, and changes in the order in which covariant derivatives are taken. The authors also manipulated the resulting formulae for the GJMS operators into the canonical form suggested in [21]. Here are the results: ij klmp
P6 = ∇i ∇j ∇k T6
ij kl
ij
∇l ∇m ∇p f + ∇i ∇j T4 ∇k ∇l f + ∇i T2 ∇j f
g
+ n−6 2 Q6,n f. ij klmp
Here T6
ij kl
and T4
are the symmetrizations of the tensors g ij g kl g mp and 2 − 3n ij kl Jg g + (20 − 2n)P ij g kl , 2
ij
respectively, and T2 is the tensor 88 − 86 n + n2 i j k 2 2176 − 768 n + 82 n2 + n3 i j k − P |k − P kP 15 (n − 4) 15 (n − 4) 2 320 − 218 n + 27 n2 i j (3 n − 2) (5 n − 54) i j − g Pk l Pk l + P J 15 (n − 4) 15 744 − 250 n + 31 n2 i j −164 − 120 n + 45 n2 i j 2 2 (5 n − 22) i j k g J − g J|k − J| 60 15 15 (n − 4) 2 296 − 26 n + 3 n2 4 + (22) Pk l C i k j l − C i k l m C j k l m . 15 (n − 4) 15 +
Here and below, for typesetting convenience, we write Pi j |k k as an alternative notation g for ∇ k ∇k Pij and so forth. Finally, Q6,n denotes the scalar 8 (n − 2) 64 − 8Pi j |k Pi j | k − Pi j Pi j |k k + P i j Pi k Pj k n−4 n−4 4 −4 − 4 n + n2 (n − 2) (n + 2) 3 J + (n − 6) J|i J| i + P i j Pi j J − 4 n−4 3n − 2 8 (n − 6) 32 JJ|i i − Pi j J| i j − J|i i j j − P i j Pk l C i k j l . + 2 n−4 n−4
(23)
We find that P8 f is given by ij klmpqr
∇i ∇j ∇k ∇l U8
ij klmp
∇ m ∇ p ∇ q ∇ r f + ∇ i ∇ j ∇ k U6
ij kl
ij
+ ∇i ∇j U4 ∇k ∇l f + ∇i U2 ∇j f + ij klmpqr
In this formula, U8 g mp g qr and
ij klmp
and U6
n−8 g 2 Q8,n f.
denote the symmetrizations of the tensors g ij g kl
−2nJg ij g kl g mp − 4(n − 12)g ij g kl Pmp , ij kl
respectively, and U4
∇l ∇m ∇p f
denotes the symmetrization of the tensor
Conformally Invariant Powers
357
−8 (−12 + n) i j k l 4 (−12 + n) (−64 + 5 n) i j k l 24 n i j k l m P | + P P + g P |m 5 15 −4 + n 16 1536 − 530 n + 59 n2 i j k l m 2 480 − 568 n + 67 n2 i j k l g P mP − g g Pmp Pmp − 15 (−4 + n) 15 (−4 + n) 4 n (−64 + 5 n) i j k l −96 − 20 n + 15 n2 i j k l 2 24 − 5 n i j k l m g P J+ g g J + g g J|m + 5 5 10 2 2 8 120 − 31 n + 4 n 16 192 − 7 n + n − g i j J| k l + g i j Pmp C k ml p 5 (−4 + n) 15 (−4 + n) 16 32 − C i m j p C k ml p − g i j C k mpq C l pmq . 15 15 ij
We let U2 denote the symmetrization of the tensor Eij + Fij + Gij , where Eij , Fij , and g Gij are as given in Figs. 2, 3, and 4. Finally, Q8,n denotes the scalar given in Fig. 5. 2.3. Branson’s Q-curvature. We have used P2k to indicate a conformally invariant operator between densities, P2k : E[k −n/2] → E[−k −n/2]. Suppose we choose a metric from the conformal class.Then we can trivialise these density bundles, and so P2k gives g an operator P2k between functions on the Riemannian (or pseudo-Riemannian) structure given by the choice of g. If we write (ξ g )w for the operator given by multiplication by
−4 −19200+8468 n−980 n2 +27 n3
− − + +
45 (−6+n) (−4+n)
315 (−4+n)
4 2592−5262 n+625 n2 +14 n3
g i j Pk l Pk l |m m
4 74928−21908 n−968 n2 +283 n3
g i j Pk l |m Pk l | m
315 (−6+n) (−4+n)
g i j Pk l |m Pk m | l
8 311616−146460 n+27484 n2 −2167 n3 +5 n4 +7 n5 315 (−6+n) (−4+n)
2 −44928−10224 n+14400 n2 −3206 n3 +203 n4
45 (−6+n) (−4+n)
g i j Pk l Pk m Pl m
g i j Pk l Pk l J
n2 −105 n3 g i j J3 + 14820−3650 n+231 n2 g i j J J k + −2560+1568 n+420 |k | 210 315
+ +
4 −6−31 n+5 n2
15
g i j JJ|k k
2 480384−238464 n+44040 n2 −3346 n3 +91 n4 315 (−6+n) (−4+n)
n) i j g i j Pk l J| k l − 3 (−48+7 g J|k k l l 35
n) i j + 8 (188+21 g Pk l |mp C k ml p 315
−
8 79488−20160 n+1020 n2 +50 n3 +7 n4 315 (−6+n) (−4+n)
ij k ml p g i j Pk l Pmp C k ml p + 16 45 g JC k l mp C
ij k mp q l + 16 (−4+n) g i j P C k l p mq − 16 mp q C | kl 45 g C k l mp |q C 45 ij k m l q p r − 16 g i j C k m Cl q p r − 16 qr k l mp C 45 g C k l mp C q r C 45
Fig. 2. The tensor Eij
358
A. R. Gover, L. J. Peterson 8 63648−43740 n+8300 n2 −405 n3 +37 n4
− − −
315 (−6+n) (−4+n)
2 −1728+1164 n−292 n2 +n3
105 (−6+n) (−4+n)
Pi j
k l |k l
Pk l Pi j | k l +
16 −57312+18900 n−1240 n2 −5 n3 +27 n4
8 −11108+5601 n−447 n2 +14 n3 315 (−4+n)
Pk l | j Pi k | l
Pk l Pi k | j l
315 (−6+n) (−4+n)
4 −23040+98136 n−18188 n2 +134 n3 +93 n4
Pi k |l l Pj k 315 (−6+n) (−4+n) 2 4 −7754−1224 n+29 n − Pi k |l Pj k | l 315 8 589824−282624 n+54804 n2 −4472 n3 +17 n4 +6 n5 + Pk l Pi k Pj l 105 (−6+n) (−4+n) 4 −51384+36938 n−9529 n2 +662 n3 +93 n4 − Pi k |l Pj l | k 315 (−6+n) (−4+n) 2 3 4 4 47232−112776 n+38488 n −4444 n +165 n + Pk l Pi j Pk l 105 (−6+n) (−4+n) 2 −53720+22678 n−3194 n2 +63 n3 + Pk l | i Pk l | j 315 (−4+n) 4 184896−132840 n+30900 n2 −2390 n3 +59 n4 + Pk l Pk l | i j 315 (−6+n) (−4+n) 2 3 4 2 −2880−37008 n+17564 n −2432 n +21 n + Pi j |k k J 315 (−6+n) (−4+n) 4 −33408−425736 n+228064 n2 −37810 n3 +2094 n4 +21 n5 Pi k Pj k J + 315 (−6+n) (−4+n)
n2 −105 n3 Pi j J2 + 9336−8206 n+875 n2 J i J j + −9216−1264 n+1568 | | 105 315
+ − + + −
8 −45432+19134 n−2965 n2 +138 n3
315 (−6+n)
2 24768−1404 n−2584 n2 +215 n3
315 (−6+n)
Pi k | j J| k
Pi j
|k J|
k
+
2 −40320−77904 n+52144 n2 −9826 n3 +651 n4
315 (−6+n) (−4+n)
4 6108−1487 n+70 n2 315
JJ| i j
4 934848−481176 n+106800 n2 −11644 n3 +457 n4 315 (−6+n) (−4+n) 4 −15984+8490 n−1411 n2 +85 n3
105 (−6+n) (−4+n)
Pi j J|k k
Pi k J| j k
J| i j k k
Fig. 3. The tensor Fij
g
(ξ g )w ∈ E[w], then P2k f = (ξ g )k+n/2 P2k (ξ g )k−n/2k f . The GJMS operators as disg cussed, for example, in [5] are the P2k . In this form the operators are not invariant but rather covariant (see below), and conformally invariant operators are often discussed g entirely in this setting. For many purposes the difference between P2k and P2k is rather small. In particular, since the Levi-Civita connection corresponding to g annihilates ξ g , g the formulae above for the P2k also serve as formulae for the operators P2k . From Theg orem 2.5 and the formulae (8), (9),(11) and (13), there is a universal expression for P2k which is polynomial in g, g −1 , C, P, ∇, and the ∇ covariant derivatives of C and P, and the coefficients in this universal expression are real rational functions of the dimension n which are regular for all odd n and all n ≥ 2k.
Conformally Invariant Powers
359
8 −3984+4492 n−296 n2 +13 n3
− − + − + +
315 (−6+n) (−4+n)
Pk l |m m C i k j l
4 33984−21624 n+4540 n2 −346 n3 +21 n4
105 (−6+n) (−4+n)
8 −19392+14302 n−2403 n2 +128 n3
315 (−6+n) (−4+n)
4 −33408+13680 n−1082 n2 +95 n3
ikj m lp Pk l |m C i k j m | l − 80 | 63 C k l mp C
315 (−6+n) (−4+n) 63 (−6+n) (−4+n)
4 69168−25940 n+1472 n2 +15 n3
315 (−6+n) (−4+n)
Pk l JC i k j l
J|k l C i k j l
16 −52992+13248 n−472 n2 −20 n3 +3 n4
Pk l Pj m C i k l m
Pk l |m C i k l m | j
8 17088−16264 n+652 n2 +94 n3 +5 n4
315 (−6+n) (−4+n)
n) Pk l Pk m C i l j m + 128 (31+2 Pk l | i m C j k l m 315
8 Ci j k l p m + 16 (−7+3 n) JC i j l k m − 68 C i j lkp m − 105 | | k l m|p C k l mC k l m|p C 63 45 n) i + 32 (231+2 P k |l m C j mk l + 315 ikl j mp q − + 64 35 C k l mp C q C
8 8016−1576 n−506 n2 +21 n3
315 (−6+n) (−4+n) 16 22200−7798 n+215 n2 +8 n3 315 (−6+n) (−4+n)
Pk l C i m k p C j ml p Pk l C i m k p C j p l m
i k m C j p l q − 32 Pi j C k ml p + 176 q k l mp C 105C k l mp C 45
−
16 2692−143 n+25 n2 315 (−4+n)
n) i Pk l C i m j p C k ml p − 16 (−33+20 P k C j l mp C k ml p 315
i j k l mp q − 32 45 C k l mp C q C
Fig. 4. The tensor Gij
˜ 2k,n be the local invariant P 1 on a conformal n-manifold. Since P is forLet Q 2k 2k g,1 g ˜ 2k,n , mally self-adjoint (FSA) it is clear we can write it in the form P2k = P2k + Q g,1 g g where P2k has the form δS2k d with δ the formal adjoint of d and S2k an order 2k − 2 differential operator. By setting w = 0 in the formulae (8) and (9) we see that, as an operator on E, DA factors through the exterior derivative d. At least this is true given a choice of metric g from g the conformal class. Thus in dimension n0 = 2k it is clear from Theorem 2.5 that Pn0 is ˜ 2k,n vanishes in dimension 2k = n0 . Using also a composition with d. Thus the term Q ˜ 2k,n = this and a careful use of classical invariant theory one can conclude that in fact Q g g g n−2k g Q . In the previous section we gave explicit formulae for Q , Q and Q8,n . 4,n 6,n 2k,n 2 g Clearly Q2k,n is also given by a formula rational in n and regular at n = n0 := 2k. g g In dimension n0 , Qn0 := Qn0 ,n0 is by definition (modulo a sign (−1)k ) Branson’s g Q-curvature, and for compact conformal n0 -manifolds, M Qn0 is a global conformal invariant. To see this, observe that the conformal invariance of P2k is equivalent to the covariance law g
n+2k 2
g
gˆ
g
P2k = P2k
n−2k 2
,
where gˆ = 2 g and we regard the powers of as multiplication operators. Applying n+2k n−2k gˆ n−2k g 2 Q 2 both sides to the constant function 1 we obtain n−2k + 2k = 2 Q2k 2
360
A. R. Gover, L. J. Peterson −12 8−4 n+n2 (−4+n)2
n) Pi j |k k Pi j |l l − 24Pi j |k l Pi j | k l − 48 (−18+7 Pi j Pk l Pi j | k l −6+n
(−2+n) (−2+n) 192 n P P Pi k j l − 48−4+n Pi j |k Pi j | k l l − 12−6+n Pi j Pi j |k k l l + −6+n | ij kl 48 −32+44 n−12 n2 +n3
384 (−16+3 n) i jk l Pi j Pk l | i Pj k | l + (−6+n) (−6+n) (−4+n) (−4+n) Pi j P k |l P | 384 4−6 n+n2 192 144−32 n−4 n2 +n3 + P Pi Pj k |l l + Pi j Pk l Pi k Pj l (−6+n) (−4+n)2 i j k (−6+n) (−4+n)2 48 −32+20 n−8 n2 +n3 + Pi j Pi k |l Pj l | k (−6+n) (−4+n) 6 (−8+n) 256−32 n−18 n2 +3 n3 + Pi j Pk l Pi j Pk l (−6+n ) (−4+n)2 24 −176+80 n−14 n2 +n3 4 48+80 n−46 n2 +5 n3 k l j i + Pi j Pk l | P | + Pi j |k Pi j | k J (−6+n) (−4+n) (−6+n) (−4+n) 4 12−20 n+5 n2 384 n P Pi Pj k J + Pi j Pi j |k k J − −6+n ij k −6+n
−
n2 +5 n3 P Pi j J2 + (−4+n) n (4+n) J4 − 96−80 n−30 ij −6+n 8
(−2+n) 200−38 n+3 n2 2 i + 2 + 13 n − 2 n JJ|i J| − Pi j J| i J| j −4+n 24 (−2+n) 88−20 n+n2 16 −48+70 n−17 n2 +n3 i jJ k + + P P Pi j Pi j |k J| k ij k| | (−6+n) (−4+n) (−6+n) (−4+n) 2 32−40 n+5 n2 n2 J2 J i + −8+3 n J i J j + + 16+4 n−3 Pi j Pi j J|k k |i |i |j 2 2 −4+n 2 16 4−8 n+n n) − Pi j |k k J| i j + 4 n (−46+5 Pi j JJ| i j −6+n (−4+n)2 2 −336+136 n−20 n2 +n3 24 1088−608 n+144 n2 −18 n3 +n4 i j + J|i j J| + Pi j Pi k J| j k (−4+n)2 (−6+n) (−4+n)2
(−7+n) + (−38 + 5 n) J|i J| i j j − 32−4+n Pi j |k J| i j k + 2 (−1 + n) JJ|i i j j n) ij k − 4 (−54+7 −6+n Pi j J| k
− J|i i j j k k
96 n P P JC i k j l − + −6+n ij kl
192 4−6 n+n2
− P P (−6+n) (−4+n)2 i j k l |m 2 24 (−8+n) −16−2 n+n Pi j J|k l C i k j l (−6+n) (−4+n)2
mCi k j l
384 (−5+n) i k j l m − 192 P k ilj m − (−6+n) | −4+n i j |k Pl m| C (−4+n) Pi j Pk l |m C
+
192 112−16 n−6 n2 +n3 (−6+n) (−4+n)2
96 n P i j lkm Pi j Pk l Pi m C j k l m − −4+n i j |k P l |m C
192 P P C i k C j ml p − 384 P P C i k C j p l m + −6+n ij kl m p −6+n i j k l m p i j k ml p + 192 (−6+n) 2 Pi j Pk l C m p C (−4+n)
g
Fig. 5. The invariant Q8,n g,1
n−2k
g
P2k 2 . Expanding this out yields a universal transformation formula for Q2k . Since g,1 P2k is a composition with d it is clear that we can divide this formula by n−2k 2 . Then in dimension n0 = 2k we obtain gˆ
g
g
n0 Q2k = Q2k + δS2k dϒ,
Conformally Invariant Powers
361
where ϒ = log . Then, if we denote by g the volume density with a metric associated g g, we have gˆ = n0 g , and the conformal invariance of M Q2k is clear. Recall that a choice of metric g determines a canonical section ξ g of E[1] by (ξ g )−2 g = g. It is g g g convenient to redefine Q2k to be (ξ g )−n0 times Q2k as above. Then Q2k is valued in E[−n] and the transformation law simplifies to gˆ
g
g
Q2k = Q2k + δS2k dϒ,
(24) gˆ
g
where δ and S2k are now also density valued. Note that we can also write this as Q2k = g g,1 g Q2k + P2k ϒ as P2k agrees with P2k in dimension 2k. g The discussion above for Qn0 and its properties are a minor adaption of the arguments presented in Branson’s [5]. It is clear that given explicit formulae for the P2k as in the previous section we can extract a formula for the Q-curvature as follows: Take the order g 0 part of the formula, divide by (n − 2k)/2 and then set n = 2k. For example Q8 is obtained by setting n = 8 in the formula given in Fig. 5. In [5] it is also shown that the global invariant is not trivial. In fact, it is established there that, on conformally flat g g structures, Qn0 is given by a multiple of the Pfaffian plus a divergence, and so M Qn0 is a multiple of the Euler characteristic χ (M). One of the keys to the importance of Q2k is the remarkable transformation formula (24). We will describe a new definition and construction for Q2k and proof of this transformation formula. This leads to a direct formula for Q2k . Here to prove the transformation law we will use a dimensional continuation argument. (This plays a minor role and can in fact can be replaced by a direct proof [8]). The construction is then adapted to proliferate other curvature quantities with transformation formulae of the general form (24), and in these cases dimensional continuation is not used at all. See Proposition 2.8. (Since the original writing of this Fefferman and Graham [27] have given another alternative construction and generalisation of the Q-curvature which involves the Poincar´e metric.) We work on a conformal manifold of dimension n0 = 2k. For a choice of metric g g g from the conformal structure let IA be the section of EA [−1] defined by IA := (n−2)YA − JXA , where, recall, YA ∈ EA [−1] gives the splitting of the tractor bundle corresponding g to the metric g (as in Sect. 2). We can write this as a triple [IA ]g = ((n − 2), 0, − J). gˆ ˆ A or According to this definition, if gˆ = 2 g then we have I := (n − 2)YˆA − JX A
gˆ ˆ In terms of the splitting determined by g, I gˆ is given by [IA ]gˆ = ((n − 2), 0, − J). A gˆ [I ]g = ((n − 2), −(n − 2)ϒa , − Jˆ − (n/2 − 1)ϒ b ϒb ). By (2) and ϒa = ∇a ϒ this A
gˆ
becomes [IA ]g = ((n − 2), −(n − 2)∇a ϒ, − J + ϒ), and so gˆ
g
IA = IA − DA ϒ.
(25)
This observation is due to Eastwood who also pointed out [20] that on conformally g flat structures this yields Branson’s curvature as follows. For each metric define QB by g
gˆ
g
g
QB := 2k−2 IB . Then, by (25), QB = QB − 2k−2 DB ϒ. Now since the structure is conformally flat, 2k−2 DA ϒ = −XA P2k ϒ (see Theorem 2.5 or e.g. [28]). Thus we have gˆ
g
QB = QB + XB P2k ϒ.
362
A. R. Gover, L. J. Peterson gˆ
g
It follows that X B QB = X B QB is a conformal invariant of weight n0 − 2. On a conformally flat structure there are no conformal invariants of the structure and so this g vanishes. Since this vanishes, Z Bc QB is also conformally invariant and so must vanish. g This shows that for any conformally flat metric, QB = XB Qg for some Riemannian invariant Qg and also that Qgˆ = Qg + P2k ϒ. That is, it transforms according to (24). On conformally flat structures one can always locally choose a metric that is flat whence all Riemannian invariants vanish. Using this we deduce that Qg is Branson’s curvature, g that is Qg = Q2k . Via the theorem we can generalise Eastwood’s cunning construction to the curved case. Note (25) holds on any conformal manifold. Let us define the operator FC B : EB [k −1−n/2] → EC [1−k −n/2] by FC B := (k −2)−1 (n−2k +2)−1 D K ECK AB DA , where E is the operator defined in Theorem 2.5. Now on a dimension n0 = 2k manifold g g we simply define QC := FC B IB . From (25) and the theorem we have immediately gˆ
g
QC = QC + XC P2k ϒ.
(26) g
g
It remains to verify that for any metric g in the conformal class, QC is indeed XC Q2k . 1f + In any dimension n and given any metric g, if f ∈ E[w] let us write DA f = DA 0 f , where D 1 f := (n + 2w − 2)Z a ∇ f − X f and wD 0 f is the remaining wDA A a A A A 0 f = (n+2w−2)Y f −X Jf . Let w = k−n/2, and assume order zero part. That is, DA A A n is odd or n ≤ 2k. Then −XA P2k f = FA B DB f = FA B DB1 f + wFA B DB0 f . Let ξ g be the section of E[1] corresponding to g. Recall that ξ g is parallel for the Levi-Civita connection of g. Since (ξ g )w is a section of E[w], we have g
XA wQ2k,n (ξ g )w = wFA B DB0 (ξ g )w . Now XA FA B DB0 (ξ g )w can be expressed as a universal expression which is polynomial in g, g −1 , C, P, ∇, and the ∇ covariant derivatives of C and P. The coefficients in this universal expression are real rational functions of the dimension n which are regular for all odd n and all n ≥ 2k. Furthermore, from the left-hand-side of the display, this expression vanishes in even dimensions n < 2k and for all odd dimensions. Thus it must vanish in dimension n0 = 2k. Similarly we can conclude Z Aa FA B DB0 (ξ g )w vanishes g if n is odd or n ≤ 2k. Thus in dimension n0 , w = 0, DB0 (ξ g )w = DB0 1 = IB , and as g g Q2k := Qn0 ,n0 , we have g
g
g
QA = FA B IB = XA Q2k . Thus we have the following. g
g
Proposition 2.7. Y A FA B IB is a formula for Branson’s Q-curvature Q2k . Note that if we take this as a definition for Q then the transformation property (24) arises from (26) which in turn is an immediate consequence of (25). The formula itself is direct and requires no dimensional continuation. The only subtlety in the construcg g tion was in establishing that QA has the form XA Q2k . We employed a dimensional
Conformally Invariant Powers
363
continuation argument to establish this above, but it turns out that there is an elementary g direct proof using the ambient construction [8]. Finally note that if we write IgA = hAB IB , g g then IgA FA B IB = (n − 2)Q2k . g It is straightforward to convert this tractor formula for Q2k into a formula in terms g of ∇, C, P, the metric, and its inverse. We simply expand Y A FA B IB using the forB mula for FA as a partial contraction polynomial in DA , WABCD , XA , hAB , and its inverse hAB , as obtained from Proposition 4.5, and apply (8), (9), (11), and (13) along the same lines as the calculations in Sect. 2.2. In fact, as a means of checking against formulae or calculational errors it is prudent to calculate the entire tractor valued g expression FA B IB and verify from this that only the bottom slot is not zero. That g g g is that X A FA B IB = 0 = Z Aa FA B IB . Doing this for Q4 we obtain the known formula Q4 = −2Pi j Pi j + 2J2 − J|j j . In terms of the Ricci curvature Rc and the scalar curvature Sc, this becomes Q4 = − 21 Rci j Rci j + 16 Sc2 − 16 Sc|i i . For Q6 the formulae are more severely tested by calcug g g g 2 lating EAB CE DC IE . For this case EAB CE DC IE = DA IB + n−4 WA C B E DC IE and the calculation verifies all components vanish except for the coefficient of XA XB , the negative of which is g
Q6 = −(8Pi j |k Pi j | k + 16Pi j Pi j |k k − 32Pi j Pi k Pj k − 16Pi j Pi j J +8J3 − 8J|k k J + J|j j k k + 16Pi j Pk l Ci k j l ). Note that these examples agree with setting n = 4 in (21) and n = 6 in (23). g Using IA it is easy to construct examples of other functionals of the metric that have transformation laws of the same form as (24). We state this as a proposition. Proposition 2.8. In dimension n0 = 2k, for each natural conformally invariant operag tor GA B : EB [−1] → EA [1 − n0 ] there is a Riemannian invariant D A GA B IB with a conformal transformation of the form gˆ
g
g
D A GA B IB = D A GA B IB + δT2k dϒ, g
where T2k is a Riemannian invariant differential operator such that the composition g δT2k d is a conformally invariant operator between functions and densities of weight −n. g If GA B is formally self-adjoint, then δT2k d is formally self-adjoint. gˆ
g
Proof. It is clear from (25) that D A GA B IB = D A GA B IB − D A GA B DB ϒ. Note that D A GA B DB is a composition of conformally invariant operators. Since ϒ is a function (i.e. is a density of weight 0), DB ϒ factors through dϒ. From Proposition 2.2 it follows that the formal adjoint of D A GA B DB also factors through the exterior derivative d. Thus g the conformally invariant operator D A GA B DB has the form −δT2k d. g Since δT2k dϒ = −D A GA B DB ϒ, the last part of the proposition is immediate from Proposition 2.2.
An example in dimension 6 is to take GA B to be the order zero operator |C|2 δA B , g gˆ where |C|2 = C abcd Cabcd . Then D A GA B IB = −4|C|2 , and D A GA B IB = −4|C|2 + a 2 16∇ |C| ∇a ϒ. We can easily make many other examples via the tractor objects already
364
A. R. Gover, L. J. Peterson
seen above. Other examples in dimension 6 are to take GA B to be WACDE W BCDE or B DP W ECDQ D D E WA B E F DF . In dimension 8 we could take GA B to be δA P CDE W Q and so on. Note all these examples have G formally self-adjoint. g It is a trivial exercise to verify that D A GA B IB is always a divergence, and so none of the invariants from the proposition yield non-trivial global invariants. Thus we could adg just the definition of Q2k by adding such functions without affecting it as a representative th of n0 de Rham cohomology and also without affecting the form of the transformation g law (24). Such changes would of course alter what we meant by S2k , but in any case g δS2k d would remain an invariant operator on functions. Such potential modifications are important from several points of view. The transformation law (24) is satisfied in dimension 2 by the scalar curvature, or more precisely by −Sc/2. In this context it is usually called the Gauss curvature prescription equation. As mentioned earlier, Q2k lends itself to higher dimensional analogues of this curvature prescription problem. For g g g the same reason, in any case where D A GA B IB is non-trivial, Q2k + D A GA B IB yields a distinct, and apparently equally natural, curvature prescription problem. Of course then g g Q2k + D A GA B IB does not arise by Branson’s construction from the GJMS operator P2k . But it is easily verified that it does arise via Branson’s argument applied to the conformally invariant operator P2k := P2k − D A GA B DB : E[k − n/2] → E[−k − n/2],
and according to either construction the conformal transformation formula in dimension n0 = 2k is gˆ
gˆ
Q2k + D A GA B IB = Q2k + D A GA B IB + P2k ϒ. g
g
It is possible, for example, that there are settings where such natural modifications to the GJMS operators will yield operators which are positive but the relevant GJMS operator fails to be positive. 3. The Ambient Metric Construction The ambient metric construction of Fefferman-Graham associates to a conformal manifold M of signature (p, q) a pseudo-Riemannian so-called ambient manifold M˜ of signature (p + 1, q + 1). The ambient manifold M˜ is Q × I , where I = (−1, 1). Hence˜ forth we identify Q with its natural inclusion ι : Q → M˜ given by Q q → (q, 0) ∈ M. Observe that Q carries a tautological symmetric 2-tensor g0 given by g0 = π ∗ g at the point (p, g) ∈ Q. This satisfies δs∗ g0 = s 2 g0 , where δs is the natural R+ -action on Q given by δs (p, g) = (p, s 2 g). We will also write δs for natural extension of this action to M˜ and denote by X the infinitesimal generator of this, i.e., for a smooth function f ˜ Xf (q) = d f (δs q)|s=1 . The metric on the ambient manifold M˜ will be denoted on M, ds h and is required to be a homogeneous extension of g0 in the sense that ι∗ h = g0
δs∗ h = s 2 h for s > 0.
(27)
The idea of the Fefferman-Graham construction is to attempt to find a formal power series solution along Q for the Cauchy problem of an ambient metric h satisfying (27) and the condition that it be Ricci-flat, i.e. Ric(h) = 0. It turns out that only a weaker curvature condition can be satisfied in the even dimensional case. The main results we
Conformally Invariant Powers
365
need are contained in Theorem 2.1 of [26]: If n is odd then, up to a R+ -equivariant diffeomorphism fixing Q, there is a unique power series solution for h satisfying (27) and Ric(h) = 0. If n is even then, up to a R+ -equivariant diffeomorphism fixing Q and the addition of terms vanishing to order n/2, there is a unique power series solution for h satisfying (27) and such that, along Q, Ric(h) vanishes to order n/2 − 2 and that the tangential components of Ric(h) vanish to order n/2 − 1. We should point out that we only use the existence part of the Fefferman-Graham construction. The uniqueness of the GJMS operators, the covariant derivatives of the ambient curvature and so forth are a consequence of the existence of tractor formulae for these objects. By choosing a metric g from the conformal class on M we determine a fibre variable on Q by writing a general point of Q in the form (p, t 2 g(p)), where p ∈ M and t > 0. Local coordinates x i on M then correspond to coordinates (t, x i ) on Q. These extend ˜ where ρ is a defining function for Q and such [26, 32] to coordinates (t, x i , ρ) on M, i that the curves ρ → (t, x , ρ) are geodesics for h. In these coordinates the ambient metric takes the form h = t 2 gij (x, ρ)dx i dx j + 2ρdtdt + 2tdtdρ.
(28)
This form is forced to all orders in odd dimensions. In even dimensions it is forced up to the addition of terms vanishing to order n/2. In order, in even dimensions, to recover the order n GJMS operators via the procedure of [32] we need also to assume that the metric has this form up to the addition of terms vanishing to order n/2 + 1. Although we only need this form to that order, to simplify our discussion we will assume that the form (28) holds to all orders in even dimensions too. This simply involves some choice of extension for the Taylor series of the components gij , and then with this assumption the identities discussed in the remainder of this subsection hold to all orders in all dimensions. We write ∇ for the ambient Levi-Civita connection determined by h. In terms of the coordinates one has X = t ∂t∂ , and if we let Q := h(X, X), then Q = 2ρt 2 and is a defining function for Q. In terms of this we have that, when n is even, the ambient construction determines h up to O(Qn/2 ). Let us use upper case abstract indi˜ For example, if v B is a vector field on M, ˜ then the ambient ces A, B, . . . for tensors on M. C Riemann tensor will be denoted R AB D and defined by [∇A , ∇B ]v C = R AB C D v D . Indices will be raised and lowered using the ambient metric hAB and its inverse hAB in the usual way. We will soon see that this index convention is consistent with our use of these indices for tractor bundles. The homogeneity property of h in (27) means that X is a conformal Killing vector, and in particular LX h = 2h, where L is the Lie derivative. It follows that ∇(A X B) = hAB . On the other hand, from the explicit coordinate form of the metric, we have that ∇B Q = 2XB , and so ∇A X B is symmetric. Thus ∇A X B = hAB which, in turn, implies X A R ABCD = 0.
(29)
In terms of our notation the theorem of [26] (mentioned above) means that in even dimensions the ambient Ricci curvature R BF can be written in the form R BF = Qn/2−2 X (B K F ) + Qn/2−1 LBF
366
A. R. Gover, L. J. Peterson
for appropriately homogeneous ambient tensors K F and LBF . In fact the choice to extend the metric h so that it has the form (28) restricts K A significantly. From (29) we have that XA R AC vanishes to all orders. With the contracted Bianchi identity 2∇A R AC = ∇C S (where S denotes the ambient Ricci scalar curvature) this implies that K A = XA K for an ambient homogeneous function K. Although it is not strictly necessary, it will simplify our subsequent calculations to restrict the ambient metric a little more. An elementary calculation verifies that we can adjust the components gij in (28) so that K = O(Q). Thus finally we have that in even dimensions the metric has the form (28) and R BF = Qn/2−1 LBF
(30)
for an appropriately homogeneous ambient tensor LBF . (The authors are appreciative ˘ and C.R. Graham in relation to this point.) of discussions with A. Cap 3.1. Recovering tractor calculus. Recall that a section of E[w] corresponds to a realvalued function f on Q with the homogeneity property f (p, s 2 g) = s w f (p, g), where p ∈ M and g is a metric from the conformal class [g]. Let E˜Q (w) denote the space of smooth functions on Q which are homogeneous of degree w in this way. We write ˜ ˜ E(w) for the smooth functions on M˜ which are similarly homogeneous, i.e. f˜ ∈ E(w) A means X ∇A f = w f˜. The construction of the GJMS operators in [32] exploits this ˜ relationship between E[w] and E(w). We will use here the analogous idea at the level ˜ This is developed more fully in [12], and here we just summarise the of tensors on M. basic ideas needed presently. Writing δs for the derivative of the action δs , let us define an equivalence relation on the ambient tangent bundle by Uq1 ∼ Vq2 if and only if there is s ∈ R+ such that Vq2 = s −1 δs Uq1 . Corresponding to this we have the equivalence relation on M˜ by ˜ ∼ q1 ∼ q2 if and only if q2 = δs q1 . It is straightforward to verify that the space T M/ ˜ is a rank n + 2 vector bundle over M/ ∼. Sections of this bundle correspond to smooth sections V : M˜ → T M˜ with the homogeneity property V (δs p) = s −1 δs V (p), or they could be alternatively characterised by their commutator with the Euler field X, A (0)) denote the space of sections of T M ˜ (T M| ˜ Q) [X, V ] = −V . We will let E˜ A (0) (E˜Q AB AB which are homogeneous in this way, and we will write E˜ (w) (E˜Q (w)) to mean A ⊗ E˜ B ⊗ E˜ (w) respectively) and so forth. (The reason for the ˜ E˜ A ⊗ E˜ B ⊗ E(w) (E˜Q Q Q weight convention will soon be obvious.) We will write E˜ (w) to mean an arbitrary ˜ tensor power of E˜ A (0) (or symmetrization thereof and so forth) tensored with E(w) and
˜ we will say sections of E (w) are tensors homogeneous of weight w. (We use the term “weight” here to distinguish from the homogeneity “degree” [12] as exposed by the Lie derivative along the field X.) Of course this construction is formal at the same order as ˜ but upon restriction to Q, T M/ ˜ ∼ yields a genuine rank n + 2 the construction of M, vector bundle over M = Q/ ∼ that will be denoted by T or T A . It is immediate from the homogeneity property of h that if U and V are sections of ˜ E˜ A (0), then the function hAB U A V B is in E(0). Restricting to Q we see that hAB U A V B descends to a function on M. From the bilinearity and signature of h it follows that h descends to give a signature (p + 1, q + 1) metric hT on the bundle T . We can use this to raise and lower indices in the usual way. ˜ Observe that XA ∈ E˜ A (1). Thus if ϕ ∈ E(−1), then ϕX A ∈ E˜ A (0). The same is true upon restriction to Q, so we have a canonical inclusion E[−1] → T with image
Conformally Invariant Powers
367
denoted by T 1 . We write XTA for the natural section of T A [1] := T A ⊗ E[1] giving this map, and so on Q, XA is the homogeneous section representing XTA . Clearly then V A → hTAB XTA V B determines a canonical homomorphism T → E[1], and we let T 0 denote the kernel. Recall that Q was defined to be hAB X A X B and that this was a defining function for Q. Thus XTA is a null vector for the metric hT , and it follows immediately that T 1 ⊂ T 0 . There is a simple geometric interpretation of T 0 and T 1 . Observe that A (1) that are annihilated by contraction with X . T 0 [1] corresponds to sections of E˜Q A 1 A (1) corresponding to On Q we have that XA = 2 ∇A Q, so along Q the sections of E˜Q ˜ Q and which are invariant under T 0 [1] are precisely those taking values in T Q ⊂ T M| the action of δs . Then, since X is the Euler vector field, it follows that T 1 [1] corresponds A (1) taking values in the vertical subbundle of T Q. Of course the map to functions in E˜Q Q → M is a submersion, and so T 0 [1]/T 1 [1] is naturally isomorphic to E a = T M. Tensoring by E[−1] we have T 0 /T 1 ∼ = E a [−1], and we can summarise the filtration of T by the composition series a + E [−1] + T = E[1] E[−1]. It is now straightforward to observe that the ambient Levi-Civita connection ∇ also descends to give a connection on T . First, from the defining property that ∇ preserves the metric it follows that if U A ∈ E˜ A (w) and V A ∈ E˜ A (w ), then U A ∇A V B ∈ E˜ B (w +w −1). Then since ∇ is torsion free, we have that ∇X U −∇U X −[X, U ] = 0 for any tangent vector field U . So if U ∈ E˜ A (0), then ∇X U = 0, as, in that case, [X, U ] = −U . So sections of E˜ A (0) may be characterised as those which are covariantly parallel along the vertical Euler vector field. These two results imply that ∇ determines a connection ∇ T on T . For U ∈ T let U˜ be the corresponding section of A (0). Similarly a tangent vector field V on M has a lift to a field V˜ ∈ E˜ A (1), on Q, EQ ˜ which is everywhere tangent to Q. This is unique up to adding f X, where f ∈ E(0). We ˜ Then we can form ∇ ˜ U˜ . This is clearly extend U˜ and V˜ homogeneously to fields on M. V independent of the extensions. Since ∇X U˜ = 0, it is also independent of the choice of V˜ as a lift of V . Finally, it is a section of E˜ A (0) and so determines a section ∇VT U of T which only depends on U and V . It is easily verified that this defines a covariant derivative on T . Let us summarise. By the above construction the ambient manifold and metric construction of Fefferman and Graham naturally determines a rank (n + 2) vector bundle T on M. This vector bundle comes equipped with a signature (p + 1, q + 1) metric hT , a connection ∇ T , and a filtration determined by a canonical section XT of T [1]. Furthermore if v a is a smooth tangent field on M and ϕ is a smooth section of E[1], one easily verifies from the above that the image of v a ∇aT (ϕX B ) lies in T 0 and that composing with the map to the quotient T 0 /T 1 recovers ϕv b . This is a non-degeneracy property of the connection. This with the fact that ∇ T preserves the metric means that T is a tractor bundle with a tractor connection in the sense of [14]. Since ∇ is Ricci flat it follows that ∇ T satisfies the curvature normalisation condition described in [13, 14]. (This is shown explicitly in [12].) From this and the non-degeneracy we can conclude that T A and ∇aT are a normal tractor bundle and connection corresponding to the defining representation of SO(p + 1, q + 1). That is we can take, T A = E A , XTA = X A , and ∇aT to be the usual tractor connection as in Sect. 2. We henceforth drop the notation T .
368
A. R. Gover, L. J. Peterson
We can also recover the operators introduced in the tractor setting. Observe that the ˜ Thus D AP gives an opoperator D AP := 2X [P ∇A] annihilates the function Q on M.
(w) → E˜ Q ˜ (w), and it is a trivial matter to show that this descends erator E˜Q ⊗ E [AP ] Q to DAP : E [w] → E[AP ] ⊗ E [w] as defined in Sect. 2. (Here, of course, E [w] is
(w).) Now we can formally follow the the weight w tractor bundle corresponding to E˜Q construction of DA . First one calculates that, for V˜ ∈ E˜ (w), and using (29), we have hAB D A(Q D |B|P )0 V = −X(Q D P )0 V , where D A V = (n + 2w − 2)∇A V − X A ∆V ,
∆ := ∇B ∇B .
(31)
Then we observe the map E˜P (w − 1) → E˜(P Q)0 (w) given by S˜P → X(Q S˜P )0 is injective. It follows immediately that, along Q, (31) is determined by the equation
(w) → hAB D A(Q D |B|P )0 V = −X(Q D P )0 V and so is precisely the operator D A : E˜Q
(w − 1), which descends to D : E [w] → E ⊗ E [w − 1]. In particular E˜AQ ⊗ E˜Q A A
˜ this is true when w = 1 − n/2, and so ∆ : E (1 − n/2) → E˜ (−1 − n/2) descends to the generalised Yamabe operator : E [1 − n/2] → E [−1 − n/2]. We will take ˜ Although we will not need it here, let us point out (31) as the definition of D A on M. that D AP as defined above acts more generally on sections of tensor bundles on M˜ and not just sections which are homogeneous. Following through the argument above in this more general setting yields a generalisation of the operator D A on tensor bundles given by D A = n∇A + 2XB ∇B ∇A − XA ∆. This still has the property that along Q it acts tangentially. Observe that hAB D A(Q D |B|P )0 V is only of the form −X (Q D P )0 V to order Q0 along Q and that although D A acts tangentially to Q to this order, it does not commute with ˜ from (31) we have Q. In fact for any tensor field V , homogeneous of weight w on M, D A QV = QD A V + 4Q∇A V .
(32)
So, along the Q = 0 surface Q, D A acts tangentially, but, D A does not act tangentially to other Q = constant surfaces. Nevertheless this allows us to conclude that if U and V are tensors of the same rank (and with U + QV homogeneous of some weight), then D A1 · · · D A (U + QV ) = (D A1 · · · D A U ) + QW for some tensor W . Thus, along Q, D A1 · · · D A U is independent of how U is extended off Q. The identities X A D A V = w(n + 2w − 2)V − Q∆V
(33)
D A X A V = (n + 2w + 2)(n + w)V − Q∆V
(34)
and
will also be useful. Here V is a tensor which is homogeneous of weight w. We are now in a position to show directly how the tractor field WABCD is represented in the ambient setting. Let us for a while restrict to n = 4. Note that the curvature of the ambient connection R ABCD is a section of E˜ABCD (−2) and so determines a section of the tractor bundle EABCD [−2]. We will write RABCD to denote this section. Let V˜ ∈ E˜ (w). From (31) we obtain [DA , DB ]V˜ = (n + 2w − 2)(n + 2w − 4)[∇A , ∇B ]V˜ − 2(n + 2w − 2)X[A [∆, ∇B] ]V˜ .
Conformally Invariant Powers
369
A (0) for the corresponding field on Q, and Now let V A ∈ E A . We write V˜ = V˜ A ∈ E˜Q ˜ Then, along Q, we have (see remark below) extend this homogeneously to a field on M.
[D A , D B ]V˜ C = (n − 2)(n − 4)R AB C E V˜ E + 4(n − 2)X [A R B]F C E ∇F V˜ E . Thus, since XF R BF CE = 0 = XF ∇F V˜ E , this implies [DA , DB ]V C = (n − 2)(n − 4)RAB C E V E + 4(n − 2)X[A RB]F C E Z F f ∇ f V E . Comparing this with (12) (with w set to 0 in that expression) we can at once conclude that X[A WBC]DE V E = (n − 4)X[A RBC]DE V E . Since this holds for any section V A of E A , it follows from the definition of WABCD that X[A BC]DE = X[A RBC]DE . Contracting with Z F f we have immediately X[A RB]F C E Z F f = X[A B]F C E Z F f . Substituting this in the above display and once again comparing to (12) we now have that WBCDE V E = (n − 4)RBCDE V E for all V E , and so WBCDE = (n − 4)RBCDE .
(35)
Remarks . Note that [∆, ∇B ]V˜C = 2R EBCF ∇E V˜ F + (∇E R EBCF )V˜ F + R BF ∇F V˜C . From the contracted Bianchi identity ∇E R EBCF = 2∇[C R F ]B , so in odd dimensions the last two terms of the display vanish to all orders. In even dimensions recall we have that, along Q, R BF vanishes to order n/2 − 1 and so in all even dimensions, other than 4, these last terms also vanish along Q. 4. The GJMS Operators Using the properties of D A , we observed in the previous section that if V is a tensor homogeneous of weight 1−n/2 then, along Q, ∆V is independent of how V is extended off Q. So ∆ gives an operator ∆ : E˜ (1 − n/2) → E˜ (−1 − n/2), and this descends to the generalised conformally invariant Laplacian (or Yamabe operator) as in (14). The observation that the conformally invariant Laplacian (on densities) can be obtained from an ambient Laplacian in this way goes back to [38] in the conformally flat dimension 4 setting and to [26] for the general curved case. For the generalised conformally invariant Laplacian we can also show this directly using the result ∇A Q = 2XA
(36)
from above. From this it follows that if U is a tensor field homogeneous of weight w (i.e. U ∈ E˜ (w)), then [∆, Q]U = 2(n + 2w + 2)U.
(37)
Thus if V ∈ E˜ (1 − n/2) and U is a tensor of the same rank and type but homogeneous of weight −1 − n/2, then ∆(V + QU ) = ∆V + Q∆U.
(38)
So clearly ∆V is independent of how V extends off Q. In [32], Graham, Jenne, Mason, and Sparling establish a remarkable generalisation of the result for densities which we state here in our current notation.
370
A. R. Gover, L. J. Peterson
Proposition 4.1. For n even and k ∈ {1, 2, . . . , n/2} or n odd and k ∈ Z+ , let f ∈ ˜ − n/2) be a homogeneous extension of f . The restriction E˜Q (k − n/2), and let f˜ ∈ E(k of ∆k f˜ to Q depends only on f and the conformal structure on M but not on the choice of the extension f˜ or on any choices in the ambient metric. Thus there is a conformally invariant operator ∆k : E˜Q (k − n/2) → E˜Q (−k − n/2), and this descends to a natural conformally invariant differential operator P2k : E[k − n/2] → E[−k − n/2] on M. As mentioned in the introduction, we call the operators P2k the GJMS operators. In this section we will describe a way that one can directly rewrite these operators in terms of D A , X A , the curvature R, and just one ∆. As observed above, each of these corresponds to an object in the tractor calculus. Before we begin we need one more result from [32]. (This follows from Proposition 2.2 and Sect. 3 from there). Proposition 4.2. For n even and k ∈ {1, 2, . . . , n/2} or n odd and k ∈ Z+ , let f ∈ ˜ − n/2) uniquely determined modulo E˜Q (k − n/2). Then f has an extension f˜ ∈ E(k k ˜ O(Q ) by the requirement that ∆f = 0 modulo O(Qk−1 ). The extension is independent of any choices in the ambient metric. ˜ We are ready to consider an example. Let f˜ ∈ E(2−n/2), and let f denote the section of E[2 − n/2] that it determines. Consider ∆D A f˜ = ∆(2∇A f˜ − X A ∆f˜). Since in all dimensions the ambient Ricci curvature vanishes along Q, we have [∆, ∇A ]f˜ = 0. So with the operator equality [∆, XA ] = 2∇A we immediately see that ∆D A f˜ = −XA ∆2 f˜. Thus DA f = −XA P4 f , where P4 is the fourth-order GJMS operator (which agrees with the Paneitz operator). Note that according to the earlier proposition above, the right-hand side is independent of how f extends off Q. So the left-hand side is likewise independent of the choice of extension. In fact this is already clear from (32) and (38). This suggests attempting to recover the higher order GJMS operators from ∆D A · · · D B f˜. On conformally flat structures this is immediately successful. ˜ − n/2), k ∈ Z+ , then Proposition 4.3. On conformally flat structures, if f˜ ∈ E(k ∆D Ak−1 · · · D A1 f˜ = (−1)k−1 X A1 · · · XAk−1 ∆k f˜. Proof. We are only interested in local results and differential operators. So without loss of generality we suppose that we are in the setting of the flat model for which the ambient space is simply Rn+2 equipped with the flat metric h given by a fixed bilinear form of signature (p + 1, q + 1) and the standard parallel transport. The latter also gives the ambient connection in this setting. In the standard coordinates, X = X I ∂/∂X I at the point XI , and the identities of the previous section hold as genuine equalities rather than just formally.
Conformally Invariant Powers
371
We have the operator identity [∆, X A ] = 2∇A on sections of E˜ (w). Since the structure is conformally flat, we also have [∆, ∇A ] = 0. It follows that [∆m , XA ] = 2m∆m−1 ∇A . Thus if f˜ ∈ E˜ (m + 1 − n/2), we have −∆m D A f˜ = −∆m [2m∇A f˜ − X A ∆f˜] = XA ∆m+1 f˜. The proposition now follows by induction on k.
To relate ∆D Ak−1 · · · D A1 f˜ and ∆k f˜ in the general case we must take account of the curvature of the ambient manifold. Since this is Ricci flat we have that if V˜B ∈ E˜A (w), then [∆, ∇A ]V˜B = −2R A P B Q ∇P V˜Q . More generally if V˜BC···E ∈ E˜BC···E (w), then [∆, ∇A ]V˜BC···E = −2R A P B Q ∇P V˜QC···E − 2R A P C Q ∇P V˜BQ···E − · · · −2R A P E Q ∇P V˜BC···Q . (39) In even dimensions the ambient metric is only Ricci flat and determined by the conformal structure on M to finite order, as described above. For example for even n (39) only holds mod O(Qn/2−2 ) (or mod O(Qn/2−1 ) if V˜ has rank 0). For simplicity in the following discussion we will often ignore this point and assume the given calculations do not involve sufficient transverse derivatives of the ambient metric to encounter this problem. We will return to a careful count of tranverse derivatives later in the section. We will also henceforth restrict to n = 4. This also simplifies matters. And there is no loss, as the results for n = 4 have been obtained above. ˜ It follows from the last display that if f˜ ∈ E(w) (and < n/2 if n is even), then ∆∇A · · · ∇A1 f˜ = − 2R A P A −1 Q ∇P ∇Q ∇A −2 · · · ∇A1 f˜ − · · · −2R A P A1 Q ∇P ∇A −1 · · · ∇A2 ∇Q f˜ − 2∇A R A −1 P A −2 Q ∇P ∇Q ∇A −3 · · · ∇A1 f˜ − · · · − 2∇A · · · ∇A3 R A2 P A1 Q ∇P ∇Q f˜ + ∇A · · · ∇A1 ∆f˜,
(40)
where here all ∇A ’s act on all tensors to their right and the result is mod O(Qn/2− ) if n is even. We may apply the Leibniz rule to (40). The term ∇A R A −1 P A −2 Q ∇P ∇Q ∇A −3 · · · ∇A1 f˜, for example, becomes (∇A R A −1 P A −2 Q )∇P ∇Q ∇A −3 · · · ∇A1 f˜ +R A −1 P A −2 Q ∇A ∇P ∇Q ∇A −3 · · · ∇A1 f˜. Often we will not require the details of contractions or the value of coefficients, and so we might write the last result symbolically as ∇R∇ −1 f˜ = (∇R)∇ −1 f˜ + R∇ f˜. (In this informal notation we will write ∇ to indicate a ∇A which is not part of a ∆. For example, it may have a free index or be contracted to the ambient curvature R.) We may repeatedly apply the Leibniz rule to (40) in this way until all of the terms on the right-hand side are of the form (omitting indices) (∇p R)∇q f˜. We might write the result symbolically as ∆∇ f˜ = ∇ ∆f˜ + (∇p R)∇q f˜. (41)
372
A. R. Gover, L. J. Peterson
Note that each term of the second sort on the right-hand side has q ≥ 2 and p + q = . Although in these symbolic formulae we omit the details of the contractions and the coefficients, we really want to regard these expressions as representing precise formulae. The idea of this notation is simply to manifest explicitly only the aspects of the formulae that we need for our general discussion. Now observe that (n + 2w − 2 − 2)∇A +1 ∇A · · · ∇A1 f˜ = D A +1 ∇A · · · ∇A1 f˜ + X A +1 ∆∇A · · · ∇A1 f˜, or, in our symbolic notation, (n+2w−2 −2)∇ +1 f˜ = D∇ f˜+X∆∇ f˜. We can substitute (41) into the right-hand side of this and so observe that if n+2w−2 −2 = 0, then we can replace a term ∇ +1 f˜ by the expression D∇ f˜ +X (∇p R)∇q f˜ +X∇ ∆f˜. Suppose w = k − n/2. Then n + 2w − 2 − 2 = 2(k − − 1), and we have 2(k − − 1)∇ +1 f˜ = D∇ f˜ + X (∇p R)∇q f˜ + X∇ ∆f˜. (42) In each term of the sum we again have q ≥ 2 and p + q = . Note that the left-hand side of (42) has at most + 1 transverse derivatives of f˜. Apart from the term X∇ ∆f˜, which we will deal with below, the right-hand side has at most transverse derivatives of f˜, as D acts tangentially to Q. Our strategy below will be to replace ∇’s with D’s beginning from the left. We may apply similar reasoning to R. Since R has weight −2, we have (n − 2m − 4) ∇m R = D∇m−1 R + X∆∇m−1 R. Here we have used the same informal notation ˜ that we used with we may write this as (n − 2m − 4)∇m R = f ,pabove.q By (39) m−1 m−1 D∇ R + X (∇ R)∇ R + X∇ ∆R. Now note that since R is Ricci flat, we have (43) ∆R BCDE = 2 R A CB F R FADE + R A CD F R BAF E + R A CE F R BADF , from the Bianchi identity. In odd dimensions this holds to all orders. In general we have ∆R BCDE = 2(∇B ∇[D R E]C − ∇C ∇[D R E]B ) + O(R 2 ) where O(R 2 ) indicates the quadratic term in the display. Using, once again, that in even dimensions R AB = Qn/2−1 LAB it follows that (∇B ∇[D R E]C − ∇C ∇[D R E]B ) vanishes to order n/2 − 3, and so (43) holds to that order. Thus we get the simplification (n − 2m − 4)∇m R = D∇m−1 R + X (∇p R)∇q R, (44) where in each term of the sum, p +q = m−1. In even dimensions we need m < n/2−2. This follows immediately from the previous paragraph. That is, we need (n−2m−4) > 0. Our effort to relate ∆D Ak−1 · · · D A1 f˜ and ∆k f˜ involves another identity, viz ∆(∇t ∆u R)E = (∆∇t ∆u R)E + (∇t ∆u R)∆E + 2(∇t+1 ∆u R)∇E.
(45)
Here E is any expression (for a linear operator) which, in terms of our informal symbolic notation, is a polynomial in ∇, ∆, R, and f˜. We also need the following fact which follows from the above:
Conformally Invariant Powers
373
Lemma 4.4. Suppose n is odd or t + u ≤ n/2 − 3. Then on Q there is an expression for ∇t ∆u R as a partial contraction polynomial in D A , R ABCD , XA , hAB , and its inverse hAB . This expression is rational in n, and each term is of degree at least 1 in R ABCD . Proof. Repeatedly use (39), (43), and (45) to rewrite ∇t ∆u R as a sum of terms of the form (∇v1 R) · · · (∇vj R). In doing this we convert some ∆’s into pairs of ∇’s via (45), but at most one ∇ from each pair acts on any given R. Thus in even dimensions, vi ≤ n/2 − 3, i ∈ {1, · · · , j }, and we may construct the desired partial contraction polynomial by repeatedly applying (44) to the terms (∇v1 R) · · · (∇vj R). In even dimensions, using (37) and ∇A Q = 2X A with the restriction t + u ≤ n/2 − 3, we see that (39), (43), and (44) all hold to sufficient order.
˜ − n/2) is any homogeneous extension Now let f ∈ E˜Q (k − n/2). Suppose f˜ ∈ E(k of f as in Proposition 4.2. We will consider ∆D Ak−1 · · · D A1 f˜, where k is a positive integer. If n is even, we assume that k ≤ n/2. Let us systematically rewrite this in terms of (−1)k−1 X A1 · · · XAk−1 ∆k f˜ and curvature coupled terms via the following steps: Step 1. Observe that ∆D Ak−1 · · · D A1 f˜ = ∆(2∇Ak−1 − X Ak−1 ∆) · · · (2(k − 1)∇A1 − X A1 ∆)f˜. Expand this out via the distributive law without changing the order of any of the operators. Step 2. Move all X’s to the left of any ∇ or ∆ via the identities [∇A , XB ] = hAB and [∆, XA ] = 2∇A (which hold to all orders). Step 3. Move all ∆’s to the right of any ∇’s (other than those implicit in ∆) via (40), and (45). In even dimensions one of course needs to be careful, since (40) is valid only if < n/2 and holds mod O(Qn/2− ). Elementary counting arguments (along similar lines to the discussion in the next paragraph) quickly establish that for terms encountered we have < k satisfied and with no more than (k − − 1) transverse derivatives of the result. Since we assume k ≤ n/2 when n is even the use of (40) is valid. Next by the proof of Proposition 4.3, we may cancel all terms not explicitly involving the curvature except for the term (−1)k−1 XA1 · · · XAk−1 ∆k f˜. (The proof of Proposition 4.3 involves only the identities used in Steps 1 and 2 with just the difference that these are applied in a different order.) We thus obtain hs X x (∇p1 ∆r1 R) · · · (∇pd ∆rd R)∇q ∆r f˜, (46) (−1)k−1 X k−1 ∆k f˜ + where d ≥ 1 in each term of the right-hand part. At this point let us take stock of what we have. For each term in the result of Step 1, the sum of the number of ∆’s in the term and the number of ∇’s in the term is exactly k. In Steps 2 and 3 some ∆’s may have been exchanged for ∇’s via the identity [∆, XA ] = 2∇A or for R’s via the commutator [∆, ∇A ], and similarly we may have lost some ∇’s by [∇A , XB ] = hAB . On the other hand, we may have converted some ∆’s into pairs of ∇’s via (45); note that at most one ∇ from each pair acts on f˜, and similarly at most one ∇ from each pair acts on any given R. Thus for each term of the sum in (46) we must have d + q + r ≤ k. Since d ≥ 1, it follows that k − q − r ≥ 1. Note that each R in (40) is followed by at least two ∇’s. Thus at each step in the construction
374
A. R. Gover, L. J. Peterson
of the right-hand part of (46), each of the rightmost two ∇’s of each term arose from Steps 1 and 2, and not from the use of (45). It follows that at least two of the ∇’s in ∇q ∆r f˜ did not arise from (45). Thus q ≥ 2, and for any i ∈ {1, · · · , d}, pi +ri +3 ≤ k. Now suppose n is even. Then, by assumption, k ≤ n/2, and for i ∈ {1, · · · , d} we have pi + ri ≤ n/2 − 3. Since the ambient metric is determined modulo terms of O(Qn/2 ), it follows immediately that the metric connection ∇ is determined modulo terms of O(Qn/2−1 ). Its curvature R is similarly determined modulo O(Qn/2−2 ). Now when ∆ = ∇A ∇A acts on functions, its rightmost ∇ is really just the exterior derivative. Thus as an operator on functions, ∆ is determined modulo terms of O(Qn/2−1 ). It now ˜ −n/2), all terms in the sum of (46) follows from (36) and (37) that, as an operator on E(k are determined uniquely modulo O(Q). If n is odd, the ambient metric is determined to infinite order so certainly the same is true. Next, by Proposition 4.2 we can assume that f˜ satisfies ∆f˜ = 0 modulo O(Qk−1 ), and given f , this determines f˜ uniquely modulo O(Qk ). This will simplify our arguments. The end result will be independent of this choice. Since k − q − r ≥ 1, we see immediately that all terms in (46) with r ≥ 1 will vanish modulo O(Q). We will thus delete these terms. From the inequality k − q − r ≥ 1 and, in even dimensions, the inequality pi + ri ≤ n/2 − 3, it follows that we can carry out the next step. Step 4. First rewrite (46) as (−1)k−1 X k−1 ∆k f˜ +
hs X x (∇p1 ∆r1 R) · · · (∇pd ∆rd R)∇q f˜.
(47)
Then repeatedly use (42) and Lemma 4.4 to eliminate all ∇’s and ∆’s from the right-hand part of this expression. The use of (42) introduces additional ∆’s. But terms containing these ∆’s vanish modulo O(Q), and we cancel them as soon as they appear. We obtain as result hs f˜, (−1)k−1 X k−1 ∆k f˜ = ∆D k−1 f˜ + (48) where, in terms of our informal symbolic notation, the operator is a polynomial in X, D, and R. The exponent s here is not claimed to bear any relationship to the s from earlier. The only differential operator of non-zero order used in the formula is D. Thus although we used the f˜ satisfying ∆f˜ = 0 modulo O(Qk−1 ) to obtain (48), observe now that it follows immediately from (32) that each term depends only on f and is otherwise independent of the extension f˜. Thus for any extension f˜, (48) holds modulo O(Q). Remarks . At this point it is worthwhile to justify our use of (42) in Step 4. First note that in each term in (47) we have q ≤ k − 1, by the counting given above. Thus in (42), + 1 will always be at most k − 1, will be at most k − 2, and k − − 1 will be nonzero. We may therefore solve for ∇ +1 f˜ in (42). On the other hand the use of (42) may generate additional curvature terms (∇p R)∇q f˜. But p + q = , where q ≥ 2. Thus in even dimensions, p ≤ − 2 ≤ k − 4 ≤ n/2 − 4, and we may apply Lemma 4.4 to ∇p R. In the final step we will use the fact that (n − 4)R descends to the tractor field W , X descends to X, h descends to h, and that ∆ : E˜ (1 − n/2) → E˜ (−1 − n/2) descends to : E [1 − n/2] → E [−1 − n/2]. Step 5. In the right-hand side of (48) make the following formal replacements: f˜ with f , ∆ with , X with X, h with h, R with W/(n − 4) (in dimensions n = 4) and D with D. The result is a tractor formula for (−1)k−1 X k−1 P2k f . We state this as a proposition.
Conformally Invariant Powers
375
Proposition 4.5. There is a tractor calculus expression for the GJMS operators of the form XA1 · · · XAk−1 P2k f = (−1)k−1 DAk−1 · · · DA1 f + Ak−1 ···A1 P Q DP DQ f,
(49)
where f ∈ E[k − n/2] and is a linear differential operator Ak−1 ···A1 P Q : EP Q [k − 2 − n/2] → EAk−1 ···A1 [−1 − n/2], expressed as a partial contraction polynomial in DA , WABCD , XA , hAB , and its inverse hAB . The expression for is rational in n, and each term is of degree at least 1 in WABCD . Proof. It is clear from the argument of this section that XA1 · · · XAk−1 P2k f = (−1)k−1 DAk−1 · · · DA1 f + Ak−1 ···A1 f, where Ak−1 ···A1 is a linear differential operator on f expressed as a partial contraction polynomial in DA , WABCD , XA , hAB , and its inverse hAB . It is also clear that this expression for is rational in n and that each term is of degree at least 1 in WABCD . Furthermore, recall that in Step 4 we used (42) and Lemma 4.4 to convert the expression ∇q f˜ of (47) into an expression in D, X, R, h, and h−1 . Since q ≥ 2 in (42) and (47), it follows that each term of this tractor expression ends in two consecutive D’s. The result now follows.
We conclude this section with examples. 4.1. Examples. The simplest example of our procedure is the Paneitz operator P4 , which we treated at the outset of this section. Recall that we obtained DA f = −XA P4 f , and it is clear that the tractor expression on the left-hand side of this is independent of any choices in the ambient construction. This is as guaranteed by the argument following Step 3. The next simplest case is of course the operator P6 . By assumption then, n = 4. Let f ˜ − n/2) such that its restriction denote a section of E[3 − n/2]. Let f˜ be a section of E(3 2 ˜ ˜ to Q agrees with f and such that, ∆f = Q g for some smooth g ∈ E(−3 − n/2). ˜ Expanding out ∆D A D B f according to Steps 1 and 2 gives ∆D A D B f˜ = XA X B ∆3 f˜ + 2X B [∇A , ∆]∆f˜ +2XA [∇B , ∆]∆f˜ + 4X A ∆[∇B , ∆]f˜ − 8[∇A , ∆]∇B f˜. Since [∆, ∇A ] vanishes on functions mod O(Q2 ), the third step reduces to ∆D A D B f˜ = XA X B ∆3 f˜ − 8[∇A , ∆]∇B f˜ = XA X B ∆3 f˜ − 16R A C B E ∇C ∇E f˜, along Q. The fourth step is simply the observation that on Q (with f˜ as above) we have 8∇C ∇E f˜ = D C D E f˜
376
A. R. Gover, L. J. Peterson
and so X A X B ∆3 f˜ = ∆D A D B f˜ + 2R A C B E D C D E f˜. As we have observed generally, at this stage none of the terms on either side depend on how f˜ extends off Q. Thus finally we have DA DB f +
2 WA C B E DC DE f = XA XB P6 f, n−4
where P6 is the sixth-order GJMS operator. Thus as promised, we have recovered the tractor formula found by other means in Sect. 2.1. Our final example is P8 . By following Steps 1 through 4, above, and by applying (29), we obtain −XA X B X C ∆4 f˜ = ∆D A D B D C f˜ + 2R A P B Q D P D Q D C f˜ + 2R A P C Q D P D B D Q f˜ 2 X A (D E R B P C Q )D E D P D Q f˜ − (n−6) + 4XA R B P C Q R P E Q F D E D F f˜ − 2X A U B P C Q D P D Q f˜ 2 − n−6 X A X E U B P C Q D E D P D Q f˜ 4 + (n−6) X A X E (D E R B P C Q )R P F Q G D F D G f˜.
Here U B P C Q denotes the tractor field 2 R AP B F R FAC Q + R AP C F R BAF Q + R AP QF R BACF . To demonstrate explicitly that P8 is formally self-adjoint, a variation on this formula is preferred. It is a straightforward exercise to rewrite the above equation as follows. XA X B X C ∆4 f˜ = −∆D A D B D C f˜ − 2R A P B Q D P D Q D C f˜ − 2R A P C Q D P D B D Q f˜ −
4 P Q ˜ n−6 X A (U B C )D P D Q f
2 + n−6 X A D E R B P C Q D E D P D Q f˜ P Q S T ˜ + 4 n−2 n−6 X A R B C R P Q D S D T f .
This together with (43) yields the tractor formula of Proposition 2.3. References 1. Bailey, T.N., Eastwood, M.G., Gover, A.R.: Thomas’s structure bundle for conformal, projective and related structures. Rocky Mountain J. Math. 24, 1191–1217 (1994) 2. Bateman, H.: The transformation of the electrodynamical equations. Proc. Lond. Math. Soc. 8, 223–264 (1910) 3. Branson, T.: Differential operators canonically associated to a conformal structure. Math. Scand. 57, 293–345 (1985) 4. Branson, T.: Private communication 1987
Conformally Invariant Powers
377
5. Branson, T.: Sharp inequalities, the functional determinant, and the complementary series. Trans. Am. Math. Soc. 347, 3671–3742 (1995) 6. Branson, T., Chang, S.-Y.A., Yang, P.: Estimates and extremals for zeta function determinants on four-manifolds. Commun. Math. Phys. 149, 241–262 (1992) 7. Branson, T., Gover, A.R.: Conformally invariant non-local operators. Pacific J. Math. 201, 19–60 (2001) 8. Branson, T., Gover, A.R.: In progress 9. Branson, T., Ørsted, B.: Conformal geometry and global invariants. Differential Geom. Appl. 1, 279–308 (1991) 10. Branson, T., Ørsted, B.: Explicit functional determinants in four dimensions. Proc. Am. Math. Soc. 113, 669–682 (1991) ˇ 11. Cap, A., Gover, A.R.: CR tractors and the ambient metric construction. In progress ˇ 12. Cap, A., Gover, A.R.: Standard tractors and the conformal ambient metric construction. Preprint math.DG/0207016, http://www.arxiv.org ˇ 13. Cap, A., Gover, A.R.: Tractor bundles for irreducible parabolic geometries. In: Global analysis and harmonic analysis (Marseille-Luminy, 1999), S´emin. Congr., 4, Paris: Soc. Math. France, 2000, pp. 129–154. Preprint ESI 865, http://www.esi.ac.at ˇ 14. Cap, A., Gover, A.R.: Tractor calculi for parabolic geometries. Trans. Am. Math. Soc. 354, 1511– 1548 (2002). Preprint ESI 792, http://www.esi.ac.at 15. Cartan, E.: Les espaces a` connexion conforme. Ann. Soc. Pol. Math. 2, 171–202 (1923) 16. Chang, S.-Y.A., Qing, J.,Yang, P.: Compactification of a class of conformally flat 4-manifold. Invent. Math. 142, 65–93 (2000) 17. Chang, S.-Y.A.,Yang, P.: On uniqueness of solutions of nth order differential equations in conformal geometry. Math. Res. Lett. 4, 91–102 (1997) 18. Dirac, P.A.M.: Wave equations in conformal space. Ann. of Math. (2) 37, 429–442 (1936) 19. Eastwood, M.G.: Notes on conformal differential geometry. In: The Proceedings of the 15th Winter School “Geometry and Physics” (Srni,´ 1995), Rend. Circ. Mat. Palermo (2) Suppl. No. 43, 1996, pp. 57–76 20. Eastwood, M.G.: Private communication 2001 21. Eastwood, M.G., Gover, A.R.: Formal adjoints and a canonical form for linear operators. Twistor Newsletter 41, 35–36 (1996) 22. Eastwood, M.G., Rice, J.W.: Conformally invariant differential operators on Minkowski space and their curved analogues. Commun. Math. Phys. 109, 207–228 (1987). Erratum, Commun. Math. Phys. 144, 213 (1992) 23. Eastwood, M.G., Singer, M.: A conformally invariant Maxwell gauge. Phys. Lett. A 107, 73–74 (1985) 24. Fefferman, C.: Monge-Amp`ere equations, the Bergman kernel and geometry of pseudoconvex domains. Ann. of Math. (2) 103, 395–416 (1976). Correction, Ann. of Math. (2) 104, 393–394 (1976) 25. Fefferman, C., Graham, C.R.: In progress 26. Fefferman, C., Graham, C.R. Conformal invariants. In: Elie Cartan et les math´ematiques d’aujourd’hui, Ast´erisque, hors s´erie. Paris: Soci´et´e Math´ematique de France, 1985, pp. 95–116 27. Fefferman, C., Graham, C.R.: Q-curvature and Poincar´e metrics. Math. Res. Lett. 9, 139–151 (2002). Preprint math.DG/0110271, http://www.arxiv.org 28. Gover, A.R.: Aspects of parabolic invariant theory. In: The 18th Winter School “Geometry and Physics” (Srni´ 1998), Rend. Circ. Mat. Palermo (2) Suppl. No. 59, 1999, pp. 25–47 29. Gover, A.R.: Invariant theory and calculus for conformal geometries. Adv. Math. 163, 206–257 (2001) 30. Gover, A.R., Graham, C.R.: CR invariant powers of the sub-Laplacian. Preprint math.DG/0301092, http://www.arxiv.org 31. Graham, C.R.: Conformally invariant powers of the Laplacian, II: Nonexistence. J. Lond. Math. Soc. (2) 46, 566–576 (1992) 32. Graham, C.R., Jenne, R., Mason, L., Sparling, G.: Conformally invariant powers of the Laplacian, I: Existence. J. Lond. Math. Soc. (2) 46, 557–565 (1992) 33. Graham, C.R., Witten, E.: Conformal anomaly of submanifold observables in AdS/CFT correspondence. Nucl. Phys. B 546, 52–64 (1999). Preprint hep-th/9901021, http://www.arxiv.org ´ 34. Graham, C.R., Zworski, M.: Scattering matrix in conformal geometry. S´emin. Equ. D´eriv. Partielles, ´ 2000–2001, Exp. No. XXIII, 14 pp., Ecole Polytech., Palaiseau, 2001. Preprint math.DG/0109089, http://www.arxiv.org 35. Graham, C.R., Zworski, M.: Scattering matrix in conformal geometry. Invent. Math., to appear 36. Henningson, M., Skenderis, K.: The holographic Weyl anomaly. J. High Energy Phys. (1998), no. 7, Paper 23, 12 pp. (electronic). Preprint hep-th/9806087, http://www.arxiv.org
378
A. R. Gover, L. J. Peterson
37. Henningson, M., Skenderis, K.: Holography and the Weyl anomaly, Proceedings of the 32nd International Symposium Ahrenshoop on the Theory of Elementary Particles (Buckow, 1998). Fortschr. Phys. 48, 125–128 (2000). Preprint hep-th/9812032, http://www.arxiv.org 38. Hughston, L.P., Hurd, T.R.: A CP 5 calculus for space-time fields. Phys. Rep. 100, 273–326 (1983) 39. Lee, J.M.: “Ricci” software package, http://www.math.washington.edu/∼lee. 40. Paneitz, S.: A quartic conformally covariant differential operator for arbitrary pseudo-Riemannian manifolds. Preprint 1983 41. Riegert, R.: A nonlocal action for the trace anomaly. Phys. Lett. B 134, 56–60 (1984) 42. Thomas, T.Y.: On conformal geometry. Proc. Natl. Acad. Sci. USA 12, 352–359 (1926) 43. Witten, E.: Anti-de Sitter space and holography. Adv. Theor. Math. Phys. 2, 253–291 (1998). Preprint hep-th/9802150, http://www.arxiv.org 44. W¨unsch, V.: On conformally invariant differential operators. Math. Nachr. 129, 269–281 (1986) 45. Zuckerman, G.: Tensor products of finite and infinite dimensional representations of semisimple Lie Groups. Ann. of Math. 106, 295–308 (1977) Communicated by P. Sarnak
Commun. Math. Phys. 235, 379–425 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0798-4
Communications in
Mathematical Physics
Glauber Dynamics of the Random Energy Model I. Metastable Motion on the Extreme States∗ G´erard Ben Arous1 , Anton Bovier2 , V´eronique Gayrard1, 1
Ecole Polytechnique F´ed´erale de Lausanne, 1015 Lausanne, Switzerland. E-mail:
[email protected] 2 Weierstrass-Institut f¨ ur Angewandte Analysis und Stochastik, Mohrenstrasse 39, 10117 Berlin, Germany. E-mail:
[email protected] Received: 9 October 2001 / Accepted: 17 October 2002 Published online: 28 February 2003 – © Springer-Verlag 2003
Abstract: We investigate the long-time behavior of the Glauber dynamics for the random energy model below the critical temperature. We give very precise estimates on the motion of the process to and between the states of extremal energies. We show that when disregarding time, the consecutive steps of the process on these states are governed by a Markov chain that jumps uniformly on all possible states. The mean times of these jumps are also computed very precisely and are seen to be asymptotically independent of the terminal point. A first indicator of aging is the observation that the mean time of arrival in the set of states that have waiting times of order T is itself of order T . The estimates proven in this paper will furnish crucial input for a follow-up paper where aging is analysed in full detail. 1. Introduction and Background 1.1. Introduction. The concept of “aging” has become one of the main paradigms in the theory of the dynamics of disordered systems1 . Roughly speaking, this term refers to a particular way in which dynamic properties of a system change with time when relaxing towards equilibrium: the time scale at which the process evolves slows down in proportion to the elapsed time, the system “ages”. It is in fact believed that most disordered systems, or at least those qualified as “glassy systems” do exhibit this phenomenon. While this is so, almost no results concerning aging in “real” spin systems do exist. In fact most existing results, even on the heuristic level, concern two types of dynamics: 1) Langevin dynamics in spherical models such as the spherical SK model [BDG, CD], or the spherical p-spin SK model [BCKM]. 2) Trap models [B, BD, BCKM] that are ∗
Work partially supported by the Swiss National Science Foundation under contract 21-65267.01 On leave from CPT-CNRS, Luminy, Case 907, 13288 Marseille Cedex 9, France. E-mail:
[email protected] 1 The cond-mat archives in Trieste contain 263 papers containing this term in their abstracts, and 124 containing it even in the title. See also [Be] for a recent mathematical review.
380
G. Ben Arous, A. Bovier, V. Gayrard
inspired by the structure of equilibrium states found in (mostly non-rigorous) analysis of mean field spin-glasses. These dynamics are, however, introduced ad hoc without any attempt to justify and derive them from an underlying Glauber dynamics on the microscopic degrees of freedom. In the context of the spherical models, a rigorous derivation of the aging phenomenon has been given recently in [BDG]. This model lacks, however, many of the expected features of spin glasses, in particular the existence of a complex energy landscape with many “metastable states”. The simplest model showing these features is the random energy model (REM) [D1, D2]. This model is indeed traded as one of the standard examples where aging occurs in the physics literature; the arguments in the physics literature are however, all based on the ad hoc introduction of an effective model (the REM-like trap model [B, BD, BM]) inspired by known properties of the equilibrium distribution and some heuristic arguments. The behaviour of the trap models can then be analysed in detail. In this and the companion paper [BBG] we prove the first rigorous results on the Glauber dynamics of the REM that will justify in a suitable sense the predictions based on the trap model heuristic. We feel that this is an important first step in showing that the abundant literature on this model is of relevance for realistic disordered systems. The key point of our analysis, and in fact a central problem of the entire subject, will be to control the behaviour of a Markov chain on a very high-dimensional set on a relatively small, but still asymptotically infinite subset of its “most recurrent” or “most stable” states on appropriate time scales, and to describe the ensuing effective dynamics. While we will have to use many of the particular features of the model we consider here, we feel that the general methodology developed in this paper will be of use in many other contexts of the dynamics of complex systems. The REM. We recall that the REM [D1, D2] is defined as follows. A spin configuration σ is a vertex of the hypercube SN ≡ {−1, 1}N . On an abstract probability space (, F, P ) we define the family of i.i.d. standard normal random variables {Xσ }σ ∈SN . We set Eσ ≡ [Xσ ]+ ≡ (Xσ ∨ 0). We define a random (Gibbs) probability measure on SN , µβ,N , by setting √
eβ NEσ , µβ,N (σ ) ≡ Zβ,N
(1.1)
2 . It is well-known [D1, D2] that this where Zβ,N is the normalizing partition function √ is model exhibits a phase transition at βc = 2 ln 2. For β ≤ βc , the Gibbs measures √ supported, asymptotically as N ↑ ∞, on the set of states σ for which Eσ ∼ N β, and no single configuration has positive mass. For β > βc , on the other hand, the Gibbs measure gives positive mass to the extreme elements of the order statistics of the family Eσ ; i.e. if we order the spin configurations according to the magnitude of their energies s.t.
Eσ (1) ≥ Eσ (2) ≥ Eσ (3) ≥ · · · ≥ Eσ (2N ) , (1.2) then for any finite k, the respective mass µβ,N σ (k) will converge, as N tends to infinity, to some positive random variable νk ; in fact, the entire family of masses µβ,N σ (k) , κ ∈ 2 The standard model has X instead of E . This modification has no effect on the equilibrium σ σ properties of the model, and will be helpful for setting up the dynamics.
Aging in the REM. Part I
381
N will converge in a suitable sense to a random process {νk }k∈N , called Ruelle’s point process [Ru]. We explain this in more detail below. So far the fact that σ are vertices of a hypercube has played no rˆole in our considerations. It will enter only in the definition of the dynamics of the model. The dynamics we will consider is a discrete time Glauber dynamics. That is we construct a Markov chain σ (t) with state space SN and discrete time t ∈ N by prescribing transition probabilities pN (σ, η) = P[σ (t + 1) = η|σ (t) = σ ] by √ 1 −β NEσ , if σ − η2 = 2 N e √ . (1.3) pN (σ, η) = 1 − e−β NEσ , if σ = η 0, otherwise Note that the dynamics is also random, i.e. the law of the Markov chain is a measure valued random variable on that takes values in the space of Markov measures on the path space SNN . We will mostly take a pointwise point of view, i.e. we consider the dynamics for a given fixed realization of the disorder parameter ω ∈ (dependence on which we persistently suppress in the notation). Remark. Let us comment on our choice of the dynamics. First, we chose discrete time rather than continuous time as to be closer to computer simulations. Since we work on a discrete space, there is no difficulty to treat the continuous time and all our results hold also in continuous time. Second, the fact that we chose the rates to depend only on the starting point allows us to avoid having to solve the problem of determining very precisely the barrier heights between any pair of points σ (i) , σ (j ) , which is a tremendous geometrical problem to which we have no answer. Clearly our choice favours the emergence of Bouchaud’s trap model. It is easy to see that this dynamics is reversible with respect to the Gibbs measure µβ,N . One also sees that it represents a nearest neighbor random walk on the hypercube with traps of random depths (i.e. the probability to make a zero step is rather large when Eσ is large)3 . The idea suggested by the known behavior of the equilibrium distribution is that these dynamics, for β > βc , will spend long periods of time in the states σ (1) , σ (2) , . . . etc. and will move “quickly” from one of these configurations to the next. Based on this intuition, Bouchaud et al. proposed the “REM-like” trap model: the state space is reduced to M points, representing the M “deepest” traps. Each of the states is assigned a positive random energy Ek which is taken to be exponentially distributed with rate one. The dynamics is now a continuous time Markov chain Y (t) taking values in SM ≡ {1, . . . , M}. If the process is in state k, it waits an exponentially distributed time with mean proportional to eEk α , where α = β/βc , and then jumps with equal probability in one of the other states k ∈ SM . This process is then analyzed using essentially techniques from renewal theory. The essential point is that if one starts the process from the uniform distribution, it is possible to show that if one only considers the times, Ti , at which the process changes its state, then the counting process, c(t), that counts the number of these jumps in the time interval (0, t) is a classical renewal (counting) process [KT]; moreover, as n ↑ ∞, this renewal process converges to a renewal process with a deterministic law for the renewal time with a heavy-tailed distribution (in the sense that the mean is infinite4 ) whose density is proportional to t −1−1/α . It is the emergence 3 We have chosen this particular dynamics for technical reasons. To study e.g. the Metropolis algorithm would require some extra work, but we expect essentially the same results to hold. 4 This is clearly due to the fact that the average of the waiting time eαEi over the disorder is infinite.
382
G. Ben Arous, A. Bovier, V. Gayrard
of such non-Markovian limit processes that is ultimately responsible for all the aging phenomena observed in the abundant literature on this and related models. Mathematically, the analysis of this trap model presents no particular challenge and the analysis presented e.g. in the review [BCKM] is essentially rigorous, or can be made so with minor efforts. Our purpose is to show, in a mathematically rigorous way, how and to what extent the REM-like trap model can be viewed as an approximation of what happens in the REM itself. Clearly the main difficulty in doing this will be to explain why the rather complicated random walk on the hypercube between the most profound traps can be interpreted as a simple jump process. This question has two aspects: 1) Why does the process jump with the uniform distribution on the extremal states? 2) Why can this process be seen as a Markov process, in particular, why are the times between visits of two extreme points asymptotically exponentially distributed? While these facts may appear “obvious” to most physicists, the reason why they are not addressed in any serious way in the literature is that a) they are not at all easy to solve and b) they are, strictly speaking, not even true. In fact, we will see in the course of the analysis (including the follow-up paper [BBG]) that such properties can be only established in a very weak asymptotic form, which is, however, just enough to imply that the predictions of Bouchaud’s model apply to the long time asymptotics of the process. While this fact will emerge here only through some very careful and tedious computations, it is clearly desirable to develop a more profound understanding of the phenomenon. In this first paper we will essentially address the question 1). We will show that if we look at the sequence of visits of the process on a selected set of the states of lowest energy, disregarding the times of these visits, the law of the sequence can indeed be described asymptotically by a simple discrete time Markov chain on this set, which jumps from one point to the next with the uniform distribution. We will also consider two more questions. First we will compute the mean entrance time and the entrance law on this set starting from an arbitrary point on the hypercube. Second we will compute the mean transition times between points in this set. It will turn out that these mean transition times do indeed depend, asymptotically, only on the starting point. Thus, modulo the Markovian hypothesis, we come very close to the heuristic picture outlined above. Moreover, we will see that the mean time to reach √ a set of extremes is proportional to the smallest “waiting times” on that set (if β > 2 ln 2), which will be interpreted √ as a first sign of the occurrence of aging. We will also show that in contrast, if β < 2 ln 2, then the mean time to reach any such point is much longer (by an exponentially large factor) than the waiting time in that point, independent of the starting point of measure. This dichotomy is in fact the main dynamical signature of the transition in this model. This resolves a question raised in an earlier attempt by Fontes et al. [FIKP] to analyse the dynamics of the REM using estimates on the spectral gap. This analysis revealed no sign of a phase transition in the behaviour of the spectral gap. Indeed, the spectral gap in this model correponds in both the high and the low temperature case essentially to the maximal mean waiting time in one site, which depends in a regular way on the temperature. For a different approach to the high-temperature dynamics see also the recent paper by Mathieu and Picco [MP, M]. The control of the property 2) and the more refined analysis of the aging phenomenon will be left two a companion paper [BBG], which will strongly rely on the results obtained here.
Aging in the REM. Part I
383
Our analysis will draw heavily on methods introduced only recently in the analysis of metastability in similar Markov chains in [BEGK1, BEGK2]. We note, however, that the situation here is in some respects quite different than in the setting investigated in these papers. In particular, the investigation of metastability concentrated on the situation where the time scales associated to each metastable state were sufficiently far apart so that to each state corresponds a distinct scale. Moreover, these long, metastable time-scales were assumed to be well separated from the shorter time scales on which the process may stay away from the set of metastable states. In the present situation, and this is a generic feature distinguishing aging from metastability, we have on the contrary an infinity of states that communicate on the same time scale, and to complicate the issue, there will be no “gap” between the time scales we are interested in and the “faster” times scales that we try to ignore. Thus the present situation violates the conditions of the setting investigated in [BEGK2] in a maximal way. The remainder of the introduction is organized as follows. In the next subsection we present some background results on the equilibrium properties of the REM. Based on this information, we will discuss in Subsect. 1.3 some aspects of the metastable behaviour of the model, and state precisely the results we alluded to above.
1.2. Equilibrium results for the REM. In this sub-section we give the necessary background on the (mostly well known, see e.g. [Ei, GMP, OP, Ru]) static aspects of the REM, i.e. we give a precise description of the infinite volume asymptotics of the Gibbs measures that will help to understand the heuristics of the model. A complete exposition can be found in [Bo]. The basic result is the following theorem that characterizes the precise behavior of the partition function: Proposition 1.1 ([BKL]). √ Let P denote the Poisson point process on R with intensity measure e−x dx. If β > 2 ln 2, then ∞ √ D −N [β 2 ln 2−ln 2]+ α2 [ln(N ln 2)+ln 4π ] e Zβ,N → eαz P(dz) (1.4) −∞
and D
ln Zβ,N − E ln Zβ,N → ln
∞
−∞
e P(dz) − E ln αz
∞ −∞
eαz P(dz).
(1.5)
Remark. The right-hand side of (1.4) is the partition function of what is known as Ruelle’s version of the random energy model [Ru]. The simple proof of this theorem is given in [BKL]. It relies, of course, on the classical theorem on the convergence of the point process of (properly rescaled) extremes of i.i.d. Gaussian r.v.’s to the Poisson point process P (see e.g. [LLR]). Namely, if we set uN (x) ≡
√ 1 ln(N ln 2) + ln 4π x − 2N ln 2 + √ √ 2 2N ln 2 2N ln 2
(1.6)
and define the point process PN ≡
σ ∈{−1,1}N
δu−1 (Xσ ) N
(1.7)
384
G. Ben Arous, A. Bovier, V. Gayrard
it is well-known that PN converges in distribution to the Poisson point process, P, with intensity measure e−x dx on the real line. Since the left hand side of (1.4) can be written as PN (dx)eαx , (1.8) the theorem follows if the convergence (in law, as N ↑ ∞) of this integral can be proven, which is the case if and only if α > 1. For this reason the Poisson point process P will play a central rˆole in all of our analysis. Proposition 1.1 can be extended to obtain a precise description of the Gibbs measures as well. To formulate this result, it will be convenient to compactify the space SN by mapping it to the interval [−1, 1] via SN σ → rN (σ ) ≡
N
σi 2−i ∈ [−1, 1].
(1.9)
i=1
Define the pure point measure µ˜ β,N on [−1, 1] by µ˜ β,N ≡ δrN (σ ) µβ,N (σ ).
(1.10)
σ ∈SN
Let us introduce the Poisson point process R on the strip [−1, 1] × R with intensity measure 21 dy × e−x dx. If (Yk , Xk ) denote the atoms of this process, define a new point process Wα on [−1, 1] × (0, 1] whose atoms are (Yk , wk ), where wk ≡ With this notation we have that Proposition 1.2 ([Bo]). If β > µ˜ β,N
eαXk . R(dy, dx)eαx
(1.11)
√ √ 2 ln 2, with α = β/ 2 ln 2, 1 D → µ˜ β ≡ Wα (·, dw)w.
(1.12)
0
Proof. Define the point process RN on [−1, 1] × R by δ(rN (σ ),uN (Xσ )) . RN ≡
(1.13)
σ ∈SN
A standard result of extreme value theory (see [LLR], Theorem 5.7.2) is easily adapted to yield that D
RN → R,
as N ↑ ∞,
(1.14)
where the convergence is in the sense of weak convergence on the space of sigma-finite measures endowed with the (metrizable) topology of vague convergence. Note that −1
−1
eαuN (Xσ ) eαuN (Xσ ) µβ,N (σ ) =
= . αu−1 RN (dy, dx)eαx N (Xσ ) σ e
(1.15)
Aging in the REM. Part I
385
We can define the point process WN ≡
δ
σ ∈SN
on [−1, 1] × (0, 1]. Then
exp(αu−1 N (Xσ )) N (dy,dx) exp(αx)
(1.16)
rN (σ ), R
µ˜ β,N =
WN (dy, dw)δy w.
(1.17)
Of course we would like to show that this quantity converges to the same object with WN replaced by W, as N ↑ ∞. The only non-trivial issue to be resolved is to see whether the denominators RN (dy, dx) exp(αx) converge. But Theorem 1.1 asserts precisely that D
this is the case whenever α > 1. Standard arguments then imply that first WN → W, and consequently, (1.12). Remark. Note that Theorem 1.2 contains in particular the convergence of the Gibbs measure in the product topology on SN , since cylinders correspond to certain subintervals of [−1, 1]. Let us discuss the properties on the limiting process µ˜ β . It is not hard to see that with probability one, the support of µ˜ β is the entire interval [−1, 1]. On the other hand, its mass is concentrated on a countable set, i.e. the measure is pure point. This is quite easy to see and the details of the argument can be found in [Bo]. 1.3. Metastability and statement of the main results. The properties of the invariant distribution explained in the previous section clearly imply that at temperatures below the critical one the dynamical process will spend most of its time on the extreme states. This suggests that the long time behaviour of the dynamics can be read off from observations of the process on visits to these states. More precisely, define the sets, for E ∈ R,
TN (E) ≡ σ ∈ SN Eσ ≥ uN (E) , (1.18) where uN (E) is defined in (1.6). We will call the set TN (E) “the top”, and frequently suppress indices, writing TN (E) = T (E) = T whenever no confusion is likely (the single letter T will only be used within proofs and the change in the notation will always be clearly signalled). Moreover, we will use the convention that M ≡ |TN (E)| denotes the cardinality of the top, and d ≡ 2M . Let us introduce, for σ ∈ SN , I ⊂ SN , the (slightly abusive) notation τIσ ≡ inf{n > 0|σ (n) ∈ I, σ (0) = σ }
(1.19)
for the first positive time the process starting in σ reaches the set I , i.e. here and in the following we will write P[τIσ = k] ≡ P[τIσ = k|σ (0) = σ ].
(1.20)
Let us recall that in [BEGK1, BEGK2] a very similar program was carried out in a situation that we consider generic for systems having “metastable states”. A key characterization of the effective dynamics on such a set M involves the quantities P[τIx < τxx ] (that, in potential theoretic language, are closely related to Newtonian capacities). There,
386
G. Ben Arous, A. Bovier, V. Gayrard
as here, we identified certain subsets M of the state space, . They are called metastable sets, if they satisfy the properties that x < τx] supx∈ P[τM x 1. z inf z∈M P[τM < τzz ] \z
(1.21)
Equation (1.21) implies a separation of the time-scales of the motion towards the set M (“fast scale”) and the motion within the set M (“slow scale”). Under some additional “non-degeneracy” hypothesis, namely that y
(i) for all pairs x, y ∈ M, and any set I ⊂ M\{x, y} either P[τIx < τxx ] µ(y)P[τI < y y y τy ] or P[τIx < τxx ] µ(y)P[τI < τy ], and (ii) there exists m1 ∈ M, s.t. for all x ∈ M\m1 , µ(x) µ(m1 ), it was shown in [BEGK2] that the motion on the set M can be described as a sequence of exits with asymptotically exponentially distributed times (on distinct scales) towards the more stable states, i.e. the equilibrium. It was also shown that the inverse mean exit times from any point x ∈ M are asymptotically equal to the small eigenvalues of the generator of the Markov chain. In the random energy model we will find ourselves in a situation where all of these hypotheses are not satisfied. When checking condition (1.21) with M ≡ T (E) we will see that this is not satisfied, and that, rather, supσ ∈SN P[τTσ(E) < τσσ ] inf σ ∈M P[τTσ(E)\σ < τσσ ]
→ 1,
as
E ↓ −∞.
(1.22)
Moreover, all the quantities P[τTσ(E)\σ < τσσ ] for x ∈ T (E) will turn out to be comparable. Thus the situation is completely different than in [BEGK2], and we have to expect a much more complicated behaviour of the process on T (E). Moreover, there is no natural criterion for the choice of a particular value of E, and we will, in fact, see later (in [BBG]) that it is somehow natural to consider limits as E ↓ −∞. In any case our purpose is the description of the process observed on T (E). Our first result concerns just the “motion” of the process disregarding time. To that effect we consider the random times θ0 ≡ min{n > 0|σ (n) ∈ T (E)}, θ ≡ min{n > θ−1 |σ (n) ∈ T (E)\σ (θ−1 )}.
(1.23)
Let ξ 1 , . . . , ξ |T (E)| be an enumeration of the elements of T (E). Now define (for fixed N and E), the stochastic process Y with state space {1, . . . , |T (E)|} and discrete time ∈ N by (N)
Y
= i ⇔ σ (θ ) = ξ i .
(1.24)
It is easy to see that Y is a Markov process. Moreover, the transition matrix elements can be expressed as i ξ ξi (1.25) p(i, j ) ≡ P τξ j < τT (E)\{ξ i ∪ξ j } . Note that this Markov chain has a state space whose size |T (E)| is a random variable. To formulate our first theorem it will be convenient to fix the size by conditioning. Thus set PM (·) ≡ P (·| |T (E)| = M).
Aging in the REM. Part I
387
Theorem 1.3. Let σ (n) denote the Markov chain with transition matrix defined in (1.3) and whose initial distribution is the uniform distribution on SN 5 . Let Y (N) be the Markov process defined by (1.24). Let Y denote the Markov chain on {1, . . . , M} with transition ∗ given by matrix pM 1 , if i = j ∗ (1.26) pM (i, j ) = M−1 0, if i = j ∗ (i) = 1/M. Then, for all M ∈ N, and initial distribution pM
D
Y (N) → Y,
PM -a.s.
(1.27)
Remark. Note that the statement of the theorem also implies the convergence in law (w.r.t. P ) of the probability distribution of Y (N) to that of Y . The next results concern mean times. √ Theorem 1.4. Assume that α ≡ β/ 2 ln 2 > 1. Then there exists a subset E ⊂ with = 1, such that for all ω ∈ E, for all N large enough, the following holds: P (E) i) For all η ∈ T (E), η E τT (E)\η =
1 1−
1 |T (E)|
eβ
√ NEη
+ Wβ,N,T (E) (1 + O(1/N )).
(1.28)
ii) For all σ ∈ / T (E), E τTσ(E) ≤ E(τTσ(E) )
≥
1 1−
1 |T (E)|
1 1−
1 |T (E)|
eβ
e
√ NEσ
√ β NEσ
+ Wβ,N,T (E) (1 + O(1/N )) 1 − eE (α − 1) + Wβ,N,T (E) (1 + O(1/N )). 1 + 1/|T (E)| (1.29)
iii) For all η, η¯ ∈ T (E), η = η, ¯ η η η η E τη¯ | τη¯ ≤ τT (E)\η − E τT (E)\η ≤
1 1−
1 |T (E)|
Wβ,N,T (E) O(1/N ), (1.30)
where √
Wβ,N,T (E)
e(α−1)E+β NuN (0) ≡ |T (E)|(α − 1)
α−1 1 + VN,E eE/2 √ 2α − 1
(1.31)
and VN,E is a random variable of mean zero and variance one. Theorem 1.4 is complemented by a somewhat converse result in the case α < 1: 5 In fact it is enough, for the result to hold, that the initial distribution gives zero mass to an -neighborhood of T (E).
388
G. Ben Arous, A. Bovier, V. Gayrard
Theorem 1.5. Assume that α < 1. Then, with probability one, for all N large enough, for all σ ∈ SN , EτTσ(E) =
1 2 η eN(β /2+ln 2) (1 + O(1/N )) sup EτSN \η . |T (E)| − 1 η∈SN
(1.32)
Remark. Since as N ↑ ∞, E|T (E)| → e−E , we see that for −E very large, Wβ,N,T (E) ∼ eα(E+uN (0)) . Thus (ii) of Theorem 1.4 implies that if α > 1, for all σ ∈ T (E), the mean time of arrival in the√top is proportional to eα(E+uN (0)) . On the other hand, there exists η ∈ T (E) such that √N Eη ∼ E +uN (0)+O(eE ), so that the slowest times of exit from η a state, EτSN \η = eβ NEη , in T (E) are just of the same order. This can be expressed by saying that on the average the process takes a time t to reach states that have an exit time t. This is a first, and weak, manifestation of the aging phenomenon that we will investigate in much greater detail in [BBG]. In contrast, if α < 1, Theorem 1.5 η EτTσ(E) supη∈SN EτSN \η , and thus the time spent in top states is irrelevant compared to the time between successive visits of such states. Thus we see a clear distinction between the high and the low temperature phase of the REM on the dynamical level. Remark. Statement iii) of Theorem 1.4 expresses the fact that the mean times of passage from a state η ∈ T (E) to another state η¯ ∈ T (E) are asymptotically independent of the terminal state η. ¯ This confirms to some extent the heuristic picture of Bouchaud. Indeed, if we added the hypothesis that the process observed on the top is Markovian, then the two preceding theorems would immediately imply that the waiting times must be exponentially distributed with rates independent of the terminal state and given by (1.30). We will see in [BBG] that this, however, cannot be justified. The remainder of this paper is devoted to proving Theorems 1.3, 1.4, and 1.5. Section 2 will in fact prove a number of results that will not only imply Theorem 1.3, but will also furnish basic input to both Sect. 3 and the follow-up paper [BBG]. Section 3 contains the proof of Theorems 1.4 and 1.5. 2. Probability Estimates In this section we provide estimates that will immediately allow to prove Theorem 1.3. In fact we will prove much more, anticipating what will be needed in Sect. 3 as well as in the follow-up paper [BBG]. These results are collected in the following proposition. 1/2 Proposition 2.1. Set M = |T (E)|, d = 2M and δ(N ) ≡ Nd log N . There exists a subset E ⊂ with P (E) = 1, such that for all ω ∈ E, for all N large enough, the following holds: For ε > 0 a constant, define the sets √ (2.1) B√εN (σ ) = σ ∈ SN | σ − σ 2 ≤ εN , σ ∈ SN and Wε (I ) ≡
σ ∈I
c B√ (σ ), εN
I ⊆ SN .
(2.2)
Aging in the REM. Part I
389
Then, i) For all ε > 0 there exists a constant c > 0 such that, for all η ∈ T (E) and all σ ∈ Wε (T (E)), P τησ < τTσ(E)\η −
1 M
≤
d NM (1 + cδ(N )).
(2.3)
ii) There exists a constant c > 0 such that, for all η ∈ T (E) and η¯ ∈ T (E) with η = η, ¯ √ β NEη¯ η¯ P τηη¯ < τT (E)\η − e
1 M
≤
d NM (1 + cδ(N )).
(2.4)
iii) There exists a constant c > 0 such that, for all η ∈ T (E) and η¯ ∈ T (E) with η = η, ¯ η¯ − P τηη¯ < τT (E)\{η,η} ¯
1 M−1
≤
d N(M−1) (1 + cδ(N )).
(2.5)
iv) There exists a constant c > 0 such that, for all η ∈ T (E), √ β NEη η P τT (E)\η < τηη − 1 − e
1 M
≤ 1−
1 M
d N (1 + cδ(N )).
(2.6)
v) There exists a constant c > 0 such that, for all σ ∈ / T (E), 1−
1 M
1−
d N (1 + cδ(N ))
≤ eβ
√ NEσ
P τTσ(E) < τσσ ≤ 1.
(2.7)
vi) For all ε > 0 there exists a constant c > 0 such that, for all σ ∈ / T (E) and all σ¯ ∈ Wε (T (E) ∪ σ ), 1 M+1
+
d NM (1 − cδ(N ))
≤ P τσσ¯ ≤ τTσ¯ (E) ≤
1 M
+
d NM (1 + cδ(N )).
(2.8)
Proof of Theorem 1.3. Assuming the proposition, Theorem 1.3 follows immediately from iii) and i), together with the fact that the mass of the set SN \W (T ) under the uniform measure on SN tends to zero as N tends to infinity. Let us briefly highlight the structure of the proof 2.1. In Subsect. 2.1 of Proposition σ σ we will show that, for I ⊂ SN , the probabilities P τη < τI can be expressed in terms of a lumped chain through a lumping procedure that allows to reduce the high dimensional state space SN to a much smaller one. In Subsect. 2.2 we analyse the lumped chain and establish the probability estimates which will serve as basic input to the proof of Proposition 2.1. The proof of the proposition is then carried out in Subsect. 2.3.
390
G. Ben Arous, A. Bovier, V. Gayrard
2.1. Lumped chains: Definition and properties. Lumping procedure. We begin with some preparatory notation and definitions. For M an integer, let SM×N be the set of all M × N matrices whose elements belong to S = {−1, 1}. A matrix ξ ∈ SM×N will be written either in terms of its matrix elements, row vectors or column vectors according to the following notation. In terms of its matrix µ µ=1,...,M µ elements we will write ξ = (ξi )i=1,...,N , where ξi ∈ S is the element lying at the intersection of the µth row and i th column. The row and column vectors of ξ will be denoted respectively by ξ µ and ξi , and written, in terms of their elements, as: µ
ξ µ = (ξi )i=1,...,N ∈ SN ,
µ ∈ {1, . . . , M},
µ (ξi )µ=1,...,M
i ∈ {1, . . . , N}.
ξi =
∈ SM ,
(2.9)
Observe that, when carrying an index placed as a superscript, the letter ξ refers to an element of the cube SN while, when carrying an index placed as a subscript, it refers to an element of the cube SM . As is usual, ξ may then be written as the N -tuple formed by its column vectors, ξ = (ξ1 , . . . , ξi , . . . , ξN )
(2.10)
or, denoting by t ξ the transpose matrix, as the M-tuple formed by its row vectors, t
ξ = (ξ 1 , . . . , ξ µ , . . . , ξ M ).
(2.11)
Given a subset I ⊂ SN we define a partition of the index set ≡ {1, . . . , i, . . . , N} in the following way. Let ξ = (ξ1 , . . . , ξi , . . . , ξN ) ∈ S|I |×N be any matrix having the property that I = ξ 1 , . . . , ξ µ , . . . , ξ |I | (2.12) in other words, any matrix having the set I for a set of row vectors. Next, let {e1 , . . . , ek , . . . , ed } be an arbitrarily chosen labeling of all d = 2|I | elements of S|I | (this labeling will be kept fixed throughout, whatever the choice of I is). Then ξ induces a partition of into d disjoint (possibly empty) subsets, k (I ), obtained by grouping together all indices i having the property that ξi = ek : =
d
k (I ),
k (I ) = {i ∈ | ξi = ek }.
(2.13)
k=1
We will write PI () = {k (I ), 1 ≤ k ≤ d} .
(2.14)
Remark. Observe that with the notation introduced above, we do not keep track of the particular choice of the matrix ξ we made. The reason is that since any two matrices satisfying (2.12) are obtained from each other by a permutation of their rows, the partitions they induce only differ through the labeling of the sets (2.13). As this labeling will be irrelevant for our purposes we will as a rule forget the underlying matrix. It is understood that in all statements involving PI (), a choice has been fixed.
Aging in the REM. Part I
391
Finally, this partition is used to define a many-to-one function, γI , that maps the elements of SN into d-dimensional vectors, γI (σ ) = γI1 (σ ), . . . , γIk (σ ), . . . , γId (σ ) , σ ∈ SN , (2.15) where, for all k ∈ {1, . . . , d}, γIk (σ ) =
1 σi . |k (I )|
(2.16)
i∈k (I )
A few elementary properties of γI are listed in the lemma below. Lemma 2.2. i) The range of γI , N,d (I ) ≡ γI (SN ), is a discrete subset of the d-dimensional cube [−1, 1]d and may be described as follows. Let {uk }dk=1 be the canonical basis of Rd . Then, x ∈ N,d (I ) ⇐⇒ x =
d k=1
nk uk , |k (I )|
(2.17)
where, for all 1 ≤ k ≤ d, |nk | ≤ |k (I )| has the same parity as |k (I )|. ii) |{σ ∈ SN | γI (σ ) = x}| =
d |k (I )| , k |k (I )| 1+x 2 k=1
∀x ∈ N,d (I ).
(2.18)
In particular, the restriction of γI to I is a one-to-one mapping from I onto γI (I ). iii) The elements of I are mapped onto corners of [−1, 1]d : for all σ ∈ I , γI (σ ) = (σi1 , . . . , σik . . . , σid ), iv) Let σ ∈ SN be such that inf η∈I \σ and I ≡ γI (I ). Then
for any choice of indices ik ∈ k (I ). (2.19) √ σ − η2 ≥ εN for some ε > 0. Set x ≡ γI (σ )
εN inf x − y2 ≥ √ . 2 d maxk |k (I )|
y∈I \x
(2.20)
Proof of Lemma 2.2. Assertions i), ii), and iii) result from elementary observations. To prove assertion iv) note that for any η ∈ I \ σ , setting y ≡ γI (η) and using 2.19, we have: εN ≤
N
(σi − ηi ) =
i=1
2
d
(σi − yk ) = 2 2
k=1 i∈k
d
|k (I )|(1 − yk xk )
k=1
≤ 2 max |k (I )|(y, y − x),
(2.21)
k
where we √ used in the last line that 1 − yk xk = yk (yk − xk ). But (y, y − x) ≤ y2 y − x2 = dy − x2 , so that εN x − y2 ≥ √ 2 d maxk |k (I )| which, together with assertion ii) yields (2.20).
(2.22)
392
G. Ben Arous, A. Bovier, V. Gayrard
The I -lumped chain. In the sequel we will denote by σN◦ (t) t∈N the ordinary random walk (ORW) associated to {σN (t)}t∈N , that is, the walk evolving on the edges of GN according to the transition probabilities ◦ pN (σ, σ )
=
1 N,
0,
if σ − σ 2 = otherwise
√ 2
.
(2.23)
All objects referring to the ORW will be distinguished
from those referring to the chain {σN (t)} by the superscript ◦ . Note in particular that σN◦ (t) is reversible w.r.t. the measure µ◦N (σ ) = 2−N ,
σ ∈ SN .
(2.24)
We will denote by XI,N (t) t∈N and call the I -lumped chain or the lumped chain induced by I , the chain defined through XI,N (t) ≡ γI (σN◦ (t)),
∀t ∈ N.
(2.25)
To N,d (I ) we associate an undirected graph, G( N,d (I )) = (V ( N,d (I )), E( N,d (I ))), with set of vertices V ( N,d (I )) = N,d (I ) and the set of edges: E( N,d (I )) = (x, x ) ∈ N,d (I ) | ∃k∈{1,...,d} , ∃s∈{−1,1} : x − x = s |k2(I )| uk . (2.26)
The properties of XI,N (t) are summarized in the lemma below. Lemma 2.3. Given any subset I ∈ SN :
i) The process XI,N (t) is Markovian no matter how the initial distribution π ◦ of {σN◦ (t)} is chosen. ii) Set Q◦N = µ◦N ◦ γI−1 . Then Q◦N is the unique reversible invariant measure for the chain XI,N (t) . In explicit form, the density of Q◦N reads: 1 |{σ ∈ SN | γI (σ ) = x}|, ∀x ∈ N,d (I ). (2.27) 2N
iii) The transition function rN◦ ( . , . ) of XI,N (t) does not depend on the choice of π ◦ and is given by: Q◦N (x) =
rN◦ (x, x )
=
|k (I )| 1−sxk N 2
0,
if (x, x ) ∈ E( N,d (I ))) and x − x = s |k2(I )| uk . otherwise (2.28)
Proof. The proof of this lemma is a direct application of the results of Burke and Rosenblatt [BR] on Markovian functions of Markov Chains.
Aging in the REM. Part I
393
Comparison lemmata. In order to make use of the above set-up we first need to establish how the Markov chain σ (t) relates to the ORW. This is done in the next lemma. Lemma 2.4. Let I ⊂ SN . Then, i) for all σ ∈ / I and η ∈ / I ∪σ P τησ < τIσ = P◦ τησ < τIσ ,
(2.29)
ii) for all σ ∈ I and η ∈ /I √ P τησ < τIσ = e−β NEσ P◦ τησ < τIσ .
(2.30)
It finally remains to establish how the quantities P◦ τησ < τIσ can be expressed in terms of a lumped chain. Lemma 2.5. Let I, J, K ⊂ SN be such that I ∩ J = ∅ and I ∪ J ⊆ K. Then, denoting by R◦ the law of the K-lumped chain, γ (σ ) γ (σ ) P◦ τIσ ≤ τJσ = R◦ τγKK(I ) ≤ τγKK(J ) , for all σ ∈ / I. (2.31) Remark. Note that K in the above lemma does not necessarily contain σ if σ ∈ / J. We skip the proofs of Lemma 2.4 and 2.5 as they are nothing but elementary exercises.
2.2. Main ingredients of the proof of Proposition 2.1. Observe that the entropy produced by the lumping procedure gives rise through (2.27) to a potential, FN (x) ≡ − N1 ln Q◦N (x). It moreover follows from assertions ii) and iii) of Lemma 2.2 that this potential is convex and takes on its global maximum at the corners of the cube [−1, 1]d . This allows us to draw on the results of [BEGK1] where such processes were investigated. Throughout this section I denotes an arbitrary (non empty) subset of SN whose size, |I |, does not depend on N. Given 0 < < 1 let K(I ) and K(I )c be the sets defined through:
K(I ) ≡ K (I ) ≡ k ∈ {1, . . . , d} | |k (I )| ≥ Nd , K(I )c ≡ K (I )c ≡ {1, . . . , d} \ K (I ).
(2.32)
Set κ = |K(I )|. Of course κ ≥ 1 since supposing κ = 0, (2.32) implies that dk=1 |k (I )| < N < N, contradicting (2.13). Let π : Rd → Rκ be the projection that maps x = (x1 , . . . , xd ) into πx = (xi1 , . . . , xiκ ) where, for all 1 ≤ j ≤ κ, ij ∈ K(I ). Finally, set N∗ = min |k |. k∈K(I )
With this notation we have:
(2.33)
394
G. Ben Arous, A. Bovier, V. Gayrard
Lemma 2.6. There exists a constant c > 0 such that, for all N large enough, 1 R◦ τ0x < τxx ≥ 1 − N
|µ |
1−
µ∈K(I )c
1 c − 2 N∗ N∗
,
for all x ∈ γI (I ). (2.34)
Lemma 2.7. Let x ∈ γI (I ) and y ∈ γI (SN ) be such that π x − πy2 ≥ δ for some constant 21 > δ > 0. Then there exist a constant h(δ, κ) > 0 such that, for all N large enough, y y R◦ τx < τ0 ≤ e−N∗ h(δ,κ) .
(2.35)
As an important consequence of the previous two lemmata we may immediately state: Lemma 2.8. Let x ∈ γI (I ) and J ⊆ γI (I ) be such that for all y, y ∈ J ∪ x, πy − πy2 ≥ δ for some δ > 0. Then, for all N large enough, ϑ 1 ≤ R◦ τx0 ≤ τJ0 ≤ , |J | ϑ|J |
for all J ⊆ γI (I ), x ∈ γI (I ),
(2.36)
where 1 ϑ = 1− N
1 c 1− − 2 . N∗ N∗
|µ |
µ∈K(I )c
In particular, if K(I )c = ∅, ◦ 0 c R τ ≤ τ 0 − 1 ≤ 1 1 + x J |J | |J |N∗ N∗
(2.37)
(2.38)
for some constant c > 0. Proof of Lemma 2.6. An L-steps path ω on N,d (I ), beginning at x and ending at y is defined as a sequence of L sites ω = (ω0 , ω1 , . . . , ωL ), with ω0 = x, ωL = y, and ωl = (ωlk )k=1,...,d ∈ V ( N,d (I )) for all 1 ≤ l ≤ L, that satisfies: (ωl , ωl−1 ) ∈ E( N,d (I )),
for all
l = 1, . . . , L.
(2.39)
(We may also write |ω| = L to denote the length of ω.) Recall from Lemma 2.2 that if x ∈ γI (I ), then a fortiori x ∈ {−1, 1}d . Without loss of generality we may thus choose x in (2.34) as the point x = (xk )dk=1 , xk = 1 for all 1 ≤ k ≤ d. There is no loss of generality either in taking K(I ) in (2.32) to be the set K(I ) = {1, . . . , κ} and in assuming |k (I )| to be even for all k ∈ K(I ). With this we introduce κ one-dimensional paths in N,d (I ), each being of length L ≡ κN∗ /2, and connecting x to the endpoint y defined by N∗ 1 − | , if k ∈ K(I ) d k| y = (yk )k=1 , yk = . (2.40) 1, if k ∈ K(I )c
Aging in the REM. Part I
395
Definition 2.9. For each 1 ≤ µ ≤ κ, let ω(µ) = (ω0 (µ), . . . , ωn (µ), . . . , ωL (µ)), ωn (µ) = (ωnk (µ))dk=1 , be the path in N,d (I ) defined through (k+µ−2) modκ+1 , if k ∈ K(I ) ωn , (2.41) ωnk (µ) = 1 if k ∈ K(I )c where ω = (ω0 , . . . , ωn , . . . , ωL ), ωn = (ωnk )dk=1 , is defined by ω0 = x and, for 1 ≤ n ≤ L, 1 − |k2(I )| n−1 − κ k n−1 ωn = 1 − 2 , |k (I )| κ 1
2 |k (I )| ,
(2.42)
if k ∈ K(I ) and k ≤ n − κ
n−1 κ
. if k ∈ K(I ) and k > n − κ n−1 κ if k ∈ K(I )c (2.43)
Here [x], x ∈ R, denotes the integer part of x. (The paths ω(µ) are in fact paths on the subgraph {z ∈ N,d (I ) | zk = 1 ∀k ∈ K(I )c }.) Let D be the subgraph of G( N,d (I )) with a set of vertices V (D) = {x ∈ N,d (I ) | 2 ≤ y2 } and a set of edges E(D) = {(x , x ) ∈ E( N,d (I )) | x , x ∈ V (D)}. Denoting by µ the subgraph of G( N,d (I )) “generated” by the path ω(µ), i.e., with a set of vertices V (µ ) = {x ∈ N,d (I ) | ∃0≤n≤L : x = ωn (µ)}, we set x
=D∪
κ
µ .
(2.44)
µ=1
Since both x and 0 belong to it follows from Lemma (4.1) of the appendix that x x ◦ R◦ τ0x < τxx ≥ R τ0 < τx y y ◦ ◦ = R τyx < τxx R τ0 < τx , (2.45) where the last equality is nothing but the Markov property. Again, the collection µ , 1 ≤ µ ≤ κ, being easily seen to verify conditions (4.2) and (4.3) of Lemma 4.1 (w.r.t. the event τyx < τxx ), we have, applying the latter lemma twice in a row, ◦ R τyx < τxx ≥ R∪◦ κ
µ=1 µ
κ ω0 (µ) ω0 (µ) ◦ τyx < τxx ≥ τ R < τ ωL (µ) ω0 (µ) µ
(2.46)
µ=1
and combining (2.45) and (2.46), we have, κ y y ω0 (µ) ω0 (µ) ◦ ◦ R◦ τ0x < τxx ≥ τ R τ0 < τx R < τ ωL (µ) ω0 (µ) . µ
(2.47)
µ=1
The bound (2.45) is of course meaningless if it so happens that y = 0. In this special case we only use (2.46) to write κ ω0 (µ) ω0 (µ) ◦ R◦ τ0x < τxx ≥ < τ τ (2.48) R ωL (µ) ω0 (µ) . µ µ=1
Thus, in view of (2.47) and (2.48), Lemma 2.6 will be proven if we can establish that:
396
G. Ben Arous, A. Bovier, V. Gayrard
Lemma 2.10. Under the assumptions of Lemma 2.6: i) There exists a constant c > 0 such that, for large enough N , for each µ ∈ K(I ) and with N∗ defined as in (2.33), | | c 1 µ ω0 (µ) ω0 (µ) ◦ (2.49) − 2 . 1− Rµ τωL (µ) < τω0 (µ) ≥ N N∗ N∗ ii) Assume that y = 0. There exists a constant c > 0 such that, for all N large enough, y y ◦ (2.50) τ0 < τx ≥ 1 − cdN 3/2 2−N/d . R Proof of Lemma 2.10 i). To simplify the presentation we will only treat the case µ = 1, that is, with the notation of Definition 2.9, establish that ω |1 | 1 c ◦ 0 < τ ω0 ≥ τ 1 − . (2.51) − R ωL ω0 1 N N∗ N∗2 It is well known that (see e.g. [Sp] or [BEGK1], Lemma 2.5) ◦ R 1
τωωL0
<
τωω00
! =
L Q ◦ (ω0 ) µ n=1
◦ Q µ
1 ◦ rµ (ωn , ωn−1 ) (ωn )
"−1 (2.52)
◦ and which we may also write, using reversibility together with the definitions of r µ ◦ (see Appendix A), Q µ
◦ R τωωL0 < τωω00 1
−1 !L−1 "−1 N∗ /2−1 κ QN (ω0 ) 1 = = Am,l , (2.53) QN (ωn ) rN (ωn , ωn+1 ) m=0
n=0
l=1
where Am,l ≡
QN (ω0 ) 1 . QN (ωmκ+l−1 ) rN (ωmκ+l−1 , ωmκ+l )
(2.54)
κ |k | |l | − m k 1−ωmκ+l−1 , N k=1 |k | 2
(2.55)
By (2.27) and (2.18), A−1 m,l = and by (2.43) |k | |l | − m |k | m m+1 N k>l−1 k≤l−1 κ |l | − m |k | |k | − m , = m N m+1
A−1 m,l =
k =1
(2.56)
k≤l−1
where we use the convention that the second product above is one whenever the index set k ≤ l − 1 is empty. From now on we distinguish two cases.
Aging in the REM. Part I
397
1) The case κ = 1. Here N∗ = |1 |. Inserting (2.56) in (2.53) yields
◦ R τωωL0 < τωω00 1
−1 N∗ /2−1 |1 | Cm , = N
(2.57)
m=0
where −1
Cm ≡
|1 | m
|1 | |1 | − 1 −1 . = m |1 | − m
(2.58)
Then, using the bound
−1
|1 | m
≤
6 , (|1 | − 1)3
3 ≤ m ≤ N∗ /2 − 1
(2.59)
easily yields ω |1 | c 1 ◦ 0 < τ ω0 ≥ − τ 1 − R ωL ω0 1 N N∗ N∗2
(2.60)
for some constant c > 0. 2) The case κ > 1. Inserting (2.56) in (2.53) yields
◦ R τωωL0 < τωω00 = 1
N∗ /2−1 m=0
!
−1 "−1 κ |k | Bm , m
(2.61)
k =1
where κ
m+1 N |l | − m |k | − m l=1 ! k≤l−1 " l κ m+1 N 1+ = |1 | − m |l | − m l=2 l =2 " ! κ m + 1 l−1 N ≤ . 1+ |1 | − m N∗ − m
Bm ≡
(2.62)
l=2
Since (m + 1)/(N∗ − m) < 1 for all 0 ≤ m ≤ N∗ /2 − 1, Bm ≤
N (N∗ − m) . (|1 | − m)(N∗ − 2m − 1)
(2.63)
Inserting (2.63) in (2.61),
◦ R τωωL0 < τωω00 1
−1 N∗ /2−1 |1 | ≥ Cm , N m=0
(2.64)
398
G. Ben Arous, A. Bovier, V. Gayrard
where
! Cm =
"−1 κ (N∗ − m)|1 | |k | . m (N∗ − 2m − 1)(|1 | − m)
(2.65)
k =1
Finally, a few simple computations yield the bounds C0 = 1 + N∗1−1 , C1 ≤ N∗−κ (1 + 5N∗−1 ), Cm ≤ 2κ−2 N∗−2κ+1 , 2 ≤ m ≤ N∗ /2 − 1,
(2.66)
from which we easily get ω |1 | c 1 ◦ 0 < τ ω0 ≥ R − τ 1 − ωL ω0 1 N N∗ N∗κ
(2.67)
for some constant c > 0. As (2.60) together with (2.67) give (2.51), the first assertion of Lemma 2.10 is proven. Proof of Lemma 2.10 ii). We first write y y y y ◦ ◦ R τ0 < τx = 1 − R τx < τ0 and use the renewal identity (see e.g. Corollary 1.9 in [BEGK1]) to get ◦ τy < τy R x y y∪0 y ◦ R τx < τ0 = ◦ y y . < τy R τ
(2.68)
(2.69)
x∪0
By reversibility the numerator of (2.69) may be rewritten as Q ◦ (x) y y ◦ ◦ x x R τx < τy∪0 = τ R < τ x∪0 . ◦ (y) y Q
(2.70)
◦ (x ) = Q◦ (x )/Q◦ () we have, by (2.27), Thus, remembering that Q N N Q◦ (x) 1 |{σ | γ (σ ) = x}| y y ◦ R τx < τy∪0 ≤ N = = ◦ QN (y) |{σ | γ (σ ) = y}| |{σ | γ (σ ) = y}| which by (2.18), for y defined in (2.40), gives: −1 −1 | | k y y ◦ ≤ N∗ R , τx < τy∪0 ≤ N∗ /2 N∗ /2
(2.71)
(2.72)
k∈K(I )
where we used that there exists at least one index k ∈ K(I ) with the property that |k | = N∗ . Since by (2.32) N ≥ N∗ ≥ Nd , Stirling’s formula enables us to conclude that, for large enough N , ' √ y y ◦ R τx < τy∪0 ≤ c N∗ 2−N∗ ≤ c N 2−N/d (2.73) for some constant c > 0.
Aging in the REM. Part I
399
To bound the probability appearing in the denominator of (2.69) we again resort to the path technique employed in the proof of assertion i). As we need only a rough estimate, this probability will be estimated by means of a single path, ω. Setting L=L 0 + · · · + Ld , 0, Lk = 21 (|k (I )| − N∗ ), |k (I )| , 2
if k = 0 if k ∈ K(I ) , if k ∈ K(I )c
(2.74)
ω = ( ω0 , . . . , ωL ) is defined as follows y, if n = 0
k−1
k ωn = . 2 if ωn−1 − |k (I )| uk , l=0 Ll , 1 ≤ k ≤ d l=0 Ll < n ≤
(2.75)
the subgraph of G( N,d (I )) generated by the (Observe that ωL = 0.) Denoting by D ωn }), we path ω (i.e., with a set of vertices V (D) = {x ∈ N,d (I ) | ∃0≤n≤L : x = have ⊂D⊂ D
(2.76)
and thus, by Lemma 6.1, that y y y y y y ◦ ◦ ◦ τx∪0 < τy ≥ R τ0 < τy ≥ RD R τ0 < τy .
(2.77)
To bound the last probability in (2.77) note that, just as in (2.53), ◦ RD
ω0 τ ωL
<
ω0 τ ω0
!L−1 "−1 QN ( ω0 ) 1 = . QN ( ωn ) rN ( ωn , ωn+1 )
(2.78)
n=0
ωn ) increases as n increases from At this stage, simply observe that on the one hand, QN ( 0 to L, implying that QN ( ω0 )/QN ( ωn ) ≤ 1 for all 0 ≤ n ≤ L, while on the other hand,
k for each 1 ≤ k ≤ d and all k−1 l=0 Ll , l=0 Ll ≤ n < ωn , ωn+1 ) = rN (
|k (I )| |k (I )| (1 + ωnk ) ≥ . 2N 2N
(2.79)
Therefore ◦ RD
ω0 τ ωL
<
ω0 τ ω0
≥
d L k −1 |k (I )| k=1 m=0
2N
−1
=
! d k=1
|k (I )| Lk 2N
"−1 ≥
1 , (2.80) Nd
where the last inequality follows from (2.74). Putting (2.80) back in (2.77) finally gives y y ◦ R (2.81) τx∪0 < τy ≥ (N d)−1 . Inserting (2.81) and (2.73) in (2.69) and plugging the resulting bound in (2.68) yields (2.50). The second assertion of Lemma 2.10 being proven, this concludes the proof of Lemma 2.10.
400
G. Ben Arous, A. Bovier, V. Gayrard
Inserting the bounds of Lemma 2.10 in 2.47 we obtain 1 1 |µ | 1− − R◦ τ0x < τxx ≥ 1 − N N∗ µ∈K(I )c 1 1 |µ | 1− − ≥ 1− N N∗ c µ∈K(I )
c N∗2 c N∗2
1 − c N 2−N/d
,
(2.82)
where the last inequality holds true for some constant c > 0, provided that N is large enough. The first assertion of 2.6 is proven. Proof of Lemma 2.7. For ρ ≥ 0 and x ∈ γI (I ) set
x,ρ
N,d (I ) = y ∈ N,d (I ) | π x − πy 2 > ρ .
(2.83)
By hypothesis, x,δ (I ). y ∈ N,d
(2.84)
Observe moreover that either y satisfies x,δ/2
i) ∀z ∈ {−1, 1}d ∩ N,d (I ), πz − πy2 > δ/2 or else x,δ ii) ∃z ∈ {−1, 1}d ∩ N,d (I ) such that πz − πy2 ≤ δ/2. We will first show that in case i), (2.35) is a direct consequence of reversibility. Indeed, as in (2.69), y y R◦ τx < τy∪0 y y (2.85) R◦ τx < τ0 = ◦ y y . R τx∪0 < τy A straightforward adaptation of the proof of the bound (2.81) to the case at hand shows that the denominator of (2.85) obeys the bound y y (2.86) R◦ τx∪0 < τy ≥ 1/N while by reversibility its numerator may be rewritten as Q◦ (x) y y x . R◦ τx < τy∪0 = ◦ R◦ τyx < τx∪0 Q (y)
(2.87)
Thus, by (2.27), Q◦ (x) |{σ | γ (σ ) = x}| 1 y y = = , R◦ τx < τy∪0 ≤ N Q◦N (y) |{σ | γ (σ ) = y}| |{σ | γ (σ ) = y}|
(2.88)
where the last equality follows from the fact that x ∈ γI (I ) (see Lemma 2.2). To estimate the last ratio note that condition i) combined with (2.84) implies that inf
z ∈{−1,1}κ
z − πy2 > δ/2
(2.89)
Aging in the REM. Part I
401
√ which in turn implies that there exists k ∈√K(I ) such that inf s=±1 |s − yk | > δ/2 κ, or in other words, such that |yk | < 1 − δ/2 κ. Thus, making use of (2.18) and Stirling’s formula, −1 |k (I )| |{σ | γ (σ ) = y}|−1 ≤ 1+y |k (I )| 2 k ≤ c exp {−| k (I )|I(yk )} ( ≤ c exp −N∗
inf √ I(u) |u|<1−δ/2 κ
,
for some constant c > 0, with N∗ defined as in (2.33), and where 1−u 1−u 1+u 1+u −I(u) = ln + ln , |u| ≤ 1. 2 2 2 2
(2.90)
(2.91)
Collecting all our bounds we arrive at R
◦
y τx
<
y τ0
(
≤ cN exp −N∗
inf √ I(u) |u|<1−δ/2 κ
.
(2.92)
As δ > 0, inf |u|<1−δ/2√κ I(u) > 0. Choosing h(δ, κ) = inf |u|<1−δ/2√κ I(u)/2, it then follows from (2.92) that, for all N large enough, y y R◦ τx < τ0 ≤ exp {−N∗ h(δ, κ)} . (2.93) Thus, under the assumption made in i), (2.35) is proven. Let us turn to the case ii). Observe that for δ < 1/2 there exists a unique point z ∈ {−1, 1}κ such that π z − πy2 ≤ δ/2. Calling z∗ this point and introducing the discrete hyper-surface Hδ/2 (z∗ ) = {z ∈ N,d (I ) | z∗ − πz 2 = δ/2}, we have y y y y R◦ τx < τ0 = R◦ τz < τHδ/2 (z∗ ) R◦ τxz < τ0z . (2.94) z ∈Hδ/2 (z∗ )
Now all points z ∈ Hδ/2 (z∗ ) have the following properties: firstly, as is obvious from x,δ the definition of Hδ/2 (z∗ ), πz − πz 2 > δ/4 for all z ∈ {−1, 1}d ∩ N,d (I ), implying
x,δ that assumption i) is satisfied with δ replaced by δ/2; secondly, since z∗ ∈ N,d (I ) by assumption, then
δ ≤ π x − πz∗ 2 ≤ πx − πz 2 + πz − π z∗ 2 ≤ π x − π z 2 + δ/2
(2.95)
implying that πx − πz 2 ≥ δ/2, i.e., that z ∈ N,d (I ). As a result, for each z ∈ Hδ/2 (z∗ ), the probability R◦ τxz < τ0z obeys the bound (2.92) with δ replaced by δ/2. It therefore follows from (2.94) that y y y y R◦ τx < τ0 ≤ exp {−N∗ h(δ/2, κ)} R◦ τz < τ∂ Bδ/2 (z∗ ) x,δ/2
z ∈∂ Bδ/2 (z∗ )
≤ exp {−N∗ h(δ/2, κ)} . This concludes the proof of Lemma 2.7.
(2.96)
402
G. Ben Arous, A. Bovier, V. Gayrard
Proof of Lemma 2.8. Again using renewal as in (2.69), R◦ τ 0 ≤ τ 0 R◦ τx0 ≤ τJ0∪0 x J ∪0 ◦ 0 0 , =
R τx ≤ τJ = ◦ τ0 ≤ τ0 R◦ τJ0 < τ00 y y∈J R J ∪0
(2.97)
so that we are left to bound a term of the form R◦ τy0 ≤ τJ0∪0 , y ∈ J . To do so observe that (2.98) R◦ τy0 ≤ τJ0∪0 = R◦ τy0 < τ00 − R◦ τJ0\y < τy0 < τ00 and that
R◦ τJ0\y < τy0 < τ00 = R◦ τz0 ≤ τJ0∪0 R◦ τyz < τ0z .
(2.99)
z∈J \y
By assumption, the probabilities R◦ τyz < τ0z in the r.h.s. above obey the bound (2.35) of Lemma 2.7. Thus R◦ τJ0\y < τy0 < τ00 ≤ e−Nh(δ,κ) R◦ τz0 ≤ τJ0∪0 ≤e
−Nh(δ,κ)
z∈J \y R◦ τJ0
< τ00 .
(2.100)
From (2.98) and (2.100) we deduce that R◦ τy0 < τ00 − e−Nh(δ,κ) R◦ τJ0 < τ00 ≤ R◦ τy0 ≤ τJ0∪0 ≤ R◦ τy0 < τ00 (2.101) and, summing over y ∈ J , R◦ τy0 ≤ τ00 − |J |e−Nh(δ,κ) R◦ τJ0 < τ00 ≤ R◦ τJ0 ≤ τ00 ≤ R◦ τy0 < τ00 . y∈J
y∈J
(2.102) Inserting the bounds (2.101) and (2.102) into (2.97), and using that R◦ τJ0 ≤ τ00 ≤1
◦ τ0 < τ0 R y y∈J 0 we arrive at: R−e where
−Nh(δ,κ)
≤R
◦
τx0
≤
τJ0
≤R
1 1 − |J |e−Nh(δ,κ)
R◦ τx0 ≤ τ00 . R≡
◦ τ0 ≤ τ0 y y∈J R 0
(2.103)
,
(2.104)
(2.105)
Aging in the REM. Part I
403
To estimate the above ratio we use first that, by reversibility, Q◦N (x)R◦ τ0x ≤ τxx y R=
y ◦ ◦ y∈J QN (y)R τ0 ≤ τy
(2.106)
and next that, by Lemma 2.6, R ϑR ≤ R◦ τx0 ≤ τJ0 ≤ , ϑ
(2.107)
where ϑ is defined in (2.37) and R≡
Q◦N (x) . ◦ y∈J QN (y)
(2.108)
Now since J ⊆ γI (I ), and since Q◦N (y) = 2−N for all y ∈ γI (I ), R=
1 . |J |
(2.109)
Collecting (2.104), (2.107), and (2.109) yields (2.38), concluding the proof of Lemma 2.8. 2.3. Proof of Proposition 2.1. While the estimates of Sect. 2.2 will furnish all the basic ingredients to the proof of Proposition 2.1, they depend upon the choice of the mapping γI through several quantities. To put them to use we still have to identify which mappings γI will be needed and establish the properties of all related objects. Taking a look at Proposition 2.1 in the light of Lemma 2.5 tells us at once that we will be concerned with two cases only: the case where the mapping γI is induced by the elements of the top (as required for the proof of the first four assertions) or the top augmented by a non-random element of SN (which is needed for the proof of the last one). These two cases are analysed below. Notation. In this section we will systematically write T for T (E). Lumped chain induced by the Top. Let t ξ = (ξ 1 , ξ 2 , . . . , ξ |T | ) be the matrix formed of the elements of the top ordered according to the magnitude of Xσ : T = {ξ 1 , ξ 2 , . . . , ξ |T | },
where Xξ 1 ≥ Xξ 2 ≥ · · · ≥ Xξ |T | .
(2.110)
Thus ξ is here a random variable on the probability space (, F, P ). One easily verifies that the conditional distribution of ξ , given that the top contains exactly M points, is the uniform distribution over the set SM×N of M-tuples of mutually distinct points of SN , i.e.: N (2 −M)! if ζ ∈ SM×N (2N )! , (2.111) P (ξ = ζ | |T | = M) = 0, otherwise
404
G. Ben Arous, A. Bovier, V. Gayrard
where
SM×N ≡ ζ ∈ SM×N ζ µ = ζ ν
for all
1 ≤ µ, ν ≤ M, µ = ν .
(2.112)
Set δ(N) ≡ (d/N )1/2 ln N
(2.113)
(where, as before, d = 2M ) and let S M×N be defined through S M×N ≡ ζ ∈ SM×N |k (ζ )| = Nd (1 + λk (N )), |λk (N )| < δ(N ), 1 ≤ k ≤ d . (2.114) The set E appearing in the statement of Proposition 2.1 may be chosen as E= EN ,
(2.115)
N0 N>N0
where EN is given by
EN ≡ ω ∈ | ξ(ω) ∈ S |T |×N .
(2.116)
It is easy to see, using the proof of Lemma 4.2 of [G], that: Lemma 2.11. P (E) = 1.
(2.117)
We will need a certain number of geometric properties of the set T , which we collect below. Lemma 2.12. For all 0 ≤ ε < 1/2, all ω ∈ EN , and large enough N the following holds: for all η = η, ¯ η ∈ T , η¯ ∈ T , B√εN (η) ∩ B√εN (η) ¯ = ∅,
(2.118)
N 1 ηi η¯ i ≤ δ(N ). N
(2.119)
and
i=1
Proof. With the notation of (2.110) let ξ µ = ξ ν be any two distinct elements of T . For all σ ∈ B√εN (ξ µ ) we have, µ σ − ξ ν 2 ≥ ξ ν − ξ µ 2 − σ √ − ξ 2 ν µ ≥ ξ − ξ 2 − εN 1/2 N √ √ 1 ν µ ξ i ξi − εN . = 2N 1 − N i=1
(2.120)
Aging in the REM. Part I
405
Using (2.13) we may write N d 1 ν µ 1 µ ξi ξ i = |k (ξ )|ekν ek . N N i=1
(2.121)
k=1
Since ω ∈ EN by assumption, it follows from (2.114) that N
1 N
µ
ξiν ξi =
1 d
i=1
d
µ
ekν ek +
k=1
=
1 d
d
1 d
d
µ
λk (N )ekν ek
k=1 µ
λk (N )ekν ek
k=1
≤ δ(N),
(2.122)
where the second equality follows from Lemma 2.1 of [G]. Thus (2.119) is proven. Inserting (2.122) in (2.120) yields, ' √ (2.123) σ − ξ ν 2 ≥ 2N (1 − δ(N )) − εN , which would entail (2.118) if we had ' √ √ 2N (1 − δ(N )) − εN > εN .
(2.124)
Now our assumptions on ε imply that this is the case for all N large enough. The lemma is therefore proven. With our choice of EN it readily follows from (2.16) that, for ω ∈ EN , choosing e.g. = 1/2 in definition (2.33), K(T )c = ∅ and N∗ = min |k (T )| ≥
N d
max |k (T )| ≤
N d
k∈K(T )
k∈K(T )
(2.125) d 1/2
1+
d 1/2
1−
N N
ln N
ln N .
(2.126)
Of course κ = d and the projection π defined in the line preceding (2.33) simply is the identity. Knowing this we have: Lemma 2.13. Assume that ω ∈ EN . Then, for all N large enough, i) For all σ ∈ Wε (T ),
√ ε d inf πx − πγT (σ )2 ≥ (1 − δ(N )). x∈γT (T ) 2
ii) For all σ ∈ T , inf
x∈γT (T )\γT (σ )
√ πx − πγT (σ )2 ≥ (1 − 2δ(N )) d.
(2.127)
(2.128)
406
G. Ben Arous, A. Bovier, V. Gayrard
Proof. As a consequence of (2.126) and assertion iv) of Lemma 2.2 we have, for all σ ∈ Wε (T ), inf
x∈γT (T )
πx − πγT (σ )2 =
inf
x∈γT (T )
x − γT (σ )2
εN ≥ √ 2√d maxk |k (T )| ε d ≥ (1 − δ(N ))−1 2 which yields (2.127). Similarly note that if σ ∈ T then, by Lemma 2.12, ' inf η − σ 2 ≥ 2N (1 − δ(N )). η∈T \σ
(2.129)
(2.130)
Just as in (2.129) this property combined with (2.126) and Lemma 2.2, iv) implies that, for all σ ∈ T , inf
x∈γT (T )\γT (σ )
which proves (2.128).
πx − πγT (σ )2 ≥
√ 1 − δ(N ) , d 1 + δ(N )
(2.131)
We are now ready to prove the first five assertions of Proposition 2.1. Notation. The following notation will be used throughout: T = γT (T ), y = γT (σ ), x = γT (η) and x¯ = γT (η). ¯ It will moreover be assumed that ω ∈ EN . Proof of Proposition 2.1, i). Using in turn assertion i) of Lemma 2.4 and Lemma 2.5, y y P τησ < τTσ\η = P◦ τησ < τTσ\η = R◦ τx < τT \x . (2.132) Defining y y y y R1 ≡ R◦ {τx < τT \x } ∩ {τ0 < τx } , y y y y R2 ≡ R◦ {τx < τT \x } ∩ {τx < τ0 } , y y R◦ τx < τT \x may be decomposed as y y R◦ τx < τT \x = R1 + R2 . Obviously
while
(2.133)
(2.134)
y y 0 ≤ R2 ≤ R◦ τx < τ0 ,
(2.135)
y y y R1 = R◦ τ0 < τx < τT \x y y = R◦ τ0 < τT R◦ τx0 < τT0 \x y y = 1 − R◦ τT < τ0 R◦ τx0 < τT0 \x
(2.136)
Aging in the REM. Part I
407
which, together with the bound y y y y R◦ τT < τ0 ≤ R◦ τx < τ0
(2.137)
x ∈T
yields
y y R◦ τx0 < τT0 \x 1 − M sup R◦ τx < τ0 ≤ R1 ≤ R◦ τx0 < τT0 \x . (2.138) x ∈T y y We are thus left to bound the quantities supx ∈T R◦ τx < τ0 and R◦ τx0 < τT0 \x , which will be done by means of, respectively, Lemma 2.7 and Lemma 2.8: on the one hand, since by assumption σ ∈ Wε (T ), it follows from (2.127) that δ in Lemma 2.7 may √ be chosen as δ = ε 4 d , so that inserting the bound (2.126) in (2.35) yields y y R◦ τx < τ0 ≤ e−Nh (d) , for all x ∈ T (2.139) for some constant h (d) > 0 and large enough N ; on√ the other hand, it follows from (2.128) that δ in Lemma 2.8 may be chosen as δ = d so that, in view of (2.125), combining the bounds (2.38) and (2.126), we obtain ◦ 0 d 1/2 1 d R τ < τ 0 ln N (2.140) − ≤ 1 + c 0 x T \x N M NM for some constant c0 > 0. Collecting the previous estimates we obtain that, for large enough N , ◦ y d 1/2 1 d R τx < τ y ln N − ≤ 1 + c 1 N T \x M NM
(2.141)
for some constant c1 > 0. Inserting (2.141) in (2.132) yields the claim of assertion i). Proof of Proposition 2.1, ii). The proof of this second assertion closely follows that of ¯ we assertion i). Keeping in mind the notation T = γT (T ), x = γT (η) and x¯ = γT (η) may write, using in turn assertion ii) of Lemma 2.4 and Lemma 2.5, √ √ η¯ η¯ P τηη¯ < τT \η = e−β NEη¯ P◦ τηη¯ < τT \η = e−β NEη¯ R◦ τxx¯ < τTx¯ \x . (2.142) We then decompose R◦ τxx¯ < τTx¯ \x as in (2.134), and bound R2 as in (2.135). As for R1 we write, just as in (2.136), R1 = 1 − R◦ τTx¯ < τ0x¯ R◦ τx0 < τT0 \x , (2.143) but this time use (2.137) to deduce that 1 − R◦ τTx¯ < τ0x¯ ≥ 1 − R◦ τxx¯ < τ0x¯ x ∈T
= R◦ τ0x¯ < τx¯x¯ − R◦ τxx¯ < τ0x¯
x ∈T \x¯
≥ R◦ τ0x¯ < τx¯x¯ − (M − 1) sup R◦ τxx¯ < τ0x¯ . x ∈T \x¯
(2.144)
408
G. Ben Arous, A. Bovier, V. Gayrard
Therefore R
◦
τx0
0
< τT \x
! R
◦
τ0x¯
<
τx¯x¯
− (M − 1) sup R
◦
x ∈T \x¯
"
τxx¯
<
τ0x¯
≤ R1 ≤ R◦ τx0 < τT0 \x . (2.145) Having already estimated the probability R◦ τx0 < τT0 \x in (2.140), we are left to treat the terms R◦ τ0x¯ < τx¯x¯ and supx ∈T \x¯ R◦ τxx¯ < τ0x¯ . The probabilities R◦ τxx¯ < τ0x¯ entering the latter term are easily dealt with by means of Lemma 2.7: note that for all x ∈√ T \ x and x¯ ∈ T , it follows from (2.128) that δ in Lemma 2.7 may be chosen as δ = d, so that inserting the bound (2.126) in (2.35) yields for all x ∈ T \ x¯ (2.146) R◦ τxx¯ < τ0x¯ ≤ e−Nh (d) , for some constant h (d) > 0 and large enough N . To bound R◦ τ0x¯ < τx¯x¯ we simply use that by Lemma 2.6, in view of (2.125) and (2.126), there exists a constant c2 > 0 such that 1/2 d R◦ τ0x¯ < τx¯x¯ ≥ 1 − ln N . (2.147) 1 + c2 Nd NM Gathering our bounds, we finally obtain ◦ x¯ 1/2 1 d R τ < τ x¯ − ln N ≤ 1 + c3 Nd x T \x M NM
(2.148)
for some constant c3 > 0. Inserting (2.148) in (2.142) concludes the proof of assertion ii). Proof of Proposition 2.1, iii). Again, the proof of this third assertion is very similar to that of assertion i). Using in turn assertion i) of Lemma 2.4 and Lemma 2.5, η¯ η¯ ◦ η¯ ◦ x¯ x¯ P τηη¯ < τT \{η,η} τ < τ = P = R τ (2.149) < τ η x ¯ T \{η,η} ¯ T \{x,x} ¯ . Defining
we have
x¯ x¯ } ∩ {τ < τ } R1 ≡ R◦ {τxx¯ < τTx¯ \{x,x} x 0 ¯ ◦ x¯ x¯ x¯ x¯ R2 ≡ R {τx < τT \{x,x} } ∩ {τ < τ } x 0 ¯
(2.150)
= R1 + R2 . R◦ τxx¯ < τTx¯ \{x,x} ¯
(2.151)
Next, just as in (2.135) we write 0 ≤ R2 ≤ R◦ τxx¯ < τ0x¯
(2.152)
Aging in the REM. Part I
409
while proceeding as in (2.136) and (2.137) to treat the term R1 yields, in analogy with (2.138), ! " 1 − (M − 1) sup R◦ τxx¯ < τ0x¯ ≤ R1 ≤ R◦ τx0 < τT0 \{x,x} R◦ τx0 < τT0 \{x,x} ¯ ¯ . x ∈T \x¯
(2.153) Since the probabilities R◦ τxx¯ < τ0x¯ , x ∈ T \ x, ¯ appearing in (2.152) and (2.153) have already been bounded in (2.146), we are left to estimate R◦ τx0 < τT0 \{x,x} ¯ . To do this we proceed exactly as in the proof of (2.140) (the only difference being that the set J in Lemma 2.8 is here given by J = T \ {x, x} ¯ so that |J | = M − 1) and obtain ◦ 0 d 1/2 1 d R τ < τ 0 − ≤ 1 + c ln N (2.154) 0 x T \{x,x} ¯ N M − 1 N (M − 1) for some constant c0 > 0. Collecting our bounds yields the claim of assertion iii).
Proof of Proposition 2.1, iv). This assertion is nothing but a direct consequence of assertion ii) since η η η (2.155) P τη¯ < τT \η¯ . P τT \η < τηη = η∈T ¯ \η
Thus (2.6) is proven. For later use (see the proof of assertion v)) let us however give a full derivation of the lower bound in (2.6): again, with the same notation as in the proofs of the first two assertions, using in turn assertion ii) of Lemma 2.4 and Lemma 2.5, it follows from (2.155) that √ √ η η P τT \η < τηη = e−β NEη P◦ τT \η < τηη = e−β NEη R◦ τTx \x < τxx . (2.156) Defining R1 ≡ R◦ {τTx \x < τxx } ∩ {τ0x < τTx \x } , R2 ≡ R◦ {τTx \x < τxx } ∩ {τTx \x < τ0x } ,
(2.157)
1 ≥ R◦ τTx \x < τxx = R1 + R2 ≥ R1 ,
(2.158)
R1 = R◦ τ0x < τTx \x < τxx = R◦ τ0x < τTx R◦ τT0 \x < τx0 = 1 − R◦ τTx < τ0x 1 − R◦ τx0 < τT0 \x
(2.159)
we have
and since
410
G. Ben Arous, A. Bovier, V. Gayrard
we obtain, proceeding as in (2.144) to bound 1 − R◦ τTx < τ0x , ! " x x ◦ x ◦ x R1 ≥ R τ0 < τx − (M − 1) sup R τx < τ0 1 − R◦ τx0 < τT0 \x . x ∈T \x
(2.160) Now all the probabilities entering the above expression have already been estimat ◦ τx < τx , ed (see respectively (2.147), (2.146) and (2.140) for the estimates on R x 0 R◦ τxx < τ0x , x ∈ T \ x, and R◦ τx0 < τT0 \x ). Plugging these estimates in (2.160) we obtain 1/2 1 1 − Nd 1 + c4 Nd ln N (2.161) R◦ τTx \x < τxx ≥ 1 − M for some constant c4 > 0. Inserting (2.161) in (2.156) proves the lower bound of (2.6). As is by now routine, the proof of assertion v) of Proposition 2.1 begins as in (2.156): we first invoke Lemma 2.4 to write √ (2.162) P τTσ < τσσ = e−β NEσ P◦ τTσ < τσσ and next use Lemma 2.5 to express the last probability above in terms of a lumped chain: γ ∪σ (σ ) γT ∪σ (σ ) (2.163) P◦ τTσ < τσσ = R◦ τγTT∪σ (T ) < τγT ∪σ (σ ) . Similarly, to prove assertion vi) we begin by writing: γ ∪σ (σ¯ ) γT ∪σ (σ¯ ) P τσσ¯ ≤ τTσ¯ = R◦ τγTT∪σ (σ ) < τγT ∪σ (T ) .
(2.164)
At this point however we see that contrary to the cases encountered so far the mapping γ involved in the last two identities is not constructed from the top alone, but the top augmented by a non-random point σ . To proceed any further we thus need to investigate its properties. Lumped chain induced by the Top and a non-random point. In order to study the mapping γT ∪σ we must go back to its definition (see Sect. 2.1). Most of the results we will establish below rely on the simple observation that the partition PT ∪σ () induced by T ∪ σ may be constructed by first constructing the partition PT () induced by the top alone and next, partitioning each of the elements of PT () according to the sign of σi . More precisely: Lemma 2.14. Set d = 2M+1 , d = 2M . There is a one-to-one correspondence between the elements of the partition PT ∪σ (), k (T ∪ σ ), k ∈ {1, . . . , d }
(2.165)
and the sets sk (T ) = {i ∈ k (T ) | σi = s} , (s, k) ∈ {−1, 1}× ∈ {1, . . . , d}.
(2.166)
Aging in the REM. Part I
411
Proof. Let {e1 , . . . , ek , . . . , ed } and {e1 , . . . , ek , . . . , ed } be arbitrarily chosen labellings of, respectively, all d elements of SM+1 and all d elements of SM . For u = (u1 , . . . , uM+1 ) ∈ RM+1 write u = (u, u), with u = (u1 , . . . , uM ) and u = uM+1 . Then, clearly, {ek }k ∈{1,...,d } = {(ek , s)}(s,k)∈{−1,1}×∈{1,...,d}
(2.167)
ek = (ek , s)
(2.168)
and that the relation
induces a one-to-one correspondance between the indices k ∈ {1, . . . , d } and the pairs (s, k) ∈ {−1, 1}× ∈ {1, . . . , d}. Let now ξ ∈ S(|T |+1)×N and ξ ∈ S(|T |)×N be two matrices satisfying property (2.12) with, respectively, I = T ∪ σ and I = T , and chosen such that: ξµ if µ ∈ {1, . . . , M} µ . (2.169) ξ = σ, if µ = M + 1 It then follows from Definition (2.13) that, whenever (2.168) holds, k (T ∪ σ ) ≡ {i ∈ | ξi = ek } = {i ∈ | ξ i = e k , ξ i = e k } = {i ∈ | ξi = ek , σi = s} = {i ∈ k (T ) | σi = s} = sk (T ). The lemma is therefore proven.
(2.170)
Lemma 2.15. Let K(T ∪ σ ) ≡ K (T ∪ σ ) and δ(N ) be defined as in (2.32), resp. (2.113). Choose = 1 − 2δ(N). Then, for all ω ∈ EN , d ≤ |K(T ∪ σ )| ≤ d 2
(2.171)
and N∗ ≡
N d k∈K(T ∪σ ) N max |k (T ∪ σ )| ≤ 2(1 + δ(N )) . d k∈K(T ∪σ ) min
|k (T ∪ σ )| ≥ (1 − 2δ(N ))
(2.172)
Proof. It obviously follows from Definition (2.166) that − |k (T )| = |+ k (T )| + |k (T )|.
(2.173)
For fixed k ∈ {1, . . . , d}, assume that there exists s ∈ {−1, 1} such that |sk (T )| < dN . It then follows from (2.173) that |−s k (T )| ≥ |k (T )| −
N N N N ≥ (1 − δ(N )) − ≥ , d d 2d d
(2.174)
where the second inequality follows from (2.126) and the fact that ω ∈ EN , while the last line results from our choice of . Thus for each k ∈ {1, . . . , d} there exists at least
412
G. Ben Arous, A. Bovier, V. Gayrard
one index s ∈ {−1, 1} such that |sk (T )| ≥ dN . This together with Lemma 2.14 yields the lower bound of (2.171). The upper bound beeing immediate, (2.171) is proven. Let us turn to (2.172). The first inequality simply follows from the definition of K(T ∪ σ ) and our choice . To prove the second inequality we first use that by (2.173), for each pair (s, k) ∈ {−1, 1}× ∈ {1, . . . , d}, |sk (T )| ≤ |k (T )| ≤ (1 + δ(N ))
N N = 2(1 + δ(N )) , d d
(2.175)
where the second inequality follows from (2.126), and next conclude by means of Lemma 2.14. The lemma is proven. To state the next lemma we need some extra notation. Set κ = |K(T ∪ σ )| and let : Rd → Rκ be the projection that maps x = (xk )k ∈{1,...,d } into π x = (xk )k ∈K(T ∪σ ) . For each k ∈ {1, . . . , d}, let s∗ ∈ {−1, 1} be defined through π
∗ |sk∗ (T )| ≥ |−s k (T )|
(2.176)
and set
D = k ∈ {1, . . . , d } | k (T ∪ σ ) = sk∗ (T ), k ∈ {1, . . . , d} .
(2.177)
Finally, let π ∗ : Rd → Rd be the projection that maps x = (xk )k ∈{1,...,d } into π ∗ x = (xk )k ∈D . Lemma 2.16. For all σ ∈ T c the following holds true: i) For all η ∈ T , π ∗ γT ∪σ (η) = γT (η). For 0 ≤ ε < 21 , define √ Aε (σ ) = η ∈ T π ∗ γT ∪σ (η) − π ∗ γT ∪σ (σ )2 ≤ ε d .
(2.178)
(2.179)
Then, ii) either Aε (σ ) = ∅ or else, |Aε (σ )| = 1. iii) For all η ∈ T \ Aε (σ ),
√ π γT ∪σ (η) − π γT ∪σ (σ )2 ≥ ε d
(2.180)
√ inf π γT (η) − π γT (η)2 ≥ (1 − 2δ(N )) d.
(2.181)
and, for all η ∈ T ,
η∈T \η
Proof. We first prove assertion i). By Lemma 2.14, to each k ∈ {1, . . . , d } there corresponds a unique pair (s, k) ∈ {−1, 1}× ∈ {1, . . . , d} verifying k (T ∪ σ ) = sk (T ). Fix k ∈ {1, . . . , d }. It follows from Definition (2.16) and (2.182) that 1 ηi . γTk ∪σ (η) = s |k (T )| s i∈k (T )
(2.182)
(2.183)
Aging in the REM. Part I
413
Now by (2.166), we have, sk (T ) ⊆ k (T ) for each
s ∈ {−1, 1}.
(2.184)
But assertion iii) of Lemma 2.2 states that ηi = γTk (η),
for all
i ∈ k (T ).
(2.185)
Hence, combining (2.185) and (2.183) we get
γTk ∪σ (η) = γTk (η) for each
s ∈ {−1, 1}.
(2.186)
Since (2.186) holds for each s ∈ {−1, 1} it holds true for s = s∗ . We therefore have proven that for each k ∈ {1, . . . , d } and each k ∈ {1, . . . , d} related through k (T ∪ σ ) = sk∗ (T ), γTk ∪σ (η) = γTk (η). But this, in view of (2.177), implies that π ∗ γT ∪σ (η) = γT (η), concluding the proof of assertion i). We now turn to the proof of assertion ii). Note that by (2.178), Aε (σ ) may be written as √ Aε (σ ) = η ∈ T γT (η) − π ∗ γT ∪σ (σ )2 ≤ ε d . (2.187) Assume that Aε (σ ) = ∅. Then there exists η ∈ T such that γT (η) − π ∗ γT ∪σ (σ )2 ≤ √ ε d. Thus, √ inf γT (η) − π ∗ γT ∪σ (σ )2 ≥ inf γT (η) − γT (η)2 − ε d η∈T \η η∈T \η √ √ ≥ (1 − 2δ(N )) d − ε d, (2.188) where the last inequality follows from Lemma 2.13. Since for all 0 ≤ ε < 21 , 1 − 2δ(N) − ε > ε, provided that N is sufficiently large, we conclude that |Aε (σ )| = 1. The claim of assertion ii) is thus proven and it remains to prove iii). To do so note that proceeding just as in the proof of (2.171), we easily see that D ⊆ K(T ∪ σ ). Hence, for all y, y ∈ Rd , π y − π y2 ≥ π ∗ y − π ∗ y2 .
(2.189)
Now (2.180) is an immediate consequence of (2.189) and the definition of Aε (σ ) while (2.181) results from the combination of (2.189) and (2.128) of Lemma 2.13. Assertion iii) being proven, the proof of the lemma is done. We are now ready to prove the last two assertions of Proposition 2.1. The following notation will be used throughout: I ≡ T ∪ σ , I ≡ γI (I ), y ≡ γI (σ ), and y¯ ≡ γI (σ¯ ). It will moreover be assumed that ω ∈ EN . Proof of Proposition 2.1, v and vi). With the notation introduced above, (2.163) and (2.164) read, respectively, y y P◦ τTσ < τσσ = R◦ τI \y < τy (2.190) and
y¯ y¯ P τσσ¯ ≤ τTσ¯ = R◦ τy < τI \y .
(2.191)
414
G. Ben Arous, A. Bovier, V. Gayrard
We may now distinguish two cases since, according to assertion ii) of Lemma 2.16, either σ is such that, case 1), Aε (σ ) = ∅, or else case 2), Aε (σ ) = {η} for some η ∈ T . In case 1), a simple adaptation of the proof of the lower bound (2.161) of assertion iii) yields y y 1 R◦ τI \y < τy ≥ 1 − M+1 (2.192) 1 − Nd (1 + c5 (1 + δ(N ))) for some constant c5 > 0. Similarly, retracing the proof of the upper bound of assertion i), we readily obtain that ◦ y¯ y¯ 1 d (2.193) (1 + c6 (1 + δ(N ))) R τy < τI \y − M+1 ≤ NM for some constant c6 > 0. Case 2) will also be brought back to well known situations once observed that, setting x ≡ γI (η), y y y y R◦ τI \y < τy ≥ R◦ τI \{y,x} < τy (2.194) while
y¯ y¯ y¯ y¯ R◦ τy < τI \y ≤ R◦ τy < τI \{y,x} .
Then, proceeding as in the proof of (2.192) we obtain that y y 1 R◦ τI \{y,x} < τy ≥ 1 − M 1 − Nd (1 + c7 (1 + δ(N ))) for some constant c7 > 0, while going back over the proof of (2.193) yields ◦ y¯ 1 d R τy < τ y¯ I \{y,x} − M ≤ NM (1 + c8 (1 + δ(N )))
(2.195)
(2.196)
(2.197)
for some constant c8 > 0. The lower bound in (2.7) then follows from (2.162) together with (2.190), (2.192), (2.194), and (2.196); the coresponding upper bound being immediate, assertion v) is proven. Finally, collecting (2.191), (2.193), (2.195), and (2.197) proves (2.8) of assertion vi). This completes the proof of Proposition 2.1. 3. Expected Times In this section we prove Theorems 1.4 and 1.5. Let ET (E) ( . ) and VT (E) ( . ) denote the expectation and the variance with respect to the conditional distribution P ( . | T (E)(ω) = T (E)). Define √ Zβ,N (T c (E)) ≡ eβ NEσ , (3.1) σ ∈T (E)c
VN,E ≡ Vβ,N,T (E) ≡ Recall that Wβ,N,T (E)
Zβ,N (T c (E)) − ET (E) (Zβ,N (T c (E))) . VT (E) (Zβ,N (T c (E))) √
e(α−1)E+β NuN (0) ≡ M(α − 1)
where, as in Sect. 2, M = |T (E)|.
(3.2)
α−1 1 + Vβ,N,T (E) eE/2 √ , 2α − 1
(3.3)
Aging in the REM. Part I
415
Remark. A remark is in order concerning the random variables defined in (3.1) to (3.3). The behavior of Zβ,N (T c (E)) will be studied in Lemma 3.3. It will in particular be established that Zβ,N (T c (E)) = MWβ,N,T (E) (1 + O(1/N )) (see (3.27)). This of course implies that Wβ,N,T (E) is a positive random variable. Note also that by definition Vβ,N,T (E) has mean zero and variance one, and that all its moments are finite. Notation. From now on we will systematically write T for T (E) and drop the indices β, N , and T (E) in all the symbols appearing in (3.1), (3.2) and (3.3). The cornerstone of the proof of Proposition 1.4 is a classical identity from potential theory (see e.g. [So] or Corollary (3.3) of [BEGK2]) that expresses the expectation of conditioned transition times in terms of the invariant measure and transition probabilities. Namely, it states that for all subsets I, J ⊆ SN , and all σ ∈ SN such that σ ∈ / I ∪ J, E
σ 1 τI | τIσ ≤ τJσ = µβ,N (σ )P(τIσ∪J < τσσ ) σ ≤ τσ ) P(τ I J µβ,N (σ )P(τσσ < τIσ∪J ) × µβ,N (σ ) + . P(τIσ ≤ τJσ ) c
(3.4)
σ ∈(I ∪J ∪σ )
Equation (3.4) generalizes the following expression for the expected value of unconditioned transition times: for all subset I ⊆ SN and all σ ∈ SN such that σ ∈ / I, 1 µβ,N (σ ) + E(τIσ ) = µβ,N (σ )P(τIσ < τσσ )
σ ∈(I ∪σ )c
σ
σ
µβ,N (σ )P(τσ < τI ) . (3.5)
Therefore, by definition of µβ,N , (3.4) reads E(τIσ | τIσ ≤ τJσ ) =
1
√ eβNEσ P(τIσ∪J √ β NEσ
+
× e
< τσσ )
e
√ β NEσ
σ ∈(I ∪J ∪σ )c
P(τσσ
<
P(τIσ σ τI ∪J ) P(τIσ
≤ τJσ ) ≤ τJσ ) (3.6)
and similarly, E(τIσ ) =
1
√ eβ NEσ P(τIσ
<
τσσ )
e
√ β NEσ
+
e
σ ∈(I ∪σ )c η
η
η
√ β NEσ
P(τσσ < τIσ ) . (3.7) η
Applying (3.6) and (3.7) to the quantities E(τη¯ | τη¯ ≤ τT \η ), E(τT \η ) and E(τTσ ), and inserting the probability estimates of Proposition 2.1 in the resulting expressions, the proof of Proposition 1.4 essentially reduces to studying the behavior of the random variable √
Z(T c ) = σ ∈T c eβ NEσ . We start by proving the first assertion of the proposition.
416
G. Ben Arous, A. Bovier, V. Gayrard
Proof of assertion i) of Theorem 1.4. We will assume throughout that the assumptions of Proposition 2.1 are satisfied. It follows from (3.7) that, for all η ∈ T , " ! √ √ 1 η β NEη β NEσ σ σ + e P(τη < τT \η ) . (3.8) E(τT \η ) = √ e η η eβ NEη P(τT \η < τη ) σ ∈T c The factor in front of the square brackets was estimated in Proposition 2.1, iv). Plugging in this estimate yields ! " √ √ 1 η β NEη β NEσ σ σ E(τT \η ) = e + e P(τη < τT \η ) (1 + O(1/N )) (3.9) 1 1− M σ ∈T c and we are left to study the term √ I≡ eβ NEσ P◦ (τησ < τTσ\η ).
(3.10)
σ ∈T c
To do so, we proceed as follows: for ε > 0 a constant, let B√εN (η) and Wε (T ) be defined as in (2.1) and (2.2) and set (3.11) T c ∩ B√εN (η) . Vε (T ) ≡ η∈T
Observing that T c = Vε (T ) ∪ Wε (T )
(3.12)
I = I1 + I2
(3.13)
I may be decomposed as
with I1 ≡
eβ
√ NEσ
σ ∈Vε (T )
I2 ≡
eβ
P◦ (τησ < τTσ\η ),
√ NEσ
σ ∈Wε (T )
P◦ (τησ < τTσ\η ).
(3.14)
Now obviously,
0 ≤ I1 ≤
eβ
√ NEσ
(3.15)
σ ∈Vε (T )
while, by Proposition 2.1, i), for all ω ∈ E and large enough N , I2 obeys the bound √ d β NEσ I2 − 1 e (3.16) ≤ N M (1 + cδ(N )). M σ ∈Wε (T )
Therefore, setting Z(Vε (T )) ≡
σ ∈Vε (T )
eβ
√ NEσ
,
Aging in the REM. Part I
417
Z(T c ) ≡
eβ
√ NEσ
(3.17)
,
σ ∈T c
and combining (3.15) and (3.16) together with (3.13), we arrive at I − 1 Z(T c ) 1 + (M − 1) Z(Vε (T )) ≤ d (1 + cδ(N )), c M Z(T ) N M
(3.18)
and it remains to study the behavior of the random variables Z(Vε (T )) and Z(T c ). As this depends on the cardinality of Vε (T ), we will first establish that: Lemma 3.1. Assume that 0 < ε < 1/2 and set J (x) = (1 − x) ln
1 1−x
+ x ln x1 , 0 < x < 1.
(3.19)
Then, for all ω ∈ E and large enough N , there exist constants, 0 < c− ≤ c+ < ∞, such that c− MN −1/2 eNJ (ε/4) − M ≤ |Vε (T )| ≤ c+ MN 1/2 eNJ (ε/4) .
(3.20)
Proof. Under the assumptions of Lemma 2.12, |Vε (T )| = |T c ∩ B√εN (η)| = |B√εN (η) \ η| = |B√εN (η)| − M. (3.21) η∈T
η∈T
η∈T
Now, for all η ∈ T ,
N N εN ≤ |B√εN (η)| ≤ , εN/4 4 εN/4
(3.22)
where we used that Nk is an increasing function of k for 0 ≤ k ≤ εN/4. By Stirling’s formula, for large enough N , there exist constants, 0 < a − ≤ a + < ∞ such that N a+ a− NJ (ε/4) ≤ ≤√ (3.23) e eNJ (ε/4) . √ εN/4 π ε(1 − ε/4) π ε(1 − ε/4) Inserting (3.23) in (3.22) and using that, by assumption, 0 < ε < 1/2 we obtain c− N −1/2 eNJ (ε/4) ≤ |B√εN (η)| ≤ c+ N 1/2 eNJ (ε/4)
(3.24)
for some constants, 0 < c− ≤ c+ < ∞. Inserted in (3.21), (3.24) yields (3.20), concluding the proof of Lemma 3.1. We are now ready to prove the following two lemmata. Lemma 3.2. Let Z(Vε (T )) be as in (3.17). Under the assumptions and with the notation of Lemma 3.1), the following holds: there exists a constant 0 < c < ∞ such that, for all 0 < ε < 1/2 , and large enough N , √ c P Z(Vε (T )) ≥ |Vε (T )|e2β N ln |Vε (T )| T (ω) = T ≤ √ e−NJ (ε/4) . (3.25) J (ε/4)
418
G. Ben Arous, A. Bovier, V. Gayrard
Lemma 3.3. Let Z(T c ) and W be as in (3.1) and (3.3). Then, for all N large enough, √ N (ln 2)/4 P Z(T c ) ≤ eβN ln 2 T (ω) = T ≤ e−e (3.26) and Z(T c ) = MW(1 + O(1/N )). Proof of Lemma 3.2. For δ > 0, set a =
(3.27)
√ eβ 2(δ+1)N ln |Vε (T )| ,
P (Z(V ε (T )) ≥ a|Vε (T )| | T (ω) = T ) √ ≤ P |Vε (T )| max eβ NEσ ≥ a|Vε (T )| T (ω) = T (T ) σ ∈V √ε β NEσ ≤ |Vε (T )|P e ≥ a T (ω) = T √ = |Vε (T )|P eβ NEσ ≥ a Eσ < uN (E) ,
(3.28)
where the second inequality holds true for all σ ∈ Vε (T ) (thereby implying the last equality). In explicit form, the probability appearing in the last line of (3.28) reads √ P √2(δ + 1) ln |Vε (T )| ≤ Eσ < uN (E) β NEσ P e ≥ a Eσ < uN (E) = . P (Eσ < uN (E)) (3.29) By a standard upper tail estimate for Gaussian random variables, ' P 2(δ + 1) ln |Vε (T )| ≤ Eσ < uN (E) ' ≤ P Eσ ≥ 2(δ + 1) ln |Vε (T )| 1 ≤ √ |Vε (T )|δ+1 4π(δ + 1) ln |Vε (T )|
(3.30)
while P (Eσ < uN (E)) = 1 − 2−N e−E .
(3.31)
Inserting (3.30) and (3.31) in (3.29) and combining with (3.28) yields P (Z(Vε (T )) ≥ a|Vε (T )| | T (ω) = T ) 1 ≤ . √ δ −N −E |Vε (T )| (1 − 2 e ) 4π(δ + 1) ln |Vε (T )|
(3.32)
Choosing δ = 1, (3.32) together with the lower bound on |Vε (T )| of Lemma 3.1 gives (3.25). This proves the lemma. Proof of Lemma 3.3. We first prove (3.27). Recall from Theorem 1.4 that ET ( . ) and VT ( . ) denote the expectation and the variance with respect to the conditional distribution P ( . | T (ω) = T ) and set Xβσ = eβ
√ NEσ
I{Eσ
(3.33)
Aging in the REM. Part I
419
Observing that ET (Z(T c )) = |T c |ET (Xβσ ), σ ) − ET2 (Xβσ )], (3.34) VT2 (Z(T c )) = ET (Z(T c )2 ) − ET2 (Z(T c ))2 = |T c |[ET (X2β the computation of the mean and variance of Z(T c ) reduces to that of ET (Xβσ ). Now
ET (Xβσ ) =
1 √ 2π
uN (E)
e−
x2 2 +β
√ Nx
dx
0
P (Eσ < uN (E))
.
Decomposing the integral above as uN (E) √ x2 1 e− 2 +β Nx dx = r1,N − r0,N √ 2π 0
(3.35)
(3.36)
with √ 2 eβ N/2 uN (E)−β N − y 2 = √ e 2 dy 2π −∞ √ 2 eβ N/2 −β N − y 2 = √ e 2 dy 2π −∞
r1,N r0,N
we have, by standard tail estimates for the Gaussian, √ β N 1 ≤ r0,N ≤ ' √ 2 2π(1 + β N ) 2πβ 2 N √ while, for β > 2 ln 2,
(3.37)
(3.38)
√
r1,N
e(α−1)E+β NuN (0) = (1 + O(1/N )). 2N (α − 1)
(3.39)
Inserting these bounds (3.36) and making use of (3.31), we get √
ET (Xβσ )
e(α−1)E+β NuN (0) (1 + O(1/N )). = 2N (α − 1)
(3.40)
Remembering that |T c | = 2N − M, it follows from (3.34) and (3.40) that √
e(α−1)E+β NuN (0) (1 + O(1/N )) ET (Z(T )) = (α − 1) c
(3.41)
and √
VT2 (Z(T c ))
e(2α−1)E+2β NuN (0) (1 + O(1/N )). = (2α − 1)
(3.42)
Hence, recalling from (3.2) that V = [Z(T c ) − ET (Z(T c ))]/[VT (Z(T c ))], we have VT (Z(T c )) c c Z(T ) = ET (Z(T )) 1 + V ET (Z(T c ))
420
G. Ben Arous, A. Bovier, V. Gayrard √ e(α−1)E+β NuN (0) E/2 α − 1 = 1 + Ve (1 + O(1/N )) √ (α − 1) 2α − 1 = MW(1 + O(1/N )),
(3.43)
where the last line follows from the definition of W. Thus (3.27) is proven. To prove (3.26) we use the rather abrupt bound √ √ P Z(T c ) ≤ eβN ln 2 T (ω) = T ≤ P ∀σ ∈T c Eσ < N ln 2 T (ω) = T √ ≤ P Eσ < N ln 2 Eσ < uN (E) . σ ∈T c
√ Since N ln 2 < uN (E) for fixed E and large enough N , we have √N ln 2 − x 2 2 √ e 2 dx √ 2π 0 P Eσ < 2N ln 2/2 Eσ < uN (E) = , P (Eσ < uN (E))
(3.44)
(3.45)
and it follows from (3.31) together with the bound √N ln 2 ∞ x2 x2 2 2 1 e− 2 dx = 1 − √ e− 2 dx ≥ 1 − √ e−N(ln 2)/2 (3.46) √ √ 2π 0 2π 4 N N ln 2 that
|T c | 1 − √1 e−N(ln 2)/2 √ 4 N P ∀σ ∈T c Eσ < N ln 2 T (ω) = T ≤ 1 − 2−N e−E −N N 1 −N(ln 2)/2 2 (1−2 M) ≤ 1− √ e 8 N √ ≤ exp −eN(ln 2)/2 (1 − 2−N M)/8 N (3.47) ≤ exp −eN(ln 2)/4 .
This proves (3.26) and concludes the proof of Lemma 3.3.
√ We are now ready to complete the proof of assertion i) of Theorem 1.4. For β ≥ 2 ln 2 and fixed 0 < ε < 1/2 (to be chosen appropriately later), let EN ≡ EN (β, ε) be defined as √ √ EN ≡ E ∩ ω ∈ | Z(Vε (T )) ≤ |Vε (T )|e2β N ln |Vε (T )| , Z(T c ) ≥ eβN ln 2 , (3.48) where E is taken from Proposition 2.1. Now set E ≡ EN .
(3.49)
N0 N>N0
Obviously, by (3.25), (3.26) and Lemma 2.11, P E = 1.
(3.50)
Aging in the REM. Part I
421
Assume from now on that ω ∈ EN (and that N is large enough). By Lemma 3.1 and Lemma 3.2, Z(Vε (T )) ≤ ecβN
√
ε
(3.51)
for some numerical constant 0 < c < ∞. Thus, by 3.26 of Lemma 3.3, 0<
Z(Vε (T )) < e−c(ε)βN , Z(T c )
(3.52)
√ where c(ε) = ln 2 − c ε. Let ε0 be defined through c(ε0 ) = 0. Then, choosing 0 < ε < (1/2 ∧ ε0 ), 1 Z(Vε (T )) 1 c c −c(ε)βN ) 1 + O e (3.53) Z(T ) 1 + (M − 1) = Z(T M Z(T c ) M which, combined with (3.27) of Lemma 3.3 yields, 1 Z(Vε (T )) c Z(T ) 1 + (M − 1) = W(1 + O(1/N )). M Z(T c )
(3.54)
This inserted in turn in (3.18) gives, I = W(1 + O(1/N )).
(3.55)
Combining (3.55) with (3.10) and (3.9) concludes the proof of assertion i) of Theorem 1.4. Proof of assertion ii) of Theorem 1.4. It follows from (3.7) that, for all σ ∈ / T, √ √ 1 eβ NEσ + E(τTσ ) = √ eβ NEσ P(τσσ < τTσ ) . (3.56) σ β NE σ σ e P(τT < τσ ) σ ∈T c \σ Assuming again that the assumptions of Proposition 2.1 are satisfied, it follows from Proposition 2.1, v), that √ 1 β √NEσ eβ NEσ + I ≤ E(τTσ ) ≤ + I e (1 + O(1/N )), (3.57) 1 1− M where I ≡
eβ
σ ∈T c \σ
√ NEσ
P(τσσ < τTσ ).
(3.58)
Then, decomposing I as I = I1 + I2
(3.59)
with (for ε > 0 a constant and with Vε (T ) and Wε (T ) defined as in (3.11) and (3.12)) √ eβ NEσ P(τσσ < τTσ ), I1 ≡ σ ∈Vε (T ∪σ )
422
G. Ben Arous, A. Bovier, V. Gayrard
I2 ≡
eβ
√ NEσ
σ ∈Wε (T ∪σ )
P(τσσ < τTσ ).
(3.60)
But clearly,
0 ≤ I1 ≤
eβ
√ NEσ
(3.61)
,
σ ∈Vε (T ∪σ )
while by Lemma 2.1, vi), I2 obeys the bounds I2
1 ≥ M +1 ≤ I2 ≤
e
√ β NEσ
σ ∈Wε (T ∪σ )
1 M
eβ
√ NEσ
σ ∈Wε (T ∪σ )
d 1 + (1 − cδ(N )) N d 1 + (1 + cδ(N )) . N
(3.62)
Therefore, recalling the definition of Z(Vε (T )) and Z(T c ) from (3.17), Z((T ∪ σ )c ) Z(Vε (T ∪ σ )) d 1 − (M − 1) 1 + (1 + cδ(N )) I ≥ M +1 Z((T ∪ σ )c ) NM c ) (T ∪ σ )) Z((T ∪ σ ) Z(V d ε I ≤ 1 + (M − 1) 1+ (1 + cδ(N )) . (3.63) M Z((T ∪ σ )c ) NM In other words, comparing (3.63) with (3.18), and noting that the difference between Z(T c ) and Z((T ∪ σ )c is even in the worst case not larger than eE (α − 1)Z(T c ) , I obeys virtually the same upper bound as does I in the proof of assertion i). From here on the proof of assertion ii) follows step by step that of the first assertion. Proof of assertion iii) of Theorem 1.4. As is the proof of the first two assertions we will assume that the assumptions of Proposition 2.1 are satisfied. By (3.6) we have, for all η, η¯ ∈ T , η = η, ¯ η
η
η
E(τη¯ | τη¯ ≤ τT \η ) =
1
√ η η eβ NEη P(τT \η < τη ) ! √ √ β NEη β NEσ
× e
+
e
σ ∈T c
P(τησ
<
τTσ\η )
P(τη¯σ ≤ τTσ\η )
"
. η η P(τη¯ ≤ τT \η ) (3.64)
Recalling the definition of I from (3.10) and comparing Eqs. (3.64) and (3.8), we see that η η their right hand sides are identical up to the extra factor P(τη¯σ ≤ τTσ\η )/P(τη¯ ≤ τT \η ) that multiplies each of the terms of the sum in I . Mimicking the proof of assertion i), set I ≡ I1 + I2 ,
(3.65)
where, for ε > 0 a constant and with Vε (T ) and Wε (T ) defined as in (3.11) and (3.12), I1 ≡
eβ
σ ∈Vε (T )
I2
≡
σ ∈Wε (T )
e
√ NEσ
P◦ (τησ < τTσ\η )
√ β NEσ
P
◦
(τησ
<
P(τη¯σ ≤ τTσ\η ) η
η
P(τη¯ ≤ τT \η )
τTσ\η )
,
P(τη¯σ ≤ τTσ\η ) η
η
P(τη¯ ≤ τT \η )
.
(3.66)
Aging in the REM. Part I
423
It follows from Proposition 2.1, iii) that, for σ ∈ Vε (T ), 0≤
P(τη¯σ ≤ τTσ\η ) η
η
P(τη¯ ≤ τT \η )
≤ (M − 1)(1 + O(1/N )),
(3.67)
where we trivially used that 0 ≤ P(τη¯σ ≤ τTσ\η ) ≤ 1 in the numerator. In the case where σ σ ∈ Wε (T ), observing that P(τη¯σ ≤ τTσ\η ) = P(τη¯σ < τ(T \η)\η¯ ), and making use of assertion i) of Proposition 2.1 with T replaced by T \ η (hence, since M = |T |, with M replaced by M − 1) we get, P(τ σ ≤ τ σ ) η¯ T \η − 1 (3.68) ≤ O(1/N ). P(τη¯η ≤ τTη \η ) Thus, with I = I1 + I2 , I1 and I2 being defined as in (3.14), we get |I − I | ≤ ≤ = ≤
|I1 − I1 | + |I2 − I2 | M(1 + O(1/N ))I1 + O(1/N )I2 [M(I1 /I2 )(1 + O(1/N )) + O(1/N )] I2 [M(I1 /I2 )(1 + O(1/N )) + O(1/N )] I,
(3.69)
Z(T c )
defined as in (3.17) we have, in view of (3.15) and where for Z(Vε (T )) and (3.16), MZ(Vε (T )) I1 ≤ c I2 Z(T ) − Z(Vε (T )) − Nd (1 + cδ(N )) M[Z(Vε (T ))/Z(Vε (T ))] ≤ (3.70) . 1 − [Z(Vε (T ))/Z(Vε (T ))] 1 + Nd (1 + cδ(N )) With EN defined as in (3.48), choosing ε as in the line following (3.52), and inserting the bound (3.52) in (3.69) we get, for large enough N , I1 ≤ 2e−c(ε)βN , I2 From this and (3.55), (3.69) yields
on EN .
(3.71)
|I − I | ≤ WO(1/N ).
(3.72)
Finally, combining (3.64) and (3.8), and using the previous bound, √ η η η η η E(τη¯ | τη¯ ≤ τT \η ) − E(τT \η ) = eβ NEη P(τT \η < τηη )|I − I | ≤
1 1−
1 M
WO(1/N ), (3.73)
|I
where the pre-factor of − I | was estimated by means of Proposition 2.1, iv). This completes the proof of the last assertion of Theorem 1.4. Proof of Theorem 1.5. The proof of this theorem is very similar to that of assertion (i) of Theorem 1.4. The only difference is that this time, the partial partition function Z(V (E)) √ 2 is negligible compared to Zβ,N . Finally, for β < 2 ln 2, Zβ,N = eNβ /2 (1+O(N −1/4 )) with probability tending to one faster than any polynomial, as follows from easy estimates (see [BKL or Bo]), and this proves the theorem. Acknowledgements. We thank the Weierstrass Institute and the Mathematics Department of the EPFL for financial support and mutual hospitality.
424
G. Ben Arous, A. Bovier, V. Gayrard
A. Appendix We state and prove a simple lemma that is used in Sect. 2. ◦ Lemma 4.1. Let k ⊂ N , 1 ≤ k ≤ K, be a collection of subgraphs of N and let R k denote the law of the Markov chain with transition rates r ◦ (x , x ), if x = x , and (x , x ) ∈ E(k ) ◦ rk (x , x ) = N . (4.1) 0, otherwise
Assume that E(k ) ∩ E(k ) = ∅, ∀k, k ∈ {1, . . . , K}, k = k
(4.2)
and that y, x ∈
K
(4.3)
V (k ).
k=1
Then K y y y y ◦ τx < τy . R R◦ τx < τy ≥ k
(4.4)
k=1
Proof. This lemma is a straightforward generalisation of Lemma 2.1 of [BEGK1]. Let y Hx denote the space of functions y
Hx ≡ {h : N → [0, 1] : h(y) = 0, h(x) = 1} and define the Dirichlet forms N (h) ≡ x ,x ∈ N
k (h) ≡
x ,x ∈k
QN (x )rN◦ (x , x )[h(x ) − h(x )]2 , ◦ (x )r ◦ (x , x )[h(x ) − h(x )]2 , Q k k
◦ (y) = QN (y)/QN (k ). Then N (h) ≥ where Q k that inf y N (h) ≥ inf y
h∈Hx
h∈Hx
K k=1
(4.5)
QN (k )k (h) ≥
K
K k=1
(4.6)
k=1 QN (k )k (h),
implying
QN (k ) inf y k (h)
(4.7)
h∈Hx
from which the lemma follows by an application of Theorem 2.2 of [BEGK1].
References [Be] [B] [Bo]
Ben Arous, G.: Aging and spin glasses. In: Proceedings of the International Congress of Mathematicians 2002, Beijing, China, Li, Ta, Tsien et al. (eds.). Vol. 3, China: Higher Education Press, 2002, pp. 3–14 Bouchaud, J.P.: Weak ergodicity breaking and aging in disordered systems. J. Phys. I (France) 2, 1705 (1992). Bovier, A.: Statistical mechanics of disordered systems. MaPhySto Lecture Notes 10, Aarhus, 2001
Aging in the REM. Part I [BBG]
425
Ben Arous, G., Bovier, A., Gayrard, V.: Glauber dynamics of the random energy model. II. Aging below the critical temperature, Commun. Math. Phys. to appear [BCKM] Bouchaud, J.P., Cugliandolo, L., Kurchan, J., M´ezard, M.: Out-of-equilibrium dynamics in spi-glasses and other glassy systems. In: Spin-Glasses and Random Fields, A.P. Young (ed). Singapore: World Scientific, 1998 [BD] Bouchaud, J.P., Dean, D.: Aging on Parisi’s tree. J. Phys. I (France) 5, 265 (1995) [BDG] Ben Arous, G., Dembo, A., Guionnet, A.: Aging of spherical spin glasses. Probab. Theory Relat. Fields 120, 1–67 (2001) [BEGK1] Bovier, A., Eckhoff, M., V. Gayrard, M., Klein, M.: Metastability in stochastic dynamics of disordered mean field models. Probab. Theory Relat. Fields 119, 99–161 (2001) [BEGK2] Bovier, A., Eckhoff, M., Gayrard, V., Klein, M.: Metastability and low lying spectra in reversible Markov chains. Commun. Math. Phys. 228, 219–255 (2002) [BKL] Bovier, A., Kurkova, I., L¨owe, M.: Fluctuations of the free energy in the REM and the p-spin SK models. Ann. Probab. 30, 605–651 (2002) [BM] Bouchaud, J.P., Monthus, C.: Models of traps and glass phenomenology. J. Phys. A-Math. Gen. 29(14), 3847–3869 (1996) [BMR] Bouchaud, J.P., Maass, P., Rinn, B.: Multiple scaling regimes in simple aging models. Phys. Rev. Letts. 84(23), 5403–5406 (2000) [BR] Burke, C.J., Rosenblatt, M.: A Markovian function of a Markov Chain. Ann. Math. Statist. 29, 1112–1122 (1958) [CD] Cugliandolo, L., Dean, D.: Full dynamical solution for a spherical spin-glass model. J. Phys. A - Math. Gen. 28(15), 4213–4234 (1995) [D1] Derrida, B., Random energy model: Limit of a family of disordered models. Phys. Rev. Letts. 45, 79–82 (1980) [D2] Derrida, B.: Random energy model: An exactly solvable model of disordered systems. Phys. Rev. B 24, 2613–2626 (1981) [Ei] Eisele, Th.: On a third-order phase transition. Commun. Math. Phys. 90, 125–159 (1983) [FIKP] Fontes, L.R.G., Isopi, M., Kohayakawa, Y., Picco, P.: The spectral gap of the REM under Metropolis dynamics. Ann. Appl. Probab. 8, 917–943 (2001) [G] Gayrard, V.: Thermodynamic limit of the q-state Potts-Hopfield model with infinitely many patterns. J. Stat. Phys. 68, 977–1011 (1992) [GMP] Galvez, A., Martinez, S., Picco, P.: Fluctuations in Derrida’s random energy and generalized random energy models. J. Stat. Phys. 54, 515–529 (1989) [KT] Karlin, S., Taylor, H.M.: A first course in stochastic processes. Boston: Academic Press, 1975 [Li] Liggett, T.M.: Interacting particle systems. Berlin: Springer, 1985 [LLR] Leadbetter, M.R., Lindgren, G., Rootz´en, H.: Extremes and Related Properties of Random Sequences and Processes. Berlin-Heidelberg-New York: Springer, 1983 [M] Mathieu, P.: Convergence to equilibrium for spin glasses. Commun. Math. Phys. 215, 57–68 (2000) [MP] Mathieu, P., Picco, P.: Convergence to equilibrium for finite Markov processes, with application to the Random Energy Model. Preprint CPT-2000/P.3982 2000 [OP] Olivieri, E., Picco, P.: On the existence of thermodynamics for the random energy model. Commun. Math. Phys. 96, 125–144 (1984) [Ru] Ruelle, D.: A mathematical reformulation of Derrida’s REM and GREM. Commun. Math. Phys. 108, 225–239 (1987) [So] Soardi, P.M.: Potential theory on infinite networks. LNM 1590, Berlin-Heidelberg-New York: Springer, 1994 [Sp] Spitzer, F.: Principles of random walks. Second edition. Graduate Texts in Mathematics, Vol. 34. New York-Heidelberg: Springer-Verlag, 1976 Communicated by M. Aizenman
Commun. Math. Phys. 235, 427–466 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0811-y
Communications in
Mathematical Physics
Singularly Perturbed Elliptic Equations with Symmetry: Existence of Solutions Concentrating on Spheres, Part I Antonio Ambrosetti1 , Andrea Malchiodi2 , Wei-Ming Ni3 1 2
S.I.S.S.A., 2-4 via Beirut, 34014 Trieste, Italy. E-mail:
[email protected] School of Mathematics, Institute for Advanced Study, 1 Einstein Drive, Princeton, NJ 08540 USA. E-mail:
[email protected] 3 School of Mathematics, University of Minnesota, Minneapolis, MN 55455, USA. E-mail:
[email protected]
Received: 20 September 2002 / Accepted: 28 October 2002 Published online: 28 February 2003 – © Springer-Verlag 2003
Abstract: We deal with the existence of positive radial solutions concentrating on spheres to a class of singularly perturbed elliptic problems like −ε 2 u + V (|x|)u = up , u ∈ H 1 (Rn ). Under suitable assumptions on the auxiliary potential M(r) = r n−1 V θ (r), θ (p + 1)/(p − 1) − 1/2, we provide necessary and sufficient conditions for concentration as well as the bifurcation of non-radial solutions. Contents 1. 2. 3. 4.
5. 6. 7.
8.
Introduction and Statement of the Main Results . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . Some Preliminary Estimates . . . . . . . . . . . . . . . The Contraction Argument on Cε . . . . . . . . . . . . . 4.1 Study of the linearized operator and abstract setting 4.2 Step 3: Decay estimate . . . . . . . . . . . . . . . 4.3 Proof of Proposition 4.3 completed . . . . . . . . . Proof of Theorem 1.1 . . . . . . . . . . . . . . . . . . . Necessary Conditions for Concentration . . . . . . . . . Bifurcation of Non-radial Solutions . . . . . . . . . . . . 7.1 About the Morse index . . . . . . . . . . . . . . . 7.2 Some formal expansions . . . . . . . . . . . . . . 7.3 Proof of Theorem 1.6 . . . . . . . . . . . . . . . . Open Problems and Perspectives . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
427 432 436 439 440 443 447 449 451 453 454 455 462 463
1. Introduction and Statement of the Main Results Singularly perturbed elliptic equations arise in many applications. For example, in plasma physics or in condensed matter physics one simulates the interaction effect
428
A. Ambrosetti, A. Malchiodi, W.-M. Ni
among many particles introducing a nonlinear term to obtain a nonlinear Schr¨odinger equation like iε
∂ψ = −ε2 x ψ + Q(x)ψ − |ψ|p−1 ψ, ∂t
(NLS)
where i is the imaginary unit and p > 1. Making the Ansatz ψ(t, x) = exp(iλε−1 t)u(x), one finds that the standing wave u is a positive solution of −ε 2 u + (λ + Q(x))u = up ,
u ∈ H 1 (Rn ),
(1)
where the condition that u ∈ H 1 (Rn ) is made in order to obtain a bound state. In addition, the behavior of u as ε → 0 has great physical interest since it can be used to describe the transition from quantum to classical mechanics. Also problems like −ε2 u + u = up ,
p > 1,
(2)
on a bounded domain with Neumann boundary conditions, arise in applications: they model biological pattern formations, see e.g. [29]. For example, a straightforward scaling argument shows that a positive solution of Eq. (2) with Neumann boundary conditions can be used to construct a positive steady-state solution of the following system: p in × (0, +∞), − U + Uξ q Ut = d1 U 1 Ur (S) ξt = || −ξ + ξ s in (0, +∞), ∂U on ∂ × (0, +∞), ∂ν = 0 where ν denotes the unit outer normal to , and p, q, r, s are non-negative and satisfy 0<
p−1 r < . q s+1
Problem (S) is, in turn, the shadow-system of the well-known activator-inhibitor system proposed by Gierer and Meinhardt following Turing’s idea of diffusion-driven instability, in modelling the regeneration phenomenon of hydra in morphogenesis ([19, 37]). More precisely, (S) is obtained by letting d2 → +∞ in the following system: Up Ut = d1 U − U + Vrq in × (0, +∞), (GM) in × (0, +∞), Vt = d2 V − V + U Vs ∂U = ∂V = 0 on ∂ × (0, +∞). ∂ν ∂ν Spike-layer solutions (or, simply, spikes), i.e. solutions concentrating on isolated points, of (2) (with Neumann boundary conditions) have been studied quite thoroughly in recent years following the work of [30, 31]. However, solutions concentrating on higher-dimensional sets have not been investigated until the recent work [27], in which solutions concentrating on the boundary of two-dimensional domains are established. Single-peak spike-layer steady states have been proved to be stable under suitable conditions for (S) and then for (GM) when d2 is very large. (See [32] for details.) It is expected that spike-layer steady states with multiple peaks, and perhaps even steady
Singularly Perturbed Elliptic Problems, Concentration on Spheres
429
states with higher dimensional concentration sets, will become stable under suitable hypotheses as d2 decreases. This stability question seems, however, very difficult and lies outside the scope of this paper. Positive solutions of (1) are also known to exhibit concentration phenomena: for example, single-peak spikes concentrate at the stable stationary points of the potential Q (see e.g. [3, 6, 15–17, 25]). It is natural to ask if, extending [27], higher dimensional spike-layers exist for the stationary Schr¨odinger equation (1) as well as for (2) with Neumann or Dirichlet boundary conditions. In this generality the problem seems to be a rather complicated task, and our goal is to prove this kind of concentration in the radial case, where the symmetry reduces the technical difficulties and allows us to handle the problem using a finite dimensional reduction. In this paper we focus on (1) and, letting λ + Q = V , we consider the following class of singularly perturbed elliptic problems on Rn with radial potentials −ε2 u + V (|x|)u = up , x ∈ Rn , (3) u ∈ H 1 (Rn ) u > 0, where |x| denotes the Euclidean norm of x ∈ Rn , V : Rn → R and p > 1. On the potential V we make the following assumptions: (V 1) V ∈ C 1 (R+ , R), (V 2) V is bounded and λ20 := inf{V (r) : r ∈ R+ } > 0. As pointed out in [39], if (V 1) − (V 2) hold and 1 < p < (n + 2)/(n − 2), then (3) has a radial solution that can be found by the Mountain-Pass Theorem: actually, the subspace of the radial functions of H 1 (Rn ) is compactly embedded in Lq , for all q < 2n/(n − 2). Such a solution is a spike, and precisely it concentrates at the origin, see [39]. To give an idea why (3) might possess solutions concentrating on a sphere, let us make the following heuristic considerations. A concentrating solution of (3) carries a potential energy due to V and a volume energy. The former, see (I I ) in (70) below, would prefer the region of concentration approach the minima of V . On the other hand, unlike the case of spike-layer solutions where the volume energy does not depend on the location, the volume energy of solutions concentrating on spheres tends to shrink the sphere, see (I ) and (I I I ) in (70) below. In the region where V is decreasing, there could possibly be a balance that gives rise to solutions concentrating on a sphere. This phenomenon is quantitatively reflected by an auxiliary weighted potential M defined as follows. Let p+1 1 θ= − p−1 2 and define M by setting M(r) = r n−1 V θ (r). Theorem 1.1. Let (V 1) − (V 2) hold, let p > 1 and suppose that M has a point of local strict maximum or minimum at r = r. Then, for ε > 0 small enough, (3) has a radial solution which concentrates near the sphere |x| = r. Remark 1.2. From the proof of Theorem 1.1, it will be clear that the existence of a radial solution can be established even if there exists an interval [a, b], and δ > 0 such that M(r) ≡ M0 , ∀ r ∈ [a, b]; min{M(a − δ), M(b + δ)} > M0 or < M0 . In such a case the concentration will take place at some r ∈ [a − δ, b + δ].
(M)
430
A. Ambrosetti, A. Malchiodi, W.-M. Ni
In the case n = 1 one obviously has M (r) = 0 iff V (r) = 0. Otherwise, when n > 1 one has
M (r) = r n−2 V θ−1 (r) (n − 1)V (r) + θrV (r) . Therefore, no solution concentrating on a sphere can exist in the region where V (r) ≥ 0. On the other hand, taking into account the behavior of M one immediately infers that this kind of solution, if any, arises generically in pairs. More precisely, one has: Corollary 1.3. Suppose that p > 1 and that, in addition to (V 1) − (V 2), one of the following conditions hold: (i) there exists r ∗ > 0 such that (n − 1)V (r ∗ ) + θ r ∗ V (r ∗ ) < 0.
(4)
(ii) V is C 2 and there exists rˆ such that M (ˆr ) = 0 and M (ˆr ) = 0. Then (3) has a pair of solutions concentrating on a sphere. Remark 1.4. Carrying out the proof of Theorem 1.1 we first take p ∈ 1, n+2 n−2 . In Sect. 5 we will indicate the modifications to handle the case of a general p > 1. The preceding existence results are complemented by showing that concentration necessarily occurs on stationary points of M. Theorem 1.5. Suppose that, for all ε > 0 small, (3) has a radial solution uε concentrating on the sphere |x| = r, in the sense that ∀ δ > 0, ∃ ε0 > 0 and R > 0 such that uε (r) ≤ δ,
for ε ≤ ε0 , and |r − r| ≥ εR.
(5)
Then uε has a unique maximum at r = rε , rε → r and M ( r) = 0. The preceding theorem is the counterpart of the results of [39] dealing with necessary conditions for concentration of spikes. The two main new features of the preceding results are: the role of the auxiliary potential M and the fact that we do not impose any growth restriction to the nonlinearity up . The fact that the radius of concentration is not related to the zeros of V but to those of M has never been observed before and is a specific feature of radial solutions concentrating on spheres, in striking contrast with what happens for the spikes. As anticipated before, the role of M is to balance the volume energy and the energy due to the potential V . The surprising phenomenon (at least for us) is that, indeed, there can be no solution concentrating on a sphere of radius r, if V (r) = 0. About the power p, we do not impose any growth restriction: p can be any number greater than 1, including the critical or supercritical exponent. How this is possible will become more clear in the next section, when we give an outline of the proof. Here we limit ourselves to point out that the fact that we can deal with any p > 1 is due not only to the presence of a radial potential but it is also a specific feature of the solutions concentrating on a sphere. For example, spikes concentrating on single points do not exist when p = (n + 2)/(n − 2), see [12].
Singularly Perturbed Elliptic Problems, Concentration on Spheres
431
Our last result concerning Eq. (3) is related to the Morse index of the solution uε and to the bifurcation of non-radial solutions from the set {uε }. According to [39], the solution uε found in Theorem 1.1 cannot be a ground state (i.e. of minimal energy) and hence cannot be a Mountain-Pass of the energy functional. Actually, it is possible to show that the Morse index of uε in H 1 (Rn ) diverges as ε ↓ 0 and this allows us to prove the existence of a plethora of non-radial solutions of (3). More precisely, let V satisfy the assumptions of Theorem 1.1 and let r > 0 be a point of strict local maximum or minimum of M. According to Theorem 1.1, there exists ε > 0 such that for all ε ∈ (0, ε), (3) possesses a family ue of solutions concentrating at the sphere |x| = r. Let us denote by r,ε such a set. Theorem 1.6. Suppose that, in addition to the assumption of Theorem 1.1, the potential V is smooth and that at a point r > 0 of local strict maximum or minimum of M there holds M (r) = 0.
(6)
Then there exists ε0 ∈ (0, ε) such that = r,ε0 is a smooth curve. Moreover, there exist a sequence εj ↓ 0 such that from each uεj ∈ bifurcates a family of non-radial solutions of (3). Remark 1.7. If p ∈]1, (n + 2)/(n − 2)[ and there exists rˆ > 0 such that V (ˆr ) = 0, V (ˆr ) = 0, we can find other solutions of (3) by working on H 1 (Rn ) and using the results of [6] that deal with potentials having a manifold of stationary points like our V (|x|). This approach leads to solutions different in nature: they are spike layers, namely they do not concentrate near a sphere. Actually these solutions are found near critical points of V , not M. Moreover they have low Morse index. To the best of our knowledge, the only result dealing with the existence of solutions of (3) concentrating on a sphere, is the recent paper [7] where an appropriate global variational scheme is used. Some related results are given in [8, 14]. The approach in [7] requires 1 < p < (n + 2)/(n − 2) and some additional assumptions on the behavior of V that are slightly involved and seem technical in nature. Unlike [7], we use a local approach which is outlined in the next section. Such an approach is based on a finite dimensional reduction that takes also into account the variational nature of the problem and allows us to obtain the existence result in a much greater generality. Actually, in Theorem 1.1 we handle critical and super-critical powers and make merely a neat assumption involving M. Moreover, in [7] neither the radii where the solutions concentrate nor results like Corollary 1.3, Theorems 1.5 and 1.6, are established. The paper consists of 7 more sections. In Sect. 2 we introduce some notation and outline the main steps of the abstract procedure. In Sect. 3 we prove some preliminary estimates that play a key role in the rest of the arguments. In Sect. 4 we carry out the abstract procedure that allows us to reduce the problem to the study of a finite dimensional functional whose leading part is M. The proof of Theorem 1.1 is given in Sect. 5 while Sect. 6 contains the proof of Theorem 1.5 and Sect. 7 that of Theorem 1.6. In the last section we collect some open problems and new perspectives. In particular, we indicate how Theorem 1.1 can be extended to find solutions concentrating on a k dimensional sphere for any 1 ≤ k ≤ n − 1. In the subsequent paper [5] we will discuss singularly perturbed problems on a bounded symmetric domain with Dirichlet or Neumann boundary conditions. Roughly, in addition to potential energy due to V , if any, and to the volume energy, one has here to
432
A. Ambrosetti, A. Malchiodi, W.-M. Ni
take into account the effect of the boundary on spherical layers, and this leads to a new kind of solution. To give an idea, we will prove that the equation −ε 2 u + u = up
(7)
on the unit ball (on the annulus {a < |x| < 1}), with Neumann (Dirichlet) boundary conditions, possesses a family of radial solutions with a local maximum point rε < 1 (resp. a < rε < 1) for which 1 − rε ∼ ε| log ε| (resp. rε − a ∼ ε| log ε|). Finally, let us mention that the above results have been reported in the preliminary note [4]. Notation. – Hr1 denotes the set of the functions u ∈ H 1 (Rn ) which are radially symmetric and will be endowed with the norm
+∞ 2 uH 1 = r n−1 |u (r)|2 + V (εr)u2 (r) dr. r
0
The corresponding scalar product will be denoted by (·, ·). – We will often use the notation c, c1 , etc. or C, C1 , etc. for denoting different positive constants. The value of C is allowed to vary from line to line (and also in the same formula). – We use the notation ∼ to denote quantities which, in the limit ε → 0, are of the same order. Similarly, the notation a b means that the order a does not exceed that of b when ε converges to zero. – The symbol o(1) denotes a quantity that tends to 0 as ε → 0. 2. Preliminaries In the first part of this section we collect some preliminary facts that will be used frequently in the following. Moreover, we will outline our procedure. Consider the problem −u + u = up in Rn ; (8) u>0 in Rn , with n ≥ 1, p > 1 and u ∈ H 1 (Rn). We recall that in the sequel (see Remark 1.4) if n+2 n ≥ 3 we will first assume that p ∈ 1, n−2 , so the variational approach is consistent. It is known that problem (8) possesses the ground state solution U which satisfies the properties U (x) = U (|x|), U (r) > 0, n−1 lim er r 2 U (r) = α r→∞
n,p
for all x ∈ Rn , for all r > 0, > 0,
limr→∞ U (r) U (r)
(9)
= −1,
where αn,p is a constant depending only on the dimension n and the exponent p. Hence all the H 1 solutions of (8) belonging to H 1 (R) are of the form U (x − x0 ), with x0 ∈ R.
Singularly Perturbed Elliptic Problems, Concentration on Spheres
433
Solutions of problem (8) for n = 1 in the class H 1 (R) can be found as critical points of the functional I 0 : H 1 (R) → R defined as
1 1 |∇u|2 + u2 − I 0 (u) = |u|p+1 , u ∈ H 1 (R). (10) 2 R p+1 R From the above discussion it follows that the set Z0 = U (x − x0 ) : x0 ∈ R ⊆ H 1 (R) is a manifold of critical points of mountain pass type for the functional I 0 . The manifold Z0 has the following non-degeneracy property, see [34]. Proposition 2.1. The manifold Z0 is non-degenerate for the functional I 0 . Precisely there exists a positive constant C such that
I 0 (U )[U , U ] ≤ −C −1 U 20 ; I 0 (U )[v, v] ≥ C −1 v20 , 1 for all v ∈ H (R), v ⊥ U , v ⊥ TU Z0 , where · 0 denotes the natural norm of H 1 (R). For all λ > 0, let Uλ denote the solution of the problem −U + λ2 U = U p in R, U (0) = 0, U > 0 U ∈ H 1 (R). The Uλ ’s are critical points of the functional
1 1 I λ (u) = |u|p+1 , |∇u|2 + λ2 u2 − 2 R p+1 R
(11)
u ∈ H 1 (R),
and it is immediate to check that Uλ (s) = λ2/(p−1) U (λs). We point out that, by (9), the Uλ ’s are radial functions and satisfy the following decay properties Uλ (s), Uλ (s), Uλ (s) ≤ ce−λs ,
|s| > 1.
Moreover, using the Pohozahev identity, one easily finds p+1 λ2 U 2 = 1 + 1 , λ R R Uλ 2 p+1 p+1 (U )2 = 1 − 1 . R λ R Uλ 2 p+1
(12)
(13)
In the sequel we will work in the space Hr1 (Rn ) = Hr1 of the functions u ∈ H 1 (Rn ) which are radially symmetric. For each u ∈ Hr1 there is a real function u(r) ˜ such that u(x) = u(|x|). ˜ With an abuse of notation, we will simply write u(r) instead of u(r). ˜ For future reference, let us recall that (see [36]) if u ∈ Hr1 one has: |u(r)| ≤ c r (1−n)/2 uH 1 (Rn ) ,
r ≥ 1.
(14)
434
A. Ambrosetti, A. Malchiodi, W.-M. Ni
Performing the change of variable x → εx, problem (3) becomes −u + V (ε|x|)u = up ,
u ∈ Hr1 ,
u > 0.
(15)
uε (x/ε) is a solution of (3). If uε is a solution of (15) then uε (x) = Solutions of (15) are found as critical points of the C 2 functional Iε : Hr1 → R,
1 1 |∇u|2 + V (ε|x|)u2 dx − |u|p+1 dx Iε (u) = 2 Rn p + 1 Rn
+∞ 1 1 +∞ n−1 2 r r n−1 |u|p+1 dr. (u ) + V (εr)u2 dr − = 2 0 p+1 0 Replacing u with its positive part u+ one finds that any critical point uε of Iε satisfies uε ≥ 0 and hence, by the maximum principle, u+ is positive. We will look for critical points of Iε near suitable approximate solutions of (15). Precisely, for all ρ > 0, let U = Uρ,ε be the solution of (11) with λ2 = V (ερ): −U + V (ερ)U = U p , (16) U (0) = 0, U ∈ H 1 (R+ ), U > 0. Setting Stat (M) = {r > 0 : M (r) = 0}, let us fix ρ0 > 0 with 8ρ0 < min Stat (M) and let φε (r) denote a smooth non-decreasing function such that ρ0 0, if r ≤ 2ε , 4ε 16ε 2 |φε (r)| ≤ φε (r) = , |φε (r)| ≤ 2 . ρ0 ρ0 1, if r ≥ ρ0 . ε For ρ ≥ 4ρ0 /ε, set zρ,ε (r) = φε (r) Uρ,ε (r − ρ).
(17)
Fixed > r, see Theorem 1.1, consider the compact interval Tε = 4ε−1 ρ0 , ε−1 and let Z = Zε = {z = zρ,ε : ρ ∈ Tε }. We will look for critical points of Iε of the form u = z + w,
z = zρ,ε ∈ Z,
w ⊥ Tz Z.
We will use a finite dimensional reduction discussed in [1, 2]. Let us outline the main steps of the argument. Step 1. For all ρ ∈ Tε , we first solve the auxiliary equation Iε (z + w) ∈ Tz Z, namely Iε (z + w) = α z˙ ,
z˙ =
∂z , ∂ρ
for some α ∈ R. It is convenient to write such an equation in the form Iε (z)[w] = − Iε (z + w) − Iε (z)[w] − α z˙ .
(18)
Singularly Perturbed Elliptic Problems, Concentration on Spheres
435
Let P denote the orthogonal projection onto (Tz Z)⊥ . We shall show (see Sect. 4) that for ε small enough P Iε (z) is invertible and so, setting
−1 Sε (w) = − P Iε (z) P Iε (z + w) − Iε (z)[w] − α z˙ , (19) solving (18) is equivalent to finding the solutions of w = Sε (w). Recall that (12) yields z(r) ≤ Ce−λ0 |r−ρ| ,
(λ20 = inf{V (r) : r ∈ R+ }).
Choose η > 0 such that λ1 := λ0 − η >
λ0 . min{p, 2}
It is possible to show that there exist positive constants γ > 0 such that, setting Cε = w ∈ Hr1 : wHr1 ≤ γ εzHr1 , |w(r)| ≤ γ e−λ1 (ρ−r) for r ∈ [0, ρ] ,
(20)
for all ε small and all ρ ∈ Tε , the map Sε is a contraction such that Sε (Cε ) ⊂ Cε and thus it has a unique fixed point wρ,ε therein. This is accomplished with some estimates that will be carried out in Sect. 4. In addition, it will be shown that w is of class C 1 with respect to ρ and there results ∂w (21) ∂ρ = o(˙z). Step 2. Define the finite dimensional functional ε (ρ) by setting ε (ρ) = Iε (zρ,ε + wρ,ε ). Then, according to the general result of [1] cited above, one has the following proposition. uε = zρε ,ε + Proposition 2.2. If, for some ε 1, ρε is a stationary point of ε , then wρε ,ε is a critical point of Iε . Proof. At ρ = ρε one has that (we omit below the dependence on ε) ∂zρ 2 ∂zρ ∂zρ ∂wρ ∂wρ ∂ 0= = I (zρ + wρ ) +α + = α , , ∂ρ ∂ρ ∂ρ ∂ρ ∂ρ ∂ρ where α = α(ρε ) has been found in Step 1 before. Using (21) we deduce that α = 0 and the result follows. In order to use this proposition, we will show, see Sect. 5, that εn−1 ε (ρ) = C0 M(ερ) + O(ε 2 ),
ρ ∈ Tε ,
for some constant C0 > 0. Then ε has a stationary point ρε ∼ r/ε. Let us point out that, according to the choice of ρ0 and , one has that ρε ∈ Tε and hence the arguments outlined in Step 1 apply. Using Proposition 2.2 we infer that uε (r) ∼ U (r − ρε ) is a uε (r/ε) ∼ U ((r − r)/ε) is a radial solution radial solution of (15). Therefore uε (r) = of (3) concentrating on the sphere |x| = r, proving Theorem 1.1. The proofs of the other results follow roughly, a similar scheme up to some appropriate modifications.
436
A. Ambrosetti, A. Malchiodi, W.-M. Ni
Remark 2.3. The abstract procedure sketched before is related to the arguments employed in [6, 27], see also [1, 3]. Nevertheless, there are several new ingredients in the proofs carried out here. As in [27], we look for a solution near a suitable approximated one. But in the present situation the approximated solutions arise in a manifold, namely the set Z defined above. For this reason it is convenient to use some argument inspired by [1, 3, 6], that consists of splitting the proof in two parts: (i) the existence of solutions of the auxiliary problem Iε (z + w) ∈ Tz Z and (ii) the study of the finite dimensional functional ε (ρ) that selects the value ρε where solutions of (15) branch off. We point out that in solving the auxiliary equation, the gradient Iε (z) is small relatively to z, see Eq. (E1) in the next section, although it could be large for n ≥ 3. Finally, since we do not impose any growth restriction, we need to have appropriate a priori bounds. To overcome these problems, the idea is to introduce the subset Cε of functions with a suitable exponential decay, see (20), and to look for fixed points of the map Sε therein. This is the most remarkable difference with the approach of all the aforementioned papers. Working in Cε makes it possible to obtain L∞ estimates which are needed to prove the previous results in that generality. 3. Some Preliminary Estimates In this section we prove some estimates that will be needed in the sequel. First of all, a lemma is in order. We anticipate that the pointwise estimate (24) below is the key ingredient that will allow us to deal with any exponent p > 1, possibly critical or super-critical. Lemma 3.1. For ε > 0 small, ρ ∈ Tε , w ∈ Cε and r > 0 one has zρ,ε Hr1 ∼ ε(1−n)/2 ; wHr1 ε zρ,ε Hr1 ∼ ε |w(r)| ≤ C ε,
(22) (3−n)/2
;
(23)
∀ r ≥ 0,
(24)
where C depends only on n and the constant γ in the definition of Cε . Proof. The definition of zρ,ε and the fact that Uρ decays exponentially to zero as |x| → ∞ yield
zρ,ε 2H 1 = r
+∞
r n−1 (|z |2 + V (εr)z2 )dr ∼ ρ n−1 .
0
Since ρ ∈ Tε , then ρ ∼ ε −1 and (22) follows. Equation (23) is an immediate consequence of (22) and of the fact that w ∈ Cε . To estimate |w(r)| we first recall that, taking ε such that ρ0 /ε > 1 and using (14) we get |w(r)| ≤ c r (1−n)/2 wHr1 ≤ c ε (n−1)/2 wHr1 , Then (23) implies
|w(r)| ≤ C ε
for all r ≥ ρ0 /ε.
for all r ≥ ρ0 /ε.
Furthermore, according to the definition of Cε , and recalling that ρ ∈ Tε implies ρ ≥ 4 ρε0 , we have |w(r)| ≤ γ e−λ1 (ρ−r) ≤ γ e−3 Hence (24) follows.
λ1 ε
≤ Cε,
for all r ≤ ρ0 /ε.
Singularly Perturbed Elliptic Problems, Concentration on Spheres
437
We are now in position to prove that, for ε > 0 small enough, for all w ∈ Cε and all ρ ∈ Tε the following estimates hold: Iε (zρ,ε ) ε zρ,ε Hr1 ∼ ε(3−n)/2 ; Iε (zρ,ε + sw) ≤ C, (0 ≤ s ≤ 1) ; Iε (zρ,ε + w) ε (3−n)/2 ; Iε (zρ,ε + sw) − Iε (zρ,ε ) ε1∧(p−1) , (0 ≤ s ≤ 1) ; Iε (zρ,ε + w) − Iε (zρ,ε ) − Iε (zρ,ε )[w] ε1∧(p−1) wHr1 .
(E1) (E2) (E3) (E4) (E5)
Notation. In all the equations below it is understood that w ∈ Cε and ρ ∈ Tε . Sometimes we will also omit the subscripts ρ, ε. Moreover, consistently with our notation, throughout this section we also omit the coefficient ωn−1 (the measure of S n−1 ) when we write integrals in polar coordinates. Proof of (E1). For all v ∈ Hr1 one has Iε (z)[v]
+∞ = r n−1 z v + V (εr)zv − zp v dr 0
+∞
+∞ v(r n−1 z ) dr + r n−1 V (εr)zv − zp v dr =− 0 0
+∞
+∞
+∞ r n−2 z vdr − r n−1 z vdr + r n−1 V (εr)zv − zp v dr . = −(n − 1) 0 0 0 A0 (v)
A1 (v)
Using the H¨older inequality we get |A0 (v)| ≤ C vHr1
+∞
(r (n−3)/2 z )2 dr
1/2 .
0
Since z decays exponentially away from r = ρ and since ρ ∈ Tε , it follows that
+∞
+∞ (n−3)/2 2 (r z ) dr = r −2 · r n−1 |z |2 dr ∼ ρ −2 z2H 1 ∼ ε2 z2H 1 . 0
r
0
r
Then, using (22) we find sup{|A0 (v)| : vHr1 ≤ 1} ∼ ε z ∼ ε(3−n)/2 .
(25)
To estimate A1 (v) we recall that, by definition, z = φ · U (r − ρ) and hence
+∞ r n−1 (φ U + 2φ U )vdr A1 (v) = 0 A2 (v)
+∞
+∞ + r n−1 φU v + V (εr)φU v − (φU )p v dr − r n−1 φU vdr , 0 0 A3 (v)
438
A. Ambrosetti, A. Malchiodi, W.-M. Ni
where U stands for U (r − ρ). Since the support of φ is the interval [ρ0 /2ε, ρ0 /ε] and U decays exponentially to zero as r → ∞ we get sup{|A2 (v)| : vHr1 ≤ 1} ∼ e−c/ε .
(26)
Finally, using Eq. (16) we infer
A3 (v) =
+∞
+∞
r n−1 (V (εr) − V (ερ)) φU vdr =
0
r n−1 (V (εr) − V (ερ)) zvdr.
0
Since V is bounded, |A3 (v)| ≤ c vHr1
1/2
+∞
r
n−3 2
z dr
,
0
and hence, in view of (22), sup{|A3 (v)| : vHr1 ≤ 1} ∼ ε z ∼ ε(3−n)/2 . Putting together (25), (26) and (27), we find (E1).
(27)
Proof of (E2). For v ∈ Hr1 we get I (z + sw)[v, v] = ε
+∞
r 0
≤ c v2H 1 r
2
|v | + v + V (εr)v − p|z + sw| +∞ n−1 p−1 2 +p r |z + sw| v dr .
n−1
2
2
0
According to (24) one has that |z(r) + sw(r)| ≤ c and thus
+∞
r
n−1
|z + sw|
v dr ≤ c v2H 1 . r
p−1 2
0
Then |Iε (z + sw)[v, v]| ≤ c v2H 1 and (E2) follows. r
Proof of (E3). From Iε (z + w) = Iε (z) +
1 0
Iε (z + sw)[w]ds
we deduce Iε (z + w) ≤ Iε (z) + sup Iε (z + sw) · w. 0≤s≤1
Then the result follows from (E1), (E2) and (23).
p−1 2
v
dr
Singularly Perturbed Elliptic Problems, Concentration on Spheres
439
Proof of (E4). One has
+∞ n−1 p−1 p−1 2 I (z + sw)[v, v] − I (z)[v, v] = r − pz dr p|z + sw| v ε ε 0
+∞ r n−1 (|z| + |w|)p−1 − zp−1 v 2 dr ≤p 0
+∞ |w| + |w|p−1 r n−1 v 2 dr. ≤c 0
Using (24) we infer that I (z + sw)[v, v] − I (z)[v, v] ≤ c (ε + ε p−1 ) v2 1 , ε ε H r
proving (E4).
Proof of (E5). Since Iε (z
+ w) − Iε (z)
then Iε (z + w) − Iε (z) − Iε (z)[w] = and using (E4) we find that (E5) holds.
1
= 0
Iε (z + sw)[w]ds, 1
0
Iε (z + sw) − Iε (z) [w]ds
4. The Contraction Argument on Cε In this section we study the contraction mapping argument on the set Cε . As stated in the Introduction, we are interested in finding a function w satisfying the two conditions i) Iε (z + w) ∈ Tz Z; ii) w ⊥ z˙ , where z˙ = ∂z/∂ρ. Here and below z stands for zρ,ε defined in (17). Setting W = (Tz Z)⊥ and letting P = Pρ,ε denote the orthogonal projection onto W , conditions i) and ii) above are clearly satisfied if and only if, given z, one finds w ∈ W such that P Iε (z + w) = 0. We are going to show that there exists γ > 0 such that this equation is indeed solvable in Cε provided ε is small enough. First of all, we need some preliminary estimates on the function z˙ . Since zρ,ε (r) = ϕε (r)UV (ερ) (r − ρ), we have ∂ ∂ ∂ zρ,ε = ϕε (r) εV (ερ) UV |V (ερ) (r − ρ) − UV (ερ) (r − ρ) . ∂ρ ∂V ∂r From this formula and from the exponential decay of Uλ it is easy to deduce 2
∂ 1 zρ,ε ∼ ρ n−1 (UV (ερ) )2 + (UV (ερ) )2 ∼ n−1 , ∂ρ ε R
(28)
since ρ is of order 1ε . In particular the function z˙ and its derivatives up to order 2, pointwise, are uniformly bounded and decay exponentially away from ρ.
440
A. Ambrosetti, A. Malchiodi, W.-M. Ni
4.1. Study of the linearized operator and abstract setting. In this subsection we introduce the abstract set-up for solving the equation P Iε (z + w) = 0, and we begin the proof of the existence of a solution. The remaining part of the proof regards some decay properties and since it relies on separate arguments, we postpone it to the next subsection. We start by proving that the operator Iε (z) is positive-definite on the subspace of Hr1 orthogonal to {tz} ⊕ Tz Z. Lemma 4.1. There exists a positive constant δ such that, for every ρ ∈ Tε and for ε sufficiently small there holds Iε (z)[v, v] ≥ δv2 ,
for all v ⊥ {tz} ⊕ Tz Z.
Proof. For ρ ∈ Tε , µ > 0, we define a cutoff function ψρ,µ : R+ → R which satisfies ψρ,µ (r) = 0 for r ∈ [0, ρ − 2µ] ∪ [ρ + 2µ, ∞] , ψ (r) = 1 for r ∈ [ρ − µ, ρ + µ] , ρ,µ (29) |(r) ≤ 2 |ψ for r ∈ [ρ − 2µ, ρ − µ] ∪ [ρ + µ, ρ + 2µ] , ρ,µ µ |ψ |(r) ≤ 4 for r ∈ − 2µ, ρ − µ] ∪ + µ, ρ + 2µ] . [ρ [ρ ρ,µ µ In the following, when there is no confusion, we simply write ψ for ψρ,µ . For every function u ∈ H 1 we have clearly u2 = ψu2 + (1 − ψ)u2 + 2 (ψu, (1 − ψ)u) ,
(30)
and Iε (z)[u, u] = Iε (z)[ψu, ψu] + Iε (z)[(1 − ψ)u, (1 − ψ)u] + 2Iε (z)[ψu, (1 − ψ)u]. (31) On the other hand, since u is orthogonal to z and z˙ and since z and z˙ have an exponential decay, see (28), from the H¨older inequality it is easy to check that |(ψu, z)| = |− ((1 − ψ)u, z)| ≤ Ce−µ zu ≤ Ce−µ ε |(ψu, z˙ )| = |− ((1 − ψ)u, z˙ )| ≤ Ce
−µ
˙zu ≤ Ce
−µ
ε
1−n 2
u;
(32)
1−n 2
u,
(33)
where C is a positive constant independent of ε, u, µ and ρ ∈ Tε . Moreover, since V is smooth, there holds |V (εr) − V (ερ)| ≤ Cεµ, r n−1 − ρ n−1 ≤ Cµε 2−n for r ∈ [ρ − 2µ, ρ + 2µ] . (34) Formulas (32), (33) and the second equation in (34) imply
1 ρ n−1 ((ψu) )2 (r) + (ψu)2 (r) dr = ψu2 + O + εµ u2 ; µ
1 ρ n−1 (ψu) (r)UV (ερ) (r) + (ψu)(r)UV (ερ) (r) dr = O + εµ u; µ
1 (ψu) (r)UV (ερ) (r) + (ψu)(r)UV (ερ) (r) dr = O ρ n−1 + εµ u, µ
(35) (36) (37)
Singularly Perturbed Elliptic Problems, Concentration on Spheres
441
where O µ1 + εµ ≤ C µ1 + εµ for some constant C independent of ε, u, µ and ρ ∈ Tε . For every v(r) ∈ H 1 (R) we have
∞ r n−1 (v )2 + V (εr)v 2 − pzp−1 v 2 dr Iε (z)[v, v] − ωn−1 ρ n−1 I (z)[v, v] = 0
∞ (v )2 + V (ερ)v 2 − pzp−1 v 2 dr, − ρ n−1 0
(38) hence from (34) and the decay of z we deduce
∞ Iε (z)[ψu, ψu] = ρ n−1 ((ψu) )2 + V (ερ)(ψu)2 − pzp−1 (ψu)2 dr 0 1 +O (39) + εµ u2 . µ From (35)-(37), (39) and Proposition 2.1 it follows that 1 2 Iε (z)[ψu, ψu] ≥ δψu + O + εµ u2 . µ Using the exponential decay of z we find also
1 ≥ (1 − ψ)u + O + εµ u2 ; µ 1 Iε (z)[ψu, (1 − ψ)u] = (ψu, (1 − ψ)u) + O + εµ u2 . µ
Iε (z)[(1 − ψ)u, (1 − ψ)u]
2
From the last three equations and (31) we finally deduce Iε [u, u] ≥ δψu2 + (1 − ψ)u2 + 2(ψu, (1 − ψ)u) + o(1)u2 1 ≥ δu2 + O + εµ u2 . µ Choosing µ sufficiently large, the above quadratic form is positive definite in u, provided ε is sufficiently small. This concludes the proof of the lemma. From simple computations one finds Iε (z)[z, z]
= (1 − p)
zp+1 + oε (1).
This and Lemma 4.1 readily yield Proposition 4.2. For ε sufficiently small and for ρ ∈ Tε the operator P Iε (z) is invertible on W with uniformly bounded inverse. In other words, we have −1 where Aε = − P Iε (z) , Aε ≤ C; and where C is a fixed positive constant.
442
A. Ambrosetti, A. Malchiodi, W.-M. Ni
We are now ready to prove the existence of w ∈ Cε satisfying properties i) and ii) above, namely P Iε (z + w) = 0. Proposition 4.3. For ε sufficiently small there exists a positive constant γ such that for ρ ∈ Tε , there exists and a function w = w(zρ,ε ) ∈ W = (Tz Z)⊥ satisfying P Iε (z + w) = 0 and iii) w ∈ Cε = w ∈ Hr1 : w ≤ γ εz, |w(r)| ≤ γ e−λ1 |r−ρ| for r ∈ [0, ρ] , iv) w ≤ C Iε (zρ ). Moreover w is of class C 1 in ρ and ∂w/∂ρ = o(˙z). Proof. We divide the proof into three steps, postponing the third one to the next subsection. Step 1. Abstract setting. First of all, let us write the equation P Iε (z + w) = 0 in the equivalent form P Iε (z) + P hw + P Iε (z)[w] = 0, where hw = Iε (z + w) − Iε (z) − Iε (z)[w]. Setting Sε (w) := Aε P Iε (z) + P hw and using Proposition 4.2, we deduce that P Iε (z + w) = 0
⇔
w = Sε (w).
(40)
We are going to prove that the map Sε is a contraction on the set Cε (for a suitable choice of γ ) endowed with the norm induced by Hr1 . In order to do this, we shall prove through some decay and norm estimates that Cε is sent into itself by Sε and that Sε (w) − Sε (w) ˆ ≤ κw − w, ˆ
for all w, wˆ ∈ Cε ,
(41)
for some κ < 1. Step 2. Norm estimates. From estimates (E1), (E5), and from Proposition 4.2, we find immediately Sε (w) ≤ C Iε (zρ ) + ε1∧(p−1) w ≤ εz + ε 1∧(p−1) w ≤ ε C0 1 + ε 1∧(p−1) C1 z, ∀ w ∈ Cε . (42) Also, if w, wˆ ∈ Cε , one has Sε (w) − Sε (w) ˆ = Aε P Iε (z + w) − P Iε (z)[w] − P Iε (z + w) ˆ + P Iε (z)[w] ˆ . Writing Iε (z + w) − Iε (z)[w] − Iε (z + w) ˆ + Iε (z)[w] ˆ
1 = Iε (z + wˆ + s(w − w)) ˆ − Iε (z) [w − w]ds, ˆ 0
and repeating the proof of (E4), we find Iε (z + w) − Iε (z)[w] − Iε (z + w) ˆ + Iε (z)[w] ˆ ≤ Cε 1∧(p−1) w − w. ˆ This and Proposition 4.2 imply Sε (w) − Sε (w) ˆ ≤ Cε1∧(p−1) w − w, ˆ so Eq. (41) is satisfied for ε small.
for all w, wˆ ∈ Cε ,
Singularly Perturbed Elliptic Problems, Concentration on Spheres
443
4.2. Step 3: Decay estimate. The last step consists in proving that the set Cε is sent onto itself by Sε . Having the norm estimates of Step 2, it is sufficient to show that w ∈ Cε
⇒
|(Sε w)(r)| ≤ γ e−λ1 (ρ−r)
for 0 ≤ r ≤ ρ.
Let us point out that the constant γ > 0 in the definition of Cε has not been selected yet, and will be determined below. First, using the definition of Aε one can immediately see that, given f ∈ Hr1 , there holds w = Aε (Pf )
if and only if
− P Iε (z)[w] = f −
(f, z˙ )˙z . ˙z2
(43)
We also note that, testing the equation P Iε (z)[w] = −f + ˙z−2 (f, z˙ )˙z on any smooth function, w satisfies the following differential equation: −w + V (εr)w − pzp−1 w = −(f + α z˙ ) + V (εr)(f + α z˙ ) with
in Rn ,
(44)
α = ˙z−2 (Iε (z)[w] + f, z˙ ).
Hence, if we set hz = Iε (z) and
w˜ = Sε (w) = Aε (P Iε (z)) + Aε P Iε (z + w) − P Iε (z) − P Iε (z)[w] ,
(45)
from (40), (43) and (44) we find that w˜ ∈ W satisfies the differential equation −w˜ + V (εr)w˜ − pzp−1 w˜ = −(hz + hw + β z˙ ) + V (εr)(hz + hw + β z˙ ) with
in Rn ,
(46)
β = ˙z−2 (Iε (z)[w] + hz + hw , z˙ ) = ˙z−2 (Iε (z + w), z˙ ).
The functions hz and hw are defined by duality as (hz , v) = Iε (z)[v];
(hw , v) = Iε (z+w)[v]−Iε (z)[v]−Iε (z)[w, v],
for all v ∈ Hr1 ,
so they solve respectively the two equations −hz + V (εr)hz = −z + V (εr)z − zp , in Rn ; in Rn . −hw + V (εr)hw = − (z + w)p − zp − pzp−1 w , Hence from the last formulas we deduce that w˜ satisfies −w˜ + V (εr)w˜ − pzp−1 w˜ = − (z + w)p − zp − pzp−1 w + β (−˙z + V (εr)˙z) in Rn . + −z + V (εr)z − zp , (47) Now, for N > 0 to be chosen later, we write w(x) ˜ = w˜ 1 (x) + w˜ 2 (x),
for x ∈ Bρ−N (0),
444
A. Ambrosetti, A. Malchiodi, W.-M. Ni
where the functions w˜ 1 and w˜ 2 are defined respectively as the solutions of the two problems −w˜ 1 + V (εr)w˜ 1 − pzp−1 w˜ 1 = 0 in Bρ−N (0), (48) w˜ 1 = w˜ on ∂Bρ−N (0);
−w˜ 2 + V (εr)w˜ 2 − pzp−1 w˜ 2 = fz,w w˜ 2 = 0
in Bρ−N (0), on ∂Bρ−N (0),
(49)
with
fz,w = (z + w)p − zp − pzp−1 w − β (−˙z + V (εr)˙z) − −z + V (εr)z − zp .
Let us choose positive numbers λ2 and λ3 such that (recall that λ0 < (2 ∧ p)λ1 ).
λ1 < λ3 < λ2 < λ0
(50)
Denote by G(·, y) the Green’s function of − + λ22 with pole y, namely the solution of −u + λ22 u = δy . Then it is standard to check as for (9) that G(x, y) ≤ Ce−λ2 |x−y|
1 |x − y|
n−1 2
for |x − y| ≥ 1.
,
(51)
Estimate of w˜ 1 . We have the following elementary lemma. Lemma 4.4. Let R > 1 and V ∗ (r) be such that (λ∗0 )2 := inf V ∗ > λ22 . Let u ∈ Hr1 (Rn ) be any C 2 solution of the equation −u + V ∗ (r)u = 0;
in BR ,
u(0) > 0.
Then there exists a positive constant C, depending only on n and (λ∗0 )2 − λ22 , such that u(r) ≤ Ce−λ1 (R−r) u(R), Proof. Let u ∈
Hr1 (Rn )
be a
C2
for r ∈ [0, R].
solution of the equation
−u + (λ∗0 )2 u = 0;
in BR ,
u(R) = u(R).
From the maximum principle it follows that u(r) ≤ u(r) for all r ∈ [0, R]. Then the lemma follows easily from the well-known results about Bessel functions. See also [18], Appendix C, p. 401. The function w˜ 1 can be estimated by using Lemma 4.4. We choose N such that λ20 − p sup zp−1 (r) > λ22 .
(52)
r≤ρ−N
Applying Lemma 4.4 with V ∗ = V − pzp−1 and R = ρ − N , we have from (48), w˜ 1 (r) ≤ Ce−λ1 (ρ−N−r) w˜ 1 (ρ − N ) = Ceλ1 N e−(ρ−r) w(ρ ˜ − N ). Using (24), we infer ε e−λ1 (ρ−r) , w˜ 1 (r) ≤ C
for r ≤ ρ − N.
(53)
Singularly Perturbed Elliptic Problems, Concentration on Spheres
445
Estimate of w˜ 2 . By Eq. (12) we also can choose N0 > 0 large enough, depending only on n and V such that (52) holds. Let N ≥ N0 : if ρ ∈ Tε , then ρ − N > 0 for ε small. Let u˜ 2 denote the solution of the problem −u˜ 2 + λ22 u˜ 2 = f1 + f2 + f3 in Bρ−N (0), (54) u˜ 2 = 0 on ∂Bρ−N (0), where
f1 = (z + w)p − zp − pzp−1 w ; f3 = −z + V (εr)z − zp .
f2 = |β| |−˙z + z˙ | ;
From the maximum principle, taking into account (52), it follows easily that 0 ≤ |w˜ 2 | ≤ u˜ 2 , so it is sufficient to prove an estimate for u˜ 2 . In order to do this, we need to consider first the decay of the functions f1 , f2 and f3 . From the fact that w ∈ Cε and from standard computations we have f1 (r) = (z + w)p − zp − pzp−1 w (r) ≤ C|w|(r)2∧p 2∧p −(2∧p)λ1 (ρ−r)
≤ CC1
e
,
for r ≤ ρ.
(55)
From the definition of β and the estimate (E3), it follows that |β| ≤ ε. Moreover there holds f2 = |β| |−˙z + z˙ | ≤ Cεe−λ2 (ρ−r) ,
for r ≤ ρ.
(56)
Similarly, from the expression of zρ,ε we deduce f3 = −z + V (εr)z − zp ≤ Ce−λ2 (ρ−r) ,
for r ≤ ρ.
(57)
Equations (55)–(57) then imply 2∧p f˜(r) := f1 (r) + f2 (r) + f3 (r) ≤ Ce−λ2 (ρ−r) + CC1 e−(2∧p)λ1 (ρ−r) ,
for r ≤ ρ. (58)
Let GB denote the Green’s function for the operator −u + λ22 u on the set Bρ−N (0) with Dirichlet boundary conditions. By the maximum principle we have clearly
GB (x, y)f˜(y)dy ≤ G(x, y)f˜(y)dy, x ∈ Bρ−N (0). (59) u˜ 2 (x) = Bρ−N (0)
Rn
For a small number σ to be fixed later and for any x ∈ Bρ−N (0), we divide Rn in the two regions A1 = y ∈ Rn : |y − x| ≥ (1 − σ ) ||x| − ρ| ; A2 = y ∈ Rn : |y − x| ≤ (1 − σ ) ||x| − ρ| . 2∧p (see (58)) we find From (51) and from the fact that |f˜(x)| ≤ C 1 + C1
446
A. Ambrosetti, A. Malchiodi, W.-M. Ni
∞ G(x, y)f˜(y)dy ≤ Cf L∞ e−λ2 r r n−1 dr A1 (1−σ )(ρ−|x|) ∞ 2∧p e−λ2 r r n−1 dr. ≤ C 1 + C1 (1−σ )(ρ−r)
If σ is chosen sufficiently small (in order to have λ3 < (1 − σ )λ2 ), it is easy to prove the following estimate:
∞ e−λ2 r r n−1 dr ≤ Ce−λ3 (ρ−r) , for (ρ − r) ≥ 1, (1−σ )(ρ−r)
where C is independent of ρ ∈ Tε and r. Hence from the last two equations it follows that
2∧p G(x, y)f˜(y)dy ≤ C 1 + C1 for (ρ − r) ≥ N. (60) e−λ3 (ρ−r) , A1
On the other hand there holds |y − x| + |ρ − |y|| ≥ |ρ − r| ,
for y ∈ A2 .
As a consequence, from (51) and (58) we deduce
2∧p G(x, y)f˜(y)dy ≤ C 1 + C1 e−λ2 |x−y| e−λ2 |ρ−|y|| A2 A2 2∧p ≤ C 1 + C1 e−λ2 |ρ−r| . A2
Since |A2 | ≤ C(1 − |σ |)n−1 |ρ − r|n−1 , it follows that
2∧p e−λ3 (ρ−r) , G(x, y)f˜(y)dy ≤ C 1 + C1
for (ρ − r) ≥ N.
(61)
A2
From (59), (60) and (61) we find 2∧p u˜ 2 (x) ≤ C 1 + C1 e−λ3 (ρ−|x|) ,
for (ρ − |x|) ≥ N,
so from (53) it follows that 2∧p −λ1 (r−ρ) , |w(r)| ˜ ≤ C 0 1 + C1 e−λ3 (ρ−r) + Cεe
for (ρ − r) ≥ N.
(62)
We now choose γ > 2C0 , where C0 is the constant given in formula (42). Then for ε sufficiently small we have w ∈ Cε
⇒
w ˜ ≤ γ εz.
(63)
Next, we can first choose N > N0 1 (satisfying (52)) and then ε > 0 small such that 2∧p −λ1 (r−ρ) ≤ γ , C 0 1 + C1 (64) e−(λ3 −λ1 )N + Cεe
Singularly Perturbed Elliptic Problems, Concentration on Spheres
447
where C 0 is the constant given in formula (62). Then (62) and (64) imply 2∧p |w|(x) ˜ ≤ C 0 1 + C1 e−(λ3 −λ1 )(ρ−r) e−λ1 (ρ−r) ≤ γ e−λ1 (ρ−r) , for (ρ − r) ≥ N. (65) On the other hand, since ρ ∈ Tε Eq. (14) and w ˜ ≤ γ εz imply |w(x)| ˜ ≤ C γ εz · r (1−n)/2 ≤ Cγ ε ≤ γ e−λ1 (ρ−r) ,
for (ρ − r) ≤ N,
(66)
provided ε is sufficiently small. 4.3. Proof of Proposition 4.3 completed. We are now in position to complete the proof of Proposition 4.3. Equations (65) and (66) imply w ∈ Cε
⇒
|w(r)| ˜ ≤ γ e−λ1 (ρ−r)
for r ≤ ρ,
hence, by (63) and the last equation, w˜ = Sε w ∈ Cε . Then Sε is a contraction from Cε into itself, and hence there exists w ∈ Cε such that w = Sε (w), namely such that P Iε (z + w) = 0. Moreover, iv) follows directly from (42). It remains to show the C 1 dependence of w on ρ as well as the estimate of ∂w/∂ρ and to this is devoted the rest of this subsection. Since in the sequel we are interested in the dependence on ρ we will emphasize this and will often omit the other subscripts such as ε. We know that wρ solves the equation
wρ = Sρ (wρ ) = Aρ Pρ Iε (zρ + wρ ) − Iε (zρ )[wρ ] . Since the map ρ → Sρ (·) is continuous, it is not hard to show that ρ → wρ is also continuous, see e.g. [10] pp. 22-23. For ρ, ρ1 ∈ Tε , we first evaluate
wρ − wρ1 = Aρ Pρ Iε (zρ + wρ ) − Iε (zρ )[wρ ]
−Aρ1 Pρ1 Iε (zρ1 + wρ1 ) − Iε (zρ1 )[wρ1 ] . We have
wρ − wρ1 = Aρ Pρ − Aρ1 Pρ1 Iε (zρ + wρ ) − Iε (zρ )[wρ ]
+Aρ1 Pρ1 Iε (zρ + wρ ) − Iε (zρ1 + wρ1 ) − Iε (zρ )[wρ ] + Iε (zρ1 )[wρ1 ] . This can also be rewritten in the form wρ − wρ1 = X1 + X2 + X3 + Gρ,ρ1 (wρ , wρ1 ),
(67)
where
X1 = Aρ Pρ − Aρ1 Pρ1 Iε (zρ + wρ ) − Iε (zρ )[wρ ] ,
X2 = Aρ1 Pρ1 Iε (zρ + wρ ) − Iε (zρ1 + wρ ) ,
X3 = Aρ1 Pρ1 Iε (zρ1 ) − Iε (zρ ) [wρ ] ,
Gρ,ρ1 (wρ , wρ1 ) = Aρ1 Pρ1 Iε (zρ1 + wρ ) − Iε (zρ1 + wρ1 ) − Iε (zρ1 )[wρ − wρ1 ] .
448
A. Ambrosetti, A. Malchiodi, W.-M. Ni
Now we evaluate (wρ − wρ1 )/(ρ − ρ1 ) and let ρ1 tend to ρ. From (67) we infer wρ − wρ1 Gρ,ρ1 (wρ , wρ1 ) X1 + X2 + X3 − = . ρ − ρ1 ρ − ρ1 ρ − ρ1
(68)
Note that the three terms Xi do not depend upon wρ1 , while they depend smoothly (at least C 1 ) with respect to ρ1 . This implies that they are differentiable at ρ and there holds (all the derivatives are evaluated at ρ; moreover we denote by z˙ the derivative ∂zρ /∂ρ):
∂Aρ Pρ X1 → Iε (zρ + wρ ) − Iε (zρ )[wρ ] =: X1 , ρ − ρ1 ∂ρ
X2 → Aρ Pρ Iε (zρ + wρ )[˙zρ ] =: X2 , ρ − ρ1 X3 ∂ → A ρ Pρ Iε (zρ ) [wρ ] =: X3 . ρ − ρ1 ∂ρ Furthermore, since Iε (zρ1 + wρ ) − Iε (zρ1 + wρ1 ) − Iε (zρ1 )[wρ − wρ1 ]
1 wρ − wρ1 [Iε (zρ1 + wρ1 + s(wρ − wρ1 )) − Iε (zρ1 )]ds = , ρ − ρ1 0 it follows from (E4) that as ρ1 approaches ρ, Eq. (68) can be solved (via the contraction mapping principle) in the unknown (wρ − wρ1 )/(ρ − ρ1 ). Then, passing to the limit we deduce that (wρ − wρ1 )/(ρ − ρ1 ) converges to a vρ satisfying
vρ = X1 + X2 + X3 + Aρ Pρ (Iε (zρ + wρ ) − Iε (zρ ))[vρ ] . In this way we have shown that w is continuously differentiable with respect to ρ and the derivative w˙ satisfies
˙ = X1 + X2 + X3 . I d − Aρ Pρ (Iε (zρ + wρ ) − Iε (zρ ))[w] Let us evaluate X1 ≤ CIε (zρ + wρ ) − Iε (zρ )[wρ ], X2 ≤ CIε (zρ + wρ )[˙zρ ], X3 ≤ Cwρ . Using (23), (E2), (E3) and the fact that ˙zρ X1 ≤ C ε zρ ≤ C ε ˙zρ ,
zρ , we infer X3 ≤ C ε ˙zρ .
As for X2 we write Iε (zρ + wρ )[˙zρ ] = (Iε (zρ + wρ ) − Iε (zρ ))[˙zρ ] + Iε (zρ ))[tzρ ]. Using (E4) we get (Iε (zρ + wρ ) − Iε (zρ ))[˙zρ ] ≤ Cε(p−1)∧1 ˙zρ . On the other hand, small modifications of the arguments used in the proof of (E1) readily yield Iε (zρ )[˙zρ ] ≤ C ε ˙zρ .
Singularly Perturbed Elliptic Problems, Concentration on Spheres
449
Finally note that, using (E4), we can estimate the term on the
right-hand side and it follows that the operator I d − Aρ Pρ (Iε (zρ + wρ ) − Iε (zρ )) is invertible with bounded inverse. In conclusion it follows that w˙ ρ = o(˙zρ ), as required and the proof of Proposition 4.3 is now complete.
Remark 4.5. It is possible to prove some better decay estimates for the function w constructed in this section. Indeed the function w satisfies the equation −w + V (εr)w − pzp−1 w − (z + w)p − zp − pzp−1 w = α (−˙z + V (εr)˙z) − −z + V (εr)z − zp . From the smoothness of V and from theabove estimates, the right-hand side can be bounded by Cε(1 + |r − ρ|)z. The term (z + w)p − zp − pzp−1 w is superlinear in w, see (55), and hence (since w is pointwise small) the coefficient of w in the left-hand side is bounded below by a positive constant for |r − ρ| sufficiently large. In this way one finds |w(r)| ≤ Cεe−λ1 |r−ρ| , for all r ≥ 0. This estimate is not necessary for the existence result in Theorem 1.1, but it is useful to obtain Theorem 1.6, based on more refined arguments. 5. Proof of Theorem 1.1 First of all we prove the following lemma Lemma 5.1. For ε > 0 small, there is a constant C0 > 0 such that: εn−1 Iε (zρ,ε + wρ,ε ) = C0 M(ερ) + O(ε 2 ),
ρ ∈ Tε .
Proof. For brevity, we write z instead of zρ,ε and w instead of wρ,ε . One has
1 Iε (z + sw)[w]2 ds. Iε (z + w) = Iε (z) + Iε (z)[w] + 0
Using (23), (E1) and (E2) we infer Iε (z + w) = Iε (z) + O(ε 3−n ). On the other hand, recall that by definition zρ,ε (r) = φε (r) Uρ,ε (r − ρ), see Sect. 2, where φε is a cut-off function. Then z concentrates near ρ and one finds 2
∞ 2 zp+1 n−1 |z | + V (εr)z Iε (z) = − dr r 2 p+1 0
2 |U | + V (ερ)U 2 U p+1 = ρ n−1 − dr (1 + o(1)) . 2 p+1 R Moreover, U = Uρ,ε satisfies −U + V (ερ)U = U p and hence, see Sect. 2, U (r) = λ2/(p−1) U (λr),
λ2 = V (ερ).
450
A. Ambrosetti, A. Malchiodi, W.-M. Ni
It follows by a straightforward calculation that
2 U p+1 |U | + V (εr)U 2 − dr = C0 V θ (ερ), 2 p + 1 R p+1 1 1 1 p+1 U . − , C0 = − p−1 2 2 p+1 R Substituting into the preceding equations we find
where
θ=
Iε (z + w) = C0 ρ n−1 V θ (ερ) + O(ε 3−n ). Recalling the definition of M we get Iε (z + w) =
C0 C0 (ερ)n−1 V θ (ερ) + O(ε 3−n ) = n−1 M(ερ) + O(ε 3−n ), ε n−1 ε
and the lemma follows.
Proof of Theorem 1.1 completed. Let us first consider the case p ∈ 1, n+2 n−2 . By Lemma 5.1, if r is a maximum (resp. minimum) of M then ε (ρ) = Iε (zρe ,ε + wρe ,ε ) will possess a maximum (resp. minimum) at some ρε ∼ r/ε, with ρε ∈ Tε . Using Proposition 2.2, such a stationary point of ε gives rise to a critical point uε = zρe ,ε + wρε ,ε , which is a (radial) solution of (15). By a rescaling, we infer that uε (r) = uε (r/ε) is a radial solution of (3). Since uε (r) ∼ U (r − ρε ) ∼ U (r − r/ε), then uε (r) ∼ U ((r − r)/ε) and hence uε concentrates near the sphere |x| = r. Let us now consider the case p > n+2 n−2 . The proof in this situation is done using some truncation for the nonlinear term, and then proving a-priori L∞ estimates on the solutions. We list the modifications to the above proof which are necessary to handle this case. For K > 0, we define a smooth positive function FK : R → R such that FK (t) = |t|p+1 Let Iε,K :
Hr1
for |t| ≤ K;
FK (t) = (K + 1)p+1
→ R be the functional obtained substituting 1 p−1
for |t| ≥ K + 1.
|u|p+1
with FK (u) in Iε ,
and let K0 = (sup V ) . We note that by the definition of Uλ and zρ,ε , see Sect. 2, it is zρ,ε ≤ K0 for all ρ ∈ Tε and ε sufficiently small. (z) remains invertible and its In the above notation, if K ≥ K0 , the operator P Iε,K inverse Aε has uniformly bounded norm, independent of K. In fact, Proposition 2.1, Lemma 4.1 and Proposition 4.2 are based on local arguments and remain unchanged. Moreover, if K ≥ K0 + γ (see the definition (20) of Cε ) and using the pointwise bounds on |w(r)| stated in Eq. (24), one readily checks that the estimates (E1) − (E5) in Sect. 3 involving Iε (z + w) and Iε (z + w) are also independent of K. As a consequence Steps 1 and 2 in Proposition 4.3 hold true. As far as Step 3 is concerned, the decay estimates do not need any modification, and it is sufficient to choose γ and M as above. This completes the proof of Theorem 1.1. Remark 5.2. In the case that M satisfies (M) we use again Lemma 5.1 to infer that there exists r ∈ [a − δ, b + δ] such that ε (ρ) possesses at least a maximum or a minimum at some ρε ∼ r/ε. According to Proposition 2.2 this stationary point of ε still gives rise to a radial solution of (3) concentrating near the sphere |x| = r. This proves what is stated in Remark 1.2.
Singularly Perturbed Elliptic Problems, Concentration on Spheres
451
Proof of Corollary 1.3. In view of the behavior of M at r = 0 and as r → ∞, in both cases (i) and (ii), it follows that M possesses at least a local maximum and a local minimum. Let us remark that when (1.3) holds these stationary points of M could possibly be degenerate; however, in any case, (M) holds. Then an application of Theorem 1.1 (jointly with Remark 1.2, if necessary) yields the existence of a pair of solutions to (3).
6. Necessary Conditions for Concentration In this section we will prove Theorem 1.5. The first part of the proof is similar to that of Theorem 3.1, Steps 1 and 2, of [39], and therefore we will be sketchy. Let uε (r) be a solution of (3) concentrating near the sphere |x| = r, in the sense of (5), namely ∀ δ > 0, ∃ ε0 > 0 and R > 0 such that uε (r) ≤ δ,
for ε ≤ ε0 , and |r − r| ≥ εR.
As in [39], one can show that uε has a unique maximum at rε → rˆ , and uε L∞ ≤ c1 .
(69)
Let uε (r) = uε (εr) denote the corresponding solution of (15). By (69) uε (· − rε /ε) 2 (R) and converges, up to a subsequence, to some u0 in Cloc r) u0 = u0 . − u0 + V ( p
Moreover, by the properties of uε , it follows immediately u0 is positive and has a maximum point at 0. Hence, using the notation introduced in Sect. 2, we can say that u0 = Uλ ,
with λ2 = V ( r).
It remains to show that M ( r) = 0 and this is proved in the last part of this section. Let us write (15) in the form − r n−1 uε + r n−1 V (εr) uε = r n−1 upε . Multiplying by uε and integrating over [0, R], R > 0, we get
0= 0
R
R
R uε uε dr − r n−1 V (εr) uε uε dr + r n−1 upε uε dr . r n−1 0 0
(I )
(I I )
(70)
(I I I )
Let us evaluate separately the three integrals. 2 uε (R) − (I ) = R n−1
2 1 uε (R) − r n−1 uε uε dr = R n−1 2 0
2 n − 1 R n−2 2 1 n−1 = R r [ uε ] dr, uε (R) + 2 2 0 R
0
R
uε ]2 dr r n−1 [
452
A. Ambrosetti, A. Malchiodi, W.-M. Ni
1 R n−1 r V (εr) u2ε dr 2 0
1 n−1 1 R n−1 2 = R V (εR) ( uε (R)) − V (εr) u2ε dr r 2 2 0
1 n−1 1 R 2 = R V (εR) ( (n − 1)r n−2 V (εr) + εr n−1 V (εr) uε (R)) − u2ε dr, 2 2 0 !
R p+1 uε 1 n − 1 R n−2 p+1 n−1 r dr = uε (R)]p+1 − r uε dr. R n−1 [ (I I I ) = p+1 p+1 p+1 0 0 (I I ) =
Substituting into (70) we find
1 R 2 n − 1 R n−2 2 r [ uε ] dr + uε (n − 1)r n−2 V (εr) + εr n−1 V (εr) dr 0= 2 2 0 0
R n−1 − r n−2 up+1 dr + ε (R), ε p+1 0 where ε (R) =
2 1 1 n−1 1 uε (R) − R n−1 V (εR) ( R R n−1 [ uε (R))2 + uε (R)]p+1 . 2 2 p+1
uε have an exponential decay, then ε (R) → 0 as R → ∞ (depending Since uε and on ε) and we get, for any ε small,
n − 1 +∞ n−2 2 1 +∞ r [ uε ] dr + u2ε dr (n − 1)r n−2 V (εr) + εr n−1 V (εr) 2 2 0 0
n − 1 +∞ n−2 p+1 r uε dr. (71) = p+1 0 Now, recall that (i) uε concentrates at r/ε, (ii) uε (· − r/ε) → u0 = Uλ , with λ2 = V ( r). This implies
+∞
(Uλ )2 dr,
r n−2 V (εr) u2ε dr ∼ ( r/ε)n−2 V ( r) Uλ2 dr, R 0
+∞ p+1 r n−2 up+1 dr ∼ ( r/ε)n−2 Uλ dr, ε 0 +∞
r n−2 [ uε ]2 dr ∼ ( r/ε)n−2
R
R
0
as well as
+∞
n−1 2 n−1 2 n−2 ε r V (εr) uε dr ∼ ε ( r/ε) V ( r) Uλ dr = r ( r/ε) V ( r) Uλ2 dr. 0
R
R
(72)
Singularly Perturbed Elliptic Problems, Concentration on Spheres
Let us now use Eqs. (13), see Sect. 2, with λ2 = V ( r), namely
1 1 p+1 + V ( r) Uλ2 dr = Uλ dr, 2 p + 1 R R
1 1 p+1 2 − (Uλ ) dr = Uλ dr. 2 p + 1 R R
453
(73) (74)
Inserting (73) and (74) into the preceding equations, we infer
+∞ 1 1 p+1 n−2 2 r [ uε ] dr ∼ Uλ dr, − r/ε)n−2 ( 2 p+1 R 0
+∞ 1 1 p+1 r n−2 V (εr) u2ε dr ∼ Uλ dr. + r/ε)n−2 ( 2 p+1 R 0 Then we find
n − 1 +∞ n−2 2 n − 1 +∞ n−2 n − 1 +∞ n−2 p+1 r [ uε ] dr + r V (εr) u2ε dr − r uε dr 2 2 p+1 0 0 0 1 1 p+1 ∼ (n − 1) ( r/ε)n−2 U dr. − 2 p+1 R λ (75) Inserting (72) and (75) into (71) we get
1 1 p+1 n−2 1 n−2 − r ( r/ε) U dr = V ( r) Uλ2 dr + o(ε 2−n ). (n − 1) ( r/ε) 2 p+1 R λ 2 R (76) Letting ε → 0 and dividing by r n−2 we deduce
1 p−1 p+1 Uλ dr = r) Uλ2 dr. rV ( (n − 1) 2(p + 1) R 2 R Using again (73) we get
1 p−1 2 r) Uλ2 dr, V ( r) Uλ dr = rV ( (n − 1) p+3 2 R R
namely (n − 1)V ( r) =
p+3 r). rV ( 2(p − 1)
r) = θ rV ( r) Since θ = (p + 1)(p − 1)−1 − 1/2 = (p + 3)/2(p − 1), we find (n − 1)V ( which is equivalent to M ( r) = 0, as required. This completes the proof of Theorem 1.5. 7. Bifurcation of Non-radial Solutions In this section we study bifurcation of non-radial solutions from a family of radial solutions concentrating on spheres. We will assume first that the exponent p lies in the range n+2 1, n−2 , so all the functionals involved are well-defined. The general case p > n+2 n−2 will be handled by a truncation procedure as before.
454
A. Ambrosetti, A. Malchiodi, W.-M. Ni
7.1. About the Morse index. We have the following result. Proposition 7.1. Let uε be the family of radial solutions uε of (3) having the form uε = zρε ,ε + wρε ,ε ,
for some ρε ∼
r0 , ε
where wρε ,e ∈ Cε (see Eq. (20)). Then the Morse index of Iε (uε ) tends to infinity as ε goes to zero. Proof. Let {ϕj }j , j ∈ N denote the eigenfunctions of − on S n−1 with eigenvalues {µj }j , where the µj ’s are chosen non-decreasing in j . In particular we have
2 |∇ϕj | = µj ϕj2 , j ≥ 1. (77) S n−1
S n−1
Recalling the definition of the cutoff function φε in Sect. 2, let us define the function vj ∈ H 1 (Rn ) as vj (r, θ ) = uε (r)φε (r)ϕj (θ ), We note that
r ≥ 0, θ ∈ S n−1 , j ≥ 1.
1 ∇vj = ϕj (θ ) (φε uε ) (r), uε (r)φε (r)∇S n−1 ϕj (θ ) , r
(78)
and hence using polar coordinates we deduce (vi , vj ) = 0 Moreover it turns out that Iε (uε )[vj , vj ] where
C1,ε =
C2,ε =
for i = j.
= C1,ε
2 r n−1 (φε uε ) +
S n−1
(79)
ϕj2
+ C2,ε
S n−1
|∇ϕj |2 ,
r n−1 V (εr)(φε uε )2 − p
(80)
2 r n−1 up+1 φ ε ε ,
r n−3 (φε uε )2 .
From the estimates of the previous sections one easily finds r n−1 r n−3 0 0 C1,ε ∼ I V (r0 ) [UV (r0 ) , UV (r0 ) ] < 0; C2,ε ∼ UV2 (r0 ) . ε ε R From (77) and the last equations it follows that, for any fixed j ≥ 1, for ε small one has r n−1 1 0 Iε (uε )[vi , vi ] ≤ ωn−1 I V (r0 ) [UV (r0 ) , UV (r0 ) ] < 0, for all i ≤ j. 2 ε Hence setting
Mj = span {vi : i = 1, . . . , j } ,
from the orthogonality relation (79) it follows that Iε (uε ) is negative-definite on Mj for ε sufficiently small. This concludes the proof.
Singularly Perturbed Elliptic Problems, Concentration on Spheres
455
Remark 7.2. (a) From the Weyl’s asymptotic formula, see for example [11], it turns out that on S n−1 one has 2 for j large. µj ∼ j n−1 , Hence, using the above argument, one can prove that the Morse index of Iε (uε ) is of order greater than or equal to ε −(n−1) . This is in agreement with the result in [27], where solutions concentrate along 1-dimensional sets, and their Morse index is indeed of order ε−1 . (b) Let V = λ+Q and let uε,λ denote a solution of (3), as in Proposition 7.1. The study of the orbital stability of the corresponding solitary wave ψ(t, x) = exp(iλε−1 t)uε,λ (x) for the Schr¨odinger evolution equation (NLS) can be done using the results of [21]. First of all, in order to use the Stability Theorem, see [21] pag. 309, the Morse index of uε,λ should be not larger than 1, which is not the case here, for ε small. On the other hand, the Instability Theorem of [21] implies that ψ is orbitally unstable provided that
∂ |uε,λ (x)|2 dx > 0, (resp. < 0), ∂λ Rn and that the Morse index of uε,λ is greater than 1 and odd (resp. even). For results dealing with orbital stability and instability of solitary waves corresponding to spike-layer solutions, see also [33]. 7.2. Some formal expansions. In this subsection we derive some formal expansions for the solutions of (3) and for the eigenvalues of the linearized operator Iε . These expansions will be used in the next subsection to obtain rigorous proofs. In the following, for some critical point r of M, we will write 1 1 z(s) = V (r) p−1 U V (r) 2 s . In the next lemma we will solve formally Eq. (15) in power series of ε. The solution will be found in the form uε (r) = z(r − r/ε − β) + w(r − r/ε − β), with w perpendicular to z with respect to the scalar product
!u, v" = u v + V (r)uv. R
To be rigorous, the functions we are dealing with are defined only on the positive real axis. But since we start with purely formal computations, we can suppose they are defined on all of R. In the final step of the proof, a cutoff function will be applied. The above scalar product ! , " will be always used throughout this subsection. We will use formal expansions for both β and w. Lemma 7.3. Suppose uε is a solution of (15) of the form zρ (r −r/ε−β)+w(r −r/ε−β), with M (r) = 0 and w ∈ Cε . Then, expanding formally w as w(s) = εw1 (s)+ε 2 w2 (s)+ o(ε2 ), where s = r − r/ε − β, and expanding β as β = εβ1 + ε 2 β2 , the functions w1 and w2 satisfy the equations n−1 dz L r w1 = z = z − sV (r)z; , (81) r dr
456
A. Ambrosetti, A. Malchiodi, W.-M. Ni
Lr w2 =
n−1 1 w1 − sV (r)w1 − s 2 V (r)z r 2 1 (n − 1)s + p(p − 1)zp−2 w12 − z − V (r)β1 z, 2 r2
(82)
where Lr denotes the operator Lr u = −u + V (r)u − pzρp−1 u. The function w1 can be determined explicitly and is of the form V (r) s 2 z sz w1 = + + bz , V (r) 4 p−1
(83)
for some real number b. Suppose also that M (r) = 0. Then the above function w(s) can be expanded at the third order in ε for a suitable value of the coefficient β1 . Let us remark that above the term bz is added to obtain w1 ⊥ z . The value of b does not need to be computed explicitly. Proof. The function uε (r) solves the differential equation −uε (r) + V (εr)uε (r) = (uε (r))p . We make the change of variables s = r − r/ε − β, so r = s + r/ε + β. We obtain 1 V (εr) = V (εs + ε 2 β1 + r) = V (r) + εV (r)(s + εβ1 ) + ε 2 V (r)s 2 + o(ε 2 ), (84) 2 n−1 (n − 1)s n−1 n−1 + o(ε 2 ) = ε − ε2 + o(ε 2 ). (85) = r/ε + s r r r2 If we expand Eq. (15) at order ε2 using (84) and (85), we get n−1 2 2 (n − 1)s 2 z − z (s) − εw1 (s) − ε w2 (s) − ε (s) + εw (s) + ε w (s) −ε 1 2 r r2 1 + V (r) + εV (r)(s + β) + ε 2 V (r)s 2 + ε 2 V (r)β1 z(s) + εw1 (s) + ε 2 w2 (s) 2 p 2 = z(s) + εw1 (s) + ε w2 (s) + o(ε 2 ). From this equation we deduce immediately (81) and (82). Let us now prove (83). Differentiating the equation −z + V (r)z = zp and using some elementary computations we find Lr z = 0,
Lr z = p(p − 1)zp−2 (z )2 ,
(86)
Lr z = −(p − 1)zp .
(87)
Lr (sz ) = −2z , Using (86) we also obtain Lr ((r − ρ)z) = −2z − (p − 1)(r − ρ)zp ,
Lr ((r − ρ)2 z ) = −2z − 4(r − ρ)z . (88)
Singularly Perturbed Elliptic Problems, Concentration on Spheres
457
From (87) and (88) it follows that (r − ρ)z 1 2 V V V (r) (r − ρ)2 z Lr + =− + z − (r−ρ) z + zp . V (r) 4 p−1 2 p−1 V V Using the condition M (r) = 0 we derive 1 V n−1 1 V Lr (r − ρ)2 z + (r − ρ)z = z − (r − ρ)V (r)z, 4V p−1 V r which implies (83). Now we verify the solvability for w1 and w2 . A necessary condition is that the r.h.s. in (81) and (82) is orthogonal to z . Multiplying the r.h.s. of (81) by z and integrating by parts we get
n−1 n−1 1 (z )2 − V (r) szz = (z )2 + V (r) z2 . r r 2 From the condition M (r) = 0 and from (13) we obtain that the r.h.s. is zero, hence there are no obstructions for the solvability in w1 . On the other hand, multiplying the r.h.s. of (82) by z and using some elementary calculation we get
n−1 1 1 2 w1 z − V (r) w1 sz − V (r) s zz + p(p − 1) zp−2 z w12 . r 2 2 Integrating by parts and using (81) we obtain
1 1 p(p − 1) zp−2 z w12 = p (zp−1 ) w12 = −p zp−1 w1 w1 2 2
n−1 = w1 Lr z + w1 − V (r)w1 = w1 z −sV (r)z . r Hence our expression becomes
n−1 n−1 1 2 w1 z − V (r) w1 sz − V (r) s zz + w1 z − sV (r)z . r 2 r We note now that all the integrands in the above expression are odd functions, and hence vanish. This implies the solvability for w2 . Now we expand the equation up to the third order in ε. Writing uε = z + εw1 + ε2 w2 + ε3 w3 + o(ε3 ) and β = εb1 + ε2 β2 + ε3 β3 , we get the following equation for w3 : Lr w3 = f, where
1 1 2 3 f = −V (r)w2 − V (r)β1 + V (r)s w1 − V (r)β1 s + V (r)s z 2 6 2 n−1 n−1 n−1 s β1 + w2 − V (r)β2 z − 2 sw1 + − 2 r r r r r 1 +p(p − 1)zp−2 w1 w2 + p(p − 1)(p − 2)zp−3 w13 . 6
458
A. Ambrosetti, A. Malchiodi, W.-M. Ni
The last equation is solvable if and only if !f, z " = 0. We are going to show that, under the assumption M (r) = 0, this is the case for a suitable choice of β1 . We write !f, z " = f1 + β1 f2 , where the term β1 f2 includes all the terms in !f, z " where β1 appears. We note that β1 appears also in the expression of w2 , see Eq. (82). From (86) and (87), if we set w˜ =
z s + , z p−1
then
Lr w˜ = −V (r)z.
It follows that
V (r) ˜ β1 w, V (r) where the function w2 does not contain the coefficient β1 . Using this equation we obtain
V (r)2 n − 1 V (r) s wz ˜ − V (r) w1 z − V (r) szz + w˜ z f2 = V (r) r V (r)
n−1 V (r) 2 (z ) + p(p − 1) − 2 ˜ . zp−2 w1 wz V (r) r w2 = w 2 +
Using the explicit expression of w˜ and some integration by parts we are left with
V (r) n − 1 1 n−1 2 (z )2 + (z )2 − 2 f2 = V (r) r 2 p−1 r
1 1 1 V (r)2 2 − z2 . + V (r) z + 2 V (r) p − 1 4 From Eq. (13) we get 2θ (z )2 = V (r) z2 , and hence using the condition M (r) = 0 we deduce 1 V (r)2 V (r) f2 = V (r) − + z2 . 2 V (r) r On the other hand, using again the condition M (r) = 0 we obtain V (r) V (r) n−1 V (r)2 +θ M |r=r = θ V (r) − + . M (r) = r V (r) V (r) r The last two equations imply f2 = 0, which makes the equation !f, z " = 0 solvable setting β1 = − ff21 . This concludes the proof of the lemma. Remark 7.4. From the above computations we see that the 0th order term in the expansion of β vanishes. This fact can also be checked as follows. Setting s = r − r/ε − β0 − εβ1 − ε 2 β2 and looking at the equation for w2 , one finds that the equation is solvable if and only if M (r)β0 = 0. This equation always admits the trivial solution β0 = 0. In general, the expansion at order εk should determine the coefficient βk−2 . In the next lemma we consider formal expansions for the second eigenvalue (and the corresponding eigenfunction) of Iε (z + εw1 + ε 2 w2 ). This operator is a perturbation of I λ (Uλ ), where λ2 = V (r), see Sect. 2. The operator I λ (Uλ ) possesses a zero eigenvalue with multiplicity 1, with corresponding eigenfunction Uλ . In the present notation we have set z = Uλ . Therefore it is natural to expand the eigenvalues and the eigenfunctions of Iε (z + εw1 + ε 2 w2 ) starting from 0 and z respectively.
Singularly Perturbed Elliptic Problems, Concentration on Spheres
459
Lemma 7.5. Let w1 and w2 be defined as in Lemma 7.3. Let us consider the formal expansion for the second eigenvalue of Iε (z + εw1 + ε 2 w2 ) and of the corresponding eigenfunction Iε (z + εw1 + ε 2 w2 )[z + εz1 + ε 2 z2 ] = (ετ1 + ε 2 τ2 )(z + εz1 + ε 2 z2 ) + o(ε 2 ), where the function z is centered at the point r/ε − εβ1 − ε2 β2 . Then τ1 and τ2 are given by 2 z 1 θ . τ1 = 0, M (r) 2 τ2 = 2 V (r) (z ) + V (r)(z )2 In particular τ2 does not depend on the numbers β1 and β2 . Proof. Let us start by computing the eigenvalue equation for Iε (z + w) to the first order in ε. As before we write z = z(r − r/ε − εβ1 ) and we set s = r − r/ε − εβ1 . Writing the eigenvalue as ετ1 + o(ε) and the eigenvector as z + εz1 + o(ε), the equation becomes Iε (z + εw1 + o(ε))[z + εz1 + o(ε)] = (ετ1 + o(ε))(z + εz1 + o(ε)). As a partial differential equation this means −(z + εz1 ) + (V (r) + εsV (r))(z + εz1 ) − p(z + εw1 )p−1 (z + εz1 )
= (ετ1 ) −(z + εz1 ) + (V (r) + εsV (r))(z + εz1 ) + o(ε). Considering the coefficient of ε in the last equation we get Lr z1 −
n − 1 z + V (r)sz − p(p − 1)zp−2 w1 z = τ1 (−z + V (r)z ). r
We can obtain the value of τ1 multiplying the last equation by z . Using the fact that Lr is self-adjoint and Lr z = 0, we deduce τ1 !z , z " =
n − 1 !z , z " + V (r)!sz , z " − p(p − 1)!zp−2 w1 z , z " = 0, r
since all the scalar products involve an even function and an odd function (we recall that from (83) w1 is odd). Hence τ1 = 0 and z1 satisfies Lr z1 =
n − 1 z − V (r)sz + p(p − 1)zp−2 w1 z . r
Again this equation can be solved explicitly, as for w1 . Using (86) and −z +V (r)z = zp we obtain Lr (z ) = p(p − 1)zp−2 (z )2 , Lr (sz ) = −2z ; (89) Lr (s 2 z ) = p(p − 1)s 2 zp−2 (z )2 − 4sz − 2z , L sz = sL z − 2z = p(p − 1)szp−2 (z )2 − 2z . r r We claim that z1 =
1 V 2 1 V s z + sz + bz , 4V p−1 V
where b is the same as in (83). From (89) we get
(90)
460
A. Ambrosetti, A. Malchiodi, W.-M. Ni
Lr (z1 ) =
1 V p(p − 1)s 2 zp−2 (z )2 − 4sz − 2z 4V 1 V + (−2z ) + bp(p − 1)zp−2 (z )2 . p−1 V
(91)
From the condition M (r) = 0 and some elementary computations we obtain Lr z1 =
V n − 1 V 1 − sz + p(p − 1) s 2 zp−2 (z )2 + bp(p − 1)zp−2 (z )2 . r V 4 V
From (88) and from the explicit expression of w1 we see that z1 satisfies the required equation. Let us now turn to the expansion at the second order in ε. We expand the second eigenvalue of Iε (z + εw1 + ε2 w2 ) as ε 2 τ2 + o(ε 2 ) and the corresponding eigenvector as z + εz1 + ε 2 z2 + o(ε 2 ). We also write z = z(r − r/ε − εβ1 − ε 2 β2 ) and we set s = r − r/ε − εβ1 − ε 2 β2 . The eigenvalue equation becomes 1 −(z + εz1 + ε 2 z2 ) + V (r) + εsV (r) + ε 2 s 2 V (r) + ε 2 V (r)β1 2 × (z + εz1 + ε 2 z2 ) − p(z + εw1 + ε 2 w2 )p−1 (z + εz1 + ε 2 z2 ) = ε2 τ2 (−z + V (r)z ) + o(ε 2 ). Considering the coefficient of ε 2 we get n−1 1 n−1 z1 + s 2 V (r)z + sV (r)z1 − p(p − 1)zp−2 w2 z + 2 sz r 2 r 1 p−3 2 p−2 +V (r)β1 z − p(p − 1)(p − 2)z w1 z − p(p − 1)z w1 z 1 2 = τ2 (−z + V (r)z ).
L r z2 −
Multiplying by z and using again the equation Lr z = 0 we obtain
1 n−1 2 2 z1 z + V (r) s (z ) + V (r) sz1 z − p(p − 1) zp−2 w2 (z )2 − r 2
n−1 1 2 sz (z p(p − 1)(p − 2) zp−3 w12 (z )2 − p(p − 1) + z + V (r)β ) − 1 2 r 2 × zp−2 w1 z1 z = τ2 !z , z ". The term −p(p −1) zp−2 w2 (z )2 can be estimated using (86) and integrating by parts, and gives
p−2 2 w2 (z ) = − w2 Lr z = − z Lr w2 . −p(p − 1) z From the last two equations, from (82) and from the fact that z1 = w1 − V (r) 1 1 V (r) 2 sz + p−1 z we deduce τ2 !z , z " = σ1 + σ2 + σ3 + σ4 ,
(92)
Singularly Perturbed Elliptic Problems, Concentration on Spheres
where
461
1 1 V (r) s 2 (z )2 + V (r) s 2 zz , 2 2
1 σ2 = − p(p − 1) zp−2 z w12 2
1 p−3 2 2 − p(p − 1)(p − 2) z w1 (z ) − p(p − 1) zp−2 z w1 w1 , 2
σ3 = V (r) sw1 z + V (r) sw1 z
1 V (r) 1 +p(p − 1) zp−2 w1 z sz + z , V (r) 2 p−1
1 1 n − 1 V (r) 1 σ4 = z + z + sz r V (r) 2 p−1 2
2 V (r) 1 1 n−1 − sz z . (r − ρ)z sz + z +2 2 V (r) 2 p−1 r σ1 =
We chose these coefficients to highlight the number V (r), the quadratic terms in w1 , the linear terms in w1 , and the terms depending only on z, V (r), V (r). Integrating by parts we deduce
1 1 σ1 = V (r) z2 , σ2 = − p(p − 1) (93) zp−2 z w12 = 0. 2 2 Using integration by parts in the first two terms of σ3 and Eq. (89) we get
V (r) 1 1 V (r) σ3 = w1 −z + p(p − 1)zp−2 (z )2 s = w1 Lr sz V (r) 2 2 V (r) 1 V (r) = sz Lr w1 . 2 V (r) From Eq. (81) we finally obtain 1 V (r) σ3 = 2 V (r)
sz
n−1 z − sV (r)z . r
(94)
Using some integration by parts we deduce
1 V (r) n − 1 2 n−1 p−2 2 2 σ3 + σ4 = (z )2 . (z ) − V (r) (z ) − 2 2 V (r) r p−1 p−1 r Equation (13) and the condition M (r) = 0 imply n−1 V (r) +θ = 0; r V (r)
2θ
(z )2 = V (r)
z2 .
The last two equations and (93) yield
1 V (r)2 n−1 2 σ1 + σ2 + σ3 + σ4 = (z )2 , V (r) − z − 2 2 V (r) r and hence from (92) we obtain the conclusion.
462
A. Ambrosetti, A. Malchiodi, W.-M. Ni
7.3. Proof of Theorem 1.6. We just give a scheme of the proof, omitting some details. We will assume that the function zω defined below is multiplied by a suitable cutoff φε as in Sect. 2, so our estimates will be affected only by an error exponential in ε. Let us consider the following manifold Z = {zω } = z(s) + εw1 (s) + ε 2 w2 (s), s = r − r/ε − εω , where w1 , w2 are as in Lemma 7.3, and where ω belongs to some large fixed interval. From the above expansions one can check that Iε (zω ) ≤ Cε3 · ε (1−n)/2 for some constant C. The arguments in Sect. 4 and in Lemma 7.3 imply the following. For any zω ∈ Z there exists a function w such that w ≤ Cε3 · ε (1−n)/2 ,
!w, z " = 0,
Iε (zω + w) = C0 ε 3 (ω − β1 )z + o(ε 3 ),
for some positive constant C0 > 0. The above scalar product !·, ·" is defined in the previous subsection. In particular we will find a solution of Iε (zω + w) = 0 for some ω = β1 + o(1). This solution will be non-degenerate. To prove this fact one can reason as follows. Lemma 7.5 furnishes an approximation for the second eigenvalue of Iε (z+εw1 +ε 2 w2 ) and of its corresponding eigenvector. Since a solution uε of (15) satisfies Iε (uε )[uε ] = −(p−1)uε , the function z+εw1 +ε2 w2 is an approximation of order ε 2 to the first eigenvector of Iε (z+εw1 +ε 2 w2 ), and the first eigenvalue will be given by −(p −1)+o(ε 2 ). Using the Courant-Fischer min-max method on the two-dimensional space spanned by z + εw1 + ε 2 w2 and z + εz1 + ε 2 z2 , one can prove that the second eigenvalue of Iε (zω ) is again of the form ε2 τ2 + o(ε2 ). The next step consists in passing from Iε (zω ) to Iε (zω + w). We note that for any v1 , v2 ∈ Hr1 there holds
Iε (zω + w)[v1 , v2 ] − Iε (zω )[v1 , v2 ] = p |zω |p−1 − |zω + w|p−1 v1 v2 . (95) The function w can be estimated in the following way, reasoning as in Remark 4.5 |w(r)| ≤ Cε 3 e−λ1 |r−r/ε−εω| ,
for all r ∈ R.
(96)
When p ≥ 2, the function t → |t|p−1 is Lipschitz, so (95) and (96) imply that |Iε (zω + w)[v1 , v2 ] − Iε (zω )[v1 , v2 ]| ≤ Cε3 v1 v2 . To get a similar inequality for p < 2 we need a more accurate argument. From (9), (96) and some elementary computations one finds 1 for |x − r/ε| ≤ (3| log ε| − C), |zω |p−1 − |zω + w|p−1 ≤ Czp−2 w, λ0 − λ 1 for some positive constant C. As a consequence we have |zω |p−1 − |zω + w|p−1 ≤ Cε 3 e[(2−p)λ0 −λ1 ]|x−r/ε| , 1 for |x − r/ε| ≤ (3| log ε| − C). λ0 − λ 1
(97)
Singularly Perturbed Elliptic Problems, Concentration on Spheres
On the other hand, from (96) one finds λ (p−1) 3 1 |zω |p−1 − |zω + w|p−1 ≤ C|w|p−1 ≤ Cε λ0 −λ1 , 1 for |x − r/ε| ≥ (3| log ε| − C). λ0 − λ 1
463
(98)
If λ1 is chosen sufficiently close to λ0 , then (95), (97) and (98) imply again |Iε (zω + w)[v1 , v2 ] − Iε (zω )[v1 , v2 ]| ≤ Cε3 v1 v2 for some constant C. We have proved that the second eigenvalue of Iε (zω + w) is of the form ε2 λ2 + o(ε 2 ) for every ω in a fixed compact set of R. From this we deduce that there is an unique ω (ω = β1 + o(1)) for which Iε (zω + w) = 0, and the solution uε = zω + w of (15) is non-degenerate and locally unique in the class of radial functions. This implies that the set in Theorem 1.6 is a smooth curve. By Proposition 7.1 the Morse index of Iε (uε ), in the space H 1 (Rn ), diverges as ε → 0. To obtain the conclusion it is sufficient to apply the result of [24]. When p > n+2 n−2 it is sufficient to consider a cutoff function FK as in Sect. 5. The above proof yields bifurcation of non-radial solutions of Iε,K = 0. The ∞ L bounds on the radial solutions and standard regularity results imply that non-radial solutions which are sufficiently close to the radial ones (in H 1 (Rn )) are also uniformly bounded. Hence these critical points are also solutions of (15). Remark 7.6. Bifurcation of non-symmetric solutions of some classes of elliptic equa tions on Rn has been recently studied in [13, 35]. 8. Open Problems and Perspectives In this section we discuss some possible extensions of the results proved in the previous sections. Concentration on k-dimensional sets. Similarly to Theorem 1.1, one could ask about concentration at k-dimensional spheres, 1 ≤ k ≤ n−1. In such a case, the corresponding limit problem is of the form (see (11)) p in Rn−k , −Uλ,k + λ2 Uλ,k = Uλ,k (99) Uλ,k > 0, Uλ,k ∈ H 1 (Rn−k ). Here λ2 = V (εr k ) is the potential at the concentration radius (to be found) and the exponent p is subcritical w.r.t. Rn−k , namely 1 < p < n−k+2 n−k−2 if n − k > 2, 1 < p if n − k ≤ 2. From a simple rescaling argument one finds 2
Uλ,k (x) = λ p−1 U k (λx),
x ∈ Rn−k ,
1 1 1 p+1 2 |∇Uλ,k |2 + λ2 Uλ,k − Uλ,k n−k n−k 2 Rn−k 2 p + 1 R R 1 1 = − λ2θk U k H 1 (Rn−k ) , 2 p+1
(100)
(101)
464
A. Ambrosetti, A. Malchiodi, W.-M. Ni
1 where θk = p+1 p−1 − 2 (n − k) and U k is the solution of (99) with λ = 1. Hence the energy of an approximate solution zρ of (15) which is concentrated near a k-dimensional sphere of radius ρ can be estimated as
E(zρ ) ∼ ρ k V θk (ερ).
(102)
As a consequence, solutions of (3) should concentrate at critical points of the auxiliary functional Mk (r) := r k V θk (r). When k = n − 1, Mk coincides with M while for k = 0 the critical points of Mk coincide with those of V . The complete details are not carried out here.
Global analysis of bifurcation branches. In Theorem 1.6 we have proved that, on the curve of radial solutions concentrating at spheres, there are infinitely many bifurcations of non-radial solutions. It is natural to ask the following questions: 1) Are these bifurcations sets constituted by connected and global branches? 2) If so, what is the behavior of the solutions along the branches? Concerning the former question, standard tools of bifurcation theory should allow to perform a detailed local analysis near the bifurcation points. The proof of Theorem 1.6 suggests that the angular dependence of the non-radial solutions should be qualitatively related to the spherical harmonics on S n−1 , with higher and higher degree when ε tends to zero. See also [37], Sect. 12. The latter question has been investigated in [13] for a somewhat related problem, see also [35]. We remark that, if p is subcritical, then near local maxima of V there exist multi-spike solutions, see [23]. Furthermore, as anticipated in Remark 1.7, there exist spike or multi-spike solutions peaked at critical manifolds of V , if any. We wonder whether or not it is possible to reach these kinds of solutions along the non-radial branches.
The non-radial case. Below we outline the heuristic argument which could lead to find the k-dimensional manifolds ⊆ Rn where concentration should take place. Let us consider an approximate solution z of (1) which is concentrated near . In analogy with (102), the energy of such a solution (we are not rescaling in this case) could be expressed as
V θk dσ, (103) E(z ) ∼ εn−k
where dσ is the volume element of . We refer e.g. to the paper [38] for the geometric formulas used below. Let X denote a vector field perpendicular at . Then the Leibnitz rule and the classical formula for the variation of the area yields d θk d dσ + V θk V dσ dX dX
= εn−k ∇X V θk − V θk H · X dσ,
d E(z ) = εn−k dX
(104)
Singularly Perturbed Elliptic Problems, Concentration on Spheres
465
where H denotes the mean-curvature vector of . We recall that, if x1 , . . . , xk are local coordinates orthonormal at a point p ∈ , and if F : x1 , . . . , xk → Rn denotes the immersion of (F (0) = p), then H(p) is given by H(p) =
k " ∂ 2F i=1
∂xi2
!⊥ |x=0
.
(105)
Here ⊥ denotes the component orthogonal to . From formula (104) the conditions of stationarity of become θk ∇ ⊥ V = V H,
(106)
where ∇ ⊥ denotes the component of V normal to . Note that when is a k-dimensional sphere then, by (105), conditions (106) and Mk (r) = 0 coincide. We conjecture that, under suitable non-degeneracy assumptions, (106) is a sufficient condition for the existence of solutions of (1) concentrating on . The main difference is that in the non-radial case the invertibility of P Iε (z), see Proposition 4.2, requires a more delicate analysis. In particular, we suspect that concentration occurs in general along sequences εj → 0 as in [27]. Acknowledgements. A. A. and A. M. are supported by MURST, under the project Variational Methods and Nonlinear Differential Equations. A.M. is also supported by NSF under agreement No. DMS9729992. W.-M.N. is partially supported by the National Science Foundation. The authors wish to thank some institutions for hospitality and support. Precisely, A.A. wishes to thank the University of Paris VI, A.M. the University of Paris VI (under the European Grant ERB FMRX CT98 0201), the University of Minnesota and S.I.S.S.A., and W.-M.N., S.I.S.S.A.
References 1. Ambrosetti,A., Badiale, M.: Homoclinics: Poincar´e-Melnikov type results via a variational approach. Ann. Inst. H. Poincar´e Analyse Non Lin´eaire 15, 233–252 (1998) 2. Ambrosetti, A., Badiale, M.: Variational perturbative methods and bifurcation of bound states from the essential spectrum. Proc. Royal Soc. Edinburg 128-A, 1131–1161 (1998) 3. Ambrosetti, A., Badiale, M., Cingolani, S.: Semiclassical states of nonlinear Schr¨odinger equations. Arch. Rational Mech. Anal. 140, 285–300 (1997) 4. Ambrosetti, A., Malchiodi, A., Ni, W.-M.: Solutions concentrating on spheres to symmetric singularly perturbed problems. C. Rendus Acad. Sci. 335, 145–150 (2002) 5. Ambrosetti, A., Malchiodi, A., Ni, W.-M.: Singularly Perturbed Elliptic Equations with Symmetry: Existence of Solutions Concentrating on Spheres, Part II. Preprint, 2002 6. Ambrosetti,A., Malchiodi,A., Secchi, S.: Multiplicity results for some nonlinear singularly perturbed elliptic problems on R n . Arch. Rat. Mech. Anal., 159, 253–271 (2001) 7. Badiale, M., D’Aprile, T.: Concentration around a sphere for a singularly perturbed Schr¨odinger equation. Nonlin. Anal. T.M.A. 49, 947–985 (2002) 8. Benci, V., D’Aprile, T.: The semiclassical limit of the nonlinear Schr¨odinger equation in a radial potential. J. Differential Equations, 184, 109–138 (2002) 9. Berestycki, H., Lions, P.-L.: Nonlinear scalar field equations I: Existence of a ground state. Arch. Rational Mech. Anal. 82, 313–345 (1983) 10. Bressan, A.: Hyperbolic systems of conservation laws. Oxford Lecture Series in Math. and its Applications. Oxford: Oxford Univ. Press, 2000 11. Chavel, I.: Eigenvalues in Riemannian Geometry. London-New York: Academic Press, 1984 12. Cingolani, S., Pistoia,: Nonexistence of single blow-up solutions for a nonlinear Schr¨odinger equation involving critical Sobolev exponent. Zeit. angew. Mathematik und Physik. To appear 13. Dancer, N.: New solutions of equations on Rn . Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 30(3–4), 535–563 (2001)
466
A. Ambrosetti, A. Malchiodi, W.-M. Ni
14. D’Aprile, T.: On a class of solutions with non-vanishing angular momentum for nonlinear Schr¨odinger equations. Diff. Int. Eq. To appear 15. Del Pino, M., Felmer, P.: Local mountain passes for semilinear elliptic problems in unbounded domains. Calc. Var. 4, 121–137 (1996) 16. Del Pino, M., Felmer, P.: Semi-classcal states for nonlinear Schr¨odinger equations. J. Funct. Anal. 149, 245–265 (1997) 17. Floer, A., Weinstein, A.: Nonspreading wave packets for the cubic Schr¨odinger equation with a bounded potential. J. Funct. Anal. 69, 397–408 (1986) 18. Gidas, B., Ni, W.-M., Nirenberg, L.: Symmetry of positive solutions of nonlinear elliptic equations in Rn . In: Advances in Mathematics Supplementary Studies, Vol. 7-A, A. Nachbin (ed.), pp. 369–402 19. Gierer, A., Meinhardt, H.: A theory of biological pattern formation. Kybernetik 12, 30–39 (1972) 20. Gilbarg, D., Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Berlin: Springer- Verlag, 1983 21. Grillakis, M., Shatah, J., Strauss, W.: Stability theory of solitary waves in the presence of symmetry. II. J. Funct. Anal. 94, 308–348 (1990) 22. Grossi, M.: Some results on a class of nonlinear Schr¨odinger equations. Math. Zeit. 235(4), 687–705 (2000) 23. Kang, X., Wei, J.: On interacting bumps of semi-classical states of nonlinear Schr¨odinger equations. Adv. Diff. Eq. 5(7–9), 899–928 (2000) 24. Kielhofer, H.: A bifurcation theorem for potential operators. J. Funct. Anal. 77, 1–8 (1988) 25. Li, Y.Y.: On a singularly perturbed elliptic equation. Adv. Diff. Eq. 2(6), 955–980 (1997) 26. Malchiodi, A.: Adiabatic limits for some Newtonian systems in Rn . Asympt. Anal. 25, 149–181 (2001) 27. Malchiodi, A., Montenegro, M.: Boundary concentration phenomena for a singularly perturbed elliptic problem. Commun. Pure Appl. Math. 55, 1507–1568 (2002) 28. Lin, C.-S., Ni, W.-M., Takagi, I.: Large amplitude stationary solutions to a chemotaxis systems. J. Differential Equations 72, 1–27 (1988) 29. Ni, W.-M.: Diffusion, cross-diffusion, and their spike-layer steady states. Notices Amer. Math. Soc. 45(1), 9–18 (1998) 30. Ni, W.-M., Takagi, I.: On the shape of least-energy solution to a semilinear Neumann problem. Commun. Pure Appl. Math. 41, 819–851 (1991) 31. Ni, W.-M., Takagi, I.: Locating the peaks of least-energy solutions to a semilinear Neumann problem. Duke Math. J. 70, 247–281 (1993) 32. Ni, W.-M., Takagi, I., Yanagida, E.: Stability of least energy patterns of the shadow system for an activator-inhibitor model. Japan J. Ind. Appl. Math. 18(2), 259–272 (2001) 33. Oh, Y.-G.: Stability of semiclassical bound states of nonlinear Schroedinger equations with potentials. Commun. Math. Phys. 121, 11–33 (1989) 34. Oh, Y.-G.: On positive multibump states of nonlinear Schroedinger equation under multiple well potentials. Commun. Math. Phys. 131, 223–253 (1990) 35. Shi, J.: Semilinear Neumann boundary value problems on a rectangle. Trans. Am. Math. Soc. 354(8), 3117–3154 (2002) 36. Strauss, W.A.: Existence of solitary waves in higher dimensions, Commun. Math. Phys. 55, 149–162 (1977) 37. Turing, A.-M.: The Chemical Basis of Morphogenesis. Phil. Trans. Royal Soc. London, Series B, Biological Sciences 237, 37–72 (1952) 38. Wang, M.-T.: Mean Curvature Flows in Higher Codimension. Preprint, 2002 39. Wang, X.: On concentration of positive bound states of nonlinear Schr¨odinger equations. Commun. Math. Phys. 153, 229–243 (1993) Communicated by P. Constantin
Commun. Math. Phys. 235, 467–494 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0784-2
Communications in
Mathematical Physics
The Heun Equation and the Calogero-Moser-Sutherland System I: The Bethe Ansatz Method Kouichi Takemura Department of Mathematical Sciences, Yokohama City University, 22-2 Seto, Kanazawa-ku, Yokohama 236-0027, Japan. E-mail:
[email protected] Received: 28 September 2001 / Accepted: 31 October 2002 Published online: 31 January 2003 – © Springer-Verlag 2003
Abstract: We propose and develop the Bethe Ansatz method for the Heun equation. As an application, holomorphy of the perturbation for the BC1 Inozemtsev model from the trigonometric model is proved.
1. Introduction Olshanetsky and Perelomov proposed a family of integrable quantum systems, which is called the Calogero-Moser-Sutherland system or the Olshanetsky-Perelomov system [6]. In the early 90’s, Ochiai, Oshima and Sekiguchi classified integrable models of quantum mechanics which are invariant under the action of a Weyl group with certain assumptions [5]. For the case of BN (N ≥ 3), the generic model coincides with the BCN Inozemtsev model. Oshima and Sekiguchi found that the eigenvalue problem for the Hamiltonian of the BC1 (one particle) Inozemtsev model is transformed to the Heun equation with full parameters [7]. Here, the Heun equation is a second-order Fuchsian differential equation with four regular singular points. In this paper we will investigate solutions to the Heun equation motivated by the analysis of the Calogero-Moser-Sutherland systems. In the papers [9, 4], holomorphy of the perturbation for the Calogero-Moser-Sutherland model of type AN from the trigonometric model was proved, and in [9] some results were obtained by applying the Bethe Ansatz method. Note that the Bethe Ansatz for the model of type AN was established by Felder and Varchenko [2] by investigating asymptotic behavior of an integral expression of a solution to the KZB equation. For the case of type A1 , the Bethe Ansatz method was found in the textbook of Whittaker and Watson [12], although it was covered in the chapter entitled “Lam´e’s equation”, and the phrase “Bethe Ansatz” was not used. Note that the Lam´e equation is a special case of the Heun equation [8].
468
K. Takemura
In this paper we will propose and develop the Bethe Ansatz method for the Heun equation, and holomorphy of the perturbation for the BC1 Inozemtsev model from the trigonometric model will be obtained. The method “Bethe Ansatz” appears frequently in physics. In particular the Bethe Ansatz method has great merits for investigating solvable lattice models [1]. In this paper, we use “Bethe Ansatz” in a somewhat restricted sense, similar to the usage in [2, 9]. The Bethe Ansatz method replaces the problem of finding eigenstates and eigenvalues of the Hamiltonian (for example Eq. (2.2)) with a problem of solving transcendental equations for a finite number of variables which are called the Bethe Ansatz equations (for example Eqs. (3.29)). In our case, we use elliptic functions to describe the Bethe Ansatz equation. It would be very difficult to solve the transcendental equations explicitly. Instead, by applying the trigonometric limit the transcendental equations tend to algebraic equations, which may be more easily solvable. On the other hand, the Hamiltonian of the BC1 Calogero-Sutherland model appears after applying the trigonometric limit, and the eigenstates for the BC1 Calogero-Sutherland model are described by Jacobi polynomials. Thus the Jacobi polynomials and the solutions to the Bethe Ansatz equation for the trigonometric case are closely connected, and they are related to the perturbation. In fact holomorphy of the perturbation for the BC1 Inozemtsev model from the trigonometric model follows from holomorphy of the solutions to the Bethe Ansatz equation with respect to a parameter of a period of elliptic functions. Now we examine the perturbation. Based on the Hamiltonian HCS of the BC1 Calogero-Sutherland model, the Hamiltonian H model can be regarded of the BC1 Inozemtsev i as a perturbed one, i.e. H = HCS +V0 + ∞ i=1 Vi (x)p , where V0 is a constant and Vi (x) (i ∈ Z≥1 ) are functions. We can calculate eigenvalues and eigenstates of H as a formal power series in p by the perturbation. For our case, holomorphy of the perturbation can be shown and it implies convergence of the formal power series in p if |p| is sufficiently small. This paper is organized as follows. In Sect. 2, we clarify the relationship between the Heun equation and the BC1 Inozemtsev system [7]. In Sect. 3, we introduce the Bethe Ansatz method for the BC1 Inozemtsev system, or equivalently for the Heun equation. In Sect. 3.1, invariant subspaces of doubly periodic functions are introduced. In Sect. 3.2, we obtain an integral representation of eigenfunctions of the BC1 Inozemtsev system. In Sect. 3.3, we introduce the Bethe Ansatz method for the BC1 Inozemtsev system and describe the Bethe Ansatz equation. In Sect. 3.4, we rewrite the Bethe Ansatz equation in terms of theta functions in order to consider the trigonometric limit. In Sect. 4, we consider the trigonometric limit and solve the trigonometric Bethe Ansatz equation. In Sect. 5, relationships between the Bethe Ansatz method and the spectral problem on L2 -space for the BC1 Inozemtsev system are clarified. Then holomorphy of the perturbation for the BC1 Inozemtsev model from the trigonometric model is obtained. A key observation is that holomorphy of the perturbation follows from holomorphy of the solutions to the Bethe Ansatz equation. In Sect. 6, we give comments, and in the appendix, we note definitions and formulae of elliptic functions. 2. The Heun Equation and the Hamiltonian of the BC1 Inozemtsev System The expression of the Heun equation in terms of elliptic functions was already known by Darboux in the 19th century, and this appeared in connection with the soliton theory by Verdier and Treibich [11]. In [7], the relationship between the eigenfunctions for the
Heun Equation
469
BC1 Inozemtsev model and the solutions to the Heun equation was demonstrated. In this section, we recall the relationship between the BC1 Inozemtsev model and the Heun equation. The Hamiltonian of the BC1 Inozemtsev model is given as follows: d2 + li (li + 1)℘ (x + ωi ), 2 dx 3
H := −
(2.1)
i=0
where ℘ (x) is the Weierstrass ℘-function with periods (2ω1 , 2ω3 ), ω0 = 0, ω2 = −(ω1 + ω3 ) and li (i = 0, 1, 2, 3) are coupling constants. In this paper, we will consider eigenvalues and eigenfunctions of the Hamiltonian H . Let f (x) be an eigenfunction of H with an eigenvalue E, i.e. 3 d2 (H − E)f (x) = − 2 + li (li + 1)℘ (x + ωi ) − E f (x) = 0. (2.2) dx i=0
We will make a correspondence between Eq. (2.2) and the Heun equation as was done in [7]. Set z = ℘ (x) and ei = ℘ (ωi ) (i = 1, 2, 3). By a straightforward calculation, we have 2 d2 d d 1 1 1 1 2 = ℘ (x) + + + . 2 2 dx dz 2 z − e1 z − e2 z − e3 dz Hence Eq. (2.2) is equivalent to the equation 2 d d 1 1 1 1 1 + + + − dz2 2 z − e1 z − e2 z − e3 dz 4(z − e1 )(z − e2 )(z − e3 ) 3 )(ei − ei ) − e (e i i · C˜ + l0 (l0 + 1)z + f˜(z) = 0, (2.3) li (li + 1) z − ei i=1
where i , i ∈ {1, 2, 3}, i < i , i and i are different from i, f˜(℘ (x)) = f (x) and C˜ = −E + 3i=1 li (li + 1)ei . Note that e1 + e2 + e3 = 0. Equation (2.3) has four singular points e1 , e2 , e3 and ∞ on the Riemann sphere and all singular points are regular. The exponents at z = ei (i = 1, 2, 3) are (li + 1)/2 and −li /2, and the exponents at ∞ are (l0 + 1)/2 and −l0 /2. The Heun equation is a standard form of a Fuchsian differential equation with four regular singularities. It is written as d 2 γ δ αβw − q d + + + + g(w) = 0, (2.4) dw w w − 1 w − t dw w(w − 1)(w − t) with the condition α + β + 1 = γ + δ + . It is easy to transform an arbitrary Fuchsian differential equation with four regular singularities into the Heun equation. Thereby Eq. (2.3) is transformed to the Heun equation. Conversely, if a Fuchsian differential equation with four regular singularities is given, we can transform it into Eq. (2.3) with suitable ei (i = 1, 2, 3) with the condition e1 + e2 + e3 = 0, li (i = 0, 1, 2, 3), and C˜ by changing the variable z → az+b cz+d and making the transformation f → (z − e1 )α1 (z − e2 )α2 (z − e3 )α3 f . Thus the relationship between the BC1 Inozemtsev model and the Heun equation is made.
470
K. Takemura
In this paper, we discover properties of the Heun equation by using an expression of elliptic functions especially for the case li ∈ Z≥0 (i = 0, 1, 2, 3). Note that a Fuchsian differential equation with four regular singularities with exponents (γi , δi ) (i = 0, 1, 2, 3) satisfying γi − δi ∈ Z + 21 is transformed to Eq. (2.3) with the condition li ∈ Z≥0 . 3. Bethe Ansatz Equation 3.1. The invariant subspace of doubly periodic functions. In this subsection we will find finite-dimensional invariant spaces of doubly periodic functions with respect to the Hamiltonian H (see Eq. (2.1)) for the case li ∈ Z≥0 (i = 0, 1, 2, 3) and analyze them. In the terminology of the Heun-type equation (2.3), doubly periodic functions in x correspond to algebraic functions in the parameter z(= ℘ (x)). Let F be the space spanned by meromorphic doubly periodic functions up to signs, namely F= F1 ,3 , (3.1) 1 ,3 =±1
F1 ,3 = {f (x): meromorphic |f (x + 2ω1 ) = 1 f (x), f (x + 2ω3 ) = 3 f (x)}, (3.2) where (2ω1 , 2ω3 ) are basic periods of elliptic functions. Let ki (i = 0, 1, 2, 3) be the rearrangement of li (i = 0, 1, 2, 3) such that k0 ≥ k1 ≥ k2 ≥ k3 (≥ 0). The dimension of the maximum finite-dimensional subspace in F which is invariant under the action of the Hamiltonian is given as follows. Theorem 3.1. (i) If the number k0 + k1 + k2 + k3 (= l0 + l1 + l2 + l3 ) is even and k0 + k3 ≥ k1 + k2 , then the dimension of the maximal finite-dimensional invariant subspace in F with respect to the action of the Hamiltonian H (see Eq. (2.1)) is 2k0 + 1. (ii) If the number k0 + k1 + k2 + k3 is even and k0 + k3 < k1 + k2 , then the dimension is k0 + k1 + k2 − k3 + 1. (iii) If the number k0 + k1 + k2 + k3 is odd and k0 ≥ k1 + k2 + k3 + 1, then the dimension is 2k0 + 1. (iv) If the number k0 + k1 + k2 + k3 is odd and k0 < k1 + k2 + k3 + 1, then the dimension is k0 + k1 + k2 + k3 + 2. Proof. First, by assuming the existence of the finite-dimensional invariant space W , we derive the properties of the space, and later we show its existence. Let W be a finite-dimensional invariant space in F and T (i) (i = 0, 1, 2, 3) be operators on F defined by T (i) f (x) = f (−x + 2ωi ). From the definitions of the operator H and the periodicity of the function which belongs to F, it follows that (T (i) )2 = 1, T (i) T (j ) = T (j ) T (i) , T (i) H = H T (i) and T (3) T (2) T (1) T (0) = 1 on F. Set
(3) a3 (2) a2 (1) a1 (0) a0 V = · W, (3.3) T T T T a0 ,a1 ,a2 ,a3 ∈{0,1}
then the space V is finite-dimensional and invariant under the actions of H and T (i) (i = 0, 1, 2, 3). From the commutativity of the operators T (i) (i = 0, 1, 2, 3), we have the decomposition V = V0 ,1 ,2 ,3 , (3.4) 0 ,1 ,2 ,3 =±1 0 1 2 3 =1
Heun Equation
471
where V0 ,1 ,2 ,3 = {f (x) ∈ V |T (i) f (x) = i f (x) for i = 0, 1, 2, 3} (0 , 1 , 2 , 3 = ±1). Let f (x) be an element of V0 ,1 ,2 ,3 . From the definition of T (i) and V0 ,1 ,2 ,3 , the function f (x + ωi ) (i = 0, 1, 2, 3) is odd or even. Hence f (x) =
∞ n=Mi
c
(i) 1− 2n+ 2 i
(x − ωi )2n+
1−i 2
(3.5)
,
for some Mi ∈ Z. The exponents of the operator H at x = ωi are −li and li + 1. Let αi i be one of the exponents at x = ωi such that αi + 1− 2 is divisible by 2. If c
(i)
2n+
1−i 2
j i = 0 for some n such that 2n + 1− 2 < αi then the functions H f (x − ωi )
(j = 0, 1, 2, . . . ) are linearly independent and there is a contradiction with finite(i) i dimensionality. Therefore if 2n + 1− 1−i = 0. 2 < αi then c 2n+
2
(i)
i = 0 for all i(∈ {0, 1, 2, 3}) and n such that 2n + 1− 2 < 1− (i) 2n+ 2 i αi . Then the function Hf (x) is expressed as Hf (x) = n∈Z c˜ 1−i (x − ωi )
Conversely assume c
1− 2n+ 2 i
2n+
(i = 0, 1, 2, 3) and c˜
(i)
= 0 for n such that 2n +
1− 2n+ 2 i
1−i 2
2
< αi , because αi is an
exponent of the operator H at x = ωi . Therefore the space V0 ,1 ,2 ,3 consists of functions whose coefficients of the expan(i) sion (see Eq. (3.5)) at x = ωi (i ∈ {0, 1, 2, 3}) satisfy c 1−i = 0 for all i and n such 2n+
2
i that 2n + 1− 2 < αi . From decomposition (3.4) and the condition 0 1 2 3 = 1, the number α0 + α1 + α2 + α3 must be even. Now we show that the non-zero finite dimensional vector space V0 ,1 ,2 ,3 exists iff 2 +α3 and α0 + α1 + α2 + α3 ∈ 2Z≤0 . Let d = − α0 +α1 +α 2
V˜0 ,1 ,2 ,3 =
d n=0 C
{0},
σ1 (x) σ (x)
α1
σ2 (x) σ (x)
α2
σ3 (x) σ (x)
α3
℘ (x)n , d ∈ Z≥0 ; otherwise,
(3.6)
where σi (x) are the co-sigma functions which are defined in the appendix. Then the space V˜0 ,1 ,2 ,3 is maximum, i.e. V0 ,1 ,2 ,3 ⊂ V˜0 ,1 ,2 ,3 and it is seen that H · V˜0 ,1 ,2 ,3 ⊂ V˜0 ,1 ,2 ,3 . Note that the dimension of the space V˜0 ,1 ,2 ,3 is d + 1. Set V˜ =
V˜0 ,1 ,2 ,3 ,
(3.7)
0 ,1 ,2 ,3 =±1 0 1 2 3 =1
then the space V˜ is the maximum finite-dimensional space spanned by functions in F and is invariant under the action of the Hamiltonian H . In particular, we have W ⊂ V ⊂ V˜ . The dimension of the space V˜ is the summation of the dimension of the spaces V˜0 ,1 ,2 ,3 for 0 1 2 3 = 1.
472
K. Takemura
If l0 + l1 + l2 + l3 is even, then the dimension of the space spanned by all finite dimensional invariant spaces in F is l0 + l1 + l2 + l3 l 0 + l 1 − l2 − l 3 + 1 + 2 2 l0 − l1 − l2 + l3 l0 − l1 + l2 − l3 . + (3.8) + 2 2 If l0 +l1 +l2 +l3 is odd, then the dimension of the space spanned by all finite dimensional invariant spaces in F is l0 + l 1 + l 2 − l 3 + 1 l 0 + l 1 − l 2 + l 3 + 1 + 2 2 l0 − l1 + l2 + l3 + 1 −l0 + l1 + l2 + l3 + 1 + . + (3.9) 2 2 Hence the theorem is obtained.
Remark. Theorem 3.1 and its proof is a generalization of [12, §23.41].
We will consider the multiplicities of eigenvalues of the Hamiltonian H on the space spanned by finite dimensional invariant spaces in F. We will obtain the following theorem, which will be used later. Theorem 3.2. Let V˜ be the space which has appeared in Theorem 3.1 (i.e. Eq. (3.7)). If two of li (i = 0, 1, 2, 3) are zero, then the roots of the characteristic polynomial of the operator H (see Eq. (2.1)) on the space V˜ are distinct for generic ω1 and ω3 . Proof. By shifting the variable x → x + ωi and rearranging the periods, it is sufficient to show the case l2 = l3 = 0 and l0 ≥ l1 . Set (z) = (z − e1 )α˜ 1 (z − e2 )α˜ 2 (z − e3 )α˜ 3 , where α˜ i = −li /2 or (li + 1)/2 for each i(∈ {1, 2, 3}), and f˜(z) = (z)f (z). If the function f˜(z) satisfies Eq. (2.3), then 3 (α˜ 1 + α˜ 2 + α˜ 3 − l20 )(α˜ 1 + α˜ 2 + α˜ 3 + l0 +1 d 2 f (z) 2α˜ i + 21 df (z) 2 )z + + dz2 z − ei dz (z − e1 )(z − e2 )(z − e3 ) i=1 E ˜ 2 + α˜ 3 )2 − e2 (α˜ 1 + α˜ 3 )2 − e3 (α˜ 1 + α˜ 2 )2 4 − e1 (α f (z) = 0. (3.10) + (z − e1 )(z − e2 )(z − e3 ) (0, 1/2) (α˜ 2 = 0) Note that the exponents of Eq. (3.10) at z = e2 are . Write (−1/2, 0) (α˜ 2 = 1/2) ∞ f (z) = r=0 cr (z − e2 )r , then the recursive relations for coefficients cr are given as follows: 1 (e2 − e1 )(e2 − e3 )(2α˜ 2 + )c1 + qc0 = 0, 2
(3.11)
1 (r − 1 + γ1 )(r − 1 + γ2 )cr−1 + (e2 − e1 )(e2 − e3 )(r + 1)(r + 2α˜ 2 + )cr+1 2 + (((2α˜ 2 + 2α˜ 3 + r)(e2 − e1 ) + (2α˜ 2 + 2α˜ 1 + r)(e2 − e3 ))r + q)cr = 0, (3.12)
Heun Equation
473
E where r ≥ 1, γ1 = α˜ 1 + α˜ 2 + α˜ 3 − l20 , γ2 = α˜ 1 + α˜ 2 + α˜ 3 + l0 +1 ˜2 + 2 and q = 4 − e1 (α 2 2 2 α˜ 3 ) − e2 (α˜ 1 + α˜ 3 ) − e3 (α˜ 1 + α˜ 2 ) + e2 γ1 γ2 . Either γ1 or γ2 is an integer. Let this integer be denoted by γ˜ . The coefficients cr are polynomial in E of degree r. We denote cr by cr (E). If E is a solution to the equation cr (E) = 0 for the positive integer r = 1 − γ˜ , then it is seen that c2−γ˜ (E) = 0 from the recursive relation (3.12), and also cr (E) = 0 for r ≥ 2 − γ˜ . For the value E, the power series f (z) = r cr (E)(z − e2 )r is a finite summation and the function f (℘ (x)) is meromorphic and doubly periodic. If l0 + l1 is even, it occurs that (α˜ 1 , α˜ 2 , α˜ 3 ) = (−l1 /2, 0, 0), (−l1 /2, 1/2, 1/2), ((l1 + 1)/2, 1/2, 0), ((l1 + 1)/2, 0, 1/2) and the degrees of the polynomials c1−γ˜ (E) are (l0 + l1 + 1)/2, (l0 + l1 )/2, (l0 − l1 )/2, (l0 − l1 )/2 respectively. We denote these polynomials by c(1) (E), c(2) (E), c(3) (E), and c(4) (E). Since α˜i (i = 0, 1, 2, 3) belongs to 21 Z, the function (℘ (x))f (℘ (x)) is doubly periodic up to signs. Let E1 (resp. E2 ) be a solution to c(i1 ) (E) = 0 (resp. c(i2 ) (E) = 0) for some i1 (resp. i2 ) and let f1 (z) (resp. f2 (z)) be the corresponding solution to Eq. (3.10) which is determined by recursive relations (3.11, 3.12). The periodicity of the functions (℘ (x))f1 (℘ (x)) and (℘ (x))f2 (℘ (x)) , (i.e. the signs of (℘ (x + ωi ))f1 (℘ (x + ωi ))/ (℘ (x))f1 (℘ (x)) and (℘ (x + ωi ))f2 (℘ (x + ωi ))/ (℘ (x))f2 (℘ (x))(i = 1, 3)) are different if i1 = i2 . Assume that E is a common solution to c(i) (E) = 0 and c(j ) (E) = 0 for i = j . Let f˜(i) (z) (resp. f˜(j ) (z)) be the solution to the differential equation (3.10) related to the value E. Then the functions f˜(i) (℘ (x)) and f˜(j ) (℘ (x)) are the basis of solutions to (2.2). However, this contradicts the periodicity of the solutions when considering exponents. Hence, we proved that the equations c(i) (E) = 0 and c(j ) (E) = 0 (i = j ) do not have common solutions. The degree of the polynomial c(1) (E)c(2) (E)c(3) (E)c(4) (E) is equal to the dimension of the space V˜ (see Eq. (3.7)). If we show that the roots of the equation c(i) (E) = 0 (i = 1, 2, 3, 4) are distinct for generic ω1 and ω3 , the theorem is proved. It is enough to show that if e1 , e2 , e3 are real and (e2 −e1 )(e2 −e3 ) < 0 then the roots of the equation c(i) (E) = 0(i = 1, 2, 3, 4) are real and distinct for the case (α˜ 1 , α˜ 2 , α˜ 3 ) = (−l1 /2, 0, 0), (−l1 /2, 1/2, 1/2), ((l1 + 1)/2, 1/2, 0), ((l1 + 1)/2, 0, 1/2). (r) We will show that the polynomial cr (E) (1 ≤ r ≤ 1 − γ˜ ) has r real distinct roots si (r) (r−1) (r) (r−1) (r) (r−1) (r) (i = 1, . . . , r) such that s1 < s1 < s2 < s2 < · · · < sr−1 < sr−1 < sr by induction on r. The case r = 1 is trivial. Let k ∈ Z≥1 and assume that the state(k) (k−1) (k) ment is true for r ≤ k. From the assumption of the induction, s1 < s1 < s2 < (k−1) (k−1) (k) < · · · < sk−1 < sk . It is immediate from (3.12) that the leading term of s2 (k−1)
the polynomial cr (E) in E is positive. Since si (k−1)
ck−1 (si
(k)
(k−1)
(k−1)
) = ck−1 (si+1 ) = 0 and si
(k)
< si
(k−1)
and si+1
satisfy the equation
(k−1)
< si+1 , the sign of the value (k)
ck−1 (si ) is (−1)k−i . By the recursion relation (3.12) and the equation ck (si ) = 0, (k) (k) the sign of the value ck+1 (si ) is opposite to that of ck−1 (si ). Combined with the asymptotics of ck+1 (E) for E → +∞ and E → −∞ and degE ck+1 (E) = k + 1, it follows that the polynomial ck+1 (E) has k + 1 real distinct roots and the inequality (k+1) (k) (k+1) (k) (k) (k+1) < s1 < s2 < s2 < · · · < sk < sk+1 is satisfied. s1 For the case that l0 + l1 is odd, the proof is similar.
The following proposition was essentially proved in the proof of the previous theorem.
474
K. Takemura
Proposition 3.3. Suppose l2 = l3 = 0 and l0 , l1 ∈ Z≥0 . If e1 , e2 , e3 are real and (e2 − e1 )(e3 − e1 ) > 0, then the roots of the characteristic polynomial of operator H (2.1) on the space V˜ are real and distinct. Proof. It is sufficient to prove for the case l0 ≥ l1 . If e1 , e2 , e3 are real then the condition (e2 −e1 )(e3 −e1 ) > 0 is equivalent to ((e2 −e1 )(e2 −e3 ) < 0 or (e3 −e1 )(e3 −e2 ) < 0). For the case (e2 − e1 )(e2 − e3 ) < 0, the proposition is proved in the proof of the previous theorem. For the case (e3 − e1 )(e3 − e2 ) < 0, the proof is similar.
3.2. Results from the method of monodromy. In this subsection, we present an integral expression of solutions to Eq. (2.2) and related results for the case li ∈ Z≥0 (i = 0, 1, 2, 3). First, we show a proposition which is related to the monodromy of solutions to Eq. (2.2) for the case li ∈ Z≥0 (i = 0, 1, 2, 3). Proposition 3.4. If l0 , l1 , l2 , l3 ∈ Z≥0 then the monodromy matrix of Eq. (2.2) around each singular point is a unit matrix. Proof. Each solution to Eq. (2.2) has singular points only at x = n1 ω1 + n3 ω3 (n1 , n3 ∈ Z). Due to periodicity, it is sufficient to consider the case a = ωi (i = 0, 1, 2, 3). The exponents at the singular point a = ωi are −li and li +1, hence there exist solutions of the 2k li +1 (1+ ∞ a x 2k ), form fa,1 (x) = (x−a)−li (1+ ∞ k=1 ak x ) and fa,2 (x) = (x−a) k=1 k because Eq. (2.2) lacks a first-order differential term and because the functions ℘ (x+ωj ) (j = 0, 1, 2, 3) are even and because the numbers lj + 1 − (−lj ) are odd. Then it is obvious that the monodromy matrix for the functions fa,1 (x) and fa,2 (x) around the point x = a is a unit matrix, hence the local monodromy matrix of Eq. (2.2) around each singular point is also a unit matrix.
We explain how to choose particular solutions to Eq. (2.2) by applying Proposition 3.4. Let Mi (i = 1, 3) be the transformations of solutions to the differential equation (2.2) obtained by the analytic continuation x → x + 2ωi . Since all the local monodromy matrices are unit, the transformations Mi do not depend on the choice of paths. From the fact that the fundamental group of the torus is commutative, we have M1 M3 = M3 M1 . The operators Mi act on the space of solutions to Eq. (2.2) for each E, which is two dimensional. By the commutativity M1 M3 = M3 M1 , there exists a joint eigenvector
(x, E) for the operators M1 and M3 . From Proposition 3.4, the function (x, E) is single-valued and satisfies equations (H − E) (x, E) = 0, M1 (x, E) = m1 (x, E) and M3 (x, E) = m3 (x, E) for some m1 , m3 ∈ C \ {0}. We immediately obtain relations (H − E) (−x, E) = 0, M1 (−x, E) = m−1 1 (−x, E) and M3 (−x, E) = m−1
(−x, E). 3 Now consider the function ∗ (x, E) = (x, E) (−x, E). This function is even (i.e. ∗ (x, E) = ∗ (−x, E)), doubly periodic (i.e. ∗ (x + 2ω1 , E) = ∗ (x + 2ω3 , E) = ∗ (x, E)), and satisfies the equation 3 d3 d −4 li (li + 1)℘ (x + ωi ) − E dx 3 dx i=0 3 −2 li (li + 1)℘ (x + ωi ) ∗ (x, E) = 0, i=0
which products of any pair of solutions to Eq. (2.2) satisfy.
Heun Equation
475
The following proposition is used to determine an appropriate constant multiplication of the function ∗ (x, E). Proposition 3.5. If l0 , l1 , l2 , l3 ∈ Z≥0 then the equation
3 d d3 −4 li (li + 1)℘ (x + ωi ) − E 3 dx dx i=0 3 −2 li (li + 1)℘ (x + ωi ) (x, E) = 0,
(3.13)
i=0
has a nonzero doubly periodic solution which has the expansion (x, E) = c0 (E) +
3 l i −1
bj (E)℘ (x + ωi )li −j , (i)
(3.14)
i=0 j =0 (i)
where the coefficients c0 (E) and bj (E) are polynomials in E, they do not have common divisors, and the polynomial c0 (E) is monic. Moreover the function (x, E) is (i) determined uniquely. Set g = degE c0 (E), then degE bj (E) < g for all i and j . Proof. Recall that the function ∗ (x, E) = (x, E) (−x, E) is even and doubly periodic, and it satisfies Eq. (3.13). Here we show the following lemma. Lemma 3.6. Suppose l0 , l1 , l2 , l3 ∈ Z≥0 and fix E ∈ C. If Eq. (2.2) does not have solutions in F, then the dimension of the space of even doubly periodic functions which satisfy Eq. (3.13) is one. Proof of the lemma. Since the dimension of even (or odd) functions (not necessarily doubly periodic) which satisfy Eq. (2.2) is one and solutions to Eq. (3.13) are spanned by products of two solutions to Eq. (2.2), the dimension of even functions which satisfy Eq. (3.13) is two. Suppose that the function (x, E) is proportional to (−x, E), then by definition −1 2 2 one has m1 = m−1 1 and m3 = m3 , that is m1 = m3 = 1, and it contradicts the assumption of the lemma. Therefore the functions (x, E) and (−x, E) are linearly independent. If the dimension of even doubly periodic functions which satisfy Eq. (3.13) is no less than two, all even solutions to Eq. (3.13) must be doubly periodic. Hence the function
(x, E)2 + (−x, E)2 is doubly periodic. Since (x, E) ∈ F, it follows that m21 = 1 or m23 = 1. For example, suppose m21 = 1. From the periodicity of (x, E), (x, E)2 + 2
(−x, E)2 = (x + 1, E)2 + (−x − 1, E)2 = m21 (x, E)2 + m−2 1 (−x, E) which contradicts the linear independence of (x, E) and (−x, E). Therefore the dimension of even doubly periodic functions which satisfy Eq. (3.13) is less than two. Since the function ∗ (x, E) is an even doubly periodic function that satisfies Eq. (3.13), we obtain the lemma.
476
K. Takemura
We continue the proof of Proposition 3.5. Since the function ∗ (x, E) is an even doubly periodic function that satisfies the differential equation (3.13) and the exponents of Eq. (3.13) at x = ωi are −2li , 1, 2li + 2 , it admits the following expansion: ∗ (x, E) = c˜0 (E) +
3 l i −1
(i) b˜j (E)℘ (x + ωi )li −j .
(3.15)
i=0 j =0
Substituting Eq. (3.15) for Eq. (3.13), we obtain relations for the coefficients c˜0 (E) and (i) (i) (i) b˜j (E). On the relations, the coefficient b˜j (E) is expressed as a polynomial in b˜j (E) (i) (j < j ) and E, and the coefficient c˜0 (E) is expressed as a polynomial in b˜ (E) j
(i = 0, 1, 2, 3; 0 ≤ j ≤ li − 1) and E. Through a multiplicative change of scale given by 3 l −1
i c0 (E) ∗ (i) (x, E) = bj (E)℘ (x + ωi )li −j , (x, E) = c0 (E) + c˜0 (E)
(3.16)
i=0 j =0
(i)
we can choose the coefficients in such a way that c0 (E) and bj (E) are polynomials in E, they do not have common divisors, and the polynomial c0 (E) is monic. If there is a function in F that satisfies (H −E)f (x) = 0, we have f (x) ∈ V˜0 ,1 ,2 ,3 (see Eq. (3.7)) for some 0 , 1 , 2 , 3 ∈ {±1}, and the value E is an eigenvalue of H on the space V0 ,1 ,2 ,3 . From Theorem 3.1 and Lemma 3.6, any even doubly periodic function that satisfies Eq. (3.13) is determined uniquely up to constant multiplication (i) except for finitely many E. Hence the polynomials c0 (E) and bj (E) are determined uniquely. g Let (x, E) = k=0 ag −k (x)E k , where g is the maximum of the degrees of c0 (E) (i) and bj (E) (i = 0, 1, 2, 3; 0 ≤ j ≤ li − 1) in E. Substituting this into Eq. (3.13)
d and computing the coefficient of E g +1 yields dx a0 (x) = 0, and therefore we have (i) degE bj (E) < degE c0 (E), g = g, and a0 (x) = 1.
We will get an integral representation of (x, E) by using the function (x, E). The following argument is motivated by the book of Whittaker and Watson [12, §23.7]. Since the functions (x, E) and (−x, E) satisfy the differential equation (2.2), we
d d d d2
(−x, E) dx
(x, E) − (x, E) dx
(−x, E) = (−x, E) dx have dx 2 (x, E) − 2
d d d
(x, E) dx 2 (−x, E) = 0. Therefore (−x, E) dx (x, E) − (x, E) dx (−x, E) = d log (x,E) 2C(E) − d log (−x,E) = (x,E) for dx dx d log (−x,E) d log (x,E) d(x,E) 1 + = (x,E) dx , dx dx
2C(E) and
some constant C(E). From the
equation
we have
d log (±x, E) 1 C(E) d(x, E) = ± , dx 2(x, E) dx (x, E) which is integrated to yield
(±x, E) =
±C(E)dx (x, E) exp . (x, E)
This formula is the integral representation of the solution to Eq. (2.2).
(3.17)
(3.18)
Heun Equation
477
It remains to determine the constant C(E). Differentiating Eq. (3.17) yields d 2 (x, E) d (x, E) 2 1 1 −
(x, E) dx 2
(x, E) dx 2 d (x, E) 1 1 d(x, E) 2 1 C(E) d(x, E) . (3.19) = − − 2(x, E) dx 2 2 (x, E) dx (x, E)2 dx Set Q(E) = −C(E)2 . By substituting Eq. (2.2) and Eq. (3.17) into Eq. (3.19), we obtain the formula 3 li (li + 1)℘ (x + ωi ) Q(E) = −C(E)2 = (x, E)2 E − i=0
1 d 2 (x, E) 1 + (x, E) − 2 dx 2 4
d(x, E) dx
2 .
(3.20)
It follows that Q(E) is a monic polynomial in E of degree 2g + 1 from the expression for (x, E) given by Eq. (3.14) and Proposition 3.5. In summary, we have the following proposition. Proposition 3.7. Let (x, E) be the doubly periodic function defined in Proposition 3.5 and Q(E) be the monic polynomial defined in Eq. (3.20). Then the function √ ± −Q(E)dx
(±x, E) = (x, E) exp (3.21) (x, E) is a solution to the differential equation (2.2). From the previous proposition, if E is a solution to the equation Q(E) = 0, then the corresponding function (x, E) is an element of F. Conversely, we suppose (x, E) ∈ F. For this case, the function (−x, E) has the same periodicity as (x, E). If the function (x, E) is neither even nor odd, then the function (x, E) + (−x, E) is non-zero, even, and doubly periodic up to signs, and the non-zero function (x, E) − (−x, E) has the same periodicity. The function (x, E) + (−x, E) belongs to the space V˜0 ,1 ,2 ,3 (see Eq. (3.7)) for some 0 , 1 , 2 , 3 ∈ {±1}. Since the function (x, E) − (−x, E) has the same periodicity and the opposite parity as (x, E) + (−x, E), we have (x, E) − (−x, E) ∈ i V˜−0 ,−1 ,−2 ,−3 . Let αi ∈ {−li , li + 1} be the number such that αi + 1− 2 is divisible by α +α +α +α 2, then we have dim V˜0 ,1 ,2 ,3 = max(0, 1− 0 1 2 2 3 ) and dim V˜−0 ,−1 ,−2 ,−3 = 2 +α3 max(0, α0 +α1 +α −1). Hence either the space V˜0 ,1 ,2 ,3 or the space V˜−0 ,−1 ,−2 ,−3 2 is {0}, which contradicts the property (x, E) = ± (−x, E). Therefore the function
(x, E) must be odd or even. If the function (x, E) is odd or even, the functions d
(x, E) and (−x, E) are linearly dependent. Thus the Wronskian (−x, E) dx
(x, E) d − (x, E) dx (−x, E) is equal to 0 and C(E) = Q(E) = 0. Therefore the following theorem is proved: Theorem 3.8. Equation (2.2) has a solution in the space F (3.1) if and only if the value E satisfies the equation Q(E) = 0. Let us count the degree of the polynomial Q(E) and discuss the relationship to the maximum finite-dimensional invariant subspace V˜ (see Eq. (3.7)) in F.
478
K. Takemura
Conjecture 1. (i) The degree of the polynomial Q(E) in E is equal to the dimension of the space V˜ . (See Eq. (3.7) and Theorem 3.1.) (ii) For generic ω1 and ω3 , roots of the equation Q(E) = 0 are distinct. Proposition 3.9. Conjecture 1 is true if two of li (i = 0, 1, 2, 3) are zero. Proof. It is sufficient to show the case l2 = l3 = 0 and l0 ≥ l1 . The function (x, E) admits expression (3.14), but we now use the other expression, (x, E) =
l 0 +l1 1 aj (E)(z − e1 )j , (z − e1 )l1
(3.22)
j =0
where z = ℘ (x). We are going to obtain a recursive relation for the coefficients aj (E). Since (x, E) satisfies Eq. (3.13), we find that j (j − 2l0 − 1)(2j − 2l0 − 1)aj (E) + 2(j − l0 − 1)(−E + (3j 2 − 6(l0 + 1)j + 2l02 + 5l0 − l12 − l1 + 3)e1 )aj −1 (E) + (2j − 2l0 − 3)(j − 2 − l0 − l1 )(j − 1 − l0 + l1 )(e1 − e2 )(e1 − e3 )aj −2 (E) = 0, (3.23) where j = 1, 2, . . . , l0 + l1 + 1 and a−1 (E) = al0 +l1 +1 (E) = 0. By recursive relation (3.23), we have (a) All coefficients aj (E) (j = 1, . . . l0 + l1 ) are divisible by a0 (E) as a polynomial in E, e1 , e2 , e3 . Hence we may set a0 (E) = 1. (b) The coefficient aj (E) (j = 0, 1, . . . , l0 +l1 ) is a polynomial in variables E, e1 , e2 , e3 and the homogeneous degree of aj (E) w.r.t. E, e1 , e2 , e3 is j . (c) degE aj (E) ≤ j . (d) If j ≤ l0 then degE aj (E) = j . Put j = l0 + l1 + 1 in Eq. (3.23), then degE al0 +l1 −1 (E) ≤ degE al0 +l1 (E) + 1. By putting j = l0 + l1 − k + 1 (k = 1, . . . , l1 − 1) in Eq. (3.23), we also obtain degE al0 +l1 −k−1 (E) ≤ degE al0 +l1 −k (E) + 1. It follows that degE aj (E) ≤ l0 for all j , degE aj (E) = l0 ⇔ j = l0 , and the coefficient of E l0 in al0 (E) does not depend on e1 , e2 , e3 . From relation (3.20), it is seen that degE Q(E) = 2l0 + 1 and the coefficient of E 2l0 +1 in Q(E) does not depend on e1 , e2 , e3 . From Theorem 3.8, Eq. (2.2) has a solution in F iff the value E satisfies the equation Q(E) = 0, and from Theorem 3.1, dim V˜ = 2l0 + 1 (see Eq. (3.7)). From Theorem 3.2, the roots of the characteristic polynomial of the operator H (see Eq. (2.1)) on the space V˜ are distinct for generic ω1 and ω3 . Hence the characteristic polynomial of the operator H on the space V˜ coincides with the polynomial Q(E) up to constant multiplication. Therefore we obtain the proposition.
The following proposition is proved case by case. Proposition 3.10. Conjecture 1 is true for the case l0 + l1 + l2 + l3 ≤ 8.
Heun Equation
479
3.3. Bethe Ansatz equation in terms of sigma functions. In the previous subsection, we introduced the elliptic function (x, E) and the polynomial Q(E). In this subsection, we will propose the Bethe Ansatz method. We postulate the Ansatz (assumption) that the eigenfunction of the Hamiltonian (2.2) is written as Eq. (3.31) with auxiliary variables (t1 , . . . , tl ). The Bethe Ansatz method replaces the spectral problem (see Eq. (2.2)) with solving transcendental equations (3.29) for a finite number of variables (t1 , . . . , tl ), which we call the Bethe Ansatz equation. It will be shown that the basis of solutions to Eq. (2.2) is written as the form satisfying Ansatz (3.31) except for finitely many eigenvalues E. The precise statement will be given in Theorem 3.12. Note that the case of the Lam´e equation (i.e. the case l1 = l2 = l3 = 0) was treated in [12]. Throughout this subsection, we assume Q(E) = 0. As was discussed in Sect. 3.2, the functions (x, E) and (−x, E) are linearly independent. For simplicity, assume l0 = 0. The function (x, E) is an even doubly periodic function which may have poles of degree 2li at x = ωi (i = 0, 1, 2, 3). Therefore we have the following expression; (x, E) =
(0) l j =1 (℘ (x) − ℘ (tj )) l 1 (℘ (x) − e1 ) (℘ (x) − e2 )l2 (℘ (x) − e3 )l3
a0
(3.24)
for some values t1 , . . . , tl , where l = l0 + l1 + l2 + l3 . Note that the values t1 , . . . , tl depend on E. Proposition 3.11. Assume Q(E) = 0, then the values tj , −tj , ωi (j = 1, . . . , l, i = 0, 1, 2, 3, see Eq. (3.24)) are mutually distinct up to periods of elliptic functions. Proof. (a) We show tj = ωi up to periods of elliptic functions for each j ∈ {1, . . . , l} and i ∈ {0, 1, 2, 3}. Since the function (x, E) is an even doubly periodic function and satisfies Eq. (3.13), it has a pole of degree 2li or a zero of degree 2li + 2 at x = ωi (i = 0, 1, 2, 3). Suppose that the function (x, E) has a zero of degree 2li + 2 at x = ωi for some i ∈ {0, 1, 2, 3}. Then the functions (x, E) and (−x, E) have zeros of degree li + 1 at x = ωi , because (x, E) is proportional to the product (x, E) (−x, E),
(x, E) and (−x, E) satisfy the differential equation (2.2) and their exponents at x = ωi are −li and li + 1. From the condition Q(E) = 0, the functions (x, E) and (−x, E) are linearly independent and form a basis of solutions to Eq. (2.2). It follows that every solution to Eq. (2.2) has a zero of degree li + 1 at x = ωi and this contradicts the fact that one of the exponents at x = ωi is −li . Therefore the function (x, E) has a pole of degree 2li at x = ωi (i = 0, 1, 2, 3). If tj = ωi or −ωi for some j and i, then from expression (3.24) it contradicts the degree of the pole at x = ωi . (b) Now we show tj = tj (j = j , j, j ∈ {1, . . . , l}) and tj = −tj (j, j ∈ {1, . . . , l}) up to the periods of elliptic functions. If tj = −tj , then we have tj = ωi mod 2ω1 Z + 2ω3 Z for some i ∈ {0, 1, 2, 3} from the condition 2tj = 0 mod 2ω1 Z + 2ω3 Z and it is reduced to the case (a). Assume (tj = tj (j = j , j, j ∈ {1, . . . , l}) or tj = −tj (j, j ∈ {1, . . . , l})) and tj = ωi (j ∈ {1, . . . , l}, i ∈ {0, 1, 2, 3}), then the function (x, E) has zeros of degree no less than 2 at two different points x = tj and x = −tj . The degree of zeros of the functions (x, E) and (−x, E) at points x = tj and x = −tj are 0 or 1, because (x, E) and (−x, E) satisfy the differential equation (2.2) which is holomorphic at the points x = ±tj . Since (x, E) is proportional to
480
K. Takemura
(x, E) (−x, E), there is a contradiction with the degree of zeros of the function (x, E). Therefore the proposition is proved.
Set z = ℘ (x) and zj = ℘ (tj ). From formula (3.20), we obtain
d(x, E) dz
2 =
4C(E)2 . ℘ (tj )2
=
2C(E) . ℘ (tj )
z=zj
We fix the signs of tj by taking
d(x, E) dz
z=zj
If we put 2C(E)/(x, E) into partial fractions, it is seen that
2C(E) = (x, E)
l j =1
2C(E)
d dz z=z j
℘ (x) − ℘ (tj )
=
l
ζ (x − tj ) − ζ (x + tj ) + 2ζ (tj ) ,
j =1
where ζ (x) is the Weierstrass zeta function. It follows from formula (3.18) that
(x, E) =
(0) l j =1 (℘ (x) − ℘ (tj )) l 1 (℘ (x) − e1 ) (℘ (x) − e2 )l2 (℘ (x) − e3 )l3
a0
l
1 × exp log σ (tj + x) − log σ (tj − x) − 2xζ (tj ) 2 j =1 (0) l l a0 j =1 σ (x + tj ) = ζ (tj ) , exp −x σ (x)l0 σ1 (x)l1 σ2 (x)l2 σ3 (x)l3 lj =1 σ (tj ) i=1
(3.25)
where σ (x) is the Weierstrass sigma function and σi (x)(i = 1, 2, 3) are the Weierstrass co-sigma functions. Consider fixing the value E. If (x, E) ∈ F, then (x, E) and (−x, E) are linearly independent and the linear combinations of (x, E) and
(−x, E) exhaust solutions to Eq. (2.2). Hence if there is no solution to Eq. (2.2) in F, a basis of the solution is written as (x, E) and (−x, E) where (x, E) is given as Eq. (3.25). Note that the condition for E that there is no solution to Eq. (2.2) in F is equivalent to the condition that the value E satisfies Q(E) = 0. Conversely assume that differential equation (2.2) has the solution written as l ˜
(x) =
σ (x)l0 σ
j =1 σ (x + tj ) l l l 1 (x) 1 σ2 (x) 2 σ3 (x) 3
exp(cx),
(3.26)
Heun Equation
481
for some c and tj (j = 1, . . . , l) such that tj = tj (j = j ) and tj = ωi (j = 1, . . . , l, i = 0, 1, 2, 3). By a straightforward calculation, it follows that ˜ d 2 (x) dx 2 3 l ˜ = (x) li (li + 1)℘ (x + ωi ) + 2l0 ζ (x) −c − ζ (tj ) i=0
+2
3
li ζ (x + ωi ) −c − lζ (ωi ) −
i=1
+2
l
l
ζ (tj − ωi )
j =1
ζ (x + tj ) c +
j =1
+
j =1
ζ (tk − tj ) − l0 ζ (−tj ) −
k=j
3 l
3
li (ζ (−tj + ωi ) − ζ (ωi ))
i=1
li ℘ (tj − ωi ) − ζ (tj − ωi )2 −
j =1 i=0
℘ (tj − tk ) − ζ (tj − tk )2
j
+c − (l0 l1 + l2 l3 )e1 − (l0 l2 + l1 l3 )e2 − (l0 l3 + l1 l2 )e3 + 2
3
li ηi (2c + lηi )
i=1
(3.27) Note that we used the formula (ζ (x + a) − ζ (x + b) − ζ (a − b))2 = ℘ (x + a) + ℘ (x + b) + ℘ (a − b)
(3.28)
to derive Eq. (3.27). ˜ Hence we find that the function (x) satisfies Eq. (2.2) if and only if tj (j = 1, . . . , l) and c satisfy the following equations:
ζ (−tj + tk ) − l0 ζ (−tj ) −
k=j
3
li (ζ (−tj + ωi ) − ζ (ωi )) = −c (j = 1, . . . , l),
i=1
(1 − δl0 ,0 ) c +
l
(3.29)
ζ (tj ) = 0,
j =1
(1 − δli ,0 )(c + lζ (ωi ) +
l
ζ (−ωi + tj )) = 0,
(i = 1, 2, 3).
j =1
The eigenvalue E is given by E = − c2 + (l0 l1 + l2 l3 )e1 + (l0 l2 + l1 l3 )e2 + (l0 l3 + l1 l2 )e3 − −
3 l j =1 i=0
li (℘ (tj − ωi ) − ζ (tj − ωi )2 ) +
3
li ηi (2c + lηi )
i=1
(℘ (tj − tk ) − ζ (tj − tk )2 ).
j
(3.30) Note that there are other expressions of E in terms of tj (j = 1, . . . , l) and c.
.
482
K. Takemura
Summarizing, we obtain the following theorem: Theorem 3.12. Set l ˜
(x) =
σ (x)l0 σ
j =1 σ (x + tj ) l l l 1 (x) 1 σ2 (x) 2 σ3 (x) 3
exp(cx), (l = l0 + l1 + l2 + l3 )
(3.31)
and assume tj = tj (j = j ) and tj = ωi (j = 1, . . . , l, i = 0, 1, 2, 3). ˜ The condition that the function (x) satisfies Eq. (2.2) for some E is equivalent to tj (j = 1, . . . , l) and c satisfying relations (3.29), with the eigenvalue E given as (3.30). For the eigenvalue E such that Q(E) = 0, a basis of solutions to (2.2) is written as ˜ ˜
(x) and (−x), and tj (j = 1, . . . , l) such that tj = tj (j = j ) and tj = ωi (j = 1, . . . , l, i = 0, 1, 2, 3). ˜ Remark. (i) Comparing with [2, 9], it would be reasonable to call the function (x) the Bethe vector and Eqs. (3.29) the Bethe Ansatz equation. (ii) For each l0 , l1 , l2 , l3 , the number of eigenvalues E which do not have a basis of solutions to Eq. (2.2) as the Bethe vectors (3.31) is at most the dimension of the space V˜ (see Eq. (3.7)) which is given in Theorem 3.1. (iii) If l0 = 0 and l1 = 0 then the number of Eqs. (3.29) is strictly greater than the number of the variables tj (j = 1, . . . , l), although Eqs. (3.29) have solutions (t1 , . . . , tl ) if Q(E) = 0. For the case l1 = l2 = l3 = 0 (the case of the Lam´e equation), the equation c + lj =1 ζ (tj ) = 0 is obtained by summing up other equations in (3.29). Hence the number of equations is essentially equal to that of the variables tj (j = 1, . . . , l0 ).
3.4. Bethe Ansatz equation in terms of theta functions. We will introduce the Bethe Ansatz equation written in terms of theta functions. Set τ = ω3 /ω1 and ω1 = 1/2, then the Hamiltonian (see Eq. (2.1)) is described as d2 + l0 (l0 + 1)℘ (x) dx 2 1 1+τ τ + l1 (l1 + 1)℘ x + + l2 (l2 + 1)℘ x + + l3 (l3 + 1)℘ x + . 2 2 2 (3.32)
H =−
By replacing the sigma function with the theta function by formula (A.7), it is seen ˜ ˜ that if Q(E) = 0, then the basis of solutions to Eq. (2.2) is written as (x) and (−x), where √ exp(π −1cx) lj =1 θ(x + tj ) ˜
(x) = , (3.33) θ (x)l0 θ (x + 1/2)l1 θ (x + (1 + τ )/2)l2 θ(x + τ/2)l3 tj = tj (j = j ), tj ∈ 21 Z + τ2 Z (j = 1, . . . , l) and l = l0 + l1 + l2 + l3 . By similar calculations as the previous subsection, we obtain relations for tj (j = ˜ 1, . . . , l) and c for which the function (x) becomes an eigenfunction of operator (3.32).
Heun Equation
483
˜ Theorem 3.13. The function (x) (see Eq. (3.33)) with the condition tj = tj (j = j ) 1 τ and tj ∈ 2 Z + 2 Z is an eigenfunction of the operator H (see Eq. (3.32)) if and only if tj (j = 1, . . . , l(= l0 + l1 + l2 + l3 )) and c satisfy the relations, θ (tk − tj ) √ θ (tj ) θ (tj − 1/2) + l0 + l1 cπ −1 + θ (tk − tj ) θ (tj ) θ(tj − 1/2) k=j
θ (tj − (1 + τ )/2) θ (tj − τ/2) + l3 = 0, (j = 1, . . . , l) + l2 θ (tj − (1 + τ )/2) θ (tj − τ/2) l √ √ θ (tj ) = 0, (1 − δl0 ,0 ) cπ −1 + π −1(l2 + l3 ) + θ(tj ) j =1 l √ √ θ (tj − 1/2) = 0, (1 − δl1 ,0 ) cπ −1 + π −1(l2 + l3 ) + (3.34) θ(tj − 1/2) j =1 l √ √ θ (tj − (1 + τ )/2) = 0, (1 − δl2 ,0 ) cπ −1 − π −1(l0 + l1 ) + θ(tj − (1 + τ )/2) j =1 l √ √ θ (tj − τ/2) = 0. (1 − δl3 ,0 ) cπ −1 − π −1(l0 + l1 ) + θ(tj − τ/2) j =1
The eigenvalue E is given by E =π 2 (c2 + (l0 + l1 )(l2 + l3 )) + l(l + 1)η1 −
θ (tj − tk ) θ(tj − tk ) j
+ (l0 l1 + l2 l3 )e1 + (l0 l2 + l1 l3 )e2 + (l0 l3 + l1 l2 )e3 l θ (tj ) θ (tj − 1/2) θ (tj − (1 + τ )/2) θ (tj − τ/2) l0 + + l1 + l2 + l3 . θ (tj ) θ (tj − 1/2) θ(tj − (1 + τ )/2) θ(tj − τ/2) j =1
(3.35) If the value E satisfies Q(E) = 0, then a basis of solutions to Eq. (2.2) is written ˜ ˜ ˜ as (x) and (−x), where the function (x) is given in Eq. (3.33) and satisfies the 1 condition tj = tj (j = j ) and tj ∈ 2 Z + τ2 Z (j = 1, . . . , l). It would be hopeless to attempt to solve Eqs. (3.34) √ explicitly for generic τ . Instead, −1∞ and to gain some knowledge it may be possible to solve them for the case τ → √ for the case τ = −1∞√ from this limit. We will investigate solutions to Eqs. (3.34) for the trigonometric (τ → −1∞) case in the next section. 4. Trigonometric Limit 4.1. Trigonometric limit of √ the polynomial Q(E). In this section, we will consider the trigonometric limit τ → −1∞.
484
K. Takemura
The trigonometric limits of the elliptic functions are given by π2 1 π2 π2 π2 ℘ (x) → − , ℘ x + , (4.1) + → − + 3 (sin πx)2 2 3 (cos π x)2 τ π2 1+τ π 2 θ(x) ℘ x+ →− , ℘ x+ →− , → sin π x, (4.2) 2 3 2 3 θ (0) √ as τ → −1∞. They converge uniformly on compact sets. Due to this limit, the term τ l2 (l2 + 1)℘ (x + 1+τ to a constant. From now on, we 2 ) + l3 (l3 + 1)℘ (x + 2 ) converges √ √ consider the case l2 = l3 = 0. Set p = exp(2π −1τ ). Note that the limit τ → −1∞ corresponds to the limit p → 0. As p → 0, we obtain H → HT − CT , where 2
2
2
d π π HT = − dx 2 + l0 (l0 + 1) (sin πx)2 + l1 (l1 + 1) (cos πx)2 ,
CT = (l0 (l0 + 1) + l1 (l1
(4.3)
+ 1))π 2 /3.
(4.4)
We will consider the trigonometric limit of functions which have appeared in Sect. 2 2 2 3.2. Note that e1 → 2π3 , e2 → −π3 and e3 → −π3 as p → 0. The function (x, E) (see Eq. (3.14)) admits the trigonometric limit T (x, E) = c˜0 (E) +
l 0 −1 j =0
(0)
a˜ j (E) (sin πx)2(l0 −j )
+
l 1 −1 j =0
(1)
a˜ j (E) (cos π x)2(l1 −j )
,
(4.5)
and the function T (x, E) satisfies the trigonometric version of Eqs. (3.13) and (3.20). We will calculate the polynomial QT (E) which is the trigonometric version of Q(E). From the trigonometric version of Eq. (3.20), we have QT (E) = (E + CT )c˜0 (E)2 . Proposition 4.1. If l0 + l1 is even, then l0 +l1 2
QT (E) = b · (E + CT )
|l0 −l1 | 2
(E − (2i) + CT )
i=1
2
2
(E − (2i − 1)2 + CT )2 , (4.6)
i=1
where CT is defined in (4.4) and b is a constant which is independent of E. If l0 + l1 is odd, then |l0 −l1 |−1 2
QT (E) = b · (E + CT )
i=1
l0 +l1 +1 2
(E − (2i)2 + CT )2
(E − (2i − 1)2 + CT )2 .
i=1
(4.7) Proof. For simplicity, we prove for the case l0 ≥ l1 and l0 , l1 ∈ 2Z>0 . The proofs for the other cases are similar. From the proof of Proposition 3.9, it follows that the polynomial QT (E) is obtained by a suitable limit of the characteristic polynomial of the space V˜ (see Eq. (3.7)). For ˜ ˜ this case, we have V˜ = V˜1,1,1,1 ⊕ V˜1,1,−1,−1 ⊕ √ V1,−1,−1,1 ⊕ V1,−1,1,−1 . Set = (e2 − e3 ), then → 0 as τ → −1∞. We will solve recursive relation (3.12) for the case → 0. If = 0 then the invariant subspace V˜1,1,1,1 is given as
Heun Equation
485
− l1 1 l0 +l 2 σ1 (x) 2 ˜ ℘ (x)n . Then the characteristic polynomial of the space V1,1,1,1 = n=0 C σ (x) V˜1,1,1,1 for → 0 is expressed as
−
l0 +l1 2 −1
l0 +l1 2
c1
(E − (2i)2 + CT ) + −
l0 +l1 2
+··· ,
(4.8)
i=0
where c1 is a constant. Similarly, the leading terms of the characteristic polynomials of the spaces V˜1,1,−1,−1 , V˜1,−1,−1,1 , V˜1,−1,1,−1 as → 0 are given as l0 +l1 2
l0 −l1 2
E − (2i) + CT , 2
i=1
l0 −l1 2
E − (2i − 1) + CT , 2
i=1
E − (2i − 1)2 + CT ,
i=1
(4.9) up to constant multiplications respectively. By multiplying the leading terms of these four cases, we obtain the proposition.
4.2. Bethe Ansatz equation for the trigonometric model. In this subsection, we will obtain the trigonometric √ version of Eqs. (3.34).√ Set X = exp(2π −1x) and Tj = exp(2π −1tj ). Now we state the trigonometric version of Theorem 3.13. Set ˜ T (X) =
l0 +l1 j =1
(1 − XTj )
(X − 1)l0 (X + 1)l1
X c/2 ,
(4.10)
and assume Tj = Tj (j = j ) and Tj = 0, ±1 (j ∈ {1, . . . , l0 + l1 }). Suppose that Tj and c satisfy relations,
Tk +Tj k=j Tk −Tj
T +1
T −1
+ l0 Tjj −1 + l1 Tjj +1 = 0 (j = 1, . . . , l0 + l1 ) (4.11) +l1 Tj +1 l0 +l1 Tj −1 (1 − δl0 ,0 ) c + lj0=1 j =1 Tj +1 = 0, (4.12) Tj −1 = 0, (1 − δl1 ,0 ) c + c+
˜ T (e2π then it is shown that the function
2 2 with an eigenvalue E = π c .
√ −1x ) is an eigenfunction of operator H
˜ T (X) = Conversely, if the function
l0 +l1 j =1
(1−XTj )
(X−1)l0 (X+1)l1
T
(4.3)
X c/2 (Tj = Tj (j = j )
˜ T (e2π and Tj =√ 0, ±1 (j = 1, . . . , l0 + l1 )) satisfies the equation HT
˜ T (e2π −1x ), we obtain Eqs. (4.11, 4.12) and E = π 2 c2 . E
√ −1x )
=
Proposition 4.2. Fix the value E. If QT (E) = 0 then the basis of the solutions√to the √ 2π −1x ˜ ˜ T (e−2π −1x ), equation HT f √ (x) = (E + CT )f (x) is written as T (e ) and
E+C 2π −1x 2 ˜ T (e where
) is given in Eq. (4.10) with the conditions c = π 2 T , Tj = Tj (j = j ) and Tj = 0, ±1 (j ∈ {1, . . . , l0 + l1 }).
486
K. Takemura
Proof. From expression (4.5), it follows that T (x, E) =
c˜0 (E)(sin πx)2(l0 +l1 ) + (lower terms in (sin π x)2 ) , (sin πx)2l0 (cos π x)2l1
(4.13)
for some t1 , . . . , tl0 +l1 , where the polynomial c˜0 (E) is common in Eqs. (4.5) and (4.13). From the relation QT (E) = (E + CT )c˜0 (E)2 , we have c˜0 (E) = 0. Hence T (x, E) is written as 0 +l1 c˜0 (E) lk=1 ((sin π x)2 − (sin π tj )2 ) T (x, E) = (4.14) (sin πx)2l0 (cos π x)2l1 for some t1 , . . . , tl0 +l1 . By a similar argument to that performed in Sect. 3.3, we obtain 0 +l1 √ c˜0 (E) lk=1 sin π(x + tk )
T (x) = exp(π −1c), (4.15) l l 0 1 (sin πx) (cos π x) T where the value c satisfies c2 = E+C and the signs of t1 , . . . , tl0 +l1 are suitably chosen. π2 If T (x) satisfies the differential equation (HT − CT − E) T (x) = 0, T (−x) also satisfies it. From the trigonometric version of formula (3.18), it is seen that the functions
T (x) and T (−x) are linearly independent. By a similar argument to the proof of Proposition 3.11, the√values ±tj , (j = 1, . . . , l√ 0 + l1 ), 0, 1/2 are mutually different. By setting X = exp(2π −1x) and Tj = exp(2π −1tj ), we obtain the proposition.
4.3. Solutions to the Bethe Ansatz equation. We will solve the trigonometric Bethe Ansatz equation. Theorem 4.3. Assume QT (π 2 c2 −CT ) = 0. Let (T10 , . . . , Tl0 )(l = l0 +l1 ) be a solution to Eqs. (4.11, 4.12) and let σi = j1 <···<ji Tj01 . . . Tj0i be the i th elementary symmetric function of the solution (T10 , . . . , Tl0 ). Then there exists a solution to Eqs. (4.11, 4.12) uniquely up to permutation of (T10 , . . . , Tl0 ). The functions σi are determined by the following recursive relation: (l0 − l1 )(c − l0 − l1 ) , (4.16) c+1 (l0 − l1 )(c + 2i − 2 − l0 − l1 ) (i − 2 − l0 − l1 )(c + i − 2 − l0 − l1 ) σi = σi−1 + σi−2 i(i + c) i(i + c) (4.17) σ0 = 1,
σ1 =
for i = 2, 3, . . . , l0 + l1 . Proof. By the assumption QT (π 2 c2 − CT ) = √0 and Proposition 4.2, there exists ˜ T (e2π −1x ) = 0 of the form
˜ T (X) = asolution to the equation (HT − π 2 c2 )
l j =1 (1−XTj ) (X−1)l0 (X+1)l1
X c/2 . Then the values (T1 , . . . , Tl ) satisfy Eqs. (4.11, 4.12) and we have ˜ T (X) =
1+
l
(X
j j j =1 (−1) σj X − 1)l0 (X + 1)l1
X c/2 .
(4.18)
˜ T (X) = By comparing the coefficients of Xj (j = 1, . . . , l) of the equation (HT −π 2 c2 )
√ 2π −1x 0 (X = e ), we obtain the recursive relation (4.17) for σi .
Heun Equation
487
Now we exhibit explicit expressions of σm (1 ≤ m ≤ l0 + l1 ). By straightforward calculations, we obtain the following proposition: Proposition 4.4. (i) If l1 = 0 then m−1
σm =
j =0
(l0 − j )(c + j − l0 ) . (j + 1)(c + j + 1)
(4.19)
(ii) If l1 = 1 then m−2 σm =
j =0
(l0 − j ) m−2 j =1 (c + j − l0 ) m−1 m! j =0 (c + j + 1)
· (c − l0 − 1)((l0 − 2m + 1)c − l02 − l0 + 2ml0 + 2m − 2m2 ).
(4.20)
(iii) If l1 = 2 then m−3 σm =
j =0
(l0 − j ) m−3 j =2 (c + j − l0 ) (c − l0 )(c − l0 − 2) m−1 m! j =0 (c + j + 1)
· (c2 (l02 − (4m − 3)l0 + 4m2 − 8m + 2) + 2c(−l03 + (4m − 3)l02 − (6m2 − 10m + 2)l0 + 4m3 − 12m2 + 8m) + l04 − (4m − 3)l03 + (8m2 − 12m + 1)l02 − (8m3 − 20m2 + 8m + 3)l0 + 4m4 − 16m3 + 16m2 − 2).
(4.21)
(iv) If l1 = l0 then σ2m−1 = 0,
σ2m =
m−1 j =0
(v) If l1 = l0 − 1 then
m−1
σ2m (vi) If l1 = l0 − 2 then σ2m−1 = σ2m =
m
j =2 (j
− l0 )
(4.23)
m−1
j =0 (c + 2j + 2 − 2l0 ) , (m − 1)!(c + 1) m−2 j =0 (c + 2j + 2) m m−1 j =2 (j − l0 ) j =0 (c + 2j + 2 − 2l0 ) . − m!(c + 1) m−1 j =0 (c + 2j + 2)
2
(4.22)
m−1
j =0 (c + 2j + 1 − 2l0 ) , m−1 (m − 1)! j =0 (c + 2j + 1) m−1 m j =1 (j − l0 ) j =0 (c + 2j + 1 − 2l0 ) . = m! m−1 j =0 (c + 2j + 1)
σ2m−1 =
j =1
(j − l0 )
(j − l0 )(c + 2j − 2l0 ) . (j + 1)(c + 2j + 2)
·(c(l0 − 2m − 1) + (4m + 1)l0 − (2m + 1)2 ). Remark. Proposition 4.4 (i) reproduces Proposition 4.1 in [9].
(4.24)
488
K. Takemura
5. Eigenstates for the BC1 Inozemchev System 5.1. Eigenstates for the trigonometric Hamiltonian. Let us find square-integrable eigenstates of the Hamiltonian HT . Set (x) = (sin πx)l0 +1 (cos πx)l1 +1 ,
HT = (x)−1 HT (x),
(5.1)
then the gauge transformed Hamiltonian is written as (l1 + 1) sin π x d d2 (l0 + 1) cos πx − + (l0 + l1 + 2)2 π 2 . HT = − 2 − 2π dx sin πx cos π x dx (5.2) By a straightforward calculation we obtain HT (cos 2π x)m = π 2 (2m + l0 + l1 + 2)2 (cos 2π x)m +4π 2 ((−m(m − 1) + m(l0 − l1 ))(cos 2π x)m−1 .
(5.3)
Therefore the operator HT acts triangularly on the space spanned by {1, cos 2π x, (cos 2π x)2 , . . . }. Since (2m+l0 +l1 +2)2 = (2m +l0 +l1 +2)2 (m, m ∈ Z≥0 , m = m ), the operator HT is diagonalizable and HT ψm (x) = π 2 (2m + l0 + l1 + 2)2 ψm (x),
(5.4)
for some functions ψm (x) = (cos 2πx)m + (lower terms). Set w = (1 − cos 2π x)/2. By a change of variable, it follows that HT − π 2 (2m + l0 + l1 + 2)2 d2 2l0 + 3 d − ((l0 + l1 + 2) + 1)w = −4π 2 · w(1 − w) 2 + dw 2 dw + m(m + l0 + l1 + 2) . (5.5) Hence the operator HT − π 2 (2m + l0 + l1 + 2)2 is transformed to the hypergeometric operator. From expression (5.5), we obtain 2l0 + 3 1 − cos 2π x ; , (5.6) ψm (x) = c˜m Gm l0 + l1 + 2, 2 2 where the function Gm (α, β; w) = 2F1 (−m, α + m; β; w) is the Jacobi polynomial of degree m and c˜m is a constant. We define the inner product by 1 f, g = dxf (x)g(x), (5.7) 0
then the values (x)ψm (x), (x)ψm (x) are finite and π 2 (2m + l0 + l1 + 2)2 (x)ψm (x), (x)ψm (x) = HT (x)ψm (x), (x)ψm (x) = (x)ψm (x), HT (x)ψm (x) = π 2 (2m + l0 + l1 + 2)2 (x)ψm (x), (x)ψm (x). Therefore the functions (x)ψm (x) (m ∈ Z≥0 ) are orthogonal.
(5.8)
Heun Equation
489
Remark. (i) The values c˜m and (x)ψm (x), (x)ψm (x) are known explicitly. (ii) The set of functions (x)ψm (x) (m ∈ Z≥0 ) is a complete orthogonal basis of the space L2 (R/Z). For the proof, see [4 or 10]. (iii) The functions ψm (x) are simply the BC1 -Jacobi polynomials.
5.2. Jacobi polynomial, Bethe Ansatz and perturbation. In this subsection, we will see the relationship between the Jacobi polynomial and the function obtained by the Bethe Ansatz method. After that, we√ will show the holomorphy of the perturbation with respect to the parameter p = exp(2π −1τ ) ∼ 0 for the Hamiltonian of the Inozemtsev model (2.1) with the condition l2 = l3 = 0. For the case l1 = l2 = l3 = 0, the corresponding result is obtained in [9]. Let us recall the Bethe Ansatz method for the trigonometric model, which was discussed in Sect. 4. Let m be an non-negative integer, and set c = l0 +l1 +2+2m, then QT (π 2 c2 −CT ) = 0, where QT (E) and CT were defined in Sect. 4.1. Hence the solution T1 , . . . , Tl0 +l1 of the Bethe Ansatz equation (see Eqs. (4.11, 4.12)) with the condition c = l0 + l1 + 2 + 2m exists uniquely up to permutation. In this setting, the function l0 +l1 √ j =1 sin π(x + tj ) π −1cx
T (x) = e , (5.9) l l (sin πx) 0 (cos π x) 1 √ satisfies HT T (x) = π 2 c2 T (x), where exp(2π −1tj ) = Tj and the values (T1 , . . . , Tl0 +l1 ) are a solution to Eqs. (4.11, 4.12). From the condition QT (π 2 (l0 + l1 + 2 + 2m)2 − CT ) = 0, it follows that the functions T (x) and T (−x) are linearly sym sym independent. Set T (x) = T (x)−(−1)l0 T (−x), then T (x) = 0 and the degree sym of the pole of the function T (x) at x = n (resp. x = n + 1/2) (n ∈ Z) is less than sym l0 (resp. l1 ). Since the exponents of the differential equation (HT − π 2 c2 ) T (x) = 0 at x = n (resp. x = n + 1/2) are −l0 and l0 + 1 (resp. −l1 and l1 + 1), the funcsym tion T (x) has zeros of degree l0 + 1 (resp. l1 + 1) at x = n (resp. x = n + 1/2). Since the eigenvalue of the function (x)ψm (x) with respect to the Hamiltonian HT is π 2 (l0 + l1 + 2 + 2m)2 and the function (x)ψm (x) is holomorphic at x = 0, it follows sym that T (x) = Bm (x)ψm (x) for some non-zero constant Bm . Now, we will show that for each c = l0 +l1 +2+2m (m ∈ Z≥0 ) there exists p˜ ∈ R>0 such that the solution to the Bethe Ansatz equation (3.34) exists for each p(|p| < p) ˜ and it converges analytically to the solution to the Bethe Ansatz equation for the case p = 0. Set cm = l0 + l1 + 2 + 2m. Since the coefficients of the polynomial Q(E) defined in Sect. 3.2 are holomorphic in p, there exists 1 , p1 ∈ R>0 such that Q(E) = 0 for all E 2 | < and |p| < p . Let us recall that the function and p such that |E + CT − π 2 cm 1 1 (x, E) is expressed as l0 +l1 j =1 (℘ (x) − ℘ (tj )) (x, E) = . (5.10) ℘ (x)l0 ℘ (x + 1/2)l1 Set aj = ℘ (tj ) and z = ℘ (x), then the values aj depend on the values E and p. The numerator of Eq. (5.10) is written as a polynomial in z and the coefficients of zj (j = 1, . . . , l0 + l1 ) are holomorphic in E and p. Since the zeros of the numerator of Eq. (5.10) in z are simple, the values aj are holomorphic in E and p such that 2 | < , |p| < p . |E + CT − π 2 cm 1 1
490
K. Takemura
2 | < and |p| < p , then Q(E) = 0 and there exists a soluSuppose |E +CT −π 2 cm 1 1 tion to BetheAnsatz equation (3.34). Denote the solution by (t1 (E, p), . . . , tl0 +l1 (E, p)). If we choose the index suitably, then aj = ℘ (tj (E, p)). From Proposition 3.11 and the proof of Proposition 4.2, it follows that the values Z/2 (p = 0) tj (E, p) are mutually distinct and are not contained in for E Z/2 + Zτ/2 (p = 0) 2 | < , |p| < p . Since t (E, p) ∈ 1 Z + 1 τ Z, we and p such that |E + CT − π 2 cm 1 1 j 2 2 have ℘ (tj (E, p)) = 0 and the values tj (E, p) are holomorphic in E and p because the values aj are holomorphic in E and p. From Eq. (3.35), the value c satisfies the relation π 2 c2 = h(E, p) where
h(E, p) = E − (l0 + l1 )(l0 + l1 + 1)η1 − l0 l1 e1 +
θ (tj (E, p) − tk (E, p)) j
−
l 0 +l1 j =1
l0
θ(tj (E, p) − tk (E, p))
θ (tj (E, p)) θ (tj (E, p) − 1/2) + l1 . θ (tj (E, p)) θ (tj (E, p) − 1/2)
(5.11)
We are going to investigate the Bethe Ansatz equation with the condition c = cm for each p (|p|: sufficiently small). For the case p = 0, we have h(E − CT , 0) = E ∂h 2 − C , 0) = 1. By the implicit function theorem, there exists p˜ ∈ R and ∂E (π 2 cm T >0 such that there exists a holomorphic function E(p) in p on the domain |p| < p˜ such 2 and E(0) = π 2 c2 − C . Set t˜ (p) = t (E(p), p) (j = that h(E(p), p) = π 2 cm T j j m 1, . . . , l0 + l1 ), then the function t˜j (p) is holomorphic in the variable p on |p| < p, ˜ (t˜1 (p), . . . , t˜l0 +l1 (p)) is a solution to Bethe Ansatz equation (3.34) with the condition c = cm , and the corresponding eigenvalue E(p) is also holomorphic in p. Hence we √ obtain that the solution to Bethe Ansatz equation (3.34) at p(= exp(2π −1τ )) is holomorphically connected to the solution at p = 0 with the condition c = cm = l0 + l1 + 2 + 2m (m ∈ Z≥0 ). Now, we set √ +l1 θ(x + t˜j (p)) exp(π −1cm x) lj0=1 , (5.12)
m (x, p) = θ (x)l0 θ(x + 1/2)l1 sym
m (x, p) = m (x, p) − (−1)l0 m (−x, p). sym
(5.13) sym
By a similar argument to the case of the function T (x), the function m (x, p) is sym square-integrable on the interval [0, 1]. The function m (x, p) converges to the funcsym tion Bm T (x) (c = l0 + l1 + 2 + 2m) uniformly on compact sets for some non-zero as p → 0. constant Bm sym Combined with the relation T (x) = Bm (x)ψm (x) for some Bm , the following theorem is proved: Theorem 5.1. Let m be an non-negative integer. For each m, there exists ∈ R>0 such that if |p| < then there exists an eigenvalue Em (p) and an eigenfunction gm (x, p) of the Hamiltonian of the Inozemtsev model (see Eq. (3.32)) with the condition l2 = l3 = 0 such that π2 (l0 (l0 + 1) + l1 (l1 + 1)) and 3 on compact sets,
Em (p) → π 2 (l0 + l1 + 2 + 2m)2 −
(5.14)
gm (x, p) → (x)ψm (x)
(5.15)
Heun Equation
491
as p → 0 and gm (x, p) is square-integrable, i.e. 1 |gm (x, p)|2 dx < ∞.
(5.16)
0
Here, the function ψm (x) is defined in Eq. (5.6). The functions Em (p) and gm (x, p) are holomorphic in p if |p| < . Remark. For the case l1 = l2 = l3 = 0, this theorem was established in [9].
5.3. The algorithm of the perturbation. Assume l2 = l3 = 0. From formula (A.9), we obtain the following expansion in p: H = HT − CT +
∞
Vk (x)p k ,
(5.17)
k=1
where H is the Hamiltonian of the BC1 (one-particle) Inozemtsev model (see Eq. (2.2)) with the condition l2 = l3 = 0, HT is the Hamiltonian of the BC1 Calogero-Sutherland model (see Eq. (4.3)), CT is a constant, and Vk (x) are the functions which are expressed as a finite sum of cos 2πnx (n ∈ Z≥0 ). Set vm = (x)ψm (x), then vm is an eigenvector of the operator HT with the eigenvalue Em = π 2 (l0 + l1 + 2 + 2m)2 . We now apply the algorithm of the perturbation. More precisely, we determine the ˜ {k} k eigenvalues E˜ m (p) = Em + ∞ k=1 Em p and the eigenfunctions vm (p) = vm + ∞ {k} ∞ k k m cm,m vm p of the operator HT − CT + k=1 k=1 Vk (x)p as a formal power series in p. Since the function Vk (x)vm is expressed as the finite sum of vm (m ∈ Z≥0 ) and the eigenvalues Em (m ∈ Z≥0 ) are non-degenerate, the coefficients of the formal power series vm (p) and E˜ m (p) satisfying ∞ k HT − CT + Vk (x)p vm (p) = E˜ m (p)vm (p), (5.18) k=1
vm (p), vm (p) = vm , vm ,
are determined recursively and uniquely (for details see [10]). The convergence of the formal power series vm (p) and E˜ m (p) is not guaranteed a priori and they may diverge in general. Nevertheless it is shown that the formal power series vm (p) and E˜ m (p) converge in our model as a corollary of Theorem 5.1. Corollary 5.2. Let vm (p) be the eigenvector and E˜ m (p) the eigenvalue of the Hamiltonian H of the BC1 (one-particle) Inozemtsev model with the condition l2 = l3 = 0, which are obtained by the algorithm of the perturbation. If |p| is sufficiently small then the power series E˜ m (p) converges and the power series vm (p) converges on compact sets of x. Proof. From Theorem 5.1, for each m there exists ∈ R>0 such that if |p| < then there exists the eigenvalue Em (p) and the square-integrable eigenfunction gm (x, p) that converge respectively to the trigonometric eigenvalue and eigenfunction as p → 0. Set
492
K. Takemura
m (x), (x)ψm (x) g˜ m (x, p) = (x)ψ gm (x,p),gm (x,p) gm (x, p), then Em (p) and g˜ m (x, p) are holomorphic in p near p = 0 and they satisfy
k HT − CT + ∞ k=1 Vk (x)p g˜ m (x, p) = Em (p)g˜ m (x, p), g˜ m (x, p), g˜ m (x, p) = (x)ψm (x), (x)ψm (x) = vm , vm . (5.19) These equations are the same as Eqs. (5.18). From the uniqueness of the coefficients in p, we obtain E˜ m (p) = Em (p) and vm (p) = g˜ m (x, p). From the holomorphy of Em (p) and g˜ m (x, p), the convergence of E˜ m (p) and vm (p) in p is shown.
6. Comments 6.1. After obtaining Theorem 5.1, we should discuss the completeness of functions gm (x, p) (see Theorem 5.1) and clarify properties of eigenvalues and eigenstates. These matters will be discussed in the forthcoming article [10]. Note that some results will be obtained by applying the method performed in [4]. 6.2. Roughly speaking, the problem of finding square-integrable eigenstates for the Hamiltonian of the Inozemtsev model (see Eq. (2.1)) is equivalent to the problem of finding the solutions to the Heun equation (see Eq. (2.4)) with special conditions for a monodromy matrix around two singular points. It would be desirable if the Bethe Ansatz method would be of help in investigating the monodromies of the Heun equation. Acknowledgement. The author would like to thank Prof. M. Kashiwara, Dr.Y. Komori, and Prof. T. Miwa for discussions and support. He is partially supported by the Grant-in-Aid for Scientific Research (No. 13740021) from the Japan Society for the Promotion of Science.
Appendix A This appendix presents the definitions and formulae for elliptic functions. Let ω1 and ω3 be complex numbers such that the value ω3 /ω1 is an element of the upper half plane. The Weierstrass ℘-function, the Weierstrass sigma-function and the Weierstrass zeta-function are defined as follows, ℘ (z) = ℘ (z|2ω1 , 2ω3 ) 1 = 2+ z
1
1 − (z − 2mω1 − 2nω3 )2 (2mω1 + 2nω3 )2 (m,n)∈Z×Z\(0,0) z σ (z) = ℘ (z|2ω1 , 2ω3 ) = z 1− 2mω1 + 2nω3 (m,n)∈Z×Z\{(0,0)} z z2 · exp , + 2mω1 + 2nω3 2(2mω1 + 2nω3 )2 σ (z) ζ (z) = . σ (z)
, (A.1)
Heun Equation
493
Setting ω2 = −ω1 − ω3 and ei = ℘ (ωi ), ηi = ζ (ωi ) (i = 1, 2, 3) yields the relations e1 + e2 + e3 = η1 + η2 + η3 = 0, ℘ (z) = −ζ (x), (℘ (z))2 = 4(℘ (z) − e1 )(℘ (z) − e2 )(℘ (z) − e3 ), ℘ (z + 2ωj ) = ℘ (z), ζ (z + 2ωj ) = ζ (z) + 2ηj , (j = 1, 2, 3), ℘ (z) 1 1 1 1 , = + + (℘ (z))2 2 z − e1 z − e2 z − e3 (ei − ei )(ei − ei ) ℘ (z + ωi ) = ei + , (i = 1, 2, 3), ℘ (z) − ei
(A.2)
where i , i ∈ {1, 2, 3} with i < i , i = i , and i = i . The co-sigma functions σi (z) (i = 1, 2, 3) are defined by σi (z) = exp(−ηi z)σ (z + ωi )/σ (ωi ),
(A.3)
then we obtain
σi (z) σ (z)
2 = ℘ (z) − ei ,
(i = 1, 2, 3).
(A.4)
Let τ be an complex number on the upper half plane. The elliptic theta function θ(x) is defined by θ (x) = 2
∞
√ (−1)n−1 exp(τ π −1(n − (1/2))2 ) sin(2n − 1)π x.
(A.5)
n=1
Then θ (x + 1) = −θ (x),
θ (x + τ ) = −e−2π
√ √ −1x−π −1τ
θ(x).
(A.6)
Setting τ = ω3 /ω1 yields the relation σ (2ω1 x) = 2ω1 exp(2η1 ω1 x 2 )θ (x)/θ (0).
(A.7)
The following formulas are used to obtain Theorem 3.13: θ ( 21 ) θ ( 21 )
= 0,
√ θ (± τ2 ) τ = ∓π −1, θ (± 2 )
℘ (x) =
θ (x)2 θ (x) − − 2η1 , 2 θ(x) θ(x)
(A.8)
where ω1 = 1/2 and ω3 = τ/2. √ Setting ω1 = 1/2, ω3 = τ/2, and p = exp(2π −1τ ), the expansion of the Weierstrass ℘ function in p is given as follows: ℘ (x) =
∞ π2 np n π2 2 − (cos 2nπ x − 1). − 8π 3 1 − pn sin2 (π x) n=1
(A.9)
494
K. Takemura
References 1. Baxter, R.J.: Exactly Solved Models in Statistical Mechanics. London: Academic Press, 1982 2. Felder, G., Varchenko, A.: Three formulae for eigenfunctions of integrable Schrodinger operator. Comp. Math. 107, 143–175 (1997) 3. Inozemtsev, V.I.: Lax representation with spectral parameter on a torus for integrable particle systems. Lett. Math. Phys. 17, 11–17 (1989) 4. Komori, Y., Takemura, K.: The perturbation of the quantum Calogero-Moser-Sutherland system and related results. Commun. Math. Phys. 227, 93–118 (2002) 5. Ochiai, H., Oshima, T., Sekiguchi, H.: Commuting families of symmetric differential operators. Proc. Japan. Acad. 70, 62–66 (1994) 6. Olshanetsky, M.A., Perelomov, A.M.: Quantum integrable systems related to Lie algebras. Phys. Rep. 94, 313–404 (1983) 7. Oshima, T., Sekiguchi, H.: Commuting families of differential operators invariant under the action of a Weyl group. J. Math. Sci. Univ. Tokyo 2, 1–75 (1995) 8. Ronveaux, A.(ed.): Heun’s Differential Equations. Oxford Science Publications, Oxford: Oxford University Press, 1995 9. Takemura, K.: On the eigenstates of the elliptic Calogero-Moser model. Lett. Math. Phys. 53, 181– 194 (2000) 10. Takemura, K.: The Heun equation and the Calogero-Moser-Sutherland system II, III. Preprint, math.CA/0112179 (2001), math.CA/0201208 (2002) 11. Treibich, A., Verdier, J.-L.: Solitons elliptiques. Progr. Math. 88, 437–480 (1990) 12. Whittaker, E.T., Watson, G.N.: A Course of Modern Analysis. Fourth edition. New York: Cambridge University Press, 1962 Communicated by L. Takhtajan
Commun. Math. Phys. 235, 495–511 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0781-5
Communications in
Mathematical Physics
Aubry-Mather Theory and Hamilton-Jacobi Equations Ugo Bessi Dipartimento di Matematica, Universit`a Roma Tre, Largo S. Leonardo Murialdo, 00146 Roma, Italy. E-mail:
[email protected] Received: 20 November 2000 / Accepted: 4 November 2002 Published online: 24 January 2003 – © Springer-Verlag 2003
Abstract: We review and extend some results of Jauslin, Kreiss, Moser and Weinan E on the smooth approximation of Mather sets. Introduction We will consider a time-dependent lagrangian L(q, q, ˙ t) satisfying L(q + 1, q, ˙ t) = L(q, q, ˙ t) = L(q, q, ˙ t + 1); in other words, the phase space is the cylinder S 1 × R and L depends periodically on time; the standard hypotheses on the convexity and regularity of L are listed below. Even in this low-dimensional and simple-looking situation the dynamics of the Euler-Lagrange flow of L is not completely understood. For instance, in general it is impossible to read directly on L whether, given M > 0, there are orbits q satisfying |q(T ˙ ) − q(0)| ˙ > M for some T > 0; this fact being known as Arnold’s diffusion. J. N. Mather has shown in [11] and [12] that much information on the orbits is coded inside the family of functions, depending on a real parameter c, defined by n h∞ (x, t, ξ ) = lim inf min [L(q(s), q(s), ˙ s) − cq(s) ˙ + α(c)]ds : c n→∞ t q ∈ A.C.([t, n], S 1 ), q(t) = x, q(n) = ξ , where A.C.([t, n], S 1 ) denotes the class of S 1 -valued absolutely continuous functions. To explain the connection of h∞ c (x, t, ξ ) with the Hamiltonian dynamics we note that if there is a KAM torus of mean action c, then h∞ c (x, t, ξ ) is its generating function. Let us show briefly this classical argument from the calculus of variations. We call (p, I ) the canonical variables conjugate to (q, t); it is well-known that a KAM torus T is a Lagrangian submanifold and a graph; from this it follows easily that T is the graph of a closed 1-form: T = {(p, I ) = (c − ∂q u(q, t), −∂t u(q, t))}. Now I + H is constant on the orbits; since the orbits on a KAM torus are dense, I + H is constant on T and u must
496
U. Bessi
satisfy the Hamilton-Jacobi equation, i. e. Eq. (1) below. We note that the Lagrangian L˜ = L(q, q, ˙ t) − cq˙ + ∂q u(q, t)q˙ + ∂t u(q, t) evaluated on the Legendre transform of T is equal to −(I + H )|T ; it is thus a constant which we call −α(c). Moreover, ∂q˙ L˜ vanishes on the Legendre transform of T , which by the convexity of L implies that L˜ is constantly equal to its minimum on the KAM torus. Thus the orbits lying on the KAM ˜ moreover, their action is easy to calculate because it torus minimize the action of L; is the integral of a constant. Since L˜ and L have the same minimal orbits with fixed endpoints, it now follows easily that u(x, t) = h∞ c (x, t, ξ ). In this case the dependence ∞ of h∞ c on ξ is trivial: changing ξ means adding a constant to hc . Another important class of Lagrangian submanifolds are the stable and unstable manifolds of periodic orbits; for suitable values of c, the graph of c − ∂x h∞ c (x, t, ξ ) selects portions of the stable and unstable manifolds of the “action minimizing” hyperbolic periodic orbits. Here changing ξ means changing the selection: the reader may see this in a simple case calculating h∞ ˙ t) = 21 |q| ˙ 2 +[1−cos(q)]; h∞ 0 (x, t, ξ ) for the pendulum Lagrangian L(q, q, 0 (x, t, ξ ) is the action of a heteroclinic chain connecting x with ξ . Thus the function h∞ c , due to its relationship with KAM tori and stable and unstable manifolds of periodic orbits, plays an important rˆole in the theorems regarding the dynamics of L. The interested reader is referred to [11] and [12] for the whole story. We also point out [4], which contains many connections between Lagrangian dynamics, Aubry-Mather theory and Hamilton-Jacobi equations. Let now H denote the Hamiltonian of L; it is a standard calculation (see for instance [10], Theorem 3.1 or [9], Sects. 10 and 11) that for any ξ ∈ S 1 and c ∈ R v(x, t) = h∞ c (x, t, ξ ) is a viscosity solution of the following Hamilton-Jacobi equation: ∂t v − H (x, c − ∂x v, t) + α(c) = 0 (1) v(x + 1, t) = v(x, t) = v(x, t + 1). Jauslin, Kreiss and Moser proposed in [6] to find h∞ c by the vanishing viscosity method. We recall their results in the appendix; what matters now is that 2 ν 2 ∂xx u + ∂t u − H (x, c − ∂x u, t) + β(ν) = 0 (2) u(x + 1, t) = u(x, t) = u(x, t + 1) has, for a suitable choice of β(ν) ∈ R, a classical solution which is unique up to an additive constant and which we call uν . If νk 0 then uνk has (always up to an additive constant) a subsequence converging uniformly whose limit solves (1) in the viscosity sense. The problem with this approach is that the viscosity solutions of (1) are not unique and many of them, as shown in [6], do not coincide with h∞ c (x, t, ξ ) for any value of ξ . Thus one is led to the following question: when does the limit of uν as ν → 0 exist and coincide, for a suitable choice of ξ , with h∞ c (x, t, ξ )? This question has a rather long history, though in a different setting: as we explain in the digression at the end of the paper, if L = 21 |q| ˙ 2 − V (q), then the solutions of (2) can be transformed into those of the Schr¨odinger equation. Here ν plays the rˆole of the Plank constant h; identifying the limit of uν as ν → 0 is known in Physics as the problem of the semiclassical approximation. Theorem 5 of [6] concerns the tunnelling effect. Indeed, here Jauslin et al. restrict to the integrable Lagrangian L(q, q, ˙ t)= 21 |q| ˙ 2 − V (q); they suppose that the absolute maxima of V are all nondegenerate and have different second order derivatives. Let c0 be the action functional of the heteroclinic chain connecting all the maxima of V ; if |c| ≥ c0 it is easy to see that the solution of (1) is unique and uν converges to it. If |c| < c0 they show that uν converges, as ν → 0, to h∞ c (x, t, ξ ) where ξ is the maximum of V with
Aubry-Mather Theory and Hamilton-Jacobi Equations
497
smallest second order derivative; this is exactly the statement of tunnelling. Our aim is to generalize this result of [6] to a more general class of lagrangians and to periodic orbits of any rational rotation numbers. Assumptions on the lagrangian . We suppose that L is of class C 3 , that ∂q˙ q˙ L > 0, that L(q,q,t) ˙ lim|q|→∞ = +∞ and that the Euler-Lagrange flow of L is complete (i. e. the ˙ |q| ˙ trajectories of L have R as their domain of definition). These are exactly the hypotheses under which the Aubry-Mather theory works, except that L is assumed to be of class C 3 and not C 2 . Assumptions on the minimal orbits . Let n ∈ N, m ∈ Z be relatively prime and let us suppose that the functional I : {q ∈ A.C.([0, n], R) : q(n) = q(0) + m, q(0) ∈ [0, 1)} → R, n I: q → L(q, q, ˙ t)dt
(3)
0
has only a finite number of minima, q˜1 , . . . , q˜N , all of them hyperbolic. This implies (see for instance [12], Sect. 7) that there is an interval [c1 , c2 ] with c1 < c2 maximal with respect to this property: for c ∈ [c1 , c2 ] and any ξ ∈ S 1 the graph of c − ∂x h∞ c (x, t, ξ ) consists of portions of the stable and unstable manifolds of some q˜i . We will always suppose c ∈ [c1 , c2 ]. Moreover, dropping some q˜i we can always suppose that q˜i (·+l) = q˜j for i = j , ∀l ∈ Z. Though we will not use this in the following, it is proven in [12] that [c1 , c2 ] is the interval on which α (c) = m n. The local stable manifold of qi , being Lagrangian, is the graph of a closed one-form, c − ∂x zi (x, t); we set n 2 Ai = ∂xx zi (q˜˙i (t), t)dt. 0
We will prove the following theorem. Theorem 1. Let L, {q˜i }N i=1 , c1 < c2 n ∈ N, m ∈ Z and Ai be as above. Let c ∈ [c1 , c2 ] and let there be only one q˜i with minimal Ai , say q˜j . Then ∂x uν (x, t) → ∂x h∞ c (x, t, q˜j (0)) a. e. as ν → 0. We will also prove a theorem for a completely opposite situation, the case in which a KAM torus exists. It is a theorem of Birkhoff’s that the KAM tori for this kind of lagrangians are graphs of closed one-forms, T = {(I, p) = (−∂t u, c − ∂x u(x, t))}; as we explain later, it follows from [13] that ∂x uν → ∂x u a. e.; the authors of [6] ask how fast this convergence is. We are interested only in ∂x u because, as we will see later, ∂x u gives the initial conditions of the characteristics; ∂t u can be derived from these using the H-J equation, i. e. the conservation of energy. Theorem 2. Let there exist a KAM torus T C 1 -conjugate to a rotation of frequency ω; we suppose that for some τ, D > 0 ω satisfies |ωq − p| ≥
D qτ
∀p ∈ Z, ∀q ∈ N.
It is well-known that T = {(I, p) = −∂t u, (c − ∂x u(x, t))} for a unique c ∈ R and some u ∈ C 2 (S 1 × S 1 ). Then 1
|∂x uν (x, 0) − ∂x u(x, 0)| ≤ Dν 3τ +2 .
498
U. Bessi
Our method is different from that of [6] and is based on the fact that the viscosity solutions u of (1) and uν of (2) have a variational characterization similar to that of h∞ c , but with a small white noise added for uν . As we will see below, u(x, t) is the minimum action of the orbits in a certain class; for u(x, 0) this minimum is attained on an orbit qx whose initial conditions are determined by ∂x u(x, 0) and which accumulates on some q˜i . A simple comparison argument between the orbits on which u and uν are attained allows us to give bounds on uν from above and from below. We prove that if uν → u then the orbit realizing uν (x, 0) is close to the characteristic qx of u and thus spends a very long time close to q˜i . During this time the Brownian motion gives a large contribution to uν , raising its bound from below so much that it becomes incompatible with the bound from above unless ∂x uν converges to ∂x h∞ c (x, t, ξ ) with the right ξ . Proof of the Results We begin recalling some notions of [11] and [12]. We say that q ∈ A.C.(R, S 1 ) is c-minimal if ∀a, b, d, e ∈ Z, we have
b
a < b, d < e, ∀q˜ ∈ A.C.([d, e], S 1 ) : q(d) ˜ = q(a), q(e) ˜ = q(b),
e
[L(q, q, ˙ s) − cq˙ + α(c)]ds ≤
a
˙˜ s) − cq˙˜ + α(c)]ds, [L(q, ˜ q,
d
where α(c) is the same number defined in the introduction. If we allow for a = d = t we have an analogous definition for orbits defined on the half line [t, ∞); in this case we will speak of orbits c-minimal on [t, ∞). Let us now consider a sequence nk → ∞ and orbits qnk such that qnk (t) = x qnk (nk ) = ξ and
t
nk
[L(qnk , q˙nk , s) − cq˙nk + α(c)]ds → h∞ c (x, t, ξ ).
Such sequences exist by the definition of h∞ c and it can be shown that, up to a subsequence, they converge on compact sets to an orbit q c-minimal on [t, ∞) and with q(t) = x; we will say that q realizes h∞ c (x, t, ξ ). There is one caveat: the orbits realizing h∞ c (x, t, ξ ) need not have ξ in their ω-limit; of this the reader may easily convince herself considering the pendulum lagrangian L(q, q, ˙ t) = 21 |q| ˙ 2 + [1 − cos(q)] and ∞ the corresponding h0 (x, 0, ξ ) with x < 0 < ξ : the orbit realizing h∞ 0 (x, 0, ξ ) is the homoclinic starting at x and accumulating at the unstable equilibrium 0. We recall from [11] that, given any ω ∈ R, it is possible to choose c ∈ R such that all the recurrent c-minimal orbits have rotation number ω; from [1] or [12] it follows that if ω = m n ∈Q then the recurrent c-minimal orbits are all periodic of period n. The relationship between the minimum value of the functional I defined in (3) and α(c) is that, for c ∈ [c1 , c2 ], n [L(q˜i , q˙˜ i , t) − cq˙˜ i + α(c)]dt = 0 ∀i. (4) 0
There are two ways to consider our orbits: as having S 1 or R, the universal cover, as target space; from now on we will stick to the second one. It is a result of [1] that the order
Aubry-Mather Theory and Hamilton-Jacobi Equations
499
of R induces an order on the recurrent c-minimal orbits; in other words, if q˜i (0) < q˜j (0) then q˜i (t) < q˜j (t)∀t ∈ R. Let now 0 ≤ x1 < x2 < · · · < xnN < 1 denote the points {q˜j (l)(mod1)}0≤l≤n−1 1≤j ≤N ; for 1 ≤ i ≤ nN let us call qi (t) the orbit in {q˜j (· + l)}j,l such that qi (0) = xi . Since the recurrent minimal orbits are ordered, qi (t) < qi+1 (t)∀t. The following lemmas are a consequence of the fact that two c-minimal orbits intersect at most once. Lemma 1. Let c ∈ [c1 , c2 ] and x ∈ (xi , xi+1 ). Let q − (t) and q + (t) be two orbits, c-minimal on [0, ∞) and satisfying
limt→+∞ |qi (t) − q − (t)| = 0 = limt→+∞ |qi+1 (t) − q + (t)|. q + (0) = x = q − (0).
(5)
Then q˙ − (0) < q˙ + (0). Proof. We can rule out that q˙ − (0) = q˙ + (0): otherwise q − and q + would coincide contradicting (5). Let us suppose by contradiction that q˙ − (0) > q˙ + (0); we already recalled that qi (t) < qi+1 (t)∀t; by (5) and continuity this implies that there is t0 > 0 such that q − (t0 ) = q + (t0 ). Since both q − and q + are c-minimal on [0, ∞), their action functionals on [0, t0 ] must be the same. Thus also + q (t) 0 ≤ t ≤ t0 q(t) = q − (t) t ≥ t0 is a c-minimal orbit; this is a contradiction since q does not satisfy the E-L equation at t0 , where it is not smooth.
Let now νk → 0 and let uνk solve (2); as we recall in the appendix, possibly taking a subsequence we can suppose that uνk converges uniformly to a Lipschitz u solving (1) in the viscosity sense. It is proven in [13] that, if x ∈ S 1 is a point of differentiability of u(x, 0) (and almost any point is since u is Lipschitz) then the orbit qx of L with initial conditions qx (0) = x (6) q˙x (0) = Hp (x, c − ∂x u(x, 0), 0) is c-minimal on [0, ∞). Let now c ∈ [c1 , c2 ] and x ∈ [xi , xi+1 ]; from [1] or from Lemma 2.15 of [13] it follows that either lim |qx (t) − qi (t)| = 0
t→+∞
or
lim |qx (t) − qi+1 (t)| = 0.
t→+∞
Lemma 2. There is one and only one point z¯ ∈ [xi , xi+1 ] such that if x ∈ (xi , z¯ ) then if x ∈ (¯z, xi+1 ) then
lim |qx (t) − qi (t)| = 0
t→+∞
lim |qx (t) − qi+1 (t)| = 0.
t→+∞
500
U. Bessi
2 u ≤ M as ν → 0; thus Proof. We recall in the appendix the result of [6] that ∂xx ν ∂x u(x, 0) can have only downward jumps, i. e.
∀z ∈ (xi , xi+1 )
lim inf ∂x u(x, 0) ≥ lim sup ∂x u(x, 0). x z
(7)
yz
Moreover, if ∂x u(z, 0) exists, then it stays between the two limits above. Let us suppose that the thesis does not hold; then there is z0 ∈ (xi , xi+1 ) and two sequences, xs z0 and ys z0 such that, for s → ∞, qxs and qys accumulate on qi+1 and qi respectively. One of these two sequences may coincide with z0 , if ∂x u(z0 , 0) exists. It follows by an easy compactness argument that qxs and qys converge, up to a subsequence, to two orbits, which we call respectively q + and q − ; they are both cminimal on [0, ∞) and satisfy q − (0) = q + (0) = z0 . We assert that q + accumulates on qi+1 and q − on qi . Let us postpone the proof of this assertion and let us look at its consequences: by Lemma 1 we have that q˙ − (0) < q˙ + (0) or equivalently lims→∞ q˙ys (0) < lims→∞ q˙xs (0). But Hp is monotone increasing in p since we supposed L convex; thus by (6) lims→∞ ∂x u(ys , 0) > lims→∞ ∂x u(xs , 0). Since from [6] we know that ∂x u is B. V., this contradicts (7). We now prove that q − accumulates on qi . We recall that by [1] the c-minimal q − must accumulate either on qi or on qi+1 : let us suppose by contradiction that q − accumulates on qi+1 . On any interval [0, T ] qys converges uniformly to q − ; since we are supposing that |q − (t) − qi+1 (t)| → 0 for t → ∞, we can find a positive integer k1 such that |qys (k1 n) − qi+1 (k1 n)| ≤
1 |qy (0) − qi+1 (0)|. 2 s
(8)
But qys remains close to qi+1 only for a finite time, because we are supposing that it accumulates on qi ; thus we can choose k1 in such a way to minimize |qys (k1 n) − qi+1 (k1 n)|. We recall that qi (t + n) = qi (t) + m; since by [1] qys always stays between qi (t) and qi+1 (t), it is easy to see that qˆs (t) = qys (t +k1 n)−k1 m satisfies qi (t) < qˆs (t) < qi+1 (t) for t ∈ [0, ∞). Moreover, qˆs accumulates on qi since qys does. By (8) qˆs (0) > qys (0) while qˆs (k1 n) ≤ qys (k1 n) for T big because k1 is minimizing. Thus qˆs (t) and qys cross at t0 ≤ k1 . In other words, qˆs and qys cross twice, at t0 and at +∞, which by [1] is absurd.
We recall the terminology of [6]. If the point z¯ defined in the previous lemma belongs to (xi , xi+1 ), we will say that z¯ is a strong jump. Let us consider two abutting intervals (xi−1 , xi ) and (xi , xi+1 ) and let us see what can happen at xi . 1) Let qx be defined as in (6); then if ∀x ∈ (xi − , xi + )
lim |qx (t) − qi (t)| = 0
t→+∞
we call xi a transition point. We note that if xi is a transition point, then by (6) the local stable manifold of qi , the periodic orbit passing through xi , is given by the graph of c − ∂x u(x, t). 2) If ∀x ∈ (xi − , xi ) limt→+∞ |qx (t) − qi−1 (t)| = 0 ∀x ∈ (xi , xi + ) limt→+∞ |qx (t) − qi+1 (t)| = 0 we call xi a limit jump, since it can be considered as the limit of an interior strong jump. 3) If ∀x ∈ (xi − , xi ) limt→+∞ |qx (t) − qi−1 (t)| = 0 ∀x ∈ (xi , xi + ) limt→+∞ |qx (t) − qi (t)| = 0
Aubry-Mather Theory and Hamilton-Jacobi Equations
or
∀x ∈ (xi − , xi ) ∀x ∈ (xi , xi + )
501
limt→+∞ |qx (t) − qi (t)| = 0 limt→+∞ |qx (t) − qi+1 (t)| = 0
we call xi a second order discontinuity, retrograde in the first case and advanced in the second. We note that there are no other cases. First, because by [1] if x ∈ (xi , xi+1 ) the c-minimal orbit qx must satisfy limt→+∞ |qx (t) − qj (t)| = 0 for some j ∈ (i, i + 1). Second, because, if is small enough, by Lemma 2 all the qx with x ∈ (xi − , xi ) accumulate on the same orbit. Lemma 3. Let c ∈ (c1 , c2 ); then u has at least one transition point. Moreover, if qi (0) is a transition point, then all the points qi (l), l ∈ Z are transition. Proof. We begin to prove that u cannot have only second order discontinuities, all of the same kind; to fix ideas, say advanced. Let us suppose by contradiction that this is the case. Thus, if x ∈ (xj −1 , xj −1 +)∪(xj −, xj ), qx accumulates on qj (t); by Lemma 2 qx must accumulate on qj ∀x ∈ (xj −1 , xj ). We recall that qj denotes the periodic orbit passing through xj at time 0. Since qj −1 (t) < qx (t) < qj (t) and qj , qj −1 have rotation number m n , we can find k1 ∈ N, depending on x, such that qx (t +k1 n)−k1 m ∈ [xj −1 +, xj −] for some > 0, uniform in x. Letting now x → xj −1 , by a standard compactness argument qx (t + k1 n) − k1 m converges to a heteroclinic (or homoclinic, if xj −1 and xj are on the same q˜k ) connection between qj −1 and qj ; by the variational characterization of u explained at the end of this lemma its action is u(xj , 0) − u(xj −1 , 0). Since this holds for all j , we have a heteroclinic chain from q1 to q2 to q3 all the way up to q1 + 1; the action of this heteroclinic chain is nN+1 [u(xj , 0) − u(xj −1 , 0)] that is to say 0 since 2 u(x1 + 1, 0) = u(x1 , 0). By [12] a heteroclinic chain with zero action exists if and only if c = c1 or if c = c2 , a contradiction. To show that u cannot have only second order discontinuities, we note that by Lemma 2 no retrograde second order discontinuity can immediately follow an advanced one; thus the second order discontinuities of u are either all retrograde or all advanced and we fall back into the previous case. Let us now suppose by contradiction that u has only second order discontinuities and limit jumps; let xi be a limit jump. Since xi+1 is not a transition point, by Lemma 2 it must be an advanced second order discontinuity, as xi+2 , xi+3 , etc., all the way up to xi + 1 which we have supposed a limit jump, a contradiction. Since we have ruled out all other possibilities, u must have a transition point. Let us now come to the last assertion of the lemma. We remark that it is a standard fact (see for instance [13], Thm. 2.6) that the graph of c − ∂x u(x, 0) (which by (6) is the graph of the Legendre transform of q˙x (0)) is carried into itself by the step-1 map of the Hamiltonian flow of H ; thus, if qi (0) is a transition point also qi (1), qi (2), etc. are such and all the orbit qi is transition.
By the way, we note that the invariance of the graph of ∂x u(x, 0) under the step-1 map of the Hamiltonian flow of H is at the basis of a completely different approach to this theory, that of [8]. Because of the previous lemma, from now on we will speak of transition orbits instead of transition points. We now recall the variational characterization of u and uν . If u is any viscosity solution of (H-J), then (see again [9], Sects. 10 and 11)
502
U. Bessi
∀(x, t)∈ S 1 × (−∞, 0] u(x, t) 0 1 = min [L(q, q, ˙ s) − cq˙ + α(c)]ds + u(q(0), 0) : q ∈ A.C.([t, 0], S ), q(t) = x . t
From [5] it follows that the solution uν of (2) satisfies, for t ≤ 0, 0 uν (x, t) = min E [L(ξ, η, ˙ s) − cη˙ + β(ν)]ds + uν (ξ(0), 0) , Y
(9)
t
where Y varies in the class of smooth, time dependent vector fields; ξ solves for s ≥ t the stochastic differential equation dξ(s) = Y (ξ(s), s)ds + νdw(s) ξ(t) = x and η(s) ˙ = Y (ξ(s), s); w is the standard Brownian motion and E denotes expectation with respect to the Wiener measure. We call Yν the minimal Y (which exists and is unique as shown in [5]) and ξν the solution of the corresponding stochastic differential equation; we set η˙ ν (s) = Yν (ξν (s), s). We recall from [5], formula (4.5), that Yν (x, s) = Hp (x, c − ∂x uν (x, s), s).
(10)
Since uν is periodic, the last formula implies that Yν (x +1, s) = Yν (x, s) = Yν (x, s +1). Lemma 4. Let c ∈ (c1 , c2 ) and let u be the a. e. limit of a sequence uνl , νl 0. Then u has only one transition orbit, which is the one with minimal Ai . Proof. The proof consists in two estimates on α(c)−β(ν), one from above and one from below, which can agree only if the thesis is true. Before doing the two estimates, we need to know some properties of the stochastic minimizing flow in the neighbourhood of a transition orbit. Let qj be a transition orbit of u; then in a neighbourhood of qj the graph of c − ∂x u(x, t) coincides with the stable manifold of qj . Thus in the neighbourhood of qj , (11) Uj = {(q, t) : |q − qj (t)| < δ}, we have that c − ∂x u(x, t) is of class C 2 ; because of this greater regularity the method of Theorem 4 of [5] implies that, possibly reducing δ, ∂x uν (x, t) → ∂x u(x, t) uniformly in Uj . We know that ξν satisfies
t
ξν (t) − qj (0) =
Yν (ξν (s), s)ds + νw(t).
0
We set Y (x, t) = Hp (x, c − ∂x u(x, t), t), a(t) = ∂x Y (qj (t), t). We recall that by (6) q˙j (t) = Y (qj (t), t); by (10), the uniform convergence of ∂x uν and a Taylor development we have that |Yν (x, t) − q˙j (t) − a(t)[x − qj (t)]| ≤ (ν)
∀(x, t) ∈ Uj
Aubry-Mather Theory and Hamilton-Jacobi Equations
503
with (ν) ≤ Cδ 2 for ν small enough. Thus we have that dξν (s) = [q˙j (s) + a(s)(ξν (s) − qj (s)) + g(s, w)]ds + νw(s) ξν (0) = qj (0) with
|g(s, w)| ≤ (ν) if (ξν (s), s) ∈ Uj .
For every Brownian path w, g(s, w) is a continuous function of s. Let q solve q(s) ˙ = q˙j (s) + a(s)(q(s) − qj (s)) + g(s) q(0) = qj (0).
(12)
(13)
(14)
From the explicit solution of (12), which is a linear equation, we see that t t ξν (t) − q(t) = ν e s a(τ )dτ dw(s). 0
Since ξν (t) − q(t) is a martingale, Kolmogoroff’s inequality (see for instance [3], Chap. 7) implies t t E sup |ξν (t¯) − q(t¯)|2 ≤ ν 2 e2 s a(τ )dτ ds ≤ D1 ν 2 ∀t ≥ 0, 0≤t¯≤t
0
n where the last inequality comes from the fact that 0 a(τ )dτ < 0 since a(τ ) is the linearized motion on the stable manifold. We recall that ξν = ξν (t, ω) and q = q(t, ω), where ω belongs to some probability space ; we set δ A = ω : sup |ξν (t, ω) − q(t, ω)| ≥ . 4 t≥0 We denote by 1A the characteristic function of A; by the Tchebitcheff inequality we get E1A ≤
D2 ν 2 . δ2
(15)
On the other side, let ω ∈ Ac ; then by (13) |g(s)| ≤ (ν) if |q(s) − qj (s)| ≤ 4δ . But q solves the linear equation (14), which consists in a linear part attracting to qj perturbed by a term g which is of order Cδ 2 as long as q stays in Uj . From the explicit solution of this equation we get that, as long as q stays in Uj , |q(t) − qj (t)| ≤ (ν) ≤ Cδ 2 ; thus q never gets out of Uj and we get |q(t, ω) − qj (t)| ≤
δ 4
∀t ≥ 0, ∀ω ∈ Ac
δ 2
∀t ≥ 0, ∀ω ∈ Ac .
which by the definition of A yields |ξν (t, ω) − qj (t)| ≤
(16)
Let Ui be a neighbourhood of qi defined as in (11) and let z be a smooth function such that 1) the graph of c − ∂x z gives the stable manifold of qi in each Ui ; 2) z(qi (t)) = u(qi (t), t) for every i, where u is the viscous limit of the uν .
504
U. Bessi
The two properties are compatible: in Ui we take z to be a generating function of the local stable manifold of qi , the one with z(xi ) = u(xi , 0). This immediately gives 1) while 2) follows automatically by the variational characterization. We extend z smoothly outside the Ui . Then z, being the generating function of the local stable manifold, solves the Hamiton-Jacobi equation ∂t z − H (x, c − ∂x z, t) + α(c) + g(x, t) = 0 (17) z(x + 1, t) = z(x, t) = z(x, t + 1) with g = 0 outside ∪i Ui . Let uν solve (H − J )ν and let v be defined by uν = z + v. Subtracting the last equation from (H − J )ν we get 2 2 ν2 ∂xx v + ∂t v − H (x, c − ∂x z − ∂x v, t) + ν2 ∂xx z − g(x, t) +H (x, c − ∂x z, t) + β(ν) − α(c) = 0, v(x + 1, t) = v(x, t) = v(x, t + 1). This is still a Hamilton-Jacobi equation, thus it has a variational characterization with Lagrangian 2
ν ˜ q, L(q, ˙ t) = L(q, q, ˙ t)−(c−∂x z)q˙ +H (x, c−∂x z, t)+ ∂xx z−g(x, t)+β(ν)−α(c). 2 We now estimate β(ν) − α(c) from above. We consider the optimal ξ with initial condition at the transition orbit qj : dξ(t) = Yν (ξ(t), t)dt + νdw(t) ξ(0) = qj (0). Setting η(t) ˙ = Hp (ξ(t), c − ∂x u(ξ(t), t), t) we get from the variational characterization of v nk∧τ ˜ v(qj (0), 0) = E L(ξ, η, ˙ t)dt + v(ξ(nk ∧ τ ), nk ∧ τ ) , 0
where τ is the exit time from Uj . This is not exactly formula (8) but since the only difference is the presence of the stopping time τ , it can be proven as in [5], keeping track of the boundary condition on ∂Uj . Recalling that L(q, q, ˙ t) − (c − ∂x z)q˙ + H (x, c − ∂x z, t) ≥ 0
(18)
and that g|Uj = 0, we get nk∧τ 2 ν ∂xx z(ξ, t) + β(ν) − α(c) dt + v(ξ(nk ∧ τ ), nk ∧ τ ) . v(qj (0), 0) ≥ E 2 0 Since in Uj uν converges uniformly to z as ν → 0, v converges uniformly to 0; thus we have that nk∧τ 2 ν 0≥E ∂xx z(ξ, t) + β(ν) − α(c) dt − (ν) 2 0 with (ν) → 0 as ν → 0. Since the stable manifold of qi is C 2 , ∂xx z is Lipschitz in Uj , which implies nk∧τ 2 ν 2 ∂xx z(qj (t), t) + β(ν) − α(c) − C4 δν dt − (ν) ≥ 0≥E 2 0
Aubry-Mather Theory and Hamilton-Jacobi Equations
505
ν2 Aj + nk β(ν) − α(c) − C5 δν 2 − (ν), (19) 2 where the last inequality follows from (15) and (16). We now give the estimate from below. We consider ξ that satisfies dξ(t) = [Hp (ξ(t), (c − ∂x z(ξ(t), t), t)]dt + νdw(t) ξ(0) = qi (0). k(1 − (ν))
Setting η(t) ˙ = Hp (ξ(t), c − ∂x z(ξ(t), t), t) we get from the variational characterization of v, nk∧τ ˜ L(ξ, η, ˙ t)dt + v(ξ(nk ∧ τ ), nk ∧ τ ) . v(qi (0), 0) ≤ E 0
Since the drift term of ξ is the motion on the stable manifold of qi , we have that (15) and (16) hold for ξ with j replaced by i; since z solves the H-J equation (17), (18) is now an equality and we have nk∧τ 2 ν v(qi (0), 0) ≤ E ∂xx z(ξ, t) + β(ν) − α(c) dt + v(ξ(nk ∧ τ ), nk ∧ τ ) . 2 0 Since uν and z are Lipschitz uniformly in ν, uν converges to u and u(qi (t), t) = z(qi (t), t), we get that |v| ≤ C6 δ in Ui , which implies 0 ≤ k(1 + (ν))
ν2 Ai + nk β(ν) − α(c) + C5 δν 2 + C6 δ. 2
Since the above formula holds for every i, 0 ≤ k(1 + (ν))
ν2 min Ai + nk β(ν) − α(c) + C5 δν 2 + C6 δ. 2 i
(20)
Taking δ small and k large enough, we see that (19) and (20) are in contradiction unless Aj is minimal; in more detail, we take k large, so that Ck6 δ << ν 2 and then δ small so that Cδν 2 << ν 2 .
Proof of Theorem 1. We distinguish two cases: c ∈ (c1 , c2 ) and c ∈ {c1 , c2 }. We will use Lemma 4 only for the proof of the first case; in the second one, essentially we will prove that the solution of (1) is unique and we won’t need Lemma 4 and the assumption on the minimal orbits on which it depends. Let us begin with the first case and let us consider a sequence νk 0. We know from [6] that, possibly taking a subsequence, ∂x uνk → ∂x u a. e., where u is a viscosity solution of (1). We know from Lemma 4 that u has only one transition orbit, qj : the one with minimal Aj . Adding a constant, we can always suppose that u(qj (0), 0) = h∞ c (qj (0), 0, qj (0)) = 0, the second equality being a consequence of (4). Clearly, it 1 suffices to prove that u(x, 0) = h∞ c (x, 0, qj (0)) ∀x ∈ S . We recall that the periodic orbit qi satisfies qi (0) = xi . Let x ∈ (xi , xi+1 ) be a common differentiability point of u(x, 0) and h∞ c (x, 0, qj (0)); let us consider the orbit qˇx satisfying qˇx (0) = x qˇ˙x (0) = Hp (x, c − ∂x h∞ c (x, 0, qj (0)), 0).
506
U. Bessi
We have that qˇx is c-minimal on [0, ∞) and by [1] it must be asymptotic to qi or to qi+1 ; to fix ideas, let us suppose it is qi ; thus |qˇx (t) − qi (t)| → 0 for t → ∞. It is easy to see that h∞ c (x, 0, qj (0)) is realized by a c-minimal heteroclinic chain: qˇx starting at x and asymptotic to qi , qˆi−1 asymptotic at −∞ to qi and at +∞ to qi−1 , qˆi−2 from qi−1 to qi−2 , etc., until we arrive to qˆs connecting qs+1 with qs = qj (· + b) + a with a and b integers. Again by the variational characterization we have that h∞ c (x, 0, qj (0)) =
b
[L(qj , q˙j , t) − cq˙j + α(c)]dt nk i−1 + lim [L(qˆl , q˙ˆ l , t) − cq˙ˆ l + α(c)]dt 0
l=s
k→∞ −nk nk
+ lim
k→∞ 0
[L(qˇx , qˇ˙x , t) − cqˇ˙x + α(c)]dt.
We assert that also u(x, 0) is realized by a heteroclinic chain connecting x with qj (· + b ) + a . Indeed, let us consider the orbit qx defined by (6) and to fix ideas let us suppose it accumulates on qi+1 . If qi+1 is a transition orbit (i. e. if qi+1 = qj (· − b ) + a for some a b integers) we stop; otherwise, qi+1 is neither a transition orbit nor a limit jump, because qx accumulates on it; thus it must be an advanced second order discontinuity and we can connect it to qi+2 by the heteroclinic q¯i+1 ; we go on in this way until we arrive to q¯s connecting qs to qj (· − b ) + a . The variational characterization of u now implies b [L(qj , q˙j , t) − cq˙j + α(c)]dt u(x, 0) = 0
+
s l=i+1
+ lim
lim
nk
k→∞ −nk nk
k→∞ 0
[L(q¯l , q˙¯ l , t) − cq˙¯ l + α(c)]dt
[L(qx , q˙x , t) − cq˙x + α(c)]dt.
By the variational characterization of u, also this heteroclinic chain is c-minimal and thus must have the same action as the heteroclinic chain for h∞ c ; by the last two formulas, we have that h∞ (x, 0, q (0)) = u(x, 0) for a. e. x and thus for all x since both j c functions are continuous. Since every sequence {uνl }νl →0 has a subsequence converging to h∞ c (x, 0, qj (0)), the thesis follows. Let now c ∈ {c1 , c2 }. Adding a constant we can always suppose that u(x0 , 0) = ∞ h∞ c (x0 , 0, x0 ) = 0. We want to prove that u(x, 0) = hc (x, 0, x0 )∀x. Let x ∈ (xi , xi+1 ) ∞ be a common differentiability point of u and hc ; we consider the orbit qx defined by (6); it tends asymptotically say to xi+1 and we can see as before that it realizes u(x, 0); since qx ∞ is c-minimal it also realizes h∞ c (x, 0, xi+1 ) and thus ∂x u(x, 0) = ∂x hc (x, 0, xi+1 ). Thus ∞ ∞ the proof would follow if we could prove that hc (x, 0, xi+1 ) = hc (x, 0, x0 ) + c(xi+1 ) with c(xi+1 ) constant on (xi , xi+1 ). To do this, let us suppose to fix ideas that c = c1 and let us define qˇx by the formula above applied to h∞ c (x, 0, x0 ); it follows easily from [12] that since c = c1 , qˇx must accumulate on xi for all x ∈ (xi , xi+1 ). In other words, for this value of ch∞ c (x, 0, x0 ) has only retrograde second order discontinuities. ∞ Since qˇx accumulates on xi and realizes h∞ c (x, 0, x0 ) we have that hc (x, 0, x0 ) = ∞ (x , 0, x ) and we are done. h∞ (x, 0, x ) + h
i i 0 c c
Aubry-Mather Theory and Hamilton-Jacobi Equations
507
Remark 1. We remark that the definition of Ai is gauge-invariant: in other words, the ˜ q, Lagrangian L(q, ˙ t) = L(q, q, ˙ t) − ∂q S(q, t)q˙ − ∂t S(q, t), which has the same c-minimal orbits as L, has the same Ai . We also note that Ai is not the Lyapounoff exponent of q˜i , which is given by n [Hpq (q˜i , c − uq (q˜i , t), t) − Hpp (q˜i , c − uq (q˜i , t), t)uqq (q˜i , t)]dt. 0
Proof of Theorem 2. We now turn to a completely different case, that of the existence of a smooth KAM torus. It is a standard fact that any smooth KAM torus T is the graph of a closed one-form: T = {(q, p, t) = (q, c − ∂q u(q, t), t)}(q,t)∈S 1 ×S 1 . Moreover, u is a smooth solution of (H-J); it follows from Theorem 2.19 of [13] that, since the rotation number ω is irrational, then u is, up to an additive constant, the only viscosity solution of (H-J). Since u is unique we have that ∂x uν → ∂x u a. e. as ν 0. Adding a constant, we can always suppose that uν (x, 0) ≥ u(x, 0) and that uν (x, ¯ 0) = u(x, ¯ 0) for some x¯ depending on ν. We will denote by D1 a common bound ¯ By the variational of |∂x uν | and |∂x u|. Let Qk (t) be the c-minimal orbit with Qk (k) = x. characterization of uν and u we get k ˙ k , t) − L(Qk , Q ˙ k , t)]dt 0 ≤ uν (Qk (0), 0) − u(Qk (0), 0) ≤ E [L(Qk + νw(t), Q 0
+[β(ν) − α(c)]k + uν (x¯ + νw(k), 0) − u(x, ¯ 0). We estimate the integral as in [5], Theorem 4 and we get k ˙ k , t) − L(Qk , Q ˙ k , t)]dt ≤ D2 νkE sup |w(t)|. E [L(Qk + νw(t), Q 0
0≤t≤k
Moreover, since uν (x, ¯ 0) = u(x, ¯ 0) and D1 is the common Lipschitz constant of uν and u, we get that ¯ 0)] ≤ D1 νE sup |w(t)|. E[uν (x¯ + νw(k), 0) − u(x, 0≤t≤k
We also note that |β(ν) − α(c)| tends to 0 as ν tends to 0; from this and the last three formulas we get 3 0 ≤ uν (Qk (0), 0) − u(Qk (0), 0) ≤ νD3 k 2 . (21) We recall that by Theorem D of [2] there is D4 > 0 such that D4 kω mod(1) : 0 ≤ k ≤ τ is a -net on S 1 ; since the motion on T is C 1 -conjugate to a rotation of frequency ω, also {qk (0)}0≤k≤ D4 is an -net on S 1 . Using this fact, (23) and the fact that |∂x uν | and τ
|∂x u| ≤ D1 , we get that
|uν (x, 0) − u(x, 0)| ≤ D5
ν 3
2τ
+ D1 .
508
U. Bessi 2
If we choose ν 3τ +2 we get that 2
|uν (x, 0) − u(x, 0)| ≤ D4 ν 3τ +2
∀x ∈ S 1 .
(22)
¯ 0) = u(x, ¯ 0) we have that Since uν (x, x 2 [∂x uν (s, 0) − ∂x u(s, 0)]ds ≤ D4 ν 3τ +2
∀x ∈ [0, 1].
x¯
(23)
By [6] (see also the Appendix) we know that ∂xx uν (s, 0) − ∂xx u(s, 0) ≤ M
(24)
as ν 0. If the thesis were not true, 1
¯ 0) − ∂x uν (x, ¯ 0)| > Dν 3τ +2 |∂x uν (x,
(25)
for some big D and some x. ¯ Now (26) implies that (27) holds with D2 in a semi-interval 1 of length 2M to the left of or to the right of x, ¯ contradicting (25).
Digression Let us consider the Schr¨odinger equation for −V , −
ν2 u¨ − V (x)u = Eu. 2
We are interested in the eigenvalues E lying in the lowest part of the spectrum. If we x2
were on R and if −V (x) = 21 x 2 then we would have that e− 2ν is the eigenfunction corresponding to the lowest-lying eigenvalue. We might expect that in our case the eigenfunctions we are looking for are perturbed gaussians peaked near some minimum of w(x) −V . We set u(x) = e− ν and we find 1 2 ν w¨ − |w| ˙ − V (x) = E 2 2 ˙ 2 −V (q) and c = 0. Theorem 5 of [6] says the following: which is (2) if L(q, q, ˙ t) = 21 |q| if all the maxima of V are nondegenerate and have different second order derivatives, then as ν → 0 the solution wν of (2) converges to h∞ 0 (x, 0, ξ ), where ξ is the maximum of V with smallest second order derivative. Recalling that close to ξ the graph of −∂x h∞ (x, 0, ξ ) is the local stable manifold of ξ , we have that wν (x) h∞ 0 (x, 0, ξ ) √ 0 |x−ξ |2 |V (ξ )| 2 which implies that u is a perturbed gaussian peaked near ξ . More suggestively, if we drop a charge in a minimum of −V different from ξ , this charge will tunnel through the energy barrier to ξ . We refer the reader to [7] and references therein for a probabilistic treatment of tunnelling. Useless to say, for c = 0 tunnelling theory holds in any dimension and has far richer results than those contained in this digression.
Aubry-Mather Theory and Hamilton-Jacobi Equations
509
Appendix We recall the method used in [6] to show that there exists one solution (and only one, up to an additive constant) of (2). In [6] only lagrangians of the type L(q, q, ˙ t) = 1 ˙ 2 − V (q, t) are considered and this makes some of the details of the proof different. 2 |q| The idea of [6] is to differentiate (2) with respect to x; setting v(x, t) = ∂x u(x, t) we have ν2 2 v + ∂ v − ∂ H (x, c − v, t) + ∂ H (x, c − v, t)∂ v = 0 2 ∂xx t x p x 1 (26) v(x, t)dt = 0 0 v(x, t + 1) = v(x, t) = v(x + 1, t). In the above equation v has zero mean since u is periodic. To solve (25) one considers the evolution (actually involution, since t ≤ 0) problem 2 2 w + ∂ w − ∂ H (x, c − w, t) + ∂ H (x, c − w, t)∂ w = 0 ν2 ∂xx t x p x (27) w(x, 0) = φ(x) w(x, t) = w(x + 1, t), where φ is a smooth function of mean 0; it is easy to see that 1 w(x, t)dx = 0 ∀t ≤ 0. 0
One would like to show that for n ∈ Z, n → −∞, w(x, t + n) converges to a solution of (25). Two things are needed: 1) w(·, t)t≤0 is compact in some topology; 2) the limit of w(x, t + n) as n → −∞ is unique. Point 2) is proven in the same way as in [6] so we won’t speak about it; we show point 1). We begin to show that if ν is small enough then w(·, t)∞ is bounded as t → −∞. Let denote one primitive of φ and let us consider the problem ∂t u − H (x, c − ∂x u, t) + α(c) = 0 u(x, 0) = (x). Let T < 0 and let qx be the orbit of L with initial conditions q˙x (T ) = ∂p H (x, c − ∂x u(x, T ), T ) qx (T ) = x.
(28)
We begin getting a bound on q˙x (T ); then we will show how this bound carries over to a bound on ∂x w(x, T ); this bound will propagate to all negative t by induction. We recall a lemma of [11]: if a c-minimal orbit q satisfies |q(T )−q(0)| ≤ K then it also satisfies |T | |q(t)| ˙ ≤ K (K) for t ∈ [T , 0]. We define ˙ t) − c| : |p| ≤ K, K (K) = sup{|∂p H (q, c − p, t)| + |∂q˙ L(q, q, |q| ˙ ≤ K (K) q, t ∈ S 1 }. By the variational characterization of u we know that qx realizes 0 min [L(q, q, ˙ t) − cq˙ + α(c)]dt + (q(0)) : q(T ) = x . T
510
U. Bessi
d This easily implies that there is K > 0 such that, for any M > 0, if dx
∞ ≤ M, then qx (T )−qx (0) there is T (M) < 0 such that, if T ≤ T (M), then | | < K. In other words, if T d we wait long enough, the mean speed of qx does not depend on dx
∞ . Once K is thus defined we set M = K (K) + 1, T = T (M) and we choose φ in (26) such that x (0) φ∞ ≤ M. As we already said this choice of φ and T implies that | qx (T )−q |
for some suitable β(ν) and after adding a suitable function of time. The technique of Theorem 4 of [5] shows that there is ν0 > 0, only depending on L and M, such that, if ν ∈ (0, ν0 ), then ∂x zL∞ (S 1 ×[−T ,0]) ≤ ∂x uL∞ (S 1 ×[−T ,0]) + 1 ≤ K + 1 = M. If we iterate this argument using uν (x, −T ) as a new final condition, we get ∂x zL∞ (S 1 ×[−nT ,0]) ≤ K + 1 for any n ∈ N; since ∂x z = w we have that sup{|w(x, t)| : x ∈ S 1 ,
t ≤ 0} ≤ M
(29)
with M independent on ν ∈ (0, ν0 ). After getting this bound, the argument is identical to the one in [6]; namely, we differentiate (26) with respect to x and we get, setting g = ∂x w, ν2 ∂xx g +∂t g − ∂xx H (x, c − w, t) + 2∂xp H (x, c − w, t)g 2 −∂pp H (x, c − w, t)g 2 + ∂p H (x, c − w, t)∂x g = 0. If (x0 , t0 ) is a maximum of g in S 1 ×[−T , 0] we have that ∂x g(x0 , t0 ) = 0, ∂xx g(x0 , t0 ) ≤ 0 and ∂t g(x0 , t0 ) ≤ 0; from the above equation we get 2∂xp H (x, c − w, t)g − ∂pp H (x, c − w, t)g 2 ≥ ∂xx H (x, c − w, t). Since ∂pp H > 0, (28) and the above equation imply that ∂x w(x, t) = g(x, t) ≤ M 1 with M independent on ν ∈ (0, ν0 ); since 0 ∂x w(x, t)dx = 0 we get that the total variation of w(·, t) is bounded by a constant independent on t ≤ 0 and ν ∈ (0, ν0 ). Helly’s theorem now provides the pointwise convergence of w(x, t + n) to a limit which has the desired periodicity properties. The above argument also shows that the solution vν of (25) satisfies |vν (x, t)| ≤ D1 for all x, t ∈ S 1 and ν ∈ (0, ν0 ); thus, if uν is a solution of (2), ∂x uν satisfies the same 2 u ≤ M . Since 1 ∂ 2 u dx = 0, the total bound. Moreover, we also have that ∂xx ν 0 xx ν variation of ∂x uν is bounded independently on ν ∈ (0, ν0 ] and thus by Helly’s theorem {∂x uν }ν∈(0,ν0 ] is compact for the a. e. convergence. Thus also {uν } is compact and it is a standard fact that the pointwise limits of uν are viscosity solutions of (2). Acknowledgement. The author would like to thank Nalini Anantharaman for helpful trouble shooting on a previous version of this paper.
Aubry-Mather Theory and Hamilton-Jacobi Equations
511
References 1. Aubry, S., Le Daeron, P.Y.: The discrete Frenkel-Kontorova model and its extensions. Physica 8D, 381–422 (1983) 2. Bourgain, J., Golse, F., Wennberg, B.: On the distribution of free path lengths for the periodic lorentz gas. Commun. Math. Phys. 190, 491–508 (1998) 3. Da Prato, G.: Introduction to Differential Stochastic Equations. Pisa: Scuola Normale Superiore, 1995 4. Fathi, A.: Weak KAM theory in Lagrangian Dynamics. Lecture notes of a course held in Lyon, June 2000 5. Fleming, W.H.: The cauchy problem for a nonlinear first order partial differential equation. J. D. E. 5, 515–530 (1969) 6. Jauslin, H.R., Kreiss, H.O., Moser, J.: On the forced burgers equation with periodic boundary conditions. Proc. Symposia in Pure Math. 65, 133–153 (1999) 7. Iona Lasinio, G., Martinelli, F., Scoppola, E.: New approach to the semiclassical limit in quantum mechanics, multiple tunnellings in one dimension. Commun. Math. Phys. 80, 223–254 (1980) 8. Katznelson, Y., Ornstein, D.S.: Twist maps and Aubry-Mather sets. Contemp. Math. 211, 343–357 (1997) 9. Lions, P. L.: Generalized Solutions of Hamilton-Jacobi Equations. London: Pitman, 1981 10. Mantegazza, C., Mennucci, A.C.: Hamilton-Jacobi equations and distance functions on Riemannian Manifolds. Preprint SNS, 1996 11. Mather, J.N.: Action minimizing invariant measures for positive definite lagrangian systems. Math. Z. 207, 169–207 (1991) 12. Mather, J.N.: Variational construction of connecting orbits. Annales de l’Institut Fourier 43, 1349– 1368 (1993) 13. Weinan, E.: Aubry-Mather theory and periodic solutions of the forced Burgers equation. Comm. Pure Appl. Math. 52, 811–828 (1999) Communicated by G. Gallavotti
Commun. Math. Phys. 235, 513–543 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0794-8
Communications in
Mathematical Physics
Conformal Geodesics on Vacuum Space-times Helmut Friedrich Max-Planck-Institut f¨ur Gravitationsphysik, Am M¨uhlenberg 1, 14476 Golm, Germany Received: 29 April 2002 / Accepted: 4 November 2002 Published online: 10 February 2003 – © Springer-Verlag 2003
Abstract: We discuss properties of conformal geodesics on general, vacuum, and warped product space-times and derive a system of conformal deviation equations. It is then shown how to construct on the Schwarzschild-Kruskal space-time globally defined systems of conformal Gauss coordinates which extend smoothly and without degeneracy to future and past null infinity. 1. Introduction Geometrically defined coordinate systems are convenient in local studies but they often degenerate on larger domains. It is well known, for instance, that Gauss coordinates are of limited use because the underlying geodesics tend to develop caustics. In [8] conformal Gauss coordinates have been introduced and successfully employed in local studies of conformally rescaled space-times. Here the time-like coordinate lines are generated by conformal geodesics, curves which are associated with conformal structures in a similar way as geodesics are associated with metrics. In this case the danger of a coordinate break-down is even more severe, because the conformal geodesics generating the system can not only develop envelopes or intersections, as occur in caustics of metric geodesics, but they may even become tangent to each other. However, it will be shown in this article that the structure responsible for this difficulty also has useful aspects. Subsequent analyses of conformal Gauss coordinates revealed that conformal geodesics possess various remarkable properties. In [5] they have been studied systematically in the context of the conformal field equations. Though this introduces at first additional complications into the equations because it requires the use of Weyl connections, it leads to unexpected simplifications. While the Bianchi equation satisfied by the rescaled conformal Weyl tensor, which is given by d i j kl = −1 C i j kl , where denotes the conformal factor, always implies hyperbolic reduced systems of partial differential equations, the generalized conformal field equations (in vacuum, possibly with a cosmological constant) admit in the new gauge hyperbolic reductions in which the frame
514
H. Friedrich
coefficients, the connection coefficients and the components of the conformal Ricci tensor are governed by equations of the form ∂τ e µ k = . . . ,
∂τ ˆ i j
k
= ... ,
∂τ Rˆ j k = . . . ,
where τ denotes the time variable in the given gauge and the right-hand sides are given by algebraic functions of the unknowns eµ k , ˆ i j k , Rˆ j k , d i j kl and known functions on the solution manifold. Moreover, associated with a conformal geodesic is a function along that curve, the conformal factor, which is determined up to a constant factor that can be fixed on a given initial hypersurface. This function necessarily has zeros at points where the conformal geodesic crosses null infinity. It turns out that for given data on the initial hypersurface this function can be determined explicitly and acquires a form which allows one to prescribe to null infinity by a suitable choice of initial data a finite coordinate location. In [5] these facts are used to provide a rather complete discussion of Anti-de Sitter-type solutions, including a characterization of these solutions in terms of initial and boundary data. In [6] these properties form a basic ingredient of a detailed investigation of the behaviour of asymptotically flat solutions near space-like and null infinity under certain assumptions on the Cauchy data for the solutions. In particular, the cylinder at space-like infinity introduced in [6] to remove the conformal singularity at space-like infinity is obtained as a limit set of conformal geodesics. The results obtained there strongly suggest that even under the most stringent smoothness assumptions on the conformal data near space-like infinity the solutions will in general develop logarithmic singularities at null infinity. However, they also suggest that these singularities can be avoided if the data satisfy certain regularity conditions at space-like infinity. The conversion of these conjectures into facts requires the proof of a certain existence result for the regular finite initial value problem at space-like infinity formulated in [6]. While the basic difficulty of this proof has nothing to do with the specific features of the underlying conformal Gauss system, only the proof will show that the latter is doing what we expect it to do: to extend near space-like infinity smoothly and without degeneracy to null infinity if null infinity admits a smooth differentiable structure at all. An independent proof that conformal Gauss systems behave this way on general space-times is difficult, because it requires the information on the asymptotic structure of the solution which we first hope to obtain by completing the analysis of [6]. The question whether conformal Gauss systems can be used globally is even more difficult. In the context of the hyperboloidal initial value problem with smooth initial data it can be shown that these coordinates remain good for a while and that they can in fact be globally defined for data sufficiently close to Minkowskian hyperboloidal data. Nothing is known for data which deviate strongly from Minkowskian ones. There remains, however, the possibility to study the behaviour of conformal Gauss systems on specific solutions. It has been shown for instance in [6] that there exist conformal Gauss systems on the Schwarzschild space-time which smoothly cover neighbourhoods of space-like infinity including parts of null infinity. One may be tempted to think that this is only possible due to the weakness of the field in that domain (although it doesn’t look weak in the conformal picture) while in regimes of strong curvature a break-down of the coordinates will occur. It turns out that this is not true. It is the main result of this article that there exist conformal Gauss coordinates on the Schwarzschild-Kruskal space-time which smoothly cover the whole space-time and which extend smoothly (in fact analytically) and without degeneracy through null infinity.
Conformal Geodesics on Vacuum Space-times
515
This raises hopes that conformal Gauss systems stay regular under much wider circumstances than expected so far and that we may be able to exploit the simplicity of the reduced equations and the fact that conformal Gauss systems provide by themselves the conformal compactification in time directions under quite general assumptions. How robust these systems really are, whether they will stay regular under non-linear perturbations of the Schwarschild-Kruskal space, remains to be seen. Though a general analytical investigation of these equations appears difficult at present, important information about them may be obtained by numerical calculations. Conversely, there are good reasons to believe that it should be possible to give strong analytical support to numerical work based on the use of conformal Gauss systems because the latter are amenable to geometric analysis. We begin the article by discussing properties of conformal geodesics on general space-times and then specialize to vacuum space-times, using also some of the results which have been obtained already in the articles referred to above. For later application we work out the conformal geodesic equations on warped product space-times. Various features of conformal Gauss systems are then illustrated by explicit examples on Minkowski space-time. After specializing further to warped product vacuum space-times a conformal Gauss system on the Schwarzschild-Kruskal space-time will be analysed. Expressions for the conformal geodesics determined by suitably chosen initial data are derived in terms of elliptic and theta functions. It is shown that the curves extend smoothly through null infinity and that the associated conformal factor defines a smooth conformal extension if the congruence is regular. We do not attempt to derive the explicit expression of the rescaled metric in terms of the new coordinates, because this cannot be done anyway in more general situations. Instead, the regularity of the coordinate system and the conformal extension is established by analysing a system of equations which is the analogue for conformal geodesics of the Jacobi equation for metric geodesics. While specific features of the Schwarzschild solution are used here, it is clear that similar techniques apply in more general cases. We end the article by indicating as a possible application the numerical calculation of entire asymptotically flat vacuum solutions. 2. Conformal Geodesics on Pseudo-Riemannian Manifolds Before we introduce conformal geodesics we recall a few concepts and formulae of conformal geometry. On a pseudo-Riemannian manifold (M, g) ˜ of dimension n ≥ 3 we consider two operations preserving the conformal structure defined by g: ˜ (i) conformal rescalings of the metric g˜ µν → gµν = 2 g˜ µν ,
(1)
with smooth conformal factor > 0, (ii) transitions ∇ → ∇ˆ of the Levi-Civita connection ∇ of a metric g in the conformal class of g˜ into (torsion free) Weyl connections ∇ˆ with respect to g. ˜ In terms of the Christoffel symbols and the connection coefficients ˆ defined by ∇∂µ ∂ν = µ ρ ν ∂ρ and ∇ˆ ∂µ ∂ν = ˆ µ ρ ν ∂ρ respectively, the transition is , described by µ ρ ν → ˆ µ ρ ν = µ ρ ν + S(f )µ ρ ν , S(f )µ ρ ν ≡ δ ρ µ fν + δ ρ ν fµ − gµν g ρλ fλ ,
(2)
516
H. Friedrich
with a smooth 1-form f . For given f the difference tensor S(f ) depends only on the conformal structure of g. The transformation of the Levi-Civita connection ˜ of g˜ under (1) is a special case of (2) in which the 1-form is exact ˜ µ ρ ν → µ ρ ν = ˜ µ ρ ν + S(f )µ ρ ν
with
f = −1 d .
(3)
Conversely, if the 1-form f is closed (2) arises locally from a rescaling of g with a suitable conformal factor. By (2) we have ∇ˆ ρ gµν = −2 fρ gµν .
(4)
It follows from this that ∇ˆ preserves the conformal structure of g˜ in the sense that with a function θ > 0 satisfying on a given curve x(τ ) in M ∇ˆ x˙ θ = θ < f, x˙ >,
(5)
ˆ the metric θ 2 gµν is parallel transported along x(τ ) with respect to ∇. The curvature tensors of the connections ∇ˆ and ∇, defined by (∇ˆ λ ∇ˆ ρ − ∇ˆ ρ ∇ˆ λ ) Xµ = Rˆ µ νλρ X ν etc., are related by Rˆ µ νλρ − R µ νλρ = 2 {∇[λ Sρ] µ ν + Sδ µ [λ Sρ] δ ν },
(6)
where indices are raised or lowered with respect to g. The tensor Lˆ µν =
1 n−2 ˆ 1 { Rˆ (µν) − R[µν] − gµν Rˆ }, n−2 n 2 (n − 1)
which occurs in the decomposition Rˆ µ νλρ = 2 {g µ [λ Lˆ ρ]ν − g µ ν Lˆ [λρ] − gν[λ Lˆ ρ] µ } + C µ νλρ of the curvature tensor of ∇ˆ into its trace parts and the trace-free, conformally invariant conformal Weyl tensor C µ νλρ , is related to the tensor Lµν =
1 1 { Rµν − R gµν }, n−2 2 (n − 1)
by the equation ∇µ fν − fµ fν + gµν
1 fλ f λ = Lµν − Lˆ µν . 2
(7)
2.1. Conformal geodesics. A conformal geodesic associated with the conformal structure defined by g˜ is given by a curve x(τ ) in M and a 1-form b(τ ) along x(τ ) such that the equations ˙ µ + S(b)λ µ ρ x˙ λ x˙ ρ = 0, (∇˜ x˙ x) (∇˜ x˙ b)ν −
1 bµ S(b)λ µ ν x˙ λ = L˜ λν x˙ λ , 2
(8) (9)
hold on x(τ ). For given initial data x∗ ∈ M, x˙∗ ∈ Tx∗ M, b∗ ∈ Tx∗∗ M there exists a unique conformal geodesic x(τ ), b(τ ) near x∗ satisfying for given τ∗ ∈ R, x(τ∗ ) = x∗ ,
x(τ ˙ ∗ ) = x˙∗ ,
b(τ∗ ) = b∗ .
(10)
Conformal Geodesics on Vacuum Space-times
517
From (8) it follows that the sign of g( ˜ x, ˙ x) ˙ is preserved along x(τ ), since we have ∇˜ x˙ (g( ˜ x, ˙ x)) ˙ = −2 < b, x˙ > g( ˜ x, ˙ x). ˙
(11)
In particular, if g( ˜ x, ˙ x) ˙ = 0 holds at one point it holds everywhere on x(τ ) and (8) implies that x(τ ) can be reparametrized to coincide with a null geodesic of g. ˜ ¯ τ¯ ) be two solutions to (8), (9) with g( Let x(τ ), b(τ ) and x( ¯ τ¯ ), b( ˜ x, ˙ x) ˙ = 0. We derive now the conditions under which the curves x(τ ), x( ¯ τ¯ ) coincide locally as point sets such that there exists a local reparameterization τ = τ (τ¯ ) with x(τ (τ¯ )) = x( ¯ τ¯ ). This relation and (8) imply x˙ ∂τ2¯ τ + 2 < b¯ − b, x˙ > x˙ (∂τ¯ τ )2 − g( ˜ x, ˙ x)( ˙ b¯ − b) (∂τ¯ τ )2 = 0 which is equivalent to the equations b¯ − b = α x˙ ,
∂τ2¯ τ + α g( ˜ x, ˙ x) ˙ (∂τ¯ τ )2 = 0,
(12)
with some function α. In the first of these equations the index of x˙ is lowered with the metric g. ˜ From the first equation and (9) 1 2 ˜ x, ˙ x) ˙ α g( 2
α˙ = 2 α < b, x˙ > +
(13)
follows, which together with (12) is equivalent to our requirement. Equations (11), (13) imply the equation ∂τ (α g( ˜ x, ˙ x)) ˙ = 1/2 (α g( ˜ x, ˙ x)) ˙ 2 which has the solutions α g( ˜ x, ˙ x) ˙ =
2 α∗ g( ˜ x˙∗ , x˙∗ ) , ˜ x˙∗ , x˙∗ ) τ 2 − α∗ g(
where τ = τ − τ∗ and τ¯ = τ¯ − τ¯∗ . From these solutions with (12)
τ =
4 e τ¯ , 1 + 2 e α∗ g( ˜ x˙∗ , x˙∗ ) τ¯
b¯ = b +
1 2 α∗ g( ˜ x˙∗ , x˙∗ ) x, ˙ g( ˜ x, ˙ x) ˙ 2 − α∗ g( ˜ x˙∗ , x˙∗ ) τ
e, α∗ , τ∗ , τ¯∗ ∈ R,
(14)
e = 0
follows. Thus, the changes of the initial data (10) which locally preserve the point set spread out by the curve x(τ ) are given by x˙∗ → 4 e x˙∗ ,
b∗ → b∗ + α∗ x, ˙
e, α∗ ∈ R,
e = 0.
(15)
The 1-form remains unchanged and an affine parameter transformation results if α∗ = 0. With suitable choices of the free constants, however, any fractional linear transformations of the parameter can be obtained and it can be arranged that τ → ∞ at any prescribed value of τ¯ . This fact is related to an important difference between conformal geodesics and metric geodesics. It may happen that the parameter τ on a conformal geodesic takes any value in R while the curve still acquires endpoints in M as τ → ±∞. If the transformation in (14) has a pole it may further happen that the point sets spread out by x(τ ), x( ¯ τ¯ ) coincide only partly, namely at the points where the transformation is defined, but each of the curves extends into a region not entered by the other one. In fact, there may occur an arbitrary number of such overlaps (we will encounter and make use of this phenomenon below). In certain contexts it may be preferable to call the union of the corresponding point sets a conformal geodesic. However, it will be more convenient for us to preserve this name for a solution of the conformal geodesic equation with its distinguished parameter.
518
H. Friedrich
If the 1-form b is used to define a connection ∇ˆ along x(τ ) by requiring that its difference tensor with respect to ∇˜ is given by ∇ˆ − ∇˜ = S(b),
(16)
Eq. (8) can be written in the form ∇ˆ x˙ x˙ = 0. Thus x(τ ) is an autoparallel with respect ˆ Equation (9), which determines the connection ∇ˆ along that autoparallel, acquires to ∇. some meaning by comparing it with (7). A congruence of conformal geodesics, covering smoothly and without caustics an open set U , defines a smooth 1-form field b and ˜ b, L˜ respectively and thus a connection ∇˜ on U . If ∇, f , L are replaced in (7) by ∇, transvected with x, ˙ we find that Eq. (9) takes the form of the restriction Lˆ µν x˙ µ = 0 on ˆ the connection ∇ in the direction of the congruence. Here it turns out to be important that (7) does not contain the contraction of ∇µ fν . Let f be an arbitrary 1-form and ∇ˆ = ∇˜ + S(f ) the associated Weyl connection. ˜ we find for any curve x(τ ) in M and 1-form ˜ L, Observing (7) with ∇, L replaced by ∇, b(τ ) along x(τ ) the identities (∇˜ x˙ x) ˙ µ + S(b)λ µ ρ x˙ λ x˙ ρ = (∇ˆ x˙ x) ˙ µ + S(b − f )λ µ ρ x˙ λ x˙ ρ , 1 bµ S(b)λ µ ν x˙ λ − L˜ λν x˙ λ 2 1 = (∇ˆ x˙ (b − f ))ν − (b − f )µ S(b − f )λ µ ν x˙ λ − Lˆ λν x˙ λ . 2
(17)
(∇˜ x˙ b)ν −
(18)
It follows that conformal geodesics are invariant under transitions to general Weyl connections and, in particular, under conformal rescalings of g˜ in the following sense: if ˜ then x(τ ), b(τ ) − f |x(τ ) x(τ ), b(τ ) is a solution of these equations with respect to ∇, is a solution of the conformal geodesic equation with respect to ∇ˆ = ∇˜ + S(f ). Note that the parameter τ is an invariant of the conformal class of g. ˜ It is determined by the initial conditions (10) but it does not depend on the Weyl connection and the metric in the conformal class chosen to write the conformal geodesics equations. Let ek , k = 0, 1, . . . n − 1, be a frame field which is parallel along x(τ ) for the connection ∇ˆ of (16) and satisfies g(e ˜ i , ek ) = −2 diag(1, . . . 1, −1, . . . , −1) at x(τ∗ ) with some = ∗ > 0. It follows then from (4), (5), that this relation is preserved along x(τ ) with the function = (τ ) satisfying ∇˜ x˙ = < b, x˙ >,
(τ∗ ) = ∗ .
(19)
Consider again a congruence of conformal geodesics as above and initial data ∗ chosen such that the function satisfying (19) is smooth and positive on U . Then the associated 1-form is given in terms of the Levi-Civita connection ∇ of the metric g = 2 g, ˜ for which the frame is orthonormal, by h(τ ) = b(τ ) − −1 d |x(τ ) ,
(20)
along x(τ ). Equations (19) and (11) imply x(τ ) < h, x˙ > = 0,
g(x, ˙ x) ˙ = 2 g( ˜ x, ˙ x) ˙ = 2∗ g( ˜ x˙∗ , x˙∗ ).
(21)
Conformal Geodesics on Vacuum Space-times
519
We assume the congruence, the function , and the frame ek to be constructed by a suitable choice of the initial data such that g(x, ˙ x) ˙ = 1,
˙ e0 = x.
(22)
From (21) and ∇ = ∇˜ + S(−1 d) = ∇ˆ − S(h) follows ∇x˙ ek = − < h, ek > x˙ + g(x, ˙ ek ) h , where the index of h is raised with g. This implies that ˙ ek ) ∇x˙ x˙ + g(∇x˙ x, ˙ ek ) x˙ Fx˙ ek ≡ ∇x˙ ek − g(x, = − < h, ek > x˙ + g(x, ˙ ek ) h − g(x, ˙ ek ) h + g(h , ek ) x˙ = 0. ˆ is in general not Thus the frame ek , which is parallel propagated with respect to ∇, parallel but always Fermi-propagated with respect to ∇. The following general observation, which follows by a direct calculation, describes the sense in which Fermi-transport is conformally invariant. Let θ > 0 be some conformal factor and g, ˜ g metrics on M such that g = θ 2 g. ˜ Denote by F˜ , F the respective Fermi-transports. Let x(τ ) be any curve in M such that g(x, ˙ x) ˙ = 1. For any vector field X along the curve we then have F˜θ x˙ (θ X) = θ 2 Fx˙ X, where θ x˙ is the tangent vector of the curve parametrized in terms of g-arc ˜ length. Applying this to the situation considered above, we see that the g-orthonormal ˜ frame ek is Fermi-transported along the conformal geodesics, if the latter are parametrized in terms of g-arc ˜ length. Let f : M → M be a conformal diffeomorphism of g˜ such that f ∗ g˜ = 2 g˜ with some function on M. Since conformal geodesics are invariants of the conformal structure, it is not surprising that f maps conformal geodesics into conformal geodesics. If K is a conformal Killing vector field its flow defines a 1-parameter family of local conformal diffeomorphism and thus maps the set of conformal geodesics into itself. Though we will occasionally make use of this fact, it will not be demonstrated here and we refer to [11] for the formal argument. 2.2. The conformal deviation equations. Following the discussion of the Jacobi equation for metric geodesics, we derive now an analogous system of equations for conformal geodesics. Let x(τ, λ), b(τ, λ) be a family of solutions to (8), (9) depending smoothly on a parameter λ. We denote the tangent vector field of the conformal geodesics, the deviation vector field of the congruence, and the deviation 1-form by X = ∂τ x = x, ˙
Z = ∂λ x,
B = ∇˜ Z b,
respectively. Considering x = x(τ, λ) as a map from some open subset of R2 into M˜ and denoting by T x its tangent map, we have T x([∂τ , ∂λ ]) = 0, whence [X, Y ] = ∇˜ X Z − ∇˜ Z X = 0. This, (8), and (∇˜ X ∇˜ Z − ∇˜ Z ∇˜ X ) X − ∇˜ [X,Z] Z = R˜ (X, Z) X imply the conformal Jacobi equation ∇˜ X ∇˜ X Z = R˜ (X, Z) X − S(B; X, X) − 2 S(b; X, ∇˜ X Z),
(23)
where (S(b; X, Y ))µ = S(b)ρ µ ν X ρ Y ν for a given 1-form b and vector fields X, Y . The last two terms on the right-hand side of (23) indicate why conformal geodesics are potentially more useful than metric geodesics for the construction of coordinate systems. Under suitable circumstances the acceleration induced by the 1-form b may counteract curvature induced tendencies of the curves to develop caustics.
520
H. Friedrich
Caustics of conformal geodesics can be more complicated than caustics of metric geodesics because for a given tangent vector there exists (essentially) a 3-parameter family of conformal geodesics with the same tangent vector. Moreover, to analyse them we need to complement the conformal Jacobi equation by an equation which governs the behaviour of B. Writing (∇˜ X ∇˜ Z − ∇˜ Z ∇˜ X ) b − ∇˜ [X,Z] b = −b R˜ (X, Z), we get from (9) the 1-form deviation equation ˜ ˜ ∇˜ X Z, .) ∇˜ X B = −b R˜ (X, Z) + (∇˜ Z L)(X, .) + L( 1 + (B · S(b; X, .) + b · S(B; X, .) + b · S(b; ∇˜ X Z, .)), 2
(24)
where (B · S(b; X, .))ν = Bµ S(b)ρ µ ν X ρ . We refer to Eqs. (23), (24) as the system of conformal deviation equations. They form a linear system of ODE’s for Z and B along the curves x(τ ).
2.3. Conformal geodesic equations on warped products. For later applications we work out the simplifications of the conformal geodesic equations which are obtained in the case where g˜ can be written as a warped product. Thus we assume that there exist coordinates x µ for which the set I = {0, 1, . . . , n − 1} in which the index µ takes its values can be decomposed into two non-empty subsets I1 , I2 with I = I1 ∪ I2 , ∅ = I1 ∩ I2 , such that in terms of the coordinates x A , A ∈ I1 , and x a , a ∈ I2 , the metric takes the form g˜ = g˜ µν dx µ dx ν = hAB dx A dx B + f 2 kcd dx c dx d , hAB = hAB (x C ),
kab = kab (x c ),
(25)
f = f (x A ) > 0,
where 1 ≤ p ≡ kab k ab ≤ n−1 by our assumptions on I1 , I2 . The Christoffel coefficients are then given by ˜ B A C =
1 AD h (hBD,C + hCD,B − hBC,D ), 2 ˜ B A c = 0,
˜ B a c = ˜ c a B =
1 f,B k a c , f
˜ b A c = −hAD f f,D kbc ,
˜ B a C = 0,
˜ b a c =
1 ad k (kbd,c + kcd,b − kbc,d ). 2
Those which cannot be derived by symmetry considerations from the coefficients above, vanish. The curvature tensor has components R A BCD [g] ˜ = R A BCD [h],
R a BcD [g] ˜ = −k a c
1 DB DD f, f
˜ = R a bcd [k] − 2 hCD f,C f,D k a [c kd]b , R a bcd [g]
(26)
(27)
Conformal Geodesics on Vacuum Space-times
521
where D denotes the h-Levi-Civita connection with connection coefficients ˜ A B C . Components which cannot be deduced by symmetry considerations from those above, vanish. In particular R A BCd [g] ˜ = 0, R a bCD [g] ˜ = 0, R A Bcd [g] ˜ = 0, R A bcd [g] ˜ = 0, R a BCD [g] ˜ = 0. (28) ˜ = 0, This implies LAc [g] Lcd [g] ˜ =
1 1 Rcd [k] − kcd R[k] 2 12 1 2 3−p (p − 1)(6 − p) A A − f R[h] + f DA D f + DA f D f kcd , 12 6 12
LAB [g] ˜ =
1 1 p 1 RAB [h] − hAB R[h] − DA DB f − hAB DC D C f 2 12 2f 3 1 − { R[k] − p (p − 1) DC f D C f } hAB . 12 f 2
The conformal geodesic equations take the form x¨ A + ˜ B A C x˙ B x˙ C + ˜ b A c x˙ b x˙ c = −2 (bC x˙ C + bc x˙ c ) x˙ A +(hBC x˙ B x˙ C + f 2 kbc x˙ b x˙ c ) hAD bD ,
(29)
x¨ a + ˜ b a c x˙ b x˙ c + ˜ B a c x˙ B x˙ c + ˜ b a C x˙ b x˙ C = −2 (bC x˙ C + bc x˙ c ) x˙ a + (hBC x˙ B x˙ C + f 2 kbc x˙ b x˙ c )
1 ac k bc , f2
(30)
b˙A − ˜ C D A x˙ C bD − ˜ c d A x˙ c bd 1 1 = (bC x˙ C + bc x˙ c ) bA − (hBC bB bC + 2 k bc bb bc ) hAD x˙ D + L˜ AB [g] ˜ x˙ B , 2 f (31) b˙a − ˜ c D a x˙ c bD − ˜ C d a x˙ C bd − ˜ c d a x˙ c bd 1 1 = (bC x˙ C + bc x˙ c ) ba − (hBC bB bC + 2 k bc bb bc ) f 2 kac x˙ c + L˜ ab [g] ˜ x˙ b . 2 f (32) 3. Conformal Geodesics on Vacuum Fields Though many of the subsequent discussions apply to more general situations, we shall assume from now on that dim M = 4, sign(g) ˜ = (1, −1, −1, −1) and that g˜ satisfies Einstein’s vacuum field equations R˜ µν = 0. With the resulting simplification and the
522
H. Friedrich
notation introduced above the conformal geodesic equations and the equation for the frame ek now take the form ∇˜ x˙ x˙ + 2 < b, x˙ > x˙ − g( ˜ x, ˙ x) ˙ b = 0,
(33)
1 ∇˜ x˙ b − < b, x˙ > b + g˜ (b, b) x˙ = 0, 2
(34)
∇˜ x˙ ek + < b, x˙ > ek + < b; ek > x˙ − g( ˜ x, ˙ ek ) b = 0.
(35)
Since these equations admit solutions with vanishing 1-form b, metric geodesics form on vacuum space-times a subclass of conformal geodesics. 3.1. Conformal geodesics on Minkowski space. Denote by x µ coordinates on Minkowski space (M, η) in which the metric takes the form η = ηµν dx µ dx ν = (dx 0 )2 − (dx 1 )2 −(dx 2 )2 −(dx 3 )2 and let k = δ µ k ∂µ be the orthonormal standard frame. There are various possibilities to determine the solutions to (33), (34), (35). The framework of the normal conformal Cartan connection (cf. [5]), in which conformal geodesics are defined as “horizontal” curves on a certain bundle allows one a purely algebraic calculation of these curves, because the bundle admits in the case of the conformally compactified Minkowski space a purely group theoretical description. Since the curves on the bundle space are known explicitly there only remains the task to calculate their projection onto the base space. We just give the result of the calculation. Besides the initial data (10) we prescribe the value ∗ > 0 and a Lorentz transforj mation A∗ which determines the initial value ek∗ = −1 ∗ A∗ k j of the frame ek . The corresponding solution is given by 1 (τ ) = ∗ 1 + τ < b∗ , x˙∗ > + ( τ )2 η(x˙∗ , x˙∗ ) η (b∗ , b∗ ) , (36) 4 µ
x µ (τ ) = x∗ +
∗ (τ )
µ
τ x˙∗ +
1 µ ( τ )2 η(x˙∗ , x˙∗ ) b∗ , 2
(37)
1
τ η (b∗ , b∗ ) x˙∗ν , 2
(38)
bν (τ ) = (1 + τ < b∗ , x˙∗ >) b∗ν − ek =
1 Aj k (τ ) j , (τ )
with a Lorentz transformation A(τ ) = (Aj k (τ )) given by ∗ 1 2 A(τ ) = A∗ + τ b∗ x˙∗ −
τ x˙∗ + ( τ ) η(x˙∗ , x˙∗ ) b∗ (τ ) 2 1 × b∗ · A∗ + τ η (b∗ , b∗ ) x˙∗ . 2 Indices are moved in this subsection with the metric η.
(39)
Conformal Geodesics on Vacuum Space-times
523
Conformal geodesics which satisfy η(x, ˙ x) ˙ = 0 at one point coincide (as point sets) with null geodesics, those for which b∗ = 0 are Minkowski geodesics. If x˙∗ is spaceor time-like we can assume, possibly after a reparametrization, that < b∗ , x˙∗ > = 0. It follows that the remaining conformal geodesics fall into one of the following classes. If x˙∗ and b∗ generate a space-like 2-surface in the tangent space of x∗ , the corresponding conformal geodesic is a metric circle in the plane tangent to that 2-surface. If x˙∗ and b∗ generate a time-like 2-surface and x˙∗ is space-like, the corresponding conformal geodesic is a space-like metric hyperbola in the plane tangent to that 2-surface. If x˙∗ is space-like and b∗ is null, the conformal geodesic is a space-like curve in a null plane. We shall mainly be interested in the case in which x˙∗ and b∗ generate a time-like 2-surface and x˙∗ is time-like. An example of such a conformal geodesic is given by the curve a τ2 τ 1 2 2 x(τ ) = , , + , 0, 0 , |τ | < a2 τ 2 a a2 τ 2 |a| 1− 1− 4
4
which is a time-like hyperbola satisfying ηµν x µ (τ ) x µ (τ ) = − a −2 . We shall use now time-like conformal geodesics to construct coordinate systems on Minkowski space where the parameter τ on the curves defines a time coordinate while ˜ Most the other coordinates are obtained by dragging along spatial coordinates on S. important for us is the fact that the construction provides a conformal rescaling and extension of Minkowski space to which the coordinates are adapted in a natural way. We denote by r∗ the restriction of the standard radial coordinate r on Minkowski space to the hypersurface S˜ = {x 0 = 0}. In spherical coordinates θ , φ the inner metric induced on S then takes the form h˜ = −(dr∗2 + r∗2 dσ 2 ) with dσ 2 = dθ 2 + sin2 θ dφ 2 the standard line element on the 2-sphere. If we write r∗ = tan( χ2 ) with 0 ≤ χ < π, we find that the conformal factor ∗ = 2 (1 + r∗2 )−1 = 1 + cos χ allows us to realize the conformal embedding of S˜ into the 3-sphere S = S˜ ∪ {i} = S 3 with line element 2∗ h˜ = −(dχ 2 + sin2 χ dσ 2 ), where i denotes the point {χ = π } at space-like infinity. To define a congruence of conformal geodesics and a conformal factor, we choose ˜ u ∈ R3 with |u| = 1, we set initial data on S˜ as follows. At x∗ = (0, r∗ u) ∈ S, 2 b∗ = a uc c (sum over c = 1, 2, 3) and x˙∗ = −1 ˙ x) ˙ = 1. The ∗ 0 such that ∗ η(x, ˜ function a on S is determined by the following consideration. With the data above and τ∗ = 0 we get (τ ) = ∗ (1 − 41 τ 2 a 2 −2 ∗ ) from (36). To obtain, if possible, a 1-form −1 b which is exact, we require d = b at x∗ which gives a = 2 r∗ (1 + r∗2 )−1 . Our data are then spherically symmetric and we can express the conformal factor (36) and the curves (37) in terms of the time coordinate t = x 0 and the radial coordinate r to obtain 1 2 (1 + r∗2 ) τ r∗2 (4 + τ 2 ) , r = . (40) = ∗ 1 − τ 2 r∗2 , t = 4 4 − τ 2 r∗2 4 − τ 2 r∗2 To relate these expressions to known facts, we use r∗ = tan χ2 and set τ2 = tan 2s to obtain τ 2 1 = , (41) = ω, with = cos s + cos χ , ω = 1 + 2 cos2 2s
524
H. Friedrich
t=
sin s , cos χ + cos s
r=
sin χ . cos χ + cos s
(42)
Reading the last equation as a coordinate transformation, we get 2 η = 2 dt 2 − dr 2 − r 2 dσ 2 = gE ≡ ds 2 − dχ 2 − sin2 χ dσ 2 . The set ME = R × S 3 endowed with the metric gE defines the Einstein cosmos. We see that (42) realizes the well known conformal embedding of Minkowski space into the Einstein cosmos, which maps the former onto the subset |s ± χ | < π , 0 ≤ χ < π of ME . The boundary ∂M of this set in ME supplies representations of future and past null infinity, future and past time-like infinity, and space-like infinity of Minkowski space, which are given by the subsets J ± = {s ± χ = ±π }, i ± = {s = ±π, χ = 0}, i 0 = {s = 0, χ = π } of ∂M respectively. In terms of τ and we get from (40) the metric 1 2 2 2 2 2 2 η=ω (43) dτ − dχ − sin χ dσ , ω2 which extends smoothly to J ± but does not extend to i ± , where ω → ∞. This is ˜ The curve τ → z(τ ) = (s = not due to an unfortunate choice of initial data on S. 2 arctan τ2 , χ = 0) is a conformal geodesic on the Einstein cosmos which approaches i ± as τ → ±∞. The freedom to perform parameter transformations, characterized by (14), (15), does not allow us to find a parametrization for which the curve extends simultaneously to i − and i + . Thus the separation of i − and i + realizes a conformal invariant. It is possible, however, to find reparametrizations under which the rescaled metric extends smoothly either to i + or to i − . The curve τ¯ → z¯ (τ¯ ) = (s = s∗ + 2 arctan( τ2¯ − tan s2∗ ), χ = 0) is again a conformal geodesic, because ∂s is a Killing vector field for gE . We choose s∗ to be a constant with 0 < s∗ < π . Then z¯ (τ¯ ) is related to z(τ ) by the parameter transformation τ = τ¯ (1 + tan2 s2∗ − 21 τ¯ tan s2∗ )−1 . Comparing with (14) we are led to set τ∗ = 0, τ¯∗ = 0, 4 e = (1 + tan2 s2∗ )−1 , and α∗ η(x˙∗ , x˙∗ ) = − tan s2∗ and to consider the initial data x´¯ ∗ =
1 1 + tan2
s∗ 2
x˙∗ ,
¯ ∗ = (η(x´¯ ∗ , x´¯ ∗ ) 2 = 1
Observing these data and setting Eqs. (42), while (36) yields now tan s2∗ ¯ ¯ = ∗ 1 − τ¯ 1 + tan2 with as in (41) and ω¯ = cos−2
1 cos2
s∗ 2
∗ ,
b¯∗ = b∗ −
τ¯ 2
= tan
s∗ 2
tan2 s2∗ − tan2 χ2 1 + τ¯ 2 4 (1 + tan2 s2∗ )2
s−s∗ 2
+ tan
s∗ 2,
(s−s∗ ) 2 .
tan s2∗ x˙∗ . η(x˙∗ , x˙∗ ) (44)
we get from (37) again = ω, ¯
It follows that the metric 1 2 2 2 2 ¯ 2 η = ω¯ 2 ds 2 − dχ 2 − χ 2 d σ 2 = ω¯ 2 , d τ ¯ − dχ − χ d σ ω¯ 2
extends regularly onto the domain |s − s∗ | < π containing i + .
Conformal Geodesics on Vacuum Space-times
525
If we insist on a conformal factor which defines a conformal compactification of ˜ h) ˜ and initial data ensuring b = −1 d on S, ˜ invariance under t → − t implies the (S, existence of a point on S˜ where b vanishes. The conformal geodesic through such a point will then be a metric geodesic. In our first example such a point is given by r∗ = 0 and it follows from (40) that the parameter τ takes values in R and the conformal factor is constant on the curve r = 0. Thus, in order to achieve a compactification which extends smoothly to i + for a finite value of the parameter we have to choose non-time-symmetric initial data such as (44). ˜ The possibility to Notice that s∗ could have been chosen to be a function on S. change the parametrization while leaving the curves as point sets unchanged offers a large freedom to select slices of constant parameter value. 3.2. The conformal factor and the 1-form on general vacuum fields. In the following we shall derive some general, explicit information on the solutions to (33), (34), (35) (cf. the more general discussion in [5] on solutions to the vacuum field equations R˜ µν = λ g˜ µν with a cosmological constant λ). From (33), (34) follows 1 ∇˜ x˙ < b, x˙ > = − < b, x˙ >2 + g( ˜ x, ˙ x) ˙ g˜ (b, b), 2
(45)
∇˜ x˙ g˜ (b, b) = < b, x˙ > g˜ (b, b),
(46)
where the index on the symbol of a metric indicates here and in the following the contravariant version of that metric. Assume again that the congruence of conformal geodesics, , and the frame have been chosen such that (22) holds and g( ˜ x, ˙ x) ˙ = −2 along the congruence. Equation (11) then implies ∇˜ x˙ = < b, x˙ >. Taking a derivative and observing (45) yields ∇˜ x2˙ = 1/2 g˜ (b, b) −1 . Taking another derivative and observing (46) gives finally ∇˜ x3˙ = 0.
(47) In terms of the initial data (10) this yields with ∗ = (g( ˜ x˙∗ , x˙∗ )) the explicit expression 1 ¨∗ ˙ ∗ + ( τ )2 (τ ) = ∗ + τ 2 1 2 = ∗ 1 + τ < b∗ , x˙∗ > + ( τ ) g( ˜ x˙∗ , x˙∗ ) g˜ (b∗ , b∗ ) . 4
(48)
Similar steps lead to ∇˜ x˙ (g( ˜ x, ˙ x) ˙ (g˜ (b, b))2 ) = 0, whence ¨ −1 g˜ (b, b) = −1 ∗ g˜ (b∗ , b∗ ) = 2 ∗ .
(49)
In particular, the sign of g˜ (b, b) is preserved as long as > 0. >From (34), (35) follows ∇˜ x˙ ( < b, ek >) = 1/2 g˜ (b, b) g( ˜ x, ˙ ek ). >From this equation and (19), (22) the following explicit expression can be derived for the components bk ≡ < b, ek > of the 1-form b in the frame ek , ˙ b0 = −1 ,
ba = −1 da
with
da = < b∗ , ∗ ea∗ > .
(50)
526
H. Friedrich
This implies with (49) the relation ˙ 2 − δ ab da db = 2 ηj k bj bk = 2 g (b, b) ¨ = g˜ (b, b) = −1 ∗ g˜ (b∗ , b∗ ) = 2 ∗ .
(51)
While the expressions (48), (50) depend on invariants built from the initial data, they are universal in the sense that their form does not depend on the solution g. ˜ Using (50), we can write the representation of the 1-form with respect to the connection of g, i.e. (20), in the form ek () = < b, ek > − < h, ek > on the congruence. If there is a point on a given curve of the congruence where vanishes and b and h remain bounded on that curve, (50), (51) imply ηj k ej () ek () → 0
as
→ 0.
(52)
3.3. The g-adapted ˜ form of the conformal geodesic equation. We write b = bˆ + ζ x˙ (where here and below indices of 1-forms and vectors are moved with g) ˜ with bˆ such that ˆ x˙ > = 0, < b,
whence ζ =
< b, x˙ > , g( ˜ x, ˙ x) ˙
ˆ b). ˆ (53) g (b, b) = < b, x˙ >2 + g (b,
ˆ b) ˆ x˙ . Equations (33), (34) are then equivalent to ∇˜ x˙ x˙ = bˆ , ∇˜ x˙ bˆ = −g˜ (b, We introduce the parameter transformation
τ dτ τ¯ (τ ) = τ¯∗ + , (54)
τ∗ (τ ) with inverse τ = τ (τ¯ ) and set x( ¯ τ¯ ) = x(τ (τ¯ )). Then x¯ ≡ ∂τ¯ x¯ = x˙ satisfies
g( ˜ x¯ , x¯ ) = 1 and we obtain the g-adapted ˜ conformal geodesic equations ∇˜ x¯ x¯ = bˆ ,
∇˜ x¯ bˆ = β 2 x¯ ,
(55)
where, by (50), ˆ b) ˆ = δ ab da db = const. β 2 ≡ −g˜ (b,
(56)
ˆ If bˆ along the conformal geodesic. These equations bring out the important role of b. vanishes at a point it vanishes along x( ¯ τ¯ ) and the curve is a g-geodesics. ˜ We determine the transformation (54) under the assumption that g (b∗ , b∗ ) = g( ˜ x˙∗ , x˙∗ ) g˜ (b∗ , b∗ ) < 0.
(57)
It follows then from (48) that (τ ) vanishes at τ± = τ∗ −
2 ∗ , ∗ < b∗ , x˙∗ > ∓ |β|
(58)
with τ+ = τ− , such that =
1 ∗ g (b∗ , b∗ )(τ − τ+ ) (τ − τ− ), 4
(59)
Conformal Geodesics on Vacuum Space-times
527
and the integration of (54) gives τ¯ (τ ) = τ¯∗ +
1 (τ∗ − τ+ ) (τ − τ− ) log , |β| (τ − τ+ ) (τ∗ − τ− )
(60)
whence 2 ∗ sinh |β|
τ ¯ 2 .
τ = |β| |β| cosh 2 τ¯ − ∗ < b∗ , x˙∗ > sinh |β|
τ ¯ 2
(61)
In terms of the parameter τ¯ we also get =
|β| cosh
|β| 2
τ¯
∗ β 2 − ∗ < b∗ , x˙∗ > sinh
|β| 2
τ¯
2 .
(62)
Suppose that the solution admits a smooth conformal extension for which = 0 on J + , that the extension admits a point i + such that J + coincides with the past light cone of i + , and that one of our conformal geodesics, x(τ ) say, passes through i + for a finite value τi of τ . We note that condition (57) precludes a discussion of the zeros of on x(τ ). If (58) describes the zeros of on the conformal geodesics covering a neighbourhood of x(τ ), then, on a family of such conformal geodesics which approaches x(τ ), we should find τ± → τi , and consequently β = 0, < b∗ , x˙∗ > = 0 on x(τ ) and τi = τ∗ −
. ˆ τ¯ , λ) a smooth 3.4. The g-adapted ˜ conformal deviation equations. Denote by x( ¯ τ¯ , λ), b( family of solutions to (55) with family parameter λ. As before, we write X = ∂τ¯ x¯ = x¯ ,
Z = ∂λ x, ¯
ˆ Bˆ = ∇˜ Z b.
(63)
Following the derivation of the conformal deviation equations and observing that vacuum field equations L˜ µν = 0, we obtain the g-adapted ˜ conformal deviation equations ∇˜ X ∇˜ X Z = C (X, Z) X + Bˆ ,
(64)
ˆ b)) ˆ X + g˜ (b, ˆ b) ˆ ∇˜ X Z , ∇˜ X Bˆ = −bˆ C (X, Z) + (∇˜ Z g˜ (b,
(65)
where C denotes the conformal Weyl tensor of g. ˜ ˆ b)) ˆ = 0, we have by (56) always g˜ (b, ˆ b) ˆ = const. along While in general ∇˜ Z (g˜ (b, ˆ b)) ˆ = ∇˜ Z (∇˜ X g˜ (b, ˆ b)) ˆ = 0. Thus the coefficients x( ¯ τ¯ ), which implies ∇˜ X (∇˜ Z g˜ (b, of X and ∇˜ X Z in the second line of (65) are constant and known along x( ¯ τ¯ ) by their initial data at some τ¯∗ .
528
H. Friedrich
3.5. Conformal geodesics and the g-adapted ˜ conformal deviation equations on warped product vacuum fields. It will be shown now that on warped product vacuum fields the discussion of the conformal geodesic equations reduces under suitable assumptions to the analysis of one equation and that a similar statement is true for the g-adapted ˜ conformal deviation equations. If g˜ can be written in the form (25), the vacuum field equations take the form RAC [h] =
p DA DC f, f
Rac [k] = f −(p−2) DA (f (p−1) D A f ) kac ,
(66)
and the dependence of the various fields in the second equation on x a implies Rac [k] =
R[k] kac , p
R[k] = p f −(p−2) DA (f (p−1) D A f ) = const.
(67)
We shall assume from now on that p = 2 and that the indices A, a take values 0, 1 and 2, 3 respectively. Then RABCD [h] = R[h] hA[C hD]B and RAC [h] = 21 R[h] hAC and thus 2 DA DB f = hAB DC D C f by the first of Eqs. (66). Contracting with D A , commuting derivatives yield DB (f 2 DA D A f ) = 0 if (66) is used again. It follows 2 c ≡ f 2 DA D A f = const.,
DA DB f =
c hAB , f2
RABCD [h] =
4c hA[C hD]B . f3 (68)
These equations imply with (26), (27), 4c A 4c a h [C hD]B , R a bcd [g] ˜ = k [c kd]b , 3 f f c R a BcD [g] ˜ = − 3 k a c hBD . f R A BCD [g] ˜ =
(69)
It follows that R µ νλρ = 0 unless c = 0. If Einstein’s vacuum field equation L˜ µν = 0 holds, Eqs. (29), (30), (31), (32) admit solutions satisfying x˙ a = 0, bc = 0, and these solutions obey equations which can be written in the form Dx˙ x˙ = −2 < b, x˙ > x˙ + h(x, ˙ x) ˙ b ,
Dx˙ b = < b, x˙ > b −
1 h (b, b) x˙ , 2
(70)
where all quantities are derived from h. We shall discuss now the g-adapted ˜ version of (70), assuming that ≡ det(hAB ) < 0. With h = | | d x 0 ∧ d x 1 , (71) √ it follows that h (x, ˙ .) = | |(x˙ 0 d x 1 −x˙ 0 d x 0 ) and h (h (x, ˙ .), h (x, ˙ .)) = −h(x, ˙ x). ˙ Since the vectors x, ˙ bˆ are contained in the 2-dimensional space spanned by ∂A , A = 0, 1, ˆ x˙ > = 0, the equations above and (56) imply the representation and < b, bˆ = ± β h ( x, ˙ .) = ± β h (x¯ , .),
(72)
Conformal Geodesics on Vacuum Space-times
529
where the sign is determined by the initial conditions and the convention for β. Using this expression in the second of Eqs. (55), gives β 2 x¯ = Dx¯ bˆ = ± β h (Dx¯ x¯ , .), which can be rewritten in the form Dx¯ x¯ = bˆ .
(73)
Thus, the g-adapted ˜ forms of the two Eqs. (70) are equivalent to each other and we only have to solve (73) with bˆ as in (72) to obtain a solution to (70). Observing the formulae for the connection coefficients, the results (28) for the curvature tensor of warped products, and C A BCD [g] ˜ =
4c A h [C hD]B , f3
(74)
Eqs. (64), (65) are seen to be equivalent to each other and to the equation DX DX Z =
2c h (X, Z) h (X, .) ± DZ β h (X, .) + β h (DX Z, .) . 3 f
(75)
The sign here is chosen as in (72) and use has been made of the identities h (h (X, .) .) = X ,
X h(Z, X) − Z = h (X, Z) h (X, .) .
4. Conformal Geodesics on the Schwarzschild Space-time For our discussions various forms of the Schwarzschild line element will be needed. Its standard form with mass m is given for r¯ > 2 m by 2 m −1 2m d r¯ 2 − r¯ 2 d σ 2 . (76) g˜ = 1 − d t2 − 1 − r¯ r¯ In terms of the retarded and advanced null coordinates w = t − (¯r + 2 m log(¯r − 2 m)),
v = t + (¯r + 2 m log(¯r − 2 m)),
(77)
the line element (76) is obtained in the forms 2m 2m g˜ = 1 − dw 2 + 2 dw d r¯ − r¯ 2 dσ 2 , g˜ = 1 − dv 2 − 2 dvd r¯ − r¯ 2 dσ 2 , r¯ r¯ (78) respectively. These extend analytically into regions where r¯ ≤ 2 m. The retarded null coordinate w extends smoothly through J + and through the past horizon to the past singularity, while the advanced null coordinate v extends smoothly through J − and the future horizon to the future singularity. The transformation 1/2 r¯ t r¯ exp −1 sinh , s= 2m 4m 4m 1/2 t r¯ r¯ −1 cosh , exp ρ= 2m 4m 4m
530
H. Friedrich
applied to (76) with m > 0 produces the Schwarzschild-Kruskal line element 32 m3 r¯ 2 2 2 2 with K = exp − , r¯ = k(ρ 2 − s 2 ), g˜ = K (d s − d ρ ) − r¯ d σ r¯ 2m (79) r¯ where k denotes the inverse of the map ]0, ∞[ r¯ −→ ( 2r¯m − 1) exp( 2m ) ∈ ] − 1, ∞[. This line element extends analytically to a non-degenerate line element on the domain ρ 2 − s 2 > −1. The coordinate r, given for r¯ > 2 m by 1 1 m 2 r¯ = resp. r = (80) r+ r¯ − m + r¯ (¯r − 2 m) , r 2 2
yields the isotropic Schwarzschild line element 1 − 2mr 2 2 m 4 g˜ = d t − 1 + (d r 2 + r 2 d σ 2 ), 2r 1 + 2mr
r>
m . 2
(81)
On the hypersurface {t = 0} it induces initial data which can be extended analytically to an initial data set m 4 (S˜ = R3 \ {0}, h˜ = − 1 + (d r 2 + r 2 d σ 2 ), χ˜ = 0), (82) 2r where χ˜ denotes the second fundamental form. These data may be identified isometrically with the initial data induced by the Schwarzschild-Kruskal metric (79) on the hypersurface {s = 0} by the transformation r − m2 (r + m2 )2 ρ=√ exp . 4mr 2mr In the following discussions one may think of the conformal geodesics as being constructed on the Schwarzschild-Kruskal solution (79) with initial data being prescribed on the hypersurface S˜ = {s = 0}. It will be convenient, however, to specify them in terms of the coordinate r in (82). To discuss the different aspects of these curves we will use those coordinates which will appear to give the simplest formulae. Because of the symmetries s → −s and ρ → −ρ, it will be sufficient to consider the region {s ≥ 0, ρ ≥ 0} if we prescribe data for the conformal geodesics which respect to these symmetries. The data will also be chosen to be spherically symmetric. The function r¯ extends by the first of Eqs. (80) to an analytic function on S˜ which ˜ h) ˜ has two takes its minimum r¯min = 2 m at the “throat” {r = rh ≡ m2 }. The space (S, asymptotically flat ends. In terms of (82) the end i1 at r = ∞ has the “usual” representation, while the end i2 at r = 0 has a coordinate representation adapted to the metric (1 + 2mr )−4 h˜ which may be thought of as a conformal compactification of h˜ at the end i2 . The map m 2 1 , (83) r→ 2 r which corresponds to the isometry ρ → −ρ allows us to show the equivalence of the spatial infinities i1 and i2 . It leaves r¯ invariant and has {r = rh } as its fixed point set.
Conformal Geodesics on Vacuum Space-times
531
4.1. Conformally compactified Schwarzschild-Kruskal data and initial data for a congruence of conformal geodesics. We shall in the following prescribe initial data on S˜ for a congruence of conformal geodesics whose g-adapted ˜ initial vectors x¯ coincide ˜ with the future directed unit normals of S. By r¯∗ will be denoted the restriction to S˜ of the function r¯ on the Schwarzschild-Kruskal space-time and r will be considered as a ˜ For the conformal factor we choose coordinate on S. ∗ =
1 r2 = r¯∗2 (r + m2 )4
(84)
which implies m −4 −2∗ h˜ = r + (d r 2 + r 2 d σ 2 ). 2
(85)
It follows that the transformation r = tan χ2 , 0 < χ < π (and, to get simple equations, µ m 2 = tan 2 ) could be used to make the conformal compactification achieved by ∗ manifest in terms of coordinates, since it realizes an embedding of S˜ into the unit 3-sphere S 3 with the poles at χ = 0, χ = π corresponding to the ends i2 , i1 respectively. The choice (84) is particularly well adapted to the Schwarzschild-Kruskal geometry and implies simple formulae. However, it is important to note that our basic results do not depend on it. For the analysis of the field near space-like infinity other choices might be preferable (cf. [6]), because the metric (85) does not extend smoothly to space-like infinity. We set furthermore b∗ = bˆ∗ = −1 ∗ d ∗ = − and, observing F ≡ 1 − 2 m/¯r∗ = (r − β=
m 2 2 ) (r
+
2 (r − r (r + m −2 2) ,
2 r (r − m2 ) , (r + m2 )3
m 2) m 2)
d r,
(86)
choose (87)
√ −g˜ (bˆ∗ , bˆ∗ ) = r¯2∗ F (¯r∗ ) > 0 near the end i1 in analogy to our procedure on Minkowski space. If h˜ is used to raise indices, bˆ∗ is outward pointing at the ends i1 , i2 , as it should. With the choices above the general formula (59) reduces to 2 ∗ 2 2 (88) −τ , = F (¯r∗ ) β such that β is analytic on S˜ and β =
where functions with subscript ∗ are assumed to be constant along the conformal geodesics. Note that our initial data and gauge conditions are preserved by the isometry (83) and are adapted to the spherical symmetry and the time reflection symmetry. We have β → −β under (83) and β(r) = 0 precisely for r = m2 . This means that conformal geodesics satisfying the initial data above at points with r = m2 will be metric geodesics in the hypersurface {ρ = 0} of the Schwarzschild-Kruskal space-time.
532
H. Friedrich
4.2. The conformal geodesic equations on the Schwarzschild space-time. To discuss conformal geodesics on the Schwarzschild space-time we write the g-adapted ˜ form of the conformal geodesic equation in terms of the line element (76), which can be written in the form (25) with the metric h given by 1 2 2m 2 . h = F dt − d r¯ , F = 1 − F r¯ Initial data for the conformal geodesics will be given on the hypersurface S˜ = {t = 0} ˜ with τ∗ = 0, τ¯∗ = 0 on S. Because of (72) (where due to our conventions we have to use the minus sign), Eqs. (73) take the form F,r¯ 1 r¯ t = β r¯ , F F
(89)
F,r¯ 2 F F,r¯ 2 (¯r ) + (t ) = F β t , 2F 2
(90)
t
+ r¯
−
with β satisfying (56). Assuming on S˜ that the initial vector is the future directed unit ˜ the initial data on S˜ are given by normal to S, t∗ = 0,
r¯∗ > 2 m,
t∗ = √
1 , F (¯r∗ )
r¯∗ = 0,
bˆt∗ = 0,
1 bˆr¯ ∗ = −β(¯r∗ ) √ . F (¯r∗ )
The g-normalization ˜ gives F (t )2 −
1 2 (¯r ) = 1. F
Solving for t > 0 and inserting the result into Eq. (90) gives 1 0 = r¯
+ F,r¯ −β F + (¯r )2 . 2
(91)
(92)
Dividing by the square root and multiplying with r¯ leads to d( F + (¯r )2 − β r¯ )/d τ¯ = 0, whence F + (¯r )2 − β r¯ = γ , (93) √ with the constant of integration given by γ = F (¯r∗ ) − β(¯r∗ ) r¯∗ . It follows that
r¯ = ± (γ + β r¯ )2 − F (¯r ), (94) with a sign which depends on the value of r¯∗ . Given r¯ (τ¯ ), we obtain t (τ¯ ) by integrating (t )2 = (β r¯ + γ )2 F −2 for the given initial conditions. However, the integration of t is in general not particularly interesting, because t does neither extend smoothly through J + nor through the future horizon. To study the extension of the conformal geodesics through J + it is convenient to use the first of the line elements (78). The conformal geodesic equations then read w
−
1 F,r¯ (w )2 = −β w , 2
r¯
+
1 F F,r¯ (w )2 + F,r¯ r¯ w = β (¯r + F w ). 2
Conformal Geodesics on Vacuum Space-times
533
These equations imply for r¯ the equations obtained above. The normalization of the tangent vector reads F (w )2 + 2 w r¯ = 1. Solving for w and requiring w > 0 on S˜ gives 1 1 F + (¯r )2 − r¯ = , (95) w = F F + (¯r )2 + r¯ which has to be integrated with the initial conditions w∗ = −¯r∗ − 2 m log(¯r∗ − 2 m). Thus w(τ¯ ) can be obtained by a simple integration once r¯ (τ¯ ) has been determined. To study the extension through the horizon, it is convenient to use the second of the line elements (78). The conformal geodesic equations then read v
+
1 F,r¯ (v )2 = β v , 2
r¯
+
1 F F,r¯ (v )2 − F,r¯ r¯ v = −β (¯r − F v ). 2
The normalization of the tangent vector gives now F (v )2 − 2 v r¯ = 1, which leads to the equation 1 F + (¯r )2 + r¯ , (96) v = F which has to be integrated with the initial conditions v∗ = r¯∗ + 2 m log(¯r∗ − 2 m). 4.2.1. Conformal geodesics on which r¯ is constant. The explicit solution of the key equation (94) requires the discussion of three different cases. We first consider the borderline solution. A change of the sign in (94) should occur near conformal geodesics along which r¯ √ is constant. Requiring r¯ = 0 in (92) gives 1/2 F = β F . With β given by (87) this condition has the solution √ 5 3± 5 r¯∗ = rˆ ≡ m resp. r = r± ≡ m. 2 4 Inserting r¯ = 0 into (91), using (60) with τ∗ = 0 , τ¯∗ = 0, and observing (84), (87) with r = r+ gives on the conformal geodesics on which r¯ = rˆ , 2 ∗ + β τ τ¯ 25 m log , t= = 4 2 ∗ − β τ F (ˆr ) ∗ (r+ ) = √2 . and thus t → ∞ as τ → τi ≡ 2 β(r +) 5m It follows now from Eqs. (92), (94) that each of the conformal geodesics specified by our data falls into one of the following four classes: (i) the conformal geodesics passing through points with r = m2 , which coincide with metric geodesics in the SchwarzschildKruskal space-time and approach the singularity, (ii) the conformal geodesics passing through points with r = r± , which are tangent to the static Killing field ∂t and approach time-like infinity for the finite value τi of their parameters, (iii) the conformal geodesics passing through points with 0 < r < r− or with r+ < r < ∞, for which r¯ is monotonically increasing, (iv) the conformal geodesics passing through points with r− < r < m2 or with m2 < r < r+ , for which r¯ is monotonically decreasing. Though it will be seen below that the integration procedure for (94) depends on the classes specified above, it is ˜ determine a smooth congruence of conformal clear that our data, which are smooth on S, ˜ geodesics which is free of caustics near S.
534
H. Friedrich
4.2.2. Conformal geodesics on which r¯ is increasing. Because of the symmetry of the data it is sufficient to discuss the case r+ < r < ∞. Under this assumption the “+” sign holds in (94). The radicand in this equation factorizes as (γ + β r¯ )2 − F (¯r ) =
β2 (¯r − r¯∗ ) (¯r − α+ ) (¯r − α− ). r¯
√ Because γ = − F (¯r∗ ), it follows that α± = ± α
with α =
(97)
m r¯∗ . 2 F (¯r∗ )
Since α < r¯∗ and α → rˆ as r¯∗ → rˆ , the polynomial on the right-hand side of (97) has under our assumption three different zeros while in the limit above two zeros will coincide and the integration procedure needs to be changed. The use of the notation t≡
(α + r¯ ) r¯∗ (α + r¯∗ ) r¯
1 2
, k≡
1 21 √ 2 m α 1 1 1+ 1+ = , e ≡ 2 k, 2 r¯∗ 2 2 r¯∗ − 4 m
in (94) gives, after an integration, 1 e
2 2 (e − 1) (e2 − k 2 ) β τ¯ = e
√ α
1+ r¯ 1
(1 − e2 t 2 )
dt , (1 − t 2 ) (1 − k 2 t 2 )
√
where, by our assumptions, 1 < e < 2, √1 < k < 1. It is well known how to express 2 this elliptic integral of the third kind in terms of theta functions (for all manipulations involving elliptic integrals, elliptic functions, theta functions, etc. we refer to [10]). The substitution t = sn(u, k) with Jacobi’s elliptic function sn yields 2 2 β τ¯ = (e − 1) (e2 − k 2 ) e
u K
= −2 Z(c) (u − K) + log
dv (1 − e2 sn2 v)
π (u + c) 2πK θ1 2 K (u − c)
θ1
(98)
.
Here Z denotes Jacobi’s zeta function Z(u) =
∞ π u 2 π u d q 2n−1 log θ4 = sin π du 2 K K K 1 − 2 q 2n−1 cos(π n=1
. u 4n−2 K)+q
It is an analytic function on the real line with Z(u) > 0 for 0 < u < K and Z(0) = Z(K) = 0. The theta functions are given in their product representations by 1
θ1 (z) = 2 q 4 sin z
∞ n=1
(1 − q 2 n ) (1 − 2 q 2 n cos 2 z + q 4 n ),
Conformal Geodesics on Vacuum Space-times
1
θ2 (z) = 2 q 4 cos z
∞
535
(1 − q 2 n ) (1 + 2 q 2 n cos 2 z + q 4 n ),
n=1
θ4 (z) =
∞
(1 − q 2 n ) (1 − 2 q 2 n−1 cos 2 z + q 4 n−2 ),
n=1
with q = exp(−π kind
K K
). The constant K > 0 is the complete elliptic integral of the first
1
K(k) = 0
dt (1 − t 2 ) (1 − k 2 t 2 )
,
K = K(k ) with k 2 = 1 − k 2 , and c is the (unique) number in the open interval ]0, K[ satisfying sn (c, k) = 1/e. With τ∗ = 0, τ¯∗ = 0, (84), and (87) we get from Eq. (60) β τ¯ = − log g(τ ) with
g(τ ) ≡
2 ∗ − β τ . 2 ∗ + β τ
(99)
Setting now x = K − u to exhibit the symmetries of the solution, we obtain π (c−x) π (c+x) x Z(c) − θ e e−x Z(c) θ 2 2K 2K 2 ∗ 2 τ = G(x, k) ≡ . β θ π (c−x) ex Z(c) + θ π (c+x) e−x Z(c) 2
2
2K
2K
Here we consider 2 β∗ as a function of k, which makes sense because ∂k/∂r < 0 for r ∈]r+ , ∞[. Since the denominator of the function G is positive in an open neighbourhood of the closed interval [−(K − c), K − c], the function G is analytic there. While τ¯ → ∞ as u → c and τ¯ → −∞ as u → 2 K − c by (98), we have now τ → ± 2 β∗ , whence τ¯ → ±∞, as x → ±(K − c). It follows from the ODE satisfied by r¯ that we can solve the equation above for x(τ ) with τ ∈ ] − 2 β∗ , 2 β∗ [. A direct calculation shows that G (x, k) > 0 at x = ±(K − c). This implies that x(τ ) extends as an analytic function into an open neighbourhood of [− 2 β∗ , 2 β∗ ]. Inserting now x(τ ) into the equation t = sn(K − x, k), we finally get r¯∗ 2 (1 − k 2 ) k 2 sn2 (x(τ )) =1− . r¯ 2 k 2 − 1 1 − k 2 sn2 (x(τ ))
(100)
In the limit when r → r+ we have k → 1 and thus sn(u, k) → tanh u. The formula (100) thus suggests that the solution r¯ (τ, r) of (100) approaches the constant solution r¯ = rˆ in that limit, as√ we know √ already by general arguments. However, since K → ∞ while c → 1/2 log(( 2 + 1)/( 2 − 1)) as k → 1, the precise behaviour of the solution in that limit requires a quite careful discussion involving also the behaviour of G(x, k). The right-hand side of (100) is an analytic function of τ in an open interval containing [−τscri (r), τscri (r)], with τscri (r) = 2 β∗ = (r+ m )r(r− m ) . It vanishes precisely at the 2
2
536
H. Friedrich
points τ = ±τscri at which the conformal factor (88) vanishes. Furthermore, it depends analytically on r for r+ < r < ∞. It follows that r¯ (r, τ ) (r, τ ) is positive and analytic for
r ∈]r+ , ∞[, τ ∈ [−τscri (r), τscri (r)]. (101)
Using the coordinate z = 1r¯ in the first of the line elements (78) and rescaling with the conformal factor = z gives the smooth conformal representation 2 g˜ = z2 (1 − 2 m z) d w2 − 2 d w d z − d σ 2 ,
(102)
of the Schwarzschild metric which extends analytically through future null infinity, given here by J + = {z = 0}. Integrating (95) with r¯ given by (100) and writing the solution in terms of z, we obtain our conformal geodesics in the form τ → (w(τ, r), z(τ, r)).
(103)
From the discussion above it is clear that the first of the functions on the right-hand side is an analytic function of both variables for r ∈ ]r+ , ∞[, τ ∈ [−τscri (r), τscri (r)]. We show that this holds true also for the second function. It is clear that the function w is ˜ Parametrizing w in terms of z, we obtain from (94), (95), analytic near S. −1
dw = − β 2 (1 − r¯∗ z) (1 − α 2 z2 ) + (β + γ z) β 2 (1 − r¯∗ z) (1 − α 2 z2 ) . dz The assertion now follows because the function on the right-hand side is analytic for z in an open interval of the form ] − , r¯1∗ [ with some > 0. It follows in particular that there exists a smooth function w(r), ˆ with r ∈ ]r+ , ∞[, such that w(τ, r) → w(r) ˆ as τ → τscri . We show that w(r) ˆ →∞
as r → r+ ,
w(r) ˆ → −∞
as
r → ∞.
The first assertion follows from the observation that w = −ˆr − 2 m log(ˆr − 2 m) +
25 2 ∗ + β τ m log , 4 2 ∗ − β τ
along the conformal geodesic with r¯ = rˆ and the fact that the solutions are jointly smooth in the initial data and the parameter. The second assertion follows from a comparison of w with the solution to −1 2 1 α α du + , = − β 2 (1 − r¯∗ z) 1 − (1 − r¯∗ z) 1 − ( )2 dz r¯∗ 2 r¯∗ which satisfies u = w at z = r¯1∗ . Since 0 ≥ dd wz ≥ dd uz for z ∈ [0, r¯1∗ ], it follows that the value u(r) ˆ of u at z = 0 gives an upper estimate for w(r). ˆ Since the direct integration gives u(r) ˆ with u(r) ˆ → −∞ as r → ∞, the assertion follows.
Conformal Geodesics on Vacuum Space-times
537
4.2.3. Conformal geodesics on which r¯ is decreasing. It is sufficient to discuss the case m 2 < r < r− . Now (94) must hold with the “−” sign. The function α in the factorization (97) then satisfies r¯∗ < α and α → ∞ as r¯∗ → m2 while α → rˆ as r¯∗ → rˆ . If we set t≡
1 1 α 2 1+ , 2 r¯
k≡
1 2
− 1 2 α 1+ , r¯∗
e≡
√
2,
such that e > 1, 0 < k < 1, the integration of (94) yields √ α 1 e 1+ r¯ dt β τ¯ = 2 (2 − k 2 ) . (t 2 e2 − 1) (t 2 k 2 − 1) (t 2 − 1) 1 k
With the substitution k sn(u, k) = 1/t and c ∈ ]0, K[ such that sn(c, k) = 1/e we get π (u+c)
K 2 θ 2 4 2K 2−k sn v . β τ¯ = k 2 d v = −2 Z(c) (u − K) + log 2 k π (u−c) 2 2 1 − sn v θ u
4
2
Setting x = K − u and observing (99), we finally obtain x 2v 2
2 − k 1 − sn 2 ∗ tanh k 2 d v . τ = H (x, k) ≡ β 2 2 − k 2 (1 + sn2 v)
2K
(104)
0
Again we consider 2 β∗ as a function of k, which makes sense because ∂k/∂r > 0 for r ∈ ] m2 , r+ [. The function H (x) is real analytic on R with H (x) ≥ 0 and H (x) = 0 iff x = (2 m+1) K with m ∈ Z. Thus (104) can be solved to obtain an analytic function x(τ ) which maps the interval ] − τs , τs [, with τs ≡ H (K, k) < 2 ∗ /β, diffeomorphically onto ] − K, K[. The solution can then be written r¯ = r¯∗
(2 − k 2 ) (1 − sn2 (x(τ ))) . 2 − k 2 (1 + sn2 (x(τ )))
(105)
From the discussion above it follows that r¯ (τ ) → 0 and d r¯ /d τ → ∓∞ as τ → ±τs . Thus in this case there does not exist a smooth (though a continuous) extension of r¯ (τ ) beyond its physical domain. The limit r → r+ implies k → 1. In particular, it follows from (104), (105) that the right-hand side of (105) goes in this limit to rˆ for constant τ and τs → τi . The limit r → m2 implies k → 0. Since β = 0 at r = m2 , the expression for τs appears to give a nonsensical result at that point. However, we have 2 ∗ k 2 /β → 2/m. Expanding the right-hand side of (104) and observing that K(0) = π/2 thus gives τs → π/4 m as r → m/2, consistent with the fact that the conformal geodesics approach in the limit metric geodesics with length τ¯ = ∗ (¯r∗τs=2m) = m π . To follow the conformal geodesics through the horizon, one has to integrate for given r¯ (τ ) the equation γ + β r¯ − (γ + β r¯ )2 − F (¯r )
v = , (106) F (¯r )
538
H. Friedrich
whose right-hand side defines a positive analytic function of r¯ for r¯ > 0. The conformal geodesics are then obtained in the form τ → x(τ, r) = (v(τ, r), r¯ (τ, r)),
(107)
where the function on the right-hand side are analytic for (τ, r) with r ∈ ] m2 , r+ [, τ ∈ [0, τs (r)[. 4.3. Analytic coordinates covering the Schwarzschild-Kruskal space-time and its null ˜ drag these coordinates along infinity. If we set now y 1 = r, y 2 = φ, y 3 = θ on S, with the congruence of conformal geodesics constructed in the previous section, and set ˜ The purpose of the following y 0 = τ , we obtain a smooth coordinate system near S. discussion is to establish the following result (we ignore here the coordinate singularity arising from the use of coordinates on S 2 , which can easily be removed): The conformal Gauss system y ν defines a smooth global coordinate system on the Schwarzschild-Kruskal space-time which extends smoothly to null infinity. It remains to be shown that the conformal geodesics of the congruence do not develop caustics. Because of the symmetry of the functions (79) and of our family of curves under s → −s, ρ → −ρ, it is sufficient to control the behaviour of the congruence on the subset {s ≥ 0, ρ ≥ 0} of the Schwarzschild-Kruskal space-time. Since the coordinates y µ ˜ they are known to be regular in particular near the set {s = 0, ρ = 0}. are regular near S, Therefore we can work with the coordinates w, r¯ or z, r¯ or v, r¯ respectively. Using the functions on the right-hand sides of (103) and (107), we can in principle express the corresponding line elements in terms of the coordinates y µ . Though the resulting metric coefficients will be smooth where (103) and (107) are analytic, the Jacobian of the transformation may drop rank at various places and we may end up with a degenerate representation of the metric. Having available the explicit solution, we could try to check this by a direct calculation. Since the explicit calculation will be tedious and, in particular, because this method will not apply to more general cases, we prefer to employ an argument which is similar to the analysis of the behaviour of metric geodesics congruences in terms of the Jacobi equation. Consider the transformation provided by (107). We need to show that the tangent vector field x˙ = −1 x¯ = −1 X and the connecting vector field x,r = x¯,r − x˙ τ,r = Z − x˙ τ,r are linearly independent on their domain of definition. In terms of the 2-form (71) (with h = F d v 2 − 2 d v d r¯ , x 0 = v, x 1 = r¯ ) this can be expressed as the requirement that the invariant −1 h (X, Z) does not vanish. In the case of (103) the nondegeneracy up to and in fact beyond null infinity would follow if it could be shown that −(¯r )2 (w˙ z,r − z˙ w,r ) = 0
for r∗ ∈ ]r+ , ∞[, τ ∈ [0, τscri (r)].
If z is replaced by r¯ again, this condition translates in terms of x(τ, r) = (v(τ, r), r¯ (τ, r)) and (71) (with h = F d w2 + 2 d w d r¯ , x 0 = w, x 1 = r¯ ) into the requirement that ˙ x,r ) = h (X, Z), 0 = 2 h (x, in the domain given above. Here the factor is of course most significant because it vanishes at τscri .
Conformal Geodesics on Vacuum Space-times
539
The proof that these requirements are met by our transformations relies on a differential equation satisfied by h (X, Z). To make use of (75), we observe that the various representations of the Schwarzschild metric with warping function f = r¯ and second metric k = −dσ 2 give the value c = 21 f 2 DA D A f = −m for the constant in (69). It follows DX h (X, Z) = h (DX X, Z) + h (X, DX Z) = h (−β (X, .) , Z) + h (X, DX Z) = −β h(X, Z) + h (X, DX Z), and similarly DX DX h (X, Z) = β 2 h (X, Z) − 2 β h(X, DX Z) + h (X, DX DX Z) 2m 2 = β + 3 h (X, Z) + DZ β, r¯ where in the second equation DX Z = DZ X, h(X, X) = 1, and (75) have been used. The invariant h (X, Z) thus satisfies the ODE DX DX h (X, Z) − (β 2 + k) h (X, Z) = DZ β, with k =
2m , r¯ 3
and on S˜ the initial conditions
h (X, Z)∗ =
(r + m2 )2 > 1, r2
(DX h (X, Z))∗ = 0
for
r≥
m . 2
The quantity DZ β, which is constant along the conformal geodesics, is given by DZ β = β,r = −2 It vanishes at r˘± ≡
√ 2± 3 2
DZ β > 0
r2 − 2 m r + (r + m2 )4
m2 4
.
m, where r¯ (˘r± ) = 3 m > rˆ , and satisfies
for
m ≤ r < r˘+ , 2
DZ β < 0
for
r˘+ < r.
A lower estimate for h (X, Z) can be obtained as follows. Denote by u and v the solutions to the ODE problems w
− (β 2 + k) w = f,
w(0) = 1,
w (0) = 0,
where f = 0 in the case of u and f = −1 in the case of v. Then u is strictly increasing with u ≥ cosh(β τ¯ ) for τ¯ ≥ 0, and h (X, Z) can be given in the form v (108) h (X, Z) = u h (X, Z)∗ + (1 − ) β,χ . u Since (u − v)
− (β 2 + k) (u − v) = 1 and the function u − v has vanishing initial conditions at τ¯ = 0, it follows that u ≥ v for τ¯ ≥ 0. Since v changes its sign, a further estimate is needed. We derive a representation of v in terms of u. Since u > 0 there exists a function f with v = f u. The ODE’s satisfied by u and v imply for f the equation f
= −2
u 1 f − , u u
540
H. Friedrich
which has because of the initial conditions for u and v the solution
τ
τ¯ 1 u dτ
dτ . f =1− u2 0 0 Since u is strictly increasing, it follows that v 0≤1− = u
τ¯
0
τ¯
≤2
1 u2
τ
u dτ
τ¯
dτ ≤
0
0
τ e−β τ dτ =
0
1 u τ dτ ≤ u2
2 (1 − (β τ¯ + 1) e−β τ¯ ) β2
τ¯ 0
τ dτ cosh(β τ )
for
τ¯ ≥ 0,
which implies 0≤1−
2 v ≤ 2. u β
(109)
A direct calculation gives −1 < 2 β −2 DZ β < 0 for r > r˘+ . It follows that v β,χ ≥ h (X, Z)∗ h (X, Z)∗ + 1 − u
for
m < r ≤ r˘+ , 2
m r∗ + v h (X, Z)∗ + 1 − β,χ ≥ h (X, Z)∗ − 1 = u r∗2
m2 4
for
r˘+ < r.
Since (62) gives under our assumptions = (108) implies that for given r >
m 2
∗ cosh2 ( β2 τ¯ )
(110)
,
there is a constant c > 0 such that
h (X, Z) ≥ ∗ c
cosh(β τ¯ ) cosh2 ( β2 τ¯ )
≥ ∗ c.
In the region covered by (107) it suffices of course to get a lower estimate for h (X, Z), because is positive where the conformal geodesics approach the singularity. On the curves with r¯ = rˆ , which separate the domains where (103) and (107) are valid, c2 ≡ β 2 + k with c = const. > 0 and thus h (X, Z) = (h (X, Z)∗ cosh(c τ¯ ) + c−2 DZ β (cosh(c τ¯ ) − 1) ≥ ∗ h (X, Z)∗ > 0. Since β = 0 but k ≥ k∗ = (2m)−2 , DZ β = m−2 on the curves starting with r = follows that = ∗ = (2m)−2 and a similar result is obtained.
m 2,
it
Conformal Geodesics on Vacuum Space-times
541
4.4. Global numerical evolution for a class of standard Cauchy data. The methods described above offer a possibility to study in detail the complete numerical evolution of three-dimensional space-times without symmetries which are determined by certain asymptotically flat Cauchy data. In [2] and more recently in [1] the existence of smooth, time-symmetric, asymptotically flat solutions of the vacuum constraint has been shown which coincide with certain given time-symmetric initial data on compact sets and with Schwarzschild data in a neighbourhood of space-like infinity. If these data can be constructed numerically, it is easy to determine numerically hyperboloidal initial data implied by the Cauchy data. Consider the time symmetric initial data set (82). The set S˜ can be embedded by the transformation r = tan χ2 into the 3-dimensional standard unit sphere S = S 3 . With the conformal factor S∗ = 2 (1 + r 2 )−1 (1 + 2mr )−2 = 2 cos2 χ2 (1 + m2 cot χ2 )−2 , the rescaled metric (S∗ )2 h˜ = −(d χ 2 + sin2 χ d σ 2 ) coincides with the restriction of the standard metric on S 3 to the set S \ {i0 , iπ }, where i0 = {χ = 0}, iπ = {χ = π }. If we set m = 0, we get Minkowski data and conformally compactified Minkowski data respectively. For given χ0 with π2 < χ0 < π, let ξ be a smooth function on R such that ξ(x) = 0 for x ≤ π2 , ξ(x) = 1 for x ≥ χ0 and such that ξ ≥ 0. We define ψ ∈ C ∞ (] − ∞, π [) by setting it equal to 1 for x ≤ π2 and equal to 1 + ξ(x) (cot x2 − 1) for π2 < x < π . Then ψ ≤ 0 and 21 ≤ M ≡ supx<π dd ψx (x) < ∞. Suppose we are given time symmetric initial data on R3 which agree with Schwarzs1 ˜ suitably child data of mass m < M near space-like infinity and are such that the metric h, written in terms of the coordinates φ, θ, χ on S, satisfies h˜ = −
(1 + m2 cot χ2 )4 (d χ 2 + sin2 χ d σ 2 ) 4 cos4 χ2
for
χ > χ0 .
(111)
The conformal factor ∗ =
2 cos2 χ2 , (1 + m2 ψ(χ ))2
(112)
is smooth on S \ {iπ }, coincides for χ ≥ χ0 with S∗ , has non-vanishing gradient in S \ {i0 , iπ } and goes to zero at iπ . It defines a conformal compactification of the data such that h = 2∗ h˜ coincides with the standard metric of the 3-sphere for χ > χ0 . We choose initial data b∗ for the 1-form field which annihilate the normals of the initial hypersurface and satisfy b∗ = −1 ∗ d ∗ in S \ {iπ }. Since the time evolution of the data will be Schwarzschild near iπ , it can be determined there by the methods described above. It is clear that we can construct a smooth hyperboloidal hypersurface, which coincides with S on the the set {χ ≤ χ0 } and extends to the future null infinity of the Schwarzschild part. It should not be difficult to determine the corresponding initial data for the conformal field equations, possibly by a numerical integration (as shown in [6], this reduces to solving a system of ODE’s). Since there are codes available to evolve such data numerically (cf. [3, 9] and the references given there) we could in principle calculate their evolution in time. We note that it is possible to extend these data smoothly, as solutions to the conformal constraint equations, across null infinity. This fact offers enormous simplifications in the codes employed in [3 and 9].
542
H. Friedrich
If data satisfying (111) for a fixed χ0 can be constructed for sufficiently small mass m, as suggested in [1], they will be close to corresponding Minkowskian data and the result of [4] suggests that they will evolve into solutions possessing complete past and future null infinities and regular points i ± at future and past time-like infinity. If we use a gauge in which b∗ picks up on S \ {iπ } a suitable component in the direction of the tangent vectors of the conformal geodesics, it will be possible to construct a gauge of the type considered above which smoothly covers the complete future of the initial hypersurface as well as J + ∪ {i + }. The work in [9] suggests that such solutions can be calculated numerically in their entirety. The numerical calculation of such space-time is, of course, not an end in itself. However, such calculations allow one to perform detailed tests of codes under completely controlled assumptions and to study the robustness of the codes and of the gauge conditions by choosing data with an increasing value of the mass which eventually yields the time evolution of a collapsing gravitational field. 5. Concluding Remarks We have described in detail the construction of a global system of conformal Gauss coordinates on the Schwarzschild-Kruskal solution which extend smoothly and without degeneracy through null infinity. Furthermore, we have shown that the conformal factor naturally associated with this system defines a smooth conformal extension of the Schwarzschild-Kruskal space-time which gives to null infinity a finite location in the new coordinates which is determined by our choice of initial data. We did not try to work out in detail the behaviour of the fields in the conformal extension constructed here. An analysis of the fields near space-like infinity can be found in [6]. The behaviour of the fields near time-like infinity, which is of considerable interest for the numerical calculation of space-times, has still to be investigated. There is a property of the Gauss system which we only indicated but which may turn out to be quite important. While the regularity of a conformal Gauss system is essentially decided by its underlying conformal geodesics considered as point sets, there always exists a huge class of different time slicings based on one and the same underlying congruence of conformal geodesics. The consequences of this freedom still have to be explored. Of course, there are many more such coordinate systems. It would be interesting to see whether the initial data for congruences of conformal geodesics which lead to such coordinates can be characterized in a general way. References 1. Chru´sciel, P.T., Delay, E.: Existence of non-trivial, vacuum, asymptotically simple space-times. Class. Quantum Grav. 19, L71–L79 (2002) 2. Corvino, J.: Scalar curvature deformation and gluing construction for the Einstein constraint equations. Commun. Math. Phys. 214, 137–189 (2000) 3. Frauendiener, J.: Numerical treatment of the hyperboloidal initial value problem for the vacuum equations. III. On the determination of radiation. Class. Quantum Grav. 17, 373–387 (2000) 4. Friedrich, H.: On the existence of n-geodesically complete or future complete solutions of Einstein’s field equations with smooth asymptotic structure. Commun. Math. Phys. 107, 587–609 (1986) 5. Friedrich, H.: Einstein equations and conformal structure: Existence of Anti-de-Sitter-type spacetimes. J. Geom. Phys. 17, 125–184 (1995) 6. Friedrich, H.: Gravitational fields near space-like and null infinity. J. Geom. Phys. 24, 83–163 (1998) 7. Friedrich, H., K´ann´ar, J.: Bondi-type systems near space-like infinity and the calculation of the Newman-Penrose constants. J. Math. Phys. 41, 2195–2232 (2000)
Conformal Geodesics on Vacuum Space-times
543
8. Friedrich, H., Schmidt, B.: Conformal geodesics in general relativity. Proc. Roy. Soc. A 414, 171–195 (1987) 9. H¨ubner, P.: From now to timelike infinity on a finite grid. Class. Quantum Grav. 18, 1871–1884 (2001) 10. Lawden, D.F.: Elliptic functions and applications. Berlin: Springer, 1989 11. Ogiue, K.: Theory of conformal connections. Kodai Math. Sem. Rep. 19, 193–224 (1967) Communicated by H. Nicolai
Commun. Math. Phys. 235, 545–563 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0793-9
Communications in
Mathematical Physics
New Solutions of Einstein Equations in Spherical Symmetry: The Cosmic Censor to the Court Roberto Giamb`o1 , Fabio Giannoni1 , Giulio Magli2 , Paolo Piccione1,∗,∗∗ 1 2
Dipartimento di Matematica e Informatica, Universit`a di Camerino, 62032 Camerino, Italy. E-mail: [email protected]; [email protected]; [email protected] Dipartimento di Matematica, Politecnico di Milano, 20133 Milano, Italy. E-mail: [email protected]
Received: 4 April 2002 / Accepted: 11 November 2002 Published online: 18 February 2003 – © Springer-Verlag 2003
Abstract: A new class of solutions of the Einstein field equations in spherical symmetry is found. The new solutions are mathematically described as the metrics admitting separation of variables in area-radius coordinates. Physically, they describe the gravitational collapse of a class of anisotropic elastic materials. Standard requirements of physical acceptability are satisfied, in particular, existence of an equation of state in closed form, weak energy condition, and existence of a regular Cauchy surface at which the collapse begins. The matter properties are generic in the sense that both the radial and the tangential stresses are non-vanishing, and the kinematical properties are generic as well, since shear, expansion, and acceleration are also non-vanishing. As a test-bed for cosmic censorship, the nature of the future singularity forming at the center is analyzed as an existence problem for o.d.e. at a singular point using techniques based on comparison theorems, and the spectrum of endstates – blackholes or naked singularities – is found in full generality. Consequences of these results on the Cosmic Censorship conjecture are discussed.
1. Introduction The final state of gravitational collapse is an important open issue of classical gravity. It is, in fact, commonly believed that a collapsing star that it is unable to radiate away – via e.g. supernova explosion – a sufficient amount of mass to fall below the neutron star limit, will certainly and inevitably form a black hole, so that the singularity corresponding to diverging values of energy and stresses will be safely hidden – at least to faraway observers – by an event horizon. However, this is nothing more than a conjecture – what Roger Penrose first called a “Cosmic Censorship” conjecture [24] – and has never been proved. On the contrary, in the last twenty years of research, many analytic examples of ∗ ∗∗
On leave from Departamento de Matem´atica, Universidade de S˜ao Paulo, Brazil Author partially sponsored by CNPq (Brazil), Grant No. \ 200615/01-7.
546
R. Giamb`o, F. Giannoni, G. Magli, P. Piccione
(spherically symmetric) naked singularities satisfying the principles of physical reasonableness have been discovered. Spherically symmetric naked singularities can be divided into two groups: those occurring in scalar fields models [3, 4] and those occurring in astrophysical sources modeled with continuous media, which are of exclusive interest here (see [11] for a recent review). The first (shell focusing) examples of naked singularities were discovered numerically by Eardley and Smarr [9] and, analytically, by Christodoulou [2], in the gravitational collapse of dust clouds. Since then, the dust models have been developed in full detail: today we know the complete spectrum of endstates of the gravitational collapse of spherical dust with arbitrary initial data [16]. This spectrum can be described as follows: given the initial density and velocity of the dust, a integer n can be introduced, in such a way that if n equals one or two, the singularity is naked, if n ≥ 4 the singularity is covered, while if n = 3 the system undergoes a transition from naked singularities to blackholes in dependence of the value of a certain parameter. If the collapse is marginally bound, the integer n is the order of the first non-vanishing derivative of the energy density at the center. The dust models can, of course, be strongly criticized from the physical point of view. Since from one side very few results are known on perfect (i.e. isotropic) fluids and, on the other side, stresses have to be expected to be anisotropic in strongly collapsed objects, one of us initiated some years ago a program whose objective was to understand what happens if the gravitational collapse occurs in the presence of only one non-vanishing stress, the tangential one [20, 21]. The program can be said to have been completed: we know the complete spectrum of the gravitational collapse with tangential stresses in dependence of the data both for the clusters [10, 12] and for the continuous media models [8]. The results of this program are somewhat puzzling. In fact, what one should expect on physical grounds is that the equation of state plays a relevant role in deciding the final state of the collapse. Well, it is only apparently so. In fact, also in the tangential stress case a parameter n can be defined, in such a way that the endstates of collapse depend on n exactly in the same way as in dust. In the present paper we present a complete, new model of gravitational collapse which includes both radial and tangential stresses. This is done deriving a class of anisotropic solutions which is in itself new, and contains the dust and the tangential stress metrics as special cases. As far as we are aware, this is the first time that the spectrum of the endstates of a solution satisfying all the requirements of physical reasonability and exhibiting both radial and tangential stresses is found. As will be discussed in the concluding section, in view of the results of the present paper, the case for a cosmic censor – at least in spherical symmetry – becomes very weak. 2. Einstein Field Equations in Area–Radius Coordinates Consider a spherically symmetric collapsing object. The general line element in comoving coordinates can be written as ds 2 = −e2ν dt 2 + (1/η) dr 2 + R 2 (dθ 2 + sin2 θ dφ 2 )
(1)
(where ν, η and R are functions of r and t). We shall use a dot and a prime to denote derivatives with respect to t and r respectively. In the present paper we consider as admissible sources of the gravitational field only those matter models which admit a well-defined thermodynamical description in terms of the standard relativistic mechanics of continua. We recall that the physical properties
The Cosmic Censor to the Court
547
of an elastic material in isothermal conditions can be described in terms of one state function w, which depends on three parameters (the so-called strain parameters) and on the coordinates. In the comoving description, the strain tensor of the material can be expressed in a “purely gravitational” way and, as a consequence, the state function depends only on the space-space part of the metric (see e.g. [19] and references therein). In addition, in spherical symmetry, the state function cannot depend on angles, and therefore the equation of state of a general spherically symmetric material can be given as a function of r and of two strain parameters [20]: w = w(r, R, η).
(2)
It is useful also to introduce the matter density √ η ρ= (3) 4πER 2 (where E = E(r) is an arbitrary positive function) so that the internal energy density is given by = ρw. If w depends only on ρ, one recovers the case of the barotropic perfect fluid; in general, however, the stresses are anisotropic and are given by the following stress-strain relations: ∂w ∂w 1 pr = 2ρη , pt = − ρR . (4) ∂η 2 ∂R Comoving coordinates r, t are extremely useful in dealing with gravitational collapse because of the transparent physical meaning of the comoving time. We shall, however, make systematic use here also of another system of coordinates, the so-called area-radius coordinates, which were first introduced by Ori [23] to study charged dust, and then successfully applied to other models of gravitational collapse (see e.g. [10, 21]). These coordinates prove extremely useful for technical purposes, as will be clear below. In these coordinates the comoving time is replaced by R. The velocity field of the material µ ˙ λ and therefore the transformed metric, although v µ = e−ν δt transforms to v λ = e−ν Rδ R non-diagonal, is still comoving. It can be written as ds 2 = −Adr 2 − 2BdRdr − CdR 2 + R 2 (dθ 2 + sin2 θ dϕ 2 ) ,
(5)
where A, B and C are functions of r and R. Obviously 1 1 C= := 2 , (6) λ vλ v u ˙ −ν |. Formula (2) implies that the internal energy w now where we have denoted u = |Re depends on two coordinates (r and R) and on only one field variable η. It is convenient to introduce the quantity 1 := B 2 − AC = 2 > 0 , (7) ηu √ √ √ (so that η = 1/(u ) and B = − + A/u2 ), and to use A, u and as the fundamental field variables. A convenient set of Einstein equations for these is Grr = 8π Trr , r r R GR r = 8π Tr and GR = 8πTR [21]. Denoting partial derivatives with a comma, the first of these equations reads 1 A A 1− −R = −8π pr , (8) 2 R ,R so that A decouples from u and :
548
R. Giamb`o, F. Giannoni, G. Magli, P. Piccione
2F (r) 2 − , A= 1− R R where
:= −
R
4πσ 2 pr (r, σ, η(r, σ )) dσ.
(9)
(10)
R0 (r)
In the above formulae, the arbitrary positive function R0 (r) describes the values of R on the initial data and will be conveniently taken to be R0 (r) = r.
(11)
To find the physical meaning of the other arbitrary function F , we introduce the so-called Misner-Sharp mass (r, R). It is defined in such a way that the equation R = 2 spans the boundary of the trapped region, i.e. the region in which outgoing null rays reconverge. This boundary is the apparent horizon of the spacetime, and it can be shown that (see e.g. [14]):
(r, R) = (R/2) 1 − g µν ∂µ R ∂ν R . (12) Transforming to radius-area coordinates and using (9), one easily gets = (R/2) (1 − A/ ) = F (r) +
(13)
and therefore the function F is the value of the Misner-Sharp mass on the data. Since vanishes if the radial stress is zero, when pr = 0 the Misner-Sharp mass is conserved during the evolution (i.e. independent on R). This is the main technical reason which makes the spacetimes with vanishing radial stresses simpler with respect to general collapse models. Using (9) it can be shown that the two remaining field equations can be written as follows: ∂w 2 E ,r = w + 2η u2 + 1 − (14) ∂η R and √ ( ),R = −
1 u2 + 1 −
2 R
1 . u ,r
(15)
3. The General Solution Admitting Separation of Variables 3.1. The solution. Let us consider the system of two coupled PDE’s for and u (14)– (15). If does not depend on the field variables, Eq. (14) becomes algebraic and the system decouples. Functional properties of are related to the choice of w and in fact,
will be independent of η if pr is. This obviously happens if pr is zero (leading to the already well known cases of dust and of vanishing radial stresses, for which w does not depend on η) but also if pr is some function of r and R only. This in turn occurs if w is of the form 1 w(r, R, η) = h(r, R) + √ l(r, R), η
(16)
The Cosmic Censor to the Court
549
the case pr = 0 corresponding to l = 0. If l is non-zero the radial pressure is non-vanishing pr = −
1 l(r, R) 4πER 2
(17)
so that
=
1 E
R
l(r, σ ) dσ.
(18)
r
Using (16) and (13), Eq. (14) gives E ,r = h u2 + 1 −
2 , R
which allows us to compute the quantity u2 : u2 (r, R) = −1 +
2 + R
E,r h
2
As a consequence, (15) becomes a quadrature which allows the calculation of √ =−
(19)
. √
:
1 ∂H (r, σ ) dσ, Y (r, σ ) ∂r
(20)
R , 2(r, R) + R(Y 2 (r, R) − 1)
(21)
where we have defined
H(r, R) =
Y (r, R) =
E(r),r (r, R) . h(r, R)
(22)
It is convenient to eliminate the indefinite integral in (20). For this aim, the following equation is useful: √ Y (r, R) = R (r, t (r, R)) η(r, t (r, R));
(23)
this can be easily found from the relations between the metrics coefficients in the comoving and the√area–radius coordinates. Using (7), (23) and the fact that R = 1 (r,r) for R = r, we get (r, r) = H Y (r,r) , which fixes the integration constant, to obtain √ (r, R) =
r
R
1 ∂H H(r, r) (r, σ ) dσ + . Y (r, σ ) ∂r Y (r, r)
(24)
550
R. Giamb`o, F. Giannoni, G. Magli, P. Piccione
3.2. Physical requirements. In this section we will discuss the conditions that have to be imposed on the equation of state and on the data, in order for the solutions to be physically meaningful. From now on, we consider only spacetimes whose matter source satisfies the equation of state (16). For such spacetimes, using (13) and (18), the constitutive function w may be written as ,r (r, R) ,R (r, R) + , (25) w(r, R, η) = E(r) √ Y (r, R) η therefore the energy density = ρw and the stresses (4) are given by: √ η 1 ,r + ,R , = 4πR 2 Y ,R pr = − , 4πR 2 √ η ,rR ,r ∂Y ,RR − 2 + √ . pt = − Y Y ∂R η 8πR
(26) (27) (28)
From now on we choose as independent arbitrary functions and Y . In the present paper, we assume C k regularity of the data: the equation of state and the arbitrary functions are assumed to be Taylor-expandable up to the required order. Due to Eqs. (26), (27) and (28), it will then be possible to translate all the conditions which the metrics have to satisfy in order to be physically meaningful (like e.g. weak energy condition or regularity of the center up to singularity formation, see below), in terms of conditions on the functions and Y or on their derivatives. We first impose the weak energy condition (w.e.c.). In spherical symmetry, this condition is equivalent to ≥ 0, + pr ≥ 0, + pt ≥ 0. These inequalities lead to R R ,r , ,R ≥ ,RR . (29) ,r ≥ 0, ,R ≥ 0, ,r ≥ Y 2 Y ,R 2 For an explicit solution satisfying the weak energy condition we refer to the example in 3.5 below. We also impose regularity of the metric at the center (“local flatness”). In comoving coordinates r, t this amounts to require R(0, t) = 0,
eλ(0,t) = R (0, t).
(30)
In addition, the stress tensor must be isotropic at r = 0, that is pr (0, t) = pt (0, t),
(31)
for any regular t = const hypersurface. Finally, we require the existence of a regular Cauchy surface (t = 0, say) carrying the initial data for the fields. These requirements are fundamental, since they assure that the singularities eventually forming will be a genuine outcome of the dynamics. We have already chosen R = r at t = 0; using (26), (23) and the fact that R (r, 0) = 1 we find the expression for the initial energy density 0 (r): 0 (r) =
,r (r, r) + ,R (r, r) . 4π r 2
(32)
The Cosmic Censor to the Court
551
Once the regularity conditions are satisfied, the data will be regular if this function is regular. For physical reasonability we also require the initial density to be decreasing outwards, that is 0 (r) ≤ 0 for r > 0. It is easy to check that the above stated conditions are equivalent, in terms of the functions and Y , to the following: Y (0, 0) = 1, (33) (0, 0) = ,r (0, 0) = ,R (0, 0) = ,rr (0, 0) = ,rR (0, 0) = ,RR (0, 0) = 0, (34)
2 (35) ,r (r, r) + ,R (r, r) < 0. ,rr (r, r) + 2,rR (r, r) + ,RR (r, r) − r Definition 1. We say that a spacetime is a physically valid area-radius separable spacetime (ARS) if 1. the equation of state of the matter is (25), where (r, R) and Y (r, R) are arbitrary positive functions; 2. the weak energy condition, the regularity condition, and the condition of decreasing initial density hold, i.e. (r, R) and Y (r, R) satisfy (29) and (33)–(35).
3.3. Kinematics. Spherically symmetric non-static solutions can be invariantly classified in terms of their kinematical properties (see e.g. [17, 18]). Most of the known solutions have vanishing shear, or expansion, or acceleration. For the solutions studied here all such parameters are generally non-vanishing. The acceleration in comoving coordinates r, t is given by aµ = ν δµr , therefore, it can be uniquely characterized by the scalar A := aµ a µ . Using (23) together with the relation Y ν = R Y,R
(36)
˙ + (which easily follows from the the field equation in comoving coordinates R˙ = Rν ˙ R λ) we get A = Y,R .
(37)
The expansion is given by √ 2 (u ),R + . = ±u √ R u
(38)
µ µ Finally, the shear tensor σν can be uniquely characterized by the scalar σ := (2/3) σν σµν given by u σ =± 3
√ 1 (u ),R − . √ R u
(39)
552
R. Giamb`o, F. Giannoni, G. Magli, P. Piccione
3.4. Special classes. Special cases of the solutions discussed above are: 1. The dust (Tolman-Bondi) spacetimes. The energy density equals the matter density, and this implies ,R = 0 and Y = E,r . 2. The general solution with vanishing radial stresses. Vanishing of pr implies ,R = 0 (see (27)), while Y depends also on R (if Y,R = 0 one recovers dust). The properties of these solutions have been widely discussed in [20]. 3. An interesting new subclass is obtained by imposing the vanishing of the acceleration. This subclass corresponds to a choice of the function Y depending on r only. It is well known that acceleration-free perfect fluid models can describe only very special collapsing objects (the pressure must be a function of the comoving time only) while these anisotropic solutions exhibit several interesting features, like e.g. a complete spectrum of endstates [7].
3.5. An explicit example: “Anisotropysations” of Tolman–Bondi–de Sitter spacetime. The investigation of various explicit examples of new solutions within the class presented here, as well as their physical applications, will be presented in a forthcoming paper [7]. We restrict ourself here only to sketch a simple acceleration–free (i.e. Y = Y (r)) case, essentially with the aim of showing explicitly that the new sector of the solutions (w.r. to dust and the tangential stress case) is non-empty. A simple way to produce explicit examples of new collapsing solutions within the equation of state considered here is to choose the mass function as the sum of a function of r only (governing the dust limit) and a function of R only (governing the anisotropic stresses). In order to satisfy the requirements of Definition 1, easy calculations show that the function must have the form r R (r, R) = φ(s)s 2 ds + χ (σ )σ 2 dσ, (40) 0
0
where χ and φ are positive and non-increasing functions. In the particular case in which χ is a constant, the contribution of χ to the energy-momentum tensor is formally the same as that of a cosmological constant, and therefore the spacetime coincides with the so-called Tolman–Bondi–de Sitter (TBdS), describing the collapse of spherical dust. This is actually the unique model of gravitational collapse in the presence of a lambda term which is known so far. In recent years, the description of gravitational collapse in such spacetimes attracted a renewed interest both from the astrophysical point of view, since recent observations of high-redshift type Ia supernovae suggest a non-vanishing value of lambda, and from the theoretical point of view, after the proposal of the socalled Ads-Cft correspondence in string theory. In such a context, the solutions (40) can be used to investigate the effects of stresses by adding higher order terms to the function χ . For instance, choosing (r, R) = αr 3 +
R 3 , 1 + R/R∗
where R∗ is a positive constant, one obtains a model which is TBdS homogeneous and isotropic near the center but becomes anisotropic and inhomogeneous when R is of the order of R∗ (we stress, however, that this is only one possible application of this special sub-class of solutions).
The Cosmic Censor to the Court
553
4. Conditions for Singularity Formation 4.1. Shell crossing and shell-focusing singularities. Due to Eqs. √ (7) and (26), the energy density becomes singular if, during the evolution, R or u vanish. The latter case corresponds to R (r, t) = 0 in comoving coordinates and is called a shell–crossing singularity, since it is generated by shells of matter intersecting each other. Shell-crossing singularities correspond to weak (in the Tipler sense) divergences of the invariants of the Riemann tensor. As a consequence, these singularities are usually considered as “not interesting”, although no proof of extensibility is as yet present in the literature. In any case, we shall concentrate here on the shell-focusing R = 0 singularity, for which no kind of extension is possible. Therefore, we work out the conditions ensuring that crossing singularities do not occur. Since we are considering a collapsing scenario, R < r during evolution. Therefore, √ due to Eq. (24), it is sufficient to require strict positivity of the function (r, 0) together √ with non-increasing behavior of (r, R) w.r. to R. We thus have the following: Proposition 1. In a ARS spacetime the formation of shell–crossing singularities does not occur if ,r (r, R) + Y (r, R) Y,r (r, R) R ≥ 0 for r ≥ 0, R ∈ [0, r], 0
r
1 ∂H H(r, r) (r, σ ) dσ + >0 Y (r, σ ) ∂r Y (r, r)
for r > 0.
(41)
(42)
The above condition, although being only sufficient, allows for a wide class of new metrics. It becomes also necessary in the particular case of dust [13]. Spherically symmetric matter models do not, of course, unavoidably form singularities if stresses are present. One can, in fact, construct models of oscillating or bouncing spheres. In comoving coordinates (r, t), the locus of the zeroes of R(r, t) - if any - defines implicitly a singularity curve ts (r) via R(r, ts (r)) = 0. The quantity ts (r) represents the comoving time at which the shell labelled r becomes singular, and therefore the central singularity forms if lim ts (r) = t∗ ,
r→0+
(43)
is finite (t∗ is positive due to the regularity of the data at t = 0 ). To study the behavior of this limit in dependence of the choice of the data, we use R˙ = −eν u. Integrating along a flow line we get r t= e−ν(r,σ ) H(r, σ )dσ, (44) R
where the initial condition (11) has been used. Up to time reparameterizations we can assume ν(0, t) = 0. Using this fact, together with (36), it can be seen that ν is uniformly bounded and limr→0+ ν(r, σ ) = 0 uniformly for σ ∈ [0, r]. Therefore it is necessary and sufficient to require that, for r near 0, H(r, σ ) is integrable with respect to σ in [0, r] and r 0 < lim H(r, σ )dσ < ∞. (45) r→0+ 0
554
R. Giamb`o, F. Giannoni, G. Magli, P. Piccione
For this aim, let us consider the following Taylor expansion centered at the point (0, 0), where we take Definition 1 into account: 2(r, R) + R(Y 2 (r, R) − 1) = 2Y,r (0, 0)r R + 2Y,R (0, 0)R 2 + hij r i R j . k≥3 i+j =k
(46) Changing the variable σ = rτ and recalling the definition of H (Eq. (21)) the integral in (45) becomes √ √ 1 τ r dτ, 0 2Y,r (0, 0)τ + 2Y,R (0, 0)τ 2 + r i+j =3 hij τ j + r ϕ(r) with ϕ(0) = 0. Convergence of this integral to a finite non-zero value requires vanishing of the zero order terms and at least one non-vanishing first order term in the denominator (i.e. (h30 , h21 , h12 ) = (0, 0, 0)). We shall, however, consider only the sub-case in which h30 (which we will call α hereafter) is non-vanishing, since a vanishing α would correspond to a bad-behaved “dust limit” of the solutions, i.e. when the source of the anisotropic stress becomes very weak. For the same reason, α cannot be negative otherwise the weak energy condition would be violated in the same limit. Thus: Proposition 2. In an ARS spacetime shell focusing singularities form if Y,r (0, 0) = Y,R (0, 0) = 0, α > 0.
(47)
From the above discussion, we now define the subclass of ARS spacetimes that we are dealing with hereafter. Definition 2. An ARS spacetime is called collapsing if: 1. Shell-crossing singularities do not form (i.e. and Y satisfy (41), (42)); 2. Shell-focusing singularities form in a finite amount of comoving time (i.e. and Y satisfy (47)).
4.2. The apparent horizon and the nature of the possible singularities. A key role in the study of the nature of a singularity is played by the apparent horizon (see for instance [14]). The apparent horizon in comoving coordinates is the curve th (r) defined by R(r, th (r)) = 2(r, R(r, th (r))). In area-radius coordinates, one has correspondingly a curve Rh (r) and, since ,R (0, 0) = 0, the implicit function theorem ensures that Rh (r) is well defined in a right neighborhood of r = 0 and such that Rh (r) < r. This fact, along with the requirement ,R ≥ 0 coming from the w.e.c. (29) ensures (r, Rh (r)) < (r, r). Now using (44) it is 0 ≤ ts (r) − th (r) =
2(r,Rh )
e 0
−ν
H(r, σ )dσ ≤
(r,r)
e−ν H(r, σ )dσ.
(48)
0
Changing the variable σ = rτ as before and using the fact that ν is bounded and (0, 0) = 0, we find that this integral converges to 0 as r tends to 0, ensuring the following
The Cosmic Censor to the Court
555
Fig. 1. The singularity curve ts (r) corresponds to the r-axes R = 0 in area-radius coordinates. The initial data surface t = 0 becomes, in view of (11), the curve R0 (r) = r, whereas the other t = t0 (< ts (0) = t∗ ) surfaces correspond (at least to first order of Taylor development) to curves R = a r with a < 1. The gray area represents the trapped region, and its boundary, the apparent horizon, is also sketched
Proposition 3. In a collapsing ARS spacetime the center becomes trapped at the same comoving time at which it becomes singular, that is lim th (r) = t∗ = lim ts (r).
r→0+
r→0+
(49)
In the models studied in this paper, the only singularity that can be naked is the central (r = 0) one. Indeed, a singularity cannot be naked if it occurs after the formation of the apparent horizon th (r). But using (48), we get ts (r) = th (r) only if (r, Rh (r)) = 0, which happens if and only if r = 0. Thus, the shell labeled r > 0 becomes trapped before becoming singular, and hence all non–central singularities are censored. An important consequence of this fact is that, although the area-radius coordinates map the singularity curve to the axis R = 0, it is actually only the “point” R = r = 0 that has to be analyzed in order to establish the causal structure of the spacetime. The relative disposition of the relevant curves (initial data, apparent horizon and singularity) in the r, t and in the r, R planes are reported in Fig. 1. The next section will be devoted to the study of the nature of the central singularity. A singularity is either locally naked, if it is visible to nearby observers, globally naked, if it is visible also to far-away observers, or censored, if it is invisible to any observer. We shall not be concerned here with the issue of global nakedness, so that our test of cosmic censorship will be performed on the strong cosmic censorship hypothesis: each singularity is invisible to any observer. We call a blackhole a singularity of this kind, although one could conceive situations in which the singularities are locally naked but hidden by a global event horizon, an obvious example of this being of course the Kerr spacetime with mass to angular momentum per unit mass ratio greater than one. However, in all the examples so far discovered of dynamical formation of naked singularities, one can easily attach smoothly to the region of spacetime which contains the naked singularity a regular asymptotic region containing no event horizon. It is, therefore, likely that the unique version of the cosmic censorship conjecture that can be the object of a mathematical proof is the strong one.
556
R. Giamb`o, F. Giannoni, G. Magli, P. Piccione
5. The Spectrum of Endstates 5.1. The radial null-geodesic equation. The equation of radial null geodesics in the coordinate system (r, R) is given by √ dR = u [Y − u]. dr
(50)
Indeed, all null curves in two dimensions can be reparameterized to become geodesics. Therefore, (50) comes from (5) setting ds 2 = 0, (dθ 2 + sin2 θ dϕ 2 ) = 0, and requiring the future–pointing character of the curve. The center R = r = 0 is (locally) naked if there exists a future pointing local solution Rg (r) of the geodesic equation which extends back to the singularity (i.e. R(0) = 0) and “escapes from the apparent horizon”, that is Rg (r) > Rh (r) for r > 0. We will study in full detail only the existence of radial null geodesics emanating from the singularity. We are, however, going to prove that if a singularity is radially censored (that is, no radial null geodesics escape), then it is censored (see Subsect. 5.5 below). In what follows we shall need to consider sub and super solutions of Eq. (50). We recall that a function y0 (r) is said to be a subsolution (respectively supersolution) of an ordinary differential equation of the kind y = f (r, y) if it satisfies y0 ≤ f (r, y0 ) (respectively ≥). 5.2. The main theorem. It is easy to √ check that in a collapsing ARS spacetime (see Definition 2) the Taylor expansion of (r, 0) at the center is given by √ (r, 0) = ξ r n−1 + . . . , (51) where ξ is a positive number and n is a positive integer. The following theorem shows that the causal nature of such spacetimes is fully governed by these two quantities. Theorem 1. In a collapsing ARS spacetime, the singularity forming at the center is locally naked if n = 1, if n = 2, or if n = 3 and αξ > ξc where √ 26 + 15 3 ξc = . 2
(52)
Otherwise the singularity is covered. For the sake of clarity, we divide the proof of the main theorem into two parts.
5.3. Sufficient conditions for existence. Theorem 2. In a collapsing ARS spacetime, the singularity forming at the center is locally naked if n < 3 or if n = 3 and αξ > ξc . To prove the theorem we first need the following crucial result: Lemma 1. In a collapsing ARS spacetime the apparent horizon is a supersolution of Eq. (50).
The Cosmic Censor to the Court
557
Fig. 2. Disposition of the curves of interest in the r-t (left) and r-R plane (right). In the R-r plane the ˜ of (50) exists above lower dotted line below represents the apparent horizon Rh (r). If a subsolution R(r) Rh (r) (the other dotted line), then a null geodesic Rg (r) which extends back to the singularity exists. On the left, the same situation is represented in comoving coordinates: the singular geodesics tg (r) lies below the apparent horizon but above the curve tR˜ (r)
Proof. This result can actually be proved for much more general spacetimes (essentially, only the weak energy condition is needed) [6]. However, we give here a very simple proof for the model at hand. As we have seen, we always suppose that α is positive (see Definition 2). This implies 2(r, 0) ∼ = αr 3 , and from the definition of the apparent horizon Rh (r), it is Rh (r) = 2(r, Rh (r)) = 2(r, 0) + Rh (r),R (r, sr ) ∼ = αr 3 ,
(53)
∼ 3αr 2 > 0 in an (sr ∈ (0, Rh (r))) since ,R (r, ξr ) is infinitesimal. This implies Rh (r) = open right neighborhood of 0, whereas it is easily seen that u(r, Rh (r)) = Y (r, Rh (r)), implying that the right-hand side of (50) gives zero when evaluated at (r, Rh ). Hence Rh (r) is a supersolution of (50). Lemma 2. The singularity forming at the center is naked if there exists a subsolution of ˜ Eq. (50) of the form R(r) = xr 3 , where x > α. Proof. The singularity is naked if there exists a geodesic Rg (r) such that Rg (r) > Rh (r) ˜ in an open right neighborhood of 0. If R(r) = xr 3 is a subsolution of (50) and x > α, then ˜ ˜ 0 )[. R(r) > Rh (r). Let r0 > 0 and Rg (r) be a geodesic through Rg (r0 ) ∈]Rh (r0 ), R(r This curve cannot cross the subsolution from below, neither can it cross the supersolution from above. Thus Rg (r) is defined in ]0, r0 ] and limr→0+ Rg (r) = 0 (see Fig. 2). Therefore Rg is the sought geodesic. ˜ Proof (Theorem 2). We will derive sufficient conditions for R(r) = xr 3 to be a subsolution of (50) with x > α; the result will then follow from Lemma 2. We have α ∼ ˜ u(r, R(r)) = (54) x
558
R. Giamb`o, F. Giannoni, G. Magli, P. Piccione
(see (21)), and
α , x
(55)
1 ∂H (r, σ ) dσ. Y (r, σ ) ∂r
(56)
∼ ˜ ˜ Y (r, R(r)) − u(r, R(r)) =1− since Y (0, 0) = 1. Now, from (24), √ √ ˜ (r, R(r)) = (r, 0) −
xr 3 0
But
xr 3
0
√ i−1 σ j + . . . σ i h r ij i+j =3 1 ∂H (r, σ ) dσ = − 3/2 dσ Y (r, σ ) ∂r 0 iσ j + . . . 2Y (r, σ ) h r i+j =3 ij √ i−1+3j τ j x r 3/2 τ i h r ij i+j =3 ∼ r 3 dτ =− 3/2 0 i+3j j 2 τ i+j =3 hij r x 2 r2 ∼ xr , = − √ τ 3/2 |ττ =x =0 = − α α
xr 3
where the variable change σ = r 3 τ has been performed, and we recall that α = h30 . Then √ x 2 n−1 ∼ ˜ (r, R(r)) = ξ r + (57) xr . α The curve R˜ is certainly a subsolution of (50) if the following inequality holds: α α n−1 2 2 3x r < 1 − + xr . ξr x x
(58)
This inequality holds always if n = 1 or if n = 2; namely, the term on the right-hand side is a positive function that behaves like r m whenever x > α. If the n = 3 condition (58) is equivalent to ξ S x, < 0, (59) 3 where S(x, p) ≡ 2x 2 +
√
αx 3/2 − 3p
√
α x 1/2 + 3p α,
(60)
and using standard techniques it can be seen that (59) holds for some x > α if and only if √ ξ 26 + 15 3 > = ξc . (61) α 2
The Cosmic Censor to the Court
559
5.4. Necessary conditions for existence. Theorem 3. In an collapsing ARS spacetime, if the singularity forming at the center is locally naked then n < 3 or, n = 3 and αξ ≥ ξc . To show the theorem, essentially adapting an argument exploited in [5] for dust solutions, we need the following lemma. ˜ Lemma 3. In an collapsing ARS spacetime, if a curve R(r) is a geodesic emanating from the central singularity such that r e−ν H(r, σ ) dσ tR˜ (r) := ˜ R(r)
verifies limr→0+ tR˜ (r) = t0 = limr→0+ ts (r), then lim
r→0+
˜ R(r) = 0. r
(62)
Proof. We have limr→0+ (ts (r) − tR˜ (r)) = 0. Recalling that r ts (r) = e−ν H(r, σ ) dσ, 0
we also have ts (r) − tR˜ (r) =
˜ R(r) 0
e−ν H(r, σ ) dσ ∼ =
˜ R(r)/r 0
e−ν
√
τ
j i+j =3 hij τ
1/2 dτ,
and since ν is bounded this quantity must be infinitesimal, so that (62) must hold. ˜ ˜ = 0. Proof (Theorem 3). Let R(r) = x(r)r 3 be a geodesic such that x(r) > α, and R(0) 2 Using Lemma 3, x(r)r must be infinitesimal, so it is a straightforward calculation to verify that α ∼ ˜ u(r, R(r)) , (63) = x(r) α ∼ ˜ ˜ Y (r, R(r)) − u(r, R(r)) , (64) =1− x(r) whereas, using the same arguments as in (57), it is x(r)r 3 √ √ 1 ∂H ˜ (r, R(r)) = (r, 0) − (r, σ ) dσ Y (r, σ ) ∂r 0 x(r) n−1 ∼ + x(r)r 2 . =ξr α ˜ Since R(r) is a geodesic, (50) yields α α n−3 ∼ x (r)r = 1 − + x(r) − 3 x(r). ξr x(r) x(r)
(65)
(66)
560
R. Giamb`o, F. Giannoni, G. Magli, P. Piccione
By contradiction, let us first assume n > 3. Therefore, if x(r) went to +∞ for r → 0+ , using (66) it would be 2 x (r) = − x(r)ψ(r) + ϕ(r), r where ϕ(r) is a bounded function and ψ(0) = 1. Using comparison theorems for ODE’s ˜ there would exist λ > 0 such that x(r) ≥ rλ2 and so R(r) ≥ λr, which is in contradiction with Lemma 3. If limr→0+ x(r) = l, with l ∈ (0, +∞), then a straightforward calculation of the limit of both sides in (66) yields a contradiction as well, because k x(r) x (r) ∼ = − r for some positive constant k. If x(r) does not have limit, there exists a sequence rm of local minima for x(r) such that rm → 0 for m → ∞, x (rm ) = 0 and x(rm ) is bounded. Evaluating (66) for r = rm , and taking the limit of both sides for m → ∞ we get a contradiction once again. Then n ≤ 3. In the case n = 3, Eq. (66) can be written as S(x(r), ξ3 ) + ... , (67) x(r) where S(x, p) was defined in (60). Arguing as before, x(r) cannot go to +∞ as r goes ˜ to 0 since R(r) would be bounded from below by λr for some λ > 0. If limr→0+ x(r) = l < +∞ then S(x(r), ξ3 ) → S(l, ξ3 ) and, from (67), S(l, ξ3 ) = 0. Since l ≥ α, Eq. (61) implies αξ ≥ ξc . If x(r) does not have limit, as before there exists a sequence rm → 0 of local minima for x(r) such that x (rm ) = 0 and x(rm ) is bounded, and this yields, using (67), that S(x(rm ), ξ3 ) → 0, that is x(rm ) converges to a root of S(x, ξ3 ) = 0 greater than α, which is possible again only if αξ ≥ ξc . x (r)r = −
5.5. Non-radial geodesics. We have limited our analysis to radial null geodesics. However, we will now show that, if no radial null geodesic escapes from the singularity, then no null geodesic escapes at all. In other words, we show that a radially censored singularity is censored (in the case of dust spacetimes, this result was first given by Nolan and Mena [22]). ˜ Let by contradiction R(r) be a non-radial null geodesics escaping from the center, that we will suppose radially censored. Arguing in a similar way as for recovering (50), ˜ we have that R(r) solves equation dR˜ 2 2 2 (68) = −u B + + CL /R , dr where L2 is the conserved angular momentum. Since C is positive we have √ dR˜ ≤ −u2 B + , dr ˜ that is R(r) is a subsolution of the null radial geodesic equation (50). By hypothesis ˜ ˜ R(r) > Rh (r) for r > 0, and R(0) = Rh (0). Then a comparison argument in ODE similar to the one exploited in Lemma 2 ensures the existence of a radial null geodesic, which is a contradiction. Thus, the following holds true: Proposition 4. Any singularity radially censored is censored. We emphasize that this is a general result, not depending on the class of solutions we are dealing with, but only on the spherical symmetry of the model.
The Cosmic Censor to the Court
561
5.6. Physical interpretation of the results. As we have seen, although the mathematical structure of the solutions of the Einstein field equations and the way in which this structure governs the properties of the differential equation of radial geodesics is extremely intricate, our final results are nevertheless extremely simple:√what governs the whole machinery is just the first term of the Taylor expansion of (r, 0) near the center (Theorem √ 1). To understand what this results says in physical terms, it is convenient to write (r, 0) as √ (r, 0) = I1 (r) + I2 (r), (69) where (see also (24)) r 1 ∂ I1 (r) := H(r, σ ) dσ , Y (r, r) ∂r 0
r
I2 (r) :=
γ (r, σ ) 0
∂H (r, σ ) dσ, ∂r
(70)
and we have defined
1 1 − . Y (r, σ ) Y (r, r) The function I1 (r) has the following behavior: 1 1 ∂ H(r, rτ )r dτ I1 (r) = Y (r, r) ∂r 0 √ 1 p−1 Q(τ ) + . . . τ (p r ∼ p a r p−1 + . . . , ∼ = 3/2 = p 0 2 [P (τ ) − r Q(τ ) + . . . ] γ (r, σ ) :=
where the following Taylor expansion through (0, 0) has been introduced: 2(r, R) + R(Y 2 (r, R) − 1) = hij r j R j + hij r j R j + . . . , i+j =3
P (τ ) =
i+j =3
j
hij τ ,
Q(τ ) = −
(72)
i+j =3+p
and
(71)
j
hij τ ,
i+j =3+p
1
a= 0
√ Q(τ ) τ dτ. 2P (τ )3/2
(73)
The remaining summand I2 (r) is zero if Y depends only on r, i.e. if the acceleration A(r, R) = Y,R (r, R) (formula (37)) vanishes. Its behavior is as follows: √ i−1 σ j + . . . r σ i+j =3 i hij r I2 (r) = − γ (r, σ ) 3/2 dσ 0 iσ j + . . . 2 h r i+j =3 ij √ j 1 τ i+j =3 i hij τ + . . . γ (r, rτ ) dτ. =− 2r [P (τ ) + . . . ]3/2 0 Now, using Taylor expansion of Y through the center Y (r, R) = ϕ(r) + kij r i R j + . . . , i+j =q+1 j >0
(ϕ(r) contains the terms where R does not appear, and ϕ(0) = 1), it is
562
R. Giamb`o, F. Giannoni, G. Magli, P. Piccione
γ (r, rτ ) ∼ = Y (r, r) − Y (r, rτ ) = r 1+q
kij (1 − τ j ) + . . . = S(τ ) r 1+q + . . . ,
i+j =q+1 j >0
where q is easily seen to be the order of the first non-vanishing term of the expansion of the acceleration near the center. Thus, we obtain: √ 1 τ i hij τ j i+j =3 S(τ ) dτ. I2 (r) = −b r q + . . . , where b := 2P (τ )3/2 0 It follows that √ (r, 0) = p a r p−1 − b r q + . . . .
(74)
The index n defined in formula (51) is thus the smaller1 between p and q + 1. The value of p is clearly related to the degree of inhomogeneity of the system, since one can generate a low value of it taking a low order of non-vanishing derivatives of the initial density profile at the center (formula (32)). This effect can be related to the shear as well (see e.g. [1, 15]). The value of q, and thus the second term, is related strictly to the acceleration: it vanishes if the acceleration vanishes, and in any case it does not contribute to the nature of the final state if the system “does not accelerate enough” near the center. The effect of this term on naked singularities formation can be considered as a new feature of our models. It can, in fact, be shown that (virtually) all the particular cases already known in literature of formation of naked singularities in the gravitational collapse of continua (for instance the dust and the tangential stress model) can be retrieved as particular cases of our main theorem, and that in all such cases the acceleration term is negligible and the effect does not occur (of course, these cases do not exhaust the content of Theorem 1). 6. Discussion and Conclusions From the very beginning, the problem of Cosmic Censorship was shown to be linked to specific mathematical (as opposed to physically transparent) properties of the available arbitrary functions, like e.g. “first derivative of initial density equal to zero, second non-zero at the center”. In all the cases which have been discovered so far the situation became more and more intricate. In the present paper, we have constructed a new class of solutions which contains as subcases all the solutions for which censorship has already been investigated in full detail. The new solutions add to the (rather scarce) set of spherically symmetric spacetimes whose kinematical properties are generic. The analysis of the structure of their singularities allowed us to show that a general and simple pattern actually exists. This pattern follows from Theorem 1: given any set of regular data, and any equation of state within the considered class, the formation of naked singularities or blackholes depends on a sort of selection mechanism. The mechanism works as follows: it extrapolates the value of an integer n and selects the final state according to it. The extrapolation essentially depends on the kinematical invariants of the motion. If the resulting n equals one or two or it is greater than three, the final state is decided and has no other dependence on the data or on the matter properties. Dimensional quantities such as e.g. value of the derivative of the density at the center, cosmological constant, 1 One can conceive very special cases in which the terms exactly balance each other and the index n has to be defined at the next order.
The Cosmic Censor to the Court
563
value of the derivative of the velocity profile at the center, profile of the state equation for tangential stresses, and so on, play a further role only at the transition between the two endstates, occurring at n = 3. This role is to combine themselves to produce a non-dimensional quantity which acts as a critical parameter. The (spherically symmetric) cosmic censor seems to answer to the court, that the formation of naked singularities or blackholes is essentially a local and kinematical phenomenon: it neither depends (or weakly depends) on what is collapsing, nor does it depends on the details of the data characterizing how it starts collapsing. The formation or whatsoever of blackholes or visible singularities depends only on the kinematical properties of the motion near the center of symmetry of the system. Acknowledgement. The authors wish to thank the referees for their very useful comments and suggestions.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24.
Brinis, E., Jhingan, S., Magli, G.: Class. Quantum Gravity 17, 4481–4490 (2000) Christodoulou, D.: Commun. Math. Phys. 93, 171 (1984) Christodoulou, D.: Ann. Math. 140, 607 (1994) Christodoulou, D.: Ann. Math. 149, 183 (1999) Giamb`o, R., Magli, G.: math-ph/0209059, to appear on Diff. Geom. Appl. Giamb`o, R., Giannoni, F., Magli, G.: Class. Quantum Grav. 19, L5 (2002) Giamb`o, R.: Naked singularity existence for an acceleration free solution of Einstein’s equations, to appear in the Proceedings of the Sigrav Conference on General Relativity and Gravitational Physics, 2002 Gon¸calves, S., Jhingan, S., Magli, G.: Phys. Rev. D 65, 64011 (2002) Eardley, D.M., Smarr, L.: Phys. Rev. D 19, 2239 (1979) Harada, T., Iguchi, H, Nakao, K.: Phys. Rev. D 58, R041502 (1998) Jhingan, S., Magli, G.: Phys. Rev. D61, 124006 (2000) Jhingan, S., Magli, G.: Gravitational collapse of fluid bodies and cosmic censorship: Analytic insights. In: Recent Developments in General Relativity B. Casciaro, D. Fortunato, A. Masiello, M. Francaviglia (eds), Berlin: Springer Verlag, 2000 Hellaby, C., Lake, K.: Ap. J. 290, 381 (1985) Joshi, P.S.: Global aspects in gravitation and cosmology, Oxford: Clarendon Press, 1993 Joshi, P.S., Dadhich, N., Maartens, R.: gr-qc/0109051 Joshi, P.S., Singh, T.P.: Class. Quantum Grav. 13, 559 (1996) Kramer, D., Stephani, H., Herlt, E., MacCallum, M.: Exact solutions of the einstein’s field equations. Cambridge: Cambridge University Press, 1980 Krasi´nski, A.: Inhomogeneous cosmological models. Cambridge: Cambridge Univ. Press, 1997 Kijowski, J., Magli, G.: Class. Quantum Grav. 15, 3891(1998) Magli, G.: Class. Quantum Grav. 14, 1937 (1997) Magli, G.: Class. Quantum Grav. 15, 3215 (1998) Mena, F.C., Nolan, B.C.: Class. Quantum Grav. 18, 4531 (2001) Ori, A.: Class. Quantum. Grav. 7, 985 (1990) Penrose, R.: Nuovo Cimento 1, 252 (1969)
Communicated by H. Nicolai