Communications in Mathematical Physics - Volume 239

Commun. Math. Phys. 239, 1–28 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0882-9 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

59 downloads 1209 Views 5MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 239, 1–28 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0882-9

Communications in

Mathematical Physics

Berezin-Toeplitz Operators, a Semi-Classical Approach Laurent Charles Dipartimento di Matematica, Universit`a di Bologna, Piazza di Porta S. Donato, 5, 40127 Bologna, Italia. E-mail: [email protected] Received: 10 April 2001 / Accepted: 17 September 2002 Published online: 23 June 2003 – © Springer-Verlag 2003

Abstract: This article is devoted to the Toeplitz Operators [4] in the context of the geometric quantization [11, 15]. We propose an ansatz for their Schwartz kernel. From this, we deduce the main known properties of the principal symbol of these operators and obtain new results : we define their covariant and contravariant symbols, which are full symbols, and compute the product of these symbols in terms of the K¨ahler metric. This gives canonical star products on the K¨ahlerian manifolds. This ansatz is also useful to introduce the notion of microsupport. 1. Introduction Let M be a compact K¨ahler manifold with fundamental 2-form ω ∈ 2 (M, R). Assume that there exists an Hermitian line bundle L → M with a covariant derivation ∇ of curvature 1i ω. M and L are the data of geometric quantization introduced by Kostant [11] and Souriau [15] : the symplectic manifold (M, ω) is the classical phase space and the space H consisting of the holomorphic sections of L is the quantum space. The set of classical observables is the Poisson algebra C ∞ (M). The quantum observables are the linear operators of H. To relate the classical and quantum observables, Berezin introduced in [2] the notions of covariant symbol and contravariant symbol. To describe this, introduce the space L2 (M, L) which consists of the sections of L → M with finite L2 norm, endowed with the scalar product (s1 , s2 ) =

h(s1 , s2 ) µM , M

1 ∧n where h is the Hermitian metric and µM is the Liouville measure n! ω . Since M is compact, H is a finite dimensional subspace of L2 (M, L). Denote by the orthogonal projector of L2 (M, L) onto H. From now on, we identify a quantum observable T with the bounded operator of L2 (M, L) which vanishes on H⊥ and which restricts on H to

2

L. Charles

T . So the quantum observables are the operators T : L2 (M, L) → L2 (M, L) such that T = T . A contravariant symbol is a function f ∈ C ∞ (M) to which we associate the operator Mf , where Mf : L2 (M, L) → L2 (M, L) is the multiplication by f , contravariant

f −−−−−−−→ Mf . On the covariant side, we start from a quantum observable T . We denote by T (xl , xr ) its Schwartz kernel. It is the section of L L−1 → M 2 such that (T s)(xl ) = T (xl , xr ).s(xr ) µM . M

Since L ⊗ L−1 C, T (xl , xr ) restricts on the diagonal to a function, Assume that (x, x) does not vanish. The covariant symbol of T is f (x) = T (x, x)/(x, x). T (x, x) covariant ←−−−−−− T . (x, x) It is natural to ask if one can find products on C ∞ (M) corresponding to the product of the operators. Using the covariant symbol, Moreno and Ortega [13] defined star products on the projective space CP and the Poincar´e disc. When M is a coadjoint orbit of a compact Lie group, similar results were obtained by Cahen, Rawnsley and Gutt [6]. More generally when M is a K¨ahler manifold, Bordemann, Meinrenken, Schlichenmaier [3] and Guillemin [9] deduced from the theory of Toeplitz operators of Boutet de Monvel and Guillemin [4] that the product of contravariant symbol is a star product. These results involve the semi-classical limit defined in the following way. For every positive integer k, we replace in the previous constructions the line bundle L by Lk . We obtain a sequence of Hilbert spaces Hk . The semi-classical limit is k → ∞. Furthermore, we restrict our attention to a family of quantum observables called Toeplitz operators. By definition, a Toeplitz operator is a sequence (Tk ) such that for every k, Tk : L2 (M, Lk ) → L2 (M, Lk ),

Tk = k Mf (.,k) k + Rk ,

(1)

where – f k) is a sequence of C ∞ (M) which admits an asymptotic expansion of the form (., ∞ −l ∞ ∞ l=0 k fl for the C topology with f0 , f1 ,.. C functions. – (Rk ) is a negligible operator, that is k Rk k = Rk and its uniform norm ||Rk || is O(k −∞ ). The interest to consider these operators is that the contravariant map of Berezin leads to a bijection between C ∞ (M)[[]] and the set T of the Toeplitz operators modulo the negligible operators. Theorem 1 ([3, 9]). The product of two Toeplitz operators is a Toeplitz operator, so T is an algebra. The contravariant symbol map σcont : T → C ∞ (M)[[]]

Berezin-Toeplitz Operators, a Semi-Classical Approach

3

which sends the operator (Tk ) defined by (1) into l l fl is well-defined. It is onto and its kernel is the set of negligible Toeplitz operators. Furthermore, if σcont (Tk ) = f0 + O() and σcont (Uk ) = g0 + O(), then σcont (Tk Uk ) = f0 .g0 + O(), σcont (Tk Uk − Uk Tk ) = {f0 , g0 } + O( 2 ), i ||Tk || = O(k −N ) iff σcont (Tk ) = O( N ), ||Tk || ∼ Sup |f 0 | if f0 = 0. The principal symbol of a Toeplitz operator (Tk ) is the function f0 such that σcont (Tk ) = f0 + O(). Observe that the map σcont is a full symbol in the sense that σcont (Tk ) = 0 if and only if ||Tk || = O(k −∞ ). Our main result is an ansatz for the Schwartz kernel of a Toeplitz operator. Theorem 2. Let E be a section of L L−1 such that E(x, x) = 1, |E(xl , xr )| < 1 if xl = xr and ∇∂ i E(xl , xr ) = ∇∂ i E(xl , xr ) = 0 + O(|xr − xl |∞ ) z¯ l

zr

for any complex coordinates system (zi ) of M. If (Tk ) is a Toeplitz operator, its Schwartz kernel is of the form Tk (xl , xr ) =

k n E k (xl , xr )a(xl , xr , k) + Rk (xl , xr ), 2π

(2)

where a(., k) k is a sequence of C ∞ (M 2 ) which admits an asymptotic expansion ∞ −l ∞ l=0 k al for the C topology whose coefficients satisfy ∂z¯ j .al (xl , xr ) = ∂zj .al (xl , xr ) = 0 + O(|xr − xl |∞ ) r

l

for any complex coordinates system (zi ) of M. (Rk ) is negligible, that is Rk is uniformly O(k −∞ ) and the same holds for its successive covariant derivatives. Conversely, if (Tk ) is a sequence of operators whose Schwartz kernels are given by (2), then ||k Tk k − Tk || = O(k −∞ ) and (k Tk k ) is a Toeplitz operator. For the projector (k ), this ansatz follows from a theorem of Boutet de Monvel and Sj¨ostrand about the Szeg¨o kernel of a strictly pseudoconvex domain. This representation of the Schwartz kernel is actually similar to the representation of the Schwartz kernel of an -pseudodifferential operator as an oscillatory integral. From this theorem, we can give a direct proof of Theorem 1 and deduce many other properties of Toeplitz operators : if (Tk ) is a Toeplitz operator, the sequence (Tk (x, x)) admits an asymptotic expansion Tk (x, x) =

k n k −l al (x, x) + O(k −∞ ). 2π l

We set σ (Tk ) =

l

l al (x, x)

and

σcov (Tk ) = σ (Tk ).σ (k )−1 .

4

L. Charles

So we obtain three full symbol maps T σ D

ψ˜

@ σcont @ σcov @ R @ ? -D D ψ

with D = C ∞ (M)[[]],

that is σ , σcov and σcont are onto, their kernel is the set O(k −∞ ) ∩ T of the negligible Toeplitz operators. Since T is an algebra whose O(k −∞ ) ∩ T is an ideal, we obtain three associative products ∗, ∗cov and ∗cont on C ∞ (M)[[]]. We prove that these are star products. Furthermore, the maps ψ˜ and ψ are equivalences of star product. We compute these products modulo O( 2 ) : if f and g ∈ C ∞ (M), then f ∗ g =f.g + Gij (∂zi f )(∂z¯ j g) − 21 r.f.g + O( 2 ), (f ) =f + Gij ∂zi ∂z¯ j f + O( 2 ), f ∗cov g =f.g + Gij (∂zi f )(∂z¯ j g) + O( 2 ), f ∗cont g =f.g − Gij (∂zi f )(∂z¯ j g) + O( 2 ), where r is the scalar curvature of M, and the functions Gi,j are defined by Gi,j .Gk,j = δi,k and ω = iGj,k dzj ∧ dzk . In fact, we can also compute the remainders O( 2 ). We prove that the bidifferential operators Bl associated to the star product ∗ are of the form β l [det(Gij )]−1 , Gα ,β ∂z¯ f.∂zα g B˜ α,β Bl (f, g) = α,β

if f ∗ g =

l Bl (f, g),

∀ f, g ∈ C ∞ (M),

(3)

l are polynomials in where the functions Gα,β are the derivatives of the Gi,j and B˜ α,β −1 [det(Gij )] and Gα ,β . These polynomials are universal, that is they do neither depend on the choice of the complex coordinates system nor on the K¨ahler metric. Furthermore these formulas define a canonical star product on every K¨ahler manifold, that is on K¨ahler manifold which are neither necessarily compact nor have a prequantization bundle. We prove similar properties for the star products ∗cov et ∗cont . The unit These results are connected with a theorem of Lu about the projector k . σ (k ) of (C ∞ (M)[[]], ∗) is not the formal series 1, but a formal series 1∗ = l l Sl , with S0 = 1 and S1 = 2r , such that

k (x, x) =

k n k −l Sl (x) + O(k −∞ ). 2π

(4)

l

The existence of this asymptotic expansion was proved by Zelditch [17], by using the result of Boutet de Monvel and Sj¨ostrand. In [12], Lu computed S0 , S1 , S2 , S3 and S4 and with his method we can also compute the other coefficients. Since 1∗ is the unit of (C ∞ (M)[[]], ∗), we can also compute it from the formulas for the star product ∗.

Berezin-Toeplitz Operators, a Semi-Classical Approach

5

In a future article, we will explain how we can generalize the ansatz for the kernel of Toeplitz operators to define Lagrangian sections similar to the Lagrangian distributions of the theory of -pseudodifferential operators. We will deduce from this the BohrSommerfeld conditions for a Toeplitz operator. To prepare this, we introduce the notion of microsupport. This is fairly easy, because the quantum states are defined on the phase space. The paper is organized as follows. The second section is devoted to introduce our notations. In the third one, we consider an algebra F, which contains as a subalgebra the set of the Toeplitz operators. We prove that (k ) belongs to this algebra, introduce the full symbol of its operators and compute the product of the symbols. In the following section, we derive from this the properties of the Toeplitz operator. Finally we define the notion of microsupport and consider the functional calculus of Toeplitz operators. This article is a part of our PhD-thesis [7]. It is self-contained, except that we use the essential result of Boutet de Monvel and Sj¨ostrand on the Szeg¨o kernel and we apply the stationary phase lemma of H¨ormander. 2. Notations 2.1. Geometric objects. First if L → M is a Hermitian fiber bundle, we denote by 1 h(u, v) the scalar product of u, v ∈ Lx and by |u| = h(u, u) 2 the norm of u. When L is endowed with a connection, we denote by ∇ : C ∞ (M, L) → 1 (M, L) the covariant derivation. We use the same notation for the induced Hermitian structure and covariant derivation on Lk → M and Lk L−k → M × M. If D : C ∞ (M, Lk ) → C ∞ (M, Lk ) is a differential operator, we define the differential operators Dl and Dr by Dl = D ⊗ Id : C ∞ (M × M, Lk L−k ) → C ∞ (M × M, Lk L−k ), Dr = Id ⊗ D : C ∞ (M × M, Lk L−k ) → C ∞ (M × M, Lk L−k ). 2.2. Negligible terms. First, if (f (, .k))k is a sequence of C ∞ (X), we say that (f (., k)) is negligible if for every integer l, N , for every vector field X1 , ..., Xl and for every compact K ⊂ X, there exists C such that (X1 ...Xl .f )(x, k) Ck −N , ∀x ∈ K. Consider now a line bundle L → X endowed with a Hermitian structure. Let (sk )k be a sequence such that sk ∈ C ∞ (M, Lk ) for all k. Introduce a covariant derivation ∇ : C ∞ (M, L) → 1 (M, L). We say that (sk ) is negligible if for every integer l, N , for every vector field X1 , ..., Xl and for every compact K ⊂ X, there exists C such that ∇X ...∇X sk (x) Ck −N , ∀x ∈ K. 1 l It is easy to see that this definition depends on the choice of h, but does not depend on ∇. Locally, if t : U → L is a unitary gauge (i.e. |t (x)| = 1, ∀x ∈ U ) and sk = f (., k)t k on U , the fact that (sk ) is negligible means that the sequence (f (., k)) is negligible. Let (Tk ) be a sequence such that for every k, Tk is an operator C ∞ (X, Lk ) → C ∞ (X, Lk ) with a smooth Schwartz kernel. Using a density µ ∈ C ∞ (X, ||(X)), the kernel Tk (xl , xr ) can be viewed as a section of Lk L−k . We say that (Tk ) is a smoothing

6

L. Charles

operator if (Tk (xl , xr )) is a negligible sequence. This definition does not depend on the choice of µ. We will denote by O(k −∞ ) a negligible sequence of functions or the set of the negligible sequences of functions. We use the same notation for sequences of sections or for smoothing operators. 2.3. Symbols. A symbol of order N is a sequence of functions (f (., k)) in C ∞ (X) which admits an asymptotic expansion f (., k) =

∞

k −l fl + O(k −∞ )

l=N

in the C ∞ topology. We denote by S N (X) the set of the symbols of order N defined on X. We associate to (f (., k)) ∈ S 0 (X) the formal symbol l l fl . This defines a map S 0 (X) → C ∞ (X)[[]]. By the Borel lemma, this map is onto, its kernel is O(k −∞ ). 2.4. Taylor expansions. We say that a function f ∈ C ∞ (X) vanishes to order k along a submanifold Y ⊂ X, if for every differential operator D of order k − 1, D.f Y = 0. We say that a function f ∈ C ∞ (X) vanishes to order ∞ along Y , if it vanishes to order k along Y for every k. We denote by I k (Y ) the ideal of C ∞ (M) which is the set of the functions which vanish to order k along Y . The Taylor series of f ∈ C ∞ (M) along Y is the class of f in C ∞ (M)/I ∞ (Y ). Lemma 1. Let X be a submanifold of an open set of Rk . Let d ∈ C ∞ (, R+ ) be a non-negative function which vanish along X to order 2, does outside of X not vanish and whose kernel of its Hessian is Tx X for all x in X. Let a(., k) be a sequence of −i in the C 0 topology. Let N C ∞ () which has an asymptotic expansion ∞ i=0 ai (x)k be a non-negative integer. The following assertions are equivalent: N i) ∀ compact K of , ∃ C such that e−kd(x) a(x, k) C k − 2 on K. ii) ai ∈ I N−2i (X), ∀i such that N 2i. Proof. Let l be some integer larger than N2 . We have a(x, k) − li=0 ai (x)k −i CK k − 2 −1 on every compact K of . Consequently, i) is equivalent to N

l N −kd(x) ai (x)k −i CK τ − 2 . e

(5)

i=0 N Moreover, assertion ii) is equivalent to ai (x) Cδ(x) 2 −i on every compact K of . The function y → y m e−y is bounded on R+ . It follows that N N −kδ(x) ai (x)k −i Cai (x)d(x)− 2 +i k − 2 . e

Berezin-Toeplitz Operators, a Semi-Classical Approach

7

This prove that ii) implies i). Conversely, we introduce the set D = {x ∈ / d(x)−1 ∈ N}. Consider an integer j between 1 and l + 1 and we use the inequality (5) where x ∈ D and k = j/d(x). We obtain that the function bj (x) = j l a0 (x)d(x)− 2 + j l−1 a1 (x)d(x)− 2 +1 + ... + j 0 al (x)d(x)− 2 +l N

N

N

is bounded on K ∩ D if K is a compact of . The functions bj (x) are obtained from N the functions aj (x)d(x)− 2 +j by a linear equations system of Vandermonde type. By N solving this system, we obtain that aj (x)d(x)− 2 +j is bounded on K ∩ D if K is a compact of . Using the Taylor expansions of aj and d along X, i) follows. 3. The Algebra F This section is devoted to an algebra of operators defined in the following way. Definition 1. F is the space of operators Qk : C ∞ (M, Lk ) → C ∞ (M, Lk ) k 0 , whose kernel is of the form k n E k (xl , xr ) a(xl , xr , k) + O(k −∞ ), (6) Qk (xl , xr ) = 2π where – E satisfies the same assumptions as in Theorem 2 – (a(., k)) is a symbol in S 0 (M 2 ). Our basic interest in this algebra is that it contains as a subalgebra the set of Toeplitz operators. In the next section we will derive many properties of the Toeplitz operators from those of operators of F. The first subsection is devoted to the section E defined in Theorem 2. We prove its existence and give its main properties. In the following two subsections, we prove that (k ) is an operator of F. In the last subsections we define the full symbol of an operator of F, prove that F is an algebra and compute the product of symbols. 3.1. The section E. In the following, g denotes the Riemannian metric of M defined by g(X, Y ) = ω(X, J Y ). Proposition 1. There exists a section E of L L−1 such that E diag(M) = 1, ∇Z¯ l E ≡ ∇Zr E ≡ 0

mod I ∞ (diag(M))

(7)

for all holomorphic vector fields Z of M. This section is unique modulo a section which vanishes, with all its derivatives, along diag(M). The function δ = −2 ln |E| of C ∞ (M × M) vanishes, with its first derivatives, along diag(M). If x ∈ M, the Hessian at (x, x) of δ is the quadratic form, whose kernel is diag(Tx M) and restriction on Tx M × (0) ⊂ Tx M × Tx M is 21 g. Furthermore, ∇E ≡ −E ⊗ (∂l + ∂¯r )δ

mod I ∞ (diag(M)).

(8)

8

L. Charles

On a neighborhood of diag(M), we have δ(xl , xr ) < 0 if xl = xr . By modifying E outside this neighborhood, we may assume that δ(xl , xr ) < 0 if xl = xr for all (xl , xr ) ∈ M × M. Remark 1. Let t : U → L be a holomorphic gauge. Let ρ ∈ C ∞ (U ) be such that |t| = e−ρ and introduce the unitary gauge s = eρ t. Then we will prove that ˜ l , xr ), E = eiψ s ⊗ s −1 with ψ(xl , xr ) = i(ρ(xl ) + ρ(xr )) + ψ(x

(9)

where ψ˜ is such that ˜ ψ(x, x) = −2iρ(x) and ∂z¯ i ψ˜ ≡ ∂zri ψ˜ ≡ 0 l

mod I ∞ (diag(U )).

(10)

This local expression will be useful, especially to apply the stationary phase lemma for the composition of operators. Proof. We introduce the same local data as in the previous remark and look for a section E verifying (9). Then Eqs. (7) are equivalent to (10). There is a unique function ψ˜ satisfying (10) modulo I ∞ (diag(U )). Using the local uniqueness, we can construct with a partition of unity the global section E required. We have ˜ l , xr ) − i ψ(x ˜ r , xl ) δ(xl , xr ) = 2ρ(xl ) + 2ρ(xr ) − i ψ(x

mod I ∞ (diag(M)).

From ∂zj .ψ˜ ≡ (∂zj + ∂zj )ψ˜ modulo I ∞ (diag(U )) it follows that ∂zj δ(x, x) vanishes. r l l l Similarly we show that the other derivatives of δ vanish along diag(M). To compute the Hessian of δ, observe that ∂zj ∂zrk δ (x,x) = ∂z¯ j ∂z¯ rk δ (x,x) =0 l

∂zj ∂z¯ k δ (x,x) = ∂zj ∂z¯ rk δ (x,x) =Gj,k

l

l

r

l

with Gj,k = ∂zj ∂z¯ k (ρ + ρ). ¯ Let X and Y be two vectors in Tx M, ¯ −Z) with Z = 1 (X − iJ X) ∈ Tx1,0 M, (X, 0) = (Z, Z) + (Z, 2 (Y, 0) = (W¯ , W¯ ) + (W, −W¯ ) with W = 1 (Y − iJ Y ) ∈ Tx1,0 M. 2

Since the kernel of Hess δ (x,x) contains certainly diag Tx M, we have ¯ −Z), (W, −W¯ )) Hess δ (x,x) (X, 0), (Y, 0) = Hess δ (x,x) (Z, ¯ + = 2i1 ω(W, Z) ¯ + ρ) using (11) and since ω = i∂ ∂(ρ ¯ =i

1 ¯ 2i ω(Z, W )

Gj,k dzj ∧ d z¯ k , we have

Hess δ (x,x) (X, 0), (Y, 0) = 21 g(X, Y ). By derivating h(E, E) = exp(−δ) we obtain (8).

(11)

Berezin-Toeplitz Operators, a Semi-Classical Approach

9

3.2. The Szeg¨o projector. We recall first the result of Boutet de Monvel and Sj¨ostrand that we will apply. Let Y be a complex manifold of dimension k +1. Let D be a domain of Y with compact C ∞ boundary. Let E → ∂D be the complex subbundle of T (∂D) ⊗ C, which consists of the holomorphic vectors of Y tangent to ∂D. The complex dimension of E is k. Let r : Y → R be a defining function for the boundary of D, i.e. D = {r 0} and r (y) = 0 if y ∈ ∂D. Assume that D is strictly pseudoconvex, i.e. the sesquilinear form of E y , ¯ X ∧ Y¯ (X, Y ) → ∂ ∂r,

X, Y ∈ E y

is positive definite at every point y ∈ ∂D. Then the restriction of −i∂r at the boundary ∂D is a contact form. Let µ ∈ C ∞ (∂D, ||(∂D)) be a volume form. Hence L2 (∂D) is endowed with a Hilbertian structure. H is the set of the functions of L2 (∂D) satisfying induced CauchyRiemann equations:

¯ = 0, ∀Z ∈ C ∞ (∂D, E) . H = f ∈ L2 (∂D) / Z.f The Szeg¨o projector : L2 (∂D) → L2 (∂D) is the orthogonal projection onto H. Let φ ∈ C ∞ (Y × Y ) be a function such that φ(y, y) =

1 r(y) and Z¯ l φ ≡ Zr φ ≡ 0 i

mod I ∞ (diag(Y ))

(12)

for all holomorphic vector fields Z. Define ϕ ∈ C ∞ (∂D×∂D) by ϕ(ul , ur ) = φ(ul , ur ). dϕ doesn’t vanish on diag(∂D). dIm ϕ vanishes identically on diag(∂D) and the Hessian of Im ϕ at (u, u) is negative with kernel diag(Tu ∂D). So by modifying ϕ outside a neighborhood of diag(∂D), we may assume that Im ϕ(ul , ur ) < 0 if ul = ur . The map R+ × ∂D × ∂D → C,

(τ, ul , ur ) → τ ϕ(ul , ur )

is a non-degenerate phase function of positive type (cf. [10]) and parametrizes a positive canonical ideal C. Let F 0 (C) be the set of the Fourier integral operators of order 0 associated with C. It consists of the operators T : C ∞ (∂D) → C ∞ (∂D) whose Schwartz kernel is the sum of an oscillatory integral and a C ∞ function : T (ul , ur ) = eiτ ϕ(ul ,ur ) s(ul , ur , τ )|dτ | + f (ul , ur ), (13) R+

n (∂D × ∂D × R+ ) which admits an asymptotic where s is a classical symbol of S1,0 expansion

s(ul , ur , τ ) ∼

∞

τ n−l sl (ul , ur ).

l=0

These operators are continuous L2 (P ) → L2 (P ). Theorem 3 (Boutet de Monvel, Sj¨ostrand [5]). is an elliptic Fourier integral operator of order 0 associated with the canonical ideal C.

10

L. Charles

To apply this result, we introduce the principal bundle π : P → M with structural group T = R/2πZ such that Lk P ×sk C, where sk : T → Gl(C) is the representation defined by sk (θ ).v = e−ikθ v. Consider the embedding i of the principal bundle P into L∗ P ×s−1 C defined by i(u) = [u, 1] ∈ P ×t C,

∀ u ∈ P.

The covariant derivation ∇ induces a connection 1-form α ∈ 1 (P ). Let Hor 1,0 → P denote the subbundle of T P ⊗ C, which consists of the horizontal lifts of holomorphic vectors. Let H : L∗ −→ R denote the function sending u ∈ L∗ into |u|2 . The following result is well-known. Proposition 2. D = {H 1} is a strictly pseudoconvex domain of L∗ with boundary i(P ). The fiber bundle of holomorphic vectors of L∗ tangent to i(P ) is i∗ Hor 1,0 . Moreover i ∗ ∂ ln H = iα. µP = 2π1n! α ∧ (dα)∧n is a volume form. So we obtain a scalar product on L2 (P ), a Szeg¨o projector and a subspace H = Im ⊂ L2 (P ). Since Lk P ×sk C, we have an identification between sections of Lk and functions f ∈ C ∞ (P ) such that Rθ∗ f = eikθ f . If s : M → Lk is associated to f ∈ C ∞ (P ), then ∇X s is associated to X hor .f , where X hor denotes the horizontal lift of the vector field X. So s is holomorphic if and only if f satisfies induced Cauchy-Riemann equations. Furthermore, this identification is compatible with scalar products, that is (s1 , s2 ) = (f1 , f2 ) if s1 and s2 are respectively associated to f1 and f2 . By Fourier decomposition, we obtain L2 (P )

k=∞

L2 (M, Lk )

and

H

k=−∞

k=∞

Hk .

k=0

Using the first sum, we associate to a bounded family (Tk )k∈Z of bounded operators of L2 (M, Lk ) a bounded operator T of L2 (P ) which commutes with the action of T, and conversely. The sequence (Tk )k 0 is called the sequence of positive Fourier coefficients of T . In particular, the sequence of positive Fourier coefficients of the Szeg¨o projector is the sequence (k ). In the next section we will prove the following theorem. Theorem 4. The operators of F are the sequences of positive Fourier coefficients of the Fourier integral operators of F 0 (C) which commute with the action of T. Furthermore if (Tk ) is the sequence of positive Fourier coefficients of T ∈ F 0 (C), then T is smoothing if and only if (Tk ) is smoothing. Applying the theorem of Boutet de Monvel and Sj¨ostrand, we obtain Corollary 1. The projector (k ) belongs to F. 3.3. Proof of Theorem 4. First, we prove that the section E of L L−1 determines a non-degenerate phase function of positive type which parametrizes the canonical ideal C. P × P → M × M is a T2 -principal bundle and L L−1 (P × P ) ×s C,

Berezin-Toeplitz Operators, a Semi-Classical Approach

11

where s : T2 → Gl(C) is the representation defined by s(θl , θr ).v = ei(θl −θr ) .v. In this way the section E is associated to a function E˜ ∈ C ∞ (P × P ) such that ∗ ˜ E˜ = ei(θl −θr ) E. R(θ l ,θr )

˜ E(x, x) = 1 implies E(u, u) = 1. Let V be the neighborhood of diag P given by 1 V = {|E˜ − 1| 2 }. Define the function ϕ = 1i ln E˜ on V . Lemma 2. The function τ ϕ(ul , ur ) defined on R + × V is a non degenerate phase function of positive type which parametrizes the canonical ideal C. Proof. Introduce the same notation as in Remark 1. We identify U ×C with L−1 over U , by sending (x, z) into zs −1 (x). In the same way, the bundle P over U can be identified with U × T in such a way that i(x, θ ) = eiθ s −1 (x). Introduce a complex coordinate system (zj ) over U . Recall that eρ s −1 is a holomorphic gauge and let w denote the linear holomorphic coordinate of L−1 such that w eρ s −1 = 1. Then H (zj , w) = wwe ¯ ρ+ρ¯ and the embedding i is given by U × T −→ U × C

(zj , θ) −→ (zj , w = eiθ−ρ ).

˜ l , xr ). It extends to a The function ϕ is given by ϕ = θl − θr + i(ρ(xl ) + ρ(xr )) + ψ(x −1 −1 function φ defined on a neighborhood of diag(L ) ⊂ L × L−1 by ˜ l , xr ). φ = −i ln(wl w¯ r ) + ψ(x ˜ it follows that φ satisfies Eqs. (12), where the From Eqs. (10) which determine ψ, function r is ln H . To prove Theorem 4, we will apply the stationary phase lemma to obtain expression (6) from expression (13). By the previous lemma, the oscillatory term eiτ ϕ(ul ,ur ) becomes E k (xl , xr ). The amplitude s(ul , ur , τ ) gives the symbol a(xl , xr , k). In connection with negligible terms, observe that if T : C ∞ (P ) → C ∞ (P ) is a smoothing operator (i.e. its kernel is C ∞ ) which commutes with the action of T, then the family (Tk )k∈Z of its Fourier coefficients is smoothing (i.e. the kernels Tk (xl , xr ) are C ∞ , |Tk (xl , xr )| = O(|k|−∞ ) as k → ±∞ and the same holds for their successive covariant derivatives), and conversely. Proof of Theorem 4. Consider an operator T ∈ F 0 (C) which commutes with the action of T. Its kernel is of the form eiτ ϕ(ul ,ur ) s(ul , ur , τ )|dτ | + f (ul , ur ), T (ul , ur ) = R+

n has support in where ϕ is defined as in Lemma 2 and the classical symbol s ∈ S1,0 + V × R and asymptotic expansion

s(ul , ur , τ ) ∼

∞

τ n−l sl (ul , ur ).

(14)

l=0

We compute the Fourier coefficients of T (ul , ur ). We may assume f = 0 since its Fourier coefficients are negligible. Since T commutes with the action of T, its kernel is T-invariant, i.e. T (Rθ .ul , Rθ .ur ) = T (ul , ur ). So by averaging, we may assume that

12

L. Charles

s and the coefficients sl of its asymptotic expansion are T-invariant. Let Q denote the quotient of P × P by the diagonal action and p : P × P → Q the associated projection. The push-forward of T (ul , ur ) by p : P × P → Q is ˜ eiτ ϕ(q) s˜ (q, τ )|dτ |, p∗ T (q) = R+

n−l s˜ where where s˜ and ϕ˜ are such that p ∗ s˜ = s and p ∗ ϕ˜ = ϕ. Furthermore s˜ ∼ ∞ l l=0 τ ∗ the functions s˜l are defined by p s˜l = sl . Q is a T-principal bundle with base M × M. The action of θ ∈ T is given by Rθ .p(ul , ur ) = p(Rθ .ul , ur ). We have to compute the positive Fourier coefficients of p∗ T for this action. We may assume that P U × T (x, θ ) and Q U × U × T (xl , xr , γ ) with p ∗ γ = θl − θr . Using the same notation as in the proof of Lemma 2, we have ϕ(x ˜ l , xr , γ ) = γ + ψ(xl , xr ). The Fourier coefficients of p∗ T are ikγ e−ikθ eiτ (θ+(xl ,xr )) s˜ (xl , xr , θ, τ )|dθ ||dτ |. Ik (xl , xr , γ ) = e T×R+

The support of s˜ is included in p(V ) × R+ ⊂ U × U × (−α, α) × R+ with 0 < α < π. We replace τ by kτ , e−i|k|φ(θ,τ,xl ,xr ) s˜ (xl , xr , θ, kτ )k|dθ ||dτ | Ik (xl , xr , γ ) = eikγ + T×R θ + τ θ + τ (xl , xr ) if k > 0 with φ(θ, τ, xl , xr ) = . −θ − τ θ − τ (xl , xr ) if k < 0 To estimate this as |k| tends to ∞, we follow the method of stationary phase ([10], Sect. 7.7). First observe that if k < 0, the phase φ does not have critical point, so |Ik (xl , xr , γ )| is uniformly O(|k|−∞ ) as k → −∞ and the same holds for its successive derivatives. Consequently, T is a smoothing operator iff the sequence (Tk ) of its positive Fourier coefficients is smoothing. Assume now that k > 0. The first step is to restrict the integral at a small neighborhood of the critical locus of the phase φ by integrating by parts, (∂τ φ = ∂θ φ = 0) iff (xl = xr , θ = 0 et τ = 1). Recall that the imaginary part of ψ(xl , xr ) is positive if xl = xr . We obtain ikγ e−ikφ(θ,xl ,xr ,τ ) s˜ (xl , xr , θ, kτ )k|dθ ||dτ | + eikγ gk (xl , xr ), Ik (xl , xr , γ ) = e D

where D = (−, ) × (1 − , 1 + ) and the semi-norms C 0 (K) of gk are O(k −∞ ) if K is compact. We now apply Theorem 7.7.12 of [10]. Observe that φ = ψ + (∂τ φ)(∂θ φ). Hence k n eik(γ +ψ(xl ,xr )) a(xl , xr , k) + eikγ gk (xl , xr ), Ik (xl , xr , γ ) = 2π

Berezin-Toeplitz Operators, a Semi-Classical Approach

13

−l ∞ topology and where a(., k) admits an asymptotic expansion l=∞ l=0 k al in the C the semi-norms C 0 (K) of gk are O(k −∞ ). Actually it follows from Theorem 7.7.12 of [10] that Ik has an asymptotic expansion in the C 0 topology, but the coefficients al are C ∞ and by Borel process there exists a sequence a(., k) as above. Using the identification between the functions Ik and the sections of Lk L−k , we express the kernels of the positive Fourier coefficients of T in the form k n E k a(xl , xr , k) + Rk (xl , xr ), (15) Tk (xl , xr ) = 2π −∞ where |Rk (xl , xr )| is uniformly O(k ). We have to improve this, that is to show that ∇X1 ...∇Xl Rk CK,N k −N on every compact K and for all N . The sections ∇X1 ...∇Xl Fk are the positive Fourier coefficients of X1hor ...Xlhor T . We can estimate them in the same way as the Fourier coefficients of T . Consequently their norm is O(k N ) on every compact with N sufficiently large. The derivatives of E k a(., k) satisfy the same estimate, so the same holds for ∇X1 ...∇Xl Rk . By applying Lemma 3.2 of [14], we obtain that (Rk ) is smoothing. Conversely, we have to show that for every sequence (ak ) of C ∞ (M × M) which −l admits an asymptotic expansion k a l in the C ∞ topology,there exists a symbol n + s ∈ S1,0 (P × P × R ) which admits an asymptotic expansion τ l sl such that (15) is satisfied. Assume that s0 is locally T2 -invariant on a neighborhood of diag P . We can easily compute (cf. Theorem 7.7.2 [10]) the first coefficient a0 of the asymptotic expansion. It is such that p˜ ∗ a0 = s0 on a neighborhood of diag(P ), where p˜ is the projection of P × P onto M × M. So we can choose the convenient s0 , and by successive iterations the other coefficients sl . Finally we obtain s by Borel process. 3.4. Symbol of the operators of F. Let us define the full symbol of an operator of F. Let J denote the space C ∞ (M 2 )/I ∞ (diag M) which consists of the Taylor expansions along diag(M) of the functions in C ∞ (M 2 ). Definition 2. The symbol S(Tk ) of an operator (Tk ) ∈ F is the formal series l [al ] ∈ J [[]], S(Tk ) = 0 2 where the kernel of (T k ) is given by (6) and the symbol (a(., k)) ∈ S (M ) has the ∞ −l asymptotic expansion l=0 k al (xl , xr ).

From Lemma 1, we deduce that S(Tk ) is well-defined, i.e. it does not depend on the choice of the section E nor on the choice of the symbol (a(., k)). Furthermore, Borel process and Lemma 1 imply that Lemma 3. The map S : F → J [[]] is onto and its kernel is the set of smoothing operators. Since M is compact and the kernel Tk (xl , xr ) is C ∞ , Tk is a bounded operator of L2 (M, Lk ) for every k. Furthermore the sequence (Tk∗ ) of adjoints belongs to F and S(Tk∗ )(xl , xr ) = S(Tk )(xr , xl ). For every k, Tk is trace class and we have the asymptotic expansion k n −l k fl (x, x) µM (x) + O(k −∞ ) if S(Tk ) = l fl . Tr Tk = 2π M l

l

14

L. Charles

3.5. Symbolic calculus. We discuss now the composition of the operators of F. We will prove that the product of two operators of F belongs to F. The set of smoothing operators is an ideal of F, so that S induced a product in J [[]]. We will also compute this product. Let us introduce some notations. Let (zi ) be a complex coordinates system defined on an open set U of M. In terms of these coordinates the Taylor expansion along the diagonal of a function f ∈ C ∞ (U 2 ) can be seen as a formal series of C ∞ (U )[[Zl , Z¯ r ]]. Lemma 4. The map D2 : C ∞ (U 2 ) → C ∞ (U )[[Zl , Z¯ r ]] defined by β 1 fα,β (x)Zlα Z¯ rβ where fα,β (x) = α!β! ∂zαl ∂z¯ r f (xl , xr )x=x =x [D2 f ](x, Zl , Z¯ r ) = l

r

α,β

induces an algebra isomorphism from C ∞ (U 2 )/I ∞ (diag U ) onto C ∞ (U )[[Zl , Z¯ r ]]. We need also to consider Taylor expansions of functions in C ∞ (U 3 ) along the set trig(U ) = {(x, x, x) / x ∈ U }. We use the indices l, m, r for the first, second and third factors of U 3 . Lemma 5. The map D3 : C ∞ (U 3 ) → C ∞ (U )[[Zl , Zm , Z¯ m , Z¯ r ]] defined by γ δ ¯β fα,γ ,δ,β (x)Zlα Zm Z¯ m Zr , [D3 f ](x, Zl , Zm , Z¯ m , Z¯ r ) = α,γ ,δ,β

where fα,γ ,δ,β (x) =

1 α!γ !δ!β!

induces an algebra isomorphism from series C ∞ (U )[[Zl , Zm , Z¯ m , Z¯ r ]].

γ β ∂zαl ∂zm ∂z¯δm ∂z¯ r f (xl , xm , xr )x=x =x

C ∞ (U 3 )/I ∞ (trig U )

l

m =xr

onto the algebra of formal

Let us define the functions Gij , Gij and Gα,β associated to the K¨ahler metric. The functions Gij are given by ω=i Gij dzi ∧ d z¯ j . i,j

The functions Gij are such that (Gj i )i,j is the inverse of (Gij )i,j . To define the functions Gα,β , observe that dω = 0 implies ∂zk Gij = ∂zi Gkj and ∂z¯ k Gij = ∂z¯ j Gik . Consequently ∂zi1 ∂zi2 ...∂z¯ j1 ∂z¯ j2 ...Gi0 j0 is symmetric with respect to i0 , i1 , i2 , ... and j0 , j1 , j2 , .... . Let Gα,β denote this function where α (resp. β) is the multiindice such that α(l) (resp. β(l)) is the number of indices ik (resp. jk ) equal to l. Theorem 5. If (Pk ) and (Qk ) are operators in F, then the same holds for (Pk ◦ Qk ). The product A : J [[]] × J [[]] → J [[]], A S(Pk ), S(Qk ) = S(Pk ◦ Qk ) is associative and C[[]]-bilinear. The operators Al defined by A(F, G) = l Al (F, G) ∀ F, G ∈ J Al : J × J → J ,

Berezin-Toeplitz Operators, a Semi-Classical Approach

15

are given with the previous notations by Al (F, G) = [det(Gij )]−1

3l (−1)l−k k k−l (R H.D) , Zm =Z¯ m =0 k!(k − l)! k=l

where R, D and H are the formal series of C ∞ (U )[[Zl , Zm , Z¯ m , Z¯ r ]] defined by R=

|α|>0,|β|>0, |α|+|β|3

Gα,β (x) α β Zm Z¯ m , α!β!

D=

∂zα ∂z¯β [det(Gij )](x) α,β

α!β!

α ¯β Zm Zm

H = [D3 (f (xl , xm ).g(xm , xr ))](x, Zl , Zm , Z¯ m , Z¯ r ) and is the operator Gij (x)∂Zmi ∂Z¯ j which acts on C ∞ (U )[[Zl , Zm , Z¯ m , Z¯ r ]]. m

Remark 2. If l fl is the full symbol of (Tk ), its principal symbol is f0 . The formula for the composition of principal symbols is F (x, Zl , Z¯ r ), G(x, Zl , Z¯ r ) → F (x, Zl , 0).G(x, 0, Z¯ r ).

J ×J → J,

Proof. We compute the kernel Tk (xl , xr ) of the product of two operators in F whose symbols F and G belong to J . We will estimate this kernel by applying the stationary phase lemma, k 2n Tk (xl , xr ) = E k (xl , xm , xr )F (xl , xm )G(xm , xr )µM (xm ), 2π M where E(xl , xm , xr ) ∈ L∗xl ⊗ Lxr is the contraction of E(xl , xm ) ∈ L∗xl ⊗ Lxm with E(xm , xr ) ∈ L∗xm ⊗ Lxr . |E(xl , xm , xr )| < 1 if (xl , xm , xr ) ∈ / trig(M). So the sequence Tk (., k) is negligible on every open set which does not meet the diagonal {xl = xr }, and to estimate Tk on the neighborhood of (x, x) modulo a negligible sequence it suffices to integrate on a small neighborhood of (x, x, x). So we may assume that M is an open set U and use the notations introduced in Remark 1. Then E k (xl , xm , xr ) = eikφ(xl ,xm ,xr ) s(xl ) ⊗ s −1 (xr ), where φ(xl , xm , xr ) = ψ(xl , xm ) + ψ(xm , xr ). Lemma 6. The formal series [D3 φ(xl , xm , xr ) − D3 ψ(xl , xr )](x, Zl , Zm , Z¯ m , Z¯ r ) is equal to |α|>0,|β|>0

i

Gα,β (x) α β j k Zm Z¯ m = iR(x, Zm , Z¯ m ) + i Gj,k Zm Z¯ m , α!β!

(16)

j,k

where R is the formal series defined in Theorem 5. β

Proof. Set G(x) = ρ(x) + ρ(x). ¯ Since ∂zi ∂z¯ j = Gi,j , we have Gα,β = ∂zα ∂z¯ G if β ∂z¯ G

|α|, |β| > 0. Define the functions G0,β = and Gα,0 = ∂zα G. From Remark 1, ˜ l , xm ) − ψ(x ˜ m , xr ) + ψ(x ˜ l , xr ) . φ(xl , xm , xr ) − ψ(xl , xr ) = i G(xm ) − ψ(x

16

L. Charles

˜ l , xm ) with respect to zk or z¯ rk , observe that To compute the successive derivatives of ψ(x l ∂zk ψ˜ (x, x) = ∂zk G(x) and l

˜ l , xr ) ≡ ∂ j ∂ k ψ(x ˜ l , xr ) ≡ 0 ∂z¯ j ∂zk ψ(x z z l

l

l

mod I ∞ ({xl = xr }).

l

By iterating this and doing the same with z¯ rk , we obtain α β ∂zl ψ˜ (x, x) = Gα,0 , ∂z¯ r ψ˜ (x, x) = G0,β . It follows that ˜ l , xm )] = [D3 ψ(x

G0,β (x) β!

β , Z¯ m

˜ m , xr )] = [D3 ψ(x

Gα,0 (x) α!

α . Zm

Moreover, we have ˜ l , xr )] = G(x), [D3 ψ(x

[D3 G(xm )] =

By adding up these series, we obtain the result.

Gα,β (x) α!β!

α ¯β Zm . Zm

To apply the stationary phase lemma, we show that dx2m φ is non-degenerate at (x, x, x). By Lemma 6, the matrix of dx2m φ is written 0 −iGj k (x) (17) −iGkj (x) 0 in terms of the basis (∂zmi , ∂z¯ mi ), and the result follows. Let us determine the ideal J of C ∞ (U 3 ) generated by ∂zj φ, ∂z¯ j φ. m

m

C ∞ (U 3 )

Lemma 7. A function f ∈ j j ideal generated by Zm , Z¯ m .

belongs to J if and only if [D3 f ] belongs to the

j j Proof. By Lemma 6, [D3 ∂zj φ] and [D3 ∂z¯ j φ] belong to the ideal generated by Zm , Z¯ m , m m so the same holds for every function of J . Conversely, consider the ideal J generated j j j j by the functions uj = zm − zl and v j = z¯ m − z¯ r . The function uj u¯ j + v j v¯j vanishes on trig U to order 2, its Hessian is non-degenerate in the directions transversal to trig U . j j So I ∞ (trig U ) ⊂ J . Moreover [D3 uj ] = Zm and [D3 v j ] = Z¯ m . We obtain that

f ∈ J ⇔ [D3 f ] ∈ Zm , Z¯ m . j

j

From Lemma 6, we see that the functions ∂zj φ, ∂z¯ j φ belong to J , that is they are linear m

m

combinations of the functions uj and v j with C ∞ coefficients. This gives a linear system which is inversible on a neighborhood of trig U since the coefficients along trig U are those of the matrix (17). We obtain that J ⊂ J and the result follows. If f ∈ C ∞ (U 3 ), let f r ∈ C ∞ (U 2 ) denote a function such that f (xl , xm , xr ) ≡ ∞ l , xr ) modulo I. By Lemma 7, such a function exists, is unique modulo I (diag U ) and

f r (x

[D2 f r ](x, Zl , Z¯ r ) = [D3 f ](x, Zl , 0, 0, Z¯ r ). Lemma 6 implies that φ r = ψ. The final result follows then from Theorem 7.7.12 of [10] by using that µM = det(Gij )|dz1 ...dzn .d z¯ 1 ...d z¯ n | , (17) and (16).

Berezin-Toeplitz Operators, a Semi-Classical Approach

17

4. Toeplitz Operators In this chapter we prove Theorem 2 and give the properties of the full symbol σ , the covariant symbol and the contravariant one. The first task is to compute the symbol of the projector (k ). To do this we consider the set T˜ of the operators (Tk ) ∈ F such that ∀Z ∈ C ∞ (M, T 1,0 M), ∇Z¯ ◦ Tk ≡ Tk ◦ ∇Z ≡ 0

mod O(k −∞ )

where O(k −∞ ) is the set of smoothing operators. T˜ is a subalgebra of F and (k ) is an operator of T˜ . As we shall see, T˜ = T + O(k −∞ ), that is every operator of T˜ is the sum of a Toeplitz operator and a smoothing operator, and conversely. l Lemma 8. Let (Tk ) be an operator of F with symbol S(Tk ) = [fl ]. Then (Tk ) belongs to T˜ if and only if Z¯ l .fl (xl , xr ) ≡ Zr .fl (xl , xr ) ≡ 0

mod I ∞ (diag M)

(18)

for every integer l and holomorphic vector field Z ∈ C ∞ (M, T 1,0 M). vector Proof. If (Tk ) is an operator in F with symbol l [fl ] and Z is a holomorphic field, then (∇Z¯ ◦ Tk ) and (Tk ◦ ∇Z ) are operators in F with symbol l [Z¯ l .fl (xl , xr )] and l [Z¯ r .fl (xl , xr )]. Let us define the full symbol map σ˜ : T˜ → C ∞ (M)[[]] by S

F −−−−→  

J [[]]   r

σ˜ T˜ −−−−→ C ∞ (M)[[]]

where r is the restriction l r fl (xl , xr ) = l fl (x, x)

From the properties of S and Lemma 8, it follows that the map σ˜ is onto and its kernel is the set O(k −∞ ) which consists of the smoothing operators. Since O(k −∞ ) is an ideal of T˜ , we obtain an associative product C ∞ (M)[[]] × C ∞ (M)[[]] → C ∞ (M)[[]], (f, g) → f ∗ g. Lemma 9. The product ∗ is C[[]]-bilinear. The operators

defined by f ∗ g = they are given by

Bl : C ∞ (M) × C ∞ (M) → C ∞ (M) l Bl (f, g) for every f, g ∈ C ∞ (M) are bidifferential. Locally,

Bl (f, g) = [det(Gij )]−1

3l (−1)l−k k k−l , (R H.D) Zm =Z¯ m =0 k!(k − l)! k=l

where , R and D are defined as in Theorem 5 and α 1 β 1 α ¯β H = β! ∂z¯ g(x)Zm . α! ∂z f (x)Zm . β

α

(19)

18

L. Charles

Proof. This follows from Theorem 5. It suffices to compute [D3 F (xl , xm )], where F (x, x) = f (x) and Z¯ l .F (xl , xr ) and Zr .F (xl , xr ) vanish to order ∞ along {xl = xr } if Z is a holomorphic vector field. This computation can be done as in the proof of Lemma 6. We compute in the same way [D3 G(xm , xr )]. ∗ has a unit and it is determined as We obtain that B0 (f, g) = f.g. Consequently being the unique formal series 1∗ = l l Sl such that 1∗ = 0 and 1∗ ∗ 1∗ = 1∗ . The symbol σ˜ (k ) satisfies σ˜ (k ) ∗ σ˜ (k ) = σ˜ (k ). Furthermore σ˜ (k ) = 0, because (k ) is not a smoothing operator. So σ˜ (k ) = 1∗ . To compute it, we can use that 1∗ ∗ 1 = 1, which gives S0 = 1,

Sl = −

i=l−1

Bl−i (Si , 1).

(20)

i=0

Let (Tk ) be an operator of F. Using that (k ) is the unit of T˜ /R, we obtain that (Tk ) ∈ T˜ if and only if k Tk k ≡ Tk mod O(k −∞ ). We now come to the Toeplitz operators. Proposition 3. The following assertions are equivalent and define the set T of the Toeplitz operators (Tk ): ) ∈ F et k Tk k = Tk i) (Tk ii) ∃ l fl ∈ C ∞ (M)[[]] such that Tk ≡ k Mf (.,k) k + O(k −∞ ), where f (., k) = k −l fl + O(k ∞ ) and k Tk k = Tk . T is a ∗-algebra, that is if (Rk ) and (Sk ) belong to T , then the same holds for (Rk ◦ Sk ) and (Rk∗ ). Define the full symbol map σ : T → C ∞ (M)[[]] by S

F −−−−→   σ

J [[]]   r

T −−−−→ C ∞ (M)[[]]

or equivalently σ is the restriction σ˜ σ : T → T˜ − → C ∞ (M)[[]].

Then σ is onto and its kernel is T ∩ O(k −∞ ). Since T ∩ O(k −∞ ) is an ideal of T , we obtain an associative product C ∞ (M)[[]] × C ∞ (M)[[]] → C ∞ (M)[[]]. It is the same as the product ∗ described in Lemma 9. The map σcont : T → C ∞ (M)[[]] such that σcont (Tk ) = l fl if Tk ≡ k Mf (.,k) k + O(k −∞ ) and f (., k) = k −l fl + O(k −∞ ) is well-defined. It is onto and its kernel is T ∩ O(k −∞ ). ˜ : C ∞ (M)[[]] → C ∞ (M)[[]], which sends σcont (Tk ) into σ (Tk ) if The map l ˜ l such that (f ˜ )= ˜ l (f ) for every (Tk ) ∈ T , is C[[]]-linear. The operators ˜ 0 (f ) = f . f ∈ C ∞ (M), are differential of order 2l. Furthermore Remark 3. Recall that T˜ is the set of the operators (Tk ) ∈ F which satisfy k Tk k ≡ Tk mod O(k −∞ ). Using the definition i) of a Toeplitz operator, we obtain that T˜ = T + O(k −∞ ). Now Theorem 2 of the introduction follows from Lemma 8.

Berezin-Toeplitz Operators, a Semi-Classical Approach

19

Proof. First define a Toeplitz operator by assertion i). The properties of σ follow from those of σ˜ and the fact that T˜ = T + O(k −∞ ). To prove that ii) ⇒ i), observe that Mf (.,k) k ∈ F and so k Mf (.,k) k ∈ F. To prove the converse, we compute σ (k Mf (.,k) k ). We have S(Mf (.,k) k )(xl , xr , ) =

l fl (xl ).S(k )(xl , xr , )

and by applying Theorem 5, we obtain that σ (k Mf (.,k) k ) =

˜ l (fm ), l+m

˜ are differential of order 2l and ˜ is the identity. This defines where the operators 0 l l ˜ = a map ψl , which is bijective. We obtain that i) ⇒ ii). Now by definition ˜ −1 ◦ σ and the properties of σcont follow from those of σ . Finally, observe σcont = ˜ = ˜ and this completes the proof. that In the last proposition, we defined the symbol σ and the contravariant symbol. The third full symbol is the covariant symbol. Definition 3. The covariant symbol map σcov : T → C ∞ (M)[[]] is the map

(Tk ) → σcov (Tk ) = σ (Tk )

l Sl

−1

.

We denote by the map C ∞ (M)[[]] → C ∞ (M)[[]], which sends σcont (Tk ) into ˜ So we have the following commutative σcov (Tk ). It satisfies the same properties as . diagram: T /O(k −∞ ) @ σcont @ σcov σ @ R @ ? -D D D ψ ψ˜

with D = C ∞ (M)[[]].

where each map is a bijection. Using the symbol maps σcov and σcont , we define the products ∗cov and ∗cont of C ∞ (M)[[]]. These are associative products with unit 1. We describe the symbolic calculus modulo O(h2 ). To do this, we introduce the bivector G−1 ∈ C ∞ (M, T 1,0 M ⊗ T 0,1 M) and the Laplacian : C ∞ (M) → C ∞ (M) defined locally by G−1 =

Gij ∂zi ⊗ ∂z¯ j ,

= Gij ∂zi ∂z¯ j ,

i,j

where (zi ) are complex coordinates. We denote by r ∈ C ∞ (M) the scalar curvature of (M, g).

20

L. Charles

Proposition 4. If f and g ∈ C ∞ (M), we have f ∗ g =f.g + dg ⊗ df, G−1 −

1 2

r.f.g + O( 2 ),

(f ) =f + f + O( 2 ), f ∗cov g =f.g + dg ⊗ df, G−1 + O( 2 ), f ∗cont g =f.g − df ⊗ dg, G−1 + O( 2 ). Consequently, we have f ∗ g = f.g + O(), f ∗ g − g ∗ f = i {f, g} + O( 2 ) and the same holds for ∗cov and ∗cont . Furthermore, if (Tk ) is a Toeplitz operator then σ (Tk ) = σcov (Tk ) + O() = σcont (Tk ) + O(). We say that the function f ∈ C ∞ (M) such that σ (Tk ) = f + O() is the principal symbol of (Tk ). Proof. Let x be a point of M and (zi ) a complex coordinates system such that zi (x) = 0 and Gi,j (x) = δi,j

Gi,α (x) = Gα,i (x) = 0 if |α| = 2.

We have to show that the operator B1 defined in Lemma 9 is given by 1 B1 (f, g)x = (∂z¯ i f )(∂zi g) + Gij,ij .f.g x 2 i

(21)

i,j

with Gij,kl = Gα,β , where α(s) = δsi + δsj and β(s) = δsk + δsl . The formula (19) gives for B1 (f, g),

1 1 (F.G.D) − 2 (R.F.G.D) + 3 (R 2 .F.G.D) Z =Z¯ =0 . m m 2 12 Since R vanishes to order 4 at x, the third term of the sum vanishes at x. We have j k D ≡1+ Gij,ik (x)Zm Z¯ m i,j,k

modulo some terms of order larger than 3. So, at x, (F.G.D)Z

¯ m =0 m =Z

j i j k (∂z¯ i f )(∂zj g)Zm Z¯ m + f.g Gij,ik (x)Zm Z¯ m Z

is equal to

¯ m =0 m =Z

i,j

=

i,j,k

(∂z¯ i f )(∂zi g) + f.g Gij,ij . i

i,j

On the other hand 1 i j ¯k ¯l R x ≡ Gij,kl (x)Zm Zm Zm Zm 4 i,j,k,l

Berezin-Toeplitz Operators, a Semi-Classical Approach

21

modulo some terms of order larger than 5. So, at x

i j ¯k ¯l Gij,kl (x)Zm Z m Z m Z m Z 2 (R.F.G.D)Z =Z¯ =0 = 2 f.g. m

¯ m =0 m =Z

m

=f.g

i,j,k,l

Gij,ij .

i,j

By adding up, we obtain (21). In the same way, we compute ˜ ) = f + (∂z¯ i ∂zi f + 21 Gij,ij .f.g x + O( 2 ). (f x i

i,j

And we obtain the formulas of the proposition.

By applying Eq. (20), we compute σ (k ) modulo O( 2 ) Corollary 2. σ (k ) = 1 + 2 r + O( 2 ). From this we obtain the first and second terms of Riemann-Roch-Hirzebruch formula k n n 1 dim Hk = (1 + 2k r) ωn! + O(k n−2 ). 2π M Applying Lemma 9, Proposition 3 and Proposition 4, we obtain : Proposition 5. The products ∗, ∗cov and ∗cont are equivalent star products. Let Vl , Nl : C ∞ (M 2 ) → C ∞ (M) denote the bidifferential operators such that f ∗cov g = l Vl (f, g), f ∗cont g = l Nl (f, g) and l , l−1 the differential operators such that −1 (f ) = l l−1 (f ), (f ) = l l (f ), where −1 is the inverse of . The operators Vl may be easily computed using the following equation: Sl−1 Bl2 (Sl3 f, Sl4 g), (22) Vl (f, g) = 1 l1 +l2 +l3 +l4 =l

where l l Sl−1 is the inverse of l Sl for the usual product. Then we can deduce Nl , l and l−1 from the following proposition.

Proposition 6. Let U be an open set of M endowed with a system of complex coordinates. β Denote by (aα,β ) the family of C ∞ (U ) such that l = aα,β ∂zα ∂z¯ , then β aα,β (∂zα g)(∂z¯ f ). (23) Vl (f, g) = α,β

In a similar way, if we denote by bα,β the functions of C ∞ (U ) such that l−1 = β α β α α,β bα,β ∂z ∂z¯ , then Nl (f, g) = α,β bα,β (∂z f )(∂z¯ g).

22

L. Charles

Using this, we may deduce that the functions aα,β , bα,β are given by universal polynomials in [det(Gi,j )]−1 and Gα ,β . Proof. From (22), we deduce that the operators Vl act by antiholomorphic derivations on the first factor and holomorphic derivations on the second factor. Observe that (f ) = f and (f¯) = f¯ over an open set V , if f is holomorphic on V . Indeed, since S(Mf k )(xl , xr ) = f (xl )S(k )(xl , xr ) satisfies Eqs. (18) over V , we have A(S(k ), S(Mf k )) = S(k Mf k ) over V which leads to (f ) = f . (f¯) = f¯ can be proved in the same way. Let us prove that the operators Nl act by holomorphic derivations on the first factor and antiholomorphic derivations on the second factor. It suffices to prove that f ∗cont g = f.g and g¯ ∗cont f¯ = g¯ ∗cont f¯ on V if g is holomorphic on V . The second equation follows from the first one by considering adjoints. f ∗cont g = f.g is a consequence of S(k Mf k Mg k )V =A(S(k Mf ), S(k Mg k ))V =A(S(k Mf ), S(Mg k )) . V

Finally, if f and g are holomorphic over V , then (f¯.g)V = (f¯ ∗cont g)V = (f¯) ∗cov (g)V = f¯ ∗cov g V , that is l (f¯g) = Vl (f¯, g) over V , which proves (23). In the same way, we obtain that ¯ = l−1 (f.g) Nl (f, g) ¯ on V . Let us explain how we can define the symbolic calculus on a K¨ahler manifold which is not necessarily compact or which does not admit a prequantization bundle. Observe that the star products ∗, ∗cov , ∗cont and the equivalence maps do not depend on the choice of the prequantization bundle L. Indeed, this is clear for ∗ because the formula (19) depends only on the K¨ahler metric. So the unit 1∗ of (C ∞ (M)[[]], ∗) does not depend on L and consequently the same holds for the covariant star product. Finally, we compute the contravariant symbol by the formulas given in Proposition 6, which do not depend on L. Now assume that M is a K¨ahler manifold endowed with a prequantization bundle which is not necessarily compact. We can define the algebra F in the following way : it consists of the operators which satisfy the assumptions of Definition 1 and whose kernel is properly supported. Then we define as previously the subalgebra T˜ and introduce the symbol σ , the covariant symbol and the contravariant symbol. Their products still define the star products ∗, ∗cov , ∗cont and all the formulas are the same as in the compact case. Finally, if M does not admit a prequantization bundle, we can not construct an algebra of operators. Using complex coordinate systems, we can still define the products ∗, ∗cov , ∗cont and the equivalence maps. We have to prove that these definitions do not depend on the choice of coordinates and that the products obtained are associative. But it suffices to prove this locally and for every x ∈ M there exists a neighborhood U of x endowed with a prequantization bundle L → U . So we can apply the previous observations on the non-compact case. 5. Microsupport, Characterization by the Coherent States and Functional Calculus of Toeplitz Operators We begin with the microsupport. First we introduce the coherent states. Let P ⊂ L be the set which consists of the vectors u ∈ L such that |u| = 1. Denote by π : P → M

Berezin-Toeplitz Operators, a Semi-Classical Approach

23

the canonical projection. For every k, the map which sends s ∈ Hk into h s(π(u)), uk is continuous. Let u ∈ P . By the Riesz lemma, there exists a unique vector eku of Hk such that (s, eku ) = h s(π(u)), uk , ∀ s ∈ Hk , that is, s(π(u)) = (s, eku )uk , for all s ∈ Hk . eku is the coherent state at u. If Tk is an operator C ∞ (M, Lk ) → C ∞ (M, Lk ) such that k Tk k = Tk , we have Tk eku (x) = Tk (x, π(u)).uk , (Tk eku , ekv ) = v −k .Tk π(v), π(u) .uk ,

(24) (25)

where the points are contractions. These properties can be proved by writing them in terms of an orthogonal base of Hk . By choosing Tk = k , we deduce that eku (x) = k (x, π(u)).uk , (eku , ekv ) = O(k −∞ ) if π(u) = π(v), k n k −l Sl (π(u)) + O(k −∞ ). (eku , eku ) = 2π l

Proposition 7. Let (uk ) be a sequence of Hk . The following assertions are equivalent : i) ∃ N, ||uk || = O(k N ), ii) ∃ N, SupM |uk | = O(k N ), iii) ∀ l 0, ∀ vector fields X1 , ..., Xl of M, ∃ N, SupM |∇X1 ...∇Xl uk | = O(k N ). When they are satisfied, we say that (uk ) is admissible. Proof. Obviously, iii) ⇒ ii) ⇒ i). To prove that i) ⇒ iii), we introduce the vectors Q,u (ek ) which generalize the coherent states. Let Q : C ∞ (M, Lk ) → C ∞ (M, Lk ) be a differential operator of the form ∇X1 ◦ ... ◦ ∇Xl , where X1 ,...,Xl are l vector fields. Since Hk is a finite dimensional subspace of C ∞ (M, Lk ), the map which sends v ∈ Hk u,Q into h(Q.v u , uk ) is continuous. By the Riesz lemma, there exists ek ∈ Hk such that k (v, eu,Q ) = h(Q.v u , uk ) for all v ∈ Hk . We have Q,u Tk ek (x) = Qr .Tk (x,π(u)) .uk , (26) Q,u (Tk ek , ekR,v ) =v −l . R¯ l ⊗ Qr .Tk π(v),π(u)) .uk . Q,u

Hence the norm of (ek )k is O(k N ) uniformly with respect to u ∈ P for some N (which depends on the order of Q). The result follows by applying the Cauchy-Schwarz lemma. Proposition 8. Let (uk ) be an admissible sequence of Hk and x ∈ M. The following assertions are equivalent. i) ∃ a neighborhood V of x such that V |uk |2 µM = O(k −∞ ), ii) ∃ a neighborhood V of x such that SupV |uk | = O(k −∞ ), iii) ∃ a neighborhood V of x, ∀ l 0, ∀ vector fields X1 , ..., Xl of M, ∃ N such that SupV |∇X1 ...∇Xl uk | = O(k −∞ ). When they are satisfied, we say that (uk ) is negligible at x.

24

L. Charles

Proof. Obviously, iii) ⇒ ii) ⇒ i). Let us prove that i) ⇒ iii). Choose a neighborhood W of x such that W ⊂ V and a section f : W → P . Let us write for all y ∈ W , f (y),Q µM , |∇X1 ...∇Xl uk |(y) = h uk , ek M

h uk , ef (y),Q (x) is smaller than |uk (x)|.ef (y),Q (x). The first term of this prodk k uct is O(k N ) since uk is admissible and the second one is uniformly O(k −∞ ) when (x, y) ∈ V c × W . Hence, the integral on the complementary set V c of V is O(k −∞ ). By applying the Cauchy-Schwarz lemma and assumption i, we can estimate the integral on V . Remark 4. If (sk ) is a sequence of Hk , we prove in the same way that (sk ) is negligible iff ||sk || = O(k −∞ ). Remark 5. Let (Tk ) be a sequence such that for every k, Tk is an operator C ∞ (M, Lk ) → C ∞ (M, Lk ) and k Tk k = Tk . Note that we can apply the previous propositions to the sequence (Tk (xl , xr )) of kernels. Indeed, Tk (xl , xr ) is a holomorphic section of Lk L−k → M × M, M × M is a K¨ahlerian manifold whose fundamental 2-form is ωl − ωr and the curvature of Lk L−k is ki (ωl − ωr ). We can also apply the previous remark and deduce that (Tk ) is a smoothing operator iff ||Tk || = O(k −∞ ). Definition 4. The microsupport of an admissible sequence (uk ) of Hk is the complementary set of {x ∈ M / (uk ) is negligible at x}. Let (Tk ) be a sequence such that for every k, Tk is an operator C ∞ (M, Lk ) → C ∞ (M, Lk ) and k Tk k = Tk . We say that (Tk ) is admissible if the kernel sequence (Tk (xl , xr )) is admissible. In this case, the microsupport of (Tk ) is the microsupport of the sequence (Tk (xl , xr )). The microsupport is a closed set. We denote it by MS(uk ) or MS(Tk ). We have MS(Tk .sk ) ⊂ MS(Tk ). MS(sk ) = x / ∃ y ∈ M, y ∈ MS(sk ) and (x, y) ∈ MS(Tk ) , MS(Tk ◦ Tk ) ⊂ MS(Tk ) ◦ MS(Tk ) = (x, z) / ∃ y ∈ M, (x, y) ∈ MS(Tk ) et (y, z) ∈ MS(Tk ) . The microsupport of a Toeplitz operator (Tk ) with symbol k k fk is a subset of diag(M). By identifying diag(M) with M, we have MS(Tk ) = ∪k Supp fk . We say that (Tk ) is elliptic at x if f0 (x) = 0, or equivalently if there exists a Toeplitz operator (Sk ) such that (Tk Sk − k ) and (Sk Tk − k ) are negligible at (x, x). Proposition 9. Let (sk ) be an admissible sequence of Hk . A point x of M does not belong to the microsupport of (sk ) if and only if there exists a Toeplitz operator (Tk ) elliptic at x such that (Tk .sk ) is negligible at x.

Berezin-Toeplitz Operators, a Semi-Classical Approach

25

Proof. If sk (y) is O(k −∞ ) on a neighborhood V of x, we introduce a Toeplitz operator (Tk ) elliptic at x and whose microsupport is a subset of V . This implies that (Tk sk ) is negligible. Conversely, assume that Tk .sk (y) is O(k −∞ ) on a neighborhood of x and (Tk ) is elliptic at x. By multiplying (Tk ) by a Toeplitz operator (Sk ) such that (Sk Tk −k ) is negligible at (x, x), we may assume that (Tk − k ) is negligible at (x, x). If f is a section of P defined on a neighborhood of x, we have f (y) f (y) Tk sk , ek = sk , Tk∗ ek . So it suffices to prove that when y belongs to some neighborhood of x, Tk∗ ek

f (y)

f (y)

= ek

y

+ rk ,

where ||rk || is O(k −∞ ) uniformly with respect to y. This follows from (24) which implies that Tk eku (x) =

Tk (x, π(u)) u e (x), k (x, π(u)) k

and the fact that (Tk − k ) is negligible at (x, x).

(27)

The computation modulo O(k −1 ) of the L2 -norm of a Toeplitz operator was done in [3]. We recall the proof which is an easy consequence of the previous results. Then we give a characterization by the coherent states of the Toeplitz operators. We end with the functional calculus. Proposition 10. Let (Tk ) be a Toeplitz operator whose symbol is 0. We have

l N

l fl with fN =

||Tk || ∼k −N Sup |fN |. Remark 6. We have the same estimation with the covariant symbol and the contravariant one since σ (Tk ) = O( N ) implies σcov (Tk ) = σ (Tk ) + O( N+1 ) and σcont (Tk ) = σ (Tk ) + O( N+1 ). Proof. By using contravariant symbols, we can prove that Tk = k −N k MfN k + k −N −1 k Mg(.,k) k + Rk , where (Rk ) is a smoothing operator and (g(., k)) is a sequence of C ∞ (M) whose norm is uniformly O(k −N−1 ). It follows that there exists C such that ||Tk || k −N Sup |fN | + Ck −N−1 . Let u be in P such that Sup |fN | = |fN (π(u))|. Using that ||Tk euk ||2 = (Tk∗ Tk eku , eku ) and (25), we obtain ||Tk euk ||2 = k −2N |fN (π(u))|2 + O(k −2N−1 ). ||euk ||2 We deduce from this that ||Tk || k −N Sup |fN | + C k −N−1 .

26

L. Charles

Proposition 11. Let (Tk ) be a sequence such that for every k, Tk is an operator of C ∞ (M, Lk ) and k Tk k = Tk . Then (Tk ) is a Toeplitz operator if and only if there exists a symbol (f (., k)) of S 0 (M × M) such that u Tk ek (x) = f (x, π(u), k)eku (x) + rku (x), (28) where (rku ) is a uniformly negligible sequence with respect to u. In this case, the covariant symbol of (Tk ) is l l fl (x, x), where f (., k) = l k −l fl + O(k −∞ ). This result can be compared with the characterisation of the -pseudodifferential i operators from their action on the oscillatory functions e x.ξ (cf. [8]). Proof. If (Tk ) is Toeplitz operator, we can prove (28) by using (27) and the expression of the kernels of (Tk ) and (k ). Conversely, if s is a section of Hk , we have s = (s, eku ) eku µP (u). P

Consequently, Tk s = P

(s, eku ) Tk eku µP (u).

Using (28), we obtain that Tk (xl , xr ) = f (xl , xr , k)k (xl , xr ) + u−k .rku (x) with π(u) = xr . Hence, (Tk ) ∈ F and by assumption, k Tk k = Tk , that is (Tk ) a Toeplitz operator. Proposition 12. Let (Tk ) be a selfadjoint Toeplitz operator with symbol l l fl and g ∞ be a function of C (R, C). Then (g(Tk )) is a Toeplitz operator with principal symbol g(f0 ). Proof. By the previous proposition, the spectrum of Tk is a subset of [− Sup |f0 | − 1, Sup |f0 | + 1] if k is sufficiently large. By modifying g outside this interval, we may assume that g has compact support. So g extends to a function G of C ∞ (C) with compact support and such that ∂z¯ G vanishes to order ∞ along R ⊂ C. Let a, b ∈ R be such that Supp g ⊂ (a, b). Introduce the loops γ : γ

1 2 (b + a) + i

a

b

1 2 (b + a) − i

Berezin-Toeplitz Operators, a Semi-Classical Approach

Since ∂z¯ G vanishes to order ∞ along R, 1 g(Tk ) = lim 2iπ →0

27

G(z)(z − Tk )−1 dz.

γ

By applying Stokes theorem 1 g(Tk ) = ∂z¯ G(z)(z − Tk )−1 |dzd z¯ |, 2π C −1 −1 where the integral is well-defined since ||(z the imagi − Tk ) || = O(|y| ) (y is nary part of z). For y = 0, we denote by l l hl (z, x) the inverse of z − l l fl in (C ∞ (M)[[]], ∗). Using that the bidifferential operators Bl associated to ∗ are of degree l in each argument, we obtain that

hl (z, x) = Pl (z, x)(z − f0 )−(l+1) ,

(29)

where the functions Pl are polynomial in z with coefficients in C ∞ (M). So the functions y −1 hl (z, x)∂z¯ G(z) are C ∞ . By applying the Borel process, we construct a symbol H (z, xl , xr , k) in S 0 (C × M × M) with asymptotic expansion −l l k Hl (z, xl , xr ) such that Hl (z, x, x) = y −1 hl (z, x)∂z¯ G(z), ∂z¯ i Hl = O(|xl − xr |∞ ) and ∂zri Hl = O(|xl − xr |∞ ), l

where the estimations are uniform with respect to z. Introduce the operators Lzk of kernel k n k ) E H (z, xl , xr , k). We have ( 2π Lzk .(z − Tk ) = y −1 ∂z¯ Gk + Skz , where ||Skz || = O(k −∞ ) uniformly with respect to z. We deduce from this that ∂z¯ G(z − Tk )−1 = yLzk − ySkz (z − Tk )−1 , and then that g(Tk ) is a Toeplitz operator with symbol l ∂z¯ G(z)hl (z, x)|dzd z¯ |. 2π C

(30)

l

The full calculus of the symbol l l Gl of g(Tk ) can be done by the following way: 1 (p) write the Taylor series p p! g f0 (x) (y − f0 (x))p of g at f0 (x). Then l

1 l ∗p g (p) f0 (x) l G l x = fl (y) − f0 (x)1∗ (y) y=x , p! p

where 1∗ is the unit of (C ∞ (M)[[]], ∗) and l ∗0 l ∗1 l fl = 1∗ , fl (y) = fl ,

∗2

l fl (y)

=

l fl ∗

l fl ,

28

L. Charles

and so on. Indeed, the right-hand side is well-defined. Then assume that g vanishes to order q + 1 at f0 (x); we deduce from (29) and (30) that G0 (x) = ... = Gq (x) = 0. So we may replace g with its Taylor series. In particular, if σ (Tk ) = f0 + f1 + O( 2 ), then σ (g(Tk )) is equal to g(f0 ) + g(f0 ) 2r + g (f0 )(f1 − 2r f0 ) + g (f0 )Gi,j (∂zi f0 )(∂z¯ i f0 ) + O( 2 ). The same formulas apply for the covariant symbol and contravariant symbol. Hence if σcov (Tk ) = f0 + f1 + O( 2 ), then σcov (g(Tk )) = g(f0 ) + g (f0 )f1 + g (f0 )Gi,j (∂zi f0 )(∂z¯ i f0 ) + O( 2 ), and if σcont (Tk ) = f0 + f1 + O( 2 ), then σcont (g(Tk )) = g(f0 ) + g (f0 )f1 − g (f0 )Gi,j (∂zi f0 )(∂z¯ i f0 ) + O( 2 ). References 1. Bargmann, V.: On a Hilbert space of analytic functions and an associated integral transform. Comm. Pure Appl. Math. 14, 187–214 (1961) 2. Berezin, F.A.: General concept of quantization. Commun. Math. Phys. 40, 153–174 (1975) 3. Bordemann, M., Meinrenken, E., Schlichenmaier, M.: Toeplitz Quantization of K¨ahler Manifolds and gl(N), N → ∞ Limits. Commun. Math. Phys. 165, 281–296 (1994) 4. Boutet de Monvel, L., Guillemin, V.: The spectral theory of Toeplitz operators. Ann. Math. Stud. 99, Princeton NJ: Princeton University, 1981 5. Boutet de Monvel, L., Sj¨ostrand, J.: Sur la singularit´e des noyaux de Bergman et de Szeg¨o. Ast´erisque 34(35), 123–164 (1976) 6. Cahen, M., Gutt, S., Rawnsley, J.: Quantization of K¨ahler manifolds I. J. Geom. Phys. 7, 45–62 (1990); II Trans. Amer. Math. Soc. 337, 73–98 (1993); III Lett. Math. Phys, 30, 291–305 (1994) 7. Charles, L.: Semi-classical aspects of Geometric quantization, PhD-Thesis, University Paris-Dauphine, 2000 8. Colin de Verdi`ere, Y.: M´ethodes semi-classiques et th´eorie spectrale. In preparation 9. Guillemin, V.: Star products on compact pre-quantizable symplectic manifolds. Lett. Math. Phys 35, 85–89 (1995) 10. H¨ormander, L.: The analysis of linear partial differential operators I,IV. Berlin-Heidelberg-New York: Springer Verlag, 1983 11. Kostant, B.: Quantization and unitary representation. Lect. Notes Math. 170, Berlin-HeidelbergNew York: Springer, 1970, pp. 87–208 12. Lu, Z.: On the lower order terms of the asymptotic expansion of Tian-Yau-Zelditch. Am. J. Math 122, 235–273 (2000) 13. Moreno, C., Ortega-Navarro, P.: Deformations of the algebra of functions on Hermitian symmetric spaces resulting from quantization. Ann. Inst. H. Poincar´e Sect. A, 38, 215–241 (1983); ∗-products on D 1 (C), S 2 and related spectral analysis. Lett. Math. Phys. 7, 1983 14. Shubin, M.A.: Pseudodifferential operator and spectral theory, Berlin-Heidelberg-New York: Springer Verlag, 1987 15. Souriau, J.M.: Structure des syst`emes dynamiques, Paris: Dunod, 1970 16. Tian, G.: On a set of polarized K¨ahler metrics on algebraic manifolds. J. Diff. Geom. 32, 99–130 (1990) 17. Zelditch, S.: Szeg¨o Kernels and a theorem of Tian. Int. Math. Res. Notices 5, (1998) Communicated by P. Sarnak

Commun. Math. Phys. 239, 29–51 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0841-5

Communications in

Mathematical Physics

Concentration Inequalities for Functions of Gibbs Fields with Application to Diffraction and Random Gibbs Measures Christof Kulske ¨

Weierstrass-Institut f¨ur Angewandte Analysis und Stochastik, Mohrenstrasse 39, 10117 Berlin, Germany. E-mail: [email protected] Received: 10 June 2002 / Accepted: 5 February 2003 Published online: 5 May 2003 – © Springer-Verlag 2003

Abstract: We derive useful general concentration inequalities for functions of Gibbs fields in the uniqueness regime. We also consider expectations of random Gibbs measures that depend on an additional disorder field, and prove concentration w.r.t. the disorder field. Both fields are assumed to be in the uniqueness regime, allowing in particular for non-independent disorder fields. The modification of the bounds compared to the case of an independent field can be expressed in terms of constants that resemble the Dobrushin contraction coefficient, and are explicitly computable. On the basis of these inequalities, we obtain bounds on the deviation of a diffraction pattern created by random scatterers located on a general discrete point set in Euclidean space, restricted to a finite volume. Here we also allow for thermal dislocations of the scatterers around their equilibrium positions. Extending recent results for independent scatterers, we give a universal upper bound on the probability of a deviation of the random scattering measures applied to an observable from its mean. The bound is exponential in the number of scatterers with a rate that involves only the minimal distance between points in the point set. 1. Introduction Concentration inequalities for functions of random fields play an important role in various areas of probability theory, with numerous applications ranging from the more abstract to the explicit analysis of given models ([LT91, Ta96, Le01]). Such exponential inequalities give an upper bound on the probability of a deviation of a function from its mean; they are of interest when the function is defined in a complicated way and explicit computations for its fluctuations are not possible. They assume no spatial symmetry of the function, and so they apply also when there is no reason for a large deviation principle to hold. When the underlying field is a product field, such inequalities are very well

Work supported by the DFG

30

C. K¨ulske

known, and are beautifully tied to the concentration of measure phenomenon [Ta96]. If the function in particular happens to have some (approximate) additivity and there is translation-invariance they provide a large deviation upper bound that is valid in finite volume (and not just asymptotically) with a lower bound on the rate function. So, they are both weaker and stronger than a full large deviation principle (that also incorporates a lower bound on the probabilities with the correct rate function). The aim of our present paper is twofold. First of all, motivated by the study of disordered systems, we derive general concentration inequalities for functions of Gibbs fields in the Dobrushin uniqueness regime that have not appeared before in this simple and useful form. Replacing “independence” by the “weak dependence” of a Gibbs measure in the Dobrushin uniqueness regime is a natural generalisation when we are dealing with a random field on a lattice, or on a graph having some spatial structure. The focus in our approach is on applicability of the estimates and not just existence. In particular we are interested not just in the mere finiteness of the constants appearing in the estimates but in explicit expressions that can be readily evaluated (or estimated) in given models. Secondly, in parallel to the general treatment, we show in this paper how these estimates can be applied to the analysis of the self-averaging properties of random diffraction measures of general point sets in Euclidean space ([BaaHoe00, Hof95a, Hof95b, D93, EnMi92]). These diffraction measures describe the intensity of the reflections of an incoming beam at the points of the set when looked at from far away (at infinity). They are given by the Fourier transform of the autocorrelation measure of the scatterers. Randomness appears here naturally as a probability distribution governing the thermal dislocations of the scatterers around their equilibrium positions. It is clear that these dislocations will interact and taking them to be i.i.d. would only be a very crude model. Additionally, we also consider a random distribution for the scattering amplitudes. We stress that the scattering patterns described by the random scattering measures are beautiful objects themselves that are of considerable interest. In this context we give a universal upper bound on the probability of a deviation of the random scattering measures from its mean, applied to an observable that models the measurement device. The bound depends on the point set only through the minimal distance between its points (Theorems 4,5). In particular the results also apply to diffuse scattering. This analysis extends the previous results for independent scatterers of [K01b]. Being motivated by the study of general disordered systems, the first and basic question is for a useful concentration estimate of a function of a Gibbs field in the uniqueness regime, where no assumptions are made about translational invariance (Theorem 1). In the next more interesting step we will be interested also in expectations of functions w.r.t. Gibbs measures, when the latter are themselves functions of another random field modelling the disorder (Theorems 2,3). This setup corresponds physically to a system that is quenched from an equilibrium state at some sufficiently high (but finite) temperature. Then the quenched degrees of freedom are described by the Gibbs field modelling the disorder. This is more realistic than the assumption of independence for the quenched degrees of freedom which is usually made for simplicity in the classical models of disordered systems (like the random field Ising model or the Edwards Anderson spin glass). We emphasize that we are able to treat also this dependent situation, again assuming no symmetries at all. We believe that these inequalities can be useful tools in a variety of circumstances to extend results for disordered systems from independent disorder to dependent disorder. The assumption we chose to impose on the random distributions is essentially the Dobrushin uniqueness condition going back to [Do68]. (For an excellent presentation

Concentration Inequalities

31

see [Geo88] Chapter 8. More precisely we assume even a slightly stronger form of it, but the difference is minor from the point of view of applications. For general background material about Gibbsian theory see [Geo88, EFS93].) Recall the idea of Dobrushin uniqueness: Assume that the total interaction of a spin at any given site with the other spins is sufficiently small, meaning that the “Dobrushin contraction coefficient” is sufficiently small. Then there is a unique Gibbs measure (“absence of phase transition”) and this measure has fast decay of correlations. Now, it turns out that the constants that appear in our estimates can in all cases be expressed by the original Dobrushin contraction coefficient, and constants measuring the dependence of one random field from the other one that are defined in the same spirit. We stress that all these quantities can be estimated in terms of the Hamiltonian defining the interaction of the random field (the potential of the Gibbsian specification) in a very simple way. Coming back to our main example of diffraction measures we will need to estimate the concentration properties of a function that is not convex. Unfortunately non-convex functions are appearing in a lot of applications, and so very often all elegant methods based on convexity are simply not applicable. Let us mention in this context also the very beautiful result of [SZ92] who proved that the Dobrushin-Shlosman Mixing Condition [DS84] implies a Logarithmic Sobolev inequality and vice versa, at least for certain state spaces. (The Dobrushin-Shlosman condition is less restrictive than the Dobrushin condition we are working with. A simple new proof of the first implication was recently given in [Ce01]). In principle one can obtain exponential concentration as a corollary to a log-Sobolev inequality (see [Le01] Theorem 5.3). Here the problem would be that there are no handy formulas for the constant appearing in the log-Sobolev inequality so that also the resulting concentration estimates would not be explicit. Also, for the purpose of the concentration results we are interested in, log-Sobolev inequalities are a detour, assuming an additional structure (gradient) that is not needed for the present problem. We conclude this introduction with an outline of the rest of the paper. Section 2 is an extended introduction containing an overview of the main results, including the general concentration theorems and a first application to random scattering measures. In Sect. 3 we give more results for random scattering measures along with their proofs. They follow in an elementary but slightly tricky way from the general concentration estimates. In Sect. 4 we describe applications to disordered spin systems and provide details about the estimation of constants. Section 5 contains a simple proof of the basic concentration estimate of Theorem 1, where in particular the form of the constants appearing becomes clear. It follows from consequent use of estimates in the Dobrushin uniqueness region on the basis of the classical martingale method. Section 6 contains a proof of the concentration estimates for expectations w.r.t. random Gibbs measures of Theorems 2 and 3. They use the explicit knowledge of the variation of the Gibbs measure in the Dobrushin uniqueness regime when the local specification is perturbed, in combination with a chain rule argument for variations. 2. Main Results 2.1. Basic concentration estimate in the Dobrushin uniqueness regime. Suppose that is a countably infinite or finite set and E is a standard Borel space. In our applications below E will be a finite set or a ball in a finite-dimensional Euclidean space. Suppose we are given a random field X = (Xx )x∈ taking values in E , with distribution µ. Usually the distribution µ will be explicitly given as a Gibbs measure in terms

32

C. K¨ulske

of the exponential of the negative Hamiltonian defining the model which associates an energy to a configuration of Xx ’s. More precisely this Hamiltonian is in turn given by an interaction potential which is the proper basic object. The measure µ will then describe physically the “equilibrium distribution” corresponding to this interaction. However, we don’t need to make these quantities explicit at this point. Following standard notation, we denote by (2.1) C = Cx,y x,y∈ with Cx,y := sup µ( · ξx c ) − µ( · ξx c )x ξ,ξ ∈E ξy c =ξ c y

the Dobrushin interdependence matrix. Here the r.h.s. of (2.1) denotes the variational distance at the site x. Given two measures ρ and ρ on E it is defined by ρ( · ) − ρ ( · )x = maxf ρ(dξx )f (ξx ) − ρ (dξx )f (ξx )/δ(f ). The maximum is over nonconstant functions f on E. Here and throughout the paper δ(f ) := supu,u |f (u)−f (u )| denotes the total variation of a function f , where u, u are taken over the range of definition of this function. If f is vector valued, | · | denotes the Euclidean norm. We write y c ≡ \y for the complement of the site y. One says that the random field X (respectively its distribution µ) satisfies the Dobrushin uniqueness condition iff cX := sup Cx,y < 1. (2.2) x∈ y∈

The “Dobrushin contraction coefficient” cX is a well-known quantity which estimates the possible change of the single site conditional expectations (appearing on the r.h.s. of (2.1)) when the field values at the other sites are varied. The Dobrushin uniqueness condition (2.2) is perhaps the best-known weak-dependence condition in the theory of Gibbs measures. We need to introduce a new notion. Let us say that the random field X (resp. µ) satisfies the transposed Dobrushin uniqueness condition iff ctX := sup Cx,y < 1. (2.3) y∈ x∈

Obviously cX and ctX vanish if the Xx ’s are independent. Then we have the following general concentration estimate. Theorem 1. Suppose the random field X = (Xx )x∈ taking values in E is distributed according to a Gibbs measure µ that obeys the Dobrushin uniqueness condition with Dobrushin constant cX , and also the transposed Dobrushin uniqueness condition with constant ctX . Suppose that F is a real function on E with µ exp(tF (X)) < ∞ for all real t. Then we have the Gaussian concentration estimate

r 2 (1 − cX )(1 − ctX ) ∀r ≥ 0. (2.4) µ F (X) − µ F (X) ≥ r ≤ exp − 2 δ(F )22 l Here δ(F ) ≡ δx (F ) x∈ is the (infinite) variation vector of F , where δx (F ) = supξ,ξ ;ξx =ξx |F (ξ ) − F (ξ )| denotes the variation of F at the site x. Its l 2 -norm is de 2 noted by δ(F )l 2 ≡ x∈ (δx (F ))2 . If this norm is infinite, the statement is empty (and thus correct).

Concentration Inequalities

33

Remark. Ready-to-use upper bounds on the Dobrushin constant cX are known when the conditional expectations are given in terms of a Gibbsian specification with a defining interaction potential (see Georgii Chapter 8.1)1 . Let us mention the following general classic bound on cX that takes care of all high-temperature situations. We point out here X X that it gives the same estimate we would on the constant ct . So, suppose have on c also that µ(dξx |Xx c = ξx c ) = exp − Ax A (ξx ξ\x ) λ(dξx )/Zx (ξ\x ) for a Gibbsian potential = (A )A⊂ (meaning that A is a function on E that depends only on E A ). Here λ is a σ -finite measure on E, which must be the same for all sites x ∈ and Zx (ξ\x ) is the usual normalization factor. Then we have that cX , ctX ≤

1 (|A| − 1)δ(A ) sup 2 x∈

(2.5)

Ax

which is independent of the single-site part δ(x ). This is stated as Proposition 8.8 in [Geo88] as a bound for cX , for a brief explanation why it implies the bound for ctX too, see Sect. 4. Be aware however that interdependence constants Cxy and Cyx whose actual values differ significantly could occur for models with very different x for different sites x ∈ . Remark. Often the theorem will be used in the following situation. Suppose that F = 2 F (X ) is a function that depends only on variables in a finite set ⊂. Then δ(F )l 2 ≤ | | δ(F ) 2l ∞ . The reader who likes to see an interesting application of this is advised to go directly to Sect. 2.3, “First application to random diffraction measures”.

2.2. Chain rule concentration estimates for disordered systems with dependent disorder. The concentration inequalities we are going to present now apply to situations where a random field Y is given whose distribution depends on the realizations of another “external” random field X. This is precisely the case in the study of disordered systems. Here X models the quenched randomness (which we sometimes will call external randomness) and one is given the Gibbs distribution of Y for any fixed configuration of X. We assume here that both fields are in the Dobrushin uniqueness regime in a natural sense, and that the dependence of Y on X is not completely unreasonable. To control these properties quantitatively we will have to introduce constants (in the spirit of the Dobrushin constant) governing the deviation of the fields X (respectively Y ) from the case of product distributions, and constants governing the degree of influence from Y on X. Very often in disordered systems the distribution of the external random field X will even be assumed to be a product distribution, but we don’t need this for our estimates. We emphasize that we are able to treat the more general case of Dobrushin uniqueness for X. The resulting concentration estimates will depend only on these constants, and thus contain only minimal information about the distribution of (X, Y ). We stress that while the definition of the constants might look a little frightening at first sight, they are 1 Prescribing a consistent set of finite volume conditional probabilities in terms of an interaction potential is of course the standard way of producing a Gibbs measure. Recall the following well-known facts about Dobrushin uniqueness. If µ is an infinite-volume measure for which the Dobrushin uniqueness condition (2.2) holds, it is necessarily the unique Gibbs measure for the local specification defined by the system of its conditional expectations. This can be proved by a contraction method where the Dobrushin constant c appears as a contraction coefficient (See e.g. Theorem 8.7 of [Georgii]). Existence must be proved separately but is of course guaranteed e.g. by a compact state space E.

34

C. K¨ulske

very easy to control, so the estimates are very explicit. (This is done e.g. by (2.5) and an analogous consideration given below in Sect. 4.) We call them “chain rule estimates” because the distribution of the field Y is a (possibly very complicated) function of the field X, so that in order to control expectations of functions of both fields some “chain rule for variations” will be needed. Let us now formulate our results in a precise manner. Suppose that X and Y are countable (finite or infinite) sets, and EX and EY are standard Borel spaces. Suppose that we are given two random fields X = (Xx )x∈X X taking values in EX and Y = (Yx )x∈Y taking values in EYY . Suppose that their joint distribution µ satisfies the following conditions. (i) The marginal of µ on the variable X, denoted by µX , is a Gibbs measure that obeys the Dobrushin uniqueness condition (2.2) and the transposed condition (2.3). We denote the corresponding “marginal Dobrushin constant” by cX and its transposed version by ctX . (ii) For any realization η of X the conditional distribution of Y given X, denoted by µ( · |X = η), is a Gibbs measure that obeys Dobrushin uniqueness and its transposed version. Moreover we demand uniformity in η in the sense that the following uniform Dobrushin constant cY,∞ and its transposed version ctY,∞ obey Y Y cY,∞ := sup sup Cx,y (η) < 1, ctY,∞ := sup sup Cx,y (η) < 1. x∈Y y∈ Y

η

y∈Y x∈ Y

η

(2.6) Y (η) denotes the Dobrushin matrix for the fixed configuration η. Here Cx,y (iii) To control the dependence of the field Y on the field X let us introduce their dependence matrix in the following way: Y ←X Cz,u := sup µ( · X = η, Yzc = ωzc ) − µ( · X = η , Yzc = ωzc ) z . η,η ;ηuc =η c u ωz c

(2.7) It describes the possible change of the fixed Y -single-site conditional distribution at z w.r.t. variation of the X-variables at u. The supremum is taken over the respective Y X spaces, i.e. η, η ∈ EX and ω ∈ EY . We demand that the following dependence constant and its transposed version obey Y ←X Y ←X cY ←X := sup Cz,u < ∞, ctY ←X := sup Cz,u < ∞. (2.8) z∈Y u∈ X

u∈X z∈ Y

For independent X and Y these constants vanish, obviously. We need a little more notation. Let us write δxX (G) := supη,η ;ηx c =η c ,ω |G(η, ω) − x

G(η , ω)| for the X-variation at the site x ∈ X for a function G on the product space. The notation for δxY (G) Note that the corresponding partial infinite is analogous. variation vectors δ X (G) ≡ δx (G) x∈ and δ Y (G) are not in the same space anymore, X in general, because the index sets X and Y are different. Then the first result concerns the concentration properties of Y -averages w.r.t. the field X.

Concentration Inequalities

35

Theorem 2. Suppose that X and Y are random fields with joint distribution µ satisfying (i), (ii), (iii). Suppose that G is a real function on E X × E Y with µ exp(tG(X, Y )) < ∞ for all real t. Then we have the Gaussian concentration estimate µX µ G (X, Y ) X − µ G (X, Y ) ≥ r   2 (1 − cX )(1 − ctX )  r  ≤ exp − 2  2 X Y δ (G) 2 + cY,eff δ (G) 2 l

with the “effective constant”

cY,eff

=

∀r ≥ 0

(2.9)

l

cY ←X ctY ←X (1−cY,∞ )(1−ctY,∞ )

1 2

< ∞.

Remark. We can view cY,eff as the “effective strength” of the influence the random field X has on the field Y . The form of the constants will become clear in the proof that combines an application of Theorem 1 for the X-marginal with a chain rules for variations. Remark. The reader should realize that the dependence constants (and thus cY,eff ) are as easily estimated as the Dobrushin constants if the single-site conditional distribution of Yx is given in a Gibbsian form with a random energy function. This is analogous to the estimate for the Dobrushin constants in (2.5) and is explained in more detail in Proposition 2 of Sect. 4. Almost automatically we then also have the following “total concentration result”. Theorem 3. Under the hypothesis of Theorem 2 we have the “total” concentration estimate µ G (X, Y ) − µ G (X, Y ) ≥ r  −1  2 (1 − cX )(1 − ctX )  r   ≤ exp −  2  2 X Y Y, eff δ (G) 2 + c

δ (G)l 2 l   −1 −1 Y,∞ Y,∞ )(1 − ct ) (1 − c   +  . 2 Y

δ (G) 2

(2.10)

l

Remark. The form is easy to understand. The term within the inverse of the outer square brackets has the character of a squared variance. It is the sum of the term for the Y -average from Theorem 2 and a uniform version of the term for the conditional Y -distribution from Theorem 1. 2.3. First application to Random diffraction measures. It is our aim now to look at the self-averaging properties of the diffraction pattern created by random scatterers (“atoms”) located on a general discrete point set which is a subset of Euclidean space. The function F whose concentration properties we will be interested in describes the result of a measurement at the random diffraction pattern. We stress that this function is

36

C. K¨ulske

not a convex function, so all methods based on convexity simply cannot be applied. To appreciate the charm of this topic the interested reader may take a look at some of the beautiful experimental diffraction patterns of quasicrystals (This is how quasicrystals were discovered in 1982). Here is the problem. Let us describe at first how this function is defined.2 Consider the scattering image of the complex random measure (“random Dirac comb”) given by ηx δx+ωx , (2.11) ρ (η, ω) = x∈

where δx denotes the Dirac-measure at the site x. The point set ⊂Rν is assumed to be countable. The ηx ’s are complex numbers modelling scattering amplitudes. The ωx ’s (“dislocations”) are vectors in the underlying Euclidean space Rν . Below they will be made random according to a random field X = (Xx )x∈ taking values η = (ηx )x∈ and a random field Y = (Yx )x∈ taking values ω. So, the point set modelling the locations of the scatterers in Euclidean space has a geometric meaning here, but it also serves just as an index set for the random fields. The classes of distributions we allow for them will be described later. Fix any finite volume ⊂. Then, the object that contains all information about the scattering image of the points in is the finite volume scattering measure which by definition is the Fourier-transform of the corresponding finite volume autocorrelation measure. The latter is defined as follows 1 η,ω γ := ηx ηx∗ δx−x +ωx −ωx . (2.12) | | x,x ∈

Here the star denotes complex conjugate. Since we allow to be any finite set, we have chosen the natural normalization by the number of points, as in [K01b]. A measurement on the scattered intensity is described by an observable k → ϕ(k) in Fourier-space, modelling the measurement device, which is usually taken as a Schwartz test-function. η,ω η,ω The corresponding result of the measurement is then given by γˆ (ϕ) ≡ γˆ (k) ϕ(k)dk. Here the Fourier-transform of a tempered distribution γ is defined by duality, γˆ (ϕ) = γ (ϕ), ˆ where ϕˆ denotes the Fourier-integral of the Schwartz-function ϕ over Rν . So, the function we are interested in is given by η,ω

(η, ω) → γˆ (ϕ) =

1 ηx ηx∗ ϕ(x ˆ − x + ωx − ωx ). | |

(2.13)

x,x ∈

We assume that the function ϕ(k) is real and view it as a fixed parameter, so that (2.13) is a real function3 on the random fields modelling the dislocations and random amplitudes. We can now take averages of this function describing the random scattering image, for instance w.r.t. the distribution of the dislocations ω to obtain an ω-averaged scattering image. This can of course also be done w.r.t. the scattering amplitudes η, or w.r.t. to 2 For a summary of the basic notions of mathematical scattering theory for point scatterers, see e.g. Sect. II of [BaaHoe00] and Appendix A of [K01b]. The reason for the definitions of the diffraction measures can be understood in an elementary way by superposition of the reflections of an incoming beam at the individual scatterers. The results are physically meaningful when one takes measurements at distances far away from the scatterers and there is only single-scattering. 3 Write γˆ η,ω (k) = | ik·(x+ωx ) |2 for the Lebesgue density of the finite volume scattering x∈ ηx e measure. So, for real test functions ϕ(k) the function (2.13) is always real, and it is nonnegative if ϕ ≥ 0. Of course it is not a convex function of ω but of oscillatory nature! It is convex as a function of η though.

Concentration Inequalities

37

both random fields η and ω. The study of the large -behavior of the average is then one part of the story that is essentially reduced to understanding the diffraction pattern of without disorder. The other part of the story which we are going to discuss now is the control of the self-averaging properties of the diffraction image. Concentration estimates were looked at for the first time in [K01b], for the cases of independent ωx ’s and fixed ηx ’s, and vice versa. Before that there were only few partial results of the SLLN type, which can be found in the quasicrystal literature for special sets , see however [Hof95a]. (This is because of the different inclinations of probabilistic, statistical mechanics and diffraction communities which we are hoping to bring together at this point.) The emphasis in this study is to understand the influence of the point set and the function ϕˆ for the quality of the concentration estimate. Since scattering experiments are a tool to guess the structure of one is interested in estimates that depend on very little a priori information about . It turned out in [K01b] that for the independent case we could obtain large deviation upper bounds that involve only the minimal distance between points in and hence do not depend on the structure of the set at all. This means in particular that the quality of the large deviation estimate is independent of the nature of the limiting diffraction image when tends to infinity, be it pure point or diffuse. The dependence on the observable ϕ is expressed then in terms of a suitable Sobolev-norm. The proof given in [K01b] for the independent case used a cluster expansion for the logarithmic moment generating function of (2.13). At the price of some technical work, it has the advantage to provide also a central limit theorem (for “non-pathological” , in particular lattices) and shows that the bounds appearing are essentially optimal. On the basis of the general results in Theorems 1,2,3 we can now extend the concentration result in a rather easy and elegant way to the case of dependent fields that obey Dobrushin uniqueness. Let us give here only the result that corresponds to Theorem 1, and provide more discussion later. Theorem 4. Assume that X = (Xx )x∈ is a field of complex random variables (“scatterers”) indexed by the point set ⊂Rν , and that Y = (Yx )x∈ is a random field of Rν -valued random variables (“thermal dislocations”). Assume that the field of the joint variables Z = (X, Y ) = (Xx Yx )x∈ is distributed according to a Gibbs measure µ that obeys the Dobrushin uniqueness condition (2.2) with a Dobrushin constant c. Assume also the transposed Dobrushin uniqueness condition (2.3) with constant ct . Let ⊂ be any finite set. Assume that the random point set {x + ωx , x ∈ } has minimal distance b > 0, for µ-a.e. realization of ω of the dislocations. Moreover we assume the following µ-a.s. uniform bounds on the single-site distributions |Xx | ≤ 1,

δ(Xx ) ≤ εsc ,

δ(Yx ) ≤ εdl

(2.14)

for all x ∈ .4 Then the corresponding random scattering image γˆ X,Y (ϕ) in the finite volume obeys the universal large deviation estimate µ γˆ X,Y (ϕ) − µ γˆ X,Y (ϕ) ≥ r | | r 2 (1 − c)(1 − ct ) ≤ 2 exp − ∀r > 0. (2.15) 2 8 ˆ ν,b + εdl d ϕ

ˆ ν,b εsc ϕ

4 So ε bounds the diameter of the supports of the distribution of the dislocation variables Y taken x dl in the Euclidean norm for all sites x.

38

C. K¨ulske

Here we have introduced the Sobolev-norm involving integrals of derivatives up to the order of the dimension ν where we make explicit also a scaling factor b/2. For a function g : Rν → C the norm is given by

g ν,b :=

ν 1 1 1

d k g(y) dy. |B1 | k! (b/2)ν−k Rν k=0

(2.16)

The constant b/2 plays the role of fixing a length scale and here it is the “uniform packing radius” as defined above. The constant |B1 | denotes the volume of the ν-dimensional unit ball.5 Remark. Theorem 4 shows self-averaging of the diffraction measures with an explicit estimate on the rate. We regard this estimate as very satisfactory. Indeed, the l.h.s. of (2.15) depends in a complicated way on three complicated objects, the geometry of the point set ⊂, the test function ϕ, and the distribution µ of the random field (ω, η). The upper bound on the r.h.s. of (2.15) is in comparison very simple. The influence of the dependence structure of the random field is entirely factorized into the constant (1 − c)(1 − ct ), a structure that is inherited from Theorem 1. The dependence on ϕ is only through the integrals appearing in the Sobolev norm. The dependence on is only through the uniform packing radius b/2 > 0 appearing as the scaling factor in this norm. We stress that all quantities appearing in the estimate (2.15) are explicitly computable, and so an experimentalist can produce actual numbers on the r.h.s. of (2.15). Also the assumption of uniform positivity of the packing radius can be given up, leading to somewhat uglier estimates. For more on this see Sect. 3, Addition to Proposition 1. Remark. Even for the independent case this bound is slightly better than the one given in [K01b]. It seems possible to prove a result of this type by an extension of the expansion method described in [K01b], at least to certain smaller classes of weakly dependent Gibbs fields. This would be at the price of adding a huge layer of complexity to the expansions, so the concentration estimate method is to be preferred. 3. Further Application to Diffraction – Proofs 3.1. Concentration result for quenched scatterers or quenched dislocations. It is physically important to know what happens when we have a frozen configuration of scattering η,ω amplitudes η and we are interested in the concentration of γˆ (ϕ) centered at its average over the dislocations ω, for fixed η. So, we have “quenched” the η-configuration. This describes a disordered material with frozen types of scatterers that are subjected to thermal motions around their equilibrium positions. We mention that we get the valid bound for this case by the formal application of Theorem 4 (although this case is not logically contained in the statement of the theorem). The corresponding constant in the denominator of the argument of the exponential is obtained by putting the bound on the variation of the amplitudes εsc = 0. So, it doesn’t depend on the Sobolev norm of ϕ Of course, d k g(y) : (Rν )k → Rν denotes the k th differential of g at the point y and d k g(y) = sup|v1 |=...|vk |=1 |d k g(y)[v1 , . . . , vk ]| is the usual norm of a k-multilinear mapping, at any fixed point y, 1 1 k+1 g(y) dy. where |v| denotes the Euclidean norm. Similarly dg ν,b = |B1 | νk=0 k! ν d (b/2)ν−k R 1 The advantage of including the factor b > 0 inside the definition of the norm is the scale invariance: Rescaling of the measurement function ϕσ (k) = σ −ν ϕ1 (k/σ ), where ϕ1 is a probability density w.r.t. the νdimensional Lebesgue measure, leads to ϕˆσ ν,b = ϕˆ1 ν,bσ . Similarly ε d ϕˆσ ν,b = εσ d ϕˆ1 ν,bσ . 5

Concentration Inequalities

39

anymore but only on the Sobolev norm of its differential. Next c, ct have to be taken as constants for the ω-distribution for that particular η. An equal game can be played by exchanging the roles of η and ω, so that we are fixing the latter ones. Note that, when ω is fixed we are left with a model on a distorted but fixed point set {x + ωx , x ∈ } (with modified but positive minimal packing radius b/2). Thus we can assume without loss of generality that ωx ≡ 0 for all x ∈ . 3.2. Concentration result for average over dislocations. It is physically very natural to consider a model for the joint distribution of scatterers η and dislocations ω whose joint distribution (X, Y ) ≡ (η, ω) is of the type as described in Sect. 2.2. A special case for this would be a model of independent scatterers with thermal dislocations that might depend on the type of the scatterer, but we don’t need independence for the scatterers. Theorem 5. Suppose a distribution for the scatterers X and dislocations Y as described in Sect. (2.2). Again we assume the uniform bounds on the scatterers and amplitudes as detailed in Theorem 4 (2.14). Then, the corresponding fixed-scatterer scattering image that is averaged over the dislocations obeys the universal large deviation estimate µX µ γˆ X,Y (ϕ)X − µ γˆ X,Y (ϕ) ≥ r    | | r ≤ 2 exp − 8 2

(1 − cX )(1 − ctX ) ˆ ν,b εsc ϕ

+ cY,eff

ˆ ν,b εdl d ϕ

 2 

∀r ≥ 0. (3.1)

We also have the total bound µ γˆ X,Y (ϕ) − µ γˆ X,Y (ϕ) ≥ r   −1 2 (1 − cX )(1 − ctX )  | | r  ≤ 2 exp −  2 8 εsc ϕ

ˆ ν,b + cY,eff εdl d ϕ

ˆ ν,b −1  −1 Y,∞ (1 − cY,∞ )(1 − ct )   +  . 2 ˆ ν,b εdl d ϕ

(3.2)

Let us now give the estimate on the l 2 -norm of the variation of our function w.r.t. the scatterers and the dislocations. From this, Theorem 4 follows immediately from Theorem 1. Similarly Theorem 5 follows from Theorem 2 and Theorem 3. η,ω

Proposition 1. Look at the function (η, ω) → γ (ϕ) ˆ on the set where |ηx | ≤ 1 for all sites x ∈ and the minimal distance of the point set {x + ωx , x ∈ } is bigger than b > 0. Then we have 2 ϕ

ˆ ν,b η η,ω ˆ 2 ≤ δ γ (ϕ) l | |

1 2

[δ(ηx )]2

x∈

(3.3)

40

C. K¨ulske

and 2 d ϕ

ˆ ν,b ω η,ω ˆ 2 ≤ δ γ (ϕ) l | |

1 2

[δ(ωx )]2

.

(3.4)

x∈

Proof. For each x ∈ we have for the variation of the non-normalized observable that η ∗ δx ηx ηx ϕ(x ˆ − x + ωx − ωx ) x ,x ∈

≤ 2δ(ηx ) × sup ω

ϕ(x ˆ − x + ωx − ωx ),

(3.5)

x ∈

where we have used that |ηz | ≤ 1 for all z, and that |ϕ(x)| ˆ = |ϕ(−x)|. ˆ This expression is not particularly transparent, but it can be estimated in terms of the much nicer Sobolev norm. To get good estimates it is important to refrain from the temptation to put the sup inside the sum! Now, let us use the following fact that was proved as Proposition 3 in [K01b]: For any point set ⊂Rν whose points have a minimal distance of a > 0 we have the estimate |g(z)| ≤ g ν,a . (3.6) z∈

Here the norm on the r.h.s. was introduced in (2.16). This statement is reminiscent of Sobolev embedding theorems. It follows from the fact that for any ν-times differentiable function g on the unit ball B1 around the origin one has ν 1 1 |g(0)| ≤

d k g(y) dy. |B1 | k! B1 k=0

We apply this statement for the set (x, ω) ≡ {x − x + ωx − ωx , x ∈ } that includes the arguments the r.h.s. of (3.5) is summed over. It is simple but important to note that its minimal distance is bounded below by b > 0, independently of x and ω. So we get ϕ(x ϕ(z) ˆ − x + ωx − ωx ) ≤ ˆ ≤ ϕ

ˆ ν,b . (3.7) x ∈

z∈ (x,ω)

This already proves the desired estimate (3.3) on the l 2 -norm. Next we show the result (3.4) for the ω-variation. It is in the same spirit but there is a small trick involved. We have ω ∗ δx ηx ηx ϕ(x ˆ − x + ωx − ωx ) x ,x ∈

≤ 2 sup sup

ˆ − x + ωx − ωx ) − ϕ(x ˆ − x + ωx − ωx ). ϕ(x

ωx c ωx ,ωx x ∈ \x

(3.8)

˜ ωx c , ωx ) := {x − x + This time, for each fixed x, ωx c , and ωx let us define the set (x, ωx − ωx , x ∈ \x} including all the arguments of the second ϕ-term. ˆ We note that the minimal distance between the points of any of these sets is bounded below by b > 0. Then we can bound the r.h.s. of (3.8) by

Concentration Inequalities

41

ωx ,ωx ωx c

≤2

ˆ + ωx − ωx ) − ϕ(z) ˆ ϕ(z

2 sup sup

˜ z∈(x,ω x c ,ωx )

sup

sup

ˆ + u) − ϕ(z) ˆ , ϕ(z

|u|≤δ(ωx ) ˜ z∈˜

(3.9)

where sup˜ is over all ˜ with minimal distance ≥ b. For u = 0 and any such ˜ we write 1 d ˆ + u) − ϕ(z) ˆ = |u| ˆ + tu + su/|u|)dt ϕ(z ϕ(z s=0 ds ˜ 0 z∈˜ z∈1 d ˆ + tu + su/|u|)dt ≤ |u| ϕ(z ds s=0 0 z∈˜ d ≤ |u| sup ˆ + su/|u|). (3.10) ϕ(w s=0 ds 0≤t≤1 ˜ w∈+tu

It is important to note that ˜ + tu is still a set with minimal distance ≥ b, for any fixed t. So we can estimate the sum uniformly in t and get d d ≤ d ϕ

ϕ(w ˆ + su/|u|) ≤ ϕ(· ˆ + su/|u|) ˆ ν,b . (3.11) ds s=0 ds s=0 ν,b ˜ w∈+tu

This finishes the proof of Proposition 1. The assumption that {x + ωx , x ∈ } may have a positive minimal distance, µ-a.s. is not necessary for a similar estimate to hold. We will now briefly discuss what estimates can be made when the a.s. minimal distance assumption is lifted, however still assuming a.s. uniformly bounded dislocations. In fact, the reader will realize that the proof of Proposition 1 shows the a priori sharper statement (i) given below. The resulting estimate is then exploited more explicitly in statement (ii) under the assumption of bounded dislocations. Addition to Proposition 1. (i) For a function g : Rν → C define the norm g ,µ to be the smallest number such that |g(x + Yx )| ≤ g ,µ for µ-a.e. realization of Y. (3.12) sup v∈Rν x∈+v

A similar definition is made for a linear form dg by replacing the modulus on the l.h.s. by the norm of the linear functional at x + Yx . Then, under the sole condition that |ηx | ≤ 1 without any restrictions on and µ, Proposition 1 holds with · ,µ replacing · ν,b . (ii) Denote the minimal distance of the unperturbed set ⊂Rν by b0 > 0 and assume that |Yx | ≤ R a.s., for any fixed arbitrarily large R < ∞. Then we have the (crude) ν estimate · ,µ ≤ 2 + 2R/b0 · ν,b0 . Remark. Note that therefore Theorem 4 and Theorem 5 have obvious extensions obtained by the application of the Addition to Proposition 1 on the basis of the general concentration Theorems 1,2,3!

42

C. K¨ulske

Proof of (ii). The idea is to estimate the sum on the l.h.s. of (3.12) in terms of sums of integrals over balls with fixed radii b0 /2 that might overlap, using the statement given after (3.6). Then simply count the possible number of overlaps. Without loss of generality put v = 0. Then ν 1 1 1 |g(x + Yx )| ≤ 1B b0 (x+Yx ) (y) d k g(y) dy |B1 | k! (b0 /2)ν−k Rν 2 x∈ x∈ k=0 1B b0 (x+Yx ) (y) g ν,b0 ≤ sup

y∈Rν x∈

≤ 2 + 2R/b0

2

ν

g ν,b0 .

(3.13)

To understand the last inequality note that, at any point y the sum in the bracket must be smaller than the number of points in any set with minimal distance b0 whose distance to y is smaller than R = R + b0 /2. But this number is certainly bounded by the volume of the ball with radius R + b0 /2 divided by the volume of the ball with radius b0 /2. It is obvious from this argument that the given factor could be improved by more careful counting.

4. Application to Random Gibbs Measures Example: Self-averaging of free energy density for dependent disorder. Let us mention at first an application that shows exponential self-averaging of the free energy for the case of a disordered model with disorder field that obeys Dobrushin uniqueness. For the case of independent disorder such estimates can already be found in [HP82]. For a full large deviation principle for the free energy of a random spin system with i.i.d. disorder distribution, see Sect. 5 in [SY01]. Note that in our setup we don’t assume absence of phase transition for the spin variables of the model itself. It is a straightforward application of the basic concentration Theorem 1 and reads in the abstract setting as follows. Corollary 1. Suppose the random field X = (Xx )x∈ (“disorder field”) taking valX ues in EX is distributed according to a Gibbs measure µX that obeys the Dobrushin uniqueness condition with Dobrushin constant cX , and also the transposed Dobrushin uniqueness condition with constant ctX . Suppose that is a measurable space (“spin space”) and ρ is a positive measure on (“a priori measure on the spin-space”). X Suppose that H is a real function (“Hamiltonian”) on EX × . Define the function (corresponding “free energy”) by

F (X) := − log

ρ(dω) e−H (X,ω)

whenever it exists. Then we have the Gaussian concentration estimate

r 2 (1 − cX )(1 − ctX ) X X µ F (X) − µ F (X) ≥ r ≤ exp − ∀r ≥ 0. 2 2δ X (H ) 2 l

(4.1)

(4.2)

Concentration Inequalities

43

This follows from the easy fact that the partial variation δxX (F ) is bounded by the partial variation δxX (H ). Note that the estimate can be used to prove self-averaging of the finite volume free energy density that is exponentially fast in the volume (for disordered spin systems whose Hamiltonians have bounded local variations w.r.t. the disorder field 2 X). This is clear since δ X (H )l 2 will be of the order when H is any reasonable finite volume random Hamiltonian depending only on spin variables in (while fixing a spin-boundary condition outside). Note that for very general non-local dependence of H on X this fact is still true, the precise constants depending on the specific model, of course. Example: Pair interactions on general graphs. Let us now discuss the class of models with pair interactions on a general graph to illustrate how the various “Dobrushin-type” constants can be estimated in terms of simpler constants bounding the pair potentials themselves. Suppose that GX = (X , BX ) is a graph with vertex set X and set of edges (or “bonds”) BX . Suppose that its degree is bounded by mX . Suppose that µX is a X measure with state-space EX obeying Dobrushin uniqueness and its transposed version with formal Boltzmann weight ! ∝ exp − Ux,y (ωx , ωy ) λ(dηx ) (4.3) {x,y}∈BX

x∈X

with a pair potential satisfying supω,ω |Ux,y (ω) − Ux,y (ω )| ≤ u for all {x, y} ∈ BX . Then we have from (2.5) that cX , ctX ≤ mX u/2 for the constants appearing in Theorem 1. The same would be true if there were any additional single-site potential possibly differing from site to site (as long as all integrals converge). Let us now consider a disordered (or nested) system whose fields X and Y are both of the pair potential type and see what constants arise in the chain rule estimates of Theorem 2 and Theorem 3. Let us suppose that Y is a variable whose conditional distribution µ( · |X = η) is a Gibbs measure on a graph GY = (Y , BY ) with vertex set Y and set of edges BY . Suppose that its degree is bounded by mY . Suppose uniform Dobrushin uniqueness and its transpose for the distribution with formal Boltzmann weight of the form ! ∝ exp − Wx,y (ωx , ωy , η{x,y} ) λ (dωx ) (4.4) {x,y}∈BY

x∈Y

with a pair potential W that is a function also of an edge variable ηx,y . So, we assume that X = BY equals the set of edges of the inner variable Y . This is the case e.g. for “nearest neighbor” pair-interacting spin glass models on arbitrary graphs. Suppose that the X-influence on the interaction between Y ’s is bounded in the sense that supω supη,η |Wx,y (ω, η) − Wx,y (ω, η )| ≤ q. Then we have from Proposition 2 given Y ←X ≤ q/2 so that the interaction constants are bounded by in the section below that Cx,y Y ←X Y ←X c ≤ mX q/2 and ct ≤ q. Finally, assuming that supη supω,ω |Wx,y (ω, η) − Wx,y (ω , η)| ≤ w, we get the bound on the uniform Dobrushin constants cY,∞ , ctY,∞ ≤ mY w/2. In this way all constants appearing in Theorems 1,2,3 have been expressed in the elementary variation parameters q, w, u of the potentials and the degree of the two graphs appearing. In the simple situation of the graph Y = Zν with independent Xx ’s we thus have in particular cX = ctX = 0, cY ←X ≤ νq, ctY ←X ≤ q, and cY,∞ , ctY,∞ ≤ νw.

44

C. K¨ulske

Simple estimates on the dependence constants. For practical use let us mention the following proposition that was already applied in the previous example. Proposition 2. Suppose that the conditional distribution of Yx has the Gibbsian form µ(dωx X = η, Yx c = ωx c ) = exp −Hx (η, ωx , ωx c ) λ(dωx )/Zx (ηω\x ), where Hx (η, ω) is a function on the product space and λ is a σ -finite measure on EX . Then we have that Y ←X ≤ Cx,y

1 X δ (Hx ). 2 y

(4.5)

Proof of Proposition 2. Within the proof of Proposition 8.8 of [Geo88] the following (i) (i) (i) was shown. Suppose that λx (dωx ) = eu (ωx ) λ(dωx )/ λ(d ω˜ x )eu (ω˜ x ) , i = 1, 2 are two measures on the single-site space E, given in terms of the functions u(i) . Then their variational distance can be bounded in terms of the variation of the function u(1) − u(2) (1) (2) so that one has λx − λx x ≤ 41 supωx ,ωx |u(1) (ωx ) − u(2) (ωx ) − u(1) (ωx ) + u(2) (ωx )|. But from here the proposition is obvious. Proof of Estimate on Dobrushin constants and transpose given in (2.5). Assuming the inequality above one sees that Cx,y ≤ 21 A⊃{x,y} δ(A ) (which is also explicitly pointed out in the proof of Proposition 8.8 in [Geo88]). We point out for our purposes that it is symmetric in x, y. So one gets (2.5) from here, for both cX and ctX . 5. Proof of Theorem 1 The proof of Theorem 1 relies on an appropriate extension of the martingale method that is well-known for the case of functions of independent variables to the case of Dobrushin uniqueness. (See e.g. [Ta96] Paragraph 4 for independent variables). Recall the idea of this method. The exponential moment generating function of F (X) − E(F (X)) is estimated in a simple way: Put some order on the sites and write F (X) − E(F (X)) as a sum of martingale differences. These are differences between conditional expectations obtained by fixing of the values of the field on sets differing at one site. Then integrate the exponential successively over the individual fields, using bounds on the integrals at each step. Our application is based on Lemma 1 which is a uniform estimate on the martingale differences which are obtained by introducing an arbitrary order of the sites of the index set . The interesting point of the proof is then to understand how the weak dependence of the Gibbs distribution can be handled, in comparison to the case of independent variables. It turns out that this can be done in a very simple and elegant way by the use of estimates of the variational distance of Gibbs-measures in the Dobrushin uniqueness regime w.r.t. changes of the local specification. A clear two-page proof of the result we need for our purposes can be found in [Geo88]; we won’t repeat it here and just refer to the necessary information we need as “Fact about Dobrushin uniqueness”. This “fact” will be exploited again in more generality below in the proof of Theorem 3. Now, let us start with the proof. In fact we prove the following stronger (but less convenient) statement. Theorem 1 . Fix a bijection from the positive integers to and denote by < the order on that is inherited by that bijection. Denote by D< = Dx,y 1x
Concentration Inequalities

45

Suppose that F is a real function on E with µ exp(tF (X)) < ∞ for all real t. Then we have   2 r (5.1) µ F (X) − µ F (X) ≥ r ≤ exp − 2  . 2 1 + [D< ]t δ(F )l 2 () D t denotes the transpose of a matrix D, and 1 is the unit matrix. Assuming this, t )v ≤ D t v

Theorem 1 is implied for simple reasons: We first use that (1 + D< 2 2 for vectors v with positive entries, because of the positivity of the matrix elements of C. Next use that D t v 2l 2 = | x,y (DD t )x,y vx vy | ≤ 21 x,y (DD t )x,y (vx2 + vy2 ) ≤ (DD t )x,y v 2l 2 . We have that supx y,z Dx,z Dy,z ≤ supx u Dx,u supz sup x y y Dy,z = D l ∞ D l 1 , where the last symbols denote the operator norms. Noting that the Dobrushin constant equals cX = C l 1 and that ctX = C l ∞ we see that the last expression is bounded by (1 − cX )−1 (1 − ctX )−1 . This proves the form of the estimate given in Theorem 1. Now let us start with the proof of the uniform bounds on the martingale differences of the function F (X). Lemma 1. Define the decreasing sequence of σ -algebras by putting Tx := σ (Xy ; y ≥ x), for x ∈ . Then the Martingale differences of the random variable F (X) taken w.r.t. this ordering obey the uniform bound

µ(F (X)|Tx ) − µ(F (X)|Tx+1 ) ∞ ≤ δx (F ) +

δy (F )Dy,x .

(5.2)

y∈,y<x

Proof of Lemma 1. This estimate relies on the following piece of information (see [Geo88], Theorem 8.20).

Fact about Dobrushin uniqueness. Suppose that is a countable set, infinite or finite, and the random variables (Xx )x∈ are distributed according to a Gibbs measure ρ that n obeys the Dobrushin uniqueness condition (see the Introduction). Put D = ∞ n=0 C , where C is the interdependence matrix of ρ. Suppose that we are given another Gibbs measure ρ˜ such that the variational distance of the single-site conditional probabilities is uniformly bounded ˜ · |ξ ) x ≤ bx sup ρ( · |ξ ) − ρ(

(5.3)

ξ

with constants bx for x ∈ . Then the expectations of any function f (ξ ) on the infinitevolume configurations ξ don’t differ more than |ρ(f ) − ρ(f ˜ )| ≤

δy (f )Dy,x bx .

(5.4)

y,x∈

To show Lemma 1 let us use short notations like µ(F (X)|Tx )(ξ ) ≡ µ(F (X)ξ≥x ), etc. Now, to estimate the martingale differences in (5.2) let us write

46

C. K¨ulske

µ(F (X)ξ>x ξx ) − µ(F (X)ξ>x ) ≤ µ(F (X<x ξx ξ>x ) ξ>x ξx ) − µ(F (X<x ξx ξ>x ) ξ>x ) +µ F (X<x ξx ξ>x ) − F (X<x Xx ξ>x )ξ>x .

(5.5)

The second term is bounded by δx (F ). For the first term we apply the “fact about Dobrushin uniqueness” on the conditional spin-system on the sites y in with y < x that is obtained from the original conditional probabilities by fixing ξ>x . Putting ρ(dξ≤x ) = µ(dξ≤x ξ>x ) and ρ(dξ ˜ ≤x ) = µ(dξ≤x ξ>x ξx ) we have the estimate (5.3) with by = 0 for all y < x and bx = 1. This gives in fact that supx supξ>x ξx over the first modulus on the r.h.s. of the last inequality is bounded by the second term in (5.4). This finishes the proof of Lemma 1. Note that, in this application, we applied the “fact about Dobrushin uniqueness” to the finite index set of the sites that are less than or equal to x. In this situation the proof of the “fact” becomes even simpler, as is easily seen by going through the short proof of Lemma 8.18 and Theorem 8.20 given in [Geo88]. It is also simple to verify that the statement holds for any, possibly degenerate kernels ρ( ˜ · |ξ ) allowing e.g. also for Dirac measures on specific configurations. To complete the proof of Theorem 1 we apply Lemma A.1 (given in Appendix A) on the filtration (decreasing sequence of σ -algebras) defined in Lemma 1. To be able to do so we need that µ is trivial on the tail σ -algebra, but this is clear because it is the only Gibbs measure that is compatible with the specification defined by its conditional expectation, using Dobrushin uniqueness again. So the proof of Theorem 1 is finished. Lemma A.1 itself, at least in the case of a finite filtration, is a simple application of the Martingale method in the context of uniformly bounded Martingale differences. However, we need to treat correctly the presence of the infinite filtration. Infinities in the filtrations appear also in a slightly different way in the proof of Theorem 3, so for the sake of clarity we give the results needed along with their complete proofs in Appendix A. Remark. We remark that a term like µ(f (Xz )ξ>x ξx ) − µ(f (Xz )|ξ>x ) is dangerous in the presence of a phase transition for the measure µ. Then we could not exclude that there might be discontinuous behavior, even for arbitrarily distant sites x, z, for certain ξ>x . Therefore the proof doesn’t generalize to the phase transition region. 6. Proof of Theorems 2, 3 The Proof of Theorem 2 relies on Theorem 1 and another application of the “Fact about Dobrushin uniqueness” stated in Sect. 5, along with the application of a chain rule for variations. Again, let us give the strongest version of Theorem 2 first. Theorem 2 . Fix a bijection from the positive integers to X and denote < the by Y,∞ n or der on X that is inherited by that bijection. Denote by D Y,∞ = ∞ the C n=0 Y,∞ Y geometric series of the uniform Dobrushin matrix Cx,y = supη Cx,y (η). Suppose that G is a real function with µ exp(tG(X, Y )) < ∞ for all real t. Then we have the Gaussian concentration estimate r2 , (6.1) µ µ G (X, Y ) X − µ G (X, Y ) ≥ r ≤ exp − 2M 2

Concentration Inequalities

47

where " X #t X " X #t " Y ←X #t " Y,∞ #t Y M= 1 D + D (G) + 1 + D δ (G) δ C < <

(6.2) l2

whenever this quantity is finite. X is the same as D in Theorem 1 for the marginal Of course the definition of D< < distribution on X.

Proof of Theorem 2 . We denote the function that appears in the estimate by F (η) := µX=η (G(η, Y )) and apply Theorem 1 for that function. We need to estimate its variation δzX (F ). We will show that, in the sense of the inequality between coordinates, we have #t " #t " δ(F ) ≤ δ X (G) + C Y ←X D Y,∞ δ Y (G).

(6.3)

From that Theorem 2’ follows by Theorem 1 . Take any η and η with ηz = ηz , and put G− (ηzc , ω) := inf ηz G(ηzc ηz , ω). Then we have

µX=η (G(η, Y )) − µX=η (G(η , Y )) ≤ µX=η (G(η, Y )) − µX=η (G− (ηzc , Y ))

+ µX=η (G− (ηzc , Y )) − µX=η (G− (ηzc , Y )) ≤ δzX (G) ¯ Y) . + sup δzX µX=η G(η, η¯

(6.4)

To control the variation of the conditional spin system when we change its local specification by changing the X-variable we need to use again the “Fact about Dobrushin uniqueness”. Denoting ρ(dω) = µX=η\z ηz (dω) and ρ(dω) ˜ = µX=η\z ηz (dω) we have to put Y ←X in the statement of the “fact” controlling the change in the local specificabx ≤ Cx,z tions caused by a single-site variation of X. For fixed η¯ we set f (ω) := G(η, ¯ ω) so that we get from the “fact” X=η Y,∞ Y ←X µ f (Y ) − µX=η f (Y ) ≤ δyY (f ) Dy,x Cx,z . y∈ Y

(6.5)

x∈ Y

Collecting terms and using vector notation the desired inequality for δ(F ) follows. Assuming this, Theorem 2 is obtained from Theorem 2’ by an analogous estimate on M as Theorem 1 is obtained from Theorem 1 . Using the triangle inequality and splitting off the common matrix we are left with the new term " #t " #t " #t #t

C Y ←X D Y,∞ v 2l 2 ≤ C Y ←X C Y ←X 2l 2 × [D Y,∞ v 2l 2 .

(6.6)

The first factor is bounded by cY ←X ctY ←X . The second factor has already been seen to be bounded by (1 − cY,∞ )−1 (1 − ctY,∞ )−1 v 2l 2 .

48

C. K¨ulske

Proof of Theorem 3. To prove Theorem 3 we need a double filtration. Define the filtra (1) (2) tion Tx := σ Xy ; y ≥ x on the probability space E X and the filtration Tx := σ Yy ; y ≥ x on the probability space E Y . Then Lemma A.2 tells us that we can treat them like they were finite filtrations if the function in question has exponential moments and we have bounds on their martingale differences. Now, the martingale differences in the first line of (A.3) are controlled by (5.2) applied to the conditional distribution of Y given any fixed configuration of X. The martingale differences in the second line of (A.3) are controlled in terms of (6.3). Collecting terms Theorem 3 follows. A different (although less natural) way to prove the “total concentration result” of Theorem 3 would be to prove that the joint distribution can be represented as a Gibbsmeasure for the joint variables ξx = (ηx ωx ), estimate its joint constants c, ct , and then apply Theorem 4. Note in this context that it won’t be true in general that the resulting measure is a Gibbs measure, even for independent Xx ’s, when one allows for conditional Gibbsian distributions of the Y -variables having phase transitions (which is however excluded here). For more on this, see the research in [K99, K01a]. Appendix Lemma A.1. Suppose that (, T0 , µ) is a probability space. Suppose that (Ti )i=0,1,2,... is a decreasing sequence of σ -algebras such that µ is trivial on the tail-σ =algebra $ F∞ := i=0,1,2,... Ti . Suppose that Z is a real random variable on such that µ (exp(tZ)) < ∞ for all real t. Assume that Z has uniformly bounded martingale differences

µ(Z|Ti ) − µ(Z|Ti+1 ) ∞ ≤ Mi . Then we have the exponential concentration estimate

a2 . µ Z − µ(Z) ≥ a ≤ exp − ∞ 2 i=0 Mi2

(A.1)

(A.2)

Remark. Tail triviality is needed! Otherwise µ(Z) must be replaced by µ(Z|T∞ ) in the l.h.s. of the estimate. Remark. If the sum in the denominator of the argument of the exponential does not converge, the statement is empty, obviously. In the case of a finite filtration (Ti )i=0,1,...,n the statement is applied by putting Ti := Tn for i ≥ n. (1)

(2)

Lemma A.2. Suppose that ((1) , T0 ) and ((2) , T0 ) are measurable spaces. Denote by (, F0 , µ) the corresponding product space with the product σ -algebra where the distribution µ has the form µ(dω(1) dω(2) ) = µ(1) (dω(1) )µ(2) (dω(2) |ω(1) ) with a probability measure on the first space and a probability kernel from the first to the second space. (k) Suppose that (Ti )i=0,1,2,... are two decreasing sequences of σ -algebras on the (1) spaces (k) such that (a) the measure µ(1) is trivial on the tail-σ =algebra F∞ := $ (1) (2) T and (b) the measure µ(2) ( · |ω(1) ) is trivial on the tail-σ =algebra F∞ := $i=0,1,2,... i(2) for any µ(1) -a.e. ω(1) . i=0,1,2,... Ti

Concentration Inequalities

49

Suppose that Z is a real random variable on such that µ (exp(tZ)) < ∞ for all real t. Assume that Z has uniformly bounded martingale differences (1)

(2)

(1)

(2)

µ(Z|T0

⊗ Ti

(1)

µ(Z|Tj

(1) (2) ⊗ T∞ ) − µ(Z|Tj +1

) − µ(Z|T0

⊗ Ti+1 ) ∞ ≤ Mi ,

∀i = 0, 1, . . . ,

(2) ⊗ T∞ ) ∞

∀j = 0, 1, . . . .

≤ Lj ,

Then we have the exponential concentration estimate

a2 µ Z − µ(Z) ≥ a ≤ exp − ∞ . 2 i=0 (Mi2 + L2i ) Proof of Lemma A.1. We will show that t 2 ∞ 2 µ et (Z−µ(Z)) ≤ e 2 i=0 Mi .

(A.3)

(A.4)

(A.5)

From this the estimate on the probabilities follows in the standard way from the exponentialMarkov inequality saying that for all t ≥ 0 in the form µ (Z − µ(Z) ≥ a) ≤ −ta t (Z−µ(Z| T )) ∞ e µ e by optimizing the bound (A.5) over t. Now, to show (A.5) one puts t ≡ 1 without loss and estimates the Laplace transform µ eZ−µ(Z|T1 ) eµ(Z|T1 )−µ(Z) = µ µ[eZ−µ(Z|T1 ) |T1 ] × eµ(Z|T1 )−µ(Z) ≤ µ[eZ−µ(Z|T1 ) |T1 ] µ eµ(Z|T1 )−µ(Z|T ) . (A.6) ∞

The supremum over the conditional Laplace transform of the first martingale difference is estimated in terms of the uniform bound M0 . Since the expectation vanishes one gets that M02 Z−µ(Z|T1 ) |T1 ] ≤ e 2 . (A.7) µ[e ∞

λ2 2

(This follows from the inequality eλz ≤ e + z sinh λ for |z| ≤ 1.) From that we get by iteration 1 N −1 2 µ eZ−µ(Z) ≤ e 2 i=0 Mi µ eYN ,

(A.8)

where YN = µ(Z|TN ) − µ(Z). To show (A.5) we show that limN↑∞ µ eYN = 1 . To see this, note at first that, by the backwards Martingale theorem (see e.g. Bauer Theorem 60.8) we know that, µ-a.s. limN↑∞ µ(Z|TN ) = µ(Z|T∞ ). But since we assumed that µ is trivial on T∞ this means limN↑∞ YN = 0 µ-a.s. So one has limN↑∞ µ(eYN 1YN ≤λ ) = 0 for all fixed λ. But from this follows the convergence of the full integrals because of the uniform estimate sup µ eYN 1YN ≥λ ≤ e−λ sup µ e2YN N=0,1,...

N=0,1,...

1 1 2 2 µ e2µ(Z|TN ) µ e−2µ(Z|T∞ ) 1 1 2 2 ≤ e−λ µ e2Z µ e−2Z < ∞, ≤e

−λ

where the last inequality is Jensen’s inequality.

(A.9)

50

C. K¨ulske

t 2 ∞ 2 2 Proof of Lemma 3. We need to show that µ et (Z−µ(Z)) ≤ e 2 i=0 (Mi +Li ) . Now, we write the Laplace transform as (1) (2) µ et (Z−µ(Z)) = µ eZ−µ(Z|T0 ⊗T∞ ) eU (1)

(A.10)

(2)

with U = µ(Z|T0 ⊗ T∞ ) − µ(Z). With the same arguments as the ones leading to (A.8) one gets that 1 N −1 2 µ eZ−µ(Z) ≤ e 2 i=0 Mi µ eVN eU 1 ∞ 2 ≤ e 2 i=0 Mi µ eU + µ (eVN − 1)eU , (A.11) (1)

(2)

(1)

(2)

where VN = µ(Z|T0 ⊗ TN ) − µ(Z|T0 ⊗ T∞ ). We can apply the martingale 1 ∞ 2 decomposition for µ eU from which follows that µ eU ≤ e 2 i=0 Li , using tail-triviality. So, we need to show that the second term in the last parenthesis converges to zero with N ↑ ∞. But this follows from the backwards martingale convergence theorem, tail triviality and existence of all exponential moments in an analogous fashion as in the proof of Lemma A.1. Acknowledgements. I am grateful to M. Baake for his motivation of the study of diffraction patterns and to A. Bovier for valuable discussions about concentration inequalities. This work was supported by the DFG Schwerpunkt ‘Wechselwirkende stochastische Systeme hoher Komplexit¨at’.

References [BaaHoe00] Baake, M., H¨offe, M.: Diffraction of random tilings: Some rigorous results. J. Stat. Phys. 99(1/2), 219–261 (2000) [BaaMoo98] Baake, M., Moody, R.V.: Diffractive point sets with entropy. J. Phys. A 31, 9023–9039 (1998) [BaaMoo01] Baake, M., Moody, R.V.: Weighted Dirac combs with pure point diffraction. Preprint, 2001 [Ce01] Cesi, F.: Quasi-factorization of the entropy and logarithmic Sobolev inequalities for Gibbs random fields. Probab. Theory Relat. Fields 120, 569–584 (2001) [D93] Dworkin, S.: Spectral theory and x-ray diffraction. J. Math. Phys. 34, 2965–2967 (1993) [Do68] Dobruˇsin, R.L.: Description of a random field by means of conditional probabilities and conditions for its regularity. Teor. Verojatnost. i Primenen 13, 201–229 (1968) [DS84] Dobrushin, R.L., Shlosman, S.B.: In: Statistical Physics and Dynamical Systems (K¨oszeg, 1984). Boston, MA: Birkh¨auser, Boston, 1985, pp. 371–403 [EFS93] van Enter, A.C.D., Fern´andez, R., Sokal, A.D.: Regularity properties and pathologies of position-space renormalization-group transformations: Scope and limitations of Gibbsian theory. J. Stat. Phys. 72, 879–1167 (1993) [EnMi92] van Enter, A.C.D., Miekisz, J.: How should one define a (weak) crystal?. J. Stat. Phys. 66, 1147–1153 (1992) [Geo88] Georgii, H.O.: Gibbs Measures and Phase Transitions. Berlin: de Gruyter, 1988 [He00] Herrmann, D.J.L.: Properties of Models for Aperiodic Solids. Ph.D. thesis, Nijmegen, 2000 [Hof95a] Hof, A.: Diffraction by aperiodic structures at high temperatures. J. Phys. A 28, 57–62 (1995) [Hof95b] Hof, A.: On diffraction by aperiodic structures. Commun. Math. Phys. 169, 25–43 (1995) [HP82] van Hemmen, J.L., Palmer, R.G.: The thermodynamic limit and the replica method for short-range random systems. J. Phys. A 15(12), 3881–3890 (1982) [K99] K¨ulske, C.: (Non-) Gibbsianness and Phase Transitions in Random Lattice Spin Models. Markov Proc. Rel. Fields 5, 357–383 (1999) [K01a] K¨ulske, C.: Weakly Gibbsian Representations for joint measures of quenched lattice spin models. Probab. Theory Relat. Fields 119, 1–30 (2001)

Concentration Inequalities [K01b] [Le01] [LT91] [Ma98] [M00] [Sa00] [SY01] [SZ92] [Ta96]

51

K¨ulske, C.: Universal bound on the selfaveraging of random diffraction measures. WIAS-preprint 676, available as preprint math-ph/0109005 at http://lanl.arXiv.org/, to be published in Probab. Theory Relat. Fields Ledoux, M.: The concentration of measure phenomenon. Mathematical Surveys and Monographs 89, Providence, RI: American Mathematical Society, 2001 Ledoux, M., Talagrand, M.: Probability in Banach spaces. Berlin: Springer, 1991 Marton, K.: Measure concentration for a class of random processes. Probab. Theory Relat. Fields 110, 427–439 (1998) Schlottmann, M.: Generalized model sets and dynamical systems. In: Directions in Mathematical Quasicrystals, 143–159, CRM Monogr. Ser., 13, Providence, RI: Am. Math. Soc. 2000, pp. 143–159 Samson, P.-M.: Concentration of measure inequalities for Markov chains and -mixing processes. Ann. Probab. 28(1), 416–461 (2000) Sepp¨al¨ainen, T., Yukich, J.E.: Large deviation principles for Euclidean functionals and other nearly additive processes. Probab. Theory Relat. Fields 120, 309–345 (2001) Stroock, D.W., Zegarlinski, B.: The equivalence of the logarithmic Sobolev inequality and the dobrushin-Shlosman mixing condition. Commun. Math. Phys. 144, 303–323 (1995) Talagrand, M.: A New Look at Independence. Ann. Probab. 24, 1–34 (1996)

Communicated by H. Spohn

Commun. Math. Phys. 239, 53–63 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0872-y

Communications in

Mathematical Physics

On the Largest Eigenvalue of a Random Subgraph of the Hypercube Alexander Soshnikov1, , Benny Sudakov2, 1

Department of Mathematics, University of California at Davis, One Shields Ave., Davis, CA 95616, USA. E-mail: [email protected] 2 Department of Mathematics, Princeton University, Princeton, NJ 08544, USA and Institute for Advanced Study, Princeton, NJ 08540, USA. E-mail: [email protected] Received: 30 May 2002 / Accepted: 10 February 2003 Published online: 3 June 2003 – © Springer-Verlag 2003

Abstract: Let G be a random subgraph of the n-cube where each edge appears randomly and independently with probability p. We prove that the largest eigenvalue of the adjacency matrix of G is almost surely λ1 (G) = (1 + o(1)) max 1/2 (G), np , where (G) is the maximum degree of G and the o(1) term tends to zero as max(1/2 (G), np) tends to infinity. 1. Introduction and Formulation of Results Let Qn be a graph whose vertices are all the vectors x = (x1 , . . . , xn ) | xi ∈ {0, 1} and two vectors x and y are adjacent if they differ in exactly one coordinate, i.e., i |xi − yi | = 1. We call Qn the n-dimensional cube or simply the n-cube. Clearly Qn is an n-regular, bipartite graph of order 2n . In this paper we study random subgraphs of the n-cube. A random subgraph G(Qn , p) is a discrete probability space composed of all subgraphs of the n-cube, where each edge of Qn appears randomly and independently with probability p = p(n). Sometimes with some abuse of notation we will refer to a random subgraph G(Qn , p) as a graph on 2n vertices generated according to the distribution described above. Usually asymptotic properties of random graphs are of interest. We say that a graph property A holds almost surely, or a.s. for brevity, in G(Qn , p) if the probability that G(Qn , p) has property A tends to one as n tends to infinity. Necessary background information on random graphs in general and random subgraphs of n-cube can be found in [4].

Research was supported in part by the NSF grant DMS-0103948. Research was supported in part by NSF grants DMS-0106589, CCR-9987845 and by the State of New Jersey.

54

A. Soshnikov, B. Sudakov

Random subgraphs of the hypercube were studied by Burtin [5], Erd¨os and Spencer [8], Ajtai, Koml´os and Szemer´edi [2] and Bollob´as [4], among others. In particular it was shown that a giant component emerges shortly after p = 1/n ([2]) and the graph becomes connected shortly after p = 1/2 ([5, 8, 4]). Recently the model has become of interest in mathematical biology ([7, 13, 14]). In particular it appears (see [13, 14]) that random graphs play an important role in a general model of a population evolving over a network of selectively neutral genotypes. It has been shown that the population’s limit distribution on the neutral network is solely determined by the network topology and given by the principal eigenvector of the network’s adjacency matrix. Moreover, the average number of neutral mutant neighbors per individual is given by the spectral radius. The subject of this paper is the asymptotic behavior of the largest eigenvalue of the random graph G(Qn , p). The adjacency matrix of G is an 2n ×2n matrix A whose entries are either one or zero depending on whether the edge (x, y) is present in G or not. A is a random real symmetric matrix with the eigenvalues denoted by λ1 ≥ λ2 ≥ . . . ≥ λ2n . It follows from the Perron-Frobenius theorem that the largest eigenvalue is equal to the spectral norm of A, i.e. λmax (G) = λ1 (G) = A = maxj |λj |. Note also that for a subgraph of the n-cube, or in general, for any bipartite graph, λk (G) = −λ|V |−k (G), k = 1, 2, . . . and in particular |λmin (G)| = λmax (G). It is easy to observe that for every graph G = (V , E) its largest eigenvalue λ1 (G) is always squeezed between the average degree of G, d¯ = v∈V dG (v)/|V | and its maximal degree (G) = maxv∈V dG (v). In some situtaions these two bounds have the same asymptotic value which determines the behavior of the largest eigenvalue. For example, this easily gives the asymptotics of the largest eigenvalue of a random subgraph G(n, p) of a complete graph of order n for p log n/n. On the other hand, in our case there is a gap between average and maximal degree of random graph G(Qn , p) for all values of p < 1 and therefore it is not immediately clear how to estimate its largest eigenvalue. Here we determine the asymptotic value of the largest eigenvalue of sparse random subgraphs of n-cube. To understand better the result, observe that if denotes a maximal √ degree of a graph G, then G contains a star S and therefore λ1 (G) ≥ λ1 (S ) = . Also, as mentioned above λ1 (G) is at least as large as the average degree of G. As for all values of p(n) n−1 2−n , a.s. |E(G(Qn , p))| = (1 + o(1))pn2n , we get that a.s. λ1 (G(Qn , p)) ≥ (1 + o(1))np. Combining √ the above lower bounds, we get that a.s. λ1 (G(Qn , p)) ≥ (1 + o(1)) max , np . It turns out this lower bound can be matched by an upper bound of the same asymptotic value, as given by the following theorem: Theorem 1.1. Let G(Qn , p) be a random subgraph of the n-cube and let be the maximum degree of G. Then almost surely the largest eigenvalue of the adjacency matrix of G satisfies √ λ1 (G) = 1 + o(1) max , np , where the o(1) term tends to zero as max(1/2 (G), np) tends to infinity. As the asymptotic value of the maximal degree of G(Qn , p) can be easily determined for all values of p(n) (see Lemma 2.1), the above theorem enables us to estimate the asymptotic value of λ1 (G(Qn , p)) for all relevant values of p. This theorem is similar to the recent result of Krivelevich and Sudakov [11] on the largest eigenvalue of a random subgraph G(n, p) of a complete graph of order n. The rest of this paper is organized as follows. In the next section we gather some necessary technical lemmas about random subgraphs of the n-cube. Section 3 is devoted

On the Largest Eigenvalue of a Random Subgraph of the Hypercube

55

to the proof of the main theorem. Section 4, the last section of the paper, contains some concluding remarks. Throughout the paper we will systematically omit floor and ceiling signs for the sake of clarity of presentation. All logarithms are natural. We will frequently use the inequal b a ity ≤ ea/b . Also we use the following standard notations: an = (bn ), an = b O(bn ) and an = (bn ) for an > 0, bn > 0 as n → ∞ if there exist constants C1 and C2 such that C1 bn < an < C2 bn , an < C2 bn or an > C1 bn respectively. The equivalent notations an = o(bn ) and an bn mean that an /bn → 0 as n → ∞. We will say that an event ϒn , depending on a parameter n, holds almost surely if Pr(ϒn ) → 1 as n → ∞ (please note that this notion, very common in the literature on random structures, has a different meaning in probability theory). 2. Few Technical Lemmas In this section we establish some properties of random subgraphs of the n-cube which we will use later in the proof of our main theorem. First we consider the asymptotic behavior of the maximal degree of G(Qn , p). It is not difficult to show that if p is a constant less than 1/2 then a.s. (G) = (1 + o(1))cn, where c satisfies the equation log 2+c log p+(1−c) log(1−p) = c log c+(1−c) log(1−c) and (G) = (1+o(1))n for p √ ≥ 1/2. We omit the proof of this statement since for our purposes it is enough to have (G) = O(np) which follows immediately from the fact that (G) ≤ n. The case when p = o(1) is studied in more detail in the following lemma. Lemma 2.1. Let G = G(Qn , p) be a random subgraph of the n-cube. Denote by n pk (1 − p)n−k ≥ 1 . κ(n) = max k : 2n k Then the following statements hold. (i) If p = o(1) and p is not at least exponentially small in n then almost surely κ(n) − 1 ≤ (G) ≤ κ(n) + 1. n pk (1 − p)n−k = (1) and κ(n) = k − 1 or k. (ii) If p = (2−n/k n−1 ), then 2n k Also almost surely (G) is either k − 1 or k. (iii) If small, but not proportional to 2−n/k n−1 , then κ(n) =

p nislogexponentially 2 and almost surely (G) = κ(n). log(p−1 )−log n Proof. Let Xk be the number of vertices of G(Qn , p) with degree larger than k −1. Then Xk = v∈Qn Iv , where Iv is the indicator random variable of the event that deg(v) ≥ k. n One can easily calculate the expectation EXk = 2n l≥k pl (1 − p)n−l . Also note l that Qn is bipartite and therefore has an independent set X of size 2n−1 . By definition, the events that d(v) < k are mutually independent for all v ∈ X. Therefore we obtain that Pr d(v) < k = 1 − EIv Pr Xk = 0 ≤ v∈X

≤ exp −

v∈X

v∈X

EIv = exp − EXk /2 .

(1)

56

A. Soshnikov, B. Sudakov

Let us now consider the case (i) in more detail. We have that n−O(np) k k n n pk (1 − p)n−k ≤ 2n (en/k)k p k . (n/k) p ≤ 2 2 k Therefore it is easy to check that κ(n) must satisfy the inequalities n log 2 1 − 1/ log log(p−1 ) n log 2 1 + 1/ log log(p−1 ) ≤ κ(n) ≤ . log(p −1 ) log(p −1 ) By definition, EXκ(n) ≥ 1. Elementary computations show that EXk+1 = (1 + o(1)) p log(p −1 ) log 2 EXk

log 2 for k = (1 + o(1))κ(n) which imply EXκ(n)−1 ≥ (1 + o(1)) p log(p −1 ) . Therefore by (1), we have that EXκ(n)−1 Pr (G) < κ(n) − 1 = Pr Xκ(n)−1 = 0 ≤ exp − 2 log 2 ≤ exp −(1 + o(1)) = o(1). p log(p−1 )

On the other hand, since EXκ(n)+1 ≤ 1 + o(1) we have that EXκ(n)+2 ≤ (1 + −1

) = o(1). Thus by Markov’s inequality we conclude that a.s. Xκ(n)+2 = 0 o(1)) p log(p log 2 and therefore almost surely κ(n) − 1 ≤ (G) ≤ κ(n) + 1. n −n/k −1 n Now consider the case (ii). Since p(n) = (2 pk (1− n ) we have that 2 k −n/k ) = p)n−k = (1) and κ(n) = k −1 or k. Also it is easy to check that EX k+1 = (2 n/k o(1) and EXk−1 = (2 ). Therefore by (1) we have that Pr (G) < k − 1 ≤ exp − EXk−1 /2 = o(1) and by Markov’s inequality Pr (G) ≥ k + 1 = o(1). Finally suppose that p(n) is exponentially small but not proportional to 2−n/k n−1

log 2 for any k ≥ 1. Then it is rather straightforward to check that κ(n) = log(pn−1 )−log n again using and EXκ(n)+1 1 EXκ(n) . Therefore, (1) and Markov’s inequality, we conclude that Pr (G) < κ(n) = o(1) and Pr (G) ≥ κ(n) + 1 = o(1). Thus almost surely (G) = κ(n). This completes the proof.

Next we need the following lemma, which shows that a.s. G cannot have a large number of vertices of high degree too close to one another. More precisely the following is true. Lemma 2.2. Let G(Qn , p) be a random subgraph of n-cube. Then almost surely (i) For every 0 < p ≤ 1 and for any two positive constants a and b such that a +b > 1 and nb ≥ 6np, G contains no vertex which has within distance one or two at least na vertices of G with degree ≥ nb . (ii) For p ≥ n−2/3 and any constant a > 0, G contains no vertex which has within distance one or two at least na /p vertices of G with degree ≥ np + np/ log n. Proof. We prove the lemma for the case of vertices of distance two, the case of vertices of distance one can be treated similarly. Note that since the n-cube is a bipartite graph the vertices which are within the same distance from a given vertex in Qn are not adjacent.

On the Largest Eigenvalue of a Random Subgraph of the Hypercube

57

(i) Let X be the number of vertices of G which violate condition (i). To prove the statement we estimate the expectation of X. Clearly we can choose a vertex v of the n-cube in 2n ways. Since thereare at most n2 vertices in Qn within distance n2 possibilities to choose a subset S of size na two from v we have at most na of vertices which will have degree at least nb . The probability that the degree of b n b pn . Note that these events are some vertex in S is at least n is bounded by nb mutually independent, since S contains no edges of the n-cube. Therefore, using that a + b > 1 and b > 0, we obtain that E(X) ≤ 2

n

n2 na a

≤ 2n+2n

n nb

p

log2 n −na+b

2

nb

na

n 2na

≤2 n

enp nb nb

na

= o(1).

Thus by Markov’s inequality we conclude that almost surely no vertex violates condition (i). (ii) Let again X be the number of vertices of G which violate (ii). Similarly as 2 condition n n before we have 2 choices for vertex v and at most choices for the set S of na /p vertices within distance two from v which will have degree ≥ np+np/ log n. Since for all vertices v ∈ G the degree d(v) is binomially distributed with parameters n and p, then by a standard large deviation inequality (cf. , e.g., [1], Appendix A) 3 2 2 Pr d(v) ≥ t = np + np/ log n ≤ e−(t−np) /2np+(t−np) /2(np) = e−(1+o(1))np/(2 log

2 n)

.

As we already mentioned, the events that vertices in S have degree ≥ np+np/ log n are mutually independent. Therefore, using that p ≥ n−2/3 and a > 0, we conclude that 2 na /p n −(1+o(1))np/(2 log2 n) E(X) ≤ 2n e na /p a /p

≤ 2n n2n

e−(1+o(1))n

a+2/3 log n

≤ 2n e2n

1+a /(2 log2 n)

e−(1+o(1))n

1+a /(2 log2 n)

= o(1).

Thus we can complete the proof of the lemma using again Markov’s inequality. 3. Proof of the Theorem In this section we present our main result. We start by listing some simple properties of the largest eigenvalue of a graph, that we will use later in the proof. Most of these easy statements can be found in Chapter 11 of the book of Lov´asz [12]. Proposition 3.1. Let G be a graph on n vertices and m edges and with maximum degree . Let λ1 (G) be the largest eigenvalue of the adjacency matrix of G. Then it has the following properties:

58

A. Soshnikov, B. Sudakov

√ (I) max , 2m n ≤ λ1 (G) ≤ . (II) If E(G) = ∪i E(Gi ) then λ1 (G) ≤ i λ1 (Gi ). If in addition graphs Gi are vertex disjoint, then λ1 (G) = maxi λ1 (Gi ). √ (III) If G is a√bipartite graph then λ1 (G) ≤ m. Moreover if it is a star of size then λ1 (G) = . (IV) If the degrees√on both sides of bipartition are bounded by 1 and 2 respectively, then λ1 (G) ≤ 1 2 . (V) For every vertex v of G let W√2 (G, v) denote the number of walks of length two in G starting at v. Then λ1 (G) ≤ maxv W2 (G, v). Proof of the Theorem 1.1. We already derived in the introduction the lower bound of this theorem so we will concentrate on proving an upper bound. We will frequently use the following simple fact that between any two distinct vertices of the n-cube there are at most two paths of length two. We divide the proof into a few cases with respect to the value of the edge probability p. In each case we partition o G into smaller subgraphs, whose largest eigenvalue is easier to estimate. We start with a rather easy case when the random graph is relatively sparse. Case 1. Let e− log n ≤ p ≤ n−2/3 . For these values of p, by Lemma 2.1, we have that (G) ≥ logn4 n . Partition the vertex set of G into three subsets as follows. Let V1 be 4

the set of vertices with degree at most n2/5 , let V2 be the set of vertices with degree larger than n2/5 but smaller than n4/7 and let V3 be the set of vertices with degree at least n4/7 . Also let G1 be a subgraph of G induced by V1 , let G2 be a subgraph induced by V2 ∪ V3 , let G3 be a bipartite graph containing edges of G connecting V1 and V2 and finally let G4 be a bipartite graph containing edges connecting V1 and V3 . By definition G = ∪i Gi and thus by claim (II) of Proposition 3.1 we obtain that λ1 (G) ≤ 4i=1 λ1 (Gi ). Since the maximum degree of graph G1 is at most n2/5 , then by (I) it follows that λ1 (G1 ) ≤ n2/5 . The degrees of vertices of bipartite graph G3 are bounded on √ one side by 2/5 4/7 n and on another by n . Hence using (IV) we conclude that λ1 (G3 ) ≤ n2/5 n4/7 = n17/35 . Let v be a vertex of G2 and let N2 (G2 , v) be the set of vertices of G2 which are within distance exactly two from v. Since between any two distinct vertices of Qn there at most two paths of length two it is easy to see that the number of walks of length two in G2 starting at v is bounded by dG2 (v) + 2|N2 (G2 , v)|. Since every vertex of V2 ∪ V3 has degree in G at least n2/5 , using Lemma 2.2 (i) with a = 4/5 and b = 2/5 we obtain that almost surely both dG2 (v) and |N2 (G2 , v)| are bounded by n4/5 . Therefore for every vertex v in √ G2 we have W2 (G2 , v) ≤ dG2 (v) + 2|N2 (G2 , v)| ≤ 3n4/5 and hence by (V) √ λ1 (G2 ) ≤ 3n4/5 = 3n2/5 . Finally we need to estimate λ1 (G4 ). To do so consider partition of V1 into two parts. Let V1 be the set of vertices in V1 with at least two neighbors in V3 and let V1

= V1 −V1 . Let G 4 and G

4 be bipartite graphs with parts (V1 , V3 ) and (V1

, V3 ) respectively. By definition, G4 = G 4 ∪ G

4 and hence λ1 (G4 ) ≤ λ1 (G 4 ) + λ1 (G

4 ). Since the vertices in V1

have at most one neighbor in V3 and the graph G

4 is bipartite it follows that G

4 is the √ union of vertex disjoint stars of size at most (G). So by (III) we get λ1 (G

4 ) ≤ (G). Now let u be the vertex of V3 with at least 2n1/2 neighbors in V1 . By definition, every neighbor of u in V1 has also an additional neighbor in V3 , which is distinct from u. Therefore we obtain that there are at least 2n1/2 simple paths of length two from u to the set V3 . Since between any two distinct vertices of the n-cube there are at most two paths of length two we obtain that u has at least n1/2 other vertices of V3 within distance

On the Largest Eigenvalue of a Random Subgraph of the Hypercube

59

two. Since the degree of all these vertices is at least n4/7 it follows from Lemma 2.2 (i) with a = 1/2 and b = 4/7 that a.s. there is no vertex u with this property. Therefore the degree of every vertex from V3 in bipartite graph G 4 is bounded by 2n1/2 and we also have that the degree of every vertex from V1 is at most n2/5 . So using again (IV) √ √ we obtain λ1 (G 4 ) ≤ 2n1/2 n2/5 = 2n9/20 . This implies the desired bound on λ1 (G), since

√ √ λ1 (G) ≤ λ1 (Gi ) ≤ n2/5 + 3n2/5 + n17/35 + 2n9/20 + (G) i

= 1 + o(1) (G). Case 2. Let p ≥ n−4/9 . This case when the random graph is dense is also quite simple. Indeed, partition the vertices of G into two parts. Let V1 be the set of vertices with degree larger than np + np/ log n and let V2 be the rest of the vertices. Clearly G = ∪i Gi , where G1 is a subgraph induced by V1 , G2 is a subgraph induced by V2 , and G3 is a bipartite subgraph with bipartition (V1 , V2 ). By definition, the maximum degree of G2 is at most np + np/ log n, implying λ1 (G2 ) ≤ np + np/ log n. Since every vertex in V1 has degree at least np + np/ log n, by Lemma 2.2 (ii) with a = 1/18 we obtain that almost surely no vertex in G can have more than na /p ≤ n1/2 vertices in V1 within distance one or two. In particular, this implies that the maximum degree in G1 is at most n1/2 and so λ1 (G1 ) ≤ n1/2 . Partition V2 into two parts. Let V2 be the set of vertices in V2 with at least two neighbors in V1 and let V2

= V2 − V2 . Let G 3 and G

3 be bipartite graphs with parts (V1 , V2 ) and (V1 , V2

) respectively. By definition, G3 = G 3 ∪ G

3 and thus λ1 (G3 ) ≤ λ1 (G 3 )+λ1 (G

3 ). Since the vertices in V2

have at most one neighbor in V1 and the graph G

3 is bipartite it follows that G

3 is the union of vertex disjoint stars of size at most (G). √ So by (III) we get λ1 (G

3 ) ≤ (G) ≤ n1/2 . Now let u be the vertex of V1 with at least 2n1/2 neighbors in V2 . By definition, every neighbor of u in V2 has also an additional neighbor in V1 , which is distinct from u. Therefore we obtain that there are at least 2n1/2 simple paths of length two from u to the set V1 . Since between any two distinct vertices there are at most two paths of length two we obtain that u has at least n1/2 other vertices of V1 within distance two. As we already explain in the previous paragraph, this almost surely does not happen. Thus the degree of every vertex from V1 in bipartite subgraph G 3 is bounded by 2n1/2 and we also have that the degreeof every vertex from V2 is at most np + np/ log n. So using (IV) we obtain λ1 (G 3 ) ≤ 2n1/2 (np + np/ log n). Now since np ≥ n5/9 , it follows that all λ1 (G1 ), λ1 (G 3 ), λ1 (G

3 ) = o(np) and therefore λ1 (G) ≤

λ1 (Gi ) ≤ λ1 (G2 ) + o(np) ≤ np + np/ log n + o(np) = (1 + o(1))np.

i

Case 3. Let n−2/3 ≤ p ≤ n−4/9 . This part of the proof is slightly more involved than two √ previous ones since in particular it needs to deal with a delicate case when np and (G) are nearly equal. Partition the vertex set of G into four parts. Let V1 be the set of vertices with degree at least n2/3 and let V2 be the set of vertices with degrees larger than np + np/ log n but less than n2/3 . Let V4 contain all vertices G which have at least one neighbor in V1 and degree at most np + np/ log n. Finally let V3 be the set of remaining vertices of G. Note

60

A. Soshnikov, B. Sudakov

that by definition there are no edges between V1 and V3 and every vertex from V3 also has degree at most np + np/ log n in G. We consider the following subgraphs of G. Let G1 be the bipartite subgraph containing all the edges between V1 and V4 . Partition V4 into two parts. Let V4 be the set of vertices in V4 with at least two neighbors in V1 and let V4

= V4 − V4 . Let G 1 and G

1 be bipartite graphs with parts (V1 , V4 ) and (V1 , V4

) respectively. By definition, G1 = G 1 ∪ G

1 and thus λ1 (G1 ) ≤ λ1 (G 1 ) + λ1 (G

1 ). Since the vertices in V4

have at most one neighbor in V1 and the graph G

1 is bipartite it follows that G

1 is the union of √ vertex disjoint stars of size at most (G). So by (III) we get λ1 (G

1 ) ≤ (G). Now let u be the vertex of V1 with at least 2n2/5 neighbors in V4 . By definition, every neighbor of u in V4 has also an additional neighbor in V1 , which is distinct from u. Therefore we obtain that there are at least 2n2/5 simple paths of length two from u to the set V1 . Similar as before, this implies that u has at least n2/5 other vertices of V1 within distance two. By Lemma 2.2 (i) with a = 2/5 and b = 2/3 this almost surely does not happen. Thus the degree of every vertex from V1 in bipartite subgraph G 1 is bounded by 2n2/5 and we also have that the degree of every vertex from V4 is at most np + np/ log n ≤ n5/9 . √ √ √ So using (IV) we obtain λ1 (G 1 ) ≤ 2n2/5 n5/9 = 2n43/90 = o( ). Therefore √ λ1 (G1 ) ≤ λ1 (G 1 ) + λ1 (G

1 ) ≤ (1 + o(1)) . Our second subgraph G2 is induced by the set V3 . By definition, the maximum degree in it is at most np + np/ log n and therefore λ1 (G2 ) ≤ (1 + o(1))np. Crucially this graph is a vertex disjoint form G1 which implies by (II) that

√ λ1 G1 ∪ G2 = max λ1 (G1 ), λ1 (G2 ) ≤ 1 + o(1) max , np . Next we define the remaining graphs whose union with G1 and G2 equals to G and show that their largest eigenvalues contribute only smaller order terms in the upper bound on λ1 (G). Let G3 be the subgraph of G induced by the set V1 ∪ V2 . By definition, every vertex in G3 has at least np + np/ log n neighbors in G. Therefore by Lemma 2.2 (ii) with a = 1/12 we obtain that for every v ∈ G3 there at most na /p ≤ n3/4 other vertices of G3 within distance one or two. This implies that dG3 (v) and |N2 (G3 , v)| are both bounded by n3/4 . Then, as we already show in Case 1, the total number of walks of length two starting by dG3 (v) + 2|N2 (G3 , v)| ≤ 3n3/4 . Thus by (V) √ at v is bounded √ 3/4 we get λ1 (G3 ) ≤ 3n = o( ). Let u be a vertex of V3 ∪ V4 which has at least 2n2/5 neighbors in the set V4 . Since every vertex in V4 has at least one neighbor in V1 we obtain that there at least 2n2/5 simple paths of length two from u to V1 . On the other hand we know that there are at most two such paths between any pair of distinct vertices. This implies that u has at least n2/5 vertices within distance two whose degree is at least n2/3 . Using Lemma 2.2 (i) with a = 2/5 and b = 2/3 we conclude that almost surely there is no such vertex u. Now let G4 be a subgraph induced by the set V4 and let G5 be the bipartite graph with parts (V3 , V4 ). By the above√ discussion, the maximum degree of G4 is at most 2n2/5 , implying 2/5 λ1 (G4 ) ≤ 2n = o( ). We also know that every vertex from V3 has at most 2n2/5 5/9 neighbors in V4 and every vertex in V4 has at most in √ np + np/ log√n ≤ n neighbors √ 43/90 2/5 5/9 = o( ). V3 . Therefore by (IV) we obtain that λ1 (G5 ) ≤ 2n n = 2n Finally consider the bipartite subgraph G6 whose parts are V2 and V3 ∪V4 . Let X be the set of vertices from V3 ∪ V4 with at least 2n2/7 neighbors in V2 and let Y = V3 ∪ V4 − X. Note that G6 = G 6 ∪ G

6 where G 6 is a bipartite graph with parts (V2 , X) and G

6 ,

On the Largest Eigenvalue of a Random Subgraph of the Hypercube

61

is a bipartite graph with parts (V2 , Y ). The upper bound on λ1 (G

6 ) follows immediately from the facts that G

6 is bipartite, the degree of vertices in V2 is bounded by n2/3 and, by√definition, every in Y has√at most 2n2/7 neighbors in V2 . Therefore √vertex

10/21 λ1 (G6 ) ≤ 2n2/7 n2/3 = 2n = o( ). To bound λ1 (G 6 ), note that almost surely every vertex in V2 has at most n3/7 neighbors in X. Indeed, let u ∈ V2 be the vertex with more than n3/7 neighbors in X. Since every neighbor of u in X has at least 2n2/7 − 1 additional neighbors in V2 different from u we obtain that there at least (2n2/7 − 1)n3/7 = (2 + o(1))n5/7 simple paths of length two from u to V2 . On the other hand we know that there are at most two such paths between any pair of distinct vertices. This implies that u has at least (1 + o(1))n5/7 vertices of V2 within distance two. By definition, every vertex of V2 has at least np + np/ log n neighbors in G. Therefore using Lemma 2.2 (ii) with a = 1/22 we conclude that almost surely there is no such vertex u. Now the upper bound on λ1 (G 6 ) can be obtained using that G 6 is bipartite, the degree of V2 has at√ most vertices in X is bounded by np +np/ log n ≤ n5/9 and √ that every vertex √ in

3/7 31/63 3/7 5/9 = o( ), 2n neighbors in X. Indeed, by (IV) λ1 (G6 ) ≤ 2n n = 2n √ and hence λ1 (G6 ) ≤ λ1 (G 6 ) + λ1 (G

6 ) = o( ). From the above definitions it is easy to check that G = ∪6i=1 Gi . Hence using our estimates on the largest eigenvalues of graphs Gi we obtain the desired upper bound on λ1 (G), as follows:

λ1 (G) ≤ λ1 G1 ∪ G2 + λ1 (Gi ) ≤ (1 + o(1)) max (G), np + o (G) i≥3

= (1 + o(1)) max (G), np . This completes the proof of the third case. Now to finish the proof of the theorem it remains to deal with the last very simple case when the random graph is very sparse. Case 4. Let p ≤ e− log n . For every integer k ≥ 1 denote by Yk the number of connected components with k edges. It is not difficult to see that EYk ≤ O(2n k!nk p k ). Indeed, we can pick the first vertex in the connected component in 2n ways. Suppose we already know the first 1 ≤ t ≤ k vertices of the component. Then these vertices are incident to at most tn edges of the n-cube and therefore we can pick the next edge only in at most tn ways. This gives at most 2n kt=1 tn = 2n k!nk ways to pick the edges of the connected component. First consider the case when p is not exponentially small. Then, by Lemma 2.1 we n have that almost surely (G) = (1+o(1))κ(n), where κ(n) = max{k : 2n pk (1− k p)n−k ≥ 1} and κ(n) tends to infinity together with n. Let k0 = κ(n) + κ(n)/ log κ(n). Then it is easy to check that EYk0 = o(1) and therefore, by Markov’s inequality, almost surely G(Qn , p) contains no connected component with more than k0 edges. Since the largest eigenvalue of G is the maximum of the eigenvalues of its connected components √ and the largest eigenvalue of a component with k0 edges is not greater than k0 (see, parts (II) and (III) of Proposition 3.1), we obtain that λ1 (G) ≤ k0 ≤ κ(n) + κ(n)/ log κ(n) = (1 + o(1)) (G). 4

Next, let p ≤ 2−αn for some fixed α > 0. If p is not proportional to 2−n/k n−1 , k = 1, 2, 3, . . . , then it follows from part (iii) of Lemma 2.1 that with probability going to

62

A. Soshnikov, B. Sudakov

n log 2 . Note that in this case log(p−1 )−log n n κ(n)+1 κ(n)+1 p = o(1). κ(n) is a constant and it is easy to check that EYκ(n)+1 ≤ O 2 n

one the maximum degree of G(Qn , p) is κ(n) =

Thus, by Markov’s inequality, there are no connected components with more than κ(n) edges. Since the largest eigenvalue of G is the maximum of the eigenvalues of its connected √ largest eigenvalue of a component with k edges is not greater √components and the to k√ only if the component is a star on k + 1 vertices), we obtain than k ( and is equal √ that a.s. λ1 (G) = κ(n) = (G). Finally if p(n) is proportional to 2−n/k n−1 , k = 1, 2, 3, . . . , then by part (ii) of Lemma 2.1 almost surely (G) ∈ {k − 1, k} and again one can check that EYk+1 is exponentially small. Using Markov’s inequality, as before, we conclude that there √are no connected components with more than k edges. Therefore a.s. λ (G) is either (G) 1 √ or (G) + 1. This completes the proof of the theorem. 4. Concluding Remarks There are several other important questions that are beyond the reach of the presented technique. The most fundamental is perhaps the local statistics of the eigenvalues, in particular the local statistics near the edge of the spectrum. For results in this direction for other random matrix models we refer the reader to [17, 18, 16]. A recent result of Alon, Krivelevich and Vu [3] states that the deviation of the first, second, etc. largest eigenvalues from its mean is at most of order of O(1). Unfortunately our results give only the leading term of the mean. A second, perhaps even more difficult question is whether the local behavior of the eigenvalues is sensitive to the details of the distribution of the matrix entries of A. We refer the reader to [15, 6, 16, 10] for the results of that nature for unitary invariant and Wigner random matrices. Acknowledgement. The first author would like to thank Sergey Gavrilets and Janko Gravner for bringing this problem to his attention and for useful discussions.

References 1. Alon, N., Spencer, J.: The Probabilistic Method. 2nd ed. New York: Wiley, 2000 2. Ajtai, M., Koml´os, j., Szemer´edi, E.: Largest random component of a k-cube. Combinatorica 2(1), 1–7 (1982) 3. Alon, N., Krivelevich, M., Vu, V.H.: On the concentration of eigenvalues of random symmetric matrices. Israel J. Math. 131, 259–267 (2002) 4. Bollob´as, B.: Random Graphs. 2nd ed., New York: Cambridge University Press, 2001 5. Burtin, Yu.: The probability of connectedness of a random graph (in Russian). Problemy Peredaci Informacii 13(2), 90–99 (1977) 6. Deift, P.: Orthogonal Polynomials and Random Matrices: A Riemann-Hilbert Approach. Courant Lecture Notes in Mathematics 3, New York: Couvant Distitute, 1999 7. Gavrilets, S., Gravner, J.: Percolation on the fitness hypercube and the evolution of reproductive isolation. J. Theor. Biol. 184(1), 51–64 (1997) 8. Erd¨os, P., Spencer, J.: Evolution of the n-cube. Comput. Math. Appl. 5(1), 33–39 (1979) 9. Janson, S., Luczak, T., Rucinski, A.: Random Graphs, New York: Wiley, 2000 10. Johansson, K.: Universality of the local spacing distribution in certain Hermitian Wigner matrices. Commun. Math. Phys. 215, 683–705 (2001) 11. Krivelevich, M., Sudakov, B.: The largest eigenvalue of sparse random graphs. Combinatorics, Probability and Computing 12, 61–72 (2003) 12. Lov´asz, L.: Combinatorial Problems and Exercises. Amsterdam: North Holland, 1993

On the Largest Eigenvalue of a Random Subgraph of the Hypercube

63

13. van Nimwegen, E., Crutchfield, J.P., Nuynen, M.: Neutral evolution of mutational robustness. P. Natl. Acad. Sci. USA (17)96, 9716–9720 (1999) 14. van Nimwegen, E., Crutchfield, J.P.: Metastable evolutionary dynamics: Crossing fitness barriers or escaping via neutral path ? B. Math Biol. (5)62, 799–848 (2000) 15. Pastur, L., Shcherbina, M.: Universality of the local eigenvalue statistics for a class of unitary invariant random matrix ensembles. J. Stat. Phys. 86, 109–147 (1997) 16. Soshnikov, A.: Universality at the edge of the spectrum in Wigner random matrices. Commun. Math. Phys. 207, 697–733 (1999) 17. Tracy, C.A., Widom, H.: Level-spacing distributions and the Airy kernel. Commun. Math. Phys. 159, 151–174 (1994) 18. Tracy, C.A., Widom, H.: On orthogonal and symplectic matrix ensembles. Commun. Math. Phys. 177, 727–754 (1996) Communicated by P. Sarnak

Commun. Math. Phys. 239, 65–92 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0866-9

Communications in

Mathematical Physics

Lyapunov Functionals and L1 -Stability for Discrete Velocity Boltzmann Equations Seung-Yeal Ha1 , Athanasios E. Tzavaras1,2 1 2

Department of Mathematics, University of Wisconsin-Madison, Madison, WI 53706, USA. E-mail: [email protected]; [email protected] Institute for Applied and Computational Mathematics, FORTH (P.O. Box 1527), 71110 Heraklion, Crete, Greece

Received: 18 October 2002 / Accepted: 11 February 2003 Published online: 28 May 2003 – © Springer-Verlag 2003

Abstract: We devise Lyapunov functionals and prove uniform L1 stability for onedimensional semilinear hyperbolic systems with quadratic nonlinear source terms. These systems encompass a class of discrete velocity models for the Boltzmann equation. The Lyapunov functional is equivalent to the L1 distance between two weak solutions and non-increasing in time. They result from computations of two point interactions in the phase space. For certain models with only transversal collisional terms there exist generalizations for three and multi-point interactions. 1. Introduction In this article we devise Lyapunov functionals and prove uniform L1 stability for the Cauchy problem for semilinear hyperbolic systems with quadratic source terms, ∂t fi + vi ∂x fi =

N

jk

B i fj fk ,

j,k=1

fi (x, 0) = fi0 (x).

(1.1)

This system encompasses certain one-space dimensional discrete velocity models in kinetic theory of gases (see Sect. 2). In this context, fi (x, t) stands for the number of particles moving with velocity vi , i = 1, . . . , N, and (x, t) ∈ R × R+ . The collision operator is of the general form Qi (f ) =

N

jk

B i fj f k

j,k=1

and the system is assumed strictly hyperbolic: vi = vj for i = j . Precise assumptions on jk the interaction coefficients Bi will be placed in the sequel. We are interested in positive weak solutions of (1.1) of class L1 .

66

S.-Y. Ha, A.E. Tzavaras

The study of discrete velocity approximations of the Boltzmann collision operator goes back to works of Carleman, Broadwell [10], Gatignol [15]. In the one-space dimensional context, there exist a number of global existence results for small, [23], or large L1 -data, [11, 25, 26], as well as studies concerning the asymptotic behavior, [1, 2], and uniform bounds for solutions emanating from L1 ∩ L∞ data, [3, 2], and results for initial-boundary value problems [8]. For global existence results in several space variables we refer to [4, 21, 18] and the survey article [19]. Bony [3] introduced the following functional in the theory of discrete velocity models: Q(t) = (vm − vn )sgn(y − x)|fm (x, t)||fn (y, t)|dydx, (1.2) m,n

R R

where sgn(x) equals −1 for x < 0, 0 for x = 0 and +1 for x > 0. Study of the evolution of Q leads to uniform integrability of the transversal source terms in space-time, which plays a central role in the existence and asymptotic analysis of [3]. A continuous version of Bony’s functional has been proposed by Cercignani [12, 14] for the full Boltzmann equation with a truncated collision kernel. The functional Q measures the potential interactions between particles with different velocities, and has some similarities with the potential of the interaction functional introduced by Glimm [16] to the study of quasilinear hyperbolic systems. (It also has the important difference that Q is not positive). The objective of the present work is to introduce a new functional measuring the L1 distance between two weak solutions f and f¯ of (1.1). This functional reduces to the Bony functional (1.2) when one of f or f¯ is zero, and has certain analogies to the LiuYang functional [22] that was recently devised for the stability of small BV solutions for systems of conservation laws (see also Bressan-Liu-Yang [9], Hu-LeFloch [17]). The functional provides information regarding the long-time response of solutions, and will be used, in particular, to establish uniform L1 -stability for solutions to (1.1) of small L1 -mass. For the special case of the Broadwell system, there is an alternative functional, consisting of only positive terms and accounting only for forward interactions, and the smallness assumption can be precisely quantified. We proceed to explain the results. Consider first (1.1) under the structural hypotheses: jk

1. Bi satisfy symmetry, sign and boundedness conditions: jk

kj

B i = Bi , jk jk Bi ≤ 0 if j = i or k = i, Bi ≥ 0 if j = i and k = i, jk |Bi | ≤ B ∗ , for some positive constant B ∗ .

(1.3)

2. The system is strictly hyperbolic, v1 < v2 < · · · < vN .

(1.4)

3. Conservation of mass and momentum: There exist weights νi ≥ 1, (i = 1, · · · , N ) such that N i=1

jk

νi Bi = 0,

N i=1

jk

vi νi Bi = 0,

for fixed j, k.

(1.5)

Lyapunov Functionals and L1 -Stability for Discrete Velocity

67

4. Any existing quadratic interactions are assumed to satisfy: if Biii < 0, then there is a sequence of indices i = i1 , i2 , . . . , ir so that k ik Biik+1 > 0 for k = 1, . . . , r − 1, Biirr ir = 0. (1.6) In Sect. 2 we discuss how models of kinetic theory of gases fit into the above framework. For the moment, note that (1.5) reflects conservation of mass and momentum and that certain non-strictly hyperbolic models of kinetic theory can be accommodated by the above assumptions. This is due to the presence of the weights νi ; their role is discussed in Sect. 2. Finally, the structural hypothesis (1.6) is always satisfied for kinetic theory models. To outline the approach, let L(t) be the L1 -distance between two solutions f and f¯, νm |fm (x, t) − f¯m (x, t)|dx, L(t) ≡ R

m

and let δi stand for δi (x, t) = sgn(fi (x, t) − f¯i (x, t)). A direct calculation (see Proposition 3.2) shows that the time derivative of L, dL(t) + R n,i, n=i νi Binn 1 − δδni |fn − f¯n |(fn + f¯n ) dx dt ¯ ¯ ≤ O(1) m,n,m=n |fm − fm |(fn + fn )dx, R

(1.7)

consists of two types of terms: (i) Terms accounting for interactions among particles moving with the same velocity; these appear to the left of the inequality and turn out to be positive. Such a property holds for the much simpler class of contractive relaxation systems (e.g. [20, 28]) and it is remarkable that, due to the conservation laws in (1.5), it holds also for discrete Boltzmann type operators. (ii) Terms that are due to interactions between transversally moving particles, which contribute the term to the right of the inequality in (1.7). To control the terms on the right, we introduce a quadratic functional Qd (t) of the form sgn(y − x) (vm − vn )νm νn |fm − f¯m |(x, t)(fn + f¯n )(y, t) dxdy. Qd (t) = m,n

R R

(1.8) Qd accounts for the potential of (forward and backward) interactions of transversally moving particles between f or f¯ and the difference |f − f¯|. A second calculation (Proposition 3.2) shows dQd (t) + cd (f, f¯)(t) ≤ C||f + f¯||L1 (R) s (f, f¯)(t) + d (f, f¯)(t) , (1.9) dt where c, C are positive constants and ¯ νm νn |fm − f¯m |(x)(fn + f¯n )(x)dx, d (f, f )(t) = s (f, f¯)(t) =

m,n, m=n R

m,n, m=n

δm nn 1− |fn − f¯n |(fn + f¯n ). νm Bm δn R

Setting H(t) = L(t) + KQd (t), we have the following theorem:

68

S.-Y. Ha, A.E. Tzavaras

Theorem 1.1. Suppose the system (1.1) satisfies the assumptions (1.3)–(1.6) and let f, f¯ ∈ C(R+ ; (L1 (R))N ) be two mild solutions of (1.1) corresponding to initial data f0 , f¯0 ≥ 0,

N f0 , f¯0 ∈ (L∞ ∩ L1 )(R) such that ||f0 ||L1 (R) 1 and ||f¯0 ||L1 (R) 1. Then, for an appropriate choice of K, the functional H(t) = L(t)+KQd (t) is equivalent to the L1 distance between f and f¯ and satisfies dH(t) (1.10) + c d (f, f¯)(t) + s (f, f¯)(t) ≤ 0, dt for some positive constant c. Moreover, ||f (·, t) − f¯(·, t)||L1 (R) ≤ C||f0 (·) − f¯0 (·)||L1 (R) , where C is a positive constant independent of time t. The theorem above is a statement on the asymptotic response of (1.1). (Note that, for quadratic models (1.1), given L∞ bounds it is easy to establish L1 -stability estimates with a constant that depends on time. L∞ bounds are available for data of class L1 ∩L∞ , see [2, 4].) The functional inequality (1.10) is valid for data of small initial L1 -mass, while the functional H(t) has the properties 1. H is equivalent to the L1 -distance, i.e., for some constant C0 > 0, 1 ¯ ¯ C0 ||f (·, t) − f (·, t)||L1 (R) ≤ H(t) ≤ C0 ||f (·, t) − f (·, t)||L1 (R) . 2. H is non-increasing in time t. The functionals in (1.2), (1.8) account for both forward and backward interactions of transversally moving particles. In this sense they are different from the Glimm functional or the Liu-Yang functional, as the latter compute the potential of only forward interactions. The functional Qd (t) is not positive, and thus not a Lyapunov functional itself. For certain special systems, such as the Broadwell model ∂t f1 − ∂x f1 = f22 − f1 f3 , 1 ∂t f2 = − (f22 − f1 f3 ) 2 ∂t f3 + ∂x f3 = f22 − f1 f3 ,

(1.11)

or models with transversal interactions (Bikl = 0 for k, l with vk = vl )), it is possible to define alternative functionals, where the interaction potential is positive and accounts for only forward interactions. For the Broadwell system, the interaction potential Qd (t) ≡ 1x
gives rise to a sharper Lyapunov inequality (see Sect. 3.2), and yields a uniform L1 stability Theorem 3.1 where the smallness assumption on the data is quantified. Finally, there exist some interesting models with wave speeds vi = vi (x, t). The linearization of conservation laws leads for instance to such models in divergence

Lyapunov Functionals and L1 -Stability for Discrete Velocity

69

form and an analog of the Glimm functional is then constructed by Schatzman [24]. Consider ∂t fi + ∂x vi (x, t)fi = Qi (f ), fi (x, 0) = fi0 (x), (1.12) jk where Qi (f ) = Bi (x, t)fj fk , i = 1, · · · , N , and (1.12) is strictly hyperbolic j,k

in the sense that vi (x, t) are real-valued C 1 functions satisfying vi (x, t) < vj (x, t) for all (x, t) ∈ R × R+ and i < j . Analogous estimations are then established for (1.12) provided the source consists of only transversal terms. In return, one can relax the assumptions on signs of the data and collision coefficients and no conservation law is assumed for the source. Theorem 1.2. Suppose that the coefficients in (1.12) satisfy hypotheses (3.39)–(3.40), and let f, f¯ ∈ C(R+ ; (L1 (R))N ) be two mild solutions corresponding to initial data

N 1 f0 , f¯0 ∈ (L∞ with ||f0 ||L1 (R) 1 and ||f¯0 ||L1 (R) 1. + ∩ L )(R) Then, f , f¯ satisfy the inequalities (3.56)–(3.58) and ||f (·, t) − f¯(·, t)||L1 (R) ≤ C||f0 (·) − f¯0 (·)||L1 (R) , where C is a positive constant independent of time t. The functionals Q and Qd consist of sums of two-point distribution functions in the phase space. For the system (1.12), we introduce in Sect. 3.3 functionals consisting of three-point and multi-point distribution functions (see (3.52), (3.53)) and obtain various Lyapunov inequalities. This paper is organized as follows. In Sect. 2, we review the basics of a one-dimensional discrete model for the Boltzmann equation and give an outline of the global existence theory in L1 ∩ L∞ . In Sect. 3, we explicitly construct the nonlinear functionals, study their time-variation, and finally prove L1 stability. This is done consecutively, first for general discrete velocity models (1.1), then for the Broadwell system (1.11), and finally for semilinear quadratic systems with transversal terms (1.12). 2. Preliminaries 2.1. Discrete velocity Boltzmann equations. We present a review of the basics for discrete velocity Boltzmann equations and refer to [10, 15, 11, 21, 19, 2] for further details. The discretization of the velocity space in the kinetic theory of gases allows to replace the Boltzmann equation by a system of semilinear hyperbolic equations. There is a set of preselected velocities V1 , . . . , VN ∈ R3 and a set of admitted binary collisions (k, l) → (i, j ). The pre-collisional Vk , Vl and post-collisional velocities Vi , Vj satisfy microscopic conservation laws of mass, momentum and energy, Vi + Vj = Vk + Vl , |Vi |2 + |Vj |2 = |Vk |2 + |Vl |2 .

(2.1)

The interaction coefficients Akl ij are positive constants measuring the relative strengths of the collisions (k, l) → (i, j ); if a collision (k, l) → (i, j ) does not occur one sets Akl ij = 0. Typical assumptions for the interaction coefficients are symmetry lk kl Akl ij = Aij = Aj i ,

(2.2)

70

S.-Y. Ha, A.E. Tzavaras

and microreversibility (or detailed balance) ij

Akl ij = Akl .

(2.3)

The latter is sometimes relaxed to semi-detailed balance (see [15]) but we will not insist on that here. The kinetic function fi (X, t) describes the density of particles at the point (X, t) ∈ R3 × R moving with velocity Vi and is governed by the discrete velocity Boltzmann equation ij (Akl (2.4) ∂t fi + Vi · ∇X fi = ij fk fl − Akl fi fj ), j,k,l

for i = 1, . . . , N. Next, we briefly review some properties of the collision operator ij (Akl i = 1, . . . , N. Qi (f ) := ij fk fl − Akl fi fj ), j,k,l

Let φ : R3 → R be any measurable function. Then we have ij φ(Vi )fi + divX Vi φ(Vi )fi = φ(Vi )(Akl ∂t ij fk fl − Akl fi fj ). i

i

i,j,k,l

In view of (2.2) and (2.3) the right-hand side may be rearranged as ij φ(Vi )(Akl ij fk fl − Akl fi fj ) i,j,k,l

1 ij φ(Vi ) + φ(Vj ) − φ(Vk ) − φ(Vl ) Akl ij fk fl − Akl fi fj 4 i,j,k,l 1 = φ(Vi ) + φ(Vj ) − φ(Vk ) − φ(Vl ) Akl ij fk fl 2 i,j,k,l 1 φ(Vi ) + φ(Vj ) − φ(Vk ) − φ(Vl ) Akl = (2.5) ij (fk fl − fi fj ). 4 =

i,j,k,l

For the choice of φ(V ) equal to one of the collisional invariants 1, V 1 , V 2 , V 3 or |V |2 solutions of (2.4) satisfy macroscopic conservation laws of mass, momentum and energy. Moreover, by multiplying (2.4) by 1 + log fi , we have the H-theorem fi log fi + divX Vi fi log fi ∂t i

=

i

Akl ij (log fi )(fk fl

− f i fj )

i,j,k,l

1 kl Aij (log fi + log fj − log fk − log fl )(fk fl − fi fj ) 4 i,j,k,l

1 kl fk fl =− (fk fl − fi fj ) ≤ 0. Aij log 4 fi fj

=

i,j,k,l

(2.6)

Lyapunov Functionals and L1 -Stability for Discrete Velocity

71

We consider now the description of one dimensional motions of a dilute gas. Let D ∈ R3 be the direction of motion and consider the ansatz fi (X, t) = f¯i (D · X, t),

x = D · X.

Then f¯(x, t) satisfies a system of the form Akl ∂ t f i + v i ∂x f i = ij (fk fl − fi fj ),

(2.7)

j,k,l

where the projected velocities vi = Vi · D satisfy microscopic conservation of mass and momentum vi + vj = vk + vl ,

(2.8)

but not, in general, microscopic conservation of energy. In view of (2.8), f¯ satisfies macroscopic conservations of mass and momentum (in the direction of motion) f i + ∂x vi fi = 0, ∂t i

∂t

i

v i f i + ∂x

i

and the H-theorem ∂t

i

fi log fi + ∂x

i

vi fi log fi +

vi2 fi = 0,

(2.9)

i

1 kl fk fl (fk fl − fi fj ) = 0. Aij log 4 fi fj i,j,k,l

(2.10) From the viewpoint that the one-dimensional model (2.7) describes a one-dimensional motion of a three-dimensional discrete velocity model (2.4), the one-dimensional model does not have to satisfy conservation of energy and the system does not need to be strictly hyperbolic – even if the original system has distinct velocities. The loss of strict hyperbolicity causes various difficulties with the types of estimates pursued here. Following Beale [2], we wish to reduce the system by combining the densities fi and fi for which the projected velocities vi and vi coincide, vi ≡ vi . The equations for fi , fi read Akl (2.11) ∂t fi + vi ∂x fi = Qi (f ) = ij (fk fl − fi fj ), j,k,l

∂t fi + vi ∂x fi = Qi (f ) =

j ,k ,l

Aki jl (fk fl − fi fj ).

(2.12)

We place structural hypotheses on the system so that, when for two indices i and i the projected velocities coincide vi ≡ vi , we can identify the corresponding collision operators Qi (f ) ≡ Qi (f ) so that Eqs. (2.11) and (2.12) coincide. This dictates certain restrictions on the interaction coefficients Akl ij , see [2] for the precise hypotheses. Then, if the initial data are the same for particles moving with the same velocities, then we can identify the densities fi and fi and replace them by one equation that is counted νi

72

S.-Y. Ha, A.E. Tzavaras

times (where νi is the number of projected velocities that coincide with the velocity vi ). The system (2.7) can be put into the form of (1.1) ∂t fi + vi ∂x fi =

N

jk

B i fj fk

(2.13)

j,k=1

with Bikl =

N j =1

Akl ij −

N N 1 kl 1 kl Amn δik − Amn δli 2 2 m,n=1

m,n=1

and δik is the Kronecker symbol. It is clear that (1.3) and (1.4) are satisfied. Hypothesis (1.5) reflects in this setting the conservation laws of mass and momentum. Most of the usual examples of discrete kinetic theory fit under the above framework. Next we consider Hypothesis (1.6). For a kinetic theory model quadratic terms arise as follows: • If Bi > 0 with i = j then there is a collision (j, j ) → (i, i ). For a nontrivial collision, (2.8) implies that either vi < vj = vj < vi or vi < vj = vj < vi . • If Biii < 0 then there is a collision (i, i) → (j, j ). For a nontrivial collision, (2.8) implies vj = vj and, if we denote by vj the smallest outgoing speed, vj < vi = vi < vj . jj

Suppose now that Biii < 0 and set i1 = i. Then there is a nontrivial collision (i1 , i1 ) → (i2 , i2 ) with vi2 < vi1 < vi2 . Consider the equation for the balance of fi2 . i1 i1

There, the interaction coefficient Bi2

> 0. If Bii22 i2 = 0 then we are finished here, (1.6)

is justified. If not, then Bii22 i2 < 0 and there is a nontrivial collision (i2 , i2 ) → (i3 , i3 ) i2 i

with vi3 < vi2 < vi3 . We consider the equation for the balance of fi3 , note that Bi3 2 > 0 and repeat the previous step. Since the velocities vik are strictly decreasing in each step, the process necessarily terminates and (1.6) is justified. A well studied paradigm of a one-dimensional discrete velocity model is the one proposed by Broadwell [10]. The Broadwell model describes particles moving with a set of six velocities and colliding with equal probabilities. It reads ∂t f1+ + ∂x f1+ = ∂t f1− − ∂x f1− = ∂t f2+ + ∂y f2+ = ∂t f2− − ∂y f2− = ∂t f3+ + ∂z f3+ = ∂t f3− − ∂z f3− =

1 + − (f f + f3+ f3− ) − f1+ f1− , 2 2 2 1 + − (f f + f3+ f3− ) − f1+ f1− , 2 2 2 1 + − (f f + f3+ f3− ) − f2+ f2− , 2 1 1 1 + − (f f + f3+ f3− ) − f2+ f2− , 2 1 1 1 + − (f f + f2+ f2− ) − f3+ f3− , 2 1 1 1 + − (f f + f2+ f2− ) − f3+ f3− , 2 1 1

where f1± , f2± and f3± are densities of particles moving with velocities ±1 in the direction of the x, y, and z axes respectively. One then considers one-dimensional motions

Lyapunov Functionals and L1 -Stability for Discrete Velocity

73

of particles, depending on x but independent of the y and z coordinates, and under the ansatz f2+ = f2− = f3+ = f3− . If we set f1+ = f1 , f1− = f3 , f2+ = f2− = f3+ = f3− = f2 , then (2.14) reduces to the one-dimensional system ∂t f1 − ∂x f1 = f22 − f1 f3 , 1 ∂t f2 = − (f22 − f1 f3 ), 2 ∂t f3 + ∂x f3 = f22 − f1 f3 . It is easy to check that the one-dimensional Broadwell model is of the general form of the system (1.1) and satisfies (1.3)–(1.6).

2.2. Existence theory. Next we discuss the existence theory for the Cauchy problem of (1.1). There are two venues for defining weak solutions of (1.1). First, using the notion of mild solution: Definition 2.1. f = (f1 , · · · , fN ) ∈ C([0, T ]; (L1 (R))N ) is a mild solution of (1.1) with data f0 ∈ (L1 (R))N if Qi (f ) ∈ L1 (R × [0, T ]) and for t ∈ [0, T ] and a.e x ∈ R, f (x, t) satisfies the integral equation, t fi (x, t) = fi0 (x − vi t) + Qi (f )(x − vi (t − τ ), τ )dτ, (2.14) 0

for i = 1, · · · , N. A second possibility is to define f as a weak solution: Definition 2.2. f = (f1 , · · · , fN ) ∈ C([0, T ]; (L1 (R))N ) is a weak solution of (1.1) with data f0 ∈ (L1 (R))N if Qi (f ) ∈ L1 (R × [0, T ]) and for any test function ϕ ∈ Cc∞ (R × [0, ∞)) we have fi ∂t ϕ + vi ∂x ϕ dxdt + fi0 (x)ϕ(x, 0)dx = − Qi (f )ϕdxdt (2.15) for i = 1, · · · , N. In fact, for solutions f of class C([0, T ]; (L1 (R))N ) with Qi (f ) ∈ L1 (R × [0, T ]) the two notions of solution are equivalent. Obviously, a mild solution is also a weak solution. To see that a weak solution is also a mild solution, rewrite first (2.15) in the equivalent form fi (y + vi τ, τ )(∂t ψ)(y, τ )dydτ + fi0 (y)ψ(y, 0)dy =− Qi (f )(y + vi τ, τ )ϕdydτ, where ϕ(x, t) = ψ(x − vi t, t). Fix t > 0, δ ∈ (0, t) and take the test function ψ(y, τ ) = a(y)bδ (τ ), where a ∈ Cc∞ (R) and bδ ∈ Cc∞ ([0, ∞)) is selected so that it takes the value

74

S.-Y. Ha, A.E. Tzavaras

1 on [0, t − δ], decreases linearly on (t − δ, t), and takes the value 0 on (t, ∞). Taking the limit δ → 0, we deduce

R

fi (y + vi t, t)a(y)dy −

R

fi0 (y)a(y)dy =

R 0

t

Qi (f )(y + vi τ, τ )a(y)dτ dy,

from which (2.14) follows. We state the main global existence result for (1.1). Theorem 2.1 ([2, 3]). Suppose that (1.1) satisfies Hypotheses (1.3)–(1.6) and let f0 ≥ 0 N with f0 ∈ L1 (R) ∩ L∞ + (R) . There exists a unique, nonnegative mild solution f of (1.1) with N f ∈ C [0, T ]; (L1 (R)N ∩ L∞ (R × [0, T ]) for any T > 0, and Qi (f ) ∈ L1 (R × R+ ), i = 1, . . . , N. Moreover, if f0 is of class C 1 , then f is of class C 1 in (x, t). Variants of this theorem are proved by Tartar [26], Beale [2] and Bony [3]. The reader is also referred to [23, 25, 11, 21, 6, 7] for further existence and asymptotic behavior results. Outline of the proof. It is instructive to outline the proof of Theorem 2.1 following the 1 1 ideas of [3]. One starts with estimates for C solutions, in terms of the L -norm µ := j νj f0j (x)dx of the data. R Step 1. One first shows using Proposition 3.1 and (3.12) that the transversal terms satisfy the L1 -bound ∞ fm (x, t)fn (x, t)dxdt ≤ Cµ2 < ∞, for m = n. (2.16) 0

R

Step 2. Next, it is shown that jj

Bi = 0 jj

implies 0

∞

R

fj2 (x, t)dxdt ≤ C(µ + µ2 ) < ∞.

(2.17)

jj

Suppose that Bi = 0 with i = j . Then Bi > 0. Consider the balance equation for fi . If Biii = 0 then by integrating in space-time we conclude (2.17). If Biii < 0 then using (1.6) we again conclude (2.17). Step 3. The previous steps indicate that Qi (f ) ∈ L1 ([0, ∞) × R). Bony [3] uses this fact to establish that solutions f are in L∞ , with an explicit bound depending on the L1 -mass of the data. We refer to [3] for the proof of the estimate. Once these estimates are established for smooth solutions, the existence of mild solutions for data f0 ∈ L1 ∩ L∞ follows by a standard density argument. Uniqueness for mild solutions of class L∞ is trivial. As weak and mild solutions coincide, uniqueness is inherited for the class of bounded weak solutions.

Lyapunov Functionals and L1 -Stability for Discrete Velocity

75

Since (1.1) has quadratic nonlinearities it is easy to see that f (·, t) − f¯(·, t)L1 (R) ≤ f0 − f¯0 L1 (R) eMt , where M depends on maxk fk (x, t)L∞ (R×(0,t)) . Thus the L∞ -bounds imply L1 -stability with a constant depending exponentially in time. The estimates (2.16)–(2.17) give also information on asymptotic behaviour. By integrating (1.1), we write t fi (y + vi t, t) = fi0 (y) + Qi (f )(y + vi s, s)ds. (2.18) If we formally set Fi∞ (y) := fi0 (y) + fi (· + vi t, t) − Fi∞ L1 (R) ≤

∞ 0

t

0

Qi (f )(y + vi s, s)ds, then ∞ |Qi (f )|(y + vi s, s)dyds → 0 R

as t → ∞ and thus fi (x, t) → Fi∞ (x − vi t) in L1 (R) and a.e. Hence, the leading term in the asymptotic response of fi is a traveling wave. If the collision model contains the quadratic interaction (i, i) → (j, j ) then the coefficient Biii < 0 and (2.17) implies ∞ that 0 R fi2 dxdt < ∞ and Fi∞ (y) = 0 a.e. In other words, the leading traveling wave in the asymptotic behavior of a field which self-interacts is trivial. We refer to [2, 4, 5, 26, 27] for further results on asymptotic behavior. 3. Lyapunov Functionals and Uniform L1 Stability Estimates In this section, we construct nonlinear functionals which are equivalent to the L1 distance and non-increasing in time. We begin in Sect. 3.1 with a general discrete velocity Boltzmann equation. Then, in Sect. 3.2, we specialize to the well-known Broadwell model. In this case, the proposed functional contains only the forward interaction potential and is thus in closer analogy to the spirit of the Glimm and Liu-Yang functionals. In Sect. 3.3, we take up systems that only have transversal source terms. We allow then for solutions that may be negative and calculate two-point, three-point or multi-point interactions and establish more complicated Lyapunov type functionals. 3.1. General discrete velocity Boltzmann equations. We first consider the Cauchy problem for the discrete velocity Boltzmann equation jk ∂t fi + vi ∂x fi = B i fj f k , (3.1) j,k

i = 1, . . . , N, under Hypotheses (1.3), (1.4) and (1.5). Using the fact that 1 and vi are collisional invariants, it follows from (1.5) that N N ∂t (3.2) νn fn + ∂ x vn νn fn = 0, ∂t

n=1 N m=1

n=1

(vm − vn )νm fm + ∂x

N m=1

vm (vm − vn )νm fm

= 0.

(3.3)

76

S.-Y. Ha, A.E. Tzavaras

Motivated by (3.2) and (3.3), Bony’s functional for (3.1) is defined by Q(t) = sgn(y − x)(vm − vn )νm νn fm (x, t)fn (y, t)dydx, m,n

=

R R

m,n

R R

?x

−

R R

?x>y (vm − vn )νm νn fm (x, t)fn (y, t)dydx

=: I + I I.

(3.4)

From the conservation of mass and the positivity of solutions, it easily follows that 2 |Q(t)| ≤ 2 max |vm − vn | νm f0m (x)dx < ∞. (3.5) m,n

R m

We note that for m, n such that m > n, I and I I denote respectively the forward and backward interaction potentials between the waves traveling with speeds vm and vn . By the choice of the weight vm − vn , this functional can be negative. For notational simplicity, we suppress from now on the t dependence and write f (x, t) ≡ f (x),

Qn (f )(x, t) ≡ Qn (x).

Define the instantaneous interaction production (f ) by (f )(t) = νm νn fm (x)fn (x)dx. m,n, m>n R

(3.6)

The following proposition shows uniform integrability of the transversal terms of the source. Proposition 3.1 ([3]). Assume that (3.1) satisfies (1.3)–(1.5), and let f be a solution emanating from the initial datum f0 . Then Q(t) is non-increasing in time t, i.e., dQ(t) ≤ −4v∗2 (f )(t), dt where v∗2 = min(vm − vn )2 . m=n

Proof. We consider I in (3.4). Recall that ∂t fm (x) + vm ∂x fm (x) = Qm (x), ∂t fn (y) + vn ∂y fn (y) = Qn (y).

(3.7) (3.8)

Then (3.7) and (3.8) give ∂t ?x
Lyapunov Functionals and L1 -Stability for Discrete Velocity

77

and in turn

(vm − vn )νm νn fm (x)fn (y) ∂t ?x
+

m,n (vm − vn )2 νm νn δ(x − y)fm (x)fn (y) m,n

= ?x

(vm − vn )νm νn Qm (x)fn (y) + fm (x)Qn (y) = 0,

(3.10)

m,n

where we have used that, from (1.5), νm Qm (x) = 0, m

vm νm Qm (x) = 0.

(3.11)

m

Integrating (3.10) over R × R, we have d ?x
R

The term II is treated in the same way and gives d ?y<x (vm − vn )νm νn fm (x)fn (y)dydx − dt m,n R R =− (vm − vn )2 νm νn fm (x)fn (x)dx. m,n

R

Combining the above, we obtain the desired result.

Remark 3.1. By Proposition 3.1 and (3.5) the transversal terms are integrable in spacetime, that is, for m = n, ∞ ∞ νm νn fm (x, t)fn (x, t)dxdt ≤ C||f0 ||2L1 (R) < ∞, for some C > 0. 0

−∞

(3.12)

Next, we study the L1 stability of mild solutions. Let f and f¯ be two solutions corresponding to initial data f0 (x) and f¯0 (x) respectively. Define the nonlinear functionals L(t) ≡ νm |fm (x) − f¯m (x)|dx, m R Qd (t) ≡ sgn(y − x)(vm − vn )νm νn |fm (x) − f¯m (x)|(fn (y) + f¯n (y)) dxdy, m,n

R R

H(t) ≡ L(t) + KQd (t),

78

S.-Y. Ha, A.E. Tzavaras

where the positive constant K > 0 is later appropriately selected. L(t) measures the L1 distance between f and f¯, while Qd (t) is a generalization of the Bony functional Q(t). Qd (t) measures the (forward and backward) interaction potentials between f and |f − f¯| and between f¯ and |f − f¯|. We study the time-variation of these nonlinear functionals. In the calculations there will enter the analogs of the instantaneous interaction production (f )(t) that take the form d (f, f¯)(t) = νm νn |fm − f¯m |(x)(fn + f¯n )(x)dx, m,n, m=n R

s (f, f¯)(t) =

m,n, m=n

δm nn νm Bm 1− |fn − f¯n |(fn + f¯n ), δn R

(f, f¯)(t) = d (f, f¯)(t) + s (f, f¯)(t). All these functionals are positive and their role will be clarified in the sequel. Proposition 3.2. Assume that (3.1) satisfies (1.3)–(1.5). Let f and f¯ be two solutions of (3.1) corresponding to initial data f0 and f¯0 with ||f0 ||L1 (R) + ||f¯0 ||L1 (R) 1. Then, for an appropriate choice of K, the functional H(t) is equivalent to the L1 distance between f and f¯ and satisfies dL(t) + s (f, f¯)(t) ≤ C1 d (f, f¯)(t), dt dQd (t) + c2 d (f, f¯)(t) ≤ C2 ||f ||L1 (R) + ||f¯||L1 (R) (f, f¯)(t), dt dH(t) ≤ −c3 (f, f¯)(t), dt

(3.13)

where C1 , C2 , c2 and c3 are positive constants which are independent of time t. Proof. We consider the time-evolution of each functional separately. Let f and f¯ be C 1 solutions of compact support corresponding to compactly supported C 1 data f0 (x) and f¯0 (x) respectively. The case of L1 solutions will follow by a standard density argument. Step 1. Computation of d Ldt(t) . Note that f and f¯ satisfy ∂ t f i + v i ∂x f i = Q i , ¯ ¯ i )δi , ∂t |fi − fi | + ∂x vi |fi − f¯i | = (Qi − Q

(3.14) (3.15)

where we use the notation δi (x, t) = sgn(fi (x, t) − f¯i (x, t)). We decompose the terms due to transversal interactions from the terms due to selfinteractions, Qi = Bimn fm fn + Binn fn2 , m=n

n

Lyapunov Functionals and L1 -Stability for Discrete Velocity

and use (1.3) to write ¯ i )δi = (Qi − Q m,n,m=n

Bimn

79

δi δi |fm − f¯m |(fn + f¯n ) + Binn |fn − f¯n |(fn + f¯n ). δm δn n (3.16)

We note the identity

δi |fn − f¯n |(fn + f¯n ) δ n n i δi νi Binn |fn − f¯n |(fn + f¯n ) + νn Bnnn |fn − f¯n |(fn + f¯n ) = δn n i,n, i=n nn δi = νi B i |fn − f¯n |(fn + f¯n ) − νi Binn |fn − f¯n |(fn + f¯n ) δn n i, i=n i,n, i=n

δ i = νi Binn − 1 |fn − f¯n |(fn + f¯n ) ≤ 0, (3.17) δn νi Binn

i,n, i=n

which follows from conservation of mass (1.5)1 in the form νn Bnnn = − i, i=n νi Binn , and the fact that Binn ≥ 0 for i = n. From (3.15)–(3.17) we obtain δi ∂t νi |fi − f¯i | + ∂x νi vi |fi − f¯i | + νi Binn 1 − |fn − f¯n |(fn + f¯n ) δn i i n,i, n=i mn δi = νi B i |fm − f¯m |(fn + f¯n ) (3.18) δm i

m,n,m=n

and, in turn, dL(t) + s (f, f¯)(t) ≤ C1 d (f, f¯)(t), dt

(3.19)

for some positive constant C1 . Step 2. Calculation of d Qdtd (t) . From (3.14) and (3.15), we obtain ∂t ?x
(3.20)

(vm − vn )νm νn ?x
(vm − vn )νm νn vm ∂x + vn ∂y ?x
+

m,n (vm − vn )2 νm νn δ(x − y)|fm − f¯m |(x)(fn + f¯n )(y)

m,n

m,n

80

S.-Y. Ha, A.E. Tzavaras

= ?x
¯ m )(x)δm (x)(fn + f¯n )(y) (vm − vn )νm νn (Qm − Q

m,n

+ ?x
¯ n (y) . (vm − vn )νm νn |fm − f¯m |(x) Qn (y) + Q

(3.21)

m,n

The last term in (3.21) vanishes, due to the conservation of mass and momentum (3.11). To estimate the first term in the right-hand side of (3.21), we note that, from (3.16), ¯ i )δi = (vi − vj )νi νj (Qi − Q (vi − vj )νi νj i

m,n,m=n

i

Bimn

δi |fm − f¯m |(fn + f¯n ) δm

δi + (vi − vj )νi νj Binn |fn − f¯n |(fn + f¯n ), δ n n i (3.22)

and that, as in (3.17) but using now both conservations of mass and momentum in (1.5), δi (vi − vj )νi νj Binn |fn − f¯n |(fn + f¯n ) δ n n i

nn δi (vi − vj )νi νj Bi − 1 |fn − f¯n |(fn + f¯n ). = δn

(3.23)

i,n, i=n

Note that Binn ≥ 0 for i = n and that d dt

R R

?x

δi δn

− 1 ≤ 0. We can now estimate (3.21) as

(vm − vn )νm νn |fm − f¯m |(x)(fn + f¯n )(y) dxdy + v∗2 d (f, f¯)(t)

m,n

¯ m )(x)δm (x) (fn + f¯n )(y) (vm − vn )νm νn (Qm − Q R R n m ≤ C2 ||f + f¯||L1 (R) s (f, f¯)(t) + d (f, f¯)(t) . ≤

?x
(3.24)

We conclude, by using similar estimations for the remaining part of Qd (t), that dQd (t) + 2v∗2 d (f, f¯)(t) ≤ C2 ||f ||L1 (R) + ||f¯||L1 (R) (f, f¯)(t), (3.25) dt where v∗2 = min(vm − vn )2 . m=n

(t) Step 3. Calculation of d H dt . By the definition of H(t), x νm |fm (x) − f¯m (x)| 1 + K(vn − vm ) νn (fn (y) + f¯n (y))dy H(t) = −∞ m R ∞ − K(vn − vm ) νn (fn (y) + f¯n (y))dy dx. x

Therefore, if

KC3 ||f ||L1 (R) + ||f¯||L1 (R) < 1,

(3.26)

Lyapunov Functionals and L1 -Stability for Discrete Velocity

81

where C3 := maxm,n νn |vm − vn |, then there exists M > 0 such that 1 L(t) ≤ H(t) ≤ ML(t), M and H(t) is equivalent to the L1 -distance L(t). On the other hand, from (3.19) and (3.25), we have dH(t) dL(t) dQd (t) = +K dt dt dt ≤ − 1 + KC2 ||f ||L1 (R) + ||f¯||L1 (R) s (f, f¯)(t) + − 2v∗2 K + KC2 ||f ||L1 (R) + ||f¯||L1 (R) + C1 d (f, f¯)(t). (3.27) If the L1 mass of the two solutions is sufficiently small, ||f ||L1 (R) + ||f¯||L1 (R) <

2v∗2 < 2v∗2 , C1 max{C2 , C3 }

(3.28)

then K can be selected in the nonempty interval C1

2v∗2

− (||f ||L1 (R) + ||f¯||L1 (R) )

1 . max{C2 , C3 }(||f ||L1 (R) + ||f¯||L1 (R) ) (3.29)

Then (3.26) is fulfilled and there exists a (possibly small) positive constant c3 so that, for the above choice of K, dH(t) ≤ −c3 (f, f¯)(t). dt

From Proposition 3.2, we obtain L1 stability of mild solutions. (k) (k) Proof of Theorem 1.1. Let f0 and f¯0 be C 1 -approximations of two given initial data 1 f0 , f¯0 ∈ L such that (k)

f0

→ f0 ,

(k) f¯0 → f¯0

in L1 (R)

as k → ∞.

Then we can construct C 1 solutions f (k) (x, t) and f¯(k) (x, t) corresponding to two (k) (k) smooth initial data f0 and f¯0 respectively. It follows that f k is Cauchy in L1 and f (k) (x, t) → f (x, t),

f¯(k) (x, t) → f (x, t) in L1 (R) as k → ∞.

Define H(t) = H[f (·, t), f¯(·, t)] ≡ lim H[f (k) (·, t), f¯(k) (·, t)]. k→∞

Then by the two key-properties of H(t), we have ||f (k) (·, t) − f¯(k) (·, t)||L1 (R) ≤ C||f0 (·) − f¯0 (·)||L1 (R) , for some constant C > 0. (k)

(k)

Letting k → ∞, we have ||f (·, t) − f¯(·, t)||L1 (R) ≤ C||f0 (·) − f¯0 (·)||L1 (R) .

82

S.-Y. Ha, A.E. Tzavaras

3.2. The one-dimensional Broadwell model. Next, we consider the one-dimensional Broadwell model (1.11). In this case, a variant of Qd (t) can be defined using only the forward part of the interaction potential. This is in accord with the approach of the Glimm potential, and in contrast to the interaction potential used in the previous subsection for general discrete velocity Boltzmann models. Let f and f¯ be two solutions of (1.11) which for the time are taken to be C 1 . We use the notation f (x), f (y) for the evaluation of f at the points (x, t), (y, t) respectively; the t dependence is mostly suppressed. From (1.11), we derive the conservations for the partial masses, ∂t (2f2 + f3 )(x) + ∂x f3 (x) = 0, ∂t (f1 + 2f2 )(y) − ∂y f1 (y) = 0.

(3.30) (3.31)

In the sequel, we define a potential of interaction functional Q(t) in the form 1x
The definition is motivated in the following lemma, which appears in Tartar [27] and is there attributed to Varadhan. Proposition 3.3. Along solutions f of (1.11), we have dQ(t) f1 f3 + f1 f2 + f2 f3 (x, t) dx. = −2 dt R Proof. We multiply (3.30) by (f1 + 2f2 )(y) and (3.31) by (2f2 + f3 )(x). Adding and multiplying the resulting identity by 1x
H(t) ≡ L(t) + KQd (t),

Lyapunov Functionals and L1 -Stability for Discrete Velocity

83

where K is positive constant to be determined later. We also define the instantaneous interaction productions as follows: (f, f¯)(t) = s (f, f¯)(t) + d (f, f¯)(t), δ1 (x) δ3 (x) ¯ s (f, f ) ≡ − |f2 (x) − f¯2 (x)|(f2 (x) + f¯2 (x))dx, 2− δ2 (x) δ2 (x) R 3 ¯ d (f, f )(t) ≡ |fm (x) − f¯m (x)| (fn (x) + f¯n (x))dx. R m=1

(3.32) (3.33) (3.34)

n=m

The above functionals are all positive and for positive L1 -solutions H(t) is equivalent to the L1 distance between f and f¯. Furthermore, L(t) denotes the weighted L1 distance while Qd (t) represents the potential of interaction between particles. Next, we study the time-evolution of L and Qd . Lemma 3.1. Let f and f¯ be two solutions of (1.11) corresponding to initial data f 0 and f¯0 with ||f 0 + f¯0 || < 2. Then K can be selected so that the functionals satisfy the Lyapunov type estimates: dL(t) ≤ −s (f, f¯)(t) + d (f, f¯)(t), dt dQd (t) ≤ (−2 + ||f + f¯||)d (f, f¯)(t), dt dH(t) ≤ −C1 (f, f¯)(t), dt where C1 is a positive constant independent of time t. Proof. First, we derive the equations for the differences |fi (x, t) − f¯i (x, t)|, 1 ≤ i ≤ 3, in the form δ1 f3 + f¯3 ∂t |f1 − f¯1 | − ∂x |f1 − f¯1 | = |f2 − f¯2 |(f2 + f¯2 ) − |f1 − f¯1 | δ2 2 ¯1 δ1 + f f 1 − |f3 − f¯3 | , δ3 2 1 δ2 f3 + f¯3 ∂t |f2 − f¯2 | = − |f2 − f¯2 |(f2 + f¯2 ) + |f1 − f¯1 | 2 2δ1 2 ¯1 δ2 + f f 1 + |f3 − f¯3 | , 2δ3 2 δ3 δ3 f3 + f¯3 ∂t |f3 − f¯3 | + ∂x |f3 − f¯3 | = |f2 − f¯2 |(f2 + f¯2 ) − |f1 − f¯1 | δ2 δ1 2 ¯1 + f f 1 − |f3 − f¯3 | . (3.35) 2

84

S.-Y. Ha, A.E. Tzavaras

Step 1. We consider the functionals separately. From (3.35) we have ∂t |f1 − f¯1 | + 4|f2 − f¯2 | + |f3 − f¯3 | + ∂x |f3 − f¯3 | − |f1 − f¯1 |

δ1 δ3 |f2 − f¯2 |(f2 + f¯2 ) + 2− − δ2 δ2

δ2 δ3 f3 + f¯3 δ1 f1 + f¯1 δ2 ¯ = 2 − − 1 |f1 − f1 | − 1 |f3 − f¯3 | + 2 − . δ1 δ1 2 δ3 δ3 2 (3.36) By the definitions of L, s and d , we have dL(t) |f1 − f¯1 |(f3 + f¯3 ) + |f3 − f¯3 |(f1 + f¯1 ) dx ≤ −s (f, f¯)(t) + dt R ≤ −s (f, f¯)(t) + d (f, f¯)(t). (3.37) Step 2. By a direct yet cumbersome calculation, we obtain from (3.30), (3.31) and (3.35) the identity   3 dQd (t) |fm − f¯m |(x)  (fn + f¯n )(x) dx = −2 dt R m=1

n=m δ1 + 1x
δ2 R2 δ1 + 21x
δ2 R2 δ3 + 1x
δ3 + 21x
2 R δ2 f3 + f¯3 + 1x
δ1 R2 δ2 + 1x
δ2 δ3 f3 + f¯3 + 1x
δ1 R2 δ2 δ3 + 1x
1 R δ2 f1 + f¯1 + (x)|f3 − f¯3 |(x)(f1 + f¯1 )(y) dxdy 1x
δ3 R2 δ2 + 1x
δ2 δ1 f1 + f¯1 + 1x
δ2 δ1 + 1x
Lyapunov Functionals and L1 -Stability for Discrete Velocity

≤ −2

3

R m=1

 |fm − f¯m |(x) 

85

 (fn + f¯n )(x)

n=m

+ ||f1 + f¯1 ||L1 (R) + 2||f2 + f¯2 ||L1 (R)

+ 2||f2 + f¯2 ||L1 (R) + ||f3 + f¯3 ||L1 (R)

R

≤ (−2 + ||f + f¯||)d (f, f¯)(t).

R

(f3 + f¯3 )(x)|f1 − f¯1 |(x)dx (f1 + f¯1 )(x)|f3 − f¯3 |(x)dx

Step 3. By the definition of H, we have dL(t) dQd (t) dH(t) = +K dt dt dt ≤ −s (f, f¯)(t) + [1 + K(−2 + ||f + f¯||)]d (f, f¯)(t). Since ||f + f¯|| < 2, we can choose K sufficiently large so that 1 + K(−2 + ||f + f¯||) < 0. We then have dH(t) ≤ −C1 (f, f¯)(t), dt where C1 is a positive constant independent of time t.

Proceeding as in the proof of Theorem 1.1, we have from Lemma 3.1 the following L1 stability estimate. Theorem 3.1. Let f and f¯ be two mild solutions of (1.11) subject to the hypotheses of Lemma 3.1. Then we have the uniform L1 stability estimate |f1 − f¯1 | + 4|f2 − f¯2 | + |f3 − f¯3 | (x, t)dx R 0 ≤C |f1 − f¯10 | + 4|f20 − f¯20 | + |f30 − f¯30 | (x)dx, R

where C is a constant independent of time t. 3.3. Systems with transversal source terms. In this subsection, we consider the semilinear hyperbolic system jk Bi (x, t)fj fk ∂t fi + ∂x (vi (x, t)fi ) = j,k, j =k

fi (x, 0) = fi0 (x),

(3.38)

(x, t) ∈ R × R+ , i = 1, · · · , N. We do not assume any conservation laws and any nonnegativity for the initial data but only the transversality of the source. We also admit jk jk variable wave speeds vi = vi (x, t) and variable interaction coefficients Bi = Bi (x, t). The assumptions (1.3)–(1.6) are replaced by

86

S.-Y. Ha, A.E. Tzavaras jk

1. The interaction coefficients Bi interaction:

are bounded and only transversal terms enter the

|Bi (x, t)| ≤ B ∗ , for some constantB ∗ , jk

Bikk = 0,

(3.39)

for all i, k.

2. The wave speeds are globally separated, v1 (x, t) < v2 (x, t) < · · · < vN (x, t), with v∗ := min sup vi (x, t) − vj (x, t) > 0. i>j x∈R, t>0

(3.40)

Throughout, we suppress the t-dependence and write f (x) ≡ f (x, t), vi (x) ≡ vi (x, t) and so on. The equation for |fi | is jk ˜ i (f ) = Qi (f )sgnfi = ∂t |fi | + ∂x (vi |fi |) = Q Bi fj fk sgnfi . (3.41) j =k

Define the nonlinear functionals L(f )(t) ≡

m

Q(f )(t) ≡

R

|fm (x, t)|dx = ||f (·, t)||L1 (R) ,

m>n R R

?x
F (f )(t) ≡ L(f )(t) + MQ(f )(t),

(3.42) (3.43) (3.44)

and the instantaneous interaction production (f )(t) = |fm (x, t)||fn (x, t)|dx. m>n R

(3.45)

We study the evolution of these functionals for data f0 with sufficiently small L1 -norm. Proposition 3.4. Let (3.38) satisfy the structural hypotheses (3.39)–(3.40) and f be a mild solution. Then (i) f satisfies the inequalities dL(t) ≤ b(t), dt

dQ(t) + v∗ (t) ≤ b(t)L(t), dt

where b is a (generic) constant depending on B ∗ and N . (ii) There exist constants M (large) and (small) so that if f0 satisfies L(0) = ||f0 ||L1 (R) < , and F (0) = L(0) + MQ(0) < ,

(3.46)

then d L + MQ (t) ≤ −b(t) , dt

L(t) ≤ .

(3.47)

Lyapunov Functionals and L1 -Stability for Discrete Velocity

87

Proof. It follows from (3.41) that L(t) satisfies dL(t) ≤ b(t). dt

(3.48)

Consider now Q(t). The equations for fm (x) and fn (y) read jk Bm (fj fk )(x)sgnfm (x), ∂t |fm |(x) + ∂x (vm |fm |)(x) =

(3.49)

j,k

∂t |fn |(y) + ∂y (vn |fn |)(y) =

jk

(3.50)

Bn (fj fk )(y)sgnfn (y).

j,k

If we multiply (3.49) by |fn , (y)|, (3.50) by |fm (x)|, add the identities, and multiply the result by ?xk

j >k

This identity is integrated over R2 and we add the contributions for the velocities with m > n. We then obtain dQ(t) + v∗ (t) ≤ b(t)L(t). dt

(3.51)

Note that (3.48) and (3.51) give dF (t) d = L + MQ ≤ (t) b − M(v∗ − bL(t)) . dt dt Select now <

v∗ 2b

and M >

4b v∗

and let f0 satisfy (3.46). For the solution f define

T := sup L(s) < for s ∈ (0, t) . t>0

Clearly, T > 0. Moreover, for t ∈ (0, T ) we have L(t) < and v∗ − bL(t) > implies that F satisfies the differential inequality dF (t) ≤ −b(t), dt and that for t ∈ [0, T ], L(t) ≤ L + MQ (t) ≤ L + MQ (0) < . We conclude that T = ∞ and the inequalities follow.

v∗ 2.

This

88

S.-Y. Ha, A.E. Tzavaras

From the perspective of kinetic theory, (3.51) is interpreted as describing the evolution of a two-point distribution function. Analogous differential inequalities appear for the three-point or multi-point distribution functions. For a triplet of wave speeds vk > vm > vn define the triple interaction potential T3 (t) := ?xm>n R

More generally, for an n-tuple index (k1 , · · · , kn ) with (k1 > k2 > · · · > kn ), we define the multiple interaction potential M n (t) by Mn (t) :=

n k1 >...>kn R

n−1

?xi <xi+1

i=1

n

|fki (xi , t)| dx1 . . . dxn .

(3.53)

i=1

The evolution of T3 and Mn obeys respectively d T3 (t) + 3 (t) ≤ b(t)L2 (t), dt d Mn (t) + n (t) ≤ b(t)Ln−1 (t), dt

(3.54)

where (t) and L(t) as before, b depends on B∗ and N , while n (t) is given by n (t) := (vkj (xj ) − vkj +1 (xj +1 )) ?xi <xi+1 R k1 >...>kn j fk2j (xj ) |fki (xi )| dxi . i, i=j,j +1 i=j n−1

i=j

(3.55)

We outline the proof of the first inequality in (3.54). Using (3.41) at three distinct points x, y, z, one obtains ∂t + ∂x vk (x) + ∂y vm (y) + ∂z vn (z) |fk (x)||fm (y)||fn (z)| ˜ k (x)|fm (y)fn (z)| + Q ˜ m (y)|fk (x)fn (z)| + Q ˜ n (z)|fk (x)fm (y)| = Q whence ∂t + ∂x vk (x) + ∂y vm (y) + ∂z vn (z) ?x vm > vn and integrate over R3 to arrive at dT 3 (t) + (vk (y) − vm (y))|fk (y)fm (y)||fn (z)|dydz dt ym>n

Lyapunov Functionals and L1 -Stability for Discrete Velocity

+

k>m>n x
=

89

(vm (y) − vn (y))|fk (x)||fm (y)fn (y)|dxdy

k>m>n {x
˜ k (x)|fm (y)fn (z)| Q

˜ n (z)|fk (x)fm (y)| dxdydz ˜ m (y)|fk (x)fn (z)| + Q +Q ≤ b(t)(L(t))2 . The second inequality in (3.54) follows from a similar though lengthier computation. Consider now f , f¯ two mild solutions and define the nonlinear functionals: |fm (x) − f¯m (x)|dx, L(t) := m R Qd (t) := |fm − f¯m |(x) |fn | + |f¯n | (y) {xn + |fm | + |f¯m | (x)|fn − f¯n |(y)dxdy, ¯ H(t) = L(t) + M1 Qd (t) + M2 (Q(t) + Q(t)), ¯ where Q(t), Q(t) are the interaction potentials for f and f¯ as in Proposition 3.4, and M1 , M2 are constants to be selected later. Proposition 3.5. Let (3.38) satisfy (3.39)–(3.40), and let f , f¯ be two mild solutions emanating from data f0 , f¯0 . Then (i) f , f¯ satisfy the differential inequalities dL(t) ≤ b d (f, f¯)(t), dt dQd (t) + v∗ d (f, f¯)(t) ≤ bd (f, f¯)(t) L(f )(t) + L(f¯)(t) dt + bL(t) (f )(t) + (f¯)(t) , where b depends on B∗ and N , is defined in (3.45), and d is given by |fk (x) − f¯k (x)|(|fl |(x) + |f¯l |(x))dx. d (f, f¯)(t) = k,l, k=l R

(3.56)

(3.57)

(ii) There are choices of the parameters (small) and M1 , M2 (large) such that if the data are selected to satisfy ||f0 ||L1 (R) + ||f¯0 ||L1 (R) < 1, then we have dH(t) v∗ + d (f, f¯)(t) + c (f )(t) + (f¯)(t) ≤ 0, dt 2

(3.58)

for some constant c > 0. Proof. Recall that kl

Bm (fk − f¯k )fl + (fl − f¯l )f¯k δm =: Rm , ∂t |fm − f¯m | + ∂x vm (x)|fm − f¯m | = k=l

(3.59)

90

S.-Y. Ha, A.E. Tzavaras

∂t |fn | + ∂y (vn (y)|fn |) =

k=l

∂t |f¯n | + ∂y (vn (y)|f¯n |) =

k=l

˜ n (f )(y), Bnkl fk fl sgn(fn ) = Q

(3.60)

˜ n (f¯)(y). Bnkl f¯k f¯l sgn(f¯n ) = Q

(3.61)

We have dL(t) ¯ Rm (x, t)dx ≤ b(f, f¯)(t) . = dt m R

(3.62)

dQd (t) . From (3.59), (3.60) and (3.61), we obtain the identity dt ∂t + ∂x vm (x) + ∂y vn (y) ?x
Next, consider

+ (vm (x) − vn (y))δ(y − x)|fm − f¯m |(x)(|fn | + |f¯n |)(x) ˜ n (f ) + Q ˜ n (f¯) (y) , = ?xn + (vm (x) − vn (x)) |fm − f¯m |(x)(|fn | + |f¯n |)(x) R m>n

=

2 m>n R

+ (|fm | + |f¯m |)(x)|fn − f¯n |(x) dx

˜ n (f ) + Q ˜ n (f¯) (y) ?x
˜ m (f ) + Q ˜ m (f¯) (x)|fn − f¯n |(y) + (|fm | + |f¯m |)(x)Rn (y) dxdy + Q ¯ ¯ ≤ bd (f, f¯)(t) L(t) + L(t) + bL(t) (t) + (t) , ¯ ¯ ¯ where we used the notation L(t) = L(f¯)(t), Q(t) = Q(f¯)(t) and (t) = (f¯)(t). Hence, Qd obeys the differential inequality dQd ¯ + bL( + ) ¯ . + v∗ d ≤ bd (L + L) dt

(3.63)

Combining with (3.62), we obtain d ¯ d ≤ bd + M1 bL( + ) ¯ . L + M1 Qd + M1 v∗ − b(L + L) dt

(3.64)

Proposition 3.4 yields d ¯ ¯ ≤ b(L + ¯ L), Q + Q¯ + v∗ ( + ) dt

(3.65)

Lyapunov Functionals and L1 -Stability for Discrete Velocity

91

and that there exists a threshold 0 such that for < 0 and for f0 L1 , f¯0 L1 suffi¯ ciently small we have (L + MQ)(t) and (L¯ + M Q)(t) are decreasing in time and L(t), ¯ L(t) < . From (3.64) and (3.65) we deduce d ¯ L + M1 Qd + M2 (Q + Q) dt ¯ d + M2 v∗ − b(L + L) ¯ ( + ) ¯ + M1 v∗ − b(L + L) ¯ ¯ ≤ bd (L + L) + M1 bL( + ). (3.66) Since L + L¯ < 2 , by selecting even smaller (if necessary) and M1 , M2 sufficiently large we have d ¯ + v∗ d + c( + ) ¯ ≤0 L + M1 Qd + M2 (Q + Q) dt 2 for some constant c > 0.

Acknowledgement. We thank Marshall Slemrod for various helpful discussions. SYH is partially supported by the National Science Foundation. AET is partially supported by the National Science Foundation, the European Union-IHP project HYKE, and a Marie Curie Fellowship.

References 1. Beale, J.-T.: Large-time behavior of the Broadwell model of a discrete velocity gas. Commun. Math. Phys. 102, 217–235 (1985) 2. Beale, J.-T.: Large-time behavior of discrete velocity Boltzmann equations. Commun. Math. Phys. 106 (4), 659–678 (1986) 3. Bony, J.-M.: Solutions globales bornées pour les modèles discrets de l’ équation de Boltzmann, en dimension 1 d’espace. In: Journées "Équations aux derivées partielles" (Saint Jean de Monts, 1987), Exp. No. XVI. 10. pp. Palaiseau: École Polytech, 1987 4. Bony, J.-M.: Existence globale et diffusion en théorie cinétique discrete. In: Advances in kinetic theory and continuum mechanics. R. Gatignol and Soubbaramayer, eds, Berlin-Heidelberg-New York: Springer-Verlag, 1991, pp. 81–90 5. Bony, J.-M.: Problème de Cauchy et diffusion à données petites pour les modèles discrets de la cinétique des gaz. In: Journées “Équations aux derivées partielles” (Saint Jean de Monts, 1990), Exp. No. I. 12. pp. Palaiseau: École Polytech, 1990 6. Bony, J.-M.: Existence globale à données de Cauchy petites pour les modèles discrets de l’équation de Boltzmann. Comm. Partial Differential Equations 16, 533–545 (1991) 7. Bony, J.-M.: Existence globale et diffusion pour les modèles discrets de la cinétique des gaz. In: First European Congress of Mathematics, Vol. I (Paris, 1992), Progr. Math. 119, Basel: Birkäuser, 1994, pp. 391–410 8. Bose, C., Grzegorrczyk, P., Illner, R.: Asymptotic behavior of one-dimensional discrete velocity models in a slab. Arch. Rational Mech. Anal. 127, 337–360 (1994) 9. Bressan, A., Liu, T.-P., Yang, T.: L1 stability estimates for n × n conservation laws. Arch. Rational Mech. Anal. 149, 1–22 (1999) 10. Broadwell, J.E.: Shock structure in a simple discrete velocity gas. Phys. Fluids 7, 1243–1247 (1964) 11. Cabannes, H.: Solution globale du problème de Cauchy en théorie cinétique discrète. J. Mécanique 17, 1–22 (1978) 12. Cercignani, C.: A remarkable estimate for the solutions of the Boltzmann equation. Appl. Math. Lett. 5, 59–62 (1992) 13. Cercignani, C.: Weak solutions of the Boltzmann equation and energy conservation. Appl. Math. Lett. 8, 53–59 (1995); See also: Errata. Appl. Math. Lett. 8(5), 95–99 (1995) 14. Cercignani, C., Illiner, R.: Global weak solutions of the Boltzmann equation in a slab with diffusive boundary conditions. Arch. Rat. Mech. Anal. 134, 1–16 (1996) 15. Gatignol, R.: Théorie Cinétique des Gaz a Répartition Discrète de Vitesses. Lecture Notes in Physics, 36, Berlin-New York: Springer-Verlag, 1975

92

S.-Y. Ha, A.E. Tzavaras

16. Glimm, J.: Solutions in the large for nonlinear hyperbolic systems of equations. Comm. Pure. Appl. Math. 18, 697–715 (1965) 17. Hu, J., LeFloch, Ph.: L1 continuous dependence property for systems of conservation laws. Arch. Ration. Mech. Anal. 151, 45–93 (2000) 18. Illner, R.: Global existence results for discrete velocity models of the Boltzmann equation. J. Meca. Th. Appl. 1, 611–622 (1982) 19. Platkowski, T., Illner, R.: Discrete velocity models of the Boltzmann equation: A survey on the mathematical aspects of the theory. SIAM Review 30, 213–255 (1988) 20. Katsoulakis, M., Tzavaras, A.: Contractive relaxation systems and the scalar multidimensional conservation law. Comm. Partial Differential Equations 22, 195–233 (1997) 21. Kawashima, S.: Global solution of the initial value problem for a discrete velocity model of the Boltzmann equation. Proc. Japan Acad. 57, 19–24 (1981) 22. Liu, T.-P., Yang, T.: Well posedness theory for hyperbolic conservation laws. Comm. Pure Appl. Math. 52, 1553–1586 (1999) 23. Nishida, T., Mimura, M.: On the Broadwell model of the Boltzmann equation for a simple discrete velocity gas. Proc. Japan. Acad. 50, 812–817 (1974) 24. Schatzman, M.: Continuous Glimm functionals and uniqueness of solutions of the Riemann problem. Indiana Univ. Math. J. 34, 533–589 (1985) 25. Tartar, L.: Existence globale pour un système hyperbolique semi linéaire de la théorie cinétique des gaz. In: Séminaire Goulaouic-Schwartz (1975/1976), Équations aux dérivées partielles et analyse fonctionnelle, Exp. No. 1, 11 pp. Centre Math. Palaiseau: École Polytech, 1976 26. Tartar, L.: Some existence theorems for semilinear hyperbolic systems in one space variable. MRC Technical Summary Report, No. 2164. University of Wisconsin-Madison (1980) 27. Tartar, L. (1987): Oscillations and asymptotic behaviour for two semilinear hyperbolic systems. In: Dynamics of infinite-dimensional systems. NATO Adv. Sci. Inst. Ser. F, 37, Berlin: Springer, 1981, pp. 341–356 28. Tzavaras, A.: On the mathematical theory of fluid dynamic limits to conservation laws. In: Advances in Mathematical Fluid Mechanics. J. Malek, J. Necas, M. Rokyta, eds.; New York: Springer, 2000, pp. 192-222 Communicated by J.L. Lebowitz

Commun. Math. Phys. 239, 93–113 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0858-9

Communications in

Mathematical Physics

Measures of Maximal Dimension for Hyperbolic Diffeomorphisms Luis Barreira1 , Christian Wolf 2 1 2

Departamento de Matem´atica, Instituto Superior T´ecnico, 1049-001 Lisboa, Portugal. E-mail: [email protected] Department of Mathematics, Wichita State University, Wichita, KS 67260–0033, USA. E-mail: [email protected]

Received: 18 September 2002 / Accepted: 14 February 2003 Published online: 28 May 2003 – © Springer-Verlag 2003

Abstract: We establish the existence of ergodic measures of maximal Hausdorff dimension for hyperbolic sets of surface diffeomorphisms. This is a dimension-theoretical version of the existence of ergodic measures of maximal entropy. The crucial difference is that while the entropy map is upper-semicontinuous, the map ν → dimH ν is neither upper-semicontinuous nor lower-semicontinuous. This forces us to develop a new approach, which is based on the thermodynamic formalism. Remarkably, for a generic diffeomorphism with a hyperbolic set, there exists an ergodic measure of maximal Hausdorff dimension in a particular two-parameter family of equilibrium measures. 1. Introduction In the theory of dynamical systems the study of dimension became popular around the late 70’s and the beginning of the 80’s. It appeared as a means to characterize the number of independent modes necessary to describe the “strange attractors” of many infinitedimensional systems that are associated to natural phenomena. Furthermore, dimension is related to other invariants of a dynamical system which are associated to invariant sets and invariant measures. These invariants include topological and measure-theoretic entropies and Lyapunov exponents. We refer the reader to [6, 1, 11] for detailed accounts and references. In particular, the measure-theoretic entropy describes the complexity of the dynamics from the point of view of a given invariant measure, and thus discards sets of small measure. On the other hand, topological entropy measures the complexity from the point of view of topological dynamics without discarding any component of the phase space. This indicates that measure-theoretic entropy cannot be larger than topological entropy, and this relation is made rigorous by the so-called variational principle of topological entropy. Partially supported by the Center for Mathematical Analysis, Geometry, and Dynamical Systems, through FCT’s Funding Program.

94

L. Barreira, C. Wolf

Namely, let f be a homeomorphism of a compact metric space and denote by htop (f ) the topological entropy and by hν (f ) the measure-theoretic entropy of f (see [8] for the definitions; see also Sect. 2 below). Then htop (f ) = sup{hν (f ) : ν ∈ M},

(1)

where M is the set of f -invariant probability measures. A measure ν ∈ M at which the supremum in (1) is attained is called measure of maximal entropy. A priori there exists no natural measure of maximal entropy, that is, a natural measure of the size of sets from the point of view of entropy (which attains the maximal complexity in terms of entropy). Nevertheless, when the map ν → hν (f ) is upper-semicontinuous (and therefore in particular when f is an expansive map of a compact metric space) there exist measures of maximal entropy. Furthermore, many transitive systems of hyperbolic nature (including topological Markov chains, hyperbolic sets, and repellers) possess a unique (and thus ergodic) measure of maximal entropy. The main purpose of this paper is to discuss the corresponding questions in the case of Hausdorff dimension, for which the theory is poorly understood. Namely, let dimH ν denote the Hausdorff dimension of a measure ν (see Sect. 2 for the definition) and consider the quantity δ(f ) = sup{dimH ν : ν ∈ M}.

(2)

A related quantity was introduced by Denker and Urbanski in [5] (with the supremum in (2) replaced by the supremum over the ergodic measures of positive entropy). In particular, this quantity has been intensively studied in one-dimensional complex dynamics (see [15] for further details). A measure ν ∈ M at which the supremum in (2) is attained is called measure of maximal Hausdorff dimension or simply measure of maximal dimension. This is the dimension-theoretical counterpart to a measure of maximal entropy, and each of these measures provides a natural measure of the size of sets from the point of view of dimension. Furthermore, it attains the maximal complexity in terms of dimension. It is thus of considerable interest to discuss the existence and ergodicity of measures of maximal dimension. The main outcome of this paper is a complete solution to this problem in the case of hyperbolic sets of diffeomorphisms on surfaces. Our main result is the following (see Theorem 6 in Sect. 4): Theorem 1. Let f be a C 1+ε surface diffeomorphism, and let be a compact locally maximal hyperbolic set such that f | is topologically mixing. Then there exists an ergodic f -invariant probability measure µ on such that dimH µ = sup{dimH ν : ν ∈ M is ergodic}.

(3)

It was established by Barreira, Pesin and Schmeling in [1] that for a C 1+ε diffeomorphism, any finite invariant hyperbolic measure with compact support (and thus in particular any finite invariant measure supported on a compact hyperbolic set) possesses an “almost” local product structure. This means that the measure imitates (up to a small exponential error when compared to the Lyapunov exponents) the product structure of the invariant set. One consequence is that virtually all characteristics of dimension type of the measure (including the Hausdorff dimension, box dimensions, and information dimensions) coincide. In particular, the Hausdorff dimension can be replaced by any of these characteristics in our main result.

Measures of Maximal Dimension for Hyperbolic Diffeomorphisms

95

One consequence of Theorem 1 is the existence of measures of maximal dimension. This follows from the behavior of the Hausdorff dimension of an invariant measure under an ergodic decomposition. Theorem 2 ([3]). Let f be a C 1+ε surface diffeomorphism, and let be a compact f -invariant locally maximal hyperbolic set. If µ is an f -invariant probability measure on and τ is an ergodic decomposition of µ then dimH µ = ess sup{dimH ν : ν ∈ M is ergodic}, with the essential supremum taken with respect to τ . An immediate consequence of Theorem 2 is that sup{dimH ν : ν ∈ M} = sup{dimH ν : ν ∈ M is ergodic}.

(4)

This identity allows us to reduce the study of measures of maximal dimension to the corresponding study for ergodic measures. Combining (4) with Theorem 1 we can finally establish the existence of ergodic measures of maximal dimension. Corollary 3. Let f be a C 1+ε surface diffeomorphism, and let be a compact locally maximal hyperbolic set such that f | is topologically mixing. Then there exists an ergodic measure of maximal dimension. It should be noted that measures of maximal dimension are never unique. Namely, if µ is a measure of maximal dimension, then any linear combination of µ and any other measure is also a measure of maximal dimension. Nevertheless, it is possible to identify classes of diffeomorphisms under which the number of ergodic measures µ satisfying (3) is finite (see Sect. 5). We now want to describe the main difficulties in the proof of Theorem 1. The crucial difference between entropy and dimension is that while the entropy map is upper-semicontinuous, the map ν → dimH ν is neither upper- nor lower-semicontinuous. For the former one can consider the sequence (ν + (n − 1)νx )/n, where dimH ν > 0 and νx is supported on a periodic orbit, and for the latter consider a sequence of atomic measures converging to a measure of positive dimension (the existence of such a sequence follows from [14]). On the other hand, it follows from work of Young in [17] and the upper-semicontinuity of the entropy map that ν → dimH ν is upper-semicontinuous when restricted to ergodic measures. However, by a classical result of Sigmund in [14] the set of ergodic measures is a proper dense subset of M with respect to the weak∗ topology. Thus the upper-semicontinuity on the set of ergodic measures does not imply the existence of a measure of maximal dimension (either ergodic or nonergodic). These difficulties force us to develop a new approach, which is based on the thermodynamic formalism. We shall describe here one particular case that illustrates well the nature of our approach. Consider the functions φu = log df |E u

and

φs = log df |E s ,

(5)

where E u and E s are the unstable and stable distributions on . We assume here that neither φu nor φs is cohomologous to a constant (see Sect. 2 for the definition). Our approach consists in the following:

96

L. Barreira, C. Wolf

1. consider a measure µ ∈ M which is the limit of a sequence of ergodic measures νn ∈ M satisfying lim dimH νn = sup{dimH ν : ν ∈ M is ergodic};

n→∞

2. construct curves γu : R → R and γs : R → R such that φu dνγu (q),q = φu dµ and φs dνp,γs (p) = φs dµ,

where νp,q denotes the equilibrium measure of −pφu + qφs ; 3. show that the two curves intersect, that is, there exists (p, q) ∈ R2 such that (γu (q), q) = (p, γs (p)); 4. show that for each intersection point (p, q) ∈ R2 the measure νp,q satisfies the identity dimH νp,q = sup{dimH ν : ν ∈ M is ergodic}.

(6)

The proofs of these statements are based on the thermodynamic formalism. We note that the curves γu and γs correspond to level sets of measures νp,q having the same positive and negative values of the Lyapunov exponent. We remark that a priori there may exist an ergodic measure of maximal dimension that is not among the measures νp,q . Nevertheless, the number δ(f ) can always be arbitrarily approximated by the dimension of the measures νp,q , i.e., δ(f ) = sup{dimH νp,q : p, q ∈ R}. This is established in Sect. 4. We now describe several applications of the measures of maximal dimension (actually not only of their existence but also of the properties established in Sect. 4). A first application concerns the dependence of δ(f ) on the diffeomorphism f . Namely, by using the existence of the measures of maximal dimension we show, under reasonable assumptions (see Theorem 11), that f → δ(f ) is C r−3 with respect to the C r topology. This improves a result of McCluskey and Manning [10] showing that f → δ(f ) is continuous. A second application concerns the discussion of whether the measures of maximal dimension have the Gibbs property. We provide a very general condition (see Theorem 12) which implies that every ergodic measure of maximal dimension is an equilibrium measure of a H¨older continuous potential, and thus has the Gibbs property. This condition holds on an open subset of surface diffeomorphisms with a locally maximal hyperbolic set. Another interesting application of Theorem 1 is the existence of measures of “maximal recurrence”. It was proven by Barreira and Saussol in [2] that if ν is an equilibrium measure of a H¨older continuous potential of a C 1+ε diffeomorphism f on a locally maximal hyperbolic set, then log inf{k > 0 : f k (x) ∈ B(x, r)} = dimH ν r→0 − log r lim

(7)

for ν-almost every point x, where B(x, r) is the ball of radius r centered at x. When the limit in the left-hand side of (7) exists, it is called the recurrence rate at x. An immediate consequence of the above discussion is that any measure µ = νp,q as in (6) satisfies (7)

Measures of Maximal Dimension for Hyperbolic Diffeomorphisms

97

with the maximum possible value in the right-hand side, and thus is also a measure of “maximal recurrence” among all equilibrium measures of H¨older continuous potentials. The paper is organized as follows. In Sect. 2 we define the basic concepts and recall several required results. In Sect. 3 we construct the curves γu and γs . Section 4 establishes Theorem 1 as well as several properties of measures of maximal dimension, including a complete characterization when neither φu nor φs is cohomologous to a constant. Section 5 provides further applications of Theorem 1 and of the properties established in Sect. 4. In particular, we discuss the regularity of the map f → δ(f ) and the Gibbs property of the measures of maximal dimension. 2. Preliminaries Let f : M → M be a C 1+ε diffeomorphism on a two-dimensional Riemannian manifold for some ε ∈ (0, 1], and ⊂ M a compact locally maximal hyperbolic set. This means that is an f -invariant compact set and that there exists a continuous splitting of the tangent bundle T M = E u ⊕ E s , and constants c > 0 and λ ∈ (0, 1) such that for each x ∈ : 1. dx f (Exu ) = Efu (x) and dx f (Exs ) = Efs (x) ; 2. dx f −n v ≤ cλn v whenever v ∈ Exu and n > 0; 3. dx f n v ≤ cλn v whenever v ∈ Exs and n > 0.

Furthermore, there exists an open neighborhood U of such that = n∈Z f n U . We shall always assume that the unstable and stable distributions E u and E s have dimension one, and that f | is topologically mixing. We define functions φu : → R and φs : → R by (5). The unstable and stable distributions are H¨older continuous (in fact they are of class C 1 under our assumptions; see for example [8, Sect. 19.1]). Hence, the functions φu and φs are also H¨older continuous. Let now M be the family of f -invariant probability Borel measures on equipped with the weak∗ topology, and ME ⊂ M the subset of ergodic measures. This makes M a compact metrizable space. For each ν ∈ M we define λu (ν) = φu dν and λs (ν) = φs dν. (8)

We denote by dimH Z the Hausdorff dimension of the set Z, and define by dimH ν = inf{dimH Z : ν( \ Z) = 0} the Hausdorff dimension of the measure ν. Let hν (f ) denote the measure-theoretic entropy of f with respect to ν (see for example [8] for the definition). For surface diffeomorphisms it was shown by Young in [17] that if ν is ergodic, then dimH ν = d(ν), where

1 1 d(ν) = hν (f ) − . λu (ν) λs (ν) def

This result is crucial for our approach.

(9)

98

L. Barreira, C. Wolf

We also need the notion of topological pressure P (ϕ) of a continuous function ϕ : → R with respect to f | (see [12, 8] for the definition and details). The topological pressure satisfies the following variational principle: ϕ dν . (10) P (ϕ) = sup hν (f ) + ν∈M

Furthermore, the supremum in (10) can be replaced by the supremum over ν ∈ ME . The number htop (f ) = P (0) is the topological entropy of f |. If there exists a measure ν ∈ M at which the supremum in (10) is attained it is called an equilibrium measure of ϕ. We recall that two functions ϕ, ψ : → R are said to be cohomologous if ϕ − ψ = η − η ◦ f for some continuous function η : → R. In this case P (ψ) = P (ϕ). Given α ∈ (0, 1], let C α () be the space of H¨older continuous functions ϕ : → R with H¨older exponent α. We now list several properties of the topological pressure which are needed later on (see [12] for details). Let α ∈ (0, 1] be fixed. Then: 1. The map ϕ → P (ϕ) is real-analytic on C α (). 2. Each function ϕ ∈ C α () has a unique equilibrium measure νϕ ∈ M; furthermore νϕ is ergodic and given ψ ∈ C α () we have d = ψ dνϕ . (11) P (ϕ + tψ) t=0 dt 3. For each ϕ, ψ ∈ C α () we have νϕ = νψ if and only if ϕ − ψ is cohomologous to a constant. 4. For each ϕ, ψ ∈ C α () and t ∈ R we have d2 P (ϕ + tψ) ≥ 0, dt 2

(12)

with equality if and only if ψ is cohomologous to a constant. Our approach is based on the study of the topological pressure of the two-parameter family (p, q) → −pφu + qφs . More precisely, we consider the function Q : R2 → R defined by Q(p, q) = P (−pφu + qφs ). Since φu and φs are H¨older continuous, Property 1 above implies that Q is real-analytic. Furthermore, Property 2 implies that for each (p, q) ∈ R2 the function −pφu + qφs has a unique equilibrium measure νp,q ∈ ME . For simplicity, and since there is no danger of confusion, we shall use the notations λu (p, q) = λu (νp,q ),

λs (p, q) = λs (νp,q ),

h(p, q) = hνp,q (f ).

Accordingly, we also think of λu , λs , and h as functions in R2 . Note that by the variational principle of the topological pressure (see (10)) we have Q(p, q) = h(p, q) − pλu (p, q) + qλs (p, q).

(13)

We now briefly describe how these functions relate to dimension theory. It is straightforward to verify that there exist unique nonnegative numbers tu and ts satisfying Q(tu , 0) = Q(0, ts ) = 0. It follows from work of McCluskey and Manning in [10] that dimH = tu + ts

(14)

Measures of Maximal Dimension for Hyperbolic Diffeomorphisms

99

(for more details see [11] and the references therein). Since 0 = h(tu , 0) − tu λu (tu , 0) ≥ hν (f ) − tu λu (ν), with strict inequality if and only if ν = νtu ,0 (together with an analogous property for the stable part), we have tu = max ν∈M

hν (f ) h(tu , 0) = λu (ν) λu (tu , 0)

and ts = max ν∈M

hν (f ) h(0, ts ) =− , −λs (ν) λs (0, ts )

(15)

and the maxima are uniquely attained at the measures νtu ,0 and ν0,ts respectively. We call a probability measure µ supported on a measure of full dimension if dimH = dimH µ. Together with (9) and (14), the uniqueness of the maxima in (15) implies that there exists a measure µ ∈ ME of full dimension if and only if νtu ,0 = ν0,ts , in which case µ = νtu ,0 = ν0,ts . In particular, if there is a measure of full dimension in ME , then it is unique. For example, if f preserves volume then there exists an ergodic invariant measure of full dimension (and in this case tu = ts ). To the best of our knowledge this was first explicitly observed by Friedland and Ochs in [7]. A short argument is the following. If f n (x) = x ∈ and k ∈ Z, then 1 = |det dx f kn | = exp[kφu (f n (x)) + kφs (f n (x))] sin (E u (x), E s (x)). Letting k → ±∞ yields φu (f n (x)) + φs (f n (x)) = 0 whenever f n (x) = x ∈ , and by Livshitz’s theorem (see for example [8, Theorem 19.2.1]) φu + φs is cohomologous to zero. This implies that Q(t, 0) = Q(0, t) for every t ∈ R and hence tu = ts . Thus −tu φu is cohomologous to ts φs , and νtu ,0 = ν0,ts is the ergodic invariant measure of full dimension. In the case of hyperbolic polynomial automorphisms of C2 it is shown by Wolf in [16] that if there exists an ergodic invariant measure of full dimension, then either the map is volume preserving, or φu and φs are both cohomologous to a constant. In the latter case the ergodic invariant measure of full dimension obviously coincides with the measure of maximal entropy. 3. Preparatory Results We first introduce some notation. Since the maps ν → λu (ν) and ν → λs (ν) defined by (8) are continuous on M, and M is compact, we can define λmin u = min λu (M),

λmax = max λu (M), u

= min λs (M), λmin s

λmax = max λs (M). s

and

Set max Iu = (λmin u , λu ) and

max Is = (λmin s , λs ).

Note that Iu = ∅ (respectively Is = ∅) if and only if φu (respectively φs ) is not cohomologous to a constant. We shall also consider the functions du (p, q) = h(p, q)/λu (p, q) and

ds (p, q) = −h(p, q)/λs (p, q).

(16)

100

L. Barreira, C. Wolf

Recall that the function Q is real-analytic. It follows from (11) that ∂p Q = −λu

and ∂q Q = λs .

(17)

Therefore the functions λu and λs are also real-analytic. We conclude from (13) that h and hence also ds and du are real-analytic. Proposition 4. The following properties hold: 1. If φu is not cohomologous to a constant and q ∈ R, then: a) λu (·, q) is strictly decreasing and {λu (p, q) : p ∈ R} = Iu ; b) h(·, 0) is strictly decreasing on [0, ∞); c) du (·, 0) is strictly increasing on (−∞, tu ] and strictly decreasing on [tu , ∞). 2. If φs is not cohomologous to a constant and p ∈ R, then: a) λs (p, ·) is strictly decreasing and {λs (p, q) : q ∈ R} = Is ; b) h(0, ·) is strictly decreasing on [0, ∞); c) ds (0, ·) is strictly increasing on (−∞, ts ] and strictly decreasing on [ts , ∞). Proof. Assume that φu is not cohomologous to a constant and fix q ∈ R. By (12) and (17) we have ∂p λu = −∂p2 Q < 0,

(18)

and thus λu (·, q) is strictly decreasing. The continuity of λu (·, q) implies that {λu (p, q) : p ∈ R} is an open interval. For property 1 we show that lim λu (p, q) = λmin u

p→∞

and

lim λu (p, q) = λmax u .

p→−∞

(19)

Otherwise there would exist ν ∈ M and ε > 0 such that λu (ν) + ε < λu (p, q) for all p ∈ R. Take p > 0 satisfying pε > htop (f ) − qλs (ν) + qλs (p, q) (such a p always exists since the function λs (·, q) is bounded). We obtain Q(p, q) = h(p, q) − pλu (p, q) + qλs (p, q) < htop (f ) − p(λu (ν) + ε) + qλs (p, q) < hν (f ) − pλu (ν) + qλs (ν), which is a contradiction to the variational principle of the topological pressure. This establishes the first identity in (19). A similar argument establishes the second identity and Property 1a holds. It follows from (13) that h(p, 0) = Q(p, 0) + pλu (p, 0).

(20)

Using (17) and (18) it is straightforward to verify that ∂p h(p, 0) = p∂p λu (p, 0).

(21)

This establishes Property 1b. Using now (13) and (21) we obtain ∂p du (p, 0) =

p∂p λu (p, 0)λu (p, 0) − h(p, 0)∂p λu (p, 0) Q(p, 0) = −∂p λu (p, 0) . 2 λu (p, 0) λu (p, 0)2

It follows from the variational principle that Q(·, q) is strictly decreasing. This implies that Q(p, 0) > Q(tu , 0) = 0 for p < tu and Q(p, 0) < Q(tu , 0) = 0 for p > tu . Property 1 follows now immediately from (18). The proofs of the statements for the stable part are entirely analogous.

Measures of Maximal Dimension for Hyperbolic Diffeomorphisms

101

Using Proposition 4 we introduce two curves, crucial for our approach. Proposition 5. The following properties hold: 1. For each a ∈ Iu there exists a unique function γu : R → R satisfying λu (γu (q), q) = a for all q ∈ R, and γu is real-analytic; 2. For each b ∈ Is there exists a unique function γs : R → R satisfying λs (p, γs (p)) = b for all p ∈ R, and γs is real-analytic. Proof. We shall prove the second statement. The proof of the first statement is analogous. Let b ∈ Is . In particular Is = ∅, and φs is not cohomologous to a constant. By Statement 2 of Proposition 4 and (17), for each p ∈ R there exists a unique number γs (p) ∈ R such that ∂q Q(p, γs (p)) = λs (p, γs (p)) = b. Since φs is not cohomologous to a constant, ∂q2 Q(p, q) > 0 for all (p, q) ∈ R2 . The Implicit Function Theorem shows that p → γs (p) is real-analytic. 4. Measures of Maximal Dimension 4.1. Existence. We now establish the existence of ergodic measures of maximal dimension on locally maximal hyperbolic sets. The following is our main result. Theorem 6. Let f be a C 1+ε surface diffeomorphism, and let be a compact locally maximal hyperbolic set of f such that f | is topologically mixing. Then there exists a measure µ ∈ ME such that dimH µ = sup{dimH ν : ν ∈ ME }.

(22)

Proof. Let (νn )n∈N be a sequence of measures in ME such that lim dimH νn = sup{dimH ν : ν ∈ ME }.

n→∞

(23)

Since M is compact in the weak∗ topology, we can also assume that (νn )n∈N converges to some measure m ∈ M. Since the map M ν → hν (f ) is upper semi-continuous, it follows from (9) and the continuity of ν → λu (ν) and ν → λs (ν) that lim dimH νn ≤ d(m).

(24)

sup{dimH ν : ν ∈ ME } ≤ d(m).

(25)

n→∞

Using (23) and (24) we obtain

Therefore, in order to establish the existence of a measure µ ∈ ME satisfying (22), it is sufficient to show that there exists µ ∈ ME with dimH µ = d(m).

(26)

It is clear that any measure µ ∈ ME satisfying (26) also satisfies (22). We note that when m is ergodic, it follows from (9) that dimH m = d(m), and hence (22) holds for m. However, m may not be ergodic. Set a = λu (m) and b = λs (m). By Proposition 5, whenever a ∈ Iu (respectively b ∈ Is ) we can consider the curve γu (respectively γs ) associated to the number a (respectively b). We first prove some auxiliary statements.

102

L. Barreira, C. Wolf

Lemma 1. If λs (m) ∈ Is then there exists p ∈ [0, hm (f )/λu (m)] such that λu (p, γs (p)) = λu (m). Proof of the lemma. The assumption λs (m) ∈ Is guarantees that γs is well-defined. Since νp,γs (p) is the equilibrium measure of −pφu + γs (p)φs we have h(p, γs (p)) − pλu (p, γs (p)) + γs (p)λs (p, γs (p)) ≥ hm (f ) − pλu (m) + γs (p)λs (m) (27) for all p ∈ R. Note that λu (p, γs (p)) > 0. It is straightforward to verify that hm (f ) λu (m) hm (f ) h(p, γs (p)) − ≥ 1− p− . λu (p, γs (p)) λu (m) λu (p, γs (p)) λu (m)

(28)

Define κ = hm (f )/λu (m). Setting p = κ, it follows from (28) that h(κ, γs (κ))/λu (κ, γs (κ)) ≥ hm (f )/λu (m).

(29)

Assume now that λu (κ, γs (κ)) > λu (m). By (29), h(κ, γs (κ)) > hm (f ). We conclude from (9) and (29) that dimH νκ,γs (κ) > d(m). This is a contradiction to (25) and thus we must have λu (κ, γs (κ)) ≤ λu (m).

(30)

On the other hand, it follows from (9) and (25) that h(0, γs (0)) hm (f ) hm (f ) h(0, γs (0)) − ≤ − . λu (0, γs (0)) λs (m) λu (m) λs (m)

(31)

Setting p = 0 in (27) we obtain h(0, γs (0)) ≥ hm (f ). Therefore (31) yields λu (0, γs (0)) ≥ λu (m).

(32)

It follows from the continuity of p → λu (p, γu (p)) together with (30) and (32) that there exists p ∈ [0, κ] for which λu (p, γs (p)) = λu (m). This completes the proof of the lemma. Lemma 2. Assume that neither φu nor φs is cohomologous to a constant. Then λu (m) ∈ Iu if and only if λs (m) ∈ Is . Proof of the lemma. Assume that λs (m) ∈ Is . By Lemma 1, there exists p for which λu (p, γs (p)) = λu (m). Proposition 4 shows that λu (p, γs (p)) ∈ Iu and hence λu (m) ∈ Iu . A similar argument and the corresponding version of Lemma 1 show that λs (m) ∈ Is whenever λu (m) ∈ Iu . Lemma 2 indicates that it is enough to consider the following four cases: 1. 2. 3. 4.

λs (m) ∈ Is and λu (m) ∈ Iu ; λs (m) ∈ Is and φu is cohomologous to a constant; λu (m) ∈ Iu and φs is cohomologous to a constant; λs (m) ∈ Is and λu (m) ∈ Iu . We continue with an auxiliary statement.

Lemma 3. If p, q ∈ R are such that λu (p, q) = λu (m) and λs (p, q) = λs (m), then m = νp,q .

Measures of Maximal Dimension for Hyperbolic Diffeomorphisms

103

Proof of the lemma. We have h(p, q) + (−pφu + qφs ) dνp,q = h(p, q) − pλu (m) + qλs (m) ≥ hm (f ) + (−pφu + qφs ) dm,

and hence h(p, q) ≥ hm (f ), with equality if and only if νp,q = m. On the other hand, combining (9) with (25) gives h(p, q) ≤ hm (f ). Therefore h(p, q) = hm (f ) and m = νp,q . We now consider each of the above four cases. Lemma 4. If λu (m) ∈ Iu and λs (m) ∈ Is , then there exist p, q ∈ R such that (p, γs (p)) = (γu (q), q) and m = νp,q . Proof of the lemma. The hypotheses guarantee that γu and γs are well-defined. Since λs (p, γs (p)) = λs (m), it follows from Lemma 1 and the uniqueness of γu that (p, γs (p)) = (γu (q), q) for some p, q ∈ R. In particular, λu (p, q) = λu (m) and λs (p, q) = λs (m). Lemma 3 implies that m = νp,q . Lemma 5. If λs (m) ∈ Is and φu is cohomologous to a constant, then there exist p, q ∈ R such that m = νp,q . Proof of the lemma. Since λs (m) ∈ Is , γs is well-defined, and λs (p, γs (p)) = λs (m) for each p. On the other hand, the cohomological assumption ensures that λu (p, γs (p)) = λu (m). Setting q = γs (p) we obtain λu (p, q) = λu (m) and λs (p, q) = λs (m). Lemma 3 implies that m = νp,q . An analogous argument establishes the following. Lemma 6. If λu (m) ∈ Iu and φs is cohomologous to a constant, then there exist p, q ∈ R such that m = νp,q . Finally we consider the fourth case. Lemma 7. If λu (m) ∈ / Iu and λs (m) ∈ / Is then: max 1. λu (m) = λmin u and λs (m) = λs ; 2. there exists ν ∈ ME such that λu (ν) = λu (m), λs (ν) = λs (m), and hν (f ) = hm (f ).

Proof of the lemma. We first establish Property 1. When Iu = Is = ∅ (i.e., φu and φs are both cohomologous to constants), there is nothing to prove. Assume now that Iu = ∅,

Is = ∅,

and λs (m) = λmin s .

(33)

Since ν0,0 is the measure of maximal entropy we have h(0, 0) ≥ hm (f ). Therefore it follows from λu (0, 0) = λmin u , Statement 2 in Proposition 4, and (9) that dim H ν0,0 > d(m). But this contradicts (25). Thus (33) cannot occur. Analogously we can show that it is impossible to have Is = ∅, Iu = ∅, and λu (m) = λmax u . In order to complete the proof of Property 1 it remains to consider the case when max Iu = ∅ and Is = ∅. In this case λu (m) ∈ ∂Iu = {λmin u , λu } and λs (m) ∈ ∂Is = min max {λs , λs }. Assume first that λu (m) = λmax u

and λs (m) = λmin s .

(34)

104

L. Barreira, C. Wolf

Since ν0,0 is the measure of maximal entropy we have h(0, 0) ≥ hm (f ). On the other hand, Proposition 4 implies that λu (0, 0) < λu (m) and λs (0, 0) > λs (m). Using (9) we obtain dimH ν0,0 > d(m). This is again a contradiction to (25) and hence (34) cannot occur. Assume now that λu (m) = λmin u

and λs (m) = λmin s .

(35)

We claim that h(p, 0) > hm (f )

(36)

for all p > 0. Otherwise, if h(p, 0) ≤ hm (f ) for some p > 0, Proposition 4 would imply h(p, 0) − pλu (p, 0) < hm (f ) − pλu (m). But this is impossible since νp,0 is the equilibrium measure of −pφu . We also claim that du (p, 0) ≥ hm (f )/λu (m)

(37)

for all sufficiently large p (see (16) for the definition of du ). Otherwise, Proposition 4 would guarantee the existence of p0 ∈ R and ε > 0 such that du (p, 0) + ε < hm (f )/λu (m) for all p ≥ p0 . It would then follow from (19) that hm (f ) > h(p, 0) for all sufficiently large p. This contradicts (36) and hence (37) holds for all sufficiently large p. It follows from (35)–(37) that dimH νp,0 = du (p, 0) + ds (p, 0) ≥

hm (f ) h(p, 0) − > d(m) λu (m) λs (p, 0)

for all sufficiently large p. This contradicts (25) and hence (35) cannot occur. Analogously one can show that it is impossible to have λu (m) = λmax and λs (m) = λmax u s . min max Therefore we must have λu (m) = λu and λs (m) = λs , and Property 1 is established. To prove Property 2 we consider an ergodic decomposition τ of m, i.e., a probability measure on the metrizable space M with τ (ME ) = 1 such that ϕ dν dτ (ν) = ϕ dm (38) M

for all ϕ ∈ C(, R). Applying (38) to φu yields min λu = λu (m) = λu (ν) dτ (ν). M

Since λu (ν) ≥ λmin u for all ν ∈ M there exists A1 ⊂ ME with τ (A1 ) = 1 such that λu (ν) = λmin for all ν ∈ A1 . Analogously there exists A2 ⊂ ME with τ (A2 ) = 1 such u that λs (ν) = λmax for all ν ∈ A2 . We conclude from (9) and (25) that hν (f s ) ≤ hm (f ) for all ν ∈ A1 ∩A2 . On the other hand, since τ (A1 ∩A2 ) = 1 and hm (f ) = M hν (f ) dτ (ν) (see for example [4]), there exists A ⊂ A1 ∩A2 with τ (A) = 1 such that hν (f ) = hm (f ) for all ν ∈ A. This completes the proof of the lemma. By Lemmas 4–7, in each of the four cases there exists a measure µ ∈ ME satisfying (26): each measure νp,q in the Lemmas 4–6, and each measure ν in Statement 2 of Lemma 7. This completes the proof.

Measures of Maximal Dimension for Hyperbolic Diffeomorphisms

105

As explained in the introduction each ergodic measure µ ∈ M satisfying (22) is a measure of maximal dimension, and thus we can simply refer to it as an ergodic measure of maximal dimension. For hyperbolic surface diffeomorphisms with constant Jacobian, the identity (22) follows from work of Wolf in [16] in the case of polynomial automorphisms of C2 . We remark that because of the constant Jacobian assumption the methods in [16] cannot be used in our study. We note that the compactness assumption in Theorem 6 is essential. Otherwise, we can consider a sequence of (linear) Smale horseshoes n ⊂ R2 such that dimH n 2 as n → ∞, and thus the (noncompact) hyperbolic set = ∞ n=1 n has Hausdorff dimension 2. It can be arranged that each of the horseshoes n has associated a natural ergodic invariant measure νn of full dimension. On the other hand, there exists no ergodic measure of maximal dimension. Note that it is nevertheless easy to find a nonergodic n measure of maximal dimension: simply consider the measure µ = ∞ n=1 (νn /2 ). We now present several consequences of Theorem 6 and its proof. A first result pertains to the connection with invariant measures of full dimension. Corollary 7. Under the hypotheses of Theorem 6, the following properties are equivalent: 1. there exists an invariant measure of full dimension on ; 2. there exists an ergodic invariant measure of full dimension on ; 3. sup{dimH ν : ν ∈ M} = dimH . Proof. Clearly, sup{dimH ν : ν ∈ ME } ≤ sup{dimH ν : ν ∈ M} ≤ dimH . It is established in [3] (see also the discussion in the introduction) that the two suprema are equal. Theorem 6 shows that the common value is attained at an ergodic measure. This implies the desired statement. Let now N ⊂ M be the set of measures that are accumulation points of some sequence (νn )n∈N in ME satisfying lim dimH νn = sup{dimH ν : ν ∈ ME }.

n→∞

(39)

The following is another consequence of Theorem 6 and its proof. Corollary 8. Under the hypotheses of Theorem 6, the supremum in (39) is a maximum, and for each m ∈ N the following properties hold: 1. d(m) = max{dimH ν : ν ∈ ME }; 2. if neither φu nor φs is cohomologous to a constant, then one of the following exclusive alternatives holds: a) λu (m) ∈ Iu and λs (m) ∈ Is , with m = νp,q for some p, q ∈ R; max b) λu (m) = λmin u and λs (m) = λs ; 3. there exists ν ∈ ME such that λu (ν) = λu (m), λs (ν) = λs (m) and hν (f ) = hm (f ); 4. if (νn )n∈N as in (39) has limit m then hνn (f ) → hm (f ) as n → ∞. We note that under the assumption of Statement 2b the measure m cannot be an equilibrium measure in the two-parameter family {νp,q : p, q ∈ R}. Nevertheless, somewhat surprisingly, the supremum of the Hausdorff dimension of ergodic invariant measures can be arbitrarily approximated by the dimension of the measures in this family.

106

L. Barreira, C. Wolf

Theorem 9. Under the hypotheses of Theorem 6, the following holds: sup{dimH νp,q : p, q ∈ R} = sup{dimH ν : ν ∈ ME }.

(40)

Proof. Let µ ∈ ME be a measure of maximal dimension. Recall that the existence of µ is guaranteed by Theorem 6. We shall consider three cases: 1. λs (µ) ∈ Is and λu (µ) ∈ Iu ; 2. λu (µ) ∈ Iu , λs (µ) ∈ Is , and neither φu nor φs is cohomologous to a constant; 3. φu or φs is cohomologous to a constant. We note that under the assumptions of Case 1 neither φu nor φs is cohomologous to a constant. Therefore Corollary 8 implies that the above three cases cover all possibilities. Under the assumptions of Case 1 it follows from Corollary 8 that µ = νp,q for some p, q ∈ R; thus (40) holds. In the second case, Corollary 8 implies that λu (µ) = λmin and λs (µ) = λmax u s . Therefore (40) is a consequence of the following lemma. Lemma 8. If µ ∈ ME is a measure of maximal dimension such that λu (µ) = λmin u and λs (µ) = λmax s , then dim H νt,t → d(µ) as t → ∞. Proof of the lemma. Since νt,t is the equilibrium measure of −tφu + tφs , h(t, t) − tλu (t, t) + tλs (t, t) ≥ hµ (f ) − tλu (µ) + tλs (µ).

(41)

By Proposition 4 we obtain λu (t, t) > λu (µ) and λs (t, t) < λs (µ).

(42)

Therefore (41) implies that h(t, t) > hµ (f ) for all t ≥ 0. Let now ε > 0 and t > htop (f )/ε. Using (41) we obtain h(t, t) − hµ (f ) + λu (µ) − λs (µ) t εh(t, t) < + λu (µ) − λs (µ) ≤ ε + λu (µ) − λs (µ). htop (f )

λu (t, t) − λs (t, t) ≤

Therefore (42) yields λu (t, t) → λu (µ) and λs (t, t) → λs (µ) as t → ∞. Since µ is an ergodic measure of maximal dimension, combining h(t, t) > hµ (f ) with (9) establishes the desired statement. Finally we consider the third case. Lemma 9. If φu or φs is cohomologous to a constant then µ = νp,q for some p, q ∈ R. Proof of the lemma. If φu and φs are both cohomologous to a constant, then λu (ν) = λu (ν0,0 ) and λs (ν) = λs (ν0,0 ) for all ν ∈ ME . Setting m = µ, it follows from Lemma 3 that µ = ν0,0 . We now assume that only one of the functions φu and φs is cohomologous to a constant. Without loss of generality we shall only consider the case when φs is cohomologous to a constant. This implies that for all ν ∈ ME , λs (ν) = λs (ν0,0 ).

(43)

Measures of Maximal Dimension for Hyperbolic Diffeomorphisms

107

Let us assume that λu (µ) ∈ Iu . By Proposition 4 there exists p ∈ R such that λu (p, 0) = λu (µ). Again Lemma 3 implies that µ = νp,0 . We now assume that λu (µ) ∈ Iu . If λu (µ) = λmax u , then the fact that ν0,0 is the measure of maximal entropy combined with (9) implies that dimH ν0,0 > dimH µ. Since µ is a measure of maximal dimension, we obtain a contradiction. Thus λu (µ) = λmax u . min To complete the proof of the lemma we shall show that λu (µ) = λu . Otherwise Lemma 8 implies that dimH νt,t → dimH µ as t → ∞. Since φs is cohomologous to a constant it follows from the uniqueness of the equilibrium measure and (43) that νt,t = νt,0 for all t ∈ R. Hence dimH νt,0 → dimH µ as t → ∞.

(44)

On the other hand, Proposition 4 together with (9) and (43) imply that du (·, 0) + ds (·, 0) is strictly decreasing on [tu , ∞) and thus (44) is a contradiction to the fact that µ is a measure of maximal dimension. We have established (40) in the three cases. This completes the proof.

An interesting consequence of Theorem 9 is that the number δ(f ) (as defined in the introduction) can be arbitrarily approximated by the dimension of the measures νp,q : it follows immediately from Theorem 9 and (4) that sup{dimH νp,q : p, q ∈ R} = sup{dimH ν : ν ∈ M}. 4.2. Characterization. We now provide a characterization of the measures of maximal dimension under the assumption that neither φu nor φs is cohomologous to a constant. Theorem 10. Let f be a C 1+ε surface diffeomorphism, and let be a compact locally maximal hyperbolic set of f such that f | is topologically mixing, and neither φu nor φs is cohomologous to a constant. 1. If no nontrivial linear combination of φu and φs is cohomologous to a constant and µ = νp,q is a measure of maximal dimension, then 0 ≤ p ≤ tu and 0 ≤ q ≤ ts . 2. If some nontrivial linear combination −φu + αφs is cohomologous to a constant (in which case α > 0) and µ ∈ ME is a measure of maximal dimension, then there exists t ∈ [min{tu , α −1 ts }, max{tu , α −1 ts }]

(45)

such that µ = νκt,(1−κ)αt for every κ ∈ [0, 1]. Proof. Assume first that no nontrivial linear combination of φu and φs is cohomologous to a constant and let νp0 ,q0 be an ergodic measure of maximal dimension. In particular, φs is not cohomologous to a constant. Set b = λs (p0 , q0 ) and consider the curve γs given by Proposition 5. The uniqueness of γs implies that q0 = γs (p0 ). Lemma 10. The function p → λu (p, γs (p)) is strictly decreasing. Proof of the lemma. It follows from the definition of γs that ∂p λs (p, γs (p)) + ∂q λs (p, γs (p))γs (p) = 0,

108

L. Barreira, C. Wolf

and hence (17) implies that γs (p) = −∂p ∂q Q/∂q2 Q, with the derivatives at the righthand side computed at (p, γs (p)). Again applying (17) yields ∂p λu (p, γs (p)) + ∂q λu (p, γs (p))γs (p) = −∂p2 Q − ∂p ∂q Q(−∂p ∂q Q/∂q2 Q) = −[∂p2 Q∂q2 Q − (∂p ∂q Q)2 ]/∂q2 Q.

(46)

Consider now the bilinear form A(ϕ1 , ϕ2 ) = ∂t1 ∂t2 P (−pφu + qφs + t1 ϕ1 + t2 ϕ2 )|t1 =t2 =0 . Then A(vφu + wφs , vφu + wφs ) coincides with

A(φu , φu ) A(φu , φs ) v v vw = vw B , A(φs , φu ) A(φs , φs ) w w where

∂p2 Q −∂p ∂q Q . −∂q ∂p Q ∂q2 Q

B=

Since no nontrivial linear combination of φu and φs is cohomologous to a constant, if (v, w) = 0 then A(vφu + wφs , vφu + wφs ) > 0 (see [12]) and hence B is positive definite. In particular det B (which coincides with the quantity in square brackets in (46)) is positive. Furthermore, since φs is not cohomologous to a constant we have ∂q2 Q > 0 and hence ∂p λu (p, γs (p)) + ∂q λu (p, γs (p))γs (p) < 0. This establishes the desired statement.

Assume now that p0 < 0. Lemma 10 implies that λu (p0 , q0 ) > λu (0, γs (0)).

(47)

Since ν0,γs (0) is the equilibrium measure of γs (0)φs we have h(0, γs (0)) + γs (0)λs (0, γs (0)) ≥ h(p0 , q0 ) + γs (0)λs (p0 , q0 ). Furthermore λs (0, γs (0)) = λs (p0 , q0 ) and hence, h(0, γs (0)) ≥ h(p0 , q0 ).

(48)

Combining (47) and (48) with (9) yields h(p0 , q0 ) h(0, γs (0)) h(0, γs (0)) h(p0 , q0 ) − < − . λu (p0 , q0 ) λs (p0 , q0 ) λu (0, γs (0)) λs (0, γs (0)) But this contradicts the fact that νp0 ,q0 is an ergodic measure of maximal dimension. This shows that p0 ≥ 0. Replacing m by νp0 ,q0 and applying the same arguments as in the proof of Lemma 1 (see Eqs. (27)–(32)) yields p0 ≤ h(p0 , q0 )/λu (p0 , q0 ). Therefore (15) implies p0 ≤ tu . The number p in Lemma 1 is a priori not necessarily unique. However, the uniqueness follows immediately from Lemma 10. Similar arguments show that 0 ≤ q0 ≤ ts .

Measures of Maximal Dimension for Hyperbolic Diffeomorphisms

109

Assume now that some nontrivial linear combination −φu + αφs is cohomologous to a constant. This implies that νt,0 = ν0,αt = νκt,(1−κ)αt

(49)

for every κ ∈ [0, 1]. Therefore, the function du + ds is determined by the function t → dimH νt,0 . By Proposition 4, du (·, 0) is strictly increasing on (−∞, tu ] and strictly decreasing on [tu , ∞). Furthermore ds (0, ·) is strictly increasing on (−∞, ts ] and strictly decreasing on [ts , ∞). It follows from (49) that t → du (t, 0)+ds (t, 0) is strictly increasing on (−∞, min{tu , α −1 ts }] and strictly decreasing on [max{tu , α −1 ts }, ∞). Therefore, its maximum can only be attained at some point t as in (45). If µ is an ergodic measure of maximal dimension, it follows from Corollary 8, Lemma 8, and (49) that there exists t as in (45) such that µ = νt,0 . This completes the proof. We remark that an alternative proof of Lemma 10 can be obtained using the curves γu and γs . In the following we provide a brief sketch. Let p1 , p2 ∈ R with p1 < p2 and denote by γu1 , γu2 the unique real-analytic functions satisfying λu (γuk (q), q) = λu (pk , γs (pk )) for all q ∈ R and k = 1, 2. Since no nontrivial linear combination of φu and φs is cohomologous to a constant, one can show that the curves γu1 and γu2 do not intersect in R2 and hence p1 < γu2 (γs (p1 )). The result follows from the fact that λu (·, γs (p1 )) is strictly decreasing (see Proposition 4). The function ds (·, 0) + du (·, 0) is real-analytic. Therefore in the case when ds (·, 0) + du (·, 0) is not constant then it has at most finitely many maxima in [min{tu , α −1 ts }, max{tu , α −1 ts }]. Under the assumptions of Theorem 10 it is thus an immediate consequence that if φu and φs are linearly dependent as cohomology classes then there exist at most finitely many ergodic measures of maximal dimension. We note that an analogous result to that in Statement 2 of Theorem 10 was obtained earlier in the case of hyperbolic polynomial automorphisms of C2 (see [16]). 5. Applications We now provide several nontrivial applications of the results in the former section. 5.1. Regular dependence on the diffeomorphism. In this section we discuss the dependence of the Hausdorff dimension of the measure of maximal dimension (given by Theorem 6) on the diffeomorphism. Namely, we are interested how the quantity δ(f ) = sup{dimH ν : ν ∈ M} varies with f (restricted to the locally maximal hyperbolic set ). More precisely, consider an open neighborhood U of such that = n∈Z f n U . Then (see for example [8]) there exists a neighborhood U of f inthe space of C r diffeomorphisms (with respect to the C r topology) such that g = n∈Z g n U is a locally maximal hyperbolic set of g for all g ∈ U. Moreover, g|g is topologically mixing. We are interested in the regularity of the map g → δ(g). One might think that the regularity of this map could be determined from that of the Hausdorff dimension of the set g . In fact it was proven by Ma˜ne´ in [9] that the map g → dimH g is of class C r−1 . On the other hand, it follows from work of McCluskey and Manning in [10] that the identity δ(f ) = dimH fails whenever −tu φu and ts φs are not cohomologous, and thus one can show that for every g in an open and dense subset of U (with respect to the C r topology)

110

L. Barreira, C. Wolf

we have δ(g) < dimH . This indicates that a priori one cannot study the regularity of the map g → δ(g) using information about the regularity of the map g → dimH g . It is shown in [10] that g → δ(g) is continuous. We shall discuss the higher regularity of the map g → δ(g) under some reasonable assumptions. It is well-known that there exists a neighborhood U of f and α ∈ (0, 1] such that for all g ∈ U there exists a unique α-H¨older homeomorphism hg : → g satisfying g ◦ hg = hg ◦ f (see for example [8]). For a given g ∈ U we denote by Mg the space of g-invariant probability measures on g and by ME,g the subset of ergodic measures. For each g ∈ U we define a map Tg : M → Mg by Tg (ν) = (hg )∗ ν, where ((hg )∗ ν)(A) = ν(h−1 g (A)) for all Borel sets A ⊂ g . It is easy to see that Tg is a homeomorphism. Moreover it follows directly from the definition of Tg that (f |, ν) and (g|g , Tg (ν)) are measure-theoretically isomorphic, in particular hTg (ν) (g) = hν (f ) for all ν ∈ M and g ∈ U. Theorem 11. Let f be C r surface diffeomorphism, for some r ≥ 4, and let be a compact locally maximal hyperbolic set of f such that f | topologically mixing. Assume that neither φu nor φs is cohomologous to a constant and that f admits a unique measure µ ∈ ME of maximal dimension. Assume furthermore that µ = νp0 ,q0 for some p0 , q0 ∈ R, and that D 2 (du + ds )(p0 , q0 ) is invertible. Then g → δ(g) is of class C r−3 in a neighborhood U of f . Proof. Since δ(g) = sup{dimH ν : ν ∈ ME,g }, it is sufficient to restrict ourselves to ergodic measures. Consider a neighborhood U of f such that for each g ∈ U neither φu nor φs (defined with respect to g) is cohomologous to a constant. The existence of such a neighborhood is a simple consequence of work of McCluskey and Manning in [10] combined with Livshitz’s theorem. We now need the following lemma. Lemma 11. Let V be an open neighborhood of νp0 ,q0 in ME . Then there exists a neighborhood U of f such that each measure ν ∈ ME,g of maximal dimension satisfies ν ∈ Tg (V ). Proof of the lemma. It follows from work of McCluskey and Manning in [10] that if U is small, then the functions δV (g) = sup{dimH ν : ν ∈ Tg (V )} and δME \V (g) = sup{dimH ν : ν ∈ Tg (ME \ V )} are continuous in g. Since ME \ V does not contain an ergodic measure of maximal dimension we may conclude from Corollary 8 that δME \V (f ) < δ(f ). In order to see this, notice that otherwise there would exist a sequence of measures νn ∈ ME \ V with dimH νn → δ(f ) as n → ∞. For any accumulation point m ∈ N (see (39)) of this sequence, it follows from Statement 2 of Corollary 8 and the uniqueness of the measure νp0 ,q0 that λu (µ) ∈ Iu and λs (µ) ∈ Is . By Statement 3 in the same theorem there exists an ergodic measure of maximal dimension in ME \ V , but this violates the uniqueness of νp0 ,q0 . By making U smaller if necessary, it follows from the continuous dependence of the quantities δV (g) and δME \V (g) on g that δME \V (g) < δ(g) for all g ∈ U which proves the lemma. Define a function Q : U × R2 → R by Q(g, p, q) = P (g, −pφu + qφs ), where P (g, ·) denotes the topological pressure of g|g . It follows from work of Ma˜ne´ in [9] that Q is of class C r−1 . Therefore (9), (17), and (20) imply that the function : U ×R2 → R

Measures of Maximal Dimension for Hyperbolic Diffeomorphisms

111

defined by (g, p, q) = dimH νp,q (g) is of class C r−2 . Here νp,q (g) denotes the equilibrium measure of −pφu + qφs with respect to g|g (with φu and φs defined with respect to g). Since neither φu nor φs is cohomologous to a constant we may conclude from Proposition 4 that λu (p0 , q0 ) ∈ Iu and λs (p0 , q0 ) ∈ Is . Let now V be an open neighborhood of νp0 ,q0 in ME such that λu (ν) ∈ Iu and λs (ν) ∈ Is for all ν ∈ V . Without loss of generality we may assume that U is the same as in Lemma 11. Since Tg is a homeomorphism and Iu and Is are open intervals for all g ∈ U, it follows that λu (ν) ∈ Iu and λs (ν) ∈ Is for all ν ∈ Tg (V ). Therefore Corollary 8 and Lemma 11 imply that each measure µ(g) ∈ ME,g of maximal dimension coincides with νp,q (g) for some p, q ∈ R. The map (p, q) → νp,q is real-analytic and thus in particular continuous (see [9]). This implies that every neighborhood V of νp0 ,q0 in ME also contains a neighborhood of νp0 ,q0 in {νp,q : p, q ∈ R}. By making V and U smaller if necessary, using the fact that is of class C r−2 we may assume that (g, ·) has a unique extremum νp(g),q(g) (g) ∈ Tg (V ) ∩ {νp,q (g) : p, q ∈ R} and this must be a maximum. We conclude that δ(g) coincides with the Hausdorff dimension of the unique measure νp(g),q(g) (g) such that (∂p , ∂q )(g, p(g), q(g)) = 0.

(50)

Observe that (∂p , ∂q ) is of class C r−3 . Applying the Implicit Function Theorem to Eq. (50) completes the proof of the theorem. We remark that the assumptions of Theorem 11 for example hold if f preserves volume (provided that neither φu nor φs is cohomologous to a constant), and more generally when f admits an ergodic invariant measure of full dimension (see Sect. 2 for the definition).

5.2. Gibbs property of measures of maximal dimension. We now provide conditions which imply that every ergodic measure of maximal dimension is an equilibrium measure of a H¨older continuous function, and thus has the Gibbs property. We continue to use the notation of Sect. 5.1. Theorem 12. Let f be C 2 surface diffeomorphism, and let be a compact locally maximal hyperbolic set of f such that f | is topologically mixing. Assume that no measure max of maximal dimension m satisfies λu (m) = λmin u and λs (m) = λs . Then there exists 2 an open neighborhood U of f in the C topology such that for every g ∈ U each ergodic measure of maximal dimension µg of g|g is an equilibrium measure of a H¨older continuous function. Proof. We first consider the map f . The assumptions imply that φu and φs cannot be simultaneously cohomologous to a constant. This follows from Theorem 6. We define max δb (f ) = sup{d(ν) : λu (ν) = λmin u and λs (ν) = λs }

(with “b” standing for boundary case). In a similar way to that in the proof of Lemma 7 we max such that are able to show that there exists ν ∈ ME with λu (ν) = λmin u and λs (ν) = λs

112

L. Barreira, C. Wolf

dimH ν = δb (f ). By Theorem 6 there exists an ergodic measure µ of maximal dimension for f |, and therefore the hypothesis in the theorem implies that δb (f ) < δ(f ). By Corollary 8 we conclude that µ = νp,q for some p, q ∈ R. McCluskey and Manning [10] showed that g → δ(g) is continuous in the C 2 topology. Moreover, one can modify their proof to show that g → δb (g) is also continuous. Therefore, there exists an open neighborhood U of f in the C 2 topology such that δb (g) < δ(g) for all g ∈ U. In particular, if g ∈ U then no measure mg of maximal dimension for g|g satisfies λu (mg ) = λmin and λs (mg ) = λmax u s . Arguing as above (but now for g ∈ U instead of f ), we conclude that every ergodic measure of maximal dimension for g|g is an equilibrium measure of a H¨older continuous function. It follows immediately from the proof of Theorem 6 that if φu and φs are both cohomologous to a constant, then the measure of maximal entropy is the unique ergodic measure of maximal dimension. To our best knowledge, all known examples of hyperbolic surface diffeomorphisms satisfy one of the following exclusive alternatives: max 1. no measure of maximal dimension m satisfies λu (m) = λmin u and λs (m) = λs ; 2. φu and φs are cohomologous to a constant.

Conjecturally these two cases cover all possibilities. This would imply that every ergodic measure of maximal dimension has the Gibbs property. Acknowledgement. This paper was written while Christian Wolf was a postdoctoral fellow at the Center for Mathematical Analysis, Geometry, and Dynamical Systems of Instituto Superior T´ecnico in Lisbon, and he would like to thank the Department of Mathematics for its hospitality.

References 1. Barreira, L., Pesin, Ya., Schmeling, J.: Dimension and product structure of hyperbolic measures. Ann. Math. (2) 149, 755–783 (1999) 2. Barreira, L., Saussol, B.: Hausdorff dimension of measures via Poincar´e recurrence. Commun. Math. Phys. 219, 443–463 (2001) 3. Barreira, L., Wolf, C.: Pointwise dimension and ergodic decompositions, Preprint 4. Denker, M., Grillenberger, C., Sigmund, K.: Ergodic theory on compact spaces. Lect. Notes in Math. 527, Berlin-Heidelberg-New York: Springer, 1976 5. Denker, M., Urbanski, M.: On Sullivan’s conformal measures for rational maps on the Riemann sphere. Nonlinearity 4, 365–384 (1991) 6. Eckmann, J.-P., Ruelle, D.: Ergodic theory of chaos and strange attractors. Rev. Mod. Phys. 57, 617–656 (1985) 7. Friedland, S., Ochs, G.: Hausdorff dimension, strong hyperbolicity and complex dynamics. Discrete Contin. Dynam. Systems 4, 405–430 (1998) 8. Katok, A., Hasselblatt, B.: Introduction to the modern theory of dynamical systems. In: Encyclopedia of Mathematics and its Applications 54, Cambridge: Cambridge University Press, 1995 9. Ma˜ne´ , R.: The Hausdorff dimension of horseshoes of diffeomorphisms of surfaces. Bol. Soc. Brasil. Mat. (N.S.) 20, 1–24 (1990) 10. McCluskey, H., Manning, A.: Hausdorff dimension for horseshoes. Ergodic Theory Dynam. Systems 3, 251–260 (1983) 11. Pesin, Ya.: Dimension theory in dynamical systems: contemporary views and applications. Chicago Lectures in Mathematics, Chicago: Chicago University Press, 1997 12. Ruelle, D.: Thermodynamic formalism. In: Encyclopedia of Mathematics and its Applications 5, Reading, MA: Addison-Wesley, 1978 13. Ruelle, D.: Repellers for real analytic maps. Ergodic Theory Dynam. Systems 2, 99–107 (1982) 14. Sigmund, K.: Generic properties of invariant measures for axiom A-diffeomorphisms. Invent. Math. 11, 99–109 (1970) 15. Urbanski, M.: Measures and dimensions in conformal dynamics. Bull. Am. Math. Soc., to appear

Measures of Maximal Dimension for Hyperbolic Diffeomorphisms

113

16. Wolf, C.: On measures of maximal and full dimension for polynomial automorphisms of C2 . Trans. Am. Math. Soc., to appear 17. Young, L.-S.: Dimension, entropy and Lyapunov exponents. Ergodic Theory Dynam. Systems 2, 109–124 (1982) Communicated by J.L. Lebowitz

Commun. Math. Phys. 239, 115–153 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0869-6

Communications in

Mathematical Physics

Generalized Lam´e Operators Oleg Chalykh1, , Pavel Etingof2 , Alexei Oblomkov2 1 2

Department of Mathematics, Cornell University, Ithaca, NY 14853, USA. E-mail: [email protected] Department of Mathematics, MIT, 77 Mass. Ave, Cambridge, MA 02139, USA. E-mail: [email protected]; [email protected]

Received: 2 December 2002 / Accepted: 14 February 2003 Published online: 28 May 2003 – © Springer-Verlag 2003

Abstract: We introduce a class of multidimensional Schr¨odinger operators with elliptic potential which generalize the classical Lam´e operator to higher dimensions. One natural example is the Calogero–Moser operator, others are related to the root systems and their deformations. We conjecture that these operators are algebraically integrable, which is a proper generalization of the finite-gap property of the Lam´e operator. Using earlier results of Braverman, Etingof and Gaitsgory, we prove this under additional assumption of the usual, Liouville integrability. In particular, this proves the Chalykh–Veselov conjecture for the elliptic Calogero–Moser problem for all root systems. We also establish algebraic integrability in all known two-dimensional cases. A general procedure for calculating the Bloch eigenfunctions is explained. It is worked out in detail for two specific examples: one is related to the B2 case, another one is a certain deformation of the A2 case. In these two cases we also obtain similar results for the discrete versions of these problems, related to the difference operators of Macdonald–Ruijsenaars type. 1. Introduction In this paper we consider higher-dimensional analogues of the classical Lam´e operator L=−

d2 + m(m + 1)℘ (z), dz2

m ∈ Z+ .

(1.1)

Here ℘ (z) = ℘ (z|1, τ ) is the Weierstrass ℘-function with periods 1, τ . More generally, we are interested in multivariable analogues of the so-called elliptic algebro-geometric operators L = −d 2 /dz2 + u(z), which appeared in the finite-gap theory initiated

On leave of absence from: Advanced Education and Science Centre, Moscow State University, Moscow 119899, Russia

116

O. Chalykh, P. Etingof, A. Oblomkov

in 70’s by Novikov [1]. This theory provides a beautiful interplay between the spectral theory and algebraic geometry, and the Lam´e operator is the simplest and best known member of this family of operators (see [2] for a survey). Since then there have been several attempts to generalize some parts of that theory to higher dimensions, most notably [3, 4], see also [5–7]. We should stress, however, that in general this leads to differential operators with matrix coefficients. On the other hand, in [8] it was suggested to consider the quantum elliptic Calogero–Moser problem and its versions related to the root systems [9] as natural multidimensional analogues of the Lam´e operator. More specifically, a conjecture from [8] says that for integer values of the coupling parameters the corresponding Schr¨odinger operators are algebraically integrable (this is a proper generalization of the properties of the algebro-geometric operators to higher dimensions, see [10, 11] and Sect. 3 below). For the rational and trigonometric versions of the Calogero–Moser problem this was proved in [10]. The elliptic version, however, turned out to be more difficult: until now it was known for An case only, due to [11]. One of the results of the present paper is a proof of that conjecture of [8] for all root systems. In fact, our approach applies to a wider class of Schr¨odinger operators, which is an elliptic version of the class introduced in [12], see also [13]. Their singularities are the second order poles along a set of hyperplanes satisfying some special conditions, which encode the triviality of the local monodromy around each of the poles. We call the corresponding Schr¨odinger operators the generalized Lam´e operators. The Calogero–Moser operators with integer coupling parameters give particular examples of such operators. Our main result says that for a generalized Lam´e operator its algebraic integrability follows from the usual, Liouville integrability (in a slightly stronger sense). The proof uses a criterion from [11], based on differential Galois theory. In dimension one we recover in this way the main result of [14]. For the elliptic Calogero–Moser problem the complete integrability was proved (for all root systems) by Cherednik [15], and this allows us to prove the conjecture of [8]. A complete description of all generalized Lam´e operators is an open problem. All known (irreducible) examples in dimension > 1 are related to the root systems and their deformations which appeared in [13, 16]; we list them all in Sect. 4. We conjecture that they are all algebraically integrable. Using our main result, we check this for all two-dimensional examples, since the complete integrability is relatively easy to work out in that case. One important property of the algebraically integrable operators is that their eigenfunctions can be calculated explicitly (at least, in principle). In Sect. 5 we explain how to find the Bloch eigenfunctions for a given integrable generalized Lam´e operator, in particular, for the elliptic Calogero–Moser problem. As a result, we will see that the Bloch solutions are parametrized by the points of an algebraic variety, which is a covering of a product of elliptic curves (in perfect agreement with the situation in dimension one, due to Krichever [17]). Let us mention that for the elliptic Calogero–Moser problem (in the An case) the Bloch eigenfunctions were calculated by Felder and Varchenko [18]. Our procedure is different and more general (at cost of being less effective). We also explain how these Bloch solutions can be used to construct the discrete spectrum eigenstates for the Calogero–Moser problem. In the last three sections of the paper we consider two particular examples of the generalized Lam´e operators in dimension 2, for which we make the formulas for the Bloch solutions very explicit. The first example is L = − + 2℘ (x) + 2℘ (y) + 4℘ (x − y) + 4℘ (x + y).

(1.2)

Generalized Lam´e Operators

117

This is a special case of the elliptic Calogero–Moser problem of the B2 -type. Our second example is 2 2 L = − + 2℘ (x) + 2 a 2 + b2 ℘ (ax + by) + 2 a + b ℘ (ax + by), (1.3) √ √ √ −1 + i 3 sin α −1 − i 3 sin α i 3 cos α a= , , b = −b = , a= 2 2 2 where α is a complex parameter. Such an operator was considered by Hietarinta [19] who showed that it admits a commuting operator of order 3. Its rational version ℘ (z) = z−2 corresponds to a specific choice of the parameters in the family of two-dimensional Schr¨odinger operators introduced by Berest and Lutsenko [20] in connection with Huygens’ Principle (see Sect. 4 of [13] for details). Note that the potential in (1.3) is real-valued (for real x, y) if α is real and the period τ is pure imaginary. It is more convenient to work with the following 3-dimensional version of (1.3): L = −∂12 − ∂22 − ∂32 + 2(a12 + a22 )℘ (a1 x1 − a2 x2 ) + 2(a22 + a33 )℘ (a2 x2 − a3 x3 ) + 2(a32 + a12 )℘ (a3 x3 − a1 x1 ),

(1.4)

where a12 + a22 + a32 = 0, ∂i = ∂/∂xi . Then it is easy to see that L commutes with the operator L0 = a1−1 ∂1 + a2−1 ∂2 + a3−1 ∂3 and after restriction to the plane a1−1 x1 + a2−1 x2 + a3−1 x3 = 0 it reduces to the operator (1.3) with proper a, b. We calculate explicitly the Bloch eigenfunctions of the operators (1.2), (1.4). Notice a certain similarity between our approach and the one used by Inozemtsev for A2 case [21]. Let us also mention that in dimension two there is a nice theory of Schr¨odinger operators which are finite-gap at a fixed energy level, see [22, 23]. It would be interesting to analyze our results from that point of view. In the last two sections we also calculate the Bloch solutions for the discrete versions of (1.2), (1.4), which are given by certain difference operators of Macdonald–Ruijsenaars type. This raises a natural question about generalizing our results to the difference setting. We hope to return to this problem in the future. 2. Generalized Lam´e Operators Let V = Cn be a complex Euclidean space with the scalar product denoted by ( , ), and A = {α} be a given finite set of affine-linear functions on V . Let us consider a Schr¨odinger operator L = − + u(x),

= ∂ 2 /∂x12 + · · · + ∂ 2 /∂xn2 ,

with the elliptic potential u of the following form: cα ℘ (α(x)|τ ). u(x) = α∈A

(2.1)

(2.2)

118

O. Chalykh, P. Etingof, A. Oblomkov

Here ℘ (z|τ ) is the Weierstrass ℘-function with the periods 1, τ , Im(τ ) > 0, and cα ∈ C are the parameters which will be specified later (below we will mostly suppress τ , denoting ℘ (z|τ ) simply by ℘ (z)). Each of the functions α(x) in standard coordinates on V looks as α(x) = a0 + a1 x1 + · · · + an xn , so α is, effectively, a pair (α0 , a0 ), where a0 = α(0) ∈ C and α0 = gradα = (a1 , . . . , an ). We assume that each α0 is non-isotropic (co)vector, i.e. (α0 , α0 ) = a12 + · · · + an2 = 0. Let us project A onto V ∗ , α → α0 , denoting by A0 the resulting set. We want u to be periodic, more precisely, we assume that the lattice M ⊂ V ∗ , generated over Z by the set A0 , has rank ≤ n. For simplicity, let us assume that rkM = n (the general case can be reduced to that by passing to the factor space V /AnnM). In that case the lattice L + τ L is the period lattice for u, with L := Hom(M, Z) ⊂ V . Thus, u may be considered as a meromorphic function on a (compact) torus T = V /L + τ L, which is isomorphic to the product of n copies of the elliptic curve E = C/Z + τ Z. Singularities of u(x) are the second order poles along the following set of the hyperplanes: Sing = παm,n , παm,n := {x : α(x) = m + nτ }. α∈A m,n∈Z

We will assume that all hyperplanes παm,n are pairwise different; this can always be achieved by rearranging the terms in (2.2). This will imply that u(x) ∼ cα (α(x) − m − nτ )−2 + O(1)

near παm,n .

Our next important assumption is that each cα in (2.2) has a form cα = mα (mα + 1)(α0 , α0 )

with some

mα ∈ Z>0 .

Now we are going to put some more restrictions on u demanding its quasi-invariance in the following sense. Definition. Let us say that the potential (2.2) as above is quasi-invariant if for any hyperplane π = παm,n ∈ Sing the (meromorphic) function u(x) − u(sπ x) is divisible by (α(x) − m − nτ )2mα +1 , where sπ denotes the orthogonal reflection with respect to π . Here are two important examples of such potentials (more examples will appear later). Example 2.1. Consider the following potential u(x) in Cn : u= 2m(m + 1)℘ (xi − xj ), m ∈ Z>0 .

(2.3)

i<j

It is invariant under any permutation of the coordinates x 1 , . . . , x n , and also under translations x → x + l for l ∈ Zn + τ Zn (Zn is the standard integer lattice in n dimensions). As a result, u will be symmetric with respect to any of the hyperplanes xi −xj = m+nτ . Thus, its Laurent expansion in the normal direction will have no odd terms at all, hence u is quasi-invariant. The corresponding Schr¨odinger operator (2.1) is the Hamiltonian of the quantum elliptic Calogero–Moser problem. More generally, the elliptic Calogero–Moser problems related to other root systems [9] also lead, in the same way, to quasi-invariant potentials. In particular, for the rank-one system A1 we have the classical Lam´e operator (1.1)

Generalized Lam´e Operators

119

Example 2.2. This example is in dimension one. The potential u has N poles x1 , . . . , xN and looks like u=

N

2℘ (x − xi ).

(2.4)

i=1

To ensure its quasi-invariance, one has to impose the condition that u has zero derivative at each of its poles, more explicitly: ℘ (xi − xj ) = 0 for all i = 1, . . . , N. (2.5) j =i

This system of equations describes the so-called “elliptic locus” from [24], which has an intimate connection with the classical elliptic Calogero–Moser system and the KdV hierarchy. Let us call a Schr¨odinger operator L with quasi-invariant elliptic potential u(x) a generalized Lam´e operator. In trigonometric and rational versions (℘ (x) = sin−2 x or x −2 ) such operators were considered in [12], where their eigenfunctions were effectively constructed. From the results of [12] the so-called algebraic integrability of L follows (see the paper [13] for the rational case). We recall the definition of the algebraic integrability in the next section, following [8, 11]; let us just remark that in dimension one this coincides with the class of algebro-geometric operators which appear in the finite-gap theory, see [2, 25]. This motivates the following Conjecture. The generalized Lam´e operators are all algebraically integrable. As a particular case, this contains a conjecture of [8] about the algebraic integrability of the elliptic Calogero–Moser problems. As we already mentioned in the introduction, for the An -case (2.3) this has been proved by Braverman, Etingof and Gaitsgory in [11]. It is also known to be true in dimension one, due to Gesztesy and Weikard [14]. In the next section we prove this conjecture under the additional assumption of the usual, Liouville integrability of L (in a slightly stronger sense). As a corollary, we will obtain the algebraic integrability of the quantum Calogero–Moser problems for integer coupling parameters. 3. Monodromy and Algebraic Integrability Let L be a generalized Lam´e operator as defined previously. Recall that L is completely integrable if it is a member of a commutative family of differential operators L1 = L, L2 , . . . , Ln which are algebraically independent. We assume that the Li ’s have meromorphic coefficients and are periodic with respect to the same lattice, which makes them (singular) differential operators on the torus T = Cn /L + τ L. The following proposition shows that possible singularities of Li are contained in the singular locus Sing of the Schr¨odinger operator L. Proposition 3.1. Let L be a Schr¨odinger operator regular in an open set U . Then any differential operator M on U with meromorphic coefficients commuting with L is regular in U . To prove the proposition, we will need the following lemma.

120

O. Chalykh, P. Etingof, A. Oblomkov

Lemma 3.2. Let S(x, p) be a meromorphic function on T ∗ U = U × V ∗ which is polynomial in the momentum p, and {p2 , S(x, p)} is a regular function. Then S is a regular function. Proof. Assume the contrary. The function S is a finite sum Sk (x)p k , where pk are monomials. Let D ⊂ U be the divisor of poles of S. Take a generic point z0 of this divisor. Near this point D is given by an equation f = 0, where f is analytic at z0 and df (z0 ) = 0. Let S = fQk (1 + O(f )) near z0 (Q is regular at z0 , with Q(z0 ) = 0). Then {p2 , S} = −2k But

pi

∂f −k−1 f Q + O(f −k ). ∂xi

∂f pi ∂x (z0 , p) = 0 for generic p. Thus, {p 2 , S} is singular, which is a contradiction. i

Now we prove the proposition. Suppose [M, L] = 0. Assume M is not regular. Then we can write M as M +M

, where M

is regular, and M has a singular highest symbol. Then [L, M ] = −[L, M

] is regular. Let S(x, p) be the symbol of M . We have {p 2 , S} is regular (since if it is nonzero, it is the symbol of [L, M ]). Then the lemma implies that S is regular. This contradiction proves the proposition. Notice also that if S(x, p) is the highest symbol of Li , then from [L, Li ] = 0 it follows that {p 2 , S} = 0. Thus, by Lemma 3.2, S must be regular everywhere. Hence, each Li must have constant highest symbol, i.e. Li = si (∂) + . . . for some polynomial si . Definition. Let us say that a Schr¨odinger operator L = − + u in V = Cn is strongly integrable if the commuting operators L1 = L, . . . , Ln have algebraically independent homogeneous constant highest symbols s1 , . . . , sn and if C[V ] is finitely generated as a module over the ring generated by s1 , . . . , sn or, equivalently, if the system s1 (ξ ) = 0, . . . , sn (ξ ) = 0 has the unique solution ξ = 0. Now let L be a strongly integrable generalised Lam´e operator, so we have the operators L1 = L, . . . , Ln with meromorphic coefficients on the torus T = Cn /L + τ L, and Li = si (∂) + . . . . First of all, C[V ] is locally free as a module over C[s1 , . . . , sn ]. This follows from the fact, due to Serre [26], that if f : X → Y is a finite map of smooth affine varieties of the same dimension, then O(X) is a locally free O(Y )-module. Further, since si are homogeneous, this module is graded, hence, it must be free. Denote by N the rank of this free module. Consider now the eigenvalue problem Li ψ = λi ψ,

i = 1, . . . , n,

(3.1)

with λ = (λ1 , . . . , λn ) ∈ Cn . Then the space of solutions of this system in any simply connected domain in T \ Sing is N-dimensional. So we have a holonomic system of rank N on T with singularities along a finite union of hypertori Sing ⊂ T . Theorem 3.3. The holonomic system (3.1) has regular singularities. Proof. Regularity of singularities is a codimension one condition. Thus, it is sufficient to restrict our attention to a neighborhood U of a point which lies on exactly one subtorus from the pole divisor. Let us assume that this point is 0, and the subtorus is locally defined by the equation x1 = 0.

Generalized Lam´e Operators

121

Lemma 3.4. For any r ≥ 0, the singular part of the differential operator (x1 )r Li has the order at most di − r − 1, where di is the degree of si . Proof. Let us introduce new variables yj = txj , and write our operators with respect to them. Then L = t −2 L(0) + O(1), t → 0, where L(0) = − + mα (mx 2α +1) with mα ∈ Z+ 1

(0)

corresponding to the chosen hyperplane x1 = 0. Let Li be the coefficient at the lowest (0) power of t in the expression for Li . Then Li is homogeneous, of degree ≤ −di in t (since the symbol of Li has this degree). On the other hand, the order of this operator is (0) at most di . Thus, if this degree of Li is less than di , then its symbol would have to be (0) singular. But Li commutes with L(0) , so this contradicts Lemma 3.2. Thus, the degree is exactly di , which implies the statement. Lemma 3.5. Consider a system of differential equations df/dz = A(z)f (z) with holomorphic coefficients on a punctured disk 0 < |z| < (f is a vector function, A is a matrix function). Assume that A(z) is meromorphic at z = 0 and that there exists an integer-valued function D(i) such that the order of aij (z) at z = 0 is at least D(j )−D(i)−1. Then the system has a regular singularity at z = 0. Indeed, let us make a change of variable gi = zD(i) fi . This will change the matrix A into a new matrix A˜ which obviously has at most simple pole at z = 0. Now let us prove the theorem. First, let us reduce (3.1) to a first order holonomic system in a standard way. Choose a collection of N homogeneous polynomials q1 , ..., qN in ∂i , which form a basis in C[∂1 , ..., ∂n ] as a free module over C[s1 (∂), ..., sn (∂)]. Let ψ be a solution the eigenvalue problem (3.1), and consider the functions q1 (∂)ψ, ..., qN (∂)ψ. Then, for any polynomial q of ∂i , the function qψ can be expressed via qi ψ, with the coefficients depending on x and λ. Indeed, assume that q is homogeneous of degree d. We know that q can be uniquely represented in the form gj qj , where gj ∈ C[s1 , ..., sn ]. Thus, qψ − qj gj (L1 , .., Ln )ψ is expressed through differential polynomials of ψ of degree smaller than d. But gj (L1 , ..., Ln )ψ = gj (λ1 , ..., λn )ψ, so, eventually, by induction we get a desired representation of qψ. Now observe that by Lemma 3.4, the coefficient of qj ψ in the expansion of qψ has a pole in x1 of order at most deg(q) − deg(qj ). Thus, we have a holonomic system of matrix partial differential equations (k) ∂k qi ψ = aij (x)qj ψ k = 1, . . . , n, (k)

which is equivalent to the system (3.1). Each of the matrices aij (k = 1, . . . , n) satisfies the conditions of Lemma 3.5 with respect to z = x1 , for D(i) = − deg(qi ). In (1) particular, using the lemma, we conclude from the first equation ∂1 qi ψ = aij ψ that all the solutions have at most power growth in x1 when approaching x1 = 0, hence the system (3.1) has a regular singularity at x1 = 0. Now take any point x0 ∈ T \ Sing and consider the monodromy of the system (3.1) at point x0 , this gives an N -dimensional representation of the fundamental group π1 (T \ Sing).

122

O. Chalykh, P. Etingof, A. Oblomkov

Proposition 3.6. The monodromy group of the system (3.1) is commutative for any λ. Proof. Take a hyperplane π = παm,n ∈ Sing and let P be a generic point of π. Changing the coordinates if necessary, we may assume that P = (0, . . . , 0) is the origin and π is given by equation x1 = 0. From the regularity of singularities it follows that there exist γ1 , . . . , γr ∈ C such that any solution ψ of the system (3.1) near P ∈ π has a convergent series expansion in the subspace r

γ

x1 j C[[x1 , . . . , xn ]][log(x1 )].

j =1

Substituting such a series into the first equation Lψ = λ1 ψ, one arrives at certain recurrence relations from which follows that (1) we have two “leading exponents” γ1 = −mα , γ2 = mα + 1, both integer; (2) there will be no log(x1 ) involved. The latter fact is due to the quasi-invariance of u, see [27] for the one-dimensional case and [13], Sect. 2 for a discussion in the multivariable setting. As a result, we see that all the solutions are single-valued near π , so the local monodromy corresponding to a loop around π is trivial. Consequently, the global monodromy group will be a homomorphic image of the commutative group π1 (T ). Corollary 3.7. The differential Galois group of the system (3.1) is commutative for any λ. Proof. It is known (see [28]) that for a regular holonomic system on a smooth projective variety the differential Galois group coincides with the Zariski closure of the monodromy group. In our situation the proof is simple: first, we know that all solutions of the system (3.1) are meromorphic in Cn . Indeed, this is true outside the set of codimension 2, by the proposition above, hence it holds everywhere by Hartogs’ theorem. Now let F be the field of “elliptic functions”, i.e. meromorphic functions on the torus T , and L is the solution field of (3.1) (on some simply connected domain). The monodromy gives the homomorphism π1 (T ) → Dgal(L : F ) of the fundamental group of the torus T into the differential Galois group. Let G be the Zariski closure of the image of this homomorphism inside Dgal(L : F ). By the main theorem of the differential Galois theory [29], to prove that G = Dgal(L : F ) it suffices to show that in the solution field any G-invariant function is actually Dgal-invariant, i.e. belongs to F . But any meromorphic G-invariant function is π1 (T )-invariant, hence elliptic. Thus, it is Dgal-invariant by definition. This proves that Dgal(L : F ) = G, and the latter is obviously commutative. Now let us remark (see [11]) that a quantum completely integrable system (QCIS) on a smooth algebraic variety X of dimension n naturally defines an embedding θ : O() → D(X), where An is an affine space and D(X) denotes the ring of differential operators on X. More generally, can be any affine variety with dim() = n. Then we have an analogous eigenvalue problem θ (g)ψ = g(λ)ψ, ∀g ∈ O(). The dimension of the local solution space of this system at a generic point of X is called the rank of a QCIS. Recall further, that a QCIS S = (, θ ) is algebraically integrable if it is dominated

Generalized Lam´e Operators

123

by another QCIS S = ( , θ ) of rank one (S is dominated by S if there is a map of algebras h : O() → O( ) such that θ = θ ◦ h). In our situation, this implies that apart from the operators L1 , . . . , Ln , we have additional commuting operators which are not algebraic combinations of Li (though, of course, they are algebraically dependent with Li ). In dimension n = 1 this is equivalent to saying that L is a member of a maximal commutative ring in D(X) of rank one; this is known to coincide with the class of algebro-geometric operators. Theorem 3.8. Any generalised Lam´e operator L which is strongly integrable is algebraically integrable. This follows immediately from the result above and the criterion from [11]. Moreover, according to [11], for generic λ the solution space is generated by the quasiperiodic solutions: Corollary 3.9. There exist meromorphic 1-forms ωj on the torus T ,with first order poles and depending analytically on λ, such that the functions ψj = e ωj give a basis of the solution space of (3.1) for generic λ. Each of these functions will be double-Bloch, in terminology of [30]: ψj (x + l) = e2πiaj ,l ψj (x),

ψj (x + τ l) = e2πibj ,l ψj (x),

(3.2)

for appropriate aj , bj ∈ V ∗ and for all l ∈ L. Namely, aj , l =

1 2πi

z+l

ωj z

and bj , l =

1 2π i

z+τ l

ωi z

(since ψj has no branching along Sing, these are well defined modulo Z). In Sect. 5 below we explain how one can calculate these double-Bloch solutions for the generalized Lam´e operators. Remark 3.10. Our argument applies to a more general situation when one has commuting operators L1 , . . . , Ln on an abelian variety, such that the system Li ψ = λi ψ, i = 1, . . . , n, is regular holonomic. Then the triviality of its local monodromy around singularities implies that this system is algebraically integrable. 4. Examples 4.1. One-dimensional case. Let L be a one-dimensional Schr¨odinger operator L = d2 − dx 2 + u(x), x ∈ C. Consider the eigenvalue problem Lψ = λψ. Now, assuming that u is meromorphic and has a pole at x = 0, let us consider the local monodromy of the solutions of this second order differential equation. Suppose that this monodromy is trivial for all λ, in other words, all the solutions are meromorphic at x = 0. Then, using the classical Frobenius analysis, one can see that u must have a pole of the second order, with no residue: u = m(m + 1)x −2 + O(1), with integer m. Furthermore, in the series for u at x = 0 there must be no terms of order 2j − 1 for all j = 1, . . . , m (see [27]). This is exactly the quasi-invariance of u with respect to the symmetry x → −x. Now let u be elliptic, with periods 1, τ . Let us demand all the solutions of the equation Lψ = λψ to be meromorphic in C (such u are called Picard potentials in [25, 14]). By

124

O. Chalykh, P. Etingof, A. Oblomkov

the discussion above, this is equivalent to the quasi-invariance of u at each pole. More explicitly, M u(x) = mi (mi + 1)℘ (x − xi ), with mi ∈ Z+ , i=1

and the poles xi must satisfy the following system of equations, which generalizes (2.5): mj (mj + 1)℘ (2s−1) (xi − xj ) = 0 (4.1) j =i

for all i = 1, . . . , M and s = 1, . . . , mi . Such L automatically defines a QCI S (of rank 2), and the regularity of singularities is obvious. Applying the results of the previous section, we conclude that L is algebraically integrable. Algebraic integrability of L implies the existence of a differential operator P , commuting with L. Since P is not a polynomial of L, we may assume that it is of odd order. Thus, L must be algebro-geometric, according to the Burchnall–Chaundy–Krichever theory, see e.g.[31]. Thus, our result in this case is equivalent to the main result of [14]. Notice that our approach easily extends to the case of operators of any order (cf. the remark at the end of [14]). 4.2. Quantum Calogero–Moser system. Let V be a complex Euclidean space, dim V = n. Let R = {α} be a reduced irreducible root system in V ∗ , W be the corresponding Weyl group, and the parameters mα be chosen in a W -invariant way. The corresponding Calogero–Moser operator [9] looks as L = − + mα (mα + 1)(α, α)℘ (α, x). (4.2) α∈R+

Let C[V ]W be the ring of W -invariant polynomials. By the Chevalley theorem, it is freely generated by n elements p1 , . . . , pn , and C[V ] is a free module over C[V ]W , of rank |W |. The following result has been proved in [15] for any W -invariant mα ∈ C. Theorem 4.1 (Cherednik). For each homogeneous p ∈ C[V ]W there exists a differential operator Lp with the highest symbol p, commuting with L: [L, Lp ] = 0. The family {Lp } is commutative. For integer mα the potential u in (4.2) is quasi-invariant (cf. Example 2.1). Altogether this proves the conjecture of [8]: Corollary 4.2. The Calogero–Moser operator (4.2) is algebraically integrable for integer mα . Let us mention that for the classical root systems R = A . . . D the complete set of commuting operators L1 , . . . , Ln for (4.2) was explicitly found by Ochima and Sekiguchi [34]. Their results cover the BCn case, too. In that case we have 5 parameters m, g0 , g1 , g2 , g3 and the Inozemtsev operator 3 n gs (gs + 1)℘ (xi + ωs ), L = − + 2m(m + 1) (℘ (xi − xj ) + ℘ (xi + xj )) + i<j

i=1 s=0

(4.3)

Generalized Lam´e Operators

125

with ωs (s = 0 . . . 3) denoting the half-periods 0, 1/2, τ/2, (1 + τ )/2. The Weyl group W for this case is generated by the permutations of xi and sign flips, and Theorem 4.1 still holds true, due to [34]. Applying Theorem 3.8 we obtain Corollary 4.3. The Inozemtsev operator (4.3) is algebraically integrable for any integer m, g0 , g1 , g2 , g3 . 4.3. Deformed root systems. Other known examples of the generalized Lam´e operators in dimension > 1 are related to deformed root systems, which appeared in [13]. Below we describe the set of linear functionals A = {α} and the corresponding multiplicities mα . (1) An,1 (m) system [13]. It consists of the following covectors in Cn+1 : xi − x 1 ≤ i < j ≤ n, with multiplicity m, √j , xi − mxn+1 , i = 1, . . . , n with multiplicity 1. Here m is an integer parameter, and m stands for m = max{m, −1 − m}. (2) Cn,1 (m, l) system [13]. It consists of the following covectors in Cn+1 :  xi ± xj , 1 ≤ i < j ≤ n, with multiplicity k,    2x , i = 1, . . . , n with multiplicity m, i √ xi ± kxn+1 , i = 1, . . . , n with multiplicity 1,    √ 2 kxn+1 with multiplicity l. Here k, l, m are integer parameters related as k = 2m+1 2l+1 , and k, l, m have the same meaning as in the An,1 (m) case. In the two-dimensional case n = 1 the first group of roots is absent and there is no restriction for k to be integer. (3) Here is a BCn -type generalization of the previous example. Let ωs (s = 0 . . . 3) denote the half-periods, as in the BCn case (4.3). The set of linear functionals α ∈ A and the corresponding multiplicities look as follows:  x i ± xj , 1 ≤ i < j ≤ n, with multiplicity k,   x + ω , i = 1, . . . , n with multiplicity ms , (s = 0 . . . 3), i s √   x√i ± kxn+1 , i = 1, . . . , n with multiplicity 1,  kxn+1 + ωs with multiplicity ls , (s = 0 . . . 3). s +1 Here k, ls , ms are nine integer parameters related through k = 2m 2ls +1 for all s = 0 . . . 3. The previous case corresponds to ms ≡ m and ls ≡ l. Again, in case n = 1 the first group of roots is absent and k may not be integer. (4) Hietarinta operator [19]. In this case we have three covectors in C3 ,

α = a1 x1 − a2 x2 , β = a2 x2 − a3 x3 , γ = a1 x1 − a3 x3 , with mα = mβ = mγ = 1. Here ai are arbitrary complex parameters such that a12 + a22 + a32 = 0. Notice that the system is essentially two-dimensional since α + β + γ = 0. (5) An−1,2 (m) system [16]. It consists of the following covectors in Cn+2 :  xi − xj , 1 ≤ i < j ≤ n, with multiplicity m,    x − √mx , i = 1, . . . , n with multiplicity 1, n+1 i √ x − −1 − mx , i = 1, . . . , n with multiplicity 1,  i n+2  √ √ mxn+1 − −1 − mxn+2 with multiplicity 1. Notice that for m = 1 this system coincides with the system An,1 (−2) above.

126

O. Chalykh, P. Etingof, A. Oblomkov

In all these cases a direct check shows that the corresponding potential u is quasiinvariant. Notice that in all cases u is symmetric with respect to sα as soon as mα > 1. Thus, one has to check the quasi-invariance only for those α where mα = 1. We believe that all these operators are algebraically integrable. In the next section we check this for all two-dimensional examples.

4.4. Two-dimensional case. Let us check that all known two-dimensional generalized Lam´e operators are algebraically integrable. Apart from the root systems A2 , BC2 and G2 considered previously, we have three deformed cases, namely, the A1,1 (m) case, the Hietarinta operator and the deformed BC2 case. First, let us consider the A1,1 (m) case: L = −∂12 − ∂22 − ∂32 + 2m(m + 1)℘ (x1 − x2 ) + 2(m + 1)℘ (x1 − √ + 2(m + 1)℘ (x2 − mx3 ).

√

mx3 ) (4.4)

In this case we can use the result of [32] where the complete integrability of (4.4) was established. First, L obviously commutes with L0 = ∂1 + ∂2 + √1m ∂3 . Proposition 4.4 ([32]). For any m there exists √ a third order operator L2 commuting with L0 , L, and its highest symbol is ∂13 + ∂23 + m∂33 . It is easy to check that as soon as m = −1, the highest symbols of L0 , L, L2 will satisfy the requirements of the strong integrabilty (notice that for m = −1 the operator L is trivial). As a result, for integer m we obtain the algebraic integrability of the operator (4.4). Note that for the special case m = 2 the algebraic integrability of (4.4) was demonstrated in [33] by presenting an explicit extra operator commuting with L0 , L, L2 . Now let us consider the Hietarinta operator L = −∂12 − ∂22 − ∂32 + 2(a12 + a22 )℘ (a1 x1 − a2 x2 ) + 2(a22 + a33 )℘ (a2 x2 − a3 x3 ) + 2(a32 + a12 )℘ (a3 x3 − a1 x1 ),

(4.5)

where a12 + a22 + a32 = 0. Such L commutes with L0 = a1−1 ∂1 + a2−1 ∂2 + a3−1 ∂3 , and we need one more operator for the complete integrability (we assume that all ai are nonzero, otherwise L is reducible). Such operator was found in [19]. Proposition 4.5 ([19]). There exists a third order operator L2 commuting with L0 , L above, and its highest symbol is a1 ∂13 + a2 ∂23 + a3 ∂33 . If we denote the highest symbols of these three operators as s1 , s2 , s3 , then it is easy to check that the system s1 (ξ ) = s2 (ξ ) = s3 (ξ ) = 0 has the only solution ξ = 0; the only exception is the case when a13 = a23 = a33 . As a result, we conclude that for all values of the parameters (apart from the case a13 = a23 = a33 ) the Hietarinta operator is strongly integrable, thus, it is algebraically integrable. Note that this also follows from [19] where one more operator commuting with L0 , L, L2 was found. Remark 4.6. In the case a13 = a23 = a33 the operator (4.5) is still algebraically integrable, although it is no longer strongly integrable.

Generalized Lam´e Operators

127

Finally, let us consider the deformed BC2 case. The Schr¨odinger operator L has the following form: L = −∂x2 − ∂y2 + U (x, y), (4.6) √ √ where U = 2(k + 1)(℘ (x + ky) + ℘ (x − ky)) + v(x) + w(y) and v, w are given by the expressions √ v= ms (ms + 1)℘ (x + ωs ), w = k ls (ls + 1)℘ ( ky + ωs ). (4.7) s

s

Here ω0 , ω1 , ω2 , ω3 are the half-periods and ms , ls and k are nine parameters such that k = (2ms + 1)/(2ls + 1) for all s = 0, 1, 2, 3 (thus, effectively, L contains five independent parameters). s +1 Proposition 4.7. For any values of the parameters ms , ls , k such that k = 2m 2ls +1 the following operator commutes with L: √ √ √ M = − ∂x4 − k∂y4 + 2(U − w)∂x2 − 4 k(k + 1)(℘ (x + ky) − ℘ (x − ky))∂x ∂y √ √ + 2k(U − v)∂y2 + (2v + 2(k + 1)(2 − k)(℘ (x + ky) + ℘ (x − ky)))∂x √ √ √ + (2k 2 w + 2 k(k + 1)(2k − 1)(℘ (x − ky) − ℘ (x + ky)))∂y √ √ − (k + 1)3 ℘ (x + ky)℘ (x − ky) √ √ + 8(k 3 + 1)(℘ 2 (x + ky) + ℘ 2 (x − ky)) √ √ − 4(k + 1)(v + kw)(℘ (x + ky) + ℘ (x − ky)) + v

− v 2 + k(w

− w 2 ).

Here v , w and so on are the derivatives with respect to the corresponding variable. Now since the system ξ12 + ξ22 = ξ14 + kξ24 = 0 does not have nontrivial solutions as soon as k = −1, we conclude that L is strongly integrable (for k = −1 this is also true because L is reducible in that case). Hence, for any integer values of the parameters ls , ms the operator (4.6)–(4.7) is algebraically integrable. 5. Bloch Solutions Let L be a generalized Lam´e operator which is strongly integrable, thus algebraically integrable. We know already that for generic λ the solution space of (3.1) is spanned by the meromorphic double-Bloch solutions. Now we are going to explain how one can, in principle, calculate them. Let W denote the following linear subspace in the space of meromorphic functions on Cn . First, its elements are holomorphic everywhere apart from the singular locus Sing = παm,n of the operator L, where they may have poles, of order ≤ mα along παm,n . Next, take any hyperplane π = παm,n with sπ denoting the orthogonal reflection with respect to π. Then any function ϕ ∈ W must have the following property: ϕ(x) − (−1)mα ϕ(sπ x) is divisible by

(α(x) − m − nτ )mα +1 .

(5.1)

Proposition 5.1. The subspace W is stable under the action of L: L(W) ⊆ W. Furthermore, any meromorphic eigenfunction ϕ of L must belong to W. The same is true for any of the commuting operators L2 , . . . , Ln : Li (W) ⊆ W.

128

O. Chalykh, P. Etingof, A. Oblomkov

Proof. Consider any hyperplane π ∈ Sing and adjust the coordinates in such a way that π is given by equation xn = 0. Take a generic point in π and expand ϕ in Laurent series in normal direction to π, i.e. ϕ = j ∈Z aj (xn )j , aj = aj (x1 , . . . , xn−1 ). Now put M := {−mα + 2Z≥0 } ∪ {mα + 1 + 2Z≥0 } and split the series into two parts, with j ∈ M and with j ∈ Z\M: ϕ = ϕ1 + ϕ2 . First claim is that an application of L to ϕ1 will produce a series of a similar kind. This follows directly from the quasi-invariance of u, proving the first part of the proposition. On the other hand, if aj (xn )j is the first nonzero term in ϕ2 , then applying L to ψ2 will give a series starting from (xn )j −2 , which would contradict the equation Lϕ = λϕ, thus proving the second claim. In a similar way, if L L = LL for some other operator L , then L (L)r = (L)r L for any r ≥ 1. Thus, if W := L (W) then Lr (W ) ⊆ W for any r. Now suppose we could find a function ϕ in W which is not in W. Then, consider a series expansion of ϕ in the direction, normal to π, and split it into ϕ = ϕ1 + ϕ2 as above. If ϕ2 = 0, then Lr ϕ would have a pole of an arbitrarily high order along π (as r increases), which is impossible for an element in W (since W = L W and L has meromorphic coefficients). This would contradict the inclusion Lr (W ) ⊆ W . Thus, W = W. Now let ψ be a double-Bloch solution of (3.1), so for appropriate a, b ∈ V ∗ and for any l ∈ L we have: ψ(x + l) = ψ(x)e2πia,l ,

ψ(x + τ l) = ψ(x)e2πib,l .

(5.2)

We know that ψ is meromorphic in Cn with possible poles along the hyperplanes παm,n , of order mα . In our discussion below we restrict ourselves to the case when all the linear functions α ∈ A have zero constant term, i.e. α = α0 ∈ V ∗ for all α, so α(x) = α, x. Everything extends to the general case with obvious modifications. As a result, we see that ψ can be presented in the form ψ = /δ, δ(x) = θ (α, x)mα , (5.3) α∈A

for some holomorphic (x). Here θ = θ1 is the classical (odd) Jacobi theta function, θ (z) = exp(π i(n + 1/2)2 τ + 2πi(n + 1/2)(z + 1/2)). (5.4) n∈Z

Recall that θ (z) has the following translation properties in z: θ (z + 1) = −θ (z),

θ (z + τ ) = −e−2πiz−πiτ θ(z).

This determines the translation properties of δ. To write them down, it is convenient to introduce the following linear map : V → V ∗ which is defined as : x → mα α, xα. (5.5) α∈A

Note that maps the lattice L to a sublattice in M = Hom(L, Z). We also need a covector = 21 α∈A mα α.

Generalized Lam´e Operators

129

Under these notations, we have the following translation formulas for any l ∈ L: δ(x + l) = e2πi,l δ(x), δ(x + lτ ) = e

2πi,l−2πil,x−πil,lτ

(5.6) (5.7)

δ(x).

As a corollary of (5.2) and (5.6)–(5.7), we conclude that the numerator in (5.3) must have the translation properties as follows: (x + l) = e2πia+,l (x), (x + lτ ) = e2πib+,l−2πil,x−πil,lτ (x). The vector space of entire functions with such properties is finite-dimensional. Indeed, let ω(x, y) denote the bilinear form on V associated with the operator : ω(x, y) = mα α, xα, y. (5.8) α∈A

p

It is symmetric positive and integer-valued on the lattice L. Let us define following series: p (x) = exp(2πiω(l + p, x + q) + π iτ ω(l + p, l + p)). q

q

by the

(5.9)

l∈L

It has the following translation properties: p p (x + l) = e2πiω(l,p) (x), q q p p (x + lτ ) = e−2πiω(l,x+q)−πiτ ω(l,l) (x). q q

(5.10) (5.11)

It is easy to show (see e.g.[35]) that the space of holomorphic functions with such translation properties has dimension equal to [M : L]; this is equal to det(ij ), where ij = ei , ej for some basis e1 , . . . , en of L. A natural basis in this space is given p by the functions q+r with r ∈ −1 M running over the set of representatives in −1 (M)/L. For later purposes, let us use a slightly different basis, namely, the functions 0 k,x r = e (x + γ + r), := , r ∈ −1 (M)/L. (5.12) 0 It is easy to relate the parameters k ∈ V ∗ and γ ∈ V to p, q: k = 2πip,

γ = q + τp.

Let us denote the linear space generated by the functions (5.12) as Uk,γ . We conclude that ψ must belong to linear space δ −1 Uk,γ with k, γ related in a simple way to the “quasimomenta” a, b in (5.2). Now recall Proposition 5.1. It implies that ψ must belong to the (finite-dimensional) subspace Wk,γ = δ −1 Uk,γ ∩ W. It also implies that L (as well as any of Li ’s) preserves this finite-dimensional space, so we can eventually find ψ by diagonalizing the action of L on Wk,γ . Note that since the

130

O. Chalykh, P. Etingof, A. Oblomkov

double-Bloch solutions ψ form an n-parametric family (with λ = (λ1 , . . . , λn ) being the parameters), this space Wk,γ will be nonzero only for (k, γ ) belonging to a certain n-dimensional subvariety. In most cases dim Wk,γ ≤ 1, so the (unique) function ψk,γ generating Wk,γ = 0 will be an eigenfunction for L automatically. All this simplifies a little for the Calogero–Moser models, so let us consider this case in more detail. For a given reduced, irreducible root system R in V ∗ and a fixed W -invariant m(α), we consider the Calogero–Moser operator L = − + mα (mα + 1)(α, α)℘ (α, x|τ ). (5.13) α∈R+

The Bloch solutions must be of the form ψ = δ −1 as in (5.3). Note that δ in this case has the following symmetry: δ(wx) = εm (w)δ(x) for any w ∈ W,

(5.14)

where εm is the one-dimensional character of W such that εm (sα ) = (−1)mα .

(5.15)

The bilinear symmetric form (5.8) is obviously W -invariant, and since R is irreducible, ω must be proportional to the W -invariant scalar product (x, y). So, ω(x, y) = κ · (x, y) for some κ which depends on mα . (For instance, if R consists of one W -orbit only and mα ≡ m, one has ω(x, y) = mh(x|y), where h = h(R) is the Coxeter number and the form (x|y) on V is normalized in such a way that (α ∨ |α ∨ ) = 2 for all α ∨ ∈ R ∨ .) The lattice L in this case is the coweight lattice P ∨ of R, while M is the root lattice Q. Still, the numerator must belong to the finite-dimensional space Uk,γ spanned by the functions (5.12). We already know that a Bloch solution ψ = δ −1 appears only for those k, γ when −1 δ Uk,γ ∩ W = 0. Such k, γ can be effectively determined. Indeed, due to (5.14), the conditions (5.1) for ψ reduce to the quasi-invariance of : (x) − (sα x) is divisible by α, x2mα +1

for all α ∈ R.

(5.16)

(These are local conditions near πα = {x : α, x = 0}, similar conditions for other hyperplanes παm,n ∈ Sing will follow because ψ is quasiperiodic.) These conditions can be rewritten as α ∨ , ∂2j −1 ≡ 0

for α, x = 0

and j = 1, . . . , mα ,

(5.17)

with α ∨ , ∂ denoting the derivative in α ∨ -direction. Now recall that we have the period lattice L = P ∨ , and belongs to the linear space Uk,γ of the functions with the translation properties (5.10)–(5.11). Consider the following sublattice Lα ⊆ L: Lα := L ⊕ L

,

L = L ∩ ker α,

L

= L ∩ Rα ∨ .

(5.18)

α denote the space of theta functions with the same translation properties, but for Let Uk,γ the translations l from Lα only. Obviously, we have a natural inclusion map Uk,γ → α . It is possible to describe this linear map explicitly, using the standard bases in both Uk,γ spaces (look at the formula (6.4) below which is a particular example of such a relation). According to (5.18), the lattice Lα is the direct orthogonal sum of two sublattices. Thus,

Generalized Lam´e Operators

131

α will be the products of the (n − 1)-dimenthe corresponding theta functions from Uk,γ sional theta functions related to L and the one-dimensional theta functions related to α into a tensor product: the lattice L

. This corresponds to the decomposition of Uk,γ α = (U α ) ⊗ (U α )

. Applying the derivative in α ∨ direction will affect the oneUk,γ k,γ k,γ dimensional theta functions only. As a result, for each j we have an explicit linear map α ) given by α,j from Uk,γ to (Uk,γ

α,j ϕ = α ∨ , ∂2j −1 ϕ|α,x=0 . This map is given by a matrix whose entries are certain combinations of one-dimensional theta-functions and their derivatives. Now we can organize all these maps for α ∈ R+ into one big linear map α

: Uk,γ → (Uk,γ ), = ( α,j )α∈R+ ,j =1,...,mα , (5.19) α∈R+ j =1...mα

with α,j defined above. We can think of as an M × N matrix, where N, M are the dimensions of the source and the target spaces, respectively. The outcome is the following: a double-Bloch solution ψ appears exactly for those k, γ where this linear map has nontrivial kernel. This gives equations on such k, γ (by equating to zero all N × N minors of the matrix ). In its turn, the kernel will determine a corresponding Bloch eigenfunction. (If the kernel has dimension > 1, it still defines an invariant subspace for the action of L, so we have at least one double-Bloch solution.) So, let C denote an analytic subvariety in V ∗ × V given by C = {(k, γ )| ker = 0}. (5.20) Formulas (5.10)–(5.12) make clear that C is invariant under the following transformations of k, γ : (k, γ ) → (k, γ + l),

l ∈ −1 M,

(k, γ ) → (k + 2πil, γ + τ l),

l∈

−1

(5.21) M.

(5.22)

Now, for a function of one variable f (z) = ekz θ (z + c) its derivatives at z = 0 are after being factored by the translations above, obviously polynomial in k. Therefore C, can be considered as an algebraic covering of an abelian variety (a product of elliptic curves) Cn /L + τ L where L = −1 M. Proposition 5.2. The double-Bloch eigenfunctions of the Calogero–Moser operator (5.13) are parametrized by the points of an algebraic variety which is a covering of an abelian variety Cn /L + τ L , where L = −1 Q, Q is the root lattice and the map is defined by (5.5). A similar analysis applies to any (integrable) generalized Lam´e operator, so the double-Bloch eigenfunctions are also parametrized by the points of an algebraic variety covering a product of elliptic curves. Now let C denote the result of factoring the variety (5.20) by the translations (5.21)– (5.22). It is an algebraic variety parametrizing the double-Bloch eigenfunctions of L. Below, following [18], we will refer to it as the Hermite–Bloch variety for L. It differs from the complex Bloch variety, traditionally defined as the set of (µ, E) ∈ (C× )n × C such that there exists ψ with Lψ = Eψ and ψ(x + li ) = µi ψ(x), where l1 , . . . ln is a basis in L. Note that the latter is a transcendental complex analytic variety.

132

O. Chalykh, P. Etingof, A. Oblomkov

Remark 5.3. In case of the Calogero–Moser operator related to a root system R, there is a natural action of the Weyl group W on the Hermite–Bloch variety C. Also, there is a natural projection of C onto Cn sending ψ = ψk,γ to the set of eigenvalues λi , Li ψ = λi ψ. This is a |W |-sheeted covering and the Weyl group acts on C by permuting the points in the fiber. Remark 5.4. Note that our results do not contradict the theorem of Feldman–Kn¨orrer– Trubowitz [36] in dimension two, since their result only applies to a real-valued smooth potential u in R2 . The Hermite–Bloch variety C is a subvariety in the total space of a certain bundle over the product of elliptic curves defined by (5.21)–(5.22). This bundle naturally compactifies to a bundle with the fibers isomorphic to the projective space Pn . As a result, C compactifies to a projective variety, covering the product of elliptic curves. The variety C is closely related to the so-called spectral variety which is defined as follows. Suppose L is a strongly integrable generalized Lam´e operator, so we have n commuting operators L1 = L, . . . , Ln , which generate a commutative subalgebra in the ring of PDO with meromorphic elliptic coefficients. Then by [11], Theorem 2.2, the centralizer of this subalgebra will be a maximal commutative ring which we will denote by Z(L) (using Proposition 5.1 one can show that the operators in this ring will share a common family of the double-Bloch eigenfunctions). Each operator in Z(L) must have constant highest symbols, by Lemma 3.2. Then from the strong integrability we immediately derive that Z(L) is finitely generated. Thus, SpecZ(L) defines an affine algebraic variety, which we call the spectral variety. It is not quite clear whether the spectral variety is isomorphic to the Hermite–Bloch variety (for instance, the latter may not be affine), but at least they must be birationally equivalent. Finally, let us remark on some algebraic geometry behind the double-Bloch solutions and the Hermite–Bloch variety for the Calogero–Moser system (5.13). We consider the torus T = Cn /Q∨ + τ Q∨ , where Q∨ = R ∨ ⊗ Z is the coroot lattice. Let us define the following subsheaf Q ⊂ O(T ) of the structure sheaf of T by requiring its local sections to have zero normal derivatives of order 1, 3, . . . , 2mα − 1 along each of the hyperplanes α, x = 0 (considered as hypertori in T ). The sheaf Q can be considered as the structure sheaf O(X) of a singular variety X, with T being its injective normalization (cf. [39]). Such X is projective; it is a |W |-sheeted covering of the weighted projective space T /W considered by Looijenga [37], see also [38]. Notice that from the results of [40] it follows that X is Cohen-Macaulay and Gorenstein. Let us consider now the group Pic(X) of invertible sheaves on X. Then each of the double-Bloch solutions ψk,γ represents a meromorphic section of a degree zero line bundle on X (to define degree, we use the pull-back to the torus T ). In this way the Hermite–Bloch variety for the Calogero–Moser system becomes an n-dimensional subvariety in Pic0 (X). It would be interesting to study this relation in more detail. Remark 5.5. An interesting thing is to analyse how the spectral variety changes when τ goes to +i∞ (trigonometric limit). In this limit the spectral variety becomes rational and is relatively well understood. Thus, one could think of the whole family depending on τ as a deformation of this rational variety. This point of view was used in [41] to construct the spectral surface in the simplest A2 case. 5.1. Discrete spectrum eigenstates. Let us explain how the Bloch solutions can be used to construct the discrete spectrum eigenstates of L. Our discussion is strictly confined to

Generalized Lam´e Operators

133

the Calogero–Moser operator (5.13). We take a purely imaginary τ ; this ensures that the potential in (5.13) is real-valued for x ∈ Rn . The Calogero–Moser operator L is defined on a dense subset of L2 (Rn ) and it is self-adjoint only formally, and its Bloch solutions are singular. It has square-integrable eigenstates, though. Namely, let ψ = ψk,γ be one of the double-Bloch solutions constructed in the previous section. Given such a ψ, let us symmetrize it as follows: (x) =

εm (w) det w ψ(wx),

(5.23)

w∈W

where W is the Weyl group of the root system R and εm is the character (5.15). The Calogero–Moser operator L is W -invariant, thus the constructed will be again its eigenfunction (by the same reason, it will be an eigenfunction for all commuting operators Li ). A priori, might have poles in Rn along the hyperplanes α, x = c, c ∈ Z. However, it is easy to see that has no poles along the hyperplanes α, x = 0. This follows immediately from the properties (5.1) of ψ. To avoid the appearance of singularities on other hyperplanes α, x = c, one has to impose the condition that all the terms ψ(wx) in the sum (5.23) have the same Bloch–Floquet multipliers with respect to a shift x → x + l with l ∈ L = P ∨ . This means that expk, l = expwk, l for all w ∈ W and l ∈ P ∨ , which in its turn implies that k belongs to the lattice 2π iP . So, we have the following result. Proposition 5.6. Let L be the Calogero–Moser operator (5.13). Then for any point (k, γ ) of its Bloch–Hermite variety which satisfies an additional condition k ∈ 2π iP (with P being the weight lattice for R), the corresponding function (5.23) (if nonzero) will be a nonsingular in Rn eigenfunction of the Calogero–Moser operator (5.13) and of the higher operators L2 , . . . , Ln . By construction, vanishes along the hyperplanes α, x ∈ Z, and it gets a factor of (−1)mα +1 under the orthogonal reflection with respect to such a hyperplane. Since these are the reflection hyperplanes of the affine Weyl group of R, they cut Rn into its fundamental domains (alcoves), so the restriction of to each alcove will be, essentially, the same. We can restrict to any alcove, extending it by zero outside, and this gives us a finitely supported smooth eigenfunction of L (notice that in the complex domain it still has poles). We see from this that the discrete spectrum of L in L2 (Rn ) is infinitely degenerate (one says that the spectral problem for L splits into identical spectral problems on each of the alcoves). Morally, this is the reason why one should expect the same spectrum considering L not on L2 (Rn ) but on the space L2 (T )W of W -invariant functions on the torus T = Rn /Q∨ , as in [46]. The latter case is simpler from the technical point of view, since the operator L is essentially self-adjoint on L2 (T )W , see [46] for the details. In [46] Komori and Takemura considered the elliptic Calogero–Moser problems as a perturbation (in τ ) of the trigonometric case τ = +i∞, and Theorem 3.7 of [46] claims that for sufficiently small p = e2πiτ the family of eigenfunctions (Jack polynomials) which corresponds to p = 0, admits analytic continuation in p, and the resulting functions will give rise to a complete orthogonal family of eigenfunctions of L in L2 (T )W . One can show that our family in the limit τ → +i∞ specializes to the Jack polynomials. Comparing this with the previous discussion, we conclude that our family must coincide with the one considered in [46].

134

O. Chalykh, P. Etingof, A. Oblomkov

6. Calogero–Moser Model of B2 Type In this section we consider the following 2-dimensional Schr¨odinger operator L = − + 2℘ (x1 ) + 2℘ (x2 ) + 4℘ (x1 − x2 ) + 4℘ (x1 + x2 ),

(6.1)

where ℘ (z) = ℘ (z|τ ) is the Weierstrass ℘-function with the periods 1, τ (Im τ > 0). Our goal is to calculate its double-Bloch eigenfunctions, i.e. such ψ that ψ(x + ej ) = λj ψ(x), ψ(x + τ ej ) = µj ψ(x) (j = 1, 2),

(6.2) (6.3)

where (e1 , e2 ) is the standard basis in C2 and (λ1 , λ2 , µ1 , µ2 ) are fixed Bloch–Floquet multipliers. First we recall some standard definitions and formulas from the theory of theta-func tions, see [43, 35]. Let θ βα be the one-dimensional theta-function (with characteristics), defined by the following series: α θ (z|τ ) = exp(π i(n + α)2 τ + 2π i(n + α)(z + β)). β n∈Z

Notice that α and β are defined modulo 1: α α+1 α α . θ =θ , θ = e2πiα θ β β β β +1 Later we will need the following formula which can be easily derived from the definitions: θ

α2 α+ α− α1 (x1 |τ )θ (x2 |τ ) = θ (x+ |2τ )θ (x− |2τ ) 0 0 0 0 α+ + 21 α− + 21 +θ (x+ |2τ )θ (x− |2τ ), 0 0

(6.4)

where α± = 21 (α1 ± α2 ), x± = x1 ± x2 . We will mostly use θ 1/2 1/2 which we will denote simply by θ(z), which will always stand for the odd Jacobi theta function (5.4). According to the previous section, ψ must have the form ψ=

(x1 , x2 ) . θ (x1 |τ )θ (x2 |τ )θ (x1 − x2 |τ )θ (x1 + x2 |τ )

(6.5)

Here is nonsingular in C2 . The translation properties for ψ easily translate into the properties of : (x + ej ) = −λj (x), (x + τ ej ) = −µj e−3πiτ −6πixj (x).

Generalized Lam´e Operators

135

Standard considerations from the theory of theta-functions show that the linear space of functions with these properties has dimension 9 and must be of the form i/3 j/3 = exp(K1 x1 + K2 x2 ) cij θ (3x1 + γ1 |3τ )θ (3x2 + γ2 |3τ ), 0 0 0≤i,j ≤2

(6.6) where cij are arbitrary constants and parameters γj , Kj relate to λj , µj as follows: λj = −eKj ,

µj = −e−2πiγj +Kj τ .

(6.7)

Remark 6.1. The shifting γj → γj + 1

(6.8)

does not change the space (6.6), the same applies to the shifts (Kj , γj ) → (Kj + 2πi, γj + τ ).

(6.9)

Conversely, for any given λj , µj the corresponding (γj , Kj ) are determined uniquely modulo shifts (6.8)–(6.9). Now, in accordance with Proposition 5.1, we impose the following “vanishing” conditions on : ∂1 ≡ 0 ∂2 ≡ 0 (∂1 + ∂2 ) ≡ 0 (∂1 − ∂2 ) ≡ 0

for x1 = 0, for x2 = 0, for x1 + x2 = 0, for x1 − x2 = 0.

(6.10) (6.11) (6.12) (6.13)

As we will see below, for a certain 2-dimensional surface in 4-dimensional space of parameters (kj , aj ), the conditions (6.10)–(6.13) cut a one-dimensional subspace in 9-dimensional space (6.6). Thus, the corresponding ψ will be an eigenfunction for L automatically. To determine the corresponding (Kj , γj ), let us rewrite using (6.4) and making identification θ α+1 = θ α0 : 0 l/3 m/3 (3x+ + γ+ |6τ )θ clm θ (3x− + γ− |6τ ) 0 0 0≤l,m≤2 l/3 + 1/2 m/3 + 1/2 +θ (3x+ + γ+ |6τ )θ (3x− + γ− |6τ ) , 0 0

= eK+ x+ +K− x−

where K± = 21 (K1 ± K2 ), γ± = γ1 ± γ2 and clm = cij

with i ≡ l + m

(mod 3), j ≡ l − m

(mod 3).

(6.14)

136

O. Chalykh, P. Etingof, A. Oblomkov

It is easy to see now that (6.12) leads to six linear equations on clm of the form 2

2

Al clm = 0,

l=0

Bl clm = 0

(m = 0, 1, 2)

l=0

for certain explicitly given A = (Al ), B = (Bl ). For generic parameters γ1 , γ2 the vectors A, B will be linearly independent. Therefore, these 6 equations determine the = ( 2-dimensional kernel of the matrix C clm ). Similarly, (6.13) gives six more equations Thus, for any K1 , K2 and generic γ1 , γ2 for clm , which determine the cokernel of C. the vanishing conditions (6.12)–(6.13) determine clm (and, hence, ) uniquely up to a common factor. In principle, it is straightforward to write down explicit expressions for the coefficients cij but they are cumbersome and not very useful. However, there is a better way of getting an expression for , by taking a limit ω → 0 in the formula for the difference case, see (7.7) below. It turns out that has the form =

e(k,x) bij (k1 + k2 )i (k1 − k2 )j , θ (a1 |τ )θ (a2 |τ )

(6.15)

0≤i,j ≤2

where the parameters k1 , k2 and a1 , a2 relate to Kj , γj as kj = Kj + πi,

aj = γj − (1 + τ )/2.

(6.16)

The coefficients bij depend on x and a1 , a2 and are given by the following recipe. Let us introduce formal commutative variables A, B, C, D. We also need the following scalar λ depending on τ : λ := θ

(0|τ )/θ (0|τ ). Now introduce U, V as U := A + B − C, b22 =1,

b21 = 2U,

b20 = − λ + U 2 ,

V := A + B − D and put

b12 = 2V ,

b02 = −λ + V 2 ,

b10 =2V (−λ + U ), 2

b11 = 4U V ,

(6.17)

b01 = 2U (−λ + V ), 2

b00 =(−λ + U )(−λ + V 2 ). 2

After that one opens the brackets, so each of bij becomes a sum of monomials in A, B, C, D with scalar coefficients, and then replaces each monomial using the following rule: cAp B q C r D s −→ cθ (p) (x1 + a1 )θ (q) (x2 + a2 )θ (r) (x1 − x2 )θ (s) (x1 + x2 ).

(6.18)

Here θ = θ (z|τ ), as before, denotes the odd Jacobi theta function, and the upper index in brackets refers to taking derivatives in z. We treat a scalar as a multiple of A0 B 0 C 0 D 0 assuming, as usual, that f (0) = f . To illustrate this, we present below some first of the coefficients: b22 =θ (x1 + a1 )θ (x2 + a2 )θ (x1 − x2 )θ (x1 + x2 ), b21 =2θ (x1 + a1 )θ (x2 + a2 )θ (x1 − x2 )θ (x1 + x2 ) − 2θ (x1 + a1 )θ (x2 + a2 )θ (x1 − x2 )θ (x1 + x2 ) − 2θ (x1 + a1 )θ (x2 + a2 )θ1 (x1

− x2 )θ (x1 + x2 ), b20 = − λθ (x1 + a1 )θ (x2 + a2 )θ (x1 − x2 )θ (x1 + x2 ) + · · · .

(6.19)

Generalized Lam´e Operators

137

Proposition 6.2. For any K1 , K2 and generic γ1 , γ2 there is a unique (up to a constant factor) function (x) of the form (6.6) with the properties (6.10)–(6.13). It is described by the formulas (6.15)–(6.18) above. To prove the formula (6.15), one goes to the limit ω → 0 in the formula (7.7) from the next section, picking up the first nonzero term (of order 6 in ω). Formulas (6.15)–(6.18) fix the dependence of (x) on 4 parameters a1 , a2 , k1 , k2 . Let us consider now the translation properties of regarded as a function of these parameters. Recall that for generic a1 , a2 the function was determined uniquely up to a factor by (6.6) and (6.12)–(6.13). Thus, it follows immediately from Remark 6.1 that under the shifts (6.8)–(6.9) must remain the same, up to a factor independent on x. To find this factor, it is sufficient to look at the formula (6.19) for the leading coefficient b22 . As a result, we conclude that the function , given by formulas (6.6), (6.15)–(6.18) is invariant with respect to the shifts (6.8)–(6.9): (6.20) (a1 + 1, a2 , k1 , k2 ) = (a1 , a2 + 1, k1 , k2 ) = (a1 , a2 , k1 , k2 ), (a1 + τ, a2 , k1 + 2πi, k2 ) = (a1 , a2 + τ, k1 , k2 + 2π i) = (a1 , a2 , k1 , k2 ). (6.21) Now let us find the interrelations between the parameters aj , kj which will guarantee two remaining vanishing conditions (6.10)–(6.11). Let G1 and G2 denote the derivatives ∂1 ∂2 and ∂1 ∂23 evaluated at x1 = x2 = 0: G1 =

∂ 2 (0, 0), ∂x1 ∂x2

G2 =

∂ 4 (0, 0). ∂x1 ∂x23

(6.22)

Thus defined G1 , G2 will be regarded as functions of the parameters a1 , a2 , k1 , k2 . Consider now the following two equations on these 4 parameters: G1 = 0,

G2 = 0.

(6.23)

It is clear that conditions (6.10) imply both Eqs. (6.23). Indeed, they guarantee that the function f (t) = ∂1 (0, t)

(6.24)

is identically zero, in particular, f (0) = G1 = 0 and f

(0) = G2 = 0. More interestingly, (6.23) are “almost” equivalent to (6.10). To see this, note that the function (6.24) has the following translation properties in t: f (t + 1) = ek2 f (t), f (t + τ ) = ek2 τ −2πi(3t+a2 )−3πiτ f (t). From (6.12)–(6.13) we know that (∂1 ± ∂2 )(0, 0) = 0. This gives that f (0) = ∂1 (0, 0) = 0, while the first equation in (6.23) gives that f (0) = 0. Together with the translation properties above this implies that f is proportional to the following theta function: ek2 t θ (t + a2 |τ )[θ (t|τ )]2 .

138

O. Chalykh, P. Etingof, A. Oblomkov

This function has nonzero third derivative at t = 0 as soon as k2 θ (a2 |τ ) + θ (a2 |τ ) = 0.

(6.25)

Thus, the conditions f (0) = f (0) = f

(0) = 0 imply that f is identically zero provided that (6.25) is true. The outcome is the following: for all a1 , a2 , k1 , k2 satisfying the condition (6.25), Eqs. (6.23) imply (6.10). Our next remark is that for any a1 , a2 , k1 , k2 we have the relation ∂1 ∂23 (0, 0) = 3 ∂1 ∂2 (0, 0). This is due to the identity 4∂13 ∂2 − 4∂1 ∂23 = (∂1 + ∂2 )3 (∂1 − ∂2 ) − (∂1 − ∂2 )3 (∂1 + ∂2 ), where the right-hand side vanishes at x1 = x2 = 0 due to (6.12)–(6.13). Thus, repeating the same arguments as above, we conclude that Eqs. (6.23) imply also (6.11), as soon as k1 θ (a1 |τ ) + θ (a1 |τ ) = 0.

(6.26)

As a result, we see that under assumptions (6.25)–(6.26) both vanishing conditions (6.10)–(6.11) are equivalent to one system (6.23). To remove the restrictions (6.25)– (6.26), let us consider Eqs. (6.23) in more details. First notice that the functions G1 , G2 share the same translation properties (6.20)– (6.21) in (a1 , a2 , k1 , k2 ) with (since differentiating in x doesn’t affect these properties). Notice also that G1 , G2 are polynomials in k1 , k2 . Let us pass from k1 , k2 to another variables p1 , p2 as follows: p1 = k1 + ζ (a1 ),

p2 = k2 + ζ (a2 ),

(6.27)

where we used ζ = ζ (z|τ ) to denote the logarithmic derivative of θ(z): ζ (z) =

θ (z|τ ) . θ (z|τ )

This is slightly different from the Weierstrass ζ -function. Clearly, G1 , G2 are still polynomials in p1 , p2 with the coefficients depending on a1 , a2 . The translation properties (6.20)–(6.21) imply that the coefficients in these polynomials will be elliptic functions of a1 and a2 . In fact, one can write down G1 , G2 quite explicitly. The following result follows from our calculations for the difference case from the next section. Proposition 6.3. The system G1 = G2 = 0 is equivalent to the following system: p1 (p23 + 3ζ2 p2 + ζ2

) = p2 (p13 + 3ζ1 p1 + ζ1

),

p1 (p25 + 10ζ2 p23 + 10ζ2

p22 + (5ζ2

+ 15(ζ2 )2 )p2 + ζ2

+ 10ζ2 ζ2

) =

p2 (p15

+ 10ζ1 p13

+ 10ζ1

p12

+ (5ζ1

+ 15(ζ1 )2 )p1

+ ζ1

(6.28)

+ 10ζ1 ζ1

),

where ζ1 , ζ2 stand for ζ (a1 ) and ζ (a2 ) while the prime denotes taking the derivative with respect to the corresponding variable. (Notice that ℘ (z) = −ζ + const, so ζ

= −℘

and so on.)

Generalized Lam´e Operators

139

From the discussion below it will follow that for generic a1 , a2 the system (6.28) has a finite number of solutions (p1 , p2 ). Thus, we can think of (6.28) as a finite covering of the product E × E of two copies of an elliptic curve E = Eτ = C/(Z + τ Z). In fact, the only (a1 , a2 ) where the fiber is infinite are those with a1 = ±a2 (mod 1, τ ). This corresponds to the following “vertical” components of (6.28): {a1 = a2 , p1 = p2 },

{a1 = −a2 , p1 = −p2 }.

(6.29)

Another “trivial” component is, obviously, {p1 = p2 = 0}.

(6.30)

If we delete these three components from (6.28), the remaining part will be, in fact, ˙ Since we deleted the component (6.30), the conditions a 13-fold covering of E˙ × E. (6.25)–(6.26) and, hence, (6.10)–(6.11) are satisfied on the remaining part. Thus, we arrive at the following theorem. Theorem 6.4. Let C be the finite covering of the product of two (punctured) elliptic curves which is obtained from (6.28) by deleting the components (6.29) and (6.30). Then C is the Hermite–Bloch variety for the operator (6.1) and a double-Bloch solution ψ(x) corresponding to a point (a1 , a2 , p1 , p2 ) in C is given by the formulas (6.5), (6.15)–(6.18) and (6.27). We still have to explain why C is a 13-fold covering. To this end let us consider a family of plane rational curves ϕ : P1 → P2 of degree 5, depending on a parameter a ∈ E = C/Z + τ Z and defined as follows: if a ∈ E and u = (u0 : u1 ) ∈ P1 then ϕ(a, u) = (ϕ0 : ϕ1 : ϕ2 ), ϕ0 = ϕ1 = ϕ2 =

where

u1 u40 , u31 u20 + 3ζ (a)u1 u40 + ζ

(a)u50 , u51 + 10ζ (a)u31 u20 + (5ζ

(a) + 15(ζ (a))2 )u1 u40

+ (ζ

(a) + 10ζ (a)ζ

(a))u50 .

Then the solutions (p1 , p2 ) of (6.28) correspond to the intersection points of two curves C1 = ϕ(a1 , · ), C2 = ϕ(a2 , · ) from our family. Namely, if ϕ(a1 , u) = ϕ(a2 , v) then p1 = u1 /u0 and p2 = v1 /v0 obviously satisfy (6.28) and vice versa, provided p1 , p2 = 0. We should, however, exclude from consideration points with p1 , p2 = ∞. Namely, all the curves from our family pass through (0 : 0 : 1) = ϕ(a, ∞). It is a standard exercise in basic algebraic geometry to show that mult (C1 ∩ C2 ) at this point equals 12. In doing this local analysis, one immediately observes that the condition ζ (a1 ) = ζ (a2 ) is necessary for C1 and C2 to coincide near the point (0 : 0 : 1). This proves that the covering (6.28) is finite apart from the components (6.29). After that Bezout’s theorem tells us that the number of common points, apart from (0 : 0 : 1), equals 5×5−12 = 13. Remark 6.5. Operator (6.1) is symmetric under the Weyl group W of B2 , whose 8 elements act by permuting coordinates and/or changing their signs. This induces the action of W on Bloch solutions and, therefore, on the Hermite–Bloch variety (6.28). This action, in terms of (a1 , a2 , p1 , p2 ), is generated by involutions (a1 , a2 , p1 , p2 ) → (a2 , a1 , p2 , p1 ) and (a1 , a2 , p1 , p2 ) → (−a1 , a2 , −p1 , p2 ).

140

O. Chalykh, P. Etingof, A. Oblomkov

6.1. Algebraic integrability. The operator L (6.1) is completely integrable. According to Theorem 4.1, it has a commuting operator of order four, L1 = ∂12 ∂22 + . . . . Using Proposition 5.1, we see that our ψ is a common eigenfunction for L, L1 : Lψ = Eψ,

L1 ψ = E1 ψ,

L2 ψ = E2 ψ.

(6.31)

Here E, E1 are some functions of the parameters a1 , a2 , k1 , k2 which, in principle, can be calculated explicitly (though we didn’t have enough energy to perform such a calculation). In [44] it was shown that apart from L1 , there is another operator L2 = ∂15 − 5∂13 ∂22 + . . . which commutes with L, L1 (see [44] for the explicit expression for L1 , L2 ). The existence of a fifth order quantum integral L2 means in this case that the Schr¨odinger operator L is algebraically integrable. To see this directly, let us consider one more operator L3 , obtained from L2 by interchanging x1 and x2 . Then one easily checks that the common eigenspace (6.31) of L and L1 is 8-dimensional, and for generic E and E1 it is spanned by the double-Bloch solutions ψ, constructed previously. On the other hand, by Proposition 5.1, each ψ will be an eigenfunction of L2 and L3 as well. So, the only thing to check is that the eigenvalues E2 , E3 separate all 8 solutions of the system (6.31). This is enough to check in the trigonometric limit τ → +i∞, which is not difficult. Remark 6.6. We do not give the precise relation between the Hermite–Bloch variety and the spectral surface. To find such a relation, a careful analysis of the structure of the divisor at infinity is needed. Let us remark that in [45] two algebraic relations between the 4 operators L, . . . , L3 were calculated explicitly. Thus, they determine a 2-dimensional affine algebraic variety in C4 . However, it is not isomorphic to the spectral surface as an affine variety (though they are birationally equivalent). This can be seen already in the trigonometric limit τ → +i∞, by using the information about the spectral variety from [10]. Namely, the results of [10] imply that these four operators L, . . . , L3 are not enough to generate the whole commutative ring (which is isomorphic to the coordinate ring of the spectral surface). 6.2. Spectrum of L. Throughout this subsection we assume that the parameter τ is pure imaginary, so the potential u(x) of the Schr¨odinger operator (6.1) is real-valued for x ∈ R2 . Its singularities is the family of lines x1 ∈ Z,

x2 ∈ Z,

x1 + x2 ∈ Z,

x1 − x2 ∈ Z.

(6.32)

These lines cut R2 into triangles and the spectral problem for L splits into separate spectral problems for each triangle. Let ψ(x) = ψ(x; a1 , a2 , p1 , p2 ) be a double-Bloch eigenfunction for L, which corresponds to a point (a1 , a2 , p1 , p2 ) of the surface (6.28) in accordance with the formulas (6.5), (6.15)–(6.18) and (6.27). Given such a ψ, let us symmetrize it in the following way: (x) = ψ(wx), (6.33) w∈W

Generalized Lam´e Operators

141

where W denotes the Weyl group for the system B2 . To get a square-integrable eigenfunction, according to Proposition 5.6, one takes k ∈ 2π iP , that is k = (iπ m, iπ n) with integer m, n having the same parity. For such k the substitution of (6.27) into (6.28) leads to a system of (transcendental) equations on a1 , a2 , and by solving it one eventually finds the corresponding eigenfunctions. Note that because of the invariance of the system (6.28) (and of ψ) under the shifts (6.8)–(6.9), it is enough to look for the solutions (a1 , a2 ) with ai lying inside the fundamental parallelogram with the vertices ±(1 + τ )/2. Also, taking into account the Weyl group action, we can restrict ourselves to the dominant weights, i.e. 0 ≤ m ≤ n. Now let us consider the trigonometric limit τ → +i∞, then one can show that for any m, n the corresponding system (6.28) will have the unique solution a1 , a2 inside the fundamental parallelogram (at least, for sufficiently big τ ). Moreover, if one fixes k = (k1 , k2 ) and takes then the limit τ → +∞, then ψ will go to the Baker–Akhiezer function ψ(k, x) considered in [42] (this can be seen directly from the formulas for ψ). Now, for the Baker–Akhiezer function ψ(k, x) it is known (see Theorem 6.7 of [42]) that the formula (6.33) will produce all the Jack polynomials if k = 2π i(λ + ρ)

with λ ∈ P+

and ρ =

1 (mα + 1)α. 2

(6.34)

α∈R+

For others k ∈ 2π iP+ the symmetrized will be zero. In our situation this means that the function = m,n defined by (6.33) will be non-zero as soon as n − 4 ≥ m ≥ 2,

with n ≡ m (mod2).

(6.35)

For other k ∈ 2π iP+ it will be zero in the trigonometric limit. But the result of [46] cited in Sect. 5.1 claims that the family of the eigenfunctions of L is analytic in p = eπiτ and specializes to the Jack polynomials at p = 0. Hence, if m,n = 0 in the trigonometric limit, it must be zero for all τ identically. We conclude that the eigenfunctions m,n of L are labeled by m, n satisfying (6.35). In particular, the ground state corresponds to (m, n) = (2, 6). The constructed solutions m,n will have second order zeros along the lines (6.32) and will be invariant under orthogonal reflections with respect to these lines. According to the results of [46], the resulting family is complete in L2 (T )W (see Sect. 5.1 above). Remark 6.7. For certain (a1 , a2 , k1 , k2 ) the corresponding Bloch solution ψ has a nontrivial symmetry in x, which must be a subgroup of the Weyl group W . Our formulas for ψ do not work directly for some of these cases, because of the presence of an extra component (6.29). Of particular interest are those of the points which correspond to the solutions which are double-(anti)periodic (in each of the variables x1 , x2 ). These can be viewed as multidimensional analogues of the classical Lam´e polynomials [43]. 7. Difference B2 Case In this section we will generalize the results above to the following difference version of the operator (6.1): L = a0 + a+ T12ω + a− T1−2ω + b+ T22ω + b− T2−2ω ,

(7.1)

142

O. Chalykh, P. Etingof, A. Oblomkov

where Ti stands for a shift in xi by , and the coefficients a± , b± are θ (x1 ∓ ω)θ (x1 + x2 ∓ 2ω)θ (x1 − x2 ∓ 2ω) , θ (x1 ± ω)θ (x1 + x2 )θ (x1 − x2 ) θ (x2 ∓ ω)θ (x1 + x2 ∓ 2ω)θ (x1 − x2 ± 2ω) b± = , θ (x2 ± ω)θ (x1 + x2 )θ (x1 − x2 ) a± =

while a0 has the form a0 = c+ + c− + d+ + d− with c± , d± given by θ (2ω)θ (x1 ± 5ω)θ (x1 + x2 ∓ 2ω)θ (x1 − x2 ∓ 2ω) , θ (4ω)θ (x1 ± ω)θ (x1 + x2 )θ (x1 − x2 ) θ (2ω)θ (x2 ± 5ω)θ (x1 + x2 ∓ 2ω)θ(x1 − x2 ± 2ω) d± = . θ (4ω)θ (x2 ± ω)θ (x1 + x2 )θ (x1 − x2 ) c± =

In all formulas θ (z) = θ (z|τ ) is the odd Jacobi theta function (5.4). This is a very special case of the so-called BCn generalization of the quantum Ruijsenaars model [47]. In trigonometric case it has been introduced by Koornwinder [48]. The elliptic version was first suggested by van Diejen [49] and later extended in [50], where its complete integrability has been proven. This also can be viewed as an elliptic generalization of one of the Macdonald operators [51] for B2 . In what follows we assume that the parameter ω is generic. Note that the operator (6.1) can be restored (up to a certain gauge) in the limit ω → 0. It is worth mentioning that coefficients of L are not periodic, so instead of (double-) Bloch solutions one should look for the eigenfunctions in a certain θ -functional space. Another way of putting it is to observe that L can be reduced to elliptic form using proper gauge. For instance, consider = δ −1 ◦ L ◦ δ, L

δ = θ (x1 )θ (x2 )θ (x1 + x2 )θ (x1 − x2 ).

will have elliptic coefficients, so we can look for its Bloch solutions ψ(x). Then L Correspondingly, = δψ will be an eigenfunction for L and it will have translation properties similar to those of δ. Abusing the language, below we refer to as a Bloch solution for L. 7.1. Bloch solutions. We are going to construct eigenfunctions of L similar to the differential case above. Our ansatz for remains unchanged: i/3 j/3 = exp(K1 x1 + K2 x2 ) cij θ (3x1 + γ1 |3τ )θ (3x2 + γ2 |3τ ). 0 0 0≤i,j ≤2

(7.2) An analogue of the vanishing conditions (6.10)–(6.13) is dictated by the singularities of L and is the following: (ω, x2 ) ≡(−ω, x2 ) (x1 , ω) ≡(x1 , −ω) (x1 + ω, x2 + ω) ≡(x1 − ω, x2 − ω) (x1 + ω, x2 − ω) ≡(x1 − ω, x2 + ω)

for all x2 , for all x1 , for x1 + x2 = 0, for x1 − x2 = 0.

(7.3) (7.4) (7.5) (7.6)

Generalized Lam´e Operators

143

We are going to show that for a certain two-dimensional variety in the space of parameters γ1 , γ2 , K1 , K2 there is only one (up to a factor) such . As a consequence, will be an eigenfunction of L, due to a natural analogue of Proposition 5.1. We start from conditions (7.5), (7.6). Using the formula (6.4) and repeating the arguments used in case ω = 0, we obtain a linear system for cij and can see that (for generic γ1 , γ2 ) it defines cij uniquely, up to a factor. However, solving this system leads to a very cumbersome formula. Instead, let us define by the following formula: bij θ (x1 + a1 + iω)θ (x2 + a2 + j ω)eω(ik1 +j k2 ) , (7.7) = exp(k1 x1 + k2 x2 ) i,j

where the summation is taken over the following set of indices: (i, j ) = (0, 4), (4, 0), (0, −4), (−4, 0), (2, 2), (2, −2), (−2, −2), (−2, 2), (0, 0), and the coefficients bij = bij (x) look as follows: i+j i−j bij = βij θ x1 + x2 − ω θ x1 − x2 − ω 2 2 with β04 = β40 = β0,−4 = β−4,0 = (θ (2ω))2 , β22 = β2,−2 = β−2,−2 = β−2,2 = −θ(2ω)θ (4ω), β00 = (θ (4ω))2 . Proposition 7.1. For any K1 , K2 and generic γ1 , γ2 there exists unique (up to a factor) function of the form (7.2) satisfying conditions (7.5)–(7.6). It is given by the formula (7.7), where kj = Kj + πi and aj = γj − (1 + τ )/2 (j = 1, 2). To prove the proposition one first checks that this has the needed translation properties in x1 , x2 , then an elementary check shows that conditions (7.5)–(7.6) are satisfied. Let us turn now to conditions (7.3)–(7.4). First, let us remark that (7.5), (7.6) at x = (0, 0) imply that (ω, ω) − (−ω, −ω) = (ω, −ω) − (−ω, ω) = 0.

(7.8)

Introduce now G1 = (ω, ω) − (ω, −ω) − (−ω, ω) + (−ω, −ω), G2 = (ω, −3ω) − (−ω, −3ω) − (ω, 3ω) + (−ω, 3ω), and consider the system G1 = 0,

G2 = 0.

(7.9)

Obviously, the vanishing condition (7.3) implies (7.9). Conversely, from the system (7.9) we deduce immediately that the function f (t) = (ω, t) − (−ω, t) satisfies the conditions f (ω) = f (−ω) and f (3ω) = f (−3ω). Together with (7.8) this gives that f (ω) = f (−ω) = 0. Since f is a theta function of order 3, it must have the

144

O. Chalykh, P. Etingof, A. Oblomkov

form f = θ (t − ω)θ (t + ω)g(t) with g(3ω) = g(−3ω). Now, since g is a theta function of the first order with known characteristics (expressed in terms of a2 , k2 ), the condition g(3ω) = g(−3ω) implies g ≡ 0 as soon as e6ωk2 =

θ (a2 − 3ω|τ ) . θ (a2 + 3ω|τ )

(7.10)

The latter condition is a difference version of (6.25). Now let us collect some corollaries of (7.5)–(7.6): (ω, −3ω) = (3ω, −ω), (ω, 3ω) = (3ω, ω),

(−ω, −3ω) = (−3ω, −ω), (−ω, 3ω) = (−3ω, ω).

Thus, G2 can be rewritten as G2 = (3ω, −ω) − (−3ω, −ω) − (3ω, ω) + (−3ω, ω). Hence, we can repeat the same arguments with respect to x2 -variable and conclude that the system (7.9) implies (7.4) as soon as e6ωk1 =

θ (a1 − 3ω|τ ) . θ (a1 + 3ω|τ )

(7.11)

Summing up, we see that the system (7.9) implies both of the conditions (7.3)–(7.4) under assumptions (7.10)–(7.11). To get rid of restrictions (7.10)–(7.11) let us calculate G1 , G2 . First, introduce the notation ξ1 , ξ2 for ξ1 = eωk1 , ξ2 = eωk2 . A direct substitution gives that (ω, ω) =β0,−4 θ (4ω)θ (−2ω)θ (a1 + ω)θ (a2 − 3ω)ξ1 ξ2−3 + β−4,0 θ (4ω)θ (2ω)θ (a1 − 3ω)θ (a2 + ω)ξ1−3 ξ2 + β−2,2 θ (2ω)θ (2ω)θ (a1 − ω)θ (a2 + 3ω)ξ1−1 ξ23 + β2,−2 θ (2ω)θ (−2ω)θ (a1 + 3ω)θ (a2 − ω)ξ13 ξ2−1 . In a similar way we calculate (−ω, ω) and G1 = 2(ω, ω) − 2(−ω, ω). As a result, the equation G1 = 0 takes the following form (up to a nonessential factor): (θ (a1 + 3ω)ξ13 − θ (a1 − 3ω)ξ1−3 )(θ (a2 + ω)ξ2 − θ(a2 − ω)ξ2−1 ) = (θ (a1 + ω)ξ1 − θ (a1 − ω)ξ1−1 )(θ (a2 + 3ω)ξ23 − θ(a2 − 3ω)ξ2−3 ).

(7.12)

In a similar way one can compute G2 . It turns out that it is a linear combination of 8 terms of the form ξ1±3 ξ2±5 , ξ1±5 ξ2±3 and 8 terms of the form ξ1±1 ξ2±3 , ξ1±3 ξ2±1 . Moreover, the combination of the last 8 terms is proportional to G1 , so we can get rid of them by subtracting G1 , this does not affect the system (7.9). After these transformations equation G2 = 0 takes the following nice form: (θ(a1 + 5ω)ξ15 − θ(a1 − 5ω)ξ1−5 )(θ (a2 + 3ω)ξ23 − θ(a2 − 3ω)ξ2−3 ) = (θ (a1 + 3ω)ξ13 − θ (a1 − 3ω)ξ1−3 )(θ (a2 + 5ω)ξ25 − θ(a2 − 5ω)ξ2−5 ).

(7.13)

Generalized Lam´e Operators

145

Summarizing, we see that the system (7.9) is equivalent to Eqs. (7.12)–(7.13). These equations are obviously invariant under the transformations (6.8)–(6.9), in this way they define a covering over the product E × E of two elliptic curves E = C/Z + τ Z. It has two “vertical” components {a1 = a2 , ξ1 = ξ2 },

{a1 = −a2 , ξ1 = (ξ2 )−1 }.

(7.14)

Another “trivial” component is given by θ (a1 + 3ω)ξ13 − θ (a1 − 3ω)ξ1−3 = θ (a2 + 3ω)ξ23 − θ(a2 − 3ω)ξ2−3 = 0.

(7.15)

After deleting these three components, one gets a finite (in fact, 17-fold) covering of E × E, let us denote it by C. We can conclude now that for any point in C the corresponding function will satisfy the vanishing conditions (7.3)–(7.6) and, hence, it will be an eigenfunction of the difference operator L. Theorem 7.2. Formulas (7.7) Eqs. (7.12)–(7.13) describe the (double-) Bloch eigenfunctions of the difference operator (7.1). The Bloch–Hermite variety C, which is obtained by deleting components (7.14)–(7.15) from the variety (7.12)–(7.13), is a 17-fold covering of the product E × E of two elliptic curves. As a corollary, considering the limit ω → 0 we can calculate explicitly the Hermite– Bloch variety for the operator (6.1). Namely, one picks up the terms of order 4 and 6 in ω in Eq. (7.12). Corollary 7.3. In the limit ω → 0 the system (7.12)–(7.13) goes to (6.28). The only thing we still have to explain is why the degree of the covering is 17. To this end, let us define a family of plane rational curves ϕ : P1 → P2 of degree 5, depending on parameter a ∈ E. Namely, for a ∈ E and u = (u0 : u1 ) ∈ P1 put ϕ(a, u) = (ϕ0 : ϕ1 : ϕ2 ),

where

+ ω)θ (a − ω/2) u30 u21 θ (a − ω)θ (a + ω/2) − , θ (a + ω/2) θ(a − ω/2) u0 u41 θ (a + 3ω)θ 3 (a − ω/2) u40 u1 θ(a − 3ω)θ 3 (a + ω/2) ϕ1 = − , θ 3 (a + ω/2) θ 3 (a − ω/2) u5 θ (a + 5ω)θ 5 (a − ω/2) u50 θ (a − 5ω)θ 5 (a + ω/2) − . ϕ2 = 1 θ 5 (a + ω/2) θ 5 (a − ω/2) ϕ0 =

u20 u31 θ (a

Then the solutions (ξ1 , ξ2 ) of (6.28) correspond to the intersection points of two curves C1 = ϕ(a1 , · ), C2 = ϕ(a2 , · ) from our family. Namely, if ϕ(a1 , u) = ϕ(a2 , v) then ξ1 , ξ2 with (ξ1 )2 =

u1 θ 2 (a1 − ω/2) u0 θ 2 (a1 + ω/2)

and

(ξ2 )2 =

v1 θ 2 (a2 − ω/2) v0 θ 2 (a2 + ω/2)

clearly satisfy (7.12)–(7.13) and vice versa, provided (7.10)–(7.11). We should, however, exclude from consideration points with ξ1 , ξ2 = 0, ∞ since ξi = eωki . Namely, all the curves from our family pass through (0 : 0 : 1) = ϕ(a, 0) = ϕ(a, ∞). A little difference with the ω = 0 case is that now we have mult (C1 ∩ C2 ) = 8 at this point. By Bezout’s theorem, the number of intersection points of C1 , C2 , apart from (0 : 0 : 1), equals 5 × 5 − 8 = 17.

146

O. Chalykh, P. Etingof, A. Oblomkov

Remark 7.4. Note that for given ξ12 , ξ22 the corresponding quasimomenta k1 , k2 seem to be non-unique, with the ambiguity of adding some multiple of iπ/ω. However, this would result in multiplying by eiπx1 /ω , eiπx2 /ω which are quasi-constant on the lattice 2ωZ2 . Thus, this leads to the same eigenfunction, so the constructed Bloch solutions are in one-to-one correspondence with the points of the surface C. Remark 7.5. It is clear that the Weyl group action on C is generated by two involutions (a1 , ξ1 ) → (−a1 , ξ1−1 ) and (a1 , ξ1 ) ↔ (a2 , ξ2 ). 7.2. Structure of the solution space. As we mentioned above, the difference operator (7.1) is completely integrable, i.e. there exists another difference operator L1 which commutes with L. It is given by the following expression [49, 50]: L1 = c++ T1ω T2ω + c+− T1ω T2−ω + c−+ T1−ω T2ω + c−− T1−ω T2−ω ,

(7.16)

where the coefficients c1 ,2 look as follows (we treat ± as ±1): c1 ,2 =

θ (x1 − ω1 )θ (x2 − ω2 )θ (x1 + x2 − 2ω(1 + 2 ))θ (x1 − x2 − 2ω(1 − 2 )) . θ (x1 )θ (x2 )θ (x1 + x2 )θ (x1 − x2 )

Now let us consider the system of two partial difference equations: Lf = Ef,

L1 f = E1 f,

(7.17)

defined on the lattice 2ωL where L = {(m, n) | m ± n ∈ Z}. More precisely, we fix generic x 0 ∈ C2 as a base point and regard a function f in (7.17) as being defined on x 0 + 2ωL ⊂ C2 . The base point x 0 must be outside the singular locus of L, L1 , i.e. such that L, L1 are nonsingular on x 0 + 2ωL. The Bloch solutions , constructed above, are common eigenfunctions of L and L1 . (The proof for L1 is the same: the only thing to check is an analogue of Proposition 5.1.) Since L and L1 are W -symmetric, each of the 8 functions (wx) (w ∈ W ) will solve the system (7.17). We know that for a generic point of the spectral surface X (thus, for generic E, E1 ) all 8 functions (wx) are linearly independent (as functions on C2 ), because they have different translation properties with respect to the shifts (6.8)– (6.9). Hence, their restriction to x 0 + 2ωL also gives 8 linearly independent solutions of (7.17)(at least, for generic base point x 0 ). On the other hand, it is not difficult to see that any solution f is uniquely determined by its values at eight points x 0 + ν with the following ν: ν = (0, 0), (±ω, ω), (ω, −ω), (±2ω, 0), (0, 2ω), (ω, 3ω). This implies that the dimension of the solution space of (7.17) is at most 8. Thus, we conclude that any solution of (7.17) (for generic E, E1 ) is a linear combination of 8 Bloch solutions {(wx)}w∈W . Proposition 7.6. The space of solutions of the system (7.17) has dimension 8 and for generic E, E1 is generated by the Bloch solutions (wx) (w ∈ W ).

Generalized Lam´e Operators

147

Remark 7.7. Above we associated a double-Bloch eigenfunction to a solution (a1 , a2 , ξ1 , ξ2 ) of Eqs. (7.12)– (7.13). Note that these equations are invariant under ξj → −ξj , but this does not lead to another eigenfunction, since they will differ by a quasiconstant factor. Situation is different for the system (7.17), since it is defined on a different lattice. It is easy to see that (ξ1 , ξ2 ) and (−ξ1 , −ξ2 ) still lead to the same solution modulo quasiconstants, the same is true for (ξ1 , −ξ2 ) and (−ξ1 , ξ2 ). The resulting two functions have the same eigenvalue E in (7.17), but opposite values of E1 . Thus, the Bloch variety for the system (7.17) is a double covering of the surface C introduced above. We will not go into discussing the spectral properties of the difference operator L. See papers [52, 53] devoted to this rather delicate matter. Let us just remark on some special solutions analogous to the “discrete spectrum” considered in Sect. 6.2. Namely, let us consider the following anti-invariant solution of the system (7.17): skew (x) = (det w)(wx). w∈W

The vanishing conditions (7.3)–(7.6) imply that skew vanishes along lines x1 ± x2 = ±2ω and xj = ±ω. It also vanishes if x1 = ±x2 and xj = 0 due to anti-invariance. Let us require now for all 8 functions (wx) to have the same Floquet–Bloch multipliers with respect to the shifts by e1 and e2 , which is equivalent to the conditions exp(k1 ) = exp(k2 ) = ±1. Then skew will vanish also along the shifted lines x1 ± x2 = m, m ± 2ω,

xj = n, n ± ω

(m, n ∈ Z).

In the limit ω → 0 these solutions go to those constructed in Sect. 6.2, more precisely, ω−6 skew −→ 64(θ (0))2 θ (a1 )θ (a2 )θ (x1 )θ (x2 )θ (x1 +x2 )θ (x1 −x2 )

as

ω → 0.

8. Hietarinta Operator and Its Discretization 8.1. Continuous case. We consider now the Schr¨odinger operator (1.4) but first let us rescale the coordinates xi → ai xi , so instead of (1.4) we will consider L = −a12 ∂12 − a22 ∂22 − a32 ∂32 + 2(a12 + a22 )℘ (x1 − x2 ) + 2(a22 + a33 )℘ (x2 − x3 ) + 2(a32 + a12 )℘ (x3 − x1 ), (8.1) where, as before, ℘ (z) = ℘ (z|τ ) is the Weierstrass ℘-function and a12 + a22 + a32 = 0. We are going to calculate the double-Bloch eigenfunctions of L. More specifically, we are looking for the solutions ψ of the equation Lψ = Eψ with the following properties: (i) ψ is of the form (x) (8.2) exp(k1 x1 + k2 x2 + k3 x3 ), θ (x12 )θ (x23 )θ (x31 ) 3 where xij := xi − xj , θ = θ 1/2 1/2 and is holomorphic in C and depends on the differences xij only, in other words, (∂1 + ∂2 + ∂3 ) = 0 ; ψ(x) =

148

O. Chalykh, P. Etingof, A. Oblomkov

(ii) ψ has the following translation properties: ψ(x + ej ) = ekj ψ(x),

ψ(x + τ ej ) = eµj ψ(x)

(j = 1, 2, 3),

(8.3)

where (e1 , e2 , e3 ) is the standard basis in C3 . It is not difficult to conclude that for fixed kj , µj the conditions above determine a three-dimensional functional space, and the corresponding (x) in (8.2) must be of the form =

2

cl θ (x12 + b12 + lτ/3)θ (x23 + b23 + lτ/3)θ (x31 + b31 + lτ/3),

(8.4)

l=0

where c0 , c1 , c2 are arbitrary constants and the parameters b12 , b23 , b31 are related to µj above in the following way: eµ1 = ek1 τ +2πib31 −2πib12 , eµ2 = ek2 τ +2πib12 −2πib23 , e

µ3

=e

k3 τ +2πib23 −2πib31

(8.5)

.

This shows that the three-dimensional space (8.4) depends, essentially, on the pairwise differences of the parameters blm only. Thus, without loss of generality we may assume that b12 + b23 + b31 = 0.

(8.6)

In formulas below we will also use b21 , b32 , b13 under the convention that bij = −bj i . Now, in accordance with Proposition 5.1, we impose certain vanishing conditions on ψ which are motivated by the structure of the singularities of the operator (8.1). Namely, 2 e 2 for any i = 1, 2, 3 consider the function f (t) = ψ(x + tai−1 i−1 − tai+1 ei+1 ) for x such that xi−1 = xi+1 (we treat indices modulo 3, so x0 = x3 ). Our assumptions about ψ imply that for such x the function f will have a pole at t = 0, so its Laurent expansion will look as f = a−1 t −1 + a0 + a1 t + . . . . The coefficients in this expansion depend on x. Let us require that a0 = 0 for all x such that xi−1 = xi+1 . Using (8.2) one rewrites this condition as follows: 2 2 ai−1 ki−1 − ai+1 ki+1 + Fi = 0,

Fi :=

2 ∂ 2 ai−1 i−1 − ai+1 ∂i+1

(8.7) 2 − ai−1

θ (xi−1,i ) 2 θ (xi+1,i ) + ai+1 , θ (xi−1,i ) θ(xi+1,i )

(8.8)

with (8.7) to be valid for all x such that xi−1 = xi+1 . The following lemma follows from Proposition 5.1. Lemma 8.1. If ψ has the form (8.2), (8.4) and satisfies the vanishing conditions (8.7), = Lψ under the action of the operator (8.1). then the same will be true for its image ψ As we will see below, for a certain three-dimensional subvariety in the space of the parameters kj , blm the vanishing conditions cut a one-dimensional subspace in the space (8.4). Thus, the lemma ensures that the corresponding ψ(x) will be an eigenfunction of L. We may regard the restriction of the expression Fi on the plane xi−1 = xi+1 as a function of z = xi−1,i = xi+1,i . It is easy to check then that Fi will be an elliptic function

Generalized Lam´e Operators

149

of z with periods 1, τ . So, first of all we have to choose the parameters blm , cj in such a way that Fi (z) would be non-singular. Let us assume that the parameters a1 , a2 , a3 are generic enough, i.e. that ai2 = aj2 . Then for Fi to be non-singular at z = 0 we need (0) = 0. This gives the following condition: lτ lτ lτ c0 + θ b23 + θ b31 + . (8.9) c1 + c2 = 0, cl = cl θ b12 + 3 3 3 Now we note that

(z) := |xi−1 =xi+1

is a one- dimensional θ -function of order 2, hence it has two zeros (modulo 1, τ ). The first zero is z = 0 (due to condition (8.9)). An easy check shows that the second zero is z = bi,i−1 + bi,i+1 . So, up to a constant factor, (z) = θ (z)θ (z − bi,i−1 − bi,i+1 ). To get rid of a possible pole at z = bi,i−1 + bi,i+1 in (8.8) we must require that 2 ∂ 2 ai−1 i−1 − ai+1 ∂i+1 = 0 for xi−1,i = xi+1,i = bi,i−1 + bi,i+1 . This leads to the following relation: 2 l=0

cl (a12 ζ (b23 +

lτ lτ lτ ) + a22 ζ (b31 + ) + a32 ζ (b12 + )) = 0. 3 3 3

(8.10)

d logθ (z). Notice that the relation (8.10) is symmetric with Here and below ζ (z) := dz respect to indices 1, 2, 3 (so we have just one condition instead of possible three!). We use the relations (8.9), (8.10) to express (up to a common factor) the parameters cl in terms of the parameters bij . These relations imply that each of Fi (8.8) is nonsingular in z = xi−1,i = xi+1,i , therefore they are some constants depending on cl , bij . 2 k 2 Thus, (8.7) leads to the expressions for the differences ai−1 i−1 − ai+1 ki+1 in terms of cl and blm . One can check that the resulting system is always compatible (i.e. that F1 +F2 +F3 = 0). This follows, for instance, from the compatibility of the system (8.18) below by going to the limit ω → 0. As a corollary, the formulas (8.2),(8.4), together with (8.9), (8.10) and (8.7) deliver the expression for the double-Bloch eigenfunctions of the operator L. Finally, let us discuss the structure of the Hermite–Bloch variety of the operator (8.1). The double-Bloch solutions are parametrized by b12 , b23 , b31 with b12 + b23 + b31 = 0, and the corresponding k1 , k2 , k3 are determined from (8.7). Denote by b and k the threecomponent vectors b = (b12 , b23 , b31 ) and k = (k1 , k2 , k3 ). Then two different points in the parameter space (b, k) lead to the same solution iff the corresponding Floquet multipliers in (8.3) are the same. Taking into account relations (8.5), we conclude that the following transformations do not lead to another Bloch solution:

b → b + ε1 , k → k (ε1 = (2/3, −1/3, −1/3)), b → b + ε2 , k → k (ε2 = (−1/3, 2/3, −1/3)), b → b + τ ε1 , k → k + 2πi(1, −1, 0), b → b + τ ε2 , k → k + 2πi(0, 1, −1).

(8.11)

Thus, b is effectively represented by a point of a factor C2 /L + τ L, where C2 = {z1 + z2 + z3 = 0} ⊂ C3 and the lattice L is generated by ε1 , ε2 . This factor is isomorphic to the product of two elliptic curves with parameter τ . Above each point b we have

150

O. Chalykh, P. Etingof, A. Oblomkov

a complex line of double-Bloch solutions, because Eqs. (8.7) determine k up to adding any multiple of (a1−2 , a2−2 , a3−2 ). 8.2. Discrete case. We keep the notation xij for xi − xj . The discrete version of the operator (8.1) looks as follows: D=

3 θ (ω)θ (xi−1,i + ωai2 )θ (xi,i+1 − ωai2 ) i=1

θ (ωai2 )θ (xi−1,i )θ (xi,i+1 )

ωai2

Ti

(8.12)

,

where a12 + a22 + a32 = 0 and Ti stands for a shift by in xi . Its rational version θ(z) = z was communicated to us by M. Feigin who found it to be dual (in bispectral sense) to the trigonometric version ℘ = sin−2 of the Hietarinta operator (see [54]). The difference operator (8.12) relates to (8.1) in the following way: D = a1−2 + a2−2 + a3−2 + ω(∂1 + ∂2 + ∂3 ) +

ω2 + o(ω2 ) (const − L) 2

as

ω → 0,

is gauge-equivalent to L, where L = δ ◦ L ◦ δ −1 , L

δ = θ (x1 − x2 )θ (x2 − x3 )θ (x3 − x1 ).

Unlike L, the operator D is not periodic. As a result, instead of the double Bloch eigenfunctions, we will look for eigenfunctions with the translation properties similar to those of δ. Apart from that, our ansatz for the eigenfunctions ϕ of the operator D remains the same: ϕ = exp(k1 x1 + k2 x2 + k3 x3 ), =

2

(8.13)

cl θ (x12 + b12 + lτ/3)θ (x23 + b23 + lτ/3)θ (x31 + b31 + lτ/3),

(8.14)

l=0

b12 + b23 + b31 = 0.

(8.15)

The vanishing conditions now look as follows: for each i = 1, 2, 3 ωa 2

ωa 2

2 2 Fi := θ (xi,i−1 + ωai−1 )Ti−1i−1 (ϕ) − θ (xi,i+1 + ωai+1 )Ti+1i+1 (ϕ) = 0

(8.16)

identically for all x with xi+1 = xi−1 . We have then a straightforward analog of Lemma 8.1, so the same approach as above will give us the eigenfunctions for D. Let us first formulate the result. Namely, we consider the following two conditions on the function (x1 , x2 , x3 ) given by (8.14): (ωa12 , 0, −ωa22 ) = 0,

(−ωa22 , 0, ωa32 ) = 0.

(8.17)

This gives us two linear equations on cl and we use them to express cl (up to a factor) through bij .

Generalized Lam´e Operators

151

Secondly, we impose the following three relations: eωa1 k1 −ωa3 k3 = 2

2

eωa2 k2 −ωa1 k1 = 2

2

eωa3 k3 −ωa2 k2 = 2

2

θ (ωa32 ) (0, 0, ωa32 ) θ (ωa12 ) (ωa12 , 0, 0) θ (ωa12 ) (ωa12 , 0, 0) θ (ωa22 ) (0, ωa22 , 0) θ (ωa22 ) (0, ωa22 , 0) θ (ωa32 ) (0, 0, ωa32 )

, ,

(8.18)

.

We use these formulas to express k1 , k2 , k3 through . The solution is not unique, and in fact we have a one-parameter family of ki . Altogether, formulas (8.17)–(8.18) fix the dependence of cl and ki and hence of ϕ on three parameters bij (related by (8.15)). The resulting family of functions ϕ(x) depends on three parameters: two of bij and one more due to the freedom in resolving (8.18), see more comments below. Theorem 8.2. The formulas (8.13)–(8.15) and the relations (8.17)–(8.18) give a threeparameter family of eigenfunctions for the difference operator D. To prove the theorem, let us first notice that each of Fi in (8.16), being regarded as a function of z = xi,i−1 = xi,i+1 , is a one-dimensional theta-function of order 3, so if it doesn’t vanish, it must have three zeros (modulo 1, τ ). Moreover, a simple count shows that the sum of these zeros will be equal to bi,i−1 + bi,i+1 . On the other hand, a direct substitution into (8.16) shows that the relations (8.17) imply that F2 vanishes for z = −ωa12 and z = −ωa32 . Further, the first relation in (8.18) simply encodes the fact that F2 vanishes at z = 0. Since the sum of these three zeros is ωa22 which, generically, is not b21 + b23 , we conclude that F2 (z) is zero identically. At first glance it seems that we need to add four more conditions to ensure all the vanishing properties (8.16). Namely, one needs also (0, ωa22 , −ωa12 ) =(0, −ωa12 , ωa32 ) =(−ωa32 , ωa22 , 0) = (ωa12 , −ωa32 , 0) = 0.

(8.19)

However, since depends on the pairwise differences of xi only, we will have that (0, ωa22 , −ωa12 ) = (−ωa22 , −ωa22 + ωa22 , −ωa22 − ωa12 ) = (−ωa22 , 0, ωa32 ) = 0. In the same way other relations in (8.19) follow from (8.17). This demonstrates that the three-parameter family constructed in the theorem satisfies the vanishing conditions (8.16), thus proving the theorem. Finally, let us comment on the structure of the Hermite–Bloch variety. Similarly to the case ω = 0 above, Bloch solutions are parametrized by b = (b12 , b23 , b31 ) with b12 + b23 + b31 = 0. This determines the corresponding cl by (8.17). After that k = (k1 , k2 , k3 ) are determined from (8.18). At this point we have certain freedom: if k = (k1 , k2 , k3 ) is a solution of (8.18), then any k = k +

t −2 −2 −2 2πi (a , a2 , a3 ) + (n1 a1−2 , n2 a2−2 , n3 a3−2 ) ω 1 ω

with any t ∈ C and integer n1 , n2 , n3 will be a solution, too. However, the last term is not essential since it results in multiplying by a quasiconstant. For the same reason,

152

O. Chalykh, P. Etingof, A. Oblomkov

the factor t in the second term is essential modulo 2π i only. Besides, we still have the translation invariance of with respect to the transformations (8.11). Thus, the Hermite–Bloch variety is fibered over the product of two elliptic curves with the fibers isomorphic to C/2π i. Acknowledgement. We are grateful to A.P. Veselov for stimulating discussions and to Yu. Berest, S. Ruijsenaars and K. Takemura for useful comments. P.E. thanks C. De Concini for a discussion which was useful for the proof of Theorem 3.8. The work of O.C. was supported by EPSRC. The work of A.O. was supported by Russian Foundation for Basic Research (grant RFBR-01-01-00803). The work of P.E. was partially supported by the NSF grant DMS-9988796, and partly done for the Clay Mathematics Institute.

References 1. Novikov, S.P.: A periodic problem for the Korteweg–de Vries equation I. Funct. Analis i ego Pril. 8(3), 54–66 (1974) 2. Dubrovin, B.A., Krichever, I.M., Novikov, S.P.: Integrable systems I. In: Dynamical Systems IV. Arnold, V.I., Novikov, S.P. (eds) Berlin: Springer, 1990, pp. 173–280 3. Nakayashiki, A.: Structure of Baker–Akhiezer modules of principally polarized abelian varieties, commuting partial differential operators and associated integrable systems. Duke Math. J. 62(2), 315–358 (1991) 4. Nakayashiki, A.: Commuting partial differential operators and vector bundles over abelian varieties. Amer. J. Math. 116, 65–100 (1994) 5. Parshin, A.N.: Integrable systems and local fields. Comm. Algebra 29(9), 4157–4181 (2001) 6. Osipov, D.V.: The Krichever correspondence for algebraic varieties. Izv. Math. 65(5), 941–975 (2001) 7. Rothstein, M.: The Fourier–Mukai transform and equations of KP-type in several variables. Preprint, 2002; mathAG/0201066 8. Chalykh, O.A., Veselov, A.P.: Commutative rings of partial differential operators and Lie algebras. Commun. Math. Phys. 126, 597–611 (1990) 9. Olshanetsky, M.A., Perelomov, A.M.: Quantum integrable systems related to Lie algebras. Phys. Rep. 94, 313–404 (1983) 10. Veselov, A.P., Styrkas, K.L., Chalykh, O.A.: Algebraic integrability for the Schr¨odinger equation and finite reflection groups. Theor. Math. Phys. 94, 253–275 (1993) 11. Braverman, A., Etingof, P., Gaitsgory, D.: Quantum integrable systems and differential Galois theory. Transfor. Groups 2, 31–57 (1997) 12. Chalykh, O.A.: Darboux transformations for multidimensional Schr¨odinger operators. Russian Math. Surveys 53(2), 167–168 (1998) 13. Chalykh, O.A., Feigin, M.V., Veselov, A.P.: Multidimensional Baker–Akhiezer functions and Huygens’ principle. Commun. Math. Phys. 206, 533–566 (1999) 14. Gesztesy, F., Weikard, R.: Picard potentials and Hill’s equation on torus. Acta Math. 176, 73–107 (1996) 15. Cherednik, I.: Elliptic quantum many-body problem and double affine Knizhnik–Zamolodchikov equation. Commun. Math. Phys. 169(2), 441–461 (1995) 16. Chalykh, O.A., Veselov, A.P.: Locus configurations and ∨-systems. Phys. Lett. A 285(5–6), 339–349 (2001) 17. Krichever, I.M.: Elliptic solutions of the KP equations and integrable systems of particles. Funktsional. Anal. i Prilozhen. 14(4), 45–54 (1980) 18. Felder, G., Varchenko, A.: Three formulae for eigenfunctions of integrable Schr¨odinger operators. Compos. Math. 107, 143–175 (1997) 19. Hietarinta, J.: Pure quantum integrability. Phys. Lett. A 246, 97–104 (1998) 20. Berest, Yu.Yu., Lutsenko, I.M.: Huygens’ principle in Minkowski spaces and soliton solutions of the KdV equation. Commun. Math. Phys. 190, 113–132 (1997) 21. Inozemtsev, V.I.: Solution to three-magnon problem for S = 1/2 periodic quantum spin chains with elliptic exchange. J. Math. Phys. 37(1), 147–159 (1996) 22. Dubrovin, B.A., Krichever, I.M., Novikov, S.P.: The Schr¨odinger equation in a periodic field and Riemann surfaces. Dokl. Akad. Nauk SSSR 229(1), 15–18 (1976) 23. Veselov, A.P., Novikov, S.P.: Finite-gap two-dimensional Schr¨odinger operators. Potential operators. Dokl. Akad. Nauk SSSR 279(4), 784–788 (1984) 24. Airault, H., McKean, H.P., Moser, J.: Rational and elliptic solutions of the Korteweg– de Vries equation and a related many body problem. Commun. Pure Appl. Math. 30(1), 95–148 (1977)

Generalized Lam´e Operators

153

25. Gesztesy, F., Weikard, R.: Elliptic algebro-geometric solutions of the KdV and AKNS hierarchies – an analytic approach. Bull. Amer. Math. Soc. (N.S.) 35(4), 271–317 (1998) 26. Serre, J.-P., Local fields. NY–Berlin: Springer-Verlag, 1979 27. Duistermaat, J.J., Gr¨unbaum, F.A.: Differential equations in the spectral parameter. Commun. Math. Phys. 103(2), 177–240 (1986) 28. Deligne, P.: Equations differentielles a points singuliers reguliers. LNM 163, NY–Berlin: SpringerVerlag, 1970 29. Kaplansky, I.: Introduction to differential algebra. Paris, Hermann: 1957 30. Krichever, I., Zabrodin, A.: Spin generalization of the Ruijsenaars-Schneider model, the nonabelian two-dimensionalized Toda lattice, and representations of the Sklyanin algebra. Russian Math. Surveys 50(6), 1101–1150 (1995) 31. Segal, G., Wilson, G.: Loop groops and equations of KdV type. IHES Publ. 61, 5–65 (1985) 32. Khodarinova, L.A.: On quantum elliptic Calogero-Moser problem. Vestnik Mosc. Univ., Ser. Math. and Mech. 53(5), 16–19 (1998) 33. Khodarinova, L.A., Prikhodsky, I.A.: On algebraic integrability of the deformed elliptic CalogeroMoser problem. J. Nonlin. Math. Phys. 8(1), 1–4 (2001) 34. Oshima, T., Sekiguchi, H.: Commuting families of differential operators invariant under the action of a Weyl group. J. Math. Sci. Univ. Tokyo 2(1), 1–75 (1995) 35. Mamford, D.: Tata lectures on theta I. Progress in Mathematics, 28. Boston: Birkh¨auser, 1983 36. Feldman, J., Kn¨orrer, H., Trubowitz, E.: There is no two dimensional analogue of Lam´e’s equation. Math. Ann. 294, 295–324 (1992) 37. Looijenga, E.: Root systems and elliptic curves. Invent. Math. 38(1), 17–32 (1976) 38. Bernstein, I.N., Shvartsman, O.V.: Chevalley’s theorem for complex crystallographic Coxeter groups. Funktsional. Anal. i Prilozhen. 12(4), 79–80 (1978) 39. Berest, Yu., Etingof, P., Ginzburg, V.: Cherednik algebras and differential operators on quasi-invariants. Preprint, 2001; mathQA/0111005 (to appear in Duke Math. J.) 40. Etingof, P., Ginzburg, V.: On m-quasiinvariants of a Coxeter group. Preprint, 2001; mathQA/0106175 41. Schmidt, M.U., Veselov, A.P.: Quantum elliptic Calogero–Moser problem and deformations of algebraic surfaces. Preprint, 1996 42. Chalykh, O.A.: Bispectrality for the quantum Ruijsenaars model and its integrable deformation. J. Math. Phys. 41(8), 5139–5167 (2000) 43. Whittaker, E.T., Watson, G.N. A course of modern analysis. Cambridge: Cambridge U.P., 1986 44. Oblomkov, A.A.: Integrability of some quantum problems related to root system B2 . Vestnik Mosc. Univ., Ser. Math. Mech. 54(2), 6–8 (1999) 45. Khodarinova, L.A., Prikhodsky, I.A.: Algebraic spectral relations for elliptic quantum Calogero– Moser problems. J. Nonlin. Math. Phys. 6(3), 263–268 (1999) 46. Komori, Y., Takemura, K.: The perturbation of the quantum Calogero–Moser–Sutherland system and related results. Commun. Math. Phys. 227(1), 93–118 (2002) 47. Ruijsenaars, S.N.M.: Complete integrability of relativistic Calogero–Moser systems and elliptic functions identities. Commun. Math. Phys. 110, 191–213 (1987) 48. Koornwinder, T.H.: Askey-Wilson polynomials for root systems of type BC. In: Hypergeometric functions on domains of positivity. Richards, D.St.P (ed), Jack polynomials, and applications. Contemp. Math. 138, 1992, pp. 189–204 49. van Diejen, J.F.: Integrability of difference Calogero–Moser systems. J. Math. Phys. 35, 2983–3004 (1994) 50. Komori,Y., Hikami, K.: Conserved operators of the generalized elliptic Ruijsenaars models. J. Math. Phys. 39, 6175–6190 (1998) 51. Macdonald, I.G.: Orthogonal polynomials associated with root systems. Preprint, 1988; mathQA/0011046 52. Ruijsenaars, S.: Relativistic Lam´e functions: The special case g = 2. J. Phys. A 32(9), 1737–1772 (1999) 53. Komori, Y.: Essential self-adjointness of the elliptic Ruijsenaars models. J. Math. Phys. 42(9), 4523– 4553 (2001) 54. Feigin, M.V.: Multidimensional integrable Schrodinger operators. PhD Thesis, Moscow, 2001 Communicated by L. Takhtajan

Commun. Math. Phys. 239, 155–182 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0883-8

Communications in

Mathematical Physics

Solutions of the WDVV Equations and Integrable Hierarchies of KP Type Henrik Aratyn1 , Johan van de Leur2 1

Department of Physics, University of Illinois at Chicago, 845 W. Taylor St., Chicago, IL 60607-7059, USA. E-mail: [email protected] 2 Mathematical Institute, University of Utrecht, P.O. Box 80010, 3508 TA Utrecht, The Netherlands. E-mail: [email protected] Received: 17 April 2001 / Accepted: 17 February 2003 Published online: 23 June 2003 – © Springer-Verlag 2003

Abstract: We show that reductions of KP hierarchies related to the loop algebra of SLn with homogeneous gradation give solutions of the Darboux-Egoroff system of PDE’s. Using explicit dressing matrices of the Riemann-Hilbert problem generalized to include a set of commuting additional symmetries, we construct solutions of the Witten– Dijkgraaf–E. Verlinde–H. Verlinde equations.

1. Introduction This paper deals with symmetries of integrable models and their connection to the topological field theory and Frobenius manifolds. Our results establish a new and fundamental link between Darboux-Egoroff metric systems and additional symmetry-flows of the integrable hierarchies of KP type. This connection goes beyond a dispersionless limit. The description of n-orthogonal curvilinear coordinate systems in Rn was one of the classical puzzles of the 19th century. Well-known mathematicians such as Gauss, Lam´e, Cayley and Darboux have contributed to its solution. The problem is formulated as follows. Find all coordinate systems ui = ui (x 1 , x 2 , . . . , x n ), ∂ui = 0, det ∂x j 1≤i,j ≤n satisfying the orthogonality condition n ∂ui ∂uj = 0, ∂x k ∂x k k=1

for i = j.

(1.1)

156

H. Aratyn, J. van de Leur

It was Darboux who gathered most results on this topic and published them in 1910 in his Le¸cons sur les syst`emes orthogonaux et les coordonn´ees curvilignes [12]. For more historical details see the paper [30] by Zakharov. The metric tensor in Rn in the coordinate system ui is diagonal: ds 2 =

n

u = (u1 , . . . , un ),

h2i (u)(dui )2 ,

(1.2)

i=1

where h2i (u)

=

2 n ∂x i ∂uk

k=1

(1.3)

.

The flatness of the metric imposes a condition on the Lam´e coefficients hi of the following form: ∂k βij (u) = βik (u)βkj (u), i = k = j, ∂βij (u) ∂βj i (u) + + βki βkj = 0, ∂ui ∂uj

i = j ;

(1.4)

k=i,j

here this is expressed in Darboux’ rotation coefficients βij (u) =

1 ∂hj (u) . hi (u) ∂ui

(1.5)

For rotation coefficients βij that satisfy βij (u) = βj i (u),

(1.6)

the corresponding metric is called an Egoroff metric, in honour of D.F. Egoroff who gave its complete description [18]. B. Dubrovin noticed [15, 16] that the Eqs. (1.4) and (1.6), which he called the Darboux-Egoroff system, also describe the local classification of massive topological field theories. An important role in physics literature on two dimensional field theory is played by a remarkable system of partial differential equations commonly known as the Witten-Dijkgraaf-Verlinde-Verlinde (WDVV) equations [14, 29]. These equations determine deformations of 2-dimensional topological field theories. Dubrovin [15, 16] connected the Frobenius structure to any solution of the WDVV-equation thus providing a “coordinate-free” approach to the topological field theory. Frobenius manifolds are objects from the differential geometry which arise in a number of different areas of mathematics, such as quantum cohomology, Gromov-Witten invariants, theory of singularities, Hurwitz spaces, Coxeter groups. One of the most interesting connections from a physics point of view is a link to the integrable systems. We obtain a class of integrable hierarchies which provides a setting for solutions to the Darboux-Egoroff metric systems. The relevant integrable models turn out to be connected with generalizations of the integrable structure related to the Nonlinear Schr¨odinger equation. In the pseudo-differential calculus framework of the Sato theory the relevant models originate from the Lax structure given by L = ∂x +

m i=1

i ∂x −1 i .

(1.7)

WDVV Equations and Integrable Hierarchies of KP Type

157

The corresponding Baker-Akhiezer (BA) wave function ψBA which enters the linear spectral problem LψBA = λψBA is given as ψBA (t, λ) =

j τ (t − [λ−1 ]) ∞ e j =1 λ tj , τ (t)

(1.8)

in terms of the τ -function such that ∂x ∂ ln τ/∂tn = Res(Ln ). In (1.8) we used the multi-time notation (t − [λ−1 ]) = (t1 − 1/λ, t2 − 1/2λ2 , . . .). For the Lax operator in (1.7) the action of the isospectral flows on the Lax operator L is given by ∂ L = [(Ln )+ , L] , ∂tn

(1.9)

where as usual (Ln )+ stands for the differential part of the pseudo-differential operator. The isospectral flows from (1.9) induce the following flows on the eigenfunctions i and adjoint eigenfunctions i with i = 1, . . ., m: ∂ i = (Ln )+ (i ); ∂tn

∂ i = −(Ln )∗+ (i ) . ∂tn

(1.10)

These models possess a rich structure of additional, with respect to the isospectral flows, symmetries. These symmetry flows will be studied within the pseudo-differential calculus framework where they are generated by the squared eigenfunction potentials [8] and within the algebraic formalism obtained by extending the standard Riemann-Hilbert problem. In the latter formalism the relevant systems are characterized through their Lie algebraic context as integrable models arising through a generalized Drinfeld-Sokolov scheme from the sl(n) algebras with the homogeneous gradation. After further reduction to CKP sub-hierarchy [13], which brings conditions i = i , we obtain a model whose abelian subalgebra of additional symmetries provide canonical coordinates of the related Darboux-Egoroff systems. This connection will be used to explicitly construct solutions to the Darboux-Egoroff system and the Witten-DijkgraafVerlinde-Verlinde equations in topological field theory. In a separate publication we will address the issue of how several structures of the underlying integrable models (including Virasoro symmetry) carry over to the DarbouxEgoroff system. In Sect. 2, we discuss the Darboux-Egoroff metrics and the underlying linear spectral system from which the Darboux-Egoroff system can be obtained via compatibility equations. Section. 3 defines the constrained integrable KP hierarchy, which is a subject of our study, in terms of the Sato Grassmannian and within the framework of the pseudodifferential operator calculus. In Sect. 4, the Riemann-Hilbert problem is extended to incorporate the constrained integrable KP hierarchy augmented by additional symmetry flows. We succeed, in this section, to find an explicit expression for the dressing matrix allowing us to solve the extended Riemann-Hilbert problem in terms of the τ -function and the eigenfunctions i and the adjoint eigenfunctions i . Section 5 introduces the CKP reduction of the constrained integrable KP hierarchy imposing equality of the eigenfunctions i and the adjoint eigenfunctions i . Finally, in Sect. 6, we reproduce the fundamental objects defining the Darboux-Egoroff system presented in Sect. 2 in terms of the integrable structure derived in the previous sections.

158

H. Aratyn, J. van de Leur

2. The Darboux-Egoroff System We now describe in some more detail a relation between the Darboux-Egoroff system (1.4), (1.6) and the local classification of massive topological field theories [15, 16]. First of all, the flat coordinates x 1 , . . . , x n of the Darboux-Egoroff metric (1.2) can be found from the linear system k ∂ 2xk ∂x k j ∂x = iji + j i , ∂ui ∂uj ∂ui ∂uj

j ∂x k ∂ 2xk = ii , ∂uj ∂u2i j =1 n

i = j ;

(2.1)

where ijk are the Christoffel symbols of the Levi-Civita connection: iji =

1 ∂hi , hi ∂uj

j

ii = (2δij − 1)

hi ∂hi . h2j ∂uj

(2.2)

For the flat coordinates x i , 1 ≤ i ≤ n, of the metric (1.2), the functions m ck

(x) =

n ∂x m ∂ui ∂ui , ∂ui ∂x k ∂x

(2.3)

i=1

form the structure constants of the commutative algebra of the primary fields φi of a topological field theory, i.e. m φk φ = ck

φm . m

Associativity of this algebra imposes on the structure constants a relation: n

k

cij (x)ckm (x) =

k=1

n

cjk m (x)cik (x).

(2.4)

k=1

If one writes down these equations for the function F (x) for which ∂ 3 F (x) i = ck m (x) = ηmi ck

(x), k

m ∂x ∂x ∂x n

where

ηpq =

i=1

n i=1

h2i (u)

∂ui ∂ui , ∂x p ∂x q (2.5)

with the constraint ∂ 3 F (x) = η m , ∂x 1 ∂x ∂x m one obtains the well-known Witten-Dijkgraaf-E. Verlinde-H. Verlinde (WDVV)-equations [29, 14] for the prepotential F (x). In fact, Dubrovin showed [15] that one can find these ck m (x) and ηpq of (2.5) as follows. The Darboux-Egoroff system can be represented as the compatibility equations of the following linear system depending on a spectral parameter λ: ∂ψik (u, λ) = βij (u)ψj k (u, λ), ∂uj n ∂ψij (u, λ) = λψij (u, λ). ∂uk k=1

i = j, (2.6)

WDVV Equations and Integrable Hierarchies of KP Type

159

Solving this system for λ = 0 , i.e., finding ψij (u) that satisfy ∂ψik (u) = βij (u)ψj k (u), ∂uj n ∂ψij (u) = 0, ∂uk

i = j (2.7)

k=1

for a given solution βij (u) of the Darboux-Egoroff system, leads to “a local classification of complex semisimple Frobenius manifolds”: Proposition 2.1 ([15]). On the domain ui = uj and ψ11 ψ21 · · · ψn1 = 0, one has hi (u) = ψi1 (u), n ηαβ = ψiα (u)ψiβ (u), β

i=1

∂x β (u) ηαβ = ψi1 (u)ψiα (u), ∂ui cαβγ (x(u)) =

(2.8)

n ψiα (u)ψiβ (u)ψiγ (u) i=1

ψi1 (u)

.

We refer the reader for the definition of Frobenius manifold to [15 or 16]. This definition is different from the one given in e.g [17]. We do not assume the quasihomogeneity condition here. Note that this system (2.8) is not unique. For any given solution βij (u) of the Darboux-Egoroff system, there exists an n-parameter family of Lam´e coefficients and of Egoroff metrics. m . Following Akhmetshin, KrichAssume for simplicity that ηαβ = δαβ , so ck m = ck

ever and Volvovski [2] (see also [24 and 25]), we can even construct the prepotential F (x). There exist n unique solutions ψij (u, λ), 1 ≤ j ≤ n, of (2.6) with initial condin ∂x j j ∂x j j tions ψij (0, λ) = ∂x j =1 ∂ui (0) ∂uk (0) = δik , such that ψij (u, λ) ∂ui , x (0) = 0 and has the expansion ψi1 (u, 0)ψij (u, λ) =

j ∞ ∂ξ (u) k

k=0

∂ui

j

λk , with ξ0 (u) = x j (u).

Definition 2.1. The matrix and row-vector are defined as: (u, λ) = ψi1 (u, 0)ψij (u, λ)) ij , n ψi1 (u, 0)ψij (u, λ)) . λ(u, λ) = i=1

From Eqs. (2.6) and (2.7) one deduces ∂j (u, λ) = (u, λ)ij , ∂ui

j

160

H. Aratyn, J. van de Leur

and satisfies ∂ 2 (u, λ) ∂(u, λ) ∂(u, λ) j = iji (u) + j i (u) , ∂ui ∂uj ∂ui ∂uj n ∂ 2 (u, λ) j ∂(u, λ) ∂ 2 (u, λ) = ii (u) +λ , 2 ∂(ui ) ∂uj ∂ui ∂uj

i = j, (2.9)

j =1

where the Christoffel symbols are given by (2.1) and hence that ∂ 2 (u, λ) ∂(u, λ) m = λ ck

(u) . ∂x m ∂x k ∂x

n

(2.10)

m=1

Thus is the generating series for the flat sections of the connection ∇k = j = δj 1 λ−1 + x j (u) +

∞

∂ ∂x k

m: − λck

j

ξ i λi .

i=1

Moreover (u, λ) (u, −λ)T = λ(u, λ) (u, −λ)T = Since

∂(u,λ) ∂x i

is a linear combination of

n

hi (u)2 Eii ,

i=1 (h21 (u)

(2.11)

h22 (u) . . . h2n (u)).

∂(u,λ) ∂uk ’s,

λ(u, λ)

∂(u, −λ)T ∂x k

is independent of λ, which means that all coefficients, except the constant coefficient, are zero. In particular the coefficient of λ2 gives: ∂ξ21 (u) i ∂ξ i (u) + x (u) 1 m . m ∂x ∂x n

ξ1m (u) = −

i=1

The coefficient of λ of (2.10) leads to ∂ 2 ξ1m (u) m = ck

(u), ∂uk ∂u

hence

∂F (u) ∂x m

= ξ1m (u) and we obtain the Theorem of [2] (see also [25]).

Theorem 2.1. The function F (u) = F (x(u)) defined by 1 1 i F (u) = − ξ21 (u) + x (u)ξ1i (u) 2 2 n

i=1

satisfies Eq. (2.5).

WDVV Equations and Integrable Hierarchies of KP Type

161

3. The Sato Grassmannian and the Constrained KP Hierarchy In this section we describe the Sato Grassmannian. Consider the spaces −j H− = λ−1 C[[λ−1 ]] = { ∞ j =1 aj λ |aj ∈ C} and H+ = C[λ] = { ki=0 bi λi |bi ∈ C}. −1 ]]. On the space H we have field C((λ)) of C[[λ Hence H = H+ ⊕ H− is the quotient a bilinear form, viz. if f (λ) = j aj λj and g(λ) = j bj λj are in H , then we define

(f (λ), g(λ)) = Resλ f (λ)g(λ) =

(3.1)

aj b−j −1 .

j

Let p+ : H → H+ be the projection aj λj ) = a j λj . p+ ( j ≥0

Then the Sato Grassmannian Gr(H ) consists of all linear subspaces of W ⊂ H that are of a size comparable to H+ , i.e.,

p+ : W → H+ has a finite

Gr(H ) = W ⊂ H

. dimensional kernel and cokernel The space Gr(H ) has a subdivision into different components: Gr (k) (H ) = {W ∈ Gr(H )| dim(Coker(p+ |W )) − dim(Ker(p+ |W )) = k}. Clearly, the subspace λk H+ belongs to Gr (k) (H ) and one easily verifies that this also holds for all subspaces in Gr(H ) that project bijectively onto λk H+ , i.e. all W belonging to the “big cell”. For W ∈ Gr(H ), let W ⊥ be the orthocomplement of W in H w.r.t. the bilinear form (3.1). Then, with the above given description, W ⊥ also belongs to Gr(H ) see [20]. We write x = t1 , t = (t1 , t2 , t3 , . . . ) and tˆ = (t2 , t3 , t4 , . . . ). Consider now a wave ∗ with function ψBA of the KP -hierarchy and its dual ψBA i ψBA (x, tˆ, λ) ={ aj (x, tˆ )λj }λl exλ+ i>1 ti λ and j ≤0 ∗ (x, tˆ, λ) ={ ψBA

bm (x, tˆ )λm }λ−l e−xλ−

i i>1 ti λ

.

m≤0

We assume from now on in this section that there exists an α(x, t) ∈ C[[x, tˆ ]] of the form α(x, 0) = x N + aj x j , (3.2) j >N

such that for all m ≤ 0 and all j ≤ 0, α(x, tˆ )aj (x, tˆ ) ∈ C[[x, tˆ ]] and α(x, tˆ )bm (x, tˆ ) ∈ C[[x, tˆ ]].

(3.3)

162

H. Aratyn, J. van de Leur

Note that throughout this section x j stands for the j th power of x and not for the flat coordinate x j of Sect. 2. Such wave functions are called regularizable, see [28, 20]. For regularizable wave ∗ have the form functions the Laurent series in x of ψBA and ψBA ψBA (x, tˆ, λ) =

∞

i i=2 ti λ

wj (tˆ, λ)x j exλ+

,

j ≥−N

where wj (tˆ, λ) =

N1

vl λl , with vl ∈ C[[tˆ]],

l=−∞

∗ ψBA (x, tˆ, λ) =

∞

wj∗ (tˆ, λ)x j e−xλ−

i i=2 ti λ

,

j ≥−N

where wj∗ (tˆ, λ) =

N2

vl∗ λl ,; with vl∗ ∈ C[[tˆ]],

l=−∞

and moreover W = Span{wj (0, λ), j ≥ −N} and W ∗ = Span{wj∗ (0, λ), j ≥ −N } belong to Gr(H ). It was Sato who realized that the space W determines ψBA , for according to [28] there holds Proposition 3.1. The map that associates to a regularizable wave function ψBA of the KP -hierarchy the span of the coefficients in tˆ = 0 of the Laurent series of ψBA in x is a bijection between this class of wave functions and Gr(H ). The wave functions that satisfy the conditions in (3.3) for α = 1 correspond to the big cell. For each W ∈ Gr(H ) we denote the wave function corresponding to W by ψW . The ∗ can be characterized as follows [28, dual wave function of ψW , which we denote by ψW 20]: Proposition 3.2. Let W and W˜ be two subpaces in Gr(H ). Then W˜ is the space W ∗ corresponding to the dual wave function, if and only if W˜ = W ⊥ with W ⊥ the orthocomplement of W w.r.t. the bilinear form (3.1) on H . Moreover ∗ ψW (t, λ), ψW (s, λ) = 0. Let W ∈ Gr (k) (H ) then ψW (t, λ) = PW (t, ∂x )e

∞

j j =1 tj λ

,

∗−1 ∗ ψW (t, λ) = PW (t, ∂x )e−

∞

j j =1 tj λ

,

where PW (t, ∂x ) is an k th order pseudo-differential operator. The corresponding KP Lax operator LW is equal to −1 LW (t, ∂x ) = PW (t, ∂x )∂x PW (t, ∂x ).

(3.4)

From now on we will use the notation ψW and LW instead of ψBA and L whenever we want to emphasize its dependence on a point W of the Sato Grassmannian Gr(H ).

WDVV Equations and Integrable Hierarchies of KP Type

163

Eigenfunctions and adjoint eigenfunctions of the KP Lax operator (3.4), see (1.10), can be expressed in wave and adjoint wave functions, viz. there exist functions f, g ∈ H such that ∗ (t) = (ψW (t, λ), f (λ)) , (t) = ψW (t, λ), g(λ) . (3.5) Such (adjoint) eigenfunctions induce elementary B¨acklund–Darboux transformations [20]. Assume that we have the following data W ∈ Gr (k) (H ), W ⊥ , ψW (t, λ) and ∗ (t, λ), then the (adjoint) eigenfunctions (3.5) induce new KP wave functions: ψW

ψW (t, λ) = (t)∂x (t)−1 ψW (t, λ),

∗−1 ∗ ∗ (t, λ) = (t)∂ (t)−1 ψW (t, λ), ψW x (3.6)

∗−1 ψW (t, λ), ψW (t, λ) = −(t)∂x (t)−1

∗ (t, λ) = −(t)∂ (t)−1 ψ ∗ (t, λ), ψW x W where W = {w ∈ W |(w(λ), f (λ)) = 0} ∈ Gr (k+1) (H ),

W ⊥ = W ⊥ + Cf,

W ⊥ = {w ∈ W ⊥ |(w(λ), g(λ)) = 0}.

W = W + Cg ∈ Gr (k−1) (H ),

(3.7)

Now assume that we have a Lax operator LW with W ∈ Gr (k) (H ) of the form (1.7), (1.10), with m minimal. Then ψλW =λψW =L W ψW = ∂x +

m i=1

=∂x (ψW ) +

i ∂x −1 i ψW m

∗−1 i i −i ∂x i−1 ψW

i=1

=∂x (ψW ) +

m

i i ψWi ,

i=1

where W ⊂ Wi of codimension 1. Hence there exists a W = W + such that W ⊂

W

W ∈ Gr (k) (H ), (codimension m),

m

i=1 Wi

W = W + λW ∈ Gr (k−m) (H ), λW ⊂ W (codimension m + 1).

= W + λW (3.8)

The converse is also true. If LW is a Lax operator such that W satisfies (3.8), then there exists an inverse B¨acklund–Darboux transformation, i.e. the inverse of an mth order differential operator Lm , mapping W into W and an m + 1th order B¨acklund–Darboux transformation Lm+1 mapping W into λW : L−1 m ψW = ψW ,

Lm+1 ψW = ψλW = λψW .

(3.9)

Hence, LW = Lm+1 L−1 m ,

(3.10)

is a first order pseudo-differential operator. Before we continue, we first state a small lemma which will be important later on:

164

H. Aratyn, J. van de Leur

Lemma 3.1. There exist m independent functions vi ∈ W , vi ∈ W , respectively ui ∈ W ⊥ , ui ∈ W ⊥ and m + 1 independent functions fj ∈ λ−1 W ⊥ , fj ∈ W ⊥ , respectively gj ∈ λ−1 W , gj ∈ W such that the functions ϕi span Ker(Lm ), ψi span Ker(L∗m ), ϕ¯j span Ker(Lm+1 ) and ψ¯ j span Ker(L∗m+1 ), for 1 ≤ i ≤ m, 1 ≤ j ≤ m + 1, where ∗ ψi (t) = ψW (t, λ), vi (λ) , ϕi (t) = (ψW (t, λ), ui (λ)) , ∗ ψ¯ j (t) = ψW (t, λ), gj (λ) . ϕ¯j (t) = ψW (t, λ), fj (λ) , Proof. This follows from (3.6–3.8) and the observation that (λW )⊥ = λ−1 W ⊥ .

We proceed to construct LW . Choose independent vectors vi ∈ W = W + λW , as in Lemma 3.1, such that ⊥

W = {w ∈ W ⊥ |(w(λ), vj (λ)) = 0 for all 1 ≤ j ≤ m}. Since LW can be obtained by two B¨acklund–Darboux transformations Lm and Lm+1 such that (3.10) holds, the general theory of such B¨acklund–Darboux transformations [20] shows that m ∗ (LW )− = aj (t)∂x−1 ψW (t, λ), vj (λ) . j =1

Thus λψW (t, λ) = ∂x (ψW (t, λ)) +

m

∗ aj (ψW (t, λ), vj )ψW +Cvj (t, λ).

j =1

Let now Uj = W + Cv1 + Cv2 + · · · + Cvj −1 + Cvj +1 + · · · + Cvm . Choose uj ∈ W ⊥ such that Uj = {w ∈ W |(w(λ), uj (λ)) = 0}. Then

(λψW (t, λ), ui (λ)) = (λψW (t, λ) − ∂x (ψW (t, λ)) , ui (λ)) =

m

∗ aj (ψW (t, λ), vj (λ))(ψW +Cvj (t, λ), ui (λ))

j =1 ∗ =ai (ψW (t, λ), vi (λ))(ψW +Cvi (t, λ), ui (λ)),

from which we can deduce that ai =

(λψW (t, λ), ui (λ)) . ∗ (t, λ), v (λ))(ψ (ψW i W +Cvi (t, λ), ui (λ))

Since eigenfunctions which produce elementary B¨acklund–Darboux transformations are unique up to a scalar factor [20, 22],

−1 ∗ ∗ ψW +Cvi (t, λ) = (ψW (t, λ), vi (λ))−1 ∂x (ψW (t, λ), vi (λ)) ψW (t, λ),

ψW (t, λ) = (ψW +Cvi (t, λ), ui (λ))∂x (ψW +Cvi (t, λ), ui (λ))−1 ψW +Cvi (t, λ)

WDVV Equations and Integrable Hierarchies of KP Type

and

165

∂ ∗ ∗ (ψW (t, λ), vi (λ))−1 = LnW +Cvi (t, λ), vi (λ))−1 , (ψW + ∂tn

we find that

∗ (ψW (t, λ), vi (λ))−1 = ci (ψW +Cvi (t, λ), ui (λ))

and hence

ai = di (λψW (t, λ), ui (λ)).

Now replace di ui by ui and we obtain that (LW )− =

m

∗ (λψW (t, λ), ui (λ))∂x−1 (ψW (t, λ), vi (λ))

i=1

=

m

∗ (ψW (t, λ), λui (λ))∂x−1 (ψW (t, λ), vi (λ)).

i=1

With these choices, we set i (t) = (ψW (t, λ), λuj (λ)),

∗ i (t) = (ψW (t, λ), vj (λ)),

(3.11)

and thus one obtains the desired Lax operator of the form given in (1.7). Notice that (vj (λ), ui (λ)) = 0 for i = j and (vi (λ), ui (λ)) = 0. This construction for the SegalWilson Grassmannian was given in [21], see also [22]. For the Baker-Akhiezer wave function ψBA the linear spectral problem LψBA = λψBA can be decomposed on a set of differential equations: ∂x ψBA (t, λ) +

m

i i (t, λ) = λψBA (t, λ); ∂x i (t, λ) = i (t)ψBA (t, λ) . (3.12)

i=1

Similarly, we introduce the conjugated linear problem for L∗ = −∂x − For the conjugated Baker-Akhiezer wave function ∗ ψBA (t, λ) =

m

j τ (t + [λ−1 ]) − ∞ j =1 λ tj , e τ (t)

−1 i=1 i ∂x i .

(3.13)

∗ ) = λψ ∗ can be rewritten as: the conjugated spectral problem L∗ (ψBA BA

∂x i∗ (t, λ)

=

∗ i (t)ψBA (t, λ);

∗ −∂x ψBA (t, λ) −

m

∗ i i∗ (t, λ) = λψBA (t, λ) .

i=1

(3.14) Recall from [8, 20], that in the Sato formalism the squared eigenfunction potentials i (t, λ) and i∗ (t, λ) with i = 1, . . ., m are given by: j 1 τ (t − [λ−1 ]) ∞ i (t − [λ−1 ]) e j =1 λ tj , λ τ (t) j 1 τ (t + [λ−1 ]) − ∞ j =1 λ tj . i∗ (t, λ) = − i (t + [λ−1 ]) e λ τ (t)

i (t, λ) =

(3.15) (3.16)

The Lax operator formalism possesses additional commuting symmetry flows [19]:

166

H. Aratyn, J. van de Leur

Definition 3.1. Mutually commuting additional symmetry flows of the integrable hierarchy defined in terms of the pseudo-differential calculus are defined as: (n)

∂k,n L = [Mk , L];

n−1

(n)

Mk =

Ll (k )∂x−1 L∗ n−1−l (k ) ,

(3.17)

l=0

for n = 1, 2, . . . and k = 1, . . ., m. These flows induce : ∂k,n Lr (i ) = ∂k,n L∗ r (i ) =

n−1 l=0 n−1

Ll (k )∂x−1 L∗ n−1−l (k )Lr (i ) − Lr+n (i )δki ,

(3.18)

L∗ l (k )∂x−1 Ln−1−l (k )L∗ r (i ) + L∗ r+n (i )δki . (3.19)

l=0

Especially, for n = 1 we get: (r)

∂k,1 Lr (i ) = k βki − Lr+1 (k )δik ; ∗ (r) ∂k,1 L∗ r (i ) = k βik + L∗ r+1 (k )δik , where

(k) ∗ (k) βij ≡ ∂x−1 Lk (j )i ; βij ≡ ∂x−1 j L∗ k (i )

(3.20)

(3.21)

(0)

with notation that for k = 0 we write βij = βij . (n) n Due to m k=1 Mk = (L )− we have m ∂ + ∂k,n L = [Ln , L] = 0 ∂tn

(3.22)

k=1

or m+1

∂k,n L = 0

with

∂m+1,n ≡

k=1

∂ . ∂tn

(3.23)

Accordingly, it holds that: m+1

∂i,n j = 0,

i=1

m+1

∂i,n j = 0; j = 1, . . ., m.

(3.24)

i=1

We now extend Definition 3.1 to ∂k,n L−1 = [Mk , L−1 ] . (n)

(3.25)

m+1 −1 ¯ Recall, from (3.10) and e.g. [4, 3], that L−1 = m+1 j =1 Lm (ϕ¯ j )∂x ψj in terms of ϕ¯ j j =1 m+1 and ψ¯ j j =1 in Ker(Lm+1 ) and Ker(L∗m+1 ).

WDVV Equations and Integrable Hierarchies of KP Type

167

We introduce notation: (−n) (−n) j = L−n+1 Lm (ϕ¯j ) , j = L∗ −n+1 ψ¯ j , j = 1, . . ., m + 1, n = 1, . . . .

(3.26)

Clearly, j = Lm (ϕ¯j ), L(j ) = 0 and L∗ (j ) = 0. As a result of the definition (3.25) (with n = 1 ) we obtain: (−1)

(−n)

∂i,1 j

(−1)

(−1)

(−n) = i ∂x−1 i j ,

for i = 1, . . ., m,

(−n)

∂i,1 j

(−n) = −i ∂x−1 i j ,

(3.27)

j = 1, . . ., m + 1.

Definition 3.2. Define the (m + 1) × (m + 1) matrix M = (Mij )1≤i,j ≤m+1 by Mm+1 j =

∞

(−n)

λn−1 j

,

Mij = ∂x−1 i Mm+1 j ,

n=1

i = 1, . . ., m, j = 1, . . ., m + 1. (−1)

As pointed out in [3], due to the fact that L(j

(3.28) ) = 0, Mm+1 j ’s satisfy

L(Mm+1 j ) = λMm+1 j ,

(3.29)

and in view of (3.27) this can be rewritten as ∂x +

m

∂i,1 Mm+1 j = λMm+1 j .

(3.30)

i = k = 1, . . ., m ∂x−1 k j Mij for i Mij for k = m + 1, i = 1, . . ., m

(3.31)

i=1

It follows that ∂i,1 Mkj =

for arbitrary j = 1, . . ., m + 1. Furthermore, due to (3.28) and (3.30) we find: ∂x Mij = i Mm+1 j , i = 1, . . ., m, m i Mij = 0, j = 1, . . ., m + 1 . (∂x − λ) Mm+1 j +

(3.32)

i=1

As will be shown in the next section, this matrix will appear in the Riemann-Hilbert problem and together with the dressing matrix will turn out to be a crucial element in the integrable structure behind the Darboux-Egoroff system.

168

H. Aratyn, J. van de Leur

4. Dressing Formalism and Integrable Hierarchy in the Homogeneous Gradation In this section we first briefly describe the Riemann-Hilbert problem (see, for instance [1] and references therein) resulting in the integrable hierarchy associated to the affine Lie + 1) with the homogeneous gradation. We then derive an expression algebra G = sl(m for the dressing matrix (see references [5] and references therein) in terms of underlying τ - and potential-functions and additional symmetry flows by comparing with the results obtained for the same hierarchy in the previous section. + 1) with the homogeLet G be a Lie group associated to the Lie algebra G = sl(m neous gradation. We define two subgroups of G as: G− = {g ∈ G|g(λ) = 1 + g (i) } , (4.1) i<0

G+ = {g ∈ G|g(λ) =

g (i) } ,

(4.2)

i≥0

where g (i) has gradation i with respect to the gradation operator d = λd/dλ. Also G+ ∩ G− = I . + 1) with the Definition 4.1. The (extended) Riemann-Hilbert problem for G = sl(m homogeneous gradation is defined as [7]:   ∞ m+1 (n) (n) exp  Ejj uj  g = g− g+ = −1 M (4.3) j =1 n=1 (n)

with g being a constant element in G− G+ , (Ers )ij = δir δj s and Ejj = λn Ejj . We find: ∂ (n) ∂uj



∞ m+1

exp 

 (n) (n) Ejj uj  g

=

j =1 n=1

∂

g (u) (n) − ∂uj

g+ (u) + g− (u)

∂

g (u) (n) +

∂uj

(4.4) or

(n) −1 g− (u)Ejj g− (u)

−1 = g− (u)

∂

g (u) (n) −

∂uj −1 ∂ Note that g− is in G g and − (n) − ∂uj

∂ (n) g+ ∂uj

+

∂

g (u) (n) +

∂uj

−1 g+ (u) .

(4.5)

−1 g+ is in G+ , where G± are positive/neg-

ative subalgebras of the graded Lie algebra G. The identity (4.5) implies therefore that:

∂ −1 (n) g = g E g . (4.6) g − − − − jj (n) − ∂uj Let (u) = (u1 , . . ., um+1 ) denote m + 1 multi-times (or flows) uj with each argument (1) (2) uj denoting one of the m + 1 multi-times (uj , uj , . . .).

WDVV Equations and Integrable Hierarchies of KP Type

169

−1 In terms of the (u, λ) = g− (u, λ) we obtain the following lemma: (n)

Lemma 4.1. Mutually commuting uj -flows in the dressing formalism of the integrable hierarchy are defined through: ∂ (n)

∂uj

(u, λ) = −(λn Ejj −1 )− (u, λ);

j = 1, . . ., m + 1 .

(4.7)

Let us note that inserting the non-diagonal matrices Eij with i = j on the righthand side of (4.7) would correspond to nonabelian symmetry flows from the Borel loop + 1), described recently in [5, 4]. subalgebra of sl(m For M(u, λ) = g+ (u, λ) we obtain from Eq. (4.5):

(n) −1 M(u, λ) = E M(u, λ). jj (n)

∂

(4.8)

+

∂uj

(1)

Consider (4.8) for n = 1 and j = m + 1. From now on we will identify uj = uj . The result can be written (see also [25]) as: ∂ (−1) − λEm+1 m+1 + [θ , E] M = 0 , (4.9) ∂um+1 where E is a semisimple, grade-one element of G, E≡

λ I − λEm+1 m+1 , m+1

(4.10)

and θ (−1) is a term of = 1 + θ (−1) + . . . of grade −1. Since the grade-zero matrix A ≡ [θ (−1) , E] is in the image of ad(E) it can therefore be parametrized as: A=

m

(−i Ei m+1 + i Em+1 i ) .

(4.11)

i=1 (n)

A special role in this formalism is played by the um+1 -flows. From (4.10) we find that : ∂ (n) ∂um+1

(u, λ) = (λn−1 E−1 )− (u, λ) ,

(4.12)

(n)

which shows that the um+1 -flows are the isospectral deformations of the underlying integrable hierarchy [5]. Consequently, (n)

um+1 = tn

and

x = t1 = um+1 .

(4.13)

The standard solution to Eq. (4.9) was given in the literature in the form of the formal path ordered integral (see e.g. [3] and references therein). Here we notice that with the above identification and with parametrization as in Eq. (4.11) relation (4.9) agrees with Eq. (3.32) for the matrix M introduced in the previous section.

170

H. Aratyn, J. van de Leur

An important property: m+1

∂

j =1

∂uj

(n)

= 0,

n = 1, . . .

is another consequence of Lemma 4.1. For we obtain the dressing expression: ∂ ∂ −1 +E+A = +E, ∂um+1 ∂um+1

(4.14)

(4.15)

by taking n = 1 in Eq. (4.7). Hence we have shown that Ad() maps the matrix operator ∂ into the Lax operator: ∂x + E with ∂x = ∂um+1 L = ∂x + E + A ,

(4.16)

∂x + E → (∂x + E)−1 = ∂x + E + A .

(4.17)

according to

This is called the dressing procedure [5]. (n) One also finds that the action of the uj -flows on the potential A is given by: ∂ (n)

∂uj

A = [(λn Ejj −1 )+ , L] .

(4.18)

(n)

The uj -flows are the symmetry flows of the underlying integrable hierarchy due to the fact that they commute with the isospectral flows. As explained in [5], the dressing (n) formalism is connected with the tau-function, τ being a function of all the uj -times. The connection is expressed by the formula:

∂ (4.19) Resλ tr(Eλn−1 Ejj −1 ) = − (n) ∂x ln τ (u) . ∂uj Our goal in this section is to derive the expression for the dressing matrix in terms of the τ (u) function and the matrix elements i , i of A from (4.11). A simple way to do it relies on equivalence between the above algebraic formulation and the one in which the above integrable model is represented in the Sato formalism by the pseudodifferential Lax operator as in the previous section. In reference [6] this equivalence was established by showing that recursion operators of both hierarchies are identical. A key to our derivation is the following proposition, which shows that the additional symmetry flows of both approaches agree. Proposition 4.1. We have ∂ (n)

∂ui

= ∂i,n i = 1, . . ., m + 1 ,

(4.20)

meaning that both flows have identical actions on the fields i , i , i = 1, . . ., m parametrizing the constrained KP hierarchy.

WDVV Equations and Integrable Hierarchies of KP Type

171

Proof. For i = m + 1 this was already established in relation (4.13). For other values of index i the relation follows from the direct calculation based on (4.18). We will now use the equivalence of these two formalisms to calculate the dressing matrix . For simplicity, we first will work with m = 2 for which case the Lax matrix operator in (4.16) becomes:     1 0 0 0 0 − 1 λ L = ∂x + E + A = ∂x +  0 1 0  +  0 0 −2  . (4.21) 3 0 0 −2 0 1

2

We start with the corresponding un-dressed linear spectral problem: (∂x + E) exp(− E (n) tn ) χ = 0 ,

(4.22)

where χ is any constant column vector, E = E (1) and   10 0 n λ 0 1 0  . E (n) ≡ 3 0 0 −2

(4.23)

We will be working with three types of χ :       1 0 0 χ1 =  0  ; χ2 =  1  ; χ3 =  0  . 0 0 1

(4.24)

Choosing χ3 in (4.22) and multiplying from the left by we get information about the last (3rd ) column of the matrix :   θ13

(∂x + E + A)  θ23  exp (2/3) (4.25) λn tn = 0 . θ33 This equation takes the form of the linear spectral problem as in (3.12) with θ13 = 1 e−

λn tn

; θ23 = 2 e−

λn tn

; θ33 = ψBA e−

λn tn

,

for i , i = 1, 2 as in Eq. (3.12). For the choice χ = χ2 in (4.22) we find:   θ12

λn tn /3 = 0 . (∂x + E + A)  θ22  exp − θ32

(4.26)

(4.27)

After multiplying from the left by −2 λn tn /3 we obtain: 

     λ00 0 0 −1 θ12

∂x +  0 λ 0  +  0 0 −2   θ22  exp − λn tn = 0 . 000 1 2 0 θ32 

(4.28)

172

H. Aratyn, J. van de Leur

In components this is equivalent to 0 = ∂x θ12 − 1 θ32 , 0 = ∂x θ22 − 2 θ32 , λθ32 = ∂x θ32 + 1 θ12 + 2 θ22 .

(4.29) (4.30) (4.31)

Solving the first two equations we get: θ12 = ∂x−1 (1 θ32 ) , θ22 =

(4.32)

1 + ∂x−1 (2 θ32 )

(4.33)

.

Note that the integration constant in (4.33) have been chosen so that the diagonal element θ22 starts with 1. Plugging this back into (4.31) we get an expression: L(θ32 ) = λθ32 − 2

(4.34)

whose solution is: θ32 =

∞

λ−i Li−1 (2 ) .

(4.35)

i=1

Correspondingly, θ12 =

∞

λ−i β12

(i−1)

; θ22 = 1 +

i=1

∞

λ−i β22

(i−1)

(4.36)

i=1

(k)

with βij as in (3.21). A similar technique yields for the choice χ = χ1 : θ31 =

∞

λ−i Li−1 (1 )

(4.37)

i=1

and θ11 = 1 +

∞

λ−i β11

(i−1)

i=1

; θ21 =

∞

λ−i β21

(i−1)

.

(4.38)

i=1

These results complete the construction of the dressing matrix:   ∞ −i (i−1) −1 ]) −i (i−1) 1+ ∞ 1 (t − [λ−1 ]) τ (t−[λ i=1 λ β11 i=1 λ β12 λτ (t)  ∞  −i (i−1) −i (i−1) (t − [λ−1 ]) τ (t−[λ−1 ])  . (4.39) = 1+ ∞ 2 i=1 λ β21 i=1 λ β22   λτ (t) ∞ −i i−1 ∞ −i i−1 τ (t−[λ−1 ]) λ L ( ) λ L ( ) 1 2 i=1 i=1 τ (t) For the first two diagonal elements we find from [10, 11]: 1+

∞ i=1

λ−i βjj

(i−1)

=

τ (uj − [λ−1 ]) ; τ (t)

j = 1, 2 .

(4.40)

WDVV Equations and Integrable Hierarchies of KP Type

173

Note, that on the right-hand side we have only showed the multi-times uj which are (1) (2) shifted according to (uj − [λ−1 ]) = (uj − 1/λ, uj − 1/2λ2 , . . .). Moreover, from [9–11] we find for the first two elements of the last row of : ∞

λ−i Li−1 (j ) = j (uj − [λ−1 ])

i=1

τ (uj − [λ−1 ]) ; λτ (t)

j = 1, 2.

(4.41)

Relations in (4.41) follow from the ones in (4.40) by the Darboux-B¨acklund transformation generated by Tj = j ∂x j applied to the Lax operator: L˜ = Tj LTj−1 = ∂x +

2

˜ i ∂x−1 ˜ i.

(4.42)

i=1

Inserting the transformed quantities into identity (4.40) one obtains relations (4.41). Another set of identities can be obtained from (4.40) using the binary Darboux-B¨acklund transformations [8]. As an illustration we consider: L → L˜ = T2 LT2−1 ,

T2 = 2 ∂x −1 2 , ¯ i ∂x−1 ¯ i , T¯1 = ˜ −1 , ˜ 1 ∂x L˜ → L¯ = T¯1∗ −1 L˜ T¯1∗ = ∂x + 1

(4.43) (4.44)

i

where ¯ 1 = 2 /∂x−1 (2 , 1 ), ¯ 1 = T¯1 T ∗ −1 L∗ (1 ), 2 ¯ 2 = T¯ ∗ −1 T2 L(2 ), ¯ 2 = T¯1 (1/2 ), 1

(4.45) (4.46)

˜ 1 = −∂x−1 (2 , 1 )/2 . The τ -function transforms as τ → −∂x−1 (2 1 )τ and and for j = 2 (4.40) becomes: 1+

∂ −1 ( )(u − [λ−1 ])τ (u − [λ−1 ]) 2 1 2 2 ¯ 2) ¯2 = x . (4.47) λ−i ∂x−1 L¯ i−1 ( −1 ( )(t)τ (t) ∂ 2 1 x i=1

∞

By inserting definitions (4.45)–(4.46) one can now show that

∂ −1 Li (2 ) 1 −1 ¯ i−1 ¯ ¯ ∂x L (2 ) 2 = x −1 ∂x (2 , 1 )

(4.48)

and the identity: ∞

λ−i β12

(i−1)

= β12 (u2 − [λ−1 ])

i=1

τ (u2 − [λ−1 ]) λτ (t)

(4.49)

follows from (4.47) after dividing both sides by λ and multiplying by ∂x−1 (2 1 ). Similarly, we obtain ∞ i=1

λ−i β21

(i−1)

= β21 (u1 − [λ−1 ])

τ (u1 − [λ−1 ]) . λτ (t)

(4.50)

174

H. Aratyn, J. van de Leur

To summarize, we have found the following dressing matrix: 1 = τ (t) 

−1 ]) −1 ])  τ (u1 −[λ−1 ]) β12 (u2 −[λ−1 ]) τ (u2 −[λ 1 (t −[λ−1 ]) τ (t−[λ λ λ −1 ]) −1 ])   ×  β21 (u1 − [λ−1 ]) τ (u1 −[λ . τ (u2 −[λ−1 ]) 2 (t −[λ−1 ]) τ (t−[λ λ λ −1 ]) −1 ]) τ (u −[λ τ (u −[λ 1 (u1 −[λ−1 ]) 1 λ 2 (u2 − [λ−1 ]) 2 λ τ (t −[λ−1 ]) (4.51)

We will now generalize the above matrix to the general case of m ≥ 2. For this purpose we introduce the following the definition which extends the definition of coefficients (k=0) previously given in (3.21): βij = βij Definition 4.2. Let βij (u) = ∂x−1 j i (u); βj m+1 (u) = j (u); βm+1 i (u) = i (u)

(4.52)

for i = j and i, j = 1, . . ., m. First note that with this identification Eqs. (3.31) and (3.32) take formally the form of the Darboux-Egoroff system in (2.6) provided we also make coefficients βij symmetric (see the next section). Moreover, based on Definition 4.2 we can now formulate the following proposition: Proposition 4.2. The matrix elements of the dressing matrix = (θij )1≤i,j ≤m+1 are given by: θk (u, λ) = βk (u − [λ−1 ]) θkk (u, λ) =

τ (u − [λ−1 ]) λτ (u)

τ (uk − [λ−1 ]) τ (u)

= k k, = 1, . . ., m + 1,

k = 1, . . ., m + 1,

(4.53) (4.54)

while the matrix elements of the inverse dressing matrix −1 = (θij−1 )1≤i,j ≤m+1 are: −1 (u, λ) = −βk (uk + [λ−1 ]) θk

−1 θkk (u, λ) =

τ (uk + [λ−1 ]) τ (u)

τ (uk + [λ−1 ]) λτ (u)

= k k, = 1, . . ., m + 1, (4.55)

k = 1, . . ., m + 1.

(4.56)

Definition 4.3. The matrix is defined as: (u, λ) = (u, λ) D(u, λ);

Dij (u, λ) = δij e

∞

n (n) n=1 λ uj

,

(4.57)

k, i = 1, . . ., m + 1.

(4.58)

or in components for = (γij )1≤i,j ≤m+1 , γi k (u, λ) = θi k (u, λ) e

∞

n (n) n=1 λ uk

,

WDVV Equations and Integrable Hierarchies of KP Type (n=1)

For n = 1 and with notation ui ≡ ui

175

one derives the following proposition:

Proposition 4.3. The matrix coefficients of satisfy the following equations:

m+1 j =1

∂ γj k (u, λ) = βj i (u) γi k (u, λ); i = j = 1, . . ., m + 1, ∂ui

(4.59)

∂ γi k (u, λ) = λ γi k (u, λ)); i = 1, . . ., m + 1 ∂uj

(4.60)

for each k = 1, . . ., m + 1. Proof. Equation (4.60) follows immediately from relation ∂ (n)

∂uj

= −(λn Ejj −1 )− + (u, λ)λn Ejj D(u, λ) = (λn Ejj −1 )+ , (4.61)

which can also be rewritten as ∂ (n)

∂uj

(u, λ) = (λn Ejj −1 )+ (u, λ).

(4.62)

To prove Eq. (4.59) one can either use (4.62) or Eqs. (3.20) with the identification (4.20). Equations in Proposition 4.3 give rise to the following compatibility conditions : Corollary 4.1. The rotation coefficients βij satisfy ∂ βij = βik βkj , i = k = j ∂uk

m+1 k=1

∂ βij = 0, i = j . ∂uk

(4.63)

Similarly, the inverse of the matrix −1 = (γij−1 )1≤i,j ≤m+1 is given by: −1 (u, λ) = D −1 (u, λ)−1 (u, λ)

(4.64)

or in components −1 − γk−1 i (u, λ) = θk i (u, λ) e

∞

n (n) n=1 λ uk

k, i = 1, . . ., m + 1

(4.65)

for which we derive the proposition: Proposition 4.4. The matrix coefficients of −1 satisfy the following equations:

m+1 j =1

∂ −1 γ (u, λ) = γk−1 i (u, λ) βij (u); i = j = 1, . . ., m + 1, ∂ui k j

(4.66)

∂ −1 γ (u, λ) = −λ γk−1 i (u, λ); i = 1, . . ., m + 1 . ∂uj k i

(4.67)

176

H. Aratyn, J. van de Leur

Furthermore we also have the following: Proposition 4.5. The and −1 matrices satisfy the following two bilinear identities:

Resλ −1 (u, λ)(u , λ) = 0, (4.68)

Resλ (u, λ) −1 (u , λ) = 0. (4.69) Proof. Since −1 (u, λ)(u, λ) = (u, λ) −1 (u, λ) = I ,

(4.70)

one determines that for k ≥ 0:

Resλ λk −1 (u, λ)(u, λ) = 0

(4.71)

Resλ λk (u, λ) −1 (u, λ) = 0,

(4.72)

and

and hence from relation (4.62) one gets: Resλ

−1

(u, λ)

∂(u, λ)

(n)

∂uj

= 0,

(4.73)

and a similar formula for Eq. (4.72). The conclusion follows now from the Taylor expansion in (u − u) in relations (4.68)–(4.69). Notice, now that   m+1 m+1 ∞ ∞ n (n) (n) (n) exp  Ejj uj  = Ejj e n=1 λ uj = D(u, λ), j =1 n=1

(4.74)

j =1

where the matrix D(u, λ) was previously defined in terms of its matrix elements Dij in (4.57). Hence the Riemann-Hilbert problem can be recast in the form: D(u, λ)g = g− g+ = −1 M

(4.75)

(u, λ)g = M(u, λ) ,

(4.76)

or

which provides another proof that (u, λ) satisfies the evolution Eqs. (4.62) identical to those satisfied by M(u, λ) in (4.8). In [25 and 26] a matrix similar to M(u, λ) was defined via formula (4.76) for (u, λ), the n-component KP wave function of [23]. As a corollary of relation (4.76) we find that the matrix M(u, λ) also satisfies Proposition 4.3.

WDVV Equations and Integrable Hierarchies of KP Type

177

5. Reduction to the CKP Hierarchy In this section we will describe the form of reduction of the integrable hierarchy presented in the previous section which renders the βij -coefficients from Definition 4.2 symmetric. Clearly, this is accomplished for i = i ;

i = 1, . . ., m ,

(5.1)

which results in the CKP condition [13, 27] L∗ = −L for the pseudo-differential Lax operator from (1.7). This condition implies that

n ∗ (n) ∗ (n) L + = (−)n Ln + , Mk = (−)n Mk ,

(5.2)

(5.3)

and accordingly the evolution equations (1.9) and (3.17) are only consistent with condition (5.2) for odd n. Hence, we are interested in the reduced integrable hierarchy which is obtained from the integrable structure of Sect. 2 by embedding it in the CKP hierar(2k+1) with chy by imposing condition (5.1) and introducing dependence on odd flows ui i = 1, . . ., m + 1 only. It was shown in [13], that all tau-functions corresponding to the CKP hierarchy can be obtained from special KP tau-functions, viz., the ones that satisfy τ (t1 , t2 , t3 , t4 , ...) = τ (t1 , −t2 , t3 , −t4 , ...) by putting all t2j = 0. From (1.8) one easily deduces that the corresponding KP wave functions satisfy ∗ (t, λ). ψBA ((−)i+1 ti , −λ) = ψBA

(5.4)

Now let W ∈ Gr (0) (H ), such that ψW satisfies (5.4), then (f (λ), g(−λ)) = 0 for all f, g ∈ W. All such elements W form a Grassmann submanifold of Gr (0) which we denote by C(H ), the Grassmannian of the CKP hierarchy. To obtain the Darboux–Egoroff system one needs to impose condition (5.1) on the Lax operator (1.7). In the following we arrive at this condition within the Grassmannian formalism. Assume now that W ∈ C(H ) which satisfies (3.8). In the discussion in Sect. 3 we have found m independent vectors vj (λ) ∈ W and m vectors uj (λ) ∈ W ⊥ , such that (vi (λ), uj (λ)) = 0 for i = j and (vi (λ), ui (λ)) = 0. Multiply the uj with a scalar such that (vi (λ), uj (λ)) = δij . Note that this introduces extra constants in (3.11). Since uj (λ) ∈ W ⊥ , we observe that uj (−λ) ∈ W . Express vj (λ) = aj (λ) + λbj (λ) with both aj , bj ∈ W and moreover the bj = 0 and linearly independent (otherwise W ⊂ W not codimension m). Then δij =(vj (λ), ui (λ)) =(aj (λ) + λbj (λ), ui (λ)) =(λbj (λ), ui (λ)).

178

H. Aratyn, J. van de Leur

So we see that we may replace in (3.11) vj (λ) by λbj (λ) this does not change the ∗ (t, λ), a (λ)) = 0. Now notice that for b, c ∈ W , definition of j since (ψW j (λb(λ), c(−λ)) = −(−λb(−λ), c(λ)) = (λb(−λ), c(λ)) = (λc(λ), b(−λ)),

(5.5)

hence (λuj (−λ), bj (−λ)) = 0 and therefore λuj (−λ) ∈ W for all j . Write λuj (−λ) = wj (λ) +

m

Bj λb (λ),

=1

with wj ∈ W , then clearly,

ψW (t, λ), λuj (λ) = ψW (t, λ),

m

Bj λb (−λ) ,

=1

since wj (−λ) ∈ W ⊥ . So we can replace uj (λ) in (3.11) by m

Bj b (−λ);

=1

this does not change the eigenfunctions. We calculate δij = (uj (λ), vi (λ)) = (uj (λ), ai (λ) + λbi (λ)) = (uj (λ), λbi (λ)) = (λuj (λ), bi (λ)) m = −wj (−λ) + Bj λb (−λ), bi (λ)

=1

= =

m

Bj λb (−λ), bi (λ)

=1 m

Bj (λb (−λ), bi (λ)).

=1

In other words, let B = (Bij )1≤i,j ≤m , then B(λb (−λ), bi (λ)))1≤ ,i≤m = Im and thus (λb (−λ), bi (λ)))1≤ ,i≤m is invertible. Now denote A = ((λbi (−λ), bj (λ))1≤i,j ≤m = ((λbj (λ), bi (−λ)))1≤i,j ≤m .

WDVV Equations and Integrable Hierarchies of KP Type

179

Substituting b = bj , c = bi in (5.5) one sees that (λbi (−λ), bj (λ)) = (λbj (−λ), bi (λ)) so A is symmetric (but not necessarily real). Hence one can find a new basis hi (λ) ∈

m

Cbj (λ),

j =1

hi ∈ W such that (λhi (−λ), hj (λ)) = δij . Choosing this basis instead of the original one, our Lax operator has the form LW = ∂x +

m

∗ ci (ψW (t, λ), λhi (−λ))∂x−1 (ψW (t, λ), λhi (λ)), for certain ci ∈ C.

i=1

Now multiplying hi with

√

ci , we obtain

Proposition 5.1. Let W ∈ C(H ) satisfying (3.8), then there exist m independent functions hi (λ) ∈ W with (λhi (−λ), hj (λ)) = δij ci ,

0 = ci ∈ C,

(5.6)

such that LW = ∂x +

m

i ∂x−1 i ,

with

i=1

i (t) = (ψW (t, λ), λhi (−λ)), ∗ i (t) = (ψW (t, λ), λhi (λ))

(5.7)

= (ψW ((−)j +1 tj , −λ), λhi (λ)) = (ψW ((−)j +1 tj , λ), λhi (−λ)). So we have constructed (adjoint) eigenfunctions i and i that satisfy i (t)|t2j =0 for all j ≥1 = i (t)|t2j =0 for all j ≥1 .

(5.8)

The CKP or reduced hierarchy is now obtained by putting not only all t2j = 0, but (2j ) also all uk = 0 for all 1 ≤ k ≤ m + 1, j = 1, 2, 3, . . . (cf. Definition 3.1). We will assume this to hold from now on. As a consequence of the above we find that Proposition 5.2. For the reduced hierachy the rotation coefficients βij (u) satisfy the Darboux–Egoroff system (1.4), (1.6). One also has −1 (u, λ) = T (u, −λ), −1 (u, λ) = T (u, −λ), (5.9)

(5.10) (u, λ) T (u , −λ) = 0, Resλ T (u, −λ)(u , λ) = 0.

Resλ

180

H. Aratyn, J. van de Leur

6. Construction of the WDVV Prepotential In this section we want to construct the WDVV prepotential F , from Sect. 2, in terms (2j ) of the data of the reduced hierarchy, which is obtained by putting all uk = 0 for all 1 ≤ k ≤ m + 1, j = 1, 2, 3, . . . . Recall from Proposition 5.2 that the rotation coefficients βij satisfy the Darboux–Egoroff system (1.4), (1.6). Definition 6.1. Choose some fixed u such that u = u, for which

Resλ λ−1 (u, λ) T (u , −λ) = 0,

(6.1)

then define (u, λ) = (u, λ) T (u , −λ) .

(6.2)

It is always possible to find such a u due to (u, λ) T (u, −λ) = I.

(6.3)

In view of Proposition 4.3 the following results hold Proposition 6.1. The matrix (u, λ) is a positive power series in λ, whose matrix coefficients satisfy Eqs. (2.6), i.e.,

m+1 j =1

∂ j k (u, λ) = βj i (u) i k (u, λ), i = j = 1, . . ., m + 1, ∂ui

(6.4)

∂ i k (u, λ) = λ i k (u, λ)) , i = 1, . . ., m + 1 ∂uj

(6.5)

for each k = 1, . . ., m + 1. Proof. Equations (6.4) and (6.5) follow immediately from Proposition 4.3. Next observe from (5.10) that Resλ λj (u, λ) = 0

(6.6)

for j = 0. Then using (6.5), we see that (6.6) even holds for all j ≥ 0, i.e., (u, λ) is a power series in λ. Using formula (4.76) we also have that (u, λ) = M(u, λ)M −1 (u , λ).

(6.7)

So that (u, λ) can be expressed by from G− as well as by M from G+ . This provides a different way to observe that (u, λ) is a positive expansion in λ. As a consequence of Proposition 6.1 the constant coefficients i j (u, 0) of the matrix coefficients i j (u, λ) satisfy (2.7) and therefore provide the data of Proposition 2.1. In particular the Lam´e coefficients are equal to hi (u) = i 1 (u, 0). (2k+1)

The higher times uj holds that:

(6.8)

for k > 0 play the role of parameters. Also, due to (4.70) it

(u, λ) T (u, −λ) = T (u, −λ)(u, λ) = I. From this we deduce that in fact ηαβ = δαβ . Now define as in Definition 2.1

(6.9)

WDVV Equations and Integrable Hierarchies of KP Type

181

Definition 6.2. The matrix and row-vector are defined as: (u, λ) = i1 (u, 0)ij (u, λ) ij , m+1 (u, λ) = (u, λ)j j = λ−1 i1 (u, 0)ij (u, λ) , i=1

(u, λ)j = δj 1 λ−1 + x j (u) +

∞

j

j

ξi (u)λi .

i=1

Then these and satisfy (2.9)–(2.11) and hence we obtain (see Proposition 2.1 and Theorem 2.1) Theorem 6.1. For the reduced hierarchy the rotation coefficients βij (u) satisfy the (1) (1) = uj and Darboux–Egoroff system (1.4), (1.6). On the domain ui 11 (u, 0)21 (u, 0) · · · n1 (u, 0) = 0, one has hi (u) = i1 (u, 0), ηαβ = δα,β , x α (u) =

m+1

i1 (u, 0)Resλ λ−2 iα (u, λ) ,

i=1 γ

cαβ (u) = cαβγ (u) =

m+1 i=1

iα (u, 0)iβ (u, 0)iγ (u, 0) . i1 (u, 0)

The function F (u) defined by m+1 1 1 i F (u) = − ξ21 (u) + x (u)ξ1i (u), 2 2 i=1

with j

ξi (u) =

m+1

k1 (u, 0)Resλ λ−2−i kj (u, λ)

k=1

satisfies the WDVV-equations (2.5). Acknowledgement. H.A. is partially supported by NSF (PHY-9820663).

References 1. Adams, M.R., Kreˇsi´c-Juri´c, S.: Hamiltonians and zero-curvature equations for integrable partial differential equations. J. Math. Phys. 42, 213–224 (2001) 2. Akhmetshin, A.A., Krichever, I.M., Volvovski, Y.S.: A generating formula for solutions of associativity equations. Russ. Math. Surv. 54(2), 427–429 (1999), hep-th/9904028 3. Aratyn, H., Ferreira, L.A., Gomes, J.F., Zimerman, A.H.: The complex sine-Gordon equation as a symmetry flow of the AKNS hierarchy. J. Phys. A33, L331–L337 (2000), nlin.SI/0007002 4. Aratyn, H., Gomes, J.F., Nissimov, E., Pacheva, S.: Loop-algebra and Virasoro symmetries of integrable hierarchies of KP type. Appl. Anal. 78(3-4), 233–253 (2001), nlin.SI/0004040

182

H. Aratyn, J. van de Leur

5. Aratyn, H., Gomes, J.F., Nissimov, E., Pacheva, S., Zimerman, A.H.: Symmetry flows, conservation laws and dressing approach to the integrable models. In: Integrable Hierarchies and Modern Physics Theories Proceedings of the Advanced Research Workshop, UIC, July 2000, (Aratyn, H., Sorin, A.S. eds), Kluwer Academic Publishers, Dordrecht, 2001, nlin.SI/0012042 6. Aratyn, H., Gomes, J.F., Zimerman, A.H.: Affine Lie algebraic origin of constrained KP hierarchies. J. Math. Phys. 36, 3419–3442 (1995), hep-th/9408104 7. Aratyn, H., Gomes, J.F., Zimerman, A.H.: Multidimensional Toda equations and topological-antitopological fusion. J. Geom. Phys. 46, 21–47 (2003) 8. Aratyn, H., Nissimov, E., Pacheva, S.: Method of squared eigenfunction potentials in integrable hierarchies of KP type. Commun. Math. Phys. 193, 493–525 (1998), solv-int/9701017 9. Aratyn, H., Nissimov, E., Pacheva, S.: A new “dual” symmetry structure of the KP hierarchy. Phys. Lett. A 244, 245–256 (1998), solv-int/9712012 10. Aratyn, H., Nissimov, E., Pacheva, S.: From one-component KP hierarchy to two-component KP hierarchy and back. In: “Topics in Theoretical Phys. vol. II” Festschrift for A.H. Zimerman, IFT-S˜ao Paulo, SP-1998, pp. 25–33, solv-int/9808003 11. Aratyn, H., Nissimov, E., Pacheva, S.: Multi-component matrix KP hierarchies as symmetryenhanced scalar KP hierarchies and their Darboux-B¨acklund solutions. In: CRM Proceedings and Lecture Notes, vol. 29, (Coley, A., Levi, D., Milson, R., Rogers, C., Winternitz, P. eds) AMS 2001, pp. 109–120, solv-int/9904024 12. Darboux, G.: Le¸cons sur les syst`emes orthogonaux et les coordonn´ees curvilignes, Gauthier-Villars, Paris, 1910 13. Date, E., Jimbo, M., Kashiwara, M., Miwa, T.: Transformation groups for soliton equations. 6(50), 3813–1818 (1981) 14. Dijkgraaf, R., Verlinde, E., Verlinde, H.: Topological strings in d < 1. Nucl. Phys. B325, 59 (1991) 15. Dubrovin, B.: Integrable systems in topological field theory. Nucl. Phys. B379, 627–689 (1992) 16. Dubrovin, B.: Integrable systems and classification of 2-dimensional topological field theories. In: Integrable Systems, Proceedings of Luminy 1991 Conference Dedicated to the Memory of J.-L. Verdier, (Babelon, O., Cartier, O., Kosmann-Schwarzbach, Y. eds), Birkh¨auser, 1993 17. Dubrovin, B.: Geometry on 2D topological field theories. In: Integrable Systems and Quantum Groups (Montecatini Terme, 1993). Lecture Notes in Math. 1620, Berlin: Springer, 1996, pp. 120– 348 18. Egoroff, D.F.: A Class of Orthogonal Systems. Uch. Zap. Mosk. Univ. Otd. Fiz.-Mat. 18, 1–239 (1901) 19. Enriquez, B., Orlov, A.Y., Rubtsov, V.N.: Dispersionful analogues of Benney’s equations and N-wave systems. Inverse Problems 12, 241–250 (1996), solv-int/9510002 20. Helminck, G.F., van de Leur, J.W.: Geometric B¨acklund–Darboux transformations for the KP hierarchy. Publ. Res. Inst. Math. Sci. 37(4), 479–519 (2001) 21. Helminck, G.F., van de Leur, J.W.: An analytic description of the vector constrained KP hierarchy. Commun. Math. Phys. 193, 627–641 (1998) 22. Helminck, G.F., van de Leur, J.W.: Constrained and Rational Reductions of the KP hierarchy. in: “Supersymmetry and Integrable Models”, H. Aratyn. T.D. Imbo, W.-Y. Keung, U. Sukhatme, (eds), Lecture Notes in Physics 502, Berlin-Heidelberg-New York: Springer, 1998, pp. 167–182 23. Kac,V.G. van de Leur, J.W.: The n-component KP hierarchy and representation theory. In: Important Developments in Soliton Theory, A.S. Fokas and V. E. Zakharov (eds), Springer Series in Nonlinear Dynamics, Berlin-Heidelberg-New York: Springer, 1993, pp. 302–343. 24. Krichever, I.M.: Algebraic-geometric n-orthogonal curvilinear coordinate systems and the solution of associativity equations. Funct. Anal. Appl. 31, 25–39 (1997) 25. van de Leur, J.W.: Twisted GLn Loop Group Orbit and Solutions of WDVV Equations. Internat. Math. Res. Notices (11), 551–573 (2001), nlin.SI/0004021 26. van de Leur, J.W., Martini, R.: The construction of Frobenius Manifolds from KP tau-Functions. Commun. Math. Phys. 205, 587–616 (1999) 27. Loris, I.: On reduced CKP equations. Inverse Problems 15, 1099–1109 (1999) 28. Shiota, T.: Prym varieties and soliton equations. In: Infinite-dimensional Lie Algebras and Groups (Luminy-Marseille, 1988), Adv. Ser. Math. Phys. 7, Teaneck, NJ, World Sci. Publishing, 1989, pp. 407–448 29. Witten, E.: On the structure of the topological phase of two-dimensional gravity. Nucl. Phys. B 340, 281–332 (1990) 30. Zakharov, V.E.: Description of the n-orthogonal curvilinear coordinate systems and Hamiltonian integrable systems of hydrodynamic type. I. Integration of the Lam´e equations. Duke Math. J. 94, 103–139 (1998) Communicated by R.H. Dijkgraaf

Commun. Math. Phys. 239, 183–240 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0878-5

Communications in

Mathematical Physics

Liouville Action and Weil-Petersson Metric on Deformation Spaces, Global Kleinian Reciprocity and Holography Leon A. Takhtajan1 , Lee-Peng Teo2 1 2

Department of Mathematics, SUNY at Stony Brook, Stony Brook, NY 11794-3651, USA. E-mail: [email protected] Department of Applied Mathematics, National Chiao Tung University, 1001, Ta-Hsueh Road, Hsinchu City, 30050, Taiwan, R.O.C. E-mail: [email protected]

Received: 5 September 2002 / Accepted: 21 February 2003 Published online: 23 June 2003 – © Springer-Verlag 2003

Abstract: We rigorously define the Liouville action functional for the finitely generated, purely loxodromic quasi-Fuchsian group using homology and cohomology double complexes naturally associated with the group action. We prove that classical action – the critical value of the Liouville action functional, considered as a function on the quasiFuchsian deformation space, is an antiderivative of a 1-form given by the difference of Fuchsian and quasi-Fuchsian projective connections. This result can be considered as global quasi-Fuchsian reciprocity which implies McMullen’s quasi-Fuchsian reciprocity. We prove that the classical action is a K¨ahler potential of the Weil-Petersson metric. We also prove that the Liouville action functional satisfies holography principle, i.e., it is a regularized limit of the hyperbolic volume of a 3-manifold associated with a quasi-Fuchsian group. We generalize these results to a large class of Kleinian groups including finitely generated, purely loxodromic Schottky and quasi-Fuchsian groups, and their free combinations. Contents 1. Introduction . . . . . . . . . . . . . . . 2. Liouville Action Functional . . . . . . . 2.1 Homology and cohomology set-up 2.2 The Fuchsian case . . . . . . . . . 2.3 The quasi-Fuchsian case . . . . . 3. Deformation Theory . . . . . . . . . . . 3.1 The deformation space . . . . . . 3.2 Variational formulas . . . . . . . . 4. Variation of the Classical Action . . . . 4.1 Classical action . . . . . . . . . . 4.2 First variation . . . . . . . . . . . 4.3 Second variation . . . . . . . . . 4.4 Quasi-Fuchsian reciprocity . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

184 192 192 193 204 209 209 211 215 215 217 220 222

184

5. Holography . . . . . . . . . . . . . . . . . . . 5.1 Homology and cohomology set-up . . . . 5.2 Regularized Einstein-Hilbert action . . . 5.3 Epstein map . . . . . . . . . . . . . . . . 6. Generalization to Kleinian Groups . . . . . . . 6.1 Kleinian groups of Class A . . . . . . . . 6.2 Einstein-Hilbert and Liouville functionals 6.3 Variation of the classical action . . . . . . 6.4 Kleinian Reciprocity . . . . . . . . . . .

L.A. Takhtajan, L.-P. Teo

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

224 224 227 232 233 233 234 236 239

1. Introduction Fuchsian uniformization of Riemann surfaces plays an important role in the Teichm¨uller theory. In particular, it is built into the definition of the Weil-Petersson metric on Teichm¨uller spaces. This role became even more prominent with the advent of string theory, started with Polyakov’s approach to non-critical bosonic strings [Pol81]. It is natural to consider the hyperbolic metric on a given Riemann surface as a critical point of a certain functional defined on the space of all smooth conformal metrics on it. In string theory this functional is called the Liouville action functional and its critical value – the classical action. This functional defines the two-dimensional theory of gravity with cosmological term on a Riemann surface, also known as Liouville theory. From a mathematical point of view, the relation between Liouville theory and complex geometry of moduli spaces of Riemann surfaces was established by P. Zograf and the first author in [ZT85, ZT87a, ZT87b]. It was proved that the classical action is a K¨ahler potential of the Weil-Petersson metric on moduli spaces of pointed rational curves [ZT87a], and on Schottky spaces [ZT87b]. In the rational case the classical action is a generating function of accessory parameters of Klein and Poincar´e. In the case of compact Riemann surfaces, the classical action is an antiderivative of a 1-form on the Schottky space given by the difference of Fuchsian and Schottky projective connections. In turn, this 1-form is an antiderivative of the Weil-Petersson symplectic form on the Schottky space. C. McMullen [McM00] has considered another 1-form on Teichm¨uller space given by the difference of Fuchsian and quasi-Fuchsian projective connections, the latter corresponds to Bers’simultaneous uniformization of a pair of Riemann surfaces. By establishing quasi-Fuchsian reciprocity, McMullen proved that this 1-form is also an antiderivative of the Weil-Petersson symplectic form, which is bounded on the Teichm¨uller space due to the Kraus-Nehari inequality. The latter result is important in proving that the moduli space of Riemann surfaces is K¨ahler hyperbolic [McM00]. In this paper we extend McMullen’s results along the lines of [ZT87a, ZT87b] by using the homological algebra machinery developed by E. Aldrovandi and the first author in [AT97]. We explicitly construct a smooth function on the quasi-Fuchsian deformation space and prove that it is an antiderivative of the 1-form given by the difference of Fuchsian and quasi-Fuchsian projective connections. This function is defined as a classical action for Liouville theory for a quasi-Fuchsian group. The symmetry property of this function is the global quasi-Fuchsian reciprocity, and McMullen’s quasi-Fuchsian reciprocity [McM00] is its immediate corollary. We also prove that this function is a K¨ahler potential of the Weil-Petersson metric on the quasi-Fuchsian deformation space. As it will be explained below, construction of the Liouville action functional is not a trivial issue and it requires homological algebra methods developed in [AT97]. Furthermore,

Liouville Action and Weil-Petersson Metric

185

we show that the Liouville action functional satisfies the holography principle in string theory (also called AdS/CFT correspondence). Specifically, we prove that the Liouville action functional is a regularized limit of the hyperbolic volume of a 3-manifold associated with a quasi-Fuchsian group. Finally, we generalize these results to a large class of Kleinian groups including finitely generated, purely loxodromic Schottky and quasiFuchsian groups, and their free combinations. Namely, we define the Liouville action functional, establish the holography principle, and prove that the classical action is an antiderivative of a 1-form on the deformation space given by the difference of Fuchsian and Kleinian projective connections, thus establishing global Kleinian reciprocity. We also prove that the classical action is a K¨ahler potential of the Weil-Petersson metric. Here is a more detailed description of the paper. Let X be a Riemann surface of genus g > 1, and let {Uα }α∈A be its open cover with charts Uα , local coordinates zα : Uα → C, and transition functions fαβ : Uα ∩ Uβ → C. A (holomorphic) projective connection on X is a collection P = {pα }α∈A , where pα are holomorphic functions on Uα which on every Uα ∩ Uβ satisfy 2 + S(fαβ ), pβ = pα ◦ fαβ fαβ where prime indicates derivative. Here S(f ) is the Schwarzian derivative, 3 f 2 f . S(f ) = − f 2 f The space P(X) of projective connections on X is an affine space modeled on the vector space of holomorphic quadratic differentials on X. The Schwarzian derivative satisfies the following properties: SD1 S(f ◦ g) = S(f ) ◦ g (g )2 + S(g). SD2 S(γ ) = 0 for all γ ∈ PSL(2, C). Let π : → X be a holomorphic covering of a compact Riemann surface X by a ˆ with a group of deck transformations being a subgroup of PSL(2, C). domain ⊂ C It follows from SD1-SD2 that every such covering defines a projective connection on X by Pπ = {Szα (π −1 )}α∈A . The Fuchsian uniformization X \U is the covering πF : U → X by the upper half-plane U where the group of deck transformations is a Fuchsian group , and it defines the Fuchsian projective connection PF . The Schottky ˆ uniformization X \ is the covering πS : → X by a connected domain ⊂ C, where the group of deck transformations is a Schottky group – finitely-generated, strictly loxodromic, free Kleinian group. It defines the Schottky projective connection PS . Let Tg be the Teichm¨uller space of marked Riemann surfaces of genus g > 1 (with a given marked Riemann surface as the origin), defined as the space of marked normalized Fuchsian groups, and let Sg be the Schottky space, defined as the space of marked normalized Schottky groups with g free generators. These spaces are complex manifolds of dimension 3g − 3 carrying Weil-Petersson K¨ahler metrics, and the natural projection map Tg → Sg is a complex-analytic covering. Denote by ωW P the symplectic form of the Weil-Petersson metric on spaces Tg and Sg , and by d = ∂ + ∂¯ – the de Rham differential and its decomposition. The affine spaces P(X) for varying Riemann surfaces X glue together to an affine bundle Pg → Tg , modeled over the holomorphic cotangent bundle of Tg . The Fuchsian projective connection PF is a canonical section

186

L.A. Takhtajan, L.-P. Teo

of the affine bundle Pg → Tg , invariant under the action of the Teichm¨uller modular group Modg . The Schottky projective connection is a canonical section of the affine bundle Pg → Sg , and the difference PF − PS , where PF is considered as a section of Pg → Sg , is a (1, 0)-form on Sg . This 1-form has the following properties [ZT87b]. First, it is ∂-exact – there exists a smooth function S : Sg → R such that PF − PS =

1 ∂S. 2

(1.1)

¯ Second, it is a ∂-antiderivative, and hence a d-antiderivative by (1.1), of the WeilPetersson symplectic form on Sg , ¯ F − PS ) = −i ωW P . ∂(P

(1.2)

It immediately follows from (1.1) and (1.2) that the function −S is a K¨ahler potential for the Weil-Petersson metric on Sg , and hence on Tg , ¯ = 2i ωW P . ∂ ∂S

(1.3)

Arguments using the quantum Liouville theory (see, e.g., [Tak92] and references therein) confirm formula (1.1) with function S given by the classical Liouville action, as was already proved in [ZT87b]. However, the general mathematical definition of the Liouville action functional on a Riemann surface X is a non-trivial problem interesting in its own right (and for rigorous applications to the quantum Liouville theory). Let CM(X) be a space (actually a cone) ofsmooth conformal metrics on a Riemann surface X. Every ds 2 ∈ CM(X) is a collection eφα |dzα |2 α∈A , where functions φα ∈ C ∞ (Uα , R) satisfy 2 φα ◦ fαβ + log |fαβ | = φβ

on

Uα ∩ U β .

(1.4)

According to the uniformization theorem, X has a unique conformal metric of constant negative curvature −1, called hyperbolic, or Poincar´e metric. Gaussian curvature −1 condition is equivalent to the following nonlinear PDE for functions φα on Uα , ∂ 2 φα 1 = eφα . ∂zα ∂ z¯ α 2

(1.5)

In string theory this PDE is called the Liouville equation. The problem is to define the Liouville action functional on Riemann surface X – a smooth functional S : CM(X) → R such that its Euler-Lagrange equation is the Liouville equation. At first glance it looks like an easy task. Set U = Uα , z = zα and φ = φα , so that ds 2 = eφ |dz|2 in U . Elementary calculus of variations shows that the Euler-Lagrange equation for the functional i |φz |2 + eφ dz ∧ d z¯ , 2 U

is indeed the Liouville equation on U . Therefore, it seems that the where φz = ∂φ/∂z, functional 2i X ω, where ω is a 2-form on X such that

∂φα 2 φ α

+e dzα ∧ d z¯ α , (1.6) ω|Uα = ωα =

∂zα

Liouville Action and Weil-Petersson Metric

187

does the job. However, due to the transformation law (1.4) the first terms in local 2-forms ωα do not glue properly on Uα ∩ Uβ and a 2-form ω on X satisfying (1.6) does not exist! Though the Liouville action functional can not be defined in terms of a Riemann surface X only, it can be defined in terms of planar coverings of X. Namely, let be a Kleinian group with the region of discontinuity such that \ X1 · · · Xn – a disjoint union of compact Riemann surfaces of genera > 1 including Riemann surface X. The covering → X1 · · · Xn introduces a global “´etale” coordinate, and for large variety of Kleinian groups (Class A defined below) it is possible, using methods [AT97], to define the Liouville action functional S : CM(X1 · · · Xn ) → R such that its critical value is a well-defined function on the deformation space D(). In the simplest case when X is a punctured Riemann sphere such a global coordinate exists already on X, and the Liouville action functional is just 2i X ω, appropriately regularized at the punctures [ZT87a]. When X is compact, one possibility is to use the “minimal” planar cover of X given by the Schottky uniformization X \, as in [ZT87b]. Namely, identify CM(X) with the affine space of smooth real-valued functions φ on satisfying φ ◦ γ + log |γ |2 = φ

for all

γ ∈ ,

(1.7)

and consider the 2-form ω[φ] = (|φz |2 + eφ )dz ∧ d z¯ on . The 2-form ω[φ] can not be pushed forward on X, so that the integral 2i F ω depends on the choice of fundamental domain F for the marked Schottky group . However, one can add boundary terms to this integral to ensure the independence of the choice of fundamental domain for the marked Schottky group , and to guarantee that its Euler-Lagrange equation is the Liouville equation on \. The result is the following functional introduced in [ZT87b]: i S[φ] = |φz |2 + eφ dz ∧ d z¯ 2 F g γk γk 1 i 2 φ − log |γk | dz − d z¯ + 2 2 γk Ck γ k=1 g

+ 4π

k

log |c(γk )|2 .

(1.8)

k=1

Here F is the fundamental domain of the marked Schottky group with free generators γ1 , . . . , γg , bounded by 2g non-intersecting closed Jordan curves C1 , . . . , Cg , C1 , . . . , Cg

such that Ck = −γk (Ck ), k = 1, . . . , g, and c(γ ) = c for γ = ac db . Classical action S : Sg → R that enters (1.1) is the critical value of this functional. In [McM00] McMullen considered the quasi-Fuchsian projective connection PQF on a Riemann surface X which is given by Bers’ simultaneous uniformization of X and fixed Riemann surface Y of the same genus and opposite orientation. Similar to formula (1.2), he proved d(PF − PQF ) = −i ωW P ,

(1.9)

so that the 1-form PF − PQF on Tg is a d-antiderivative of the Weil-Petersson symplectic form, bounded in Teichm¨uller and Weil-Petersson metrics due to the Kraus-Nehari ¯ F − PQF ) = −i ωW P of (1.9) actually follows from (1.1) since inequality. Part ∂(P

188

L.A. Takhtajan, L.-P. Teo

PS − PQF is a holomorphic (1, 0)-form on Sg . Part ∂(PF − PQF ) = 0 follows from McMullen’s quasi-Fuchsian reciprocity. Our first result is an analog of the formula (1.1) for the quasi-Fuchsian case. Namely, let be a finitely generated, purely loxodromic quasi-Fuchsian group with region of discontinuity , so that \ is the disjoint union of two compact Riemann surfaces with the same genus g > 1 and opposite orientations. Denote by D() the deformation space of – a complex manifold of complex dimension 6g − 6, and by ωW P – the symplectic form of the Weil-Petersson metric on D(). To every point ∈ D() with the region of discontinuity there corresponds a pair X, Y of compact Riemann surfaces with opposite orientations simultaneously uniformized by , that is, X Y \ . We will continue to denote by PF and PQF the projective connections on X Y given by Fuchsian uniformizations of X and Y and Bers’ simultaneous uniformization of X and Y respectively. Similarly to (1.1), we prove in Theorem 4.1 that there exists a smooth function S : D() → R such that PF − PQF =

1 ∂S. 2

(1.10)

The function S is a classical Liouville action for the quasi-Fuchsian group – the critical value of the Liouville action functional S on CM(X Y ). Its construction uses double homology and cohomology complexes naturally associated with the -action on . Namely, the homology double complex K•,• is defined as a tensor product over the integral group ring Z of the standard singular chain complex of and the canonical bar-resolution complex for , and the cohomology double complex C•,• is bar-de Rham complex on . The cohomology construction starts with the 2-form ω[φ] ∈ C2,0 , where φ satisfies (1.7), and introduces θ [φ] ∈ C1,1 and u ∈ C1,2 by γ γ 1 θγ −1 [φ] = φ − log |γ |2 dz − d z¯ , 2 γ γ and uγ −1 ,γ −1 1

2

γ2 γ2 ◦ γ γ dz − ◦ γ1 γ1 d z¯ 1 1 γ2 γ2 γ1 1 2 γ1 + log |γ2 ◦ γ1 | dz − d z¯ . 2 γ1 γ

1 = − log |γ1 |2 2

1

Define ∈ C0,2 to be a group 2-cocycle satisfying d = u. The resulting cochain [φ] = ω[φ] − θ[φ] − is a cocycle of degree 2 in the total complex Tot C. The corresponding homology construction starts with a fundamental domain F ∈ K2,0 for in and introduces chains L ∈ K1,1 and V ∈ K0,2 such that = F + L − V is a cycle of degree 2 in the total homology complex Tot K. The Liouville action functional is given by the evaluation map, S[φ] =

i [φ], , 2

where , is the natural pairing between Cp,q and Kp,q .

(1.11)

Liouville Action and Weil-Petersson Metric

189

In the case when is a Fuchsian group, the Liouville action functional on X \U, similar to (1.8), can be written explicitly as follows: S[φ] =

i 2

i 2 g

ω[φ] +

k=1

F

ak

θαk [φ] −

bk

θβk [φ]

i αk ,βk (ak (0)) − βk ,αk (bk (0)) + γ −1 ,αk βk (bk (0)) k 2 g

+

k=1

g i γg−1 ...γ −1 ,γ −1 (bg (0)), − k+1 k 2 k=1

where

γ1 ,γ2 (z) =

z p

p ∈ R \ (∞) and

εγ1 ,γ2

uγ1 ,γ2 + 4πiεγ1 ,γ2 (2 log 2 + log |c(γ2 )|2 ),

 −1   1 if p < γ2 (∞) < γ1 p, = −1 if p > γ2 (∞) > γ1−1 p,   0 otherwise.

Here ak and bk are the edges of the fundamental domain F for in U (see Sect. 2.2.1) with initial points ak (0) and bk (0), αk and βk are corresponding generators of and γk = αk βk αk−1 βk−1 . The action functional does not depend on the choice of the fundamental domain F for , nor on the choice of p ∈ R \ (∞). Liouville action for quasi-Fuchsian group is defined by a similar construction where both components of are used (see Sect. 2.3.3). Equation (1.10) is the global quasi-Fuchsian reciprocity. McMullen’s quasi-Fuchsian reciprocity, as well as the equation ∂(PF − PQF ) = 0, immediately follow from it. The classical action S : D() → R is symmetric with respect to Riemann surfaces X and Y, ¯ S(X, Y ) = S(Y¯ , X),

(1.12)

where X¯ is the mirror image of X, and this property manifests the global quasi-Fuchsian reciprocity. Equation (1.9) now follows from (1.10) and (1.1). Its direct proof along the lines of [ZT87a, ZT87b] is given in Theorem 4.2. As an immediate corollary of (1.9) and (1.10), we obtain that the function −S is a K¨ahler potential of the Weil-Petersson metric on D(). Our second result is a precise relation between two and three-dimensional constructions which establishes the holography principle for the quasi-Fuchsian case. Let U3 = {Z = (x, y, t) ∈ R3 | t > 0} be hyperbolic 3-space. The quasi-Fuchsian group acts discontinuously on U3 ∪ and the quotient M \(U3 ∪) is a hyperbolic 3-manifold with boundary \ X Y . According to the holography principle (see, e.g., [MM02] for a mathematically oriented exposition), the regularized hyperbolic volume of M – on-shell Einstein-Hilbert action with a cosmological term, is related to the Liouville action functional S[φ].

190

L.A. Takhtajan, L.-P. Teo

In the case when is a classical Schottky group, i.e., when it has a fundamental domain bounded by Euclidean circles, the holography principle was established by K. Krasnov in [Kra00]. Namely, let M \(U3 ∪ ) be the corresponding hyperbolic 3-manifold (realized using the Ford fundamental region) with boundary X \ – a compact Riemann surface of genus g > 1. For every ds 2 = eφ |dz|2 ∈ CM(X) consider the family Hε of surfaces given by the equation f (Z) = teφ(z)/2 = ε > 0, where z = x + iy, and let Mε = M ∩ Hε . Denote by Vε [φ] the hyperbolic volume of Mε , by Aε [φ] – the area of the boundary of Mε in the metric on Hε induced by the hyperbolic metric on U3 , and by A[φ] – the area of X in the metric ds 2 . In [Kra00] K. Krasnov obtained the following formula: 1 1 lim Vε [φ] − Aε [φ] + (2g − 2)π log ε = − (S[φ] − A[φ]) . (1.13) ε→0 2 4 It relates three-dimensional data – the regularized volume of M, to the two-dimensional data – the Liouville action functional S[φ], thus establishing the holography principle. Note that the metric ds 2 on the boundary of M appears entirely through regularization by means of the surfaces Hε , which are not -invariant. As a result, arguments in [Kra00] work only for classical Schottky groups. We extend homological algebra methods in [AT97] to the three-dimensional case when is a quasi-Fuchsian group. Namely, we construct the -invariant cut-off function f using a partition of unity for , and prove in Theorem 5.1 that the on-shell regularized Einstein-Hilbert action functional 1 E[φ] = −4 lim V [φ] − A [φ] + 2π(2g − 2) log ε , ε→0 2 is well-defined and satisfies the quasi-Fuchsian holography principle E[φ] = S[φ] − eφ d 2 z − 8π(2g − 2) log 2. \

As an immediate corollary, we get another proof that the Liouville action functional S[φ] does not depend on the choice of fundamental domain F of in , provided it is the intersection of the fundamental region of in U3 ∪ with . We also show that -invariant cut-off surfaces Hε can be chosen to be Epstein surfaces, which are naturally associated with the family of metrics dsε2 = 4ε −2 eφ |dz|2 ∈ CM(X) by the inverse of the “hyperbolic Gauss map” [Eps84, Eps86] (see also [And98]). This construction also gives a geometric interpretation of the density |φz |2 + eφ in terms of Epstein surfaces. The Schottky and quasi-Fuchsian groups considered above are basically the only examples of geometrically finite, purely loxodromic Kleinian groups with finitely many components. Indeed, according to a theorem of Maskit [Mas88], every geometrically finite, purely loxodromic Kleinian group which has finitely many components in fact has at most two components. The one-component case corresponds to the Schottky groups and the two-component case – to the Fuchsian or quasi-Fuchsian groups and their Z2 -extensions. The third result of the paper is the generalization of the main results for quasi-Fuchsian groups – Theorems 4.1, 4.2 and 5.1, to Kleinian groups. Namely, we introduce the notion of a Kleinian group of Class A for which this generalization holds. By definition,

Liouville Action and Weil-Petersson Metric

191

a non-elementary, purely loxodromic, geometrically finite Kleinian group is of Class A if it has fundamental region R in U3 ∪ which is a finite three-dimensional CW -complex with no vertices in U3 . The Schottky, Fuchsian, quasi-Fuchsian groups, and their free combinations are of Class A, and Class A is stable under quasiconformal deformations. We extend three-dimensional homological methods developed in Sect. 5 to the case of the Kleinian group of Class A acting on U3 ∪ . Namely, starting from the fundamental region R for in U3 ∪ , we construct a chain of degree 3 in total homology complex Tot K, whose boundary in is a cycle of degree 2 for the corresponding total homology complex of the region of discontinuity . In Theorem 6.1 we establish the holography principle for Kleinian groups: we prove that the on-shell regularized Einstein-Hilbert action for the 3-manifold M \(U3 ∪ ) is well-defined and is related to the Liouville action functional for , defined by the evaluation map (1.11). When is a Schottky group, we get the functional (1.8) introduced in [ZT87b]. As in the quasi-Fuchsian case, the Liouville action functional does not depend on the choice of a fundamental domain F for in , as long as it is the intersection of a fundamental region of in U3 ∪ with . Denote by D() the deformation space of a Kleinian group . To every point ∈ D() with the region of discontinuity there corresponds a disjoint union X1 · · · Xn \ of compact Riemann surfaces simultaneously uniformized by the Kleinian group . Conversely, by a theorem of Maskit [Mas88], for a given sequence of compact Riemann surfaces X1 , . . . , Xn there is a Kleinian group which simultaneously uniformizes them. Using the same notation, we denote by PF the projective connection on X1 · · · Xn given by the Fuchsian uniformization of these Riemann surfaces and by PK – the projective connection given by their simultaneous uniformization by a Kleinian group (PK = PQF for the quasi-Fuchsian case). Let S : D() → R be the classical Liouville action. Theorem 6.2 states that PF − PK =

1 ∂S, 2

which is the ultimate generalization of (1.1). Similarly, Theorem 6.3 is the statement ¯ F − PK ) = −i ωW P , ∂(P which implies that −S is a K¨ahler potential of the Weil-Petersson metric on D(). As another immediate corollary of Theorem 6.2 we get McMullen’s Kleinian reciprocity – Theorem 6.4. Finally, we observe that our methods and results, with appropriate modifications, can be generalized to the case when quasi-Fuchsian and Class A Kleinian groups have torsion and contain parabolic elements. Our method also works for the Bers’ universal Teichm¨uller space T (1) and the related infinite-dimensional K¨ahler manifold Diff + (S 1 )/ M¨ob(S 1 ). We plan to discuss these generalizations elsewhere. The content of the paper is the following. In Sect. 2 we give construction of the Liouville action functional following the method in [AT97], which we review briefly in 2.1. In Sect. 2.2 we define and establish the main properties of the Liouville action functional in the model case when is a Fuchsian group, and in Sect. 2.3 we consider the technically more involved quasi-Fuchsian case. In Sect. 3 we recall all necessary basic facts from deformation theory. In Sect. 4 we prove our first main result – Theorems 4.1 and 4.2. In Sect. 5 we prove the second main result – Theorem 5.1 on the quasi-Fuchsian holography. Finally in Sect. 6 we generalize these results for Kleinian groups of Class A: we define the Liouville action functional and prove Theorems 6.1, 6.2 and 6.3.

192

L.A. Takhtajan, L.-P. Teo

2. Liouville Action Functional Let be a normalized, marked, purely loxodromic quasi-Fuchsian group of genus g > 1 with region of discontinuity , so that \ X Y , where X and Y are compact Riemann surfaces of genus g > 1 with opposite orientations. Here we define the Liouville action functional S for the group as a functional on the space of smooth conformal metrics on X Y with the property that its Euler-Lagrange equation is the Liouville equation on X Y . Its definition is based on the homological algebra methods developed in [AT97].

2.1. Homology and cohomology set-up. Let be a group acting properly on a smooth manifold M. To this data one canonically associates double homology and cohomology complexes (see, e.g., [AT97] and references therein). Let S• ≡ S• (M) be the standard singular chain complex of M with the differential ∂ . The group action on M induces a left -action on S• by translating the chains and S• becomes a complex of left -modules. Since the action of on M is proper, S• is a complex of free left Z-modules, where Z is the integral group ring of the group . The complex S• is endowed with a right Z-module structure in the standard fashion: c · γ = γ −1 (c). Let B• ≡ B• (Z) be the canonical “bar” resolution complex for with differential ∂ . Each Bn (Z) is a free left -module on generators [γ1 | . . . |γn ], with the differential ∂ : Bn −→ Bn−1 given by ∂ [γ1 | . . . |γn ] = γ1 [γ2 | . . . |γn ] +

n−1

(−1)k [γ1 | . . . |γk γk+1 | . . . |γn ]

k=1

+ (−1) [γ1 | . . . |γn−1 ] , n > 1, n

∂ [γ ] = γ [ ] − [ ] , n = 1, where [γ1 | . . . |γn ] is zero if some γi equals to the unit element id in . Here B0 (Z) is a Z-module on one generator [ ] and it can be identified with Z under the isomorphism that sends [ ] to 1; by definition, ∂ [ ] = 0. The double homology complex K•,• is defined as S• ⊗Z B• , where the tensor product over Z uses the right -module structure on S• . The associated total complex Tot K is equipped with the total differential ∂ = ∂ + (−1)p ∂ on Kp,q , and the complex S• is identified with S• ⊗Z B0 by the isomorphism c → c ⊗ [ ]. The corresponding double complex in cohomology is defined as follows. Denote by • (M) the complexified de Rham complex on M. Each An is a left -module A • ≡ AC with the pull-back action of , i.e., γ · = (γ −1 )∗ for ∈ A• and γ ∈ . Define the double complex Cp,q = HomC (Bq , Ap ) with differentials d, the usual de Rham differential, and δ = (∂ )∗ , the group coboundary. Specifically, for ∈ Cp,q , (δ )γ1 ,··· ,γq+1 = γ1 · γ2 ,··· ,γq+1 +

q

(−1)k γ1 ,··· ,γk γk+1 ,··· ,γq+1

k=1

+ (−1)

q+1

γ1 ,··· ,γq .

We write the total differential on Cp,q as D = d + (−1)p δ.

Liouville Action and Weil-Petersson Metric

193

There is a natural pairing between Cp,q and Kp,q which assigns to the pair (, c ⊗ [γ1 | . . . |γq ]) the evaluation of the p-form γ1 ,··· ,γq over the p-cycle c, , c ⊗ [γ1 | . . . |γq ] = c

γ1 ,··· ,γq .

By definition, δ, c = , ∂ c , so that using Stokes’ theorem we get D, c = , ∂c . This pairing defines a non-degenerate pairing between corresponding cohomology and homology groups H • (Tot C) and H• (Tot K), which we continue to denote by , . In particular, if is a cocycle in (Tot C)n and C is a cycle in (Tot K)n , then the pairing , C depends only on cohomology classes [] and [C] and not on their representatives. It is this property that will allow us to define the Liouville action functional by constructing the corresponding cocycle and cycle . Specifically, we consider the following two cases. 1. is purely hyperbolic Fuchsian group of genus g > 1 and M = U – the upper half-plane of the complex plane C. In this case, since U is acyclic, we have [AT97] H• (X, Z) ∼ = H• (, Z) ∼ = H• (Tot K) , where the three homologies are: the singular homology of X \U, a compact Riemann surface of genus g > 1, the group homology of , and the homology of the complex Tot K with respect to the total differential ∂. Similarly, for M = L – the lower half-plane of the complex plane C, we have ¯ Z) ∼ H• (X, = H• (, Z) ∼ = H• (Tot K) , where X¯ \L is the mirror image of X – a complex-conjugate of the Riemann surface X. 2. is purely loxodromic quasi-Fuchsian group of genus g > 1 with region of discontinuity consisting of two simply-connected components 1 and 2 separated by a quasi-circle C. The same isomorphisms hold, where X \1 and X¯ is replaced by Y \2 .

2.2. The Fuchsian case. Let be a marked, normalized, purely hyperbolic Fuchsian group of genus g > 1, let X \U be the corresponding marked compact Riemann surface of genus g, and let X¯ \L be its mirror image. In this case it is possible to define Liouville action functionals on Riemann surfaces X and X¯ separately. The definition will be based on the following specialization of the general construction in Sect. 2.1.

194

L.A. Takhtajan, L.-P. Teo

2.2.1. Homology computation. Here is a representation of the fundamental class [X] of the Riemann surface X in H2 (X, Z) as a cycle of total degree 2 in the homology complex Tot K [AT97]. Recall that the marking of is given by a system of 2g standard generators α1 , . . . , αg , β1 , . . . , βg satisfying the single relation γ1 · · · γg = id, where γk = [αk , βk ] = αk βk αk−1 βk−1 . The marked group is normalized, if the attracting and repelling fixed points of α1 are, respectively, 0 and ∞, and the attracting fixed point of β1 is 1. Every marked Fuchsian group is conjugated in PSL(2, R) to a normalized marked Fuchsian group. For a given marking there is a standard choice of the fundamental domain F ⊂ U for as a closed non-Euclidean polygon with 4g edges labeled by ak , ak , bk , bk satisfying αk (ak ) = ak , βk (bk ) = bk , k = 1, 2, . . . , g (see Fig. 1). The orientation of the edges is chosen such that

∂F =

g

(ak + bk − ak − bk ).

k=1

Set ∂ ak = ak (1) − ak (0), ∂ bk = bk (1) − bk (0), so that ak (0) = bk−1 (0). The relations between the vertices of F and the generators of are the following: αk−1 (ak (0)) = bk (1), βk−1 (bk (0)) = ak (1), γk (bk (0)) = bk−1 (0), where b0 (0) = bg (0). According to the isomorphism S• K•,0 , the fundamental domain F is identified with F ⊗ [ ] ∈ K2,0 . We have ∂ F = 0 and, as it follows from the previous formula, ∂ F =

g

βk−1 (bk ) − bk − αk−1 (ak ) + ak = ∂ L,

k=1

where L ∈ K1,1 is given by L=

g

(bk ⊗ [βk ] − ak ⊗ [αk ]) .

(2.1)

k=1

There exists V ∈ K0,2 such that ∂ V = ∂ L. A straightforward computation gives the following explicit expression: V =

ak (0) ⊗ [αk |βk ] − bk (0) ⊗ [βk |αk ] + bk (0) ⊗ γk−1 |αk βk

g

k=1 g−1

−

−1 bg (0) ⊗ γg−1 . . . γk+1 |γk−1 .

(2.2)

k=1

Using ∂ F = 0, ∂ F = ∂ L, ∂ V = ∂ L, and ∂ V = 0, we obtain that the element = F + L − V of total degree 2 is a cycle in Tot K, that is ∂ = 0. The cycle ∈ (Tot K)2 represents the fundamental class [X]. It is proved in [AT97] that the corresponding homology class [ ] in H• (Tot K) does not depend on the choice of the fundamental domain F for the group .

Liouville Action and Weil-Petersson Metric

195

a1

b2

b’1

a’2

a’1

b’2

b1

a2

Fig. 1. Conventions for the fundamental domain F

2.2.2. Cohomology computation. The corresponding construction in cohomology is the following. Start with the space CM(X) of all conformal metrics on X \U. Every ds 2 ∈ CM(X) can be represented as ds 2 = eφ |dz|2 , where φ ∈ C ∞ (U, R) satisfies φ ◦ γ + log |γ |2 = φ for all γ ∈ .

(2.3)

In what follows we will always identify CM(X) with the affine subspace of C ∞ (U, R) defined by (2.3). The “bulk” 2-form ω for the Liouville action is given by ω[φ] = |φz |2 + eφ dz ∧ d z¯ , (2.4) where φ ∈ CM(X). Considering it as an element in C2,0 and using (2.3) we get δω[φ] = dθ [φ], where θ[φ] ∈ C1,1 is given explicitly by

1 θγ −1 [φ] = φ − log |γ |2 2

γ γ dz − d z¯ . γ γ

(2.5)

Next, set u = δθ [φ] ∈ C1,2 . From the definition of θ and δ 2 = 0 it follows that the 1-form u is closed. An explicit calculation gives γ2 1 2 γ2 uγ −1 ,γ −1 = − log |γ1 | ◦ γ1 γ1 dz − ◦ γ1 γ1 d z¯ 1 2 2 γ2 γ2 γ1 1 2 γ1 + log |γ2 ◦ γ1 | dz − d z¯ , (2.6) 2 γ1 γ 1

and shows that u does not depend on φ ∈ CM(X).

196

L.A. Takhtajan, L.-P. Teo

Remark 2.1. The explicit formulas above are valid in the general case, when the domain ˆ is invariant under the action of a Kleinian group . Namely, define the 2-form ⊂C ω by formula (2.4), where φ satisfies (2.3) in . Then the solution θ to the equation δω[φ] = dθ [φ] is given by the formula (2.5) and u = δθ [φ] – by (2.6). There exists a cochain ∈ C0,2 satisfying d = u and δ = 0. Indeed, since the 1-form u is closed and U is simply-connected, can be defined as a particular antiderivative of u satisfying δ = 0. This can be done as follows. Consider the hyperbolic (Poincar´e) metric on U eφhyp (z) |dz|2 =

|dz|2 , z = x + iy ∈ U. y2

This metric is PSL(2, R)-invariant and its push-forward to X is a hyperbolic metric on X. Explicit computation yields ω[φhyp ] = 2eφhyp dz ∧ d z¯ , so that δω[φhyp ] = 0. Thus the 1-form θ[φhyp ] on U is closed and, therefore, is exact, θ [φhyp ] = dl, for some l ∈ C0,1 . Set = δl.

(2.7)

It is now immediate that δ = 0 and δθ[φ] = u = d for all φ ∈ CM(X). Thus [φ] = ω[φ] − θ [φ] − is a 2-cocycle in the cohomology complex Tot C, that is, D [φ] = 0. Remark 2.2. For every γ ∈ PSL(2, R) define the 1-form θγ φhyp by the same formula (2.5),

1 θγ −1 [φhyp ] = − 2 log y + log |γ |2 2

γ γ dz − d z¯ . γ γ

(2.8)

Since for every γ ∈ PSL(2, R), (δ log y)γ −1 = log(y ◦ γ ) − log y =

1 log |γ |2 , 2

the 1-form u = δθ [φ] is still given by (2.6) and is a A1 (U)-valued group 2-cocycle for PSL(2, R), that is, (δu)γ1 ,γ2 ,γ3 = 0 for all γ1 , γ2 , γ3 ∈ PSL(2, R). Also 0-form given by (2.7) satisfies d = u and is a A0 (U)-valued group 2-cocycle for PSL(2, R).

Liouville Action and Weil-Petersson Metric

197

2.2.3. The action functional. The evaluation map [φ], does not depend on the choice of the fundamental domain F for [AT97]. It also does not depend on a particular choice of antiderivative l, since by the Stokes’ theorem , V = δl, V = l, ∂ V = l, ∂ L = θ [φhyp ], L .

(2.9)

This justifies the following definition. Definition 2.1. The Liouville action functional S[ · ; X] : CM(X) → R is defined by the evaluation map S[φ; X] =

i [φ], , φ ∈ CM(X). 2

For brevity, set S[φ] = S[φ; X]. The following lemma shows that the difference of any two values of the functional S is given by the bulk term only. Lemma 2.1. For all φ ∈ CM(X) and σ ∈ C ∞ (X, R),

S[φ + σ ] − S[φ] = |σz |2 + eσ + K σ − 1 eφ d 2 z, F

where d 2 z = dx ∧ dy is the Lebesgue measure and K = −2e−φ φz¯z is the Gaussian curvature of the metric eφ |dz|2 . Proof. We have ˜ ω[φ + σ ] − ω[φ] = ω[φ; σ ] + d θ, where

ω[φ; σ ] = |σz |2 + eσ + K σ − 1 eφ dz ∧ d z¯ , and θ˜ = σ (φz¯ d z¯ − φz dz) . Since δ θ˜γ −1 = σ

γ γ dz − d z¯ = θ [φ + σ ] − θ [φ], γ γ

the assertion of the lemma follows from the Stokes’ theorem.

Corollary 2.1. The Euler-Lagrange equation for the functional S is the Liouville equation, the critical point of S – the hyperbolic metric φhyp , is non-degenerate, and the classical action – the critical value of S, is twice the hyperbolic area of X, that is, 4π(2g − 2).

198

L.A. Takhtajan, L.-P. Teo

Proof. As it follows from Lemma 2.1,

dS[φ + tσ ]

= (K + 1) σ eφ d 2 z,

dt t=0 F

so that the Euler-Lagrange equation is the Liouville equation K = −1. Since

d 2 S[φhyp + tσ ]

2|σz |2 + σ 2 eφhyp d 2 z > 0 if σ = 0, =

2 dt t=0 F

the critical point φhyp is non-degenerate. Using (2.9) we get S[φhyp ] =

i i [φhyp ], = ω[φhyp ], F = 2 2 2

d 2z = 4π(2g − 2). y2

F

Remark 2.3. Let [φ] = −e−φ ∂z ∂z¯ be the Laplace operator of the metric ds 2 = eφ |dz|2 acting on functions on X, and let det [φ] be its zeta-function regularized determinant (see, e.g., [OPS88] for details). Denote by A[φ] the area of X with respect to the metric ds 2 and set I[φ] = log

det [φ] . A[φ]

Polyakov’s “conformal anomaly” formula [Pol81] reads 1 I[φ + σ ] − I[φ] = − |σz |2 + Kσ eφ d 2 z, 12π F

where σ ∈ C ∞ (X, R) (see [OPS88] for rigorous proof). Comparing it with Lemma 2.1 we get I[φ + σ ] +

1 ˇ 1 ˇ S[φ + σ ] = I[φ] + S[φ], 12π 12π

ˇ where S[φ] = S[φ] − A[φ]. Lemma 2.1, Corollary 2.1 (without the assertion on classical action) and Remark 2.3 remain valid if is replaced by + c, where c is an arbitrary group 2-cocycle with values in C. The choice (2.7), or rather its analog for the quasi-Fuchsian case, will be important in Sect. 4, where we consider classical action for families of Riemann surfaces. For this purpose, we present an explicit formula for as a particular antiderivative of the 1-form u. Let p ∈ U be an arbitrary point on the closure of U in C (nothing will depend on the choice of p). Set z lγ (z) = θγ [φhyp ] for all γ ∈ , (2.10) p

Liouville Action and Weil-Petersson Metric

199

where the path of integration P connects points p and z and, possibly except p, lies entirely in U. If p ∈ R∞ = R ∪ {∞}, it is assumed that P is smooth and is not tangent to R∞ at p. Such paths are called admissible. A 1-form ϑ on U is called integrable along z the admissible path P with the endpoint p ∈ R∞ , if the limit of p ϑ, as p → p along P , exists. Similarly, a path P is called -closed if its endpoints are p and γp for some γ ∈ , and P \ {p, γp} ⊂ U. A -closed path P with endpoints p and γp, p ∈ R∞ , is called admissible if it is not tangent to R∞ at p and there exists p ∈ P such that the translate by γ of the part of P between the points p and p belongs to P . A 1-form ϑ γp is integrable along the -closed admissible path P , if the limit of p ϑ, as p → p along P , exists. Let W =

Pk−1 ⊗ [αk |βk ] − Pk ⊗ [βk |αk ] + Pk ⊗ γk−1 |αk βk

g

k=1 g−1

−

−1 Pg ⊗ γg−1 . . . γk+1 |γk−1 ∈ K1,2 ,

(2.11)

k=1

where Pk is any admissible path from p to bk (0), k = 1, . . . , g, and Pg = P0 . Since Pk (1) = bk (0) = ak+1 (0), we have ∂ W = V − U, where U=

g

p ⊗ [αk |βk ] − p ⊗ [βk |αk ] + p ⊗ γk−1 |αk βk

k=1 g−1

−

−1 p ⊗ γg−1 . . . γk+1 |γk−1 ∈ K1,2 .

(2.12)

k=1

We have the following statement. Lemma 2.2. Let ϑ ∈ C1,1 be a closed 1-form on U and p ∈ U. In case p ∈ R∞ suppose that δϑ is integrable along any admissible path with endpoints in · p and ϑ is integrable along any -closed admissible path with endpoints in · p. Then ϑ, L = δϑ, W g + k=1

αk−1 p p

ϑβk −

βk−1 p p

ϑαk +

γk p p

ϑαk βk −

γk+1 ...γg p p

ϑγ −1 , k

where paths of integration are admissible if p ∈ R∞ . Proof. Since ϑγ is closed and U is simply-connected, we can define function lγ on U by z lγ (z) = ϑγ , p

200

L.A. Takhtajan, L.-P. Teo

where p ∈ U. We have, using Stokes’ theorem and d(δl) = δ(dl) = δϑ, ϑ, L = dl, L = l, ∂ L = l, ∂ V = δl, V = δl, ∂ W + δl, U = d(δl), W + δl, U = δϑ, W + δl, U . Since (δl)γ1 ,γ2 (p) =

γ1−1 p p

ϑγ2 ,

we get the statement of the lemma if p ∈ U. In case p ∈ R∞ , replace p by p ∈ U. Conditions of the lemma guarantee the convergence of integrals as p → p along corresponding paths.

Remark 2.4. Expression δl, U , which appears in the statement of the lemma, does not depend on the choice of a particular antiderivative of the closed 1-form ϑ. The same statement holds if we only assume that the 1-form δϑ is integrable along admissible paths with endpoints in · p, and the 1-form ϑ has an antiderivative l (not necessarily vanishing at p) such that the limit of (δl)γ1 ,γ2 (p ), as p → p along admissible paths, exists. Lemma 2.3. We have

γ1 ,γ2 (z) =

z

p

uγ1 ,γ2 + η(p)γ1 ,γ2 ,

(2.13)

where p ∈ R \ (∞) and integration goes along admissible paths. The integration constants η ∈ C0,2 are given by η(p)γ1 ,γ2 = 4π iε(p)γ1 ,γ2 (2 log 2 + log |c(γ2 )|2 ), and ε(p)γ1 ,γ2 Here for γ =

a b cd

(2.14)

 −1   1 if p < γ2 (∞) < γ1 p, = −1 if p > γ2 (∞) > γ1−1 p,   0 otherwise.

we set c(γ ) = c.

Proof. Since γ1 ,γ2 (z) =

z p

uγ1 ,γ2 +

γ1−1 p p

θγ2 [φhyp ],

it is sufficient to verify that 1 2π i

γ1 p p

 2   4 log 2 + 2 log |c(γ2 )| θγ −1 [φhyp ] = −4 log 2 − 2 log |c(γ2 )|2 2   0

if p < γ2−1 (∞) < γ1 p, if p > γ2−1 (∞) > γ1 p, otherwise.

Liouville Action and Weil-Petersson Metric

201

From (2.8) it follows that θγ −1 [φhyp ] is a closed 1-form on U, integrable along admissible paths with p ∈ R \ {γ −1 (∞)}. Denote by θγ −1 its restriction on the line y = ε > 0, (ε)

z = x + iy. When x = γ2−1 (∞), we obviously have (ε) lim θ −1 ε→0 γ2

= 0,

uniformly in x on compact subsets of R \ {γ2−1 (∞)}. If γ2−1 (∞) does not lie between points p and γ1 p on R, we can approximate the path of integration by the interval on the line y = ε, which tends to 0 as ε → 0. If γ2−1 (∞) lies between points p and γ1 p, we have to go around the point γ2−1 (∞) via a small half-circle, so that γ1 p θγ −1 [φhyp ] = lim θγ −1 [φhyp ], r→0 Cr

2

p

2

where Cr is the upper-half of the circle of radius r with center at γ2−1 (∞), oriented clockwise if p < γ2−1 (∞) < γ1 p. Evaluating the limit using the elementary formula π log sin t dt = −π log 2, 0

and the Cauchy theorem, we get the formula.

Corollary 2.2. The Liouville action functional has the following explicit representation: S[φ] =

i (ω[φ], F − θ [φ], L + u, W + η, V ) . 2

Remark 2.5. Since , V = u, W +η, V , it immediately follows from (2.9) that the Liouville action functional does not depend on the choice of point p ∈ R \ (∞) (actually it is sufficient to assume that p = γ1 (∞), (γ1 γ2 )(∞) for all γ1 , γ2 ∈ such that Vγ1 ,γ2 = 0). This can also be proved by direct computation using Remark 2.2. Namely, let p ∈ R∞ be another choice, p = σ −1 p ∈ R∞ for some σ ∈ PSL(2, R). Setting z = p in the equation (δ)σ,γ1 ,γ2 = 0 and using (δu)σ,γ1 ,γ2 = 0, where γ1 , γ2 ∈ , we get σ −1 p uγ1 ,γ2 = −(δη(p))σ,γ1 ,γ2 , (2.15) p

where all paths of integration are admissible. Using η(p)σ γ1 ,γ2 = η(σ −1 p)γ1 ,γ2 + η(p)σ,γ2 , we get from (2.15) that z uγ1 ,γ2 + η(p)γ1 ,γ2 = p

z p

uγ1 ,γ2 + η(p )γ1 ,γ2 + (δησ )γ1 ,γ2 ,

where (ησ )γ = η(p)σ,γ is a constant group 1-cochain. The statement now follows from δησ , V = ησ , ∂ V = ησ , ∂ L = dησ , L = 0.

202

L.A. Takhtajan, L.-P. Teo

Another consequence of Lemmas 2.2 and 2.3 is the following. Corollary 2.3. Set κγ −1 =

γ γ dz − d z¯ ∈ C1,1 . γ γ

Then κ, L = 4πiε, V = 4πi χ (X), where χ (X) = 2 − 2g is the Euler characteristic of Riemann surface X \U. Proof. Since δκ = 0, the first equation immediately follows from the proofs of Lemmas 2.2 and 2.3. To prove the second equation, observe that κ = δκ1 ,

where κ1 = −φz dz + φz¯ d z¯

and

dκ1 = 2φz¯z dz ∧ d z¯ .

Therefore κ, L = δκ1 , L = κ1 , ∂ L = κ1 , ∂ F = dκ1 , F . The Gaussian curvature of the metric ds 2 = eφ |dz|2 is K = −2e−φ φz¯z , so by GaussBonnet we get dκ1 , F = 2 φz¯z dz ∧ d z¯ = 2i Keφ d 2 z = 4π iχ (X). \U

F

Using this corollary, we can “absorb” the integration constants η by shifting θ[φ] ∈ C1,1 by a multiple of closed 1-form κ. Indeed, the 1-form θ[φ] satisfies the equation δω[φ] = dθ [φ] and is defined up to addition of a closed 1-form. Set θˇγ [φ] = θγ [φ] − (2 log 2 + log |c(γ )|2 )κγ , and define uˇ = δ θˇ [φ]. Explicitly, uˇ γ −1 ,γ −1 = uγ −1 ,γ −1 1

2

1

2

γ2 γ2 ◦ γ γ dz − ◦ γ1 γ1 d z¯ 1 1 γ2 γ2 γ1 |c(γ2 γ1 )|2 γ1 + log dz − d z¯ , |c(γ1 )|2 γ1 γ

|c(γ2 )|2 − log |c(γ2 γ1 )|2

(2.16)

(2.17)

1

where u is given by (2.6). As it follows from Lemma 2.2 and Corollary 2.3, i S[φ] = ω[φ], F − θˇ [φ], L + u, ˇ W . 2

(2.18)

The Liouville action functional for the mirror image X¯ is defined similarly. Namely, for every chain c in the upper half-plane U denote by c¯ its mirror image in the lower ¯ = F¯ + L¯ − V¯ , so that half-plane L; the chain c¯ has an opposite orientation to c. Set ¯ considered as a smooth real-valued function on L satisfying ¯ = 0. For φ ∈ CM(X), ∂

Liouville Action and Weil-Petersson Metric

203

(2.3), define ω[φ] ∈ C2,0 , θ [φ] ∈ C1,1 and ∈ C0,2 by the same formulas (2.4), (2.5) and (2.7). Lemma 2.3 has an obvious analog for the lower half-plane L, the analog of formula (2.13) for z ∈ L is z γ1 ,γ2 (z) = uγ1 ,γ2 − η(p)γ1 ,γ2 , (2.19) p

where the negative sign comes from the opposite orientation. Remark 2.6. Similarly to (2.15) we get

σ −1 p

p

uγ1 ,γ2 = (δη(p))σ,γ1 ,γ2 ,

(2.20)

where the path of integration, except the endpoints, lies in L. From (2.15) and (2.20) we obtain uγ1 ,γ2 = −2(δη(p))σ,γ1 ,γ2 , (2.21) C

where the path of integration C is a loop that starts at p, goes to σ −1 p inside U, continues inside L and ends at p. Note that formula (2.21) can also be verified directly using Stokes’ theorem. Indeed, the 1-form uγ1 ,γ2 is closed and regular everywhere except points γ1 (∞) and (γ1 γ2 )(∞). Integrating over small circles around these points if they lie inside C and using (2.14), we get the result. Set [φ] = ω[φ] − θ[φ] − , so that D [φ] = 0. The Liouville action functional for X¯ is defined by ¯ = − i [φ], . ¯ [φ; X] 2 Using an analog of Lemma 2.2 in the lower half-plane L and η, V¯ = η, V , we obtain

¯ = − i ω[θ ], F¯ − θ [φ], L ¯ + u, W¯ − η, V . S[φ; X] 2 Finally, we have the following definition. ¯ → R for the Fuchsian Definition 2.2. The Liouville action functional S : CM(X X) group acting on U ∪ L is defined by ¯ = S [φ] = S[φ; X] + S[φ; X] =

i ¯ [φ], − 2

i ¯ + u, W − W¯ + 2η, V , ω[φ], F − F¯ − θ [φ], L − L 2

¯ where φ ∈ CM(X X).

204

L.A. Takhtajan, L.-P. Teo

The functional S satisfies an obvious analog of Lemma 2.1. Its Euler-Lagrange equation is the Liouville equation, so that its single non-degenerate critical point is the hyperbolic metric on U ∪ L. The corresponding classical action is 8π(2g − 2) – twice ¯ Similarly to (2.18) we have the hyperbolic area of X X. i ¯ + u, ˇ S [φ] = ω[φ], F − F¯ − θ[φ], L − L ˇ W − W¯ . (2.22) 2 Remark 2.7. In the definition of S it is not necessary to choose a fundamental domain for in L to be the mirror image of the fundamental domain in U since the corresponding ¯ does not depend on the choice of the fundamental domain of homology class [ − ] in U ∪ L. 2.3. The quasi-Fuchsian case. Let be a marked, normalized, purely loxodromic quasi-Fuchsian group of genus g > 1. Its region of discontinuity has two invariant components 1 and 2 separated by a quasi-circle C. By definition, there exists a ˆ with the following properties: quasiconformal homeomorphism J1 of C QF1 The mapping J1 is holomorphic on U and J1 (U) = 1 , J1 (L) = 2 , and J1 (R∞ ) = C. QF2 The mapping J1 fixes 0, 1 and ∞. QF3 The group ˜ = J1−1 ◦ ◦ J1 is Fuchsian. Due to the normalization, any two maps satisfying QF1–QF3 agree on U, so that the ˜ ˜ group ˜ is independent of the choice of the map J1 . Setting X \U, we get \U∪L X X¯ and \ X Y , where X and Y are marked compact Riemann surfaces of genus g > 1 with opposite orientations. Conversely, according to Bers’ simultaneous uniformization theorem [Ber60], for any pair of marked compact Riemann surfaces X and Y of genus g > 1 with opposite orientations there exists a unique, up to a conjugation in PSL(2, C), quasi-Fuchsian group such that \ X Y . Remark 2.8. It is customary (see, e.g., [Ahl87]) to define quasi-Fuchsian groups by requiring that the map J1 is holomorphic in the lower half-plane L. We will see in Sect. 4 that the above definition is somewhat more convenient. Let µ be the Beltrami coefficient for the quasiconformal map J1 , µ=

(J1 )z¯ , (J1 )z

ˆ with that is, J1 = f µ – the unique, normalized solution of the Beltrami equation on C Beltrami coefficient µ. Obviously, µ = 0 on U. Define another Beltrami coefficient µˆ by µ(¯z) if z ∈ U, µ(z) ˆ = µ(z) if z ∈ L. Since µˆ is a symmetric, normalized solution f µˆ of the Beltrami equation, µˆ

µˆ fz¯ (z) = µ(z)f ˆ z (z)

Liouville Action and Weil-Petersson Metric

205

ˆ which preserves U and L. The quasiconformis a quasiconformal homeomorphism of C al map J2 = J1 ◦ (f µˆ )−1 is then conformal on the lower half-plane L and has properties similar to QF1–QF3. In particular, J2−1 ◦ ◦ J2 = ˆ = f µˆ ◦ ˜ ◦ (f µˆ )−1 is a Fuchsian ˆ group and \L Y . Thus for a given the restriction of the map J2 to L does not depend on the choice of J2 (and hence of J1 ). These properties can be summarized by the following commutative diagram: J1 =f µ

U ∪ R∞ ∪ L −−−−→ 1 ∪ C ∪ 2   µˆ J f 2 =

U ∪ R∞ ∪ L −−−−→ U ∪ R∞ ∪ L ˆ where maps J1 , J2 and f µˆ intertwine corresponding pairs of groups , ˜ and . 2.3.1. Homology construction. The map J1 induces a chain map between double complexes K•,• = S• ⊗Z B• for the pairs U ∪ L, ˜ and , , by pushing forward chains S• (U ∪ L) c → J1 (c) ∈ S• () and group elements ˜ γ → J1 ◦ γ ◦ J1−1 ∈ . We will continue to denote this chain map by J1 . Obviously, the chain map J1 induces an isomorphism between homology groups of corresponding total complexes Tot K. Let = F + L − V be total cycle of degree 2 representing the fundamental class of ˜ constructed in the previous section, X in the total homology complex for the pair U, , ¯ The total cycle () of and let = F + L − V be the corresponding cycle for X. degree 2 representing the fundamental class of X Y in the total complex for the pair ˜ = − by J1 , , can be realized as a push-forward of the total cycle () ˜ = J1 ( ) − J1 ( ).

() = J1 ( ()) We will denote push-forwards by J1 of the chains F, L, V in U by F1 , L1 , V1 , and pushforwards of the corresponding chains F , L , V in L – by F2 , L2 , V2 , where indices 1 and 2 refer, respectively, to domains 1 and 2 . The definition of chains Wi is more subtle. Namely, the quasi-circle C is not generally smooth or even rectifiable, so that an arbitrary path from an interior point of i to p ∈ C inside i is not rectifiable either. Thus if we define W1 as a push-forward by J1 of W constructed using arbitrary admissible paths in U, the paths in W1 in general will no longer be rectifiable. The same applies to the push-forward by J1 of the corresponding chain in L. However, the definition of u, W1 uses integration of the 1-form uγ1 ,γ2 along the paths in W1 , and these paths should be rectifiable in order that u, W1 is well-defined. The invariant construction of such paths in i is based on the following elegant observation communicated to us by M. Lyubich. Since the quasi-Fuchsian group is normalized, it follows from QF2 that the Fuch˜ with the sian group ˜ = J1−1 ◦◦J1 is also normalized and α˜ 1 ∈ ˜ is a dilation α˜ 1 z = λz axis iR≥0 and 0 < λ˜ < 1. Corresponding loxodromic element α1 = J1 ◦ α˜1 ◦ J1−1 ∈ is also a dilation α1 z = λz, where 0 < |λ| < 1. Choose z˜ 0 ∈ iR≥0 and denote by I˜ = [˜z0 , 0] the interval on iR≥0 with endpoints z˜ 0 and 0 – the attracting fixed point of α˜ 1 . Set z0 = J1 (˜z0 ) and I = J1 (I˜). The path I connects points z0 ∈ 1 and 0 = J1 (0) ∈ C inside 1 , is smooth everywhere except the endpoint 0, and is rectifiable. ˜ z0 ] ⊂ iR>0 and cover the interval I˜ by subintervals I˜n defined Indeed, set I˜0 = [˜z0 , λ˜

206

L.A. Takhtajan, L.-P. Teo

by I˜n+1 = α˜ 1 (I˜n ), n = 0, 1, . . . , ∞. Corresponding paths In = J1 (I˜n ) cover the path I , and due to the property In+1 = α1 (In ), which follows from QF3, we have I=

∞

α1n (I0 ).

n=0

Thus l(I ) =

∞

|λn |l(I0 ) =

n=0

l(I0 ) < ∞, 1 − |λ|

where l(P ) denotes the Euclidean length of a smooth path P . The same construction works for every p ∈ C \ {∞} which is a fixed point of an element in , and we define -contracting paths in 1 at p as follows. Definition 2.3. Path P connecting points z ∈ 1 and p ∈ C \ {∞} inside 1 is called -contracting in 1 at p, if the following conditions are satisfied. C1 Paths P is smooth except at the point p. C2 The point p is a fixed point for . C3 There exists p ∈ P and an arc P0 on the path P such that the iterates γ n (P0 ), n ∈ N, where γ ∈ has p as the attracting fixed point, entirely cover the part of P from the point p to the point p. As in Sect. 2.2, we define -closed paths and -closed contracting paths in 1 at p. The definition of -contracting paths in 2 is analogous. Finally, we define -contracting paths in as follows. Definition 2.4. Path P is called -contracting in , if P = P1 ∪ P2 , where P1 ∩ P2 = p ∈ C, and P1 \ {p} ⊂ 1 and P2 \ {p} ⊂ 2 are -contracting paths at p in the sense of the previous definition. -contracting paths are rectifiable. Lemma 2.4. Let and be two marked normalized quasi-Fuchsian groups with regions of discontinuity and , and let f be a normalized quasiconformal homeomorphism of C which intertwines and and is smooth in . Then the push-forward by f of a -contracting path in is a -contracting path in . Proof. Obvious: if p is the attracting fixed point for γ ∈ , then p = f (p) is the attracting fixed point for γ = f ◦ γ ◦ f −1 ∈ .

Now define a chain W for the Fuchsian group ˜ by first connecting points P1 (1), . . . , Pg (1) to some point z˜ 0 ∈ iR>0 by smooth paths inside U and then connecting this point to 0 by I˜. The chain W in L is defined similarly. Setting W1 = J1 (W ) and W2 = J1 (W ), we see that the chain W1 −W2 in consists of -contracting paths in at 0. Connecting P1 (1), . . . , Pg (1) to 0 by arbitrary -contracting paths at 0 results in 1-chains which are homotopic to the 1-chains W1 and W2 in components 1 and 2 respectively. Finally, we define chain U1 = U2 as push-forward by J1 of the corresponding chain U = U with p = 0.

Liouville Action and Weil-Petersson Metric

207

2.3.2. Cohomology construction. Let CM(X Y ) be the space of all conformal metrics ds 2 = eφ |dz|2 on X Y , which we will always identify with the affine space of smooth real-valued functions φ on satisfying (2.3). For φ ∈ CM(X Y ) we define cochains ω[φ], θ [φ], u, η and in the total cohomology complex Tot C for the pair , by the same formulas (2.4), (2.5), (2.6), (2.14) and (2.13), (2.19) as in the Fuchsian case, where p = 0 ∈ C, integration goes over -contracting paths at 0, and γ˜ ∈ ˜ are replaced by γ = J1 ◦ γ˜ ◦ J1−1 ∈ . The ordering of points on C used in the definition (2.14) of the constants of integration ηγ1 ,γ2 is defined by the orientation of C. Remark 2.9. Since 1-form u is closed and regular in 1 ∪ 2 , it follows from Stokes’ theorem that in the definition (2.13) and (2.19) of the cochain ∈ C0,2 we can use any rectifiable path from z to 0 inside 1 and 2 respectively. As opposed to the Fuchsian case, we can no longer guarantee that the cochain ω[φ] − θ[φ] − is a 2-cocycle in the total cohomology complex Tot C. Indeed, we have, using δu = 0, uγ ,γ + (δη)γ1 ,γ2 ,γ3 = (d1 )γ1 ,γ2 ,γ3 if z ∈ 1 , (δ)γ1 ,γ2 ,γ3 (z) = P1 2 3 (2.23) P2 uγ2 ,γ3 − (δη)γ1 ,γ2 ,γ3 = (d2 )γ1 ,γ2 ,γ3 if z ∈ 2 , where paths of integration P1 and P2 are -closed contracting paths connecting points 0 and γ1−1 (0) inside 1 and 2 respectively. Since the analog of Lemma 2.3 does not hold in the quasi-Fuchsian case, we can not conclude that d1 = d2 = 0. However, d1 , d2 ∈ C0,3 are z-independent group 3-cocycles and (d1 − d2 )γ1 ,γ2 ,γ3 = uγ2 ,γ3 + 2(δη)γ1 ,γ2 ,γ3 , (2.24) C

where C = P1 − P2 is a loop that starts at 0, goes to γ1−1 (0) inside 1 , continues inside 2 and ends at 0. In the Fuchsian case we have Eq. (2.21), which can be derived using the Stokes’ theorem (see Remark 2.6). The same derivation repeats verbatim for the quasi-Fuchsian case, and we get uγ2 ,γ3 = −2(δη)γ1 ,γ2 ,γ3 , C

so that d1 = d2 . Since H 3 (, C) = 0, there exists a constant 2-cochain κ such that δκ = −d1 = −d2 . Then + κ is a group 2-cocycle, that is, δ( + κ) = 0. As the result, we obtain that [φ] = ω[φ] − θ[φ] − − κ ∈ (Tot C)2 is a 2-cocycle in total cohomology complex Tot C for the pair , , that is, D [φ] = 0. Remark 2.10. The map J1 induces a cochain map between double cohomology complexes Tot C for the pairs U ∪ L, ˜ and , , by pulling back cochains and group elements, (J1 · )γ˜1 ,...,γ˜q = J1∗ γ1 ,...,γq ∈ Cp,q (U ∪ L),

208

L.A. Takhtajan, L.-P. Teo

where ∈ Cp,q () and γ˜ = J1−1 ◦ γ ◦ J1 . This cochain map induces an isomorphism of the cohomology groups of corresponding total complexes Tot C. The map J1 also ¯ induces a natural isomorphism between the affine spaces CM(X Y ) and CM(X X), ¯ J1 · φ = φ ◦ J1 + log |(J1 )z |2 ∈ CM(X X), where φ ∈ CM(X Y ). However, |(J1 · φ)z |2 dz ∧ d z¯ = J1∗ |φz |2 dz ∧ d z¯ , and cochains ω[φ], θ[φ], u and for the pair , are not pull-backs of cochains for ¯ the pair U ∪ L, ˜ corresponding to J1 · φ ∈ CM(X X). 2.3.3. The Liouville action functional. Discussion in the previous section justifies the following definition. Definition 2.5. The Liouville action functional S : CM(X Y ) → R for the quasiFuchsian group is defined by i i S [φ] = [φ], () = [φ], 1 − 2 2 2 i = (ω[φ], F1 − F2 − θ [φ], L1 − L2 + + κ, V1 − V2 ) , 2 where φ ∈ CM(X Y ). Remark 2.11. Since [φ] is a total 2-cocycle, the Liouville action functional S does not depend on the choice of fundamental domain for in , i.e on the choice of fundamental domains F1 and F2 for in 1 and 2 . In particular, if 1 and 2 are push-forwards ¯ then κ, V1 − V2 = 0 and by the map J1 of the total cycle and its mirror image , we have S [φ] =

i (ω[φ], F1 − F2 − θ [φ], L1 − L2 + u, W1 − W2 + 2η, V1 ) . 2 (2.25)

In general, the constant group 2-cocycle κ drops out from the definition for any choice of fundamental domains F1 and F2 which is associated with the same marking of , i.e., when the same choice of standard generators α1 , . . . , αg , β1 , . . . , βg is used both in 1 and in 2 . Indeed, in this case V1 and V2 have the same B2 (Z)-structure and κ, V1 − V2 = 0. Moreover, since the 1-form u is closed and regular in 1 ∪ 2 , we can use arbitrary rectifiable paths with endpoint 0 inside 1 and 2 in the definition of chains W1 and W2 respectively. Remark 2.12. We can also define chains W1 and W2 by using -contracting paths at any -fixed point p ∈ C \ {∞}. As in Remark 2.5 it is easy to show that , V1 − V2 = u, W1 − W2 + 2η, V1 does not depend on the choice of a fixed point p.

Liouville Action and Weil-Petersson Metric

209

As in the Fuchsian case, the Euler-Lagrange equation for the functional S is the Liouville equation and the hyperbolic metric eφhyp |dz|2 on is its single non-degenerate critical point. It is explicitly given by eφhyp (z) =

|(Ji−1 ) (z)|2

(Im Ji−1 (z))2

if z ∈ i , i = 1, 2.

(2.26)

Remark 2.13. Corresponding classical action S [φhyp ] is no longer twice the hyperbolic area of X Y , as it was in the Fuchsian case, but rather non-trivially depends on . This is due to the fact that in the quasi-Fuchsian case the (1, 1)-form ω[φhyp ] on is not a (1, 1)-tensor for , as it was in the Fuchsian case. Similarly to (2.22) we have i S [φ] = ˇ W1 − W2 , ω[φ], F1 − F2 − θˇ [φ], L1 − L2 + u, 2

(2.27)

where F1 and F2 are fundamental domains for the marked group in 1 and 2 respectively. 3. Deformation Theory 3.1. The deformation space. Here we collect the basic facts from deformation theory of Kleinian groups (see, e.g., [Ahl87, Ber70, Ber71, Kra72b]). Let be a non-elementary, finitely generated purely loxodromic Kleinian group, let be its region of discontinuity, ˆ \ be its limit set. The deformation space D() is defined as follows. Let and let = C A−1,1 () be the space of Beltrami differentials for – the Banach space of µ ∈ L∞ (C) satisfying µ(γ (z))

γ (z) = µ(z) for all γ ∈ , γ (z)

and µ| = 0. Denote by B −1,1 () the open unit ball in A−1,1 () with respect to the · ∞ norm, µ ∞ = sup |µ(z)| < 1. z∈C

For each Beltrami coefficient µ ∈ B −1,1 () there exists a unique homeomorphism ˆ →C ˆ satisfying the Beltrami equation fµ : C µ

fz¯ = µfzµ and fixing the points 0, 1 and ∞. Set µ = f µ ◦ ◦ (f µ )−1 and define D() = B −1,1 ()/ ∼ , where µ ∼ ν if and only if f µ = f ν on , which is equivalent to the condition f µ ◦ γ ◦ (f µ )−1 = f ν ◦ γ ◦ (f ν )−1 for all γ ∈ .

210

L.A. Takhtajan, L.-P. Teo

Similarly, if is a union of invariant components of , the deformation space D(, ) is defined using Beltrami coefficients supported on . By Ahlfors finiteness theorem has finitely many non-equivalent components 1 , . . . , n . Let i be the stabilizer subgroup of the component i , i = {γ ∈ | γ (i ) = i } and let Xi i \i be the corresponding compact Riemann surface of genus gi > 1, i = 1, . . . , n. The decomposition \ = 1 \1 · · · n \n establishes the isomorphism [Kra72b] D() D(1 , 1 ) × · · · × D(n , n ). Remark 3.1. When is a purely hyperbolic Fuchsian group of genus g > 1, D(, U) = T() – the Teichm¨uller space of . Every conformal bijection \U → X establishes an isomorphism between T() and T(X), the Teichm¨uller space of marked Riemann ¯ surface X. Similarly, D(, L) = T(), the mirror image of T() – the complex manifold which is complex conjugate to T(). Correspondingly, \L → X¯ establishes the ¯ ¯ so that isomorphism T() T(X), ¯ D() T(X) × T(X). The deformation space D() is “twice larger” than the Teichm¨uller space T() because its definition uses all Beltrami coefficients µ for , and not only those satisfying the reflection property µ(¯z) = µ(z), used in the definition of T(). The deformation space D() has a natural structure of a complex manifold, explicitly described as follows (see, e.g., [Ahl87]). Let H−1,1 () be the Hilbert space of Beltrami differentials for with the following scalar product: µ1 µ¯ 2 ρ = µ1 (z)µ2 (z)ρ(z) d 2 z, (3.1) (µ1 , µ2 ) = \

\

where µ1 , µ2 ∈ H−1,1 () and ρ = eφhyp is the density of the hyperbolic metric on \. Denote by −1,1 () the finite-dimensional subspace of harmonic Beltrami differentials with respect to the hyperbolic metric. It consists of µ ∈ H−1,1 () satisfying ∂z (ρµ) = 0. The complex vector space −1,1 () is identified with the holomorphic tangent space to D() at the origin. Choose a basis µ1 , . . . , µd for −1,1 (), let µ = ε1 µ1 +· · ·+εd µd , and let f µ be the normalized solution of the Beltrami equation. Then the correspondence (ε1 , . . . , εd ) → µ = f µ ◦ ◦ (f µ )−1 defines complex coordinates in a neighborhood of the origin in D(), called Bers coordinates. The holomorphic cotangent space to D() at the origin can be naturally identified with the vector space 2,0 () of holomorphic quadratic differentials – holomorphic functions q on satisfying q(γ z)γ (z)2 = q(z) for all γ ∈ .

Liouville Action and Weil-Petersson Metric

211

The pairing between holomorphic cotangent and tangent spaces to D() at the origin is given by q(µ) = qµ = q(z)µ(z) d 2 z. \

\

There is a natural isomorphism µ between the deformation spaces D() and D( µ ), which maps ν ∈ D() to ( µ )λ ∈ D( µ ), where, in accordance with f ν = f λ ◦ f µ , µ ν − µ fz λ= ◦ (f µ )−1 . 1 − ν µ¯ f¯z¯µ The isomorphism µ allows us to identify the holomorphic tangent space to D() at µ with the complex vector space −1,1 ( µ ), and holomorphic cotangent space to D() at µ with the complex vector space 2,0 ( µ ). It also allows us to introduce the Bers coordinates in the neighborhood of µ in D(), and to show directly that these coordinates transform complex-analytically. For the de Rham differential d on D() we denote by d = ∂ + ∂¯ the decomposition into (1, 0) and (0, 1) components. The differential of the isomorphism µ : D() D( µ ) at ν = µ is given by the linear map D µ : −1,1 () → −1,1 ( µ ), µ ν f z µ ν → D µ ν = P−1,1 ◦ (f µ )−1 , 1 − |µ|2 f¯z¯µ where P−1,1 is orthogonal projection from H−1,1 ( µ ) to −1,1 ( µ ). The map D µ allows to extend a tangent vector ν at the origin of D() to a local vector field ∂/∂εν on the coordinate neighborhood of the origin,

∂

= D µ ν ∈ −1,1 ( µ ). ∂εν µ µ

The scalar product (3.1) in −1,1 ( µ ) defines a Hermitian metric on the deformation space D(). This metric is called the Weil-Petersson metric and it is K¨ahler . We denote its symplectic form by ωW P ,

∂ ∂

i λ ωW P , = D µ, D λ ν , µ, ν ∈ −1,1 (). ∂εµ ∂ ε¯ ν λ 2 3.2. Variational formulas. Here we collect necessary variational formulas. Let l and m be integers. A tensor of type (l, m) for is a C ∞ -function ω on satisfying m

ω(γ z)γ (z)l γ (z) = ω(z) for all γ ∈ . Let ωε be a smooth family of tensors of type (l, m) for εµ , where µ ∈ −1,1 () and ε ∈ C is sufficiently small. Set (f εµ )∗ (ωε ) = ωε ◦ f εµ (fzεµ )l (f¯z¯ )m , εµ

212

L.A. Takhtajan, L.-P. Teo

which is a tensor of type (l, m) for – a pull-back of the tensor ωε by f εµ . The Lie derivatives of the family ωε along the vector fields ∂/∂εµ and ∂/∂ ε¯ µ are defined in the standard way,

∂

∂

εµ ∗ ε Lµ ω = (f ) (ω ) and L ω = (f εµ )∗ (ωε ). µ¯ ∂ε ε=0 ∂ ε¯ ε=0 When ω is a function on D() – a tensor of type (0, 0), Lie derivatives reduce to directional derivatives ¯ µ) Lµ ω = ∂ω(µ) and Lµ¯ ω = ∂ω( ¯ ¯ on tangent vectors µ and µ. – the evaluation of 1-forms ∂ω and ∂ω ¯ For the Lie derivatives of vector fields ν µ = D µ ν we get [Wol86] that Lµ ν = 0 and Lµ¯ ν is orthogonal to −1,1 (). In other words, ∂ ∂ ∂ ∂ = =0 , , ∂εµ ∂εν ∂εµ ∂ ε¯ ν at the point in D(). For every µ ∈ D(), the density ρ µ of the hyperbolic metric on µ is a (1, 1)-tensor for µ . Lie derivatives of the smooth family of (1, 1)-tensors ρ parameterized by D() are given by the following lemma of Ahlfors. Lemma 3.1. For every µ ∈ −1,1 (), Lµ ρ = Lµ¯ ρ = 0. Proof. Let 1 , . . . , n be the maximal set of non-equivalent components of and let 1 , . . . , n be the corresponding stabilizer groups, \ = 1 \1 · · · n \n X1 · · · Xn . For every i denote by Ji : U → i the corresponding covering map and by ˜ i – the Fuchsian model of group i , characterized by the condition ˜ i \U i \i Xi (see, e.g., [Kra72b]). Let µ ∈ −1,1 (). For every component i the quasiconformal map f εµ gives rise to the following commutative diagram: F εµˆ i

U −−−−→ U   J J εµ i i f εµ

(3.2)

εµ

i −−−−→ i

where F εµˆ i is the normalized quasiconformal homeomorphism of U with Beltrami differential µˆ i = Ji∗ µ for the Fuchsian group ˜ i . Let ρˆ be the density of the hyperbolic metric on U; it satisfies ρˆ = Ji∗ ρ, where ρ is the density of the hyperbolic metric on i . Therefore, the Beltrami differential µˆ i is harmonic with respect to the hyperbolic metric on U. It follows from the commutativity of the diagram that (f εµ )∗ ρ εµ = ((Ji )−1 ◦ f εµ )∗ ρˆ = (F εµˆ i ◦ Ji−1 )∗ ρˆ = (Ji−1 )∗ (F εµˆ i )∗ ρ. ˆ εµ

Liouville Action and Weil-Petersson Metric

213

Now the assertion of the lemma reduces to

∂

(F εµˆ i )∗ ρˆ = 0, ∂ε ε=0 which is the classical result of Ahlfors [Ahl61]. Set

∂

f˙ = f εµ , ∂ε ε=0

then 1 f˙(z) = − π

C

z(z − 1)µ(w) d 2 w. (w − z)w(w − 1)

(3.3)

We have f˙z¯ = µ and also

∂

f εµ = 0. ∂ ε¯ ε=0

As it follows from Ahlfors lemma,

∂

ρ εµ ◦ f εµ |fzεµ |2 = 0.

∂ε ε=0

Using ρ = eφhyp and the fact that f εµ depends holomorphically on ε, we get

∂

εµ εµ ◦ f φ = −f˙z . ∂ε ε=0 hyp Differentiation with respect to z and z¯ yields

∂

εµ εµ εµ φ = −f˙zz , ◦ f f z hyp z ∂ε ε=0 and

∂

εµ εµ ¯εµ φ = − (φhyp )z f˙z¯ + f˙z¯z . ◦ f f z¯ hyp z¯

∂ε ε=0

For γ ∈ set γ εµ = f εµ ◦ γ ◦ (f εµ )−1 ∈ εµ . We have (γ εµ ) ◦ f εµ fzεµ = fzεµ ◦ γ γ , and log |(γ εµ ) ◦ f εµ |2 + log |fzεµ |2 = log |fzεµ ◦ γ |2 + log |γ |2 .

(3.4)

(3.5)

(3.6)

214

L.A. Takhtajan, L.-P. Teo

Therefore

∂

εµ εµ 2 log |(γ = f˙z ◦ γ − f˙z , ) ◦ f | ∂ε ε=0

(3.7)

and, differentiating with respect to z,

εµ ∂

(γ ) εµ εµ = f˙zz ◦ γ γ − f˙zz . ◦ f f z ∂ε ε=0 (γ εµ )

(3.8)

Denote by S(h) =

hzz hz

1 − 2 z

hzz hz

2

hzzz 3 = − hz 2

hzz hz

2

the Schwarzian derivative of the function h. Lemma 3.2. Set

∂

γ˙ = γ εµ , γ ∈ . ∂ε ε=0

Then for all γ ∈ , f˙ ◦ γ γ γ˙ f˙z ◦ γ − f˙z = + , (γ )2 γ

(i)

and is well-defined on the limit set . Also we have

γ f˙z¯z ◦ γ γ − f˙z¯z = f˙z¯ , γ 1 γ 2c˙ f˙zz ◦ γ γ − f˙zz = (f˙z ◦ γ + f˙z ) − , for all γ ∈ . 2 γ cz + d

(ii) (iii)

Proof. To prove formula (i), consider the equation f˙ ◦ γ = γ˙ + γ f˙,

(3.9)

which follows from γ εµ ◦ f εµ = f εµ ◦ γ . Differentiating with respect to z gives (i). ˆ and γ˙ /γ is a quadratic polynomial in z, formula (i) Since f˙ is a homeomorphism of C ˙ ˙ shows that fz ◦ γ − fz is well-defined on . The formula (ii) immediately follows from f˙z¯ = µ and µ◦γ

γ = µ, γ ∈ . γ

To derive formula (iii), twice differentiating (3.9) with respect to z we obtain f˙z ◦ γ γ = γ˙ + γ f˙ + γ f˙z , and γ (f˙zz ◦ γ γ − f˙zz ) = γ˙ + γ f˙ + 2γ f˙z − f˙z ◦ γ γ .

Liouville Action and Weil-Petersson Metric

215

Since γ =

3 (γ )2 , 2 γ

as it follows from S(γ ) = 0, we can eliminate γ f˙ from the two formulas above and obtain 1 γ γ˙ 3 γ γ˙ . f˙zz ◦ γ γ − f˙zz = (f˙z ◦ γ + f˙z ) + − 2 γ γ 2 (γ )2 Using 2c = −γ /(γ )3/2 , we see that the last two terms in this equation are equal to −2c/(cz ˙ + d), which proves the lemma.

Finally, we present the following formulas by Ahlfors [Ahl61]. Let F εµˆ be the quasiconformal homeomorphism of U with Beltrami differential µˆ for the Fuchsian group . If µˆ is harmonic on U with respect to the hyperbolic metric, then

∂

F εµˆ (z) = 0, (3.10) ∂ε ε=0 zzz

1 ∂

F εµˆ (z) = − ρˆ µ(z). ˆ (3.11) ∂ ε¯ ε=0 zzz 2 4. Variation of the Classical Action 4.1. Classical action. Let be a marked, normalized, purely loxodromic quasi-Fuchsian group of genus g > 1 with region of discontinuity = 1 ∪ 2 , let X Y \ be corresponding marked Riemann surfaces with opposite orientations and let D() D(, 1 ) × D(, 2 ) be the deformation space of . Spaces D(, 1 ) and D(, 2 ) are isomorphic to the Teichm¨uller spaces T(X) and T(Y ) – they are their quasi-Fuchsian models which use Bers’ simultaneous uniformization of the varying Riemann surface in T(X) and fixed Y and, respectively, fixed X and the varying Riemann surface in T(Y ). Therefore, D() T(X) × T(Y ).

(4.1)

Denote by P() → D() the corresponding affine bundle of projective connections, modeled over the holomorphic cotangent bundle of D(). We have P() P(X) × P(Y ).

(4.2)

For every µ ∈ D() denote by S µ = S µ [φhyp ] the classical Liouville action. It follows from the results in Sect. 2.3.3 that S µ gives rise to a well-defined real-valued function S on D(). Indeed, if µ ∼ ν, then the corresponding total cycles f µ ( ()) and f ν ( ()) represent the same class in the total homology complex Tot K for the pair µ , µ , so that φhyp , f µ ( () = φhyp , f ν ( () .

216

L.A. Takhtajan, L.-P. Teo

Moreover, real-analytic dependence of solutions of the Beltrami equation on parameters ensures that the classical action S is a real-analytic function on D(). To every ∈ D() with the region of discontinuity there corresponds a pair of marked Riemann surfaces X and Y simultaneously uniformized by , X Y \ . Set S(X , Y ) = S and denote by SY and SX restrictions of the function S : D() → R onto T(X) and T(Y ) respectively. Let ι be the complex conjugation and let ¯ = ι() be the quasi-Fuchsian group complex conjugated to . The correspondence µ → ι ◦ µ ◦ ι establishes the complex-analytic anti-isomorphism ¯ ¯ T(Y¯ ) × T(X). D() D() The classical Liouville action has the symmetry property S(X , Y ) = S(Y¯ , X¯ ).

(4.3)

For every φ ∈ CM(\) set ϑ[φ] = 2φzz − φz2 . It follows from the Liouville equation that ϑ = ϑ[φhyp ] ∈ 2,0 (), i.e., is a holomorphic quadratic differential for . It follows from (2.26) that  2S J −1 (z) if z ∈ 1 , 1 ϑ(z) = (4.4) 2S J −1 (z) if z ∈ 2 . 2 Define a (1, 0)-form ϑ on the deformation space D() by assigning to every ∈ D() ] ∈ 2,0 ( ) – a vector in the holomorphic cotangent space to the corresponding ϑ[φhyp D() at . For every ∈ D() let PF and PQF be Fuchsian and quasi-Fuchsian projective connections on X Y \ , defined by the coverings πF : U ∪ L → X Y and πQF : 1 ∪ 2 → X Y respectively. We will continue to denote corresponding sections of the affine bundle P() → D() by PF and PQF respectively. The difference PF − PQF is a (1, 0)-form on D(). Lemma 4.1. On the deformation space D(), ϑ = 2(PF − PQF ). Proof. Consider the following commutative diagram: J

U ∪ L −−−−→ 1 ∪ 2   π πQF F =

X Y −−−−→ X Y, where the covering map J is equal to the map J1 on the component U and to the map J2 on −1 the component L. As explained in the Introduction, PF = S(πF−1 ) and PQF = S(πQF ), and it follows from the property SD1 and commutativity of the diagram that −1 S πF−1 − S πQF ◦ πQF (πQF )2 = S J −1 .

Liouville Action and Weil-Petersson Metric

217

4.2. First variation. Here we compute the (1, 0)-form ∂S on D(). Theorem 4.1. On the deformation space D(), ∂S = 2(PF − PQF ). Proof. It is sufficient to prove that for every µ ∈ −1,1 () ϑµ. Lµ S = ϑ(µ) =

(4.5)

\

Indeed, using the isomorphism ν : D() → D( ν ), it is easy to see that the variation formula (4.5) is valid at every point ν ∈ D() if it is valid at the origin. The actual computation of Lµ S is quite similar to that in [ZT87b] for the case of Schottky groups, with the clarifying role of a homological algebra. Let ˜ be the Fuchsian group corresponding to and let = F + L − V be the corresponding total cycle of degree 2 representing the fundamental class of X in ˜ As in Sect. 2.3.1, set () = J1 ( − ). ¯ The the total complex Tot K for the pair U, . corresponding total cycle for the pair εµ , εµ = f εµ ◦ ◦ (f εµ )−1 can be chosen as

( εµ ) = f εµ ( ()). According to Remark 2.11, ! i εµ φhyp , f εµ ( ()) . S εµ = 2 Moreover, as it follows from Lemma 2.4, we can choose εµ -contracting at 0 paths of εµ εµ integration in the definition of εµ or, equivalently, paths in the definition of W1 −W2 , εµ to be the push-forwards by f ofthe corresponding -contracting at 0 paths. Denoting εµ εµ εµ εµ ˇ ω = ω φhyp , θ = θˇ φhyp , and using (2.27) we have S εµ =

i εµ εµ εµ εµ εµ εµ εµ ω , F1 − F2 − θˇ εµ , L1 − L2 + uˇ εµ , W1 − W2 . 2

Changing variables and formally differentiating under the integral sign in the term εµ εµ uˇ εµ , W1 − W2 , we obtain

∂

Lµ S = S εµ ∂ε ε=0 i = ˇ W1 − W2 . Lµ ω, F1 − F2 − Lµ θˇ , L1 − L2 + Lµ u, 2 We will justify this formula at the end of the proof. Here we observe that though ωεµ , θˇ εµ and uˇ εµ are not tensors for εµ , they are differential forms on εµ so that their Lie derivatives are given by the same formulas as in Sect. 3.2. Using Ahlfors lemma and formulas (3.4)–(3.6), we get

Lµ ω = − φhyp z¯ f˙zz + φhyp z φhyp z f˙z¯ + f˙z¯z dz ∧ d z¯ = ϑµ dz ∧ d z¯ − dξ, where

ξ = 2 φhyp z f˙z¯ d z¯ − φhyp d f˙z .

(4.6)

218

L.A. Takhtajan, L.-P. Teo

Since ϑµ is a (1, 1)-tensor for , δ(ϑµ dz ∧ d z¯ ) = 0, so that δLµ ω = −δdξ . We have dξ, F1 − F2 = ξ, ∂ (F1 − F2 ) = ξ, ∂ (L1 − L2 ) = δξ, L1 − L2 . Set χ = δξ + Lµ θˇ . The 1-form χ on is closed, dχ = δ(dξ ) + Lµ d θˇ = δ(−Lµ ω) + Lµ δω = 0, and satisfies δχ = δ(Lµ θˇ + δξ ) = Lµ δ θˇ = Lµ u. ˇ Using (3.4), (3.7), (3.8) and part (ii) of Lemma 3.2, we get

γ ˙ γ γ ˙ ˙ f dz + f dz + φ ◦ γ γ − f d z ¯ + f˙z d z¯ hyp zz zz z¯ γ γ γ γ

1 dz − log |γ |2 f˙zz ◦ γ γ − f˙zz dz − f˙z ◦ γ − f˙z + 2 γ γ

2γ ˙ − log |γ | fz¯ d z¯ + f˙z ◦ γ − f˙z d z¯ γ γ γ c(γ ˙ ) γ − log |c(γ )|2 + 2 log 2 d f˙z ◦ γ − f˙z − dz − d z¯ c(γ ) γ γ

γ γ 1 2 ˙ ˙ ˙ ˙ = − fz dz + fz ◦ γ d z¯ − d log |γ | fz ◦ γ − fz γ 2 γ

+ φhyp d f˙z ◦ γ − f˙z − log |c(γ )|2 + 2 log 2 d f˙z ◦ γ − f˙z c(γ ˙ ) γ γ − dz − d z¯ . c(γ ) γ γ

Lµ θˇγ −1 = − f˙z

Using δξγ −1 = −2

γ ˙ fz¯ d z¯ − φhyp d f˙z ◦ γ − f˙z + log |γ |2 d f˙z ◦ γ , γ

we get

χγ −1

1 log |γ |2 f˙z ◦ γ + f˙z − log |c(γ )|2 + 2 log 2 f˙z ◦ γ − f˙z =d 2

γ γ γ γ c(γ ˙ ) − f˙z ◦ γ + f˙z dz − 2 f˙z¯ d z¯ − dz − d z¯ . γ γ c(γ ) γ γ

Using parts (ii) and (iii) of Lemma 3.2 and −

2c˙ c˙ γ (z), = cz + d c γ

Liouville Action and Weil-Petersson Metric

219

we finally obtain 1 c(γ ˙ ) 2 ˙ ˙ =d log |γ | fz ◦ γ + fz + 2 2 c(γ ) − log |c(γ )|2 + 2 + 2 log 2 f˙z ◦ γ − f˙z

χγ −1

= dlγ −1 . We have dξ, F1 − F2 + Lµ θˇ , L1 − L2 = χ , L1 − L2 = dl, L1 − L2 = l, ∂ (L1 − L2 ) = l, ∂ (V1 − V2 ) = δl, V1 − V2 . Using Lµ uˇ = dδl we get Lµ u, ˇ W1 − W2 = δl, ∂ (W1 − W2 ) = δl, V1 − V2 so that Lµ S =

i ϑµ dz ∧ d z¯ , F1 − F2 , 2

as asserted. Finally, we justify the differentiation under the integral sign. Set lγ = lγ(0) + lγ(1) , where (0)

c(γ ˙ ) log |γ |2 − log |c(γ )|2 + 2 + 2 log 2 (f˙z ◦ γ − f˙z ), c(γ )

1 = log |γ |2 f˙z ◦ γ + f˙z . 2

lγ −1 = (1)

lγ −1

(0)

Next, we use part (i) of Lemma 3.2. According to it, the function lγ is continuous on C \ {γ (∞)}. Since 1 (δl (1) )γ −1 ,γ −1 = log |γ2 ◦ γ1 |2 (f˙z ◦ γ1 − f˙z ) − log |γ1 |2 (f˙z ◦ γ2 γ1 − f˙z ◦ γ1 ) , 1 2 2 we also conclude that (δl (1) )γ1 ,γ2 , and hence the function (δl)γ1 ,γ2 , are continuous on (n) (n) C \{γ1 (∞), (γ1 γ2 )(∞)}. Now let W1 W1 and W2 W2 be a sequence of 1-chains in 1 and 2 obtained from W1 and W2 by “cutting” -contracting at 0 paths at points pn ∈ 1 and pn ∈ 2 , where pn , pn → 0 as n → ∞. Clearly, S = lim Sn , n→∞

where Sn =

i (n) (n) ω, F1 − F2 − θˇ , L1 − L2 + u, ˇ W1 − W2 . 2

220

L.A. Takhtajan, L.-P. Teo

Our previous arguments show that i ϑµ dz ∧ d z¯ , F1 − F2 − (δl)(pn ), U1 + (δl)(pn ), U2 . 2 Since the function δl is continuous at p = 0 and U1 = U2 , we get Lµ S n =

i ϑµ, F1 − F2 . 2 Moreover, the convergence is uniform in some neighborhood of in D(), since f εµ is holomorphic at ε = 0. Thus lim Lµ Sn =

n→∞

Lµ S = lim Lµ Sn , n→∞

which completes the proof.

For fixed Riemann surface Y denote by PF and PQF sections of P(X) → T(X) corresponding to the Fuchsian uniformization of X ∈ T(X) and to the simultaneous uniformization of X ∈ T(X) and Y respectively. Corollary 4.1. On the Teichm¨uller space T(X), 1 ∂SY . 2 Remark 4.1. Conversely, Theorem 4.1 follows from the Corollary 4.1 and the symmetry property (4.3). PF − PQF =

Remark 4.2. In the Fuchsian case the maps J1 and J2 are identities and a similar computation shows that ϑ = 0, in accordance with S = 8π(2g − 2) being a constant function ¯ on T(X) × T(X). ¯ First, we have the following state4.3. Second variation. Here we compute dϑ = ∂ϑ. ment. Lemma 4.2. The quasi-Fuchsian projective connection PQF is a holomorphic section of the affine bundle P() → D(). Proof. Consider the following commutative diagram f εµ

−−−−→  πQF

εµ  π εµ QF

F εµ

X Y −−−−→ X εµ Y εµ where µ ∈ −1,1 (). We have 2

2

εµ −1 −1 −1 −1 πQF . S πQF ◦ F εµ Fzεµ + S F εµ = S f εµ ◦ πQF + S πQF z

Since

f εµ

and, obviously

are holomorphic at ε = 0, we get

∂

εµ −1 = 0. S πQF ∂ ε¯ ε=0

F εµ ,

Liouville Action and Weil-Petersson Metric

221

Using Corollary 4.1, Lemma 4.3 and the result [ZT87b] ¯ F = −i ωW P , ∂P which follows from (1.2) since PS is a holomorphic section of Pg → Sg , we immediately get Corollary 4.2. For fixed Y ¯ Y = −2∂(P ¯ F − PQF ) = −2d(PF − PQF ) = 2 i ωW P , ∂ ∂S so that −SY is a K¨ahler potential for the Weil-Petersson metric on T(X). Remark 4.3. The equation d(PF − PQF ) = −i ωW P was first proved in [McM00] and was used for the proof that moduli spaces are K¨ahler hyperbolic (note that the symplectic form ωW P used there is twice the one we are using here, and there is a missing factor 1/2 in the computation in [McM00]). Specifically, the Kraus-Nehari inequality asserts that PF − PQF is a bounded antiderivative of − i ωW P with respect to Teichm¨uller and Weil-Petersson metrics [McM00]. In this regard, it is interesting to estimate the K¨ahler potential SY on T(X). From the basic inequality of the distortion theorem (see, e.g., [Dur83])

h (z)

2¯z 4

≤ − ,

h (z) (1 − |z|2 ) (1 − |z|2 ) where h is a univalent function in the unit disk, we immediately get |(φhyp )z |2 ≤ 4eφhyp , so that the bulk term in SY is bounded on T(X) by 20π(2g − 2). It can also be shown that other terms in SY have at most “linear growth” on T(X), in accordance with the boundness of ∂SY . The following result follows from Corollary 4.2 and the symmetry property (4.3). For completeness, we give its proof in the form that is generalized verbatim to Kleinian groups. Theorem 4.2. The following formula holds on D(), ¯ dϑ = ∂∂S = −2i ωW P , so that −S is a K¨ahler potential of the Weil-Petersson metric on D(). Proof. Let µ, ν ∈ −1,1 (). First, using the Cartan formula, we get dϑ

∂ ∂ , ∂εµ ∂εν

=Lµ (ϑ(ν)) − Lν (ϑ(µ)) − ϑ =Lµ (Lν S) − Lν (Lµ S) = 0,

∂ ∂ , ∂εµ ∂εν

222

L.A. Takhtajan, L.-P. Teo

which just manifests that ∂ 2 = 0. On the other hand, ∂ ∂ ∂ ∂ =Lµ (ϑ(¯ν )) − Lν¯ (ϑ(µ)) − ϑ , , dϑ ∂εµ ∂ ε¯ ν ∂εµ ∂ ε¯ ν = − Lν¯ ϑµ \

=−

(Lν¯ ϑ)µ, \

since ϑ is a (1, 0)-form. The computation of Lν¯ ϑ repeats verbatim the one given in [ZT87b]. Namely, consider the commutative diagram (3.2) with i = 1, 2, and, for brevity, omit the index i. Since (J εν )−1 ◦ f εν = F ενˆ ◦ J −1 , the property SD1 of the Schwarzian derivative (applicable when at least one of the functions is holomorphic) yields S(J εν )−1 ◦ f εν (fzεν )2 + S(f εµ ) = S(F ενˆ ) ◦ J −1 (Jz−1 )2 + S(J −1 ). We obtain

(4.7)

∂

∂

εν −1 εν εν 2 S(J ) ◦ f (fz ) = S(F ενˆ ) ◦ J −1 (Jz−1 )2 ∂ ε¯ ν ε=0 ∂ ε¯ ν ε=0

∂

= F ενˆ ◦ J −1 (Jz−1 )2 ∂ ε¯ ν ε=0 zzz 1 = − ρν(z), 2

where in the last line we have used Ahlfors formula (3.11). Finally, ∂ ∂ ∂ ∂ = . , µ¯ν ρ = −2i ωW P , dϑ ∂εµ ∂ ε¯ ν ∂εµ ∂ ε¯ ν \

4.4. Quasi-Fuchsian reciprocity. The existence of the function S on the deformation space D() satisfying the statement of Theorem 4.1 is a global form of quasi-Fuchsian reciprocity. Quasi–Fuchsian reciprocity of McMullen [McM00] follows from it as a immediate corollary. Let µ, ν ∈ −1,1 () be such that µ vanishes outside 1 and ν – outside 2 , so that Lie derivatives Lµ and Lν stand for the variation of X for fixed Y and variation of Y for fixed X respectively. Theorem 4.3. (McMullen’s quasi-Fuchsian reciprocity). (Lν S(J1−1 )) µ = (Lµ S(J2−1 )) ν. X

Y

Liouville Action and Weil-Petersson Metric

223

Proof. Immediately follows from Theorem 4.1, since (Lν S(J1−1 )) µ, L ν Lµ S = 2 X Lµ Lν S = 2 (Lµ S(J2−1 )) ν, Y

and [Lµ , Lν ] = 0.

In [McM00], quasi-Fuchsian reciprocity was used to prove that d(PF − PQF ) = −i ωW P . For completeness, we give here another proof of this result using earlier approach in [ZT87a], which admits generalization to other deformation spaces. Proposition 4.1. On the deformation space D(), ∂ϑ = 0. Proof. Using the same identity (4.7) which follows from the commutative diagram (3.2), we have

∂

∂

∂

εν −1 εν εν 2 ενˆ −1 −1 2 S(J ) ◦ f (f ) = S(F ) ◦ J (J ) − S(f εν ) z z ∂εν ε=0 ∂εν ε=0 ∂εν ε=0

∂

∂

ενˆ −1 −1 2 F ◦ J (J ) − f εν , = z ∂εν ε=0 zzz ∂εν ε=0 zzz where we replaced µ by ν and omit index i = 1, 2. Differentiating (3.3) three times with respect to z we get

ν(w) 6 6 ∂

εν 2 f (z) = − d w = − K(z, w)ν(w)d 2 w, (4.8) ∂εν ε=0 zzz π (z − w)4 π C

\

where K(z, w) =

γ ∈

γ (w)2 . (z − γ w)4

It is well-known that for harmonic ν the integral in (4.8) is understood in the principal value sense (as limδ→0 of integral over C \ {|w − z| ≤ δ}). Therefore, using Ahlfors formula (3.10) we obtain 12 K(z, w)ν(w)d 2 w, (Lν ϑ)(z) = π \

and ∂ϑ(µ, ν) = Lµ ϑ(ν) − Lν ϑ(µ) 2 (Lµ ϑ)(z) ν(z)d z − (Lν ϑ)(w) µ(w)d 2 w = 0, = \

\

since kernel K(z, w) is obviously symmetric in z and w, K(z, w) = K(w, z).

224

L.A. Takhtajan, L.-P. Teo

5. Holography Let be a marked, normalized, purely loxodromic quasi-Fuchsian group of genus g > 1. 3 ˆ of the hyperbolic 3-space The group ⊂ PSL(2, C) acts on the closure U = U3 ∪ C 3 3 U = {Z = (x, y, t) ∈ R | t > 0}. The action is discontinuous on U3 ∪ and 3 M = \(U3 ∪ ) is a hyperbolic 3-manifold, compact in the relative topology of U , with the boundary X Y \. According to the holography principle, the on-shell gravity theory on M, given by the Einstein-Hilbert action functional with the cosmological term, is equivalent to the “off-shell” gravity theory on its boundary X Y , given by the Liouville action functional. Here we give a precise mathematical formulation of this principle. 5.1. Homology and cohomology set-up. We start by generalizing homological algebra methods in Sect. 2 to the three-dimensional case. 5.1.1. Homology computation. Denote by S• ≡ S• (U3 ∪) the standard singular chain complex of U3 ∪ , and let R be a fundamental region of in U3 ∪ such that R ∩ is the fundamental domain F = F1 − F2 for the group in (see Sect. 2). To have a better 3 picture, consider first the case when is a Fuchsian group. Then R is a region in U ˆ along the circles that are orthogonal to bounded by the hemispheres which intersect C R and bound the fundamental domain F (see Sect. 2.2.1). The fundamental region R is a three-dimensional CW -complex with a single 3-cell given by the interior of R. The 2-cells – the faces Dk , Dk , Ek and Ek , k = 1, . . . , g, are given by the parts of the boundary of R bounded by the intersections of the hemispheres and the arcs ak − a¯ k , ak − a¯ k , bk − b¯k and bk − b¯k respectively (see Fig. 1). The 1-cells – the edges, are given by the 1-cells of F1 − F2 and by ek0 , ek1 , fk0 , fk1 and dk , k = 1, . . . , g, defined as follows. The edges ek0 are intersections of the faces Ek−1 and Dk joining the vertices a¯ k (0) to ak (0), the edges ek1 are intersections of the faces Dk and Ek joining the vertices a¯ k (1) 0 to ak (1); fk0 = ek+1 are intersections of Ek and Dk+1 joining b¯k (0) to bk (0), fk1 are intersections of Dk and Ek joining b¯k (1) to bk (1), and dk are intersections of Ek and Dk joining a¯ k (1) to ak (1). Finally, the 0-cells – the vertices, are given by the vertices of F . This property means that the edges of R do not intersect in U3 . When is a quasi-Fuchsian group, the fundamental region R is a topological polyhedron homeomorphic to the ˜ geodesic polyhedron for the corresponding Fuchsian group . As in the two-dimensional case, we construct the 3-chain representing M in the total complex Tot K of the double homology complex K•,• = S• ⊗Z B• as follows. First, identify R with R ⊗ [ ] ∈ K3,0 . We have ∂ R = 0 and ∂ R = −F +

g

Dk − Dk − Ek + Ek

k=1

= −F + ∂ S, where S ∈ K2,1 is given by S=

g k=1

(Ek ⊗ [βk ] − Dk ⊗ [αk ]) .

Liouville Action and Weil-Petersson Metric

225

Secondly, ∂ S =

g

(bk − b¯k ) ⊗ [βk ] − (ak − a¯ k ) ⊗ [αk ]

k=1 g

fk1 − fk0 ⊗ [βk ] − ek1 − ek0 ⊗ [αk ]

−

k=1

= L − ∂ E, where L = L1 − L2 and E ∈ K1,2 is given by E=

g

ek0 ⊗ [αk |βk ] − fk0 ⊗ [βk |αk ] + fk0 ⊗ [γk−1 |αk βk ]

k=1 g−1

−

−1 fg0 ⊗ [γg−1 . . . γk+1 |γk−1 ].

k=1

Therefore ∂ E = V = V1 − V2 and the 3-chain R − S + E ∈ (Tot K)3 satisfies ∂(R − S + E) = −F − L + V = − ,

(5.1)

as asserted. 5.1.2. Cohomology computation. The PSL(2, C)-action on U3 is the following. Represent Z = (z, t) ∈ U3 by a quaternion z −t Z =x·1+y·i+t ·j = , t z¯ and for every c ∈ C set c = Re c · 1 + Im c · i =

c0 . 0 c¯

Then for γ = ( ac db ) ∈ PSL(2, C) the action Z → γ Z is given by Z → (aZ + b)(cZ + d)−1 . Explicitly, for Z = (z, t) ∈ U3 setting z(Z) = z and t (Z) = t gives z(γ Z) = (az + b)(cz + d) + a c¯ t 2 Jγ (Z), t (γ Z) = t Jγ (Z),

(5.3)

where Jγ (Z) =

(5.2)

1 . |cz + d|2 + |ct|2

226

L.A. Takhtajan, L.-P. Teo 3/2

Note that Jγ (Z) is the Jacobian of the map Z → γ Z, hence it satisfies the transformation property Jγ1 ◦γ2 (Z) = Jγ1 (γ2 Z)Jγ2 (Z).

(5.4)

From (5.2) and (5.3) we get the following formulas for the derivatives: ∂z(γ Z) = (cz + d)2 Jγ2 (Z), ∂z ∂z(γ Z) = −(c¯ t)2 Jγ2 (Z), ∂ z¯ ∂z(γ Z) + d)Jγ2 (Z). = 2t c(cz ¯ ∂t

(5.5) (5.6) (5.7)

In particular, ∂z(Z) = γ (z) + O(t 2 ), ∂z

∂z(γ Z) = O(t 2 ), ∂ z¯

∂z(Z) = O(t), ∂t

(5.8)

−1 (∞)}, where for z ∈ C we continue to use the two-dimensional ˆ as t → 0 and z ∈ C\{γ notations

γ (z) =

az + b cz + d

and γ (z) =

1 , (cz + d)2

γ −2c (z) = . γ cz + d

The hyperbolic metric on U3 is given by ds 2 =

|dz|2 + dt 2 , t2

and is PSL(2, C)-invariant. Denote by w3 =

1 i dx ∧ dy ∧ dt = 3 dz ∧ d z¯ ∧ dt 3 t 2t

the corresponding volume form on U3 . The form w3 is exact on U3 , w3 = dw2 ,

where w2 = −

i dz ∧ d z¯ . 4t 2

(5.9)

The 2-form w2 ∈ C2,0 is no longer PSL(2, C)-invariant. A straightforward computation using (5.5)–(5.7) gives for γ = ( ac db ) ∈ PSL(2, C), (δw2 )γ −1 =γ ∗ w2 − w2 i c(cz + d) c(cz ¯ + d) 2 = Jγ (Z) |c| dz ∧ d z¯ − dz ∧ dt + d z¯ ∧ dt . 2 t t Since dδw2 = δdw2 = δw3 = 0 and U3 is simply connected, this implies that there exists w1 ∈ C1,1 such that dw1 = δw2 . Explicitly, γ γ i 2 dz − d z¯ . (5.10) (w1 )γ −1 = − log |ct| Jγ (Z) 8 γ γ

Liouville Action and Weil-Petersson Metric

227

Using (5.4) and (5.8) we get for δw1 ∈ C 1,2 , (δw1 )γ −1 ,γ −1 1

2

γ2 γ2 i |c(γ2 )|2 =− ◦ γ1 γ1 dz − ◦ γ1 γ1 d z¯ log Jγ1 (Z) + log 8 |c(γ2 γ1 )|2 γ2 γ2 γ1 γ1 i |c(γ2 γ1 )|2 − d z¯ − dz log Jγ2 (γ1 Z) + log 8 |c(γ1 )|2 γ1 γ 1

+ Bγ −1 ,γ −1 (Z). 1

(5.11)

2

Here Bγ −1 ,γ −1 (Z) = O(t log t) as t → 0, uniformly on compact subsets of C \ 1

2

{γ1−1 (∞), (γ2 γ1 )−1 (∞)}. Clearly the 1-form δw1 is closed,

d(δw1 ) = δ(dw1 ) = δ(δw2 ) = 0. Since U3 is simply connected, there exists w0 ∈ C0,2 such that w1 = dw0 . Moreover, using H 3 (, C) = 0 we can always choose the antiderivative w0 such that δw0 = 0. Finally, set = w2 − w1 − w0 ∈ (Tot C)2 , so that D = w3 .

(5.12)

5.2. Regularized Einstein-Hilbert action. In two dimensions, the critical value of the Liouville action for a Riemann surface X \U is proportional to the hyperbolic area of the surface (see Sect. 2). It is expected that in three dimensions the critical value of the Einstein-Hilbert action functional with cosmological term is proportional to the hyperbolic volume of the 3-manifold M \(U3 ∪ ) (plus a term proportional to the induced area of the boundary). However, the hyperbolic metric diverges at the boundary 3 of U and for quasi-Fuchsian group (as well as for general Kleinian group 1 ) the hyperbolic volume of \(U3 ∪ ) is infinite. In [Wit98], Witten proposed a regularization of the action functional by truncating the 3-manifold M by surface f = ε, where the 3 cut-off function f ∈ C ∞ (U3 , R>0 ) vanishes to the first order on the boundary of U . Every choice of the function f defines a metric on U3 ds 2 =

f2 (|dz|2 + dt 2 ), t2 3

belonging to the conformal class of the hyperbolic metric. On the boundary of U it induces the metric f 2 (z, t) |dz|2 . t→0 t2 lim

Clearly for the case of quasi-Fuchsian group (or for the general Kleinian case considered in the next section), the cut-off function f should be -automorphic. Existence of such a function is guaranteed by the following result, which we formulate for the general Kleinian case. 1 Note that we are using the definition of the Kleinian groups as in [Mas88]. In the theory of hyperbolic 3-manifolds these groups are called Kleinian groups of the second kind.

228

L.A. Takhtajan, L.-P. Teo

Lemma 5.1. Let be a non-elementary purely loxodromic, geometrically finite Kleinian group with region of discontinuity , normalized so that ∞ ∈ / . For every φ ∈ CM(\) there exists the -automorphic function f ∈ C ∞ (U3 ∪ ) which is positive on U3 and satisfies f (Z) = teφ(z)/2 + O(t 3 ),

as t → 0,

uniformly on compact subsets of . Proof. Note that \ is isomorphic to a finite disjoint union of compact Riemann surfaces. Let R be a fundamental region of in U3 ∪ which is compact in the relative 3 topology of U . I. Kra has proved in [Kra72a] (the construction in [Kra72a] suggested by M. Kuga generalizes verbatim to our case) that there exist a bounded open set V in 3 U such that R ⊂ V and a function η ∈ C ∞ (U3 ∪ ) – partition of unity for on U3 ∪ , satisfying the following properties. (i) 0 ≤ η ≤ 1 and supp η ⊂ V . (ii) For each Z ∈ U3 ∪ there is a neighborhood U of Z and a finite subset J of such that η|γ (U ) = 0 for each γ ∈ \ J . " 3 (iii) γ ∈ η(γ Z) = 1 for all Z ∈ U ∪ . Let B = V ∩{(z, t) | z ∈ }. Since R ∩ is compact, then (shrinking V if necessary) there exists a t0 > 0 such that B does not intersect the region {(z, t) ∈ V | t ≤ t0 }. Define the function fˆ : V → R by teφ(z)/2 if (z, t) ∈ V and t ≤ t0 /2, fˆ(z, t) = 1 if (z, t) ∈ V and t ≥ t0 , and extend it to a smooth function fˆ on V , positive on V ∩ U3 . Set f (Z) = η(γ Z)fˆ(γ Z). γ ∈

By the property (ii), for every Z ∈ U3 ∪ this sum contains only finitely many non-zero terms, so that the function f is well-defined. By properties (i) and (iii) it is positive on U3 . To prove the asymptotic behavior, we use elementary formulas az + b + O(t 2 ) = γ (z) + O(t 2 ), cz + d t t (γ Z) = + O(t 3 ) as t → 0, |cz + d|2

z(γ Z) =

where z = γ −1 (∞). Since φ is smooth on and eφ(γ z)/2 = eφ(z)/2 |cz + d|2 , we get for z ∈ such that γ Z ∈ V and t is small enough, t 3 φ(γ z)/2 2 ˆ f (γ Z) = + O(t ) e + O(t ) |cz + d|2 = teφ(z)/2 + O(t 3 ),

Liouville Action and Weil-Petersson Metric

229

where the O-term depends on γ . Using properties (ii) and (iii) we finally obtain η(γ Z) teφ(z)/2 + O(t 3 ) f (Z) = γ ∈

= teφ(z)/2 + O(t 3 ), uniformly on compact subsets of .

Returning to the case when is a normalized purely loxodromic quasi-Fuchsian group, for every φ ∈ CM(\) let f be a function given by the lemma. For ε > 0 let Rε = R ∩ {f ≥ ε} be the truncated fundamental region. For every chain c in U3 let cε = c ∩ {f ≥ ε} be the corresponding truncated chain. Also let Fε = ∂ R ∩ {f = ε} be the boundary of Rε on the surface f = ε and define chains Lε and Vε on f = ε by the same equations ∂ Fε = ∂ Lε and ∂ Lε = ∂ Vε as chains L and V (see Sects. 2.2.1 and 2.3.1). Since the truncation is -invariant, for every chain c ∈ S• (U3 ) and γ ∈ we have (γ c)ε = γ cε . In particular, relations between the chains, derived in Sect. 5.1, hold for truncated chains as well. Let Mε be the truncated 3-manifold with the boundary ∂ Mε . For ε sufficiently small ∂ Mε = Xε Yε is diffeomorphic to X Y . Denote by Vε [φ] the hyperbolic volume of Mε . The hyperbolic metric induces a metric on ∂ Mε , and Aε [φ] denotes the area of ∂ Mε in the induced metric. Definition 5.1. The regularized on-shell Einstein-Hilbert action functional is defined by 1 E [φ] = −4 lim Vε [φ] − Aε [φ] − 2π χ (X) log ε , ε→0 2 where χ (X) = χ (Y ) = 2 − 2g is the Euler characteristic of X. The main result of this section is the following. Theorem 5.1. (Quasi-Fuchsian holography) For every φ ∈ CM(\) the regularized Einstein-Hilbert action is well-defined and E [φ] = Sˇ [φ], where Sˇ [φ] is the modified Liouville action functional without the area term, ˇ eφ d 2 z − 8π (2g − 2) log 2. S [φ] = S [φ] − \

Proof. It is sufficient to verify the formula, 1 1 Vε [φ] − Aε [φ] = 2πχ (X) log ε − Sˇ [φ] + o(1) as ε → 0, 2 4 which is a counter-part of the formula (1.13) for quasi-Fuchsian groups.

(5.13)

230

L.A. Takhtajan, L.-P. Teo

The area form induced by the hyperbolic metric on the surface f (Z) = ε is given by # 2 2 fy dx ∧ dy fx 1+ + . ft ft t2 Using t fx (Z) = φx (z) + O(t 3 ) and ft 2 we have as ε → 0,

fy t (Z) = φy (z) + O(t 3 ), ft 2

#

t2 2 dx ∧ dy (φ + φy2 )(z) + O(t 4 ) 4 x t2 Fε dx ∧ dy 1 = + φz φz¯ dx ∧ dy + o(1) t2 2 Fε F dx ∧ dy i + ω[φ], ˇ F + o(1). = t2 4

Aε [φ] =

1+

Fε

Here we have introduced ω[φ] ˇ = ω[φ] − eφ dz ∧ d z¯ = |φz |2 dz ∧ d z¯ ,

(5.14)

and have used that for Z ∈ Fε , t = εe−φ(z)/2 + O(ε 3 ),

(5.15)

uniformly for Z = (z, t) where z ∈ F . Next, using (5.1) and (5.12) we have, Vε [φ] = w3 , Rε = w3 , Rε − Sε + Eε = D(w2 − w1 − w0 ), Rε − Sε + Eε = w2 − w1 − w0 , ∂ (Rε − Sε + Eε ) = −w2 , Fε + w1 , Lε − w0 , Vε . The terms in this formula simplify as ε → 0. First of all, it follows from (5.9) that dx ∧ dy 1 −w2 , Fε = . 2 t2 Fε

Secondly, using (5.15) and Jγ (Z) = |γ (z)| + O(t 2 ) as t → 0, we have on Lε , γ γ i (w1 )γ −1 = − log |cε|2 e−φ |γ (z)| dz − d z¯ + o(1) 8 γ γ i γ γ 1 2 2 =− dz − d z¯ + o(1). 2 log ε − φ + log |γ | + log |c(γ )| 8 2 γ γ

Liouville Action and Weil-Petersson Metric

231

Therefore, as ε → 0, i i ˇ L + o(1), w1 , Lε = − κ, L (log ε − log 2) + θ[φ], 4 8 where 1-forms κγ and θˇγ [φ] were introduced in Corollary 2.3 and formula (2.16) respectively. Finally, w0 , Vε = w0 , ∂ Eε = dw0 , Eε = δw1 , Eε = δw1 , E + o(1), where we used that the 1-form δw1 is smooth on U3 and continuous on C \ (∞). Since it is closed, we can replace the 1-chain E by the 1-chain W = W1 − W2 consisting of -contracting paths at 0 (see Sect. 2.3). It follows from (5.11) that δw1 = 8i uˇ + o(1) as t → 0, where the 1-form uˇ γ1 ,γ2 was introduced in (2.17), so that i −w0 , Vε = − u, ˇ W + o(1). 8 Putting everything together, we have as ε → 0, 1 i i ω[φ], ˇ F − θˇ [φ], L Vε [φ] − Aε [φ] = − κ, L (log ε − log 2) − 2 4 8 +u, ˇ W + o(1). Using Corollary 2.3, trivially modified for the quasi-Fuchsian case, and (2.27) conclude the proof.

A fundamental domain F for in is called admissible, if it is the boundary in C of a fundamental region R for in U3 ∪ . As an immediate consequence of the theorem we get the following. Corollary 5.1. The Liouville action functional S [φ] is independent of the choice of admissible fundamental domain. Proof. Since Vε [φ], Aε [φ] are intrinsically associated with the quotient manifolds M \(U3 ∪ ) and X Y \, the statement follows from the definition of the Einstein-Hilbert action and the theorem.

Although we proved the same result in Sect. 2 using methods of homological algebra, the above argument easily generalizes to other Kleinian groups. Remark 5.1. The truncation of the 3-manifold M by the function f does depend on the choice of the realization of the fundamental group of M as a normalized discrete subgroup of PSL(2, C). Different realizations of π1 (M) result in different choices of the function f , since f has to satisfy the asymptotic behavior in Lemma 5.1, where the leading term teφ(z)/2 is not a well-defined function on M. Remark 5.2. The cochain w0 ∈ C0,2 was defined as a solution of the equation dw0 = w1 satisfying δw0 = 0. However, in the computation in Theorem 5.1 this condition is not needed – any choice of an antiderivative for w1 will suffice. This is due to the fact that the chain in (Tot K)3 that starts with R ∈ K3,0 does not contain a term in K0,3 , hence ∂ E = V . Thus we can trivially add the term δw0 , Rε − Sε + Eε = 0 to Vε [φ], which through the equation D = w3 −δw0 still gives w0 , V = dw0 , E . Thus the absence of K0,3 -components in the chain in (Tot K)3 implies that each term in E produces two boundary terms in V which cancel out the integration constants in the definition of w0 . As a result, S [φ] does not depend on the choice of w0 . In the next section we generalize the Liouville action functional to Kleinian groups having the same property.

232

L.A. Takhtajan, L.-P. Teo

5.3. Epstein map. Construction of the regularized Einstein-Hilbert action in the previous section works for a larger class of cut-off surfaces than those given by equation f = ε. Namely, it follows from the proof of Theorem 5.1, that the statement holds for any family Sε of cut-off surfaces such that for Z = (z, t) ∈ Sε , t = εe−φ(z)/2 + O(ε 3 ) as ε → 0,

(5.16)

uniformly for z ∈ F . ˆ there is a natural surface Given a conformal metric ds 2 = eφ(z) |dz|2 on ⊂ C in U3 associated to it through the inverse of the hyperbolic Gauss map. Corresponding construction is due to C. Epstein [Eps84, Eps86] (see also [And98]) and is the following. ˆ there is a unique horosphere H based at point z and passing For every Z ∈ U3 and z ∈ C through the point Z: H is a Euclidean sphere in U3 tangent to z ∈ C and passing through Z, or is a Euclidean plane parallel to the complex plane for z = ∞. Denote by [Z, z] an affine parameter of the horosphere H – the hyperbolic distance between the point (0, 1) ∈ U3 and the horosphere H considered as positive if the point (0, 1) is outside H and negative otherwise. Denote the corresponding horosphere by H (z, [Z, z]). The Epstein map G : → U3 is defined by eφ(z)/2 |dz| = e[G (z),z]

2|dz| 1 + |z|2

and it is -invariant G ◦ γ = γ ◦ G for all γ ∈ if φ ∈ CM(\). Remark 5.3. Note that our definition of the Epstein map corresponds to the case f = id in Definition 3.9 in [And98]. Geometrically, the image of the Epstein map is the Epstein surface H = G(), which is the envelope of the family of horospheres H (z, (z)) with 1 φ(z) (z) = log (1 + |z|2 ) + , 2 2 parametrized by z ∈ , where G(z) is the point of tangency of the horosphere H (z, (z)) with the surface H. Explicit computation gives 2φw¯ (w) 2eφ(w)/2 G(w) = w + φ(w) , w ∈ . (5.17) , e + |φw (w)|2 eφ(w) + |φw (w)|2 3

Remark 5.4. The square of Euclidean distance between points w and G(w) in U is 4/(eφ(w) + |φw (w)|2 ). This gives a geometric interpretation of the density |φz |2 + eφ of the (1,1)-form ω introduced in (2.4). Now to a given φ ∈ CM(\) we associate the family φε = φ + 2 log 2 − 2 log ε ∈ CM(\) with ε > 0, which corresponds to the family of conformal metrics

Liouville Action and Weil-Petersson Metric

233

dsε2 = 4ε−2 eφ(w) |dw|2 , and consider the corresponding -invariant family Hε of Epstein surfaces. It follows from the parametric representation z=w+ t=

2ε2 φw¯ (w) , + ε 2 |φw (w)|2

4eφ(w)

4εeφ(w)/2 , 4eφ(w) + ε 2 |φw (w)|2

(5.18) (5.19)

that for ε small the surfaces Hε embed smoothly in U3 and as ε → 0, z = w + O(ε2 ), t = εe−φ(w)/2 + O(ε 3 ), uniformly for w in compact subsets of . These formulas immediately give the desired asymptotic behavior (5.16). The choice of Epstein surfaces Hε as cut-off surfaces for definition of the regularized Einstein-Hilbert action seems to be the most natural. It is quite remarkable that independently Eqs. (5.18), (5.19) appear in [Kra01] in relation with a general solution of “asymptotically AdS three-dimensional gravity”.

6. Generalization to Kleinian Groups 6.1. Kleinian groups of Class A. Let be a finitely generated Kleinian group with the region of discontinuity , a maximal set of non-equivalent components 1 , . . . , n ˆ \ . As in the quasi-Fuchsian case, a path P is called of , and the limit set = C -contracting in , if P = P1 ∪ P2 , where p ∈ \ {∞} is a fixed point for , paths P1 \ {p} and P2 \ {p} lie entirely in distinct components of and are -contracting at p in the sense of Definition 2.3. It follows from arguments in Sect. 2.3.1 that -contracting paths in are rectifiable. Definition 6.1. A Kleinian group is of Class A if it satisfies the following conditions. A1 is non-elementary and purely loxodromic. A2 is geometrically finite. A3 has a fundamental region R in U3 ∪ which is a finite three-dimensional CW complex with no 0-dimensional cells in U3 and such that R ∩ ⊂ 1 ∪ · · · ∪ n . In particular, Property A1 implies that is torsion-free and does not contain parabolic elements, and Property A2 asserts that has a fundamental region R in U3 ∪ which is a finite topological polyhedron. Property A3 means that the region R can be chosen ˆ and the boundary of such that the vertices of R – endpoints of edges of R, lie on ∈ C ˆ which is a fundamental domain for in , is not too “exotic”. R in C, The class A is rather large: it clearly contains all purely loxodromic Schottky groups (for which Property A3 is vacuous), Fuchsian groups, quasi-Fuchsian groups, and free combinations of these groups. As in the previous section, we say that the Kleinian group is normalized if ∞ ∈ .

234

L.A. Takhtajan, L.-P. Teo

6.2. Einstein-Hilbert and Liouville functionals. For a finitely generated Kleinian group let M \(U3 ∪ ) be a corresponding hyperbolic 3-manifold, and let 1 , . . . , n be the stabilizer groups of the maximal set 1 , . . . , n of non-equivalent components of . We have \ = 1 \1 · · · n \n X1 · · · Xn , so that Riemann surfaces X1 , . . . , Xn are simultaneously uniformized by . Manifold 3 M is compact in the relative topology of U with the disjoint union X1 · · · Xn as the boundary. 6.2.1. Homology and cohomology set-up. Let S• ≡ S• (U3 ∪), B• ≡ B• (Z) be standard singular chain and bar-resolution homology complexes and K•,• ≡ S• ⊗Z B• – the corresponding double complex. When is a Kleinian group of Class A, we can generalize homology construction from the previous section and define corresponding chains R, S, E, F, L, V in total complex Tot K as follows. Let R be a fundamental 3 region for in U3 – a closed topological polyhedron in U satisfying Property A3. The group is generated by side pairing transformations of R ∩ U3 and we define the chain S ∈ K2,1 as the sum of terms −s ⊗ γ −1 for each pair of sides s, s of R ∩ U3 identified by a transformation γ , i.e., s = −γ s. The sides are oriented as components of the boundary and the negative sign stands for the opposite orientation. We have ∂ R = −F + ∂ S,

(6.1)

where F = ∂ R ∩ ∈ K2,0 . Note that it is immaterial whether we choose −s ⊗ γ −1 or −s ⊗ γ in the definition of S, since these terms differ by a ∂ -coboundary. Next, relations between generators of determine the -action on the edges of R, which, in turn, determines the chain E ∈ K1,2 through the equation ∂ S = L − ∂ E.

(6.2)

Here L = ∂ S ∩ ∈ K1,1 . Finally, Property A3 implies that ∂ E = V ,

(6.3)

where the chain V ∈ K0,2 lies in . Next, let the 1-chain W ∈ K1,2 be a “proper projection” of the 1-chain E onto , i.e., W is defined by connecting every two vertices belonging to the same edge of R either by a smooth path lying entirely in one component of , or by a -contracting path, so that ∂ W = V . The existence of such 1-chain W is guaranteed by Property A3 and the following lemma, which is of independent interest. Lemma 6.1. Let be a normalized, geometrically finite, purely loxodromic Kleinian group, and let R be the fundamental region of in U3 ∪ such that R∩ ⊂ 1 ∪· · ·∪n – a union of the maximal set of non-equivalent components of . If an edge e of R ∩ U3 has endpoints v0 and v1 belonging to two distinct components i and j , then there exists a -contracting path in joining vertices v0 and v1 . In particular, i and j has at least one common boundary point, which is a fixed point for .

Liouville Action and Weil-Petersson Metric

235

Proof. There exist sides s1 and s2 of R such that e ⊂ s1 ∩ s2 . For each of these sides there exists a group element identifying it with another side of R. Let γ ∈ be such an element for s1 . Since is torsion-free and v0 , v1 ∈ , the element γ identifies the edge e with the edge e of R with endpoints γ (v0 ) = v0 and γ (v1 ) = v1 . Since, by assumption, R ∩ ⊂ 1 ∪ · · · ∪ n , we have that γ (v0 ) ∈ i and γ fixes i . Similarly, γ (v1 ) ∈ j and γ fixes j . Now assume that attracting fixed point p of γ is not ∞ (otherwise we replace γ by γ −1 ). Join v0 and γ (v0 ) by a smooth path P10 inside i , and let P1n = γ n (P10 ) be its nth γ -iterate. Since γ fixes i , the path P1n lies entirely inside n i . Since limn→∞ γ n (v0 ) = p, the path P1 = ∪∞ n=0 P1 joins v0 and p, and except for the endpoint p lies entirely in i . Clearly the path P10 can be chosen so that the path P1 is smooth everywhere except at p. The path P2 joining points v1 and p inside j is defined similarly, and the path P = P1 ∪ P2 is -contracting in .

Setting = F + L − V we get from (6.1)–(6.3) that ∂(R − S + E) = − . Remark 6.1. Since U3 is acyclic, it follows from general arguments in [AT97] that for any geometrically finite purely loxodromic Kleinian group with fundamental region R given by a closed topological polyhedron, there exist chains S ∈ K2,1 , E ∈ K1,2 , T ∈ K0,3 and chains F ∈ K2,0 , L ∈ K1,1 , V ∈ K0,2 on , satisfying ∂ R = −F + ∂ S, ∂ S = L − ∂ E, ∂ E = V + ∂ T . Property A3 asserts that T = 0, and we get Eqs. (6.1)–(6.3). • (U3 ∪ ) and C•,• ≡ Hom(B , A• ) be the de Correspondingly, let A• ≡ AC • 3 Rham complex on U ∪ and the bar-de Rham complex respectively. The cochains w3 , w2 , w1 , δw1 , w0 are defined by the same formulas as in Sect. 5.1. For φ ∈ CM(\) define the cochains ω[φ], θ[φ], u by the same formulas (2.4), (2.5), (2.6), with the group elements belonging to . Finally, define the cochains θˇ [φ], uˇ by (2.16) and (2.17).

6.2.2. Action functionals. Let be a normalized Class A Kleinian group. For each φ ∈ CM(\) let f be the function constructed in Lemma 5.1. As in Sect. 5.2, we truncate the manifold M by the cut-off function f and define Vε [φ], Aε [φ]. Definition 6.2. The regularized on-shell Einstein-Hilbert action functional for a normalized Class A Kleinian group is defined by

1 E [φ] = −4 lim V [φ] − A [φ] − π(χ (X1 ) + · · · + χ (Xn )) log ε . ε→0 2 As in the quasi-Fuchsian case, a fundamental domain F for a Kleinian group in is called admissible, if it is the boundary in C of a fundamental region R for in U3 satisfying Property A3.

236

L.A. Takhtajan, L.-P. Teo

Definition 6.3. The Liouville action functional S : CM(\) → R for a normalized Class A Kleinian group is defined by S [φ] =

i ˇ ω[φ], F − θ[φ], L + u, ˇ W , 2

(6.4)

where F is an admissible fundamental domain for in . Remark 6.2. When is a purely loxodromic Schottky group (not necessarily classical Schottky group), the Liouville action functional defined above is, up to the constant term 4π(2g − 2) log 2, the functional (1.8), introduced by P. Zograf and the first author [ZT87b]. Using these definitions and repeating verbatim arguments in Sect. 5 we have the following result. Theorem 6.1. (Kleinian holography) For every φ ∈ CM(\) the regularized Einstein-Hilbert action is well-defined and E [φ] = Sˇ [φ] = S [φ] − eφ d 2 z + 4π (χ (X1 ) + · · · + χ (Xn )) log 2. \

Corollary 6.1. The definition of a Liouville action functional does not depend on the choice of the admissible fundamental domain F for . As in the Fuchsian and quasi-Fuchsian cases, the Euler-Lagrange equation for the functional S is the Liouville equation, and its single critical point given by the hyperbolic metric eφhyp |dz|2 on \ is non-degenerate. For every component i denote by Ji : U → i the corresponding covering map (unique up to a PSL(2, R)-action on U). Then the density eφhyp of the hyperbolic metric is given by eφhyp (z) =

|(Ji−1 ) (z)|2

(Im Ji−1 (z))2

if z ∈ i , i = 1, . . . , n.

(6.5)

Remark 6.3. As in Remark 2.3, let [φ] = −e−φ ∂z ∂z¯ be the Laplace operator of the metric ds 2 = eφ |dz|2 acting on functions on X1 · · · Xn , let det [φ] be its zeta-function regularized determinant, and let I[φ] = log

det [φ] . A[φ]

Polyakov’s “conformal anomaly” formula and Theorem 6.1 give the following relation between Einstein-Hilbert action E[φ] for M \(U3 ∪ ) and “analytic torsion” I[φ] on its boundary X1 · · · Xn \, I[φ + σ ] +

1 1 E[φ + σ ] = I[φ] + E[φ], σ ∈ C ∞ (X1 · · · Xn , R). 12π 12π

6.3. Variation of the classical action. Here we generalize the theorems in Sect. 4 for quasi-Fuchsian groups to Kleinian groups.

Liouville Action and Weil-Petersson Metric

237

6.3.1. Classical action. Let be a normalized Class A Kleinian group and let D() be its deformation space. For every Beltrami coefficient µ ∈ B −1,1 () the normalized ˆ →C ˆ descends to an orientation preserving homeomorquasiconformal map f µ : C phism of the quotient Riemann surfaces \ and µ \µ . This homeomorphism extends to a homeomorphism of the corresponding 3-manifolds \(U3 ∪ ) and µ \(U3 ∪ µ ), which can be lifted to an orientation preserving homeomorphism of U3 . In particular, a fundamental region of is mapped to a fundamental region of µ . Hence Property A3 is stable and every group in D() is of Class A. Moreover, since ∞ is a fixed point of f µ , every group in D() is normalized. ] be the classical Liouville action for . For every ∈ D() let S = S [φhyp Since the property of the fundamental domain F being admissible is stable, Corollary 6.1 asserts that the classical action gives rise to a well-defined real-analytic function S : D() → R. As in Sect. 4, let ϑ ∈ 2,0 () be the holomorphic quadratic differential on \, defined by ϑ = 2(φhyp )zz − (φhyp )2z . It follows from (6.5) that ϑ(z) = 2S(Ji−1 )(z) if z ∈ i , i = 1, . . . , n. Define a (1, 0)-form ϑ on D() by assigning to every ∈ D() a corresponding ϑ ∈ 2,0 ( ). For every ∈ D() let PF and PK be Fuchsian and Kleinian projective connections on X1 · · · Xn \ , defined by the Fuchsian uniformizations of Riemann surfaces X1 , . . . , Xn and by their simultaneous uniformization by the Kleinian group . We will continue to denote corresponding sections of the affine bundle P() → D() by PF and PK respectively. The difference PF − PK is a (1, 0)-form on D(). As in Sect. 4.1, ϑ = 2(PF − PK ). Correspondingly, the isomorphism D() D(1 , 1 ) × · · · × D(n , n ) defines embeddings D(i , i ) → D() and pull-backs Si and (PF − PK )i of the function S and the (1, 0)-form PF − PK . The deformation space D(i , i ) describes simultaneous Kleinian uniformization of Riemann surfaces X1 , · · · , Xn by varying the complex structure on Xi and keeping the complex structures on other Riemann surfaces fixed, and the (1, 0)-form (PF − PK )i is the difference of corresponding projective connections.

238

L.A. Takhtajan, L.-P. Teo

6.3.2. First variation. Here we compute the (1, 0)-form ∂S on D(). Theorem 6.2. On the deformation space D(), ∂S = 2(PF − PK ). Proof. Since F εµ = f εµ (F ) is an admissible fundamental domain for εµ , and, according to Lemma 2.4, the 1-chain W εµ = f εµ (W ) consists of εµ -contracting paths in εµ , the proof repeats verbatim the proof of Theorem 4.1. Namely, after the change of variables we get Lµ S =

i ˇ W , Lµ ω, F − Lµ θˇ , L + Lµ , u, 2

where Lµ ω = ϑµdz ∧ d z¯ − dξ and 1-form ξ is given by (4.6). As in the proof of Theorem 4.1, setting χ = δξ + Lµ θˇ we get that the 1-form χ on is closed, dχ = δ(dξ ) + Lµ d θˇ = δ(−Lµ ω) + Lµ δω = 0, and satisfies δχ = δ(Lµ θˇ + δξ ) = Lµ δ θˇ = Lµ uˇ = dδl. Since the 1-chain W consists either of smooth paths or of -contracting paths in , and function δl is continuous on W , the same arguments as in the proof of Theorem 4.1 allow to conclude that Lµ S =

i ϑµdz ∧ d z¯ , F . 2

Corollary 6.2. Let X1 , . . . , Xn be Riemann surfaces simultaneously uniformized by a Kleinian group of Class A. Then on D(i , i ), (PF − PK )i =

1 ∂Si . 2

6.3.3. Second variation. Theorem 6.3. On the deformation space D(), ¯ dϑ = ∂∂S = −2i ωW P , so that −S is a K¨ahler potential of the Weil-Petersson metric on D(). The proof is the same as the proof of Theorem 4.2.

Liouville Action and Weil-Petersson Metric

239

6.4. Kleinian Reciprocity. Let µ ∈ −1,1 () be a harmonic Beltrami differential, f εµ be a corresponding normalized solution of the Beltrami equation, and let υ = f˙ be the ˆ corresponding vector field on C, 1 µ(w)z(z − 1) 2 υ(z) = − d w π (w − z)w(w − 1) C

(see Sect. 3.2). Then ϕµ (z) =

∂3 6 υ(z) = − ∂z3 π

C

µ(w) 2 d w (w − z)4

is a quadratic differential on \, holomorphic outside the support of µ. In [McM00] McMullen proposed the following generalization of quasi-Fuchsian reciprocity. Theorem 6.4. (McMullen’s Kleinian Reciprocity) Let be a finitely generated Kleinian group. Then for every µ, ν ∈ −1,1 (), ϕµ ν = ϕν µ. \

\

The proof in [McM00] is based on the symmetry of the kernel K(z, w), defined in Sect. 4.2. Here we note that Theorem 6.2 provides a global form of Kleinian reciprocity for Class A groups from which Theorem 6.4 follows immediately. Indeed, when is a normalized Class A Kleinian group, Kleinian reciprocity is the statement Lµ Lν S = Lν Lµ S, since, according to (4.8), 1 6 − Lµ ϑ(z) = − 2 π and

C

ϕµ ν = − \

µ(w) 2 d w = ϕµ (z) (w − z)4

1 2

Lµ Lν S. \

Acknowledgements. We greatly appreciate stimulating discussions with E. Aldrovandi on homological methods, and with I. Kra, M. Lyubich and B. Maskit on various aspects of the theory of Kleinian groups. We are grateful to K. Krasnov for valuable comments and to C. McMullen for the insightful suggestion to use Epstein surfaces for regularizing the Eistein-Hilbert action. The work of the first author was partially supported by NSF grants DMS-9802574 and DMS-0204628.

References [Ahl61] [Ahl87]

Ahlfors, L.V.: Some remarks on Teichm¨uller’s space of Riemann surfaces. Ann. Math. 74(2), 171–191 (1961) Ahlfors, L.V.: Lectures on quasiconformal mappings. Monterey CA: Wadsworth & Brooks/Cole Advanced Books & Software, 1987, With the assistance of Clifford J. Earle Jr. Reprint of the 1966 original

240 [And98] [AT97] [Ber60] [Ber70] [Ber71] [Dur83] [Eps84] [Eps86] [Kra72a] [Kra72b] [Kra00] [Kra01] [Mas88] [McM00] [MM02] [OPS88] [Pol81] [Tak92] [Wit98] [Wol86] [ZT85]

[ZT87a] [ZT87b]

L.A. Takhtajan, L.-P. Teo Anderson, G.C.: Projective structures on Riemann surfaces and developing maps to H3 and CP n . Ph.D. thesis, Berkeley, 1998 Aldrovandi, E., Takhtajan, L.A.: Generating functional in CFT and effective action for twodimensional quantum gravity on higher genus Riemann surfaces. Commun. Math. Phys. 188(1), 29–67 (1997) Bers, L.: Simultaneous uniformization. Bull. Am. Math. Soc. 66, 94–97 (1960) Bers, L.: Spaces of Kleinian groups. In: Several Complex Variables, I (Proc. Conf. Univ. of Maryland, College Park, Md. 1970) Berlin: Springer, 1970, pp. 9–34 Bers, L.: Extremal quasiconformal mappings. In:Advances in the Theory of Riemann Surfaces (Proc. Conf., Stony Brook, N.Y. 1969) Princeton NJ: Princeton, Univ. Press, 1971,pp. 27–52 Duren, P.L.: Univalent functions. New York: Springer-Verlag, 1983 Epstein, C.L.: Envelopes of horospheres and Weingarten surfaces in hyperbolic 3-space. Preprint, 1984 Epstein, C.L.: The hyperbolic Gauss map and quasiconformal reflections. J. Reine Angew. Math. 372, 96–135 (1986) Kra, I.: Automorphic forms and Kleinian groups. Reading, MA: W. A. Benjamin Inc., 1972, Mathematics Lecture Note Series Kra, I.: On spaces of Kleinian groups. Comment. Math. Helv. 47, 53–69 (1972) Krasnov, K.: Holography and Riemann surfaces. Adv. Theor. Math. Phys. 4(4), 929–979 (2000) Krasnov, K.: On holomorphic factorization in asymptotically Ads 3d gravity. e-preprint hepth/0109198, 2001 Maskit, B.: Kleinian groups. Berlin: Springer-Verlag, 1988 McMullen, C.T.: The moduli space of Riemann surfaces is K¨ahler hyperbolic. Ann. Math. (2) 151(1), 327–357 (2000) Manin, Y.I., Marcolli, M.: Holography principle and arithmetic of algebraic curves. Adv. Theor. Math. Phys. 5, 617–650 (2002) Osgood, B., Phillips, R., Sarnak, P.: Extremals of determinants of Laplacians. J. Funct. Anal. 80(1), 148–211 (1988) Polyakov, A.M.: Quantum geometry of bosonic strings. Phys. Lett. B 103(3), 207–210 (1981) Takhtajan, L.: Semi-classical Liouville theory, complex geometry of moduli spaces, and uniformization of Riemann surfaces. In: New Symmetry Principles in Quantum Field Theory (Carg`ese 1991), New York: Plenum, 1992, pp. 383–406 Witten, E.: Anti de Sitter space and holography. Adv. Theor. Math. Phys. 2(2), 253–291 (1998) Wolpert, S.A.: Chern forms and the Riemann tensor for the moduli space of curves. Invent. Math. 85(1), 119–145 (1986) Zograf P.G., Takhtadzhyan L.A.: Action of the Liouville equation as generating function for accessory parameters and the potential of the Weil-Petersson metric on Teichm¨uller space. Funkt. Anal. i Prilozhen. 19(3), 67–68 (1985) (Russian); English. transl. in: Funct. Anal. Appl. 19(3), 219–220 (1985) Zograf P.G., Takhtadzhyan L.A.: On the Liouville equation, accessory parameters and the geometry of Teichm¨uller space for Riemann surfaces of genus 0. Mat. Sb. (N.S.) 132(174)(2), 147–166 (1987) (Russian); English transl. in: Math. USSR Sb. 60(1), 143–161 (1988) Zograf P.G., Takhtadzhyan L.A.: On the uniformization of Riemann surfaces and on the Weil-Petersson metric on the Teichm¨uller and Schottky spaces. Mat. Sb. (N.S.) 132(174)(3), 297–313 (1987) (Russian); English transl. in: Math. USSR SB. 60(2), 297–313 (1988)

Communicated by P. Sarnak

Commun. Math. Phys. 239, 241–259 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0873-x

Communications in

Mathematical Physics

Classical Properties of Infinite Quantum Open Systems P. Ługiewicz, R. Olkiewicz Institute of Theoretical Physics, University of Wrocław, 50-204 Wrocław, Poland. E-mail: [email protected]; [email protected] Received: 30 September 2002 / Accepted: 27 February 2003 Published online: 3 June 2003 – © Springer-Verlag 2003

Abstract: Long time asymptotic properties of a class of environmentally induced dynamical semigroups on arbitrary von Neumann algebras are discussed. Such a semigroup selects observables, called effective observables, which are immune to the process of decoherence and so evolve in a reversible automorphic way. In particular, it is shown that effective observables of the quantum system in the thermodynamic limit, subjected to a specific interaction with another quantum system, obey classical dynamics. 1. Introduction The problem of transition from a microscopic to a macroscopic description of Nature is a fundamental one in the discussion of interpretation of quantum mechanics. In particular, the emergence of classical dynamics described by differential, and hence local, equations of motion from the evolution of delocalized quantum states is in the center of this issue. The origin of deterministic laws that govern the classical domain of our everyday experience has attracted much attention in recent years. For example, the question in which asymptotic regime non-relativistic quantum mechanics reduces to its ancestor, i.e. Hamiltonian mechanics, was addressed in [10, 11]. It was shown there that for very many bosons with weak two-body interactions there is a class of states such that time evolution of expectation values of certain operators in these states is approximately described by a non-linear Hartree equation. The problem under what circumstances such an equation reduces to the Newtonian mechanics of point particles was also discussed. A program of deriving irreversible transport equations for macroscopic quantum systems was also carried out. For example, in [6] time evolution of a spinless quantum particle moving in a Gaussian random environment was discussed. It was shown there that in the weak coupling limit the Wigner distribution of a wave function converges globally in time to a solution of the linear Boltzmann equation. A different point of view was

This work was supported by the KBN research grant no 5P03B 081 21

242

P. Ługiewicz, R. Olkiewicz

taken in a seminal paper by Gell-Mann and Hartle [12]. They gave a thorough analysis of the role of decoherence in the derivation of phenomenological classical equations of motion. Various forms of decoherence (weak, strong) and realistic mechanisms for the emergence of various degrees of classicality were also presented. Since quantum interferences are damped in the presence of an environment, so one may hope that the classical → 0 limit for quantum dissipative dynamics may exist for arbitrary large time. Such a problem was discussed in [16]. In this work we adopt a different point of view and follow the idea of environmentally induced decoherence which is one of the four types of mechanisms (the others being decoherence by Bremsstrahlung [5], collapse induced by gravity [25], and spontaneous localization theories [13]) leading to the reduction of a wave packet. In recent years environmental decoherence has been widely discussed and accepted as a mechanism responsible for the appearance of classicality in quantum measurements and the absence, in the real world, of Schr¨odinger-cat-like states [4, 14, 17, 31]. The basic principle of decoherence is that classicality is an emergent property induced in quantum open systems by their environment. It is marked by the dynamical transition of the vast majority of pure states of the system to statistical mixtures. It accepts the wave function description of the combined state of a system and its environment but contends that it is practically impossible to distinguish it from the corresponding statistical mixture. Therefore, this line of argument was termed by Bell a FAPP (for all practical purposes) solution to the measurement problem. It is believed that such effects should be the most transparent in quantum systems consisting of many particles [1]. In order to study decoherence, the analysis of the evolution of reduced density matrices obtained by tracing out the environmental variables is the most convenient strategy. For a large class of physical phenomena this evolution can be described by a dynamical semigroup, whose generator is given by a Markovian master equation. In a recent paper [23] a thorough analysis of the superselection structure induced by a dynamical semigroup on the algebra of all bounded operators in a Hilbert space, which is also contractive in the trace norm, was presented. It was achieved by the use of the isometric-sweeping decomposition, which singles out a subalgebra, say M1 , of the algebra of all observables whose elements are immune to the process of decoherence and so evolve in a unitary way according to Schr¨odinger dynamics in the Heisenberg picture. Other observables decay in time (in the appropriate topology) to elements in M1 . Therefore, when decoherence happens almost instantaneously, then the subalgebra M1 represents effective observables of the quantum system [24]. In a particular case, when decoherence affects all but a subset of the so-called pointer states, the algebra of effective observables becomes commutative, isomorphic to the algebra of bounded sequences on a countable discrete set . And, as was shown in [24], this is the only possible Abelian subalgebra, which can be induced by environmental decoherence. Because on the algebra l ∞ () there are no non-trivial derivations so the evolution, when restricted to M1 , must be trivial. Hence, the above scheme, although fruitful in the discussion of quantum measurements and the absence of Schr¨odinger-like-cat states, cannot be used for derivation of time continuous classical dynamics or continuous superselection rules. However, it should be noted that those results concern only quantum systems with a finite number of degrees of freedom, whose observables, due to the Stone-von Neumann uniqueness theorem, form a factor (a von Neumann algebra with a trivial center) of type I. A new perspective is obtained when we pass to the thermodynamic limit. For example, a system of infinite number of spin- 21 particles in the GNS representation associated with the trace state is described by a hyperfinite factor of type II1 . Also, as was shown by

Classical Properties of Infinite Quantum Open Systems

243

Araki [2], the von Neumann algebra of all creation and annihilation operators in quantum theory of an infinite free Bose gas is a factor of type III when no condensation is present. Such continuous factors provide room for dynamical emergence of continuous classical properties of the corresponding quantum system. For example, a bi-contractive semigroup on an infinite spin system may select the algebra of effective observables isomorphic to the algebra of essentially bounded functions on a circle whose evolution is induced by a uniform rotation of that circle [21]. However, due to the existence of a trace state on any finite factor one cannot generate in this way the evolution associated with a flow in a non-compact space. To this end we have to consider more complex systems, and this is the main objective of the present paper. The following theorem (Sect. 2.4) is our basic result. Theorem. Suppose (Tt )t≥0 is a σ -weakly continuous semigroup on a von Neumann algebra M. If for all t ≥ 0, operators Tt satisfy assumptions A1–A6 from Sect. 2.2, then M = M1 ⊕ M2 , where M1 is a σ -weakly closed ∗ -subalgebra and M2 is a σ -weakly closed linear ∗ -subspace in M. Both M1 and M2 are Tt -invariant. The restriction of Tt to M1 extends in time to a one parameter group of ∗ -automorphisms, whereas lim ψ(Tt x) = 0

t→∞

for all ψ ∈ M∗ and all x ∈ M2 ∩ C, where C is a σ -weakly dense C ∗ -subalgebra in M. If a weight φ in Assumption A4 is a state, then ψ(Tt x) → 0 for all x ∈ M2 . If the predual semigroup of (Tt ) is relatively compact in the strong operator topology, then ψ(Tt x) → 0 for all x ∈ M2 , uniformly on bounded sets in M2 . If M1 = 0, then the projection from M onto M1 is a conditional expectation, and it is a φ-compatible conditional expectation whenever 1 ∈ M1 . Since M∗ contains all statistical states of the system so the property ψ(Tt x) → 0 for all x ∈ M2 , which in particular cases may hold without any topological assumptions about semigroup (Tt ), implies that expectation values of all operators from M2 decrease in time to zero, and so observables belonging to M1 are precisely those which can be detected in practice. This justifies the name of effective observables of the system, at least in the case when the process of decoherence is efficient. It is worth pointing out that even if M is a factor, the algebra M1 may possess central elements – the classical observables in the sense of [3]. Hence the above decomposition provides a mathematical description of appearance of classical properties in the system. A novelty of this decomposition stems from the fact that for its derivation only algebraic methods have been used. It is also worth noting that this result is optimal in the following sense. There is a σ -weakly continuous semigroup of operators on B(H) satisfying A1-A6 and such that M1 = 0, M2 = B(H), ψ(Tt x) → 0 for all ψ ∈ Tr(H) and all x ∈ K(H), and ψ(Tt 1) = 1 for all t ≥ 0. Here B(H) denotes the von Neumann algebra of all bounded operators in H, Tr(H) is the Banach space of trace class operators, the predual space to B(H), and K(H) stands for the C ∗ -algebra of compact operators. The above decomposition M = M1 ⊕ M2 (called the isometric-sweeping decomposition) is obviously related to the asymptotic properties of the semigroup (Tt ). Such properties for positive or completely positive semigroups having a faithful normal stationary state (or a faithful family of subinvariant normal states) have been studied by many authors. For example, in [7 and 22] the problem of the approach to equilibrium was addressed. In [9, 15, 19, 29, 30] the existence of the mean ergodic projection on a von Neumann algebra M was considered. Such a projection being a conditional expectation onto the fixed point subalgebra MT provides another decomposition, namely

244

P. Ługiewicz, R. Olkiewicz

M = MT ⊕ N , with the obvious inclusion MT ⊂ M1 . However, the evolution restricted to MT is trivial, while on M1 it is given by a group of automorphisms. Moreover, in general the restriction of the dynamics to N cannot be controlled. For a partial result in this direction see [8]. From this point of view the isometric-sweeping decomposition is closer to the so-called Jacobs-deLeeuw-Glicksberg splitting onto the so-called reversible and flight parts which holds whenever the semigroup is relatively compact in the weak operator topology, see for example [18]. However, in such a case there is no clear physical interpretation of the flight vectors which are characterized by the property that 0 is a weak accumulation point of their trajectory {Tt x}. It should be also pointed out that the isometric-sweeping decomposition may exist, even when the Jacobs-deLeeuw-Glicksberg splitting fails to hold [23]. This is due to the fact that our Assumption A4 about the existence of a subinvariant faithful normal weight (see Sect. 2.2), contrary to the existence of a subinvariant faithful normal state, has no direct topological consequences for the semigroup (Tt ) or its predual. Finally, in Sect. 3, we present an example of the environment induced semigroup on a factor of type II∞ such that the dynamical system (M1 , Tt |M1 ) is isomorphic with a commutative dynamical system (L∞ (R3 ), gt ), where gt is a uniform motion, i.e. with a constant velocity, in R3 . Since any factor of type II∞ may be written as the tensor product of II1 and I∞ factors, it is the latter which is responsible for appearance of non-compact spaces. It should be pointed out here that this result neither compels us to change the view of how quantum mechanics is interpreted nor shows that quantum mechanics reduces to classical mechanics. It demonstrates, however, that in specific circumstances infinite quantum systems may be, for all practical purposes, described as classical ones. 2. The Decomposition of T 2.1. Notation. Let M be a von Neumann algebra with a distinguished normal semifinite and faithful weight φ. The predual space of M we shall denote by M∗ with the duality between M∗ and M given by ψ(x), ψ ∈ M∗ , x ∈ M. For any subset N ⊂ M we shall denote by Nh (respectively N+ ) the set of all Hermitian (respectively positive) operators from N . The same will apply for sets in M∗ . We shall use the standard notation for objects associated with φ in the Tomita-Takesaki theory (because the weight φ is fixed we omit the subscript φ in notation) such as N = {x ∈ M : φ(x ∗ x) < ∞}, M = span{y ∗ x : x, y ∈ N } = span{x ∈ M+ : φ(x) < ∞}, the canonical injection of N into its Hilbert space completion H with the scalar product given by < (x), (y) >= φ(y ∗ x), x, y ∈ N , π the canonical representation of M in H being an isometry and σ -weak homeomorphism of M onto π(M), the modular operator in H arising from the left Hilbert algebra U = (N ∩ N ∗ ), J the corresponding isometric involution in H, and (σt )t∈R the group of modular automorphisms on M associated with φ. By S (respectively F ) we shall denote the corresponding sharp (respectively flat) operators in H, i.e. for any ξ ∈ DS , S(ξ ) = ξ , and for any η ∈ DF , F (η) = η . By the same symbol π we shall denote the canonical faithful representation of U in B(H) given by π(ξ )η = ξ η, ξ, η ∈ U. Thus, for any x ∈ N , π((x)) = π(x). The operator norm in M we shall denote by · ∞ , the norm in M∗ by · 1 , and the norm in H induced by the scalar product by · 2 . The operator norm in B(H) will be denoted by · op . Thus π(x) op = x ∞ for all x ∈ M. Because in the following

Classical Properties of Infinite Quantum Open Systems

245

we use operators acting in all spaces M, M∗ , and H so in order to ease the confusion we shall mark operators acting in M∗ with the subscript 1, and in H with the subscript 2. The closure of M in the operator norm we shall denote by C. Since M is a ∗ -algebra so C is a σ -weakly dense C ∗ -subalgebra in M. The classes of right (left) bounded elements in H we shall denote by U and U respectively, i.e. U = {η ∈ DF : ∃c > 0 π(ξ )η 2 ≤ c ξ 2 ∀ξ ∈ U}, U = {ξ ∈ DS : ∃c > 0 π (η)ξ 2 ≤ c η 2 ∀η ∈ U }, where π (η)ξ = π(ξ )η, ξ ∈ U. Because algebra U is achieved so U = U . Finally, the injection of M into M∗ , x → φx , given by φx (u) =

n

< J π(u)∗ J (yi ), (zi ) >,

(2.1)

i=1

where x = i zi∗ yi , zi , yi ∈ N , and u ∈ M, we shall denote by . As was shown in [28] φx is well defined, i.e. independent of the representation of x, and is an injective linear positive map onto a norm dense subspace in M∗ . In the following we use also another expression for φx given in [27]: φx (z∗ y) = < J π(x)∗ J (y), (z) >, y, z ∈ N, φx (y) = < (y), J (x) >, y ∈ N, φx (y) = φy (x), y ∈ M.

(2.2) (2.3) (2.4)

The closure of M in the norm x L = max{ x ∞ , φx 1 } we shall denote by L. As was shown in [27] extends to an injection from L into M∗ . 2.2. Dynamics. Since the generalization to the time continuous case is straightforward we restrict our considerations to the discrete time case. Suppose that a bounded operator T : M → M satisfies the following assumptions: A1) A2) A3) A4) A5) A6)

T is two-positive, T is normal with the predual T∗1 : M∗ → M∗ , T (1) ≤ 1, where 1 is the identity operator in M, φ ◦ T ≤ φ, i.e. φ(T x) ≤ φ(x) for all x ∈ M+ , T ◦ σt = σt ◦ T ∀t ∈ R, T∗1 preserves the space (M), i.e. T∗1 : (M) → (M).

Let us comment on the above assumptions. An environmentally induced semigroup is a semigroup describing the reduced dynamics of the joint system which consists of a quantum system and its environment. It is given by the Markovian approximation for the family of maps (Tt ) being the composition of a conditional expectation with respect to the reservoir variables and a unitary evolution of the joint system. Hence each operator Tt is normal completely positive and preserves the identity operator in M. Also the existence of a Tt -invariant faithful normal state has been established in many models. So the first four of the assumptions only generalize these properties. The last two are technical ones. They are the consequences of arbitrariness of the weight φ, and are superfluous in the case when φ is a tracial weight. Moreover, the Assumptions A1–A6 clearly generalize those in [23 and 21].

246

P. Ługiewicz, R. Olkiewicz

Proposition 1. Suppose M is a semifinite factor with a faithful normal and semifinite trace τ . Suppose further that T : M → M is a two-positive normal operator such that: (i) T x ∞ ≤ x ∞ ∀x ∈ M, (ii) T x 1 ≤ x 1 ∀x ∈ M. Then T satisfies A1-A6 with φ = τ . Proof. Since φ = τ , σt = id. Hence only point A6 needs to be shown. Let x ∈ M+ . Then, for any y ∈ M,

(x)(y) =< J π(y)∗ J (x 1/2 ), (x 1/2 ) >= τ (xy) ≡ τx (y). So, by linearity, (x) = τx for all x ∈ M. By (ii), T extends to a contractive operator T1 in L1 (M), the predual space of M. Its dual operator T1∗ : M → M, by Proposition 1 in [30], satisfies τ (T1∗ x) ≤ τ (x), for all x ∈ M+ , and so T1∗ : M → M. Because

(T1∗ x) = T∗1 (x), for all x ∈ M, so T∗1 : (M) → (M). Let us turn to a general case. In short, the isometric-sweeping decomposition will be obtained in the following way. Firstly, by implementing the dynamics in a Hilbert space associated with a subinvariant faithful weight, where the unitary-completely nonunitary decomposition holds, secondly, by extending this decomposition to the predual space, and, finally, by transforming this result by duality to the algebra. Suppose that T satisfies A1–A6. Then, by A1 and A3, T satisfies the Schwarz inequality T (x)∗ T (x) ≤ T (x ∗ x), and so is contractive in the operator norm. By the Schwarz inequality and A4, T : N → N. The map T2 (x) = (T x), x ∈ N , is contractive in · 2 , and so extends to the whole space H. This extension we denote also by T2 . By A4, T : M → M, and so the map T1 : (M) → (M) given by T1 (φx ) = φT x , x ∈ M, is well defined. Theorem 1. T1 extends to a linear two-positive and contractive operator on M∗ (denoted also by T1 ). Proof. Step 1. First we show that for any x ∈ M, T1 φx 1 ≤ 2 φx 1 . If x ∈ Mh , then T x ∈ Mh , and so

φT x 1 = inf{φ(h) + φ(k) : h − k = T x, h, k ∈ M+ } ≤ inf{φ(T y) + φ(T z) : y − z = x, y, z ∈ M+ } ≤ inf{φ(y) + φ(z) : y − z = x, y, z ∈ M+ } = φx 1 . Hence, using the Hermitian decomposition and the property φx ∗ 1 = φx 1 , we arrive at φT x 1 ≤ 2 φx 1 , for all x ∈ M. The bounded extension of T1 onto M∗ we denote also by T1 . Step 2. We show that is completely positive. Suppose x˜ ∈ M ⊗ Mn×n and x˜ ≥ 0, where Mn×n is the algebra of n × n matrices. A functional (φxij )i,j = ⊗ id(x) ˜ is positive on M ⊗ Mn×n if and only if for any y1 , ..., yn ∈ N there is n

φxij (yj∗ yi ) ≥ 0.

i,j =1

˜ is positive so Let ξ˜ = (J (y1 ), ..., J (yn )) ∈ H ⊗ Cn . Because π ⊗ id(x) < π ⊗ id(x) ˜ ξ˜ , ξ˜ >=

n i,j =1

< J π(xij )J (yi ), (yj ) >=

n i,j =1

φxij (yj∗ yi ) ≥ 0.

Classical Properties of Infinite Quantum Open Systems

247

Step 3. Because is completely positive so T1 is two-positive. Hence the dual operator T1∗ : M → M is also bounded and two-positive. By Step 1, T1 ψ 1 ≤ ψ 1 for all ψ ∈ M∗+ . Hence ψ(1) − ψ(T1∗ 1) ≥ 0, and so T1∗ 1 ≤ 1. Thus T1∗ satisfies the Schwarz inequality and so is contractive in the operator norm. Hence T1 ψ 1 ≤ ψ 1 for all ψ ∈ M∗ . Let us next consider the dual operator T1∗ : M → M. Suppose that x ∈ M. Then, for any y ∈ M, φy (T1∗ x) = φT y (x) = φx (T y) = (T∗1 φx )(y). By Proposition 7 in [27], we infer that T1∗ x ∈ L and φT1∗ x = T∗1 φx . By Assumption A6, T1∗ x ∈ M. If x ∈ M+ , then φ(T1∗ x) = < J π(1)∗ J T1∗ x , T1∗ x >= φT1∗ x (1) = φT1∗ x 1 = T∗1 φx 1 ≤ φx 1 = φ(x), and so φ ◦ T1∗ ≤ φ. In particular, T1∗ : M → M and the map (T1∗ )1 (φx ) = φT1∗ x , x ∈ M, is well defined and extends to a bounded operator on M∗ . Let x ∈ M and y ∈ M. Then (T1∗ )1 (φx )(y) = φT1∗ x (y) = φx (T y) = (T∗1 φx )(y). Because (M) is norm dense in M∗ so (T1∗ )1 = T∗1 . This identification allows us to denote the operator T1∗ on M by T∗ . Using the same argument as for the operator T we obtain that operator T∗2 : (N ) → (N ), T∗2 (x) = (T∗ x), extends to a contraction on H which we denote also by T∗2 . Theorem 2. T2∗ = T∗2 , where T2∗ is the adjoint operator of T2 . Proof. Step 1. Suppose that x, y ∈ M. Then (T∗1 φx )(y) = φx (T y) =< T2 (y), J (x) > . On the other hand (T∗1 φx )(y) = φT∗ x (y) =< (y), J T∗2 (x) > . Because (M) is dense in H so J T∗2 J = T2∗ , and hence (T∗2 )∗ = J T2 J . Step 2. T2 : D(1/2 ) → D(1/2 ) and T2 ◦ 1/2 = 1/2 ◦ T2 |D(1/2 ) . Suppose x ∈ N ∩N ∗ . Then π(it (x)) = π(σt x). Hence it (x) ∈ U and it (x) = (σt x). Thus, by Assumption A5, T2 it (x) = (T σt x) = (σt T x) = it T2 (x). Because it and T2 are contractions and U is dense in H so T2 it = it T2 . Let A = ln and let A = λdE(λ) be its spectral Then T2 E(B) = E(B)T2 for any decomposition. 1/2 λ/2 Borel set B ⊂ R. Because = e dE(λ), the assertion follows. Step 3. Suppose again that x ∈ N ∩ N ∗ . Then (T2 S)(x) = (ST2 )(x). Because S = J 1/2 and D(1/2 ) = DS so (T2 J 1/2 )(x) = (J 1/2 T2 )(x) = (J T2 1/2 )(x). Because 1/2 (x) = J (x ∗ ), the set 1/2 U is dense in H. Thus T2 J = J T2 and T∗2 J = J T∗2 . By Step 1, T∗2 = T2∗ which completes the proof.

248

P. Ługiewicz, R. Olkiewicz

Summing up this section: In each space M, M∗ and H there is a pair of contractive (in the corresponding norms) operators T , T∗ : M → M two-positive and normal, T1 , T∗1 : M∗ → M∗ two-positive, T2 , T∗2 : H → H,

(2.5) (2.6) (2.7)

such that T is dual to T∗1 , T∗ is dual to T1 , and T2 and T∗2 are adjoint in H. Moreover, both T and T∗ leave the space M invariant, and φ ◦ T ≤ φ,

φ ◦ T∗ ≤ φ.

(2.8)

m T m , m ∈ N, and let K m = 2.3. The unitary decomposition of T2 . Suppose S2,m = T∗2 2 m {ξ ∈ H : S2,m ξ = ξ }. It is clear that K is a closed linear subspace in H. If ξ ∈ K m and m ≥ 2, then

ξ 2 = Sm,2 ξ 2 ≤ T2m−1 x 2 ≤ ξ 2 . Hence T2m−1 ξ 2 = ξ 2 . Because

S2,(m−1) ξ − ξ 22 = S2,(m−1) ξ 22 − 2 T2m−1 ξ 22 + ξ 22 ≤ 0 Thus (K m ) is a decreasing sequence of closed subspaces in H and so S2,(m−1) ξ = ξ . ∞ m we define K = ∞ m=1 K . Let P2,m and P2 be the orthogonal projections in H onto m ∞ K and K respectively. It is well known, see for example [18], that for any ξ ∈ H, 1 k S2,m ξ, n→∞ n n−1

P2,m ξ = lim

k=0

and the limit exists in the norm in H. From Step 3 in the proof of Theorem 2 we obtain that P2,m ◦ J = J ◦ P2,m , and P2 ◦ J = J ◦ P2 . Theorem 3. Suppose Sm = T∗m T m . Then, for any x ∈ M, the mean average limit converges σ -strongly∗ to a two-positive and contractive in the operator norm projection Pm : M → M, 1 k Pm x = lim Sm x. n→∞ n n−1 k=0

Moreover, φ(Pm x) ≤ φ(x) for all x ∈ M+ . Proof. Let us define 1 k Sm x, n n−1

xn =

k=0

1 k S2,m (x). n n−1

ξn = (xn ) =

k=0

Then xn ∈ M and ξn ∈ U. Let ξ = limn→∞ ξn = P2,m (x). k (x)) = S k ((x ∗ )) so ξ converges in H. Since S Step 1. ξ, ξ ∈ U. Because S(S2,m n 2,m

Classical Properties of Infinite Quantum Open Systems

249

is closed, ξ ∈ DS and ξn → ξ . Hence π(ξ ) is affiliated to π(M). Suppose that η ∈ U . Then

π (η)ξn 2 = π(ξn )η 2 ≤ π(ξn ) op η 2 . However, π(ξn ) op = xn ∞ ≤ x ∞ . Hence π (η)ξn 2 ≤ x ∞ η 2 . Because ξn → ξ and π (η) is bounded so also π (η)ξ 2 ≤ x ∞ η 2 . Thus π(ξ ) is bounded which implies that ξ ∈ U. Replacing x by x ∗ one may check in the same way that ξ ∈ U. Step 2. Let y = −1 (ξ ). Then xn → y in the σ -strong∗ topology. Clearly, it is sufficient to consider operators π(xn ) and π(y). Suppose that η ∈ U . Then

π(xn )η − π(y)η 2 = π (η)(ξn − ξ ) 2 → 0. Because π(xn ) op ≤ x ∞ and U is dense in H so π(xn ) → π(y) strongly. Since

ξn → ξ ∈ U and π(xn )∗ η = ξn η, π(y)∗ η = ξ η for all η ∈ H, so π(xn ) → π(y) ∗ strongly . In particular, π(y) op ≤ x ∞ . Since σ -strong∗ and strong∗ topologies coincide on bounded sets, the assertion follows. n−1 k Step 3. y ∈ M. Suppose that z ∈ M and let zn = n1 Sm z. Then k=0

|φz (y)| = | < (y), J (z) > | = | < P2,m (x), J (z) > | = | < (x), J P2,m (z) > | = lim | < (x), J (zn ) > | n→∞

= lim |φzn (x)| = lim |φx (zn )| ≤ φx 1 z ∞ . n→∞

n→∞

Hence, by Corollary 17 in [27], y ∈ L. Because x = x1 − x2 + i(x3 − x4 ), where xj ∈ M+ , so yj := −1 (P2,m xj ) ∈ L+ , and (xj )n → yj σ -strongly. Since the weight φ is σ -weakly lower continuous, φ(yj ) ≤ lim inf φ((xj )n ). However, by formula (2.8), 1 k φ(Sm xj ) ≤ φ(xj ). n n−1

φ((xj )n ) =

k=0

Hence yj ∈ M+ and so, by linearity, y ∈ M. Step 4. Finally, let us define a map Pm : M → M, by Pm x = −1 (P2,m (x)). Then Pm is positive and contractive in the operator norm projection such that Pm x = limn→∞ xn in the σ -strong∗ topology. Moreover, φ(Pm x) ≤ φ(x) for all x ∈ M+ . Thus only two˜ = M ⊗ M2×2 and let φ˜ = φ ⊗ Tr, where Tr positivity remains to be shown. Let M is the standard trace on M2×2 , the algebra of 2 × 2 matrices. Replacing T and T∗ by m T˜ m is contractive in the T˜ = T ⊗ id and T˜∗ = T∗ ⊗ id, and noting that S˜2,m = T˜∗2 2 m m corresponding Hilbert space and S˜m = T˜∗ T˜ is positive on M˜ = M ⊗ M2×2 , we conclude that Pm ⊗ id is positive, and so Pm is two-positive. Suppose now that x, y ∈ M. Then φPm x (y) =< (y), J P2,m (x) >=< (Pm y), J (x) >= φx (Pm y)

(2.9)

because P2,m commutes with J . Using the above property we define a contractive projection P1,m on M∗ as follows. Let P1,m φx = φPm x , x ∈ M. Then

P1,m φx 1 =

sup

y∈M, y ∞ ≤1

|φPm x (y)| =

sup

y∈M, y ∞ ≤1

|φx (Pm y)| ≤ φx 1 .

250

P. Ługiewicz, R. Olkiewicz

Because (M) is norm dense in M∗ so P1,m extends to a two-positive contractive ∗ projection on M∗ which we denote also by P1,m . Let P1,m : M → M be the dual ∗ projection. By formula (2.9), P1,m |M = Pm . Hence there exists a unique extension of Pm to a two-positive normal and contractive in the operator norm projection on M which we denote also by Pm . Using the properties of projections P2,m and the fact that P2 ξ = limm→∞ P2,m ξ one may show in the same way as above the existence of a twopositive and contractive (in the corresponding norms) projections P1 : M∗ → M∗ , and its dual P : M → M associated with the orthogonal projection P2 in H. Two-positivity follows actually from the fact that for all x ∈ M, Pm (x) → P (x) in the σ -strong∗ topology, and so   2 2 yj∗ P (xj∗ xi )yi = lim  yj∗ Pm (xj∗ xi )yi  ≥ 0 m→∞

i,j =1

i,j =1

for any xi , yi ∈ M, i = 1, 2. It is also clear that P : M → M and φ ◦ P ≤ φ. For x ∈ M, P x is given by P x = −1 (P2 (x)). m P , m ∈ N. Clearly, R Suppose now that R2,m = P2 T2m T∗2 2 2,m is a contraction in H, and let Km be its fixed point space, i.e. Km = {ξ ∈ H : R2,m ξ = ξ }. Then Km+1 ⊂ Km ⊂ K ∞ . Let K =

∞

Km .

m=1

Proposition 2. K is a unitary space for T2 . Proof. Suppose ξ ∈ K. Because ξ ∈ K ∞ so T2m ξ 2 = ξ 2 for all m ∈ N. Because m P ξ = ξ . However, P ξ = ξ and T ∗ ξ ∈ Km so T∗2 2 2 2 2 ∗2 = T2 which implies ∗m that T2 ξ 2 = ξ 2 for all m ∈ N. Conversely, suppose that for all m ∈ N there is m T m ξ = T m T m ξ = ξ , and so P ξ = ξ which

T2m ξ 2 = T2∗m ξ 2 = ξ 2 . Then T∗2 2 2 2 ∗2 implies that R2,m ξ = ξ . Hence ξ ∈ K. Let Q2,m and Q2 be the orthogonal projections in H onto Km and K respectively. Let Rm = P T m T∗m P . Since operators Rm : M → M possess the same properties as Sm , so by repeating steps of Theorem 3 and the discussion after it we arrive at the following result which we state without a proof. Theorem 4. There are two-positive and contractive (in the corresponding norms) projections Q1 : M∗ → M∗ and its dual Q : M → M, and an orthogonal projection Q2 from H onto K such that Q1 (x) = (Qx) ∀x ∈ M, Q2 (x) = (Qx) ∀x ∈ N ∩ N ∗ , Q : M → M and φ ◦ Q ≤ φ.

(2.10) (2.11) (2.12)

Proposition 3. Q(1) ≤ 1. Q(1) = 1 if and only if projection Q is φ-compatible. Proof. Assume that Q(1) = 1. Then, for any x ∈ M+ , φ(Qx) = φQx (1) = (Q1 φx )(1) = φx (Q(1)) = φx (1) = φ(x). Moreover, φ|Q(M) is semifinite. Conversely, suppose that φ(Qx) = φ(x) for all x ∈ M+ . Then φ(Q(1)) = φx (1) for all x ∈ M. Since (M) is norm dense in M∗ , the assertion follows. The inequality Q(1) ≤ 1 follows from formula (2.12).

Classical Properties of Infinite Quantum Open Systems

251

2.4. The isometric-sweeping decomposition. We start with the following observation. Proposition 4. QT = T Q, QT∗ = T∗ Q, Q1 T1 = T1 Q1 , Q1 T∗1 = T∗1 Q1 . Proof. By duality, it is sufficient to check only the first line. But this follows from the property Q2 T2 = T2 Q2 (since K is the unitary space for T2 ), formula (2.11), the fact that all three operators Q, T and T∗ are normal, and M is σ -weakly dense in M. We are now in a position to formulate our main results. Theorem 5. M∗ = M∗1 ⊕ M∗2 , where M∗1 and M∗2 are norm closed T∗1 and T1 -invariant ∗ -subspaces. The restriction of T1 to M∗1 is an invertible isometry and T∗1 |M∗1 is its inverse. For any ψ ∈ M∗2 and all x ∈ C there is n lim (T1n ψ)(x) = lim (T∗1 ψ)(x) = 0.

n→∞

n→∞

(2.13)

n ) is relatively compact in the strong operator topology, then If (T∗1 n lim T∗1 ψ 1 = 0

n→∞

(2.14)

for all ψ ∈ M∗2 . n ψ)(e) → 0, Remark 1. If ψ ∈ M∗2 , then for any φ-finite projection e ∈ M we have (T∗1 which justifies the name of sweeping. M∗1 is called the isometric part.

Proof. Let M∗1 be the image of projection Q1 and let M∗2 = {ψ − Qψ : ψ ∈ M∗ }. The first part of the theorem is clear. If x ∈ M and Qx = x, then (T∗ T x) = T∗2 T2 (x) = (x) = (T T∗ x). Hence T∗ T x = T T∗ x = x, and so T∗1 T1 φx = T1 T∗1 φx = φx . Because the space {φx : Qx = x} is norm dense in M∗1 so T∗1 |M∗1 T1 |M∗1 = T1 |M∗1 T∗1 |M∗1 = id|M∗1 . Since both T1 and T∗1 are contractive, they are isometric operators on M∗1 . Suppose ⊥ now that x, z ∈ M. Then, Q⊥ 2 (x) ∈ K , the completely nonunitary subspace for T2 , and so (T1n (φx − Q1 φx ))(z) = (T1n φx−Qx )(z) =< (z), J (T n (x − Qx)) > = < (z), J T2n Q⊥ 2 (x) > → 0 when n → ∞. The case of T∗1 may be shown in the same way. Since the space {φx − Q1 φx } is norm dense in M∗2 , and M is norm dense in C, the formula (2.13) follows. n ) is relatively compact in the strong operator topology. Then, Finally, suppose that (T∗1 n ψ}∞ is relatively compact in the norm topology in M . for any ψ ∈ M∗2 , the set {T∗1 ∗ 0 Let ψ0 be a strong accumulation point of this set. Then there exists a subsequence (mn ) of natural numbers such that mn lim T∗1 ψ − ψ0 1 = 0.

n→∞

mn Hence, for any x ∈ M, (T∗1 ψ)(x) → ψ0 (x). By formula (2.13), ψ0 (x) = 0. However, mn ψ0 is normal and M is σ -weakly dense in M so ψ0 = 0. Thus T∗1 ψ 1 → 0. Since T∗1 is norm contractive, the formula (2.14) follows, and the proof is complete.

252

P. Ługiewicz, R. Olkiewicz

In the dual picture there is a similar decomposition. Theorem 6. M = M1 ⊕ M2 , where M1 is a σ -weakly closed ∗ -subalgebra and M2 is a σ -weakly closed linear ∗ -subspace in M. Both M1 and M2 are T and T∗ -invariant. The restriction of T to M1 is a ∗ -automorphism, whereas lim ψ(T n x) = 0

(2.15)

n→∞

n ) is relatively for all ψ ∈ M∗ and all x ∈ M2 ∩ C. If the predual semigroup (T∗1 compact in the strong operator topology, then

lim ψ(T n x) = 0

(2.16)

n→∞

for all x ∈ M2 , uniformly on bounded sets in M2 . If M1 = 0, then the projection from M onto M1 is a conditional expectation, and it is a φ-compatible conditional expectation whenever 1 ∈ M1 . Proof. Let M1 = {Qx : x ∈ M} and let M2 = {x − Qx : x ∈ M}. Because projection Q is two-positive normal and commutes with both T and T∗ so the decomposition of M follows. To proceed further we first show that M1 = {Qx : x ∈ M} is a ∗ -algebra. By linearity, it is sufficient to check that x ∗ x ∈ M1 whenever x ∈ M1 . Suppose that x ∈ M1 . Then Q(x) = x and Q(x ∗ ) = x ∗ . By Proposition 3 and two-positivity, projection Q satisfies the Schwarz inequality. Hence Q(x ∗ x) ≥ Q(x ∗ )Q(x) = x ∗ x, and so φ(Q(x ∗ x)) ≥ φ(x ∗ x). On the other hand φ(Q(x ∗ x)) ≤ φ(x ∗ x). Thus φ(Q(x ∗ x) − x ∗ x) = 0 and so, by faithfulness of φ, Q(x ∗ x) = x ∗ x. Hence x ∗ x ∈ M1 which implies that M1 is a ∗ -algebra. Since M1 is a σ -weak closure of M1 , it is also a ∗ -algebra. Because T (x ∗ x) ≥ (T x)∗ T x so, for any x ∈ M , 1 0 ≤ φ(T (x ∗ x)) − φ((T x)∗ T x) ≤ (x) 22 − T2 (x) 22 = 0. Hence, by faithfulness of φ, T (x ∗ x) = (T x)∗ T x. This implies that for any ψ ∈ M∗+ , the positive form b on M given by b(x, y) = ψ(T (x ∗ y) − T (x ∗ )T (y)), vanishes on M1 . Because ψ was arbitrary we conclude that T (xy) = T (x)T (y) for all x, y ∈ M1 . However, σ -weak continuity of T implies that T (xy) = T (x)T (y) for all x, y ∈ M1 . If x ∈ M1 , then Q(x ∗ x) = x ∗ x, and so T∗ T (x ∗ x) = T T∗ (x ∗ x) = x ∗ x. Again, by σ -weak continuity of T and T∗ , this equality holds for all x ∈ M1 . Thus T |M1 is a ∗ -automorphism. If y ∈ M2 ∩ C, then formula (2.13) yields (2.15). Finally, suppose n ) is relatively compact in the strong operator topology. Then, by formula (2.14), that (T∗1 for any y ∈ M2 with y ∞ ≤ 1, and all ψ ∈ M∗ there is n (id − Q1 )ψ)(y)| lim |ψ(T n y)| = lim |ψ(T n (id − Q)y)| = lim |(T∗1

n→∞

n→∞

n→∞

n ≤ lim T∗1 (id − Q1 )ψ 1 = 0 n→∞

since (id − Q1 )ψ ∈ M∗2 . If M∗1 = 0, the Q is a norm one projection, and so it is a conditional expectation onto the algebra M1 . If 1 ∈ M1 , then M1 is a von Neumann algebra, i.e. M1 = M1 , and, by Proposition 3, Q is a φ-compatible conditional expectation onto M1 .

Classical Properties of Infinite Quantum Open Systems

253

3. Example 3.1. The joint system. First we describe the quantum system. Suppose D is a group of dyadic numbers in R, i.e. D = Dn , where Dn = { 2kn : k ∈ Z}, n ∈ N, and let G = D 3 . Clearly, G is a countable and dense subset in R3 . It is also an Abelian group with respect to the usual sum in R3 . Its neutral element we denote by 0. The Hilbert space H of the system is given by H = ⊕G L2 (R3 , dm), where dm is the Lebesgue measure in R3 . The scalar product in L2 (R3 , dm) we shall denote by ·, ·. An element of H is a map ξ : G → L2 (R3 , dm) such that G ξ(g) 2L2 < ∞. There is a natural action of the group G on the von Neumann algebra L∞ (R3 , dm) (L∞ in short), αg f (a) = f (a + g), a = (a1 , a2 , a3 ) ∈ R3, g = (g1 , g2 , g3 ) ∈ G. The Lebesgue integral of a function f ∈ L∞ + we denote by f . Suppose the algebra of the system is given by the crossed product M = L∞ ⊗α G. Since the action of G on L∞ is free and ergodic, M is a factor. It is also clear that τ : M+ → [0, ∞], τ (x) = x(0, 0), is a faithful normal and semifinite tracial weight. Hence, M is of type II∞ . It should be noted that such a factor has no direct physical interpretation. In order to deal with infinite quantum systems one has to consider factors of type III given by the temperature representations of Bose or Fermi fields. Nevertheless, it suggests the existence of the so-called collective variable evolving in a classical way. The σ -weakly dense subalgebra of finite elements [26] will be denoted by Mf . The canonical normal ∗ -isomorphism from L∞ into M we shall denote by πα . It follows that πα (L∞ ) is a commutative von Neumann subalgebra in M. Suppose further that this system interacts with a quantum particle moving in R3 . Hence, HE = L2 (R3 , dm) and ME = B(HE ). The algebra of the joint system is given by the tensor product M ⊗ ME in H ⊗ HE . It is worth noting that in a more realistic situation the environment would be also an infinite quantum system like, for example, a phonon field. However, this simplified model is sufficient for our purpose. Let us now consider the dynamics of the joint system. Suppose

(3.17) D0 = ξ˜ ∈ H ⊗ HE : ξ˜ (g) ∈ S(R3 × R3 ), ξ˜ (g) = 0 for a.a. g ∈ G . Here a.a. stands for almost all, and S is the space of Schwartz functions. Proposition 5. Suppose c = (c1 , c2 , c3 ) with ck > 0 for all k = 1, 2, 3, v = (v1 , v2 , v3 ) ∈ R3 , xˆ = (xˆ1 , xˆ2 , xˆ3 ) is the position operator, and pˆ = (pˆ 1 , pˆ 2 , pˆ 3 ) is the momentum operator in L2 (R3 , dm). Then 1 HSE = −v · Pˆ ⊗ 1E + 1 ⊗ pˆ · pˆ + ck πα (xˆk ) ⊗ pˆ k , 2 3

(3.18)

k=1

where Pˆ = (Pˆ1 , Pˆ2 , Pˆ3 ), Pˆk = ⊕G pˆ k , πα (xˆk ) = λπα (dEk (λ)), where dEk is the spectral measure of xˆk , is essentially self-adjoint on D0 . Proof. Because πα (xˆk ) = ⊕g∈G (xˆk − gk 1) and Pˆk = ⊕G pˆ k so it is sufficient to prove that for any g = (g1 , g2 , g3 ) ∈ G, the operator 3 ∂ 1 ∂2 ∂ ivk − − ick (ak − gk ) ∂ak 2 ∂bk2 ∂bk k=1 defined on S(R6 ) is essentially self-adjoint in L2 (R6 ). Suppose first that v1 = 0, v2 = 0, v3 = 0, and let a function F : R3 × R3 → R3 × R3 be given by F (a, b) = (a , b ),

254

P. Ługiewicz, R. Olkiewicz

where a = −a, bk = −(2vk )−1 (ak −gk )2 −(ck )−1 bk . Then the operator V : L2 (R6 ) → 1

L2 (R6 ), V = (c1 c2 c3 )− 2 ◦ F , is unitary and preserves the space S(R6 ). Moreover, by direct calculations, one may show that ∂ ∂ ∂ ∗ iV vk V = −ivk − ck (ak − gk ) ∂ak ∂bk ∂ak

and ∂2 1 ∂2 V = . ∂bk2 ck2 ∂bk2

V∗ Hence 3 k=1

V∗

∂ 1 ∂2 ∂ ivk − − ick (ak − gk ) ∂ak 2 ∂bk2 ∂bk

V=

3 k=1

−1 ∂ 2 ∂ − ivk 2 2 ∂a 2ck ∂bk k

,

which is obviously essentially self-adjoint on S(R6 ). If some of vk vanish, say v3 = 0, then we define F : R2 × R2 → R2 × R2 as above, and next compute the Fourier transform with respect to (b1 , b2 , b3 ) variables. The transformed operator now reads −iv1

∂ ∂ − iv2 + p a3 , b1 , b2 , b3 , ∂a1 ∂a2

where p is a polynomial of degree 2. If all vk = 0, then the Fourier transform with respect to (b1 , b2 , b3 ) variables converts the operator to a polynomial of degree 2, an essentially self-adjoint operator on S(R6 ). Its closure we denote also by HSE . 3.2. The reduced dynamics. Suppose ψ ∈ L2 (R3 , dm), ψ(a) = 1 ψ0 (s) = √ 2π

∞ −∞

eips dp π(1 + p 2 )

k

ψ0 (ak ), where

,

and let ωE = |ψψ| be the corresponding state of the environment. The reduced dynamics of any x ∈ M is given by Tt (x) = ωE eitHSE x ⊗ 1E e−itHSE , (3.19) where ωE : M⊗ME → M is the conditional expectation with respect to the reference state ωE . Theorem 7. (Tt )t≥0 is a dynamical semigroup on M with Tt satisfying Assumptions A1–A6 for all t ≥ 0, and such that (Tt x)(g, h) = e−t

k ck |gk −hk |

βt (x(g, h)),

g, h ∈ G,

(3.20)

Classical Properties of Infinite Quantum Open Systems

255

ˆ , f ∈ L∞ . Moreover, its generator L ¯ is the where βt (f ) = Vt f Vt∗ with Vt = e−it (v·p) operator closure of L = δ + L0 , where

δ(x)(g, h) = −(v · ∇)x(g, h), L0 (x)(g, h) = −

3

(3.21)

ck |gk − hk |x(g, h),

(3.22)

x ∈ D(L) = x ∈ Mf : F (x(g, h)) ∈ D R3 ∀g, h ,

(3.23)

k=1

where F denotes the Fourier transform and D(R3 ) stands for the space of smooth compactly supported functions in R3 . Proof. From the very definition (formula (3.19)) we obtain that each Tt is normal, completely positive and preserves the identity operator. Moreover, τ (Tt x) =

(Tt x)(0, 0) =

βt (x(0, 0)) =

x(0, 0) = τ (x)

for all x ∈ M+ . Thus Assumptions A1–A6 are satisfied. ˆ Step 1. In order to simplify notation we define Uk (t) = eitck πα (xˆk )⊗pˆk , Vk (t) = eitvk Pk ⊗1E , 1 Wk (t) = e− 2 itck vk 1⊗pˆk . Moreover, let U (t) = 3k=1 Uk (t) and let V (t), W (t) be defined in the same way. It is clear that Uk (t)Wk (s) = Wk (s)Uk (t), Vk (t)Wk (s) = Wk (s)Vk (t). We show that Uk (t)Vk (s) = Vk (s)Uk (t)Wk (2st) for all k = 1, 2, 3. Suppose that

g D00 = span ϕn ⊗ hm : g ∈ G, n, m ∈ N3 , 1 2 g where ϕn (g ) = δgg hn , and hn (a) = 3k=1 Hnk (ak )e− 2 ak with the multi-index n = (n1 , n2 , n3 ) and Hnk being the Hermite polynomials. Clearly, D00 is dense in H ⊗ HE . ˆ g One can check that vectors eisvk Pk ϕn ⊗ hm are analytical for the operator ck πα (xˆk ) ⊗ pˆ k and that (xˆk − gk )l eisvk pˆk hn ⊗ pˆ kl hm are analytical vectors for the operator vk pˆ k ⊗ 1E . Hence (Vk (−s)Uk (t)Vk (s)ϕn )(g ) ⊗ hm g

= δgg

∞

(itck )l

b+b =l

l=0

1 (−svk )b (xˆk − gk )b hn ⊗ pˆ kl hm . b!b !

Because the series is absolutely convergent and pˆ kl are closed so, for sufficiently small |t| and |s|, we can rearrange the order of summation. Thus (Vk (−s)Uk (t)Vk (s)ϕn ⊗ hm )(g ) g

∞ ∞ l (itck )l (−istck vk )b b (xˆk − gk ) ⊗ pˆ k pˆ k hm hn ⊗ = δgg l! b! l=0

b=0

= (Uk (t)Wk (2st)ϕn ⊗ hm )(g ). g

256

P. Ługiewicz, R. Olkiewicz

Step 2. Let Ut = W (−t 2 )V (−t)U (t), t ∈ R. By the commutation relations from Step 1, (Ut ) is a one parameter strongly continuous group of unitary operators. Moreover, for any ∈ D00 , 3 1 d ck πα (xˆk ) ⊗ pˆ k . Ut |t=0 = −v · Pˆ ⊗ 1E + i dt k=1

Since D00 is a core for the self-adjoint operator −v · Pˆ ⊗ 1E + ˆ

3

Ut = eit (−v·P ⊗1E +

k=1 ck πα (xˆk )⊗pˆ k )

3

k=1 ck πα (xˆ k ) ⊗ pˆ k ,

.

Step 3. Let β˜t : M → M, (β˜t x)(g, h) = βt (x(g, h)). Then (β˜t ) is a one parameter group of automorphisms of M and δ (see formula (3.21)) is its generator. We show that D(L) is a core for δ. Suppose that x ∈ D(L). Because x ∈ Mf so x = N 3 i=1 πα (fi )λ(gi ), where λ(gi )(g, h) = δg,h+gi . Since F(fi ) ∈ D(R ), there exists M > 0 such that suppF(fi ) ⊂ [−M, M]3 for all i = 1, ..., N. Hence

δ m x ∞ ≤

N

πα ((v · ∇)m fi )λ(gi ) ∞ =

i=1

≤

N

(2π )

− 23

M

m

i=1

3

N

(v · ∇)m fi L∞

i=1

m |vk |

F(fi ) L1 ≤ (2π )

− 23

C(x)M

m

k=1

3

m |vk |

,

k=1

where C(x) = N i=1 F(fi ) L1 . Thus D(L) is a core for δ. Step 4. Since 1 ⊗ pˆ · pˆ commutes with other terms in the Hamiltonian HSE , Tt (x) = ωE (eitHSE x ⊗ 1E e−itHSE ) ˆ

3

ˆ

3

= ωE (eit (−v·P ⊗1E + k=1 ck πα (xˆk )⊗pˆk ) x ⊗ 1E e−it (−v·P ⊗1E + = ωE (Ut x ⊗ 1E Ut∗ ) = β˜t (ωE (U (t)x ⊗ 1E U (t)∗ )),

k=1 ck πα (xˆk )⊗pˆ k )

)

for any x ∈ M. (n) Step 5. For any n ∈ N we define a sequence of numbers: dj = j · 2−n , j = (n)

(n)

(n)

−22n , −22n + 1, ..., 22n − 1, and a sequence of intervals j = [dj , dj +1 ). Finally, (n) let us define a sequence of unitary operators A(n) (t) = 3k=1 Ak (t), where (n) Ak (t)

=

(n) (n) πα E k l ⊗ eitck dl pˆk ,

2n −1 2

l=−22n

k = 1, 2, 3. Then, for any ξ1 , ξ2 ∈ H, and any ζ1 , ζ2 ∈ HE there is (n)

lim ξ1 ⊗ ζ1 , Ak (t)ξ2 ⊗ ζ2

n→∞

= lim

n→∞

=

2n −1 2

(n)

F(ζ1 ), eitck dk

pk

(n)

F(ζ2 )ξ1 , πα (Ek (l ))ξ2

l=−22n

F(ζ1 ), eitck λpk F(ζ2 )ξ1 , πα (dEk (λ))ξ2 = ξ1 ⊗ ζ1 , Uk (t)ξ2 ⊗ ζ2 .

Classical Properties of Infinite Quantum Open Systems

257

Hence A(n) (t) converges σ -strongly to U (t). Step 6. Let St (x) = ωE (U (t)x ⊗ 1E U (t)∗ ). By Step 5, St (x) = lim ωE A(n) (t)x ⊗ 1E A(n)∗ (t) n→∞

in the σ -weak topology. Because for all t ≥ 0, 3 3 (n) (n) itck d (n) −dm(n) pˆk −tck dl −dmk lk k k ωE = e e k=1

so

k=1

ωE A(n) (t)x ⊗ 1E A(n)∗ (t) =

2n −1 2

e

−t

3

(n) (n) k=1 ck |dlk −dmk |

(n) xπα E (n) , πα E l m

l1 ,l2 ,l3 , m1 ,m2 ,m3 =−22n

(n) (n) (n) (n) where l = 3k=1 lk , l = (l1 , l2 , l3 ), and E(l ) = 3k=1 Ek (lk ). Using the identity (n) (n) πα E l xπα E (n) (g, h) = E l + g E (n) m m + h x(g, h) we arrive at

(n) ωE A(n) (t)x ⊗ 1E A(n)∗ (t) (g, h) = t (g, h)x(g, h),

where (n) t (g,

h) =

2n −1 2

e

−t

3

(n) (n) k=1 ck dlk −dmk

(n) E l + g E (n) + h . m

l1 ,l2 ,l3 , m1 ,m2 ,m3 =−22n

(n)

(n)

Because t (g, h) = α−h t (g − h, 0), and lim t (g, 0) = e−t (n)

3

k=1 ck |gk |

n→∞

so St (x)(g, h) = e−t

3

k=1 ck |gk −hk |

x(g, h),

g, h ∈ G and t ≥ 0. Clearly, St is a σ -weakly continuous semigroup and L0 (see formula (3.22)) is its generator. More precisely, the generator of St is the operator closure of (L0 , D(L)). Step 7. Since β˜t and St commutes, Tt is a σ -weakly continuous semigroup on M, and its action is given by formula (3.20). Using the estimations from Step 3 one may check that D(L) consists of analytical vectors for the operator δ + L0 . Hence the generator of Tt is given by the operator closure of L = δ + L0 .

258

P. Ługiewicz, R. Olkiewicz

3.3. The algebra of effective observables. By formula (3.20), M1 = πα (L∞ ). It is also clear that (M1 , T (t)|M1 ) (L∞ , γt ), where γt (a) = a + tv, a ∈ R3 . Finally, we show that all expectation values of observables belonging to M2 tend to zero. Theorem 8. For all φ ∈ M∗ and any y ∈ M2 there is lim φ(Tt y) = 0.

(3.24)

t→∞

Proof. We use the notation of Sect. 2, keeping in mind that τ is a tracial weight. By linearity, it is sufficient to consider only φ ∈ M∗+ . Step 1. Suppose that x ∈ M. We show first that x(g, 0) ∈ L1 ∩L∞ for all g ∈ G. Without any loss of generality we may assume that x = z∗ y with z, y ∈ N . Because z, y ∈ N so for any g ∈ G, z(g, 0), y(g, 0) ∈ L2 (R3 , dm), and g z(g, 0) 2L2 = (z) 22 , 2 2 g y(g, 0) L2 = (y) 2 . The element x(g, 0) is given by x(g, 0) = α−g z(h − g, 0)y(h, 0), h∈G

and the series converges σ -strongly to a function in L∞ . Because each summand belongs to L1 ∩ L∞ and 1

z(h − g, 0) 2L2 + y(h, 0) 2L2 α−g z(h − g, 0)y(h, 0) 1 ≤ L 2 so, by the Fatou property, x(g, 0) ∈ L1 . Step 2. If x ∈ M ∩ Mf , then φx (Tt y) → 0 for all y ∈ M2 . Clearly, without any loss of generality we may assume that x(g, 0) = 0 only for one value of g ∈ G, −t ck |gk | x(g, 0)βt (y(g, 0)). φx (Tt y) = (xTt y)(0, 0) = e Because y(0, 0) = 0, x(g, 0) ∈ L1 , y(g, 0) ∈ L∞ , so φx (Tt y) → 0 when t → ∞. Step 3. Next we show that the set {φx : x ∈ M+ ∩ Mf } is norm dense in M∗+ . Since

(M+ ) is norm dense in M∗+ , it is sufficient to show that for any x ∈ M+ there is a sequence (xn ), xn ∈ M+ ∩ Mf such that φx − φxn 1 → 0. Suppose that x ∈ M+ . Then x 1/2 ∈ N+ . Let n = { nl : l ∈ Z, |l| ≤ n2 }, n ∈ N, and let zn (g, 0) = x 1/2 (g, 0) if g ∈ 3n , and 0 otherwise. It is clear that zn ∈ N ∩ Mf , and (zn ) 2 ≤ (x 1/2 ) 2 . Moreover, 2 2 1/2 lim x 1/2 − (zn ) = lim x (g, 0) 2 = 0. n→∞

2

n→∞

g ∈ / 3n

L

Let xn = yn2 . Then xn ∈ M+ ∩ Mf , and φx − φx = τ |x − xn | n 1 = τ |x 1/2 x 1/2 − yn + x 1/2 − yn yn ≤ x 1/2 x 1/2 − (yn ) + (yn ) 2 x 1/2 −(yn ) . 2

2

2

Since (yn ) 2 are uniformly bounded, φx − φxn 1 → 0. Step 4. Finally, suppose that φ ∈ M∗+ . Then, for any > 0 there exists x ∈ M+ ∩ Mf

Classical Properties of Infinite Quantum Open Systems

259

such that φx − φxn 1 < . By Step 2, there exists t0 > 0 such that |φx (Tt y)| < for all t > t0 . Hence |φ(Tt y)| ≤ φx − φxn 1 Tt y ∞ + |φx (Tt y)| < ( y ∞ + 1), and so formula (3.24) follows.

In [20] the cross products of a commutative algebra with a suitable group of its automorphisms were used to quantize the corresponding classical system. In a sense, the above example may be thought of as a reverse procedure, a dynamical de-quantization of a quantum system. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31.

Alicki, R.: Phys. Rev. A 65, 034104 (2002) Araki, H.: Progr. Theor. Phys. 32, 956 (1964) Araki, H.: Progr. Theor. Phys. 64, 719 (1980) Blanchard, Ph., et al. (eds.): Decoherence: Theoretical, Experimental and Conceptual Problems. Berlin: Springer, 2000 Breuer, H.-P., Petruccione, F.: Phys. Rev. A 63, 032102 (2001) Erd¨os, L., Yau, H.-T.: Commun. Pure Appl. Math. 53, 667 (2000) Frigerio, A.: Lett. Math. Phys. 2, 79 (1977) Frigerio, A.: Commun. Math. Phys. 63, 269 (1978) Frigerio, A., Veri, M.: Math. Z. 180, 275 (1982) Fr¨ohlich, J., Tsai, T.-P., Yau, H.-T.: Geom. Funct. Anal. Special Volume GAFA, 57 (2000) Fr¨ohlich, J., Tsai, T.-P., Yau, H.-T.: Commun. Math. Phys. 225, 223 (2002) Gell-Mann, M., Hartle, J.B.: Phys. Rev. D 47, 3345 (1993) Ghirardi, G.C., Rimini, A., Weber, T.: Phys. Rev. D 34, 470 (1986) Giulini, D., et al. (eds.): Decoherence and the Appearance of a Classical World in Quantum Theory. Berlin: Springer, 1996 Groh, U.: In: One-Parameter Semigroups of Positive Operators. LNM Vol. 1184, R. Nagel (ed.), Berlin: Springer, 1986 Haba, Z.: Lett. Math. Phys. 44, 121 (1998) Joos, E., Zeh, H.D.: Z. Phys. B 59, 223 (1985) Krengel, U.: Ergodic Theorems. Berlin: Walter de Gruyter, 1985 K¨ummerer, B., Nagel, R.: Acta Sci. Math. 41, 151 (1979) Landsmann, N.P.: Rev. Math. Phys. 2, 45 (1990) Lugiewicz, P., Olkiewicz, R.: J. Phys. A 35, 6695 (2002) Majewski, W.A.: J. Stat. Phys. 55, 417 (1989) Olkiewicz, R.: Commun. Math. Phys. 208, 245 (1999) Olkiewicz, R.: Ann. Phys. 286, 10 (2000) Penrose, R.: In: Mathematical Physics 2000. A. Fokas et al. (eds.), London: Imperial College Press, 2000 Sunder, V.S.: An Invitation to von Neumann Algebras. New York: Springer, 1986 Terp, M.: J. Operator. Th. 8, 327 (1982) Walter, M.E.: Math. Scand. 37, 145 (1975) Watanabe, S.: Hokkaido Math. J. 8, 176 (1979) Yeadon, F.J.: J. London Math. Soc. 16(2), 326 (1977) Zurek, W.H.: Phys. Rev. D 26, 1862 (1982)

Communicated by H.-T. Yau

Commun. Math. Phys. 239, 261–285 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0874-9

Communications in

Mathematical Physics

Viscous Shock Wave and Boundary Layer Solution to an Inflow Problem for Compressible Viscous Gas Feimin Huang1,2 , Akitaka Matsumura1 , Xiaoding Shi1,3 1

Department of Mathematics, Graduate School of Science, Osaka University, Osaka 560-0045, Japan. E-mail: [email protected] 2 Institute of Applied Mathematics, AMSS, Academia Sinica, Beijing 100080, P.R. China 3 Department of Mathematics and Information Science, Beijing University of Chemical Technology, Beijing 100029, P.R. China Received: 16 July 2001 / Accepted: 27 February 2003 Published online: 11 June 2003 – © Springer-Verlag 2003

Abstract: The inflow problem for a one-dimensional compressible viscous gas on the half line (0,+∞) is investigated. The asymptotic stability on both the viscous shock wave and a superposition of the viscous shock wave and the boundary layer solution is established under some smallness conditions. The proofs are given by an elementary energy method. 1. Introduction The inflow problem for a one-dimensional compressible flow on the half-space + is described by the following system in the Eulerian coordinates:   ρ + (ρu)x˜ = 0, in + × + ,   t  2 (ρu)t + (ρu + p)x˜ = µux˜ x˜ , in + × + , (1.1)  (ρ, u)| = (ρ , u ), u − − − > 0, x=0 ˜   (ρ, u)| t=0 = (ρ0 , u0 ) → (ρ+ , u+ ), as x˜ → ∞. Here u(x, ˜ t) is the velocity, ρ(x, ˜ t) > 0 is the density, p(ρ) = ρ γ is the pressure, γ ≥ 1 is the adiabatic constant, µ > 0 is the viscosity constant, ρ± (> 0), u± are prescribed constants. We assume the initial data satisfy the boundary condition as a compatibility condition. The assumption u− > 0 implies that, through the boundary x˜ = 0 the fluid with the density ρ− flows into the region + , and thus the problem (1.1) is called the inflow problem. The cases of u− = 0 and u− < 0, the problems where the condition ρ|x=0 = ρ− is removed, are called the impermeable wall problem, the outflow prob˜ lem respectively. For the impermeable wall problem, Matsumura and Nishihara [11] and Matsumura and Mei [7] have proved the solution to (1.1) tends to the rarefaction wave as t tends to infinity when u+ > u− = 0 without any smallness conditions, and the viscous shock wave when u+ < u− = 0 under some smallness conditions. In the setting of u− = 0, the problems become complicated and a new wave, denoted

262

F. Huang, A. Matsumura, X. Shi

by the boundary layer solution, or BL-solution simply, appears in the solutions due to the presence of boundary. Matsumura [6] classified all possible large time behaviors of the solutions in terms of the boundary values. In the case of u− < 0, Kawashima and Nishibata [3] showed the asymptotic stability of the BL-solution. More recently, Matsumura and Nishihara [12] established the asymptotic stability of the BL-solution and the superposition of a BL-solution and a rarefaction wave for the inflow problem when (ρ− , u− ) ∈ sub (see (1.3) and (1.6)). Shi [13] studied the rarefaction wave case when (ρ− , u− ), (ρ+ , u+ ) ∈ super . However, there has been no result concerning the viscous shock wave for both the inflow problem and the outflow one up to now. The main difficulty of the inflow problem is to control the value ψ 2 (0, t) (see (3.1)) on the boundary, as pointed out by Matsumura and Nishihara [12], where ψ(x, t) is a function corresponding to the anti-derivative of the velocity field. Roughly speaking, the above difficulty is caused by the fact that the gas flows into the right hand side on the boundary. Such difficulty does not exist for the outflow problem and impermeable wall problem. However how to determine the phase shift of viscous shock wave for the outflow problem would be difficult. In this paper, we concentrate on the viscous shock wave for the inflow problem. We establish the asymptotic stability on both the viscous shock wave and a superposition of the viscous shock wave and the BL-solution when (ρ− , u− ) ∈ sub , provided the viscous shock profile is far from the boundary initially, the strength of BL-solution and the initial perturbation are small. In order to overcome the difficulty from the term ψ 2 (0, t), ¯ we introduce a new variable ψ(x, t) (see (4.12)) so that the corresponding convection ¯ term to ψ¯ disappears. Because there are some cancellations for the new variable ψ(x, t) on the boundary, the reformulated system (4.12) corresponds to the impermeable wall problem in some sense. Thus the boundary term ψ¯ 2 (0, t), which is virtually equivalent to ψ 2 (0, t), does not appear in our analysis. Precisely speaking, when the energy method is applied to the new system, the first energy inequality does not contain the term ψ¯ 2 (0, t), provided |ρ− u− | is small. Namely, the estimates for the term ψ 2 (0, t) could be exactly bypassed. Thus we obtain our desired a priori estimates. It should be noted that the estimates for the term ψ 2 (0, t) are also obtained after the stability theorems are established. We now state our main results. As in [12], we transform (1.1) to the problem in the Lagrangian coordinate   v − ux = 0, x > s− t, t > 0,   t  ut + p(v)x = µ ux , x > s− t, t > 0, v x (1.2) 1 (v, u)| = (v , u ), v x=s t − − − = ρ− , u− > 0,  −    (v, u)|t=0 = (v0 , u0 )(x) → (v+ , u+ ) = 1 , u+ , as x → ∞, ρ+

where v=

1 , ρ

s− = −

u− < 0. v−

(1.3)

We now consider the inflow problem (1.2) above. The characteristic speeds of the corresponding hyperbolic system without viscosity are λ1 = − −p (v), λ2 = −p (v),

(1.4)

Inflow Problem for ID Compressible Viscous Gas

263

and the sound speed c(v) is defined by γ −1 √ c(v) = v −p (v) = γ v − 2 .

(1.5)

Comparing |u| with c(v), we divide the (v, u) space into three regions,

sub = (v, u)|u| < c(v), v > 0, u > 0 ,

trans = (v, u)|u| = c(v), v > 0, u > 0 ,

super = (v, u)|u| > c(v), v > 0, u > 0 .

(1.6)

We call them the subsonic, transonic and supersonic region respectively. When (v− , u− ) ∈ sub , since the first wave speed λ1 (v− ) is less than the boundary speed s− , we can expect a BL-solution which connects (v− , u− ) and some (v+ , u+ ). In fact, by the arguments in [12], such a BL-solution exists if (v− , u− ) is on the BL-solution line defined below (1.7). In the phase plane, the BL-solution line and the 2-shock wave curve through (v− , u− ) are defined by

u u− (1.7) = −s− , BL(v− , u− ) = (v, u) ∈ sub ∪ trans = v v−

S2 (v− , u− ) = (v, u) ∈ + × + u = u− − s(v − v− ), v− < v , (1.8) − )−p(v) > 0. with s = p(vv−v − Our main results are, roughly speaking, as follows. (I) If (v+ , u+ ) ∈ S2 (v− , u− ), then the viscous shock wave is asymptotically stable provided that the conditions of Theorem 2.1 hold. (II) If (v+ , u+ ) ∈ BLS2 (v− , u− ), then there exists (v, ¯ u) ¯ ∈ BL(v− , u− ) such that (v+ , u+ ) ∈ S2 (v, ¯ u) ¯ and the superposition of the BL-solution connecting (v− , u− ) with (v, ¯ u) ¯ and the 2-viscous shock wave connecting (v, ¯ u) ¯ with (v+ , u+ ) is asymptotically stable provided that |v− − v| ¯ is small and the conditions of Theorem 2.2 hold. That is, the BL-solution is weak and the shock wave is not necessarily weak. Our plan of this paper is as follows. In Sect. 2, after introducing some properties of the viscous shock wave and the BL-solution, we state the main theorems. In Sect. 3–4, we prove the asymptotic stability of the viscous shock wave. Precisely speaking, in Sect. 3, we reformulate the original problem to a new initial boundary value problem. In Sect. 4, we establish the a priori estimates by the energy method. In Sect. 5, Case (II) will be treated. Notations. Throughout this paper, several positive generic constants are denoted by c, C without confusion. For function spaces, Lp (), 1 ≤ p ≤ ∞ denotes the usual Lebesgue space on ⊂ R = (−∞, ∞) with its norm 1 p p |f (x)| dx , 1 ≤ p < ∞, f L∞ () = sup |f (x)|. (1.9) f Lp () =

H l ()

denotes the

l th

order Sobolev space with its norm 1  2 l j ∂x f 2  , when · := · L2 () . f l =  j =0

The domain will be often abbreviated without confusion.

(1.10)

264

F. Huang, A. Matsumura, X. Shi

2. Preliminaries and Main Results In this section, we first recall the properties of the viscous shock wave. It is well known that the travelling wave (v, u) = (Vs , Us )(η = x − st), s > 0, satisfying (Vs , Us )(±∞) = (v± , u± ) exists and is unique up to shift, under the Rankine-Hugoniot condition s(v+ − v− ) = u− − u+ , (2.1) s(u+ − u− ) = p(v+ ) − p(v− ), and the entropy condition u+ < u− . Namely, (Vs , Us ) satisfies

which yields

(2.2)

 −sVs − Us = 0,   

Us −sUs + p(Vs ) = µ ,  Vs   (Vs , Us )(±∞) = (v± , u± ),

(2.3)

 Us = −s(Vs − v± ) + u± ,    sµV s = −s 2 Vs − p(Vs ) − b ≡: h(Vs ),  V s   Vs (±∞) = v± ,

(2.4)

where b = −s 2 v± − p(v± ). Thus, we have Proposition 2.1. For any (v+ , u+ ), (v− , u− ), s > 0, satisfying v+ > v− > 0, u− > 0, the Rankine-Hugoniot (2.1) and the entropy condition (2.2), there exists a condition unique shock profile Vs , Us (η = x − st) up to a shift, which connects (v− , u− ) and (v+ , u+ ), and 0 < v− < Vs (η) < v+ , u+ < Us (η) < u− , Vs h(Vs ) > 0, h(Vs ) > 0, Vs = sµ |Vs (η) − v± | = O(1)|v+ − v− |e−c± |η| , |Us (η) − u± | = O(1)|v+ − v− |e−c± |η| , as η → ±∞ where c± =

v± |p (v± )+s 2 | µs

(2.5)

> 0.

On the other hand, there exists a boundary layer solution of the form (v, u) = (Vb , Ub )(x−s− t) with (Vb , Ub )(0) = (v− , u− ), (Vb , Ub )(+∞) = (v+ , u+ ), if (v− , u− ) ∈ sub and (v+ , u+ ) ∈ BL(v− , u− ) due to Matsumura and Nishihara [12]. The BLsolution (Vb , Ub ) satisfies   −s− Vb − Ub = 0, U (2.6) −s− Ub + p(Vb ) = µ Vbb ,   (Vb , Ub )(0) = (v− , u− ), (Vb , Ub )(+∞) = (v+ , u+ ). Furthermore, we have

Inflow Problem for ID Compressible Viscous Gas

265

Proposition 2.2. Let (v− , u− ) ∈ sub , (v+ , u+ ) ∈ BL(v− , u− ) sub , then there exists a unique solution Vb , Ub (η = x − s− t) to (2.6), which satisfies |Vb (x − s− t) − v+ , Ub (x − s− t) − u+ | ≤ C|v+ − v− |e−c|x−s− t| ,

(2.7)

with some c > 0. We now make a coordinate transformation, in which we can make the problem (1.2) easier to handle, by t = t, ξ = x − s− t. Thus, the problem (1.2) becomes  v − s− vξ − uξ = 0, ξ > 0, t > 0,   u  t  ut − s− uξ + p(v)ξ = µ vξ ξ , ξ > 0, t > 0,  (v, u)|ξ =0 = (v− , u− ),    (v, u)|t=0 = (v0 , u0 ) → (v+ , u+ ), as ξ → +∞.

(2.8)

(2.9)

We consider the case (v− , u− ) ∈ sub ,

(v+ , u+ ) ∈ BLS2 (v− , u− ).

(2.10)

Obviously, the large time behavior of the solutions to (2.9) should be expected to be the superposition of a 2-viscous shock wave and a BL-solution. In this case, there is (v, ¯ u) ¯ ∈ BL(v− , u− ) such that (v+ , u+ ) ∈ S2 (v, ¯ u). ¯ We consider the situation where the initial data (v0 (x), u0 (x)) are given in a neighborhood of (Vb (ξ ) + Vs (ξ − β) − v, ¯ Ub (ξ ) + Us (ξ − β) − u) ¯ for some large constant β > 0. Namely, we ask that the viscous shock wave be far from the boundary initially. The next question is how to determine the shift α such that the solution (v, u) to (2.9) is expected to tend to (Vb (ξ ) + Vs (ξ − (s − s− )t + α − β) − v, ¯ Ub (ξ ) + Us (ξ − (s − s− )t + α − β) − u). ¯ It is known that determining the shift α is difficult even for the scalar viscous conservation laws. Fortunately, Matsumura and Nishihara [12] have shown how to determine the shift α for the system (2.9). Their results are

∞ 1 α= [v0 (ξ ) − Vb (ξ ) − Vs (ξ − β) + v]dξ ¯ v+ − v¯ 0 ∞ −(s − s− ) [Vs ((s− − s)t − β) − v]dt ¯ , (2.11) 0

and

∞ 0

[v(ξ, t) − V (ξ, t; α, β)]dξ ∞ Vs ((s− − s)τ + α − β) − v¯ dτ, = (s − s− )

(2.12)

t

−→ 0 as t → ∞, where V (ξ, t; α, β) = Vb (ξ ) + Vs (ξ − (s − s− )t + α − β) − v. ¯

(2.13)

266

F. Huang, A. Matsumura, X. Shi

It is noted that if (v+ , u+ ) ∈ S2 (v− , u− ),

∞ 1 α= [v0 (ξ ) − Vs (ξ − β)]dξ v+ − v − 0 ∞ −(s − s− ) [Vs ((s− − s)t − β) − v− ]dt .

(2.14)

0

Let U (ξ, t; α, β) = Ub (ξ ) + Us (ξ − (s − s− )t + α − β) − u. ¯

(2.15)

To state our main theorems, we suppose that for some β > 0, v0 (ξ ) − V (ξ, 0; 0, β) ∈ H 1 ∩ L1 , u0 (ξ ) − U (ξ, 0; 0, β) ∈ H 1 ∩ L1 ,

(2.16)

and suppose the compatibility condition v0 (0) = v− ,

u0 (0) = u− ,

(2.17)

holds. We set

0 , 0 (ξ ) = −

∞

v0 (y) − V (y, 0; 0, β), u0 (y) − U (y, 0; 0, β) dy.

(2.18)

ξ

Assume that ( 0 , 0 ) ∈ L2 .

(2.19)

We now give our main results. Theorem 2.1. Suppose that 1 ≤ γ ≤ 3, (v− , u− ) ∈ sub , (v+ , u+ ) ∈ S2 (v− , u− ), with u− > 0, s > 0. Assume that (2.16), (2.17) and (2.19) hold and (γ − 1)2 (v+ − v− ) < 2γ v− .

(2.20)

Then there exists a positive constant δ0 depending on v− and v+ . For any given 0 < u− = δ < δ0 , there is a positive constant ε0 (δ), such that if || 0 , 0 ||2 + e−c− β < ε0 (δ),

(2.21)

then (2.9) has a unique global solution (v, u)(ξ, t) satisfying v(ξ, t) − V (ξ, t; α, β) ∈ C 0 ([0, ∞), H 1 ) ∩ L2 (0, ∞; H 1 ), u(ξ, t) − U (ξ, t; α, β) ∈ C 0 ([0, ∞), H 1 ) ∩ L2 (0, ∞; H 2 ),

(2.22) (2.23)

sup (v, u)(ξ, t) − (V , U )(ξ, t; α, β) −→ 0, as t → +∞,

(2.24)

and ξ ∈+

where α = α(β) is determined by (2.14).

Inflow Problem for ID Compressible Viscous Gas

267

Theorem 2.2. Suppose that 1 ≤ γ ≤ 3, (v− , u− ) ∈ sub , (v+ , u+ ) ∈ BLS2 (v− , u− ) with u− > 0. Then there exists (v, ¯ u) ¯ such that (v, ¯ u) ¯ ∈ BL(v− , u− ) and (v+ , u+ ) ∈ S2 (v, ¯ u). ¯ Assume that (2.16), (2.17) and (2.19) hold and (γ − 1)2 (v+ − v) ¯ < 2γ v. ¯

(2.25)

Then there exists a positive constant δ0 depending on v− and v+ . For any given 0 < u− = δ < δ0 , there exist positive constants ε0 (δ) and ε1 (δ), such that if || 0 , 0 ||2 + e−c− β < ε0 (δ), |v− − v| ¯ < ε1 (δ),

(2.26) (2.27)

then (2.9) has a unique global solution (v, u)(ξ, t) satisfying v(ξ, t) − V (ξ, t; α, β) ∈ C 0 ([0, ∞), H 1 ) ∩ L2 (0, ∞; H 1 ), u(ξ, t) − U (ξ, t; α, β) ∈ C 0 ([0, ∞), H 1 ) ∩ L2 (0, ∞; H 2 ),

(2.28) (2.29)

sup (v, u)(ξ, t) − (V , U )(ξ, t; α, β) −→ 0, as t → +∞,

(2.30)

and ξ ∈+

where α = α(β) is determined by (2.11). 3. Reformulation of the Original Problem In this section, we focus our attention on Case (I), i.e. (v− , u− ) ∈ S2 (v− , u− ). In this case, (V , U )(ξ, t; α, β) = (Vs , Us )(ξ − (s − s− )t + α − β). Let

∞

v(y, t) − V (y, t; α, β) dy,

φ(ξ, t) = − ξ

ψ(ξ, t) = −

∞

u(y, t) − U (y, t; α, β) dy,

(3.1)

ξ

which means we seek the solution (v, u)(ξ, t) in the form v(ξ, t) = φξ (ξ, t) + V (ξ, t; α, β), u(ξ, t) = ψξ (ξ, t) + U (ξ, t; α, β).

(3.2)

Substituting (3.2) into (2.9), and integrating the system on [ξ, +∞) with respect to ξ , we have  in + × + ,  φt − s− φξ − ψξ = 0, ψt − s−ψξ + p(V + φξ ) − p(V ) (3.3)   = µ U +ψξ ξ − U , in + × + . V +φξ V

268

F. Huang, A. Matsumura, X. Shi

By (3.1), the initial data satisfy +∞ φ(ξ, 0) = − [v0 (y) − V (y, 0; α, β)]dy ξ

∞

= 0 (ξ ) +

ξ

ξ

= 0 (ξ ) +

ψ(ξ, 0) = −

+∞

[v+ − V (ξ + θ − β)]dθ (3.4)

[u0 (y) − U (y, 0; α, β)]dy

ξ

∞

= 0 (ξ ) +

[U (y, 0; α, β) − U (y, 0; 0, β)]dy

ξ α

= 0 (ξ ) + = : ψ0 (ξ ).

V (y + θ − β)dθ dy

0

0

= : φ0 (ξ ),

∞ α

α

= 0 (ξ ) +

[V (y, 0; α, β) − V (y, 0; 0, β)]dy

0

[u+ − U (ξ + θ − β)]dθ (3.5)

Furthermore, we have Proposition 3.1. Under the assumptions (2.16), (2.17) and (2.19), the initial perturbations (φ0 , ψ0 ) ∈ H 2 and satisfy 1 (φ0 , ψ0 ) 2 → 0 as ( 0 , 0 ) 2 ≤ o √ and β → +∞. (3.6) β Although the proof of Proposition 3.1 is similar to Lemma 3.1 of [7], we give the proof here. α Proof. Let χ1 (x) = 0 [v+ − V (x − β + θ )] dθ. Then it follows from (2.5) that, if x ≥ β − θ, |v+ − V (x − β + θ )| ≤ Ce−c+ (x−β+θ) .

(3.7)

Thus, we have χ1 ≤ C 2

0

β−θ

α dx + C 2

∞

α 2 e−2c+ (x−β+θ) dx ≤ Cα 2 (β + 1),

β−θ

where C is independent of α and β. Similarly, we can prove that χ1 2 , χ1 2 ≤ Cα 2 (β + 1). On the other hand, (2.14) gives |α| ≤ C(| 0 | + e−c− β ) ≤ C( 0 2 + e−c− β ).

Inflow Problem for ID Compressible Viscous Gas

269

Thus, we have

1 χ1 2 → 0 as 0 2 ≤ o √ and β → +∞. β α In the same way, we have that χ2 (x) = 0 [u+ − U (x − β + θ)] dθ satisfies χ2 22 ≤ Cα 2 (β + 1). It is noted that (φ0 , ψ0 ) 2 ≤ ( 0 , 0 ) 2 + (χ1 , χ2 ) 2 . Hence, Proposition 3.1 is proved. By (2.14), the boundary data satisfy φ(0, t) = −

+∞

[v(y, t) − V (y, t; α, β)]dy ∞ V ((s− − s)τ + α − β) − v− dτ, = −(s − s− ) t

= : A(t),

t

(3.8)

ψξ (0, t) = u(0, t) − U 0, t; α, β = u− − U (s− − s)t + α − β = A (t) + s− (V (s− − s)t + α − β) − v− ).

(3.9)

It is noted that if (3.8) and (3.9) hold, then φξ (0, t) = v− − V (0, t; α, β) automatically holds by Eq. (3.3). Hence we regard (3.9) as a Neumann boundary condition. We now rewrite the system (3.3) in the form  φt − s− φξ − ψξ = 0, in + × + ,    µ   ψ − s ψ − f (V )φ − ψ = F, in + × + , − ξ  t V ξξ ξ (φ, ψ) t=0 = (φ0 , ψ0 ),    φ| ξ =0 = A(t),    ψξ ξ =0 = A (t) + s− (V (−(s − s− )t + α − β) − v− ),

(3.10)

here µsV h(V ) − p (V )V K(V ) = ≡ , 2 V V V F = −{p(V + φξ ) − p(V ) − p (V )φξ } 1 1 + (µψξ ξ + h(V )φξ ) . − V + φξ V

f (V ) = −p (V ) +

(3.11)

(3.12)

For any interval I ⊂ + , we define the solution space X(I ) by

X(I ) = (φ, ψ) ∈ C 0 (I ; H 2 ); φξ ∈ L2 (I ; H 1 ),

ψξ ∈ L2 (I ; H 2 ), sup (φ, ψ)(t) 2 ≤ ε1 , t∈I

(3.13)

270

F. Huang, A. Matsumura, X. Shi

where ε1 = 21 v− . Let N (t) = sup ( φ(τ ) 2 + ψ(τ ) 2 ), N0 = φ0 2 + ψ0 2 . 0≤τ ≤t

(3.14)

By the Sobolev embedding theorem, for (φ, ψ) ∈ X([0, T ]), one obtains (V + φξ )(ξ, t) ≥ v− − φξ 1 ≥

1 v− , (ξ, t) ∈ + × [0, T ], 2

which ensures that the system (3.10) is uniformly nonsingular on [0, T ], and |F | = O(|φξ |2 + |φξ | · |ψξ ξ |). Proposition 3.2. (Local Existence). For any τ ≥ 0, consider the problem  in + × [τ, ∞), φt − s− φξ − ψξ = 0,    µ   ψ − f (V )φξ − V ψξ ξ = F, in + × [τ, ∞), ψt − s− ξ (φ, ψ)t=τ = (φτ , ψτ ) ∈ H 2 ,    φ|ξ =0 = A(t), t ≥ τ,    ψξ ξ =0 = f (t) = A (t) + s− (V (0, t; α, β) − v− ), t ≥ τ,

(3.15)

(3.16)

subject to the compatibility condition ψξ (0, τ ) = f (τ ). Then there exists a positive constant C0 independent of τ such that: For any ε ∈ (0, Cε10 ] and β > 1, there exists a positive constant T0 depending on ε and β but not on τ such that, if (φτ , ψτ ) 2 ≤ ε, and supt≥0 (|f (t)| + |f (t)|) ≤ ε, then the problem (3.16) has a unique solution (φ, ψ) ∈ X([τ, τ + T0 ]) satisfying (φ, ψ)(t) 2 ≤ C0 ε for t ∈ [τ, τ + T0 ]. By using a standard way, such as Leray-Schauder’s fixed-point theorem, Proposition 3.1 can be easily verified, and we omit the proof here. 4. Proof of the a Priori Estimates Let (φ, ψ) ∈ X([0, T ]) be a solution of (3.10) for a positive constant T . Without loss of generality, we may restrict N(T ) < ε1 , β > 1 and |α| < 1. Throughout this section, we use C to denote the positive constant which is independent of T , β and α. Proposition 4.1. (A Priori Estimates). There exists a positive constant δ0 such that, for any given 0 < u− = δ < δ0 , there exist positive constants δ1 (δ) (δ1 ≤ ε1 ) and C(δ) such that if (φ, ψ) ∈ X([0, T ]) is a solution of (3.10) for some positive T and N (T ) < δ1 , then (φ, ψ) satisfies the a priori estimates t (φ, ψ)(t) 22 + φξ (τ ) 21 + ψξ (τ ) 22 dτ ≤ C(δ) (φ0 , ψ0 ) 22 + e−c− β , 0

(4.1) 0

t

|

d d φξ (τ ) 2 | + | ψξ (τ ) 2 |dτ ≤ C(δ) (φ0 , ψ0 ) 22 + e−c− β . dt dt

Before proving Proposition 4.1, we first give some lemmas.

(4.2)

Inflow Problem for ID Compressible Viscous Gas

271

Lemma 4.1. For 0 ≤ t ≤ T , the following inequalities hold: t t t 0 (φψ)ξ =0 , 0 (ψψξ )ξ =0 , 0 (φ 2 )ξ =0 ≤ Ce−c− β , t t t 0 (ψξ ψt )ξ =0 , 0 (φψξ )ξ =0 , 0 (φξ2 )ξ =0 ≤ Ce−c− β , t 2 t t 0 (ψξ ) ξ =0 , 0 (φξ φ)ξ =0 , 0 (φξ2t )ξ =0 ≤ Ce−c− β , t t t 0 (φξ ψ)ξ =0 , 0 (ψξ2t )ξ =0 , 0 (φξ t ψξ )ξ =0 ≤ Ce−c− β , where c− =

v− |p (v− )+s 2 | µs

(4.3)

> 0.

Proof. By using (3.8), (3.9) and Proposition 2.1, we have |φ(0, t)| = A(t) ≤ O(1)e−c− β e−c− (s−s− )t , |ψ(0, t)| ≤ sup |ψ(ξ, t)| ≤ CN (T ) ≤ C,

(4.4)

ξ ∈+

and

k d ≤ O(1)e−c− β e−c− (s−s− )t , k = 0, 1, 2, 3, A(t) dt k |φξ (0, t)| = |v− − V (s − s− )t + α − β | ≤ O(1)e−c− β e−c− (s−s− )t , (4.5) |φξ t (0, t)| = v− − V (s − s− )t + α − β ≤ O(1)e−c− β e−c− (s−s− )t , |ψξ (0, t)| = |u− − U (s − s− )t + α − β | ≤ O(1)e−c− β e−c− (s−s− )t , (4.6) |ψξ t (0, t)| = u− − U (s − s− )t + α − β ≤ O(1)e−c− β e−c− (s−s− )t ,

By using (4.4)–(4.6), t t (φψ) dτ ≤ O(1)e−c− β e−c− (s−s− )τ |ψ(0, τ )|dτ ξ =0 0 0 t ≤ CN (T ) e−c− β e−c− (s−s− )τ dτ 0

≤ Ce−c− β , t t (ψψξ ) dτ ≤ O(1)e−c− β e−c− (s−s− )τ |ψ(0, τ )|dτ ξ =0 0 0 t ≤ CN (T ) e−c− β e−c− (s−s− )τ dτ 0

≤ Ce−c− β , t t (φ 2 ) dτ ≤ O(1)e−c− β e−c− (s−s− )τ |φξ (0, τ )|dτ ξ =0 0

0

≤ Ce−c− β ,

272

F. Huang, A. Matsumura, X. Shi

t t (ψξ ψt ) dτ = ψξ (0, t)ψ(0, t) − ψξ (0, 0)ψ(0, 0) − ψ (0, τ )ψ(0, τ )dτ ξτ ξ =0 0 0 ! " t ≤ CN (T ) e−c− β + |ψξ τ (0, τ )|dτ 0

≤ Ce−c− β . In the same way, it is easy to get the other inequalities. Lemma 4.1 is proved. Remark. Lemma 4.1 is formally obtained. It can be rigorously proved by mollifying the functions and then taking the limit. Lemma 4.2. Suppose Vs (ξ − (s − s− )t + α − β) is the viscous shock profile. Then h(Vs ) s 2 (v+ − v− ) ≤ , Vs v−

0≤

0 < −p (v+ ) ≤ f (Vs ) ≤ −p (v− ) +

(4.7) s 2 (v+ − v− ) , v−

h(Vs ) ≥ −p (v+ ) > 0. 2Vs

f (Vs ) −

(4.8) (4.9)

The proof can be found in [7]. We now establish the a priori estimates. The following is our key lemma. Lemma 4.3. There exists a positive constant δ0 which only depends on v− and v+ . For any given 0 < u− < δ0 , the following holds

t ψξ 2 dτ ≤ C (φ0 , ψ0 ) 2 + e−c− β + |s− | φξ 2 dτ 0 t + N (T ) [ φξ 2 + ψξ ξ 2 ]dτ . (4.10)

t

(φ, ψ)(t) 2 + 0

0

Remark 4.1. If we directly multiply (3.10)2 by f (V )−1 ψ, then we can not get (4.10) due to the bad sign of s− ψ 2 (0, t). As pointed out by [12], it is difficult to control the term ψ(0, t) from the boundary. It is observed that the term s− ψξ in (3.10) is caused by coordinate transformation; it may be possible to erase ψ(0, t) in a new system by ¯ introducing a new variable ψ(x, t) instead of ψ(x, t). This is our key point. It should be noted that

t

0

t 0

ψ 2 (0, τ )dτ ≤ C(δ)( (φ0 , ψ0 ) 22 + e−c− β ),

∞ 0

(4.11) Vξ φ dξ dτ ≤ 2

C(δ)( (φ0 , ψ0 ) 22

+e

−c− β

),

provided that Proposition 4.1 holds. We shall prove (4.11) in the end of this section.

Inflow Problem for ID Compressible Viscous Gas

Proof. Let ψ = ψ − s− φ, then (3.10) becomes  in + × + ,  φt − 2s− φξ − ψ ξ = 0, µ  µ 2 )φ − ψ  ψ − (f (V ) − s − s φ = F, in + × + ,  ξ − ξ ξ t ξ ξ −  V V (φ, ψ)t=0 = (φ0 , ψ0 − s− φ0 ),   φ ξ =0 = A(t),     ψ ξ ξ =0 = A (t). 2 −1 ψ, then we have Multiplying (4.12)1 by φ and (4.12)2 by f (V ) − s− # 1 2 1 1 2 φ − s− φ 2 − (φψ)ξ + ψ 2 ξ 2 2 f (V ) − s− t t $ $ % % 1 µ 2 − Vt ψ − ψ ψ 2) 2) ξ 2(f (V ) − s− V (f (V ) − s− ξ $ % µ µ 2 + ψξ + Vξ ψ ξ ψ 2 2) V (f (V ) − s− ) V (f (V ) − s− $ % µ µ − s− φ ξ ψ + s− φ ψ 2 2) ξ ξ V (f (V ) − s− ) V (f (V ) − s− ξ $ % µ + s− V ξ φξ ψ 2) V (f (V ) − s− −1 2 . = F ψ f (V ) − s− By the definition of f (V ), one has $ % µ V ψ ψ ξ ξ = 2) V (f (V ) − s−

273

(4.12)

(4.13)

µ(K (V ) − s 2 ) − V ψ ψ 2 V )2 ξ ξ (K(V ) − s−

≤ε

2 )2 V 2 µ(K (V ) − s− µ 2 2 ξ ψ + ψ , ξ 2 2 3 V (f (V ) − s− ) 4ε(K(V ) − s− V ) (4.14)

for any ε > 0, which will be determined later. Substituting this inequality into (4.13) yields # 1 2 1 2 ψ φ + 2) 2 2(f (V ) − s− t # µ µ 2 − s− φ + φψ + ψ ψ + s− φ ψ 2) ξ 2) ξ V (f (V ) − s− V (f (V ) − s− ξ µ 2 2 +(1 − ε − |s− |ε) ψ + [Z(V ) + O(1)|s− |]Vξ ψ 2) ξ V (f (V ) − s− −1 µ|s− | 1 2 ≤ F ψ f (V ) − s− + c+ (4.15) φ2, 2) ξ 4ε V (f (V ) − s−

274

F. Huang, A. Matsumura, X. Shi

where s K(V ) − V K (V ) µK (V )2 Vξ − . (4.16) 2 K(V )2 4εK(V )3 In view of (2.3), (2.4) and (3.11), one gets

1 Z(V ) = 2s 2 γ 3 p(V )2 + h(V )2 + γ s 2 Vp(V ) 3 4sK(V ) ! " (γ − 1) γ 2 (γ − 1)2 2 + 2s γ (γ + 1) − p(V )h(V ) − h(V )p(V )2 ε εV ! " 1 4 + 2s 1 − V h(V ) , (4.17) 2ε Z(V ) =

where O(1) only depends on v− , v+ . By using Lemma 4.2, one obtains

1 2 2 2 Z(V ) ≥ 2s + γ s Vp(V ) h(V ) 4sK(V )3 ! " (v+ − v− ) 2 2 2 + 2s γ γ − (γ − 1) p(V )2 2v− ε ! " 1 + 2s 2 γ γ + 1 − (γ − 1)ε −1 p(V )h(V ) + 2s 4 1 − V h(V ) . 2ε Choosing

(γ − 1)2 (v+ − v− ) 1 max ≤ ε < 1, , 2γ v− 2 one has Z(V ) ≥ C > 0,

(4.18)

(4.19)

(4.20)

due to 1 ≤ γ ≤ 3. Thus, there exists a positive constant δ0 depending on v− , v+ . For any given 0 < u− < δ0 (or 0 < |s− | ≤ v− δ0 ), integrating (4.15) over [0, +∞) × [0, t], we have # +∞ 2 t +∞ 2 φ ψ 1 2 + Z(V )Vξ ψ dξ dτ dξ + 2) 2 2 2(f (V ) − s 0 0 0 − 2 t +∞ (1 − ε − |s− |ε)µψ ξ + dξ dτ 2) V (f (V ) − s− 0 0 +∞ 2 φ02 + ψ 0 dξ ≤C 0 # t ψ µψ µφ ψ s ξ − ξ 2 + s− φ + φψ + dτ + 2) 2) 0 V (f (V ) − s− V (f (V ) − s− ξ =0 t +∞ −1 2 + F ψ f (V ) − s− dξ dτ 0 t 0 +∞ cµ|s− | (4.21) φ 2 dξ dτ. + 2) ξ V (f (V ) − s 0 0 − By using Lemma 4.1, Lemma 4.2, we get the energy estimate (4.10) due to ψ = ψ −s− φ.

Inflow Problem for ID Compressible Viscous Gas

275

Lemma 4.4. For any given 0 < u− < δ0 , ∀t ∈ [0, T ], t φξ 2 + φξ (τ ) 2 dτ 0

t ≤ C (φ0 , ψ0 ) 21 + e−c− β + N (T ) φξ 2 + ψξ 21 dτ .

(4.22)

0

Proof. From system (3.10), we have µ µ φξ t − s− φξ ξ + f (V )φξ + s− ψξ = ψt − F. V V Multiplying (4.23) by φξ yields µ µ (s − s− )µ s− µ φξ2 − φ2 − Vξ φξ2 − s− Vξ φξ2 2 2V 2V 2V ξ ξ 2V 2 t + f (V )φξ2 + s− ψξ φξ = ψt φξ − F φξ . Using (2.4), one obtains µ µ h(V ) φξ2 + f (V ) − φξ2 + s− ψξ φξ − s− φ2 2V 2V 2V ξ t

ξ

(4.23)

(4.24)

= ψt φξ − F φξ . (4.25)

System (3.10)1 gives ψt φξ = (ψφξ )t − ψφξ t = (ψφξ )t − ψ(s− φξ ξ + ψξ ξ ) = (ψφξ )t − (s− ψφξ )ξ − (ψψξ )ξ + s− ψξ φξ + ψξ2 . Substituting (4.26) into (4.25), and using Lemma 4.2, we get µ h(V ) 2 φ − ψφξ + f (V ) − φξ2 2V ξ 2V t µ 2 φ + ψψξ + s− ψφξ − s− = ψξ2 − F φξ . 2V ξ ξ

(4.26)

(4.27)

Integrating (4.27) over [0, +∞) × [0, t], and using Proposition 2.1, Lemma 4.2 and the inequality +∞ µ v+ |ψφξ |dξ ≤ φξ (t) 2 + ψ(t) 2 , 4v+ µ 0 we have µ φξ (t) 2 + |p (v+ )| 4v+

t 0

φξ (τ ) dτ ≤

µ 1 1 φ0,ξ 2 + ψ0 2 + φ0,ξ 2v− 2 2 t µ φξ2 }ξ =0 dτ + {ψψξ +s− ψφξ −s− 2V 0 t v+ + ψ(t) 2 + ψξ (τ ) 2 dτ µ 0 t +∞ |F φξ |dξ dτ. (4.28) + 0

0

276

F. Huang, A. Matsumura, X. Shi

Applying Lemma 4.1 and Lemma 4.3 to this inequality yields the estimate (4.22). Lemma 4.4 is proved. Lemma 4.5. For any given 0 < u− < δ0 , ∀t ∈ [0, T ], t 2 ψξ (t) + ψξ ξ 2 dτ 0

t ≤ C (φ0 , ψ0 ) 21 + e−c− β + N (T ) φξ (τ ) 2 + ψξ (τ ) 21 dτ .

(4.29)

0

Proof. Multiplying (3.10)2 by −ψξ ξ , one obtains s 1 2 µ − 2 ψξ + ψξ + f (V )φξ ψξ ξ + ψξ2ξ = −ψξ ξ F. −ψξ ψt ξ + 2 2 V ξ t

(4.30)

Lemma 4.2 and the Cauchy inequality yield |f (V )φξ ψξ ξ | ≤

c 2 v+ µ 2 ψξ ξ + 0 φξ2 , 2v+ 2µ

and (3.15) and the Cauchy inequality yield | − F ψξ ξ | ≤ C |φξ |2 + |φξ | · |ψξ ξ | |ψξ ξ | ≤ C|φξ | |φξ |2 + |ψξ ξ |2 . Substituting (4.32), (4.31) into (4.30) and using Proposition 2.1, we have s 1 2 µ 2 − 2 ψ (ψξ )t + ψ − ψ ξ ψt + 2 2 ξ 2v+ ξ ξ ξ c 2 v+ ≤ 0 φξ2 + C|φξ | |φξ |2 + |ψξ ξ |2 . 2µ

(4.31)

(4.32)

(4.33)

Integrating (4.33) over [0, +∞) × [0, t], and using Lemma 4.1, Lemma 4.3 and Lemma 4.4, we have the estimate (4.29). From Lemmas 4.3–4.4, we get the following inequality. Lemma 4.6. For any given 0 < u− < δ0 , ∀t ∈ [0, T ], t (φ, ψ)(t) 21 + φξ 2 + ψξ 21 dτ 0 t ≤ C{ (φ0 , ψ0 ) 21 + e−c− β + N (T ) φξ (τ ) 2 + ψξ (τ ) 21 dτ }.

(4.34)

0

Lemma 4.7. For any given 0 < u− < δ0 , ∀t ∈ [0, T ], then t φξ ξ (t) 2 + φξ ξ 2 dτ

0 t C 2 −c− β 2 2 ≤ (φ φ , ψ ) + e + N (T ) (τ ) + ψ (τ ) 0 0 2 ξ ξ 1 dτ |s− |2+σ 0

t t +C Fξ (τ ) 2 dτ + |s− |σ ψξ ξ ξ 2 dτ , (4.35) 0

0

where σ is an arbitrary positive constant.

Inflow Problem for ID Compressible Viscous Gas

277

Proof. Differentiating (4.23) with respect to ξ , one gets µVξ µVξ µ µ φξ ξ t − 2 φξ t − s− φξ ξ ξ + s− 2 φξ ξ + f (V )φξ ξ V V V V + f (V )Vξ φξ + s− ψξ ξ = ψξ t − Fξ .

(4.36)

Multiplying (4.36) by φξ ξ , we have s µ µ µVξ h(V ) − 2 φξ ξ + f (V ) − φξ2ξ − φξ2ξ − 2 φξ t φξ ξ 2V 2V 2V V t ξ s− µVξ 2 + φξ ξ + f (V )Vξ φξ ξ φξ + s− ψξ ξ φξ ξ = ψξ t φξ ξ − Fξ φξ ξ . V2 From systems (3.10)1 , one has t 2 s− φ dτ ξ ξ ξ =0 0 t 1 = (φξ t − ψξ ξ )2 ξ =0 dτ |s− | 0 t 2 φξ2t ξ =0 + ψξ2ξ ξ =0 dτ ≤ |s− | 0 t +∞ C −c− β 1 d 2 ≤ + e ψξ ξ dξ dτ |s− | s− 0 0 dξ " t! C −c− β 1 1 2 ≤ e |s− |1+σ ψξ ξ ξ 2 + dτ + ψ ξξ |s− | 2|s− | 0 |s− |1+σ t t C −c− β 1 1 ≤ + |s− |σ ψξ ξ ξ 2 dτ + ψξ ξ 2 dτ, e 2+σ |s− | 2 |s | − 0 0

(4.37)

(4.38)

where σ > 0 is an arbitrary positive constant. Using Lemma 4.2 and (2.4), we obtain µVξ µVξ 1 2 2 V 2 φξ t φξ ξ = V 2 (s− φξ ξ + ψξ ξ )φξ ξ ≤ 8 |p (v+ )|φξ ξ + Cψξ ξ , (4.39) s− ψξ ξ φξ ξ ≤ O(1)|s− |φ 2 + Cψ 2 , (4.40) ξξ ξξ 1 |f (V )Vξ φξ ξ φξ | ≤ |p (v+ )|φξ2ξ + Cφξ2 , (4.41) 8 1 |Fξ φξ ξ | ≤ |p (v+ )|φξ2ξ + C|Fξ |2 , (4.42) 8 ψξ t φξ ξ = (ψξ φξ ξ )t − ψξ φξ ξ t = (ψξ φξ ξ )t − ψξ (s− φξ ξ ξ + ψξ ξ ξ ) = (ψξ φξ ξ )t − ψξ (s− φξ ξ + ψξ ξ ) ξ + s− φξ ξ ψξ ξ + ψξ2ξ = (ψξ φξ ξ )t − (ψξ φξ t )ξ + s− φξ ξ ψξ ξ + ψξ2ξ .

(4.43)

Substituting (4.38)–(4.43) into (4.37), integrating it over [0, +∞) × [0, t], and making use of Lemma 4.2 and Lemma 4.6, we obtain the inequality (4.35). Lemma 4.7 is proved.

278

F. Huang, A. Matsumura, X. Shi

Lemma 4.8. For any given 0 < u− < δ0 , ∀t ∈ [0, T ], t 2 ψξ ξ (t) + ψξ ξ ξ 2 dτ 0

t C 2 −c− β 2 2 ≤ (φ0 , ψ0 ) 2 + e + N (T ) φξ (τ ) + ψξ (τ ) 1 dτ |s− |2+σ 0 t +C Fξ (τ ) 2 dτ. (4.44) 0

Proof. Differentiating (3.10)2 with respect to ξ , multiplying the derivative by −ψξ ξ ξ , we have s 1 2 − 2 ψξ ξ + ψξ ξ − ψξ t ψξ ξ + f (V )φξ ξ ψξ ξ ξ 2 2 ξ t µ 2 µ +f (V )Vξ φξ ψξ ξ ξ + ψξ ξ ξ − 2 Vξ ψξ ξ ψξ ξ ξ = −Fξ ψξ ξ ξ . (4.45) V V From systems (3.10)1 , one has t +∞ t s− s− d 2 2 = ψ dτ dξ dτ ψ ξ ξ ξ =0 ξξ 2 2 dξ 0 0 0 t t |s− | ≤ ψξ ξ ξ 2 dτ + ψξ ξ 2 dτ , 4 0 0 µ ψξ ξ ξ 2 + C ψξ ξ 2 + Cψξ2t |ξ =0 , 8v+ µ |ψξ ξ ξ |2 , |f (V )φξ ξ ψξ ξ ξ | ≤ C|φξ ξ |2 + 8v+ µ |ψξ ξ ξ |2 , |f (V )Vξ φξ ψξ ξ ξ | ≤ C|φξ |2 + 8v+ µ |ψξ ξ ξ |2 . |Fξ ψξ ξ ξ | ≤ C|Fξ |2 + 8v+ ψξ t ψξ ξ |ξ =0 ≤

(4.46)

(4.47) (4.48) (4.49) (4.50)

Substituting (4.46)–(4.50) into (4.45), integrating it over [0, +∞) × [0, t], and making use of Lemma 4.2 and Lemma 4.7, we obtain the inequality (4.44). Lemma 4.8 is proved. Proof of Proposition 4.1. By using the Sobolev embedding theorem, (3.11) and (3.15), we have +∞ φξ4 + φξ2 φξ2ξ + ψξ2ξ φξ2ξ + ψξ2ξ ξ φξ2 + φξ2 ψξ2ξ dξ Fξ 2 ≤ C 0 ! +∞ " +∞ φξ2ξ dξ φξ2 + φξ2ξ + ψξ2ξ ξ + ψξ2ξ dξ + ψξ2ξ ≤ C sup φξ2 ξ ∈+

0

≤ C φξ 21 φξ 2 + ψξ ξ ξ 2 + ψξ ξ 21 φξ ξ 2 ≤ C φξ 21 φξ 21 + ψξ 22 ≤ CN (T ) φξ 21 + ψξ 22 .

0

(4.51)

Inflow Problem for ID Compressible Viscous Gas

279

Assume that 0 < u− < δ is given. Choose δ1 = O(1)δ 3+σ . Let N (T ) ≤ δ1 , then Lemmas 4.3–4.8 infer the inequality (4.1). To prove inequality (4.2), we differentiate the system (3.10)1 with respect to ξ , multiply it by φξ , and integrate the resulting equality with respect to ξ . We get +∞ +∞ d φξ (t) 2 = 2 ψξ ξ φξ dξ + 2s− φξ φξ ξ dξ. (4.52) dt 0 0 Integrate (4.52) over [0, +∞), from (4.1), we prove (4.2) for φ. In the same way, we can also prove (4.2) for ψ. Proposition 4.1 is proved. Theorem 4.1. Suppose that the assumptions of Theorem 2.1 hold. Then there exists a positive constant ε0 (δ), such that if (2.20) and (2.21) are satisfied, then the initial-boundary value problem (3.10) has a unique global solution (φ, ψ) ∈ X([0, +∞)) satisfying inequalities (4.1) and (4.2) for any t ≥ 0. Moreover, the solution is asymptotically stable sup |(φξ , ψξ )(ξ, t)| −→ 0,

ξ ∈+

as t → +∞.

Proof. From Proposition 3.2 and Proposition 4.1, we get the existence of a unique global solution (φ, ψ) ∈ X([0, +∞)) satisfying inequalities (4.1) and (4.2) for any t ≥ 0, provided that (φ0 , ψ0 ) 2 and β −1 are chosen small enough. Furthermore, (φξ ξ , ψξ ξ )(t) is uniformly bounded over [0, +∞) due to (4.1). By the Sobolev embedding theorem, we obtain sup |(φξ , ψξ )(ξ, t)|2 ≤ 2{ φξ (t) φξ ξ (t) + ψξ (t) ψξ ξ (t) } −→ 0,

ξ ∈+

as t → +∞. This completes the proof of Theorem 4.1. Proof of Theorem 2.1. From Theorem 4.1, Theorem 2.1 is obtained at once. Remark. In terms of Proposition 4.1, naturally we have t C (φ0 , ψ0 ) 22 + e−c− β , ψ 2 (0, τ )dτ ≤ 3+σ |s− | 0 t ∞ C (φ0 , ψ0 ) 22 + e−c− β . Vξ φ 2 dξ dτ ≤ 2+σ |s− | 0 0

(4.53) (4.54)

In fact, multiplying (3.10)1 by f (V )φ and (3.10)2 by ψ, we have s s 1 1 µ − 2 − f (V )Vξ φ 2 + ψ f (V )φ 2 + ψ 2 + ψξ2 = 2 2 2 2 V ξ t s µ − 2 f (V )φ + f (V )φψ + ψξ ψ − 2 V ξ µ + f (V )Vξ φψ − 2 Vξ ψξ ψ − F ψ. (4.55) V It is observed that t ∞ t ∞ 1 t ∞ Vξ ψ¯ 2 dξ dτ ≥ Vξ ψ 2 dξ dτ − c|s− |2 Vξ φ 2 dξ dτ, 2 0 0 0 0 0 0

(4.56)

280

F. Huang, A. Matsumura, X. Shi

and h (V )V − h(V ) V2 (V ) h ≤ −p (V ) + V 2 + p (V ) s ≤ −p (V ) − V −γ −2 . ≤ −γ 2 V+

f (V ) = −p (V ) +

(4.57)

Integrating (4.55) over [0, +∞) × [0, t], from (4.21), (4.56), (4.57) and Proposition 4.1, we have t ∞ t C 2 −c− β (φ . Vξ φ 2 dξ dτ + |s− | ψ 2 (0, τ )dτ ≤ , ψ ) + e 0 0 2 |s− |2+σ 0 0 0 (4.58) 5. Superposition of Viscous Shock Wave and BL-Solution In this section, we investigate the case (v− , u− ) ∈ sub ,

(v+ , u+ ) ∈ BLS2 (v− , u− ),

(5.1)

where a superposition of the 2-viscous shock wave and the BL-solution is expected to be an asymptotic state of the solution to (1.2). In this case, there is (v, ¯ u) ¯ ∈ BL(v− , u− ) ¯ u) ¯ and there are the BL-solution (Vb , Ub )(ξ ) satisfying (2.6) such that (v+ , u+ ) ∈ S2 (v, and the 2-viscous shock wave (Vs , Us )(ξ − (s − s− )t + α − β) satisfying (2.3), where α = α(β) is determined by (2.11). Let V (ξ, t; α, β) = Vb (ξ ) + Vs (ξ − (s − s− )t + α − β) − v, ¯ U (ξ, t; α, β) = Ub (ξ ) + Us (ξ − (s − s− )t + α − β) − u, ¯

(5.2)

and

∞

v(y, t) − V (y, t; α, β) dy,

φ(ξ, t) = − ξ

ψ(ξ, t) = −

∞

u(y, t) − U (y, t; α, β) dy.

(5.3)

ξ

Substituting (5.3) into (2.9), and integrating the system on [ξ, +∞) with respect to ξ , we have  in + × + , φt − s− φξ − ψξ = 0,    µ   ψ − f1 (V )φξ − V ψξ ξ = F1 + Q1 , in + × + , ψt − s− ξ (φ, ψ)t=0 = (φ0 , ψ0 ), (5.4)    φ|ξ =0 = A(t),    ψξ ξ =0 = A (t) + s− (V (−(s − s− )t + α − β) − v− ),

Inflow Problem for ID Compressible Viscous Gas

281

where h(Vs ) g(Vb ) + , V V F1 = −{p(V + φξ ) − p(V ) − p (V )φξ } 1 1 + (µψξ ξ + h(Vs )φξ + g(Vb )φξ ) − , V + φξ V

f1 (V ) = f1 (Vs , Vb ) = −p (V ) +

¯ + h(Vs ) Q1 = −{p(V ) − p(Vs ) − p(Vb ) + p(v)} A(t) = −(s − s− )

Vb − v¯ Vs − v¯ + g(Vb ) , V + φξ V + φξ

∞

t

Vs ((s− − s)τ + α − β) − v¯ dτ,

(5.5)

(5.6)

(5.7)

(5.8)

µs V

µsV

with h(Vs ) = Vs s and g(Vb ) = V−b b . It is easy to see that (3.15) and Lemma 4.1 still hold here. Furthermore, from Proposition 2.1 and 2.2, we have ∞ ∞ ∞ ∞ |Q1 |dξ dt ≤ |Vs (ξ − (s − s− )t + α − β) − v||V ¯ b (ξ ) − v|dξ ¯ dt 0

0

0

0 ∞ (s−s− )t+β−α

≤C 0

+C ≤ Ce

0 −cβ

e−c|ξ | e−c− |ξ −(s−s− )t+α−β| dξ dt

0 ∞ ∞

e−c|ξ | dξ dt

(s−s− )t+β−α

(5.9)

.

As in Lemma 4.3, let ψ = ψ − s− φ, then (5.4) becomes  in + × + , φt − 2s− φξ − ψ ξ = 0,    µ 2 )φ − µ ψ  ψ − (f (V ) − s − s φ = F + Q , in + × + ,  1 ξ − ξ ξ 1 1 −  t V ξξ V (φ, ψ)t=0 = (φ0 , ψ0 − s− φ0 ),    φ ξ=0 = A(t),   ψ ξ ξ =0 = A (t). 2 −1 ψ, we have Multiplying (5.10)1 by φ and (5.10)2 by f1 (V ) − s−

# 1 1 1 2 2 2 φ − s− φ − (φψ)ξ + ψ 2 ξ 2 2 f1 (V ) − s− t t $ $ % % 1 µ 2 − ψ − ψ ψ 2) 2) ξ 2(f1 (V ) − s− V (f1 (V ) − s− t ξ $ % µ µ 2 + ψξ + ψξ ψ 2 2) V (f1 (V ) − s− ) V (f1 (V ) − s− ξ

(5.10)

282

F. Huang, A. Matsumura, X. Shi

$

% µ µ − s− φ ξ ψ + s− φ ψ 2 2) ξ ξ V (f1 (V ) − s− ) V (f1 (V ) − s− ξ $ % µ +s− φξ ψ 2) V (f1 (V ) − s− ξ −1 2 = (F1 + Q1 )ψ f1 (V ) − s− . ˜ Noting that Vt = Vst , the term − Let |v− − v| ¯ = δ.

(5.11)

1 2) 2(f1 (V )−s−

can only control t

the term which includes Vsξ . By the discussion above, we have # t +∞ +∞ 2 2 φ ψ ˜ s ) + O(1)(|s− | + δ) ˜ Vsξ ψ 2 dξ dτ Z(V + dξ + 2) 2 2(f1 (V ) − s− 0 0 0 2 t +∞ cµψ ξ + dξ dτ 2) V (f1 (V ) − s− 0 0 t +∞ +∞ 2 2 2 ¯2 φ0 + ψ 0 dξ + C ψ dξ dτ Vbξ ≤C 0 0 0 # t ψ µψ µφ ψ s ξ − ξ 2 + s− φ + φψ + dτ + 2 2 V (f1 (V ) − s− ) V (f1 (V ) − s− ) 0 ξ =0 t +∞ −1 2 (F1 + Q1 )ψ f1 (V ) − s− dξ dτ + 0 0 t +∞ cµ|s− | (5.12) φξ2 dξ dτ, + 2 V (f 0 0 1 (V ) − s− ) where ˜ s) = s Z(V 2

Vs K(Vs )

−

µK (Vs )2 Vsξ ≥ C > 0, 4εK(Vs )3

(5.13)

with K(Vs ) = −p (Vs )Vs + h(Vs ),

(5.14)

if (2.25) holds and ¯ 1 (γ − 1)2 (v+ − v) , ≤ ε < 1. (5.15) 2γ v¯ 2 t +∞ 2 2 Now we estimate the term 0 0 Vbξ ψ¯ dξ dτ by the idea of Kawashima and Nikkuni [2]. Since ξ ψ(ξ, t) = ψ(0, t) + ψξ (y, t)dy, (5.16)

max

0

Inflow Problem for ID Compressible Viscous Gas

283

we have 1

|ψ(ξ, t)| ≤ C(|ψ(0, t)| + ξ 2 ψξ ),

(5.17)

which yields, from Proposition 2.2, t +∞ 2 ¯2 Vbξ ψ dξ dτ 0 0 t +∞ 2 ψ(0, t)2 + |s− |2 φ(0, t)2 + ξ |s− |2 φξ 2 + ψξ 2 dξ dτ Vbξ ≤C 0 0 t ≤ C δ˜2 (5.18) ψ(0, t)2 + ψξ 2 + |s− |2 φξ 2 dτ + Ce−cβ . 0

Hence, (5.9) and (5.12) give t t ∞ ˜ s ) + O(1)(|s− | + δ) ˜ Vsξ ψ 2 dξ dτ (φ, ψ)(t) 2 + ψξ 2 dτ + Z(V 0 0 0

t ∞ t 2 −c− β ≤ C (φ0 , ψ0 ) + e + |s− | φξ 2 dτ + |s− |2 Vsξ φ 2 dξ dτ 0 0 0 t t 2 2 2 2 + δ˜ φξ + ψξ ξ dτ , (5.19) ψ(0, τ ) dτ + N (T ) 0

0

where we have used the fact that t ∞ t ∞ 1 t ∞ Vsξ ψ¯ 2 dξ dτ ≥ Vsξ ψ 2 dξ dτ − c|s− |2 Vsξ φ 2 dξ dτ. (5.20) 2 0 0 0 0 0 0 Suppose that δ˜ is small enough such that δ˜ ≤ o(|s− |). Multiplying (5.4)1 by f1 (V )φ and (5.4)2 by ψ yields −

s 1 s − s− ∂ − 2 f1 (Vs , Vb )Vsξ φ 2 + − s− [f1 (V )]ξ φ 2 ψ 2 ∂Vs 2 2 ξ s 1 1 µ µ − = f1 (V )φ 2 + ψ 2 + ψξ2 − f1 (V )φ 2 + f1 (V )φψ + ψξ ψ 2 2 V 2 V ξ µ t + [f1 (V )]ξ φψ + ψξ ψ − (F1 + Q1 )ψ. (5.21) V ξ

The Cauchy inequality yields |[f1 (V )]ξ φψ|

≤ εVsξ φ 2 + C Vsξ ψ 2 + Vbξ φ 2 + Vbξ ψ 2 ≤ εVsξ φ 2 + CVsξ ψ 2 + CVbξ ψ(0, t)2 + φ(0, t)2 + ξ φξ 2 + ψξ 2 , (5.22)

with small constant ε, and µ ≤ C ψ 2 + Vsξ ψ 2 + Vbξ ψ 2 . ψ ψ ξ ξ V ξ

(5.23)

284

F. Huang, A. Matsumura, X. Shi

Integrating (5.21) over [0, +∞) × [0, t] and using (5.18) and (5.19), we have t t Vsξ φ 2 dξ dτ + |s− | ψ(0, τ )2 dτ 0 0

t t ∞ 2 2 2 ≤ C (φ, ψ)(t) + ψξ dτ + Vsξ ψ dξ dτ 0 0 0

t +C (φ0 , ψ0 ) 2 + e−c− β + |s− | φξ 2 dτ 0 t +N(T ) φξ 2 + ψξ ξ 2 dτ , (5.24) 0

where we have used the fact that ∂V∂ s f1 (Vs , Vb ) = f (Vs ) + O(1)δ˜ with f (Vs ) < −γ 2 v¯ −γ −2 due to (4.57). Substituting (5.24) into (5.19), we have & $ %' t t ∞ δ˜2 2 1 − c |s− | + ψξ 2 dτ + Vsξ ψ 2 dξ dτ (φ, ψ)(t) 2 + |s− | 0 0 0

t t ≤ C (φ0 , ψ0 ) 2 + e−c− β + |s− | φξ 2 + ψξ ξ 2 dτ , φξ 2 dτ N(T ) 0

0

(5.25) which yields, if |s− | is small, t t ∞ ψξ 2 dτ + Vsξ ψ 2 dξ dτ (φ, ψ)(t) 2 + 0 0 0

t t ≤ C (φ0 , ψ0 ) 2 + e−c− β + |s− | φξ 2 dτ + N (T ) [ φξ 2 + ψξ ξ 2 ]dτ . 0

0

(5.26) The estimates of higher order derivatives are also obtained, though the calculations are rather tedious. Therefore Theorem 2.2 is proved. Remark. Theorem 2.2 implies that the BL-solution is necessarily weak though the shock wave is not necessary to be weak. Acknowledgements. The work of F. Huang was supported in part by the JSPS Research Fellowship for foreign researchers and Grand-in-aid No.P-00269 for JSPS from the ministry of Education, Science, Sports and Culture of Japan.

References 1. Goodman, J.: Nonlinear asymptotic stability of viscous shock profiles for conservation laws. Arch. Rat. Mech. Anal. 95, 325–344 (1986) 2. Kawashima, S., Nikkuni,Y.: Stability of stationary solutions to the half-space problem for the discrete Boltzmann equation with multiple collision. Kyushu J. Math., to appear 3. Kawashima, S., Nishibata, S.: Stability of stationary waves for compressible Navier-Stokes equations in the half space. In preparation

Inflow Problem for ID Compressible Viscous Gas

285

4. Liu, T.P.: Nonlinear stability of shock wave for viscous conservation laws. Mem. Amer. Math. Soc. 56, (1985) 5. Liu, T.P., Matsumura A., Nishihara, K.: Behaviors of solutions for the Burgers equation with boundary corresponding to rarefaction waves. SIAM J. Math. Anal. 29, 293–308 (1998) 6. Matsumura, A.: Inflow and outflow problems in the half space for a one-dimensional isentropic model system of compressible viscous gas. Proceedings of IMS conference on Differential Equations from Mechnics, Hong Kong, 1999, to appear 7. Matsumura, A., Mei, M.: Convergence to travelling fronts of solutions of the p-system with viscosity in the presence of a boundary. Arch. Rat. Mech. Anal. 146, 1–22 (1999) 8. Matsumura, A., Nishihara, K.: On the stability of traveling wave solutions of a one-dimensional model system for compressible viscous gas. Japan J. Appl. Math. 2, 17–25 (1985) 9. Matsumura, A., Nishihara, K.: Asymptotics toward the rarefaction wave of the solutions of a onedimensional model system for compressible viscous gas. Japan J. Appl. Math. 3, 1–13 (1986) 10. Matsumura, A., Nishihara, K.: Global stability of the rarefaction wave of a one-dimensional model system for compressible viscous gas. Commun. Math. Phys. 144, 325–335 (1992) 11. Matsumura, A., Nishihara, K.: Global asymptotics toward the rarefaction wave for solutions of viscous p-system with boundary effect. Q. Appl. Math. LVIII, 69–83 (2000) 12. Matsumura, A., Nishihara, K.: Large time behaviors of solutions to an inflow problem in the half space for a one-dimensional system of compressible viscous gas. To appear in Commun. Math. Phys. 13. Shi, X.D.: Global asymptotics toward the rarefaction wave to an inflow problem on the half line for solutions of viscous p-system. Preprint, 2001 14. Szepessy, A., Xin, Z.P.: Nonlinear stability of viscous shock waves. Arch. Rat. Mech. Anal. 122, 53–103 (1993) 15. Xin, Z.P.: Asymptotic stability of rarefaction waves for 2 × 2 viscous hyperbolic conservation laws – two-mode case. J. Differ. Eqs. 78, 191–219 (1989) Communicated by P. Constantin

Commun. Math. Phys. 239, 287–307 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0876-7

Communications in

Mathematical Physics

Asymptotics for the Burgers Equation with Pumping Nakao Hayashi1 , Pavel I. Naumkin2 1 2

Department of Mathematics, Graduate School of Science, Osaka University, Osaka, Toyonaka, 560-0043, Japan. E-mail: [email protected] Instituto de Matem´aticas, UNAM Campus Morelia, AP 61-3 (Xangari), Morelia, CP 58089, Michoac´an, Mexico. E-mail: [email protected]

Received: 6 September 2002 / Accepted: 27 February 2003 Published online: 11 June 2003 – © Springer-Verlag 2003

Abstract: We consider the large time asymptotic behavior of solutions to the initialboundary value problem   ut + et u ux − uxx = 0, x ∈ R, t > 0, u (0, x) = u0 (x) , x ∈ R,  u (t, x) → a , x → ±∞, t > 0. ± We find large time asymptotic formulas of solutions for three different cases 1) a± = ±1, 2) a± = ∓1 and 3) a± = 0. 1. Introduction We study the large time asymptotic behavior of solutions to the initial-boundary value problem   ut + et u ux − uxx = 0, x ∈ R, t > 0, u (0, x) = u0 (x) , x ∈ R, (1)  u (t, x) → a , x → ±∞, t > 0. ± Equation (1) is obtained from the Burgers equation with a pumping ut +uux −u−uxx = 0 via a change of the dependent variable u (t, x) = et u (t, x). This equation is interesting as an example of a simple model of nonlinear interaction of long wave pumping with short wave dissipation. Such types of models were considered earlier by [13] and [19]. By the change of dependent and independent variables u (t, x) = βv β 2 t, βx − αβet +α, where α = 21 (a+ + a− ) , β = 21 (a+ − a− ) if a+ = a− and β = 1 otherwise, we can reduce problem (1) to the following three different cases 1) a± = ±1, 2) a± = ∓1 and 3) a± = 0. We hope that the methods of investigation of large time asymptotic behavior of solutions to (1) developed in the present paper can be applied also to the Kuramoto and Sivashinskiy equation ut + uux + uxx + uxxxx = 0.

288

N. Hayashi, P.I. Naumkin

Now let us mention known results about large time asymptotic behavior of solutions for some problems similar to (1). Solutions of the Burgers equation ut + uux − uxx = 0 are found by the Hopf-Cole substitution u = −2∂x log φ reducing it tox the linear heat equation φt − φxx = 0 with initial data φ (0, x) = φ0 (x) ≡ exp − 21 0 u0 (y) dy . Therefore analyzing the explicit 1

− −x /4t representation + 1 of solutions we can find the large time asymptotics u = At 2 e − 2 −γ for t → ∞ uniformly with respect to x ∈ R, where γ > 0, in the case O t of initial data u0 (x) which tend to zero sufficiently rapidly as x → ±∞. If the initial data u0 (x) have a form of a step u0 (x) → a± = ∓1, x → ±∞ then the solution of the Burgers equation tends for large time to the stationary solution ϕ0 (x) = − tanh x (the so-called shock wave). And if u0 (x) → a± = ±1, then the solution of the Burgers equation converge as t → ∞ towards to the rarefaction wave. In paper [15] it is shown that the solutions of the one-dimensional barotropic model system for compressible viscous gas tends toward the centered rarefaction waves, provided that the initial data are suitably close to the rarefaction wave at initial time. In paper [16] the authors prove results on the existence, regularity, and asymptotic behavior of solutions of the initial-value problem for the Burgers type equation ut + f (u)x = µ(|ux |p ux )x on R × R+ , where p > 0 and the initial data u(x, 0) = u0 (x) satisfies u0 (x) → u± as x → ±∞, under the assumption u− ≤ u+ . Using a-priori energy estimates, they proved that a solution u (t, x) tends to the rarefaction wave ϕ (t, x) as t → ∞ uniformly in x ∈ R, which is the entropy solution of the inviscid equation (with µ = 0) with initial data ϕ(0, x) = u− if x < 0, and ϕ(0, x) = u+ if x > 0. In paper [17] the asymptotic stability of traveling wave solutions with shock profile were considered for scalar viscous conservation laws ut + f (u)x = µuxx with initial data u0 which tend to constant states u± as x → ±∞. By applying an elementary weighted energy method to the integrated equation of the original one the stability results were obtained in the case of f nonconvex and when the shock speed s = f (u± ). Moreover, the rate of asymptotics in time is investigated. For the case f (u+ ) < s < f (u− ), if the integral of the initial disturbances over (−∞, x) is small and decays at an algebraic rate as |x| → ∞, then the solution approaches a traveling wave at the corresponding rate as t → ∞. In paper [14] it was considered the large time asymptotic behavior of solutions of the initial-boundary value problem for the generalized Burgers equation ut + f (u)x = uxx on the half-line with the conditions u(t, 0) = u− , u(t, ∞) = u+ , where the corresponding Cauchy problem admits a rarefaction wave as an asymptotic state. Because of the Dirichlet boundary condition, the asymptotic states in this problem are divided into five cases dependent on the signs of the characteristic speeds f (u± ) of the boundary state u− = u(0) and the far field state u+ = u(∞). In the case f (u− ) < 0 < f (u+ ) the solution behaves as a sum of a viscous shock wave as boundary layer and rarefaction wave propagating away from the boundary. Large time behavior of positive solutions for the nonlinear heat equation ut − u = −u1+σ , σ > 0 was studied extensively (see paper [10] for the super critical case σ > n2 , [5] for the critical case σ = n2 and papers [2, 3, 6, 11] for the sub critical case σ ∈ 0, n2 ). In paper [6] for the sub critical case σ ∈ 0, n2 it was proved that if the initial data are nonnegative u0 ≥ 0, u0 ∈ L1 and decay slowly at infinity such 2 representation that limx→±∞ |x| σ u0 (x) =1 +∞, then the solution has the asymptotic

√ 1 1 u (t, x) = t − σ σ − σ + o t − σ as t → ∞ uniformly in domains x ∈ Rn ; |x| ≤ C t with any C > 0. On the other hand in papers [2, 3] there were considered the initial 2 data u0 ∈ L1 , u0 = 0 such that lim|x|→∞ |x| σ u0 (x) = κ ≥ 0, then it was shown 2

Asymptotics for the Burgers Equation with Pumping

289

that the main term of asymptotic of solutions has a self-similar character 1representation x − σ1 − u (t, x) = t wκ √t + o t σ as t → ∞ uniformly with respect to x ∈ Rn , where wκ (ξ ) is a positive solution of the equation −w − 21 ξ ∇w + w1+σ = 2 σ

1 σ w,

such that

lim|ξ |→∞ |ξ | wκ (ξ ) = κ. Note that there was no restriction on the size of the initial data in these papers. However the methods of these papers are not applicable for the case of the Cauchy problem (1). Note that the nonlinearity in Eq. (1) is sub critical for large time asymptotic behavior (due to the coefficient et growing with time), so we can not find the large time asymptotic expansion of solutions by using the usual methods treating the nonlinearity as a small perturbation of the linear theory. One should expect that the large time asymptotics differs essentially from the corresponding linearized case. In the case a± = 0 there are global existence results (see [18]) and smoothing properties for solutions, and as far as we know the large time asymptotic behavior of solutions is an open question. We organize the rest of our paper as follows. In Sect. 2 we will show that if the initial data are monotonically increasing and have small higher order derivatives, then solutions tend to the rarefaction wave as t → ∞. In Sect. 3 we consider the case of the shock wave a+ < a− and we will show that solutions tend as t → ∞ to the self-similar solution − tanh xet . The most difficult and intriguing case a+ = a− = 0 is treated in Sect. 4. Finally we would like to make a comment that the results of the present paper could be extended to a more general nonlinearity et um ux with m ≥ 1, odd, instead of et uux in Eq. (1). The nonconvex case of even m ≥ 1 remains an open problem.

Denote the usual Lebesgue space Lp = φ ∈ S ; φ p < ∞ , where the norm 1/p φ p = R |φ (x)|p dx if 1 ≤ p < ∞ and φ ∞ = ess.supx∈R |φ (x)| if p = ∞. For simplicity we write · = · 2 = · L2 . Weighted Sobolev space is √ Hm,k = φ ∈ S : φ m,k ≡ xk i∂m φ < ∞ , m, k ∈ R, x = 1 + x 2 . Different positive constants we denote by the same letter C.

2. Rarefaction Wave In this section we consider the case of the rarefaction wave a± = ±1. Consider the Cauchy problem for the Hopf equation

ϕt + et ϕϕx = 0, x ∈ R, t > 0, ϕ (0, x) = ϕ0 (x) , x ∈ R,

(2)

where the initial data ϕ0 (x) ∈ C2 (R) are monotonically increasing 0 < ϕ0 (x) < C for all x ∈ R and ϕ0 (x) → ±1 as x → ±∞ and ϕ0 (0) = 0. The solution to problem (2) is given by ϕ (t, y (t, ξ )) = ϕ0 (ξ ) , where the characteristics y (t, ξ ) = ξ +ϕ0 (ξ ) et − 1 for ξ ∈ R, t > 0. Note that ϕ (t, 0) = 0, ϕx (t, x) =

ϕ0 (ξ ) >0 1 + ϕ0 (ξ ) (et − 1)

290

N. Hayashi, P.I. Naumkin

for all x ∈ R, t > 0 and

2 ϕ0 (ξ ) dξ

ϕx (t) = 2

≤

2

1 + ϕ0 (ξ ) (et − 1)

ϕ0 (ξ ) dξ sup

ϕ0 (ξ )

2 1 + ϕ0 (ξ ) (et − 1) ϕ0 (ξ ) = ϕ0 (+∞) − ϕ0 (−∞) sup 2 ξ ∈R 1 + ϕ (ξ ) (et − 1) 0 ξ ∈R

≤ Ce−t

for all t > 0. By a direct calculation we have −3 ϕxx (t) = ϕ0 1 + ϕ0 et − 1 , and we assume also that ∞ ϕxx (τ ) ∞ dτ = t

t

∞

−3 ϕ0 1 + ϕ0 eτ − 1 dτ → 0 ∞

as t → ∞. First we give a sufficiently general result about convergence as t → ∞ of solutions u (t, x) of problem (1) to the rarefaction wave ϕ (t, x). Theorem 2.1. Let u0 − ϕ0 ∈ L2 . Then u (t, x) → ϕ (t, x) as t → ∞ uniformly with respect to x ∈ R. Proof. For the difference w = u − ϕ we get the Cauchy problem

wt + et wwx + et (ϕw)x − wxx − ϕxx = 0, x ∈ R, t > 0, w (x, 0) = w0 (x) , x ∈ R,

(3)

of book [18] we can easily prove where w0 (x) = u0 (x) − ϕ0 (x) ∈ L2 . By the methods existence of a unique solution w (t, x) ∈ C∞ (0, ∞) ; H∞,0 ∩ C [0, ∞) ; L2 to the Cauchy problem (3). We now derive energy type a-priori estimates for the solution w (multiplying Eq. (3) by w and integrating with respect to x over R) d w 2 + et ϕx w 2 dx + 2 wx 2 + 2 wx ϕx dx = 0, dt whence by the Cauchy inequality and estimate ϕx (t) 2 ≤ Ce−t we have d 2 t w + e ϕx w 2 dx + wx 2 ≤ Ce−t . dt Integration with respect to time t > 0 yields t w (t) 2 + dτ eτ ϕx w 2 (τ, x) dx + wx (τ ) 2 ≤ C.

(4)

0

Estimate (4) shows that w (t) ≤ C for all t > 0 and via inequalities w 4∞ ≤ ∞ 2 w 2 wx 2 ≤ C wx 2 we obtain 0 w (t) 4∞ dt < C. Therefore w (tk ) ∞ → 0

Asymptotics for the Burgers Equation with Pumping

291

for some sequence tk → ∞. In order to prove that w (t) ∞ → 0 as t → ∞, let us esti- mate m (t) = inf x∈R w (t, x) and M (t) = supx∈R w (t, x). Since w ∈ C (0, ∞) ; H1,0 we see that lim|x|→∞ w (t, x) = 0, hence we have m (t) ≤ 0 and M (t) ≥ 0 for all t ∈ (0, ∞). By the method of paper [1] we prove the following result. Lemma 2.2. Let w ∈ C1 (T1 , T2 ) ; H1,0 and m (t) < 0 for all t ∈ (T1 , T2 ). Then there exists a point ζ (t) ∈ R such that m (t) = w (t, ζ (t)), moreover m (t) = wt (t, ζ (t)) almost everywhere on (T1 , T2 ) . Proof. Since m (t) = inf x∈R w (t, x) < 0 and w (t, x) → 0 as |x| → ∞ we see that there exists a point ζ (t) ∈ R such that m (t) = w (t, ζ (t)) for all t ∈ (T1 , T2 ). By virtue of the estimate m (s) − m (t) ≤ w (s, ζ (t)) − w (t, ζ (t)) ≤ w (s) − w (t) ∞ s ≤ |t − s| sup wτ (τ ) 1,0 dτ ≤ w (τ ) t t

∞

τ ∈(s,t)

for all s, t ∈ (T1 , T2 ) we see that m (t) has a bounded variation on (T1 , T2 ) and hence is almost everywhere differentiable on (T1 , T2 ) (see [12]). Then we have m (s) − m (t) w (s, ζ (t)) − w (t, ζ (t)) ≤ lim = wt (t, ζ (t)) s→t+0 s→t+0 s−t s−t

m (t) = lim and m (t) = lim

s→t−0

m (t) − m (s) w (t, ζ (t)) − w (s, ζ (t)) ≥ lim = wt (t, ζ (t)) s→t−0 t −s t −s

almost everywhere on (T1 , T2 ) , since by the Sobolev embedding theorem wt (t, x) = 1 lims→t t−s (w (t, x) − w (s, x)) uniformly with respect to x ∈ R. Therefore m (t) = wt (t, ζ (t)) almost everywhere on (T1 , T2 ). Lemma 2.2 is proved. We now prove that m (t) → 0 as t → ∞. Recall that w (tk ) ∞ → 0 for some sequence tk → ∞. Consider the time interval T2 > T1 ≥ tk such that m (t) < 0 for all t ∈ (T1 , T2 ). By virtue of Lemma 2.2 we get from Eq. (3), m + et ϕx m − wxx (t, ζ (t)) − ϕxx (t, ζ (t)) = 0 almost for all t ∈ (T1 , T2 ), where we have used the fact that wx (t, ζ (t)) = 0. Whence integrating with respect to t ∈ (T1 , T2 ) via wxx (t, ζ (t)) ≥ 0 and ϕx m (t) ≥ 0 we have t m (t) ≥ m (T1 ) + ϕxx (τ, ζ (τ )) dτ. T1

Since m (T1 ) = 0 or m (T1 ) = m (tk ) and m (tk ) → 0 as tk → ∞,we have m (T1 ) → 0 t ∞ as T1 → ∞. Also by our assumption T1 ϕxx (τ, ζ (τ )) dτ ≤ T1 ϕxx (t) ∞ dt = o (1) as T1 → ∞. Therefore m (t) → 0 as t → ∞. Similarly we prove that M (t) = supx∈R w (t, x) → 0 as t → ∞. Hence w (t) ∞ → 0 as t → ∞. Theorem 2.1 is proved. Now we suppose some more conditions to be fulfilled for the initial data u0 (x) and compute more precisely asymptotic behavior of the solution u (t, x) to problem (1). We

292

N. Hayashi, P.I. Naumkin

assume that initial data u0 (x) monotonically increase and are slowly varying, so that the higher order derivatives are less than the first one, such that 3 k=1

1 d u0 (x) dx

k

2 u0 (x)

k− 3

u0 (x)

2

dx < ε,

(5)

where ε > 0 is sufficiently small. For example, we can take u0 (x) = εC0 εx−n with n < 2, where C0 > 0 can be chosen independently of ε > 0, such that u0 (x) → a± = ±1 as x → ±∞. By Theorem 2.1 we know that solutions of (1) are similar to those of the Hopf equation (2). Therefore the nonlinearity in Eq. (1) grows with time more rapidly than the term with the second derivative, hence the large time behavior of solutions should be determined by the first two terms in Eq. (1). So we try to solve Eq. (1) by the method of characteristics. We define characteristics y (t, ξ ) as solutions to the following Cauchy problem:

u yt = et u (t, y) − uyyy , t > 0, ξ ∈ R, y (0, ξ ) = ξ, ξ ∈ R. Then from Eq. (1) we get a simple equation uyy wt (t, ξ ) = ut + uy yt = ut + uy et u − uy = ut − uyy + et uuy = 0 for the new dependent variable w (t, ξ ) = u (t, y (t, ξ )). Hence w (t, ξ ) = u0 (ξ ) for all t > 0, ξ ∈ R. By a straightforward calculation we have ∂y u =

u0 (ξ ) u (ξ ) u (ξ ) yξ ξ (t, ξ ) , ∂y2 u = 2 0 − 0 3 , yξ (t, ξ ) yξ (t, ξ ) yξ (t, ξ )

whence 1 ∂ξ yt = e u0 (ξ ) − u0 (ξ ) t

u0 (ξ ) . yξ (t, ξ )

We now change the independent variable η = u0 (ξ ) , then the real axis ξ ∈ R biuniquely is transformed to a segment = (−1, 1) by the assumption on u0 (ξ ). . Then for Denote m (η) = u0 (ξ ) , τ = et − 1 ≥ 0 and Z (τ, η) = uy (t, y) = ym(η) ξ (t,ξ ) Z (τ, η) we get the following initial-boundary value problem:  2   Zτ = −Z (1 + A) , τ > 0, η ∈ , Z (0, η) = m (η) , η ∈ , (6)   ∂ηk Z = 0, τ > 0, k ≥ 1, η=±1

where A (τ, η) = − (1 + τ )−1 ∂η2 Z (τ, η). In what follows we only use the cases k = 1, 2, 3. From the existence of the unique solution u (t, x) to problem (1) it follows that there exists a unique global solution Z (τ, η) ∈ C [0, ∞) , C2 () ∩C1 ((0, ∞) , C ())

Asymptotics for the Burgers Equation with Pumping

293

to the initial-boundary problem (6). Integrating Eq. (6) with respect to time τ > 0 we get the representation τ −1 A τ , η dτ . (7) Z (τ, η) = m (η) 1 + m (η) τ + 0

We prove the following result. Theorem 2.3. Let condition (5) for initial data u0 (x) be fulfilled with sufficiently small ε > 0. Then the estimate τ √ sup A τ , η dτ ≤ C ε log (1 + τ ) 0

η∈

is true for all τ ≥ 0. By virtue of representation (7) and the estimate of Theorem 2.3 we get the asymptotics Z (τ, η) =

m (η) as τ → ∞, 1 + m (η) τ 1 + O τ −1 log τ

whence uy (t, y (t, ξ )) =

u0 (ξ ) , 1 + u0 (ξ ) et 1 + O te−t

where y (t, ξ ) = ξ + u0 (ξ ) e − 1 −

t

1 ∂ξ u0 (ξ )

t

uy t , y t , ξ dt .

(8)

(9)

0

Thus we see that the solution u (t, x) to problem (1) asymptotically behaves as a solution of the Hopf equation (2). Note that via formulas (8) and (9) we can obtain a higher-order asymptotic expansion of the solution u (t, x) . Proof. We make a regularization of problem (6). We define Y as a solution of the following problem:  2 2   Yτ = −Y + aY Yηη , τ > 0, η ∈ , Y (0, η) = m (η) + δ, η ∈ , (10)  k  = 0, τ > 0, k ≥ 1, ∂η Y η=±1

−1

where a = (1 + τ ) , δ > 0. By the methods of the book [18] it is easy to prove that there exists a solution Y (τ, η) ∈ C [0, ∞) , C2 () ∩ C1 ((0, ∞) , C ()) to problem (10) such that Y (τ, η) ≥ (m (η) + δ) (1 + 2 (m (η) + δ) τ )−1 > 0 for all τ ≥ 0, η ∈ , therefore the integrals 2 5 Jk (τ ) = ∂ηk Y (τ, η) (Y (τ, η))k− 2 dη, k = 1, 2, 3

are convergent for all τ ≥ 0. We now prove that Jk (τ ) < ε

(11)

294

N. Hayashi, P.I. Naumkin

for all τ ≥ 0, k = 1, 2, 3. For τ = 0 estimate (11) follows from (5). By the contradiction in view of the continuity of Jk (τ ) , we suppose that there exists a time T > 0 such that Jk (T ) = ε and Jk (τ ) ≤ ε

(12)

for all τ ∈ [0, T ]. By the Cauchy inequality we have 2 Yη (τ ) Y −1 (τ ) ∞ 2 ≤ ∂η Yη (τ, η) Y −1 (τ, η) dη Yηη (τ, η) Yη (τ, η) Y −1 (τ, η) dη + Yη (τ, η)3 Y −2 (τ, η) dη ≤2

1 2 2 ≤ 2 (J1 J2 ) + J1 Yη (τ ) Y −1 (τ ) , 1 2

∞

whence by virtue of (12) we obtain 2 Yη (τ ) Y −1 (τ )

∞

≤ Cε

for all τ ∈ [0, T ]. In the same manner Yηη (τ ) 2 ≤ 2 Yηη (τ, η) Yηηη (τ, η) dη ≤ 2 (J2 J3 ) 21 ≤ 2ε. ∞

(13)

(14)

Now differentiating Jk (τ ) with respect to η ∈ , using Eq. (10) and integrating by parts we obtain 1 dJ1 5 −1 2 a −3 4 2 2 2 2 − Y Yη − 2aY Yηη + Y Yη dη, = dτ 2 4 3 dJ2 2 −3 4 7 1 2 2 − Y 2 Yη − Y 2 Yηη − 2aY 2 Yηηη = dτ 3 2 1 11 − 1 2 2 3 2 2 + aY Yη Yηη + aY Yηη dη, 4 and dJ3 = dτ

1 5 1 9 3 2 2 4 2 − Y 2 Yηηη − 2aY 2 Yηηηη − 4aY 2 Yηη + 3Y − 2 Yη2 Yηη 2 1 3 27 1 2 2 3 − 21 2 3 2 2 2 2 +6Y Yηη − 2aY Yη Yηη + aY Yη Yηηη + 13aY Yηη Yηηη dη. 4

Therefore we have d (J1 + J2 + J3 ) ≤ −I1 + I2 , dτ

Asymptotics for the Burgers Equation with Pumping

where I1 =

1 1 2 + Y 23 Y 2 Y − 2 Yη2 + Y 2 Yηη ηηη dη ≥ 0 and

I2 =

295

a − 3 4 11 − 1 2 2 Y 2 Yη + aY 2 Yη Yηη 4 4 1 1 1 3 27 3 3 2 2 dη. +aY 2 Yηη − 2aY − 2 Yη2 Yηη + aY 2 Yη2 Yηηη + 13aY 2 Yηη Yηηη 4 1

1

2 3 3Y − 2 Yη2 Yηη + 6Y 2 Yηη +

By estimates (12), (13), (14) we have 2 3 |I2 | ≤ C Yη2 Y −1 + Yηη ∞ + Yηη ∞ + Yηη ∞ I1 ≤ CεI1 , ∞

d therefore we get dτ (J1 + J2 + J3 ) ≤ 0 if ε > 0 is sufficiently small, hence estimate (11) is true for all τ ∈ [0, T ]. The contradiction obtained proves estimate (11) for all τ ≥ 0. Now taking a limit δ → 0 we see that estimates τ (11) for the are valid √ function Z. Now by estimates (11) and (14) we get supη∈ 0 A τ , η dτ ≤ C ε log (τ + 1) for all τ ≥ 0. Theorem 2.3 is proved.

3. Shock Wave We now consider another type of the boundary condition a± = ∓1, which corresponds to the shock wave solutions. We introduce the approximate solution V (t, y) =

n

e−2kt ϕk (y) ,

k=0

where y = xet , n ≥ 2, and the functions ϕk (y) , k ≥ 0 are defined recurrently via equations ϕ0 ϕ0 − ϕ0 = 0 with boundary conditions ϕ0 (y) → ∓1 as y → ±∞, whence ϕ0 (y) = − tanh y2 , and 1 −2 (k − 1) ϕk−1 + yϕk−1 + ∂y ϕk−l ϕl − ϕk = 0 2 k

(15)

l=0

with boundary conditions ϕk (y) → 0 for y → ±∞, k ≥ 1, whence integrating the identity with respect to y over (−∞, y) we obtain ϕk

1 = ϕk ϕ0 + ϕk−l ϕl − 2 (k − 1) 2 k−1 l=1

y −∞

η) d η+ ϕk−1 (

y −∞

η) d η. ηϕk−1 (

Multiplying both sides of the above by cosh2 y2 and integrating the resulting equation with respect to y over (−∞, y) again we have η y cosh2 η2 ϕk (y) = − 2 − 1) η) d η ϕk−1 ( (k 2 y 0 cosh 2 −∞ η k−1 1 + ηϕk−1 ( ϕk−l (η) ϕl (η) dη. η) d η+ 2 −∞ l=1

296

N. Hayashi, P.I. Naumkin

We find function for any k ≥ 0. Indeed if ϕk−1 is an odd function, η that ϕk (y) is anodd η then −η ϕk−1 ( η) d η = −η η) d η = 0 and ϕk−l (η) ϕl (η) = ϕk−l (−η) ϕl (−η) ηϕk−1 ( which imply ϕk (y) is an odd function. The function V (t, y) is close to the shock wave ϕ0 (y) = − tanh y2 for large time t → ∞, however the derivatives with respect to x of V (t, y) are not close to that of ϕ0 as t → ∞. This is the reason why we introduce the higher-order corrections ϕk (y) e−2kt , k ≥ 1 considering convergence with derivatives of the solution u (t, x) as t → ∞. By virtue of (1) and (15) we find for the difference v (t, x) = u (t, x) − V (t, y), ∂t v + et vvx + et ∂x (vV ) − vxx + Vt + et V Vx − Vxx = ∂t v + et vvx + et ∂x (vV ) − vxx n n −2kt + ϕk + + e−2kt yϕk (−2k) e k=0

+e

2t

n

e

−2kt

k=0 t

ϕk

k=0 n

e−2kt ϕk

−e

2t

k=0

n

e−2kt ϕk

k=0

= ∂t v + e vvx + e ∂x (vV ) − vxx t

n

−2ne−2nt ϕn + e−2nt yϕn +

et−2(k+l)t ϕk ∂x ϕl = 0,

k,l=1,k+l>n

whence integrating with respect to x we get 1 wt + et (wx )2 + et V wx − wxx + R = 0, 2 x where w (t, x) = −∞ v t, x dx and R (t, y) = e

−2nt−t

+

1 2

yϕn (y) − (2n + 1)

n

y −∞

ϕn y dy

(16)

et−2(k+l)t ϕk (y) ϕl (y) .

k,l=1,k+l>n

We suppose that the initial data u0 (x) for the problem (1) are close to the approximate solution V (t, y) so that w (t0 , x) cosh x ∈ H2,0 for some time t = t0 , where the initial time t0 > 0 we choose to be sufficiently large. This means that from the beginning the nonlinear effects dominate upon the linear ones (we could replace this requirement considering a large coefficient at the nonlinear term in Eq. (1)). We now prove the following result. large and the initial data Theorem 3.1. Let the initial time t0 > 0 be sufficiently u (t0 , x) ∈ H2,0 be close to the shock wave V t0 , xet0 , that is cosh x

x

−∞

u t0 , x − V t0 , x et0 dx ∈ H2,0 .

Asymptotics for the Burgers Equation with Pumping

297

Then there x exists to the Cauchy a unique solution u (t, x) problem (1) such that cosh x −∞ u t, x − V t, x et dx ∈ C [t0 , ∞) ; H2,0 and the estimate x t 3t−2nt cosh x u t, x − V t, x e dx ≤ Ce −∞

2,0

is true for all t ≥ t0 . Thus we see that the solution u (t, x) of the Cauchy problem (1) tends to the shock wave V (t, y) as t → ∞ uniformly with respect to x ∈ R. Proof. By virtue of Eq. (16) we have for the function g (t, x) = w (t, x) cosh x, gt +

et (gx − g tanh x)2 + χg + ψgx − gxx + R cosh x = 0, 2 cosh x

(17)

where χ=

2 − 1 − et V (t, y) tanh x, ψ = 2 tanh x + et V (t, y) . cosh2 x

Denote g (m) = ∂xm g (t, x). We differentiate Eq. (17) with respect to x, multiply by g (m) and integrate with respect to x over R, et d (m) 2 m 2 (m) ∂x (gx − g tanh x) g + 2 dxg dt 2 cosh x 2 (18) +∂xm (χg + ψgx ) + ∂xm (R cosh x) + 2 g (m+1) = 0. Denote I (t) =

2

k=0 e

4nt−3kt

(k) 2 g (t) and prove that I (t) < C

(19)

for all t > t0 . By contradiction we suppose that there exists T > t0 is such that I (T ) = C and I (t) ≤ C (20) 3 for all t ∈ [t0 , T ]. By (20) we have g (k) (t) ≤ Ce 2 kt−2nt , k = 0, 1, 2 for all t ∈ [t0 , T ]. We now estimate different terms in Eq. (18), 3 (m) m g ∂x (R cosh x) ≤ g (m) ∂xm (R cosh x) ≤ e− 2 t−2nt+mt g (m) 1

for m = 0, 1, 2. Further we have ∂xk χ ∞ + ∂xk ψ ∞ ≤ Cet+kt , whence m (m) g (m) ∂ m (χg + ψgx ) dx ≤ C et+kt g (m−k) g x k=0

m +C g (m) et+kt g (m+1−k) k=1

2 2 2 ≤ Ce2t g (m) + Ce2mt g (1) + Ce2mt g (0)

298

N. Hayashi, P.I. Naumkin

for m = 1, 2. By (20) we estimate the nonlinearity as follows: (m+1) et 2 2 dxg (m) ∂ m ≤ g − g tanh x) (g + Ce−t−4nt+3mt x x 2 cosh x for m = 0, 1, 2, since n ≥ 2. Thus from (18) we get d gx 2 ≤ − gxx 2 + Ce2t gx 2 + Ce2t g 2 + Ce2t−4nt dt and d gxx 2 ≤ Ce2t gxx 2 + Ce4t gx 2 + Ce4t g 2 + Ce5t−4nt . dt For the case m = 0 we apply the estimate 1 1 1 − 1 − et V (t, y) tanh x − e2t Vy (t, y) χ − ψx = 2 2 2 cosh x 2t y e + O (1) ≥ C log t ≥ 4n + 1 = et tanh tanh x + 2 4 cosh2 y2 for all t ≥ t0 , if t0 > 0 is sufficiently large, whence g (χg + ψgx ) dx =

1 χ − ψx g 2 dx ≥ (4n + 1) g 2 . 2

Then from (18) we get d g 2 ≤ −2 gx 2 − (4n + 1) g 2 + Ce−4nt−t . dt Therefore we have for I (t) =

2

k=0 e

4nt−3kt

(k) 2 g (t) ,

d I ≤ −e4nt 1 − Ce−t g 2 + gx 2 + e−3t gxx 2 + Ce−t dt ≤ −I + Ce−t for all t ∈ [t0 , T ]. Therefore I (t) ≤ e−(t−t0 ) I (t0 ) + C (t − t0 ) e−t < C for all t ∈ [t0 , T ]. The contradiction obtained proves estimate (19) for all t ≥ t0 , hence the result of the theorem is true. Theorem 3.1 is proved.

Asymptotics for the Burgers Equation with Pumping

299

4. Zero Boundary Conditions We now consider the most difficult and intriguing type of the boundary conditions a± = 0. We know (see book [18]) that there exists a unique solution u (t, x) ∈ C [0, ∞) ; L2 ∩ C∞ (0, ∞) ; H∞,0 to the Cauchy problem

ut + et uux − uxx = 0, x ∈ R, t > 0, u (0, x) = u0 (x) , x ∈ R if the initial data u0 ∈ L2 . If the initial datum u0 (x) is an odd function, then the solution u (t, x) remains to be an odd function for all t > 0 and it can be obtained as an odd prolongation of the solution to the following Dirichlet boundary-value problem   ut + et uux − uxx = 0, x ∈ (−∞, 0) , t > 0, u (t, −∞) = 0, u (t, 0) = 0, t > 0, (21)  u (0, x) = u0 (x) , x ∈ (−∞, 0) . Define ϕ (t, x) as a rarefaction wave constructed in Sect. 2

ϕt + et ϕϕx − ϕxx = 0, x ∈ R, t > 0, ϕ (0, x) = ϕ0 (x) , x ∈ R, where the initial data ϕ0 (x) are monotonically increasing ϕ0 (x) > 0 for all x ∈ R and ϕ0 (x) → 0 as x → −∞. Now we define r (t, x) as a solution to the Dirichlet boundary value problem   rt + et ϕrrx − rxx + et ϕx r (r − 1) − 2ϕϕx rx = 0, x ∈ (−∞, 0) , t > 0, (22) r (t, −∞) = 1, r (t, 0) = 0, t > 0,  r (0, x) = r0 (x) , x ∈ (−∞, 0) . Then the function u = ϕr satisfy problem (21). For example, we suppose that the initial data ϕ0 (x) decay at infinity as ϕ0 (x) = − x1 + −|x| as x → −∞. The more general case ϕ0 (x) = |x|−α +O e−|x| as x → −∞, O e where α > 0, also can be considered by our method. We use the method of characteristics applied in Sect. 2 (see formulas (7) and (9)) to derive asymptotic expansions for the −1 solution ϕ (x, t). We have ϕ (t, y (t, ξ )) = ϕ0 (ξ ) , ϕy (t, y (t, ξ )) = ϕ0 (ξ ) yξ (t, ξ ) and t t 1 y (t, ξ ) = ξ + ϕ0 (ξ ) e − 1 − ϕy t , y t , ξ dt . ∂ξ ϕ0 (ξ ) 0 Define the curve ξ0 (t) such that y (t, ξ0 (t)) = 0. We easily see that ξ0 (t) → −∞ as t → ∞. Using the estimate of Theorem 2.3 we obtain t t t −1 y (t, ξ ) = ξ − e − 1 η − ∂η η2 1 + η2 et + O 1 + t dt + O e−e 0

for ξ → −∞, η =

→ 0. In the first approximation we write y (t, ξ ) = ξ − et η + t t O (ηt) , hence ξ02 (t) = et + O (t) , and ξ0 (t) = −e 2 + O te− 2 as t → ∞. In the second approximation we take t −1 y (t, ξ ) = ξ − et − 1 η − ∂η η2 1 + η2 et dt + O t 2 e−2t , 1 ξ

0

300

N. Hayashi, P.I. Naumkin

whence ξ02 (t) = et − 2t + 2 log 2 + O t 2 e−t . Therefore the asymptotic expansions t t 3t ξ0 (t) = −e 2 − (t − log 2) e− 2 + O t 2 e− 2 , t 3t 5t ϕ (t, 0) = e− 2 − (t − log 2) e− 2 + O t 2 e− 2 , ϕx (t, 0) = e−t − 2 (t − log 2) e−2t + O t 2 e−3t are valid for t → ∞. Then by virtue of the Taylor formula ϕ (t, x) = ϕ (t, 0) + xϕx (t, 0) + 21 x 2 ϕxx (t, x ) we get 5t t 3t ϕ (t, x) = e− 2 − (t − y − log 2) e− 2 + O t 2 + y 2 e− 2 , ϕx (t, x) = e−t − (2t − y − 2 log 2) e−2t + O t 2 + y 2 e−3t for t → ∞. Continuing this procedure we obtain the asymptotic expansions t

e 2 ϕ (t, x) =

n

ak (t, y) e−tk + O (t + |y|)n+1 e−tn−t ,

k=0

et ∂x ϕ (t, x) =

n

bk (t, y) e−tk + O (t + |y|)n+1 e−tn−t ,

k=0 t

e2

2ϕx (t, x) = ϕ (t, x)

n

ck (t, y) e−tk + O (t + |y|)n+1 e−tn−t

k=0 t

for t → ∞, where n ≥ 3, y = xe 2 ; ak (t, y) , bk (t, y) , and ck (t, y) are polynomials with respect to t, y of order less than k. Now as in Sect. 3 we construct an approximate solution (t, y) to problem (22) in the form (t, y) = nk=0 φk (t, y) e−tk . Changing the dependent variable r (t, y) = t 2 r (t, x) with y = xe we get 3 y 2ϕx t ry + e 2 t ϕ rt + r ry − e t ryy + et ϕx r ( r − 1) − ry = 0, e 2 2 ϕ

then for the difference w (t, x) = r (t, y) − (t, y) we obtain 1 1 wt − wxx + et (ϕw)x + et ϕw 2 + et ϕx w 2 x 2 2 2ϕx t wx + R = 0, + e ϕx ( − 1) w − ϕ

(23)

where the remainder term R ≡ t +

3 y 2ϕx t y − et yy + e 2 t ϕy + et ϕx ( − 1) − e 2 y 2 ϕ

and the functions φk (t, y) , 0 ≤ k ≤ n, are defined recurrently via equations (which are obtained by comparing terms containing et−tk ) φ0 − a0 φ0 φ0 = 0, φk − a0 (φ0 φk ) = zk , k ≥ 1,

(24)

Asymptotics for the Burgers Equation with Pumping

301

where zk (t, y) =

j k−1

k−1 ak−j φj −l φl + bk−1−j φj −l φl + a0 φk−l φl

j =0 l=0

−

l=1

k−1 j =0

y bk−1−j φj + ck−1−j φj + ∂t φk−1 − (k − 1) φk−1 − φk−1 2

for k ≥ 1. By the boundary conditions we have φ0 (y) → 1, φk (y) → 0, k ≥ 1 for y → −∞ and φk (0) = 0, k ≥ 0. Integrating Eq. (24) with a0 = 1, we get φ0 (y) = − tanh y2 and y η 1 2 η cosh zk η , t dη dη, for k ≥ 1. φk (t, y) = y 2 2 −∞ cosh 2 0 k We have the estimates |φk (t, y)| ≤ C 1 + t + y 2 e−|y| for k ≥ 1. Therefore for the remainder term R we get R=−

3n−1

e−tk

k+1

min(n,j )

ak+1−j φj −l φl

j =k+1−n l=max(0,j −n)

k=n

min(n,j 3n k ) y −tk ∂t φn − nφn − φn + +e e bk−j φj −l φl 2 k=n j =k−n l=max(0,j −n) min(k,n) 2n n t −tk t −tk y − bk−j φj + ck−j φj + e e 2 ϕ − e ak e −tn

j =k−n

k=n

+ e ϕx − t

n

bk e

−tk

k=0 j

( − 1) − 2 n

tj 2

k=0

2ϕx t e2 − ϕ

n

ck e

−tk

y ,

k=0

1+t +y e −nt e−|y| is true. Denote W (t, x) = x = −∞ dx and integrate Eq. (23) from −∞ to x, then we

hence the estimate ∂x R = O ∂x−1 (w (t, x)) , where ∂x−1 find

1 1 Wt − Wxx + et ϕWx + et ϕw 2 + et ∂x−1 ϕx w 2 + et ϕx ( − 1) 2 2 ϕx 2ϕx t −1 −1 W − e ∂x (ϕx ( − 1))x W − Wx + 2∂x w + R1 = 0 ϕ ϕ x (25) with boundary condition Wx (t, 0) = w (t, 0) = 0, where R1 = ∂x−1 R = the Neumann t n O 1 + t + y 2 e− 2 −nt e−|y| . We suppose that the initial data r (t0 , x) are sufficiently close to (t, y) so that the function W (t, x) cosh 2x ∈ H2,0 ((−∞, 0)) and the initial time t = t0 is sufficiently large. The last requirement can be replaced by a sufficiently large coefficient at the nonlinear term in Eq. (1) so that nonlinear effects dominate the linear ones from the beginning.

302

N. Hayashi, P.I. Naumkin

We prove the following result. Theorem 4.1. Let the initial time t0 > 0 be sufficiently large, initial data u0 (x) ∈ and the H2,0 be an odd function and close to the shock wave t0 , xet0 /2 , that is x u0 x t0 /2 − t0 , e x dx ∈ H2,0 ((−∞, 0)) , cosh 2x ) ϕ (x 0 −∞ where ϕ0 (x) is such that ϕ0 (x) > 0 for all x ∈ (−∞, 0) and ϕ0 (x) = − x1 +O e−|x| as x → −∞. Then a unique solution u (t, x) to the Cauchy problem (1) has the asymptotic representation u (t, x) = ϕ (t, − |x|) t, −et/2 |x| sign (x) + O e−t for t → ∞ uniformly with respect to x ∈ R. Since the solution u (t, x) is represented as u (t, x) = r (t, x) ϕ (t, x), then the result of Theorem 4.1 follows from the next lemma. Lemma 4.2. Let the initial time t0 > 0 be sufficiently large and the initial data r (t0 , x) ∈ H2,0 ((−∞, 0)) be close to the shock wave t0 , xet0 /2 , that is x cosh 2x r t0 , x − t0 , et0 /2 x dx ∈ H2,0 ((−∞, 0)) . −∞

Then there x exists a unique solution r (t, x) to the Cauchy problem (22) such that cosh x −∞ u t, x − V t, et/2 x dx ∈ C [t0 , ∞) ; H2,0 and the estimate x t/2 cosh x r t, x − t, e x dx ≤ Ce2t−nt −∞

2,0

is true for all t ≥ t0 , where n ≥ 3. Thus we see that the solution r (t, x) to problem (22) tends to the shock wave (t, y) as t → ∞ uniformly with respect to x ∈ (−∞, 0) . Proof. Denote g (t, x) = W (t, x) cosh 2x, h = w (t, x) cosh 2x, v (t, x) = W (t, x) cosh x and s (t, x) = w (t, x) cosh x. We prove that I (t) ≡ e−21t g 2∞ + e2nt v 2 + e−t s 2 + e−2t sx 2 < C (26) for all t > t0 . By the contradiction we suppose that there exists T > t0 such that I (T ) = C and I (t) ≤ C

(27)

for all t ∈ [t0 , T ]. By (25) we get

g gt − gxx + χ1 gx − ψ1 g − et cosh (2x) ∂x−1 (ϕx ( − 1))x cosh x 2 t ϕs 2ϕ ϕ e s x x − +et cosh (2x) ∂x−1 − h ϕ 2 cosh x 2 cosh2 x 2h ϕx + cosh (2x) ∂x−1 = −R1 cosh 2x ϕ x cosh 2x

(28)

Asymptotics for the Burgers Equation with Pumping

303

with boundary condition gx (t, 0) = 0, where χ1 = 4 tanh 2x + et ϕ, 4 + 2et ϕ tanh 2x − et ϕx ( − 1) . ψ1 = 4 tanh2 2x − cosh2 2x Since ϕ ∞ ≤ Ce− 2 , ϕx ∞ ≤ Ce−t , we obtain t

ψ1 +

1 1 1 3 (χ1 )x = 4 tanh2 2x + 2et ϕ tanh 2x + et ϕx + e 2 t ϕy ≤ 5 2 2 2 3t

for t ≥ t0 if t0 > 0 is sufficiently large. Also via (27) s ∞ ≤ Ce 4 −nt , hence by the Young inequality we have the estimates g t e cosh (2x) ∂x−1 (ϕx ( − 1))x cosh x 3t ≤ Cet ϕxx ∞ − 1 1 + Ce 2 ϕx ∞ y 1 g ≤ Ce−nt , t ϕx s 2 ≤ Cet ϕx ∞ s ∞ s ≤ e−nt e cosh (2x) ∂ −1 x 2 2 cosh x and

2h ϕx −t ≤ C ϕx cosh (2x) ∂ −1 h x ≤ Ce 2 h , ϕ ϕ x cosh 2x x t ϕs 2ϕx t e − ϕ − 2 cosh x h ≤ Ce 2 h .

Applying the energy method, taking into account the boundary conditions gx (t, 0) = χ1 (t, 0) = 0, we get from (28), d g (t) 2 + h 2 ≤ 20 g 2 + Ce−2nt . dt Now for the function h = gx − 2g tanh 2x we have from Eq. (23) et ϕ hhx + R cosh 2x = 0 cosh 2x with boundary condition h (t, 0) = 0, where ht − hxx + χ2 hx − ψ2 h +

χ2 = 4 tanh 2x + et ϕ −

2ϕx , ϕ

4 − et (ϕ)x − et ϕx ( − 1) cosh2 2x e t ϕx s ϕs tanh 2x 4ϕx +2et ϕ tanh 2x − tanh 2x + 2et − . ϕ cosh x cosh x

ψ2 = 4 tanh2 2x −

We have ψ2 +

1 2

(χ2 )x ≤ Cet and 0 ϕ 3 et ϕ 2 t h h dx ≤ Ce h x cosh 2x x −∞ cosh 2x 1 t

≤ Ce 2 s ∞ h 2 ≤ C h 2 .

(29)

(30)

304

N. Hayashi, P.I. Naumkin

Hence by the energy method, taking into account the boundary condition h (t, 0) = 0, we get from (30) d h (t) 2 + 2 hx 2 ≤ Cet h 2 + Ce−2nt . dt

(31)

By virtue of (29) and (31) we obtain d g (t) 2 + e−2t h (t) 2 ≤ 20 g (t) 2 + e−2t h (t) 2 + Ce−2nt , dt whence integrating with respect to t we get g (t) 2 + e−2t h (t) 2 ≤ Ce20t , therefore g 2∞ ≤ C g 2 + C g h ≤ Ce21t

(32)

for all t ∈ [t0 , T ] . We now estimate the norms v , s and sx . By (25) we find for v (t, x) = W (t, x) cosh x, ϕx s vt − vxx + χ3 vx − ψ3 v + 2 cosh x∂x−1 ϕ x cosh x v et ϕs 2 − et cosh x∂x−1 (ϕx ( − 1))x + cosh x 2 cosh x 2 s ϕ x + R1 cosh x = 0, + et cosh x∂x−1 (33) 2 cosh2 x where χ3 = 2 tanh x + et ϕ − ψ3 = tanh2 x −

2ϕx , ϕ

1 2ϕx + et ϕ tanh x − et ϕx ( − 1) − tanh x. 2 ϕ cosh x

By the energy method (we multiply Eq. (33) by v and integrate with respect to x over (−∞, 0) taking into account the boundary condition vx (t, 0) = 0) we get 0 d 2 2 v + s − (34) (χ3 )x + 2ψ3 v 2 dx ≤ C v 2 + Ce−t−2nt , dt −∞ since

t 2 e ϕs t 1 2 cosh x ≤ Ce 2 s ∞ s ≤ 4 s ,

and R1 cosh x ≤ Ce− 2 −nt , also via the Young inequality we have the estimates s ϕx cosh x∂ −1 ≤ Ce−t s ≤ 1 s , x ϕ x cosh x 4 v t e cosh x∂x−1 (ϕx ( − 1))x cosh x 3t ≤ Cet ϕxx ∞ − 1 1 v + Ce 2 ϕx ∞ y 1 v ≤ C v t

Asymptotics for the Burgers Equation with Pumping

and

305

t ϕx s 2 ≤ Cet ϕx ∞ s ∞ s ≤ 1 s . e cosh x∂ −1 x 2 4 2 cosh x

Further we estimate

3t

e2 y (χ3 )x + 2ψ3 = −ϕ 2e tanh tanh x + 2 2 cosh2

+ O (1) ≤ −C log t

t

y 2

for all |x| ≤ e 2 since ϕ (t, x) ≥ e− 2 in the region |x| ≤ e 2 . The main point here is that by estimate (27) we have g 2∞ ≤ Ce21t , therefore t

−et/2

−∞

t

v (t, x) dx = 2

≤

−et/2 −∞

t

e−2|x| e2|x| v 2 (t, x) dx

C g (t) 2∞

−et/2 −∞

e−2|x| dx ≤ Ce21t−e

t/2

≤ Ce−2t−2nt ,

whence we get

0

−∞

(χ3 )x + 2ψ3 v 2 dx ≤ −C v 2 log t + Ce−t−2nt ,

and from (34) we obtain d v 2 + s 2 ≤ − (2n + 1) v 2 + Ce−t−2nt dt

(35)

for all t ∈ [t0 , T ] . Let us consider now the estimates for the function s = vx − v tanh x = w cosh x. From Eq. (23) we have st − sxx + χ4 sx − ψ4 s +

ϕx − ϕ tanh x 2 et ϕssx + et s + R cosh x = 0, cosh x cosh x

(36)

with the boundary condition s (t, 0) = 0, where χ4 = 2 tanh x + et ϕ −

2ϕx , ϕ

1 2ϕx − tanh2 x + et ϕ tanh x − et (ϕ)x − tanh x − et ϕx ( − 1) . 2 ϕ cosh x

ψ4 =

Applying the energy method to (36) we obtain d s 2 + sx 2 ≤ 3et s 2 + Ce−2nt dt since

0

−∞

R cosh x ≤ Ce−nt , s 2 (χ4 )x + 2ψ4 ≤ 2et s 2 , 1 et ϕ 2 ϕ ≤ Ce 2t s ∞ s 2 ≤ C s 2 s sx dx = Cet s3 cosh x cosh x x 1

(37)

306

N. Hayashi, P.I. Naumkin

and

ϕ t ϕ x et − tanh x s 2 ≤ Ce 2 s ∞ s . cosh x cosh x

In the same way we obtain from (36) d sx 2 + sxx 2 ≤ e2t s 2 + 3et sx 2 + Ce−2nt , dt since

0

−∞

(38)

3t sx (χ4 sx − ψ4 s)x dx ≤ e 2 s sx + 2et sx 2 ,

and by use of the inequality sx 2∞ ≤ 2 sx sxx + 2 sx 2 we have 0 ϕss ϕs x 2 sx dx ≤ Cet et (sx ) cosh x x cosh x x 1 −∞ t

≤ Ce 2 sx ∞ sx 2 + C s ∞ sx 2 1 ≤ C sx 2 + sxx 2 . 2 Also

ϕx ϕ tanh x 2 ≤ Ce 2t s ∞ s 2 ≤ C s 2 + C sx 2 e sx − s 1,0 cosh x cosh x x 1 t

and ∂x (R cosh x) ≤ Ce−nt . Thus for J (t) = 15 v (t) 2 + 4e−t s (t) 2 + e−2t sx (t) 2 from (35), (37) and (38) we get d J ≤ 15 − (2n + 1) v 2 − s 2 + Ce−t−2nt dt +4e−t − sx 2 + 3et s 2 + Ce−2nt +e−2t e2t s 2 + 3et sx 2 + Ce−2nt + Ce−2nt ≤ −30n v 2 − s 2 − e−t sx 2 + Ce−2nt ≤ − (2n + 1) J + Ce−2nt , whence integrating we get J (t) ≤ e−(2n+1)(t−t0 ) J (t0 ) + Ce−(2n+1)t et − et0 < C for all t ∈ [t0 , T ]. Therefore in view of (32) we see that estimate (26) is valid for all t ∈ [t0 , T ] . The contradiction obtained proves estimate (26) for all t ≥ t0 . Whence the result of the Lemma follows. Lemma 4.2 is proved. Acknowledgements. This work of P.I.N. is partially supported by CONACYT. We are grateful to an unknown referee for useful suggestions and comments.

Asymptotics for the Burgers Equation with Pumping

307

References 1. Constantin, A., Escher, J.: Wave breaking for nonlinear nonlocal shallow water equations. Acta Mathematica 181(2), 229–243 (1998) 2. Escobedo, M., Kavian, O.: Asymptotic behavior of positive solutions of a non-linear heat equation. Houston J. Math. 13(4), 39–50 (1987) 3. Escobedo, M., Kavian O., Matano, H.: Large time behavior of solutions of a dissipative nonlinear heat equation. Comm. Partial Diff. Eqs 20, 1427–1452 (1995) 4. Fujita, H.: On the blowing-up of solutions of the Cauchy problem for ut = u + u1+α . J. Fac. Sci. Univ. of Tokyo, Sect. I, 13, 109–124 (1966) 5. Galaktionov, V.A., Kurdyumov, S.P., Samarskii, A.A.: On asymptotic eigenfunctions of the Cauchy problem for a nonlinear parabolic equation. Math. USSR Sbornik 54, 421–455 (1986) 6. Gmira A., Veron, L.: Large time behavior of the solutions of a semilinear parabolic equation in RN . J. Diff. Eq. 53, 258–276 (1984) 7. Hayashi, N., Kaikina, E.I., Naumkin, P.I.: Large time behavior of solutions to the dissipative nonlinear Schr¨odinger equation. Proceedings of the Royal Soc. Edinburgh 130A, 1029–1043 (2000) 8. Hayashi, N., Kaikina, E.I., Naumkin, P.I.: Large time behavior of solutions to the Landau-Ginzburg type equation. Funcialaj Ekvacioj 44, 171–200 (2001) 9. Hayashi, N., Kaikina, E.I., Naumkin, P.I.: Global existence and time decay of small solutions to the the Landau-Ginzburg type equations. J. Anal. Math. (in press) 10. Kamin, S., Peletier, L.A.: Large time behaviour of solutions of the heat equation with absorption. Ann. Scuola Norm. Sup. Pisa 12, 393–408 (1985) 11. Kavian, O.: Remarks on the large time behavior of a nonlinear diffusion equation. Ann. Inst. Henri Poincar´e, Analyse non lin´eaire 4(5), 423–452 (1987) 12. Kolmogorov, A.N., Fomin, S.V.: Elements of the Theory of Functions and Functional Analysis. N.Y.: Graylock Press, 1957 13. Kuramoto, Y.: Chemical Oscillations, Waves and Turbulence. Berlin: Springer-Verlag, 1982 14. Liu, T.-P., Matsumura, A., Nishihara, K.: Behavior of solutions for the Burgers equation with boundary corresponding to rarefaction waves. SIAM J. Math. Anal. 29(2), 293–308 (1998) 15. Matsumura, A., Nishihara, K.: Asymptotics toward the rarefaction waves of the solutions of a onedimensional model system for compressible viscous gas. Japan J. Appl. Math. 3(1), 1–13 (1986) 16. Matsumura, A., Nishihara, K.: Asymptotics toward the rarefaction wave of the solutions of Burgers’ equation with nonlinear degenerate viscosity. Nonlinear Anal. 23(5), 605–614 (1994) 17. Matsumura, A., Nishihara, K.: Asymptotic stability of travelling waves for scalar viscous conservation laws with non-convex nonlinearity. Commun. Math. Phys. 265(1), 83–96 (1994) 18. Naumkin, P.I., Shishmarev, I.A.: Nonlinear Nonlocal Equations in the Theory of Waves. Translations of Monographs 133, Providence, R.I.: A.M.S., 1994 19. Sivashinsky, G.I.: Instabilities, pattern formation and turbulence formation. Ann. Rev. Fluid. Mech. 15, 179–199 (1983) 20. Weissler, F.B.: Existence and non-existence of global solutions to a nonlinear heat equation. Israel J. Math. 38, 29–40 (1988) Communicated by P. Constantin

Commun. Math. Phys. 239, 309–341 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0880-y

Communications in

Mathematical Physics

The Camassa-Holm Hierarchy, N -Dimensional Integrable Systems, and Algebro-Geometric Solution on a Symplectic Submanifold Zhijun Qiao1,2 1 2

T-7 and CNLS, MS B-284, Los Alamos National Laboratory, Los Alamos, NM 87545, USA Institute of Mathematics, Fudan University, Shanghai 200433, P.R. China. E-mail: [email protected]

Received: 20 September 2002 / Accepted: 27 February 2003 Published online: 16 June 2003 – © Springer-Verlag 2003

Abstract: This paper shows that the Camassa-Holm (CH) spectral problem yields two different integrable hierarchies of nonlinear evolution equations (NLEEs), one is of negative order CH hierachy while the other one is of positive order CH hierarchy. The two CH hierarchies possess the zero curvature representations through solving a key matrix equation. We see that the well-known CH equation is included in the negative order CH hierarchy while the Dym type equation is included in the positive order CH hierarchy. Furthermore, under two constraint conditions between the potentials and the eigenfunctions, the CH spectral problem is cast in: 1. a new Neumann-like N-dimensional system when it is restricted into a symplectic submanifold of R2N which is proven to be integrable by using the Dirac-Poisson bracket and the r-matrix process; and 2. a new Bargmann-like N -dimensional system when it is considered in the whole R2N which is proven to be integrable by using the standard Poisson bracket and the r-matrix process. In the paper, we present two 4 × 4 instead of N × N r-matrix structures. One is for the Neumann-like system (not the peaked CH system) related to the negative order CH hierarchy, while the other one is for the Bargmann-like system (not the peaked CH system, either) related to the positive order hierarchy. The whole CH hierarchy (an integro-differential hierarchy, both positive and negative order) is shown to have the parametric solutions which obey the corresponding constraint relation. In particular, the CH equation, constrained to a symplectic submanifold in R2N , and the Dym type equation have the parametric solutions. Moreover, we see that the kind of parametric solution of the CH equation is not gauge equivalent to the peakons. Solving the parametric representation of the solution on the symplectic submanifold gives a class of a new algebro-geometric solution of the CH equation.

310

Z. Qiao

1. Introduction The shallow water equation derived by Camassa-Holm (CH) in 1993 [7] is a new integrable system. This equation possesses the bi-Hamiltonian structure, Lax pair and peakon solutions, and retains higher order terms of derivatives in a small amplitude expansion of incompressible Euler’s equations for unidirectional motion of waves at the free surface under the influence of gravity. In 1995 Calogero [8] extended the class of mechanical system of this type. Later, Ragnisco and Bruschi [20] and Suris [22], showed that the CH equation yields the dynamics of the peakons in terms of an N -dimensional completely integrable Hamiltonian system. Such a dynamical system has Lax pair and an N × N r-matrix structure [20]. Recently, the algebro-geometric solution on the CH equation and the CH hierarchy attracted much more attention. This kind of solution for most classical integrable PDEs can be obtained by using the inverse spectral transform theory, see Dubrovin 1981 [12], Ablowitz and Segur 1981 [1], Novikov et al. 1984 [17], Newell 1985 [16]. This is done usually by adopting the spectral technique associated with the corresponding PDE. Alber and Fedorov [4, 5] studied the stationary and the time-dependent quasi-periodic solution for the CH equation and Dym type equation using the methods of trace formula [3] and Abel mapping and functional analysis on the Riemann surfaces. Later, Alber, Camassa, Fedorov, Holm and Marsden [2] considered the trace formula under the nonstandard Abel-Jacobi equations and, by introducing new parameters, presented the so-called weak finite-gap piecewise-smooth solutions of the integrable CH equation and Dym type equations. Very recently, Gesztesy and Holden [14] discussed the algebrogeometric solutions for the CH hierarchy using the polynomial recursion formalism and the trace formula, and connected a Riccati equation to the Lax pair of the CH equation. The present paper provides another approach to algebro-geometric solutions of the CH equation which is constrained to some symplectic submanifold. Our approach differs from the ones pursued in Refs. [2–5, 14] and we will outline the differences next. Based on the nonlinearization technique [9], we constrain the CH hierarchy to some symplectic submanifold and use the constraint between the potentials and the eigenfunctions first to give the parametric solution and then to give the algebro-geometric solution of the CH equation on the symplectic submanifold. The main results of this paper are twofold. • First, we extend the CH equation to the negative order CH hierarchy, which is a hierarchy of integrable integro-differential equations, through constructing the inverse recursion operator. This hierarchy is proven to have Lax pair through solving a key matrix equation. The CH spectral problem associated with this hierarchy is constrained to a symplectic submanifold and naturally gives a constraint between the spectral function and the potential. Under this constraint, the CH spectral problem (linear problem) is nonlinearized as a new N -dimensional canonical Hamiltonian system of Neumann type. This N -dimensional Neumann-like system is not the peaked dynamical system of the CH equation because the peakons do not come from the CH spectral problem. The Neumann-like CH system is shown integrable by using the so-called Dirac-Poisson brackets on the symplectic submanifold in R2N and rmatrix process. Here we present a 4 × 4 r-matrix structure for the Neumann-like system, which is available to get the algebro-geometric solution of the CH equation on this symplectic submanifold. The negative order CH hierarchy is proven to have the parametric solution through employing the Neumann-like constraint relation. This parametric solution does not contain the peakons [7], and vice versa. Furthermore, solving the parametric representation of solution on the symplectic submanifold gives

Camassa-Holm Hierarchy

311

an algebro-geometric solution for the CH equation. We point out that our algebrogeometric solution (see Eq. (3.104) and Remarks 3 and 4) is different from the ones in Refs. [2–5, 14], and simpler in form. • Second, based on the negative case, we naturally give the positive order CH hierarchy by considering the recursion operator. This hierarchy is shown integrable also by solving the same key matrix equation. The CH spectral problem, related to this positive order CH hierarchy, yields a new integrable N -dimensional system of Bargmann type (instead of Neumann type) by using the standard Poisson bracket and r-matrix procedure in R2N . A 4×4 r-matrix structure is also presented for the Bargmann-like system (not peaked CH system, either), which is available to get the parametric solution of a Dym type equation contained in the positive order CH hierarchy. This hierarchy also possesses the parametric solution using the Bargmann constraint relation. Roughly speaking, our method works in the following steps (also see [19]): – Start from the spectral problem. – Find some constraint condition between the potentials and the eigenfunctions. Here, for the negative CH hierarchy, we restrict it to a symplectic manifold in R2N , but for the positive CH hierarchy, we will have the constraint condition in the whole R2N . – Prove the constrained SPECTRAL PROBLEM is finite-dimensional integrable. Usually we use a Lax matrix and r-matrix procedure. – Verify the above constrained potential(s) is (are) a parametric solution of the hierarchy. – Solve the parametric representation of a solution in an explicit form, then give the algebro-geometric solutions of the equations on the symplectic manifold. In this process, we separate the variables of the Jacobi-Hamiltonian system [21], then construct the actional variables and angle-coordinates on the symplectic submanifold, and the residues at two infinity points for some composed Riemann-Theta functions give the algebro-geometric solutions. The paper is organized as follows. The next section gives a general structure of the zero curvature representations of the all vector fields for a given isospectral problem. The key point is to construct a key matrix equation. In Sect. 3, we present the negative order CH hierarchy based on the inverse recursion operator. The well-known CH equation is included in the negative order hierarchy, and the CH spectral problem yields a new Neumann-like system which is constrained to a symplectic manifold. This system has canonical form and is integrable by using the Dirac-Poisson bracket and the r-matrix process. Here we obtain a 4 × 4 r-matrix structure for the Neumann-like CH system. Furthermore, the whole negative order CH hierarchy constrained on the symplectic submanifold has a parametric solution. In particular, the CH equation has a parametric solution on the submanifold. Finally we give an algebro-geometric solution of the CH equation on the submanifold. In Sect. 4, we deal with the positive order integrable CH hierarchy and give a new Bargmann-like integrable system. By the use of a similar process as in Sect. 3, the CH spectral problem is nonlinearized to be an integrable system under a Bargmann constraint. This integrable Bargmann system also has an r-matrix structure of 4 × 4. Moreover, the positive order CH hierarchy is also shown to have the parametric solution which obeys the Bargmann constraint relation. In particular, the Dym type equation in the positive order CH hierarchy has the parametric solution.

312

Z. Qiao

Let us now give some symbols and convention in this paper as follows:  ∂k f = fkx , k = 0, 1, 2, . . . ,   ∂x k    f (k) = . . . f dx, k = −1, −2, . . . ,      −k

k+1

∂ f ∂ −1 is the inverse of ∂, i.e. ∂∂ −1 = ft = ∂f ∂t , fkxt = ∂t∂x k (k = 0, 1, 2, . . .), ∂ = ∂x , ∂ −1 k k ∂ ∂ = 1, ∂ f means the operator ∂ f acts on some function g, i.e.  ∂k (f g) = (f g)kx , k = 0, 1, 2 . . . ,   ∂x k    ∂ k f · g = ∂ k (f g) = . . . f gdx, k = −1, −2, . . . .      −k

In the following the function m stands for potential, λ is assumed to be a spectral parameter, and the domain of the spatial variable x is which becomes equal to (−∞, +∞) or (0, T ), while the domain of the time variable tk is the positive time axis R+ = {tk | tk ∈ R, tk ≥ 0, k = 0, ±1, ±2, . . .}. In the case = (−∞, +∞), the decaying condition at infinity and in the case = (0, T ), the periodicity condition for the potential function is imposed. (R2N , dp ∧ dq) stands for the standard symplectic structure in Euclid space R2N = { (p, q)| p = (p1 , . . . , pN ), q = (q1 , . . . , qN )}, pj , qj (j = 1, . . . , N ) are N pairs of canonical coordinates, ·, · is the standard inner product in RN ; in (R2N , dp ∧ dq), the Poisson bracket of two Hamiltonian functions F, H is defined by [6] {F, H } =

N

∂F ∂H ∂F ∂H ∂F ∂H ∂F ∂H = , − , . − ∂qj ∂pj ∂pj ∂qj ∂q ∂p ∂p ∂q

(1.1)

j =1

λ1 , . . . , λN are assumed to be N distinct spectral parameters, = diag(λ1 , . . . , λN ), and I2×2 = diag(1, 1). Denote all infinitely times differentiable functions on real field R and all integers by C ∞ (R) and by Z, respectively. 2. The Camassa-Holm (CH) Spectral Problem and Zero Curvature Representation Let us consider the Camassa-Holm (CH) spectral problem [7]: ψxx =

1 1 ψ − mλψ 4 2

(2.1)

with the potential function m. Equation (2.1) is apparently equivalent to yx = Uy, U = U (m, λ) =

1 , − 21 mλ 0 0

1 4

(2.2)

Camassa-Holm Hierarchy

313

where y = (y1 , y2 )T = (ψ, ψx )T . It is easy to see Eq. (2.2)’s spectral gradient ∇λ ≡

δλ = λy12 δm

(2.3)

satisfies the following Lenard eigenvalue problem K · ∇λ = λJ · ∇λ

(2.4)

with the pair of Lenard’s operators K = −∂ 3 + ∂, J = ∂m + m∂.

(2.5)

They yield the recursion operator L = J −1 K = (∂m + m∂)−1 (∂ − ∂ 3 ), 1

(2.6)

1

which also has the product form L = 21 m− 2 ∂ −1 m− 2 (∂ − ∂ 3 ). Apparently, the Gateaux derivative matrix U∗ (ξ ) of the spectral matrix U in the direction ξ ∈ C ∞ (R) at point m is 0 0

d (2.7) U (m + ξ ) = U∗ (ξ ) = − 21 λξ 0 d =0 which is obviously an injective homomorphism. For any given C ∞ -function G, we construct the following matrix equation with respect to V = V (G): Vx − [U, V ] = U∗ (K · G − λJ · G).

(2.8)

Theorem 1. For the CH spectral problem (2.2) and an arbitrary C ∞ -function G, the matrix equation (2.8) has the following solution: − 21 Gx −G V = V (G) = λ 1 . (2.9) 1 1 1 2 Gxx − 4 G + 2 mλG 2 Gx Proof. Directly substituting Eqs. (2.9), (2.5) and (2.7) into Eq. (2.8), we can complete the proof of this theorem. Theorem 2. Let G0 ∈ Ker J = {G ∈ C ∞ (R) | J G = 0} and G−1 ∈ Ker K = {G ∈ C ∞ (R) | KG = 0}. We define Lenard’s sequences as follows: j ·G , j ≥ 0, j ∈ Z 0 (2.10) Gj = Lj +1 L · G−1 , j < 0, Then, 1. all the vector fields Xk = J · Gk , k ∈ Z satisfy the following commutator representation: Vk,x − [U, Vk ] = U∗ (Xk ), ∀k ∈ Z;

(2.11)

314

Z. Qiao

2. the following hierarchy of nonlinear evolution equations mtk = Xk = J · Gk , ∀k ∈ Z,

(2.12)

possesses the zero curvature representation Utk − Vk,x + [U, Vk ] = 0, ∀k ∈ Z,

(2.13)

where Vk =

Vj λk−j −1 ,

 k−1  j =0 , k > 0, k = 0, = 0,  −1 − j =k , k < 0,

(2.14)

and Vj = V (Gj ) is given by Eq. (2.9) with G = Gj . Proof. 1. For k = 0, it is obvious. For k < 0, we have Vk,x − [U, Vk ] = −

−1

Vj,x − [U, Vj ] λk−j −1

j =k

=−

−1

U∗ K · Gj − λK · Gj −1 λk−j −1

j =k



= U∗ 

−1

 K · Gj −1 λk−j − K · Gj λk−j −1 

j =k

= U∗ K · Gk−1 − K · G−1 λk

= U∗ (K · Gk−1 ) = U∗ (Xk ). For the case of k > 0, it is proved similarly. 2. Noticing Utk = U∗ (mtk ), we obtain Utk − Vk,x + [U, Vk ] = U∗ (mtk − Xk ).

The injectiveness of U∗ implies item 2 holds.

3. Negative Order CH Hierarchy, Integrable Neumann-like System and Algebro-Geometric Solution 3.1. Negative order CH hierarchy. Let us first give the negative order hierarchy of the CH spectral problem (2.2) by considering the kernel element of Lenard’s operator K. The kernel of operator K has the following three seed functions: G1−1 = 1, G2−1 G3−1

(3.1)

=e , x

=e

−x

(3.2) ,

(3.3)

Camassa-Holm Hierarchy

315

where all possible linear combinations form the whole kernel of K. Let G−1 ∈ Ker K, then 3

al Gl−1 , (3.4) G−1 = l=1

where al = al (tn ), l = 1, 2, 3, are three arbitrarily given C ∞ -functions with respect to the time variables tn (n < 0, n ∈ Z), but independent of the spacial variable x. Therefore, G−1 directly generates an isospectral (λtk = 0, k < 0, k ∈ Z) hierarchy of nonlinear evolution equations for the CH spectral problem (2.2), mtk = J Lk+1 · G−1 , k < 0, k ∈ Z,

(3.5)

which is called the negative order CH hierarchy because of k < 0. In Eq. (3.5), the operator J is defined by Eq. (2.5) and L−1 is given by L−1 = K −1 J = ∂ −1 ex ∂ −1 e−2x ∂ −1 ex (∂m + m∂).

(3.6)

K −1 = ∂ −1 ex ∂ −1 e−2x ∂ −1 ex .

(3.7)

Here,

With setting m = u − uxx , we obtain another form of L−1 : L−1 = u + ex ∂ −1 e−2x ∂ −1 ex u∂ + 2ux + ∂ −1 m ∂.

(3.8)

By Theorem 2, the negative CH hierarchy (3.5) has the zero curvature representation Utk − Vk,x + [U, Vk ] = 0, k < 0, k ∈ Z, Vk = −

−1

Vj λk−j −1 ,

(3.9) (3.10)

j =k

i.e.  0 1   y y, =  x 1 1   4 − 2 mλ 0 −1 − 21 Gj,x −Gj ytk = − j =k 1 λk−j y,    Gj,xx − 41 Gj + 21 mλGj 21 Gj,x  2  k = −1, −2, . . . ,

(3.11)

where Gj = Lj +1 · G−1 , j < 0, j ∈ Z. Thus, all nonlinear equations in the negative CH hierarchy (3.5) are integrable. Let us now give some special reductions of Eq. (3.5). • In the case of a1 = −1, a2 = a3 = 0, i.e. G−1 = −G1−1 = −1, because G−2 = L−1 · G−1 = −u, the second equation of Eq. (3.5) reads mt−2 = −(∂m + m∂) · u,

(3.12)

316

Z. Qiao

i.e. (here noticing m = u − uxx ) ut−2 − uxx,t−2 + 3uux = 2ux uxx + uuxxx ,

(3.13)

which is exactly the Camassa-Holm equation [7]. According to Eq. (3.11), the CH equation (3.13) possesses the following zero curvature representation:  0 1   y, y =  x 1 1 4 − 2 mλ 01 (3.14) −u − λ−1 − 2 ux   y,  yt−2 = 1 1 1 1 −1 2 muλ + 4 u − 4 λ 2 ux which is equivalent to

ψxx = 41 ψ − 21 mλψ, ψt−2 = 21 ux ψ − uψx − λ−1 ψx .

(3.15)

Equation (3.15) coincides with the one in Ref. [7]. • In the cases of a1 = 0, a2 = 1, a3 = 0 and a1 = 0, a2 = 0, a3 = 1, i.e. G−1 = ex , e−x , we can write them in a uniform expression: G−1 = ex , = ±1. The first equation of Eq. (3.5) reads mt−1 = (mx + 2m)ex ,

(3.16)

which is a linear PDE. Because G−2 = L−1 · G−1 = u + u(−1) ex , the second equation of Eq. (3.5) reads mt−2 = mx u + u(−1) + 2m ux + 2u + u(−1) ex , (3.17) where m = u − uxx . This equation has the following zero curvature representation:  0 1  y, yx = 1 1 (3.18) 4 − 2 mλ 0  yt−2 = V−2 y, where V−2 = −V (G−2 )λ−1 − V (G−1 )λ−2   1 u + 2u + u(−1) + λ−1 (−1) ) + λ−1 u + u x 2 . = ex  3 7 3 (−1) + 1 m u+u(−1) λ + 1 λ−1 − 1 u +2u + u(−1) + λ−1 x 2 ux + 4 u+ 4 u 2 4 2

Equation (3.18) can be changed to the following Lax form: ψxx = 41 ψ − 21 mλψ, (3.19) ψt−2 = u + u(−1) + λ−1 ex ψx − 21 ux + 2u + u(−1) + λ−1 ex ψ. Both of the two cases: = ±1 for Eq. (3.17) are integrable.

Camassa-Holm Hierarchy

317

3.2. r-matrix structure for the Neumann-like CH system. Consider the following matrix (called “negative” Lax matrix) A− (λ) B− (λ) L− (λ) = , (3.20) C− (λ) −A− (λ) , where A− (λ) = − p, q λ−1 +

N

p j qj , λ − λj

(3.21)

j =1

B− (λ) = λ−2 + q, q λ−1 −

N

j =1

C− (λ) =

qj2 λ − λj

,

N

pj2 1 −2 . λ − p, p λ−1 + 4 λ − λj

(3.22)

(3.23)

j =1

We calculate the determinant of L− (λ): 1 2 1 1 λ det L− (λ) = − λ2 TrL2− (λ) = − λ2 A2− (λ) + B− (λ) C− (λ) 2 4 2 1 N

Ej− = Hj λj + , (3.24) λ − λj j =−2

j =1

where Tr stands for the trace of a matrix, and 1 H−2 = − , 8 1 1 H−1 = p, p − q, q , 2 8 H0 = p, q p, q − p, q2 , H1 = − p, q −1 p, q + p, q2 , 1 q, q λj + 1 pj2 Ej− = p, q λj pj qj − 2 1 1 1 p, p λj − qj2 + λ2j j− , j = 1, 2, . . . , N, − 2 4 2 j− =

(3.25)

(3.26)

N

(pj ql − pl qj )2 . λj − λ l

l=j,l=1

Let Fk =

N

j =1

− λk+1 j Ej , k = −1, −2, . . . ,

(3.27)

318

Z. Qiao

then it reads Fk =

1 1 k+1 p, p − k+1 q, q 2 8 −2 1

+ j +2 q, q k−j p, p − j +2 p, q k−j p, q , 2

(3.28)

j =k

k = −1, −2, −3, . . . . Obviously, F−1 = H−1 . Now, we consider the following symplectic submanifold in R2N 1 2N M = (q, p) ∈ R F ≡ (q, q − 1) = 0, G ≡ q, p = 0 2

(3.29)

and introduce the Dirac bracket on M 1

{f, g}D = {f, g} +

2 q, q

({f, F }{G, g} − {f, G}{F, g})

(3.30)

which is easily proven to be bilinear, skew-symmetric and satisfy the Jacobi identity. In particular, the Hamiltonian system (H−1 )D : qx = {q, H−1 }D , px = {p, H−1 }D on M reads   qx = p, q, px = 41 q − 1+4p,p (H−1 )D : (3.31) 42 q,q  q, p = 0, q, q = 1. We call this a Neumann-like system on M. Let 1 + 4 p, p , 2 2 q, q y1 = qj , y2 = p, λ = λj , j = 1, . . . , N. m=

(3.32) (3.33)

Then, the Neumann-like flow (H−1 )D on M exactly becomes yx = U (m, λ)y, y = (y1 , y2 )T ,

(3.34)

which is nothing else but the CH spectral problem (2.2) with the potential function m. Therefore, we can call the canonical Hamiltonian system (3.31) the Neumann-like CH system on M. A long but direct computation leads to the following key equalities: {A− (λ), A− (µ)}D = {B− (λ), B− (µ)}D = 0, 1 + 4 p, p λ µ A− (λ) − A− (µ) {C− (λ), C− (µ)}D = µ λ 2 q, q 4λµ (C− (λ)A− (µ) − C− (µ)A− (λ)) , + 2 q, q

Camassa-Holm Hierarchy

319

2 2 2 (B− (µ) − B− (λ)) + B− (µ) + B− (λ) µ−λ λ µ 2λµ B− (λ)B− (µ), − 2 q, q 2 2 2 {A− (λ), C− (µ)}D = (C− (λ) − C− (µ)) − C− (µ) − C− (λ) µ−λ λ µ (1 + 4 p, p)λ 2λµ B− (λ)C− (µ) − B− (λ), + 2 q, q 2 2 q, q µ 4 4 4 {B− (λ), C− (µ)}D = (A− (µ) − A− (λ)) + A− (µ) + A− (λ) µ−λ λ µ 4λµ B− (λ)A− (µ). − 2 q, q {A− (λ), B− (µ)}D =

− Let L− 1 (λ) = L− (λ) ⊗ I2×2 , L2 (µ) = I2×2 ⊗ L− (µ), where L− (λ) , L− (µ) are given through Eq. (3.20). In the following, we search for a general 4 × 4 r-matrix − structure r12 (λ, µ) such that the fundamental Dirac-Poisson bracket: " ! − " ! − − (3.35) L− (λ) ⊗, L− (µ) D = r12 (λ, µ) , L− 1 (λ) − r21 (µ, λ) , L2 (µ) holds, where the entries of the 4 × 4 matrix L− (λ) ⊗, L− (µ) D are L− (λ) ⊗, L− (µ) D = L− (λ)km , L− (µ)ln D , k, l, m, n = 1, 2, kl,mn

and

− r21 (λ, µ)

=

− P r12 (λ, µ) P ,



with 



1 1 0  σj ⊗ σ j  =  I2×2 + P = 0 2 j =1 0 3

0 0 1 0

0 1 0 0

 0 0 , 0 1

where σj are the Pauli matrices. Theorem 3.

2λ P + S− µ(µ − λ) is an r-matrix structure satisfying Eq. (3.35), where λ (1 + 4 p, p) 0 0 00 − S = ⊗ 10 10 2µ 2 q, q 2λµ 00 −B− (λ) 0 + 2 ⊗ 2A− (µ) B− (λ) q, q 0 1   0 0 0 0   0 0 0 0   = . 0 0 − 2λµ B (λ) 0 − 2 q,q   λ(1+4p,p) 4λµ 2λµ 0 A B (µ) (λ) 2µ2 q,q 2 q,q − 2 q,q − − r12 (λ, µ) =

(3.36)

Apparently, our r-matrix structure (3.36) is 4 × 4 and is different from the one in Ref. [20].

320

Z. Qiao

3.3. Integrability. Because there is an r-matrix structure satisfying Eq. (3.35), & % ! − " ! − " − = r¯12 (3.37) L2− (λ) ⊗, L2− (µ) (λ, µ) , L− 1 (λ) − r¯21 (µ, λ) , L2 (µ) , D

where r¯ij− (λ, µ) =

1 1

L− 1

1−k

1−l k l (λ) L− (µ) rij− (λ, µ) L− (λ) L− 2 1 2 (µ) ,

k=0 l=0

ij = 12, 21. Thus,

& & % % = Tr L2− (λ) ⊗, L2− (µ) = 0. 4 TrL2− (λ) , TrL2− (µ) D

D

(3.38)

So, by Eq. (3.24) we immediately obtain the following theorem. Theorem 4. The following equalities {Ei− , Ej− }D = 0, {Hl , Ej− }D = 0, {Fk , Ej− }D = 0,

(3.39)

i, j = 1, 2, . . . , N, l = −2, −1, 0, 1, k = −1, −2, . . . , hold. Hence, the Hamiltonian systems (Hl )D and (Fk )D on M (Hl )D : (Fk )D :

qx = {q, Hl }D , px = {p, Hl }D , l = −2, −1, 0, 1, qtk = {q, Fk }D , ptk = {p, Fk }D , k = −1, −2, . . . ,

(3.40) (3.41)

are completely integrable. In particular, we obtain the following results. Corollary 1. The Hamiltonian system (H−1 )D defined by Eq. (3.31) is completely integrable. Corollary 2. All composition functions f (Hl , Fk ), f ∈ C ∞ (R), k = −1, −2, . . . , are completely integrable Hamiltonians on M. 3.4. Parametric solution of the negative order CH hierarchy restricted onto M. In the following, we consider the relation between constraint and nonlinear equations in the negative order CH hierarchy (3.5). Let us start from the following setting: G1−1 =

N

∇λj ,

(3.42)

j =1

where G1−1 = 1, and ∇λj = λj qj2 is the functional gradient of the CH spectral problem (2.2) corresponding to the spectral parameter λj (j = 1, . . . , N). Apparently Eq. (3.42) reads q, q = 1.

(3.43)

After we do one time derivative with respect to x, we get p, q = 0, p = qx .

(3.44)

This equality together with Eq. (3.43) forms the symplectic submanifold M we need. Apparently, derivating Eq. (3.44) leads to the constraint relation (3.32).

Camassa-Holm Hierarchy

321

Remark 1. %Because M defined by Eq. (3.29) is not the usual tangent & bundle, i.e. M = N−1 2N ˜ ˜ TS = (q, p) ∈ R F ≡ q, q − 1 = 0, G ≡ q, p = 0 and Eq. (3.32) is not the usual Neumann constraint on T S N−1 , Eq. (3.31) is therefore a new kind of Neumann system. In Subsect. 3.3 we have proven its integrability. Since the Hamiltonian flows (H−1 )D and (Fk )D on M are completely integrable and x , g tk their Poisson brackets {H−1 , Fk }D = 0 (k = −1, −2, . . .), their phase flows gH Fk −1 commute [6]. Thus, we can define their compatible solution as follows:

q(x, tk ) p(x, tk )

=

x g tk gH −1 Fk

q(x 0 , tk0 ) , k = −1, −2, . . . , p(x 0 , tk0 )

(3.45)

x , g tk . where x 0 , tk0 are the initial values of phase flows gH Fk −1

Theorem 5. Let q(x, tk ), p(x, tk ) be a solution of the compatible Hamiltonian systems (H−1 )D and (Fk )D on M. Then m=

1 + 4 p(x, tk ), p(x, tk ) 2 2 q(x, tk ), q(x, tk )

(3.46)

satisfies the negative order CH hierarchy mtk = J Lk+1 · 1, k = −1, −2, . . . ,

(3.47)

where the operators J and L−1 are given by Eqs. (2.5) and (3.6), respectively. Proof. On one hand, the recursion operator L acts on Eq. (3.42) and results in the following: J Lk+1 · G1−1 = J · k+2 q, q = mx k+2 q, q + 4m k+2 q, p 2(1 + 4 p, p) 2 = q, q k+2 q, p − 2 q, p k+2 q, q . 2 2 q, q (3.48) In this procedure, Eqs. (2.4) and (3.31) are used. On the other hand, we directly make the derivative of Eq. (3.46) with respect to tk . Then we obtain 4 2 q, q 2 p, ptk − (1 + 4 p, p) 2 q, qtk mtk = , (3.49) 2 2 q, q where q = q(x, tk ), p = p(x, tk ). But, qtk = {q, Fk }D , ptk = {p, Fk }D , k = −1, −2, . . . ,

(3.50)

322

Z. Qiao

where Fk are defined by Eq. (3.28), i.e. qtk =

−1

j +2 q, q k−j p − j +2 q, p k−j q ,

(3.51)

j =k

ptk =

1 + 4 p, p 2 k+1 k+2 q, q q − q, q q 4 2 q, q +

j +2 q, p k−j p − j +2 p, p k−j q .

−1

(3.52)

j =k

Therefore after substituting them into Eq. (3.49) and calculating it, we have mtk =

2(1 + 4 p, p) 2 q, q k+2 q, p − 2 q, p k+2 q, q 2 2 q, q

which completes the proof.

Lemma 1. Let q, p satisfy the integrable Hamiltonian system (H−1 )D . Then on the symplectic submanifold M, we have 1. q, q − 4 p, p = 0.

(3.53)

u = q(x, tk ), q(x, tk ) , k, = −1, −2, . . . ,

(3.54)

2.

satisfies the equation m = u − uxx , where m is given by Eq. (3.46), and q(x, tk ), p(x, tk ) is a solution of the compatible integrable Hamiltonian systems (H−1 )D and (Fk )D on M. Proof. (q, q − 4 p, p)x = 2 q, p − 8 p, px =0 completes the proof of the first equality. As for the second one, we have

1 1 u − uxx = q, q − 2 p, p − 2 q, q − mq 4 2 1 = q, q − 2 p, p + m 2 = m. In particular, with k = −2, we obtain the following corollary.

Camassa-Holm Hierarchy

323

Corollary 3. Let q(x, t−2 ), p(x, t−2 ) be a solution of the compatible integrable Hamiltonian systems (H−1 )D and (F−2 )D on M. Then 1 + 4 p(x, t−2 ), p(x, t−2 ) , 2 2 q(x, t−2 ), q(x, t−2 ) u = u(x, t−2 ) = q(x, t−2 ), q(x, t−2 ) ,

m = m(x, t−2 ) = −

(3.55) (3.56)

satisfy the CH equation (3.12). Therefore, u = q(x, t−2 ), q(x, t−2 ), is a solution of the CH equation (3.13) on M. Here H−1 and F−2 are given by 1 1 H−1 = p, p − q, q , 2 8 1 1 1 −1 q, q p, p − q, p2 . F−2 = p, p − −1 q, q + 2 8 2 Proof. Via some direct calculations, we obtain 2(1 + 4 p, p) 2 2 q, q, mt−2 = − q, q p − q, p q . 2 2 q, q And Lemma 1 gives −J · u = −J L−1 · G1−1 = J · < q, q > 2(1 + 4 p, p) 2 =− q, q q, p − 2 q, p q, q 2 2 q, q = mt−2 , which completes the proof.

By Theorem 5, the constraint relation given by Eq. (3.32) is actually a solution of the negative order CH hierarchy (3.5). Thus, we also call the system (H−1 )D (i.e. Eq. (3.31)) a negative order restricted CH flow (Neumann-like) of the spectral problem (2.2) on the symplectic submanifold M. All Hamiltonian systems (Fk )D , k < 0, k ∈ Z are therefore called the negative order integrable restricted flows (Neumann-type) on M. Remark 2. Of course, we can also consider the integrable Bargmann-like CH system associated with the positive order CH hierarchy (4.1). Please see Sect. 4. A systematic approach to generate new integrable negative order hierarchies can be seen in Ref. [18]. 3.5. Comparing parametric solution with peakons. Let us now compare the Neumannlike CH system (H−1 )D with the peakons dynamical system. Let Pj , Qj (j = 1, 2, . . . , N) be peakons dynamical variables of the CH equation (3.13), then with [7] u(x, t) =

N

Pj (t)e−|x−Qj (t)| ,

(3.57)

Pj (t)δ(x − Qj (t)),

(3.58)

j =1

m(x, t) =

N

j =1

324

Z. Qiao

where t = t−2 and δ(x) is the δ-function, the CH equation (3.13) yields a canonical peaked Hamiltonian system  N   ˙ j (t) = ∂H = Q Pk (t)e−|Qk (t)−Qj (t)| ,  ∂Pj k=1 (3.59) N −|Q (t)−Q (t)|   ∂H k j ˙  Pk (t)sgn Qk (t) − Qj (t) e ,  Pj (t) = − ∂Qj = −Pj k=1

with H (t) =

N 1

Pi (t)Pj (t)e−|Qi (t)−Qj (t)| . 2

(3.60)

i,j =1

In Eq. (3.59), “sgn” means the signal function. The same meaning is used in the following. A natural question is: what is the relationship between the peaked Hamiltonian system (3.59) and the Neumann-like systems (3.31) and (F−2 )? Apparently, the peaked system (3.59) does not include the systems (3.31) and (F−2 ) because the system (3.59) is only concerned about the time part. By Corollary 3, we know that u = u(x, t) = q(x, t), q(x, t) =

N

qj2 (x, t),

(3.61)

j =1

is a solution of the CH equation (3.13) on M, where we set t−2 = t, and qj (x, t), pj = ∂qj (x,t) satisfies the two integrable commuted systems (H−1 )D , (F−2 )D on the sym∂x plectic submanifold M. Assume Pj (t), Qj (t) are the solutions of the peaked system ' ' (3.59); we make the following transformation (when Pj (t) < 0, we use Pj (t) = i −Pj (t)): ( 1 (3.62) qj (x, t) = Pj (t)e− 2 |x−Qj (t)| , which implies pj (x, t) =

d 1 qj (x, t) = − sgn(x − Qj (t))qj . dx 2

(3.63)

Hence, we obtain d d2 1 pj (x, t) = qj (x, t) = −δ(x − Qj (t))qj + qj . dx dx 2 4

(3.64)

However, on M we have the constraints < q, q >= 1, < q, p >= 0 which implies N

j =1

λj qj2 δ(x − Qj ) =

1 , ∀x ∈ R. 2

(3.65)

This equality is obviously not true! So, the CH peakon system (3.59) does not coincide with the nonlinearized CH spectral problem (3.31).

Camassa-Holm Hierarchy

325

Let us now furthermore compute the derivative with respect to t. Inserting the peakon system (3.59), we get d qj (x, t) dt N ! " 1

= qj Pk (t) sgn(x − Qj (t)) − sgn Qk (t) − Qj (t) e−|Qk (t)−Qj (t)| . 2

q˙j (x, t) =

k=1

(3.66) On the other hand, from the Neumann-like system (F−2 )D we have q˙j (x, t) = {qj , F−2 }D = λ−1 j pj − q, p qj )N *

1 −1 −|x−Qk (t)| = qj Pk (t)sgn(x −Qk (t))e − λj sgn(x −Qj (t)) . (3.67) 2 k=1

Apparently, (3.66) = (3.67) iff when x = Qj (x, t), i.e. for other x, they do not equal. Thus, the peakon system (3.59) is not the Neumann-like Hamiltonian system (F−2 )D , either. So, by the above analysis, we conclude: the two solutions (3.57) and (3.61) of the CH equation (3.13) are not gauge equivalent. In the next subsection we will concretely solve Eq. (3.61) on M in the form of Riemann-Theta functions. 3.6. Algebro-geometric solution of the CH equation on M. Now, let us re-consider the Hamiltonian system (H−1 )D on M under the substitution of λ → λ−1 , λj → λ−1 j (here we choose non-zero λ, λj ). Then, the Lax matrix (3.20) has the following simple form: L− (λ) = −λ3 LCH (λ),

(3.68)

where

N

λ−1 LCH (λ) = + λ − λj j =1 ACH (λ) BCH (λ) ≡ , CCH (λ) −ACH (λ)

0 −λ−1 1 −1 −4λ 0

)

pj qj −qj2 pj2 pj qj

*

(3.69)

and the symplectic submanifold M becomes 1 −1 2N −1 q, q − 1 = 0, G ≡ q, p = 0 , M = (q, p) ∈ R F ≡ 2 (3.70) −1 where −1 = diag(λ−1 1 , . . . , λN ).

326

Z. Qiao

A direct calculation yields the following theorem. Theorem 6. On the symplectic submanifold M the Hamiltonian system (H−1 )D defined by Eq. (3.31) has the Lax representation: ∂ LCH = [MCH , LCH ], ∂x where

) MCH =

0 1 4

− 1+4p,p λ−1 42 q,q

(3.71) * 1 0 .

(3.72)

Notice. On M the Hamiltonian system (H−1 )D is 2N − 2-dimensional, that is, there only exist 2N − 2 independent dynamical variables in all 2N variables q1 , . . . , qN ; p1 , . . . , pN . Without loss of generality, we assume the Hamiltonian system (H−1 )D has the independent dynamical variables q1 , . . . , qN−1 ; p1 , . . . , pN−1 (N > 1) on M. Then, on M we have N−1

2 qN = λN −

j =1

pN = −

N−1

j =1

λN 2 q , λj j

(3.73)

λN qj p j , λj q N

(3.74)

where qN in the latter is given by the former in terms of q1 , . . . , qN−1 . We will concretely give the expression u = q(x, t), q(x, t) , t = t−2 in an explicit form. By Eq. (3.73), rewrite the entry BCH (λ) in the Lax matrix (3.69) as   N−1

(λj − λN )λ−1 1 j 1 + BCH (λ) = − qj2  λ − λN λ − λj j =1

Q (λ) ≡− , K (λ)

(3.75)

where Q(λ) =

N−1 +

λ − λj +

j =1

K (λ) =

N +

N−1

λj − λ N 2 qj λj

j =1

N−1 +

(λ − λk ) ,

(3.76)

k=1,k=j

(λ − λα ) .

(3.77)

α=1

Apparently, Q(λ) is a N − 1 (N > 1) order polynomial of λ. Choosing its N − 1 distinct zero points µ1 , . . . , µN−1 , we have Q (λ) =

N−1 +

λ − µj ,

(3.78)

j =1

q, q =

N

α=1

λα −

N−1

j =1

µj .

(3.79)

Camassa-Holm Hierarchy

327

Additionally, choosing λ = λj in Eqs. (3.76) and (3.78) leads to an explicit form of qj in terms of µl : qj2 = αj N

N −1 +

λj

l=1

k=1 (λj

(λj − µl ), αj N = ,N

− λk )

(3.80)

,

which is similar to the result in Ref. [2]. By Eq. (3.79), we get an identity about µl : N

αj N

j =1

N−1 +

N

l=1

α=1

(λj − µl ) =

N−1

λα −

µj .

j =1

Remark 3. The dynamical variable pj corresponding to qj is pj =

N−1 dqj αj N dµk =− dx 2qj dx k=1

N−1 +

(λj − µl ),

(3.81)

l=1,l=k

therefore,  pj2 =

αj N

4

,N−1 l=1

(λj − µl )



N−1

k=1

dµk dx

N−1 +

2 (λj − µl ) .

Substituting Eqs. (3.79) and (3.82) into the Hamiltonian H−1 = directly gives an expression in terms of µl :

H−1

 N N−1

dµk αj N 1

 = ,N−1 8 dx l=1 (λj − µl ) j =1

k=1

N−1 +

(3.82)

l=1,l=k

2 (λj − µl ) +

l=1,l=k

1 2

p, p −

1 8

q, q

N−1 N 1

1

µj − λk . 8 8 j =1

k=1

(3.83) This is evidently different from the Hamiltonian function in Ref. [2] (see there Sect. 3). Here our H−1 comes from the nonlinearized CH spectral problemn, i.e. it is composing of a Neumann-like system on M, which is shown integrable in subsection 3.3 by r-matrix process. It is because of this difference that our parametric solution does not include the peakons (also see last subsection) and we will in the following procedure present a class of new algebro-geometric solution for the CH equation constrained on the symplectic submanifold M. Let

πj = ACH µj , j = 1, . . . , N − 1,

(3.84)

then it is easy to prove the following proposition. Proposition 1. {µi , µj }D = {πi , πj }D = 0,

{πj , µi }D = δij ,

i, j = 1, 2, . . . , N − 1,

(3.85)

i.e. πj , µj are conjugated, and thus they are the variables which can be separated [21].

328

Z. Qiao

Write − det LCH (λ) = A2CH (λ) + BCH (λ) CCH (λ) ) * N 1 1 Eα = 2 + λ 4 λ − λα α=1 * ) N−1 1 1 (λα − λN )λ−1 α = Eα + λ(λ − λN ) 4 λ − λα α=1

P (λ) ≡ , λK (λ)

(3.86)

where Eα is defined by 1 Eα = −pα2 + qα2 − α , α = 4

N

k=1,k=α

N−1

λj − λ N N−1 1 + P (λ) = qj2 λ − λj + 4 λj j =1

j =1

(pα qk − qα pk )2 , λα − λ k N−1 +

(λ − λk ) ,

(3.87)

(3.88)

k=1,k=j

and obviously P (λ) is an N − 1 order polynomial of λ whose first term’s coefficient is 1 4 . Then we have πj2

P µj , j = 1, . . . , N − 1. = µj K µj

Notice. On M we always have are independent. Then,

N

−1 α=1 λα Eα

EN =

(3.89)

= 41 . Therefore, we assume E1 , . . . , EN−1

N−1

λN 1 λN − Eα . 4 λα α=1

Actually in Eq. (3.86) we already used this fact. Now, we choose the generating function W =

N−1

Wj µj , {Eα }N−1 α=1

j =1

=

N−1

µj j =1

µ0

P (λ) dλ, λK (λ)

(3.90)

where µ0 is an arbitrarily given constant. Let us view Eα (α = 1, . . . , N − 1) as action variables, then angle-coordinates Qα are chosen as Qα =

∂W , ∂Eα

α = 1, . . . , N − 1,

Camassa-Holm Hierarchy

329

i.e. Qα =

N−1

µk k=1

≡

µ0

K(λ) λα − λN dλ · 2 λK (λ) P (λ) λα (λ − λα )(λ − λN ) √

N−1 λα − λN µk ω˜ α , λα µ0

(3.91)

k=1

where

,N−1

k=α,k=1 (λ − λk )

ω˜ α =

dλ, α = 1, . . . , N − 1. √ 2 λK (λ) P (λ) Therefore, on the symplectic submanifold M 2N−2 , dEα ∧ dQα the Hamiltonian function N 1 1 1

p, p − q, q = − Eα 2 8 2

H−1 =

α=1

1 1 = − λN − 8 2

N−1

α=1

λα − λ N Eα λα

(3.92)

produces a linearized x-flow of the CH equation λα −λN −1 Qα,x = ∂H ∂Eα = − 2λα ; Eα,x = 0,

(3.93)

as well, the Hamiltonian function F−2 =

N 1

1 1 1 p, p − q, q + q, q p, p − q, p2 = − λα E α 2 8 2 2 α=1

1 1 = − λ2N − 8 2

N−1

λ2α

− λ2N λα

α=1

Eα

yields a linearized t-flow of the CH equation . λ2α −λ2N −2 Qα,t = ∂F ∂Eα = − 2λα , Eα,t = 0.

(3.94)

(3.95)

The above two flows imply 0 λN − λ α / x + (λN + λα )t − 2Q0α , 2λα Eα = constant, α = 1, . . . , N − 1,

Qα =

(3.96) (3.97)

where Q0α is an arbitrarily chosen constant. Therefore we have N−1

µk k=1

µ0

ω˜ α = −

0 1/ x + (λN + λα )t + Q0α . 2

(3.98)

330

Z. Qiao

Choose a basic system of closed paths αi , βi (i = 1, . . . , N − 1) of Riemann surface ¯ µ2 = λP (λ) K (λ) with N − 1 handles. ω˜ j (j = 1, . . . , N − 1) are exactly N − 1 : linearly independent holomorphic differentials of the first kind on the Riemann surface ¯ Let ω˜ j be normalized as ωj = N−1 . ˜ l , i.e. ωj satisfy l=1 rj,l ω 1 1 ωj = δij , ωj = Bij , αi

βi

where B = Bij (N−1)×(N−1) is symmetric and the imaginary part ImB of B is a positive definite matrix. By the Riemann Theorem [15] we know: µk satisfy N−1

µk k=1

ω j = φj ,

µ0 d

φj = φj (x, t) =

N−1

l=1

0 1/ rj,l Q0l − x + (λN + λl )t , 2

j = 1, . . . , N − 1, ˜ (P ) = (A (P ) − φ − K) iff µk are the zero points of the Riemann-Theta function which has exactly N − 1 zero points, where A (P ) =

P P0

ω1 , · · · ,

T

P

ωN−1

,

P0

φ = φ (x, t) = (φ1 (x, t) , · · · , φN−1 (x, t))T , K = (K1 , . . . , KN−1 )T ∈ CN−1 is the Riemann constant vector, P0 is an arbitrarily given point on the Riemann surface ¯ (-function and the related properties are seen in the Appendix). Because of [10], 1 1 ˜ (P ) = C1 ¯ , λd ln (3.99) 2πi γ where the constant C1 ¯ has nothing to do with φ; γ is the boundary of a simple connected domain obtained through cutting the Riemann surface ¯ along closed paths αi , βi . Thus, we have a key equality N−1

˜ (P ) − Resλ=∞2 λd ln ˜ (P ) , µk = C1 ¯ − Resλ=∞1 λd ln

k=1

where Resλ=∞k (k = 1, 2) mean the residue at points ∞k : ( d ∞1 = 0, z−1 P z−1 K z−1 , z=0 ( d −1 −1 −1 . K z ∞2 = 0, − z P z z=0

Now, we need to calculate the above two residues.

(3.100)

Camassa-Holm Hierarchy

331

Lemma 2. ˜ (P ) = −2 Resλ=∞1 λd ln ˜ (P ) = 2 Resλ=∞2 λd ln

∂ ln (φ + K + η1 ) , ∂x

∂ ln (φ + K + η2 ) , ∂x

(3.101) (3.102)

where η1 , η2 are two different N − 1 dimensional constant vectors. ¯ µ2 = λP (λ) K (λ). Proof. Consider the following smooth superelliptic curve : th Because λP (λ) K (λ) is a 2N order polynomial with respect to λ, ∞ is not a branch point, i.e. on ¯ there are two infinity points ∞1 and ∞2 . All points P on ¯ are denoted by (λ, ±µ). On ¯ we choose a group of basis of normalized closed paths α1 , · · · , αN−1 ; β1 , · · · , βN−1 . They are mutually independent, and their intersection number are αi ◦ αj = βi ◦ βj = 0, αi ◦ βj = δij . It is easy to see that ω˜ j (j = 1, . . . , N − 1) are N − 1 linearly independent holomorphic ¯ differential forms on . ˜ (P ) at the two infinity points: Let us now come to calculate the residues of λd ln ˜ (z) produces the following result through ∞1 , ∞2 . At ∞1 , the j th variable Ij of multiplying by −1 (please note that the local coordinate at ∞1 and ∞2 is z = λ−1 ): −Ij = φj + Kj + η1,j −

N−1

0

l=1

= φj + Kj + η1,j +

N−1

N−1

0

= φj + Kj + η1,j +

0

= φj + Kj + η1,j +

z

rj,l 0

l=1 N−1

z

rj,l

l=1 N−1

z

rj,l

l=1

= φj + Kj + η1,j +

z

rj,l

ω˜ l z−2 dz √ 2 λP (λ) K (λ) −1 λ=z z−N − ( N−1 λ − λl )z−N+1 + · · · j =1 j dz √ z−2N + · · · ,N−1

α=l,α=1 (λ − λα )

1 + O (z) dz √ 1 + O (z)

rj,l z + O z2 ,

l=1

where η1,j = Because

2 P0

∞1

ωj , j = 1, · · · , N − 1. N−1 N−1 ∂ 1 ∂ rj,l = ∂x 2 ∂Ij j =1 l=1

˜ (z) has the expansion formula and ˜ (z) = (φ + K + η1 ) −

−1 N−1

N

j =1 l=1

∂ rj,l z + O z2 , ∂Ij

332

Z. Qiao

where η1 = (η1,1 , . . . , η1,N−1 )T . Therefore, ˜ (z) = (φ + K + η1 ) − 2

∂ z + O z2 . ∂x

So, we obtain the following residue: ˜ (P ) = Resz=0 z−1 d ln ˜ (z) = Resz=0 Resλ=∞1 λd ln

˜ z (z) 1 ˜ (z) z

1 −2x + O (z) z − 2x z + O z2 1 −2x + O (z) = Resz=0 z 1 − 2−1 x z + O z2 −2x ∂ = = −2 ln (φ + K + η1 ) . ∂x = Resz=0

In a similar way, we can obtain the second residue formula. So, by Eq. (3.79) and this lemma, we immediately have

q (x, t) , q (x, t) =

N

α=1

∂ λα − C1 ¯ + 2 ∂x

(φ + K + η2 ) ln , (φ + K + η1 )

(3.103)

2P where the j th component of ηi (i = 1, 2) is ηi,j = ∞0i ωj . So, the CH equation (3.13) has the following explicit solution, called the algebrogeometric solution: ∂ u(x, t) = R + 2 ∂x where R =

N

α=1 λα

(φ + K + η2 ) ln , (φ + K + η1 )

− C1 ¯ is a constant.

(3.104)

Theorem 7. The algebro-geometric solution of the CH equation (3.13) can be given through Eq. (3.104). Remark 4. Here the algebro-geometric solution (3.104) is smooth and occurs in the x-direction (spacial variable) derivative, and apparently differs from the piecewise smooth algebro-geometric solution in the t-direction derivatives given in Ref. [2]. It is also different from the results in Ref. [3–5, 14], because we are studying the CH equation constrained on M. In the paper, we do not need to calculate each qj (j = 1, . . . , N − 1) 2 but the sum N−1 j =1 qj , which we know from Eqs. (3.79) and (3.56). From the above subsection’s comparison and these comments, therefore we think Eq. (3.104) is a class of new solution of the CH equation (3.13). Apparently, Eq. (3.104) is simpler in form than in Ref. [2].

Camassa-Holm Hierarchy

333

4. Positive Order CH Hierarchy, Integrable Bargmann System, and Parametric Solution 4.1. Positive order CH hierarchy. Let us now give the positive order hierarchy of Eq. (2.2) through employing the kernel element of Lenard’s operator J . 1 Obviously, G0 = cm− 2 form all kernel elements of J , where c = c(tn ) ∈ C ∞ (R) is an arbitrarily given function with respect to the time variables tn (n ≥ 0, n ∈ Z), but independent of the spacial variable x. G0 and the recursion operator (2.6) yield the following hierarchy of the CH spectral problem (2.2): 1

mtk = cJ Lk · m− 2 , k = 0, 1, 2, . . . ,

(4.1)

where the operators J and L are defined by Eqs. (2.5) and (2.6), respectively, and L and J have a further product form L = J −1 K = J −1 =

1 − 1 −1 − 1 m 2 ∂ m 2 (∂ − ∂ 3 ), 2

1 − 1 −1 − 1 m 2∂ m 2. 2

(4.2) (4.3)

Because Eq. (4.1) is related to the Camassa-Holm spectral problem (2.2) for the case of k ≥ 0, k ∈ Z, it is called the positive order Camassa-Holm (CH) hierarchy. Equation (4.1) has the following representative equations: mt0 = 0, trivial case, 1 1 mt1 = −c m− 2 + c m− 2 . xxx

x

Apparently, with c = −1, Eq. (4.5) becomes a Dym type equation 1 1 mt1 = m− 2 − m− 2 , xxx

x

(4.4) (4.5)

(4.6)

1 which has an extra term − m− 2 more than the usual Harry-Dym equation mt1 = x 1 m− 2 . Therefore, Eq. (4.1) gives an extended Dym hierarchy corresponding to the xxx isospectral case: λtk = 0. By Theorem 2, the whole positive order CH hierarchy (4.1) has the zero curvature representation Utk − Vk,x + [U, Vk ] = 0, k > 0, k ∈ Z, Vk =

k−1

Vj λk−j −1 ,

(4.7) (4.8)

j =0

where U is given by Eq. (2.2), and Vj = V (Gj ) is given by Eq. (2.9) with G = Gj = 1 Lj · G0 = cLj · m− 2 , j > 0, j ∈ Z. Thus, all nonlinear equations in the positive CH hierarchy (4.1) are integrable. Therefore we obtain the following theorem:

334

Z. Qiao

Theorem 8. The positive order CH hierarchy (4.1) possesses the Lax pair  0 1   y, = y  x 1 1   4 − 2 mλ 0 k−1 − 21 Gj,x −Gj y λk−j y =  tk 1 1 1 1 j =0   G − G + mλG G j,xx j j j,x  2 4 2 2  k = 0, 1, 2, . . . .

Equation (4.9) can be reduced to the following Lax pair: . ψxx = 41 ψ − 21 mλψ 1 k−j ψ − G λk−j ψ , k = 0, 1, . . . . ψtk = k−1 j x j =0 2 Gj,x λ In particular, the Dym type equation (4.6) has the Lax pair . ψxx = 41 ψ− 21 mλψ 1 1 ψt1 = − 21 m− 2 λψ + m− 2 λψx .

(4.9)

(4.10)

(4.11)

x

Remark 5. This Lax pair coincides with the one obtained by the usual method of finite power expansion with respect to the spectral parameter λ. However, we here present the positive hierarchy (4.1) mainly by Lenard’s operators pair satisfying Eq. (2.4). Because it contains the spectral gradient ∇λ in Eq. (2.4), this procedure of generating evolution equations from a given spectral problem is called the spectral gradient method. Using this method, how to determine a pair of Lenard’s operators associated with the given spectral problem mainly depends on the concrete forms of spectral problems and spectral gradients, and some computational techniques. From this method, we have already derived the negative order CH hierarchy and the positive order CH hierarchy.

4.2. r-matrix structure and integrability for the Bargmann CH system. Consider the following matrix (called “positive” Lax matrix): A+ (λ) B+ (λ) L+ (λ) = , (4.12) C+ (λ) −A+ (λ) where A+ (λ) =

N

λ j pj q j j =1

B+ (λ) = −

λ − λj

N λ q2

j j j =1

C+ (λ) =

(4.13)

,

λ − λj

,

N λ p2

j j 1 + . 2 q, q λ − λj j =1

(4.14)

(4.15)

Camassa-Holm Hierarchy

335

We calculate the determinant of L+ (λ):

1 1 1 2 − det L+ (λ) = TrL2+ (λ) = A+ (λ) + B+ (λ) C+ (λ) 2 4 2 + N

Ej = , λ − λj

(4.16)

j =1

where Tr stands for the trace of a matrix, and Ej+ = − j+ =

1 1 λj qj2 − j+ , j = 1, 2, . . . , N, 4 q, q 2

N

λj λl (pj ql − pl qj )2 . λj − λ l

(4.17) (4.18)

l=j,l=1

Let Fk =

N

λkj Ej+ , k = 0, 1, 2, . . . ,

(4.19)

j =1

then it reads k+1 k−1 q, q 1 j +1 Fk = − q, p k−j q, p − j +1 q, q k−j p, p , + 4 q, q 2 j =0

(4.20) where k = 0, 1, 2, . . .. A long but direct computation leads to the following key equalities: {A+ (λ), A+ (µ)} = {B+ (λ), B+ (µ)} = 0, 2 {C+ (λ), C+ (µ)} = (λA+ (λ) − µA+ (µ)) , q, q2 2 (µB+ (µ) − λB+ (λ)), {A+ (λ), B+ (µ)} = µ−λ 2 1 {A+ (λ), C+ (µ)} = λB+ (λ), (λC+ (λ) − µC+ (µ)) − µ−λ q, q2 4 {B+ (λ), C+ (µ)} = (µA+ (µ) − λA+ (λ)). µ−λ + Let L+ 1 (λ) = L+ (λ) ⊗ I2×2 , L2 (µ) = I2×2 ⊗ L+ (µ), where L+ (λ) , L+ (µ) are given through Eq. (4.12). In the following, we search for a 4 × 4 r-matrix structure + r12 (λ, µ) such that the fundamental Poisson bracket [13]: ! + " ! + " + L+ (λ) ⊗, L+ (µ) = r12 (4.21) (λ, µ) , L+ 1 (λ) − r21 (µ, λ) , L2 (µ) holds, where the entries of the 4 × 4 matrix L+ (λ) ⊗, L+ (µ) are L+ (λ) ⊗, L+ (µ) kl,mn = L+ (λ)km , L+ (µ)ln , k, l, m, n = 1, 2,

336

Z. Qiao

and r21 (λ, µ) = P r12 (λ, µ) P , with 





1 1 0  P = σj ⊗ σ j =  I2×2 + 0 2 j =1 0 3

0 0 1 0

0 1 0 0

 0 0 , 0 1

where σj are the Pauli matrices. Theorem 9. + r12 (λ, µ) =

2λ λ S+ P+ µ(µ − λ) q, q2

(4.22)

is an r-matrix structure satisfying Eq. (4.21), where

+

S = σ− ⊗ σ− =

00 10

⊗

00 . 10

In fact, the r-matrix satisfying Eq. (4.21) is not unique [19]. Obviously, this r-matrix structure is also of 4 × 4 and different from the one in Ref. [20]. Because there is an r-matrix structure satisfying Eq. (4.21), %

& ! " ! + " + + L2+ (λ) ⊗, L2+ (µ) = r¯12 (λ, µ) , L+ 1 (λ) − r¯21 (µ, λ) , L2 (µ) ,

(4.23)

where r¯ij+ (λ, µ) =

1 1

L+ 1

1−k

1−l k l (λ) L+ (µ) rij (λ, µ) L+ (λ) L+ 2 1 2 (µ) ,

k=0 l=0

ij = 12, 21. Thus,

% & % & 4 TrL2+ (λ) , TrL2+ (µ) = Tr L2+ (λ) ⊗, L2+ (µ) = 0.

(4.24)

So, by Eq. (4.16) we immediately obtain the following theorem. Theorem 10. The following equalities {Ei+ , Ej+ } = 0, {Fk , Ej+ } = 0,

(4.25)

i, j = 1, 2, . . . , N, k = 0, 1, 2, . . . , hold. Hence, all Hamiltonian systems (Fk ) (Fk ) :

∂Fk ∂Fk , ptk = {p, Fk } = − , ∂p ∂q k = 0, 1, 2, . . . ,

qtk = {q, Fk } =

are completely integrable.

(4.26)

Camassa-Holm Hierarchy

337

Furthermore, we find the following Hamiltonian function: H+ =

1 1 1 p, p − q, q − 2 8 4 q, q

(4.27)

is involutive with Ej+ , Fk , i.e. {H + , Ej+ } = 0, {H + , Fk } = 0,

(4.28)

j = 1, 2, . . . , N, k = 0, 1, 2, . . . . Ej+

Here, are N independent functions. Therefore, we obtain the following results. Corollary 4. The canonical Hamiltonian system (H + ): . + qx = ∂H ∂p = p, + (H ) : + 1 px = − ∂H ∂q = 4 q −

1 q. 2q,q2

(4.29)

is completely integrable. Corollary 5. All composition functions f H + , Fk+ , f ∈ C ∞ (R), k = 0, 1, 2, . . ., are completely integrable Hamiltonians. Let 1 , q, q2 ψ = qj , λ = λj , j = 1, . . . , N. m=

(4.30) (4.31)

Then, the integrable flow (H + ) defined by Eq. (4.29) also exactly becomes the CH spectral problem (2.1) with the potential function m. Remark 6. Equation (4.30) is a Bargmann constraint in the whole space R2N . Therefore, the Hamiltonian system (H + ) is of integrable Bargmann type. 4.3. Parametric solution of the positive order CH hierarchy. In the following, we shall consider the relation between constraint and nonlinear equations in the positive order CH hierarchy (4.1). Let us start from the following setting G0 = −

N

∇λj ,

(4.32)

j =1 1

where G0 = −m− 2 , and ∇λj = λj qj2 is the functional gradient of the CH spectral problem (2.2) corresponding to the spectral parameter λj (j = 1, . . . , N). Apparently Eq. (4.32) reads m=

1 q, q2

which coincides with the constraint relation (4.30).

(4.33)

338

Z. Qiao

Since the Hamiltonian flows (H + ) and (Fk ) are completely integrable and their Poisx , g tk commute [6]. son brackets {H + , Fk } = 0 (k = 0, 1, 2, . . .), their phase flows gH + Fk Thus, we can define their compatible solution as follows: q(x, tk ) q(x 0 , tk0 ) tk x = gH + gFk , k = 0, 1, 2, . . . , (4.34) p(x, tk ) p(x 0 , tk0 ) x , g tk . where x 0 , tk0 are the initial values of phase flows gH + Fk

Theorem 11. Let q(x, tk ), p(x, tk ) be a solution of the compatible Hamiltonian systems (H + ) and (Fk ). Then m=

1

(4.35)

q(x, tk ), q(x, tk )2

satisfies the positive order CH equation 1

mtk = −J Lk · m− 2 , k = 0, 1, 2, . . . ,

(4.36)

where the operators J and L are given by Eqs. (2.5) and (2.6), respectively. Proof. This proof is similar to the negative case.

In particular, with k = 1, we obtain the following corollary. Corollary 6. Let q(x, t1 ), p(x, t1 ) be a solution of the compatible integrable Hamiltonian systems (H + ) and (F1 ). Then m = m(x, t1 ) =

1 q(x, t1 ), q(x, t1 )2

,

(4.37)

is a solution of the Dym type equation (4.6). Here H + and F1 are given by 1 1 1 p, p − q, q − , 2 8 4 q, q 2 q, q 1 q, p2 − q, q p, p . F1 = − + 4 q, q 2

H+ =

By Theorem 11, the Bargmann constraint given by Eq. (4.30) is actually a solution of the positive order CH hierarchy (4.1). Thus, we also call the system (H + ) (i.e. Eq. (4.29)) a positive order constrained CH flow (i.e. Bargmann type) of the spectral problem (2.2). All Hamiltonian systems (Fk ), k ≥ 0, k ∈ Z are therefore called the positive order integrable Bargmann flows in the whole R2N . In a further procedure, we can also discuss the algebro-geometric solutions for the positive order CH hierarchy. Acknowledgements. The author would like to express his sincere thanks to Dr. Darryl Holm and Dr. Roberto Camassa for their fruitful discussions and also to Dr. Fritz Gesztesy for his good suggestions. This work was supported by the U.S. Department of Energy under contracts W-7405-ENG-36 and the Applied Mathematical Sciences Program KC-07-01-01; the Special Grant of National Excellent Doctoral Dissertation of China; and also the Doctoral Programme Foundation of the Institution of High Education of China.

Camassa-Holm Hierarchy

339

Appendix. Abel Mapping and the Θ-Function 1. If the genus of a Riemann surface is g, this surface is homomorphic to a sphere with g handles. Such a basic system of closed paths (or contours) α1 , . . . , αg , β1 , . . . , βg can be chosen such that the only intersections among them are those of αi and βi with the same number i. Let the Riemann surface be covered with charts (Ui , zi ), where zi are local parameters in open domains Ui , the transition from zi to zj in intersections Ui ∩ Uj being holomorphic. If in any Ui a differential ϕi (zi ) dzi with meromor phic ϕi (zi ) is given and in the common parts Ui ∩ Uj , ϕi (zi ) dzi = ϕj zj dzj , then we say that there is an Abel differential on the whole surface with restrictions |vi = ϕi (zi ) dzi . The Abel differential is of the first kind if all the ϕi (zi ) are holomorphic. There are exactly g2 linearly independent differentials of the first kind ω1 , . . . , ωg . They are normed if αi ωj = δij , which condition determines them 2 uniquely. We shall always assume them normed. The numbers βi ωj = Bij are called β-periods. The matrix B = Bij has the following properties: 1) Bij = Bj i , 2) τ = Im B is a positive definite matrix. %2 & P We consider a g-dimensional vector A (P ) = P0 ωj , where P0 is a fixed point of the Riemann surface and P is an arbitrary point. This vector is not uniquely determined, but depends on the path of integration. If the latter is changed then a linear combination of α and β-periods with integer coefficients can be added: (A (P ))j → g g (A (P ))j + 1 ni δij + 1 mi Bij , i.e. A (P ) → A (P ) + ni δi + mi Bi , where δi is the vector with coordinates δij , Bi is the vector with coordinates Bij . Thus A (P ) determines a mapping of the Riemann surface on the torus J = Cg /T, where T is the lattice generated by 2g vectors {δi , Bi } (which are linearly independent over R). This mapping is called the Abel mapping, and the torus J is the Jacobi manifold or the Jacobian or the Riemann surface. The Abel mapping extends by linearity to the divisors:

A nk Pk = nk A (Pk ) . 2. Abel Theorem. Those and only those divisors go to zero of the Jacobian by the Abel mapping which are principal. The latter means that they are divisors of zeros and poles of meromorphic functions on the surface. If Pk is a zero of the function, then nk > 0 and nk is the degree of this zero. If Pk is a pole, then nk < 0 and |nk | is the degree of this pole. Of special interest is the case of divisors of degree g with all nk = 1, i.e. of non-ordered sets of g points P1 , . . . , Pg of the Riemann surface. All the sets of such kind form the symmetrical g th power of the Riemann surface. The Abel mapping has the form   g  P  ωj , j = 1, . . . , g. A P1 , . . . , Pg =   P0 j =1

The symmetrical g th power is a complex manifold of the complex dimension g and it is mapped on the Jacobian which is a manifold of the same dimension. A problem of conversion arises (the Jacobi inverse problem). It can be solved with the help of the -function.

340

Z. Qiao

3. For arbitrary P ∈ Cg , let

(P ) =

exp {πi (BZ, Z) + 2π i (P , Z)} ,

Z∈Zg g

(BZ, Z) =

Bij zi zj , Z = (z1 , . . . , zg )T ,

i,j =1 g

(P , Z) =

pi zi , P = (p1 , . . . , pg )T .

i=1

The series converges owing to the properties of the matrix B. The -function has the properties (−P ) = (P ) (P + δk ) = (P ) (P + Bk ) = (P ) exp {−πi (Bkk + 2pk )} . Note that the -function is not defined on the Jacobian because of the latter property. 4. Riemann Theorem. There are constants K = {ki }, i = 1, . . . , g (Riemann constants) determined by the Riemann surface such that the set of points P1 , . . . , Pg is a solution of the system of equations g Pi

ωj = lj , L = {lj } ∈ J, j = 1, . . . , g, i=1

P0

˜ (P ) = (A (P ) − L − K) if and only if P1 , . . . , Pg are the zeros of the function ˜ (P ) is not uniquely (which has exactly g zeros). Note that while the function determined on the Riemann surface (it is multivalued) its zeros are multivalued, ˜ (P ) differ by exponents. since distinct branches of 5. We define now Abel differentials of the second and of the third kind. The Abel dif(k) ferential of the second kind, P , k = 1, 2, . . . , has the only singularity at the point P which is a pole of the order k + 1. The differential can be represented at this point −k (holomorphic differential), z being the local parameter at this point. Such a as dz+ 2 (k) differential is uniquely determined if it is normed: αi P = 0, ∀i. The Abel differential of the third kind P Q has only singularities which are simple poles at the points P and Q with the residues +1 and −1, respectively. It is uniquely determined by the same condition. 6. Proposition. If z is a local parameter in a neighbourhood of the point P and ωi = ϕi (z) dz is the Abel differential of the first kind, then d k−1 1 1 (k) P = − ϕi (z) , i = 1, . . . , g, k−1 2π i βi (k − 1)! dz z=0 and 1 2πi

P Q =

βi

which is also seen in Ref. [11].

P Q

ωi , i = 1, . . . g,

Camassa-Holm Hierarchy

341

References 1. Ablowitz, M.J., Segur, H.: Soliton and the Inverse Scattering Transform. Philadelphia: SIAM, 1981 2. Alber, M.S., Roberto Camassa, Fedorov, Y.N., Holm, D.D., Marsden, J.E.: The complex geometry of weak piecewise smooth solutions of integrable nonlinear PDE’s of shallow water and Dym type. Commun. Math. Phys. 221, 197–227 (2001) 3. Alber, M.S., Roberto Camassa, Holm, D.D., Marsden, J.E.: The geometry of peaked solitons and billiard solutions of a class of integrable PDEs. Lett. Math. Phys. 32, 137–151 (1994) 4. Alber, M.S., Fedorov,Y.N.: Wave solution of evolution equations and Hamiltonian flows on nonlinear subvarieties of generalized Jacobians. J. Phys. A: Math. Gen. 33, 8409–8425 (2000) 5. Alber, M.S., Fedorov, Y.N.: Algebraic geometrical solutions for certain evolution equations and Hamiltonian flows on nonlinear subvarieties of generalized Jacobians. Inv. Prob. 17 (2001), to appear 6. Arnol’d, V.I.: Mathematical Methods of Classical Mechanics. Berlin: Springer-Verlag, 1978 7. Camassa, R., Holm, D.D.: An integrable shallow water equation with peaked solitons. Phys. Rev. Lett. 71, 1661–1664 (1993) 8. Calogero, F.: An integrable Hamiltonian system. Phys. Lett. A 201, 306–310 (1995) 9. Cao, C.W.: Nonlinearization of Lax system for the AKNS hierarchy. Sci. China A (in Chinese) 32, 701–707 (1989); also see English Edition: Nonlinearization of Lax system for the AKNS hierarchy. Sci. Sin. A 33, 528–536 (1990) 10. Dickey, L.A.: Soliton Equations and Hamiltonian Systems. Singapore: World Scientific, 1991 11. Dubrovin, B.: Riemann surfaces and nonlinear equations. I. Moscow: MGU, 1986 12. Dubrovin, B.: Theta-functions and nonlinear equations. Russ. Math. Surv. 36, 11–92 (1981) 13. Faddeev, L.D., Takhtajan, L.A.: Hamiltonian Methods in the Theory of Solitons. Berlin: SpringerVerlag, 1987 14. Gesztesy, F., Holden, H.: Algebraic-geometric solutions of the Camassa-Holm hierarchy. Private communication, to appear in Revista Mat. Iberoamericana 15. Griffiths, P., Harris, J.: Principles of Algebraic Geometry. New York: Wiley, 1978 16. Newell, A.C.: Soliton in Mathematical Physics. Philadelphia: SIAM, 1985 17. Novikov, S.P., Manakov, S.V., Pitaevskii, L.P., Zakharov, V.E.: Theory of Solitons. The Inverse Scattering Method. New York: Plenum, 1984 18. Qiao, Z.J., Cao, C.W., Strampp, W.F.: Category of nonlinear evolution equations, algebraic structure, and r-matrix. J. Math. Phys. 44, 701–722 (2003) 19. Qiao, Z.J.: Generalized r-matrix structure and algebro-geometric solutions for integrable systems. Rev. Math. Phys. 13, 545–586 (2001) 20. Ragnisco, O., Bruschi, M.: Peakons, r-matrix and Toda Lattice. Physica A 228, 150–159 (1996) 21. Sklyanin, E.K.: Separation of variables. Prog. Theor. Phys. Suppl. 118, 35–60 (1995) 22. Suris, Y.B.: A discrete time peakons lattice. Phys. Lett. A 217, 321–329 (1996) Communicated by P. Constantin

Commun. Math. Phys. 239, 343–382 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0875-8

Communications in

Mathematical Physics

Landau-de Gennes Model of Liquid Crystals and Critical Wave Number Xing-Bin Pan1,2 1 2

Department of Mathematics, National University of Singapore, Singapore 119260, Singapore Department of Mathematics, Zhejiang University, Hangzhou 310027, P.R. China. E-mail: [email protected]

Received: 31 March 2002 / Accepted: 3 March 2003 Published online: 16 June 2003 – © Springer-Verlag 2003

Abstract: We study some variational problems of the Landau-de Gennes functional under Neumann or Dirichlet boundary conditions, which describes phase transitions of liquid crystals. We investigate the effect of the parameters, especially the chirality and the wave number, on the behavior of the minimizers. In order to describe bifurcation of a smectic phase from a nematic phase we introduce the critical wave number Qc3 and give various estimates. We examine the behavior of minimizers with small chirality and large elastic coefficients. 1. Introduction In this paper we study the variational problems of the Landau-de Gennes model which describes phase transitions of liquid crystals. Our main focus is the effect of the physical parameters in the model to the behavior of minimizers. This work was motivated by Lin [L1] and Aviles-Giga [AG1,2] on the phase transitions of liquid crystals, and by Calderer [C] and Bauman-Calderer-Liu-Phillips [BCLP] on the Landau-de Gennes model. According to de Gennes’s theory [dG2, dGP], the state of a liquid crystal can be described (at least for the temperature close to the transition point) by a complexvalued function called order parameter, a unit vector field n called director field, and a real number q called wave number (or wave length parameter) which depends on the material and temperature. = 0 for a nematic phase, and = 0 for a smectic phase. (, n) minimizes the energy functional called the Landau-de Gennes energy (see [dGP,C,BCLP]): L[, n] = c|∇qn |2 + FA (||) + FN (n, ∇n) dx,

where is the region occupied by the liquid crystals, FA (||) and FN (n, ∇n) denote the smectic energy density and the nematic Oseen-Frank energy density respectively, c

344

X.B. Pan

is a real constant, and ∇qn = ∇ − iqn. In physics literature, the smectic energy density is often taken to be a polynomial of ||2 such as FA (||) = r||2 +

u ||4 , 2

where r < 0 and u > 0. The Oseen-Frank energy density is given by FN (n, ∇n) = K1 |div n|2 + K2 |n · curl n + τ |2 + K3 |n ∧ curl n + σ (n)|2 +(K2 + K4 )[tr(∇n)2 − (div n)2 ], where Kj , j = 1, 2, 3, 4, called elastic coefficients, are material constants, among them K1 , K2 and K3 are positive; τ is a real number referring to the chiral pitch in some liquid crystal materials, and σ (n) refers to the energy associated with external electromagnetic fields. Throughout this paper we assume that is a bounded, simply-connected domain in R3 with a smooth boundary. We shall consider 2 types of boundary conditions: (i) Neumann boundary conditions for both n and . (ii) Dirichlet boundary condition for n and Neumann boundary condition for . It is tempting to look for minimizers in the Sobolev spaces. Then one would wish the Oseen-Frank energy were coercive in W 1,2 (, R3 ): FN (n, ∇n) ≥ c0 |∇n|2 for some constant c0 > 0. However, this is not true without additional conditions on the coefficients Kj s. When considering the Dirichlet boundary condition on n, this difficulty can be overcome by using the idea of Hardt-Kinderlehrer-Lin [HKL], where they observed that the last term tr(∇n)2 − (div n)2 in the Oseen-Frank energy density was a divergence term and could be reduced to a surface integral, and thus they added a null multiplier to the Oseen-Frank functional and got an equivalent energy functional which was coercive in W 1,2 (, S2 ). Following this idea, in the following, when we consider the Dirichlet boundary condition on n, we will drop the last term in the Oseen-Frank functional and assume that FN (n, ∇n) = K1 |div n|2 + K2 |n · curl n + τ |2 + K3 |n ∧ curl n + σ (n)|2 . When we consider the Neumann boundary condition (i), we shall also take the simplified form of FN , as we believe that the simplified energy catches most difficulties of the full Oseen-Frank energy, and catches the main feature of the model. Instead of working on Sobolev spaces, we shall work on the admissible set W 1,2 (, C) × V (, S2 ), see Sect. 3 for the definition of V (, S2 ). Working on this space may meet difficulty of lack of compactness. Fortunately, the freedom to make gauge transforms in the order parameter part will often help us to handle the problem. The de Gennes theory predicts K2 , K3 → +∞ for the nematic-smectic-A phase transition, and K1 , K2 , K3 → +∞ for the nematic-smectic-C phase transition, see [dG3]

Liquid Crystals

345

and [dGP] (P. 515)1 . In order to understand the phase transitions, we shall examine the behavior of the minimizers when these coefficients tend to +∞. In this paper we shall only consider the case without external electromagnetic fields, hence σ (n) = 0. For simplicity we assume K2 = K3 . Recall that, for a unit vector field n it holds |curl n|2 = |n · curl n|2 + |n ∧ curl n|2 , and hence |n · curl n + τ |2 + |n ∧ curl n|2 = |curl n + τ n|2 . After making the following rescaling |r| |r| u = ψ, κ = , Kj = Kj , j = 1, 2, 3, u c c|r| and replacing L by E[ψ, n] =

u c|r| L,

κ2 4 2 2 |∇qn ψ| − κ |ψ| + |ψ| + K1 |div n| + K2 |curl n + τ n| dx. 2 (1.1) 2

we are led to the following functional 2

2

We expect that, as K1 and K2 → +∞ (resp. K2 → +∞), the asymptotic behavior of minimizers of (1.1) under suitable boundary conditions will provide a mathematical understanding of the phase transition process of liquid crystals between a nematic phase and a smectic-C (resp. smectic-A) phase. Note that the functional E has a critical point (0, n), where n is a critical point of the (simplified) Oseen-Frank functional. It corresponds with a nematic phase. We are interested in non-trivial minimizers (ψ, n) with ψ = 0, and the nucleation of a nontrivial minimizer from a trivial one, which corresponds with the phase transition from a nematic phase to a smectic phase. We believe that the wave number q could be an important indicator to distinguish the smectic and nematic states. Namely, if the wave number is larger than some critical value, the sample is in a nematic state; otherwise it is in a smectic phase. This observation led us to introduce the critical wave number Qc3 , a critical value of the wave number q, to distinguish trivial and non-trivial minimizers, and to explore the possibility of nucleation of a smectic phase from a uniform nematic phase. See Sect. 3 for a precise description of Qc3 in the Neumann boundary condition case. We should mention that, Calderer [C] introduced the critical fields hc1 and hc2 for the chirality field h = τ K2 in order to develop a mathematical theory for the twist grain boundary phase when hc1 < h < hc2 . The critical wave number Qc3 introduced in this paper is a critical value for the wave number q. The introduction of the critical wave number Qc3 was motivated by the analogy between the Landau-de Gennes energy E of liquid crystals and the Ginzburg-Landau energy GL of superconductors: κ2 2 2 2 4 2 GL[ψ, A] = {|∇κA ψ| − κ |ψ| + |ψ| }dx + κ |curl A − Happl |2 dx, 2 R3 (1.2) where ψ is the order parameter and A is the magnetic potential, κ is the Ginzburg-Landau parameter of the sample, and Happl is the applied magnetic field (see for instance [dG1, 1 According to Chen-Lubensky [ChL], it was predicted by de Gennes [dG3] and by J¨ahnig and Brochard, and verified by a number of experiments that K2 and K3 diverge and that K1 undergoes no violent change at the nematic-smectic-A transition temperature TNA ; and it was predicted by de Gennes [dG3] and supported by some experimental evidence that all three coefficients K1 , K2 and K3 diverge at the nematic-smectic-C transition temperature TNC .

346

X.B. Pan

CHO, DGP]). The analogy was discovered by de Gennes2 in [dG2] (also see [dGP]) and W. McMillan [Mc]. De Gennes compared the superconducting and normal states with the smectic and nematic phases respectively, and hence introduced various characteristic lengths and critical fields. Later on Renn-Lubensky [RL] compared the effect of twists upon liquid crystals with the effect of applied magnetic fields upon superconductors, hence predicted that a twisted smectic, with Abrikosov lattice of screw dislocations, is the analogy of the vortex state of a superconductor and could exist. They called it twist grain boundary. This phase was indeed found one year after by Goodby and co-workers [GW1,2] in the smectic A∗ phase (a smectic A phase with chiral molecules)3 . We are interested in the dependence of the minimizers on the parameters. In Sect. 4 we examine the behavior of the minimizers when the elastic coefficients Kj ’s are large and the chirality τ is small. We shall see that, if κ > 0 is fixed and the wave number q is far below Qc3 , the director field approaches a unit constant vector and the order parameter remains non-zero. We may call such a phase a perfect smectic phase, in analogy with the Meissner state of superconductivity, or the perfect superconducting state, of a type I superconductor. It is natural to expect that if q is close to Qc3 and if κ is large, the liquid crystal will show some analogy with a type II superconductor. It is well-known that, when a type II superconductor is placed in an applied field with magnitude between the second critical field HC2 and the upper critical field HC3 , the superconductor is in a surface superconducting state, namely, a superconducting sheath of scale O( κ1 ) will form surrounding the surface of the sample, while the bulk material will remain in a normal state (see for instance [P1]). It will be interesting to ask what the analogy of a surface superconducting state is for a liquid crystal. Let us call such an analogy surface smectic phase, or SSP for short, if it exists. One may expect that an SSP exhibits a boundary layer of smectic surrounding the surface of the sample, and the bulk is in a nematic phase. Then we have to find the regime of parameters for which an SSP could exist, and find possible behaviors of an SSP. We believe that Qc3 can help us to determine the possible existence of an SSP. Despite many analogies, there exist two important differences between the Landaude Gennes functional E and the Ginzburg-Landau functional GL. The first one is the gauge invariance that the Ginzburg-Landau functional has but the Landau-de Gennes functional does not have. The second difference is more important: For the Landau-de Gennes model it is required that the director field satisfies the pointwise restriction |n(x)| = 1

a.e. in .

(1.3)

It is this restriction that makes the functionals E very different from the Ginzburg-Landau functional for superconductivity, and makes the study of the questions mentioned above more difficult. The work presented in this paper is preliminary, and we hope it will initiate further study in this direction. The outline of the paper is the following. Sections 3 and 4 are devoted to the variational problem for the functional E under Neumann boundary conditions. In Sect. 3 we 2 We would like to quote a paragraph in an article, The master of analogies, from the official web site of the Nobel Foundation, http://www.nobel.se/physics/educational/poster/1991/master.html, “De Gennes discovered that although phase transitions in different materials give rise to widely different phenomena, and are governed by different parameters, such as temperature, concentration, magnetic or electrical field, they can be described in a very general way. Whether the structure is a liquid crystal, ferromagnet, superconductor or polymer, universal features can be identified and explained by simple scaling laws.” 3 The discovery of the smectic A∗ phase is an analogy of the discovery of the vortex state of type II superconductors, which was predicted by Abrikosov in 1957 and invented in laboratory several years after.

Liquid Crystals

347

give some estimates of Qc3 , for small τ (Theorem 3.6). In order to obtain the estimate we need some results about the lowest eigenvalue of the Schr¨odinger operator with a magnetic field, which are summarized in Sect. 2 for the reader’s convenience. In Sect. 4 we discuss the asymptotic behavior of non-trivial minimizers of the functional E under the Neumann boundary condition, with small chirality τ and various choices of q. We shall only consider the case where both K1 and K2 are large (consistent with de Gennes’s prediction on the divergence of the elastic coefficients in the nematic-smectic-C transition), and leave other cases to future investigation. In Sect. 5 we study the variational problem under Dirichlet boundary conditions for the director fields. Our main concern there is the dependence of the minimizers (ψ, n) on the parameters K1 and K2 . We shall see that the consistency between the boundary data and the domain geometry will be important to detect the asymptotic behavior of the minimizers, see Theorem 5.4. Some further remarks and open problems will be given in Sect. 6. The links of our problem studied in Sect. 5 with some existing mathematical models of liquid crystals will also be discussed in Sect. 6. 2. Preliminaries: Eigenvalues of −∇2bFh in a Bounded Domain Given a vector field A, let µ(A) ≡ µ(A, ) be the lowest eigenvalue of −∇A2 φ = µ(A)φ

in ,

∇A φ · ν = 0

on ∂.

(2.1)

Here we use the notation ∇A2 φ = φ − i[2A · ∇φ + φ div A] − |A|2 φ. In this section we shall collect some known estimates of µ(bFh ) for small or large parameter b. Here, for a unit vector h = (h1 , h2 , h3 ), Fh is a vector field satisfying 4 curl Fh = h,

div Fh = 0,

Fh · h = 0,

Fh− = −Fh

in ,

(2.2)

where we use the notation h− = −h.

(2.3)

For example we can choose Fh (x) =

1 (h2 x3 − h3 x2 , h3 x1 − h1 x3 , h1 x2 − h2 x1 ). 2

The eigenvalue problem (2.1) for A = bFh in a 2- or 3-dimensional domain arose in the study of surface nucleation of superconductivity in an applied field decreasing from the upper critical field HC3 , and has been studied by many authors. Especially for a 3-dimensional domain, the value of µ(bFh ) has been estimated by Giorgi-Phillips [GP], Lu-Pan [LP3], Helffer-Morame [HM], and Pan [P2-3]. In the following we shall cite two estimates of µ(bFh ), one for small b ([P2]) and the other for large b ([LP3]). Let us begin with the estimate for small b. Given a smooth, bounded domain in R3 and a unit vector h, we define − |∇φ − Fh |2 dx, (2.4) λ(h) ≡ λ(h, ) = inf φ∈W 1,2 ()

4

In [LP3] and [P2-3] we did not require the condition Fh · h = 0.

348

X.B. Pan

1 where − = || . It is easy to see that λ(h) is achieved by the unique solution wh of the following equation: ∂wh

wh = 0 in , wh dx = 0. (2.5) = Fh · ν on ∂, ∂ν Since Fh− = −Fh , we have wh− = −wh . Thus λ(−h) = λ(h). The following conclusion was proved in [P3]. Lemma 2.1. µ(bFh ) = λ(h)b2 + O(b3 ) as b → 0. Let us define λ∗ ≡ λ∗ () = inf λ(h, ).

(2.6)

h∈S2

This number will be useful in Sect. 3. Next we consider the estimate for large b. For any fixed number z, let β(z) be the lowest eigenvalue of the following problem: −u + (z + t)2 u = β(z)u

u (0) = 0,

for t > 0,

u(+∞) = 0.

(2.7)

It was proved by Helffer-Dauge [DH] that there is a unique z0 , z0 < 0, such that β(z0 ) = inf z∈R β(z) = β0 (also see [LP1]). Moreover, 0.5 < β0 < 0.76 (see [LP2] (Prop. 2.4)), and β0 = |z0 |2 . Lemma 2.2. As b → +∞ we have µ(bFh ) = β0 b + o(b).

(2.8)

Moreover, (2.8) holds uniformly for all h ∈ S2 . Proof. For fixed h, the estimate was given in [LP3]. We shall use the method in [LP3] to show that (2.8) holds uniformly for all h ∈ S2 . We choose Fh as given above. Note that 2 |∇bFh φ| dx µ(bFh ) = inf . 2 φ∈W 1,2 () |φ| dx By choosing a test function properly (see [LP3],Appendix) we can show that there exists a constant C independent of h such that, for all h ∈ S2 and large b, µ(bFh ) ≤ β0 b+Cb1/3 . Thus we have a uniform upper bound. Especially lim sup max b→+∞

h∈S2

µ(bFh ) ≤ β0 . b

To derive a lower bound, we define µ∗ (b) ≡ µ∗ (b, ) = inf µ(bFh , ). h∈S2

(2.9)

Liquid Crystals

349

Let λ = lim inf b→+∞ µ∗ (b)/b. Then λ ≤ β0 . It is easy to show that for any b > 0, µ∗ (b) is achieved. Choose a sequence bj → +∞ and hj ∈ S2 such that µ(bj Fhj ) = µ∗ (bj ),

lim

j →∞

µ∗ (bj ) = λ. bj

Let µj denote µ(bj Fhj ) and let φj be an eigenfunction associated with µ(bj Fhj ) such that φj L∞ () = 1. From (2.1), φj satisfies −∇b2j Fh

j

(x) φj

= µj φj

∇bj Fhj (x) φj · ν = 0

in ,

on ∂.

¯ Let xj be the maximum point From the elliptic estimate we see that φj ∈ C 2+α (). of |φj (x)|. Passing toa subsequence we may assume that xj → x0 and hj → h0 as j → ∞. Let εj = 1/ bj , and define ϕj (y) = φ(xj + εj y)e

ibj Fhj (xj )·y

.

Note that Fhj (xj + εj y) = Fhj (xj ) + εj Fhj (y). We see that ϕj satisfies the equation −∇F2h where j =

j

(y) ϕj

−{xj } εj .

=

µj ϕj bj

∇Fhj (y) ϕj · ν = 0

in j ,

on ∂j ,

Since µj /bj → λ ≤ β0 < 1 as j → ∞, using Theorems 4.1 and

2+α 4.2 in [LP3] we can show that x0 ∈ ∂, and {ϕj } is bounded in Cloc (R3+ ). Passing to a 2+α (R3+ ) as j → ∞, where ϕ0 (0) = 1, subsequence we may assume that ϕj → ϕ0 in Cloc ϕ0 L∞ (R3 ) = 1, and ϕ0 satisfies the equation +

−∇F2h ϕ0 = λϕ0 0

in R3+ ,

∇Fh0 ϕ0 · ν = 0

on ∂R3+ .

Then, using Theorem 4.2 in [LP3] we get λ ≥ β0 . Thus λ = β0 , and limb→+∞ β0 . Therefore (2.8) holds uniformly for h ∈ S2 .

µ∗ (b) b

=

Corollary 2.3. For µ∗ (b) defined in (2.9), let a0 (κ) = min{b > 0 : µ∗ (b) = κ 2 },

a1 (κ) = max{b > 0 : µ∗ (b) = κ 2 }. (2.10)

We have a0 (κ) a1 (κ) 1 = lim = . 2 2 κ→+∞ κ κ→+∞ κ β0 lim

350

X.B. Pan

3. Neumann Problem and Critical Wave Number Qc3 In this section we consider the variational problem for the functional E under the Neumann boundary conditions for both n and ψ. As mentioned in Sect. 1, we shall consider the variational problem on the set V() = W 1,2 (, C) × V (, S2 ), where W 1,2 (, C) is the Sobolev space of complex-valued functions, and5

V , R3 = u ∈ L2 , R3 : div u ∈ L2 (), curl u ∈ L2 () ,

V , S2 = n ∈ V , R3 : |n(x)| = 1 a.e. in . The topology in V (, R3 ) is induced by the norm 1/2 n V () = n 2L2 () + div n 2L2 () + curl n 2L2 () . Given a set of parameters K1 > 0, K2 > 0, κ > 0, τ and q ∈ R, we let Cn (K1 , K2 , κ, τ, q) =

inf

(ψ,n)∈V()

E[ψ, n].

We shall show that the minimizers exist. The Euler equations of the minimizers are:  2 ψ = κ 2 (1 − |ψ|2 )ψ,  −∇qn    ¯ qn ψ} = λ1 (x)n in , −K1 ∇(div n) + K2 (curl 2 n + 2τ curl n + τ 2 n) − q{ψ∇  ∇qn ψ · ν = 0,    K1 (div n)ν + K2 (curl n + τ n) × ν = λ2 (x)n on ∂, (3.1) where λ1 (x) and λ2 (x) are the Lagrangian multipliers and they depend on x as well as the unknown vector field n. Let us recall that there exists a linear operator γν from V (, R3 ) to H −1/2 (∂) such that, if n is smooth, then γν n = the restriction of n · ν to ∂. Moreover, γν n H −1/2 (∂) ≤ C1 (){ n L2 () + div n L2 () }.

(3.2)

For a smooth, bounded domain ⊂ R3 , u ∈ W 1,2 (, R3 ) if and only if u ∈ L2 (), div u ∈ L2 (), curl u ∈ L2 () and γν u ∈ H 1/2 (∂). For every u ∈ W 1,2 (, R3 ), u W 1,2 () ≤ C2 (){ u L2 () + div u L2 () + curl u L2 () + γν u H 1/2 (∂) }, (3.3) see [T] (Ch. 1, Theorem 1.2, and Appendix I, Prop. 1.4). 5 In this paper Lp () denotes both the Lebesgue space of scalar functions and the Lebesgue space of vector fields.

Liquid Crystals

351

Lemma 3.1. Let be a bounded, simply-connected domain in R3 with a smooth boundary. Every n ∈ V (, R3 ) has the following decomposition: n = u + ∇ξ, 2,2 () ∩ W 1,2 () is the unique solution of the following equation: where ξ ∈ Wloc ∂ξ

ξ = div n in , ξ dx = 0, (3.4) = γν n on ∂, ∂ν

and u ∈ W 1,2 (, R3 ) is a real vector field satisfying div u = 0,

curl u = curl n in ,

γν u = 0 on ∂.

Moreover, there exist a constant C = C() independent of n such that ξ W 1,2 () ≤ C div n L2 () + n L2 () , u W 1,2 () ≤ C curl n L2 () .

(3.5)

(3.6)

3 Proof. Using (3.2) we easily see that, for every n ∈ V (, R ), Eq. (3.4) has a unique solution ξ . Since ξ dx = 0, from the Poincar´e inequality we have

ξ W 1,2 () ≤ c() ∇ξ L2 () . Multiplying (3.4) by ξ and integrating, we have 2 ∇ξ L2 () = − ξ div ndx + ξ γν nds

∂

≤ ξ L2 () div n L2 () + ξ H 1/2 (∂) γν n H −1/2 (∂) ≤ c1 ξ W 1,2 () { div n L2 () + n L2 () } ≤ c2 ∇ξ L2 () { div n L2 () + n L2 () }. So ξ satisfies the first inequality in (3.6). Let u = n − ∇ξ . Then u ∈ L2 (), curl u ∈ L2 (), div u = 0 in and γν u = 0 on ∂. From [T] (Appendix I, Lemma 1.6), u ∈ W 1,2 (, R3 ) and the second inequality in (3.6) holds. In the following we shall call the decomposition described in Lemma 3.1 the canonical decomposition of n ∈ V (, R3 ). Lemma 3.2. Cn (K1 , K2 , κ, τ, q) is achieved in V(). Moreover, every minimizing sequence is pre-compact in V(). Proof. In the proof, C denotes a generic constant which varies from line to line. Let {(ψj , nj )} ⊂ V() be a minimizing sequence. Then there exists a constant C > 0 such that

2 κ2 |∇qnj ψj |2 + 1 − |ψj |2 + K1 |div nj |2 + K2 |curl nj + τ nj |2 dx 2 ≤ Cn (K1 , K2 , κ, τ, q) +

κ 2 || + o(1) ≤ C. 2

(3.7)

352

X.B. Pan

We first estimate nj . Let nj = uj + ∇ξj be the canonical decomposition as defined in Lemma 3.1. From (3.6), {uj } is bounded in W 1,2 (, R3 ) and {ξj } is bounded in W 1,2 (). After passing to a subsequence, we may assume that, as j → +∞, uj → u0 weakly in W 1,2 (, R3 ) and strongly in L4 (), ξj → ξ0 weakly in W 1,2 () and ξj → ξ0 weakly in L2 (). Now we show that {ξj } 2,2 has a subsequence that is weakly convergent in Wloc (). For a small ε > 0, let ε = {x ∈ : dist(x, ∂) > ε}. Choose a smooth cut-off function ηε such that, 0 ≤ ηε (x) ≤ 1, ηε (x) = 0 if dist(x, ∂) ≤ ε/2, ηε (x) = 1 if dist(x, ∂) ≥ ε, |∇ηε (x)| ≤ C/ε and | ηε (x)| ≤ C/ε 2 , where C > 0 is independent of ε. Using (3.3) and (3.7) we have ∇(ηε ξj ) W 1,2 () ≤ εC2 . Therefore ξj W 2,2 (ε ) ≤

C . ε2

Choosing a sequence εk → 0 and using the diagonal argument we can find a subsequence of ξj , still denoted by ξj , such that, as j → ∞, ξj → ξ0

2,2 weakly in Wloc () and W 1,2 (),

1,2 strongly in Wloc ().

From this and passing to a subsequence again, we have nj = uj + ∇ξj → u0 + ∇ξ0 ≡ n0

1,2 strongly in Wloc (, R3 ) as j → ∞.

Therefore |n0 (x)| = 1 a.e. in and hence n0 ∈ V (, S2 ). Since nj → n0 weakly in L2 () and nj L2 () = n0 L2 () , we see that nj → n0 strongly in L2 (). Since |nj (x)| = 1 a.e., we see that nj → n0

strongly in Lp () for all 1 ≤ p < ∞.

Hence 2,2 weakly in Wloc () and strongly in W 1,2 ().

ξj → ξ0

For every small ε > 0, since div nj = ξj → ξ0 = div n0 weakly in W 1,2 (ε ), curl nj = curl uj → curl u0 = curl n0 weakly in W 1,2 (ε , R3 ), and nj → n0 strongly in L2 (), we have |div n0 |2 dx ≤ lim inf |div nj |2 dx ≤ lim inf |div nj |2 dx, ε ε j →∞ j →∞ 2 2 |curl n0 +τ n0 | dx ≤ lim inf |curl nj +τ nj | dx ≤ lim inf |curl nj +τ nj |2 dx. j →∞

ε

Letting ε → 0 we get

j →∞

ε

|div n0 |2 dx ≤ lim inf

j →∞

|div nj |2 dx,

|curl n0 + τ n0 | dx ≤ lim inf

|curl nj + τ nj |2 dx.

2

j →∞

(3.8)

Liquid Crystals

353

Next we estimate ψj . From (3.7), {|∇qnj ψj |} is bounded in L2 (), and {|ψj |} is bounded in L4 (). Since |nj (x)| = 1, {nj ψj } is bounded in L2 (). Hence ∇ψj = ∇qnj ψj + iqnj ψj is bounded in L2 (). Thus ψj is bounded in W 1,2 (, C). We can find a subsequence, still denoted by ψj , such that ψj → ψ0 weakly in W 1,2 (, C), strongly in L4 (). Since nj → n0 strongly in L2 (), we have nj ψj → n0 ψ0 strongly in L2 (). Hence ∇qnj ψj → ∇qn0 ψ0 weakly in L2 (), and |∇qn0 ψ0 |2 dx ≤ lim inf |∇qnj ψj |2 dx, j →∞

2

2 2 1 − |ψ0 | 1 − |ψj |2 dx. dx = lim (3.9) j →∞

From (3.8), (3.9) we get E[ψ0 , n0 ] ≤ lim inf E[ψj , nj ] = Cn (K1 , K2 , κ, τ, q). j →∞

Therefore (ψ0 , n0 ) ∈ V() is a minimizer. So E[ψ0 , n0 ] = Cn (K1 , K2 , κ, τ, q) and lim E[ψj , nj ] = E[ψ0 , n0 ].

j →∞

From this, (3.8) and (3.9) we see that lim |∇qnj ψj |2 dx = |∇qn0 ψ0 |2 dx, j →∞ lim |div nj |2 dx = |div n0 |2 dx, j →∞ 2 lim |curl nj + τ nj | dx = |curl n0 + τ n0 |2 dx. j →∞

Since each integrand converges weakly in L2 (), the above equalities imply that each integrand converges strongly in L2 (). Therefore nj → n0 strongly in V (, S2 ) as j → ∞. Since ∇qnj ψj → ∇qn0 ψ0 and nj ψj → n0 ψ0 strongly in L2 (), we have ∇ψj = ∇qnj ψj + iqnj ψj → ∇qn0 ψ0 + iqn0 ψ0 = ∇ψ0 . Hence ψj → ψ0 strongly in W 1,2 (, C). Thus the selected subsequence converges strongly in V(). In the following we shall discuss the behavior of minimizers of E. Let us begin with the trivial critical points (0, n) of E, where n ∈ V (, S2 ) is a solution of the equation curl n + τ n = 0,

|n(x)| = 1

in .

(3.10)

Recall that is simply-connected. If τ = 0, the solutions of (3.10) are given by the gradient fields of the solutions of the eikonal equation |∇φ| = 1

on .

354

X.B. Pan

If τ = 0, then, up to a rotation, n = Nτ , where Nτ = (cos τ x3 , sin τ x3 , 0).

(3.11)

More precisely, the set C(τ ) of all solutions n ∈ V (, S2 ) of (3.10) is C(τ ) = QNτ (Qt x) : Q ∈ SO(3) , see Bauman-Calderer-Liu-Phillips [BCLP] (Lemma 3), 6 also see [E,O]. In the following we only consider the case where τ > 0. A trivial solution (0, n), where n ∈ C(τ ), corresponds with a purely chiral nematic phase, and E[0, n] = 0. Recall the notation n− = −n (see (2.3)). Write e1 = (1, 0, 0), e2 = (0, 1, 0) and e3 = (0, 0, 1), and ej− = −ej , j = 1, 2, 3. Choose Fe− = (0, x3 , 0). 1

Note that curl Fe− = e1− = −e1 . We have, for small τ > 0, 1

Nτ (x) = e1 + τ Fe− + τ 2 Aτ , 1

(3.12)

where |Aτ (x)| is uniformly bounded. In fact, Nτ (x) = e1 +τ x3 e2 − 21 (τ x3 )2 e1 +O(τ 3 ). In order to understand the regime of q for the existence of non-trivial minimizers, and to understand the bifurcation of non-trivial minimizers from a trivial one, let us define Qc3 (K1 , K2 , κ, τ ) = inf {q > 0 : E has only trivial minimizers}, Qc3 ≡ Qc3 (κ, τ ) = inf Qc3 (K1 , K2 , κ, τ ) = lim Qc3 (K1 , K2 , κ, τ ). K1 ,K2 →+∞

K1 >0,K2 >0

We can show that, for any fixed τ > 0, Qc3 (K1 , K2 , κ, τ ) and Qc3 (κ, τ ) are finite. For any 0 < q < Qc3 (K1 , K2 , κ, τ ), the minimizers of the functional E are non-trivial, which means that the liquid crystal is in a smectic phase. Roughly speaking, when the parameters vary such that q − Qc3 (K1 , K2 , κ, τ ) changes from a positive value to a negative value, we see nucleation of non-trivial minimizers from a trivial one, and the liquid crystal changes its state from a nematic phase to a smectic phase. We wish to know how the value of Qc3 (κ, τ ) depends on the parameters and on the domain geometry (see [LP2,3] for the discussions of the upper critical field HC3 for superconductivity). In order to discuss the existence of non-trivial minimizers, we consider the eigenvalue problem (2.1) for A = qn, where q = 0 and n ∈ V (, S2 ). Let µ(qn) = µ(qn, ) denote the lowest eigenvalue of (2.1) for A = qn. We define µ∗ (q, τ ) ≡ µ∗ (q, τ, ) = inf µ(qn, ). n∈C (τ )

(3.13)

Note that, for any positive numbers q and τ , µ∗ (q, τ ) is achieved. Lemma 3.3. Given q, τ, κ > 0, if µ∗ (q, τ ) < κ 2 , the functional E has nontrivial minimizers for all positive constants K1 , K2 . 6 Although [BCLP] consider the solutions of (3.10) in W 1,2 (, S2 ), the proof given there works as well for the solutions in V (, S2 ).

Liquid Crystals

355

Proof. A similar statement was given in [LP2] for the Ginzburg-Landau system of superconductivity. Here we include a brief proof for the reader’s convenience. Given q, τ > 0, choose n∗ ∈ C(τ ) such that µ(qn∗ ) = µ∗ (q, τ ). Let φ ∗ be an eigenfunction of (2.1) for A = qn∗ . Take (tφ ∗ , n∗ ) as a test function, where t=

κ 2 − µ(qn∗ ) φ ∗ L2 () κ φ ∗ 2L4 ()

.

Note that div n∗ = 0. We find φ ∗ 4 2 1 L () Cn (K1 , K2 , κ, τ, q) ≤ E tφ ∗ , n∗ ≤ − 2 κ 2 − µ(qn∗ ) < 0. 2κ φ ∗ 4 4 L ()

So the minimizers are non-trivial.

For µ∗ (q, τ ) defined in (3.13), let q0 (κ, τ ) and q1 (κ, τ ) be the least and largest solutions of the equation µ∗ (q, τ ) = κ 2 , namely, q0 (κ, τ ) = min q > 0 : µ∗ (q, τ ) = κ 2 , q1 (κ, τ ) = max q > 0 : µ∗ (q, τ ) = κ 2 .

(3.14)

From Lemma 3.3 we have Qc3 (κ, τ ) ≥ q0 (κ, τ ).

(3.15)

In fact, given κ > 0 and τ > 0, for any 0 < q < q0 (κ, τ ), we have µ∗ (q, τ ) < κ 2 . From Lemma 3.3, E has a non-trivial minimizer for any positive constants K1 and K2 . So Qc3 (κ, τ ) ≥ q. In order to estimate the value of Qc3 (κ, τ ) we shall estimate µ∗ (q, τ ), q0 (κ, τ ) and q1 (κ, τ ). Lemma 3.4. For µ∗ (q, τ ) defined in (3.13), we have:   λ q 2 τ 2 + O(τ 3 ) if τ → 0+ and q is bounded,   ∗ 2 2  λ∗ q τ + O(qτ 2 + q 3 τ 3 ) if τ → 0+ , q → +∞ and qτ → 0, µ∗ (q, τ ) =  µ∗ (b) + o(1) if τ → 0+ and qτ → b > 0,   β qτ + o(qτ ) if τ > 0 is bounded and qτ → +∞, 0 where λ∗ was given in (2.6), µ∗ (b) was given in (2.9), and β0 was given in (2.8). Proof. Step 1. We estimate µ(qNτ , ) for small τ . For the vector field Fe− defined 1 above, we write, as in (2.4), λ(e1− ) ≡ λ(e1− , ) = − |∇φ − Fe− |2 dx. inf φ∈W 1,2 ()

1

356

X.B. Pan

We claim that  − λ(e1 , )q 2 τ 2 + O(τ 3 )    λ(e− , )q 2 τ 2 +O(qτ 2 + q 3 τ 3 ) 1 µ(qNτ , ) = µ(bFe− , ) + O(τ )   1   β0 qτ + o(qτ )

if τ → 0+ and q is bounded, if τ → 0+ , q → +∞ and qτ → 0, if τ → 0+ and qτ → b > 0, if τ > 0 is bounded and qτ → ∞. (3.16)

The error estimates in (3.16) are also valid on every domain Qt , uniformly for Q ∈ SO(3). To prove, we use (3.12), and note that, e1 = ∇x1 . Up to a gauge transformation, we have

µ(qNτ ) = µ (qNτ − qe1 ) = µ qτ Fe− + qτ 2 Aτ . 1

Note that |∇qτ F − +qτ 2 Aτ φ|2 dx e1 2 2 2 4 2 ¯ = |∇qτ Fe− φ| + 2qτ Aτ i φ∇qτ Fe− φ + q τ |Aτ φ| dx.

1

(3.17)

1

If q is bounded, we take φ = φτ in (3.17) to be the eigenfunction for µ(qτ Fe− ) with 1

φτ L2 () = 1. Using the fact µ(qτ Fe− ) = O(τ 2 ) and the variational characteristic of 1 the eigenvalue, we get µ(qτ Fe− + qτ 2 Aτ ) ≤ |∇qτ F − +qτ 2 Aτ φτ |2 dx 1 e1 ≤ |∇qτ Fe− φτ |2 dx + 2qτ 2 C φτ L2 () ∇qτ Fe− φτ L2 () 1

1

+C 2 q 2 τ 4 φτ 2L2 () ≤ µ(qτ Fe− ) + O(τ 3 ) = λ(e1− )q 2 τ 2 + O(τ 3 ). 1

On the other hand, let φ τ be the eigenfunction for µ(qτ Fe− + qτ 2 Aτ ) with φ τ L2 () 1

= 1. Using the proved fact µ(qτ Fe− + qτ 2 Aτ ) = O(τ 2 ), similar to (3.17) we have 1

µ(qτ Fe− ) 1 ≤ |∇qτ Fe− φ τ |2 dx 1 τ 2 2 τ 2 4 τ 2 τ ¯ ≤ |∇qτ F − +qτ 2 Aτ φ | − 2qτ Aτ (i φ ∇qτ F − +qτ 2 Aτ φ ) + q τ |Aτ φ | dx e1

e1

≤ µ(qτ Fe− +qτ 2 Aτ )+2qτ 2 C φ τ L2 () ∇qτ F − +qτ 2 Aτ φ τ L2 ()+C 2 q 2 τ 4 φ τ 2L2 () 1

e1

≤ µ(qτ Fe− + qτ Aτ ) + O(τ ). 2

1

3

Liquid Crystals

357

Thus

µ qτ Fe− + qτ 2 Aτ ≥ µ qτ Fe− − O τ 3 . 1

1

So we get the first conclusion in (3.16). If q → +∞ and qτ is bounded, then µ(qτ Fe− ) is bounded. We use (3.17) to get 1

µ (qNτ ) = µ qτ Fe− + O qτ 2 . 1

0+ , this equality and Lemma 2.1 together give the second conclusion in (3.16).

If qτ → If qτ → b > 0, this equality gives the third conclusion in (3.16). If qτ → +∞, we rescale the eigenfunctions and consider the limiting equation as in the proof of Lemma 2.2, and show that µ(qNτ ) = β0 . qτ →+∞ qτ lim

Thus the fourth conclusion of (3.16) is true. From the proof we see that the error estimates in (3.16) are valid on every domain Qt , uniformly for Q ∈ SO(3). Step 2. For every n ∈ C(τ ), there exists Q ∈ SO(3) such that n(x) = QNτ (Qt x). Therefore t 2 |∇φ − iqQNτ (Q x)φ| dx µ(qn, ) = inf 2 φ∈W 1,2 () |φ| dx 2 Qt |∇qNτ ψ| dx = µ(qNτ , Qt ). = inf 2 dx |ψ| ψ∈W 1,2 (Qt ) t Q Hence µ∗ (q, τ ) ≡ min µ(qn, ) = n∈C (τ )

min µ(qNτ , Qt ).

Q∈SO(3)

(3.18)

For any positive constants q and τ , choose nq,τ ∈ C(τ ) such that µ(qnq,τ , ) = µ∗ (q, τ ). Choose Qq,τ ∈ SO(3) such that nq,τ (x) = Qq,τ Nτ (Qtq,τ x). Then µ∗ (q, τ ) = µ(qnq,τ , ) = µ(qNτ , Qtq,τ ). We repeat the argument in Step 1 to estimate µ(qNτ , Qtq,τ ). Recall that (3.16) holds uniformly for Qt , thus the last equality is true. If qτ → 0+ , we have

µ∗ (q, τ ) = min λ e1− , Qt q 2 τ 2 + O qτ 2 + q 3 τ 3 Q∈SO(3)

= min λ (h, ) q 2 τ 2 + O qτ 2 + q 3 τ 3 = λ∗ q 2 τ 2 + O qτ 2 + q 3 τ 3 . h∈S2

If qτ → b > 0, then µ∗ (q, τ ) =

min

Q∈SO(3)

µ bFe− , Qt + O(τ ) 1

= min {µ(bFh , ) + O(τ )} = µ∗ (b) + O(τ ). h∈S2

358

X.B. Pan

Lemma 3.5. Let qj (κ, τ ), j = 0, 1 be the numbers defined in (3.14). We have  a (κ) 1 j  if κ > 0 is fixed, and τ → 0+ ,  τ + o τ 

   √κ + O 1 + κ 2 if τ , κ → 0+ , κ τ , τ τ λ∗ qj (κ, τ ) = √κ + O(τ )  if τ , κ → 0+ , κ = O(τ ),   τ 2 λ∗ 2

   κ +o κ if τ > 0 is bounded and κ → +∞, β0 τ τ

(3.19)

where aj (κ) was given in (2.10), λ∗ was given in (2.6), and β0 was given in (2.8). Proof. From (3.14) we see that µ∗ (q0 (κ, τ ), τ ) = µ∗ (q1 (κ, τ ), τ ) = κ 2 .

(3.20)

Step 1. Let κ > 0 be fixed, and τ → 0+ . We prove the first equality in (3.19) for j = 0. Since µ∗ (q0 (κ, τ ), τ ) is bounded, from Lemma 3.4, τ q0 (κ, τ ) remains bounded as τ → 0. Choose a subsequence τj → 0 such that limj →∞ τj q0 (κ, τj ) = lim inf τ →0+ τ q0 (κ, τ ) = b. From Lemma 3.4 (third equality), µ∗ (q0 (κ, τj ), τj ) = µ∗ (b) + o(1), where µ∗ (b) is the number defined in (2.9). Comparing this with (3.20) we find µ∗ (b) = κ 2 , hence b ≥ a0 (κ) since a0 (κ) is the least root of this equation. Thus lim inf τ q0 (κ, τ ) ≥ a0 (κ). τ →0+

On the other hand, when τ → 0+ , we choose q(τ ) = equality) to find that

a0 (κ) τ

and use Lemma 3.4 (third

µ∗ (q(τ ), τ ) = µ∗ (a0 (κ)) + o(1) = κ 2 + o(1). Note that µ∗ (q, τ ) is continuous in q > 0. So we can find 0 < q ≤ that µ∗ (q, τ ) = κ 2 . From the definition of q0 (κ, τ ) we get a0 (κ) 1 1 = +o . q0 (κ, τ ) ≤ q(τ ) + o τ τ τ

a0 (κ) τ

+ o(1) such

Thus the first equality in (3.19) is proved for q0 (κ, τ ). The proof for q1 (κ, τ ) is similar. Step 2. Let κ, τ → 0+ . Let qτ denote either q0 (κ, τ ) or q1 (κ, τ ). From (3.14) we have µ∗ (qτ , τ ) = κ 2 → 0. Hence, from Lemma 3.4 we find τ qτ → 0, and

(3.21) κ 2 = µ∗ (qτ , τ ) = λ∗ qτ2 τ 2 + O qτ τ 2 + qτ3 τ 3 . If κ τ , (3.21) implies that qτ → +∞ and qτ τ = O(κ) as τ → 0. Plugging qτ τ = O(κ) into the last term in the right-hand side of (3.21), we get κ 2 = λ∗ qτ2 τ 2 +O(κτ +κ 3 ). Thus κ κ2 qτ = √ + O 1 + . τ τ λ∗ We get the second equality in (3.19). If κ = O(τ ), from (3.21) we see that qτ is bounded. From the first equality in Lemma 3.4 we have κ 2 = µ∗ (qτ , τ ) = λ∗ qτ2 τ 2 + O(τ 3 ) and hence κ qτ = √ + O (τ ) . τ λ∗ The third equality in (3.19) is true.

Liquid Crystals

359

Step 3. Let τ be bounded and κ → +∞. Again we let qτ denote either q0 (κ, τ ) or q1 (κ, τ ). From (3.20) and Lemma 3.4 we see that qτ τ → +∞, and κ 2 = µ∗ (qτ , τ ) = 2 β0 qτ τ + o(qτ τ ). Thus qτ τ = κβ0 + o(κ 2 ), and 2 κ2 κ qτ = +o . τβ0 τ The last equality in (3.19) follows.

The following theorem presents our estimates of Qc3 (κ, τ ) for positive κ and τ . Theorem 3.6. (i) If κ > 0 is fixed and τ → 0+ , we have a1 (κ) 1 1 a0 (κ) +o ≤ Qc3 (κ, τ ) ≤ +o , τ τ τ τ

(3.22)

where a0 (κ), a1 (κ) were given in (2.10). (ii) If κ and τ → 0+ , we have

κ

κ , Qc3 (κ, τ ) = √ + o τ τ λ∗

(3.23)

where λ∗ was given in (2.6). (iii) If τ > 0 is bounded and κ → +∞, we have 2 κ2 κ +o Qc3 (κ, τ ) ≥ , β0 τ τ

(3.24)

where β0 was given in (2.8). Proof. From (3.15) and (3.19) we see that (3.22), (3.23) and (3.24) give correct lower bounds of Qc3 (κ, τ ). We shall show that (3.22) and (3.23) also give correct upper bounds. Step 1. Let κ > 0 be fixed and τ → 0+ . We choose K1 = K2 = Kτ 1/τ 2 and we shall prove that 1 a1 (κ) Qc3 (Kτ , Kτ , κ, τ ) ≤ +o as τ → 0. (3.25) τ τ Let us choose qτ such that 0 < qτ < Qc3 (Kτ , Kτ , κ, τ ) and τ qτ → b > 0

as τ → 0.

(3.26)

Then E has non-trivial minimizers (ψτ , nτ ), and κ2 |∇qτ nτ ψτ |2 + (|ψτ | − 1)2 + Kτ |div nτ |2 + Kτ |curl nτ + τ nτ |2 dx 2 κ2 κ2 κ2 = E[ψτ , nτ ] + || ≤ E[0, Nτ ] + || = ||. (3.27) 2 2 2 Thus, as τ → 0+ ,

κ 2 || → 0. |div nτ |2 + |curl nτ + τ nτ |2 dx ≤ 2Kτ

(3.28)

360

X.B. Pan

Step 1.1. Estimates of nτ . Let nτ = uτ + ∇ξτ be the canonical decomposition as defined in Lemma 3.1. From (3.6) and (3.28), {ξτ } is bounded in W 1,2 (), and uτ → 0 in W 1,2 (, R3 ). Using the argument in the proof of Lemma 3.2 we can find a subsequence, still denoted by (ξτ , uτ ), such that, as τ → 0+ , 2,2 ξτ → ξ0 weakly in Wloc () and strongly in W 1,2 (), nτ = uτ + ∇ξτ → ∇ξ0 ≡ n0 1,2 weakly in V (, R3 ), strongly in Wloc (, R3 ) and Lp () for 1 ≤ p < ∞, where ξ0 satisfies 2,2 ξ0 ∈ Wloc () ∩ W 1,2 (),

ξ0 = 0

in L2 (),

|∇ξ0 | = 1

a.e. in . (3.29)

Claim. If ξ0 satisfies (3.29) in a connected domain , then n0 is a constant unit vector. 2,2 To prove the claim, note that, since ξ0 ∈ Wloc (), by the elliptic estimate we know that ξ0 is smooth in . Hence, ξ0 = 0 and |∇ξ0 |2 = 1 hold everywhere in . So 2 2 3 3 3

∂ξ0 ∂ ( ξ0 ) ∂ 2 ξ0 ∂ 2 ξ0 0 = |∇ξ0 |2 = 2 + =2 . ∂xi ∂xj ∂xi ∂xi ∂xi ∂xj i,j =1

i=1

2

ξ0 Therefore ∂x∂ i ∂x = 0 for all i, j . Thus j Step 1.2. Let

∂ξ0 ∂xi

i,j =1

=constant, i.e. n0 = ∇ξ0 is a constant vector.

1 (nτ − n0 ) = vτ + ∇ζτ , τ be the canonical decomposition of (nτ − n0 )/τ as defined in Lemma 3.1. Since n0 is a constant vector, from Lemma 3.1 and (3.28), we have C 1 2 2 vτ W 1,2 () ≤ 2 curl nτ L2 () ≤ C 1 + 2 ≤ C. τ τ Kτ Passing to a subsequence again we may assume that vτ → v0 weakly in W 1,2 (, R3 ) and strongly in L2 () as τ → 0+ , where div v0 = 0 in , and γν v0 = 0 on ∂ since γν vτ = 0 on ∂ for all τ . Write wτ = vτ − v0 , div wτ = 0 in , γν wτ = 0 on ∂, and wτ → 0 in L2 () strongly. From (3.28), curl wτ L2 () = curl (vτ − v0 ) L2 () = curl vτ + n0 L2 () ≤ curl vτ + nτ L2 () + nτ − n0 L2 () → 0

as τ → 0+ .

From this and (3.3) we have, as τ → 0+ , wτ W 1,2 () ≤ C wτ L2 () + curl wτ L2 () → 0. Since curl v0 = −n0 = n0− and is simply-connected, there exists a function η such that v0 = Fn− + ∇η, 0

and η = div v0 − div Fn− = 0. 0

Liquid Crystals

361

Step 1.3. Let χτ = ζτ + η. We can write ψτ = eiqτ (n0 ·x+τ χτ ) ϕτ ,

nτ = n0 + τ Fn− + τ wτ + τ ∇χτ . 0

(3.30)

From (3.28) we have | χτ |2 dx ≤

κ 2 || . 2τ 2 Kτ

(3.31)

We compute ∇qτ nτ ψτ = eiqτ (n0 ·x+τ χτ ) ∇τ qτ (Fn− +wτ ) ϕτ , 0 κ2 E[ψτ , nτ ] = |∇τ qτ (Fn− +wτ ) ϕτ |2 − κ 2 |ϕτ |2 + |ϕτ |4 2 0

+ Kτ τ 2 | χτ |2 + Kτ τ 2 |curl wτ + τ (Fn− + wτ + ∇χτ )|2 dx < 0. 0

Hence

|∇τ qτ (Fn− +wτ ) ϕτ |2 dx < κ 2 ϕτ 2L2 () . 0

Let µτ = µ(τ qτ (Fn− + wτ )) be the eigenvalue of (2.1) for A = τ qτ (Fn− + wτ ). Since 0 0 ϕτ ≡ 0, we have µτ < κ 2 .

(3.32)

Let φτ be the eigenfunction of (2.1) associated with µτ such that φτ L2 () = 1. From Kato’s inequality we have 2 µτ = |∇τ qτ (Fn− +wτ ) φτ | dx ≥ |∇|φτ ||2 dx. 0

Hence

√ |φτ | W 1,2 () ≤ C 1 + µτ ≤ C(1 + κ).

We also have ∇φτ L2 () = ∇τ qτ (Fn− +wτ ) φτ + iτ qτ (Fn− + wτ )φτ L2 () 0

0

≤ ∇τ qτ (Fn− +wτ ) φτ L2 () + τ qτ (Fn− + wτ )φτ L2 () 0 0 √ ≤ µτ + τ qτ Fn− + wτ L4 () φτ L4 () 0 √ ≤ µτ + Cτ qτ Fn− + wτ W 1,2 () |φτ | W 1,2 () 0

≤ C (1 + κ)(1 + τ qτ ). From this and (3.26) we see that {φτ } is bounded in W 1,2 (, C). After passing to a subsequence we may assume that φτ → φ0 weakly in W 1,2 (, C) and strongly

362

X.B. Pan

in L4 () as τ → 0+ , and φ0 L2 () = 1. Since τ qτ → b and wτ → 0 strongly in W 1,2 (, R3 ), we see that τ qτ (Fn− + wτ )φτ → bFn− φ0 strongly in L2 (), and 0

0

∇τ qτ (Fn− +wτ ) φτ → ∇bFn− φ0 weakly in L2 (). Hence, from (3.32), 0

0

|∇bFn− φ0 | dx ≤ lim inf 2

τ →0

0

|∇τ qτ (Fn− +wτ ) φτ |2 dx = lim inf µτ ≤ κ 2 . τ →0

0

So µ(bFn− ) ≤ κ 2 and hence µ∗ (b) ≤ µ(bFn− ) ≤ κ 2 . Note that µ∗ (b) is continuous in 0

0

b > 0. Thus there exists a number b ≥ b such that µ∗ (b ) = κ 2 . From the definition of a1 (κ), we find a1 (κ) ≥ b ≥ b. We have proved that, for any sequence {qτ } with 0 < qτ < Qc3 (κ, τ ) and τ qτ → b we always have b ≤ a1 (κ). So lim sup τ Qc3 (κ, τ ) ≤ a1 (κ). τ →0+

Thus (3.25) is true. Now (3.22) is proved. Step 2. Next we consider the case where κ and τ are small. As in Step 1, let K1 = K2 = Kτ 1/τ 2 . Choose qτ such that 0 < qτ < Qc3 (Kτ , Kτ , κ, τ ),

τ qτ → 0.

Let (ψτ , nτ ) be the minimizer of E (note that the minimizers also depend on κ). We repeat the discussion in Step 1 and find that (3.30) and (3.32) remain true with wτ W 1,2 () → 0. Again we let φτ be the eigenfunction of (2.1) for A = τ qτ (Fn− + wτ ) associated with 0 µτ = µ(τ qτ (Fn− + wτ )) and φτ L2 () = 1. Again we can show that {φτ } is bounded 0

in W 1,2 (, C). Now we use the technique in the proof of Lemma 3.4 to compute µ τ q τ Fn − ≤ |∇τ qτ (Fn− +wτ ) φτ |2 − 2τ qτ wτ i φ¯ τ ∇τ qτ (Fn− +wτ ) φτ 0

0

0

+τ 2 qτ2 |wτ φτ |2 dx ≤ µ τ qτ (Fn− + wτ ) + 2τ qτ wτ φτ L2 () ∇τ qτ (Fn− +w0 ) φτ L2 () 0

0

+τ 2 qτ2 wτ φτ 2L2 () ≤ κ 2 + o τ qτ κ + o τ 2 qτ2 = 1 + o(1) κ 2 + o τ 2 qτ2 . Using this and Lemma 2.1 we find − λ(n0 ) + o(1) τ 2 qτ2 ≤ (1 + o(1))κ 2 . Thus (λ∗ + o(1))τ 2 qτ2 ≤ (1 + o(1))κ 2 ,

κ qτ ≤ (1 + o(1)) √ . τ λ∗

This implies κ Qc3 (κ, τ ) ≤ (1 + o(1)) √ . τ λ∗ Hence (3.23) is true.

Liquid Crystals

363

Remark 3.7. For n0 given in (3.30), we choose Q ∈ SO(3) such that Qe1 = n0 . From (3.12), we can write the second expansion in (3.30) (which is true for a subsequence) as ˆ τ + τ ∇χτ , nτ = QNτ (Qt x) + τ w

(3.33)

ˆ τ = wτ − τ QAτ (Qt x) → 0 strongly in W 1,2 (, R3 ) as τ → 0+ , and where w

χτ → 0

in L2 ().

(3.34)

Let us call a sequence χτ satisfying (3.34) an approximately harmonic sequence. Then (3.33) says that the director fields nτ are approximated by the solutions of (3.10) up to an additive approximately harmonic sequence. Bauman-Calderer-Liu-Phillips [BCLP] investigated the minimizers of the Landau-de Gennes functional in the space W 1,2 (, C) × W 1,2 (, S2 ), with some restrictions on the elastic coefficients Kj ’s. It was proved in [BCLP] that there exist positive constants 2 β1 , K∗ , β2 , K ∗ and λ such that, if min{qτ, (qτ )2 } ≤ κβ1 , q ≥ τ and K2 ≥ K∗ , then the 2

minimizers are non-trivial; and if min{qτ, (qτ )2 } ≥ κβ2 , q > λτ and K2 > K ∗ , then the minimizers are trivial. These results imply the following estimate: if K2 is large and q ≥ τ , then 2 2 κ κ κ κ max ≤ Qc3 (K1 , K2 , κ, τ ) ≤ max , √ , √ , λτ . τβ1 τ β1 τβ2 τ β2

These results are improved by Theorem 3.6 in this paper. 4. Minimizers Under Neumann Condition and with Small Chirality In this section we examine the asymptotic behavior, as τ → 0+ , of the minimizers of the functional E subjected to the Neumann boundary condition for both order parameters and director fields. The asymptotic behavior of the minimizers depends on the ratios between κ, q, K1 , K2 and τ . In this paper we restrict ourselves to the case where κ > 0 is fixed and K1 , K2 are large7 . We shall see that if K1 , K2 τ −2 , the director fields are approximated by the solutions of (3.10) within error o(τ ) and up to an additive approximately harmonic sequence. However, this is not true if K1 , K2 = O(τ −2 ). Let us begin with the case where τ → 0+ ,

K1

1 , τ2

K2

1 , τ2

κ > 0 is fixed.

(4.1)

This case has been discussed in the proof of Theorem 3.6, and we know the minimizers (ψτ , nτ ) have the expansions (3.30). To find more information on the behavior of ψτ , we need the following Ginzburg-Landau type functional associated with the vector field A: κ2 |∇A φ|2 − κ 2 |φ|2 + |φ|4 dx. (4.2) GA [φ] = 2 7 It is consistent with de Gennes’ prediction on the elastic coefficients K ’s at the nematic-smectic-C j transition. However, here we investigate the asymptotic behavior of non-trivial minimizers, and examine the transition from a smectic phase to a nematic phase.

364

X.B. Pan

If the eigenvalue µ(A) < κ 2 , then GA has a non-trivial minimizer in W 1,2 (, C). For b > 0, we define c(b) = inf

inf

h∈S2 φ∈W 1,2 (,C)

GbFh [φ].

Let Hb be the subset of S2 consisting of all the unit vectors h such that inf

φ∈W 1,2 (,C)

GbFh [φ] = c(b).

Theorem 4.1. Assume (4.1) holds and q = qτ , where 0 < qτ < Qc3 (κ, τ ) and τ qτ → b > 0 as τ → 0+ . Let (ψτ , nτ ) be the minimizers of the functional E in V(). We have E[ψτ , nτ ] = c(b) + o(1). For any sequence τ → 0+ , there exist a subsequence τj → 0+ and h ∈ Hb such that ψτj = e

−iqτj (h·x−τj χτj )

(ϕ0 + zτj ),

nτj = −h + τj (Fh + wτj + ∇χτj ),

(4.3)

where Fh is the vector field satisfying (2.2), ϕ0 ∈ W 1,2 (, C) is a minimizer of GbFh , wτj → 0 in W 1,2 (, R3 ), zτj → 0 in W 1,2 (, C), and χτj → 0 in L2 (). Proof. We first derive an energy upper bound E[ψτ , nτ ] ≤ c(b) + o(1). Choose hb ∈ Hb and φb ∈ W 1,2 (, C) such that GbFhb [φb ] = c(b). Then choose Qb ∈ SO(3) such that Qb e1 = hb− = −hb . Let nbτ (x) = Qb Nτ Qtb x ,

ψbτ (x) = e−iqτ hb ·x φb (x).

Then nbτ ∈ C(τ ). From (3.12),

nbτ (x) = Qb e1 + τ Fe− Qtb x + τ 2 Aτ Qtb x 1 = −hb + τ Fhb (x) + τ 2 Qb Aτ Qtb x .

Hence qτ nbτ (x) = −qτ hb + bFhb (x) + Bτ (x), here Bτ (x) = (τ qτ − b)Fhb (x) + τ 2 qτ QAτ (Qt x). So E[ψτ , nτ ] ≤ E[ψbτ , nbτ ] = Gqτ nbτ [ψbτ ] = GbFhb [φb ] + o(1) = c(b) + o(1). Thus the claimed upper bound is true.

Liquid Crystals

365

From the proof of Theorem 3.6 we see that, for any sequence τ → 0+ , we can find a subsequence of the minimizers, still denoted by (ψτ , nτ ), and a constant vector n0 , such that (3.30) holds. Therefore, κ2 c(b) + o(1) ≥ E [ψτ , nτ ] ≥ |∇τ qτ (Fn− +wτ ) ϕτ |2 − κ 2 |ϕτ |2 + |ϕτ |4 dx. 2 0 Recall that τ qτ → b and wτ → 0 strongly in W 1,2 (, R3 ). As in the proof of Theorem 3.6, we can show that {ϕτ } is bounded in W 1,2 (, C). After passing to the subsequence we have ϕτ → ϕ0 weakly in W 1,2 (, C) and strongly in L4 (), and GbFn− [ϕ0 ] ≤ lim inf E[ψτ , nτ ] ≤ c(b). τ →0+

0

By the definition of c(b) we have n0− ∈ Hb , and GbFn− [ϕ0 ] = c(b). Then we must have lim

τ →0+

|∇τ qτ (Fn− +wτ ) ϕτ |2 dx = 0

From this we can show that ϕτ → ϕ0 strongly in get the expansions in (4.3).

0

|∇bFn− ϕ0 |2 dx. 0

W 1,2 (, C). Write

h = n0− . Then we

Remark 4.2. In Theorem 4.1, if µ∗ (b) ≥ κ 2 , then c(b) = 0, so ϕ0 = 0, and the expansions in (4.3) show that the liquid crystal approaches a nematic phase. If µ∗ (b) < κ 2 , then c(b) < 0, so ϕ0 = 0, and the liquid crystal remains in a smectic phase. Next we consider the case where µ ρ a τ → 0+ , κ > 0 is fixed, (4.4) K1 = 2 , K2 = 2 , q= , τ τ τ where µ, ρ and a are fixed positive constants. To describe the asymptotic behavior of the minimizers of E, we define a functional associated with a unit vector h, κ2 4 0 2 2 2 2 2 Jh [φ, v] = |∇av φ| − κ |φ| + |φ| + µ|div v| + ρ|curl v + h| dx, 2

M 0 (h) = inf Jh0 [φ, v] : (φ, v) ∈ W 1,2 (, C)×V , R3 , v(x) ⊥ h a.e. in , M 0 ≡ M 0 (µ, ρ, a, κ) = inf M 0 (h). h∈S2

Obviously M 0 (h) ≤

inf

φ∈W 1,2 (,C)

GaFh− [φ],

M 0 ≤ c(a).

Jh0 is similar to E in the form. However, now we do not require the unit-length constraint (1.3). Instead, we require the orthogonal constraint v(x) · h = 0

a.e. in .

(4.5)

Recall that in the proof of Lemma 3.2, for the canonical decomposition n = u + ∇ξ , the fact n = 1 is used to derive a global W 1,2 estimate for the divergence part ξ . If v satisfies (4.5) and if v = u + ∇ξ is the canonical decomposition, in order to get a global W 1,2 estimate for the divergence part, we need other techniques.

366

X.B. Pan

2,2 Lemma 4.3. Given h ∈ S2 , u ∈ W 1,2 (, R3 ) and ξ ∈ Wloc () ∩ W 1,2 () such that 2,2 (u + ∇ξ ) · h = 0 a.e. in , there exists a function ζ ∈ W () such that 2 (u + ∇ζ ) · h = 0 in , | ζ | dx ≤ | ξ |2 dx, ζ W 1,2 () ≤ C u W 1,2 () ,

(4.6) where C is independent of h, u, ξ and ζ . ¯ R3 ) and ξ ∈ C 2 (). ¯ Without loss Proof. We first prove the conclusion for u ∈ C 2 (, of generality we assume h = e3 . Write u = (u1 , u2 , u3 ). The first equality in (4.6) becomes ∂ζ = −u3 . ∂x3

(4.7)

For simplicity we assume that 0 ∈ , and has the form = {(x1 , x2 , x3 ) : (x1 , x2 ) ∈ D, f1 (x1 , x2 ) < x3 < f2 (x1 , x2 )} , where D is a bounded domain in R2 with smooth boundary. The argument below can be easily modified for a general domain. Extend u3 if necessary. Introduce x3 u3 (x1 , x2 , s)ds, g(x) = U (x), U (x) = 0

G(x1 , x2 ) =

1 f2 (x1 , x2 ) − f1 (x1 , x2 )

f2 (x1 ,x2 )

g(x1 , x2 , x3 )dx3 . f1 (x1 ,x2 )

Any function ζ satisfying (4.7) has the form ζ (x) = φ(x1 , x2 ) − U (x). We use to denote both 2- and 3-dimensional Laplacian operators. Note that f2 (x1 ,x2 ) (G − g)dx3 = 0. f1 (x1 ,x2 )

¯ we have Thus for any φ(x1 , x2 ) ∈ C 2 (D) | φ(x1 , x2 ) − g(x)|2 dx = | φ(x1 , x2 ) − G(x1 , x2 )|2 +|G(x1 , x2 )−g(x)|2 dx

Choose φ0 (x1 , x2 ) such that

φ0 (x1 , x2 ) = G(x1 , x2 ) on D,

φ0 (x1 , x2 ) = 0

on ∂D.

We have φ0 W 1,2 (D) ≤ C1 G H −1 (D) ≤ C2 g H −1 () ≤ C3 U W 1,2 () ≤ C4 u W 1,2 () , where Cj ’s depend only on . Let ζ0 (x) = φ0 (x1 , x2 ) − U (x). Then ζ0 ∈ W 2,2 (), and ζ0 W 1,2 () ≤ C5 φ0 W 1,2 (D) + U W 1,2 () ≤ C6 u W 1,2 () .

Liquid Crystals

367

For all C 2 functions ζ satisfying (4.7) we have | ζ |2 dx ≥ |G(x1 , x2 ) − g(x)|2 dx = | ζ0 |2 dx.

Especially it holds for the given function ξ . 2,2 To complete the proof for u ∈ W 1,2 (, R3 ) and ξ ∈ Wloc () ∩ W 1,2 (), we approximate them by C 2 vector fields and C 2 functions that satisfy (4.7). Lemma 4.4. (i) For any h ∈ S2 , M(h) is achieved. (ii) There exists h0 ∈ S2 such that

M 0 h0 = M 0 = min M 0 (h) . h∈S2

(4.8)

Proof. (i) Let ψj ∈ W 1,2 (, C), vj ∈ V (, R3 ), vj ·h = 0 a.e. in , and Jh0 [ψj , vj ] → M 0 (h) as j → ∞. Let vj = uj + ∇ξj be the canonical decomposition of vj as defined in Lemma 3.1. As in the proof of Lemma 3.2, we can pass to a subsequence and assume uj → u0 weakly in W 1,2 (, R3 ) and strongly in L4 (). Using Lemma 4.3, we can replace ξj by ζj such that (uj + ∇ζj ) · h = 0 a.e. in , ζj W 1,2 () ≤ C uj W 1,2 () and ζj L2 () ≤ ξj L2 () . Thus {ζj } is bounded in W 1,2 () and { ζj } is bounded in L2 (). From this, and using the 2,2 (). Thereargument in the proof of Lemma 3.2, we see that {ζj } is pre-compact in Wloc fore, passing to a subsequence again we may assume that ζj → ζ0 weakly in W 1,2 () 2,2 1,2 (), strongly in Wloc (), ζj → ζ0 weakly in L2 (). So (u0 +∇ζ0 )·h = 0 and Wloc a.e. in . Let φj = e−iaξj ψj . We have κ2 |∇auj φj |2 − κ 2 |φj |2 + |φj |4 + µ| ζj |2 + ρ|curl uj + h|2 dx 2 ≤ Jh0 [ψj , vj ] = M 0 (h) + o(1). Especially {∇auj φj } is bounded in L2 () and {φj } is bounded in L4 (). Then we find that {∇φj } is bounded in L2 (). Passing to a subsequence again we have φj → φ0 weakly in W 1,2 (, C) and strongly in L4 (), κ2 2 2 2 4 2 2 |∇au0 φ0 | − κ |φ0 | + |φ0 | + µ| ζ0 | + ρ|curl u0 + h| dx ≤ M 0 (h). 2 Let ψ0 = eiaζ0 φ0 ,

v0 = u0 + ∇ζ0 .

Then Jh0 [ψ0 , v0 ] ≤ M 0 (h). Since (ψ0 , v0 ) is admissible, the equality must hold, namely, (ψ0 , v0 ) is a minimizer. (ii) Choose hj ∈ S2 , (ψj , vj ) ∈ W 1,2 (, C) × V (, R3 ), vj · hj = 0 a.e. in , such that M 0 (hj ) = Jh0j [ψj , vj ] → M 0

as j → ∞.

368

X.B. Pan

Passing to a subsequence we may assume that hj → h0 , where h0 ∈ S2 . Using the argument in the proof of (i) involving the canonical decomposition of vj , with h replaced by hj , we find (ψ0 , v0 ) ∈ W 1,2 (, C) × V (, R3 ), v0 · h0 = 0 a.e. in , such that Jh00 [ψ0 , v0 ] ≤ M 0 . Then we must have equality M 0 = M 0 (h0 ) = Jh00 [ψ0 , v0 ]. Let h0 be the unit vector satisfying (4.8) and M0 (h0 ) = (φ, v) ∈ W 1,2 (, C) × V (, R3 ) : v ⊥ h0 a.e. in ,

Jh00 [φ, v] = M 0 (h0 ) = M 0 .

Theorem 4.5. Assume (4.4) holds and (ψτ , nτ ) are minimizers of E in V(). We have E[ψτ , nτ ] ≤ M 0 + o(1). For any sequence τ → 0+ , there exists a subsequence τj → 0+ , such that one of the following two cases happens. Case 1. There exist a unit vector n0 satisfying M 0 (n0 ) = M 0 , and a minimizer (ψ0 , u0 ) ∈ M0 (n0 ), such that the following expansions hold:

ψτj nτj

ia nτ0j·x +χτj = ψ0 + zτj e , = n0 + τj u0 + wτj + ∇χτj ,

(4.9)

where wτj → 0 in W 1,2 (, R3 ), zτj → 0 in W 1,2 (, C), and χτj → 0 in L2 (). Moreover, E[ψτj , nτj ] = M 0 + o(1). Case 2. There exist a unit vector h such that nτj = h + τj vτj + τj ρτj ∇ξτj , where vτj → v0 strongly in W 1,2 (, R3 ), 1 ρτj weakly in

2,2 Wloc (),

1 τj

, ξτj L2 () = 1, ξτj → ξ0

ξ0 = 0 and h · ∇ξ0 = 0 in .

Proof. We first prove an upper bound of the energy. Let h0 be a unit vector satisfying (4.8) and (φ 0 , u0 ) ∈ M0 (h0 ). Let us fix a large m > 0 and define if |u0 (x)| ≤ m, u0 (x) um (x) = m u0 (x) if |u0 (x)| > m, |u0 (x)| √ um ∈ V (, R3 ) and um · h0 = 0 a.e. in . For 0 < τ < 1/ m, choose test fields

ia 0 ψ τ = e τ h ·x φ 0 , nmτ = 1 − bmτ τ 2 h0 + τ um , where bmτ (x) =

1+

|um (x)|2 1 − τ 2 |um (x)|2

.

Liquid Crystals

369

Then |nmτ (x)| = 1 a.e. in and E[ψ τ , nmτ ] = E[φ 0 , nmτ − h0 ] = |∇aum φ 0 |2 − κ 2 |φ 0 |2

+

κ2 0 4 |φ | + µ|div um |2 + ρ|curl um + h0 |2 dx + o(1). 2

Therefore, for any sequence of minimizers (ψτ , nτ ) we have lim sup E[ψτ , nτ ] τ →0+

≤

κ2 0 4 2 0 2 |∇aum φ | − κ |φ | + |φ | + µ|div um | + ρ|curl um + h | dx. 2 0 2

2

0 2

Then we let m → ∞ and find lim sup E[ψτ , nτ ] ≤ M 0 . τ →0+

(4.10)

For the given sequence of minimizers (ψτ , nτ ), as in the proof of Theorem 3.6, we use (4.10) to show that, after passing to a subsequence, there exists a unit vector n0 1,2 such that, as τ → 0+ , nτ → n0 weakly in Wloc (, R3 ) and strongly in Lp () for 1 ≤ p < +∞. Let uτ =

1 (nτ − n0 ) = vτ + ∇ζτ τ

be the canonical decomposition as defined in Lemma 3.1, and let ϕτ = e−( τ n0 ·x+aζτ ) ψτ . a

From (4.10), we have κ2 |∇avτ ϕτ |2 − κ 2 |ϕτ |2 + |ϕτ |4 + µ| ζτ |2 + ρ|curl vτ + n0 + τ (vτ + ∇ζτ )|2 dx 2 ≤ M 0 + o(1). Since τ (vτ + ∇ζτ ) → 0 strongly in L2 (), we find κ2 |∇avτ ϕτ |2 − κ 2 |ϕτ |2 + |ϕτ |4 + µ| ζτ |2 + ρ|curl vτ + n0 |2 dx ≤ M 0 + o(1). 2 (4.11) Now we have two cases. Case 1. ∇ζτ L2 () is bounded as τ → 0+ . Using (4.11) and applying the argument in the proof of Lemma 3.2, we conclude that {vτ } is bounded in W 1,2 (, R3 ) and {ζτ } 2,2 is bounded in Wloc (). So we can pass to a subsequence again and assume that v τ → v0

weakly in W 1,2 (, R3 ) and strongly in L4 (),

ζτ → ζ0

2,2 1,2 weakly in Wloc () and strongly in Wloc (),

ϕτ → ϕ0

weakly in W 1,2 (, C) and strongly in L4 (),

370

X.B. Pan

2,2 where ζ0 ∈ Wloc () with ζ0 ∈ L2 (). Let u0 = v0 + ∇ζ0 , then u0 ∈ V (, R3 ), and

|∇av0 ϕ0 |2 − κ 2 |ϕ0 |2 +

κ2 |ϕ0 |4 + µ| ζ0 |2 + ρ|curl v0 + n0 |2 dx ≤ M 0 . 2 (4.12)

Let ψ0 = eiaζ0 ϕ0 . From (4.12), Jn00 [ψ0 , u0 ] ≤ M 0 . Note that 1 = |nτ |2 = |n0 + τ uτ |2 = 1 + 2τ uτ · n0 + τ 2 |uτ |2 . So uτ (x) · n0 = − τ2 |uτ (x)|2 . Since uτ → u0 strongly in L2loc (), we find u0 (x) · n0 = 0 a.e. in . Thus (ψ0 , u0 ) is admissible, and hence Jn00 [ψ0 , u0 ] ≥ M(n0 ) ≥ M 0 . Thus the equalities hold, n0 satisfies (4.8), and (ψ0 , u0 ) ∈ M0 (n0 ). It further implies that the left of (4.11) converges to the left of (4.12). Using the argument in the last part of the proof of Lemma 3.2, we conclude that every term in the left of (4.11) converges to the corresponding term in (4.12), and hence the involved integrands must converge strongly in L2 (), namely ∇avτ ϕτ → ∇av0 ϕ0 ,

ζτ → ζ0

and

curl vτ → curl v0

strongly in L2 ().

Hence vτ → v0 strongly in W 1,2 (, R3 ), ϕτ → ϕ0 strongly in W 1,2 (, C). Write wτ = vτ − v0 ,

zτ = ϕτ − ϕ0 ,

χτ = ζτ − ζ0 .

We see that (4.9) is true. Case 2. ρτ = ζτ L2 () → ∞ as τ → 0+ . Let ζτ = ρτ ξτ , where ∇ξτ L2 () = 1. τρτ → 0 since nτ → n0 . As in Case 1 we can show that, after passing to a subsequence, 2,2 1,2 ξτ → ξ0 weakly in Wloc () and strongly in Wloc (). From (4.11) we have ξ0 = 0. 2 Since |nτ | = 1 we have n0 ·

1 vτ + ∇ξτ ρτ

1 2 1 + τρτ vτ + ∇ξτ = 0. 2 ρτ

Taking limit we get n0 · ∇ξ0 = 0 in . Remark 4.6. Define a > 0 : M 0 (µ, ρ, a, κ) = 0 , Q∗ (µ, ρ, κ) = sup a > 0 : M 0 (µ, ρ, a, κ) = 0 .

Q∗ (µ, ρ, κ) = inf

Under the condition of Theorem 4.5, let (ψτ , nτ ) be a subsequence of minimizers having expansions given in (4.9). If 0 < a < Q∗ (µ, ρ, κ), then ψ0 ≡ 0, and (ψτ , nτ ) are non-trivial for all τ small; hence the liquid crystal remains in a smectic phase. If a ≥ Q∗ (µ, ρ, κ), then ψ0 = 0, u0 = Fh for some h, and ψτ → 0 in L2 (); the liquid crystal approaches a nematic state. Next we consider the cases where wave number is far below the critical number Qc3 : τ → 0+ ,

K1 =

µ , τ

K2 =

ρ , τ

a q=√ , τ

κ > 0 is fixed,

(4.13)

Liquid Crystals

371

where µ, ρ and a are fixed positive constants. Now we need the following functional associated with a unit vector h: (1) Jh [φ, v] = a 2 |∇φ − v|2 + µ|div v|2 + ρ|curl v + h|2 dx, (1) (1) M (h) = inf Jh [φ, v] : (φ, v) ∈ W 1,2 () × V (, R3 ), v(x) ⊥ h a.e. in , M (1) ≡ M (1) (µ, ρ, a) = inf M (1) (h). h∈S2

As in Lemma 4.4, we can show that, for every h ∈ S2 , M (1) (h) is achieved, and there exists h(1) ∈ S2 such that

M (1) h(1) = M (1) . (4.14) Let

M(1) (h(1) ) = (φ, u) ∈ W 1,2 () × V , R3 : u(x) ⊥ h(1) a.e. in ,

(1) φ dx = 0, Jh(1) [φ, u] = M (1) h(1) = M (1) .

Using the argument in the proof of Theorem 4.5, we can prove the following conclusion. Theorem 4.7. Assume (4.13) holds and (ψτ , nτ ) are minimizers of E in V(). We have E [ψτ , nτ ] ≤ −

κ 2 || + M (1) τ + o(τ ). 2

For any sequence τ → 0+ , there exists a subsequence τj → 0+ , such that one of the following two cases happens. Case 1. There exist a unit vector n0 satisfying M (1) (n0 ) = M (1) , and a minimizer (φ0 , u0 ) ∈ M(1) , such that the following expansions hold:

√

ψτj = 1 + ia τj φ0 + ατj + zτj e nτj = n0 + τj u0 + wτj + ∇χτj ,

n ·x √ ia c+ √0τ + τj (ζ0 +χτj ) j

, (4.15)

2,2 where ατj is a complex number, ζ0 ∈ Wloc (), wτj → 0 in W 1,2 (, R3 ), zτj → 0 in

W 1,2 (, C), and χτj → 0 in L2 (). Moreover, E[ψτj , nτj ] = − κ o(τj ). Case 2. Same as the Case 2 in Theorem 4.5.

2 ||

2

+ M (1) τj +

Applying the argument in the proof of Theorems 4.1 and 4.5, we get the following conclusion: Theorem 4.8. Assume that τ → 0+ ,

K1 =

µ , τ

K2 =

ρ , τ

q > 0 and κ > 0 are fixed,

(4.16)

372

X.B. Pan

where µ and ρ are fixed positive numbers. Let (ψτ , nτ ) be the minimizers of E in V(). Then we have E[ψτ , nτ ] = −

κ 2 || + q 2 τ 2 λ∗ ()|| + o(τ 2 ). 2

For any sequence τ → 0+ , there exists a subsequence τj → 0+ and a unit vector n0 satisfying λ(n0 ) = λ∗ (), such that the following expansions hold:

iq c+n ·x+τ χ

0 j τj ψτj = 1 + iqτj wn− + ατj + zτj e , 0

nτj = n0 + τj Fn− + wτj + ∇χτj ,

(4.17)

0

where Fn− is the vector field satisfying (2.2) and wn− is the solution of (2.5) for h = n0− , 0

0

wτj → 0 in W 1,2 (, R3 ), zτj → 0 in W 1,2 (, C), χτj → 0 in L2 (), c is a real number, ατj is a complex number, and τj ατj → 0. 5. Minimizers under Dirichlet Condition for Director Fields In this section we consider the variational problem for the functional E under the Dirichlet boundary condition for director fields. Throughout this section we assume that: is a bounded, smooth and simply-connected domain in R3 , and u0 is a smooth unit vector field on ∂.

(5.1)

The parameter τ is not important now. Thus in this section we assume τ = 0 and write the functional by E0 instead of E: κ2 E0 [ψ, n] = |∇qn ψ|2 − κ 2 |ψ|2 + |ψ|4 + K1 |div n|2 + K2 |curl n|2 dx. (5.2) 2 Recall that, although the Oseen-Frank energy functional with general elastic coefficients may not be coercive in W 1,2 (, S2 ), the difficulty can be overcome by using the idea of Hardt-Kinderlehrer-Lin [HKL] to consider an equivalent functional which is coercive in W 1,2 (, R3 ). Hence the discussion in this section can be extended to the Landau-de Gennes functional with general elastic coefficients. For the given field u0 : ∂ → S2 , let

W 1,2 , S2 , u0 = u ∈ W 1,2 , S2 : u = u0 on ∂ ,

W(, u0 ) = W 1,2 (, C) × W 1,2 , S2 , u0 . We consider the following variational problem: Cd (K1 , K2 , κ, q) =

inf

(ψ,n)∈W(,u0 )

E0 [ψ, n].

(5.3)

Using the inequality (3.3), or modifying the proof of Theorem 1.5 in [HKL], we can show that, under the condition (5.1), for any positive numbers K1 , K2 and κ and for any

Liquid Crystals

373

real number q, the minimizers of the variational problem (5.3) exist, and they satisfy the following Euler equations:  2 2 2  −∇qn ψ = κ (1 − |ψ| )ψ, ¯ qn ψ) = λ(x)n in , (5.4) −K1 ∇(div n) + K2 curl 2 n − q(ψ∇   ∇ ψ · ν = 0, n = n on ∂. qn 0 Here λ(x) is the Lagrangian multiplier and it depends on x as well as the unknown field n. To understand the behavior of the minimizers, we introduce K1 |div n|2 + K2 |curl n|2 dx, F0 [n] = N (K1 , K2 , u0 ) =

inf

u∈W 1,2 (,S2 ,u0 )

F0 [u].

Using the observation above we can show that N (K1 , K2 , u0 ) is achieved. Let n ∈ W 1,2 (, S2 , u0 ) be such a minimizer, then (0, n) is a trivial critical point of E0 , which corresponds with a nematic state. We wish to know if the (global) minimizers of E0 on W(, u0 ) are non-trivial. We also want to know the behavior of the minimizers when K1 or K2 is large. If K1 , K2 tend to infinity at the same speed (corresponding with the nematic-smectic-C transition), the answer is simple, and we give the following result without proof. Recall the functional GA [ψ] defined in (4.2). Lemma 5.1. Let (ψj , nj ) ∈ W(, S2 ) be a minimizer of E0 for κ > 0, q ∈ R, K1 = (j ) (j ) K1 and K2 = K2 , where (j )

(j ) K1

→ +∞,

(j ) K2

→ +∞,

K1

(j )

K2

→ a > 0 as j → ∞.

There exists a subsequence jl → ∞ such that (ψjl, njl ) → (ψ0 , n0 ) strongly in W(, u0 ), where n0 is a minimizer of the functional {a|div n|2 + |curl n|2 }dx on W 1,2 (, S2 , u0 ), and ψ0 is a minimizer of the functional Gqn0 on W 1,2 (, C). In contrast, if K1 and K2 have different scale, for example, one of them diverges and the other remains bounded, the problem is non-trivial, and we have only partially understood it. Before we present our results, let us begin with two auxiliary problems. We define G (u0 ) ≡ G (u0 , ) = φ ∈ W 2,2 () : |∇φ| = 1 a.e. in , ∇φ = u0 on ∂ . (5.5) If G(u0 ) = ∅, we set

G(u0 ) ≡ G(u0 , ) =

Lemma 5.2. Assume G(u0 ) = ∅. (i) G(u0 ) is achieved in G(u0 ). (ii) lim N(K1 , K2 , u0 ) = K1 G(u0 ). K2 →+∞

inf

φ∈G(u0 )

| φ|2 dx.

(5.6)

374

X.B. Pan (j )

(j )

(iii) Let uj be a minimizer of N (K1 , K2 , u0 ) for K2 = K2 , where K2 → +∞. Then there exists a subsequence {ujl } and ϕ ∈ G0 (u0 ) which is a minimizer of G(u0 ), such that ujl → ∇ϕ strongly in W 1,2 (, R3 ). Proof. We first prove (i). Choose φ0 ∈ G(u0 ). Without loss of generality we may assume that ∂ φ0 dS = 0. Let ∂φ0 ∂φ G0 (u0 ) = φ ∈ W 2,2 () : |∇φ| = 1 a.e. in , φ = φ0 on ∂, = on ∂ . ∂ν ∂ν We have

G(u0 ) =

inf

φ∈G0 (u0 )

| φ|2 dx.

Let {φn } ⊂ G0 (u0 ) be a minimizing sequence such that limn→∞ φn 2L2 () = G(u0 ). √ We claim that {φn } is bounded in W 2,2 (). In fact, since ∇φn L2 () = || and φ = φ0 on ∂, we see that φn L2 () is bounded. Applying (3.3) to ∇φn we conclude that φn W 2,2 () ≤ C φn L2 () + φn L2 () + φ0 W 2,2 () ≤ C . Now we pass to a subsequence and assume that φn → ϕ weakly in W 2,2 () and strongly in W 1,2 () as n → ∞. Thus |∇ϕ| = 1 a.e. in , and | ϕ|2 dx ≤ lim inf | φn |2 dx = G(u0 ). (5.7) n→∞

∂φ0 Using the trace theorem we see that ϕ = φ0 and ∂ϕ ∂ν = ∂ν on ∂. Hence ϕ ∈ G(u0 ), and thus | ϕ|2 dx ≥ G(u0 ). Combining this with (5.7) we conclude that the equality holds and ϕ achieves G(u0 ). Next we prove (ii) and (iii). Let ϕ be a minimizer of G(u0 ). We take u = ∇ϕ as a test function to find N(K1 , K2 , u0 ) ≤ K1 G(u0 ). Let uj ∈ W 1,2 (, S2 , u0 ) be a minimizer (j ) of F0 for K2 = K2 . Then (j ) K1 |div uj |2 + K2 |curl uj |2 dx ≤ K1 G (u0 ) ,

so

|div uj |2 dx ≤ G (u0 ) ,

|curl uj |2 dx ≤

K1 (j )

K2

G (u0 ) → 0.

Using (3.3) we see that {uj } is bounded in W 1,2 (, R3 ). After passing to a subsequence, we may assume that uj → u˜ weakly in W 1,2 (, R3 ) and strongly in L2 () ˜ ˜ Hence |u(x)| = 1 and curl u(x) = 0 a.e. in , u˜ = u0 on ∂, and as j → ∞. ˜ 2 dx ≤ G(u0 ). On the other hand, since is simply-connected, there exists |div u| ˜ Then φ˜ ∈ G0 (u0 ). So we have φ˜ ∈ W 2,2 () such that u˜ = ∇ φ. ˜ 2 dx = ˜ 2 dx ≤ G(u0 ). G(u0 ) ≤ | φ| |div u|

Liquid Crystals

375

Thus φ˜ achieves G(u0 ). Since 2 2 ˜ dx ≤ lim inf |div uj | dx ≤ G(u0 ) = |div u| |div uj |2 dx, lim sup j →∞

j →∞

we find

lim

j →∞

˜ 2 dx. |div u|

|div uj |2 dx =

˜ → 0 in L2 (). Recall that curl (uj − u) ˜ → 0 and uj − u˜ → 0 Thus div (uj − u) 2 in L (). Using (3.3) for uj − u˜ we find uj → u˜ = ∇ φ˜ strongly in W 1,2 (, R3 ) as j → ∞, and

(j ) ˜ 2 dx = K1 G (u0 ) . |div uj |2 dx → K1 |div u| K1 G (u0 ) ≥ N K1 , K2 , u0 ≥ K1

Our next auxiliary problem is to minimize the curl functional. Define R(u0 ) ≡ R(u0 , ) = {u ∈ W 1,2 (, S2 , u0 ) : div u = 0 a.e. in }.

(5.8)

If R(u0 ) = ∅, we define R(u0 ) ≡ R(u0 , ) = d(K2 , κ, q, u0 ) =

inf

u∈R(u0 )

|curl u|2 dx,

(5.9)

inf

φ∈W 1,2 (,C),u∈R(u0 )

×

{|∇qu φ|2 − κ 2 |φ|2 +

κ2 4 |φ| + K2 |curl u|2 }dx. 2

(5.10)

Similar to Lemma 5.2 we have Lemma 5.3. Assume R(u0 ) = ∅. (i) R(u0 ) is achieved. (ii) lim N (K1 , K2 , u0 ) = K2 R(u0 ). K1 →+∞

(j )

(j )

(iii) Let uj be a minimizer of N (K1 , K2 , u0 ) for K1 = K1 , where K1 → +∞. Then there exists a subsequence {ujl } and u˜ ∈ R(u0 ) which is a minimizer of R(u0 ), such that ujl → u˜ strongly in W 1,2 (, R3 ). Now we can state our results on the behavior of the minimizers of E0 . Theorem 5.4. Assume that condition (5.1) holds and τ = 0. (1) If G(u0 ) = ∅, we have the following conclusions: (1i) The minimizers of Cd (K1 , K2 , κ, q) are non-trivial if K1 G(u0 ) − N (K1 , K2 , u0 ) <

κ2 ||. 2

(5.11)

376

X.B. Pan (j )

(1ii) Let (ψj , nj ) ∈ W(, u0 ) be a minimizer of Cd (K1 , K2 , κ, q) for K1 > 0, (j ) (j ) K2 = K2 > 0, κ > 0 and q ∈ R, where K2 → +∞. Then, as j → ∞, E0 [ψj , nj ] = K1 G(u0 ) −

κ2 || + o(1). 2

Moreover, there exist a subsequence (ψjl , njl ) and a function ϕ0 ∈ G(u0 ) which is a minimizer of G(u0 ), such that (ψjl , njl ) → (eiqϕ0 , ∇ϕ0 ) strongly in W(, u0 ). (2) If R(u0 ) = ∅, we have the following conclusions: (2i) The minimizers of Cd (K1 , K2 , κ, q) are non-trivial if d(K2 , κ, q, u0 ) < N (K1 , K2 , u0 ).

(5.12) (j )

(2ii) Let (ψj , nj ) ∈ W(, u0 ) be a minimizer of Cd (K1 , K2 , κ, q) for K1 = (j ) > 0, K2 > 0, κ > 0 and q ∈ R, where K1 → +∞. Then, as j → ∞,

(j ) K1

E[ψj , nj ] = d(K2 , κ, q, u0 ) + o(1). Moreover, there exist a subsequence (ψjl , njl ) and (ψ0 , n0 ) ∈ W 1,2 (, C) × R(u0 ) which is a minimizer of d(K2 , κ, q, u0 ), such that (ψjl , njl ) → (ψ0 , n0 ) strongly in W(, u0 ). Proof. (1) Assume G(u0 ) = ∅. To prove (1i) we look for an upper bound of energy for E0 . Let φ0 ∈ G(u0 ) be a minimizer of G(u0 ). We choose ψ = eiqφ0 and n = ∇φ0 as test fields and find that Cd (K1 , K2 , κ, q) ≤ K1 G(u0 ) −

κ2 ||. 2

(5.13)

On the other hand, if u is a minimizer of N (K1 , K2 , u0 ), then E0 [0, u] = N (K1 , K2 , u0 ). Under the condition (5.11) we have Cd (K1 , K2 , κ, q) < E0 [0, u]. Hence the minimizers are non-trivial. (j ) Now we prove (1ii). Let (ψj , nj ) ∈ W(, u0 ) be a minimizer of Cd (K1 , K2 , κ, q), 2 (j ) where K2 → +∞. From (5.13) we find E0 [ψj , nj ] ≤ K1 G(u0 )− κ2 ||. So curl nj → 2 0 in L () and {div nj } is bounded in L2 (). Using (3.3), {nj } is bounded in W 1,2 (, R3 ). Passing to a subsequence we may assume that nj → n0 weakly in W 1,2 (, R3 ) and strongly in L2 () as j → ∞. Hence |n0 | = 1 a.e. in and n0 = u0 a.e. on ∂. Moreover, 2 |curl n0 | dx ≤ lim inf |curl nj |2 dx = 0. j →∞

Since is simply-connected, there exists ϕ0 ∈ W 2,2 () such that n0 = ∇ϕ0 . Thus ϕ0 ∈ G(u0 ). So 2 2 G(u0 ) ≤ | ϕ0 | dx = |div n0 | dx ≤ lim inf |div nj |2 dx. (5.14)

j →∞

On the other hand, from (5.13) we see that {∇qnj ψj } is bounded in L2 () and {ψj } is bounded in L4 (). Thus {ψj } is bounded in W 1,2 (, C). Passing to a subsequence

Liquid Crystals

377

again we may assume that ψj → ψ0 weakly in W 1,2 (, C) and strongly in L4 (, C) as j → ∞. From this and (5.14) we find

{|∇qn0 ψ0 |2 +

κ2 κ2 (|ψ0 |2 − 1)2 }dx + K1 G(u0 ) ≤ lim inf E0 [ψj , nj ] + || j →∞ 2 2

≤ lim sup E0 [ψj , nj ] + j →∞

κ2 || ≤ K1 G(u0 ), 2

which implies lim E0 [ψj , nj ] = K1 G(u0 ) −

lim

j →∞

j →∞

κ2 (|ψj | − 1) }dx = {|∇qn0 ψ0 |2 + (|ψ0 |2 − 1)2 }dx = 0, 2 2 lim |div nj |2 dx = |div n0 |2 dx = | ϕ0 |2 dx = G(u0 ), j →∞ (j ) 2 lim K2 |curl nj | dx = 0. (5.15)

{|∇qnj ψj | + 2

κ2

κ2 ||, 2

2

j →∞

2

Since n0 = ∇ϕ0 , from the second equality in (5.15) we find ψj → ψ0 = eiqϕ0 +ic strongly in W 1,2 (, C), where c is a real constant. We may assume c = 0, otherwise replace ϕ0 by ϕ0 + c. From the last two equalities in (5.15) and using (3.3) we find that nj → n0 = ∇ϕ0 strongly in W 1,2 (, R3 ), and ϕ0 ∈ G(u0 ) is a minimizer of G(u0 ). (2) Assume R(u0 ) = ∅. We take a minimizer of d(K2 , κ, q, u0 ) as test fields for E0 and find that Cd (K1 , K2 , κ, q) ≤ d(K2 , κ, q, u0 ).

(5.16)

As in (1i) we see that, if d(K2 , κ, q, u0 ) < N (K1 , K2 , u0 ), then the minimizers of E0 are non-trivial. So (2i) is true. (j ) Next we prove (2ii). Let (ψj , nj ) ∈ W(, u0 ) be a minimizer of Cd (K1 , K2 , κ, q), (j ) where K1 → +∞. From (5.16) we find E0 [ψj , nj ] ≤ d(K2 , κ, q, u0 ).

(5.17)

Hence {ψj } is bounded in W 1,2 (, C), {curl nj } is bounded in L2 (), and div nj → 0 in L2 (). From (3.3), {nj } is bounded in W 1,2 (, S2 ). Passing to a subsequence we may assume that, as j → ∞, nj → n0 weakly in W 1,2 () and strongly in L2 (), ψj → ψ0 weakly in W 1,2 (, C) and strongly in L4 (, C).|n0 | = 1 a.e. in , n0 = u0 a.e. on ∂, and div n0 = 0 a.e. in . So n0 ∈ R(u0 ), and we have

κ2 |∇qn0 ψ0 |2 − κ 2 |ψ0 |2 + |ψ0 |4 + K2 |curl n0 |2 dx 2 ≤ lim inf E0 ψj , nj ≤ lim sup E0 ψj , nj ≤ d (K2 , κ, q, u0 ) .

d(K2 , κ, q, u0 ) ≤

j →∞

j →∞

378

X.B. Pan

From these inequalities we find that lim E0 [ψj , nj ] = d(K2 , κ, q, u0 ), 2 |curl nj | dx = Gqn0 [ψ0 ] + K2 |curl n0 |2 dx

j →∞

lim Gqnj [ψj ] + K2

j →∞

(j ) lim K j →∞ 1

= d(K2 , κ, q, u0 ),

|div nj |2 dx = 0.

(5.18)

Hence (ψ0 , n0 ) achieves d(K2 , κ, q, u0 ). Since lim inf Gqnj [ψj ] ≥ Gqn0 [ψ0 ],

|curl nj | dx ≥

lim inf

j →∞

j →∞

|curl n0 |2 dx,

2

the second line in (5.18) further implies that lim Gqnj [ψj ] = Gqn0 [ψ0 ],

lim

j →∞

j →∞

|curl nj |2 dx =

|curl n0 |2 dx.

(5.19)

From the last equality in (5.18) and the second equality in (5.19) we find div nj → div n0 and curl nj → curl n0 strongly in L2 (). Then we use (3.3) to find nj → n0 strongly in W 1,2 (, R3 ). From the first equality of (5.19) we see that ∇qnj ψ → ∇qn0 ψ0 strongly in L2 (), and hence ψj → ψ0 strongly in W 1,2 (, C). Remark 5.5. If G(u0 ) = ∅, N (K1 , K2 , u0 ) → K1 G(u0 ) as K2 → +∞, see Lemma 5.2. Hence for any fixed κ > 0 and q ∈ R, (5.11) holds for all large K2 . If R(u0 ) = ∅, and if inf

u∈R∗ (u0 )

µ(qu) < κ 2 ,

(5.20)

where R∗ (u0 ) ≡ R∗ (u0 , ) denotes the set of the minimizers of R(u0 ), µ(qu) is the lowest eigenvalue of (2.1) for A = qu, then (5.12) holds for all large K1 . In fact, N(K1 , K2 , u0 ) → K2 R(u0 ) as K1 → +∞, see Lemma 5.3. Obviously d(K2 , κ, q, u0 ) ≤ R(u0 ) +

inf

inf

u∈R∗ (u0 ) φ∈W 1,2 (,C)

Under the condition (5.20), we have inf

inf

u∈R∗ (u0 ) φ∈W 1,2 (,C)

and hence (5.12) is true for large K1 .

Gqu [φ] < 0,

Gqu [φ].

Liquid Crystals

379

6. Further Remarks From Theorem 5.4 we see that, if either G(u0 , ) = ∅,

(6.1)

R(u0 , ) = ∅,

(6.2)

or if

then we can easily obtain the asymptotic behavior of the minimizers. Equations (6.1) and (6.2) are conditions of consistency between the boundary data u0 and the domain . If u0 and do not have such consistency, the asymptotic behavior of the minimizers will be complicated. Problem 1. Find conditions on and u0 such that (6.1) holds8 . A necessary condition for G(u0 ) to be non-empty is that the 1-form associated with u0 is exact, see [PQ] (Prop. 2.10). Let us assume this condition and set ϕ1 = γν u0 , ∂φ W 2,2 (, ϕ0 , ϕ1 ) = φ ∈ W 2,2 () : φ = ϕ0 and = ϕ1 on ∂ . ∂ν One may use the following functional to detect whether G(u0 ) is non-empty:

2 1 Jε [φ] = | φ|2 + 2 1 − |∇φ|2 dx, ε cg (ε) = inf Jε [φ]. φ∈W 2,2 (,ϕ0 ,ϕ1 )

If G(u0 ) = ∅, then cg (ε) → G(u0 ) as ε → 0. If minimizers exist for all small ε, then one can extract a subsequence which converges in the W 2,2 norm to a minimizer of G(u0 ). Problem 2. Find conditions on and u0 such that (6.2) holds9 . Let us define

W (, div, u0 ) = u ∈ W 1,2 , R3 : div u = 0 a.e. in and u = u0 on ∂ . One may use the following functional to detect whether R(u0 ) is non-empty:

2 1 Jε∗ [u] = |curl u|2 + 2 1 − |u|2 dx, ε cr (ε) = inf Jε∗ [u]. u∈W (,div,u0 )

If R(u0 ) = ∅, then cr (ε) → R(u0 ) as ε → 0. If the minimizers exist for all small ε, then one can extract a subsequence which converges in the W 1,2 norm to a minimizer of R(u0 ). 8 In some special cases (6.1) can be easily verified. For example, if is a ball B (0) and u = ±ν, 0 R where ν is the unit outer normal on ∂, then φ0 (x) = ±|x| is an element in G(u0 ). 9 In the 2-dimensional case, this problem is conjugate to Problem 1. In fact, for u = (u , u ), let 0 01 02 u0⊥ = (−u02 , u01 ). Then R(u0 ) = ∅ if and only if G(u0⊥ ) = ∅.

380

X.B. Pan

Note that the variational problem for Jε can be written as follows: Minimize the functional

2 1 2 2 |∇u| + 2 1 − |u| dx, u = u0 on ∂ (6.3) ε subject to the pointwise constraint curl u = 0

in .

(6.4)

While the variational problem for Jε∗ is to minimize the Ginzburg-Landau functional (6.3) subject to the pointwise constraint div u = 0

in .

(6.5)

Without the pointwise constraint, the variational problem for (6.3) has been studied by many authors, see for instance Bethuel-Brezis-H´elein [BBH], Lin [L2] and references therein. In contrast, the variational problem for (6.3) under the pointwise constraint (6.4) or (6.5) has not been well understood. In the 2-dimensional case, the variational problem for Jε was studied by Aviles-Giga in [AG1-2], Ambrosio-DeLellis-Mantegazza [ADM], DeSimone-Kohn-M¨uller-Otto [DKMO] and Jin-Kohn [JK]. To our knowledge, the 3-dimensional problem with constraint (6.4) or (6.5) has not been investigated yet. Problem 3. Study the asymptotic behavior of the minimizers of Cd (K1 , K2 , κ, q) when either R(u0 ) = ∅ and K1 → +∞; or G(u0 ) = ∅ and K2 → +∞. If R(u0 ) = ∅, to understand the limiting behavior of the minimizers, one has to investigate the following variational problem: L(, curl, u0 ) = u : curl u ∈ L2 (), |u(x)| = 1 a.e. in , u = u0 on ∂ , (6.6) 2 Rl (u0 ) = inf |curl u| dx. u∈L(,curl,u0 )

Problem 4. Study the variational problem (6.6). Problem 4 is linked to the following singular perturbation functional

2 1 2 ε|∇u|2 + |curl u|2 + dx. |u| − 1 ε When Rl (u0 ) > 0, we have little knowledge about the variational problem (6.6), and we do not know whether minimizers exist except in a few special cases, see [PQ]. Some numerical computations in the two dimensional case were obtained in [GLP], which show that the complexity of minimizers grows rapidly when the rotation number of u0 increases. Acknowledgements. The author would like to thank Professor F. H. Lin and Professor Y. Giga for many valuable discussions on the mathematical theory of liquid crystals, and thank the referee for many nice comments on the manuscript. Part of this work was reported in the Third Northeastern Symposium on Mathematical Analysis held in Sapporo, February 2002, and the author would like to thank the organizers for the invitation. This work was partially supported by National Natural Science Foundation of China, Science Foundation of the Ministry of Education of China, Zhejiang Provincial Natural Science Foundation of China, and NUS grant R-146-000-033-112.

Liquid Crystals

381

References [AG1]

Aviles, P., Giga, Y.: A mathematical problem related to the physical theory of liquid crystal configurations. Proc. Centre Math. Anal. Austral. Nat. Univ. 12, 1–16 (1987) [AG2] Aviles, P., Giga, Y.: On lower semicontinuity of a defect energy obtained by a singular limit of the Ginzburg-Landau type energy for gradient fields. Proc. Royal Soc. Edinburgh 129 A(1), 1–17 (1999) [ALM] Ambrosio, L., DeLellis, C., Mantegazza, C.: Line energies for gradient vector fields in the plane. Calc. Var. 9, 327–355 (1999) [BBH] Bethuel, F., Brezis, H., H´elein, F.: Ginzburg-Landau Vortices, Birkh¨auser, Boston-BaselBerlin, 1994 [BCLP] Bauman, P., Calderer, M., Liu, C., Phillips, D.: The phase transition between chiral nematic and smectic A∗ liquid crystals. Arch. Rational Mech. Anal. 165, 161–186 (2002) [C] Calderer, M.C.: Studies of layering and chirality of smectic A∗ liquid crystals. Math. comput. Modelling 34, 1273–1288 (2001) [ChL] Chen, J.-H., Lubensky, T.: Landau-Ginzburg mean-field theory for the nematic to smectic-C and nematic to smectic-A phase transitions. Phys. Rev. A 14(3), 1202–1207 (1976) [CHO] Chapman, S.J., Howison, S.D., Ockendon, J.R.: Macroscopic models for superconductivity. SIAM Rev. 34, 529–560 (1992) [dG1] De Gennes, P.G.: Superconductivity of Metals and Alloys. New York: W.A. Benjamin, Inc., 1966 [dG2] de Gennes, P.G.: An analogy between superconductors and smectics A. Solid State Commun. 10, 753-756 (1972) [dG3] de Gennes, P.G.: Mol. Cryst. Liq. Cryst. 21, 49 (1973) [dGP] de Gennes, P.G., Prost, J.: The Physics of Liquid Crystals. Second edition, Oxford: Oxford Science Publications 1993 [DGP] Du, Q., Gunzburger, M., Peterson, J.: Analysis and approximation of the Ginzburg-Landau model of superconductivity. SIAM Rev. 34, 45–81 (1992) [DH] Dauge, M., Helffer, B.: Eigenvalues variation, I, Neumann problem for Sturm-Liouville operators. J. Differential Equations 104, 243–262 (1993) [DKMO] DeSimone, A., Kohn, R., M¨uller, S., Otto, F.: A compactness result in the gradient theory of phase transitions. Proc. Roy. Soc. Edinburgh 131 A(4), 833–844 (2001) [E] Ericksen, J.: General solutions in the hydrostatic theory of liquid crystals. Trans. Soc. Rhelogy 11(1), 5–14 (1967) [GLP] Glowinski, R., Lin, P., Pan, X.-B.: An operator-splitting method for a liquid crystal model. Computer Physics Communications 152, 242–252 (2003) [GP] Giorgi, T., Phillips, D.: The breakdown of superconductivity due to strong fields for the Ginzburg–Landau model. SIAM J. Math. Anal. 30, 341–359 (1999) [GW1] Goodby, J., Waugh, M., et al.: Characterization of a new helical smectic liquid-crystal. Nature 337(6206), 449–452 (1989) [GW2] Goodby, J., Waugh, M., et al.: A new molecular ordering in helical liquid-crystals. J. Amer. Chem. Soc. 111(21), 8119–8125 (1989) [HKL] Hardt, R., Kinderlehrer, D., Lin, F.H.: Existence and partial regularity of static liquid crystal configurations. Commun. Math. Phys. 105, 547–570 (1986) [HM] Helffer, B., Morame, A.: Magnetic bottles for the Neumann problem: curvature effects in the case of dimension 3, preprint [JK] Jin, W., Kohn, R., Singular perturbation and the energy of folds. J. Nonlinear Sci. 10, 355–390 (2000) [L1] Lin, F.H.: Nonlinear theory of defects in nematic liquid crystals, phase transitions and flow phenomena. Commun. on Pure and Appl. 42, 789–814 (1989) [L2] Lin, F.H.: Solutions of Ginzburg-Landau equations and critical points of the renormalized energy. Ann. Inst. Henri Poincar´e Anal. Non Lin´eaire 12, 599–622 (1995) [LP1] Lu, K., Pan, X.B.: Gauge invariant eigenvalue problems in R2 and in R2+ . Trans. Amer. Math. Soc. 352, 1247–1276 (2000) [LP2] Lu, K., Pan, X.B.: Estimates of the upper critical field for the Ginzburg-Landau equations of superconductivity. Physica D (1271–2), 73–104 (1999) [LP3] Lu, K., Pan, X.B.: Surface nucleation of superconductivity in 3-dimension. J. Differential Equations 168, 386–452 (2000) [LR] Lubensky, T., Renn, S.: Twist-grain-boundary phases near the nematic–smectic-A–smecticA–smectic-C point in liquid crystals. Phys. Rev. A 41(8), 4392–4401 (1990) [Mc] McMillan, W.L.: Measurement of smectic-A-phase order-parameter fluctuations in the nematic phase of p − n−octyloxybenzylidene-p -toluidine. Phys. Rev. A 7(5), 1673–1678 (1973)

382

X.B. Pan

[O]

Ou, B.: Examinations on a three-dimensional differentiable vector field that equals its own curl, preprint Pan, X.B.: Surface superconductivity in applied fields above HC2 . Commun. Math. Phys. 228, 327–370 (2002) Pan, X.B.: Surface superconductivity in 3-dimensions, preprint Pan, X.B.: Superconductivity near critical temperature. J. Math. Phys. 44, 2639–2678 (2003) Pan, X.B., Qi,Y.:Asymptotics of minimizers of variational problems involving curl functional. J. Math. Phys. 41, 5033–5063 (2000) Renn, S., Lubensky, T.: Abrikosov dislocation lattice in a model of the cholesteric–to–smectic-A transition. Phys. Rev. A 38(4), 2132–2147 (1988) Temam, R.: Navier-Stokes Equations, Theory and Numerical Analysis, 3rd eds., AMS Chelsea Publishing, Providence, 2000

[P1] [P2] [P3] [PQ] [RL] [T]

Communicated by A. Kupiainen

Commun. Math. Phys. 239, 383–404 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0887-4

Communications in

Mathematical Physics

Constrained Wave Equations and Wave Maps Jalal Shatah1, , Chongchun Zeng2, 1

Courant Institute of Mathematical Sciences, 251 Mercer Street, New York, NY 10012, USA. E-mail: [email protected] 2 Department of Mathematics, University of Virginia, Charlottesville, VA 22904, USA. E-mail: [email protected] Received: 23 July 2002 / Accepted: 3 February 2003 Published online: 10 July 2003 – © Springer-Verlag 2003

Abstract: In this paper we establish that wave maps can be obtained by a penalization method if the initial data is well prepared. When the data is not well prepared, we prove that the solution of the penalized equation converges weakly to the solution of the system of coupled equations obtained in [11] by a multi-scale formal analysis. In particular, the interaction between the rapid normal oscillations and the tangential motions creates a new term in the limit system whose well-posedness is proved by using the Nash-Moser Implicit Function Theorem.

1. Introduction Wave maps are maps from Minkowski space (Rn+1 , η) into an m-dimensional Riemannian manifold (M, g), that satisfy the covariant wave equation. If the target manifold M ⊂ Rm+ is isometrically embedded with the induced Riemannian structure < ·, · >, then for any curve p(s) ∈ M ⊂ Rm+ , and any tangent vector V (s) ∈ Tp(s) M , the covariant derivative Ds V (s) is given by d V (s) ds d d = V (s) + < V (s), na (s) > na (s), ds ds

Ds V (s) = P(p(s))

a=1

where P(p) : Rm+ → Tp M denotes the orthogonal projection and {na (s)} denotes an orthonormal basis of Np M at p(s), the orthogonal complement of Tp M in Rm+ . Thus

The first author is funded in part by NSF DMS 0203485. The second author is funded in part by NSF DMS 0101969.

384

J. Shatah, C. Zeng

if we use (x 0 , x 1 , . . . , x n ) = (t, x) as a coordinate system on R × Rn with the metric (ηαβ ) = (−1, 1, . . . , 1) and consider a map p : R × Rn → M ⊂ Rm+ , then ∂α p is a tangent vector to M at p and the Cauchy problem for the wave equation for the map p is given by1 − ηαβ Dβ ∂α p = D0 ∂0 p − Di ∂ i p = p+ < ∂α p, ∂ α na (p) > na (p) = 0, p(0, x) = p0 (x), ∂t p(0, x) = p1 (x) ∈ Tp0 (x) M, where = ∂02 − , is the linear wave operator. The wave map equation with a smooth potential w : Rm+ → R is given by: D0 ∂0 p − Di ∂ i p + P(p)w (p) = 0,

(1.1)

which has a conserved energy

1 1 |pt |2 + |∇p|2 + w(p)dx, 2 2

E0 (p, pt ) =

(1.2)

defined on the tangent bundle T M. The Cauchy problem for wave maps has been studied extensively. There is a considerable body of literature on this subject, and recently the local well-posedness in the n critical norm H˙ 2 has been almost completely solved [12, 16, 15, 20–22]. In this paper we are interested in studying the problem of how to construct solutions to Eq. (1.1). A natural way of constructing solutions is by penalization: In a tubular neighborhood U of M ⊂ Rm+ let G : U → R+ be a smooth function such that • M = {u; G(u) = 0}, • ∇ G|M = 0, • At any p ∈ M, ∇ 2 G(p) is uniformly positive definite when restricted to Np M. An example which satisfies these conditions is G(u) =

1 d(u, M)2 , for u ∈ U, 2

where d(u, M) is the distance from u to M. Given such a function G we consider the penalized problem u + w (u) +

1 G (u) = 0, ε2

u ∈ Rm+ ,

which is a semi-linear wave equation with a conserved energy 1 1 1 Eε (u, ut ) = |ut |2 + |∇u| + w(u) + 2 G(u)dx, 2 2 ε

(1.3)

(1.4)

and thus is globally well-posed for finite energy initial data [15]. 1 We employ standard index notation where we use the metric to raise and lower indices, and sum over repeated indices. For Lorentz metric we use Greek letters while for Euclidean metric we use Latin letters.

Constrained Wave Equations

385

If we consider solutions to (1.3) with finite Eε data, then the penalization term ε12 G(u) should constrain the solution to a neighborhood of M. In the limit, as ε → 0, we expect the solution u to converge to p, a solution of (1.1). This simple approach of constructing solutions to constrained Hamiltonian motion works for a variety of problems. When the solution is independent of x (i.e. the ODE case), the convergence was established by Rubin and Unger in [13], with various generalizations obtained by Takens [19]. Moreover, Ebin [6, 7] proved similar results for ODE problems in infinite dimensional settings with tangential initial data. As an application, he obtained the convergence of slightly compressible barotropic fluids to Euler flows which can be viewed as geodesic flows on the space of volume preserving diffeomorphisms. Similar convergence results under strong constraining forces were also considered for problems arising in quantum mechanics [9, 10, 23, 4 and 5]. For the PDE problem, weak convergence of solutions of (1.3) to (1.1) were obtained for M = Sk and all types of initial data [14]. Later this result was generalized for compact homogenous spaces [8]. Moreover, some similar results to those given in Sect. 2 were obtained by the first author and Chechkin [3]. Here, in Theorem 1, we will prove that in one space dimension this intuitive idea on how to construct global solutions works for an arbitrary manifold, provided the initial data has finite energy and is tangent to M. In higher space dimensions, this result can be generalized to construct local solutions with smooth initial data. Such a result is optimal, for smooth solutions, due to the development of finite time singularities for wave maps in higher dimensions [14]. When the data is not tangential the convergence of u → p is less clear. In this case the normal part of the data introduces rapid oscillations in the system which do not converge to zero in a weak sense. Indeed, for ODE’s Rubin and Unger proved that these rapid normal oscillations introduce a driving term in the equations of the tangential components. For the PDE problem Keller and Rubinstein [11], using formal asymptotic expansion, were able to derive a set of equations that the limit should satisfy. These equations (3.2) are a coupled system of transport and wave equations. The well-posedness of the Cauchy problem of this system in Sobolev space is not obvious. Here, in Theorem 2, we will prove that the system derived by Keller and Rubinstien is well-posed and in Theorem 3, we prove that solutions of the penalized equations converge to system (3.2). The wellposedness proof of Theorem 2 is based on the Nash -Moser implicit function theorem, and the convergence proof is based on energy methods. Although all the results in this paper are stated in one space dimension, there are no conceptual difficulties in extending them to higher space dimensions. However in more than one space dimension, the convergence to wave maps can only be local in time since the global well-posedness of wave maps has been known to be false for higher space dimensions. We note that the constraining potential in the wave equation (1.3) penalizes the L2 norm of the distance function to the manifold M. When the penalty depends on the H 1 norm of the distance, some of the characteristics of the constrained equation are different from those of the wave equation. The limit solutions with non-tangential data satisfy an equation, different from (3.2), in the spirit of Rubin and Ungar [13] and Takens [19]. This problem is studied in a separate paper [18]. 2. Tangential Initial Data Without loss of generality, assume 0 ∈ M ⊂ Rm+ and w(0) = 0,

386

J. Shatah, C. Zeng

x ∈ R, and that the solutions u of (1.3) and p∗ of Eq. (1.1) are compactly supported in space for every fixed time. The second assumption is possible by using finite speed of propagation. Theorem 1. Let u(t, x) and p∗ (t, x) be solutions of (1.3) and (1.1). Suppose, at t = 0, 1 u(0, ·) → p∗ (0, ·) in Hloc , 1 d(u(0, ·), M) → 0 in L2loc , ut (0, ·) → p∗t (0, ·), ε

(2.1a) (2.1b)

as ε → 0, then for any R > 0 and T > 0, 1 u → p∗ in L∞ T HR ,

ut → p∗t ,

1 2 d(u, M) → 0 in L∞ T LR , ε

1 ∞ 2 ∞ 1 ∞ 2 where L∞ T HR = L ([−T , T ]; H [−R, R]), and LT LR = L ([−T , T ]; L [−R, R]).

Proof. The proof will be carried out by constructing an appropriate coordinate system near M and using energy estimates and mixes norm estimates on the tangential components of u. Coordinates near M ⊂ Rm+ . Let N M denote the normal bundle over M in R m+ . By the Tubular Neighborhood Theorem, u = p + n,

p ∈ M, n ∈ Np M,

(2.2)

is a smooth coordinate system of Rm+ in a small tubular neighborhood of M. In this coordinate system, the function G(u) can be written as G(p, n) =

1 1 < A(p)n, n > +G1 (p, n) ≥ a|n|2 ≥ 0, 2 4

where A(p) = G (p, 0) is a symmetric matrix which is positive definite on the normal bundle A(p)|Np M ≥ a > 0, and whose kernel is KerA(p) = Tp M, and where G1 is a smooth function that satisfies G1 (p, n) = O(|n|3 ),

(2.3)

for all p ∈ M. Let (p) be the second fundamental form of M, i.e. (p)(n, X) = P(p)∇X n, for any X ∈ Tp M for any n ∈ Np M, and let ⊥ n ≡ (I − P(p))∇X n, ∇X

denote the normal component of ∇X n. Moreover if we consider p and n as mappings with the variable u given by (2.2), then it is easy to see that their linearizations satisfy ∂p = (I + (n, ·))−1 P, ∂u

(I − P)

∂n = I − P. ∂u

Thus, we can conclude that (I − P)G (p, n) = A(p)n + O(|n|2 ),

PG (p, n) = O(|n|2 ).

(2.4)

Constrained Wave Equations

387

Using characteristic coordinates ξ=

1 (t − x), 2

η=

1 (t + x). 2

Eq. (1.3) can be written as uξ η + w (u) +

1 G (u) = 0, ε2

(2.5)

and writing u = p + n, we decompose the above equation into tangential and normal directions Dη ((I + (n, ·))pξ ) = − (∇ξ⊥ n, pη )−Pw (p, n) − ∇η⊥ ∇ξ⊥ n +

1 PG (p, n), ε2

(2.6a)

1 (I − P)G (p, n) = (·, pη )∗ pξ + (·, pη )∗ (n, pξ )−(I − P)w (p, n), ε2 (2.6b)

where (·, pη )∗ is the conjugate map. Energy Bounds. Let u = p + n be a solution to the penalized problem (1.3). Since the data are compactly supported, then assumption (2.1) implies p(0, ·) → p∗ (0, ·), n(0, ·) → 0 in H 1 , n(0, ·) → 0 in L2 . pt (0, ·) → p∗t (0, ·), ∇t⊥ n(0, ·) → 0, ε In the (p, n) coordinate system, the Hamiltonian Eε can be written as 1 1 1 Eε = |(I + (n, ·))pt |2 + |(I + (n, ·))px |2 + |∇t⊥ n|2 2 2 2 R 1 1 1 + |∇x⊥ n|2 + w(p, n) + 2 < A(p)n, n > + 2 G1 (p, n)dx. 2 2ε ε

(2.7)

Therefore from conservation of energy, Eε (t) = Eε (0) ≤ C0 , and Sobolev embedding we conclude that for any t ∈ R, |∇t⊥ n|2 + |∇x⊥ n|2 ≤ C, R (2.8) √ |n(t, ·)|L2 ε, |n(t, x)| , ε, where C is independent of t and ε. Here we introduced the notation A B to mean A ≤ CB, where C is a constant independent of ε. This L∞ bound on n and the fact that for any variable α of u we have |uα |2 = |(I + (n, ·))pα |2 + |∇α⊥ n|2 , imply that any estimate on uα also holds for pα and ∇α⊥ n. We can also obtain energy estimates on light cones using characteristic coordinates. For (ξ, η) ∈ R 2 , t = ξ + η ≥ 0, let C(ξ, η) be the characteristic cone with vertex (ξ, η)

388

J. Shatah, C. Zeng

and the bottom side BC(ξ, η) on the x-axis. Let LC(ξ, η), RC(ξ, η) be the left (parallel to η-axis) and right (parallel to ξ -axis) sides of the cone, respectively. For any (ξ, η), t = ξ + η ≥ 0, multiply Eq. (2.5) by uξ and integrate over C(ξ, η), to obtain 1 1 2 |uξ | dξ + w(u) + 2 G(u)dη ε RC(ξ,η) 2 LC(ξ,η) 1 1 = |uξ |2 + w(u) + 2 G(u) ε BC(ξ,η) 2 which implies n |uξ |2L∞ L2 (C(ξ,η)) + | |2L∞ L2 (C(ξ,η)) η ξ ε ξ η n (t + |uξ |2L2 (BC(ξ,η)) + | |2L2 (BC(ξ,η)) ). ε

(2.9)

Similarly, we have n |uη |2L∞ L2 (C(ξ,η)) + | |2L∞ L2 (C(ξ,η)) η ξ ε η ξ n 2 (t + |uη |L2 (BC(ξ,η)) + | |2L2 (BC(ξ,η)) ). ε

(2.10)

From the energy bounds (2.9) and (2.10) we have for any cone C(ξ, η), t = ξ + η > 0, n is bounded in Hξ1η (C(ξ, η)) and n → 0 in L2ξ η (C(ξ, η)) as ε → 0, consequently n → 0 weakly in Hξ1η (C(ξ, η)). L2 L∞ Bounds. Using Eqs. (2.6) and the energy estimates on C(ξ, η), we can obtain mixed norms estimates on pξ and pη . In fact, Eq. (2.6a) implies that for any (ξ, η), t = ξ + η > 0, |(I + (n, ·))pξ (ξ, η)| |(I + (n, ·))pξ (ξ, −ξ )| η 1 + | (∇ξ⊥ n, pη ) + P ∇w(p, n) + 2 P ∇G(p, n)|(ξ,η ) dη ε −ξ |n|2 |pξ (ξ, −ξ )| + 1 + |pη ||∇ξ⊥ n| + 2 dη by (2.4) ε LC(ξ,η) |pξ (ξ, −ξ )| + (t + |pη |L2η (LC(ξ,η)) |∇ξ⊥ n|L2η (LC(ξ,η)) n by (2.8), and (2.9). + |uξ |2L2 (BC(ξ,η)) + | |2L2 (BC(ξ,η)) ) ε This implies that |pξ |L∞ (t + |pξ (ξ, −ξ )| + |uξ |2L2 (BC(ξ,η)) η (LC(ξ,η)) n + | |2L2 (BC(ξ,η)) + |pη |L2η (LC(ξ,η)) |∇ξ⊥ n|L2η (LC(ξ,η)) ). ε

Constrained Wave Equations

389

By the energy estimate on C(ξ, η) we obtain 3

|pξ |L2 L∞ (C(ξ,η)) (t 2 + |pξ |L2 (BC(ξ,η)) + ξ

η

√

t|uξ |2L2 (BC(ξ,η)) +

√ n 2 t| |L2 (BC(ξ,η)) ε

+ |pη |L∞ L2η (C(ξ,η)) |∇ξ⊥ n|L2 L2 (C(ξ,η)) ) ξ ξ η √ n 2 3 (t 2 + |pξ |L2 (BC(ξ,η)) + t| |L2 (BC(ξ,η)) ε √ √ + t|uξ |2L2 (BC(ξ,η)) + t|uη |2L2 (BC(ξ,η)) ), and thus |pξ |L2 L∞ (C(ξ,η)) ≤ C(t, Eε ).

(2.11)

|pη |L2η L∞ (C(ξ,η)) ≤ C(t, Eε ).

(2.12)

ξ

η

Similarly we can obtain ξ

Energy Bounds on n. With the above bounds, we conclude that the right side of Eq. (2.6b) is in L2ξ η . To obtain the energy estimate for n, multiply (2.6b) by ∇ξ⊥ n to obtain 1 1 ∂η |∇ξ⊥ n|2 + 2 ∂ξ G(p, n) 2 ε =< (∇ξ⊥ n, pη ), pξ > + < (∇ξ⊥ n, pη ), (n, pξ ) > 1 − < w (p, n), ∇ξ⊥ n > + 2 < PG (p, n), (I + (n, ·))pξ > ε = f1 (ξ, η, ∇ξ⊥ n) + O{|∇ξ⊥ n|(|pη ||pξ ||p − p∗ | + |pη ||pξ − p∗ξ | + |p∗ξ ||pη − p∗η | + |n||pη ||pξ | + |n| + |p − p∗ |) + |pξ |

|n|2 }, ε2

(2.13)

where f1 (ξ, η, ∇ξ⊥ n) =< (p∗ )(∇ξ⊥ n, p∗η ), p∗ξ > − < w (p∗ ), ∇ξ⊥ n > and p∗ = p∗ (ξ, η) ∈ M is the finite energy solution of (1.1). In the above, |p − p∗ | is the distance in the Euclidean space R m+ , which is locally equivalent to the distance on M. Similarly, multiplying (2.6b) by ∇η⊥ n and adding it to (2.13), we obtain 1 1 (∂η |∇ξ⊥ n|2 + ∂ξ |∇η⊥ n|2 + 2 (∂ξ + ∂η )G(p, n)) 2 ε = f1 (ξ, η, ∇ξ⊥ n + ∇η⊥ n) +O{(|∇ξ⊥ n| + |∇η⊥ n|) × (|pη ||pξ ||p − p∗ | +|pη ||pξ − p∗ξ | + |p∗ξ ||pη − p∗η | + |n| |n|2 }. (2.14) ε2 From estimates similar to those in (2.9), (2.10), and (2.11) we have |p∗ξ ||p∗η | ∈ L2ξ η (C(ξ, η)). Thus, f1 (ξ, η, ∇ξ⊥ n + ∇η⊥ n)dξ dη → 0 +|n||pη ||pξ | + |p − p∗ |) + (|pξ | + |pη |)

C(ξ,η)

390

J. Shatah, C. Zeng

as ε → 0 uniformly in (ξ, η) on any bounded set. Thus integrating inequality (2.14) over a cone C(ξ, η) and using (2.9), (2.10), (2.11), we obtain n n |∇ξ⊥ n|2L2 (RC(ξ,η)) + |∇η⊥ n|2L2 (LC(ξ,η)) + | |2L2 (RC(ξ,η)) + | |2L2 (LC(ξ,η)) η ξ ε ξ ε η η+ξ n {|∇ξ⊥ n|2L2 (RC(ξ,s−ξ )) + |pη |L∞ | |2 (RC(ξ,s−ξ )) ξ (RC(ξ,s−ξ )) ε L2 ξ ξ 0 n + |∇η⊥ n|2L2 (LC(s−η,η)) + |pξ |L∞ | |2L2 (LC(s−η,η)) }ds η (LC(s−η,η)) η ε η 2 + o(1) + (|pξ − p∗ξ |L∞ L2 (C(ξ,η)) + |pη − p∗η |2L∞ L2 (C(ξ,η)) ), η

η

ξ

ξ

where we estimated the right-hand side in the following manner: √ |p − p∗ |L∞ |p − p∗ |L∞ (BC(ξ,η)) + t|pξ − p∗ξ |L∞ L2 (C(ξ,η)) ξ η (C(ξ,η)) η ξ √ |p − p∗ |H 1 + t|pξ − p∗ξ |L∞ L2 (C(ξ,η)) , η

t=0

and

ξ

|∇ξ⊥ n||pη ||pξ − p∗ξ |dξ dη |∇ξ⊥ n|2 dξ dη + |pη |2 |pξ − p∗ξ |2 dξ dη

C(ξ,η)

C(ξ,η)

C(ξ,η)

C(ξ,η)

C(ξ,η)

|∇ξ⊥ n|2 dξ dη + |∇ξ⊥ n|2 dξ dη +

C(ξ,η) η ξ

−ξ η −ξ

−η

|pη |2 |pξ − p∗ξ |2 dξ dη

|pη |2L∞ (RC(ξ,η )) |pξ − p∗ξ |2L2 (RC(ξ,η )) dη ξ

ξ

|∇ξ⊥ n|2 dξ dη + |pη |2L2 L∞ (C(ξ,η)) |pξ − p∗ξ |2L∞ L2 (C(ξ,η)) , η ξ

η

ξ

with similar estimates for the remaining terms. In the above inequalities, the constant C in may depend on the size of the support and the norms of the initial data and the range of t = ξ + η, but uniform in (ξ, η) and ε. The term o(1), which approaches 0 as ε → 0, comes from the integral of f1 , the convergence of the initial data, and the smallness of n. Therefore, when t = ξ + η > 0 is bounded, using (2.11), we obtain n n |∇ξ⊥ n|2L∞ L2 (C(ξ,η)) + |∇η⊥ n|2L∞ L2 (C(ξ,η)) + | |2L∞ L2 (C(ξ,η)) + | |2L∞ L2 (C(ξ,η)) η ξ η ξ ε η ξ ε ξ η 2 2 o(1) + (|pξ − p∗ξ |L∞ L2 (C(ξ,η)) + |pη − p∗η |L∞ L2 (C(ξ,η)) ). (2.15) η

ξ

ξ

η

Convergence of p. In order to estimate p − p∗ , it is easier to write Eqs. (1.1) and (2.6a) without using covariant derivatives: p∗ξ η = − (·, p∗η )∗ p∗ξ − P(p∗ )w (p∗ ), ∂η ((I + (n, ·))pξ ) = − (·, pη )∗ pξ − (·, pη )∗ (n, pξ ) 1 − (∇ξ⊥ n, pη ) − P(p)w (p, n) − 2 PG (p, n). ε

Constrained Wave Equations

391

Taking the difference of these two equations, we obtain ∂η (pξ − p∗ξ + (n, pξ )) = O(|pη ||pξ ||p − p∗ | + |pη ||pξ − p∗ξ | + |p∗ξ ||pη − p∗η | + |n||pη ||pξ | + |pη ||∇ξ⊥ n| + |n| + |p − p∗ | +

|n|2 ). ε2

(2.16)

For any fixed (ξ, η), t = ξ +η > 0, it follows from integrating Eq. (2.16) along LC(ξ, η) that |pξ − p∗ξ + (n, pξ ))|L∞ η

√ |(pξ − p∗ξ )(ξ, −ξ )| + ( ε|pξ (ξ, −ξ )| + (1 + |pξ |L∞ |pη |L2η )|p − p∗ |L∞ η η √ ∞ ∞ ∞ + |pη |L2η |pξ − p∗ξ |Lη + |p∗ξ |Lη |pη − p∗η |L2η + ε|pξ |Lη |pη |L2η √ n + |pη |L2η |∇ξ⊥ n|L2η + ε + | |2L2 ), ε η

2 ∞ where the norms L2η and L∞ η represent Lη (LC(ξ, η)) and Lη (LC(ξ, η)), respectively. 2 2 ∞ Note that L∞ ξ Lη norm is controlled by the Lη Lξ norm and

|p − p∗ |H 1 + |p − p∗ |L∞ η (LC(ξ,η)) t=0

√

t|pη − p∗η |L2η (LC(ξ,η)) .

Therefore, from (2.9), (2.10), (2.11), and (2.15), we have |pξ − p∗ξ |2L2 L∞ ξ

η

{(t + |p∗ξ |2L2 L∞ )|pη − p∗η |2L∞ L2 + |pη |2L∞ L2 |pξ − p∗ξ |2L2 L∞ ξ

η

ξ

η

ξ

η

ξ

η

n + t| |4L∞ L2 } + o(1) ε ξ η 2 {(t + |p∗ξ |L2 L∞ )|pη − p∗η |2L2 L∞ + (t + |pη |2L∞ L2 )|pξ − p∗ξ |2L2 L∞ + |pη |2L∞ L2 |∇ξ⊥ n|2L2 η ξ ξη ξ

η ξ

η

+ t (|pξ − p∗ξ |4L2 L∞ ξ η

ξ

η

ξ

η

+ |pη − p∗η |4L∞ L2 )} + o(1). η ξ

Here all the norms are on C(ξ, η). As in (2.15), the constant C, in , is uniform in ε and (ξ, η) and the o(1) term is also uniform in (ξ, η) when t is bounded. The constants may depend on the norm and the size of the support of the initial data. A similar inequality can be obtained for |pη −p∗η |L2η L∞ . Combining these inequalities with (2.11), we obtain ξ that, when t = ξ + η > 0 is reasonably small, |pξ − p∗ξ |L2 L∞ (C(ξ,η)) + |pη − p∗η |L2η L∞ (C(ξ,η)) → 0 ξ

η

ξ

(2.17)

as ε → 0. In fact, since p and p∗ are compactly supported in x, (2.17) is sufficient to yield that, for t ∈ [0, δ], where δ is a small number independent of ε, 1 p → p∗ in L∞ t Hx ,

2 pt → p∗t in L∞ t Lx .

392

J. Shatah, C. Zeng

In order to prove the same convergence for n, we write the n equation (2.6b) in (x, t) coordinate: ∇t⊥ ∇t⊥ n − ∇x⊥ ∇x⊥ n +

1 (I − P)G (p, n) ε2

1 = (·, pη )∗ pξ + [ (·, pη )∗ (n, pξ ) + (·, pξ )∗ (n, pη )] − (I − P)w (p, n). 2 (2.18) Since all these functions are compactly supported in x, from (2.9), (2.10), (2.11), and (2.15), the right side of (2.18) is bounded in L2 (R × [0, δ]) and ∇t⊥ n → 0 in L2 (R × [0, δ]). Note < ∇t⊥ n, ∇x⊥ ∇x⊥ n > = ∂x < ∇t⊥ n, ∇x⊥ n > − < ∂x ∇t⊥ n, ∇x⊥ n > = ∂x < ∇t⊥ n, ∇x⊥ n > − < ∂tx n, ∇x⊥ n > − < (∇x⊥ n, px ), (n, pt ) > 1 = ∂x < ∇t⊥ n, ∇x⊥ n > − ∂t |∇x⊥ n|2 + < (∇x⊥ n, pt ), (n, px ) > 2 − < (∇x⊥ n, px ), (n, pt ) > 1 1 = ∂x < ∇t⊥ n, ∇x⊥ n > − ∂t |∇x⊥ n|2 + < (∇x⊥ n, pξ ), (n, pη ) > 2 2 1 − < (∇x⊥ n, pη ), (n, pξ ) > . 2 Therefore, the extra terms coming from the integration by parts in the above converge to 0 in L1 (R × [0, δ]). Multiplying (2.18) by ∇t⊥ n and integrating on R × [0, δ], we obtain that, n ⊥ |∇t⊥ n|L∞ 2 + |∇x n|L∞ L2 + | |L∞ L2 → 0. t Lx t x ε t x By a continuation argument, these convergences of p and n hold for t ∈ [0, T ] and this finishes the proof of Theorem 1. 3. Nontangential Data In this section we will assume 0 ∈ M ⊂ Rm+1 is an orientible hypersurface with tubular neighborhood coordinates given by (p, d) u = p + n = p + dν, where ν = ν(p) is a smooth unit normal vector field and d ∈ R. The second fundamental form is given by def

(p)(n, X) = d ν (p)X = d X. Moreover, and without any loss of generality, we’ll assume that the potential w = 0, the initial data has compact support, and that G(u) =

1 a(p)d 2 2

a(p) > 4,

Constrained Wave Equations

393

where the constant 4 is chosen simply for convenience. The penalized equation u +

1 G (u) = 0 ε2

can be written in terms of (p, d) as 1 a(p)d = − < pα , pα > −d pα p α , ε2 d2 a(I + d )D α pα + 2(d α pα ) + d( p α )pα + 2 a (p) = 0. 2ε d +

(3.1a) (3.1b)

The limit system. When the initial data for u is assumed to be of bounded energy, but not necessarily tangential to M we have, roughly, d(0) = εd0

dt (0) = d1

p(0) = p0

pt (0) = p1 ,

where d0 = 0 or d1 = 0. To find the limit of the above system as ε → 0, for such initial data, Keller and Rubinstein derived a system of equations that the limit should satisfy using a formal asymptotic expansion [11] d = Re{εRei( ε +α) + . . . }. θ

From this ansatz one can easily derive a set of equations for R and θ from (3.1a), θt2 − θx2 = a(p) ∂t (R 2 θt ) − ∂x (R 2 θx ) = 0

with initial data

d12 . a(p0 ) Moreover using this asymptotic expansion we conclude that in some averaged sense θ (0) = 0

R(0) =

d02 +

2 d2 def R → A = . ε2 2

Therefore as ε → 0, at least heuristically, θ → θ∗ , given by the system

R2 2

→ A∗ , and p → p∗ which is

2 2 − θ∗x = a∗ , θ∗t θ∗ (0) = 0,

(3.2a)

θ∗t ∂t A∗ − θ∗x ∂x A∗ + θ∗ A∗ = 0, d12 1 2 d + , A∗ (0) = A∗0 = 2 0 a(p0 )

(3.2b)

1 Dt p∗t − Dx p∗x + A∗ a∗ = 0, 2 p∗ (0) = p0 p∗t = p1 ,

(3.2c)

where we have introduced the notation for any function f of p, f∗ = f (p∗ ). Thus a∗ = a(p∗ ), a∗ = a (p∗ ), and so on.

394

J. Shatah, C. Zeng

The above system of equations, although hyperbolic, doesn’t seem to yield useful a priori estimates because of the presence of θ in Eq. (3.2b). In fact all of the estimates we were able to derive had a loss of derivatives. For this reason we will use the Nash-Moser implicit function theorem to prove that system (3.2) has a local solution for smooth data. Theorem 2. Fix a large integer k and consider data (0, A0 , p0 , p1 ) in H k such that A0 0. Then ∃T > 0 such that system (3.2) has a unique solution (θ∗ , A∗ , p∗ ) ∈ H k−2 ([0, T ] × R) . Proof. Using finite speed of propagation we will first localize the solution in space, and then apply the Nash-Moser implicit function theorem in a neighborhood of an approximate solution. Localization. For any x0 ∈ R we consider the light cone C(x0 , δ) whose base is given by [x0 − δ, x0 + δ] and vertex (x0 , δ). For (x, t) ∈ C(x0 , δ) we translate x0 to the origin and scale t → δt, x → δx and θ → δθ . In these variables the equations become    θt2 − θx2 − a(p)  θ )A F (z) = θt ∂t A − θx ∂x A + ( = 0, (3.3)   Dt pt − Dx px + δ 2 A2 a (p) def

where z = (θ, A, p) and (x, t) ∈ C(0, 1), with initial data θ(0, x) = 0,

A(0, x) = A0 (δx),

p(0, x) = p0 (δx), pt (0, x) = δp1 (δx). √ Since θ (0, x) = 0, then from (3.2a) θt (0, x) = a∗ 2 and therefore the characteristics of the θ and the A equations have slopes dx dt less than 1. Thus solutions at any point inside the cone C(0, 1) depend only on the data at the base of the cone. Function spaces. Since we expect some derivative loss, we will set up the problem in the space def

z ∈ Hk = H k (C(0, 1)) × H k−2 (C(0, 1)) × H k (C(0, 1)) for k large. Let zI = (θI , AI , pI ) be a solution to θI2t − θI2x = a(pI ), θI t ∂t AI − θI x ∂x AI + ( θI )AI = 0, Dt ∂t pI − Dx ∂x pI = 0, with the same initial data given above. For δ small the data is almost constant, and thus pI is slowly varying in x. This implies that the θI equation does not develop shocks on a time interval [0, T ] for some T > 1. Thus the above decoupled system has a solution zI ∈ Hk . To proceed in setting up the problem to use the hard implicit function theorem, let F˜ (˜z) = F (˜z + zI ) : B k → Hk−2 defined on the unit ball B k ⊂ Hk with zero data on the base of the cone C(0, 1). Clearly F˜ is C 2 (B k ) with bounds ||F˜ ||C 2 (Bk ) C, |F˜ (0)|Hk = δ 2 |a (pI )AI |Hk Cδ 2 , with C depending on |zI |Hk .

(3.4)

Constrained Wave Equations

395

The linearized operator. Given any z = (θ, A, p) ∈ B k the derivative of F˜ at z acting ˆ p) on zˆ = (θˆ , A, ˆ is given by     2W(θ )θˆ − a (p)pˆ   δ θˆ + θ Aˆ L(z)ˆz = F˜ (z)ˆz = W(θ )Aˆ + W(A)θˆ + A ,   δz   δ2 A α 2 ˆ pˆ + Q pˆα + (Q2 + δ a (p))pˆ + a (p)A 1

2

2

where W(θ ) = and Q2 depend on p and ∂p, and zˆ has zero initial data. ˆ i.e., The difficulty in solving L(z)ˆz = h, θt ∂t − θx ∂x , Qα1

2W(θ )θˆ − a (p)pˆ = hˆ 1 , W(θ )Aˆ + W(A)θˆ + A θˆ + θ Aˆ = hˆ 2 ,

δ2 A pˆ + Qα1 pˆα + Q2 + δ 2 a (p) pˆ + a (p)Aˆ = hˆ 3 2 2

(3.5a) (3.5b) (3.5c)

is again the presence of θˆ in Eq. (3.5b). To overcome this difficulty we observe that if we let Z(θ ) = −θx ∂t + θt ∂x , then ∂t =

θt θx W+ Z α α

∂x =

θx θt W + Z, α α

where α = θt2 − θx2 > c0 for z ∈ B k and δ sufficiently small.2 Using the W and Z vector fields we can express the wave operator as =

1 (W2 − Z2 ) + b0 Z + b1 W, α

(3.6)

where bk depend on θ, ∂θ, and ∂ 2 θ . def ˆ To handle the term θˆ , we apply the vector field W to γˆ = θ to obtain Wγˆ = Wθˆ + [W, ]θˆ ;

(3.7)

the commutator consists of terms ˆ Wθˆ W2 θˆ , Z2 θˆ , ZWθˆ , Zθ, with coefficients depending on θ, ∂θ, ∂ 2 θ , and ∂ 3 θ . The term Z2 θˆ can be expressed using Eq. (3.6) in terms of ˆ Z2 θˆ = W2 θˆ − α θˆ + αb0 Zθˆ + αb1 Wθ, and therefore Eq. (3.7) can be written as ˆ Wγˆ = Wθˆ + b2 α γˆ + b3 W2 θˆ + b4 ZWθˆ + b5 Zθˆ + b6 Wθ. 2 Here we used a(p) > 4 and B k is the unit ball. If a(p) > c , an arbitrary constant, we simply adjust 0 the radius of Bk to ensure that α is strictly positive.

396

J. Shatah, C. Zeng

The terms W2 θˆ and Wθˆ can be expressed using Eqs. (3.5a) and (3.5c) as 1 W2 θˆ = W(a (p)pˆ + hˆ 1 ), 2 1 1 ˆ 1 ˆ 1 ˆ Wθ = (a (p)p) h1 = b7 Aˆ + b8 pˆ + b9 ∂ pˆ + h1 + a (p)hˆ 3 , ˆ + 2 2 2 2 where the coefficients depend on A, p, ∂p, and ∂ 2 p. Therefore the equation for Wγˆ can be written hˆ 1 Wγˆ = b2 α γˆ + b10 ∂ pˆ + b11 Zθˆ + b12 Aˆ + b13 pˆ + b14 1 + a (p)hˆ 3 + b15 ∂ hˆ 1 + b16 hˆ 1 , 2

(3.8)

ˆ The data for γˆ = where we used Eq. (3.5a) to substitute for Wθ. θ can be computed from differentiating Eq. (3.5a) with respect to t. ˆ we augment system (3.5) by (3.8) to obtain a coupled transportTo solve L(z)ˆz = h, ˆ ˆ wave system for (θ , A, γˆ , p). ˆ The coefficients of the θˆ equation depend on p and ∂θ; the coefficients of the Aˆ equation depend on ∂A and second derivatives of θ ; the coefficients of the γˆ equation depend on second derivatives of p and third derivatives of θ; and the coefficients of the pˆ equation depend on p, ∂p, and A. The initial value problem for this augmented system can be solved as follows: Let χ (s; x, t) represent the characteristic curve of the vector field W(θ ) passing through the point (x, t) and parametrized by s, and let χ0 (x, t) denote the intersection of the characteristic curve with the x-axis. Then system (3.5) and Eq. (3.8) can be written as 1 ˆθ (x, t) = (a (p)pˆ + hˆ 1 ), 2 χ ˆ ˆ θ)A), A(x, t) = (hˆ 2 − W(A)θˆ + Aγˆ − ( χ γˆ (x, t) = γˆ0 (χ0 (x, t)) + R1 , p(x, ˆ t) =

t

χ x+t−s

R2 , 0

x−t+s

where R1 and R2 are from Eqs. (3.5c) and (3.8) respectively. The integral equations system can be solved for t ∈ [0, T ] by a fixed point argument since the solutions satisfy the a priori estimates |θˆ |Htxm ≤ T C |hˆ 1 |Htxm + |p| ˆ Htxm , ˆ m−1 ≤ T C |hˆ 2 | m−1 + |θˆ |H m + |A| ˆ m−1 + |γˆ | m−1 , |A| Htx Htx Htx Htx tx ˆ m−1 + |p| ˆ H m + |γˆ | m−1 , |γˆ − γˆ0 |H m−1 ≤ T C |hˆ 1 |H m+1 + |hˆ 3 |H m−1 + |A| ˆ Htxm + |θ| H H tx tx tx tx tx tx ˆ m−1 , |p| ˆ Htxm ≤ T C |hˆ 3 |H m−1 + |p| ˆ Htxm + |A| H tx

tx

m is the Sobolev space H m on the cone where C depends on |z|H m+2 for m ≥ 3 and Htx tx C(0, 1).

Constrained Wave Equations

397

To show that the solution to the integral system solves L(z)ˆz = hˆ we only need to ˆ To do this we note in deriving (3.8) we used (3.5a) and (3.5c) which verify that γˆ = θ. also hold by solving the integral system. Thus (3.8), (3.5a), and (3.5c) imply that (3.7) holds ˆ Wγˆ = Wθˆ + [W, ]θ, and since γˆ (x, 0) = ( θˆ )(x, 0), this implies γˆ = θˆ . Thus L is invertible on Hm for 3 m k − 2 with the estimate on L−1 , ˆ Hm ≤ C|h| ˆ Hm+1 , |L−1 (z)h| where C depends on Hm+2 norm of z. Since |F˜ (0)|Hk Cδ 2 can be made arbitrarily small compared to the bounds on F˜ , L, and L−1 restricted to B k , then by Nash-Moser implicit function theorem there is a unique solution to system (3.3) in B k [17]. Convergence. Consider the penalized system (3.1) with Hx2 ×Hx1 initial data (uε0 , uε1 ), depending on ε, such that when expressed in terms of (pε0 , pε1 , dε0 , dε1 ) the data satisfies |uε0 |H 2 + |uε1 |H 1 C0 ,

d2

|pε0 − p0 |H 1 + |pε1 − p1 |L2 + |dε0 |H 1 + | 2εε02 +

2 dε1 2a(pε0 )

− A0 |L2 ε,

(3.9)

for some smooth functions (p0 , p1 , A0 ) ∈ H k × H k−1 × H k−1 . Theorem 3. Given data satisfying (3.9) for the penalized system u +

1 G (u) = 0, ε2

and assume that the solution (θ∗ , A∗ , p∗ ) of the limit system (3.2) 2 2 − θ∗x = a∗ , θ∗t θ∗t ∂t A∗ − θ∗x ∂x A∗ + θ∗ A∗ = 0, 1 Dt p∗t − Dx p∗x + Aa∗ = 0, 2

with data (0, A0 , p0 , p1 ) exists on a time interval [0, T ]. Then the solution p converges to p∗ as ε → 0, i.e., 1 q = p − p∗ = O(ε) in L∞ t Hx on [0, T ] × R. def

Proof. The proof will proceed in the following manner: 1) We introduce a change of variables (x, t) → (y, τ ) that will restrict the rapid oscillations present in the system to the τ variable. 2) Using the new coordinates, we obtain H 2 bounds on (p, d), avoiding 2 the term dτ τ which blows up as ε → 0. 3) We show that dε2 − A∗ = O(ε) in an averaged sense. 4) Using energy estimates and the computed weak limits of convergence of q = p − p∗ → 0.

d2 ε2

we conclude the

398

J. Shatah, C. Zeng θ∗ (t,x)

Change of variables. Since rapid oscillations are present in the form of ei ε (see the beginning of this section), we introduce a new coordinate system (τ, y), where τ = θ∗ , the solution of (3.2a) and y = ζ∗ (x, t) given by θ∗t ∂t ζ∗ − θ∗x ∂x ζ∗ = 0

ζ∗ (0, x) = x.

This is a nondegenerate coordinate system for t ∈ [0, T ] and x ∈ R. In fact, let s (t, x) be the flow defined by the vector field (θ∗t , −θ∗x ), then ζ∗ ◦ s = ζ∗ for all s. Therefore, ζ∗x (0, x) = 1 implies ζ∗ (s (0, x)) = 0 for all s and x. Thus using the fact θ∗t = 0, we conclude from the ζ∗ equation that ζ∗x (s (0, x)) = 0. The nondegeneracy of the transformation (t, x) → (τ, y) is given by the Jacobian J = a∗θζ∗t∗x = 0. Moreover, since dtt only appear in the expression of dτ τ for t = 0, hypothesis (3.9) on the initial data becomes for τ = 0, |uε0 |Hy2 + |uετ |Hy1 C0 , d2 p1 |pε0 − p0 |Hy1 + pετ − √a(p + |dε0 |Hy1 + 2εε02 + ) L2 0

y

2 dετ 2

− A0 L2 ε.

(3.10)

y

In these coordinates the penalized wave operator is given by a(p) 1 a(p) − a∗ + 2 = a∗ ∂τ2 − f1 ∂y2 + f2 ∂τ + f3 ∂y + 2 + ε ε a∗ ε 2 1 def + 2 (1 + m) , = a∗ ε where f1 =

2 − ζ2 ζ∗x ∗t > 0, a∗

f2 =

θ∗ , a∗

f3 =

ζ∗ , a∗

m = O(q),

and the penalized system (3.1) is given by p + 2dτ ∗ p∗τ + d +

d 2 a∗ = Q∗ + Q 1 , ε 2 2a∗

1+m d = g + Q2 , ε2

(3.11a) (3.11b)

where Q∗ = − < ∗ p∗τ , p∗τ > ν∗ + f1 < ∗ p∗y , p∗y > ν∗ , Q1 = < ∗ p∗τ , p∗τ > ν∗ − < pτ , pτ > ν − f1 < ∗ p∗y , p∗y > ν∗ + f1 < py , py > ν d 2 a d 2 a + 2 ∗ 2 2ε a∗ 2ε a∗ −1 − (I + d ) [−2f1 dy py + d( (pτ ))pτ − f1 d( (py ))py ], − (I + d )−1 (2dτ pτ ) + 2dτ ∗ p∗τ − (I + d )−1

Q2 = d| pτ |2 − f1 d| py |2 , g = < pτ , pτ > ν − f1 < py , py > ν.

Constrained Wave Equations

399

The limit system (3.2) in these coordinates is given by a∗ = Q∗ , a∗ ∂τ A∗ + f2 A∗ = 0,

p∗ + 21 A∗

(3.12a) (3.12b)

p1 with initial data (0, A0 , p0 , √a(p ). From the expressions for Q and g one can see that ) 0

|Q1 | |∂q||∂p| + |q||∂p|2 + |∂d||∂q| + |∂d||∂p||q| + |d||∂p|2 + |dy ||py | + |d||∂d||∂p| +

d2 (|d| + |q|), ε2

(3.13a)

|Q2 | |d||∂p|2 ,

(3.13b)

|g| |∂p| ,

(3.13c)

2

where, as before, ∂p = (pτ , py ) and ∂d = (dτ , dy ). Since we plan to show that q and d 1 ∞ 2 are O(ε) in L∞ τ Hy and p bounded in Lτ Hy , then Q1 and Q2 are O(ε), and g bounded. H 2 Bounds. Note that throughout the proof all norms in the τ variable are meant to be on a small fixed interval in τ , all the constants that depend on | qε |Hτy 1 are denoted by M, and constants that depend on p∗ , which include the constants in , are denoted by C, often included in . To obtain H 2 bounds we first recall the energy estimates in the (τ, y) variables |∂p|L∞ 2 + |∂d|L∞ L2 τ Ly τ y

d + E0 , ε L∞ 2 τ Ly

where we used the fact that f1 c0 > 0. Differentiating system (3.11) with respect to y, multiplying by pyτ and dyτ respectively, and integrating with respect to y we obtain (1 + m)dy 2 |∂τ py |2 + f1 |∂y py |2 + |∂τ dy |2 + f1 |∂y dy |2 + ε 2 |qy | 1 + H2 + . ε2

d def d H = dτ dτ

Though one can not apply Gronwall’s inequality directly, for small τ , bounds on H in terms of | qε |Hτy 1 can be obtained by standard ODE arguments. Moreover, from Eq. (3.11a) we can bound |pτ τ |L∞ 2 , thus τ Ly E2 = |∂ p|L∞ 2 + |∂dy |L∞ L2 τ Ly τ y 2

dy + M. ε L∞ 2 τ Ly

(3.14)

Here we again used the fact f1 c0 > 0. Equation (3.14) states that the H 2 norm of the solution can be controlled by the H 1 norm of qε .

400

J. Shatah, C. Zeng

Weak Limits of d. As we expect d and dy to be O(ε) because the rapid oscillations are in the τ direction, the main ideas in computing these weak limits are the following: 1) the 2 energy of d splits into roughly two equal quantities dτ2 and dε2 , and 2) from the energy d2 would be A∗ provided some higher power of d → 0 fast, ε2 O(ε 2 ) in a weak sense. If one can justify the above statements

splitting, the weak limit of 3

namely ddτ2 and dε2 are then p → p∗ . 2 To obtain weak limits of dτ2 , dε2 and so on we multiply Eq. (3.11b) by d to obtain −dτ2 + f1 dy2 +

(1 + m)d 2 = −∂τ (dτ d) + ∂y (f1 dy d) + O d |∂d| + |∂p|2 , 2 ε (3.15)

2 where the term O{· · · } is of order εM in L∞ τ Ly . Equation (3.15) implies the equal 2

partition of energy in the sense that dτ2 − dε2 is bounded by εM in a weak sense. Multiplying Eq. (3.11b) by dτ , we obtain

1 2 (1 + m)d 2 + f2 dτ2 ∂τ dτ + f1 dy2 + 2 ε2 = ∂y (f1 dy dτ ) + ∂τ (gd) 2 d +O |dy | |dτ | + |dy |2 + d|∂p|(|∂p|2 + |∂ 2 p|) + |∂q| + |d||∂d||∂p|2 , ε (3.16) where all the terms in O{· · · } except the last one are from integration by parts and the last term is the bound for Q2 dτ . Moreover, the term O{· · · } is bounded by εM in L2τ L2y . By 2

considering 21 f2 (3.15) + (3.16) we obtain an equation for E = 21 (dτ2 + f1 dy2 + (1+m)d ), ε2 ∂τ E + f2 E =∂τ gd − f22 ddτ + ∂y f1 dy dτ + f12f2 ddy + R1 , def (3.17) = ∂τ I + ∂y f1 dy dτ + f12f2 ddy + R1 , τ which we can solve by introducing the integrating factor eµ where µ = 0 f2 dτ , τ −µ −µ −µ E(τ ) = e E(0) + I (τ ) − e I (0) + ∂y e eµ f1 dy dτ + f12f2 ddy dτ + R˜ 1 . 0

Since + |R˜ 1 |L∞ |I |L∞ 2 εM, τy τ Ly = 1 (dτ2 + d2 ) from the above equation to obtain then using (3.14) we can express E 2 ε τ ) = e−µ E(0) + ∂y e−µ E(τ eµ f1 dy dτ + f12f2 ddy dτ + R˜ 2 , 2

0

and solving for A∗ from Eq. (3.12b) where |R˜ 2 |L∞ 2 εM. Using this expression for E τ Ly we obtain τ def E = E − A∗ = e−µ E(0) + ∂y e−µ eµ f1 dy dτ + f12f2 ddy + R˜ 2 . 0

Constrained Wave Equations

401

Therefore τ Eϕdy dτ εM |ϕ|L1τ L2y + |ϕy |L1τ L2y , 0τ Eϕτ dy dτ εM τ |ϕy (τ )|L2y + |ϕτ |L1τ L2y + |ϕy |L1τ L2y ,

(3.18) (3.19)

0

where the term τ |ϕy (τ )|L2y is from integrating by parts twice on the second term in the expression of E; first with respect to y and then with respect to τ . Thus, (3.15), (3.18), 2 and (3.19) imply that dε2 − A∗ is bounded by εM in a weak sense. Energy inequality for q. To estimate q = p − p∗ we subtract Eq. (3.12a) from (3.11a) to obtain

a∗ 1 d2 q + 2dτ ∗ p∗τ + − A = Q1 . (3.20) ∗ 2 2 ε a∗ From (3.13a) we have |Q1 |L2τ L2y εM and the energy identity (3.20) implies 2 a 1 d d 2 2 |qτ | + f1 |qy | dy + 2 dτ ∗ p∗τ qτ dy + − A∗ ∗ qτ dy 2 dτ 2 ε a∗ 2 = Q1 qτ + f1τ |qy | dy, (3.21) where

τ

0

2 Q1 qτ + f1τ |qy | dy dτ ε2 M. 2

a

Thus to estimate q we need to estimate the average of ( dε2 − A∗ ) a∗∗ qτ and dτ ∗ p∗τ qτ . 2

Though we have concluded that dε2 − A∗ is bounded by εM in a weak sense, there are extra difficulties since the multiplier contains the term qτ which depends in the solution itself. a

2

Estimate on ( dε2 − A∗ ) a∗∗ qτ . From (3.18) and (3.19) with ϕ =

τ

0

a∗ a∗ q

we have

d2 a dτ2 ∂q + 2 − A∗ ∗ qτ ε 2 M(1 + τ | |L∞ 2) 2 2ε a∗ ε τ Ly ∂q ε2 M(1 + | |L∞ 2 ), ε τ Ly

for bounded τ . From Eq. (3.15) we have 0

τ

− dτ2 +

τ a∗ d 2 a∗ dd q dy dτ = qτ τ + R 2 , τ τ ε 2 a∗ a∗ 0

(3.22)

402

J. Shatah, C. Zeng

where |R2 | ε 2 M(1 + | ∂q 2 ). Using Eq. (3.20) to substitute for qτ τ in the above ε |L∞ τ Ly equation, we obtain after integrating by parts on the f1 qyy term,

τ

τ a d 2 a∗ ddτ2 ∗ ∗ p∗τ + R3 , + 2 qτ dy dτ = −2 ε a∗ a∗ 0

− dτ2

0

(3.23)

a

2 ∗ p , multiply Eq. (3.15) by where |R3 | ε2 M(1 + | ∂q 2 ). To estimate ddτ ε |L∞ a∗ ∗ ∗τ τ Ly a

d a∗∗ ∗ p∗τ to obtain

τ

− 2dτ2 d

0

d 3 a∗ + 2 ∗ p∗τ dy dτ ε2 M. ε a∗

(3.24)

a

From Eq. (3.18) with ϕ = d a∗∗ ∗ p∗τ we have

τ

0

dτ2 a∗ 1 d3 ε2 M. − A d p d+ ∗ ∗ ∗τ 2 2 ε2 a∗

(3.25)

Now since

0

τ

τ a∗ a∗ 2 A∗ d ∗ p∗τ = ε A∗ ( d + . . . ) ∗ p∗τ ε2 M, a∗ a∗ 0

Eqs. (3.24) and (3.25) imply

τ

τ 2 a∗ dτ d ∗ p∗τ + a∗ 0

0

d 3 a∗ ε2 M. p ∗ ∗τ ε 2 a∗

Therefore Eqs. (3.23) and (3.22) imply

τ

0

a∗ d2 ∂q − A qτ ε2 M(1 + | |L∞ 2 ). ∗ 2 ε a∗ ε τ Ly

(3.26)

Estimate on dτ (p∗ )p∗τ qτ . This term can be estimated in a similar fashion. Integrate by parts on τ ,

τ

dτ ∗ p∗τ qτ = −

0

d ∗ p∗τ qτ τ + R4 ,

where R4 ε2 M(1 + | ∂q 2 ). From Eq. (3.20) we substitute for qτ τ to obtain ε |L∞ τ Ly

0

τ

∂q dτ ∗ p∗τ qτ ε2 M(1 + | |L∞ 2 ). ε τ Ly

(3.27)

Constrained Wave Equations

403

Estimates on q in H 1 . Let z(τ ) = | ∂q 2 . Substituting Eqs. (3.26) and (3.27) ε |L∞ τ ([0,τ ])Ly into Eq. (3.21), we obtain 1

z(τ )2 ≤ C + M(τ 2 z(τ ))(1 + z(τ )),

(3.28)

where recall that M(s) can be taken as a positive smooth increasing function only depending on (θ∗ , A∗ , p∗ ). Let z0 =

1 [1 + M(0) + (1 + M(0))2 + 4(1 + C + M(0))] 2

and 1

τ0 = inf{τ | M(τ 2 z0 ) ≥ 1 + M(0)} > 0. Note that z(0) < z0 from (3.28) and the definition of z0 . We claim |

∂q | ∞ 2 = z(τ ) ≤ z0 ε Lτ ([0,τ ])Ly

∀τ ∈ [0, τ0 ].

(3.29)

Otherwise, we have def

τ1 = inf{τ | z(τ ) > z0 } < τ0 . Since z(τ ) is continuous for each fixed ε > 0, we have τ1 ∈ (0, τ0 ) and z(τ1 ) = z0 . However, this contradicts (3.28) and thus (3.29) holds. 2 Convergence of p → p∗ . From the above estimate we have p → p∗ locally in L∞ τ Ly . To show that the convergence actually occurs on [0, T ], the interval of existence of the solution p∗ , we use finite propagation speed and the fact that τ0 depends only on the solution (θ∗ , A∗ , p∗ ).

Acknowledgement. We would like to thank R. E. L. DeVille and R. Kohn for pointing out reference [11] to us.

References 1. Bornemann, F.: Homogenization in time of singularly perturbed mechanical systems. Lecture Notes in Mathematics 1687, Berlin: Springer-Verlag, 1998, 156 pp 2. Bornemann, F., Sh¨utte, C.: Homogenization of Hamiltonian systems with a strong constraining potential. Phys. D 102, 57–77 (1997) 3. Chechkin, F.A.: Ph. D. thesis, Courant Institute, New York University, 1999 4. da Costa, R.C.T.: Quantum mechanics of a constrained particle. Phys. Rev. A 23, 1982–1987 (1981) 5. da Costa, R.C.T.: Constraints in quantum mechanics. Phys. Rev. A 25, 2893–2900 (1982) 6. Ebin, D.G.: The motion of slightly compressible fluids viewed as a motion with strong constraining force. Ann. Math. 105(2), 141–200 (1977) 7. Ebin, D.G.: Motion of slightly compressible fluids in a bounded domain. I. Comm. Pure Appl. Math. 35, 451–485 (1982) 8. Freire, A.: Solutions of the wave map system to compact homogenous spaces. Manuscripta 91, 525–533 (1996) 9. Froese, R., Herbst, I.: Realizing holonomic constrains in classical and quantum mechanics. AMS/IP Studies in Advanced Mathematics 16, Providence, RI: AMS, 2000, pp. 121–131 10. Froese, R., Herbst, I.: Realizing holonomic constrains in classical and quantum mechanics. Submitted

404

J. Shatah, C. Zeng

11. Keller, J.B., Rubinstein, J.: Nonlinear wave motion in a strong potential. Wave Motion 13(3), 291– 302 (1991) 12. Klainerman, S., Rodnianski, I.: On the global regularity of wave maps in the critical Sobolev norm. Internat. Math. Res. Notices (13), 655–677 (2001) 13. Rubin, H., Ungar, P.: Motion under a strong constraining force. Comm. Pure Appl. Math. 10, 65–87 (1957) 14. Shatah, J.: Weak solutions and development of singularities of the SU(2) σ -model. Comm. Pure Appl. Math. 41, 459–469 (1988) 15. Shatah, J., Struwe, M.: Geometric wave equations. Courant Lecture Notes in Mathematics, 2, New York: New York University, Courant Institute of Mathematical Sciences, 1998, viii+153 pp. 16. Shatah, J., Struwe, M.: The Cauchy problem for wave maps. Internat. Math. Res. Notices 11, 555–571 (2002) 17. Schwartz, J.T.: Nonlinear functional analysis. Notes on Mathematics and Its Applications. New York-London-Paris: Gordon and Breach, 1969 18. Shatah, J., Zeng, C.: Wave equations with strong H 1 penalties. Preprint, 2002 19. Takens, F.: Motion under the influence of a strong constraining force. In: Global theory of dynamical systems. Springer Lecture Notes in Mathmatics, 819, Berlin-Heidelberg-New York: Springer, 1980, pp. 425–445 20. Tao, T.: Global regularity of wave maps. I. Small critical Sobolev norm in high dimension. Internat. Math. Res. Notices (6), 299–328 (2001) 21. Tao, T.: Global regularity of wave maps. II. Small energy in two dimensions. Commun. Math. Phys. (to appear) 22. Tataru, D.: Local and global results for wave maps. Comm. Partial Differ. Eqs. 23, 1781–1793 (1998) 23. Tolar, J.: On a quantum mechanical d’Alembert principle. In: Group theoretical methods in physics (Varna, 1987), Lecture Notes in Phys. 313, Berlin: Springer, 1988, pp. 268–274 Communicated by P. Constantin

Commun. Math. Phys. 239, 405–447 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0860-2

Communications in

Mathematical Physics

N ) and the Drinfeld The Elliptic Algebra Uq,p (sl N ) Realization of the Elliptic Quantum Group Bq,λ (sl Takeo Kojima1,3 , Hitoshi Konno2,3 1

Department of Mathematics, College of Science and Technology, Nihon University, Chiyoda-ku, Tokyo 101-0062, Japan. E-mail: [email protected] 2 Department of Mathematics, Faculty of Integrated Arts and Sciences, Hiroshima University, Higashi-Hiroshima 739-8521, Japan. E-mail: [email protected] 3 Department of Mathematics, Heriot-Watt University, Edinburgh EH14 4AS, UK Received: 15 October 2002 / Accepted: 10 February 2003 Published online: 2 July 2003 – © Springer-Verlag 2003

Abstract: By using the elliptic analogue of the Drinfeld currents in the elliptic algebra Uq,p ( slN ), we construct a L-operator, which satisfies the RLL-relations characterizing the face type elliptic quantum group Bq,λ ( slN ). For this purpose, we introduce a set of new currents Kj (v) (1 ≤ j ≤ N ) in Uq,p (slN ). As in the N = 2 case, we find a structure of Uq,p ( slN ) as a certain tensor product of Bq,λ ( slN ) and a Heisenberg algebra. In the level-one representation, we give a free field realization of the currents in Uq,p ( slN ). Using the coalgebra structure of Bq,λ ( slN ) and the above tensor structure, we derive a free field realization of the Uq,p ( slN )-analogue of Bq,λ ( slN )-intertwining operators. (1) The resultant operators coincide with those of the vertex operators in the AN−1 -type face model. 1. Introduction In recent papers[1–5], the notion of elliptic quantum groups has been proposed. There are two types of elliptic quantum groups, the vertex type Aq,p ( slN ) and the face type Bq,λ (g), where g is a Kac-Moody algebra associated with a symmetrizable generalized Cartan matrix. The elliptic quantum groups have the structure of quasi-triangular quasi-Hopf algebras introduced by Drinfeld [6]. Since certain finite dimensional representations of the universal R-matrices of these elliptic quantum groups yield known elliptic Boltzmann weights including, for example, those of the eight vertex model[7] and the Andrews-Baxter-Forrester (ABF) face model[8], we expect that we can perform an algebraic analysis of both types of elliptic lattice models based on the corresponding elliptic quantum groups. Here, algebraic analysis means, in a restricted sense, a method of studying two dimensional solvable lattice models based on the representation theory of infinite dimensional quantum groups [9]. It can be regarded as an off-critical extension of conformal field theory, where the representation theory of the Virasoro algebras and/or affine Lie algebras

406

T. Kojima, H. Konno

plays an essential role. In fact, quite a lot of, but not all, solvable lattice models allow us, in the thermodynamic limit, to identify the space of states of the models with the infinite dimensional modules of certain quantum groups. Then two types of intertwining operators, type I and type II, of such modules become important. The type I intertwiner provides a realization of local operators, such as spin operators for example, on the infinite dimensional modules of quantum groups. And type II plays the role of creation operator of physical excitations. Due to the coalgebra structure of quantum groups, these intertwiners can be determined uniquely. Realizing these ingredients in certain forms, such as the free field realization for example, one can perform a calculation of correlation functions as well as form factors of the models. Through experience of the analysis of trigonometric models, such as the six vertex model, or equivalently the XXZ spin chain model (see the references in [9]), we know that a formulation of quantum groups in terms of the Drinfeld currents[10] provides a convenient framework. This is because one can construct a free field realization of type I and II intertwining operators starting from a free field realization of the Drinfeld currents. In addition, the Drinfeld currents have a formal, but deep, resemblance to the currents in affine Kac-Moody algebras so that we can easily compare the results with those in conformal field theory. Hence to perform an algebraic analysis of the elliptic lattice models, it is an important step to find a new realization of both elliptic quantum groups Aq,p ( slN ) and Bq,λ (g), g being an affine Lie algebra, in terms of the Drinfeld currents. In [11], one of the authors has introduced an elliptic analogue of the Drinfeld currents of Uq ( sl2 ) independently from the formulation of the elliptic quantum groups. The algebra of the currents is called the elliptic algebra Uq,p ( sl2 ). Here the parameter p is the elliptic nome of the elliptic functions. Later in [12], it has been shown that Uq,p ( sl2 ) can be regarded essentially as the Drinfeld currents which gives a new realization of the face type elliptic algebra Bq,λ ( sl2 ). According to this result, type I and type II vertex operators of Uq,p ( sl2 ), the analogues of the intertwining operators of Bq,λ ( sl2 ), have been realized by the free bosonic fields. The resultant expressions coincide with those of the vertex operators of the ABF model obtained by Lukyanov and Pugai[13]. Hence a representation theoretical foundation to Lukyanov and Pugai’s free field approach to the ABF model has been established. The purpose of this paper is to extend this result to the higher rank case. We investigate a higher rank elliptic algebra Uq,p ( slN ), and show that Uq,p ( slN ) provides a new realization of the the face type elliptic algebra Bq,λ ( slN ) in terms of the elliptic Drinfeld currents. Our strategy is parallel to the one in [12]. We first give a definition of Uq,p ( slN ) introducing the new currents Kj (v) (1 ≤ j ≤ N ) (Sect. 3). This gives a completion of the definition of Uq,p ( slN ) given in Appendix A of [12]. As an example, a realization of Uq,p ( slN ) as a certain tensor product of the algebra Uq ( slN ) and a Heisenberg ˆ algebra C{H} is given. Then we define the “half currents” of the generating functions (total currents) Ej (v), Fj (v), Kj (v) of the algebra Uq,p ( slN ) (Sect. 4). The half currents allow us to construct a L-operator as a Gauss decomposed form of an operator valued matrix (5.1) [14, 4, 15]. We then argue that the thus obtained L-operator satisfies the RLL-relation which characterizes the algebra Bq,λ ( slN ), when the generators of the mentioned Heisenberg algebra are reduced to a set of parameters (dynamical parameters) by properly removing half of the conjugate variables (Sect. 5). Hence, one can regard the algebra Uq,p ( slN ) as a tensor product of the algebra Bq,λ ( slN ) and the Heisenberg ˆ algebra C{H}.

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

407

The L-operator and the coalgebra structure of Bq,λ ( slN ) allows us to construct a free field realization of the vertex operators of Uq,p ( slN ), which are extensions of type I and II intertwining operators of Bq,λ ( slN ) by adding elements of the Heisenberg algebra, acting on the Uq,p ( slN )-modules. In the level-one representation, we derived such a realization starting from a free field realization of the total currents of Uq,p ( slN ). The resultant expressions coincide with those of type I and II vertex operators obtained in [16] and [17]. We also show that they satisfy the required commutation relations. We (1) thus give a representation theoretical meaning to the vertex operators of the AN−1 type face model[18]. Conversely, as a composition of type I and type II intertwiners, one can construct a L-operator which satisfies the RLL-relations [19, 20]. As a check of our free field realization, we investigate a connection between the two L-operators, the one constructed by a composition of the vertex operators and the other by the half currents, in the level-one representation. We then give a proof of our argument in Sect. 5 at c = 1. The article is organized as follows. In the next section, we review some basic facts on the face type elliptic quantum group Bq,λ ( slN ). In Sect. 3, we present a definition of the elliptic algebra Uq,p (slN ). New currents Kj (u) (1 ≤ j ≤ N ) are introduced there. A realization of Uq,p ( slN ) using the Drinfeld currents of Uq ( slN ) and a Heisenberg algebra is also given. In Sect. 4, we introduce a set of half currents defined from Uq,p ( slN ) and derive their commutation relations. In Sect. 5, constructing a L-operator in terms of the half currents, we show that it satisfies the required RLL-relation for Bq,λ ( slN ). According to this result, in Sect. 6, we discuss a free field realization of the two types of vertex operators of the level one Uq,p ( slN )-modules. In addition, we have four appendices. Appendix A is devoted to a list of operator product expansions used in the text. In Appendix B, we give a proof of some formulae of commutation relations of the half currents. In Appendix C, we give a derivation of some formulae contained in the RLL-relation. Finally, in Appendix D, we give a summary of the N dimensional evaluation representation of Uq,p ( slN ). 2. The Elliptic Quantum Group Bq,λ (slN ) In this section, we give a review on the face type elliptic quantum group Bq,λ ( slN ) based on the results in [5]. 2.1. Notations. Through this article, we fix a complex number q = 0, |q| < 1. We often use the parameters p = q 2r = e− ∗

p = pq

−2c

2π i τ

=q

, 2r ∗

2π i

= e− τ ∗ (r ∗ = r − c; r, r ∗ ∈ R>0 , rτ = r ∗ τ ∗ ).

The following notation is standard: p (z) = (z, p)∞ (pz−1 ; p)∞ (p; p)∞ , (z; t1 , · · · , tk )∞ = (1 − zt1n1 · · · tknk ). n1 ,··· ,nk ≥0

408

T. Kojima, H. Konno

We also use the Jacobi theta functions [v] = q

v2 r −v

v2 p (q 2v ) p∗ (q 2v ) , [v]∗ = q r ∗ −v ∗ ∗ 3 , 3 (p; p)∞ (p ; p )∞

which satisfy [−v] = −[v] and the quasi-periodicity property [v + r] = −[v], [v + rτ ] = −e−πiτ −

2π iv r

[v].

We take the normalization of the theta function to be dz 1 = 1, C0 2πiz [−v]

(2.1)

(2.2)

where C0 is a simple closed curve in the v-plane encircling v = 0 anticlockwise. The same holds for [v]∗ , with r replaced by r ∗ , except for the normalization 1 dz [v] = . ∗ [v]∗ v→0 C0 2πiz [−v] 2.2. Definition of the elliptic quantum group Bq,λ ( slN ). Let Uq = Uq ( slN ) be the standard affine quantum group. Namely, Uq (slN ) is a quasi-triangular Hopf algebra equipped with the standard coproduct , counit ε, antipode S and universal R matrix R. Our conventions on the coalgebra structure follow [5]. Let h and h¯ be the Cartan subalgebras of slN and slN , respectively. We denote a basis and its dual basis of h by l ˆ ˆ {hl } and {h }, respectively. More explicitly, they are given by {hˆ l } = {d, c, hj } and {hˆ l } = {c, d, hj } (1 ≤ j ≤ N − 1), where c and d are a central element and a derivation ¯ operator of slN , respectively, and {hj } and {hj } are a basis and a dual basis of h. The face type elliptic quantum group Bq,λ (slN ) is a quasi-Hopf deformation of Uq ( slN ) by the face type twistor F (λ) (λ ∈ h). The twistor F (λ) is an invertible element in Uq ⊗ Uq satisfying (id ⊗ ε)F (λ) = 1 = F (λ)(ε ⊗ id), (2.3) (2.4) F (12) (λ)( ⊗ id)F (λ) = F (23) (λ + h(1) )(id ⊗ )F (λ), (1) (1) where λ = l λl hˆ l (λl ∈ C), λ + h(1) = l (λl + hˆ l )hˆ l and hˆ l = hˆ l ⊗ 1 ⊗ 1. An explicit construction of the twistor F (λ) is given in [5]. A quasi-Hopf deformation means that as an associative algebra, Bq,λ ( slN ) is isomorphic to Uq ( slN ), but the coalgebra structure is deformed. Namely, the coproduct is changed to the new one given by λ (x) = F (λ)(x)F (λ)−1

∀x ∈ Uq ( slN ).

(2.5)

λ satisfies a weaker coassociativity (id ⊗ λ )λ (x) = (λ)(λ ⊗ id)λ (x)(λ)−1 (λ) = F

(23)

(λ)F

(23)

(1) −1

(λ + h )

∀ x ∈ Uq ( slN ),

(2.6)

.

(2.7)

R(λ) = F (21) (λ)RF (12) (λ)−1 .

(2.8)

The universal R-matrix is also deformed to

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

409

Definition 2.1 (Elliptic quantum group (Bq,λ ( slN )) [5]). The face type elliptic quantum group Bq,λ ( slN ) is a quasi-triangular quasi-Hopf algebra (Bq,λ ( slN ), λ , ε, S, (λ), α, β, R(λ)), where α, β are defined by S(ki )li , β = mi S(ni ). (2.9) α= i

Here we set

i ki

⊗ li = F (λ)−1 ,

i

i

mi ⊗ ni = F (λ).

A characteristic feature of Bq,λ ( slN ) is that the universal R matrix R(λ) satisfies the dynamical Yang-Baxter equation, R(12) (λ + h(3) )R(13) (λ)R(23) (λ + h(1) ) = R(23) (λ)R(13) (λ + h(2) )R(12) (λ). (2.10) Let (πV ,z , Vz ), Vz = V ⊗ C[z, z−1 ] be a (finite dimensional) evaluation representation of Uq . Taking images of R, we have a R-matrix RV+W (z, λ) and a L-operator L+ V (z, λ) as follows: (2.11) RV+W (z1 /z2 , λ) = πV ,z1 ⊗ πW,z2 q c⊗d+d⊗c R(λ), c⊗d+d⊗c + R(λ). (2.12) LV (z, λ) = πV ,z ⊗ id q Then from (2.10), we have the following dynamical RLL-relation. + (1) RV+W (z1 /z2 , λ + h)L+ V (z1 , λ)LW (z2 , λ + h ) + + + = LW (z2 , λ)LV (z1 , λ + h(2) )RV W (z1 /z2 , λ). (2.13) − (21) (λ)−1 q −c⊗d−d⊗c are slN ), L+ Note that in Bq,λ ( V (z, λ) and LV (z, λ) = πV ,z ⊗ id R not independent operators (Proposition 4.3 in [5]). Hence just one dynamical RLL-relation (2.13) characterizes the algebra Bq,λ ( slN ) completely in the sense of Reshetikhin and Semenov-Tian-Shansky [21]. Hereafter we parametrize the dynamical variable λ as

λ = (r ∗ + N )d + s c +

N−1

(sj + 1)hj

(s ∈ C, r ∗ ≡ r − c).

(2.14)

j =1

Under this, we set F (r ∗ , {sj }) ≡ F (λ) and R(r ∗ , {sj }) ≡ R(λ). Since c is central, no s

j dependence should appear. The dynamical shift λ → λ + h with h = cd + N j =1 hj h , ∗ changes the universal R-matrix R(r , {sj }) to R(r, {sj + hj }) ≡ R(λ + h). Note r ∗ = r − c. Let us now take (πV ,z , Vz ) to be the evaluation representation associated with the vector representation V ∼ = CN of Uq (slN ) (see Appendix D). We set R + (v, s + h) = (πV ,z1 ⊗ πV ,z2 )q c⊗d+d⊗c R(r, {sj + hj }), L+ (v, s) = (πV ,z ⊗ id)q c⊗d+d⊗c R(r ∗ , {sj }), where zi = q 2vi (i = 1, 2), v = v1 − v2 . One can obtain the finite dimensional representation of the twistor F (r, {sj }) by solving the difference equation for (πV ,z1 ⊗ πV ,z2 )F (r, {sj + hj }) (Eq.(2.30) in [5]) derived by using the explicit realization of F (λ), under the parametrization (2.14). Then noting the relation R(r, {sj + hj }) =

410

T. Kojima, H. Konno

F (21) (r, {sj + hj })RF (12) (r, {sj + hj })−1 , we obtain the R-matrix R + (v, s + h), up to a certain gauge transformation, as ¯ s + h), R + (v, s + h) = ρ + (v)R(v, ¯ s + h) = R(v,

N j =1

+

Ejj ⊗ Ejj +

(2.15)

¯ b(v, sj,l + hj,l )Ejj ⊗ Ell + b(v)E ll ⊗ Ejj

1≤j

c(v, sj,l + hj,l )Ej l ⊗ Elj + c(v, ¯ sj,l + hj,l )Elj ⊗ Ej l ,

1≤j
(2.16) where sj,l =

l−1

m=j sj ,

hj,l =

l−1 m=j

hj (1 ≤ j < l ≤ N ) and

[s + 1][s − 1][u] [u] ¯ , b(u) = , 2 [s] [u + 1] [u + 1] [1][s + u] [1][s − u] c(u, s) = , c(u, ¯ s) = . [s][u + 1] [s][u + 1] b(u, s) =

(2.17) (2.18)

The function ρ + (v) is chosen as ρ + (v) = q

N −1 N

z

N −1 rN

{pq 2 z}{pq 2N−2 z}{1/z}{q 2N /z} , {pz}{pq 2N z}{q 2 /z}{q 2N−2 /z}

(2.19)

where {z} = (z; p, q 2N )∞ .

(2.20)

Up to a gauge transformation, the R-matrix R + (v, s + h) is nothing but the Boltzmann (1) weight of the AN−1 type face model introduced in [18]. The R-matrix R +∗ (v, s) = (πV ,z1 ⊗ πV ,z2 )R(r ∗ , {sj }) is obtained from R + (v, s) by the replacements r → r ∗ . Hence, under the parametrization (2.14), the dynamical RLL-relation takes the form R +(12) (v, s + h)L+(1) (v1 , s)L+(2) (v2 , s + h(1) ) = L+(2) (v2 , s)L+(1) (v1 , s + h(2) )R +∗(12) (v, s).

(2.21)

2.3. Intertwining operators. Let F, F be highest weight Uq -modules. We denote the type-I and type II intertwining operators of Uq -modules by (z) and ∗ (z), respectively. (z) : F −→ F ⊗ Wz ,

∗ (z) : Wz ⊗ F −→ F .

(2.22)

Twisting these operators by F (r ∗ , s), we obtain the corresponding intertwining operators (v, s) and ∗ (v, s) of Bq,λ -modules, W (v, s) = (id ⊗ πW,z )F (r ∗ , {sj })(z), ∗ W (v, s)

∗

∗

= (z)(πW,z ⊗ id)F (r , {sj })

(2.23) −1

.

(2.24)

From the intertwining relation satisfied by (z) and ∗ (z), one can derive the following dynamical intertwining relation for the new intertwiners [5],

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

c (3) +(1) W (v2 + , s)LV (v1 , s) 2 c +(13) +(1) (3) = RV W (v, s + h)LV (v1 , s)W (v2 + , s + h(1) ), 2 +(1) ∗(2) LV (v1 , s)W (z2 , s + h(1) ) ∗(2) +(1) +∗(12) = W (z2 , s)LV (v1 , s + h(2) )RV W (v1 − v2 , s).

411

(2.25)

(2.26)

Note that (2.25) and (2.26) are the relations for the operators Vz1 ⊗ F → Vz1 ⊗ F ⊗ Wz2 and Vz1 ⊗ Wz2 ⊗ F → Vz1 ⊗ F, respectively. 3. The Elliptic Algebra Uq,p (slN ) slN ). To define the In this section, we give a definition of the elliptic algebra Uq,p ( algebra, we follow mainly the idea given in Appendix A of [12]. Namely, we first introslN ) by modifying duce the elliptic currents ei (z, p), fi (z, p) and ψi± (z, p) of Uq ( the Drinfeld currents of Uq (slN ). Then we extend them to the currents of Uq,p ( slN ) ˆ given in Sect. 3.4.1. Our by taking a tensor product with a Heisenberg algebra C{H} definition is an extended version of the one given in [12] introducing new currents Kj (v) (1 ≤ j ≤ N). The currents {Kj (v)} play an essential role in the construction of the L-operators (Sect. 5). 3.1. Drinfeld currents of Uq ( slN ). Let us first recall the Drinfeld currents of Uq ( slN ) [10]. We use the standard symbol for the q-integer, q n − q −n . (3.1) q − q −1 We also use the symbol A = (Aj k ) to express the Cartan matrix of slN . slN ) is a C-algebra generated by Definition 3.1 (Drinfeld currents). The algebra Uq ( [n]q =

± the generators hi , ai,m , xi,n (i = 1, · · · , N − 1 : m ∈ Z=0 , n ∈ Z), c, d. In terms of the generating functions ± −n xi± (z) = xi,n z , (3.2) n∈Z

ψi (q z) = q hi exp (q − q −1 ) c 2

ϕi (q

− 2c

z) = q

−hi

ai,m z−m ,

m>0

exp −(q − q

−1

)

(3.3)

ai,−m z

m

(i = 1, · · · , N − 1),

(3.4)

m>0

the defining relations of Uq ( slN ) are given by c : central, [hi , d] = [d, hi ] = [d, ai,m ] = [ai,m , d] = 0,

(3.5) (3.6)

± ± [d, xi,n ] = n xi,n , [hi , aj,m ] = [aj,m , hi ] = 0,

(3.7)

[hi , xj± (z)]

±Aij xj± (z),

(3.8)

[Aij m]q [cm]q −c|m| q δn+m,0 , m

(3.9)

=

[ai,m , aj,n ] =

412

T. Kojima, H. Konno

[Aij m]q −c|m| m + z xj (z), q m [Aij m]q m − [ai,m , xj− (z)] = − z xj (z), m (z1 − q ±Aij z2 )xi± (z1 )xj± (z2 ) = (q ±Aij z1 − z2 )xj± (z2 )xi± (z1 ), [ai,m , xj+ (z)] =

[xi+ (z1 ), xj− (z2 )] c c δi,j −c 2 z2 ) − δ(q c z1 /z2 )ϕi (q − 2 z2 ) , = δ(q z /z )ψ (q 1 2 i q − q −1

(3.10) (3.11) (3.12)

(3.13)

(xi± (z1 )xi± (z2 )xj± (z) − [2]q xi± (z1 )xj± (z)xi± (z2 ) + xj± (z)xi± (z1 )xi± (z2 )) + (xi± (z2 )xi± (z1 )xj± (z) − [2]q xi± (z2 )xj± (z)xi± (z1 ) + xj± (z)xi± (z2 )xi± (z1 )) = 0, for |i − j | = 1.

(3.14)

Here δ(z) denotes the delta function δ(z) = m∈Z zm . We call the generators hj , aj,m , ± xj,n , c, d the Drinfeld generators of Uq ( slN ) and the generating functions xi± (z), ψi (z) and ϕi (z) the Drinfeld currents. 3.2. Elliptic currents of Uq ( slN ). We next introduce an elliptic modification of the currents xi± (z), ψi (z) and ϕi (z) according to [12]. Let us define the auxiliary currents u± i (z, p) by

1 r m , = exp ai,−m (q z) [r ∗ m]q m>0

1 − −r −m ui (z, p) = exp − . ai,m (q z) [rm]q

u+ i (z, p)

(3.15)

(3.16)

m>0

Proposition 3.1. The following commutation relations hold: (p∗ q Aij z1 /z2 ; p∗ )∞ + x (z2 )u+ i (z1 , p), (p ∗ q −Aij z1 /z2 ; p∗ )∞ j (p∗ q c−Aij z1 /z2 ; p∗ )∞ − − u+ (z , p)x (z ) = x (z2 )u+ 1 2 i j i (z1 , p), (p ∗ q c+Aij z1 /z2 ; p∗ )∞ j (pq −c−Aij z2 /z1 ; p)∞ + + u− x (z2 )u+ i (z1 , p)xj (z2 ) = i (z1 , p), (pq −c+Aij z2 /z1 ; p)∞ j (pq Aij z2 /z1 ; p)∞ − − u− (z , p)x (z ) = x (z2 )u− 1 2 j i (z1 , p), i (pq −Aij z2 /z1 ; p)∞ j (pq −c−Aij z1 /z2 ; p)∞ (p ∗ q c+Aij z1 /z2 ; p∗ )∞ − u+ i (z1 , p)uj (z2 , p) = (pq −c+Aij z1 /z2 ; p)∞ (p ∗ q c−Aij z1 /z2 ; p∗ )∞ + × u− j (z2 , p)ui (z1 , p). + u+ i (z1 , p)xj (z2 ) =

(3.17) (3.18) (3.19) (3.20)

(3.21)

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

413

Definition 3.2 (Elliptic currents). Let us define “dressed” currents ei (z, p), fi (z, p), ψi± (z, p), (i = 1, · · · , N − 1) by + ei (z, p) = u+ i (z, p)xi (z),

fi (z, p) = xi− (z)u− i (z, p), + + 2c − 2c ψi (z, p) = ui (q z, p)ψi (z)u− z, p), i (q − + − 2c − 2c ψi (z, p) = ui (q z, p)ϕi (z)ui (q z, p).

(3.22) (3.23) (3.24) (3.25)

We call the currents ei (z, p), fi (z, p) and ψi± (z, p) the elliptic currents of Uq ( slN ). The reason why we call them “elliptic” is because the dressing operation specified by u± i (z, p) changes the commutation relation of the Drinfeld currents to the elliptic ones. Proposition 3.2. The elliptic currents satisfy the following relations: z1 p∗ (q Aij z2 /z1 )ei (z1 , p)ej (z2 , p) = −z2 p∗ (q Aij z1 /z2 )ej (z2 , p)ei (z1 , p),

(3.26)

z1 p (q Aij z2 /z1 )fi (z1 , p)fj (z2 , p) = −z2 p (q Aij z1 /z2 )fj (z2 , p)fi (z1 , p),

(3.27)

[ei (z1 , p), fj (z2 , p)] =

c δi,j −c δ(q z1 /z2 )ψj+ (q 2 z2 , p) q − q −1 c −δ(q c z1 /z2 )ψj− (q − 2 z2 , p) ,

q −hj ψj+ (q −r+ 2 z, p) = q hj ψj− (q r− 2 z, p), c

c

ψi+ (z1 , p)ψj+ (z2 , p) =

(3.28) (3.29)

p (q −Aij z1 /z2 )p∗ (q Aij z1 /z2 ) + ψ (z2 , p)ψi+ (z1 , p), (3.30) p (q Aij z1 /z2 )p∗ (q −Aij z1 /z2 ) j p∗ (q Aij + 2 z1 /z2 ) c

ψi+ (z1 , p)ej (z2 , p) =

ej (z2 , p)ψi+ (z1 , p),

(3.31)

fj (z2 , p)ψi+ (z1 , p),

(3.32)

p∗ (q −Aij + 2 z1 /z2 ) c

p (q −Aij − 2 z1 /z2 ) c

ψi+ (z1 , p)fj (z2 , p) =

p (q

Aij − 2c

z1 /z2 )

(p∗ q 2 z2 /z1 ; p∗ )∞ (p ∗ q −1 z/z1 ; p∗ )∞ (p ∗ q −1 z/z2 ; p∗ )∞ ei (z1 , p)ei (z2 , p)ej (z, p) (p ∗ q −2 z2 /z1 ; p∗ )∞ (p ∗ qz/z1 ; p∗ )∞ (p ∗ qz/z2 ; p∗ )∞ (p ∗ q −1 z/z1 ; p∗ )∞ (p ∗ q −1 z2 /z; p∗ )∞ ei (z1 , p)ej (z, p)ei (z2 , p) (p ∗ qz/z1 ; p∗ )∞ (p ∗ qz2 /z; p∗ )∞ (p∗ q −1 z1 /z; p∗ )∞ (p ∗ q −1 z2 /z; p∗ )∞ + ej (z, p)ei (z1 , p)ei (z2 , p) (p ∗ qz1 /z; p∗ )∞ (p ∗ qz2 /z; p∗ )∞ − [2]q

+ (z1 ↔ z2 ) = 0,

(3.33)

414

T. Kojima, H. Konno

(pq −2 z2 /z1 ; p)∞ (pqz/z1 ; p)∞ (pqz/z2 ; p)∞ fi (z1 , p)fi (z2 , p)fj (z, p) 2 (pq z2 /z1 ; p)∞ (pq −1 z/z1 ; p)∞ (pq −1 z/z2 ; p)∞ (pqz/z1 ; p)∞ (pqz2 /z; p)∞ − [2]q fi (z1 , p)fj (z, p)fi (z2 , p) (pq −1 z/z1 ; p)∞ (pq −1 z2 /z; p)∞ (pqz1 /z; p)∞ (pqz2 /z; p)∞ fj (z, p)fi (z1 , p)fi (z2 , p) + (pq −1 z1 /z; p)∞ (pq −1 z2 /z; p)∞ + (z1 ↔ z2 ) = 0, for |i − j | = 1. (3.34) 3.3. New currents kj (v). In this subsection, we consider a decomposition of the elliptic currents ψj± (z, p) (1 ≤ j ≤ N − 1) corresponding to the decomposition (3.58). For this purpose, we introduce new currents kj (v) (1 ≤ j ≤ N ). We first note that the currents ψj± (z, p) are expressed by using the Drinfeld generators aj,m as follows:     c 1 N−j −m ψj± (q ∓(r− 2 ) z, p) = q ±hj : exp − :, (3.35) b (q z) j,m   [r ∗ m]q m=0

where we set

bj,m =

[r ∗ m]q [rm]q aj,m q c|m| aj,m

m>0 m < 0.

(3.36)

The colons in (3.35) denote the standard normal ordering. j Let us introduce new generators, Bm (j = 1, · · · , N ; m ∈ Z), according to the formula j

j +1

−Bm + Bm

=

m bj,m q (N−j )m , [m]q

N

j

q 2j m Bm = 0,

(3.37)

j =1

or more explicitly,

  j −1 N−1 m j  [km]q bk,m − q Nm Bm = [(N − k)m]q bk,m  . [m]q [N m]q k=1

(3.38)

k=j

From this and (3.9)–(3.11), we derive the following commutation relations: Proposition 3.3. For m, m ∈ Z=0 , j, k = 1, · · · , N , the following commutation relations hold:

[r ∗ m]q [cm]q (j = k) [(N − 1)m]q j k

[Bm , Bm ] = mδm+m ,0 (3.39) × −q −mNsgn(j−k) [m]q (j = k), [rm]q [m]q [N m]q ∗ [r m]q (m > 0) j q [Bm , xj± (z)] = ∓q m(N+1−j −c) zm xj± (z) × [rm] (3.40) q cm (m < 0), ∗ [r m]q (m > 0) j +1 ± m(N−1−j −c) m ± q [Bm , xj (z)] = ±q z xj (z) × [rm] (3.41) cm q (m < 0), k , xj± (z)] = 0 [Bm

(k = j, j + 1).

(3.42)

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

415

We now define new currents kj (z, p) (1 ≤ j ≤ N ) by   [m]q j kj (z, p) =: exp  Bm z−m  : . m[r ∗ m]q

(3.43)

m=0

Then, from (3.35) and (3.37), we have the following decomposition: ψj± (q ±(r− 2 ) z, p) = κ q ±hj kj (q N−j z, p)kj +1 (q N−j z, p)−1 , c

(p; p)∞ (p ∗ q 2 ; p∗ )∞ κ= ∗ ∗ . (p ; p )∞ (pq 2 ; p)∞

(3.44)

It is also easy to verify the following commutation relations. Proposition 3.4. kj (z1 , p)kj (z2 , p) =

z1 z2

N −1 ( 1 − 1∗ ) N

r

r

ρ(v1 − v2 )kj (z2 , p)kj (z1 , p),

(3.45)

kj1 (z1 , p)kj2 (z2 , p) N −1 ( 1 − 1∗ ) p (z1 /z2 ) p∗ (q −2 z1 /z2 ) z1 N r r ρ(v1 − v2 ) = z2 p (q −2 z1 /z2 ) p∗ (z1 /z2 ) × kj2 (z2 , p)kj1 (z1 , p), k1 (z, p)k2 (q z, p) · · · kN−1 (q 2

(1 j1 < j2 N ), 2(N−2)

z, p)kN (q

2(N−1)

(3.46) z, p) =

(N−1) cN ,

(3.47)

∗

p∗ (q j −N+r z1 /z2 ) kj (z1 , p)ej (z2 , p) = ej (z2 , p)kj (z1 , p), ∗ p∗ (q j −N+r −2 z1 /z2 )

(3.48)

∗

p∗ (q j −N+r z1 /z2 ) ej (z2 , p)kj +1 (z1 , p), ∗ p∗ (q j −N+r +2 z1 /z2 ) ki (z1 , p)ej (z2 , p) = ej (z2 , p)ki (z1 , p), (i = j, j + 1), kj +1 (z1 , p)ej (z2 , p) =

kj (z1 , p)fj (z2 , p) =

p (q j −N+r−2 z1 /z2 ; p) fj (z2 , p)kj (z1 , p), p (q j −N+r z1 /z2 ; p)

p (q j −N+r+2 z1 /z2 ) fj (z2 , p)kj +1 (z1 , p), p (q j −N+r z1 /z2 ) ki (z1 , p)fj (z2 , p) = fj (z2 , p)ki (z1 , p), (i = j, j + 1). kj +1 (z1 , p)fj (z2 , p) =

(3.49) (3.50) (3.51) (3.52) (3.53)

Here we set ρ + (v) , ρ +∗ (v) {pq 2N+4 }{pq 2N }{p ∗ q 2N+2 }∗2 cN = {pq 2N+2 }2 {p ∗ q 2N+4 }∗ {p ∗ q 2N }∗ ρ(v) =

with ρ + (v) given in (2.19) and ρ +∗ (v) = ρ + (v)|r→r ∗ .

(3.54) (3.55)

416

T. Kojima, H. Konno

3.4. Definition of the elliptic algebra Uq,p ( slN ). Now we give a definition of the elliptic algebra Uq,p ( slN ) by considering a tensor product of the elliptic currents of Uq ( slN ) with a Heisenberg algebra. In order to keep the defining relations of the algebra Uq,p ( slN ) with the new currents Kj (v) the same as those given in Appendix A of [12], we need to make a central extension of the Heisenberg algebra. ˆ Let j (1 ≤ j ≤ N ) be 3.4.1. The Heisenberg algebra C{H} and its extension C{H}. N the orthonormal basis in R with the inner product j , k = δj,k . Setting ¯j = j − ,

=

N 1 j , N

(3.56)

j =1

we have the weight lattice P of AN−1 type P = ⊕N j =1 Z ¯j .

(3.57)

Then the simple roots αj (1 ≤ j ≤ N − 1) of slN are given by αj = −¯j + ¯j +1 .

(3.58)

Let us introduce operators hα , β (α, β ∈ P ) by [h¯j , ¯k ] = ¯j , ¯k , [h¯j , h¯k ] = 0 = [¯j , ¯k ], (3.59) j , ¯k = δj,k − N1 and hα = j nj hj for α = j nj j and h0 = 0. Note that ¯ [hαj , αk ] = 2δj,k − δj,k+1 − δj,k−1 = Aj k . Hence, we identify hαj = −h¯j + h¯j +1 with hj in the Drinfeld generators of Uq ( slN ) (Sect. 3.1). Noting N j =1 hj = 0, one can solve a set of equation hj = −h¯j + h¯j +1 (1 ≤ j ≤ N − 1) for h¯j ,

h¯j =

j −1 k=1

hk −

N−1 1 (N − k)hk . N

(3.60)

k=1

From this and (3.6)–(3.8), one can verify the following commutation relations with the ± Drinfeld generators c, d, aj,m , xj,m of Uq ( slN ): [h¯i , ajm ] = [h¯i , d] = [h¯i , c] = 0, ± ± [h¯i , xj,m ] = ±(−δi,j + δi,j +1 )xj,m .

(3.61) (3.62)

Now let us introduce another Heisenberg algebra C{H} generated by Pα and Qβ (α, β ∈ P ) satisfying the commutation relations [P¯j , Q¯k ] = ¯j , ¯k , [P¯j , P¯k ] = 0 = [Q¯j , Q¯k ], (3.63) where Pα = j nj Pj for α = j nj j and P0 = 0. We also impose that C{H} commutes with Uq ( slN ), [P¯j , α] = [Q¯j , α] = 0, [P¯ , Uq ( slN )] = [Q¯ , Uq ( slN )] = 0. j

j

(3.64) (3.65)

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

417

ˆ of the Heisenberg algebra C{H} by introDefinition 3.5. We define an extension C{H} ducing new generators ηj (1 ≤ j ≤ N ) and modifying the relations (3.63) to the following ones: [P¯j , P¯k ] = 0, [P¯j , Q¯k ] = ¯j , ¯k , 1 1 [Q¯j , Q¯k ] = − ∗ sgn(j − k)log q, r r 1 [Q¯j , ηk ] = sgn(j − k) log q, r 1 [ηj , ηk ] = sgn(j − k) log q, r N [P¯j , ηk ] = 0, ηj = 0.

(3.66) (3.67) (3.68) (3.69) (3.70)

j =1

We also impose the following commutation relations: slN )] = 0. [ηj , α] = [ηj , Uq (

(3.71)

If we set α¯ j = −ηj + ηj +1 , we have Proposition 3.6.

1 1 − ∗ δj,k+1 − δj,k−1 log q, [Qαj , Qαk ] = r r 1 1 [Qj , Qαk ] = − − ∗ δj,k + δj,k+1 log q, r r 1 [Q¯j , α¯ k ] = − δj,k + δj,k+1 log q, r 1 [Qαj , α¯ k ] = δj,k+1 − δj,k−1 log q, r 1 [α¯ j , α¯ k ] = δj,k+1 − δj,k−1 log q, r [α¯ j , P¯k ] = [α¯ j , Uq ( slN )] = 0.

(3.72) (3.73) (3.74) (3.75) (3.76) (3.77)

3.4.2. Definition of Uq,p ( slN ). Now we are ready to define the currents Ej (v), Fj (v), Hj± (v) (1 ≤ j ≤ N − 1) and Kj (v) (1 ≤ j ≤ N ), Ej (v) = ej (z, p)eα¯ j e

−Qαj

(q −j +N z)−

Fj (v) = fj (z, p)e−α¯ j (q −j +N z) Hj± (v)

=

Pαj −1 r

Pαj −1 r∗

(3.78)

, hj r

(q −j +N z) ,

(3.79)

1

1

1

c −Q (− + )(P −1)+ r hj ψj± (z, p)q ∓hj e αj (q −j +N±(r− 2 ) z) r ∗ r αj ,

Kj (v) = kj (z, p)e

−1 Q¯j ( r1∗ − 1r )P¯j − 1r h¯j +( r1∗ − 1r ) N2N

z

z

ej (z, p), fj (z, p), ψj± (z, p)

.

(3.80) (3.81)

and kj (z, p) are the elliptic currents of Here the currents Uq (slN ) given in (3.22)–(3.25) and (3.43), whereas α¯ j , Pα , Qβ (α, β ∈ P ) are the ˆ From (3.26)–(3.28), (3.45)–(3.53) and (3.66)– elements in the Heisenberg algebra C{H}. (3.71), we can verify the following relations.

418

T. Kojima, H. Konno

Proposition 3.7. Ei (v1 )Ej (v2 ) =

Fi (v1 )Fj (v2 ) =

Aij ∗ 2 ] E (v )Ei (v1 ), Aij ∗ j 2 2 ]

(3.82)

Aij 2 Aij 2

(3.83)

[v1 − v2 + [v1 − v2 − [v1 − v2 − [v1 − v2 +

] ]

Fj (v2 )Fi (v1 ),

δi,j −c c + δ(q v z /z )H + 1 2 2 j q − q −1 4 c −δ(q c z1 /z2 )Hj− v2 − , 4 1 c N −j N − j −1 r− , = κKj v + Kj +1 v + Hj± v ∓ 2 2 2 2

(3.84)

Kj (v1 )Kj (v2 ) = ρ (v1 − v2 )Kj (v2 )Kj (v1 ),

(3.86)

[Ei (v1 ), Fj (v2 )] =

Kj1 (v1 )Kj2 (v2 ) = ρ(v1 − v2 )

[v1 − v2 − 1]∗ [v1 − v2 ] Kj (v2 )Kj1 (v1 ) [v1 − v2 ]∗ [v1 − v2 − 1] 2

(1 j1 < j2 N ), Kj (v1 )Ej (v2 ) =

[v1 − v2 + [v1 − v2 +

Kj +1 (v1 )Ej (v2 ) =

j +r ∗ −N 2

j +r ∗ −N 2

]∗

− 1]∗

Ej (v2 )Kj (v1 ),

j +r ∗ −N ∗ ] 2 Ej (v2 )Kj +1 (v1 ), j +r ∗ −N + 1]∗ 2

[v1 − v2 +

(j1 = j2 , j2 + 1),

j +r−N − 1] 2 Fj (v2 )Kj (v1 ), j +r−N [v1 − v2 + 2 ]

[v1 − v2 +

Kj +1 (v1 )Fj (v2 ) =

j +r−N + 1] 2 Fj (v2 )Kj +1 (v1 ), j +r−N [v1 − v2 + 2 ]

[v1 − v2 +

Kj1 (v1 )Fj2 (v2 ) = Fj2 (v2 )Kj1 (v1 ) − 1∗ z1 r

(3.87)

[v1 − v2 +

Kj1 (v1 )Ej2 (v2 ) = Ej2 (v2 )Kj1 (v1 ) Kj (v1 )Fj (v2 ) =

(3.85)

(j1 = j2 , j2 + 1),

(3.88)

(3.89) (3.90)

(3.91)

(3.92) (3.93)

∗ −1 z/z ; p ∗ ) (p ∗ q −1 z/z ; p ∗ ) 1 (p q (p ∗ q 2 z2 /z1 ; p∗ )∞ 1 ∞ 2 ∞ r∗ (z /z) 2 (p ∗ q −2 z2 /z1 ; p∗ )∞ (p ∗ qz/z1 ; p∗ )∞ (p ∗ qz/z2 ; p∗ )∞

×Ei (v1 )Ei (v2 )Ej (v) − [2]q

(p ∗ q −1 z/z1 ; p∗ )∞ (p ∗ q −1 z2 /z; p∗ )∞ (p ∗ qz/z1 ; p∗ )∞ (p ∗ qz2 /z; p∗ )∞

∗ −1 z /z; p ∗ ) (p ∗ q −1 z /z; p ∗ ) 1 (p q 1 ∞ 2 ∞ ×Ei (v1 )Ej (v)Ei (v2 ) + (z/z1 ) r ∗ (p ∗ qz1 /z; p∗ )∞ (p ∗ qz2 /z; p∗ )∞ × Ej (v)Ei (v1 )Ei (v2 ) + (z1 ↔ z2 ) = 0, (3.94)

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups 1

z1r

419

1 (pq −2 z2 /z1 ; p)∞ (pqz/z1 ; p)∞ (pqz/z2 ; p)∞ (z/z2 ) r Fi (v1 )Fi (v2 )Fj (v) 2 (pq z2 /z1 ; p)∞ (pq −1 z/z1 ; p)∞ (pq −1 z/z2 ; p)∞ (pqz/z1 ; p)∞ (pqz2 /z; p)∞ − [2]q Fi (v1 )Fj (v)Fi (v2 ) (pq −1 z/z1 ; p)∞ (pq −1 z2 /z; p)∞ 1 (pqz1 /z; p)∞ (pqz2 /z; p)∞ + (z1 /z) r Fj (v)Fi (v1 )Fi (v2 ) (pq −1 z1 /z; p)∞ (pq −1 z2 /z; p)∞ + (z1 ↔ z2 ) = 0 (|i − j | = 1). (3.95)

Here the constant κ and the function ρ(v) are given in (3.44) and (3.54),respectively. The following relations among Hj± (v) are also useful. Proposition 3.8. c c Hj+ v − r + = Hj− v − , 4 4 A [v1 − v2 − 2ij ][v1 − v2 + Hi+ (v1 )Hj+ (v2 ) = A [v1 − v2 + 2ij ][v1 − v2 − Hi+ (v1 )Ej (v2 ) Hi+ (v1 )Fj (v2 )

=

=

[v1 − v2 + [v1 − v2 − [v1 − v2 − [v1 − v2 +

Aij 2 Aij 2

+ 4c ]∗

Aij 2 Aij 2

− 4c ]

+ 4c ]∗

− 4c ]

(3.96) Aij ∗ 2 ] H + (v2 )Hi+ (v1 ), Aij ∗ j ] 2

Ej (v2 )Hi+ (v1 ),

Fj (v2 )Hi+ (v1 ).

(3.97)

(3.98)

(3.99)

Definition 3.9 (Elliptic algebra Uq,p ( slN )). We define the elliptic algebra Uq,p ( slN ) to be the associative algebra of the currents Ej (v), Fj (v) (1 ≤ j ≤ N − 1) and Kj (v) (1 ≤ j ≤ N) satisfying the relations (3.82)–(3.95). Proposition 3.10. The construction of Ej (v), Fj (v) and Kj (v) given in (3.78)–(3.81) is a realization of the elliptic algebra Uq,p ( slN ) in terms of the Drinfeld generators of j ± Uq (slN ), hj , Bm , xi,n (i = 1, · · · , N −1 : m ∈ Z=0 , n ∈ Z), c, d and the Heisenberg ˆ generated by Pj , Qj (1 ≤ j ≤ N ) and α¯ j (1 ≤ j ≤ N − 1). algebra C{H} Remark. In Appendix A of [12], a realization of the elliptic algebra Uq,p ( slN ) is given by using the Drinfeld currents of Uq ( slN ) and the Heisenberg algebra generated by {Pj , Qj } satisfying Aij , (3.100) 2 which has no central extension. The relation between {Pj , Qj } and {Pαj , Qαj } in C{H} is Pαj = Pj and Qαj = −2Qj . The role of the central extension and the additional elements ηj (3.67)–(3.69) is to suppress some extra q-fractional-power-factors in the relations in Proposition 3.7. As for the problem realizing the L-operators satisfying the dynamical RLL-relation (Sect. 5), such q-factors can be absorbed into a choice of the gauge expressing the R-matrix. Conversely, in a gauge expressing the R-matrix compo1 1 [u] ¯ nents as b(u, s) = q − r [s+1][s−1][u] , b(u) = q r [u+1] and the others remaining the same [s]2 [u+1] as in (2.18), we need neither the central extension nor the addition of ηj . [Pi , Qj ] = −

420

T. Kojima, H. Konno

4. Half Currents + + In order to construct a L-operator, we here introduce the half currents El,j (v), Fj,l (v) and Kj+ (v) and investigate their commutation relations. We follow the idea of [14, 4, 12]. We often use the abbreviations

Pj,l = −P¯j + P¯l = Pαj + Pαj +1 + · · · + Pαl−1 , hj,l = −h¯j + h¯l = hj + hj +1 + · · · + hl−1

(4.1) (4.2)

ˆ and (3.78)–(3.81), we have for j < l. From the definition of C{H} [Kj (v), Pk,l ] = (δj,k − δj,l )Kj (v) = [Kj (v), Pk,l + hk,l ], [Ej (v), Pk,l ] = (δj,k + δj +1,l − δj,l − δj +1,k )Ej (v), [Fj (v), Pj,l + hj,l ] = (δj,k + δj +1,l − δj,l − δj +1,k )Fj (v), [Fj (v), Pk,l ] = 0 = [Ej (v), Pk,l + hk,l ].

(4.3) (4.4) (4.5) (4.6)

Now we define the half currents of Uq,p ( slN ) as follows. + + Definition 4.1 (Half currents). We define the half currents Fj,l (v), El,j (v), (1 ≤ j < + l ≤ N) and Kj (v) (j = 1, · · · , N) by r +1 + (1 ≤ j ≤ N ), (4.7) Kj (v) = Kj v + 2 l−1 dzm + Fj,l (v) = aj,l Fl−1 (vl−1 )Fl−2 (vl−2 ) · · · Fj (vj ) 2πizm C(j,l) m=j

× × + El,j (v)

=

∗ aj,l

[v − vl−1 + Pj,l + hj,l + [v − vl−1 +

l−N 2 ][Pj,l

l−N 2

− 1][1]

+ hj,l − 1]

l−2

[vm+1 − vm + Pj,m+1 + hj,m+1 − 21 ][1]

m=j

[vm+1 − vm + 21 ][Pj,m+1 + hj,m+1 ]

,

(4.8)

l−1 dzm Ej (vj )Ej +1 (vj +1 ) · · · El−1 (vl−1 ) 2πizm

C ∗ (j,l) m=j

l−N c ∗ ∗ 2 + 2 + 1] [1] c ∗ ∗ [v − vl−1 + l−N 2 + 2 ] [Pj,l − 1] l−2 [vm+1 − vm − Pj,m+1 + 21 ]∗ [1]∗ × . [vm+1 − vm + 21 ]∗ [Pj,m+1 − 1]∗ m=j

×

[v − vl−1 − Pj,l +

(4.9)

Here the integration contour C(j, l) and C ∗ (j, l) are given by C(j, l) : |pq l−N z| < |zl−1 | < |q l−N z|, |pqzk+1 | < |zk | < |qzk+1 |, ∗

∗ l−N+c

z| < |zl−1 | < |q C (j, l) : |p q |p ∗ qzk+1 | < |zk | < |qzk+1 |,

l−N+c

(4.10) z|, (4.11)

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

421

∗ are chosen to satisfy where k = j, j + 1, .., l − 2. The constants aj,l and aj,l ∗ [1] κ aj,l aj,l

q − q −1

= 1.

(4.12)

Then we can verify the following commutation relations + + (v), Fj,l (v) and Kj (v) (1 ≤ j < l ≤ N ) satisfy Theorem 4.1. The half currents El,j the following relations:

Kj+ (v1 )Kj+ (v2 ) = ρ(v)Kj+ (v2 )Kj+ (v1 ) Kj+ (v1 )Kl+ (v2 ) = ρ(v)

(1 ≤ j ≤ N ),

[v − 1]∗ [v] + K (v2 )Kj+ (v1 ) [v]∗ [v − 1] l

(4.13)

(1 ≤ j < l ≤ N ),

c∗ (v, Pj,l ) 1 + (v ) − E , 1 l,j b¯ ∗ (v) b¯ ∗ (v) c(v, ¯ Pj,l + hj,l ) + 1 = F + (v ) − Fj,l (v1 ), ¯b(v) j,l 2 ¯ b(v)

(4.14)

+ + Kl+ (v1 )−1 El,j (v2 )Kl+ (v1 ) = El,j (v2 )

(4.15)

+ Kl+ (v1 )Fj,l (v2 )Kl+ (v1 )−1

(4.16)

[1 − v]∗ + [1 + v]∗ + + + E (v )E (v ) + El,j (v2 )El,j (v1 ) 1 2 l,j l,j [v]∗ [v]∗ ∗ ∗ [1]∗ [Pj,l − 2 + v]∗ + + 2 [1] [Pj,l − 2 − v] = El,j (v1 )2 + E (v ) , 2 l,j [Pj,l − 2]∗ [v]∗ [Pj,l − 2]∗ [v]∗ [1 + v] + [1 − v] + + + (v2 ) + (v1 ) Fj,l (v1 )Fj,l Fj,l (v2 )Fj,l [v] [v] [1][Pj,l + hj,l − 2 − v] [1][Pj,l + hj,l − 2 + v] + + = Fj,l (v1 )2 (v2 )2 + Fj,l , [Pj,l + hj,l − 2][v] [Pj,l + hj,l − 2][v]

(4.17)

(4.18)

+ + Kl+ (v2 )−1 El,k (v1 )Kl+ (v2 )El,j (v2 ) ∗kj + + = Kl+ (v1 )−1 El,j (v2 )Kl+ (v1 )El,k (v1 )R¯ kj (v, Pj,k ) ∗kj

+ + + Kl+ (v1 )−1 El,k (v2 )Kl+ (v1 )El,j (v1 )R¯ j k (v, Pj,k )

(j = k),

(4.19)

+ + Fk,l (v2 )Kl+ (v2 )Fj,l (v1 )Kl+ (v2 )−1 jk + + = R¯ j k (v, Pj,k + hj,k )Fj,l (v1 )Kl+ (v1 )Fk,l (v2 )Kl+ (v1 )−1 + + + R¯ j k (v, Pj,k + hj,k )Fk,l (v1 )Kl+ (v1 )Fj,l (v2 )Kl+ (v1 )−1 kj

(j = k),

(4.20)

− 1]∗ [1]∗

[Pl−1,l − v [v]∗ [Pl−1,l − 1]∗ [Pj,l + hj,l − v − 1][1] + + − Fj,l−1 (v1 )Kl+ (v1 )−1 Kl−1 (v1 ) , (4.21) [v1 − v2 ][Pj,l + hj,l − 1]

+ + + + [El,l−1 (v1 ), Fj,l (v2 )] = Fj,l−1 (v2 )Kl−1 (v2 )Kl+ (v2 )−1

[Pj,l − v − 1]∗ [1]∗ [v]∗ [Pj,l − 1]∗ [Pl−1,l + hl−1,l − v − 1][1] + + (v1 )El−1,j (v1 ) , (4.22) − Kl+ (v1 )−1 Kl−1 [v][Pl−1,l + hl−1,l − 1]

+ + + + [El,j (v1 ), Fl−1,l (v2 )] = Kl−1 (v2 )Kl+ (v2 )−1 El−1,j (v2 )

where v = v1 − v2 .

422

T. Kojima, H. Konno

Proof. The relations (4.13) and (4.14) are direct consequences of (3.86) and (3.87). We show the relation (4.16). The relations (4.15) can be proved in the same way. Setting πl,j = Pj,l + hj,l , we have from (3.91)–(3.92) and (4.3), + Kl+ (v1 )Fj,l (v2 )Kl+ (v1 )−1 l−1 dzk

= aj,l

Fl−1 (vl−1 )Fl−2 (vl−2 ) · · · Fj (vj ) 2πiz C(j,l) k k=j

×

[v1 − vl−1

+

l−N 2

[v1 − vl−1

+ 1][v2 − vl−1 + πl,j +

− vl−1

+ + × A(vl−1 , .., vj ; πl−1,j , .., πj +1,j ), l−N 2 ][v2

l−N 2

l−N 2 ][πl,j

− 2][1]

− 2]

where we set

A(vl−1 , .., vj ; πl−1,j , .., πj +1,j ) =

l−2

[vk+1 − vk + πk+1,j − 21 ][1] k=j

[vk+1 − vk + 21 ][πk+1,j ]

.

(4.23)

Then the relation (4.16) follows from the theta function identity [u1 + t][u2 + s] [u1 − u2 + t][u2 + s + t] = [u1 − u2 ][u2 ][s + t] [u1 ][u2 ][s] [u2 − u1 + s][u1 + s + t][t] + [u2 − u1 ][u1 ][s][s + t]

(4.24)

with the replacement ui = vi − vl−1 + l−N 2 (i = 1, 2), s = πl,j − 2, t = 1. Proofs of (4.17)–(4.18) and (4.19)–(4.20) are lengthy. We put them in Appendix B. Next let us consider the relation (4.21). Integrating the delta function appearing from (3.84), we have + + ∗ −1 (aj,l aj,l ) (q − q −1 )[El,l−1 (v1 ), Fj,l (v2 )]

dzj dzl−1 c +

vl−1 ··· Hl−1 + =

+ 2πiz 2πizj 4 Cl−1 l−1 ∗ [1]∗ − π + 1] [u l,l−1 1 +

(vl−1 ) · · · Fj+ (vj ) × Fl−1 [u1 ]∗ [πl,l−i − 1]∗ dz

dzl−1 c j −

− · · · H − v l−1 − 2πiz

4 2πizj l−1 Cl−1 l−1 [u1 − πl,l−1 + 1 + c]∗ [1]∗ +

(vl−1 ) · · · Fj+ (vj ) × Fl−1 [u1 + c]∗ [πl,l−i − 1]∗

× A(vl−1 , .., vj ; πl−1,j , .., πj +1,j ). ± Here the contours Cl−1 are now + encloses z1 p ∗n , z2 p n Cl−1

(n = 1, 2, ...),

− Cl−1

n

2c ∗n

encloses z1 q p , z2 p

(n = 1, 2, ...).

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

423

Then in the second term, changing the variable zl−1 → pzl−1 and using the relation −

+

H (v + r − c/4) = H (v + c/4), we have the same integrand as the first term but − the integration contour Cl−1 becomes

− Cl−1 encloses z1 p ∗n , z2 p n

Therefore taking the residue at zl−1

(n = 0, 1, 2, ...).

= z1 , z2 and using the relation (3.85), we get (4.21).

5. The L-Operator of Uq,p (slN ) and Relation to Bq,λ (slN ) + (u) by using the half currents and show that it In this section, we construct a L-operator L satisfies the dynamical RLL-relation (2.21), which characterizes the algebra Bq,λ ( slN ). We then clarify the relation between the two elliptic algebras Uq,p (slN ) and Bq,λ (slN ). 5.1. L-operator. Definition 5.1 (L-operator). By using the half currents, we define the L-operator + (v) ∈ End(CN ) ⊗ Uq,p ( L slN ) as follows:   + + + (u) F1,3 (u) · · · F1,N (u) 1 F1,2   + 0  K1 (u) 0 · · ·  + + 0 1 F2,3 (u) · · · F2,N (u)   ..    + (u) .  0 K   . . . . . + 2   (u) =  .. .. .. .. .. L  .  .  .  . .   .. . 0  . + . . . 1 FN−1,N (u)  0 ··· 0 KN+ (u) 0 ··· ··· 0 1   1 0 ··· ··· 0 ..  ..  + .  E2,1 (u) 1 .   .  + . . + .. ..  ×  E (u) E (u) . .  . (5.1) 3,2  3,1    .. .. ..  . . . 1 0 + + + EN,1 (u) EN,2 (u) · · · EN,N −1 (u) 1 + + Here El,j (v), Fj,l (v) and Kj+ (v) are the half currents given in Sect. 4.

Let (πz , Vz ), Vz = V ⊗ C[z, z−1 ] be the evaluation representation of Uq ( slN ) based N (see Appendix D). The image of the universal on the vector representation V ∼ C = R-matrix R(r, {sj }) of Bq,λ ( slN ) in the evaluation representation (πV ,z ⊗ πV ,1 ) is given by the R-matrix R + (v, P ) in (2.15). Then from a direct comparison with the relations of the half currents in Theorem 4.1, we conjecture the following property of the L-operator. + (v) satisfies the following RLL = LLR ∗ relation: Conjecture 5.1. The L-operator L +(1) (v1 )L +(2) (v2 ) R +(12) (v1 − v2 , P + h)L +(2) +(1) +∗(12) =L (v2 )L (v1 )R (v1 − v2 , P ).

(5.2)

In Appendix C, we give a derivation of some of the relations among the half currents involved in (5.2) and discuss their direct comparison with those in Theorem 4.1. In Sect. 6.3, we give a proof of this statement in the case c = 1.

424

T. Kojima, H. Konno

5.2. Uq,p ( slN ) and Bq,λ ( slN ). Based on the conjecture, we give a relation between Uq,p ( slN ) and Bq,λ ( slN ). We argue that the RLL relation (5.2) is equivalent to the dynamical RLL relation of Bq,λ ( slN ). Hence we can regard the elliptic currents in Uq,p ( slN ) as an elliptic analogue of the Drinfeld currents in Uq ( slN ) providing a new realization of the elliptic quantum group Bq,λ ( slN ). In order to show this, we consider the realization of Uq,p ( slN ) given in (3.78)–(3.81) and modify the half currents in such a way that they have no Q¯j , ηj (1 ≤ j ≤ N ) dependence. Let us define the modified half currents kj+ (v, P ) (1 ≤ j ≤ N ) and + + ej,l (v, P ), fl,j (v, P ) (1 ≤ j < l ≤ N − 1) as follows: kj+ (v, P ) = Kj+ (v)e

−Q¯j

+ + el,j (v, P ) = eQ¯l −ηl El,j (v)e + fj,l (v, P )

=

(5.3)

,

+ e−ηj Fj,l (v)eηl .

−Q¯j +ηj

,

(5.4) (5.5)

Then it is easy to see from (3.78)–(3.81) and (4.7)–(4.9) that the modified half currents depend on neither Q¯j nor ηj and commute with P¯j ∀j . We hence regard them as the currents in Uq ( slN ) with parameters P¯j and r. Now we define a modified L-operator L+ (v, P ) by   + + + (u, P ) f1,3 (u, P ) · · · f1,N (u, P ) 1 f1,2   + + 0 (u, P ) · · · f2,N (u, P )  1 f2,3    . .. .. .. .. L+ (u, P ) =  ..  . . . .     .. . + . . . 1 fN−1,N (u, P )  0 ··· ··· 0 1   + k1 (u, P ) 0 ··· 0   ..   . 0 k2+ (u, P )  ×   .. ..   . . 0 + (u, P ) 0 ··· 0 kN   1 0 ··· ··· 0 ..  ..  + .  e2,1 (u, P ) 1 .   ..   + .. ×  e (u, P ) e+ (u, P ) . . . (5.6) . . . 3,2  3,1    .. .. ..  . . . 1 0 + + + eN,1 (u, P ) eN,2 (u, P ) · · · eN,N−1 (u, P ) 1 + (v) and the modified one L+ (v, P ) are related by Then the L-operator L   −Q¯ 1 e 0 ··· 0 ! N  ..    0 e−Q¯2 . + (1) + + =L (v) exp (v)  hm Q¯m . L (v, P ) = L   .. ..   . . m=1 0 0 · · · 0 e−Q¯N (5.7)

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

425

(1)

Here hj = hj ⊗ 1, hm ≡ −Emm (a N × N matrix unit). The reader should not confuse hm with hm , but note hj = −hj + hj +1 = −hj + hj +1 on V . Substituting (5.7) into (5.2) and noting the commutation relations Pj,l exp

N

! h(k) m Q¯m

= exp

m=1

N

! (k)

(Pj,l + hj,l )

h(k) m Q¯m

(5.8)

m=1

and "

N

(h(1) m m=1

# + h(2) m )Q¯m ,

R

+∗(12)

(v, P ) = 0,

(5.9)

or equivalently $

% Qj + Ql , Pj,l = 0,

(5.10)

(k) we can move each factor exp − N h Q m=1 m ¯m (k = 1, 2) to the right end on both sides. We then obtain the following statement. Corollary 5.2. The modified L-operator L+ (v, P ) satisfies the dynamical RLL relation R +(12) (v, P + h)L+(1) (v1 , P )L+(2) (v2 , P + h(1) ) = L+(2) (v2 , P )L+(1) (v1 , P + h(2) )R +∗(12) (v, P ),

(5.11)

where v = v1 − v2 . Comparing this with (2.21), we identify our L+ (v, P ) with L+ (v, s) in (2.21) and sj with Pαj . Note the parametrization (2.14). As a consequence of this result, we regard the elliptic currents Ej (v), Fj (v) (1 ≤ j ≤ N − 1) and Kj (v) (1 ≤ j ≤ N ) in Uq,p ( slN ) as the Drinfeld currents of the elliptic quantum group Bq,λ ( slN ) up to tensoring with the Heisenberg algebra. Conversely, this indicates that Uq,p ( slN ) is an extenˆ generated by sion of the algebra Bq,λ ( slN ) by tensoring the Heisenberg algebra C{H} {P¯j , Q¯j , ηj }. Namely, Uq,p (slN ) is obtained from Bq,λ (slN ), first by tensoring half of nQ

the generators e j emηl (1 ≤ j, l ≤ N ; n, m ∈ Z), then regarding sj = Pαj and imposing the commutation relations (3.66)–(3.71). In this sense, we can regard Uq,p ( slN ) as ˆ Bq,λ ( slN ) ⊗C{P ,P ,..,P } C{H}. 1

2

N −1

6. Vertex Operators of Uq,p (slN ) slN ) [12]. Tensoring the Heisenberg algebra breaks down the coalgebra structure of Bq,λ ( But we can still define Uq,p (slN ) counterparts of the intertwining operators of Bq,λ ( slN ). We call such operators the vertex operators of Uq,p ( slN ). In this section, we study such (1) vertex operators and compare them with those of the AN−1 -type face model obtained in the papers [16, 17].

426

T. Kojima, H. Konno

6.1. Intertwining relations. Here we derive Uq,p ( slN ) counterparts of the dynamical intertwining relations (2.25)–(2.26). In the next subsection, we use such relations to derive a free field realization of the vertex operators. Let us first define an extension of the Uq modules by & = F F ⊗ eµ1 Q¯1 +···+µN −1 Q¯N −1 . µ1 ,··· ,µN −1 ∈Z ∗ (z, P ) be the type I and type II intertwining operators of B Let W (z, P ) and W slN ) q,λ ( ∗ (2.23) –(2.24). We define type I and type II vertex operators W (v), W (v) of Uq,p ( slN ) slN ): as the following extensions of the corresponding intertwining operators of Bq,λ (

−→ F ⊗ Wz , :F   N   ∗ ∗ −→ F . W : Wz ⊗ F (v) = W (z, P ) exp h(1) Q ¯ j j   W (v) = W (q c z, P )

(6.1) (6.2)

j =1

W (v) and ∗ (v) satisfy From the relations (5.7) and (2.25)–(2.26), the new operators W the following “intertwining relations”: +(1) (v1 ) = R +(13) (v1 − v2 , P + h)L +(1) (v1 ) (3) (v2 ), (3) (v2 )L W V VW V W

(6.3)

+(1) (v1 )R +∗(12) (v1 − v2 , P − h(1) − h(2) ). (6.4) +(1) (v1 ) ∗(2) (v2 ) = ∗(2) (v2 )L L V W W V VW Now we restrict ourselves to the vector representation V and investigate the relations (6.3)–(6.4) in detail. We denote a basis of V by {vm }N m=1 . In this representation, the R+ (v) by L + (v) matrix RV+V (v, P ) is given by R + (v, P ) in (2.15) and the L-operator L V in (5.1). We define the components of the vertex operators by V

1 u− 2

=

N

V∗ m (u) ⊗ vm ,

m=1

c+1 ∗ u− (vm ⊗ ·) = m (u), (6.5) 2

+ (u) by and the matrix elements of the L-operator L + (u)vj = L vm L+ (u)mj .

(6.6)

1≤m≤N

Using these components, Eq. (6.3) is read as follows: + + m (v2 ) L+ mj (v1 ) = ρ (v1 − v2 + 1/2)Lmj (v1 )m (v2 ) ,

ρ (v1 − v2 + 1/2) m (v2 ) L+ lj (v1 ) = b(v1 − v2 + 1/2, Pl,m + hl,m )L+ lj (v1 )m (v2 ) + c(v1 − v2 + 1/2, Pl,m + hl,m )L+ mj (v1 )l (v2 ) , + + −1 ρ (v1 − v2 + 1/2) l (v2 ) Lmj (v1 ) +

(6.7)

−1

¯ 1 − v2 + 1/2)L+ (v1 )l (v2 ) = b(v mj + c(v ¯ 1 − v2 + 1/2, Pl,m + hl,m )L+ lj (v1 )m (v2 ) ,

(6.8)

(6.9)

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

427

for 1 ≤ l < m ≤ N and 1 ≤ j ≤ N. For type II, we have the following set of the equations arising from Eq. (6.4): + ∗ +∗ ∗ L+ j m (v1 )m (v2 ) = ρ (v1 − v2 + 1)m (v2 )Lj m (v1 ),

(6.10)

∗ ρ +∗ (v1 − v2 + 1)−1 L+ j l (v1 )m (v2 )

ρ

∗ ∗ = m (v2 )L+ j l (v1 )b (v1 − v2 + 1, Pl,m ) ∗ + l∗ (v2 )L+ j m (v1 )c¯ (v1 − v2 + 1, Pl,m ),

(6.11)

∗ (v1 − v2 + 1)−1 L+ j m (v1 )l (v2 ) ¯∗ = l∗ (v2 )L+ j m (v1 )b (v1 − v2 + 1) + ∗ + m (v2 )Lj l (v1 )c∗ (v1 − v2 + 1, Pl,m ),

(6.12)

+∗

for 1 ≤ l < m ≤ N and 1 ≤ j ≤ N . Let us investigate equations (6.7)–(6.9) in detail. From the component j = m = N of Eq. (6.7), we have 1 + + N (v2 )KN (v1 ) = ρ v1 − v 2 + KN+ (v1 )N (v2 ). (6.13) 2 Setting 1 ≤ j < m = N in (6.7), we have + + N (v2 )EN,j (v1 ) = EN,j (v1 )N (v2 ).

(6.14)

The following relations turn out to be sufficient conditions for (6.14) to hold: N (v2 )Ej (v1 ) = Ej (v1 )N (v2 )

(1 ≤ j ≤ N − 1).

(6.15)

Next let us consider the component l < m = j = N of Eq. (6.8). We set ρ + (v) = −

[v + 1] , ϕ(v)

ϕ(v) = (q r z)

N +1 rN

[v − 1]

(6.16) {pz}{pq 2N z}{q 2N+2 /z}{q −2 /z} . {pq 2N+2 z}{pq −2 z}{1/z}{q 2N /z}

(6.17)

Then (6.8) with m = j = N can be written as + ϕ(v1 − v2 + 1/2) N (v2 ) Fl,N (v1 )KN+ (v1 )

=

[Pl,N + hl,N − 1][Pl,N + hl,N + 1][v1 − v2 + 1/2] + Fl,N (v1 )KN+ (v1 )m (v2 ) [Pl,N + hl,N ]2 [Pl,N + hl,N + v1 − v2 + 1/2][1] + + (6.18) KN (v1 )l (v2 ) . [Pl,N + hl,N ]

In order to solve (6.18), let us assume that the operator product KN+ (v1 )N (v2 ) does not have a pole at v1 − v2 + 3/2 + r = 0. Later we will check that, for c = 1, this assumption is satisfied in a free field realization. Then from relations (6.13) and (6.16), we conclude that the product N (v2 )KN+ (v1 ) in the LHS of (6.18) has zero at v1 − v2 + 3/2 + r = 0. Therefore, setting v1 − v2 + 3/2 + r = 0 in (6.18), we have l (v2 ) = KN+ (v1 )−1

[Pj,N + hj,N + 1] + Fl,N (v1 ) KN+ (v1 )N (v2 ) [Pj,N + hj,N ]

+ = Fl,N (v2 − 1/2 − r) N (v2 )

(1 ≤ l ≤ N − 1).

(6.19)

428

T. Kojima, H. Konno

+ Note that the shift of v by r in Fl,N (v) yields a change of contour (see (6.42)). Substituting (6.13) and (6.19) into (6.8) for l < m = j = N , and using Riemann’s theta identity, we find that (6.19) and the following relations are sufficient conditions for (6.8) with l < m = j = N:

FN−1 (v1 )N (v2 ) =

[v1 − v2 + 21 ]

N (v2 )FN−1 (v1 ), [v1 − v2 − 21 ] Fl (v1 )N (v2 ) = N (v2 )Fl (v1 ) (1 ≤ l ≤ N − 2), [N (v), Pj,k + hj,k ] = −δk,N N (v) (j < k).

(6.20) (6.21) (6.22)

In the next section, we construct a free field realization of the type I vertex operators using relations (6.13), (6.15) and (6.19)–(6.22) for c = 1. We then check that the resulting vertex operators satisfy the remaining relations in (6.8) and (6.9). Similarly, from the j = m = N component of (6.10), we have for the type-II vertex operator ∗ ∗ KN+ (v1 )N (v2 ) = ρ +∗ (v1 − v2 + 1) N (v2 )KN+ (v1 )

(6.23)

and from the 1 ≤ j < m = N component of (6.10), + + ∗ ∗ Fj,N (v1 )N (v2 ) = N (v2 )Fj,N (v1 )

(1 ≤ j ≤ N − 1).

(6.24)

We find the following as sufficient conditions for (6.24): ∗ ∗ Fj (v1 )N (v2 ) = N (v2 )Fj (v1 )

(1 ≤ j ≤ N − 1).

(6.25)

∗ (v )K + (v ) To solve Eq. (6.11) with l < j = m = N , we assume that the product N 2 N 1 + ∗ ∗ has no pole at v1 − v2 + 2 + r = 0. Then the product KN (v1 )N (v2 ) in the LHS has a zero at v1 − v2 + 2 + r ∗ = 0 for the same reason as the type I case. Therefore, from (6.11) with l < j = m = N and setting v1 − v2 + 2 + r ∗ = 0, we have c+1 + ∗ l∗ (v) = N v− (1 ≤ l ≤ N − 1). (6.26) (v)EN,l − r∗ 2

Then (6.26) and the following relations turn out to be the sufficient conditions for (6.11) and (6.12): ∗ EN−1 (v1 )N (v2 ) =

[v1 − v2 − 21 ]∗ [v1 − v2 + 21 ]∗

∗ N (v2 )EN−1 (v1 ),

∗ ∗ Ej (v1 )N (v2 ) = N (v2 )Ej (v1 ) (1 ≤ j ≤ N − 2), ∗ ∗ [N (v), Pj,k ] = δk,N N (v) (j < k).

(6.27) (6.28) (6.29)

6.2. Free field realizations. Now we construct a free field realization of the vertex operators fixing the representation level c = 1. For this purpose, we first consider the simple root operator αj introduced in Sect. 3.4.1. We make the following standard central extension: [αj , αk ] = iπ Aj k .

(6.30)

Setting αˆ j = αj + α¯ j where α¯ j is an element of the Heisenberg algebra C{Hˆ }, we have

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

429

1 δj,k+1 − δj,k−1 log q, r [hj , αˆ k ] = −δj,k + δj,k+1 , 1 [Q¯j , αˆ k ] = − δj,k + δj,k+1 log q, r 1 [Qαj , αˆ k ] = δj,k+1 − δj,k−1 log q, r [αˆ j , P¯k ] = 0. [αˆ j , αˆ k ] = iπ Aj k +

(6.31) (6.32) (6.33) (6.34) (6.35)

Then the following statement holds. Proposition 6.1. The currents Ej (v) and Fj (v) given by  [rm]q j j +1 (−Bm + Bm )(q N−j z)−m  : Ej (v) = : exp − m[r ∗ m]q 

m=0

Pαj −1

−Q

(6.36) eαˆ j zhj e αj (q −j +N z)− r ∗ ,   Pαj −1 hj 1 j j +1 Fj (v) = : exp  (−Bm + Bm )(q N−j z)−m  : e−αˆ j z−hj (q −j +N z) r + r , m m=0

(6.37) together with Hj± (v), Kj (v) given in (3.80)–(3.81) satisfy the commutation relations in Proposition 3.7 for level c = 1. Hence they give a free field realization of the level one elliptic algebra Uq,p ( slN ). Now substituting the free field realization of Ej (v), Fj (v), Kj (v) into (4.7)–(4.9), we obtain a realization of the half currents Ej+ (v), Fj+ (v), Kj+ (v) as well as the L+ (v) satisfying the RLL-relation (5.2) for c = 1. Using such a L-operator in operator L the “intertwining relations”, (6.13)–(6.22) for type I and (6.23)–(6.29) for type II, one can solve them for the vertex operators. The results are stated as follows. Theorem 6.2. The highest components of type I and type II vertex operators N (v), ∗ (v) are realized in terms of a free field by N  1 1 1 1 N −1 ¯ N −m  : eN −1 z(1− r )h¯N z− r P¯N z(1− r ) 2N , z Bm N (v) = : exp − m m=0   [rm]q ∗ N (v) = : exp  B N z−m  : m[r ∗ m]q m 

(6.38)

m=0

¯

1

1

e−N −1 z−h¯N eQ¯N z r ∗ P¯N z(1+ r ∗ )

N −1 2N

1

1 N −1 N

q( r − 2 )

,

(6.39)

where ¯ N−1 =

1 (αˆ 1 + 2αˆ 2 + · · · + (N − 1)αˆ N−1 ). N

(6.40)

430

T. Kojima, H. Konno

For the other components of the type I vertex operator j (v) (j = 1, · · · , N ), we obtain from (6.19) N−1 dzm j (v) = aj,N N (v)FN−1 (vN−1 ) · · · Fj (vj ) 2πizm C m=j

×

N−1

[vm+1 − vm + Pj,m+1 + hj,m+1 − 21 ][1]

m=j

[vm+1 − vm + 21 ][Pj,m+1 + hj,m+1 ]

N−1 dzm = aj,N Fj (vj ) · · · FN−1 (vN−1 )N (v) 2πizm C m=j

×

N−1

[vm+1 − vm + Pj,m+1 + hj,m+1 − 21 ][1]

m=j

[vm+1 − vm − 21 ][Pj,m+1 + hj,m+1 ]

,

(6.41)

where v = vN and the integration contour C is specified by the condition |q −1 z| < |zN−1 | < |p−1 q −1 z|, |pqzm+1 | < |zm | < |qzm+1 | For the type II vertex

j∗ (v)

∗ j∗ (v) = aj,N

×

(j = 1, · · · , N), we obtain from (6.26),

N−1 C∗

(j ≤ m ≤ N − 2).

m=j

dzm ∗ Ej (vj ) · · · EN−1 (vN−1 )N (vN ) 2πizm

N−1

[vm+1 − vm − Pj,m+1 + 21 ]∗ [1]∗

m=j

[vm+1 − vm + 21 ]∗ [Pj,m+1 − 1]∗

∗ = aj,N

×

(6.42)

N−1 C ∗ m=j

dzm ∗ (vN )EN−1 (vN−1 ) · · · Ej (vj ) 2πizm N

N−1

[vm+1 − vm − Pj,m+1 + 21 ]∗ [1]∗

m=j

[vm+1 − vm − 21 ]∗ [Pj,m+1 − 1]∗

.

(6.43)

The integration contour C ∗ is specified as follows: |p∗ q −1 zm+1 |, |q −1 zm+1 | < |zm | < |qzm+1 |, |p∗−1 qzm+1 | (j ≤ m ≤ N − 1). (6.44) Here the integration variable zm (j ≤ m ≤ N −1) should encircle the poles p ∗ q −1 zm+1 , q −1 zm+1 but not the poles p ∗−1 qzm+1 , qzm+1 . In addition, we have the following commutation relations. ∗ (v) satisfy Proposition 6.3. The highest components N (v) and N ∗ [N (v), Pj1 ,j2 ] = [N (v), Pj1 ,j2 + hj1 ,j2 ] = 0, ∗ ∗ N (v1 )N (v2 ) = χ (v1 − v2 )N (v2 )N (v1 ), (qz) q 2N . χ (v) = z−1 q 2N (q/z)

(6.45) (6.46) (6.47)

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

431

Remark. The free field realizations of the vertex operators in Theorem 6.2 are essen(1) tially the same as those of the AN−1 -type face model obtained in [16, 17]. There are two differences between ours and those in [16, 17]; the choice of the gauge expressing the R-matrices and the zero-mode operators. Due to our gauge, we have the extra fac' 'N−1 [1] [1]∗ tors N−1 m=j [Pj,m+1 +hj,m+1 ] and m=j [Pj,m+1 −1]∗ in type I and type II vertex operators, respectively. As for the zero-modes, the correspondence between ours P¯j , Q¯j , hj , αˆ j and those in [16, 17], Pαj , PωN , Qαj , QωN is given by (

r∗ P  ( r αj  r∗  r PωN (  r  r ∗ Pαj

(

r r ∗ PωN





    ↔    

r∗ 1 r hj − r Pαj ∗ r 1 r h¯N − r P¯N 1 hj − r ∗ Pαj h¯N − r1∗ P¯N

 ( ∗ i r Q  ( r αj  ∗   i rr QωN ,  (    i rr∗ Qαj  ( i rr∗ QωN 



  αˆ j    ¯ N−1 ↔   αˆ j − Qαj  ¯ N−1 − Q¯N 

   . (6.48) 

One should note that to define the currents Kj (v) by factoring the operators Hj± (v), the use of two sets of the Heisenberg operators {P¯j , Q¯j } and {hj , αˆ j } is essential. 6.3. Commutation relations. We next investigate commutation relations of the vertex operators and show that our realization satisfies the full intertwining relations for c = 1. Theorem 6.4. The free field realizations of the type-I vertex operator µ (v) (6.41) and the type-II vertex operator µ∗ (v) (6.43) satisfy the following commutation relations: j2 (v2 )j1 (v1 ) = j∗1 (v1 )j∗2 (v2 ) =

N j1 ,j2 =1 N j1 ,j2 =1

j j

Rj11j22 (v1 − v2 , P + h) j (v1 )j (v2 ), 1

2

∗j j

j∗ (v2 )j∗ (v1 ) Rj j1 2 (v1 − v2 , P ), 2

1

(6.49)

(6.50)

1 2

j (v1 )k∗ (v2 ) = χ (v1 − v2 ) k∗ (v2 )j (v1 ).

(6.51)

Here we set ¯ P + h), R(v, P + h) = µ(v)R(v,

R ∗ (v, P ) = µ∗ (v)R¯ ∗ (v, P ),

(6.52)

with 1

µ(v) = z( r −1)

N −1 N

{pq 2N−2 z}{q 2 z}{p/z}{q 2N /z} , {pz}{q 2N z}{pq 2N−2 /z}{q 2 /z}

(6.53)

and µ∗ (v) = µ(v)|r→r ∗ .

(6.54)

Proof. Using the formulae (6.19)–(6.22) and (6.26)–(6.29), the commutation relations (6.49) and (6.50) are reduced to the relations among the half currents (B.1), (B.2), (4.19) and (4.20). Then the proofs of the latter relations are given in Appendix B.

432

T. Kojima, H. Konno

Let us consider the relation (6.51). The case j = N or k = N is a direct consequence of (6.14), (6.24) and (6.46). The simplest non-trivial case is j = k = N −1. From (6.19), (6.26) and (3.84), we have the following equation after integrating the delta functions: ∗ ∗ N−1 (v1 )N−1 (v2 ) − χ (v1 − v2 ) N−1 (v2 )N−1 (v1 ) ∗ aN−1,N aN−1,N ∗ =− N (v1 )N (v2 ) q − q −1 1 dz

+

× v + H

N −1 4 C+ 2π iz [v2 − v − PN−1,N ]∗ [1]∗ [v1 − v + PN−1,N + hN−1,N − 21 ][1] × [v2 − v − 1]∗ [PN−1,N − 1]∗ [v1 − v + 21 ][PN−1,N + hN−1,N ] dz

1 −

− v H −

N−1 4 C− 2π iz

[v2 − v − PN−1,N + 1]∗ [1]∗ [v1 − v + PN−1,N + hN−1,N − 21 ][1] × . [v2 − v ]∗ [PN−1,N − 1]∗ [v1 − v + 21 ][PN−1,N + hN−1,N ] (6.55)

The contours are specified by C+ : |qz1 |, |q −2 z2 | < |z | < |p−1 qz1 |, |p∗−1 q −2 z2 |, |p−1 q −1 z1 |, |p∗−1 z2 |, (6.56)

C− : |qz1 |, |z2 | < |z | < |p−1 qz1 |, |p∗−1 z2 |, |q −1 z1 |, |q 2 z2 |.

(6.57)

Here the conditions |z | < |p −1 q −1 z1 |, |p∗−1 z2 | for C+ and |z | < |q −1 z1 |, |q 2 z2 | for ∗ (v )H + C− are added because of the convergence of the operator product N (v1 )N 2 N−1 ∗ (v )H − (v − 1/4), respectively. Changing the integration (v + 1/4) and N (v1 )N 2 N −1 variable z → pz in the second term and using the periodicity of [v], [v]∗ and the rela− + tion HN−1 (v + r − 41 ) = HN−1 (v + 41 ), we see that the integrand in the second term coincides with the one in the first term but the contour in the second term is changed to C˜ − : |p−1 qz1 |, |p∗−1 q −2 z2 | < |z | < |p−2 qz1 |, |p∗−2 q −2 z2 |, |p−1 q −1 z1 |, |p∗−1 z2 |. Here C˜ − encircles the same poles as C+ . In addition, C˜ − would encircle two extra poles ∗ (v )H + (v + 1 ) had at z = p−1 qz1 , p ∗−1 q −2 z2 , if the operator product N (v1 )N 2 N−1 4 no zeros which cancel these extra poles. In fact, the operator product does have zeros at the required points. Therefore the RHS of (6.55) vanishes. The proof for the general 1 ≤ j1 , j2 ≤ N − 1 case is similar. Now let us investigate the intertwining relation for level c = 1. For this purpose, we remind the reader of the fact that in the trigonometric case, i.e. Uq ( slN ), the L-operator can be constructed as a composition of type I and II vertex operators [19, 20]. The following theorem is an elliptic analogue of such a construction. + (v) is given by a product of the type-I and Theorem 6.5. For c = 1, the L-operator L type-II vertex operators, + (v) = 1 k∗ (v + r)j v + r + 1 L (1 ≤ j, k ≤ N ). (6.58) jk gN 2

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

433

Here we set gN =

(q 2 ; q 2N )∞ . (q 2N ; q 2N )∞

(6.59)

Proof. For the special component j = k = N of (6.58), we have 1 + −1 ∗ KN (v) = gN N (v + r)N v + r + . 2

(6.60)

This is a direct consequence of (6.38) and (6.39). Let us consider the j < k = N component of (6.58). After a few calculations using (6.24) and (6.60), we can reduce this to relation (6.19). Similarly, the N = j > k component of (6.58) is reduced to relation (6.26). Next, let us study the simplest non-trivial component j = k = N − 1. From (6.19), (6.24)–(6.28), (3.89)–(3.90) and (3.84), we have the following equation after integrating the delta functions: −1 ∗ + + gN N−1 (v + r)N−1 (v + r + 1/2) − FN−1,N (v)KN+ (v)EN,N −1 (v)

=

∗ aN−1,N aN−1,N

q − q −1 dz

1 +

v + × H KN+ (v)

N −1 4 C+ 2πiz [v − v − PN−1,N + 1]∗ [1]∗ [v − v + PN−1,N + hN−1,N ][1] × [v − v + 1]∗ [PN−1,N − 1]∗ [v − v ][PN−1,N + hN−1,N ] dz

1 −

− v H − KN+ (v)

N−1 2πiz 4 C− [v − v − PN−1,N + 2]∗ [1]∗ [v − v + PN−1,N + hN−1,N ][1] × . [v − v + 2]∗ [PN−1,N − 1]∗ [v − v ][PN−1,N + hN−1,N ]

(6.61)

Here the contours are C+ : |pz|, |p∗ z| < |z | < |z|,

(6.62)

C− : |pz| < |z | < |z|, |q z|.

(6.63)

2

Changing the integration variable z → pz in the second term and using the periodicity − + of [v], [v]∗ and the relation HN−1 (v + r − 1/4) = HN−1 (v + 1/4), we see that the integrand in the second term coincides with the first term, but the contour in the second term is changed to C˜ − : |z| < |z | < |p−1 z|, |p∗−1 z|.

(6.64)

The contour C˜ − encircles the same poles as C+ together with one additional pole at z = z. Hence the RHS of (6.61) becomes the residue at z = z. We thus obtain −1 ∗ N−1 (v + r)N−1 (v + r + 1/2) gN + + + (v)KN+ (v)EN,N−1 (v) + KN−1 (v). = FN−1,N

(6.65)

+ (v). The proof of the The RHS coincides with the (N − 1, N − 1) component of L general case 1 ≤ j, k ≤ N − 1 is similar.

434

T. Kojima, H. Konno

+ (v) satisfies the RLL = LLR ∗ relation Corollary 6.6. For c = 1, the L-operator L (5.2). Proof. Let us substitute the expression (6.58) into the LHS of (5.2). Then using the commutation relations of the vertex operators (6.49)-(6.51) and the formula ρ + (v) µ(v) χ ( 21 − v) = , ρ +∗ (v) µ∗ (v) χ ( 21 + v) one gets the desired result.

(6.66)

In the same way, we have ∗ (v) V (v), Corollary 6.7. For c = 1, the type-I and the type II vertex operators V N ∼ satisfy the full intertwining relations (6.3) and (6.4) with V = W = C . Acknowledgements. The authors would like to thank Robert Weston and colleagues in the Department of Mathematics, Heriot-Watt University, where a part of this work was done, for their kind hospitality. H.K is also grateful to JSPS and the Royal Society for the exchange fellowship. This work is also supported by Grant-in-Aid for Scientific Research (C) (11640030, 14540028) and Grant-in-Aid for Young Scientist (B) (14740107) from the Ministry of Education, Science, Sports and Culture.

A. Operator Product Expansions Here we list formulae of operator product expansions (OPE) used in Sect. 3.3 and 6.2. For operators A(z), B(z), we write A(z1 )B(z2 ) = A(z1 )B(z2 ) : A(z1 )B(z2 ) : . (I) In Sect. 3.3, we used the OPE’s of the currents ψj± (z, p) (3.24)–(3.25) and kj (z, p) (3.43) for generic c: {pq 2 z2 /z1 }{pq 2N−2 z2 /z1 }{p ∗ q 2N z2 /z1 }∗ {p ∗ z2 /z1 }∗ , (A.1) {pq 2N z2 /z1 }{pz2 /z1 }{p ∗ q 2 z2 /z1 }∗ {p ∗ q 2N−2 z2 /z1 }∗ {pq 2N+2 z2 /z1 }{pq 2N−2 z2 /z1 }{p ∗ q 2N z2 /z1 }∗2 kj1 (z1 , p)kj2 (z2 , p) = (j1 < j2 ), {pq 2N z2 /z1 }2 {p ∗ q 2N+2 z2 /z1 }∗ {p ∗ q 2N−2 z2 /z1 }∗ (A.2) −2 ∗ ∗2 2 {pq z2 /z1 }{pq z2 /z1 }{p z2 /z1 } kj2 (z1 , p)kj1 (z2 , p) = (j1 < j2 ), (A.3) {pz2 /z1 }2 {p ∗ q 2 z2 /z1 }∗ {p ∗ q −2 z2 /z1 }∗ kj (z1 , p)kj (z2 , p) =

(pq 2 z2 /z1 ; p)∞ (p ∗ q −2 z2 /z1 ; p∗ )∞ , (pq −2 z2 /z1 ; p)∞ (p ∗ q 2 z2 /z1 ; p∗ )∞ (pq −1 z2 /z1 ; p)∞ (p ∗ qz2 /z1 ; p∗ )∞ ψj+ (z1 , p)ψj++1 (z2 , p) = , (pqz2 /z1 ; p)∞ (p ∗ q −1 z2 /z1 ; p∗ )∞ (pq −1 z2 /z1 ; p)∞ (p ∗ qz2 /z1 ; p∗ )∞ ψj++1 (z1 , p)ψj+ (z2 , p) = . (pqz2 /z1 ; p)∞ (p ∗ q −1 z2 /z1 ; p∗ )∞ ψj+ (z1 , p)ψj+ (z2 , p) =

(A.4) (A.5) (A.6)

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

435

∗ (v), E (v), F (v), K + (v) (II) In Sect. 6.2, we used the OPE’s of the currents N (v), N j j j for c = 1. We here list their boson part only. Namely, let us define the boson part of them by   1 φN (v) =: exp − (A.7) B N z−m  :, m m m=0   [rm]q ψN∗ (v) =: exp  (A.8) B N z−m  :, m[r ∗ m]q m m=0   [rm]q j j +1 ej (v) =: exp − (A.9) (−Bm + Bm )(q N−j z)−m  :, m[r ∗ m]q m=0   1 j j +1 fj (v) =: exp  (A.10) (−Bm + Bm )(q N−j z)−m  : . m m=0

Then the OPE’s of them are given by (q N−j +1 z2 /z1 ; p)∞ , (q N−j −1 z2 /z1 ; p)∞ (q N−j −3 z2 /z1 ; p)∞ Kj++1 (v1 )fj (v2 ) = N−j −1 , z2 /z1 ; p)∞ (q (q −N +j −1 z2 /z1 ; p)∞ fj (v1 )Kj+ (v2 ) = −N +j −3 , (q z2 /z1 ; p)∞ (q −N+j −1 z2 /z1 ; p)∞ fj (v1 )Kj++1 (v2 ) = −N+j +1 , (q z2 /z1 ; p)∞ Kj+ (v1 )fj (v2 ) =

(q N−j −2 z2 /z1 ; p∗ )∞ , (q N−j z2 /z1 ; p∗ )∞ (q N−j −2 z2 /z1 ; p∗ )∞ Kj++1 (v1 )ej (v2 ) = N−j −4 , (q z2 /z1 ; p∗ )∞ (q −N +j −4 z2 /z1 ; p∗ )∞ ej (v1 )Kj+ (v2 ) = −N+j −2 , (q z2 /z1 ; p∗ )∞ (q −N +j z2 /z1 ; p∗ )∞ ej (v1 )Kj++1 (v2 ) = −N +j −2 , (q z2 /z1 ; p∗ )∞ Kj+ (v1 )ej (v2 ) =

{pq 3 z2 /z1 }{pq 2N−1 z2 /z1 } , {pqz2 /z1 }{pq 2N+1 z2 /z1 } {qz1 /z2 }{q 2N−3 z1 /z2 } , KN+ (v2 )φN (v1 ) = −1 {q z1 /z2 }{q 2N−1 z1 /z2 } {pq 3 z2 /z1 }{pq −1 z2 /z1 } φN (v1 )Kj+ (v2 ) = , {pqz2 /z1 }2 φN (v1 )KN+ (v2 ) =

(A.11) (A.12) (A.13) (A.14)

(A.15) (A.16) (A.17) (A.18)

(A.19) (A.20) (A.21)

436

T. Kojima, H. Konno

Kj+ (v2 )φN (v1 ) =

{q 2N+1 z1 /z2 }{q 2N−3 z1 /z2 } , {q 2N−1 z1 /z2 }2

{p∗ q 2 z2 /z1 }∗ {p ∗ q 2N+2 z2 /z1 }∗ , {p ∗ q 4 z2 /z1 }∗ {p ∗ q 2N z2 /z1 }∗ {q −2 z1 /z2 }∗ {q 2N−2 z1 /z2 }∗ KN+ (v2 )ψN∗ (v1 ) = , {z1 /z2 }∗ {q 2N−4 z1 /z2 }∗ {p ∗ q 2 z2 /z1 }∗2 ψN∗ (v1 )Kj+ (v2 ) = ∗ 4 , {p q z2 /z1 }∗ {p ∗ z2 /z1 }∗ {q 2N−2 z1 /z2 }∗2 Kj+ (v2 )ψN∗ (v1 ) = 2N , {q z1 /z2 }∗ {q 2N−4 z1 /z2 }∗ ψN∗ (v1 )KN+ (v2 ) =

(pq −1 z2 /z1 ; p)∞ , (qz2 /z1 ; p)∞ (pq −1 z2 /z1 ; p)∞ φN (v1 )fN−1 (v2 ) = , (qz2 /z1 ; p)∞ (p∗ qz2 /z1 ; p∗ )∞ eN−1 (v1 )ψN∗ (v2 ) = −1 , (q z2 /z1 ; p∗ )∞ (p∗ qz2 /z1 ; p∗ )∞ ψN∗ (v1 )eN−1 (v2 ) = −1 , (q z2 /z1 ; p∗ )∞

fN−1 (v1 )φN (v2 ) =

{pq 2N−2 z2 /z1 }{q 2 z2 /z1 } , {q 2N z2 /z1 }{pz2 /z1 } {p∗ q 2N z2 /z1 }∗ {z2 /z1 }∗ , ψN∗ (v2 )ψN∗ (v1 ) = ∗ 2 {p q z2 /z1 }∗ {q 2N−2 z2 /z1 }∗ (q 2N−1 z2 /z1 ; q 2N )∞ φN (v1 )ψN∗ (v2 ) = , (qz2 /z1 ; q 2N )∞ (q 2N−1 z2 /z1 ; q 2N )∞ ψN∗ (v1 )φN (v2 ) = , (qz2 /z1 ; q 2N )∞

φN (v1 )φN (v2 ) =

(z2 /z1 ; p∗ )∞ (q −2 z2 /z1 ; p∗ )∞ , (p ∗ q 2 z2 /z1 ; p∗ )∞ (p ∗ z2 /z1 ; p∗ )∞ (p∗ qz2 /z1 ; p∗ )∞ ej (v1 )ej +1 (v2 ) = −1 , (q z2 /z1 ; p∗ )∞ (p∗ qz2 /z1 ; p∗ )∞ ej +1 (v1 )ej (v2 ) = −1 , (q z2 /z1 ; p∗ )∞ (z2 /z1 ; p)∞ (q 2 z2 /z1 ; p)∞ fj (v1 )fj (v2 ) = , (pz2 /z1 ; p)∞ (pq −2 z2 /z1 ; p)∞ (pq −1 z2 /z1 ; p)∞ fj (v1 )fj +1 (v2 ) = , (qz2 /z1 ; p)∞ (pq −1 z2 /z1 ; p)∞ fj +1 (v1 )fj (v2 ) = . (qz2 /z1 ; p)∞ ej (v1 )ej (v2 ) =

(A.22) (A.23) (A.24) (A.25) (A.26) (A.27) (A.28) (A.29) (A.30) (A.31) (A.32) (A.33) (A.34) (A.35) (A.36) (A.37) (A.38) (A.39) (A.40)

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

437

B. Proof of the Relations (4.17)–(4.18) and (4.19)–(4.20) Let us consider the relations + + + + Kl+ (v2 )−1 El,j (v1 )Kl+ (v2 )El,j (v2 ) = Kl+ (v1 )−1 El,j (v2 )Kl+ (v1 )El,j (v1 ),

(B.1)

+ + Fj,l (v1 )Kl+ (v1 )Fj,l (v2 )Kl+ (v1 )−1

(B.2)

=

+ + Fj,l (v2 )Kl+ (v2 )Fj,l (v1 )Kl+ (v2 )−1 ,

for 1 ≤ j < l ≤ N. Then the relations (4.17) and (4.18) follow from these relations and (4.15), (4.16). In this Appendix, we give proofs of the relations (B.2) and (4.20). The proof of the other cases (B.1) and (4.19) are similar. Let us set f (v, w) =

[v +

1 2

− w]

[v − 21 ] [v − 1] h(v) = . [v + 1]

(B.3)

,

(B.4)

+ (v) is given by Recall that the half current Fj,l + Fj,l (v) = aj,l

×

l−1 dzk Fl−1 (vl−1 )Fl−2 (vl−2 ) · · · Fj (vj ) 2πizk

C(j,l) m=j

l−1

f (vm − vm+1 , πm+1,j )

m=j

[1] , [πm+1,j − δm,l−1 ]

(B.5)

where we set vl = v + l−N−1 . Recall also zm = q 2vm and πl,j = Pj,l + hj,l . We call 2 ' Fl−1 (vl−1 ) Fl−2 (vl−2 ) · · · Fj (vj ) the operator part, and l−1 m=j f (vm − vm+1 , πm+1,j ) [1] [πm+1,j −δm,l−1 ] the coefficient part. We keep coefficient parts in the right of operator parts. ' In the coefficient part, we represent l−1 m=j f (vm − vm+1 , πm+1,j ) by the diagram vj

πj +1,j

- vj +1

πj +2,j

-

···

πl−1,j

-

vl−1

πl,j

-

vl .

According to the relation (3.83) with i = j , we have the equality

dzj dzj Fj (vj )Fj (vj )A(vj , vj ) 2πizj 2πizj

dzj dzj Fj (vj )Fj (vj )h(vj − vj )A(vj , vj ), = 2πizj 2πizj

when the integration contours for zj and zj are the same. We define “weak equality” in the following sense[16]. The two coefficient functions A(vj , vj ) and B(vj , vj ) coupled to Fj (vj )Fj (vj ) in integrals are equal in the weak sense if A(vj , vj ) + h(vj − vj )A(vj , vj ) = B(vj , vj ) + h(vj − vj )B(vj , vj ).

438

T. Kojima, H. Konno

We write the weak equality as A(vj , vj ) ∼ B(vj , vj ). To prove the equality (B.2) and (4.20), it is enough to show the equalities of coefficient parts in the weak sense. Let us recall the following two lemmas[16]. Lemma B.1. The coefficient function vj

πj +1,j −1

πj +2,j −1

- vj +1

-

πl−2,j −1

-

···

HH 0 HH0 HH H j j πj +1,j H πj +2,jH - vj +1 vj

vl−2

HH j H πl−2,µ 0

···

πl−1,j −1

-

HH j H πl−1,µ -

vl−1

0

vl−2

(B.6)

vl−1

is invariant in the weak sense when vl−1 and vl−1 are exchanged.

Lemma B.2.

vk−1

πk,j +1

- vk

Q

0QQ s

∼

πk+1,j

- vk+1

Q

Q 0

vk

Q s Q - v

k+1

πk+1,k

πk+2,j

- ···

Q

Q 0 Q s Q - ··· π

πl−2,j

- vl−2

Q

k+2,k

Q

Q s Q - v

l−2

0

πl−2,k

πl−1,j

- vl−1

Q

Q

0

Q s - v

l−1

πl−1,k

1

,π ) β(vl−1 − vl−1 k,j

vk−1

πk,j

- v

k

×

πk+1,j

- v

k+1 3

,π ) γ (vl−1 − vl−1 k,j −

β(vl−1 − vl−1 , πk,j ) πk,j

- v

vk−1 k

×

···

πl−2,j

- v

l−2 3

πl−1,j

- vl−1 3

0

πk+1,k

πk+1,j

- v

k+1 3

vk

-

3 0 0 0 - vk+1 - ··· - vl−2 - v

l−1 π π π

vk

πk+2,j

0

- vk+1

πk+1,k

k+2,k

πk+2,j

-

l−2,k

···

πl−2,j

- v

l−2 3

l−1,k

πl−1,j

- v

l−1 3

3 0 0 0 - ··· - vl−2 - vl−1 , π π π

k+2,k

l−2,k

l−1,k

(B.7) where β(v, w) =

[v][w − 1] , [v + 1][w]

γ (v, w) =

[v + w][1] . [v + 1][w]

Now let us show the relation (B.2). By using (3.91)–(3.93), (3.83), (4.3) and (4.5) in

)F the LHS of (B.2), the operator part can be arranged to Fl−1 (vl−1 )Fl−1 (vl−1 l−2 (vl−2 )

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

439

) · · · F (v )F (v ). Then the coefficient part is given by the product of the Fl−2 (vl−2 j j j j ' [1]2 factors l−1 m=j [πm+1,j −1−2δk,l−1 ][πm+1,j −2δk,l−1 ] and the one represented by the diagram πj +1,j −1

vj

- vj +1

H

H0

πl−2,j −1

H

H

- ···

0

HH -

H - vj +1

j πj +1,jH

vj

πj +2,j −1

j πj +2,jH

···

-

vl−2

0

HH j -

πl−2,µ

vl−2

πl−1,j −1

-

H

πl,j −2

vl−1

*

vl

2

0

HH j -

πl−1,µ

vl−1

π

l,µ −1

-

vl . (B.8)

The relation (B.2) denotes that (B.8) is invariant, at least in the weak sense, when vl and vl are exchanged. Applying Lemma B.1 to the corresponding part of (B.8), it is enough to show the weak equality for the rest:

f (vl−1 − vl , πl,j − 2)f (vl−1 − vl , πl,j − 1)f (vl−1 − vl , 2)

∼ f (vl−1 − vl , πl,j − 2)f (vl−1 − vl , πl,j − 1)f (vl−1 − vl , 2).

(B.9)

Let us set v = vl−1 , v = vl−1 , w = πl,j and denote the LHS and the RHS by A(v, v ) and B(v, v ), respectively. Then from the theta function identity such as (4.24), we have A(v, v ) − B(v, v ) =

[v − v + 1][v + v − vl − vl − w][vl − vl ][w − 2] [v − vl − 21 ][v − vl − 21 ][v − vl − 21 ][v − vl − 21 ]

.

Then it is easy to show h(v − v)(A(v , v) − B(v , v)) = −(A(v, v ) − B(v, v )). Therefore we get the weak equality (B.9). Next we prove (4.20) (j < k < l). The equality follows from the weak equality (A)+(B)+(C)∼0, where (A) =

vk−1

πk,j

- vk

vk (B) =

vk−1

−b(vl

πk+1,j

π

π

π

π

−1

k+2,j l−1,j l,j

- vk+1 - · · · l−2,j - vl−2 - vl−1 - vl

3 Q 3 3 3 Q 0 0 2 Q 0 0 s Q - vk+1 - ··· - vl−2 - vl−1 - vl , π π π π k+2,k

l−2,k

l−1,k

πl,k −1

πk+2,j

πl−2,j

πl−1,j

πl,j −1

k+1,k

[πk,j ] × − vl , πk,j ) [πk,j + 1]

πk,j +1

- vk

Q

0QQ s

πk+1,j

Q

- vk+1

Q 0

vk

Q s Q

- vk+1

πk+1,k

Q

- ···

Q 0 Q s Q - ··· π k+2,k

- vl−2

Q

Q 0

- vl−1

- vl

3

Q

Q

2

πl−2,k

πl−1,k

πl,k −1

π

π

π

Q s Q

- vl−2

0

Q s

- vl−1

- vl ,

(C) = −c(vl − vl , πk,j ) ×

vk−1

πk,j

- vk

vk

πk+1,j

π

−1

k+2,j l−1,j l,j

- vk+1 - · · · l−2,j - vl−2 - vl−1 - vl 3 Q 3 3 3 Q 0 0 2 Q 0 0 s Q - vk+1 - ··· - vl−2 - vl−1 - vl . π π π π k+1,k

k+2,k

l−2,k

l−1,k

πl,k −1

(B.10)

440

T. Kojima, H. Konno

Using the weak equality in Lemma B.2, we modify (B) to (A ) + (C ), where (A ) = −

b(vl − vl , πk,j ) [πk,j ] ×

β(vl−1 − vl−1 , πk,j ) [πk,j + 1]

πk,j

vk−1

- v

k

vk (C ) =

b(vl

πk,j

vk−1

πk+1,j

- vk+1 3 0

πk+2,j

-

− vl , πk,j )γ (vl−1 −

β(vl−1 − vl−1 , πk,j )

vk

π −1

π

πl,k −1

vl−1 , πk,j )

- v

k

πl−2,j

l−1,j l,j

- vl−2 - vl−1 - v

l 3 3 3 3 2 0 0 0

- vk+1 - ··· - vl−2 - vl−1 - vl , πk+1,k πk+2,k πl−2,k πl−1,k

···

[πk,j ] × [πk,j + 1]

πl,k −1 l−1,j

- vl−2 - vl−1 - vl 3 Q 3 3 Q 0 0 2 Q 0 s Q - vk+1 - ··· - vl−2 - vl−1 - v . l πk+1,k πk+2,k πl−2,k πl−1,k πk+1,j

- vk+1 3 0

πk+2,j

-

···

πl−2,j

π

πl,j −1

)β(v

Noting that h(vl−1 − vl−1 l−1 − vl−1 , w) = β(vl−1 − vl−1 , w), we can exchange

vl−1 and vl−1 in (A ). Let (A ) be the term we thus obtain. Note that (A ) ∼ (A

). Using the equality

− vl , 2) − f (vl−1

=

b(vl − vl , w) [w] f (vl−1 − vl , 2)

β(vl−1 − vl−1 , w) [w + 1]

1 3

][v − v

[1][vl − vl + vl−1 − vl−1 l l−1 + 2 ][vl − vl−1 + 2 ] 1 1

][v − v + 1][v

[vl−1 − vl−1 l l l−1 − vl − 2 ][vl−1 − vl − 2 ]

we have (A) + (A

) =

vk−1

πk,j

1 3

][v − v

[1][vl − vl + vl−1 − vl−1 l l−1 + 2 ][vl − vl−1 + 2 ] 1 1

][v − v + 1][v

[vl−1 − vl−1 l l l−1 − vl − 2 ][vl−1 − vl − 2 ]

- v

k

πk+1,j

π

π

l−1,j - · · · l−2,j- v

- v

l−2 l−1 3 3 3 0 0 0 - vl−1 vl−2 vk π ··· π π k+1,k

l−2,k

l−1,k

×

πl,j −1

- v

l (B.11)

- vl .

πl,k −1

On the other hand, to calculate (C) + (C ), we use the equality

, w )b(v − v , w ) c(vl−1 − vl−1 1 l 1 l

[w1 ]

β(vl−1 − vl−1 , w1 ) [w1 + 1]

× f (vl−1 − vl , w2 − 1)f (vl−1 − vl , w1 + w2 − 1) − c(vl − vl , w1 )

× f (vl−1 − vl , w2 − 1)f (vl−1 − vl , w1 + w2 − 1)

[1][vl − vl + vl−1 − vl−1 ][vl−1 − vl + 23 − w2 ][vl−1 − vl + 23 − w1 =− 1 1

][v − v + 1][v

[vl−1 − vl−1 l l l−1 − vl − 2 ][vl−1 − vl − 2 ]

− w2 ]

.

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

441

Then we have (C) + (C ) 3 3

][v

[1][vl − vl + vl−1 − vl−1 l−1 − vl + 2 − w2 ][vl−1 − vl + 2 − w1 − w2 ] =− × 1 1

][v − v + 1][v

[vl−1 − vl−1 l l l−1 − vl − 2 ][vl−1 − vl − 2 ]

vk−1

πk,j

- v

k

πk+1,j

π

π

l−1,j - · · · l−2,j- v

- v

l−2 l−1 3 3 3 0 0 0 vk π - · · · π - vl−2 π - vl−1 . k+1,k

l−2,k

Comparing (B.11) and (B.12), we have

2

- v

l (B.12)

l−1,k

(A) + (A

) + (C) + (C )

= 0.

C. RLL = LLR ∗ Relation We here derive some of the relations of the half currents involved in the RLL-relation (5.2), and compare them with those in Theorem 4.1. + (v) are given by From the definition (5.1), the components of the L-operator L + (v) = K + (v) + L ll l + (v) L kl

=

N

+ + + Fl,m (v)Km (v)Em,l ,

(C.1)

m=1

+ + + + (v)Kl+ (v) + N Fk,l m=l+1 Fk,m (v)Km (v)Em,l (k < l), N + + + + + (k > l). Kk (v)Ek,l (v) + m=k+1 Fk,m (v)Km (v)Em,l

(C.2)

It is convenient to introduce the reduced R-matrix and L-operators, R + (v, s|j ) and + (v|j ) (1 ≤ j ≤ N ), by L R + (v, s|j ) = R +mn (C.3) kl (v, s) j ≤k,l,m,n≤N , + + (v|j ) = L (v) L . (C.4) kl j ≤k,l≤N Then the inverse of L+ (v|j ) is given by L+ (v|j )−1   + Kj+−1 −Kj+−1 Fj,j Kj−1 xj ∗ +1 + +−1 + +−1 + +−1 + +−1 +−1 + −Ej +1,j Kj Ej +1,j Kj Fj,j +1 + Kj +1 −Ej +1,j Kj xj − Kj +1 Fj +1,j +2∗ . = +−1  yj K −1 −yj K +−1 F + − E + ∗ ∗ j j j,j +1 j +2,j +1 Kj +1 ∗ ∗ ∗ ∗ (C.5)

Here we omitted the argument v and set + + + xj (v) = Fj,j +1 (v)Fj +1,j +2 (v) − Fj,j +2 (v),

(C.6)

Ej++2,j +1 (v)Ej++1,j (v) − Ej++2,j (v).

(C.7)

yj (v) =

Due to the speciality of the form of the R-matrix (2.16), we have the reduced relation R +(1,2) (v, P + h|j )L+(1) (v1 |j )L+(2) (v2 |j ) = L+(2) (v2 |j )L+(1) (v1 |j )R ∗+ (v, P |j ).

(C.8)

442

T. Kojima, H. Konno

Below, we use this rather in its inverted form: L+(1) (v1 |j )−1 L+(2) (v2 |j )−1 R +(1,2) (v, P + h|j ) = R ∗+(1,2) (v, P |j )L+(2) (v2 |j )−1 L+(1) (v1 |j )−1 , +(2)

L

−1

+(1,2)

(C.9)

+(1)

(v2 |j ) R (v, P + h|j )L (v1 |j ) = L+(1) (v1 |j )R ∗+(1,2) (v, P |j )L+(2) (v2 |j )−1 .

(C.10)

C.1. Relations among Kj+ (v)’s. Now some of the relations among Kj+ (v) (1 ≤ j ≤ N ) are derived as follows. The (N, N ), (N, N ) component of the RLL = LLR ∗ relation (5.2) yields KN+ (v1 )KN+ (v2 ) = ρ(v1 − v2 )KN+ (v2 )KN+ (v1 ).

(C.11)

−1 ∗ −1 −1 Similarly, the (j, j ), (j, j ) component of the L−1 j Lj R = R Lj Lj relation (C.9) (1 ≤ j ≤ N − 1) yields

Kj+ (v1 )Kj+ (v2 ) = ρ(v1 − v2 )Kj+ (v2 )Kj+ (v1 )

(C.12)

∗ −1 and the (N, j ), (N, j ) component of the L−1 j RLj = Lj R Lj relation (C.9) (1 ≤ j ≤ N − 1) yields

Kj+ (v1 )KN+ (v2 ) = ρ(v1 − v2 )

[v1 − v2 − 1]∗ [v1 − v2 ] + K (v2 )Kj+ (v1 ). (C.13) [v1 − v2 ]∗ [v1 − v2 − 1] N

These relations coincide with the relations (4.13) and (4.14). + C.2. Relations between KN+ (v) and EN,j (v). The (N, N ), (N, j ) components of the ∗ RLL = LLR relation (5.2) (1 ≤ j ≤ N − 1) yield + KN+ (v1 )−1 EN,j (v2 )KN+ (v1 ) c∗ (v1 − v2 , Pj,N ) 1 + + = EN,j − En,j . (v2 ) ∗ (v1 ) b¯ (v1 − v2 ) b¯ ∗ (v1 − v2 )

(C.14)

This coincides with the case l = N of (4.15). + C.3. Relations between KN+ (v) and Fj,N (v). The (N, j ), (N, N ) components of the ∗ RLL = LLR relation (5.2) (1 ≤ j ≤ N − 1) yield + KN+ (v1 )Fj,N (v2 )KN+ (v1 )−1 c(v ¯ 1 − v2 , Pj,N + hj,N ) + 1 F + (v ) − = Fj,N (v1 ). ¯b(v1 − v2 ) j,N 2 ¯ 1 − v2 ) b(v

This coincides with the case l = N of (4.16).

(C.15)

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

443

+ C.4. Relations among El,j (v)’s. The (N, N ), (j, j ) component of the RLL = LLR ∗ relation (5.2) (1 ≤ j ≤ N − 1) yields + + (v1 )KN+ (v2 )EN,j (v2 ) KN+ (v1 )EN,j + + + = ρ(v1 − v2 )KN (v2 )EN,j (v2 )KN+ (v1 )EN,j (v1 ).

(C.16)

The (N, N ), (k, j ) component of the RLL = LLR ∗ relation (5.2) (1 ≤ j, k ≤ N − 1, j = k) yields + + (v1 )KN+ (v2 )EN,j (v2 ) ρ + (v1 − v2 )KN+ (v1 )EN,k ∗kj

+ + = KN+ (v2 )EN,j (v2 )KN+ (v1 )EN,k (v1 )Rkj (v1 − v2 , Pj,k ) ∗kj

+ + + KN+ (v2 )EN,k (v2 )KN+ (v1 )EN,j (v1 )Rj k (v1 − v2 , Pj,k ).

(C.17)

After a little calculation using (4.13), (C.16) and (C.17) coincide with the case l = N in (B.1) and (4.19), respectively. + C.5. Relations among Fj,l (v)’s. The (j, j ), (N, N ) component of the RLL = LLR ∗ relation (5.2) (1 ≤ j ≤ N − 1) yields + + (v1 )KN+ (v1 )Fj,N (v2 )KN+ (v2 ) Fj,N + + = ρ(v1 − v2 )Fj,N (v2 )KN+ (v2 )Fj,N (v1 )KN+ (v1 ).

(C.18)

The (j, k), (N, N ) component of the RLL = LLR ∗ relation (5.2) (1 ≤ j, k ≤ N − 1, j = k) yields + + (v2 )KN+ (v2 )Fj,N (v1 )KN+ (v1 ) ρ +∗ (v1 − v2 )Fk,N + + = Rj k (v, Pj,k + hj,k )Fj,N (v1 )KN+ (v1 )Fk,N (v2 )KN+ (v2 ) jk

+ + + Rj k (v, Pj,k + hj,k )Fk,N (v1 )KN+ (v1 )Fj,N (v2 )KN+ (v2 ). kj

(C.19)

After a little calculation using (4.13), (C.18) and (C.19) coincide with the case l = N in (B.2) and (4.20), respectively. + + C.6. The relations between El,j ’s and Fk,l ’s. The (j, N ), (N, N − 1) component together with the (j, N ), (N, N ) and (N, N ), (N, N − 1) components of the RLL = LLR ∗ relation (5.2) (1 ≤ j ≤ N − 2) yield + + [EN,N−1 (v2 ), Fj,N (v1 )] = KN+ (v2 )−1

c(v1 − v2 , Pj,N + hj,N ) + + (v2 ) Fj,N−1 (v2 )KN−1 ¯ 1 − v2 ) b(v

+ + − Fj,N−1 (v1 )KN−1 (v1 )

c∗ (v1 − v2 , PN−1,N ) + KN (v1 )−1 . b¯ ∗ (v1 − v2 ) (C.20)

444

T. Kojima, H. Konno

The (j +1, j ), (j, j +1) component together with the (j +1, j ), (j, j ) and (j, j ), (j, j + −1 ∗ −1 −1 1) components of the L−1 j Lj R = R Lj Lj relation (C.9) (1 ≤ j ≤ N − 1) yield c¯∗ (v1 − v2 , Pj,j +1 ) + Kj +1 (v2 )−1 b¯ ∗ (v1 − v2 ) c(v ¯ 1 − v2 , Pj,j +1 + hj,j +1 ) + Kj (v1 ). − Kj++1 (v1 )−1 ¯ 1 − v2 ) b(v (C.21)

+ + [Ej++1,j (v1 ), Fj,j +1 (v2 )] = Kj (v2 )

These equations (C.20) and (C.21) coincide with the cases l = N and l = j + 1 of (4.21), respectively. Similarly, the (N − 1, N ), (N, j ) component together with the (N − 1, N ), (N, N ) and (N, N ), (N, j ) components of the RLL = LLR ∗ relation (5.2) (1 ≤ j ≤ N − 2) yield + + [EN,j (v2 ), FN−1,N (v1 )]

c(v1 − v2 , PN−1,N + hN−1,N ) + + (v2 ) KN−1 (v2 )EN−1,j ¯ 1 − v2 ) b(v c∗ (v1 − v2 , Pj,N ) + + − KN+−1 (v1 )EN−1,j (v1 ) KN (v1 )−1 . b¯ ∗ (v1 − v2 )

= KN+ (v2 )−1

(C.22)

The (j, j + 1), (j + 1, j ) component together with the (j, j ), (j + 1, j ) and (j, j + −1 ∗ −1 −1 1), (j, j ) components of the L−1 j Lj R = R Lj Lj relation (C.9)(1 ≤ j ≤ N − 1) yield + + −1 [Ej++1,j (v2 ), Fj,j +1 (v1 )] = Kj +1 (v2 )

− Kj+ (v1 )

c(v1 − v2 , Pj,j +1 + hj,j +1 ) + Kj (v2 ) ¯ 1 − v2 ) b(v

c∗ (v1 − v2 , Pj,j +1 ) + Kj +1 (v1 )−1 . b¯ ∗ (v1 − v2 )

(C.23)

These equations (C.22) and (C.23) coincide with the cases l = N and l = j + 1 of (4.22), respectively. Finally, the following relations with j ≤ N − 2 are examples of those which we have not yet checked for our half currents: + + [EN,j (v2 ), Fj,N (v1 )]

= KN+ (v2 )−1 +

N−1

c(v1 − v2 , Pj,N + hj,N ) + c∗ (v1 − v2 , Pj,N ) + Kj (v2 ) − Kj+ (v1 ) KN (v1 )−1 ¯ 1 − v2 ) b(v b¯ ∗ (v1 − v2 )

c(v1 − v2 , Pj,N + hj,N ) + + (v2 ) Fj,k (v2 )Kk+ (v2 )Ek,j ¯ 1 − v2 ) b(v k=j +1 c∗ (v1 − v2 , Pj,N ) + + + −1 −Fj,k . (v1 )Kk+ (v1 )Ek,j (v1 ) (v ) K N 1 b¯ ∗ (v1 − v2 ) KN+ (v2 )−1

(C.24)

These are derived from the (j, N ), (N, j ) components together with the (j, N ), (N, N ) and (N, N ), (N, j ) components of the RLL = LLR ∗ relation (5.2) (1 ≤ j ≤ N − 1).

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

445

D. Evaluation Module slN ) associated We here summarize the evaluation module (πV ,z , Vz = V [z, z−1 ]) of Uq ( with the vector representation V = CN . The evaluation module (πz , Vz ) in terms of the Drinfeld generators, is defined by the following formulae: πz (c) = 0, πz (d) = z

d , dz

(D.1)

[n] j −N+1 n −n z) (q Ejj − q n Ej +1j +1 ), (q n + πz (xj,n ) = (q j −N+1 z)n Ejj +1 ,

πz (aj,n ) =

(D.2) (D.3)

− πz (xj,n ) = (q j −N+1 z)n Ej +1j ,

(D.4)

πz (hj ) = Ejj − Ej +1j +1 , πz (h¯j ) = −Ejj .

(D.5)

slN ) defined Then the elliptic currents kj (w, p), ψj± (w, p), ej (w, p), fj (w, p) of Uq ( in (3.22)–(3.25) are represented by {q r+2N+1 z/w}{q r+1 z/w}{q r−1 w/z}{q r−2N+3 w/z} {q r+2N−1 z/w}{q r+3 z/w}{q r+1 w/z}{q r−2N+1 w/z}   j −1 N p (q r+1 z/w) p (q r+3 z/w) × Ekk + Ejj + Ekk  , p (q r−1 z/w) p (q r+1 z/w)

πz (kj (w, p)) =

k=j +1

k=1

(D.6) πz (ψj± (q ∓r w, p)) = q ±hj

(q r−j +2hj +N−1 w/z)

p p (q r−j +N−1 w/z)

,

(pq 2 ; p)∞ δ(q j −N+1 z/w), (p; p)∞ (pq −2 ; p)∞ πz (fj (w, p)) = Ej +1j δ(q j −N+1 z/w), (p; p)∞

πz (ej (w, p)) = Ejj +1

(D.7) (D.8) (D.9)

where {z} = (z; p, q 2N )∞ . Especially, the auxiliary currents u± j (w, p) are represented by (pq −j +2hj +N−1 w/z; p)∞ , (pq −j +N−1 w/z; p)∞ (pq j −2hj −N+1 z/w; p)∞ πz (u− (w, p)) = . j (pq j −N+1 z/w; p)∞

πz (u+ j (w, p)) =

(D.10)

Due to this representation, we can obtain the representation of the half currents. After getting rid of some unpleasant fractional power factors of q and z by a certain gauge transformation, we have the following result:

446

T. Kojima, H. Konno

πv2 (Kj+ (v1 )e

−Q¯j

) = ρ + (v1 − v2 )   j −1 N [v − v ] − v − 1] [v 1 2 1 2 × Ekk + Ejj + Ekk  , [v1 − v2 + 1] [v1 − v2 ] k=j +1

k=1

(D.11) [v1 − v2 + Pj,l − 1][1] + πv2 (e−ηj Fj,l , (v1 )eηl ) = Elj [v1 − v2 ][Pj,l − 1] [v1 − v2 − Pj,l ][1] −Q + πv2 (eQ¯l −ηl El,j , (v1 )e ¯j +ηj ) = −Ej l [v1 − v2 ][Pj,l ]

(D.12) (D.13)

where z = q 2v . It is easy to check that these quantities satisfy the commutation relations + + of the half currents Kj+ (v), Fj,l (v) and El,j (v). Finally, let us check the results by calculating the R-matrix as the image of the L-operator Lˆ + (v) in (5.1), R + (v1 − v2 , P ) = (πv2 ⊗ id)Lˆ + (v1 ).

(D.14)

Using (4.3) and Riemann’s theta identity, we obtain the following:   R11 (v, P ) · · · · · · R1N (v, P )  R21 (v, P ) · · · · · · R2N (v, P )  , R + (v, P ) = ρ + (v)  .. .. ..   . . .

(D.15)

RN1 (v, P ) · · · · · · RNN (v, P )

where Rjj (v, P ) =

j −1

b¯ (v) Ekk + Ejj +

k=1

Rj l (v, P ) = c v, Pj,l Elj , Rlj (v, P ) = c¯ v, Pj,l Ej l

N

b v, Pj,k Ekk

(1 ≤ j ≤ N ),

k=j +1

(D.16) (D.17) (1 ≤ j < l ≤ N ).

(D.18)

This expression coincides with the R-matrix given by (2.15). References 1. Foda, O., Iohara, K., Jimbo, M., Kedem, K., Miwa, T., Yan, H.: An elliptic quantum algebra for sl2 . Lett. Math. Phys. 32, 259–268 (1994) 2. Felder, G.: Elliptic quantum groups. In: Proc. ICMP Paris 1994, Cambridge-Hong Kong: International Press, 1995, pp. 211–218 3. Frønsdal, C.: Quasi-Hopf deformation of quantum groups. Lett. Math. Phys. 40, 117–134 (1997) 4. Enriquez, B., Felder, G.: Elliptic quantum groups Eτ,η (sl2 ) and quasi-Hopf algebras. Commun. Math. Phys. 195, 651–689 (1998) 5. Jimbo, M., Konno, H., Odake, S., Shiraishi, J.: Quasi-Hopf twistors for elliptic quantum groups. Transformation Groups 4, 303–327 (1999) 6. Drinfeld, V.G.: Quasi-Hopf algebras. Leningrad Math. J. 1, 1419–1457 (1990) 7. Baxter, R.J.: Partition function of the eight-vertex lattice model. Ann. Phys. 70, 193–228 (1972) 8. Andrews, G.E., Baxter, R.J., Forrester, P.J.: Eight-vertex SOS model and generalized Rogers-Ramanujan-type identities. J. Stat. Phys. 35, 193–266 (1984)

Analysis of Elliptic Lattice Models Based on Elliptic Quantum Groups

447

9. Jimbo, M., Miwa, T.: Algebraic Analysis of Solvable Lattice Models. CBMS Regional Conference Series in Mathematics, Vol. 85, Providence, PI: AMS 1994 10. Drinfeld,V.G.: A new realization of Yangians and quantized affine algebras. Soviet. Math. Dokl. 36, 212–216 (1988) 11. Konno, H.: An elliptic algebra Uq,p (sl2 ) and the fusion RSOS model. Commun. Math. Phys. 195, 373–403 (1998) 12. Jimbo, M., Konno, H., Odake, S., Shiraishi, J.: Elliptic algebra Uq,p (sl2 ): Drinfeld currents and vertex operators. Commun. Math. Phys. 199, 605–647 (1999) 13. Lukyanov, S., Pugai, Y.: Multi-point local height probabilities in the integrable RSOS model. Nucl. Phys. B473[FS], 631–658 (1996) 14. Ding, J., Frenkel, I.B.: Isomorphism of two realizations of quantum affine Uq (slN ). Commun. Math. Phys. 156, 277–300 (1993) 15. Khoroshkin, S., Lebedev, D., Pakuliak, S.: Elliptic algebra Aq,p (sl2 ) in the scaling limit. Commun. Math. Phys. 190, 597–627 (1998); Khoroshkin, S., Lebedev, D., Pakuliak, S., Stolin, A., Tolstoy, V.: Classical limit of the scaled elliptic algebra A,η (sl2 ). Compositio Math. 115, 205–230 (1999) (1)

16. Asai, Y., Jimbo, M., Miwa, T., Pugai, Y.: Bosonization of vertex operators for the An−1 face model. J. Phys. A29, 6595–6616 (1996) (1) 17. Furutsu, H., Kojima, T., Quano, Y.-H.: Type-II vertex operators for the An−1 face model. Int. J. Mod. Phys. A15, 1533–1556 (2000) 18. Jimbo, M., Miwa, T., Okado, M.: Solvable lattice models whose states are dominant integral weights (1) of An−1 . Lett. Math. Phys. 14, 123–131 (1987) 19. Miki, K.: Creation/Annihilation operators and form factors of the XXZ model. Phys. Lett. A186, 217–224 (1994) 20. Foda, O., Iohara, K., Jimbo, M., Kedem, R., Miwa, T., Yan, H.: Notes on highest weight modules of the elliptic algebra Aq,p (sl2 ). In: Quantum field theory, integrable models and beyond (Kyoto 1994). Progr. Theor. Phys. Suppl. 118, 1–34 (1995) 21. Reshetikhin, N.Yu., Semenov-Tian-Shansky, M.: A central extensions of quantum current groups. Lett. Math. Phys. 19, 133–142 (1990) Communicated by L. Takhtajan

Commun. Math. Phys. 239, 449–492 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0888-3

Communications in

Mathematical Physics

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods Fr´ed´eric Faure1 , St´ephane Nonnenmacher2 , Stephan De Bi`evre3 1

Laboratoire de Physique et Mod´elisation des Milieux Condens´es (LPM2C) (Maison des Magist`eres Jean Perrin, CNRS), BP 166, 38042 Grenoble C´edex 9, France. E-mail: [email protected] Service de Physique Th´eorique, CEA/DSM/PhT Unit´e de recherche associ´ee au CNRS CEA/Saclay, 91191 Gif-sur-Yvette C´edex, France. E-mail: [email protected] 3 UFR de Math.–UMR AGAT, Universit´e des Sciences et Technologies de Lille, 59655 Villeneuve d’Ascq, France. E-mail: [email protected]

2

Received: 5 August 2002 / Accepted: 21 February 2003 Published online: 2 July 2003 – © Springer-Verlag 2003

Abstract: In this paper we construct a sequence of eigenfunctions of the “quantum Arnold’s cat map” that, in the semiclassical limit, shows a strong scarring phenomenon on the periodic orbits of the dynamics. More precisely, those states have a semiclassical limit measure that is the sum of 1/2 the normalized Lebesgue measure on the torus plus 1/2 the normalized Dirac measure concentrated on any a priori given periodic orbit of the dynamics. It is known (the Schnirelman theorem) that “most” sequences of eigenfunctions equidistribute on the torus. The sequences we construct therefore provide an example of an exception to this general rule. Our method of construction and proof exploits the existence of special values of for which the quantum period of the map is relatively “short”, and a sharp control on the evolution of coherent states up to this time scale. We also provide a pointwise description of these states in phase space, which uncovers their “hyperbolic” structure in the vicinity of the fixed points and yields more precise localization estimates.

1. Introduction One of the main problems in quantum chaos is the understanding of the semiclassical behaviour of the eigenfunctions of quantum dynamical systems having a chaotic classical limit. The main theorem in this context is the Schnirelman theorem [Sc, CdV, Z1, HMR, BouDB]. It roughly states that “most” eigenfunctions equidistribute on the available phase space in the classical limit. This leaves open the question of the existence of exceptional sequences of eigenfunctions with a different limit. In the case of “hard chaos” (uniformly hyperbolic systems), numerical computations have shown the presence of “scars” on certain eigenfunctions [He], i.e. a visual enhancement of the wavefunction on an unstable periodic orbit. Up to now all theories of this phenomenon have required some kind of averaging over a (semiclassically large) set of eigenfunctions [Bog, Ber, He, KH]. In addition, scarring is often described in the physics literature as a

450

F. Faure, S. Nonnenmacher, S. De Bi`evre

weak type of localization, compatible with Schnirelman’s (measure-theoretic) equidistribution, as opposed to “strong scarring” [RS], which implies that the limiting measure has a component supported on a periodic orbit and therefore does not equidistribute. We show in this paper that, for the quantized “Arnold’s cat map”, strongly scarred sequences do indeed exist for any periodic orbit (more generally, for any finite union of periodic orbits). This is, to the best of our knowledge, the first example of this kind in hyperbolic systems. A construction of exceptional sequences of eigenfunctions not equidistributing in the semiclassical limit was recently announced [CKS] for the quantization of certain ergodic piecewise affine transformations on the torus, but these do not correspond to “scars” since the systems in question have no periodic orbits. Our construction is based on intuitively clear ideas that we now briefly sketch. For unfamiliar notation, we refer to Sects. 2–4. Precise statements of our results will be given below. Let M ∈ SL(2, Z) be a hyperbolic automorphism of the 2-dimensional torus T and Mˆ its quantization on the N -dimensional quantum Hilbert space HN,θ , where 2π N = 1. We will construct strongly scarred quasimodes of Mˆ that, for certain values of N , will be shown to be eigenfunctions. For that purpose we will use three ingredients. First, the time-energy uncertainty relation in the following simple form (T ∈ N, φ ∈ R): (Mˆ − eiφ Iˆ)

T −1

e−iφt Mˆ t = e−2iφT Mˆ 2T − Iˆ ≤ 2.

(1)

t=−T

Second, precise estimates on intuitively clear phase space localization properties of coherent states. Third, a remark on the quantum period of Mˆ [BonDB1] (Sect. 8). Let x0 , x1 = Mx0 , . . . , xτ = M τ x0 = x0 be a periodic orbit of period τ of M. Let |x0 , c˜0 , θ be a “squeezed” coherent state in HN,θ centered on the point x0 and consider Mˆ t |x0 , c˜0 , θ for t ∈ Z. Note first that this state is still a squeezed coherent state and that, for small enough t, it is localized around xt . In fact, the support of the Husimi function of this state is an ellipse stretched along the unstable direction of the dynamics through √ the point xt , with its major axis roughly of size eλt , where λ is the (positive) Lyapounov exponent of the dynamics (Sect. 4). Introducing the Ehrenfest time T = | lnλ| , the support is therefore microscopic as long as t ≤ (1 − )T /2. For longer times, between T /2 and T , the support of the Husimi function of Mˆ t |x0 , c˜0 , θ starts to wrap around the torus and it was shown in [BonDB1] that it equidistributes on that time scale. We shall consider the “discrete time quasimode” |disc φ =

T −1

e−iφt Mˆ t |x0 , c˜0 , θ =

4

|disc j,φ

(2)

j =1

t=−T

and its “components” −T +j T2 −1

|disc j,φ

=

e−iφt Mˆ t |x0 , c˜0 , θ .

(3)

t=−T +(j −1) T2

We note that similar states were considered before in the study of scars, see for instance [dPBB, KH] and references therein. We shall introduce a “continuous time” version

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

-T

0

-T/2 |Φ 1>

|Φ 2>

|Φerg>

451

T/2 |Φ 3>

T

t

|Φ 4> |Φerg>

|Φloc> |Φ >

Fig. 1. Partition of the time interval [−T , T ] into four equal parts, and of the quasimode |φ into corresponding components

|cont φ of those quasimodes later. We will write |φ in statements true for both the discrete and continuous time quasimodes. Let us for simplicity concentrate on the case where x0 = 0, τ = 1. Our crucial technical estimate (Sect. 4, Proposition 1) says that there exists C > 0 so that c˜0 , θ | Mˆ t | c˜0 , θ = √

|t| 1 + I (t), with |I (t)| ≤ C e−λ(T − 2 ) . cosh(λt)

(4)

This implies rather easily (Proposition 2) the existence of a smooth, strictly positive function S1 (φ, λ) so that φ | φ ∼ 2S1 (φ, λ)T . Using (1) one concludes readily that (Mˆ − e Iˆ)|φ n ≤ iφ

2 S1 (φ, λ)T

1+

O(1) S1 (φ, λ)T

,

(5)

√ justifying the name “quasimode”. Here we used the notation |ψn = |ψ/ ψ|ψ for any non-zero |ψ ∈ HN,θ . To analyze the phase space properties of the above quasimodes, we first show as a further consequence √ of (4) that the four states |j,φ have the same norm, asymptotically proportional to T as goes to 0 and that they are asymptotically orthogonal in the semiclassical limit. In fact, this is easily understood intuitively by noting for example that the Husimi function of |1,φ is supported along the stable manifold of the periodic orbit, and that of |4,φ along the unstable one, so that they have essentially disjoint supports, which is at the origin of their orthogonality. To put it differently, since the unstable and stable manifolds intersect at homoclinic points, our results show that the contribution of these intersections in the phase space integral expressing the overlap 1,φ |4,φ is small for small . Note that although the homoclinic interferences do not contribute significantly to the above integral, they are nevertheless clearly visible on the pointwise behaviour of the Husimi distribution of |φ , which is represented in Fig. 1 and that will be further studied in Sect. 6 (for “continuous time” quasimodes). The pointwise estimates obtained there will show that the Husimi density concentrates along “classical hyperbolas” asymptotic to the stable and unstable manifolds; they will at the same time provide estimates on the rate of convergence to the limit measure, as well as other localization indicators (namely, Ls norms of the Husimi density). It is furthermore clear from the previous discussion on the phase space localization properties of the evolved coherent states that |1,φ and |4,φ are sums of states that

452

F. Faure, S. Nonnenmacher, S. De Bi`evre

p

p q

(a)

(b)

q

Fig. 2a, b. Husimi distribution of the state |φ n , constructed for the cat map (21) on the orbit of period 3 starting from x0 = (0, 0.5). The quantum parameters read N = 1/(2π ) = 500, φ = 0. (a) 3D plot on a linear scale. (b) 2D plot in logarithmic scale (darker = higher values)

equidistribute on the torus, whereas |2,φ and |3,φ are sums of states that localize on the periodic orbit. One therefore expects (and we shall prove in Sects. 5–7) that f (x)dx if j = 1, 4, lim nj,φ |fˆ|j,φ n = →0

and that

T

τ −1 1 lim nj,φ |fˆ|j,φ n = f (xi ) τ →0

if j = 2, 3.

i=0

Here fˆ is either the Weyl or anti-Wick quantization of f ∈ C ∞ (T). In other words, the Wigner and hence also the Husimi function of |2,3,φ converge (weakly) to the Dirac measure on the periodic orbit, whereas the ones of |1,4,φ equidistribute, i.e. converge to the Lebesgue measure. This suggests grouping these states two by two, defining: |erg,φ = |1,φ + |4,φ and |loc,φ = |2,φ + |3,φ .

(6)

Using the above information we shall finally prove (Propositions 7 and 12) that, for any φ ∈ [−π, π ],   τ −1 1 1 1 lim nφ |fˆ|φ n = f (x)dx +  f (xj ) . (7) 2 T2 2 τ →0 j =0

In other words, the semiclassical limit measure of the sequence of quasimodes |φ n is the measure   τ −1 1 1 1  δxj . dx + 2 2 τ j =0

This shows that the quasimodes |φ n are strongly scarred. ˆ We recall We then conclude using a particular property of the quantum period of M. that the quantum cat map Mˆ has an dependent “quantum period” P , i.e. Mˆ P = e−iϕ Iˆ

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

453

for some ϕ ∈ [0, 2π[. The eigenvalues of Mˆ on HN,θ are therefore all of the form e−iφj , with φj = ϕ/P + 2πj/P , j = 1, . . . , P . Note that P plays the role here of the Heisenberg time of the system, since φj ∼ 1/P . Since, for general , the quantum period P is of order −1 [Ke], it is considerably longer than the Ehrenfest time T , which grows only logarithmically in −1 . Nevertheless, developing an argument in [BonDB1], we will show that, for any hyperbolic matrix in SL(2, Z) there exists a subsequence (k )k∈N of values of tending to zero for which P = 2T + O(1) (see also [KR2]). For those values the Heisenberg and Ehrenfest times of the system coincide and the |φ n therefore constitute a sequence of eigenfunctions of Mˆ that strongly scar, provided φ = φj for some j ∈ {1 . . . P }. It should be noted that, for the values of considered, the number of distinct eigenvalues φj is of order | ln |, so that the eigenvalue degeneracy is very large, namely of order (| ln |)−1 . Our main result can finally be summarized as follows: Theorem 1. Let M and (k )k∈N be as above. Let 0 ≤ β ≤ 1/2 and let P = {x0 , . . . , xτ −1 } be a periodic orbit of M. Then there exists a sequence (ψjk )k∈N of eigenfunctions of Mˆ on HNk ,θ with the property that, for all f ∈ C ∞ (T2 ), 1 lim nψk |fˆ|ψk n = β k→∞ τ

τ −1

f (xj ) + (1 − β)

j =0

T2

f (x) dx.

(8)

Our result helps to complete the picture of the semiclassical eigenfunction behaviour of quantized toral automorphisms known to date. Indeed, beyond the general Schnirelman theorem for these models [BouDB] the following results are known. First, suppose M is of “checkerboard form”, meaning AB ≡ 0 ≡ CD mod 2. Then all eigenfunctions of Mˆ semiclassically equidistribute, provided one takes the limit along a density√one subsequence of values of N [KR2], for which the quantum period is larger than N . Note that this sequence excludes the values Nk for which the period is very short. Second, it is shown in [KR1, Me] that for such M there exists a basis of eigenfunctions that equidistribute as N tends to infinity, without restrictions on N . This basis is constructed as a common eigenbasis for Mˆ and its “quantum symmetries”, which are shown in [KR1] to be sufficiently numerous to drastically reduce (if not to lift) the degeneracies of the eigenvalues. Finally, one may wonder if it would be possible to construct a sequence of eigenfunctions of Mˆ that has as a limit measure β

τ −1 1 δxj + (1 − β)dx, τ j =0

with β > 1/2. It is proven in [FN1] that this is impossible, √ so that the above quasimodes are in a sense maximally localized (the bound β > ( 5 − 1)/2 ∼ = 0.62 had been previously obtained by [BonDB2]). 2. Linear Dynamics on the Plane In this section we recall some known results we will need in the sequel. For details not given here we refer to [F].

454

F. Faure, S. Nonnenmacher, S. De Bi`evre

2.1. Classical linear flow. The most general quadratic Hamiltonian on R2 is (α, β, γ ∈ R): H (q, p) =

1 2 1 2 αq + βp + γ qp. 2 2

(9)

Assuming γ 2 > αβ, H generates a hyperbolic flow x(t) = (q(t), p(t)) on R2 , given by x(t) = M(t)x(0) (t ∈ R), where for each t = 0, M(t) is a hyperbolic matrix in SL(2, R). Explicitly, for t = 1, AB def M = M(1) = ∈ SL (2, R) , (10) CD i.e. AD − BC = 1, and

A = cosh λ + γλ sinh λ B = βλ sinh λ , (11) D = cosh λ − γλ sinh λ C = − αλ sinh λ where λ = γ 2 − αβ > 0 is the Lyapounov exponent. Note that M has two real eigenvalues e±λ and hence two real eigenvectors corresponding to an unstable and a stable direction for the dynamics. They have respective slopes s+ = tan ψ+ , s− = tan ψ− . Clearly, any hyperbolic matrix M ∈ SL(2, R) with TrM > 2 is of the above form for a unique α, β, γ (the case TrM < −2 is treated by using the map −M). The expressions in (10)–(11) still make sense in the elliptic case, when γ 2 < αβ and −2
c 2 c 2 1 1 z + z + bzz, with b = (α + β) ∈ R, c = (α − β) − iγ ∈ C, (12) 2 2 2 2 2 and λ = |c| − b2 . We shall write M(c,b) for the matrix M constructed via (10)–(12), whenever b2 = |c|2 . We will make use of the following convenient decomposition of a general hyperbolic matrix M (TrM > 2). We first introduce some notation. For µ ∈ R+ we define: H =

def

D(µ) = M(c=−iµ,b=0) ,

def

B(µ) = M(c=−µ,b=0) ,

def

R(µ) = M(c=0,b=−µ) .

Clearly, D(µ) is hyperbolic, with the q and p axes as unstable and stable axes. B(µ) is also hyperbolic, with eigenaxes forming angles ψ+ = 21 arg(−ic) ¯ = π4 = −ψ− with the horizontal. R(µ), on the other hand, is just a rotation of angle µ and hence elliptic. Any hyperbolic matrix M(c,b) as in (10) can be decomposed as: M(c,b) = QD(λ)Q−1 , with Q = R(b1 )B(b2 ), (13) π π π π where b1 ∈ − 2 , 2 , b2 ∈ R are defined as follows. We denote by φ1 ∈ − 2 , 2 the angle between

the q axis and the bisector between the stable and unstable axes of M(c,b) , and by φ2 ∈ 0, π4 the angle between the bisector and the stable axis of M(c,b) (Fig. 3). In terms of those, one has: sinh (2b2 ) =

1 , tan (2φ2 )

b1 = φ1 −

π . 4

(14)

This last decomposition has the following interpretation. The general hyperbolic map M(c,b) is obtained from the special case D(λ) (λ = |c|2 − b2 > 0) by a change of

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

Boost B( b 2)

φ2

455 φ2

Rotation R(φ 1 −π/4)

φ1 ψ+ ψ-

Special Map D(λ)

General Hyperbolic map M

Fig. 3. Decomposition of the general linear hyperbolic map M(c,b) as in (13)

coordinates Q yielding a transformation from the (q, p) frame into the unstable-stable frame. The unstable (respectively stable) direction is given by the vectors v+ = Q eq , v− = Q ep (which are, in general, not normalized). Above, we decomposed Q into the transformation B(b2 ) which changes the angle between the stable and unstable axis, and the rotation R(b1 ) which rotates the whole frame (Fig. 3). We finally remark, for later purposes, that there exists another decomposition: given M ∈ SL(2, R), ∃!c˜ ∈ C, µ ∈] − π, π ] so that M = M(c,0) ˜ R(µ).

(15)

2.2. Linear quantum dynamics. In terms of the usual annihilation and number operators a = √1 qˆ + i pˆ , and nˆ = 21 a † a + aa † , the Weyl (or canonical) quantization of H 2 in (9) is defined as the self-adjoint operator Hˆ on L2 (R) given by: c 2 c †2 1 1 1 a + a + bnˆ . Hˆ = α qˆ 2 + β pˆ 2 + γ qˆ pˆ + pˆ qˆ = (16) 2 2 2 2 2 The quantum evolution operator for time t = 1 which corresponds to M(c,b) is then: ˆ H . (17) Mˆ (c,b) = exp −i The quantization of the matrix −M(c,b) = M(c,b) R(π ) can be defined as Mˆ (c,b) Pˆ = ˆ ) is the parity operator. The unitary operators Mˆ (c,b) , Pˆ Mˆ (c,b) , where Pˆ = −iR(π Mˆ (c,b) Pˆ yield a projective representation of SL(2, R) (which resembles the metaplectic representation). We will in most of the paper omit to indicate the -dependence of the operators Hˆ and Mˆ (c,b) . Let v = v1 eq + v2 ep ∈ R2 and let Tv : R2 → R2 denote the translation on classical phase space by v. The corresponding quantum translation operator is defined by: i ˆ (18) Tv = exp − v1 pˆ − v2 qˆ . These quantum translations satisfy the algebraic identity Tˆv Tˆv = eiS Tˆv+v ,

(19)

456

F. Faure, S. Nonnenmacher, S. De Bi`evre

with S = 21 v2 v1 − v1 v2 = − 21 v ∧ v , so they generate an (irreducible) unitary representation of the Heisenberg group. For any matrix M ∈ SL(2, R), one trivially has M Tv M −1 = TMv . This intertwining persists at the quantum level: Mˆ Tˆv Mˆ −1 = TˆMv .

(20)

3. Classical and Quantum Automorphisms of the Torus 3.1. Classical automorphisms and their invariant manifolds. Consider the torus T = R2 /Z2 as a symplectic manifold with the two-form dq ∧ dp. Then any M ∈ SL(2, Z) defines a (discrete) symplectic dynamics on T in the obvious way. We are interested in the case where M is hyperbolic: the corresponding dynamical system is then an Anosov system [AA]. The stable and unstable manifolds of any point x ∈ T are obtained by wrapping the lines with slopes s± passing through x around the torus. We present here some properties of these manifolds that we will need in subsequent sections. A simple example we will use for numerical illustrations is the so called “Arnold’s cat map” [AA] 21 . (21) MArnold = 11 √ Its Lyapounov coefficient is λ0 = log 3+2 5 ≈ 0.9624. The stable and unstable manifolds of the fixed point x = 0 are depicted in Fig. 4. For any hyperbolic matrix M, the slopes s+ and s− of the unstable and stable directions are quadratic irrationals (i.e. the solutions of a quadratic equation with integer coefficients). It is well known [Kh] that any quadratic irrational s satisfies the following diophantine inequality: 1 k 1 ∗ ∃C(s) > 0, ∀k ∈ Z, ∀l ∈ N , s − ≥ C(s) 2 ⇐⇒ |ls − k| ≥ C(s) . l l l This means that quadratic irrationals are poorly approximated by rationals, in the sense that, to get an approximation with an error , you need a rational with a denominator of order at least −1/2 . p 0.5

0

−0.5 −0.5

q

0

0.5

Fig. 4. The stable and unstable axes through 0 of the map MArnold wrap around the torus at infinity. We have only represented the first six occurrences

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

457

This inequality will be used in the following manner. Consider the eigenvectors v± of M(c,b) defined as v+ = Qeq , v− = Qep (with Q the matrix defined in Eq. (13)). As usual, their dual basis u± (defined as v+ · u+ = 1, v+ · u− = 0, etc.) can be used to express the coordinates of a point x in the basis v± : x = q (x)v+ + p (x)v− ,

with

q (x) = x · u+ , p (x) = x · u− .

(22)

We call d(x, Z2∗ ) the distance between a point x ∈ R2 and Z2∗ = Z2 \ {0}, and we will estimate it for x on the (un)stable axis: ∃C > 0, ∀x ∈ Rv± ,

d(x, Z2∗ ) ≥

C . x + 1

To prove this, first note that, for any n ∈ Z s.t. nq = 0, we have C(s+ )|u−,p | u−,q

|p (n)| = |n · u− | = |u−,p | nq + np ≥ , u−,p |nq |

(23)

(24)

where we have used the fact that u−,q /u−,p = s+ is a quadratic irrational. Interchanging nq and np , we obtain a first set of inequalities: Lemma 1. There is a constant C (depending on M) such that, for any integer lattice point n = 0, C C |p (n)| ≥ and |q (n)| ≥ . n n We can now prove (23) as follows. For each x ∈ Rv+ , there exists an n ∈ Z2∗ so that d(x, Z2∗ ) = n − x ≥

C± |n · u− | ≥ . u− u− n

√ Since, obviously, n − x ≤ 1/ 2, (23) follows easily. We will in addition need a slightly refined statement. If the lattice point n = 0 is in a sufficiently thin strip around the unstable axis, it satisfies p (n)v− ≤ 1/2 ≤ n/2, n which implies the lower bound |q (n)| ≥ 2v . Together with the above lemma, this + C

1 entails |p (n)| ≥ |q (n)| for a certain C1 . Interchanging p ↔ q , we see that the same inequality holds for points in a sufficiently thin strip around the stable axis. Outside the union of these strips, this inequality can be violated by at most a finite set of lattice points; therefore, upon reducing the constant C1 we obtain the main technical result of this section:

Lemma 2. There exists a constant Co > 0 (depending on M) such that, for any integer points n = m of the plane, their coordinates along the (un)stable directions satisfy: |q (n) − q (m)| ≥

Co .

|p (n) − p (m)|

(25)

These inequalities precisely control the sparseness of the lattice points inside a strip around the unstable axis: the narrower the strip, the farther successive lattice points have to be from each other.

458

F. Faure, S. Nonnenmacher, S. De Bi`evre

3.2. Quantum mechanics on the torus. We recall as briefly as possible the basic setting for the quantum mechanics of a system with T as phase space, as well as the quantization of the automorphism M, referring to [HB, DE, BouDB] and references therein for further details. In order to define the Hilbert space associated to T, we first consider the translation operators Tˆ1 = Tˆ(1,0) , Tˆ2 = Tˆ(0,1) , which satisfy Tˆ1 Tˆ2 = e−i/ Tˆ2 Tˆ1 as a result of (19). So for the values of defined as: N=

1 ∈ N∗ , 2π

(26)

one has the property Tˆ1 , Tˆ2 = 0. The Hilbert space L2 (R) may then be decomposed as a direct integral of the joint eigenspaces of Tˆ1 and Tˆ2 :

HN,θ

⊕

d 2θ , (2π)2 = |ψ ∈ S (R) Tˆ1 |ψ = eiθ1 |ψ, Tˆ2 |ψ = eiθ2 |ψ .

L2 (R) =

HN,θ

(27)

The “angle” θ = (θ1 , θ2 ) ∈ [0, 2π[2 thus describes the periodicity properties of the wave function under translations by an elementary cell. HN,θ is N -dimensional. We can define a projector Pˆθ from S(R) onto the space HN,θ : e−in1 θ1 −in2 θ2 Tˆ1n1 Tˆ2n2 = e−iθ·n+iδn Tˆn . Pˆθ = (28) n∈Z2

(n1 ,n2 )∈Z2

The phase δn = −n1 n2 N π comes from the decomposition Tˆn = e−iδn Tˆ1n1 Tˆ2n2 . The Weyl quantization of a function f (x) = k∈Z2 fk e2iπ(x∧k) is an operator on HN,θ defined by fˆ = fk Tˆk/N . (29) k∈Z2

For |ψ ∈ HN,θ , its “Wigner function” Wψ (x) is the distribution implicitly defined via f (x) Wψ (x) dx, so that W˜ ψ (k) = ψ|Tˆk/N |ψ, (30) ψ|fˆ|ψ =

T

where the W˜ ψ (k) = T e2iπ(x∧k) Wψ (x) dx are the Fourier coefficients of Wψ . Let now M ∈ SL(2, Z), so that A, B, C, D (see Eq. (10)) are integers. One then easily deduces from (20) and (28) that the quantum map Mˆ satisfies: ˆ Mˆ Pˆθ = Pˆθ M,

with θ = θ M −1 + 2π

N (CD, AB) . 2

(31)

The constant shift on the right-hand side (RHS) is due to the phases δn appearing in (28). Mˆ will define an endomorphism in HN,θ provided θ ≡ θ mod 2π , i.e. provided θ is a fixed point of the dual map defined in (31). Given a hyperbolic matrix M, such a fixed point exists for any N [DE]. In particular, for any matrix M the angle θ = (0, 0) (periodic wavefunctions) is a fixed point if N is even, while θ = (π, π ) (antiperiodic wave wavefunctions) is a fixed point for N odd. We will always make this choice for our numerical examples.

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

459

From now on, we will assume that M = ±M(c,b) ∈ SL(2, Z) is a fixed hyperbolic matrix defining a dynamics on the plane and on the torus. We will therefore no longer indicate its dependence on (c, b). We will also assume that is such that (26) holds, and for this we select an angle such that θ ≡ θ. In general, θ can depend on , but we will not indicate this dependence. 4. Coherent States and Their Evolution 4.1. Standard and squeezed coherent states. With the normalized state |0 defined by a|0 = 0, a “standard” coherent state is |x = Tˆx |0,

x = (q, p) ∈ R2 .

(32)

More generally, we define for each c˜ ∈ C∗ the “squeezed” coherent states |x, c ˜ by |c ˜ = |0, c ˜ = Mˆ (c,0) ˜ |0,

|x, c ˜ = Tˆx |c, ˜

(33)

where the “squeezing operator” Mˆ (c,0) is defined by (17), with b˜ = 0. Note that, in view ˜ of (15), given M ∈ SL(2, R), ∃!c˜ ∈ C, σ ∈ [0, 2π[ such that ˆ M|0 = eiσ |c. ˜

(34)

For more details on coherent states, we refer to [Z, Pe]. To avoid confusion, we will use a tilde for the parameters of the squeezing operator Mˆ (c,0) ˜ , and keep untilde notations for the parameters of the dynamics defined by def

the matrix M = ±M(c,b) that are at any rate kept fixed throughout the further discus˜ is a Gaussian wave packet with mean sion. In the L2 (R) representation, the state |x, c position q. Its Fourier transform is centered around the mean momentum p. For any state |ψ ∈ L2 (R), we define its Bargmann function as x → x, c|ψ, ˜ and its Husimi defined on phase space R2 by: function to be the positive function Hc,ψ ˜ 2 |x, c|ψ| ˜ 2 Hc,ψ Hc,ψ (35) , which satisfies ˜ (x) = ˜ (x) dx = ψL2 (R) . 2π R2 Note that for given |ψ, the Bargmann and Husimi functions depend on the choice of c. ˜ Also, the function x → x, c|ψ ˜ is the product of a Gaussian factor with a function holomorphic with respect to a c-dependent ˜ holomorphic structure. The term Bargmann function is usually reserved for the holomorphic factor, but we find it convenient to adopt here a slightly different convention. We will need the explicit expression of the (standard) Bargmann and Husimi functions of the squeezed coherent state |c: ˜

1 1 q˜ 2 p˜ 2 q˜ p˜ tanh |c| ˜ x, 0|c ˜ = exp − . (36) + exp −i 2 2 q˜ 2 p˜ 2 cosh |c| ˜ Here the unstable-stable frame (q, ˜ p) ˜ of the symmetric matrix M(c,0) is easily seen from ˜ the formulas in Sect. 2 to be obtained from (q, p) by a rotation of angle ψ˜ + (Fig. 4.1), and the widths are given by q˜ 2 =

2 , ˜ (1 − tanh |c|)

p˜ 2 =

2 . ˜ (1 + tanh |c|)

(37)

460

F. Faure, S. Nonnenmacher, S. De Bi`evre

N=40 |c|=0.962

ψ+ = 32° ~ p

~ q

p

~ ψ+ q

∆ p~

∆~ q

Fig. 5a, b. Modulus square of the Bargmann function of a squeezed coherent state |c, ˜ as given in (36). ˜ The inverse Planck’s constant N = 1/ h = 40, and the squeezing parameter c˜ = −i|c| ˜ e−2iψ+ with ◦ ˆ ˜ |c| ˜ = 0.962, ψ+ = 32 (this corresponds to c˜1 for the map MArnold ). (a) Three dimensional picture. (b) Typical size and orientation of the distribution: ellipse “supporting” the distribution

Standard and squeezed coherent states on the torus are defined to be the images of the previous coherent states by the projector Pˆθ . We use the notation: |x, c, ˜ θ = Pˆθ |x, c ˜ ∈ HN,θ .

(38)

These states are asymptotically normalized: ˜ x, c, ˜ θ |x, c, ˜ θ = 1 + O(e−C(c)/ )

and satisfy a resolution of the identity on the Hilbert spaces HN,θ [BouDB]: dqdp |x, c, ˜ θ x, c, ˜ θ | = IˆHN,θ . T 2π

(39)

Similarly as above, one defines for any |ψ ∈ HN,θ its Bargmann “function” x → x, c, ˜ θ|ψ (which is actually a section of a suitable line bundle over T, i.e. a quasiperiodic function on R2 , but this shall not interest us here), and Husimi function (x) = N |x, c, ˜ θ |ψ|2 , a bona fide function on the torus (of which we omit to Hc,ψ,θ ˜ indicate the N -dependence). 4.2. The evolution of coherent states. Before turning to quasimodes, we need to study in detail the quantum evolution of the squeezed coherent state |c, ˜ θ which is given def ˆ t by |t; c, ˜ θ = M |c, ˜ θ , t ∈ Z. We will extend this notation to any real time, by

ˆ |t; c, ˜ θ = Pˆθ e−iH t/ |c. ˜ Due to (34), the states |t; c, ˜ θ are again squeezed coherent states (up to a global phase), so this evolution defines a time flow c(t) ˜ on the family of squeezed coherent states centered at the origin. All squeezed states at the origin have even parity: Pˆ |c ˜ = |c, ˜ so that the evolution of |c ˜ through the map Mˆ Pˆ is the same as through Mˆ (yet, these two maps might require different values for θ , see Eq. (31)). It will turn out that |t; c, ˜ θ will be most simply described if the initial squeezed state |c˜0 , θ at time t = 0 is well chosen in terms of the decomposition (13). Defining, with ˆ the notations of (13)–(14), c˜0 = −b2 e−2ib1 , it is easy to check that |c˜0 = e−ib1 /2 Q|0 def

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

461

−ib1 /2 |0. Then, with M ˆ 1 )B(b ˆ 2 )R(−b ˆ ˆ ˆ = since Mˆ (c˜0 ,0) = R(b 1 ) and R(−b 1 )|0 = e −1 ˆ ˆ ˆ QD(λ)Q ,

ˆ D(λt)|0, ˆ Mˆ t |c˜0 = e−ib1 /2 Q

ˆ c˜0 |Mˆ t |c˜0 = 0|D(λt)|0 =√

1 ∈ R+ , cosh(λt) (40)

so the overlap c˜0 |Mˆ t |c˜0 is real positive for all times. For later purposes we note that, defining, for s ∈ R, c˜s ∈ C, σs ∈ [0, 2π [ by Hˆ

e−i s |c˜0 = eiσs |c˜s ,

(41)

ˆ

(see (34)), it is clear that c˜s | e−iH t/ |c˜s is real positive for all t. In fact, it can be shown that the c˜s are the only values of c˜ with this property. Among all s, s = 0 maximizes |0|c˜s |2 , so |c˜0 is in a sense the most localized state among all |c˜s . In this paper, we will almost exclusively build quasimodes from coherent states with “squeezing” c˜0 ; this choice is made for pure convenience, and our main semiclassical results apply to more general squeezings as well (see Sect. 6.6 and Appendix 10.2). def

Before turning to |t; c, ˜ θ ∈ HN,θ , we first describe the evolved state |t; c˜0 = ˆ t/ −i H 2 e |c˜0 ∈ L (R), by studying its Husimi function on the plane, as defined in (35). It will be convenient (but again not absolutely necessary for our results, see Sect. 10.2) to adapt the choice of c˜ in the definition of this Husimi function to the dynamics M by putting c˜ = c˜0 . One then computes 2 2 2 ˆ† † † Tˆ † Q † Q|0 ˆ ˆ ˆ ˆ ˆ ˆ ˆ c ˜ | T |t; c ˜ Q T D Q D |0 (λt) (λt) 0| 0| 0 x 0 x Q−1 x def Hc˜0 ,t (x) = = = . 2π 2π 2π (42) It is now natural to use the coordinates q , p = Q−1 (q, p) ∈ R2 attached to the unstable-stable basis (v+, v− ) (see Eq. (22)). In terms of these, the Husimi function is a Gaussian drawn on the unstable and stable axes: 1 p 2 q 2 , (43) exp − 2 − Hc˜0 ,t (x) = 2π cosh(λt) q p 2 with 2 t→∞ ∼ e2λt , 1 − tanh(λt) 2 t→∞ = e−2λt q 2 → . p 2 = 1 + tanh(λt)

q 2 =

(44)

(with The Husimi distribution of the evolved state |t; c˜0 therefore spreads exponentially √ rate λ) in the unstable direction of the map, and has a finite transverse width . It “lives” in an elliptic region of phase space centered on the origin and of area q p ∼ eλt . Due to conservation of the total probability, the height of the distribution decreases exponentially. We now turn to |t; c˜0 , θ = Mˆ t |c˜0 , θ , t ≥ 0 and its Husimi function Hc˜0 ,t,θ (x) = N |x, c˜0 |Pˆθ |t; c˜0 |2 .

462

F. Faure, S. Nonnenmacher, S. De Bi`evre

p 0

0

t=−6

t=−3

q

t=0

t=3

t=6

Fig. 6. Husimi function of the state |t; c˜0 , θ for the dynamics (21) and N = 1/(2π ) = 500. One has T ≈ 8.37

It is clear from (28) that x, c˜0 |Pˆθ |t; c˜0 is obtained by summing (up to some phases) the translates of the function x, c˜0 |t; c˜0 into the different phase space cells of size 1 centered on the points of Z2 (the cell around 0 will be called the fundamental cell F). Consequently, it follows from (43)–(44) x∈F √ atλta pointλ(t−T √ that this function is non-negligible

/2) only if x lies within a distance from a stretch of length q ≈ e = e of the unstable manifold through 0 (Fig. 6). Here we introduced the Ehrenfest time as def

T =

| log | . λ

(45)

Since at time | log |/(2λ) = T /2, q reaches the size 1 (i.e. the size of the torus), it is clear that for shorter √ λttimes the Husimi function Hc˜0 ,t,θ lives in an elliptic region of shrinking diameter e around 0. For times larger than T /2, this Husimi function starts to wrap itself around the torus along the unstable axis or, equivalently, the support of some of the translates x + n, c˜0 |t; c˜0 start to enter into the fundamental cell. The diophantine properties guarantee that the branches of the piece of length q of the unstable manifold passing through the origin are roughly at a distance 1/q = e−λ(t−T /2) from each other (Fig. 4). Consequently, as long as p << e−λ(t−T /2) , i.e. as long as t ≤ (1 − )T , the main contribution to x, c˜0 |Pˆθ |t; c˜0 and hence to the Husimi function Hc˜0 ,t,θ comes from a single term x + n, c˜0 |t; c˜0 for most x ∈ F. We say there are no interference effects. The regime (1 + )T /2 ≤ t ≤ (1 − )T was studied in [BonDB1] where it was proven that on that time scale the Husimi function equidistributes on the torus. For longer times t ≥ (1 + )T , when the area p q occupied by the support of Hc˜0 ,t becomes larger than the area of the torus itself, several terms may contribute equally to x, c˜0 |Pˆθ |t; c˜0 . In the next subsection we give a detailed control on the onset of this “interference regime” up to time 2T for the Husimi function of |t; c˜0 , θ evaluated at the origin x = 0; we shall show that the interferences remain “small” up to the time 2T . As a last remark, we point out that the above discussion is symmetric with respect to time reversal. For negative times, Hc˜0 ,t,θ spreads along the stable direction, reaches the boundary of F around −T /2, and will interfere with itself for t ≤ −T . 4.3. Estimating the interference effects. As explained in the introduction, our crucial technical estimate concerns the autocorrelation function for the state |c˜0 , θ , given by c˜0 , θ|Mˆ t |c˜0 , θ. More generally, we will need control on c˜s , θ|Mˆ t |c˜s , θ = c˜s |Pˆθ Mˆ t |c˜s = c˜s |Mˆ t |c˜s + I (t, s),

(46)

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

463

where we separated the contribution of the term n = (0, 0) (the “plane overlap”), from the remaining terms: def −in·θ+iδn e I (t, s) = c˜s |Tˆn Mˆ t |c˜s . (47) n∈Z2∗

This remainder represents the interference of the evolved plane coherent state with the lattice-translated initial state. We will show that these contributions tend to 0 as N → ∞, uniformly for all times |t| ≤ 2(1 − )T , for any fixed > 0. A trivial upper bound is i ˆ def |I (t, s)| ≤ (48) n, c˜s | e− H t |c˜s = J0 (t, s), n∈Z2∗

and we shall estimate the RHS. Note that we extended I (t, s) in the natural way to real times t. The detailed proofs of the estimates below are given in Appendix 10.1; here we limit ourselves to explaining the underlying ideas and to an instructive comparison with a numerical example. For simplicity, we will concentrate on the case s = 0. We define a time-dependent metric on the plane adapted to the Gaussian in (43): q (x) 2 1 p (x) 2 2 def 1 x t = + . 2 q (t) 2 p (t) The RHS of (48) is simply the sum of this Gaussian of height Ht = (cosh λt)−1/2 evaluated at all nonzero integer lattice points. The diophantine properties proven in Sect. 3.1 provide information on the position of the integer lattice with respect to the ellipse {x2t = 1} and allow us to prove the following estimates: • for relatively short times (meaning |t| ≤ (1 − )T ), all lattice points n = 0 are far outside the support of the Gaussian so that nt is large. In fact, the distance nt reaches its minimum for a single point no (more precisely a finite number N of −λ|t| points), with no 2t > c e >> 1. Note that, here and in the following, we write f () << g() when lim→0 f ()/g() = 0. J0 (t,0) is dominated by the contri bution of this finite set of points, given by N Ht exp −no 2t , the contributions of farther points being much smaller. The precise bound proven in the appendix reads:

√ −λ|t|/2 e−λ|t| exp −Co 1 + C eλ(|t|−T )/2 , |t| ≤ T ⇒ |I (t, 0)| ≤ 2 2 e 2 (49) where the constant Co is the parameter of the diophantine equation (25), and C can be computed explicitly (it depends only on M). • For times |t| ≥ T , a large number of lattice points (Nt = q (t)p (t) ∼ eλ(|t|−T ) ) are contained in the ellipse (i.e. satisfy nt ≤ 1), and their collective contribution dominates the RHS of (48): |I (t)| Nt Ht ∼ eλ(−T +|t|/2) . This is indeed essentially what we prove: √ 2π 2 λ(−T +|t|/2) e T ≤ |t| ⇒ |I (t, 0)| ≤ 1 + C eλ(T −|t|)/2 , (50) Co where C can be computed explicitly in terms of M. This upper bound becomes of order unity for |t| 2T .

464

F. Faure, S. Nonnenmacher, S. De Bi`evre

• From the definition (46), we have trivially for any time |I (t, 0)| ≤ c˜0 , θ |c˜0 , θ + c˜0 |Mˆ t |c˜0 ≤ 1 + O e−C(c˜0 )/ + √

1 . cosh(λ|t|)

Combining these estimates (generalized to s = 0), one obtains the following proposition: Proposition 1. There exist positive constants C, C , C

such that for all times t ∈ R, and for all s in a bounded interval √

|I (t, s)| ≤ J0 (t, s) ≤ min C eλ|t|/2 , 1 + 2 e−λ|t|/2 +C e−C / . (51) This shows that the interferences remain small until times of order 2T . The existence of “short quantum periods” for certain values of (see the introduction and Sect. 8) implies that I (t, 0) is of order 1 at t = P 2T for these values of . This is further illustrated in Fig. 7. Figure 7 shows numerical calculations of log |I (t, 0)| for values of Planck’s “constant” N = 9349 → 9359 and compares them to F (t), which is essentially given by the upper bounds (49)–(50). We observe that, whereas (49) is close to optimal, the same is not true for (50) for most values of N : there is a “plateau” log |I (t, 0)| log(1/2 ) for t > T , where T = log(N )/λ is a shifted Ehrenfest time. This plateau can be explained by assuming that the phases which multiply the different terms in I (t, 0) are uncorrelated, like independent random phases. For t >> T , the RHS of (47) could then be replaced by a sum of many ( Nt ) terms with identical moduli Ht but random uncorrelated phases, similar to a 2-dimensional random walk. The modulus of the sum √ (i.e. the length of the random walk) has a typical value |I (t, 0)| ∼ Nt Ht ∼ 1/2 , independent of time: this is indeed what we see numerically. However, for the value of N = 9349, corresponding to a “short quantum period” P = 19, as discussed in Sect. 8, log |I (t, 0)| is close to the upper bounds (49)–(50)

log |I(t,0)| for 1/h=N=9349

L

F(t)

0

-2

-4

1/2

-6

log(h ) log |I(t,0)| for 1/h=N=9350

-8

-10 0

2

4

6

8

10

T’

12

14

16

18

20

9359 t

2T’

Fig. 7. Numerical calculations of log |I (t, 0)|, for the map (21). The heuristic upper bound F (t) (solid λ(T −t ) −1 for line) is defined in terms of the shifted Ehrenfest time T = log(N)/λ: F (t) = − λt 2 −e λ

0 < t < T and F (t) = 2 t − 2T + 0.5 for T < t < 2T . The horizontal dashed line at log(h1/2 ) gives the order of magnitude of the plateau for t > T

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

465

up to time P 2T . In such exceptional cases – crucial in this paper – there appears strong correlations between the phases in the sum I (t, 0): the random walk somehow becomes “rigid”, which makes its total length of the same order as the sum of individual lengths, |I (t, 0)| ∼ J0 (t, 0) ∼ Nt Ht . This rigidity can actually by analyzed directly from the explicit expression for the phases [FN2]: one first finds that for these special values of Planck’s constant N = Nk and t in the interval T < t < 2T , the phases corresponding to the relevant ∼ Nt lattice points are all close to 2d th roots of unity, where d = (trM)2 − 4 (in the example M = MArnold and Nk = 9349, the relevant phases are all close to unity). Then, the sum of these ∼ Nt phases behaves like G(M, Nk )Nt for Nt >> 1, and one can check that the prefactor G(M, Nk ) (a Gauss sum) is bounded away from zero uniformly (e.g. G(MArnold , 9349) = 1). This explains the behaviour |I (t, 0)| ∼ J0 (t, 0). This situation drastically differs from the case of a “generic” N , where the relevant phases are more or less equidistributed over the circle. 5. Quasimodes at the Origin 5.1. Continuous time versus discrete time quasimodes. We are now ready to study the quasimodes (2) and (6) “associated” with the periodic orbits of the dynamics generated by M, as discussed in the introduction. To alleviate the notations, we start with the case where the orbit is simply the fixed point (0, 0) ∈ T. The rather straightforward generalization to arbitrary orbits is given in Sect. 7. Note that the Ehrenfest time T = | lnλ| is in general not an integer: whenever T or T /2 appears in a sum boundary, they should therefore be replaced by the nearest integer. It will be convenient to also consider slightly modified quasimodes, for which the initial state is not the squeezed coherent state |c˜0 , θ as in (2), but rather the following superposition of squeezed coherent states: 1 i ˆ dt e−iφt e− H t |c˜0 . (52) Pˆθ 0

The “continuous time” version of the quasimodes defined in (2) then reads: 1 i ˆ def ˆ ˆ |cont dt e−iφt e− H t |c˜0 P = P −T ,T ,φ θ φ = Pˆθ

(53)

0

T −T

i

ˆ

dt e−iφt e− H t |c˜0 .

(54)

Here we introduced, for any φ ∈ R, t0 < t1 ∈ Z, the operator Pˆ t0 ,t1 ,φ =

t 1 −1

e−itφ Mˆ t ,

(55)

t=t0

and the equality (54) follows from a trivial computation. These quasimodes can also be decomposed into 4 parts |cont j,φ , obtained by integrating in t over time intervals of length T /2, then projecting the obtained state in HN,θ . A remarkable and useful property (derived from Poisson’s formula) is that we can recover the “discrete time” quasimodes |disc φ defined in (2) from the “continuous time” ones: |disc |cont φ = φ+2πk . k∈Z

466

F. Faure, S. Nonnenmacher, S. De Bi`evre

Notice that the state in (52) is not 2π-periodic with respect to φ so that the quasimodes |cont φ depend on the “quasienergy” φ ∈ R. The main reason for considering continuous time quasimodes is that they are easily connected with generalized eigenstates of the Hamiltonian Hˆ , which allows to pointwise describe their Husimi densities, a task we turn to in Sect. 6. In the next subsection, we start our study of the above quasimodes. We will use from cont now on the notation |φ in statements that are valid both for |disc φ and |φ (and similarly for |j,φ ). 5.2 Orthogonality of the states |j,φ n at fixed φ Proposition 2. (i) The states |j,φ , j = 1, 2, 3, 4 and |φ satisfy, as → 0 j,φ |j,φ =

T S1 (λ, φ) + O(1), 2

φ |φ = 2T S1 (λ, φ) + O(1),

(56)

where the smooth function S1 (λ, φ) is strictly positive for all φ ∈ R and O(1) is uniformly bounded in φ. In particular these states do not vanish for small enough and the normalized quasimodes |φ n satisfy (5). (ii) Furthermore, for all φ ∈ R, the |j,φ n become mutually orthogonal in the semiclassical limit: for all j = k ∈ {1, . . . , 4} lim nj,φ |k,φ n = 0.

(57)

→0

The limit is uniform for all φ in a bounded interval. (iii) Consequently, for all φ ∈ R, √ nφ |j,φ n → 1/2, nφ |erg,φ n → 1/ 2 and

nφ |loc,φ n

√ → 1/ 2.

Proof. (i) We first give a detailed proof for the “continuous time” quasimodes. Writing k = j − i ∈ {0, 1, 2, 3}, a simple computation yields (see (41)) cont cont i,φ |j,φ 2 −1 2 −1 T

=

T

t=0 t =0

1

1

ds 0

ˆ

ds e−i(t−t +s−s +kT /2)φ c˜0 |Pˆθ e− H (t−t +s−s +kT /2) |c˜0 . i

0

Using (46) and (48) this becomes: cont cont i,φ |j,φ T /2 i ˆ T − |s| e−i(s+kT /2)φ c˜0 | e− H (s+kT /2) |c˜0 + error, ds = 2 −T /2

where

error ≤

1

ds 0

2 −1 T T − |s| J0 (t + k + s − s , s ). ds 2 2 T T

1 0

t=− 2

Using the bound (51), one readily finds that the second term is O(

3−k 4

).

(58)

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

467

To estimate the norm of |cont j , there remains to compute the integral in (58) in the case i = j , that is k = 0: T /2 T /2 i ˆ (T /2 − |s|) e−isφ ds (T /2 − |s|) e−isφ c˜0 | e− H s |c˜0 = ds √ cosh(λs) −T /2 −T /2 T = S1cont (λ, φ, T /2) − S2cont (λ, φ, T /2), 2 where the (real) functions S1cont , S2cont are defined as follows: τ τ e−itφ |t| e−itφ def def cont cont dt √ dt √ S1 (λ, φ, τ ) = , S2 (λ, φ, τ ) = . (59) cosh (λt) cosh (λt) −τ −τ The limits of Sicont (λ, φ, τ ) as τ → ∞ clearly exist. We only give the value for S1cont , the most relevant one for our purposes [BaTIT]: 1 φ 2 1 def S1cont (λ, φ) = lim S1cont (λ, φ, τ ) = √ . (60) +i τ →∞ 4 2λ λ 2π For fixed λ, this function is maximal for φ = 0 (with value ≈ 5.244/λ), and decreases as 4π −π|φ|/2λ for |φ| → ∞. A crucial property is the strict positivity of this function, λ|φ| e for all values λ > 0, φ ∈ R. The computation of φ |φ is similar. (ii) We now estimate the overlaps i,φ |j,φ for j = i, by estimating the first integral of (58) in the cases 3 ≥ k ≥ 1: T /2 4√2 λ(k−1)T e−i(s+kT /2)φ k−1 ds (T /2 − |s|) √ ≤ 2 e− 4 = O( 4 ). −T /2 λ cosh(λ(s + kT /2)) Taking into account the estimate of the error in (58), we see that for any i = j , the cont 1/4 ) for |i − j | = 2). As a overlap cont i,φ |j,φ is bounded by a constant (even by O( result, ∀i = j,

cont cont ni,φ |j,φ n

=

cont cont i,φ |j,φ cont cont i,φ |i,φ

≤

C . T

(61)

This proves (ii). Part (iii) is now obvious. To treat the case of the discrete quasimodes, the integrals over time have to be replaced by sums over integers. For instance, the expressions defined in (59) are replaced by def

S1disc (λ, φ, τ ) =

|t|≤τ

√

e−itφ , cosh (λt)

and similarly for S2 . The sum S1disc (λ, φ) = limτ →∞ S1disc (λ, φ, τ ) is also nonnegative for all λ > 0, φ ∈ [−π, π ]. Indeed, Poisson’s formula induces the identity S1cont (λ, φ + 2kπ ). S1disc (λ, φ) = k∈Z

The norms of the discrete quasimodes therefore satisfy an estimate similar to (57), upon replacing S1cont by S1disc . The other estimates are identical as for the continuous version.

468

F. Faure, S. Nonnenmacher, S. De Bi`evre

5.3. Quasimodes of different quasienergies. We now compare quasimodes |φ of different quasienergies and show: Proposition 3. Let φ0 be an arbitrary angle in [0, 2π[, and φk = φ0 +

π k, T

k = 1, . . . , 2T .

The 2T quasimodes |φk n become mutually orthogonal in the semiclassical limit: ∀k = k, nφk |φk n = O(1/T ). This is an immediate consequence of the following finer estimate: Proposition 4. Let I ⊂ R be a fixed bounded interval. There exists a constant C > 0 such that, given any semiclassically vanishing function θ() and n ∈ Z∗ , if φ, φ ∈ I , |n| and if the phase shift φ = φ − φ satisfies |φ − nπ T | ≤ θ() T , then we have, for small enough , nφ |φ n ≤ C θ () + T1 . Proof. As before, we write the proof for the continuous time quasimodes. The overcont lap cont φ |φ is given by an expression similar to (58). Using the estimate (51) for I (t, s), we obtain cont cont φ |φ =

T −T 2T

dt

=

−2T

T −T

dt √

ei(t φ −tφ) + O(1) cosh λ(t − t )

eis φ¯ sin {φ (T − |s|/2)} ds √ + O(1), φ/2 cosh(λs)

def 1 (λ,0) where we introduced φ¯ = φ 2+φ . This integral is bounded above by 2S|φ| , so that for a phase difference bounded away from zero (i.e. |φ| ≥ c > 0), the scalar product of the normalized states is nφ |φ n = O(T −1 ). We are however more interested in the case where φ is -dependent and semiclassically small: φ → 0. Inserting | sin{φ(T − |s|/2)} − sin{φT }| ≤ |s|φ/2 in the integral and using (56), we get for → 0, φ → 0:

cont cont nφ |φ n

=

¯ S cont (φ) S cont (φ)S cont (φ )

sin (T φ) + O(1/T ). T φ

The first term can be as large as 1, for φ << T −1 . It will also be large for values φ = π(n+1/2) with n an integer, |n| << T , where it takes the value ±1/T φ. At the T opposite extreme, the term vanishes for φ = nπ T , n a nonzero integer, and close to this T value it behaves like (−1)n πn (φ − nπ ). T We are now set to analyze, in the next subsections, the phase space distributions of the quasimodes |φ n and of their components.

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

469

5.4. Localization of |loc,φ n near the origin. Recall that |loc,φ = |2,φ + |3,φ . We will show the following: Proposition 5. Let φ ∈ R. Then, for any f ∈ C ∞ (T2 ), f (x) Hc˜0 ,loc,φ,θ (x) dx = f (0), lim nloc,φ |fˆ|loc,φ n = f (0) and lim →0

→0 T

(62)

where Hc˜0 ,loc,φ,θ (x) = N|x, c˜0 , θ |loc,φ n |2 is the Husimi function of |loc,φ n . It follows that the semiclassical measures Hc˜0 ,loc,φ,θ (x)dx and the Wigner distribution converge to the delta measure at the origin. All limits are uniform for φ in a bounded interval. Using a more physical terminology, one can say that the quasimodes |loc,φ n strongly scar (or localize) on the fixed point 0 ∈ T2 of the map M. Proof. As before, we write the proof for |cont loc,φ , given by T /2 cont ˆ |loc,φ = Pθ dt e−iφt |t; c˜0 . −T /2

This is a sum of evolved coherent states for times |t| ≤ T /2. At this maximal time, the i ˆ length q of the Husimi function of e− H t |c˜0 reaches the size of the torus. To control the contribution of the nonlocalized states at t ≈ T /2, we first select a function () such that in the small- limit 1 << () << T . We then split |cont loc,φ in two pieces: ˆ dt e−iφt |t; c˜0 + Pˆθ dt e−iφt |t; c˜0 |cont loc,φ = Pθ |t|≤τ∗

= | + |

,

τ∗ ≤|t|≤ T2

def

where τ∗ = [T /2 − ()/λ]. From the proof of Proposition 2 it is clear that | ∼ 2τ∗ S1cont (λ, φ) ∼ T S1cont (λ, φ) ∼ loc,φ |loc,φ The norm of the remainder

|

|

∼

when → 0.

(63)

is estimated similarly: cont S (λ, φ) ≤ C() = o(T ). λ 1

(64)

√ In the interval |t| ≤ τ∗ , the ellipses supporting the states |t; c˜0 have lengths eλt ≤ e−() → 0. Considering the disk D centered at the origin and of radius e−()/2 , the Husimi functions of these states are therefore semiclassically concentrated inside D . We will show below that |cont loc,φ n is also concentrated inside this disk. Using (63) and (64), together with the obvious |a + b|2 ≤ 2(|a|2 + |b|2 ), one finds C 2 N |x, c˜0 , θ|cont | dx ≤ N |x, c˜0 , θ | |2 + |x, c˜0 , θ |

|2 dx loc,φ n T T\D T\D C

2

≤ N |x, c˜0 , θ | | dx + | (65) T T\D () ≤C . (66) T

470

F. Faure, S. Nonnenmacher, S. De Bi`evre

The last inequality comes from the observation that the Bargmann function x, c˜0 , θ | −() so simple analysis shows that the is a sum of Gaussians of widths smaller than e ( ) integral in (65) is O N exp(−c e ) . Consequently, (66) holds and yields the proposition provided we choose log log N << << log N . For discrete quasimodes, we only need to replace S1cont by S1disc in the above estimate. For later purpose, we notice that the previous proof can be applied to the states t2 T T cont ˆ eiφt |t; c˜0 , |t1 ,t2 = Pθ with − ≤ t1 ≤ 0 ≤ t2 ≤ . (67) 2 2 t1 These states indeed localize at the origin in the sense of Eqs. (62) and (66). The same ˆ is obviously true for the discrete analogues of these states: |disc t1 ,t2 = Pt1 ,t2 |c˜0 , θ . Note that |2,φ and |3,φ are of this type. 5.5. Equidistribution of |erg,φ n . Recalling that |erg,φ = |1,φ + |4,φ we have Proposition 6. Let φ ∈ R. Then, for any f ∈ C ∞ (T2 ) lim nerg,φ |fˆ|erg,φ n = f (x)dx = lim f (x)Hc˜0 ,erg,φ,θ (x) dx, →0

T

→0 T

(68)

where Hc˜0 ,erg,φ,θ (x) = N|x, c˜0 , θ |loc,φ n |2 is the Husimi function of |erg,φ n . It follows that the Husimi measure Herg,φ (x) dx and the Wigner distribution converge to the Liouville measure on the torus. The limits are uniform for φ in a bounded interval. The states |erg,φ are said to semiclassically equidistribute on the torus. Proof. We will use the algebraic structure of the quantized automorphisms in the proof. We will drop the index φ from the notations. It is clearly enough to show that, for each k ∈ Z2∗ , we have lim erg |Tˆk/N |erg = 0. →0

For that purpose, we write erg |Tˆk/N |erg = 1 |Tˆk/N |1 + 4 |Tˆk/N |4 + 1 |Tˆk/N |4 + 4 |Tˆk/N |1 . (69) We first estimate the two diagonal terms of the RHS. Using |1 = eiφT Mˆ −T |3 , |4 = e−iφT Mˆ T |2 and the intertwining property (20), we get 1 |Tˆk/N |1 + 4 |Tˆk/N |4 = 3 |Tˆk+ |3 + 2 |Tˆk− |2 . def

Here k± = M ±T k/N ∈ (Z/N )2 are of order 1 (see below), so that we transformed the “microscopic” translation by k/N (of order ) into “macroscopic” ones. Each term is therefore the overlap between the state |2 or |3 localized in a small disc D centered at the origin of the torus (cf. Eqs. (65,67)), and a translated state localized in the def

disc D,± = D + k± centered at the point k± mod Z2 . This overlap will consequently

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

471

be small provided k± is sufficiently far away from the integer lattice. We prove this fact using (22): k+ = q (k) eT λ /N v+ + p (k) e−T λ /N v− = C(N )q (k)v+ + O(2 )v− , k− = q (k) e−T λ /N v+ + p (k) eT λ /N v− = C(N )p (k)v− + O(2 )v+ , where 2π e−λ ≤ C(N ) ≤ 2π eλ since T /2 is the closest integer to | ln |/2λ. Now, q (k) = 0 = p (k) since the slopes of v± are irrational. Consequently, k± are at a finite distance from Z2 for small enough , and the disks D and D,± do not intersect each other. We can thus estimate the overlap: n3 |Tˆk+ |3 n = n3 |x, c˜0 , θ x, c˜0 , θ |Tˆk+ |3 N dxn T

= + n3 |x, c˜0 , θ x, c˜0 , θ |Tˆk+ |3 n N dx. T\D

D

Using the Cauchy-Schwarz inequality, the first integral is bounded as n3 |x, c˜0 , θ x, c˜0 , θ |Tˆk+ |3 n N dx T\D ≤

T\D

Hc˜0 ,3,θ (x) dx

T\D

Hc˜0 ,3,θ (x − k+ ) dx ≤ C

, T

where we used (65) applied to |3 . The integral over D is treated similarly, exchanging the roles of both factors: now the second factor semiclassically converges to zero due to the inclusion D ⊂ (T \ D,+ ). In the end, we get for log log N << () << log N , ! () ˆ T Tˆk/N Mˆ −T |3 n = O , (70) n1 |Tˆk/N |1 n = n3 |M T uniformly for φ in a finite interval. The proof goes through unaltered for the second overlap n 4 |Tˆk/N |4 n and in fact for any Mˆ T |t1 ,t2 n as in (67), leading to: Lemma 3. Consider a semiclassically diverging function log | log | << () << | log | and k ∈ Z2∗ . Given a bounded interval, there exists a constant C so that for all φ in the interval nt ,t |Mˆ −T Tˆk/N Mˆ T |t ,t n ≤ C () . 1 2 1 2 T As a result, the states Mˆ T |t1 ,t2 n equidistribute as → 0, which implies that the integral of their Husimi function over a fixed domain of area A converges to A. We now use this information to finish the proof of Proposition 6. We enlarge Fig. 1 and define the additional state |5 = e−iφT Mˆ T |3 , which, according to Lemma 3, equidistributes. Now, using the same intertwining property as above, we rewrite the nondiagonal terms in the RHS of (69) as 2 |Mˆ T /2 Tˆk/N Mˆ −T /2 |5 +5 |Mˆ T /2 Tˆk/N Mˆ −T /2 |2 = 2 |Tˆk |5 +5 |Tˆk |2 ,

472

F. Faure, S. Nonnenmacher, S. De Bi`evre def

with the vector k = M T /2 k/N. Each term is the overlap between a state localized near the origin (e.g. 2 |) and an equidistributed one (e.g. Tˆk |5 ). It is natural to expect that they are asymptotically orthogonal. To prove this fact, we proceed as above: n2 |Tˆk |5 n ≤

dx Hc˜0 ,2,θ (x) dx Hc˜0 ,5,θ (x − k ) T\D(r) T\D(r)

+ D(r)

dx Hc˜0 ,2,θ (x),

D(r)

dx Hc˜0 ,5,θ (x − k ),

(71)

where D(r) is the disc of radius r centered at 0. Using the semiclassical localization of |2 n at the origin and the equidistribution of |5 n , we find lim sup |n 2 |Tˆk |5 n | ≤ →0

√ π r.

Since this is true for any r > 0, lim→0 |n 2 |Tˆk |5 n | = 0. We now control all the terms of (69) and after taking care of the normalizations we obtain Proposition 6.

5.6. Semiclassical properties of |φ = |loc,φ + |erg,φ . We now finally consider the “full” quasimode |φ . It is the sum of two states, one localized, the second equidistributed. Proposition 7. For any φ ∈ R, (7) holds with τ = 1, x0 = 0. The limit is uniform for φ belonging to a bounded interval. Proof. It is again enough to study n|Tˆk/N |n and to show lim n|Tˆk/N |n =

→0

1 (1 + δk,0 ). 2

The results of the previous subsections imply immediately that this reduces to showing lim nloc |Tˆk/N |erg n = 0.

→0

This in turn is proven as in the previous subsection through the use of the Cauchy-Schwarz inequality and cutting the integral over T into the integral over a small disc around the origin and an integral over the complement (see (71)). To conclude this section, let us remark that the semiclassical properties of the various quasimodes we introduced are not altered if we replace T in the sum or integration boundaries by an integer that differs from it by a finite amount, bounded as goes to zero. This will occasionally be useful in the sequel.

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

473

6. Pointwise Description of the Quasimodes In the last sections, we showed that the Husimi and Wigner functions of the quasimodes 0 |φ n converge to the measure 1+δ 2 in the semiclassical limit. The crucial tools of the proof were, on the one hand, precise estimates of the overlaps c˜0 , θ |Mˆ t |c˜0 , θ (obtained using the diophantine properties of the invariant axes), on the other hand the algebraic intertwining between Mˆ and the quantum translations. Still, it would be interesting to know the speed at which this convergence takes place, or to compute more refined “indicators” of the localization of the quasimodes. In this section, we will use a more “direct” yet slightly more cumbersome route which will yield more precise information on the phase space distribution of the “continuous time” quasimodes. The main step of this route is the pointwise description of the Bargmann and Husimi functions of |cont φ . This description will then provide an estimate of the speed of convergence to the limit semiclassical measure; at the same time, it will allow us to compute alternative localization indicators, like the Ls -norms of the Husimi functions. The pointwise estimates will also uncover the “hyperbolic” structure of the Husimi functions near the origin, a structure already emphasized by several authors for finite-time quasimodes [KH, WBVB] and for spectral Wigner and Husimi functions [ROdA]. 6.1. Plane quasimodes. Our final objective is to estimate the Bargmann function x, c˜0 , θ|cont φ for x ∈ F the fundamental domain. For this purpose, we start from quasimodes of the Hamiltonian Hˆ : t def ˆ |φ,t = ds e−iφs e−iH s/ |c˜0 . (72) −t

|cont φ

The torus quasimode is obtained by projecting |φ,T onto HN,θ (cf. Eq. (54)). In this subsection, we will study the Bargmann function of the plane quasimode |φ,T . def q −ip √ , 2

Using the rescaled variable Z = integral: def

φ,T (x) = x, c˜0 |φ,T = e

−|Z|2

this function is given by the following

T −T

ds √

e−iφs cosh λs

eZ

2 tanh λs

.

(73)

Through the change of variables U = Z 2 (1 − tanh λs), and using the parameter def

µ = 1/4 + iφ/2λ, this integral may be rewritten as eZ −|Z| λ2µ+1/2 Z 2µ 2

φ,T (x) =

2

U1 U0

U −µ−1/2 dU µ −U 1− U e , U 2Z 2

(74)

with the boundaries U0 = Z 2 (1 − tanh λT ) 2Z 2 2 , U1 = Z 2 (1 + tanh λT ) 2Z 2 (1 − 2 ). This function satisfies the following symmetries (with obvious notations): φ,T (Z) = φ,T (−Z) = −φ,T (iZ).

(75)

The hyperbolic Hamiltonian Hˆ admits no bound state in L2 (R), but for any real energy E = −φ, it has two independent generalized eigenstates, distinguished by their

474

F. Faure, S. Nonnenmacher, S. De Bi`evre

parity. In the limit t → ∞, the quasimode |φ,t converges (in a sense explained below) (even) . From the identities H (x) = λq p , to the even eigenstate, that we denote by |φ (even) ˆ pˆ qˆ ˆ −1 ˆ qˆ p+ Hˆ = λQ can be expressed in terms of Q , the Bargmann function of |φ 2 parabolic cylinder functions [NV1, BaHTF]: (even)

x, c˜0 |φ

= Cφ e−|Z|

2

D−1+2µ (2Z) + D−1+2µ (−2Z) .

(76)

The normalization coefficient Cφ = π (2µ cosh(π φ/λ)(µ + 1/2))−1 can be computed from the value at Z = 0. For fixed φ and small, this Bargmann function takes its largest values close to the origin (where it takes the value S1cont (λ, φ)), and is otherwise concentrated along the hyperbola {q p = −φ/λ}, which is the classical energy surface {H (x) = −φ} (see below and Sect. 6.4 for more details). The Husimi functions of two of these generalized eigenstates are displayed in Fig. 8 in terms of the coordinates

(Q , P ) = (q√,p ) .

(even)

From the integral expression (73), we see that the Bargmann functions of |φ and |φ,T are semiclassically close to each other: (even)

φ,T (x) − φ

(x) = O(1/2 ) uniformly with respect to x and φ.

(77)

This equation together with (76) yields a uniform approximation for φ,T (x). One √ cannot simplify this expression in the central region {x = O( )}. On√the other extreme, one can obtain asymptotic expansions for (74) in the region {|x| >> } ({|Z| >> 1}). def

We will give formulas uniformly valid in the “positive sector” S+ = {Z | arg(Z) ≤ π 4 (1 − )}, where > 0 is fixed. The symmetries (75) then allow to fill the remaining three sectors (around the angles π/4 + nπ/2, the function is exponentially small).

Fig. 8. Husimi functions of two generalized eigenstates |φ , in the coordinates (Q , P ). The densities are plotted in linear scale, the contour step depending on the plot. The classical energy hyperbolas are drawn in thick curves (even)

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

475

Expanding the last factor in the integral (74) into powers of 1/Z 2 , we get a sum of incomplete Gamma functions [BaHTF, Chap. 9]: U1 U −µ−1/2 dU µ −U 1− U e 2Z 2 U0 U µ + 1/2 γ (µ + 1, U0 ) − γ (µ + 1, U1 ) + . . . . = γ (µ, U0 ) − γ (µ, U1 ) + 2Z 2 These gamma functions have simple asymptotics in two regimes: √ • for U0 << 1 << U1 , that is, x ∈ S+ , << |x| << √1 , they yield

√ 1 (µ) e φ,T (x) = 1+O + O( Z) λ2µ+1/2 Z 2µ |Z|2

2

p (µ) µ − p +iq 1/2 2 e = 1/2−µ 1+O + O( |x|) . λ2 (q − ip )2µ |x|2 (78) Z 2 −|Z|2

(even)

This asymptotics also holds for the Bargmann function of |φ in the sector |Z| >> 1, Z ∈ S+ : it indeed corresponds to known asymptotics of the parabolic cylinder functions D−1+2µ [BaHTF, Chap. 8]. This gives for the Husimi function:

2

S cont (λ, φ) |φ,T (x)|2 1 − p −2 φλ pq e . (79) ∼ 1 √ 2π 2λ π |q − ip | √ √ For fixed q >> , the p -Gaussian of width is centered on the point p = −φ/λq , that is on the classical hyperbola. The function decreases as q1 along the “crest”. • in the region |x| >> √1 , x ∈ S+ , the Bargmann function is “dominated” by the coherent state at time T : √

2 2 ) q 2

2) 2 − 2 − p (1− −i q p (1−2 2 2 e e φ,T (x) = 1/2−iφ/λ (q − ip )2 1 2 + O( ) . × 1+O (80) |x|2

The crossover between the √1 decay and the e− q

q 2 2

decay is governed by the function

γ (µ, U0 ), with U0 ∼ q 2 /2 varying from small to large values. 6.2. Pointwise description of the torus quasimodes. Using the results in the last section, we will now derive semiclassical estimates for the Bargmann function of the torus quasimode |cont φ : def ˆ φ (x) = x, c˜0 , θ |cont φ = x, c˜0 |Pθ |φ,T eiϑ(x,n) φ,T (x + n), = n∈Z2

with the phases ϑ(x, n) = n · θ + iδn − iπ N x ∧ n.

(81)

476

F. Faure, S. Nonnenmacher, S. De Bi`evre

From now on we restrict x to the fundamental domain F. We will split the above sum between a few “dominant terms” and a “remainder”, which we then bound from above by using similar methods as in Appendix 10.1. We will only provide a sketch of the proof. From the last subsection, we know that the√ function φ,T (x) is concentrated along

the hyperbola {p = −φ/λq }, which is itself -close to the stable and unstable axes. We therefore define two strips Bu , Bs around these axes:

√ Co 2

Bu = x ∈ R , |p (x)| ≤ 2 T and |q (x)| ≤ √ , Bs = {q ↔ p }. 9 T def

We call B = Bu ∪ Bs the union of these strips, Sq = Bu ∩ Bs the “central square” and BuT , BsT and B T their periodizations on T or F. The coefficient Co /9 in the above definition is chosen such that Bu (resp. Bs ) does not intersect any of its integer translates (see Eq. (25)). As a consequence, for any x ∈ F the intersection between the lattice x + Z2 and Bu (resp. Bs ) is either empty, or it contains a single point noted x + nu,x (resp. noted x + ns,x ), with nu/s,x ∈ Z2 . These (possible) points define our “dominant terms” in (81). The remainder thus consists in the sum over n ∈ Z2 such that (x + n) ∈ B. In order to state the pointwise estimate, we define the “modified characteristic functions” χu (x), χs (x) on F as

χu (x) = eiϑ(x,nu,x ) if x ∈ BuT , 0 otherwise χs (x) = eiϑ(x,ns,x ) if x ∈ BsT \Sq, 0 otherwise (this definition is consistent: nu,x is well-defined iff x ∈ BuT ). The slight asymmetry between χu and χs will prevent double counting for x in the central square. Proposition 8. The Bargmann functions of the quasimodes |cont φ have the following expression, uniformly for x ∈ F and φ in a bounded interval: x, c˜0 , θ|cont φ = χu (x) x + nu,x , c˜0 |φ,T + χs (x) x + ns,x , c˜0 |φ,T + O(1/2 T 1/4 ). (82) (even)

On the RHS, |φ,T may be replaced by |φ

.

Notice that φ,T (x) at the “edge” of Bu or Bs is of order O(1/2 T 1/4 ), so that the above estimate of the remainder is sharp. This equation gives precise information for x ∈ B T , but also a nontrivial upper bound in T \ B T . It implies that the Bargmann (and Husimi) function of |cont φ is concentrated along (a portion of) the periodized classical hyperbola, itself asymptotically close to the invariant axes (see Fig. 9 and compare with Fig. 8). These features were not visible in the framework of Sect. 5. Sketch of proof. We have to find an upper bound for the sum n∈Z2 ,x+n∈B |φ,T (x+n)|. √ We first consider the points x + n in the sector S+ ; since they satisfy |x + n| >> , the Bargmann function is described by formulas (78)–(80). As in Appendix 10.1, we split √ the region S+ \ B into a union of strips parallel to the unstable axis, of width δp = . The results of Sect. 3.1 imply that two points (x +n), (x +m) in such a strip are separated by at least |q (n − m)| ≥ Co −1/2 . Summing √ the estimates (78,80) in these strips, we obtain the (x-independent) upper bound O( T 1/4 ) for points in S+ . From (75), the sum over the three other sectors leads to the same bound.

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

p

477

p

p q (a)

(b)

q

q

|cont φ=0

Fig. 9. Husimi functions of quasimodes (left: 3D linear scale; center: logarithmic scale) and ˆ Arnold (N = 500) (right: logarithmic scale) of the map M |cont φ=2λ 0

6.3. Controlling the speed of convergence. Using the pointwise formula (82), we can now directly compute the Fourier coefficients of the Husimi function of |cont φ : def H˜ c˜0 ,cont (k) = φ

F

2 dx e2iπx∧k N|x, c˜0 , θ |cont φ | ,

k ∈ Z2 .

We will prove the following estimate: Proposition 9. The Fourier coefficients of the (non-normalized) Husimi function for√the 2 T: quasimode |cont φ satisfy, uniformly for φ in a bounded interval and k ∈ Z , |k| ≤ e √ H˜ c˜0 ,cont (k) = S1cont (λ, φ)T (1 + δk,0 ) + O( T ). φ

(83)

This formula yields at the same time the norm of |cont φ , the convergence of the nor-

−1/2 ) in this con0 malized quasimode to the measure 1+δ 2 , but also the remainder O(T vergence (which we could not obtain with previous methods). We do not know whether this estimate is sharp; in any case, we believe that the remainder cannot be smaller than O(T −1 ). Using the same methods, we can show that the remainder in the convergence −1 , with a function F (k) ≡ 0. of |cont loc n to its limit measure δ0 behaves as F (k)T

Proof. From Eq. (82), we split Hc˜0 ,φ (x) into 3 components: Hdiag (x) = N |χu (x)φ,T (x + nu,x )|2 + |χs (x)φ,T (x + ns,x )|2 , Hinterf (x) = N χu (x)χs (x)φ,T (x + nu,x )φ,T (x + ns,x ) + c.c. , Hremain (x) = O(−1/2 T 1/4 ) |χu (x)φ,T (x + nu,x )| √ +|χs (x)φ,T (x + ns,x )| + O( T ).

(84) (85) (86)

We will show that the integrals over F of the “remainder” and the “interference” components are O(T 1/2 ), while the integral of e2iπx∧k Hdiag (x) yields the dominant contribution in (83). The integral of Hremain on F is easy to treat. It involves B dx |φ,T (x)|, which we √ estimate by using the asymptotics (78) in the domain x ∈ B, |x| >>√ . This yields 1/2 T −1/4 ), so the integral of Hremain is an O( T ). B dx|φ,T (x)| = O(

478

F. Faure, S. Nonnenmacher, S. De Bi`evre

Homoclinic intersections. To understand the “interference component” Hinterf (x), we have to describe a little bit the set (BuT ∩ BsT ) \ Sq. It is composed of a large number of small “squares” surrounding homoclinic intersections (some of them are clearly visible in Fig. 9). Each of these squares is indexed by a couple of (nonequal) integer vectors (nu , ns ) (finitely many such couples correspond to an actual square in B T ): def

Sqnu ,ns = (Bu − nu ) ∩ (Bs − n√ s) √ = {|q (x) + q (ns )| ≤ 2 T , |p (x) + p (nu )| ≤ 2 T }. Since we have excluded the central square, one can use the asymptotics (78) for φ,T (x + nu/s ). The integral of |Hinterf (x)| on Sqnu ,ns is then smaller than

dq dp

C Sqnu ,ns

−1/2 |q (x + nu )p (x + ns )|

e−

q 2 (x+ns ) 2

e−

p 2 (x+nu ) 2

,

which admits the upper bound

C 1/2 |q (nu − ns )p (ns − nu )|

1/2

≤C

1 1 + . |q (nu − ns )| |p (ns − nu )|

We now want to sum the RHS over all homoclinic squares in B T . To compute the sum over 1/q (resp. 1/p ), we consider the squares as subsets of Bu (resp. Bs ), which orders them along the strip. Two√successive squares do not overlap, so their centers in Bu (resp. in Bs ) satisfy |δq | ≥ 4 T . As a result, the total number of squares is less than CT , and summing their contributions we get T

|Hinterf (x)| dx ≤ 4C 1/2

C/ T j =1

√ 1 1 = O √ | log(T )| = O( T ). √ T j 4 T

Notice that we ignored the phases present in Hinterf (x), as we had done in Sect. 4.3 to estimate I (t, s). Diagonal contribution. We now finish the proof by computing the integral 2iπx∧k diag e dx H (x) = N dx e2iπx∧k |φ,T (x)|2 . F

B

q + k

The wedge product 2πx ∧ k is rewritten ku s p in the adapted coordinates. If k = 0, then ku = 2π v+ ∧ k, ks = 2πv− ∧ k are bounded away from zero (cf. Sect. 3.1). We give some details for the computation of the integral in the positive sector S+ . Let () be a semiclassically increasing function s.t. 1 << () << T 1/4 . The integral √ diag

2 in the central region (|q | < of H √ ) admits the obvious upper bound O( ).

In the region {x ∈ S+ , q > }, one can apply the asymptotics (79). After integrating over p , we obtain

S1cont (λ, φ) 2λ

Co √ 9 T

eiku q dq √ q

1 + O(e

−4T

2 ) + O 2 + O(ks ) . q

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

479

This integral is easy to estimate: • for k = 0, it yields C √ S1cont (λ, φ) S cont (λ, φ) log √ T + O log( T ) . + O(−2 ) = 1 2λ 2 T • for k = 0, it has the asymptotics [BaHTF, Chap. 9] √ S cont (λ, φ) S1cont (λ, φ) | log( ku )| + O(1) = 1 T + O(log(ku )). 2λ 4 Taking the 3 remaining sectors into account, we obtain the proposition.

6.4. Husimi function close to the origin and Ls norms. Besides providing the limit semiclassical measure, the pointwise formula (82) allows us to compute different indicators s of localization for the quasimode |cont φ n , namely the L norms of its Husimi function [Pr, NV2]: 1/s def s . (s > 0) H s = [H (x)] dx T

For s = 2, this defines a phase space analogy of the “inverse participation ratio” used in condensed-matter physics; in the limit s → 1+ , it yields the Wehrl entropy of the state; for s → ∞, this is sup-norm of the Husimi density. Proposition 10. For any fixed ∞ ≥ s > 1 and φ in a bounded interval, the Ls norms of the quasimodes |cont φ n behave in the semiclassical limit as " " C(s, φ/λ) " " . " ∼ "Hc˜0 ,|cont 1 n φ s 1− s | log | ˜ θ behave as C (s, c) ˜ −1+1/s as By comparison, the Ls -norms of a coherent state |c, → 0, c˜ in a bounded set [NV2]. In the case of the sup-norm, we have a more precise statement (see Fig. 8): Proposition 11. For small enough , the maximum of Hc˜0 ,|cont n (x) is at the origin φ

2 for |φ|/λ < 0.5, and C(∞, φ/λ) = |(1/4+iφ/2λ)| . Conversely, for |φ|/λ >> 1, 25/2 π 3/2

= −P = ±√φ/λ on the hyperbola, and the maximum is close to the points Q √ C(∞, φ/λ) ∼ (25/2 π|φ|/λ)−1 .

Sketch of proof. For any s > 1, the decrease ∼ |x|1 s of the Husimi function along the s hyperbola implies that most of the weight in the integral F H cont is supported near the φ origin, so that this integral is close to R2 Hs (even) . This yields the proposition, with the φ

coefficients C(s, φ/λ) given as integrals of parabolic cylinder functions. The statements on the maxima derive from known results about parabolic cylinder functions.

480

F. Faure, S. Nonnenmacher, S. De Bi`evre

6.5. Odd-parity quasimodes. The connection (82) between torus quasimodes |cont φ (even)

hints at a property we have not used much, namely parity. We have already mentioned that for each energy E = −φ, Hˆ admits two indeand generalized eigenstates |φ

(even)

pendent generalized eigenstates, |φ

of even parity, and a second one of odd parity,

(odd) . |φ

On the one hand, the Bargmann function of the latter can which we denote by be expressed similarly as in Eq. (76): (odd)

x, c˜0 |φ

= Cφ e−|Z|

2

D−1+2µ (2Z) − D−1+2µ (−2Z) .

(even)

, we can build this odd eigenstate by propaOn the other hand, as we did for |φ gating an “odd” coherent state at the origin, i.e. replacing the initial |c˜0 in Eq. (72) by the first excited squeezed state def |c˜0 1 = Mˆ (c˜0 ,0) a † |0.

The Bargmann function of the corresponding quasimode |φ,T 1 is given by an integral

similar to (73), with the integrand multiplied by the factor √Q −iP : this is therefore an 2 cosh λs

(odd)

odd function of x, semiclassically close to x, c˜0 |φ

.

Projecting this plane odd quasimode to the torus through Pˆθ , one obtains a quasi(odd) mode |φ of Mˆ with quasiangle φ. Provided one has selected periodicity conditions θ ≡ (0, 0) mod π , parity is conserved by Pˆθ , so that the Bargmann function (odd)

x, c˜0 , θ|φ (resp. x, c˜0 , θ |φ ) is an even (resp. an odd) function of x. As a result, these two quasimodes are mutually orthogonal. The Bargmann and Husimi func(odd) tions of |φ can be described as precisely as for its even counterpart, in particular its normalized Husimi and Wigner functions converge as well to the measure a remainder O(T −1/2 ).

1+δ0 2 , with

p

q (odd) (odd) Husimi functions of the odd eigenstate |φ=0 (linear scale) and the torus quasimode |φ=0

Fig. 10. (logarithmic scale) for N = 500. Notice the zero at the origin

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

481

6.6. On the “robustness” of continuous quasimodes. We want to show that the conodd tinuous quasimodes |cont φ , |φ are “stable” with respect to a change of the initial state (|c˜0 and |c˜0 1 , respectively). One can indeed obtain an even quasimode very close to |cont φ by propagating a different initial state |ψ0 : this state needs to be of even parity, sufficiently localized (e.g. a finite combination of excited coherent states), and taken away from a subspace of “bad” initial states. These remarks will be made more quantitative in Appendix 10.2, which treats the case where |ψ0 is a squeezed coherent state of arbitrary squeezing. To explain this “robustness”, we notice that the operator ∞ def ˆ cont Pˆ −∞,∞,φ = ds e−iφs e−iH s/ −∞

(even)

(odd) and |φ . Any (even) 2 L (R) will thus be projected onto Cφ (ψ0 )|φ , with the prefactor

projects L2 (R) onto the 2-dimensional space spanned by |φ even state |ψ0 ∈

(even)

Cφ (ψ0 ) =

φ

|ψ0

(even) φ |c˜0

.

This prefactor vanishes iff there exists a state |ϕ0 ∈ L2 (R) such that |ψ0 = (Hˆ +φ)|ϕ0 ; such |ψ0 form a “bad” subspace of codimension 1 inside the space of even states. If |ψ0 is localized inside a disk of radius C1/2+ at the origin, one can describe the cont plane quasimode Pˆ −T ,T ,φ |ψ0 as in (77): cont x, c˜0 |Pˆ −T ,T ,φ |ψ0 = Cφ (ψ0 )x, c˜0 |φ

(even)

+ O(1/2− )

uniformly in x ∈ R2 . (87)

cont If Cφ (ψ0 ) is of order unity, this estimate shows that Pˆ −T ,T ,φ |ψ0 resembles the quasimode |φ,T . One can then show (as in Sect. 6.2) that the torus state Pˆθ Pˆ cont |ψ0 is −T ,T ,φ

close to the quasimode Cφ (ψ0 )|cont φ . As an example, consider the case φ = 0: one can start from any (finitely) excited 4n coherent state of the form |c˜0 4n ∝ Mˆ (c˜0 ,0) a † |0 to obtain a quasimode asymptotically close to |cont 0 . On the opposite, the states |c˜0 4n+2 are “bad” initial states, because they are in the range of Hˆ . This discussion straightforwardly transposes to the construction of the odd quasimodes |odd φ starting from odd localized states. 7. Quasimodes on a General Periodic Orbit We have so far described the construction of quasimodes localized on the fixed point 0 of the classical map M. We will now generalize this construction to a general periodic orbit of M. The associated Husimi densities will be shown to be (semiclassically) partly localized on the orbit and partly equidistributed. The proofs require some minor changes with respect to the previous case, but no fundamentally new ingredients. We consider a fixed periodic orbit P = {x ∈ F}τ=0 of (primitive) period τ , in other words, for 0 ≤ < τ , Mx = x+1 mod Z2 and xτ = x0 . Note that M τ x = x mod Z2 ,

482

F. Faure, S. Nonnenmacher, S. De Bi`evre def

so that all x , when viewed as points on the torus, are fixed points of M = M τ . Furthermore, for all 0 ≤ ≤ τ , there exist m ∈ Z2 so that x = M x0 − m . We will first introduce the discrete time quasimode defined in (2) and will consider its continuous time analog below: |disc φ =

T −1

e−iφt Mˆ t Pˆθ Tˆx0 |c˜0 .

t=−T

Letting T be the integer multiple of τ that is closest to | ln |/λ, and setting T = T /τ , a simple computation yields   τ −1 T −1 k disc ˆ e−iφ |disc e−iφτ k Mˆ  Pˆθ Tˆx0 |c˜0 . |disc φ = where | = M k=−T

=0

(88) It is easy to see that Mˆ Pˆθ Tˆx0 = Pˆθ Tˆx +m Mˆ = eiS Pˆθ Tˆx Mˆ ,

(89)

where S = θ · ml + iδml + iπN ml ∧ xl (see (19)). This phase can partly be interpreted in terms of the action along the classical orbit; however, the θ-term is non-classical, akin to the quantum phase due to a pointwise magnetic flux tube on a charged particle (Aharonov-Bohm effect) [KM]. Hence   T −1 k iS  e−iφτ k Mˆ  Pˆθ Tˆx Mˆ |c˜0 . |disc (90) =e k=−T

ˆ This suggests that |disc is a quasimode of quasiangle φτ for M , associated to the

fixed point x of M . This is basically the content of Proposition 12. There is another instructive way of rewriting |disc which corroborates this idea. For that purpose, we first draw from Eq. (89), Mˆ τ k Pˆθ Tˆx0 = eikSτ Pˆθ Tˆx0 Mˆ τ k

and

Mˆ τ k+ Pˆθ Tˆx0 = ei(kSτ +S ) Pˆθ Tˆx Mˆ τ k+ . (91)

Using this, one can write  |disc

=e

iS

Tˆx Pˆθ˜ 

−1 T

 e

−i(φτ −Sτ )k

k

Mˆ  Mˆ |c˜0 ,

(92)

k=−T

where we used Pˆθ Tˆx = Tˆx Pˆθ˜ with θ˜ = θ + 2πN (p , −q ). A simple computation shows that, because x is a fixed point for M τ , θ˜ is a fixed point for the map θ → θ defined in (31), with M replaced by M . Consequently, |disc φ is the x translate of a quasimode for Mˆ at the origin with quasiangle φτ − Sτ , of the type studied in the previous sections.

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

483

To build continuous time quasimodes, we replace in all the above formulas |c˜0 by i ˆ 1 τ ˜ dt e−it φ e− H t |c˜0 , (93) τ 0 where the “quasienergy” φ˜ ∈ R is chosen so that τ φ˜ ≡ τ φ − Sτ mod 2π .

(94)

Whereas the quasiangle φ is defined modulo 2π, the quasienergy φ˜ is chosen in R. The continuous quasimode reads: T i ˆ ˜ cont iS ˆ ˆ 1 e | = Tx Pθ˜ dt e−iφt e− H t Mˆ |c˜0 . (95) τ −T All the above quasimodes can of course in obvious ways be split into a localized and an equidistributing part, as before. For both the discrete and continuous time quasimodes we have the following estimates: Proposition 12. For all 0 ≤ < ≤ τ − 1, for all f ∈ C0∞ (T2 ), for all k ∈ Z, | = 2T S1 (φτ − Sτ , τ λ) + O(1), 1 1 lim n |fˆ| n = f (x ) + f (x)dx, 2 2 T2 →0 lim n |Tˆk/N | n = 0.

(96) (97) (98)

→0

The quasimodes |φ satisfy (7), the limit being uniform for φ, φ˜ in a bounded interval. Starting from (95) a pointwise analysis of the continuous time quasimode |cont P ,φ can be performed as well, √ along the lines of Sect. 6.2. One should notice that the Husimi function of |cont P ,φ in the -vicinity of a periodic point xl is dominated by the contricont bution of |l ; it is concentrated on a hyperbola which depends on the quasienergy φ˜ rather than on the quasiangle φ. Proof of the proposition. We write the proof for the discrete time quasimodes only. Equation (92) immediately implies (96) and (97) as a consequence of the results of Sect. 5. To prove (98) when k = 0, i.e. the asymptotic orthogonality of the | , we write, using (88) and (91), disc disc |

−1 T −1 T

e−i(φτ −Sτ )(k−k )+iS− c˜0 |Tˆ−x0 Pˆθ Tˆx− Mˆ (− )+τ (k−k ) |c˜0 , = k =−T k=−T

so that

T −1 disc disc | ≤

≤

−1 T

k =−T k=−T m∈Z2

−1 T −1 T

Jr (τ (k − k ) + − , 0) ≤ C,

k =−T k=−T

|c˜0 |Tˆ−x0 Tˆm Tˆx− Mˆ (− )+τ (k−k ) |c˜0 | (99)

484

F. Faure, S. Nonnenmacher, S. De Bi`evre

where r = x0 − x− , and where we used the estimate Jr (t, 0) ≤ C eλ|t|/2 extracted from Appendix 10.1. To prove (98) when k = 0, one repeats the arguments of Sect. 5: we omit the details. For continuous quasimodes, the proofs are analogous, using this time the same estimate on Jr (t, s). The proof of (7) follows immediately. Convex combinations of limit measures. We can further enlarge the set of semiclassical limit measures by taking finite convex combinations of the previous ones. Consider a finite set of periodic orbits {P1 , . . . , Pf }, and complex coefficients {α1 , . . . , αf } f satisfying i=1 |αi |2 = 1. Let |Pi ,φ be quasimodes (discrete or continuous time) associated to Pi , as defined above, with the same quasiangle φ. We can then combine them into the quasimode f def | = αi |Pi ,φ n . i=1

One readily shows along the lines of the proof of Proposition 12 that for i = j , and for all k ∈ Z2 , one has lim n Pi |Tˆk/N |Pj n = 0. →0

This together with (7) that the Husimi shows and Wigner functions of |n converge to f 1 2 the limit measure 2 dx + i=1 |αi | δPi . 8. Scarred Eigenstates for Quantum Cat Maps of Short Quantum Periods We will now slightly extend an argument from [BonDB1] in order to show that the quasimodes we have built and studied in the previous sections are exact eigenstates of the quantum map Mˆ for certain special values of and we will prove Theorem 1. For that purpose, we first recall a few facts about quantum cat maps [HB]. For a given value of N = (2π)−1 , every quantum map Mˆ has a “quantum period” P (N ) defined to be the smallest nonnegative integer such that Mˆ P (N) = eiϕ(N) IˆHN,θ

for a certain ϕ(N ) ∈ [−π, π [.

It follows that, if φ is of the type φ = φj =

ϕ(N)+2πj P (N) ,

then

(100)

1 ˆ P (N) Pt1 ,t1 +P (N),φj

is

independent of t1 , and is the spectral projector onto the eigenspace of Mˆ inside HN,θ associated to the eigenvalue eiφj (the normalization factor 1/P (N ) ensures that it is indeed a projector). All eigenvalues of Mˆ on HN,θ are necessarily of that form. The function P (N ) depends on N in an erratic way, and no closed formula exists for it [Ke]. It satisfies the general bounds 2 log N − C ≤ P (N ) ≤ C N log log N. (101) λ √ It is moreover known that, for “almost all” integers, P (N ) ≥ N [KR2]. We will now give an elementary argument to show that, given any hyperbolic matrix in SL(2, Z), there exists an infinite sequence of integers Nk for which the quantum period is very short in the sense that it saturates the above lower bound: log Nk P (Nk ) = 2 + O(1) = 2Tk + O(1), (102) λ where the Ehrenfest time T was defined in (45). ∃C > 0,

∀N ∈ N∗ ,

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

485

Let us first recall that, for all k ∈ N∗ , one has M k = pk M − pk−1 ,

where

pk =

eλk − e−λk , p0 = 0. eλ − e−λ

It was proven in [BonDB1] that, for all k ≥ 1, the integer N˜ k = GCD(pk , pk−1 + 1) satisfies 2 log N˜ k = k + O(1), λ

(103)

and that M k = I + N˜ k Mk ,

with Mk an integer matrix.

(104)

We now set Nk = N˜ k if N˜ k is odd, Nk = N˜ k /2 if N˜ k is even. Choosing the periodicity angle θ = (0, 0) when Nk is even and θ = (π, π ) when Nk is odd (which makes sense, cf. the end of Sect. 3.2), we prove below the following lemma: Lemma 4. With Nk , θ given as above, Mˆ k = eiϕ IˆHNk ,θ

for a certain ϕ ∈ [−π, π [.

This means that the quantum period P (Nk ) of Mˆ on HNk ,θ divides k. Comparing (101) with (103) entails that for k large enough, P (Nk ) = k and (102) holds. Proof of the lemma. The case N˜ k = 2Nk , θ = (0, 0) was treated in [HB]. We give a different proof, which works for both cases. Lemma and the irreducibility of the Tˆn/Nk , it suffices to show that From Schur’s Mˆ k , Tˆn/Nk = 0 on HNk ,θ , for all n ∈ Z2 . Setting N˜ k = Nk , θ = (π, π ) and using the definition of Pˆθ , Eqs. (19) and (20), one readily computes Mˆ k Tˆn/Nk Mˆ −k Pˆθ = eiπ (n∧Mk n) Tˆn/Nk TˆMk n Pˆθ

= (−1)(n∧Mk n)+ [(Mk n)1 +(Mk n)2 +(Mk n)1 (Mk n)2 ] Tˆn/Nk Pˆθ .

This phase is trivial if = 2. In the case = = 1 (that is, N˜ k odd), one must consider the 6 possible values of M modulo 2: in all cases, the phase is trivial. If we now consider such a value Nk together with an admissible eigenangle φjk , the eigenstates |k =

P (N k )/2−1

e−iφjk t Mˆ t |x0 , c˜0

t=−P (Nk )/2

are (discrete time) quasimodes of the quantum map as studied in the previous sections. Indeed, as discussed at the end of Sect. 5, since T and P (Nk )/2 differ by a bounded number of terms in the semiclassical limit, we can replace one by the other in (2), without affecting any of the semiclassical properties of the quasimodes. One can similarly construct eigenfunctions that are continuous time quasimodes.

486

F. Faure, S. Nonnenmacher, S. De Bi`evre

Proof of Theorem 1. The previous arguments settle the case β = 1/2. To treat the general case, we recall that the Schnirelman theorem implies the existence of a sequence of eigenfunctions |ϕk n of Mˆ on HNk ,θ (with corresponding eigenvalues (φjk )k∈N ) that equidistribute as k → ∞. We then construct, for 0 ≤ α ≤ 1: |ψk = α|k n + 1 − α 2 |ϕk n . If we show that, for all n ∈ Z2 , lim nϕk |Tˆn/Nk |k n = 0,

→0

a simple computation implies that the |ψk n satisfy (8) with β = α 2 /2. We have lim nϕk |Tˆn/Nk |k n = lim nϕk |Tˆn/Nk |k,erg n + nϕk |Tˆn/Nk |k,loc n . →0

→0

The second limit vanishes with an argument as in (71), whereas for the first, we use the further decomposition |k,erg = |k,1 + |k,4 with |k,1 = eiφjk T /2 Mˆ −T /2 |k,2 , |k,4 = e−iφjk T /2 Mˆ T /2 |k,3 (see (3)). Now, since |ϕjk n is an eigenfunction, we have nϕk |Tˆn/N |k,4 n = nϕk |Mˆ −T /2 Tˆn/N Mˆ T /2 |k,3 n . k k As in the proof of Proposition 6, and more specifically Eq. (71), this tends to 0 with . For matrices M of “checkerboard structure”, the results of [KR1, Me] imply that, given an arbitrary sequence of eigenvalues (φjk )k∈N , there exists a corresponding sequence of eigenvectors |ϕk ∈ HNk ,θ that semiclassically equidistribute. One can then construct for the same eigenvalues eigenstates |ψk satisfying Eq. (8). The P (Nk ) eigenstates with distinct eigenvalues constructed above are of course exactly orthogonal to each other, and not just asymptotically as proven in Sect. 5.3. On the other hand, two continuous time eigenstates of identical eigenangle φj but different quasienergies φ˜ − φ˜ = sπ/T , s = 0 become orthogonal in the semiclassical limit. This is also the case for two eigenstates with the same eigenangle supported on different periodic orbits P = P . 9. Conclusion In this article we have constructed and analyzed a certain class of “quasimodes” of hyperbolic quantized torus isomorphisms, which for certain values of become exact eigenstates. The characteristic property of these quasimodes is that their “quantum limit”, that is the weak limit of their Husimi densities, does not yield the Liouville measure, but contains a singular component supported on a (finite union of) periodic orbit(s). In our case, this singular component has a relative weight β ≤ 1/2, less than or equal to the weight of the Liouville part. As explained in the introduction, no limit measure of eigenstates can have a “larger” singular component. We further conjecture that no ∧−T ,T ,φ ) can have a more singular sequence of quasimodes (i.e. images of the operators P limit measure either. The strong scarring of eigenstates exhibited in this paper is directly linked to the very large degeneracies of the eigenvalues of Mˆ for certain special values of Planck’s

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

487

constant. Therefore, such sequences of eigenstates are very probably absent as soon as ˆ ˆ one considers nonlinear perturbations of the dynamics, for instance Mˆ = e−i H1 / M, for any periodic Hamiltonian H1 (x) and > 0 small enough. Such a perturbation of the classical map is known to conserve the uniform hyperbolicity, but destroys the “action degeneracies” characteristic of the (linear) cat map. As a consequence the spectrum of the perturbed map exhibits Random Matrix statistics, in particular “repulsion” between eigenangles [KM], which forbids degeneracies. The precise characterization of some weaker form of scarring for individual eigenstates that would remain valid for Mˆ remains therefore an open problem. Nevertheless, it might be interesting to study the phase space distribution of the “nonlinear“ quasi modes of the type Tt=−T e−iφt Mˆ t |x0 , c, ˜ for x0 a periodic point of M , which may not be as simple to describe as for the linear map. Acknowledgements. We have benefited from useful conversations with Y. Colin de Verdi`ere, C. G´erard, E. Vergini, A. Voros, D. Wisniacki. F. F. acknowledges the kind hospitality of the Service de Physique Th´eorique, CEA Saclay where part of this work was accomplished.

10. Appendices 10.1. Estimate of the interference term I (t, s). In this appendix we prove Proposition 1. For the purpose of Sect. 7 we will at the same time give a bound for the more general overlap (t, s ∈ R) i ˆ ˆ (105) r, c˜s |Pˆθ e−iH t/ |c˜s ≤ r + n, c˜s | e− H t |c˜s , n∈Z2

2 where r ∈ F (the fundamental domain) belongs to the lattice D1 Z , with D ∈ N∗ and where θ ∈ [0, 2π[×[0, 2π [ is arbitrary (in other words, θ need not be equal to the fixed point of the map (31). We define i ˆ def (106) Jr (t, s) = r + n, c˜s | e− H t |c˜s . n∈Z2 ,r+n=0

√ √ We first consider the case s = 0, t ≥ 0. Since ≤ p ≤ 2 for all positive times, only the points r + n near the unstable axis can significantly contribute. Therefore, we subdivide the plane into strips parallel to this axis: the “outer” strips ∀l ≥ 1, S±l = x | al ≤ ±p (x) < al+1 , with al = W0 +(l −1)W , and the central strip S0 = x = 0 | p (x)| < W0 . The widths W0 , W will be explicitly set below. We start by estimating the contribution of the points r + n ∈ Sl with l ≥ 1. Due to the diophantine condition (25), as long as W is small enough, two points in this strip satisfy the property |q (r + n) − q (r + m)| > Co /W . Ordering these points according to their abscissas: q (r + nj ) < q (r + nj +1 ) < q (r + nj +2 ), we have for any α > 0 : j Co 2

2 . (107) exp −αq (r + nj ) ≤ exp −α W j ∈Z

j ∈Z

488

F. Faure, S. Nonnenmacher, S. De Bi`evre

The sum on the RHS is a one-dimensional theta function, which has the upper bound (optimal for 0 < α small enough): π −αj 2 e ≤1+ . (108) α j ∈Z

As a result, using (43) it becomes clear that the contribution to Jr (t) of the points r + n ∈ Sl is bounded above by ( '

q (r + nj )2 1 1 p (r + nj )2 + exp − √ 2 p 2 q 2 cosh λt r+nj ∈Sl 2 ' ( √ −λt/2 − al 2 √ W q 2p e 1 + 2π . (109) ≤ 2e Co The estimate (108) can then be applied to the sum over the strips Sl , l = 0, to obtain ˆ (remind |t; c˜0 = e−iH t/ |c˜0 ) |r + n, c˜0 |t; c˜0 | l=0 r+n∈Sl

)

2

W √ − 0 ≤ 2 e−λt/2 e 2p 2

√ 2+

2πp W

*)

√ 1+

* 2π W q . Co

p For each time t, we can minimize the RHS with respect to W by taking W 2 = Co 2q

= Co 2

e−λt , which leads to the bound

√

|r + n, c˜0 |t; c˜0 | ≤ 2 2 e

l=0 r+n∈Sl

−

W02 2p 2

' (2 π

λt/2 e−λt/2 1 + e p . (110) Co

Notice that this upper bound is independent of the point r. We now estimate the contribution of the strip S0 , which requires more care, and will 2 depend on r. For any point r = 0 on the lattice D1 Z sufficiently close to the unstable axis, the diophantine property (25) implies |p (r )| ≥ D 2 |qC o(r )| . As a consequence, the quadratic form appearing in (43) may be bounded inside S0 by p (r + n)2 q (r + n)2 Co2 q (r + n)2 def + ≥ + = ft (q (r + n)).

2

2

2 4

2

2 2q 2p 2q 2D p q (r + n) (111) qD√e−λt/2 Co e−λt The function ft satisfies the scaling property ft (q) = 2D , with 2 p 2 f C o

def

q −2 .

f (q) = + This function f (q) is bounded below for all positive q by the parabola g(q) = 2 + (q − 1)2 , so after rescaling we get q2

∀q > 0,

def

ft (q) ≥ gt (q) =

2 e−2λt Co e−λt q − Co eλt/2 /D . + 2

2

2 D p 2p

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

489

We consider the contributions of the points r +n in S0 such that q (r +n) > 0 (the points with negative q can be treated identically). We order these points as 0 < q0 < q1 < . . . :

each contribution is bounded above by the quantity (cosh λt)−1/2 e−gt (qj ) , which is max√ imal for the qj close to Co eλt/2 /D. The diophantine inequality |qj − qj +1 | ≥ Co /W0 together with the estimates (107,108) then yield ! √

√ Co e−λt 2π W0 p eλt −λt/2 e |r + n, c˜0 |t; c˜0 | ≤ 2 2 exp − 2 1+ . D p 2 Co r+n∈S0

(112) This contribution now depends on the rational point r through its denominator D: the upper bound increases with D. The full sum Jr (t) is bounded above by the sum of the RHS in (110)–(112). For each time t, we adjust the value of W0 to minimize that sum. We do not search the exact minimum, but only its order of magnitude. We have to distinguish two time intervals: • for short times (t << T ), the behaviour of (112) is governed by the first exponential (since p 2 ≤ 2). We take W0 such that the first√exponential in (110) is much smaller than that factor, for instance by taking W0 = 2 Co e−λt/2 /D. Being careful for times around t T , we find

√ Co e−λt −λt/2 λ(t−T )/2 e e 0 ≤ t ≤ T ⇒ Jr (t, 0) ≤ 2 2 exp − 2 1 + C , D p 2 (t) where the constant C is independent on the denominator D. One may replace p 2 (t) by its maximum 2 for positive times. The RHS increases with the denominator D. • for times t ≥ T , the RHS of (112) is now governed by the factor between brackets, and we want to make sure that (110) is not larger than it. Still taking W0 = e−λt/2 leads to the estimate: √ 2π 2 λt/2 1 + C eλ(T −t)/2 . T ≤ t ≤ 2T ⇒ Jr (t, 0) ≤ e Co The constant C is independent of r, so this bound applies uniformly to any point x ∈ T: it yields a L∞ -bound for the Bargmann (or the Husimi) function of Mˆ t |c˜0 , θ . The same bounds apply as well to Jr (t, s) with s = 0. Indeed, replacing the initial squeezing c˜0 by its s-evolved value amounts to dilating the coordinates of the points as q (r + n) → eλt1 q (r + n), p (r + n) → e−λt1 p (r + n). One easily checks that this dilation does not modify the above bounds. The negative times are treated thanks to the identity Jr (t, s) = J−M −t r (−t, s), and noticing that the above bounds only depend on the denominator D, common to r and M −t r. 10.2. Changing the initial squeezing. We chose from the beginning to construct quasimodes starting from the coherent state |c˜0 defined in Sect. 4.2. The definition was motivated by the positivity property (40) of the overlap c˜0 |Mˆ t |c˜0 , and by choosing the “smallest” parameter c˜ sharing this property. The simple expression (40) was then used to control the “interferences” I (t, s) (cf. Appendix 10.1), and to obtain from there the

490

F. Faure, S. Nonnenmacher, S. De Bi`evre

asymptotic norm of the quasimode (Sect. 5.2), a crucial step for further estimates. Similarly, we also chose to analyze the quasimodes using the c˜0 -Bargmann representation, because of the relatively simple formulas for x, c˜0 |Mˆ t |c˜0 (see (43)). We want to stress (as we did in Sect. 6.6 for the continuous quasimodes) that both these choices were made purely for convenience, and are not crucial for the results of this paper. The construction of quasimodes can be extended in many ways. In this appendix, we will consider discrete or continuous quasimodes starting from a squeezed state |c˜1 , with an arbitrary (possibly -dependent) squeezing c˜1 . We also want to analyze these quasimodes using the Bargmann function x, c˜2 , θ | for some c˜2 ∈ C which could depend on as well. Proposition 13. The convergence (7) holds the above quasimodes, as long as c˜1 and c˜2 stay in a fixed compact set K ⊂ C for all . ˆ Sketch of proof. For an initial state |c˜1 , the overlap x, c˜1 | e−it H / |c˜1 , crucial in the calculation of I (t, s), is still given by closed formulas. We only give it for the simpler case x = 0:

−1/2 ˆ ˆ c˜1 | e−it H / |c˜1 = c˜1 |D(λt)| c˜1 = cosh(λt) + iR(c˜1 ) sinh(λt) , ˆ −1 |c˜1 and R(c) = −#(c) sinh(2|c|) . In general, this overlap is therefore where |c˜1 ∝ Q 2|c| not real. However, it still decreases exponentially fast with time, and its average S1 (c˜1 , λ, φ) =

ˆ

R

dt e−iφt c˜1 | e−it H / |c˜1

can be√ easily related with S1 (λ, φ) through a change of variable. One gets S1 (c˜1 , λ, φ) = e−φτ1 cos(λτ1 ) S1 (λ, φ) with the “complex time” τ1 = arctan{R(c˜1 )}/λ. For x = 0, the expression for x, c˜1 |Mˆ t |c˜1 is more cumbersome than √ (43). Yet, √ it is still a Gaussian having an elliptic profile of width ∼ , length ∼ eλ|t| and height ∼ e−λ|t|/2 , and its long axis is asymptotically lined up with the unstable direction for t → ∞. As a result, the results of Sects. 4.3–5.3 still hold (replacing S1 (λ, φ) by S1 (c˜1 , λ, φ)). The localization property (65) holds as well, even if one replaces in the bras c˜0 by c˜2 , as long as c˜2 remains bounded. The rest of the proof to obtain (7) (Sects. 5.5–5.6) goes through unaltered. cont Following Sect. 6.6, the plane quasimode Pˆ −T pointwise ,T ,φ |c˜1 can be analyzed √ through the estimate (87); one now has explicitly Cφ (|c˜1 ) = e−φτ1 cos(λτ1 ). One may replace c˜0 by c˜2 in that estimate. As opposed to Eq. (76), the Bargmann function (even) is not given in terms of cylinder parabolic functions. Yet, its behaviour x, c˜2 |φ “far” from the origin will be similar to (78). As a consequence, the pointwise estimate cont (82) (with c˜0 → c˜2 in the bras) will apply to the torus quasimode Pˆθ Pˆ −T ,T ,φ |c˜1 as well, upon taking the prefactor Cφ (|c˜1 ) into account and replacing in the bras c˜0 → c˜2 on both sides. The estimates of Sects. 6.3–6.4 may be generalized as well to the present case.

Scarred Eigenstates for Quantum Cat Maps of Minimal Periods

491

References [AA] [BaTIT] [BaHTF] [Bog]

Arnold, V.I., Avez, A.: Ergodic Problems in Classical Mechanics. NewYork: Benjamin, 1968 Tables of integral transforms. A. Erd´elyi (ed), New York: Mc-Graw-Hill, 1954 Higher transcendental functions. A. Erd´elyi (ed), New York: McGraw-Hill, 1953 Bogomolny, E.B.: Smoothed wave functions of chaotic quantum systems. Physica 31 D, 169–189 (1988) [Ber] Berry, M.V.: Quantum scars of classical closed orbits in phase space. Proc. R. Soc. Lond. A 423, 219–231 (1989) [BonDB1] Bonechi, F., De Bi`evre, S.: Exponential mixing and ln timescales in quantized hyperbolic maps on the torus. Commun. Math. Phys. 211, 659–686 (2000) [BonDB2] Bonechi, F., De Bi`evre, S.: Controlling strong scarring for quantized ergodic toral automorphisms. Duke Math. J. 117, 571–587 (2003) [BouDB] Bouzouina, A., De Bi`evre, S.: Equipartition of the eigenfunctions of quantized ergodic maps on the torus. Commun. Math. Phys. 178, 83–105 (1996) [CKS] Chang, C.-H., Kr¨uger, T., Schubert, R.: Quantizations of piecewise affine maps on the torus and their quantum limits. In preparation [CdV] Colin de Verdi`ere, Y.: Ergodicit´e et fonctions propres du Laplacien. Commun. Math. Phys. 102, 497–502 (1985) [DE] Degli Esposti, M.: Quantization of the orientation preserving automorphisms of the torus. Ann. Inst. Henri Poincar´e 58, 323–341 (1993) [DEGI] Degli Esposti, M., Graffi, S., Isola, S.: Stochastic properties of the quantum Arnold cat in the classical limit. Commun. Math. Phys. 167, 471–509 (1995) [FN1] Faure, F., Nonnenmacher, S.: On the maximal scarring for quantum cat map eigenstates. To be published [FN2] Faure, F., Nonnenmacher, S.: In preparation [F] Folland, G.: Harmonic analysis in phase space. Princeton NJ: Princeton University Press, 1988 [HB] Hannay, J.H., Berry, M.V.: Quantization of linear maps-Fresnel diffraction by a periodic grating. Physica D 1, 267–290 (1980) [He] Heller, E.J.: Bound-state eigenfunctions of classically chaotic hamiltonian systems: scars of periodic orbits. Phys. Rev. Lett. 53, 1515–1518 (1984) [HMR] Helffer, B., Martinez, A., Robert, D.: Ergodicit´e et limite semi-classique. Commun. Math. Phys 109, 313–326 (1987) [KH] Kaplan, L., Heller, E.J.: Measuring scars of periodic orbits. Phys. Rev. E 59, 6609–6628 (1999) [Ke] Keating, J.P.: Asymptotic properties of the periodic orbits of the cat maps. Nonlinearity 4, 277–307 (1990) [KM] Keating, J.P., Mezzadri, F.: Pseudo-symmetries of Anosov maps and Spectral statistics. Nonlinearity 13, 747–775 (2000) [Kh] Khinchin, A.: Continued Fractions. Chicago and London: The University of Chicago Press, 1964 [KR1] Kurlberg, P., Rudnick, Z.: Hecke theory and equidistribution for the quantization of linear maps of the torus. Duke Math. J. 103, 47–77 (2000) [KR2] Kurlberg, P., Rudnick, Z.: On quantum ergodicity for linear maps of the torus. Commun. Math. Phys. 222, 201–227 (2001) [Me] Mezzadri, F.: On the multiplicativity of quantum cat maps. Nonlinearity 15, 905–922 (2002) [NV1] Nonnenmacher, S., Voros, A.: Eigenstate structures around a hyperbolic point. J. Phys. A 30, 295–315 (1997) [NV2] Nonnenmacher, S., Voros, A.: Chaotic Eigenfunctions in Phase Space. J. Stat. Phys. 92, 431–518 (1998) [dPBB] de Polavieja, G.G., Borondo, F., Benito, R.M.: Phys. Rev. Lett. 73, 1613–1616 (1994) [Pe] Perelomov, A.: Generalized coherent states and their applications. Berlin: Springer-Verlag, 1986 [Pr] Prosen, T.: Quantum surface of section method: eigenstates and unitary quantum Poincar´e evolution. Physica D 91, 244–277 (1996) [ROdA] Rivas, A.M.F., Ozorio de Almeida, A.M.: Hyperbolic scar patterns in phase space. Nonlinearity 15, 681–693 (2002) [RS] Rudnick, Z., Sarnak, P.: The behaviour of eigenstates of arithmetic hyperbolic manifolds. Commun. Math. Phys. 161, 195–213 (1994) [Sc] Schnirelman, A.: Ergodic properties of eigenfunctions. Usp. Math. Nauk 29, 181–182 (1974) [WBVB] Wisniacki, D.A., Borondo, F., Vergini, E., Benito, R.M.: Localization properties of groups of eigenstates in chaotic systems. Phys. Rev. E 63, 066220 (2001)

492

F. Faure, S. Nonnenmacher, S. De Bi`evre

[Z1]

Zelditch, S.: Uniform distribution of the eigenfunctions on compact hyperbolic surfaces. Duke Math. J 55, 919–941 (1987) Zhang, W.-M., Feng, D.H., Gilmore, R.: Coherent states: Theory and some applications. Rev. Mod. Phys. 62, 867–927 (1990)

[Z]

Communicated by P. Sarnak

Commun. Math. Phys. 239, 493–521 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0881-x

Communications in

Mathematical Physics

Conformal Field Theories of Stochastic Loewner Evolutions Michel Bauer, Denis Bernard Service de Physique Th´eorique de Saclay, CEA/DSM/SPhT, Unit´e de recherche associ´ee au CNRS, CEASaclay, 91191 Gif-sur-Yvette, France. E-mail: [email protected]; [email protected] Received: 14 October 2002 / Accepted: 4 March 2003 Published online: 10 July 2003 – © Springer-Verlag 2003

Abstract: Stochastic Loewner evolutions (SLEκ ) are random growth processes of sets, called hulls, embedded in the two dimensional upper half plane. We elaborate and develop a relation between SLEκ evolutions and conformal field theories (CFT) which is based on a group theoretical formulation of SLEκ processes and on the identification of the proper hull boundary states. This allows us to define an infinite set of SLEκ zero modes, or martingales, whose existence is a consequence of the existence of a null vector in the appropriate Virasoro modules. This identification leads, for instance, to linear systems for generalized crossing probabilities whose coefficients are multipoint CFT correlation functions. It provides a direct link between conformal correlation functions and probabilities of stopping time events in SLEκ evolutions. We point out a relation between SLEκ processes and two dimensional gravity and conjecture a reconstruction procedure of conformal field theories from SLEκ data. 1. Introduction Two dimensional conformal field theories [2] have produced an enormous amount of exact results for multifractal properties of conformally invariant critical clusters. See e.g. refs. [15, 5, 9] and references therein. These in particular include the famous Cardy formula giving the probability for the existence of a connected cluster percolating between two opposite sides of a rectangle in two dimensional critical percolation [4]. This set of results triggered the search for a probabilistic, and more rigourous, formulation of the statistical laws governing critical clusters. It lead O. Schramm to introduce [18] the notion of stochastic Loewner evolution (SLEκ ). These are conformally covariant processes which describe the evolutions of random sets, called the SLEκ hulls. Two classes of stochastic Loewner evolutions have been defined [18]: the radial and the chordal SLEκ . They differ by the positions of the end points connected by the random sets. Below, we shall only consider the so-called chordal SLEκ .

Member of the CNRS

494

M. Bauer, D. Bernard

The probabilistic approach has already lead to many new results, including the Brownian intersection exponents obtained by Lawler, Werner and Schramm [14]. It is also directly related to Smirnov’s proof of Cardy’s formula and of the conformal invariance of the scaling limit of critical percolation [20]. Although motivated by results obtained using conformal field theory (CFT) techniques, the relation between SLEκ evolutions and conformal field theories in the sense of ref. [2] remains elusive and indirect. A step toward formulating this relation was done in ref. [1]. There we presented a group theoretical formulation of SLEκ processes and we exhibited a relation between zero modes of the SLEκ evolutions and conformal null vectors. The aim of this paper is to develop this connection and to make it more explicit. Our approach is based on an algebraic – and group theoretical – formulation of SLEκ growths which takes into account the fact that these processes are encoded in random conformal transformations which are closely connected to special random vector fields defined over H. This leads naturally to lift the SLEκ evolutions to Markov processes in a group, whose Lie algebra is a Borel subalgebra of the Virasoro algebra, and which may be identified with a Borel subgroup of the group of conformal transformations. By adjoint action on conformal fields, the flows of these lifted processes induce that of the original SLEκ evolutions. The next key step consists in coupling the lifted SLEκ processes to boundary conformal field theories with specific Virasoro central charges cκ < 1 depending on κ. These CFTs are defined on random domains Ht which are the complement of the SLEκ hulls in the upper half plane, see below for details. Fluctuations and evolutions of the domains Ht , or of the SLEκ hulls, are then naturally encoded in boundary states, called the hull boundary states, in the CFT representation spaces. These states possess a natural geometrical interpretation in the lattice statistical models underlying SLEκ evolutions. We show that these hull boundary states are zero modes of the SLEκ evolutions, meaning that they are conserved in mean. In probabilistic terms, this also means that all components of these states – and there is an infinite number of them – are local martingales of the SLEκ evolutions. This martingale property allows us to compute crossing probabilities in purely algebraic terms. We show that, thanks to this property, the generalized crossing probabilities are solutions of linear systems whose coefficients are multipoint conformal correlations functions. Stated differently, the martingale property of the hull boundary states provides a direct link between SLEκ stopping time problems and CFT conformal blocks. Conformal field theories coupled to SLEκ hulls may be thought of as CFTs in the presence of 2D gravity since they are coupled to random geometry. By analysing the behavior of the conformal correlation functions close to the SLEκ hulls, this alternative viewpoint leads us to an interesting connection between operator product expansions in the presence of the SLEκ hulls and the KPZ formula for gravitationally dressed scaling dimensions. Relations between scaling properties of critical clusters and 2D gravity are crucial for the approach developed in ref. [10] and it would be interesting to connect both approaches. It is clear that the approach we present in this paper does not reach the level of rigor of refs. [14]. The aim of this paper is mainly to show that a bridge linking the algebraic and the probabilistic formulations of CFT may be developed. As is well known, it is a delicate matter to specify the appropriate group of conformal transformations, and we do not give below a precise definition of the group we use. However, given the germs at infinity of the conformal transformations ft (z), Eq. (2), which code for the SLEκ processes, it is possible to give a constructive definition of the group elements, which

Conformal Field Theories of Stochastic Loewner Evolutions

495

we shall call Gt , Eq. (3), when acting on highest weight vector representations of the Virasoro algebra. We also do not specify precisely the set of functions F (Gt ) for which the stochastic equation Eq. (4) may be applied, although matrix elements of Gt in highest weight vector representations of the Virasoro algebra clearly belong to this set. We would consider it an important research project for mathematical physicists to make the connection between SLEκ and CFT presented in this paper rigorous. For some work in this direction see [21]. The paper is organized as follows. In Sect. 2, we recall the definition of the stochastic Loewner evolutions and a few of its basic properties. We then present the group theoretical formulation of these processes. Section 3 is devoted to the algebraic construction of SLEκ martingales, i.e. zero modes of the SLEκ evolution operator, and to their link with null vectors in degenerate Virasoro modules. In Sect. 4, we give the geometrical interpretation of these martingales and explain how they may be identified with the hull boundary states. We also give there the connection between CFTs coupled to SLEκ hulls and 2D gravity. The algebraic derivation of generalized crossing probabilities is described in Sect. 5. Section 6 gathers information on the behavior of conformal correlators when the hull swallows a domain. A conjectural reconstruction scheme for conformal field theories based on the relation between generalized crossing probabilities and multipoint CFT correlation functions is proposed in the conclusions. 2. Basics of SLE The aim of this section is to recall basic properties of stochastic Loewner evolutions (SLE) and its generalizations that we shall need in the following. Most results that we recall can be found in [17, 13, 14]. See [6] for a nice introduction to SLE for physicists. 2.1. SLEκ processes. Stochastic Loewner evolutions SLEκ are growth processes defined via conformal maps which are solutions of Loewner’s equation: ∂t gt (z) =

2 , gt (z) − ξt

gt=0 (z) = z.

(1)

When ξt is a smooth real-valued function, the map gt (z) is the uniformizing map for a simply connected domain Ht embedded in the upper half plane H, Imz > 0. At infinity gt (z) = z + 2t/z + · · · . For fixed z, gt (z) is well-defined up to the time τz for which gτz (z) = ξτz . Following refs. [17, 13], define the sets Kt = {z ∈ H : τz ≤ t}. They form an increasing sequence of sets, Kt ⊂ Kt for t < t, and for a smooth enough driving source ξt , they are simple curves embedded in H. The domain Ht is H \ Kt . SLEκ processes are defined √ [18] by choosing a Brownian motion as a driving term in Loewner’s equation: ξt = κ Bt with Bt a normalized Brownian motion and κ a real positive parameter so that E[ξt ξs ] = κ min(t, s) 1 . The growth processes are then that of the sets Kt which are called the hulls of the process. See refs. [17, 18, 14, 13] for more details concerning properties of the SLEκ evolutions. It will be convenient to introduce the function ft (z) ≡ gt (z)−ξt whose Itˆo derivative is: 2 dft (z) = (2) dt − dξt ft (z) 1 Here and in the following, E[· · · ] denotes expectation (with respect to the SLE measure), and κ P[· · · ] refers to probability.

496

M. Bauer, D. Bernard

γ (t )

0<

γ (t )

κ<4

Kt

4<

κ

Kt

κ

<8

>8

Fig. 1. The different possible configurations of the SLEκ traces depending on κ

Let gt−1 (z) and ft−1 (z) be the inverse of gt (z) and ft (z) respectively. One has gt−1 (z) = ft−1 (z − ξt ). For t > s both (ft ◦ fs−1 )(z) and ft−s (z) have identical distributions. In a way similar to the two dimensional Brownian motion, a −1 fa 2 t (az) and ft (z) also have identical distributions. The trace γ [0, t] of SLEκ is defined by γ (t) = lim→0+ ft−1 (i). Its basic properties were deciphered in ref. [17]. It is known that it is almost surely a curve. It is almost surely a simple (non-intersecting) path for 0 < κ ≤ 4 and it then coincides with the hull Kt ; for 4 < κ < 8, it almost surely possesses double points but never crosses itself, it goes back an infinite number of times to the real axis and Ht is then the unbounded component of H \ γ [0, t]; the trace of SLEκ is space filling for κ ≥ 8. See Fig. 1. The time τz , which we shall refer to as the swallowing time of the point z, is such that fτz (z) = 0. For 0 < κ ≤ 4, the time τz is almost surely infinite since the trace is a simple path, while for κ > 4 it is finite with probability one. For κ = 8, the trace goes almost surely to infinity: limt→∞ |γ (t)| = ∞. A duality conjecture [10, 17] states that the boundary ∂Kt of the hull Kt of SLEκ with κ > 4 is statistically equivalent to the trace of the SLEκ process with dual parameter κ = 16/κ < 4. SLEκ evolutions may be defined on any simply connected domain D ⊂ H by conformal transformations. Namely, let a, b ∈ ∂D be two points of the boundary of D and ϕ be a conformal uniformizing transformation mapping D onto H such that ϕ(a) = 0 and ϕ(b) = ∞. Then the SLEκ growth from a to b in D is defined as that of the hull of the random map gt ◦ ϕ from D to H. Two simple examples of determistic Loewner maps are presented in Appendix A to help understand the underlying geometry.

2.2. Lifted SLEκ processes. We now formulate the group theoretical presentation of SLEκ processes introduced in ref. [1]. It consists in viewing these evolutions as Markov processes in the group of conformal transformations. Let Ln be the generators of the Virasoro algebra vir – the central extension of the Lie algebra of conformal transformations – with commutation relations [Ln , Lm ] = (n − m)Ln+m +

c n(n2 − 1)δn+m,0 , 12

with c, the central charge, in the center of vir. Let V ir− be the formal group obtained by exponentiating the generators Ln , n < 0, of negative grade of the Virasoro algebra. A possibly more rigorous but less global definition of this group consists in identifying it with the group of germs of conformal tranformations z → z + n>0 an z1−n at infinity, the group law being composition.

Conformal Field Theories of Stochastic Loewner Evolutions

497

We define a formal stochastic Markov process on V ir− by the first order stochastic differential equation generalizing random walks on Lie groups: G−1 t dGt = −2dt L−2 + dξt L−1 ,

Gt=0 = 1.

(3)

The elements Gt belong to the formal group V ir− . We refer to the formal group V ir− because we do not specify it precisely. We present in Sect. 3.3 simple computations illustrating the relation between Eqs. (3) and (1) which are based on identifying V ir− with the group of germs of conformal transformations at infinity. See also ref. [21]. Remark that for t > s, both G−1 s Gt and Gt−s have identical distributions. This definition is motivated by the fact that Eq. (2) may be written as f˙t (z) = 2/ft (z) − ξ˙t with dξt = ξ˙t dt. In this form Eq. (2) is slightly ill-defined, since ξ˙t is not regular enough, but it nevertheless indicates that SLEκ describe flows for the vector field w2 ∂w − ξ˙t ∂w . Equation (3) is written using the Stratonovich convention for stochastic differential calculus. This can be done using Ito integrals as well. Observables of the random process Gt may be thought of as functions F (Gt ) on V ir− . Using standard rules of stochastic calculus 2 , one finds their Itˆo differentials: d F (Gt ) = A · F (Gt ) dt + dξt ∇−1 F (Gt )

(4)

with A the quadratic differential operator A ≡ −2∇−2 +

κ 2 ∇ , 2 −1

(5)

where ∇n are the left invariant vector fields associated to the elements Ln in vir defined d by (∇n F )(G) = du F (G euLn )|u=0 for any appropriate function F on V ir− . Since left invariant Lie derivatives form a representation of vir, the evolution operator A may also be written as: κ A = −2L−2 + L2−1 (6) 2 with the Virasoro generators Ln acting on the appropriate representation space. In particular, the time evolution of expectation values of observables F (Gt ) reads: ∂t E[F (Gt )] = E[A · F (Gt )].

(7)

Notice that, in agreement with the scaling property of SLEκ , A is homogenous of degree two if we assign degree n to ∇−n . To test properties of the process Gt , we shall couple it to a boundary conformal field theory (CFT) defined over Ht . We refer to [3] for an introduction to boundary CFTs that we shall mainly deal with in the operator formalism – see also Sect. 4. Its stress tensor, T (z) = Ln z−n−2 , n

is the generator of conformal transformations and conformal (primary) operators are Virasoro intertwiners. We shall consider two kinds of conformal fields depending on whether they are localized on the boundary or in the bulk of the domain. More precisely, 2 We use formal rules extending those valid in finite dimensional Lie groups, and we do not try to specify the most general class of functions on which we may apply these rules.

498

M. Bauer, D. Bernard

let δ (x) be a boundary intertwiner with dimension δ. In our case the boundary of Ht , away from the hull, is the real axis so that δ (x) depends on the real variable x. By definition, it is an operator mapping a Virasoro module V1 into another, possibly different, Virasoro module V2 and satisfying the interwining relations: [Ln , δ (x)] = δn (x) · δ (x),

δn (x) ≡ x n+1 ∂x + δ(n + 1)x n .

(8)

Bulk intertwiners h,h (z, z) with conformal dimensions h and h are localized in the bulk of Ht and depend on the complex coordinates z and z. They act from a Virasoro module to another one and they satisfy the interwining relations: [Ln , h,h (z, z)] = hn (z) + hn (z) · h,h (z, z), with

hn (z) ≡ zn+1 ∂z + h(n + 1)zn .

(9)

In the following, we shall only consider highest weight Virasoro modules. More details on properties of interwiners, including in particular the fusion rules they satisfy and the differential equations their correlators obey, are briefly summarized in Appendix B. We shall need these properties in the following sections. The above relations mean that δ (x)(dx)δ or h,h (z, z)(dz)h (dz)h transform covariantly under conformal transformations. As a consequence, the group V ir acts on conformal primary fields. In particular for the flows (3), one has: δ G−1 t δ (x) Gt = [ft (x)] δ (ft (x)),

G−1 t h,h (z, z) Gt

=

(10)

[ft (z)]h [f t (z)]h h,h (ft (z), f t (z)),

with ft (z) the solution of Loewner’s equation (2) and ft (z) ≡ ∂z ft (z). In words, by adjoint action the flows of the lifted SLEκ implement the conformal transformations of the SLEκ evolutions on the primary fields. Equation (10) follow by construction but they may also be checked by taking their derivatives. Consider for instance a boundary primary field δ (x). By Eq. (4), the Itˆo derivative of the left-hand side of Eq. (10) reads: κ −1 d (l.h.s.) = [2L−2 dt − dξt L−1 , G−1 t δ (x)Gt ] + [L−1 , [L−1 , Gt δ (x)Gt ]]. 2 Similarly, the chain rule together with Eq. (2) and the intertwining relations (8) gives for the Itˆo derivative of the right-hand side of (10): κ d (r.h.s.) = [ft (x)]δ {[2L−2 dt − dξt L−1 , δ (ft (x))] + [L−1 , [L−1 , δ (ft (x))]]}. 2 Making precise Eq. (10) requires stating precisely what is the domain of Gt , but Eq. (10) clearly make sense as long as the conformal fields are located in the definition domain of the conformal map ft (z). The occurrence of the Virasoro algebra in SLEκ evolutions may also be seen by implementing perturbative computations valid close to z = ∞, as explained below in Sect. 3.3. 3. Martingales and Null Vectors The aim of this section is to show that martingales for SLEκ processes, which may be thought of as SLEκ observables which are conserved in mean, are closely related to null vectors of appropriate Verma modules of the Virasoro algebra.

Conformal Field Theories of Stochastic Loewner Evolutions

499

3.1. Modules and null vectors. Let us first recall a few basic facts concerning highest weight modules of the Virasoro algebra [2]. Such a module is generated by the iterated action of negative grade Virasoro generators Ln , n ≤ −1 on a reference state, say |h , which is annihilated by the positively graded Virasoro generators, Ln |h = 0, n > 0, and has a given conformal weight h, L0 |h = h|h . These modules are quotients of Verma modules. By definition, a Virasoro Verma module, denoted Vh , is a highest weight module which is the free linear span of vectors p L−np |h obtained by acting with the Ln ’s, n < 0, on |h . Thus Vh may be identified with (V ir− )|h . Generically Verma modules are irreducible. The Virasoro algebra acts on the space Vh∗ dual to the Verma module Vh by Ln v ∗ | = v ∗ |L−n for any v ∗ ∈ Vh∗ . The state h| ∈ Vh∗ , dual of the highest weight vector |h satisfies h|Ln = 0 for any n < 0 and h|L0 = h h|. The vacuum state |0 is a highest weight vector with zero conformal weight so that Ln |0 = 0 for all n ≥ −1. Its dual 0| satisfies 0|Ln = 0 for n ≤ 1. With SLEκ evolutions in view, we shall consider modules with central charges cκ =

3(4 − κ)2 (3κ − 8) (6 − κ) =1− . 2κ 2κ

(11)

It is then customary to parametrize the conformal weights as hκr,s =

(rκ − 4s)2 − (κ − 4)2 . 16κ

(12)

κ = 16/κ the dual parameter. Notice that cκ = 16 hκ1,2 hκ2,1 < 1 and that hκr,s = hκs,r with For r and s positive integers, the corresponding Virasoro Verma module Vhκrs possesses a highest weight vector submodule and is reducible. This means that there exists a state |nr,s ∈ Vhκrs , usually called a null vector, which is annihilated by the Ln , n > 0. If we assign degree n to the L−n , this state is at degree rs. For a generic value of κ, the quotient Vhr,s /(V ir− )|nr,s is then an irreducible module, and we shall denote it by Hr,s . 3.2. Zero modes and martingales. By definition, zero modes are observables which are eigenvectors of the evolution operator A with zero eigenvalue so that their expectation is conserved in mean. We shall need details on the module Vhκ1,2 with weight hκ1,2 =

6−κ . 2κ

We label its highest weight vector as |ω so that Ln |ω = 0 for n > 0 and L0 |ω = hκ1,2 |ω . The null vector is at degree two and it is equal to |n1,2 = (−2L−2 + κ2 L2−1 )|ω . Indeed, let A = −2L−2 + κ2 L2−1 as in Eq. (6), then |n1,2 = A|ω and [Ln , A] = (−2(n + 2) +

κ n(n + 1))Ln−2 + κ(n + 1)L−1 Ln−1 − cδn,2 . 2

Hence Ln |n1,2 = [Ln , A]|ω = 0 for all n > 0 since 2κhκ1,2 = 6 − κ and cκ = hκ1,2 (3κ − 8). The quotient H1,2 ≡ Vh1,2 /(V ir− )|n1,2 is generically irreducible, and A|ω = 0 as a vector in H1,2 .

500

M. Bauer, D. Bernard

Let us now define the observable Fω (Gt ) ≡ Gt |ω , where the group element Gt is viewed as acting on the irreducible module H1,2 – not on the reducible Verma module Vh1,2 . In particular, Fω (Gt ) is a vector in the infinite dimensional space H1,2 . By construction, Fω (Gt ) is a zero mode. Indeed (A · Fω )(Gt ) = Gt (−2L−2 + κ 2 2 L−1 )|ω = Gt |n1,2 vanishes as a vector in H1,2 , since |n1,2 = A|ω does in H1,2 . By Eq. (7), the expectation E[Fω (Gt )] is thus conserved: ∂t E[ Gt |ω ] = 0.

(13)

The stationary property of the expectation of Fω (Gt ) is a direct consequence of the existence of a null vector in the corresponding Virasoro Verma module. Summarizing leads to the following 3 Proposition. Let |ω be the highest weight vector of the irreducible Virasoro module with central charge cκ = (3κ − 8)(6 − κ)/2κ and conformal weight hκ1,2 = (6 − κ)/2κ. Then {Fω (Gt ) ≡ Gt |ω }t≥0 is a martingale of the SLEκ evolution, meaning that for t > s: E[ Gt |ω |{Gu≤s } ] = Gs |ω .

(14)

This is a direct consequence of the conservation law (13) and of the fact that G−1 s Gt and Gt−s are identically distributed. Using Dynkin’s formula, see ref. [16] p. 118, the above proposition has the following Corollary. Let τ be a stopping time such that E[τ ] < ∞, then: E[ Gτ |ω ] = |ω .

(15)

All components of the vector E [ Gt |ω ] are conserved and we may choose the vector on which we project it at will depending on the problem. The most convenient choices of vectors will be those generated by products of conformal operators. Namely, χ | hj ,hj (zj , zj ) ·

δp (xp ) j

p

with χ| the dual of a highest weight vector with weight hχ . Since χ |Ln = 0 for any n < 0, we have χ |G−1 t = χ | because Gt ∈ V ir− . As a consequence, the conservation law (13) projected on these vectors reads: ∂t E χ | [ft (zj )]hj [f t (zj )]hj hj ,hj (ft (zj ), f t (zj )) j

δp · [ft (xp )] δp (ft (xp ))|ω = 0,

(16)

p

where we moved Gt to the left using the intertwining relations (10). 3 For simplicity, we shall write “martingales” instead of the more appropriate denomination “local martingales”.

Conformal Field Theories of Stochastic Loewner Evolutions

501

Example. The simplest example is provided by considering correlations of the stress tensor T (z). For instance, from Eq. (13) we have: E[ ω|T (z)Gt |ω ] = ω|T (z)|ω . For non-vanishing central charge, T (z) transforms anomalously under conformal transformations [2] so that: 2 G−1 t T (z)Gt = [ft (z)] T (ft (z)) +

cκ {ft (z), z} 12

with {ft (z), z} the Schwarzian derivative of ft (z). Since ω|T (z)|ω = z−2 hκ1,2 and ω|G−1 t T (z)|ω = ω|T (z)|ω , we get: f (z) 2 c hκ1,2 κ + {ft (z), z} = 2 . E hκ1,2 t ft (z) 12 z This may be checked by a perturbative expansion in 1/z around z = ∞. The extension to an arbitrary number of stress tensor insertions is straightforward. 3.3. Perturbative computations in SLE. The group V ir acts on germs of meromorphic functions with a pole at infinity z(1 + a1 /z + a2 /z2 + · · · ) with ak as coordinates, similar to Fock space coordinates. The Virasoro generators are then differential operators in the ak ’s. See ref. [21]. 2 turns into a hierarchy of ordinary Concretely, the SLEκ equation g˙ t (z) = gt (z)−ξ t differential equations for the coefficients of the expansion of gt (z) at infinity. Writing gt (z) ≡ z(1 + i≥2 ai z−i ) and a1 ≡ −ξt leads to

a˙ 2 = 2 j −2 a˙ j = − i=1 ai a˙ j −i , j ≥ 3. Define polynomials pj in the variables ai by   p1 = 0 p2 = 1 j −2  pj = − i=1 ai pj −i , Then by construction

a˙ j = 2pj (a1 , · · · ),

j ≥ 3. j ≥ 2.

If one assigns degree i to ai , pj is homogeneous of degree j −2. Using the fact that a1 (t) is a Brownian motion and Ito’s formula, one derives a general formula of Fokker-Planck type to compute the expectation of any (polynomial, say) function q(a1 (t), a2 (t), · · · ): E[q(a1 (t), a2 (t), · · · ] = et A q(a1 , a2 , · · · ) , 0=a1 =a2 =···

where A is the differential operator A=

κ ∂2 ∂ + 2 pj . 2 2 ∂a1 ∂aj j ≥2

(17)

502

M. Bauer, D. Bernard

In fact, A is yet another avatar of the evolution operator A = −2L−2 + κ2 L2−1 from Eq. (6), this time acting on polynomials of the ai ’s. To check this, remember that w = ft (z) describes the integral curve starting from z at t = 0 for the time dependent meromorphic vector field w2 ∂w − ξ˙t ∂w . Then by definition ai = w(z)zi−2 dz and the vector field −wn+1 ∂w acts as i−2 δn ai = δn w(z)z dz = − w(z)n+1 zi−2 dz. For n ≤ −1 the right-hand side vanishes for i ≤ 0 which is a consistency condition. One checks that δ−1 ai = −δi,1 and δ−2 ai = −pi i ≥ 2, so on polynomials in the variables a1 , a2 , · · · , L−1 acts as − ∂a∂ 1 and L−2 as − j ≥2 pj ∂a∂ j . Using the representation (17) for the stochastic evolution operator −2L−2 + κ2 L2−1 , one may show that polynomial martingales – which are polynomials in the ai in the kernel of A – are in one-to-one correspondence with states in H1,2 . Namely, let us assign degree j to aj and set dimq KerA = n≥0 q n D(n) with D(n) the number of independent homogeneous polynomials of degree n in KerA. Then: dimq KerA =

1 − q2 . n n≥1 (1 − q )

This coincides with the graded character of H1,2 . Furthermore, polynomials in a1 , a2 , · · · do not vanish when the variables ai are replaced by their explicit expressions in terms of the Brownian motion as obtained by solving Loewner’s equation. This indicates that Gt |ω is a universal martingale – meaning that it contains all martingales. See ref. [21] for more details. 4. Geometrical Interpretation and 2D Gravity The aim of this section is to explain that the state Gt |ω involved in the definition of the above martingales possesses a simple geometrical interpretation. For each realization, the domain Ht = H \ Kt is conformally equivalent to the upper half plane. A boundary CFT defined over it is coupled to the random SLEκ hulls. Correlation functions of boundary operators δp (xp ) and of bulk operators hj (zj , zj ) (with identical left and right conformal dimensions hj = hj to keep formulæ readable), are naturally defined by: δp (xp ) · hj (zj , zj ) ω;Ht p

≡ 0|

j

δp (xp ) ·

p

hj (zj , zj ) · Gt |ω ,

(18)

j

where the subscript ω; Ht refers to the domain of definition and to the boundary conditions imposed on the conformal field theory, as we shall discuss. The r.h.s. are products of intertwiner operators and 0| is the conformally invariant dual vacuum with 0|Ln = 0 for n ≤ 1. Using Eq. (10) to implement the conformal transformations represented by Gt , these correlators may also be presented as 0| [ft (xp )]δp δp (ft (xp )) · |ft (zj )|2hj hj (ft (zj ), f t (zj ))|ω . (19) p

j

Conformal Field Theories of Stochastic Loewner Evolutions

503

Although natural as a definition for a CFT over Ht , the peculiarity of Eq. (19) is that the conformal transformations Gt do not act on the state |ω 4 . 4.1. The hull boundary state. We first present an interpretation in terms of statistical models of the occurrence of the state |ω with conformal weight hκ1,2 . We then give a meaning to the state Gt |ω in terms of boundary CFT. Let us recall a few basic elements concerning the microscopic definition of the lattice statistical models that are believed to underlie SLEκ processes. Consider for instance the Q-state Potts model, defined over some lattice, whose partition functions are: Z= exp J δs(r),s(r ) , {s(r)}

rr

where the sum is over all spin configurations and the symbol r r refers to neighbor sites r and r on the lattice. The spin s(r) at sites r takes Q possible values from 1 to Q. By expanding the exponential factor in the above expression using exp[J (δs(r),s(r ) − 1)] = (1−p)+pδs(r),s(r ) with p = 1−e−J , the partition function may be rewritten following Fortuin-Kastelyn [11] as a sum over cluster configurations Z = eJ L p C (1 − p)L−C QNC , C

where L is the number of links of the lattice, NC the number of clusters in the configuration C and C the number of links inside the NC clusters. In each of these so-called FK-clusters all spins take arbitrary but identical values. Imagine now considering the Q-state Potts models on a lattice covering the upper half plane with boundary conditions on the real line such that all spins at the left of the origin are frozen to the same identical value while spins on the right of the origin are free with non-assigned values. In each FK-cluster configuration there exists a cluster growing from the negative half real axis into the upper half plane whose boundary starts at the origin. In the continuum limit, this boundary curve is conjectured to be statistically equivalent to a SLEκ trace [18, 17]. See Fig. 25 . Being the boundary of a FK-cluster, the spin boundary conditions on both sides of the SLEκ trace are not identical. Indeed, from the above microscopic description we infer that in one side the spins are fixed whereas in the other side they are free. This change in boundary conditions corresponds to the insertion of a boundary changing operator [3] which is naturally identified with the operator ω creating the state ω:

ω (0)|0 = |ω with |0 the vacuum of the CFT defined over H. Note that creating a state at the origin in H corresponds indeed to creating a state at the tip of the SLEκ trace since γ (t) = ft−1 (0). This explains the occurrence of the state |ω in Eq. (18,19). The relation between Q and κ can be found by matching the known value of the dimension of the boundary changing operator for the Q-state Potts model with hκ1,2 . One gets: 4π Q = 4 cos2 , κ ≥ 4, κ 4 5

This is actually necessary as the transformations ft (z) are singular at the origin. We thank A. Kupianen for his explanations concerning this point.

504

M. Bauer, D. Bernard

0 1 0 1 0 0 1 01 1 0 1 0 1 0 1 0 1 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1111 0000 11111111 00000000 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 01 1 0 1 0 0 1 0 1 0 1 0 0 1 01 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 11111 00000 111 000 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 111111111 000000000 1111 0000 1111 0000 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0000 1 0 1 0 1 1111 0 1 0 1 111 000 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 111111111111111111 000000000000000000 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 01 1 0 1 0 1 0 1 111111111111111111 000000000000000000

γ(t)

0 01 1 1 0 0 01 1 0 1 0 1 0 1 0 0 01 1 01 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 01 1 01 1 0 0000000 1 0 1111111 0 1 0 1 0 1 0 1 0 1 01 1 0 01 0 1 0 1 0 1 0 1

Fig. 2. A FK-cluster configuration in the Potts models. The SLEκ trace γ (t) is the boundary of the FK-cluster connected to the negative real axis

Q = 1 for κ = 6 as it should be for percolation, and Q = 2 with κ = 16/3 for the Ising model. To formulate the boundary CFT coupled to the SLEκ hulls, imagine mapping back the domain Ht away from the hulls to a strip of width π such that half circles are mapped onto straight lines. The map is z = eσ +iθ → w = log z with θ ∈ [0, π ]. There is a natural operator formalism for the CFT defined on the strip, which is usually referred to as the open string formalism, see e.g. ref. [8] chapters 6 and 11. Boundary conditions on the two sides θ = 0 and θ = π of the strip are those inherited from the boundary conditions on the two sides of the SLEκ traces that we just described. Constant “time” slices, over which are defined conformal Hilbert spaces, are Re w ≡ σ = const. with the associated hamiltonian operators generating evolutions in the σ direction. Under this map, the SLEκ hulls appear as disturbances localized in the far past region, with σ small enough, thus generating states belonging to the conformal Hilbert spaces. See Fig. 3. Similarly, the domain Ht is mapped onto the upper half plane using the by now familiar transformation z → ft (z), sending the tip of the SLEκ traces to the origin. Under compositions of these two maps, the images of the σ = const. slices are curves topologically equivalent to half circles around the origin. The “time” quantization valid in the strip is replaced by the “radial” quantization with conformal Hilbert spaces defined over these curves or over the constant radius half circles. The SLEκ hulls, which are then local perturbations localized around the origin, generate states, which we denote |Kt , in the radial quantization Hilbert spaces. From Eq. (10), we learn that Gt intertwines the conformal field theories defined over Ht and H so that G−1 t |Kt is constant under SLEκ evolution. This leads to the identification |Kt = Gt |ω ,

(20)

where we introduce the state |ω to keep track of the boundary conditions we described above. Of course, up to the particular role played by the state |ω , this identification only reflects the fact that conformal correlation functions on Ht are defined via the conformal map ft (z) represented by Gt . This explains our identification (18).

Conformal Field Theories of Stochastic Loewner Evolutions

505

Kt

z=e α

β

w

α

β

ft (z)

Gt ω α

β

Fig. 3. A representation of the boundary hull state and of the maps between different formulations of CFTs. The shaded domains represent the SLEκ hulls and the dashed lines the “time” slices used to define CFT Hilbert spaces

4.2. SLE and 2D gravity. We may interpret conformal field theories coupled to SLEκ hulls as CFT’s on random environments either by viewing them as CFT on random domains Ht but with a trivial metric, or by viewing them in the fixed domain H, the upper half plane, but with random metrics |ft (z)|2 dzdz at points ft (z). These points of view are similar to those adopted in studying 2D gravity. The conservation law (13) then translates into a statement concerning the averaged conformal correlators in these random environments. Namely: E[ δp (xp ) · hj (zj , zj ) ω;Ht ] = 0|

δp (xp ) · hj (zj , zj ) |ω . p

j

p

j

There is an intriguing, but interesting, link between the influence of the SLEκ hulls on conformal scaling properties and the KPZ formula for gravitationally dressed dimensions [12, 7]. As we are going to argue below, in average, scaling properties of a boundary field δ (x) of dimension δ close to the hulls are dressed such that: E[ · · · δ (x) ω;Ht ] x→0 const. x ,

(21)

where · · · refers to the insertion of any operators away from the hulls. The small x limit mimics the approach of the field δ (x) close to the hull since γ (t) = ft−1 (0). The dressed dimension is determined by fusion rules and is a solution of the quadratic equation κ − 4 4 − = δ (22) κ κ with solutions ± (δ) =

1 κ − 4 ± (κ − 4)2 + 16κδ . 2κ

(23)

As a consequence, Eq. (23) also determines the conformal weights of the intermediate states propagating in the correlation functions we shall consider in the following sections.

506

M. Bauer, D. Bernard

Equation (21) may be derived as follows. From Eq. (13) we have in average: E[ · · · δ (x) ω;Ht ] = 0| · · · δ (x) |ω . For x → 0, the leading contribution to the r.h.s. comes from the highest weight vector, say |α , created by δ (x) acting on |ω . Hence, 0| · · · δ (x) |ω 0| · · · |α α| δ (x) |ω . As recalled in Appendix B, the null vector relation (−2L−2 + κ2 L2−1 )|ω = 0 imposes constraints on the possible states |α and on the possible scalings of α| δ (x) |ω . One has: κ 2δ−2 (x) + δ−1 (x)2 α| δ (x) |ω = 0. 2 This gives α| δ (x) |ω = const. x with the solution of Eq. (22). Equation (22) is the famous KPZ relation linking scaling dimension in flat space with that dressed by 2D gravity [12, 7], at least for κ ≤ 4. The KPZ relation is usually written as ( − γstr. ) = (1 − γstr. ) δ, where γstr. takes one of the two possible values ± γstr. = c − 1 ± (1 − c)(25 − c) /12. − is usually picked on based on semi-classical arguments valid for c → The choice γstr. ± ± ± −∞. Being a function of the central charge, γstr. is invariant by duality, γstr. (κ) = γstr. ( κ) for κ = 16/κ. One has: − γstr. (κ) =

κ −4 , κ

+ (κ) = γstr.

4−κ ; 4

κ ≤ 4.

− for κ ≤ 4. To reconHence, Eq. (22) matches with the KPZ equation with γstr. = γstr. ciliate the two equations for κ ≥ 4 one may try to invoke that the other branch for γstr. has to be used above the selfdual point. The fact that the scaling relation (21) reproduces that predicted by the KPZ relation suggests that, at least for κ ≤ 4, the SLEκ average samples a part of the 2D gravity phase space large enough to test – and to exhibit – scaling behaviors in the presence of gravity. It would be interesting to turn the above observation in more complete statements.

5. SLE Crossing Probabilities from CFT In this section we show on a few examples how to use the martingale properties (14,15) to compute crossing probabilities first computed in refs. [14, 18, 19]. The approach consists in choosing, in an appropriate way depending on the problem, the vector v| on which Eq. (15) is projected such that the expectation E[v| Gτ |ω ] = v|ω may be computed in a simple way in terms of the crossing probabilities. In other words, given an event E associated to a stopping time τ , we shall identify a vector vE | such that vE | Gτ |ω = 1E .

Conformal Field Theories of Stochastic Loewner Evolutions

507

This leads to linear systems for these probabilities whose coefficients are correlation functions of a conformal field theory defined over the upper half plane. Although our approach and that of refs. [14, 18, 19] are of course linked, they are in a way reversed one from the other. Indeed, the latter evaluate these crossing probabilities using the differential equations they satisfy – because they are associated to martingales, while we compute them by identifying them with CFT correlation functions – because they are associated to martingales – and as such they satisfy the differential equations. For the events E we shall consider below, the vectors vE | are constructed using conformal fields. Since we use operator product expansion properties [2] of conformal fields to show that the vectors vE | satisfy the appropriate requirements, our apporach is more algebraic than that of refs. [14, 18, 19] but less rigorous. 5.1. Generalized Cardy’s formula. The most famous crossing probability is that of Cardy [4] which gives the probability that there exists a percolating cluster in critical percolation connecting to opposite sides of a rectangle. Cardy’s formula for critical percolation applies to SLE6 . It may be extended [14] to a formula valid in SLEκ for any κ > 4. The problem is then formulated as follows. Let −∞ < a < 0 < b < ∞ and define the stopping times τa and τb as the first times at which the SLEκ trace γ (t) touches the interval (−∞, a] and [b, +∞) respectively: τa = inf{t > 0; γ (t) ∈ (−∞, a]}, τb = inf{t > 0; γ (t) ∈ [b, +∞)}. By definition, Cardy’s crossing probability is the probability that the trace hits first the interval (−∞, a], that is: P[ τa < τb ]. (1)

Let Ft (a, b) be the following correlation function (1)

Ft (a, b) ≡ ω| δ=0 (a) δ=0 (b)Gt |ω with δ=0 (x) a boundary conformal field of scaling dimension zero and ω| the dual of the highest weight vector |ω . There exist actually two linearly independent correlators, one of them being constant, but we shall not specify yet which non-constant correlation function we pick. Recall that Gt=0 = 1. By dimensional analysis – or because of the (1) commutation relations with L0 – Ft=0 (a, b) is only a function of the dimensionless ratio a/(a − b): (1) δ=0 (u) , u ≡ a . Ft=0 (a, b) = F a−b The cross ratio u, 0 < u < 1, has a simple interpretation: it is the image of the origin, the starting point of the SLEκ trace, by the homographic transformation fixing the infinity and mapping the point a to 0 and b to 1. We apply the martingale property (15) for the stopping time τ = min(τa , τb ) so that: δ=0 (u). E[ Fτ(1) (a, b) ] = Ft=0 (a, b) = F (1)

(24)

(1)

Similarly as in Eq. (16) we compute Fτ (a, b) by using the intertwining relation (10) to move Gτ to the left of the correlators. This gives: δ=0 (uτ ) , Fτ(1) (a, b) = Ft=0 (fτ (a), fτ (b)) = F (1)

uτ =

fτ (a) . fτ (a) − fτ (b)

508

M. Bauer, D. Bernard

The important observation is that uτ takes two simple non-random values depending whether τ equals τa or τb : τ = τa ⇔ τa < τb : τ = τb ⇔ τa > τb :

uτ = 0, uτ = 1.

(25)

Indeed, τ = τa means that the trace γ (t) hits first the interval (−∞, a]. As illustrated in the example of Appendix A, at the instance at which the trace gets back to the real axis, all points which have been surrounded by it are mapped to the origin by ft (z). Therefore, if τ = τa then fτ (a) = 0 while fτ (b) remains finite and thus uτ = 0. The case τ = τb is analysed similarly. Thus δ=0 (0) + 1{τa >τb } F δ=0 (1) Fτ(1) (a, b) = 1{τa <τb } F (1)

and we can compute E[Fτ (a, b)] in terms of P(τa < τb ): δ=0 (0) + (1 − P[ τa < τb ] ) F δ=0 (1), E[Fτ(1) (a, b)] = P[ τa < τb ] F where we used that P[ τa > τb ] = 1 − P[ τa < τb ]. Together with the basic martingale equation (24) we get Cardy’s formula: P[ τa < τb ] =

δ=0 (1) δ=0 (u) − F F . δ=0 (1) δ=0 (0) − F F

(26)

δ=0 (u) is shown below to be a hypergeometric function. The correlation function F Cardy’s formula can be further generalized [14] by considering for instance the expectation: δ fτ (b) E 1{τa <τb } . fτ (b) − fτ (a) This may be computed as above but considering correlation functions with the insertions of two boundary operators, one of dimension zero and the other one of dimension δ. Namely 6 : (2)

Ft (a, b) ≡ ω| δ=0 (a) δ (b)Gt |ω .

(27)

Again there exist two independent such correlators, corresponding to two different conformal blocks [2], and we shall specify in a while which one we choose. See Appendix B for further details. Since Gt=0 = 1, dimensional analysis tells that: (2)

Ft=0 (a, b) =

1 δ (u), F (b − a)δ

u=

a . a−b

As above, we start from the martingale relation (15): (2)

E[ Fτ(2) (a, b) ] = Ft=0 (a, b) with τ = min(τa , τb ). Moving Gτ to the left using the intertwining relation (10) gives: δ fτ (b) fτ (a) δ (uτ ), uτ = . F Fτ(2) (a, b) = fτ (b) − fτ (a) fτ (a) − fτ (b) 6 To be precise, the state ω| in Eq. (27) belongs, for generic δ, to the dual space of the contravariant hκ , cf. Appendix B. We however still denote it by ω| to avoid cumbersome notations. representation V 1,2

Conformal Field Theories of Stochastic Loewner Evolutions

509

The difference with the previous computation for Cardy’s formula is the occurrence of the Jacobian ft (b) since δ is non-vanishing. The argument (25) concerning the possible δ (u) which vanishes at values of uτ still applies. So, choosing the correlation function F (2) u = 1 selects the case τa < τb as the only contribution to the expectation E[ Fτ (a, b) ]. Hence: δ fτ (b) 1 δ (u) δ (0) E[ 1{τa <τb } F ]= (28) F fτ (b) − fτ (a) (b − a)δ δ (u = 1) = 0. This agrees with ref. [14]. Choosing different boundary conditions with F δ (u) would give different weights to the events {τa < τb } and {τa > τb }. for F As rederived in Appendix B, the correlation function ω| δ=0 (a) δ (b)|ω satisfies a differential equation as a consequence of the existence of the null vector |n1,2 at level δ (u) this differential two in the Verma module generated over |ω . For the function F equation translates into: δ (u) = 0. − 4δ + 4(u − 1)(2u − 1)∂u + κu(u − 1)2 ∂u2 F The two solutions correspond to the two possible conformal blocks, and as discussed δ (1) = 0. As it is well known, the solutions above we select one of them by demanding F may be written in terms of hypergeometric functions. Notice that the states propagating in the intermediate channel of the correlation ω| δ=0 (a) δ (b)|ω are those created by δ acting on |ω with conformal weights ± (δ) + δ + hκ1,2 , Eq. (23), as explained in Appendix B. As b → 0, or equivalently u → 1, the leading contributions are governed by these intermediate states so that: h (δ)

ω| δ=0 (a) δ (b)|ω b→0 b± (δ) Cδω±

ω| δ=0 (a)|h± (δ) + · · ·

h (δ)

with h± (δ) = δ + hκ1,2 + ± (δ) and Cδω± the structure constants. As recalled in Appendix B, the two possibilities correspond to the two possible choices of intertwiners δ (1) = 0 acting on H1,2 . Since + (δ) > 0 while − (δ) < 0 for δ > 0, imposing F selected the conformal block creating the states |h+ (δ) , i.e. δ (x) : H1,2 → Vh+ (δ) , for δ > 0. 5.2. Boundary excursion probability. As another application of the conformal machinery to a probabilistic stopping time problem, consider the following question, already answered by standard methods in ref. [17]. If γ (t) is an SLEκ path, κ ∈]4, 8[, and u a point on the positive real axis, let x = inf{[u, +∞[∩γ [0, +∞[} be the first point ≥ u touched by the SLEκ path. What is the distribution of x ? This question is similar to the one answered by Cardy’s formula. Note that in fact, x is the position of the SLEκ trace at time τu , the last time for which the Loewner map is well-defined at z = u. To prepare for the answer to this question, we study the two point correlator ω| δ=0 (v) δ=0 (u)Gt |ω for 0 < u < v < +∞. First, suppose that t = 0. If u comes close to 0, we can expand this function by computing the operator product expansion of δ=0 (u)|ω . As already discussed at length, this can involve at most two conformal families. The conformal family of |ω could be

510

M. Bauer, D. Bernard

one of these but we shall remove it by demanding that it does not appear in the operator product expansion. This fixes the boundary conditions we shall impose on the correlation κ−4 functions. Then δ=0 (u)|ω ∼ u κ | δ= κ−2 . This goes to 0 iff κ > 4. 2κ If the points u and v come close together, the operator product expansion δ=0 (v)

δ=0 (u) is more involved. General rules of conformal field theory ensure that the identity operator contributes, but apart from that, there is no a priori restriction on the conformal families δ that may appear. However, only those for which ω| δ |ω = 0 remain, and this restricts to two conformal families, the identity and δ=hκ1,3 . See Appendix B. Thus, when u and v come close together, the dominant contribution to ω| δ=0 (v) δ=0 (u)|ω 8−κ is either 1 or (v − u) κ ω| δ=hκ1,3 |ω depending on whether κ < 8 or κ > 8. Hence, if κ ∈]4, 8[, the correlation function ω| δ=0 (v) δ=0 (u)|ω vanishes at u = 0 and takes value 1 at u = v. For nonzero t, we write ω| δ=0 (v) δ=0 (u)Gt |ω = ω| δ=0 (ft (v)) δ=0 (ft (u))|ω = ω| δ=0 (1) δ=0 (ft (u)/ft (v))|ω . The last equality follows by dimensional analysis. Now if x, the position of the SLEκ trace at t = τu , satisfies u < x < v, fτu (v) remains away from the origin but fτu (u) = 0 and the correlation function vanishes. On the other hand, if v ≤ x, it is a general property of hulls that limtτu ft (u)/ft (v) = 1 and the correlation function is unity. To summarize ω| δ=0 (v) δ=0 (u)Gτu |ω = 1{x≥v} . From the martingale property (15) we infer that the probability distribution function of x is E[1{x≥v} ] = ω| δ=0 (v) δ=0 (u)|ω .

(29)

As recalled in Appendix B, the fact that ω| δ=0 (v) δ=0 (u)(−2L−2 +

κ 2 L )|ω = 0 2 −1

translates into the differential equation 2 2 κ ∂u + ∂v + (∂u + ∂v )2 ω| δ=0 (v) δ=0 (u)|ω = 0. u v 2 Since the correlation function depends only on s = u/v, we derive that 2 d 4 2(4 − κ) d + + ω| δ=0 (1) δ=0 (s)|ω = 0. ds 2 κs κ(1 − s) ds The differential operator annihilates the constants, a remnant of the fact that the identity operator has weight 0. With the normalization chosen for δ=0 , the relevant solution vanishes at the origin. The integration is straightforward. Finally, κ−4 1 s κ κ4 4 4−κ P[x ≥ u/s] ≡ E[1{x≥u/s} ] = κ−4 8−κ dσ σ − κ (1 − sσ )2 κ . κ κ 0 This example is instructive, because it shows in a fairly simple case that the thresholds κ = 4, 8 for topological properties of SLEκ appear in the CFT framework as thresholds at which divergences emerge in operator product expansions.

Conformal Field Theories of Stochastic Loewner Evolutions

511

5.3. Domain wall probability. We now present another percolation formula involving bulk operators. For the underlying statistical model defined on the upper half plane as in Sect. 4, this formula gives the probability for the position of the domain wall limiting the FK-cluster connected to the negative real axis relative to a given point in H. It was first proved in [19] using the SLEκ formulation and it was not yet derived within conformal field theory. Let κ < 8 and z be a point in H. One looks for the probability that the SLEκ trace goes to the left or to the right of that point. Following ref. [19], it is convenient to reformulate this problem as a statement concerning the behavior of the SLEκ trace at the time τz at which the point z is swallowed by the hull: τz = inf{t > 0; ft (z) = 0}. At time t → τz− , the point z is mapped to the origin by ft . Furthermore, with ft (z) = xt (z)+iyt (z), almost surely the ratio xt (z)/yt (z) goes to ±∞, as illustrated in Appendix A. The sign ±∞ depends on whether the point is surrounded by the trace γ (t) from the right or from the left – by convention, we say that the trace goes to the right if it loops around z positively. Hence, if θz ≡ limt→τz− xt (z)/yt (z) then [19]: P[γ (t) to the left[right] of z] = P[θz = −∞[+∞] ] with P[θz = +∞] + P[θz = −∞] = 1. We shall derive these probabilities using the martingale equation (15) with the correlation function (1)

Ct (z, z) ≡ ω| h=0 (z, z) Gt |ω

(30)

with h=0 (z, z) a bulk operator of scaling dimensions zero. As for Cardy’s formula, we choose arbitrarily any non-constant such correlators. Since Gt=0 = 1, dimensional (1) analysis implies that Ct (z, z) only depends on a dimensionless ratio: h=0 (x/y), Ct=0 (z, z) = C (1)

z = x + iy.

The martingale equation (15) applied to Ct (z, z) at t → τz− yields: (1)

h=0 (x/y). (z, z) ] = C E[ Cτ(1) z

(31)

As above, Gt may be moved to the left by using the intertwining relation (10) so that: (1) h=0 (θz ). (z, z) = lim Ct=0 (ft (z), f t (z)) = C Cτ(1) z t→τz

(1)

Since θz takes the two simple non-random values θz = ±∞, the expectation E[Cτz (z, z) ] is computable: h=0 (+∞) + P[θz = −∞] C h=0 (−∞). E[ Cτ(1) (z, z) ] = P[θz = +∞] C z With the martingale equation (31), it implies: P[θz = +∞] =

h=0 (−∞) h=0 (x/y) − C C . h=0 (+∞) − C h=0 (−∞) C

(32)

512

M. Bauer, D. Bernard

h=0 (x/y) satisfies a second order differential equaAs we recall below, the function C tion and it is expressible as a hypergeometric function. Equation (32) agrees with that of ref. [19]. As with Cardy’s formula, the above crossing probability admits a simple generalization. Let us define (2)

Ct (z, z) ≡ ω| h (z, z)Gt |ω

(33)

with h (z, z) a bulk operator of scaling dimension h. Scaling analysis implies that at initial time (2) h (x/y), z = x + iy. Ct=0 (z, z) = y −2h C As we are now familiar with, the martingale equation (15) for t → τz− gives: |f (z)| 2h τz h (x/y) E Ch (θz ) = y −2h C Imfτz (z)

(34)

with as before θz = limt→τz− xt (z)/yt (z). Since θz takes only two values ±∞ depending on whether the trace loops positively or negatively around z, Eq. (34) becomes: |f (z)| 2h τz h (+∞) + 1{θz =−∞} C h (−∞) = y −2h C h (x/y). (35) E 1{θz =+∞} C Imfτz (z) There exist two conformal blocks for the correlation function (33) and therefore two (2) possible independent choices for Ct (z, z). As a basis, one may select the even and h (x/y). Each choice leads to different information on the expectaodd correlations C tion (34). The simplest is provided by choosing the even conformal block so that the θz dependence in Eq. (34) factorizes. This yields: |f (z)| 2h τz [even] (∞) = y −2h C [even] (x/y). C (36) E h h Imfτz (z) h (x/y) satisfies a second order differential equation due As recalled in Appendix B, C to the existence of the null vector |n1,2 at level two. Its form is h (η) = 0 [ 16h + 8(1 + η2 )η∂η + κ(1 + η2 )2 ∂η2 ]C with η = x/y. As usual its solutions are hypergeometric functions. To conclude for the h (±∞), and to check whether it is finite, zero or expectation (34) one has to evaluate C infinite. As discussed in ref. [17], this depends on the values h and κ. The different cases are distinguished according to the conformal weights of the intermediate states propagating in the correlations (33). They may also be understood in terms of fusion rules. The limit x/y → ∞ may be thought of as the limit in which the bulk operator h (z, z) gets close to the real axis. By operator product expansion, it then decomposes on boundary operators δ (x). Since we are taking matrix elements between |ω and |hκ1,2 , only the operators with δ = 0 or δ = hκ1,3 give non-vanishing contributions, see Appendix B. Thus: κ

hκ

y 2h h (z, z) y→0 B0h [ δ=0 (x) + · · · ] + y h1,3 Bh1,3 [ δ=hκ1,3 (x) + · · · ] + · · · . The different cases depend on the finiteness of the two bulk-boundary coupling constants hκ

B0h and Bh1,3 and on the sign of hκ1,3 =

8−κ κ

– that is whether κ > 8 or κ < 8.

Conformal Field Theories of Stochastic Loewner Evolutions

513

6. Exterior/Interior Hull Relations As the examples we have treated in detail show, the behavior of conformal correlators when points are swallowed by the SLEκ hull is crucial for the probabilistic interpretation. In this section we gather a few general remarks on this behavior. Suppose that a domain D is swallowed by the hull at time t = tc . Then it is known that i) in the unbounded component, the Loewner map gt has a limit. This limit maps the unbounded component conformally onto H, ii) in the bounded component D, the Loewner map has a limit which is a constant map, so that the image of D collapses to the point ξtc on the real axis. (z)−ξt iii) for z, z ∈ D, limttc ggtt(z )−ξ = 1. t Let us elaborate a little bit on property iii). We concentrate on the case when a simple path γt starts from the origin at t = 0 and touches the real axis at t = tc 7 . Then the boundary of D contains an open interval I of the real axis. Let x be a point in this interval. By appropriate fractional linear transformations on z and gt (z), one can exchange the role of the bounded and the unbounded component and then use property i)8 . This leads to iv) For z ∈ D and x ∈ I , limttc

gt (z)−gt (x) gt (x)

exists and maps D conformally onto H.

Put together, properties ii), iii) and iv) imply that there exist two functions of t, (1) (0) = o(1) and εt = o(εt ) such that

(0)

εt

(0)

lim

ttc

gt (z) − ξt − εt (1)

εt

≡ gˇ tc (z)

exists and maps D conformally onto H. (0) For the semicircle example treated in detail in the appendix, one can take εt = √ (1) z − 2(tc − t), εt = 2(tc − t) and gˇ tc (z) = (r+z)2 . In fact heuristic arguments suggest √ (0) (1) that εt ∼ ± 2(tc − t) and εt ∼ 2(tc − t) if the curve γt is smooth enough9 . This smoothness assumption is not natural in the context of SLEκ so we work with (0) (1) εt and εt . Suppose there is a collection of fields in H, with the convention that the “i(n)” arguments are inside D or I , and the “o(ut)” arguments sit outside. Inserting resolutions of the identity and using covariance under similarities as detailed in Appendix C, we find that the dominant contribution to the correlator

δpo (xpo ) hjo (zjo , zjo ) ·

δpi (xpi ) hji (zji , zji ) · Gt |α 0| po 7

jo

pi

ji

In more complicated cases, the path has a self intersection at tc and/or has already made a (finite) number of self intersections before tc , but choosing any to such that γt is simple for t ∈ [to , tc ], the study of gt ◦ gt−1 instead of gt reduces the problem to the situation that we treat in some detail. o x z . The fractional linear 8 For instance, one may choose any x such that x x < 0 and set z = z−x transformation for gt (z) is then fixed by the normalization condition at infinity. 9 The sign is + if the boundary of D is oriented clockwise by the path and − otherwise.

514

M. Bauer, D. Bernard

for t close to tc is given by (0) (1) Cβα η (εt )δβ −δα −δη (εt )δη 0|

δpo (xpo ) hjo (zjo , zjo ) · Gtc |β β,η

ˇ −1 ·η|G tc ·

δpi (xpi )

pi

po

jo

ˇ tc |0 , hji (zji , zji ) · G

(37)

ji

β ˇ tc the element of V ir where Cα η is the structure constant of the operator algebra, and G representing the map gˇ tc (z) defined on D. In fact only the fields for which (0) (1) Cβα η (εt )δβ −δα −δη (εt )δη β,η

is dominant are really significant in the above sum. Note that from the conformal field (1) (0) theory viewpoint the scaling εt ∼ (εt )2 (which is also suggested by independent arguments for smooth γt ) yields a formula for which the domain D and its complement appear symmetrically. We do not know if this scaling relation holds for a typical SLEκ trace (κ ∈]4, 8[), but when subdomains of H containing field insertions are swallowed by the SLEκ hull, full conformal field theory correlation functions survive the swallowing. 7. Conclusions and Conjectural Perspectives We have established a precise connection between SLEκ evolutions and boundary conformal field theories, with central charge cκ , Eq. (11). Geometrically, this relation is based on identifying the hull boundary state |Kt , Eq. (20), which encodes the evolution of the SLEκ hull. The key point is that this state, which belongs to a Virasoro module possessing a null vector at level two, is conserved in mean. It is thus a generating function for SLEκ martingales. The simplest illustration of this property is provided by the algebraic derivation of generalized crossing probabilities that we gave in Sect. 5. It yields solutions of SLEκ stopping time problems in terms of CFT correlation functions. It is clear that the method we described can be generalized to deal with many more examples involving multipoint correlation functions than just the ones we presented in this paper. Although we point out a direct relation between SLEκ evolution and 2D gravity via the KPZ formula (22), this is clearly calling for developments, in order in particular to make contact with ref. [10]. Knowing the relation between crossing probabilities – or more generally stopping time problems – and conformal correlation functions, it is natural to wonder whether conformal field theories can be reconstructed from SLEκ data. In view of the explicit examples, e.g. Eqs. (28,29,35), and their multipoint generalizations, it is tempting to conjecture that appropriate choices for stopping time τ and state v ∗ | in the relation E[v ∗ | Gτ |ω ] = v ∗ |ω should lead to a reconstruction formula for, say, the boundary fields, expressing

δj (xj )|ω 0| j

as a linear combination of SLEκ expectations of the form E[1r(τ1 ,··· ,τn ) C(x1 , · · · , xn )δ1 ,··· ,δn ],

Conformal Field Theories of Stochastic Loewner Evolutions

515

where r(τ1 , · · · , τn ) specifies a relation between the swallowing times of the points x1 , · · · , xn and C(x1 , · · · , xn )δ1 ,··· ,δn is an appropriate function depending locally on the fτj (xj ) as in Eq. (28,29). The analogue ansatz for bulk operators would also involve the events coding for the position of the SLEκ trace with respect to the insertion points of the bulk operators as in Eq. (35) and its multipoint generalizations. We plan to report on this problem in that a near future [22]. The existence of such reconstruction scheme would indicate that SLEκ evolutions, or its avatars, provide alternative formulations of, at least conformal, statistical field theories. Contrary to usual approaches, the peculiarity of this, yet virtual, reformulation of field theories would be that its elementary objects are non-local and extended, with, at an even more hypothetical level, possible applications to (a dual formulation of) gauge theories. Appendix A: Deterministic Loewner Evolutions The simplest example of deterministic Loewner evolution is when the driving term vanishes, so that the equation reduces to g˙ = 2/g, which combined with gt=0 = z leads to 2 + 4t. When z is real, g is real as well, but g is real also when z is pure imaginary, g 2 = z√ √ z = 2i t , t ∈]0, t]. So√the hull is Kt = {2i t , t ∈ [0, t]}, and with a slight abuse of notation, we write g = z2 + 4t, being understood that the determination of the square root is chosen to make g continuous on H \ Kt and behave like z at ∞. This trivial example allows to construct a more instructive one: by appropriate fractional linear transformations in the source and image of g(z), one can construct a conformal representation for the situation when the hull is an arc of circle of radius r centered at the origin and emerging from the real axis at r. Straightforward computations lead to a family of maps 2 2λ+1 z−r 2λ−1 λ + −√ λ+1 z+r λ+1 gλ (z) := −r . 2 √ z−r λ + z+r − λ + 1 The determination of

λ+

z−r z+r

2

is chosen in such a way that the behavior at large

z is gλ (z) = z + O(1/z). The parameter r simply sets the scale, but the nonnegative parameter λ sets the angular extension θ of the arc : apart from the real axis, the points z such that gλ (z) is real are of the form z = reiϑ , ϑ ∈ [0, θ ] with tan2 θ/2 = λ. When λ → +∞, θ → π, and the free end of the arc approaches the real axis at the point −r. Our aim is to observe what happens in this limit, especially to the points that are being “swallowed” (i.e. the points inside the open half disc Dr of radius r centered at the origin). 2 The determination of λ + z−r is such that on the two sides of the cut it is equal z+r ± iϑ to ± tan2 θ/2 − tan2 ϑ/2 for z = r e . Explicit computation shows that gλ (−r) = −r

2λ + 1 , λ+1

and ±

gλ (r ) = ±2r

gλ (reiθ ) = −r λ r + . λ+1 λ+1

2λ − 1 λ+1

516

M. Bauer, D. Bernard

λ+

2

is pure imaginary when z = reiϑ , ϑ ∈ [θ, π ]. As gλ is 2 a real homographic function of λ + z−r , we conclude that {reiϑ , ϑ ∈ [θ, π ]} is z+r Moreover,

z−r z+r

mapped to a semicircle by gλ , and Dr is mapped to the open half disc bounded by this semicircle and a segment of the real axis. The important observation is that for large λ, gλ (r + ) → 2r, while gλ (−r), gλ (r − ) and gλ (reiθ ) have the same limit −2r, so that the image of Dr shrinks to a point. On the other hand, if z ∈ H \ D r , limλ→+∞ gλ (z) = z + r 2 /z, so that the image of H \ D r is H: z + r 2 /z, for|z| ≥ r lim gλ (z) = . λ→∞ −2r, for|z| < r The approach to the limit is interesting. Let gλ (z) = −2r +xλ (z)+iyλ (z) with yλ (z) > 0 by construction. For z ∈ Dr , we have xλ (z) = 2r/λ + O(1/λ2 ) and yλ (z) = O(1/λ2 ) so that limλ→∞ xλ (z)/yλ (z) = +∞ as expected for a loop surrounding the point z positively. The rate at which the points in Dr go to −2r is also interesting. If z ∈ Dr one checks that, for large λ, gλ (z)−gλ (reiθ ) = −r/λ+O(1/λ2 ), so that if we take another z ∈ Dr , the ratio gλ (z) − gλ (reiθ ) → 1, gλ (z ) − gλ (reiθ ) showing that the points z and z come close to each other faster than they approach the collapse point. Going one step further in the expansion gives r rz r iθ + O(1/λ3 ), gλ (z) − gλ (re ) = − + 2 1 + λ λ (r + z)2 and one checks that the map that appears at order λ−2 maps Dr conformally onto H. Up to now, we have not given a description of this example in terms of Loewner evolution. Explicit computation shows that ∂gλ (z) 2r 2 1 = . ∂λ (λ + 1)3 gλ (z) + r 2λ−1 λ+1 So, to deal with a normalized Loewner evolution, we need to make a change of evolution parameter. We set t≡

r2 2

1−

1 (λ + 1)2

,

ξt ≡ −r

2λ − 1 , λ+1

and, with an abuse of notation, write gt for gλ(t) . Then ∂t gt (z) = 2/(gt (z) − ξt ) as usual. The semicircle closes at tc = r 2 /2. For z ∈ Dr and t close to tc , gt (z) − ξt = −(2(tc − t))1/2 + 2(tc − t)

z + O((tc − t)3/2 . (r + z)2

Conformal Field Theories of Stochastic Loewner Evolutions

517

Appendix B: Virasoro Intertwiners and Their Correlation Functions Here, we gather some basic information on Virasoro intertwiners. Let us first deal with boundary operators δ (x), acting from one highest weight Virasoro module Vr to another module Vl with respective highest weight vectors |hr and |hl :

δ (x) : Vr −→ Vl . They are constrained by the intertwining relations (8). When Vr and Vl are irreducible Verma modules, the space of intertwiners between them is one dimensional. When Vl is an irreducible module, this space is 0 or 1 dimensional. This is so because in these cases, all the matrix elements of δ (x) can be obtained from the scalar products hl | Lnp δ (x) L−nq |hr p

q

and these are fixed by the three-point function hl | δ (x)|hr and the intertwining relations (8). Commutation relation with L0 fix the scaling form of the three point function: hl | δ (x)|hr = Chδ lhr x hl −hr −δ . The constant Chδ rhl is called the structure constant. The intertwiner δ (x) from Vr to Vl exists whenever this constant does not vanish. The fusion rules are those imposed by demanding that Chδ lhr be non-zero. If δ is generic and Vl and Vr are the possibly reducible Virasoro Verma modules no constraint is imposed on the possible values of the weight hl , hr and δ. Constraints on Chδ lhr arise when the Virasoro modules possess null vector. As we shall only deal with null vectors at level two, let us concentrate on the case where Vr = H1,2 with |hr = |ω and δ is generic. Recall that |ω has weight hκ1,2 = (6 − κ)/2κ. Hence,

δ (x) : H1,2 −→ Vl . Since |n1,2 = (−2L−2 + κ2 L2−1 )|ω vanishes in H1,2 we have: hl | δ (x) (−2L−2 +

κ 2 L )|ω = 0. 2 −1

By using the intertwining relation (8) to move the L−n to the left, this translates into κ 2δ−2 (x) + δ−1 (x)2 hl | δ (x) |ω = 0. 2 This either imposes Chδωl to vanish, or hl − δ − hκ1,2 to be one of the two solutions ± (δ) of Eq. (22). In other words, the only Virasoro intertwiners acting on H1,2 are

δ (x) : H1,2 −→ Vh± (δ) ,

h± (δ) = δ + hκ1,2 + ± (δ)

with ± (δ) given in Eq. (23). These are the fusion rules. For instance, in the particular case δ = hκ1,2 then h± (hκ1,2 ) takes the two possible values: h± (hκ1,2 ) ∈ {0 ; hκ1,3 =

8−κ }. κ

518

M. Bauer, D. Bernard

Symmetrically, the only interwiners δ (x) : H1,2 → Vhκ1,2 which couple |ω to the state |hκ1,2 with identical conformal weights should have δ = 0 or δ = hκ1,3 , since then ± (δ) + δ = 0. Multipoint correlators are computed by composing intertwiners. For instance the four point function h| δ1 (x1 ) δ2 (x2 )|ω may be thought of as a matrix element of the composition of two intertwiners:

δ1 (x1 ) ◦ δ2 (x2 ) : H1,2 −→ Vh± (δ2 ) −→ Vh . The fact that there are two possible choices for h± (δ2 ) is what is meant by the fact that there are two possible conformal blocks. The differential equations satisfied by the four point functions also follow from the existence of the null vector |n1,2 since: h| δ1 (x1 ) δ2 (x2 )(−2L−2 +

κ 2 L )|ω = 0. 2 −1

Using again the intertwining relations (8) to move the Virasoro generators to the left gives: κ 1 1 2 2 2(δ−2 (x1 ) + δ−2 (x2 )) + (δ−1 (x1 ) + δ−1 (x2 ))2 h| δ1 (x1 ) δ2 (x2 )|ω = 0. 2 These are partial differential equations which turn into second order ordinary differential equations, once the trivial scaling behaviors have been factorized. Some examples appear in the main text. Consider now bulk intertwiners h (z, z) of dimension h. The logic is the same as before except that the intertwining relations (9) are slightly more involved. In particular, h (z, z) acting from one module to another one h (z, z) : Vr −→ Vl is also determined by its three point function hl | h (z, z)|hr although, contrary to boundary operators, the intertwining relations (9) do not completely fix it. hl | h (z,z)|hr is generally more involved than simply a power law. As before no constraint is imposed if h is generic and Vl and Vr are Verma modules. Constraints arise if one of the two modules possesses null vectors. For simplicity, assume that Vr = H1,2 and |hr = |ω as almost surely everywhere in this paper, so that: h (z, z) : H1,2 −→ Vl . Then, since |n1,2 vanishes in H1,2 , hl | h (z, z)(−2L−2 +

κ 2 L )|ω = 0 2 −1

or equivalently, κ 2(h−2 (z) + h−2 (z)) + (h−1 (z) + h−1 (z))2 hl | h (z, z)|ω = 0. 2

(38)

Generically this equation has two independent solutions implying that the space of Virasoro intertwiners acting in H1,2 is two dimensional. This is what is meant by the fact that there are two conformal blocks in the bulk.

Conformal Field Theories of Stochastic Loewner Evolutions

519

Further constraints, implying fusion rules, arise if we demand that the image space Vl also possesses null vectors. Bulk operators may be thought of as the compositions of two chiral intertwiners depending respectively on z and z. Finally, besides the highest weight Verma modules Vh one may also consider the h induced on their dual spaces. For generic h, Vh and V h contravariant representations V are isomorphic as Virasoro modules, but they are not if Vh possesses sub-modules. In hκ are not isomorphic but V hκ contains a submodule isomorphic particular Vhκ1,2 and V 1,2 1,2 to H1,2 . Contravariant modules arise when defining conformal fields. Namely, one may show that the intertwiners δ (x), hκ

δ (x) : Vh −→ V 1,2 hκ exist, while there are no from the Verma module Vh to the contravariant module V 1,2 intertwiners from Vh to Vhκ1,2 , for generic δ. If the fusion rules are satisfied, then δ (x) hκ submodules. This is the subtlety alluded in footnote maps Vh into H1,2 , viewed as V 1,2

6, Eq. (27).

Appendix C: Derivation of Formula (37) To derive Eq. 37, we manipulate 0|

δpo (xpo )

po

hjo (zjo , zjo ) ·

δpi (xpi )

pi

jo

hji (zji , zji ) · Gt |α .

ji

Remember that the “i(n)” arguments are inside D or I , and the “o(ut)” arguments sit outside. We conjugate by Gt and then insert the identity operator 1 = β |β β| to separate the “i(n)” and the “o(ut)” contributions, leading to:

0|

β

β|

δpo

δpo (ft (xpo )) · po [ft (xpo )]

δpi pi [ft (xpi )] δpi (ft (xpi )) ·

jo

|ft (zjo )|2hjo hjo (ft (zjo ), f t (zjo ))|β

ji

|ft (zji )|2hji hji (ft (zji ), f t (zji ))|α .

The “o(ut)” part behaves nicely when t tc , but for the “i(n)” part we write (0) (1) ft (z) = εt + εt fˇt (z) and use the covariance under similarities: β|

[ft (xpi )]δpi δpi (ft (xpi )) ·

pi

=

(1) (εt )δβ −δα

β| δα

|ft (zji )|2hji hji (ft (zji ), f t (zji ))|α

ji

(0) (1) −εt /εt

[fˇt (xpi )]δpi δpi (fˇt (xpi ))

pi

·

ji

ˇ

ˇ

2h (f (z ),f (z )) |fˇt (zji )| ji hji t ji t ji |0 ,

520

M. Bauer, D. Bernard

where δα is the field such that δα (0)|0 = |α . Upon insertion of the identity operator 1 = η |η η| to separate δα from the “i(n)” fields, we find β| [ft (xpi )]δpi δpi (ft (xpi )) · |ft (zji )|2hji hji (ft (zji ), f t (zji ))|α pi

=

(1) (εt )δβ −δα η

·η|

ji

(0)

εt

(1)

εt

δβ −δα −δη β| δα (1)|η

[fˇt (xpi )]δpi δpi (fˇt (xpi )) · |fˇt (zji )|2hji hji (fˇt (zji ), fˇt (zji ))|0 . pi

ji β

In fact β| δα (1)|η = Cα η is the structure constant of the operator algebra. When t tc , fˇt has a limit which maps conformally D onto the upper half plane, and we may ˇ tc . This leads to formula (37). represent it by an operator G Acknowledgement. We thank John Cardy, Philippe Di Francesco, Antti Kupiainen, Vincent Pasquier and Jean-Bernard Zuber for discussions and explanations on conformal field theories and SLEκ processes. This research is supported in part by the European EC contract HPRN-CT-2002-00325.

References 1. Bauer, M., Bernard, D.: SLE growth processes and conformal field theories. Phys. Lett B543, 135– 138 (2002) 2. Belavin, A., Polyakov, A., Zamolodchikov, A.: Infinite conformal symmetry in two-dimensional quantum field theory. Nucl. Phys. B241, 333–380 (1984) 3. Cardy, J.: Conformal invariance and surface critical behavior. Nucl. Phys. B240, 514–522 (1984); Cardy, J.: Boundary conditions, fusion rules and the Verlinde formula. Nucl. Phys. B324, 581–596 (1989) 4. Cardy, J.: Critical percolation in finite geometry. J. Phys. A25, L201–206 (1992) 5. Cardy, J.: Conformal invariance and percolation. arXiv:math-ph/0103018 6. Cardy, J.: Conformal invariance in percolation, self-avoiding walks and related problems. arXiv:cond-mat/0209638 7. David, F.: Mod. Phys. Lett. A3, 1651 (1988); Distler, J., Kawai, H.: Nucl. Phys B321, 509 (1988) 8. Di Francesco, Ph., Mathieu, P., Senechal, D.: Conformal field theory. Berlin-Heidelberg-New York: Springer, 1996 9. Duplantier, B.: Conformally invariant fractals and potential theory. Phys. Rev. Lett. 84, 1363–1367 (2000) 10. Duplantier, B.: Higher conformal multifractality. And references therein, to appear in J. Stat. Phys. 11. Fortuin, C., Kasteleyn, P.: J. Phys. Soc. Japan 26, 11 (1989) 12. Knizhnik, V., Polyakov, A., Zamolodchikov, A.: Mod. Phys. Lett. A3, 819 (1988) 13. Lawler, G.: Introduction to the Stochastic Loewner Evolution. URL http://www.math.duke.edu/ jose/papers.html, and references therein 14. Lawler, G., Schramm, O., Werner, W.: Values of Brownian intersection exponents. I, II and III. arXiv:math.PR/9911084, math.PR/0003156, and math.PR/0005294 15. Nienhuis, B.: Critical behavior of two-dimensional spin models and charge asymmetry in the Coulomb gas. J. Stat. Phys. 34, 731–761 (1983) 16. Øksendal, B.: Stochastic differential equations. Berlin-Heidelberg-New York: Springer, 1998 17. Rhode, S., Schramm, O.: Basic properties of SLE. And references therein. arXiv:math.PR/0106036 18. Schramm, O.: Scaling limits of loop-erased random walks and uniform spanning trees. Israel J. Math. 118, 221–288 (2000) 19. Schramm, O.: A percolation formula. arXiv:math.PR/0107096 20. Smirnov, S.: Critical percolation in the plane: conformal invariance, Cardy’s formula, scaling limits. C.R. Acad. Sci. Paris 333, 239–244 (2001)

Conformal Field Theories of Stochastic Loewner Evolutions

521

21. Bauer, M., Bernard, D.: SLE martingales and the Virasoro algebra. Phys. Lett. B 557, 309–316 (2003) 22. Bauer, M., Bernard, D.: Reconstructing CFTs from SLEs data. In preparation Communicated by A. Kupiainen

Commun. Math. Phys. 239, 523–547 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0885-6

Communications in

Mathematical Physics

End-to-End Distance from the Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions David C. Brydges1,2, , John Z. Imbrie2 1 2

University of British Columbia, Mathematics Department, #121-1984 Mathematics Road, Vancouver, B.C. V6T 1Z2, Canada. E-mail: [email protected] Department of Mathematics, Kerchof Hall, P. O. Box 400137, University of Virginia, Charlottesville, VA 22904-4137, USA. E-mail: [email protected]

Received: 21 May 2002 / Accepted: 25 March 2003 Published online: 2 July 2003 – © Springer-Verlag 2003

Abstract: In [BEI92] we introduced a Levy process on a hierarchical lattice which is four dimensional, in the sense that the Green’s function for the process equals |x|1 2 . If the process is modified so as to be weakly self-repelling, it was shown that at the critical killing rate (mass-squared) β c , the Green’s function behaves like the free one. Now we analyze the end-to-end distance of the model and show that its expected √ 1 log log T value grows as a constant times T log 8 T 1 + O log T , which is the same law as has been conjectured for self-avoiding walks on the simple cubic lattice Z4 . The proof uses inverse Laplace transforms to obtain the end-to-end distance from the Green’s function, and requires detailed properties of the Green’s function throughout a sector of the complex β plane. These estimates are derived in a companion paper [BI02]. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . 1.1 Main results . . . . . . . . . . . . . . . . . . . 1.2 Green’s functions and the end-to-end distance . 1.3 Additional remarks . . . . . . . . . . . . . . . 2. End-to-End Distance for the Non-Interacting Walk . . 3. End-to-End Distance for the Self-Avoiding Walk . . . 4. The Coupling Constant Recursion and Its Fixed Point

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

523 523 526 530 531 535 539

1. Introduction 1.1. Main results. Precise calculations by theoretical physicists have established, with the aid of some reasonable assumptions, that the end-to-end distance of a self-avoiding

Research supported by NSF grant DMS-9706166 and NSERC of Canada.

524

D.C. Brydges, J.Z. Imbrie 1

1

walk at time T should be asymptotic to a constant times T 2 log 8 T as T tends to infinity. See for example [BLZ73] and additional references in [MS93]. These arguments form a starting point for complete proofs. In our previous paper [BEI92] of this series, we started such a program but with two major simplifications. The first is to study processes which repel weakly as opposed to being strictly self-avoiding. The second is to replace the simple cubic lattice by another state space, a “hierarchical lattice,” specifically designed to facilitate the use of the renormalization group. While the renormalization group is proposed for proving these results also on the simple cubic lattice, the method is considerably simpler to apply on the hierarchical lattice. The hierarchical lattice and some of its history have been described at length in [BEI92]. Here we summarize that discussion and specialize it to four dimensions. The hierarchical lattice G is the direct sum of infinitely many copies of Zn , where n = L4 for some integer L > 1 which characterizes the lattice. A typical element x ∈ G has the form x = (. . . , x2 , x1 , x0 ) with xi ∈ Zn = {0, 1, . . . , n − 1}. All but finitely many elements of the sequence x vanish. Let xN−1 be the first element, reading from the left, which does not vanish. We define a G-invariant ultra-metric on G by 0 if x = (. . . 0) dist(x, y) ≡ |x − y|, |x| ≡ (1.1) LN if x = (. . . , xN−1 , xN−2 , . . . , x0 ). Let ω(t) be a Levy process on G such that P (ω(t + dt) = y|ω(t) = x) = C|x − y|−6 dt,

(1.2)

if x = y. In [BEI92], Proposition 2.3, we show that, with the right choice of C = C(L), the 0-potential (Green’s function) for this process is given by ∞ G0 (x − y) ≡ dT Ex (11 {ω(T )=y} ) 0  1 − L−4   if x = y;  −2 = 1−L (1.3)  1   if x = y. |x − y|2 1 The process ω(t) is “four dimensional” in the sense that its Green’s function is |x−y| 2 for x = y. The slow decay in the law (1.2) is an ugly contrast with the simplicity of the nearest neighbor random walk on the simple cubic lattice, but it is a necessary price for a state space with an ultra-metric. (On such a space, a process with finite range jumps cannot leave the ball whose radius equals the range and which is centered on the starting position.) One consequence of (1.2) is that ω(t) does not have second moments. Thus 1 we will measure end-to-end distance by E0 (|w(T )|α ) α with 0 < α < 2. At first one might expect that if this quantity is normalized by √1 it would have a limit as T → ∞. T Instead the behavior is asymptotically periodic in log T , as the following proposition shows.

Proposition 1.1. Fix L > 1. Then for each α, 0 < α < 2, and each T ≥ 0, lim √

m→∞

1 L2m T

1

E0 (|ω(L2m T )|α ) α

exists and is a strictly positive, non-constant, bounded function Fα (T ) which satisfies Fα (L2 T ) = Fα (T ).

End-to-End Distance from the Green’s Function for a Self-Avoiding Walk in Four Dimensions

525

We postpone the proof of this proposition and turn our attention to the self-repelling process. Let us define τ (x) ≡ τ (T ) (x) as the local time (up to time T ) that the process spends at state x: T τ (T ) (x) ≡ ds 11{ω(s)=x} . (1.4) 0

Let

τ 2 (G) ≡

G

dx τ 2 (x) =

ds dt 1 {ω(s)=ω(t)} ,

(1.5)

where dx is Haar measure, i.e., counting measure on G. Clearly, τ 2 (G) is a measure of how much time the process spends in self-intersecting. For each choice of a parameter T is given by λ ≥ 0 we define a new “self repelling” process ωλ whose expectation Ex,λ 2 Ex e−λτ (G ) ( · ) T (·)≡ Ex,λ (1.6)

. 2 Ex e−λτ (G ) (Recall τ = τ (T ) .) We are able to control this expectation for λ in a sector of the complex plane containing the positive reals, although the measure may no longer be real. The main result of this paper is Theorem 1.2. Fix an integer L ≥ 2 and choose any 0 < α < 2. If λ is sufficiently small with |arg λ| < π3 , then

α α1 O(λ) T α α1 −1 41 ) , (1.7) T (T E E0,λ (|ω(T )| ) = 1 + ω 0 (T −1 ) where with T > 1, B ≡ 1 − L−4 , the logarithmic factor is (T −1 ) = 1 + O(λ) + Bλ(4 log T + log |1 + λ log T |).

(1.8)

The proof relies on two results from [BI02] (henceforth referred to as paper II). See Proposition II.6.1 and Theorem II.1.1 in the next subsection. Conventions. In this paper log refers to the base L logarithm. While we can take any L ≥ 2 as in [BEI92], for simplicity we restrict to the case where L is a fixed, large integer, and λ is taken to be sufficiently small, depending on L. Proposition II.6.1, in particular, is easier to state under these assumptions. Theorem 1.2 describes how if a weak repulsion is switched on, the effect relative to 1 the process without repulsion is to rescale time by the slowly varying (T −1 ) 4 . Thus if we say that Proposition 1.1 gives a sense in which √ (1.9) |ω(T )| c T , then in an equivalent sense, for some c(L, λ), √ 1 |ωλ (T )| c(L, λ) T log 8 T 1 + as T → ∞.

log log T 32 log T

+O

1 λ log T

(1.10)

526

D.C. Brydges, J.Z. Imbrie

1.2. Green’s functions and the end-to-end distance. We will be using the field-theoretic representation of the self-avoiding walk, see [BEI92]. In this representation, the length of the walk T is integrated over, as in (1.3). We may define the Green’s function as a Laplace transform as follows: ∞ 2 Gλ (β, x) ≡ (1.11) dT e−βT E0 e−λτ (G ) 11{ω(T )=x} . 0

Then, after obtaining detailed estimates of the behavior of Gλ (β, x) we can prove Theorem 1.2 by inverting the Laplace transform to recover fixed-T quantities. This is done in Sect. 3. To see how this works, consider a simple random walk on Zd , the process whose generator is the lattice Laplacian . For this model we have ∞ G(β, x) = dT e−βT eT (0, x) = (− + β)−1 (0, x). 0

We may compute 1 = β −1 , 2 +β p p=0 x d 2 d 1 2 x G(β, x) = = 2dβ −2 2 p2 + β dp p=0 j x

G(β, x) =

j =1

(the lattice expressions reduce to these at p = 0). Then we may use inverse Laplace transforms to recover the fixed-T quantities. With a > 0 we find a+i∞ dβ βT −1 P (T , x) = e β = 1, a−i∞ 2πi x a+i∞ dβ βT x 2 P (T , x) = e 2dβ −2 = 2dT . 2πi a−i∞ x Here we use the residue theorem to evaluate these contour integrals. Now taking the ratio we see that the expected value of ω(T )2 is 2dT . The plan is to show that this argument for the case λ = 0 applies to the case λ = 0 by approximating Gλ (β, x) by G(βeff,N(x) , x), where βeff,N(x) depends on λ, β and weakly on x. Returning to the model on the hierarchical lattice, note that in [BEI92], cf. p. 85, we studied ∞ 2 Uλ (a, x) ≡ lim (1.12) dT E0 e−λτ ( )−aτ ( ) 11{ω(T )=x} ,

G 0

where

τ 2 ( ) ≡

T

dx τ 2 (x) =

τ ( ) =

dx τ (x) =

0

0 T

T 0

ds dt 11{ω(s)=ω(t)∈ } ,

ds 11{ω(s)∈ } .

(1.13) (1.14)

End-to-End Distance from the Green’s Function for a Self-Avoiding Walk in Four Dimensions

527

Hence the difference between Uλ and Gλ lies in whether lim lies inside or outside

G dT E0 . In [BEI92] it was shown that there exists, for λ small, a special value ac (λ) with the property that Uλ (ac (λ), x) ≈

Const. , |x|2

x → ∞.

Note that at λ = 0, ac (λ) = 0 by (1.3). It is a by-product of this paper that this ac (λ) is the same as β c (λ) which appears in the next proposition and that Gλ = Uλ for β in a sector to the right of β c (λ). We study the interacting Green’s function Gλ (β, x) for (λ, β) in certain complex domains. Let us introduce the notation Dβ = {β = 0 : |arg β| < bβ }, Dλ = {λ : 0 < |λ| < δ and |arg λ| < bλ }, Dβ = {β = 0 : |arg β| < bβ + 41 bλ + }, Dλ = {λ : 0 < |λ| < δ and |arg λ| < bλ + }, B(ρ) = {β : |β| < ρ}, Dβ (ρ) = Dβ + B(ρ),

(1.15)

where bβ > 0, bλ > 0 are fixed so as to satisfy 2bβ + 23 bλ < 3π 2 . In particular, this means bλ < π , bβ < 3π . The number is fixed and small enough so that 2(bβ + )+ 23 (bλ + ) < 4 3π 2 also. The number δ is chosen to satisfy the hypotheses of Proposition II.6.1 below, and δ < δ¯ is chosen after (depending on bλ ). ρ = 21 by default. In order to invert the Laplace transform with good bounds we shall require bβ > π2 , and so bλ < π3 . For π , example, (bβ , bλ ) = 5π 8 8 defines an acceptable pair of domains (Dβ , Dλ ). As bβ , bλ , , L are taken as fixed, we will usually not make explicit the dependence of constants on these parameters. Remark. A somewhat larger domain for (β, λ) defined by the conditions |2 arg β − 3 3π 2 arg λ| < 2 , |arg λ| < π, |arg β| < π could be used but for simplicity we have taken domains which are in product form. Our main theorem for Gλ refers to a sequence (βj , λj )j =0,1,... generated by a recursion defined in paper II [BI02]. The following proposition (proven in paper II) gives all the properties of the recursion that will be needed in this paper.

Proposition II.6.1. Let (β0 , λ0 ) = (β, λ) be in the domain Dβ 21 × Dλ with δ sufficiently small. The sequence (βj , λj )j =0,1,... ,M is such that λj +1 = λj − βj +1 = L

2

8Bλ2j (1 + βj )2

+ λ,j ,

2B βj + λj + β,j , 1 + βj

(1.16)

where λ,j , β,j are analytic functions of (β, λ) satisfying | λ,j | ≤ cL |λj |3 |1 + βj |−1 , | β,j | ≤ cL |λj |2 |1 + βj |−2 .

(1.17)

528

D.C. Brydges, J.Z. Imbrie

Here B = 1 − L−4 , and M is the first integer such that (βM , λM ) is not in the domain Dβ 21 × Dλ . If no such integer exists, then M = ∞. The next proposition constructs the “stable manifold” β c (λ) for the recursion above. Proposition 1.3. For each λ ∈ Dλ there exists β c (λ) = O(λ) with the property that βnc ≡ βn (β c (λ)) = O(λn (β c (λ))) → 0 as n → ∞. Furthermore, if β ∈ Dβ + β c (λ), then βn ∈ Dβ + βnc and λn ∈ Dλ for all n. This β c (λ) is called the critical killing rate. (It is negative if λ > 0.) We define new variables βˆ = β − β c (λ) and βˆj = βj − βjc . We will relate the interacting Green’s function Gλ (β, x), λ = 0, to the free Green’s ˆ x). As we shall see, G0 (β, x) is analytic in β except for a sequence of function G0 (β, poles which lie in the interval [−1, 0) and which accumulate at zero. For small |x|, that is, |β| |x|2 < 1, it resembles |x|−2 . For large |x|, that is, |β| |x|2 ≥ 1, it decays as |x|−6 . 1 Thus G0 has “range” β − 2 . Our next result gives the detailed behavior of G0 (see Sect. 2 for the proof). Proposition 1.4. The following statements hold for all β ∈ Dβ . (1) G0 (β, x) =

j ≥0

L−2j

(1 − L−4 )(1 − L−2−2j ) , x = 0. |x|2 (1 + β|x|2 L−2 )(1 + β|x|2 L2j )

(2) G0 (β, 0) =

j ≥0

L−2j

1 − L−4 . 1 + L2j β

(3) There are positive (L-dependent) constants c1 , c2 such that c1 c2 ≤ |G0 (β, x)| ≤ , x = 0, |x|2 (1 + |β| |x|2 )2 |x|2 (1 + |β| |x|2 )2 c1 c2 ≤ |G0 (β, 0)| ≤ . 1 + |β| 1 + |β| The next theorem shows how well Gλ may be approximated by G0 . Provided an effective β is used for G0 , the error in the approximation is proportional to an effective λ. The proof is based on the renormalization group and the field theory representation for Gλ . It will be treated in paper II. Theorem II.1.1. Let λ ∈ Dλ with δ sufficiently small. Then Gλ (β, x) is analytic in β in the domain Dβ + β c (λ) and |Gλ (β, x) − G0 (βeff,N(x) , x)| ≤ O(λN(x) )|G0 (βeff,N(x) , x)|.

(1.18)

Here N(x) = log |x| for x = 0, N (0) = 0, and βeff,j = L−2j βˆj . As the behavior of G0 is described accurately in Proposition 1.4, this theorem gives a correspondingly accurate picture of Gλ . We may interpret βeff,N(x) as the value of βˆ which would evolve to βˆN(x) after N (x) steps of the trivial (λ = 0) recursion βˆj +1 =

End-to-End Distance from the Green’s Function for a Self-Avoiding Walk in Four Dimensions

529

L2 βˆj . The integer N (x) is the number of steps needed to “bring 0 and x together” when scaling and decimating the lattice as in [BEI92, p. 99]. The next proposition shows that λN(x) is something like N (x)−1 = (log |x|)−1 for 1 |x| ≤ βˆ − 2 . Hence the difference Gλ − G0 in (1.18) decays more rapidly than either 1 term by itself (at least out to the range ≈ βˆ − 2 ). ˆ λ) ∈ Dβ × Dλ , the following statements hold for k = Proposition 1.5. For all (β, 0, 1, 2, . . . : (1) (βˆk , λk ) ∈ Dβ × Dλ . (2) Let kβˆ be the largest k such that |βˆk | ≤ 1 (if no such integer exists, then kβˆ = 0). Then kβˆ = O(1) +

1 2

ˆ −1 ) + log(1 + |β|

1 8

ˆ −1 )|. log |1 + 4Bλ log(1 + |β|

(3) Let kˆ = min{k, kβˆ }. Then λk −

2 λ ≤ c1 (1 + ln(1 + |λ|k)). ˆ 1 + 8Bλkˆ 1 + 8Bλkˆ λ

ˆ − 4 ≡ βeff,k /βˆ = βk L−2k /β. ˆ Then (4) Let k (β) 1

ˆ O(λ) . ˆ = (1 + 8Bλk)e k (β) ˆ ≡ lim k (β) ˆ = (5) λβˆ ≡ lim λk and βeff,∞ ≡ lim βeff,k exist, as does (β) k→∞

βeff,∞ /βˆ = (1 + 8Bλkβˆ )eO(λ) . ˆ | ≥ 1. Then (6) Let |βT

k→∞

k→∞

ˆ k (T −1 ) λk (β) , ˆ |). 1 + O(λ) ≤ ≤ 1 + O(λ)(1 + log |βT ˆ λk (T −1 ) k (β) d ln (β) ˆ ˆ (7) β ≤ c2 |λβˆ |. d βˆ (β) k ˆ (8) ln ≤ O(λk ) 1 + log(1 + |βˆk |−1 ) . ˆ (β) This proposition plays a role in the proofs of Theorem 1.2 and of Theorem II.1.1. It will be proven along with Proposition 1.3 in Sect. 4. It turns out that we can use βeff,∞ in place of βeff,N(x) in Theorem II.1.1, as the following result shows (see Sect. 2 for a proof). Corollary 1.6. Under the same assumptions as in Theorem II.1.1, |Gλ (β, x) − G0 (βeff,∞ , x)| ≤ O(λN(x) )|G0 (βeff,∞ , x)|.

(1.19)

530

D.C. Brydges, J.Z. Imbrie

1.3. Additional remarks. In this paper we use a strategy of analyzing inverse Laplace transforms in order to obtain asymptotics as T → ∞. As a by-product we find it necessary to prove the needed Green’s function estimates throughout a sector of the complex β-plane. For some models, it may be inconvenient to have complex coupling constants, so a natural question to ask is whether there are other ways of relating the asymptotics as β tends to zero to the asymptotics as T → ∞. Tauberian theorems [Fel71] provide one answer, albeit a limited one. Working on the real axis, one can show that if G(β) is the Laplace transform of a measure µ, and G varies regularly at 0, then µ{[0, T ]} varies regularly at infinity and has an asymptotic behavior dual to the behavior of G near zero. So, for example, if G(β) ∼ β −a (log β)b , then µ{[0, T ]} ∼ T a (log T )b . The first problem we encounter is that in the hierarchical model, none of the quantities we work with behave regularly as β → 0 or as T → ∞. We need only look at Proposition 1.1 to see the type of behavior characteristic of a hierarchical model: asymptotically periodic in log T or log β. One could perhaps get around this feature and prove a Tauberian theorem tailored to this situation, or work in a non-hierarchical model. However, there is still the problem of relating the asymptotics of µ to the asymptotics of the end-to-end distance. Tauberian theorems really only relate one type of average (the Laplace transform) to another (µ{[0, T ]}). To obtain results about the fixed T ensemble of walks, one needs to learn about the density for µ. In the situaT (|ω(T )|α ) is actually a ratio of two quantities, dx P (T , x)|x|α and tion at hand, E0,λ λ dx Pλ (T , x). These are inverse Laplace transforms of

1

˜ dx Gλ (β, x)|x|α˜ ∼ (β(β)− 4 )−1−α/2 with α˜ = α or 0, respectively.

1

˜ , we need to know that the denThus while the measures behave as (T (T −1 ) 4 )1+α/2 α/2 ˜ −1 (2+ α)/8 ˜ sities behave as 1/T times this, or T (T ) . Only with this information can we take the ratio and deduce that 1

1

T E0,λ (|ω(T )|α ) ∼ (T 2 (T −1 ) 8 )α ,

as described in Theorem 1.2. Without further assumptions, such as monotonicity, one cannot conclude much about the density knowing only the behavior of the measure. One can say that if the density has reasonable asymptotics as T → ∞, then they follow that of µ. It should be clear, however, that working in the complex plane provides the most complete picture of the relation between the Green’s function and the end-to-end distance. Related work. Iagolnitzer and Magnen [IM94] have given detailed estimates on the decay of the critical Green’s function for the Edwards model of weakly self-repelling polymers in four dimensions. Golowich [Gol02] extended their method into the region Dβ \ B(ε) with ε > 0. Hara and Slade [HS92] have proved that the strictly self-avoiding d walk on a simple cubic √ lattice Z for d ≥ 5 has an end-to-end distance that is asymptotic to a constant times T and a scaling limit that is Brownian motion. Golowich and Imbrie [GI95] obtained results on the critical behavior of the broken phase (β < β c (λ)) of the hierarchical self-avoiding walk in four dimensions. Hattori and Tsuda [HT02] have detailed results on self-avoiding walks on the Sierpi´nski gasket.

End-to-End Distance from the Green’s Function for a Self-Avoiding Walk in Four Dimensions

531

2. End-to-End Distance for the Non-Interacting Walk In this section we prove Proposition 1.4 (behavior of G0 ) and then use the Laplace inversion formula to obtain the end-to-end distance and prove Proposition 1.1. We also establish Corollary 1.6. Proof of Proposition 1.4. From (2.15) of [BEI92] we have the following formula for d = 4: 1 −4 G0 (β, x) = L−2k 1 1 (2.1) 1 1 k |=0} − L k |≤L} . {|x/L {|x/L 1 + L2k β k≥0

For x = 0 this can be written as

G0 (β, x) =

L−2k

k≥N−1

1 − L−4 1 − L−2(N−1) , 1 + L2k β 1 + L2(N−1) β

(2.2)

where N = N(x) = log |x|. (Recall that x → x/L means shifting the components of x so that x/L ≡ (. . . , 0, 0, xN−1 , xN−2 , . . . , x1 ).) We manipulate this expression in order to manifest cancellations between the two 1 1 terms. Writing 1+a = a1 − a(1+a) with a = L2k β and using L−4k (1 − L−4 ) = 1 twice, we obtain

(1 − L−4 ) 1 + L−4(N−1) 2k β(1 + L β) β(1 + L2(N−1) β) k≥N−1

1 1 = L−4k (1 − L−4 ) . − β(1 + L2(N−1) β) β(1 + L2k β)

G0 (β, x) = −

L−4k

k≥N −1

Clearing denominators and using |x| = LN , j = k − N , we obtain G0 (β, x) =

L−4k (1 − L−4 )

k≥N−1

=

j ≥0

L−2j

(L2k − L2(N−1) ) (1 + L2(N−1) β)(1 + L2k β)

(1 − L−4 )(1 − L−2−2j ) |x|2 (1 + β|x|2 L−2 )(1 + β|x|2 L2j )

  −2−2j 2 (1−L−4 )(1−L−2 ) 1−L 1+β|x| , 1+ = L−2j |x|2 (1+β|x|2 L−2 )(1+β|x|2 ) 1−L−2 1+β|x|2 L2j j ≥1

(2.3) which leads to Proposition 1.4(1). For (2) we set x = 0 in (2.1):

1 − L−4 1 + L2k β k≥0   1 − L−4  1 + β . = 1+ L−2k 1+β 1 + L2k β

G0 (β, 0) =

L−2k

k≥1

(2.4)

532

D.C. Brydges, J.Z. Imbrie

Proceeding to (3), we bound (2.3) from above, noting that any χ ∈ Dβ has |arg χ | < −1/2 . Thus, both (1 + β|x|2 ) and bβ + 41 bλ + < 3π 4 and hence satisfies |1 + χ | > 2 (1 + β|x|2 L−2 ) are bounded below by c−1 L−2 (1 + |β| |x|2 ), and in addition, 1 + β|x|2 1 + β|x|2 L2j ≤ c, uniformly in β, |x|, j , L. Hence the sum on j converges, and the desired bound c2 L2 |x|−2 (1 + β|x|2 )−2 as in (3) follows. For the lower bound, we need only observe that for each j , arg(1 + β|x|2 L2j ) lies between arg(1 + β|x|2 ) and arg β (and all three have the same sign). Hence each factor (1 + β|x|2 )/(1 + β|x|2 L2j ) is in Dβ and on the same side of the real axis. So any positive linear combination of these factors is in Dβ . Using again the fact that |1 + χ | ≥ 2−1/2 for any χ ∈ Dβ , we obtain a lower bound of the same form as the upper bound. Similar arguments can be applied to the second line in (2.4), and the desired bounds on G0 (β, 0) follow. We need to control derivatives of G0 (β, x) as well. Proposition 2.1. If β ∈ Dβ , then for x = 0, d cu(1 + log(1 + u−1 )) β ≤ (β, x) , G dβ 0 |x|2 (1 + u)3

(2.5)

where u = |β| |x|2 . For x = 0, put v = |β| and then cv(1 + log(1 + v −1 )) d ≤ β (β, 0) . G 0 dβ (1 + v)2

(2.6)

Note that (2.5) improves the naive bound c|x|−2 (1 + u)−2 that would follow from Proposition 1.4(3). This is possible because the Green’s function is relatively insensitive to changes in β for smaller values of |x|. Proof. Consider what happens when β

d dβ

is applied to the right-hand side of Propo2j

uL sition 1.4(1). Wherever the derivative acts, a new factor 1+uL 2j appears after taking u absolute values. When j = −1, this is a constant times 1+u times our previous estimate, c|x|−2 (1 + u)−2 . For j ≥ 0, the L−2j which previously controlled the sum on j is cancelled out, leaving a bound cu . (1 + u)(1 + uL2j )2 j ≥0

If u > 1 this is still a geometric series, but for u < 1 there are O(1 + log(1 + u−1 )) terms of approximately the same magnitude before convergence sets in, and this leads to the form of the bound (2.5). d The same steps can be applied when estimating β dβ G0 (β, 0). Differentiation of (2.4) yields d c β ≤ (β, 0) , G dβ 0 (1 + vL2j )2 j ≥0

and proceeding as above we obtain (2.6), and the proof is complete.

End-to-End Distance from the Green’s Function for a Self-Avoiding Walk in Four Dimensions

533

Proof of Corollary 1.6. Use the bound from Proposition II.6.1, |Gλ (β, x) − G0 (βeff,N(x) , x)| ≤ O(λN(x) )|G0 (βeff,N(x) , x)|. Consider first x = 0 and let N = N (x). We may apply Proposition 1.4 to the right-hand ˆ is essentially an increasing function of k. side. Proposition 1.5(4) shows that |k (β)| ˆ ≤ c|(β)|, ˆ so that |βeff,N | ≥ c−1 |βeff,∞ | and Hence |N (β)| O(λN ) |x|2 (1 + |β

eff,N

| |x|2 )2

≤

O(λN ) |x|2 (1 + |β

2 2 eff,∞ | |x| )

≤ O(λN )|G0 (βeff,∞ , x)|. (2.7)

We also need to estimate |G0 (βeff,N , x) − G0 (βeff,∞ , x)| βeff,∞ d β˜ d ˜ x) = G0 (β, β˜ βeff,N β˜ d β˜ ≤ O(λN )(1 + log(1 + |βˆN |−1 )) sup β˜

uβ˜ (1 + log(1 + u−1 ˜ )) β

|x|2 (1 + uβ˜ )3

,

˜ 2 . Let uN = |βˆN | = where we have used Proposition 1.5(8) and (2.5) and put uβ˜ = |β||x| |βeff,N ||x|2 ≥ c−1 uβ˜ . Assuming uN < 1, we can use monotonicity to replace uβ˜ with uN in the sup. The result is |G0 (βeff,N , x) − G0 (βeff,∞ , x)| ≤

2 uN (1 + log(1 + u−1 O(λN ) N )) · . |x|2 (1 + uN )2 1 + uN

The second factor on the right-hand side is uniformly bounded, and the first factor is bounded by (2.7). If uN ≥ 1, then log(1+|βˆN |−1 ) ≤ c, uβ˜ (1+uβ˜ )−1 (1+log(1+u−1 ˜ )) ≤ β

c, and (1 + uβ˜ )−2 ≤ (1 + c−1 |βeff,∞ ||x|2 )−2 , so we are still able to obtain the bound of (2.7). This establishes (1.19) for x = 0. The case x = 0 can be handled similarly. When (2.6) is combined with ln β ≤ O(λ) 1 + log(1 + v −1 ) β eff,∞

as above (cf. Proposition 1.5(8) with k = 0), we obtain (1.19). This completes the proof. Proof of Proposition 1.1. Let P0 (T , x) be the transition probability for the L´evy process. From the definition of G0 (β, x) and the Laplace transform inversion formula, we have dβ βT e G0 (β, x), (2.8) P (T , x) = 2πi where the contour is {β : β = a + iα, α ∈ R, a > 0}. We can move the contour to the left and close it so that it encircles the poles in [−1, 0), cf. (2.2). By interchanging the

534

D.C. Brydges, J.Z. Imbrie

integral over β with the sum in (2.2) and applying the residue formula, we obtain for x = LN , N ≥ 1, −2j P0 (T , x) = L−4k (1 − L−4 )e−L T − L−4(N−1) e−2(N−1)T . k≥N−1

L−4j (1 − L−4 ) = 1, this becomes −2j −2N −2(N −1) T L−4j (1 − L−4 ) e−L L T − e−L P0 (T , x) = L−4N

Using j = k − N and

j ≥0 −4

= |x|

(2.9)

f (t),

where t = T /|x|2 and f (t) =

−2j 2 L−4j (1 − L−4 ) e−L t − e−L t .

(2.10)

j ≥0

The following proposition gives an accurate picture of the shape of P0 (T , x). Proposition 2.2. Let x = 0. Then there are constants c1 , c2 such that

c1

T2 1+

|x|2

3 ≤ P0 (T , x) ≤

T

c2

T2 1+

|x|2 T

3 .

(2.11)

This estimate holds also for x = 0, provided T ≥ 1. For small T , P0 (T , 0) ∼ 1 − O(T ). Proof. Note that for t < 1, f (t) ∼ t. For t > 1, the sum in (2.10) is dominated by the term with L−2j t ≈ 1, and so f (t) ∼ t −2 . Overall, f (t) is bounded above and below by positive multiples of t −2 (1 + t −1 )−3 , which implies (2.11). To handle the case x = 0, we use Proposition 1.4(2) and (2.8) to obtain P0 (T , 0) =

∞

−2k T

L−4k (1 − L−4 )e−L

,

k=0

which behaves as T −2 for T ≥ 1 and 1 − O(T ) for T < 1. Thus (2.11) holds for x = 0, provided T ≥ 1. Continuing with the proof of Proposition 1.1, note that from (2.9), for 0 < α < 2, we have

|ω(T )|α |x|α ≡ dx P0 (T , x) α/2 E0 α/2 T T LαN = L4N (1 − L−4 )P0 (T , x) |x|=LN T α/2 N≥1 = fα (T /L2N ), (2.12) N≥1

End-to-End Distance from the Green’s Function for a Self-Avoiding Walk in Four Dimensions

535

where fα (t) = t −α/2 (1 − L−4 )f (t). Now we replace T by L2m T in (2.12) and find that as m → ∞, E0

|x|α (L2m T )α/2

=

fα (T /L ) → 2j

j ≥1−m

∞

fα (T /L2j ).

j =−∞

Since fα (t) goes to zero at 0 and ∞ as a power of t, the sum on j converges at both ends and defines a function with the properties claimed in Proposition 1.1. 3. End-to-End Distance for the Self-Avoiding Walk We begin with a detailed statement of the behavior of the (unnormalized) transition probability function for the interacting model. Let 2 c Pλ (T , x) ≡ E0 e−λτ (G )−β (λ)T 11{ω(T )=x} . (3.1) Then Gλ (β, x) is the Laplace transform of Pλ (T , x), so as in (2.8) we have dβ (β−β c (λ))T Pλ (T , x) = Gλ (β, x) e 2πi d βˆ βT ˆ = e Gλ (β, x), 2πi

(3.2)

where βˆ = β − β c (λ). In this equation we may, by Theorem II.1.1 and Proposition 1.4, choose the contour to be T −1 , where consists of the two rays {z : |z| ≥ 1 and arg z = ±bβ } joined by an arc of the unit circle which passes across the positive real axis. Recall 4 π that π2 < bβ < 3π 4 and that bλ < π − 3 bβ < 3 . Proposition 3.1. Let k = max{0, log |x|} and put βˆ = T −1 in λk = λk (T −1 ). Likewise, ˆ = (βeff,∞ /β)−4 as per Proposition 1.5. Then with define = (T −1 ), where (β) 1 t ≡ T 4 > 1, the following estimate holds uniformly in x, T and λ ∈ Dλ : 1 O(λk ) Pλ (T , x) = 4 P0 (t, x) +

2 t (1 + |x|2 ) 1 + |x|2 /t

1 t + |x|2 . (3.3) = 4 P0 (t, x) 1 + O(λk ) 1 + |x|2 Proof. Corollary 1.6 estimates Gλ in terms of G0 : ˆ x)(1 + O(λk (β))). ˆ Gλ (β, x) = G0 (βeff,∞ (β), We need to replace βˆ with T −1 in part of this expression. To simplify formulas, let us put 1 ˆ )− 4 , x), G0 (ζ ) = G0 (β(ζ

536

D.C. Brydges, J.Z. Imbrie

ˆ x) = G0 (β). ˆ Then we have so that G0 (βeff,∞ (β), d βˆ βT ˆ ˆ ˆ + (G0 (β) ˆ − G0 (T −1 )) G0 (T −1 ) + O(λk (β))G Pλ (T , x) = e 0 (β) 2π i 1 d βˆ βT ˆ ˆ − 4 , x) + eˆ1 (x, β) ˆ + eˆ2 (T , x, β) ˆ G0 (β = e 2π i 1 dβ β t = 4 e G0 (β , x) + e(T , x) 2πi 1

= 4 P0 (t, x) + e(T , x), −1 ), and e(T , x) is the inverse Laplace ˆ ˆ eˆ2 = G0 (β)−G ˆ where eˆ1 = O(λk (β))G 0 (β), 0 (T transform of their sum. We have βˆ d ˆ = ˜ |eˆ2 (T , x, β)| d β˜ G0 (β) T −1 d β˜

βˆ d β˜ 1 d ∂ ˜ w ln (β) − β˜ G0 (w, x) , (3.4) = T −1 β˜ 4 ∂w d β˜

ˆ β) ˜ − 41 . Put u = |w| |x|2 . Then if x = 0, Proposition 2.1 implies that where w = β( ∂ cu(1 + log(1 + u−1 )) c w ≤ G (w, x) ≤ 0 ∂w 2 3 2 |x| (1 + u) |x| (1 + u)2 c c = ≤ , ˆ β) ˜ − 41 | |x|2 )2 ˆ − 41 | |x|2 )2 |x|2 (1 + |β( |x|2 (1 + |β where in the last step we have used Proposition 1.5(6). For x = 0, this bound has to be ˆ − 41 |)−1 . replaced with c (1 + |β Continuing under the assumption that x = 0, we use this bound and Proposition 1.5(7) to estimate (3.4) by ˆ | + 3π sup ˜ O(λ ˜ ) ˆ | + 3π sup ˜ O(λk (β)) ˜ ln |βT ln | βT β β β 4 4 ˆ ≤ ≤ . |eˆ2 (T , x, β)| |x|2 (1 + |t|−1 |x|2 )2 ˆ − 41 | |x|2 )2 |x|2 (1 + |β ˜ ≥ T −1 , t = T 41 , and the fact that |λk (β)| ˜ is In the second inequality, we have used |β| essentially a decreasing function of k (cf. Proposition 1.5(3)). Note that if we use Propˆ we find that eˆ1 (x, β) ˆ is bounded by this same expression, osition 1.4 to estimate G0 (β), ˆ ˜ only with O(λk (β)) replaced by O(λk (β)). Hence we combine the two error terms and ˜ ≤ λk (1 + O(λ)(1 + log |βT ˜ |)) ≤ λk (1 + log |βT ˆ |) (cf. Proposition estimate |λk (β)| 1.5(6)) to obtain O(λk ) ˆ ˆ βT ˆ |)2 . βT )e |e(T , x)| ≤ d( (1 + ln |βT T |x|2 (1 + |x|2 /t)2 ˆ

As eβT decays exponentially on the rays |arg β| = bβ , the integral is O(1) and so 1

|e(T , x)| ≤

O(λk ) O(λk ) 4 2 =

2 ,

T |x|2 1 + |x|2 /t t (1 + |x|2 ) 1 + |x|2 /t

(3.5)

End-to-End Distance from the Green’s Function for a Self-Avoiding Walk in Four Dimensions

537

which implies (3.3). The second statement in (3.3) follows from this by using (2.11), 1 with T replaced by t = T 4 . We note that 1

|arg t| = |arg 4 | = 41 |arg(1 + 8Bλk1/T )| + O(λ) < 41 |arg λ| + O(λ) <

π 12

+ O(λ)

(cf. Proposition 1.5(5)), and that the proof of Proposition 2.2 extends to the continuation of P0 (T , x) into this sector. The case x = 0 is handled similarly, only |x| has to be replaced with 1 and the power of (1 + |x|2 /t) is reduced from 2 to 1. The final bound in (3.5) remains valid, however.

Remark. The error term in (3.3) behaves as t −1 (1 + |x|2 )−1 for |x|2 < t, which is not the behavior one would expect (namely t −2 , the small-x behavior of P0 (t, x)). This is an artifact of the proof, which takes an absolute value of G0 on the contour, thereby spoiling the cancellations needed to get a bound proportional to t −2 , and leading to “Green’s function-like” rather than “transition probability-like” behavior. While (3.3) is adequate for obtaining our main theorem on the end-to-end distance, it may be of some ˆ be as in Proposition interest to indicate how a better bound might be proven. Let k (β) 1

1.5 and put k = k (T −1 ) and tk = T k4 , with k = max{0, log |x|} as in Proposition 3.1. Then we conjecture 1

Pλ (T , x) = k4 P0 (tk , x)(1 + O(λk )).

(3.6)

To get this, one need only consider |x|2 < tk as the arguments above give it for |x|2 ≥ tk . Write d βˆ βT 1 ˆ Pλ (T , x) = 2 e Gλ (β, x), T 2πi ˆ where primes denote β-derivatives. One should be able to replace Gλ (β, x) with ˆ (w, x), λ (β)G ˆ (w, x), and λ (β)G ˆ 0 (w, x), G0 (w, x) plus error terms of order λk (β)G 0 0 k k 1 1 −4 − ˆ 4 (1+O(λk (β))) ˆ where w = βk (β) . Each β-derivative of G0 (w, x) is actually k (β) ˆ which times the corresponding w-derivative, the correction term being βˆ dˆ ln k (β), dβ

ˆ Extending the proof of Proposition 2.1 to higher as in Proposition 1.5(7), is O(λk (β)). derivatives, we have c c|w| ∂ c log(1 + u−1 ) G = , | (w, x)| ≤ , 0 |x|2 (1 + u)2 u(1 + u)2 ∂w (1 + u)3 2 3 ∂ c ∂ c | 2 G0 (w, x)| ≤ , | 3 G0 (w, x)| ≤ . 3 2 ∂w |w|(1 + u) ∂w |w| (1 + u)3 |G0 (w, x)| ≤

Extending the arguments of Lemma 4.2 to second derivatives, we expect ˆ ≤ O(λk (β) ˆ 2 )βk = O(λk (β) ˆ 2 )βˆk /β, ˆ |λk (β)| ˆ ≤ O(λk (β) ˆ 3 )βk 2 = O(λk (β) ˆ 3 )βˆk2 /βˆ 2 . |λk (β)| ˆ 2k k (β) ˆ − 41 | = |w| |x|2 = u in λ , λ conWe shall see that the factors of |βk | = |βL k k trol the dangerous u−1 and log(1 + u−1 ) factors in G0 , G0 respectively. Noting that ˆ 41 , we find that eˆ1 (x, β) ˆ = G − G is bounded by ˆ β/w = k (β) λ

0

538

D.C. Brydges, J.Z. Imbrie 1 ˆ 2 ) log(1+u−1 ) ˆ 3 )w u O(λk (β) ˆ − 4 + u2 O(λk (β) k (β) (1+u)3 βˆ βˆ 2 u(1+u)2 1 1 ˆ βˆ −1 βˆ |k (β)| ˆ − 2 + |λk (β)| ˆ |k (β)| ˆ − 4 + |λk (β)| ˆ 2 w O(λk (β)) w βˆ

1 ˆ O(λk (β)) ˆ −2 (β) w(1+u)3 k

≤

+

−1

ˆ βˆ −1 k (β) ˆ − 4 ≤ O(λk )βˆ −1 4 (1 + log |βT ˆ |) 4 . ≤ O(λk (β)) k ˆ satisfies the same bound because β˜ d ln k (β) ˜ ≤ O(λk (β)) ˜ Furthermore, e2 (T , x, β) ˜ 1

5

dβ

and because the bound on wG 0 is the same as the one on G0 . One can perform the inverse Laplace transform on this and estimate it as in the proof of Proposition 3.1. The result is

−2 1 − 41 − 41 −2 = t k k ≈ k2 P0 (tk , x) |e(T , x)| ≤ O(λk )k , which when multiplied by T

(cf. Proposition 2.2), leads to (3.6). Proof of Theorem 1.2. By (3.3), we have 1 α 4 dx Pλ (T , x)|x| = dx P0 (t, x)x α + =

1 4

k

O(λk )(L4k − 1)Lαk t (1 + L2k )(1 + L2k /t)2

E0 (|ω(t)| ) + O(λk1/T )t α/2 α

1 = 4 E0 (|ω(t)|α ) 1 + O(λk1/T ) . Since λk varies slowly with k and 0 ≤ α < 2, the sum on k first increases geometrically, then decreases geometrically, so that the sum on k is estimated by the largest term ¯ for which L2k¯ ≈ t. We have replaced k¯ with k1/T , which is allowable because at k = k, βˆ = T −1 , 1 1 ˆ 2k¯ (β) ˆ − 4 | ˆ −1 ≈ T −1 t− 4 = 1, βˆk¯ = βL β=T so that k¯ ≈ k1/T . Note that Proposition 1.5(2) relates k1/T to T : k1/T = O(1) +

1 2

log(1 + T ) +

1 8

log |1 + 4Bλ log(1 + T )| .

(3.7)

In fact, we can use Proposition 1.5(3, 4, 8) to write λk1/T ≈

λ −1 ≈ λ−1 k1/T ≈ λ 1 + 8Bλk1/T

(equality to within a factor eO(λ) ). Hence 1 dx Pλ (T , x)|x|α = 4 (E0 (|ω(t)|α ) + O(λ−1 )t α/2 ) 1

= 4 E0 (|ω(t)|α )(1 + O(λ−1 )),

(3.8)

where we have used Proposition 1.1. Using (3.8) for numerator and denominator, we obtain T E0,λ (|ω(T )|α ) = E0 (|ω(t)|α )(1 + O(λ−1 )), which leads immediately to (1.7). We have = (T −1 ) = (1 + 8Bλk1/T )eO(λ) , by Proposition 1.5(5), and if we insert (3.7) into this, we obtain (1.8).

End-to-End Distance from the Green’s Function for a Self-Avoiding Walk in Four Dimensions

539

4. The Coupling Constant Recursion and Its Fixed Point This section begins with an inverse function theorem construction of the fixed point β c (λ), as specified in Proposition 1.3. Then the shifted recursion for βˆ = β − β c (λ) is controlled in some detail, and Proposition 1.3 can be established. Finally, these results are used to prove Proposition 1.5. As we shall see, one can prove accurate estimates on λk , βk by working inductively on domains which extend slightly into the “dangerous” region left of β c (λ). Precise control of βk is needed in order to obtain the right domain of analyticity for Cauchy estimates. As k → ∞, the domain shrinks back to Dβ + β c (λ) as the singularity at β c (λ) asserts itself. Proposition II.6.1 provides the necessary input. We wish to construct β c (λ) as the limit of the decreasing sequence of open sets −1 βk (B( 21 )). But we must show that the map βk (β) and its inverse are defined in appropriate domains. We establish the following lemma inductively (keep in mind that λ is fixed in Dλ ; λk and βk are regarded as functions of β ≡ β0 ∈ D( 21 ), with primes denoting β-derivatives). We use the notation   k−1 8B ,   lk = exp (4.1) −1 λ + 8Bj j =0 and note that this is a function of λ, k only. By integral approximation, it can easily be shown that lk = |1 + 8Bλk|eO(λ) . Begin by considering λ0 ∈ Dλ , β0 ∈ B0 := B( 21 ) so that β1 , λ1 are defined from Proposition II.6.1 and satisfy the recursion (1.16) and bounds (1.17). Then we prove bounds on β1 , λ1 and their derivatives, guaranteeing that B1 = β1−1 (B( 21 )) is a nonempty subset of B( 13 ) on which λ1 ∈ Dλ . Then β2 , λ2 can be defined on B1 and the process continues. Once we have established a domain for βk−1 , λk−1 , then Proposition II.6.1 can be used to generate βk , λk . Lemma 4.1. Let k ≥ 1. Assume that for 1 ≤ j < k, (1) λj ∈ Dλ for β ∈ Bj −1 and 2 1 1 ≤ c1 (1 + ln(1 + |λ|j )), λ j − −1 −1 λ + 8Bj λ + 8Bj with c1 a constant independent of j .

−1

(4.2)

1 2 4 O(λ) 2j . Here O(λ) denotes (2) For β ∈ βj−1 −1 (B( 3 )), |λj | ≤ c2 |λj βj | and |βj | = L lj e a quantity bounded by c3 |λ|, and c2 , c3 are independent of j and β. 1 (3) Bj := βj−1 (B( 21 )) is a nonempty subset of βj−1 −1 (B( 3 )) ⊂ Bj −1 .

Then Proposition II.6.1 defines βk , λk on Bk−1 and (1)–(3) hold for j = k. Proof. To prove (4.2), rewrite the λ recursion as −1 λ−1 j +1 = λj +

8B + O(λj ), (1 + βj )2

540

D.C. Brydges, J.Z. Imbrie

where we have used the fact that βj ∈ B( 21 ) for all 0 ≤ j < k to avoid writing some (1 + βj )−1 factors. This implies that λ−1 k

=λ

−1

+

k−1 j =0

−1

=λ

8B + O(λj ) (1 + βj )2

+ 8Bk + O(1)(1 + ln(1 + |λ|k)),

(4.3)

where we have used k−1 k−1 k−1 1 = − 1 O(β ) ≤ O(β ) + O(λj ), j k−1 (1 + β )2 j j =0

j =0

j =0

k−1 k−1 1 O(λj ) ≤ O(1) λ−1 + 8Bj ≤ O(1) ln(1 + |λ|k). j =0

j =0

The first of these bounds follows by bounding separately the set of j ’s such that |βj | > |λj |. Once this inequality holds, it holds for all larger j ’s (with geometric growth of βj ) as is clear from (1.16). The second bound follows from (1) for smaller values of j , keeping in mind that Dλ is contained in a sector which does not include the negative reals, so ˜ λ˜ −1 − λ−1 ) λ−1 and 8Bk never come close to canceling. Using the identity λ − λ˜ = λλ( we have 1 1 = λk−1 O(1)(1 + ln(1 + |λ|k)), λk − (4.4) −1 −1 λ + 8Bk λ + 8Bk and (4.2) follows for λk . We now prove that λk ∈ Dλ . Note that if δ (which defines the maximum |λ| in Dλ ) is chosen small enough, then by (4.2), |λk+1 | ≤ δ. The sequence λ˜ j = (λ−1 + 8Bj )−1 follows a circle tangent to the real axis at 0, so that |arg λ˜ j | is decreasing in j . Furthermore, the bound (4.2) shows that any increase in |arg λj | in the exact recursion is at most O(λ). Thus, while λk may leave Dλ , it remains in Dλ . We have now established (1). To check (2), differentiate (1.16): λj +1 = λj − βj +1 = L2

16B(λj λj − λ2j (1 + βj )−1 βj )

+ λ,j , 2B(λj − λj (1 + βj )−1 βj ) βj + + β,j . 1 + βj

(4.5)

(1 + βj )2

(4.6) 1

By the βj bound in (2), the domain βj−1 (B( 21 )) includes balls of size 16 L−2j lj4 . Hence (1.17), Cauchy’s bound, and (2) imply − 41

≤ c|λj |3 |1 + βj |−1 |βj |,

(4.7)

− 41

≤ c|λj |2 |1 + βj |−2 |βj |,

(4.8)

| λ,j | ≤ c|λj |3 |1 + βj |−1 L2j lj

| ≤ c|λj |2 |1 + βj |−2 L2j lj | β,j

End-to-End Distance from the Green’s Function for a Self-Avoiding Walk in Four Dimensions

541

for β ∈ βj−1 (B( 13 )). Inserting the bound (4.7) into (4.5) and using (2), we obtain βj +1 = L2 βj [1 − 2Bλj + O(βj λj ) + O(λ2j )],

(4.9)

which can be written in exponential form:   k−1 βk = L2k exp  (−2Bλj + O(βj λj ) + O(λ2j )) .

(4.10)

j =0

Replacing λj with λ−1 + 8Bj as per (1), we pick up an error ∼ λ2j (1 + ln(1 + |λ|j )), which, however, is summable in j . The other terms in (4.10) also sum to O(λ), so the βk bound in (2) follows. Moving on to the λk bound, we insert (2) into (4.5): |λj +1 | ≤ |λ2j βj |(c2 + O(λj ) + O(1)),

(4.11)

has been bounded using (4.7). Putting j + 1 = k, we obtain |λ | ≤ c |λ2 β |, where λ,j 2 k k k −2 O(λ) provided c2 is chosen large enough so that L e (c2 + O(1)) ≤ c2 . −1 To complete the induction, we establish (3). Consider the one-step map βk (βk−1 ( · )). 1 On B( 3 ), this has been shown, cf. (2), to be defined with precise control on βk . We −1 on βk−2 (B( 13 )). Hence the composition has derivative have previousy estimated βk−1 −1 2 (0)) is O(λk ). Hence L + O(λ). In addition, the recursion (1.16) shows that βk (βk−1 −1 −1 −1 1 1 1 1 βk (βk−1 (B( 3 ))) covers B( 2 ) and so βk (B( 2 )) ⊂ βk−1 (B( 3 )). This is of course contained in Bk−1 , so the proof of (3) and the lemma is complete.

Proof of Proposition 1.3. We may define β c (λ) =

∞ !

Bj ,

j =1

since Lemma 4.1(2,3) implies that these sets are a decreasing sequence of open sets 1

with diameter ≤ cL−2j lj4 . Furthermore, at β c (λ), Lemma 4.1(1) holds for all j , so λj (β c (λ)) → 0 as j → ∞. Consider the sequence βnc = βn (β c (λ)). By construction, c this is a bounded sequence obeying βn+1 = L2 βnc + O(λn ) (cf. (1.16)) and as such it c must satisfy βn = O(λn ) → 0. In particular, β c (λ) = β0c = O(λ). In order to complete the proof of Proposition 1.3, we compute the shifted recursion ˆ = βj (βˆ + β c (λ)) − β c denote the difference which applies to βˆ = β − β c (λ). Let βˆj (β) j between the flow from β and the critical flow from β c (λ). Then (1.16) becomes λj +1 = λj − βˆj +1

8Bλ2j

(1 + βˆj + βjc )2 1 = L2 βˆj + 2B ˆ

+ λ,j ,

1+βj +βjc

−

1 1+βjc

λj + β,j (βˆ + β c (λ)) − β,j (β c (λ)).

542

D.C. Brydges, J.Z. Imbrie

We control the global behavior of this recursion with another lemma. Some additional definitions will be needed. Let kβˆ be the largest k such that |βˆk | ≤ 1 (if no such integer exists, then k ˆ = 0). Then with kˆ = min{k, k ˆ }, we define β

β

 ˆ k−1  ˆ lk (β) = exp j =0

 8B , λ−1 + 8Bj

(4.12)

ˆ = l ˆ , cf. (4.1). Again, integral approximation shows that lk (β) ˆ = and observe that |lk (β)| k O(λ) ˆ (1 + 8Bλk)e . 1

ˆ λ) ∈ Dβ ( 1 L−2k l 4 ) × Dλ , the Lemma 4.2. Let Dβ (ρ) = Dβ + B(ρ). Then for (β, k 4 following bounds hold with k-independent constants: (1) λk ∈ Dλ and λk −

2 1 ≤ c1 (1 + ln(1 + |λ|k)). ˆ λ−1 + 8B kˆ λ−1 + 8B kˆ 1

ˆ 2k lk (β) ˆ − 41 eO(λ) ∈ Dβ ( 1 ). If βˆ ∈ Dβ , then βˆk ∈ Dβ . (2) βˆk = βL 3 ˆ − 41 eO(λ) . (Note that βˆ = β .) (3) |λk | ≤ c2 |λ2k βk | |1 + βˆk |−1 , βk = L2k lk (β) k k (4) The recursion relations

2Bλk βˆk+1 = L2 βˆk 1 − + ˆβ,k , 1 + βˆk λk+1 = λk −

8Bλ2k + ˆλ,k , (1 + βˆk )2

hold with ˆβ,k , ˆλ,k analytic in βˆ and satisfying |ˆ β,k | ≤ c3 |λk |2 |1 + βˆk |−1 , |ˆ λ,k | ≤ c4 |λk |3 |1 + βˆk |−1 . In addition, for βˆ ∈ Dβ

1

1 −2k 4 lk 5L

,

|ˆ β,k | ≤ c6 |λ3k βk | |1 + βˆk |−1 . | ≤ c5 |λ2k βk | |1 + βˆk |−2 , |ˆ λ,k

Lemma 4.2 shows that if β ∈ Dβ + β c (λ) and λ ∈ Dλ , then (2) holds for all k. Thus βk ∈ Dβ + βkc , which completes the proof of Proposition 1.3. Proof of Lemma 4.2. We begin by showing (1), (2), (3) imply (4). We may assume Lemma 4.2 for smaller values of k. Since (βˆj , λj ) ∈ Dβ 13 × Dλ for j = 1, . . . , k, and since βˆj − βj = βjc = O(λ), the assumption in Proposition II.6.1 holds and the recursion relations (1.16), (1.17) are valid.

End-to-End Distance from the Green’s Function for a Self-Avoiding Walk in Four Dimensions

543

As βkc = O(λk ), and as 1 + βˆk is never going near 0, we can expand in βkc in the λ recursion, with all but the zeroth order going into the remainder. For the β recursion, we write 1 −βˆk 1 − = c 1 + βk 1 + βˆk + βkc (1 + βkc )(1 + βˆk + βkc ) βˆk βkc (2 + βˆk + βkc ) −βˆk = + , 1 + βˆk (1 + βˆk )(1 + βkc )(1 + βˆk + βkc ) with the second term going into the remainder, as it is proportional to βkc = O(λk ). The result is 8Bλ2k + ˆλ,k (1 + βˆk )2 2Bλk = L2 βˆk 1 − + ˆβ,k , 1 + βˆk

λk+1 = λk − βˆk+1

with ˆλ,k still of order |λk |3 |1 + βk |−1 ≈ |λk |3 |1 + βˆk |−1 , and with ˆβ,k =

2Bλk βkc (2 + βˆk + βˆkc ) β,k (βˆ + β c (λ)) − β,k (β c (λ)) + . (1 + βˆk )(1 + β c )(1 + βˆk + β c ) βˆk k

(4.13)

k

The first term in ˆβ,k is O(λ2k )|1 + βˆk |−1 . To bound the second term, consider two 1 cases. First, if |βˆk | < 10 , then write the second term as

1

dθ 0

βˆ β,k (θ βˆ + β c (λ)). βˆk

ˆ so (2) implies that βˆ ∈ B Note that in this case kˆ = k, lk = |lk (β)|,

1

1 −2k 4 O(λ) lk e 10 L

.

Double the size of this ball, so that Cauchy’s bound may be check used. To

1 1 −2k 4 O(λ) the assumptions of Proposition II.6.1, observe that for 0 ≤ j ≤ k, B 5 L lk e ⊂

1 Dβ 41 L−2j lj4 , so that (2) holds, and in particular βˆj ∈ B( 13 ). Hence (1.17) holds and | β,k | ≤ O(λ2k ). Cauchy’s estimate then implies βˆ βˆ −1 c β,k (θ βˆ + β (λ)) ≤ O(λ2k )L2k lk 4 ≤ O(λ2k ). βˆk βˆk In the second case (|βˆk | ≥

1 10 )

each β,j term can be estimated separately. Note that 1

Lemma 4.2(2) applies for 0 ≤ j ≤ k, since Dβ ( 41 L−2k lk4 ) is decreasing in k. Hence (1.17) holds, so that (βˆ + β c (λ)) − (β c (λ)) β,k β,k (4.14) ≤ O(λ2k )|1 + βˆk |−2 . βˆk

544

D.C. Brydges, J.Z. Imbrie

Proceeding to the derivatives, we use Cauchy’s estimate with the bounds just estab1

lished on ˆβ,k , ˆλ,k . Thus if we shrink the domain to Dβ ( 15 L−2k lk4 ), we have 1

− |ˆ λ,k | ≤ c|λk |3 |1 + βˆk |−1 L2k lk 4 ≤ c|λk |3 |1 + βˆk |−1 |βk |, − 41

| ≤ c|λk |2 |1 + βˆk |−2 L2k lk |ˆ β,k

≤ c|λk |2 |1 + βˆk |−2 |βk |, 1

ˆ lk | ≤ O(1) to relate |β | to L2k l − 4 . The ˆ bound where we have used (3) and |lk (β)/ k k β,k was obtained by differentiating the first term in (4.13) explicitly, and using (4.14) on the second term. This completes the proof of (4). It also gets the induction started, since (1), (2), (3) are trivial for k = 0. To complete the cycle, we show that (4) (with k + 1 replaced by k) implies (1), (2), and (3). To prove (1), proceed as in (4.3)–(4.4). In this case we have k−1 O(λj ) 8B ˆ = 8B kˆ + O(1)(1 + ln(1 + |λ|k)), + ˆ 2 |1 + βˆj | j =0 (1 + βj ) and the bound in (1) follows. The argument for λk ∈ Dλ is unchanged. To obtain (2), express the iteration of (4) in exponential form:   k−1 2) O(λ −2Bλ j j . + βˆk = βˆk L2k exp  1 + βˆj 1 + βˆj j =0

The geometric growth of βˆj and (1) show that this may be expressed as in (2). In order to prove that βˆk ∈ Dβ ( 13 ), we need to allow for the phase change from ˆ O(λ) , we have |arg lk (β)| ˆ − 41 in the bound of (2). Since lk (β) ˆ = (1 + 8Bλk)e ˆ ≤ lk (β) 1 ˆ ˆ ˆ |arg λ| + O(λ). Thus if |arg β| < bβ , then |arg βk | < bβ + 4 bλ + O(λ), so that βk ∈ Dβ for all βˆ ∈ Dβ . 1

Before we may conclude that βˆk ∈ Dβ ( 13 ) for all βˆ ∈ Dβ ( 41 L−2k lk4 ), we need to ˆ with βˆ allow for the spilling out of βˆk from Dβ ( 41 ) due to the slow variation of lk (β) 1

in the bound of (2). Consider a ball of radius 41 L−2k lk4 and centered at βˆ ∈ Dβ . The bound in (2) shows that in the βˆk plane, it scales up to an approximate ball of radius 1 1 1 1 4 ˆ 41 | = 1 |l 4 / l 4 |. As this ball may be larger than the ball of radius 1 centered |l / lk (β) 4 k

4 k

kˆ

4

at βˆk , some widening of the opening angle in Dβ ( 13 ) is needed. This is only a problem if k > kˆ ≡ min{k, kβˆ }, in which case βˆk > 1, by the definition of kβˆ . We claim that 1

1

|lk4 / l ˆ4 | − 1 ≤ O(λ)|βˆk |, which implies that an O(λ) increase in opening angle is suffik k−k cient. For a proof, observe first that |βˆk | ≥ c−1 L βˆ . This is a consequence of the fact that βˆk has geometric growth with ratio close to L2 , and the fact that by definition, βˆkβˆ

is no smaller than L−2 (1 + O(λ)) = c−1 . Second, a crude estimate on (4.1) gives 1

1

ˆ

|lk4 / l ˆ4 | ≤ eO(λ)(k−k) ≤ |cβˆk |O(λ) . k

End-to-End Distance from the Green’s Function for a Self-Avoiding Walk in Four Dimensions

545

Letting y = ln |cβˆk |, we may use the fact that eay − 1 < aey for a, y ≥ 0 to conclude 1 1 that |l 4 / l 4 | − 1 ≤ O(λ)|βˆk | as claimed. As a result, we have that |βˆk − z| < 1 for some k

kˆ

z with |arg z| < bβ + 41 bλ + , and so βˆk ∈ Dβ ( 13 ). We proceed to the proof of (3). Differentiating (4), we obtain

3

16B(λk λk − λ2k (1 + βˆk )−1 βk ) + ˆλ,k , (1 + βˆk )2 βˆ ˆβ,k λk βˆk k 2B βˆk λk 2 λk + − + ˆβ,k + . = L βk 1 − βk βk 1 + βˆk 1 + βˆk

λk+1 = λk − βk+1

From (3) (applied to λk ) and (4) we see that |λk+1 | =

|λ2k βk | (c2 + O(λk ) + O(1)), |1 + βˆk |

and as before, cf. (4.11), by choosing c2 large enough we obtain the desired bound on λk+1 . Likewise we apply the inductive assumptions to each term in the βk+1 equation to obtain O(λ2k )βˆk O(λ2k ) 2Bλk O(λk )βˆk 2 βk+1 = L βk 1 − . + + + 1 + βˆk |1 + βˆk |2 |1 + βˆk |2 |1 + βˆk | We put this in exponential form: βk+1

  kˆ ˆ λ ) O( β k k  eO(λ) −2Bλk + = L2(k+1) exp  ˆk | |1 + β j =0 ˆ − 4 eO(λ) . = L2(k+1) lk (β) 1

ˆ −1 (as with all the other error terms) is The error from replacing λk with (λ−1 + 8B k) summable to O(λ). ˆ λ) ∈ Dβ × Dλ , then Corollary 4.3. If (β, kβˆ = O(1) +

1 2

ˆ −1 ) + log(1 + |β|

1 8

ˆ −1 ). log 1 + 4Bλ log(1 + |β|

(4.15)

ˆ ≥ 1, then k ˆ = 0 and (4.15) is valid. If |β| ˆ < 1, then we need to solve for k Proof. If |β| β ˆ O(λ) , ˆ = |1+8Bλk|e in the equation βˆk = O(1). By Lemma 4.2(2) and the fact that |lk (β)| this can be written as 1 |βˆk |L2k |1 + 8Bλk|− 4 = O(1). Rewrite this as k = O(1) +

1 2

ˆ −1 + log |β|

1 8

log |1 + 8Bλk| ,

and solve by repeated substitution. The result can be expressed as in (4.15).

546

D.C. Brydges, J.Z. Imbrie

Proof of Proposition 1.5. (1) is just the shifted version of a statement in Proposition 1.3. (2) is Corollary 4.3. (3) is a restatement of Lemma 4.2(1). To obtain (4), note that by Lemma 4.2(2), ˆ − 4 eO(λ) . ˆ − 4 eO(λ) = (1 + 8Bλk) k (β)− 4 = βk L−2k /βˆ = lk (β) 1

1

1

(4.16)

(5) follows immediately from the geometric growth of βˆk and the recursion relation and bounds in Lemma 4.2(4). To obtain (6), consider first the ratio 1 + 8Bλkˆ k (T −1 ) 1 = eO(λ) (4.17) , ˆ ˆ k (β) 1 + 8Bλk2 ˆ | > 1, then where kˆ1 = min{k, k1/T } and kˆ2 = min{k, kβˆ }. By Corollary 4.3, if |βT ˆ |. O(1) ≤ k1/T − kβˆ ≤ O(1) + ( 21 + ) log |βT The same bounds hold for kˆ1 − kˆ2 , so (4.17) implies k (T −1 ) ≤ eO(λ) (1 + O(λ)(1 + log |βT ˆ |)). eO(λ) ≤ ˆ k (β) ˆ k (T −1 )|, note that Lemma 4.2(1) and (4.16) imply To get the same bounds on |λk (β)/λ ˆ = λk (β)

λ 1 + 8Bλkˆ

=

λ ˆ k (β)

eO(λ) ,

ˆ k (T −1 ) bound is really the same as the k (T −1 )/k (β) ˆ bound. so the λk (β)/λ To obtain (7), apply the recursion relations of Lemma 4.2(4) ad infinitum: ˆ = (β) ˆ = (ln (β))

∞ " k=0 ∞ k=0

2Bλk 1− + ˆβ,k , 1 + βˆk

∞ −2Bλk 2Bλk βk O(λk )βk eO(λ) = + + ˆβ,k . ˆ 2 1 + βˆk (1 + βˆk )2 k=0 (1 + βk )

By Lemma 4.2(2, 3), we have βk = βˆk βˆ −1 eO(λ) , so this can be written as βˆ −1

∞

O(λk )βˆk |1 + βˆk |−2 = βˆ −1 O(λkβˆ ) = βˆ −1 O(λβˆ ),

k=0

and (7) is an immediate consequence. Proceeding to (8), note that Lemma 4.2(4) implies that ∞ " 2Bλj βeff,∞ 1− = + ˆβ,j . βeff,k 1 + βˆj j =k

End-to-End Distance from the Green’s Function for a Self-Avoiding Walk in Four Dimensions

547

Thus we may obtain (8) from the following sequence of bounds: ∞ O(λ2j ) βeff,k 2Bλ j ln = + β ˆj eff,∞ 1 + β 1 + βˆj j =k k

≤

βˆ

∞ O(λ ) βˆ

O(λj ) +

j =k

j =kβˆ

|1 + βˆj |

≤ O(λk )(1 + max{kβˆ − k, 0}) ≤ O(λk )(1 + log(1 + |βˆk |−1 )). In the last step we have used the fact that since kβˆ is defined so that βˆkβˆ ≤ 1, the recursion −(2− )(kβˆ −k) for k < kβˆ . implies that |βˆk | ≤ L

References [BEI92]

Brydges, D.C., Evans, S.E., Imbrie, J.Z.: Self-avoiding walk on a hierarchical lattice in four dimensions. Ann. Probab. 20, 82–124 (1992) [BI02] Brydges, D.C., Imbrie, J.Z.: The Green’s function for a hierarchical self-avoiding walk in four dimensions. Commun. Math. Phys.; DOI 10.1007/s00220-003-0886-5 [BLZ73] Br´ezin, E., Le Guillou, J.C., Zinn-Justin, J.: Approach to scaling in renormalized perturbation theory. Phys. Rev. D 8, 2418–2430 (1973) [Fel71] Feller, W.: An Introduction to Probability Theory and its Applications, Vol. II, 2nd ed. New York: Wiley, 1971 [Gol02] Golowich, S.E.: The nearest-neighbor self-avoiding walk with complex killing rates. Ann. Henri Poincar´e, to appear [GI95] Golowich, S.E., Imbrie, J.Z.: The broken supersymmetry phase of a self-avoiding walk. Commun. Math. Phys. 168, 265–320 (1995) [HS92] Hara, T., Slade, G.: Self-avoiding walk in five or more dimensions. I. The critical behavior. Commun. Math. Phys. 147, 101–136 (1992); The lace expansion for self-avoiding walk in five or more dimensions. Rev. Math. Phys. 4, 235–327 (1992) [HT02] Hattori, T., Tsuda, T.: Renormalization group analysis of the self-avoiding paths on the d-dimensional Sierpi´nski gasket. J. Stat. Phys. 109, 39–66 (2002), mp arc:02-225 [IM94] Iagolnitzer, D., Magnen, J.: Polymers in a weak random potential in dimension four: Rigorous renormalization group analysis. Commun. Math. Phys. 162, 85–121 (1994) [MS93] Madras, N., Slade, G.: The self-avoiding walk. Boston: Birkh¨auser, 1993 Communicated by M. Aizenman

Commun. Math. Phys. 239, 549–584 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0886-5

Communications in

Mathematical Physics

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions David C. Brydges1,2, , John Z. Imbrie2 1 2

University of British Columbia, Mathematics Department, #121-1984 Mathematics Road, Vancouver, B.C. V6T 1Z2, Canada. E-mail: [email protected] Department of Mathematics, University of Virginia, Kerchof Hall, P. O. Box 400137, Charlottesville, VA 22904-4137, USA. E-mail: [email protected]

Received: 21 May 2002 / Accepted: 25 March 2003 Published online: 2 July 2003 – © Springer-Verlag 2003

Abstract: This is the second of two papers on the end-to-end distance of a weakly selfrepelling walk on a four dimensional hierarchical lattice. the proof that the It completes √ 1 log log T 8 expected value grows as a constant times T log T 1 + O log T , which is the same law as has been conjectured for self-avoiding walks on the simple cubic lattice Z4 . Apart from completing the program in the first paper, the main result is that the Green’s function is almost equal to the Green’s function for the Markov process with no self-repulsion, but at a different value of the killing rate β which can be accurately calculated when the interaction is small. Furthermore, the Green’s function is analytic in β in a sector in the complex plane with opening angle greater than π .

Contents 1. 2. 3. 4. 5. 6. 7. A. B. C.

Introduction . . . . . . . . . . . . . . . . . τ Isomorphism . . . . . . . . . . . . . . . Renormalization Transformations . . . . . . Coordinates for Interactions . . . . . . . . . Second Order Perturbation Theory . . . . . The Green’s Function . . . . . . . . . . . . Estimates on a Renormalization Group Step Convolution of Forms and Supersymmetry . Dirichlet Boundary Conditions . . . . . . . Calculations for Proposition 5.1 . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

Research supported by NSF grant DMS-9706166 and NSERC of Canada.

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

550 551 554 558 560 562 565 577 579 581

550

D.C. Brydges, J. Z. Imbrie

1. Introduction This is the second in a series of two papers in which we study the asymptotic end-to-end distance of weakly self-avoiding walk on the four-dimensional Hierarchical lattice G. The reader is referred to [BEI92] or the first paper [BI02], henceforth referred to as paper I, for definitions of these terms. Results from the first paper have the prefix “I”. In paper I we proved that the self-avoidance causes √ a log1/8 T correction in the expected end-to-end distance after time T relative to the T law of a simple random walk. Paper I was devoted to the problem of how to recover the end-to-end distribution by taking the inverse Laplace transform of the Green’s function, assuming that the Green’s function has certain properties, which are proved in this paper in Theorem 1.1 and Proposition 6.1. These properties are of independent interest and, with minor changes, should also hold for the Green’s function for the simple cubic four dimensional lattice. We prove they hold for the hierarchical problem in this paper. The interacting Green’s function is defined by the Laplace transform Gλ (β, x) =

∞

2 e−βT E0 11ω(T )=x e−λ G τx dx dT ,

(1.1)

0 (T )

where τx = τx is the time up to T that ω(t) is at site x. Our main result is Theorem 1.1. It says that the interacting Green’s function is almost equal to the Green’s function for the Markov process, λ = 0, but at a different value of the β parameter which depends on x and (β, λ) and which can be accurately calculated when λ is small. The error in the approximation decays more rapidly than the Green’s function because it contains, as a prefactor, the “running coupling constant” λN(x) . Furthermore the Green’s function is analytic in β in a sector in the complex plane with opening angle greater than π . This very large domain of analyticity seems to be needed for accurately inverting the Laplace transform to calculate the end-to-end distance. The main theorem refers to domains Dβ = {β = 0 : | arg β| < bβ }; Dλ = {λ : 0 < |λ| < δ and | arg λ| < bλ }.

(1.2)

π For details see paper I, but, for example, we can choose (bβ , bλ ) = 5π 8 , 8 . The main theorem also refers to a recursion: given (β0 , λ0 ) we define the Renormalization Group (RG) recursion (βj , λj ) in Sects. 3 and 4 and establish the recursive properties in Proposition 6.1. Having established these estimates, we know from paper I that the recursion has various properties. In particular, from Proposition I.1.3, for each λ0 in the domain there exists a special choice β0 = β c (λ0 ) for the initial β such that the RG recursion (βnc , λn ) is defined for all n and βnc → 0. This should be viewed as a partial description of a stable manifold for the fixed point (0, 0), but note that our RG recursion is not autonomous because there are other degrees of freedom, for example, the r in Sect. 4, which have been projected out in this simplified description. We called (βnc , λn ) the critical trajectory. For β0 some other choice of initial data, we defined the deviation βˆn := βn − βnc of its trajectory from the critical trajectory. The main result is

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

551

Theorem 1.1. Let λ0 ∈ Dλ with δ sufficiently small. Then Gλ0 (β0 , x) is analytic in β0 in the domain Dβ + β c (λ0 ) and |Gλ0 (β0 , x) − G0 (βeff,N(x) , x)| ≤ O(λN(x) )|G0 (βeff,N(x) , x)|.

(1.3)

Here N(x) = log |x| for x = 0, N (0) = 0, and βeff,j = L−2j βˆj . This theorem and Proposition 6.1 are the two results needed to complete the results in paper I. The paper begins in Sect. 2 with a review of an isomorphism that recasts the Green’s function as an almost Gaussian integral with very special properties (supersymmetry). The virtue of this representation is that it leads to a precise definition of a Renormalization Group (RG) transformation which relates the Green’s function with given interaction to a rescaled Green’s function with smaller interaction. The RG transformation is defined in Sect. 3 and its effect on the interaction is further described in Sect. 4. To use this RG transformation we need an approximate calculation of its effect on the interaction. This is carried out in Sect. 5. The proof of the main Theorem 1.1 is in Sect. 6. All the analysis in this paper is in Sect. 7. The methods introduced there have several noteworthy features: (i) the demonstration that analysis with supersymmetric integrals containing differential forms ≡ Fermions is possible; (ii) the use of rotation of contours of integration in the supersymmetric integral to obtain the large domain of analyticity needed for inverting the Laplace transform; and (iii) simultaneous control over behavior in x and behavior in β both large and small. With Steven Evans we wrote an earlier paper [BEI92] on this same model which studied the Green’s function but only at the critical value of the killing rate. Here we have opted for some repetition of ideas in that paper because we have since learned that the Grassmann algebras used in [BEI92] are natural differential forms and we wanted to incorporate this insight systematically. 2. τ Isomorphism In a precise sense that we will now review, the differential form φx φ¯ x +

1 dφx d φ¯ x 2πi

represents the time τx a finite state Markov process occupies state x. The following is a distillation of ideas in papers [McK80, PS80, Lut83, LJ87]. The forms dφx , d φ¯ x are multiplied by the wedge product. To connect with notation in [BEI92, BMM91], we set ψx = (2πi)−1/2 dφx , ψ¯ x = (2π i)−1/2 d φ¯ x , 1

where (2π i)− 2 is a fixed choice of square root. Let be a finite set. Given any matrix Axy with indices x, y ∈ we define the even differential form SA = φx Axy φ¯ y + ψx Axy ψ¯ y , (2.1) and then the exponential of this form is defined by the Taylor series n 1 ¯ e−SA = e− φx Axy φy − ψx Axy ψ¯ y . n!

552

D.C. Brydges, J. Z. Imbrie

The series terminates after finitely many terms because the anticommutative wedge product vanishesif the degree of the form exceeds the real dimension 2| | of C . By definition C vanishes on forms that are of degree less than the real dimension of C . For example, taking | | = 1 and Axy = A > 0, we find that C e−SA = 1 because only the n = 1 term in the expansion of the exponential contributes and this term is A 2 2 ¯ e−A(u +v ) du dv = 1, e−φAφ dφ d φ¯ = −(2π i)−1 A π C using φ = u + iv and dφ d φ¯ = −2i dudv. Thus these integrals are self-normalizing. This feature generalizes: Lemma 2.1. Suppose that A has positive real part, meaning Re φx Axy φ¯ y > 0 for φ = 0. Then e−SA = 1,

C e−SA φa φ¯ b = Cab , C

where C = A−1 . The second part of the lemma follows from the first part together with the standard fact that the covariance of a normalized Gaussian measure is the inverse of the matrix in the exponent. The first part is a corollary of Lemma 2.2 given below. Let τx = φx φ¯ x + ψx ψ¯ x , and let τ be the collection (τx )x∈ . Given any smooth function F (t) defined on R , we use the terminating Taylor series F (τ ) =

1 ¯ ¯ α ψ) F (α) (φ φ)(ψ α! α

¯ α = (ψx ψ¯ x )αx . The even to define the form F (τ ), where φ φ¯ = (φx φ¯ x )x∈ , (ψ ψ) degree of τx relieves us of any necessity to specify an order for the product over forms. Supersymmetry. There is a flow on C given by φx −→ exp(−2π it)φx . This flow is generated by the vector field X such that X(φx ) = −2πiφx and X(φ¯ x ) = 2π i φ¯ x .A form ω is invariant (under this flow) if it is unchanged by the substitution φx −→ exp(−2π it)φx . The Lie derivative LX ω of a form ω is obtained by differentiating with respect to the flow at t = 0 so invariance is equivalent to LX ω = 0. Let iX be the interior product with the vector field X. The supersymmetry operator [Wit92, AB84] Q = d + iX

(2.2)

is an anti-derivation on forms with the property that Q2 = diX + iX d = LX is the Lie derivative. Therefore Q2 = 0 on forms ω which are invariant. A form ω that satisfies the stronger property Qω = 0 is said to be supersymmetric. For example SA is supersymmetric and the derivation property implies that exp(−SA ) is also supersymmetric.

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

553

Since Qφx d φ¯ x = dφx d φ¯ x + (2πi)φx φ¯ x , there is a form ux such that τx = Qux . SA is also in the image of Q. For any form u whose coefficients decay sufficiently rapidly at infinity, Qu = 0, C

because the integral of du is zero by Stokes theorem, while the integral of iX u is zero because it cannot contain a form of degree 2| |. Let F ∈ C0∞ (R ). Then e−SA F (λτ ) is independent of λ, because the λ derivative has the form e−SA Ftx (λτ )Qux = Q e−SA Ftx (λτ )ux = 0. C

C

x

x

The compact support condition is a simple way to be sure that there are no boundary terms at infinity. Adequate decay of the integrand and its partial derivatives is all that is needed. If the exponential has better decay then there is no need for such a strong condition on F . Thus Lemma 2.2. If A has positive real part and F is smooth on R with bounded derivatives, then e−SA F (τ ) = F (0). C

Part (1) of Lemma 2.1 is obtained when F = 1. The following proposition will be called the τ isomorphism. It is the main result of this section. Proposition 2.3 ([PS79, McK80]). Suppose that A generates a Markov process with killing on first exit from . Let Ea (·) denote the associated expectation over paths ω(t) such that ω(0) = a. Let F ∈ C0∞ (R ), then C

e−SA F (τ ) φa φ¯ b =

∞

dT Ea F (τ ) 11ω(T )=b .

0

On the right-hand side τx = τxT is the time up to T that the stochastic process ω(t) is at site x, τxT

T

=

11ω(t)=x dt. 0

On the left-hand side it is the form φx φ¯ x + ψx ψ¯ x . As noted above, the compact support condition on F is stronger than necessary. The left-hand side is a linear combination of integrals that involve finitely many derivatives of F . It is still valid if these derivatives have adequate decay at infinity, as will be the case for functions used in this paper.

554

D.C. Brydges, J. Z. Imbrie

Proof. We can assume, with no loss of generality, that Re because both sides are unchanged by F (t) → F (t)eκ

φx Axy φ¯ y > 0 when φ = 0,

, A → A + κI. Consider the special case where F (τ ) = exp(i kx τx ). Then e−SA F (τ ) φa φ¯ b = e−SA−iK φa φ¯ b , tx

C

C

(2.3)

(2.4)

where K is the diagonal matrix kx δxy . By Lemma 2.1 the right-hand side equals (A − iK)−1 ab which is

∞

(e

−T [A−iK]

∞

)ab =

0

dT Ea F (τ ) 11ω(T )=b

0

T by the Feynman-Kac formula1 and F (τ ) = exp(i kx τx ) = exp(i 0 kω(s) ds). By (2.4) the proposition is proven for F (τ ) = exp(i kx τx ). Both sides of the proposition are linear in F so we can generalize to F ∈ C0∞ by substituting the Fourier inversion formula F (τ ) = (2π)−n Fˆ (k)ei kx τx d | | k into exp(−SA )F (τ )φa φ¯ b . Since the k and φ integrals, by (2.3), are absolutely convergent, the integral over k may be interchanged with the φ integrals.

3. Renormalization Transformations Since the τ isomorphism is only applicable when the state space is finite we have to study the Green’s function as a limit of processes with finite state spaces. Let N be a positive integer and let E0 (·) be the expectation for the hierarchical Levy process ω(t) killed on first exit from = GN . Define the finite volume interacting Green’s function ∞ −λ G τx2 dx −βT

1 1 dT . (3.1) (β, x) = e E e G

ω(T )=x λ 0 0

When λ = 0, G

λ=0 (β, x) is the β potential for the hierarchical Levy process ω(t) killed on first exit from = GN . In this section we single out this important object with the notation U (β, x) = G

λ=0 (β, x). Given a bounded smooth function g(t) we define a generalization of the Green’s function ∞

(β, x) = e−βT E0 g 11ω(T )=x dT , (3.2) G

g 0

1 The Feynman-Kac formula is given in [Sim79] for Brownian motion. The proof that uses the Trotter product formula is valid for finite state Markov processes.

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

555

where g = x∈ g(τx ). When g is the function g(t) = exp(−λt 2 ) this is the Green’s function (3.1). By the τ isomorphism,

(β, x) = µ e−βτ g φ0 φ¯ x , (3.3) G

g C

where µ is the Gaussian form exp(−SA ) with A equal to the inverse of U with β = 0. Scaling. This is a transformation x → L−1 x that maps the hierarchical lattice to itself by identifying all points that lie in the same ball of diameter L in the hierarchical lattice so that they become a single point in a new hierarchical lattice. Thus it is the canonical projection (of groups), GN → GN /G1 , rewritten as L−1 x = (. . . , x3 , x2 , x1 ),

for x = (. . . , x3 , x2 , x1 , x0 ),

which maps = GN to /L := GN−1 . Associated to these lattices we have manifolds C and C /L . A point in C is specified by (φx )x∈ . Scaling therefore maps a point φ in C /L backward to a point Sφ in C according to Sφx ≡ (Sφ)x = L−1 φL−1 x , so Sφ is constant on cosets x + G1 . The prefactor L−1 is put there to make the RG map, to be defined below, autonomous. Functions and forms on C are mapped forward. For example, for x ∈ , dφx is a form on C . Under scaling we get S(dφx ) = L−1 dφx/L , which is a form on C /L . We define scaling on covariances by SC (x, y) ≡ (SC /L ) (x, y) = L−2 C /L (L−1 x, L−1 y). Covariances are functions on the lattices so they are mapped backwards. To summarize: the direction of maps may appear reversed, but observe that the projection → /L induces a map backwards of manifolds C /L → C because CX = map (X → C) and so forms/functions on the manifold C are mapped forward to forms/functions on the manifold C /L . The renormalization group rests on the following scaling decomposition U (β, x) = SU /L (L2 β, x) + (β, x).

(3.4)

The important properties of defined by this formula are that is positive semi-definite and finite range. In Appendix B we prove that 1 (3.5) (11 G0 (x) − L−4 11G1 (x)). 1+β

also has the inessential properties y (β, y) = 1 and (β, y) = (β, y ) for y, y = 0 which lead to simplifications specific to this model, notably Lemma 3.5. Let Cxy with x, y ∈ be an invertible matrix and let A be the inverse of Cxy . Define µC = exp(−SA ). According to Lemma 2.1, µC φa φ¯ b = Cab whenever the inverse A has positive-definite real part. The appearing in (3.4) is only positive semi-definite

, it has the kernel ⊥ consisting of all φ that are constant because, as an operator on C x on cosets x + G1 , because y x−y = 0. Let ⊂ C be the subspace orthogonal to

(β, x) =

556

D.C. Brydges, J. Z. Imbrie

the kernel of . We may restrict to the invariant subspace ; the restricted is then positive-definite and invertible. We choose coordinates ζ in by picking any basis and define µ as a form on using the inverse of computed in this basis. The form µ is independent of this choice of basis because forms are coordinate invariant. The matrix SU /L also has no inverse, but by the same reasoning defines a form µS U /L on ⊥ . The action of S on covariances was defined so that µS U u = µU Su, (3.6) ⊥

C /L

on ⊥ . where U = U /L and u is a form α α ¯ ¯ Let g(φ) = gα,α¯ (φ)dφ d φ be a form on C . Then g(φ +ζ ) is a form on ⊥ × defined by pullback, i.e., substituting φ + ζ for φ. Define the form µ ∗ g on ⊥ by µ (ζ )g(φ + ζ ). µ ∗ g(φ) =

The scale decomposition (3.4) leads to the convolution property µU (φ)F (φ) = µS U (φ)µ ∗ F (φ), C

⊥

(3.7)

which is valid for F (φ) any smooth bounded form. This claim follows by changing variables (u, v) = (φ + ζ, ζ ) and integrating out v using Corollary A.3. Definition 3.1. Define the linear operator Tβ that maps forms on C to forms on C /L by Tβ u = Sµ (β) ∗ u. Tβ is called a renormalization group (RG) transformation. Proposition 3.2. With µ as defined below (3.3) , 2 µ e−β τ u = µ /L e−(L β) τ Tβ u. C

C /L

Proof. By the τ isomorphism, the covariance of the Gaussian µ exp(−β τ ) is the same as the covariance of µU when U has parameters , β. Therefore the two Gaussian forms are equal. By the convolution property (3.7), µU u = µS U µ ∗ u, where U in SU has scaled parameters /L and L2 β. Apply (3.6).

Lemma 3.3. Tβ commutes with the supersymmetry operator Q defined in (2.2). Furthermore, if u is an even supersymmetric form on Cx+G1 , then there is a unique function f such that Tβ u = f (τx ). Proof. Q commutes with S because S is a pullback and the pullback commutes with the flow φx −→ exp(−2πit)φx . By Lemma A.2, Q also commutes with integrating out. Thus [Tβ , Q] = 0 because µ is supersymmetric. The existence of f follows from Lemma A.4.

Let X be a subset of . A form FX is said to be localized in X if it is a form on CX . Since Gaussian random variables are independent if their covariance vanishes, we have the independence property µ ∗ (FX GY ) = (µ ∗ FX ) (µ ∗ GY ),

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

557

whenever the hierarchical distance between X and Y exceeds the range of . Given forms gx localized at single sites {x} let gX = gx . x∈X

Then the independence property implies Tβ g =

Tβ g x+G1 .

(3.8)

x∈ /L

Tβ g x+G1 is a form on C{x} . By Lemma 3.3, it has the form gnew (τx ) for some function gnew (t) on R. This is a marvelous property of the hierarchical lattice because it means that the RG map preserves the multiplicativity of the interaction:

/L Tβ g = gnew ,

and therefore Tβ can be described by the map g → gnew . This has come about because the hierarchical topology has no overlapping neighborhoods: any pair of neighborhoods are either disjoint or nested. However, there is some redundancy in the pair (β, g): as noted in (2.3), the Green’s function Gg (β, x) depends only on the combination g(t) exp(−βt). We will remove this redundancy by imposing the normalization condition g(t) = 1 + O(t 2 ) as t → 0.

(3.9)

This normalization assumes that gnew (0) = 1. This is true for the initial interaction g(t) = exp(−λt 2 ) and, by Lemma 2.2, it also holds for gnew . After each map by Tβ , which, referring to Proposition 3.2, takes (β, g) to (L2 β, gnew ), the pair (L2 β, gnew ) is replaced by an equivalent pair (β , g ) = (L2 β + ν , exp(ν t)gnew ) to restore (3.9). Definition 3.4. Let g be a smooth bounded function on R and let gx = g(τx ). Tβ g is the function on R, given by

Tβ g(τx ) = eν t Tβ g x+G1 , where ν is chosen so that Tβ g(t) = 1 + O(t 2 ). In (3.3) there is also the factor φ0 φ¯ x . By the independence property (3.8), the RG acts on each of the factors φ0 and φx independently if |x| > L. In fact, in this model, the RG acts simply by scaling: Lemma 3.5. Suppose that g = a(φ) + b(φ) dφ d φ¯ is an even form on C. Define gx by replacing φ by φx , then Tβ g x+G1 φx = (Tβ g x+G1 ) Sφx . Proof. Let G = g x+G1 with L−2 φx/L + ζx substituted in place of φx , then Tβ g x+G1 φx = S µ G Sφx + S µ G ζx , so we have to prove that µ Gζx = 0. By G1 invariance, µ Gζx = µ Gζy for all y ∈ x + G1 . Since ζ ∈ , y∈G ζy = 0.

558

D.C. Brydges, J. Z. Imbrie

By (3.3), Proposition 3.2 and Lemma 3.5 we have proved that Proposition 3.6. For |x| > L, G

g (β, x) = SGg (β , x),

/L

(3.10)

where g = Tβ g and β = L2 β + ν with ν as in Definition 3.4. What happens if |x| ≤ L? The transformation of the factor φ0 φ¯ x is no longer simple, but the next result says that it reproduces itself together with supersymmetric corrections. Lemma 3.7. If u is a smooth supersymmetric even form on CG1 , then there are unique functions f1 , f2 such that Tβ uφ0 φ¯ x = f1 (τ0 ) + f2 (τ0 )φ0 φ¯ 0 . Proof. Let v = Tβ uφ0 φ¯ x and w = Qv, then w is a supersymmetric form of odd degree ¯ because Q2 annihilates invariant forms. By Lemma A.4, w = a(τ )(φd φ¯ + φdφ). The −1 ¯ solutions of w = Qv are v = −(2πi) a(τ )φ φ + b(τ ). a and b are unique because the degree zero part of v determines the combination −(2π i)−1 a(t)t + b(t) and the degree two part determines −(2πi)−1 a (t)t + b (t).

4. Coordinates for Interactions The parameters in the Green’s function (3.2) are β, a smooth function g, and the volume

. g defines a coupling constant λ and a smooth function r(t) by g(t) = e−λt + r(t) with r(t) = O(t 3 ) as t → 0, 2

because by Definition 3.4, β is adjusted so that g(t) = 1 + O(t 2 ). r will be called the remainder. Consequently, we may describe the map g → Tβ g by its action on the parameters β, λ, r → β , λ , r , where β , λ and r solve 2 2 2 (4.1) e−L βτ Tβ (e−λτ + r)G1 = e−β τ e−λ τ + r , r (t) = O(t 3 ). Iteration of this map defines a finite sequence (βj , λj , rj , j )j =0,...,N−1 . The sequence terminates because the initial 0 = GN is scaled down by L with each RG map and eventually becomes G1 . Then there is one final integration. This sequence exists for any initial choice of parameters with Re λ0 > 0 because 0 is finite. This hierarchical model has the nice feature that enlarging the initial volume 0 merely extends the sequence – the longer sequence coincides with the shorter for shared indices j . Therefore the sequences consistently extend to an infinite sequence. In (3.3) we rewrite the observables φ0 and φx as if they were part of the interaction by taking the derivative with respect to a new parameter γ and setting γ = 0: g φ0 φ¯ x =

d g eγ O , dγ |0

with O = b1 φ0 φ¯ x and b1 = 1. Then Lemma 3.5 asserts that Tβ acts on γ O at order γ by x → L−1 x and b1 → L−2 b1 to produce iterates xj := L−j x and bi,j := L−2j b1 for j = 1, . . . , N (x) − 1, such that |xj | ≥ L. N (x) is the number of iterations, logL |x|,

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

559

that are needed to scale x to 0. For j ≥ N (x) − 1, Lemma 3.7 asserts that Tj := Tβj maps γ Oj into Oj +1 = f1,j +1 (τ0 ) + f2,j +1 (τ0 )φ0 φ¯ 0 . In this and subsequent calculations functions of γ are identified with their linearizations because we only need to know the derivative with respect to γ at γ = 0. The Green’s function can be accurately calculated without complete knowledge of f1,j and f2,j . To this end we define

λτx2 if x = 0 vx = (4.2) 2 ¯ ¯ λτ0 − γ (b0 + b1 φ0 φ0 + b2 τ0 φ0 φ0 + b3 τ0 ) if x = 0 and consider the action of Tβ on g when gx = e−vx + rx . Part of the observable O is in v, and the rest of it, which is the part we will not need to calculate in detail, is in r. The split is uniquely determined by Definition 4.1. Let r be a form localized at a single lattice site. We say r is normalized dq if dt q rt = 0 at t = 0 for q = 0, 1, . . . , 5, where rt is defined by replacing φ by tφ in r ¯ including in dφ and d φ. The interaction v is said to be normalized if v = λτ 2 + γ O for some λ and O is an even polynomial form of degree less than or equal to four. The action of Tβ on g is now completely described by the action on parameters (β, λ, b, r, ) → (β , λ , b , r , ), where b = (b0 , b1 , b2 , b3 ). The Green’s function is d 2 0 γ φ φ¯ µ 0 e−β0 τ −λ0 τ e 0 x dγ |0 G1

d µG1 e−βN −1 τ [e−vN −1 + rN−1 ] = dγ |0

0 G

λ0 (β0 , x) =

=: b0,N ,

(4.3)

because the final integration with respect to µG1 is being considered to be a final RG map2 followed by setting φ and dφ to zero so that only the b0 part of v ends up in the final result. Note that the normalization condition is designed so that there is no γ dependence in the sequence (βj , λj ). A surprising fact is that the b3 τ term never plays any role beyond being there! No term of the form γ F (τ ) with F (0) = 0 contributes to the Green’s function because (d/dγ )0 µ G(τ ) exp(γ F (τ )) = G(0)F (0) by Lemma 2.2. For this reason, we leave out the b3 terms in the rest of this paper. 2

with replaced by U G1

560

D.C. Brydges, J. Z. Imbrie

5. Second Order Perturbation Theory We have shown that the RG induces a map from parameters (β, λ, b, r) to (β , λ , b , r ). We will also write v → v recalling that v is determined by (4.2). In this section we construct an approximation ˜ ˜ λ, ˜ b) v → v, ˜ (β, λ, b) → (β, to the exact map using second order perturbation theory in powers of λ. This approxi1 mation plays a major role in determining the log 8 corrections in the end-to-end distance of the interacting walk. −4 (5.1) B = 1 − L , Bp = (y)p dy. Bp is a function of β through . In particular, B1 = 0, B2 = B(1 + β)−2 , (0) = B(1 + β)−1 .

(5.2)

We will show that the second order approximation to (β, λ) → (β , λ ) is β˜ = L2 (β + 2 (0)λ + O( 3 )λ2 ), λ˜ = λ − 8B2 λ2 ,

(5.3)

where O( p ) is an analytic function3 of β and λ that is bounded in absolute value by c|1 + β|−p , for |λ| bounded. Likewise, the parameters b have the approximate recursion b˜0,j +1 = b0,j + j (xj )b1,j + O( j3 )λj b1,j + O( j2 )b2,j , b˜1,j +1 = L−2 b1,j + O( j2 )λj b1,j + O( j )b2,j , b˜2,j +1 = L−4 b2,j + O( j2 )λj b2,j ,

(5.4)

where the j subscript on means that β = βj . All the terms involving vanish for j < N(x) − 1. See Sect. 4. ˜ ν τ with ν˜ = β˜ − L2 β. Then r Proposition 5.1. Let rmain = Tβ (e−v )G1 − e−v−˜ main = 3 2 O(λ ) + O(λ γ ) as a formal series in powers of λ.

To prove Proposition 5.1 we introduce the following Laplacian ∂ ∂ ∂ ∂ .

(x − y) + = ∂φx ∂ φ¯ y ∂ψx ∂ ψ¯ y x,y

(5.5)

The partial derivatives ∂/∂ψx are formal anti-derivatives (equivalently, interior products with vector fields dual to the forms ψx ). Let F be a smooth bounded form, then µt ∗ F = F + t F + O(t 2 ) as t → 0.

(5.6)

3 They are integrals of p or more covariances, (β) and polynomial in λ. They can be computed explicitly using Feynman diagrams as in Appendix C.

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

561

↔

Given forms X, Y we define a new form X Y by ↔

XY = ( X)Y + X Y + X Y. Therefore, denoting partial derivatives with respect to φx and φ¯ x by subscripts, ↔ X Y =

(x − y) Xφx Yφ¯y + Xφ¯x Yφy + (−1)sgn(X) (Xψx Yψ¯ y − Xψ¯ x Yψy ) . x,y ↔ ↔

↔

Thus we can define X Y by applying the second to each term on the righthand side of this equation. ¯ The A polynomial form is a form whose coefficients are polynomials in φ and φ. j essential property of such forms is that they are annihilated by for j >> 1. Let V be any polynomial form and set Vt = et V ,

L=

∂ − . ∂t

Then Vt has the important property LVt = 0. Lemma 5.2. For any polynomial form V , Qt =

↔j 1 1 Vt t Vt 2 j! j ≥1

satisfies L[−Vt + Qt ] =

1 ↔ V t Vt . 2

Proof. Since V is a polynomial form, the sum over j terminates after finitely many terms, so Q is defined. Furthermore LVt = 0. Therefore, ↔ ↔ ↔ ↔ L Vt t Vt = Vt Vt − Vt t Vt , ↔2 ↔ ↔ ↔ ↔2 1 1 L Vt t Vt = Vt t Vt − Vt t Vt , 2! 2! together with similar equations for the remaining terms in Qt . Add these equations.

ˆ Lemma 5.3. Let Vˆt = Vt − Qt . Then µ ∗ e−V − e−V1 = O(V 3 ), where O(V 3 ) means that the formal power series in powers of α obtained by replacing V by αV in the left-hand side is O(α 3 ).

We call Vˆ1 the second order perturbative effective interaction. Proof. We use the Duhamel formula: Let Wt be any smooth family of forms with W0 = exp(−V ). Then, using (5.6),

562

D.C. Brydges, J. Z. Imbrie

µ ∗ e

−V

1

− W1 =

dt

0 1

=

d µ(1−t) ∗ Wt dt

dt µ(1−t) ∗ LWt .

0

Choose Wt = exp(−Vˆt ). By Lemma 5.2 LWt = O(λ3 ) because ↔ 1 ˆ ↔ ˆ Vˆt Vˆt 1 Le = e Vt V t − V t V t 2 2 ↔ ↔ 1 Vˆt =e Qt Vt − Qt Qt = O(V 3 ). (5.7) 2

Proof of Proposition 5.1. By Lemma 5.3 applied to V := x+G1 vy dy, with vy as defined in (4.2), e−L

2 βτ x

Tβ e−V = e−L

2 βτ x

ˆ

Se−V1 = e−L

2 βτ x

ˆ

e−S V1 .

S Vˆ1 is a priori a polynomial of degree six, but in fact the degree six part vanishes because ↔ it contains, from , a factor B1 = 0. Therefore, S Vˆ1 is a polynomial form of degree four. ˜ so that β˜ = L2 β + ν˜ and the remaining It contains a part ν˜ τx which is absorbed into β, ˜ ˜ b). part S Vˆ1 − ν˜ τx =: v˜x is in the form (4.2) with coefficients (λ, ˜ are in Appendix C. The main ˜ b) Details for deriving the formulas (5.3, 5.4) for (λ, points are as follows: Suppose coefficients b, λ are assigned minus the degree of the monomial they preface. Thus b0 has degree zero, b1 , ν have degree −2 and b2 , λ have degree −4. Let have degree 2. Then, for example, a term such as b2 λO( 2 ) can appear in the right-hand side of the b2 recursion because it has degree −2 which equals the degree of b2 . There are certain terms that do not appear because they contain a factor

in the form B1 =

(y) = 0. In fact, for this reason the b recursion is triangular.

The Large Field Problem. We are confining ourselves at present to formal power series statements because Wt = exp(−Vˆt ) is not integrable: unlike S Vˆ , the sixth degree part of Vˆ does not vanish and Wt consequently fails to be integrable. When we prove estimates on remainders in Proposition 7.8, we will use another choice of Wt which is the same up to order O(V 3 ). 6. The Green’s Function In this section we will prove the main result in this paper, Theorem 1.1. The theme of the proof is that accurate control of the flow of the coupling constants together with the algebraic properties of the renormalization group developed in Sects. 3 and 4 imply Theorem 1.1. Recall that in paper I we introduced the enlarged domains Dβ = {β = 0 : | arg β| < bβ + 41 bλ + }, Dβ (ρ) = Dβ + B(ρ) with B(ρ) = {β : |β| < ρ}, Dλ = {λ : 0 < |λ| < δ and | arg λ| < bλ + }. We will need

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

Proposition 6.1. Let (β0 , λ0 ) be in the domain Dβ The sequence (βj , λj )j =0,1,... ,M is such that

1 2

563

× Dλ with δ¯ sufficiently small.

8Bλ2j + λ,j , λj +1 = λj − (1 + βj )2 2B 2 βj +1 = L βj + λj + β,j , 1 + βj

(6.1)

where the λ,j , β,j defined by these equations are analytic functions of (β0 , λ0 ) satisfying |λ,j | ≤ cL |λj |3 |1 + βj |−1 , −2

|β,j | ≤ cL |λj | |1 + βj | 2

(6.2)

.

Here B = 1 − L−4 and M is the first integer such that (βM , λM ) is not in the domain Dβ 21 × Dλ . If no such integer exists, then M = ∞. The formulas (6.1) were already obtained in (5.3). The new content is the estimate on the errors. Recall also that there are observable parameters bj := (b0,j , b1,j , b2,j ). Let ∗,j = b∗,j +1 − b˜∗,j +1 where ∗ = 0, 1, 2, are the errors between the exact recursions for these parameters and the second order perturbative recursions, defined in (5.4). Since we defined the observable to be a derivative at γ = 0, we may suppose, without loss of generality, that ∗,j are linear in bj . Higher order terms will drop out when γ is set to

p zero. O( j ) denotes an analytic function of β0 and λ0 which is defined on Dβ 21 × Dλ and bounded in absolute value by cL |1 + βj |−p . Proposition 6.2. For M ≥ j ≥ N (x) − 1 and q = 0, 1, 2, 3−q

|q,j | ≤ O( j

)|λj |2 |bj |,

where |bj | := |b1,j | + |b2,j |. These two propositions will be proved in the next section. By invoking these propositions, we also gain the right to use Propositions I.1.5 and I.1.3 from Paper I, because these are corollaries of Proposition 6.1. In the next proof cL denotes constants chosen after L is fixed, whereas c denotes constants chosen before L is fixed. The values of these constants are not relevant to the proof so these symbols can change values from one appearance to the next. Proof of Theorem 1.1. During this proof we will write λ¯ := λN(x) , ¯ := N(x) and k := j − N(x) + 1. We fix ξ ∈ (1, 2). The constant δ¯ that controls the size of |λ| in the domain Dλ will be the minimum of a finite number of choices that achieve bounds in generic inductive steps. The choices depend on L and ξ . Thus, throughout the proof we will be using the following principles: 1. 2. 3. 4.

¯ for j ≥ N (x) by Proposition I.1.5. ∃c : |λj | ≤ c|λ| |cL λj | < c for any c because the domain for λ0 is chosen after L is fixed. 1 + cL |λj | < ξ because the domain for λ0 is chosen after ξ is fixed. O( j ) ≤ cL because βj ∈ Dβ ( 21 ).

564

D.C. Brydges, J. Z. Imbrie

¯ because it holds when |βN(x) | ≤ 1 since βj ∈ 5. For j ≥ N(x), O( j ) ≤ cO( ) Dβ (1/2) and it also holds when |βN(x) | > 1 because then the β recursion causes βj to grow exponentially. Recall from Sect. 4 that for j ≤ N (x) − 1, b0,j and b2,j vanish, while b1,j = L−2j . These values will start inductive arguments at j = N (x)−1. The term recursion denotes the perturbative recursion (5.4) combined with the bound on the error, Proposition 6.2. ¯ λ| ¯ 2 ξ k L−2j . Claim 1. For j ≥ N(x), |b1,j | ≤ ξ k L−2j , |b2,j | ≤ O( )| Proof. By induction using the recursion.

¯ Claim 2. For j ≥ N(x), |L2j b1,j − 1| ≤ ξ k O( ¯ 2 )|λ|. Proof. Let rk := |L2j b1,j − 1|. Then r0 = 0. By the recursion and Claim 1, rk+1 ≤ ¯ λ|ξ ¯ k . This implies the claim. rk + O( )|

Claim 3. Define a sequence aj such that aj = 0 for j = N (x) − 1 and aj +1 = aj + L−2j (βj , L−2j x) for j ≥ N (x). Then ¯ |b0,j − aj | ≤ L−2N(x) O( ¯ 3 )|λ|. Proof. Let rj = |b0,j − aj |. Then rj = 0 for j = N (x) − 1. By the recursion and the previous claims, ¯ rj +1 ≤ rj + L−2j ξ k O( ¯ 3 )|λ|. This proves the claim.

Recall the definition of βˆj = βj − βjc in Proposition I.1.3. Let uj := L2[k−1] βˆN(x) . By Proposition I.1.5, uj ∈ Dβ . Claim 4. For j ≥ N(x) − 1, 1 1 1 + β − 1 + u j

uj − βj = ¯ k ¯2 (1 + β )(1 + u ) ≤ |λ|ξ O( ). j j j

Proof. By Proposition I.1.3, (1 + βj )−1 is indistinguishable from (1 + βˆj )−1 up to an error that can be absorbed in the right-hand side of our claim. By Lemma I.4.2, part (4), βˆj = uj

j −1

(1 + O(λl O( l ))) ,

l=N(x)

so

¯ ¯ ¯ k |uj |. |βˆj − uj | ≤ ek|λ|O( ) − 1 |uj | ≤ |λ¯ |O( )ξ

The claim follows because u/(1 + u) is bounded for u ∈ Dβ . ¯

¯ 2 ). Claim 5. |aN − G0 (βeff,N(x) , x)| ≤ L−2N(x) |λ|O( Proof. By the definition of uj and βeff,l = L−2l βˆl ,

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

G0 (βeff,N(x) , x) =

565

L−2l (ul , L−l x).

aN is the same expression but with ul replaced by βl . Thus the claim is a consequence of Claim 4.

In our first paper we proved that the sequences (βj , λj ) in Theorem 1.1 never exit the domain Dβ (1/2) × Dλ , so M = ∞. Recall from (4.3) that the Green’s function equals b0,N . By Claims 3 and 5, ¯

¯ 2 ). |b0,N − G0 (βeff,N(x) , x)| ≤ L−2N(x) |λ|O( By Proposition I.1.4, part (3), the right-hand side is less than |G0 (βeff,N(x) , x)|.

7. Estimates on a Renormalization Group Step In this section we will prove Propositions 6.1 and 6.2. The main principle used here is that there is a part of the RG map Sµ ∗ that can be calculated explicitly by second order perturbation theory and there is a remainder r. In dealing with the remainder we need only look closely at the linearized action of the RG. The main results say that if r is small, then r remains small under the action of the RG map for as long as the parameter λ remains small. The main ideas take place in the proof of Proposition 7.9 and in rough outline are as follows: By expanding in powers of r,

G

G G \{x} µ ∗ e−v 1 rx + O(r 2 ). (7.1) Sµ ∗ e−v + r 1 = Sµ ∗ e−v 1 + S x∈G1

The first term has an explicit integrand which we feed to second order perturbation theory G

˜ ντ Sµ ∗ e−v 1 = e−v−˜ + rmain . We prove that rmain is the leading contribution to the remainder r after the RG transformation. Since it is the same calculation at every application of the RG map, we can build an induction as follows. (a) establish an O(λ3 ) estimate on rmain (Lemma 7.8). (b) Inductive assumption: r is comparable in norm to rmain . (c) Inductive step: the second and third terms in (7.1) remain small compared with rmain . For example, consider the second term: by Lemma 7.3 we can find how large it is by S

x∈G1

G \{x} G \{x}

L4 −1

e−v 1 rx = L4 Se−v µ ∗ e−v 1 rx ≈ S Srx x∈G1

because S makes all terms in the sum the same and there are L4 elements in G1 . At this point the normalization of r, Definition 4.1, plays a key role: by Lemma 7.6 the action of S on rx reduces the norm of rx in size by a factor L−6 which more than offsets the dangerous L4 that arose from the sum over x ∈ G1 . Finally the O(r 2 ) term looks frightening but is actually insignificant: Part (i) of Lemma 7.2 easily shows that it is minute compared with the second term because two or more factors of r are much smaller than one factor times any L dependent constant. This is because the maximum size δ¯ of λ is chosen after L is fixed.

566

D.C. Brydges, J. Z. Imbrie

The µ ∗ in the second (and third) terms in (7.1) makes them lose the normalization so we need an implicit function theorem (Lemma 7.7) to restore it. Let X be a subset of . A form FX is said to be localized in X if it is a form α on CX . Thus FX = function on CX and α FX,α (φ)ψ , where FX,α is a smooth αx ¯ α¯ x α X X ψ = x∈X ψx ψx . In particular, g defined by g = x∈X gx is localized in X, when the forms gx are localized at single sites {x}. Definition 7.1. Given h ≥ 0, w > 0, w −X = x∈X w −1 (φx ), let FX w,h =

hα+β α,β

|FX |h =

α!β!

hα+β α,β

α!β!

sup |∂φ FX,α |w −X , β

φ

β

|∂φ FX,α (0)|.

These norms measure large- and small-field behaviors, respectively. Complex Covariances and Weights. The weight w is a positive function used to track decay or growth of the interaction at φ = ∞. In this paper we use w(A, κ) := A exp(−κ|φ|2 ) with A ≥ 1 and, for tracking decay, κ is positive. We want to see how decay is affected by convolution with Gaussian functions and forms. Let ρα (φ) := det(π αC)−1 e−

−1 φ¯ α −1 φx Cxy y

,

(7.2)

where α = exp(iθ ) has unit modulus and positive real part and Cxy is a positive-definite matrix with indices with x, y ∈ X. By Gaussian integration, |ρα | ∗ w(A, κ)X is bounded by w(A , κ )X , where A = 2A/ cos θ and κ = κ/2, if κCoperator is small enough. X ≥ |ρ | ∗ w X . Then, for any smooth Thus there is a new weight wαC such that wαC C function fX , |ρα ∗ fX | ≤ |ρα | ∗ w X w −X fX ∞ . By applying this estimate also to derivatives of fX , ρα ∗ fX wαC ,h ≤ fX w,h . We use the notation ∂ψα = respect to ψx .

αx ¯ α¯ x x ∂ψx ∂ψx ,

(7.3)

where ∂ψx is a formal anti-derivative with

Lemma 7.2. Properties of the norms:

X (i) g X w,h ≤ gX x∈X gx w,h , w,h , where gw,h = (ii) fx gx w,h = fx wf ,h gx wg ,h when wf wg = w, ¯ ≥ Sw|G1 | and Sh := Lh, (iii) SFX w, ¯ S h ≤ FX w,h where w β α (iv) For 0 ≤ h < h , ∂ψ ∂φ FX w,h ≤ (α + β)!(h − h)−α−β FX w,h , −2 F (v) µC ∗ FX wαC ,h ≤ exp X w,h . x,y∈X |Cxy |h (i) - (iv) are valid when · ∗,h is replaced by | · |h .

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

567

Proof. Properties (i) – (iv) need only be proved for the w, h norm because the | · |h norm is the limit as κ → −∞ of the · w,h norm, when w = w(1, κ). Parts (i) and (ii) are straightforward consequences of the Leibniz rule, see p.103 and Appendix A of [BEI92]. (iii) is easy. (iv) is proved on p.104 of [BEI92]. It reflects the fact that · w,h behaves like the supremum over a polystrip, so derivatives are bounded by powers of the distance to the edge of the strip. (v) is also proved starting on p.104, noting that (7.3) is (6.4) in [BEI92].

The results of this section are organized by increasing number of hypotheses. Each proposition assumes the hypotheses that have been given earlier. The important point is that each hypothesis is satisfied when: (a) L ≥ 2 is sufficiently large; (b) |λ| is sufficiently small depending on L; and (c) h = |λ|−1/4 . The constants c, cL , cq , . . . that appear in hypotheses and conclusions are numbers in (0, ∞). c∗ denotes a number in (0, ∞) whose value is permitted to depend on ∗, whereas c is a number determined independently of all parameters L, λ, β and others that appear in the theorems. These symbols are permitted to change value from one appearance to the next. Constants that always have the same value in all appearances will be denoted by a letter other than c. Hypothesis (µ). µ = µeiθ C , where Cxy is a positive semi-definite matrix such that Cxy = 0 if |x − y| > L, cos θ ≥ c1 and |Cx,y | ≤ c2 . The constants c1 , c2 in this hypothesis can be chosen arbitrarily. Once they are fixed, estimates below are uniform in the choice of µ. Constants appearing without qualification in hypotheses can be chosen arbitrarily. In the following lemma, = is the Laplacian defined in (5.5) with replaced by exp(iθ )C and C is the maximum of |Cxy |. Eq () is the power series n≤q n /n! for exp() truncated at order q. X is a subset of a G1 coset. Hypothesis(η). 0 ≤ η ≤ h. The parameter for the small-field norm |·|h specifies how large a “small field” can be. In most cases this is determined by the covariance C (so h ≈ C1/2 ) but occasionally it is useful to let it be determined by the interaction (so h = h ≈ λ−1/4 ). X for t ∈ [0, 1]. Lemma 7.3. (Sµ ≈ S): Let w¯ ≥ SwtC −2q Cq gX , (i) Sµ ∗ g X − SEq−1 ()g X w, ¯ S h ≤ cq,L h w,2h X X −2q q gX , q = 1, 2, . . . . (ii) |Sµ ∗ g − SEq−1 ()g |h ≤ cq,L h w(0)C ¯ w,h

Proof. Let µt be µ with eiθ C replaced by teiθ C. Then t (1 − t)q−1 Sµt ∗ q g X dt. Sµ ∗ g X − SEq−1 ()g X = Rq , with Rq = (q − 1)! 0 By parts (iii) – (v) of Lemma 7.2, −2q Rq w, Cq gX ¯ S h ≤ cq,L h w,2h .

In part (ii), |Rq |h is bounded by w(0)R ¯ q w, ¯ S h/2 .

Hypothesis (λ). λ lies in a sector |λ| ≤ c1 Re λ in the right half of the complex plane and |λ| ≤ c, where c is small.

568

D.C. Brydges, J. Z. Imbrie

The qualification “where c is small” means that each of the following results of this section may require a hypothesis |λ| ≤ c, with c determined in the proof. Hypothesis (h ≈ λ−1/4 ). |λ|h4 ∈ [c−1 , c]. The default choice is h = |λ|−1/4 but this assumption allows us to assume the same estimates for e.g., 2h. Hypothesis (v, ˆ w). |ν| ≤ c1 h−2 , vˆ = λτ 2 + ντ + γ O and w(φ) = e−a|φ/ h| with 4c1 ≤ a ≤ c2 . The default choice is a = 1 and ν ≤ 21 h−2 . The observable O is a polynomial as in (4.2) such that |γ b1 |h4 + |γ b2 |h4 ≤ c, where c is small.4 2

The hypothesis on the observable ensures that it is small relative to λτ 2 . In principle γ is infinitesimal because we only use the derivative with respect to γ at zero, but later we use a Cauchy estimate that needs a finite variation of γ . The (v, ˆ w) hypothesis implies that |v| ˆ h ≤ c. Referring to the paragraph Complex Covariances and Weights, we find that the weight w in the (v, ˆ w) hypothesis is w(A, κ) with A = 1 and κ = ah−2 . Thus κ ≈ |λ|1/2 is small, as required, and we can choose w¯ = Sw(A , κ )X in Lemma 7.3. Next, we can reduce to A = 1 in w(A , κ ) by putting a constant cL = A X in front of the norm. Finally, the rescaling S maps w(1, κ )X into w(1, L2 κ ), which brings the decay rate κ back to better than the original value. In particular we can choose w¯ = w 2 for L > 2. To summarize: for λ small, the rescaling more than restores the loss of decay caused by convolution so that the whole RG map will actually strengthen this parameter in the norm. Lemma 7.4 (Integrability of e−vˆ ). If M is a polynomial form with degree m, then Me−vˆ w,h ≤ cm (h/h)m |M|h, |f evˆ |h ≤ c|f |h. ˆ h and |v| ˆ h ≤ c|v| ˆ h ≤ c by Proof. For the second estimate we use |f evˆ |h ≤ |f |he|v| hypotheses. The first estimate is reduced to the case h = 1, λ = c by scaling φ = hφ . See the proof of (7.4) in [BEI92]. In effect, each φ or ψ in M is replaced by h.

Similarly we can bound the exp(−ντ ) in exp(−v) ˆ for ν either positive or negative, but we must then have a weight that allows growth at infinity as in Lemma 7.5 (Integrability of eντ ). e−ντ w−1 ,h ≤ c and | exp(−ντ )|h ≤ c. Lemma 7.6 (Interaction scales down). If r is normalized, then r exp(−v) ˆ w,h ≤ cL−6 rw,S h/2 and |r|h ≤ c[h /(h − h )]6 |r|h for h < h. More generally, if (cf. Definition 4.1), rt = O(t p ), then L−6 is replaced by L−p . Proof. By Taylor’s theorem in t applied to rt = r(tφ) it suffices to estimate the w, h norm of 0

1

(1 − t)5 d 6 rt e−vˆ dt, 5! dt 6

When |x| = L, the observable is γ φ0 φ¯ x . But as far as estimates are concerned, the behavior is the same as that of γ2 (φ0 φ¯ 0 + φx φ¯ x ). 4

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

569

which is bounded by c times terms of the form supt rt Me−vˆ w,h , where r (6) denotes a sixth φ, ψ derivative of r and M is a monomial in φ, dφ of degree 6. This is less than (6)

sup rt wt ,h Me−vˆ w/wt ,h = sup r (6) w,th Me−vˆ w/wt ,h (6)

t

t

≤ r (6) w,h Me−vˆ w,h , where wt = w(tφ), because the norm is decreasing in w and w/wt ≥ w. r (6) is bounded by c(Sh)−6 times the w, Sh/2 norm by Lemma 7.2 (iv) and Me−vˆ is bounded by h6 using Lemma 7.4. The second inequality is proved by the same steps with the h norms.

Lemma 7.7 (Extraction). The solution (ν , v , r ) to

e−vˆ + rˆ = e−ν τ (e−v + r ), (v , r ) normalized

(7.4)

satisfies (i) |vˆ − ν τ − v |h ≤ c|ˆr |h, (ii) |ν − ν|h2 + |λ − λ|h4 ≤ c|ˆr |h where γ = 0 in rˆ , (iii) r w,h ≤ cˆr w2 ,h , (iv) |r |h ≤ c|ˆr |h, provided that for each estimate the right-hand side is less than a constant c which is small. (p)

Proof of (i). Define |·|h by truncating the sum in the definition of |·|h so that only deriv(p)

(p)

(p)

atives of order p or less are included. This seminorm satisfies |F1 F2 |h ≤ |F1 |h |F2 |h . Rewrite the (ν , v , r ) equation, ˆ τ r . (7.5) vˆ − ν τ − v = ln 1 + evˆ rˆ − ev−ν (p)

Expand the ln, take the | · |h with p = 4: |vˆ − ν τ − v |h ≤ (p)

xj j

j

ˆ τ , x = |evˆ rˆ |h + |ev−ν |h |r |h . (p)

(p)

(p)

It suffices to prove the estimate for p = 4 because |vˆ − ν τ − v |h = |vˆ − ν τ − v |h, (4)

whereas |ˆr |h ≤ |ˆr |h. x = |evˆ rˆ |h because the first five derivatives of r are zero by (p) normalization. Also, |evˆ rˆ |h ≤ c|ˆr |h as in the proof of Lemma 7.4. By hypothesis, the sum is dominated by the first term so that (p)

(p)

|vˆ − ν τ − v |h ≤ c|ˆr |h. (p)

Proof of (ii). Bound the left-hand side by c|vˆ − ν τ − v |h and use part (i).

570

D.C. Brydges, J. Z. Imbrie

Proof of (iii). Let

r (α) = eν τ (e−vˆ + α rˆ ) − e−v (α)

(7.6)

be the solution when rˆ is replaced by α rˆ . For α in the disk α rˆ w2 ,h ≤ c1 , with c1 small, we show that r (α)w,h is bounded by c. First, exp(ν τ − v) ˆ and exp(−v ) satisfy a (v, ˆ w) hypothesis by (a) parts (ii) and (i) with h = h, and (b) |ˆr |h ≤ ˆr w2 ,h . Second, exp(ν τ )(α rˆ )w,h ≤ exp(ν τ )w−1 ,h α rˆ w2 ,h ≤ c, using Lemma 7.5. Since r (α) vanishes at α = 0 we have, by analyticity in α, r w,h ≤ cˆr w2 ,h . Proof of (iv). By taking the norm of (7.6),

ˆh |r (α)|h ≤ e|ν τ |h (e|v| + |α rˆ |h) + e|v |h ≤ c.

Therefore, as in part (iii), |r |h ≤ c|ˆr |h.

For the remainder of this section we fix vˆ = v+ ˜ ν˜ τ to be the second order perturbation theory calculation of the new interaction as in Proposition 5.1. Define rmain = constant and linear in γ part of SµC ∗ (e−v )G1 − e−vˆ .

(7.7)

The right-hand side is expanded in a series in powers of γ and truncated at order γ . For computing the observable, we only need the derivative in γ at zero so there is no loss in such truncation. We showed in Proposition 5.1 that rmain is O(λ3 ) + O(λ2 γ ) as a formal power series. We now extend this result to norm estimates on rmain . Lemma 7.8 (Remainder after perturbation theory is small in norm). Suppose that 1 2 y Cxy = 0 and |h| ≤ cL C . Then rmain w,h ≤ cL C2 |λ|,

|rmain |h ≤ cL C3 |λ|3 .

The main obstacle to proving this lemma is the “large field problem” described at the end of Sect. 5. Using the notation of Lemma 5.3, let Vˆt = −Vt + Qt with Q linearized q in γ and let Eq (A) = j =0 Aq /q! denote the exponential truncated at order q. We solve the large field problem by splitting Vˆt = J¯t + J t and using the approximation exp(J t )Eq (J¯t ) in place of exp(Vˆt ). J¯t will contain the polynomial of degree six which would ruin integrability if fully exponentiated. j Qt is a sum of terms of the form Px Cx,y Py + O(γ ), where Px is a polynomial in φx , ψx and conjugates. Let Qt be the same sum with Py replaced by Px . The degree 6 term that seems to be in Qt is actually zero because y Cxy = 0. We choose J t = −Vt + Qt . Then J¯t = Qt − Qt has the property that S J¯t = 0 and vˆ = S Vˆt=1 = −SJ t=1 .

(7.8)

Since λ is small we can assume the same (v, ˆ w) hypothesis for v and v. ˆ Note that J t contains the τ 2 terms. We keep them in a standard exponent because exp(−λτ 2 ) is useful for suppressing large values of φ, unlike Eq (−λτ 2 ).

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

571

Proof of Lemma 7.8. Equation (7.8) implies that at t = 1, S exp(J t )Eq (J¯t ) = e−vˆ . Recall from Sect. 5 the Duhamel formula 1 rmain = S dt µ(1−t) ∗ LeJ t Eq (J¯t ). (7.9) 0

By a calculation based on the differentiation rule DEq (J¯) = Eq−1 (J¯)D J¯ , 1 J ¯q 1 ˆ ↔ ˆ 1 ↔ J ¯ ¯ ˆ Le Eq (J ) = e Eq−1 (J ) LV − V V + e J LJ − J J 2 q! 2 ↔ 1 + (7.10) eJ J¯q−1 J¯ J¯. 2(q − 1)! J

We have omitted t subscripts. We choose q = 2. As in (5.7), ↔ 1 ↔ 1 1 ↔ LeJ E2 (J¯) = eJ E1 (J¯) Q V − Q Q + eJ J¯2 LJ − J J 2 2 2 1 J ¯¯↔ ¯ + e JJ J. (7.11) 2 ↔

Consider the term exp(J )E1 (J¯)Qt Vt in (7.11). Since E1 (J¯) = 1 + J¯ it splits into ↔ two terms, one of which is F := exp(J )Qt Vt . F is a sum of products of terms of the form (Me−v )G1 as in Lemma 7.4 because there are three sites in G1 where there are monomials corresponding to the three factors vx , vy , vz in F . If we consider the λτ 2 parts of v then we find that the total degree of the monomials is 3 × 4 − 4 = 8. The −4 ↔

↔

is because F contains two , one explicit and one in Qt . Associated with these two are Cx,y and Cy,z . There is also λ3 . Therefore, by Lemmas 7.2 – 7.4, (Me−v )G1 w,h is bounded by cL h8 |λ|3 C2 = cL |λ|C2 . The cL includes factors L4 = |G1 | from sums over x, y, z. ↔ The other term in exp(J )E1 (J¯)Qt Vt is J¯F . This is smaller by an additional factor cL h6 |λ|2 C2 by the same methods. In fact, all of the terms in (7.9,7.11) can be estimated by this procedure by cL |λ|C2 . The O(γ ) terms may be absorbed into these bounds because, according to the (v, ˆ w) hypothesis, |γ b| ≤ c|λ|. The | · |h norm is estimated in the same way but taking into account the hypothesis on h, (i) |λ|3 hp ≤ cL,p |λ|3 and (ii) if M is a monomial of order 2p then |M|h ≤ ch2p ≤ cL,p Cp . There are at least two factors of C in every term and for terms with only two at least one more is contributed by (ii).

Hypothesis (λ, r small depending on L). |λ|, rw,h and (h/h)4 |r|h ≤ cL , where cL is determined in the proof of the next theorem. √ Proposition 7.9. Let v be given by (4.2), h, h = |λ|−1/4 , |λ |−1/4 and cL C ≥ h ≥ √ h ≥ L C. Then the map from (v, r) to (v , r , ν ) defined by

Sµ ∗ (e−v + r)G1 = e−ν τ (e−v + r ), with v , r normalized, satisfies

572

D.C. Brydges, J. Z. Imbrie

(i) r w,h ≤ cL |λ|C2 + cL−2 rw,h , 6 (ii) |r |h ≤ cL |λ|3 C3 + cL−2 hh |r|h + cL C5 h−10 rw,h , 6 ˜ 2 + |λ − λ|h ˜ 4 ≤ cL |λ|3 C3 + cL−2 h |r|h + cL h−10 C5 rw,h . (iii) |β − β|h h Proof of (i) and (ii). Recalling that vˆ = v˜ + ν˜ τ is the second-order calculation of the ˆ rˆ . new interaction, let rˆ be determined by the equation Tβ (exp(−v)+r)G1 = exp(−v)+ Then rˆ should obey third-order estimates. After extraction, these will carry over to r . We may write (e−v )G1 \{x} rx + SEq−1 () (e−v )G1 \X r X rˆ = rmain + SEq−1 () x∈G1

+ (Tβ − SEq−1 ())

X⊂G1 ,|X|>1

(e

−v G1 \X X

)

r .

(7.12)

X⊂G1

First term in 7.12. By Lemma 7.8 rmain w2 ,h ≤ cL C2 |λ|). Second term in (7.12). We choose q = 1. Let X = G1 \ {x}. S(exp(−v))X equals ˆ w) hypothesis5 . By Lemma 7.6, Lemma 7.2 exp(−vnew ), where vnew still satisfies the (v, part (iii) and w2 ≥ SwG1 , −v X S(e ) Srx ≤ cL4 L−6 Srw2 ,S h ≤ cL−2 rw,h . (7.13) x∈G1

w2 ,2h

Third term in (7.12). By Lemmas 7.2 and 7.4, third termw2 ,2h ≤ cL r2w,h . Fourth term in (7.12). By Lemma 7.3, with w¯ = w 2 as explained below the (v, ˆ w) hypothesis, followed by Lemmas 7.2 and 7.4, fourth termw2 ,2h ≤ h−2 rw,h because there are less than cL terms in the sum over X and each has one or more factors of r and q = 1. Therefore ˆr w2 ,2h ≤ cL |λ|C2 + cL−2 rw,h , (7.14) √ where we used the hypotheses (r and h−2 = |λ| small depending on L) to put the estimates for the third and fourth terms inside cL−2 rw,h . To estimate the h norm, we use (7.12) with q = 5. Consider the second term in (7.12). Let (e−v )G1 \{x} rx . F = x∈G1

Note that Ft = O(t 6 ) is normalized, using the subscript t notation of Definition 4.1. However, (n F )t = O(t 6−2n ). By Lemma 7.6, with h replaced by Lh/2, and Lemma 7.2 part (iii), 6−2n n |Sn F |h ≤ c h /(Lh) | F |h/2, 5

because it equals λnew τ 2 + observable with λnew = λ(1 − L−4 ).

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

573

for n = 0, 1, 2. When n = 3, 4, 5 we still have this bound (by Lemma 7.2 (iii)) because h /(Lh) ≤ 1. By Lemma 7.4, | exp(−v)|h ≤ 1 + L−4 for |λ| < cL . Therefore, by Lemma 7.2, n 6−2n Ch−2 |r|h, |Sn F |h ≤ cL4 h /(Lh) √ and then, by h ≥ L C, |second term in (7.12)|h ≤ cL−2 [h /h]6 |r|h. The third term in (7.12) is absorbed into the same bound by using the same argument. There are merely more factors of r which make it much smaller by the hypothesis on |r|h. The fourth term in (7.12) is bounded by Lemma 7.3 so that |ˆr |h ≤ cL |λ|3 C3 + cL−2 (h /h)6 |r|h + cL C5 h−10 rw,h .

(7.15)

The proof of (i) and (ii) is completed by applying Lemma 7.7 to (7.14, 7.15). Proof of (iii). Immediate consequence of Lemma 7.7 and (7.14,7.15).

0 Proof of Proposition 6.1. Recall that the finite volume Green’s function G

λ0 (β0 , x) has been expressed as an integral, (4.3). This integral is convergent for all β0 ∈ C, provided the real part of λ0 is positive, which is the case for λ0 ∈ Dλ . In this case we can also define (βj , λj , rj ) by conglomerating the j RG steps into one giant step (for RG), with covariance equal to the sum of all the covariances of the individual RG steps, and applying it to the initial interaction. The big RG step is a convolution of a Gaussian into an interaction that decays faster than any Gaussian so it is convergent and the result defines (βj , λj , rj ). Furthermore, rj is entire in φ. The sequence (βj , λj , rj ) is defined for all j by the remark below (4.1). In order to obtain useful estimates on the sequence, we rotate the contours of integration of each φx (both real and imaginary parts). For any |θ| ≤ bβ +bλ /4+9/8−π/2 we may rotate φx → exp(iθ/2)φx and φ¯ x → exp(iθ/2)φ¯ x . Likewise ψx = (2π i)−1/2 dφx and ψ¯ x = (2π i)−1/2 d φ¯ x pick up factors exp(iθ/2). The effect of the rotation is to replace β, λ, A with

β(θ ) = eiθ β, λ(θ ) = e2iθ λ, A(θ ) = eiθ A.

Since the covariance A−1 xy = U (β, x −y) is determined by through (3.4), the rotation of A is equivalent to an anti-rotation (β) → (θ ) = (β) exp(−iθ). Recalling the discussion of bβ , bλ below (I.1.15), the above condition on θ is sufficient to ensure that | arg λ(θ )| < π/2 − /4 for any | arg λ| < bλ + , since by assumption,

2(bβ + ) + 23 (bλ + ) <

3π 2 .

Hence the integrals remain rapidly convergent at ∞ and there is no contribution from the contours at ∞. We obtain the functional equation

iθ G

λ, (β, x) = Gλ(θ), (θ) (β(θ ), x)e .

(7.16)

The sequence (βj , λj , rj ) transforms to (βj (θ ), λj (θ ), rj (θ )), where βj (θ ) = βj exp(iθ ), λj (θ ) = λj exp(2iθ ), and rj (θ, φ) = rj (φ exp(iθ )). It is evident that βj , λj transform in the same manner as β, λ because shifts in the coupling were defined in

574

D.C. Brydges, J. Z. Imbrie

Sect. 4 as derivatives of functions of φ, now φ exp(iθ/2) and each derivative introduces a factor exp(iθ/2). Define a decomposition Dβ ( 21 ) = H + ∪ H − of the β domain into two overlapping fattened sectors. Here H ± = {β = 0 : | arg β ∓ θ| <

π 2

− 8 } + B( 21 ),

where θ = bβ + bλ /4 + 9/8 − π/2. Claim 1. Suppose that β0 is in H− and λ0 is in Dλ . Let hj = L (βj −1 )1/2 . For L sufficiently large, there is a constant kL such that 1

rj (θ )w,hj ≤ kL |λj −1 | (βj −1 ) 2

(7.17)

|rj (θ )|hj ≤ kL |λj −1 | (βj −1 ) , 3

3

for all j ≥ 1 such that (βj −1 , λj −1 ) lies in H− × Dλ . The analogous claim with H− replaced by H+ and rj (θ ) replaced by rj (−θ ) is also true. Proof by induction. Since λj −1 ∈ Dλ , we have as above that | arg λj −1 (θ )| < π/2−/8 and so the λ hypothesis is valid for λj (θ ). The covariance is

(θ ) = (βj −1 )e−iθ =

e−iθ e−i(θ+arg(1+βj −1 ))

(0),

(0) = 1 + βj −1 |1 + βj −1 |

and we verify the µ hypothesis. First we observe that (0) is√ positive semi-definite. Second, since bβ + < 3π/4, we have that |1 + βj −1 | > ( 2 − 1)/2. Third, our definition of H − guarantees that |θ + arg(1 + βj −1 )| < π/2 − /8. As a result, Proposition 7.9 is applicable to the rotated parameters. In each case we get bounds of the form (contraction by cL−2 ) + (small), which enables us to control the induction. By part (ii), |rj +1 (θ )|hj +1 ≤ 21 kL |λj |3 (βj )3 + 41 (hj +1 /hj )6 |rj (θ )|hj + cL (βj )5 h−10 rj (θ )w,hj . We made a j independent choice of L large to get the second term in this particular form. The first term has the constant kL because we choose kL in the inductive hypothesis to make it so. This bound proves the second inequality of Claim 1 for j = 1 because r0 = 0. In the right-hand side, substitute the inductive assumptions (7.17) to obtain |rj +1 (θ )|hj +1 ≤ kL 21 |λj |3 (βj )3 + 38 |λj −1 |3 (βj )3 , where cL (βj )5 h−10 rj (θ )w,hj went into the second term by a j independent choice of Dλ so that |λ| is small, making h large and consequently h−10 |λj |2 . By the λ j recursion 38 |λj −1 | ≤ 48 |λj |. Thus we have advanced j to j + 1 in the second estimate of Claim 1. The advance of j to j + 1 in the first estimate is proved similarly. But first observe that there exists an L-independent constant k such that (βj −1 ) 1 + βj 1 + βj ≤ k 2 L2 = = −2 (βj ) 1 + βj −1 1 + L βj + O(λj −1 )

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

575

for all βj ∈ Dβ (1/2). (The relation between βj −1 and βj follows from Proposition 7.9 (iii) and the inductive assumption (7.17)). Hence, by Proposition 7.9 (i) and (7.17), we have 1 rj +1 (θ )w,hj +1 ≤ kL (βj ) 2 21 |λj | + cL−2 kL|λj −1 | , so we may choose L > 4ck to complete the induction.

For later use we observe that if, in the above argument, we use (7.14, 7.15) in place of Proposition 7.9 (i) and (ii), we obtain Claim 1 . Equation (7.17) holds when rj +1 is replaced by rˆj . Claim 2. Let (βj , λj ) with j = 0, 1, . . . , M − 1 be an RG sequence in Dβ (1/2) × Dλ . If βj lies in H− \ H+ , then βj +1 is not in H+ . The idea is that the factor L2 in the β recursion causes a large expansion in β in a direction that takes βj further from H+ . There are other terms in the β recursion, but they are small by Claim 1. Proof of Claim 2. By the hypothesis on βj , Re βj (−θ − /8) ≤ − 41 . Therefore, Re L2 βj (−θ − /8) ≤ − 41 L2 , so no β within distance one of L2 βj can be in H+ . On the other hand, by Claim 1 and part (iii) of Proposition 7.9, |βj +1 (θ ) − β˜j +1 (θ )| 1. Therefore |βj +1 − L2 βj | 1. Consequently βj +1 is not in H+ .

By Claim 2, either the whole sequence (βj , λj ) with j = 0, 1, . . . , M − 1 lies in H− or it lies in H+ or in both. Therefore Proposition 6.1 follows from Claim 1 applied with either H+ or H− , Proposition 7.9 (iii), along with (5.3).

Remark. The functional equation (7.16) provides an analytic continuation of Gλ into the larger domain |2 arg β − (3/2) arg λ| < 3π/2, | arg β| < π , | arg λ| < π , even though the defining integral fails to converge for | arg λ| > π/2. However, to obtain the end-to-end distance from Gλ , we require bβ > π/2 and so bλ < π/3. Proposition 6.2 involves derivatives with respect to the observable parameter γ at γ = 0. Derivatives at zero will be denoted by γ subscripts as in vγ and rγ . We will repeatedly use four principles: (1) distinguishing partial dependences on γ by subscripts on γ ; (2) the total γ derivative is the sum of the partial derivatives with respect to partial dependences; (3) Functions of γ may be replaced by linearizations; and (4) Cauchy estimates on derivatives. These are illustrated in detail in the proof of Corollary 7.10 and are subsequently used without comment. Corollary 7.10 (to Lemma 7.7).

(i) |vˆγ − vγ |h ≤ c |ˆr |h|vˆγ |h + |ˆrγ |h ,

(ii) |rγ |h ≤ c |ˆrγ |h + |ˆr |h|vˆγ |h ,

(iii) rγ w,h ≤ c ˆrγ w2 ,h + ˆr w2 ,h |vˆγ |h . Proof of (i). Firstly, vˆ − ν τ − v is a function of γvˆ when vˆ is replaced by vˆ + γvˆ vˆγ . It is h norm analytic and bounded by c uniformly in the disk |γvˆ vˆγ |h ≤ c. By the Cauchy formula for the derivative at γ = 0 and Lemma 7.7, part (i), |(v−ν ˆ τ −v )γvˆ |h ≤ c|ˆr |h|vˆγ |h. τ −v is a function of γrˆ when rˆ is replaced by rˆ +γrˆ rˆγ . It is norm analytic Secondly, v−ν ˆ and, by Lemma 7.7, part (i), is bounded by c uniformly in the disk |γrˆ rˆγ |h ≤ |ˆr |h. By the Cauchy formula |(vˆ − ν τ − v )γrˆ |h ≤ c|ˆrγ |h. Since the γ derivative of vˆ − ν τ − v is bounded by the sum of these two estimates and since ν is independent of γ , part (i).

576

D.C. Brydges, J. Z. Imbrie

Proof of (ii) and (iii). Apply the same argument to r as a function of γvˆ and γrˆ . Part (iii) uses uniform bounds by c on the disks γrˆ rˆγ w2 ,h ≤ c1 and |γvˆ vˆγ |h ≤ c1 .

Proof of Proposition 6.2. As in the proof of Theorem 6.1 all parameters are rotated by θ in the complex plane, but here we simplify notation by writing b, β, λ, r, in place of b(θ ), β(θ ), λ(θ ), r(θ ), (θ ). Claim 3. Define hj as before (above (7.17)). There exists cL such that 1

rj,γ w,hj ≤ cL (βj −1 ) 2 |bj −1 |

(7.18)

|rj,γ |hj ≤ cL |λj | (βj −1 ) |bj −1 |. 2

3

Proof. By (7.12), rˆ is a function of γrmain and γv and γr , and note that rˆγrmain = rmain ,γv . By Lemma 7.8, there are uniform bounds on the w2 , h and h norms of rmain in the disk |γrmain ||b| ≤ ch−4 . By the Cauchy estimate and (7.12), rmain ,γv w2 ,h ≤ cL C2 |b|,

|rmain ,γv |h ≤ cL C3 |λ|2 |b|.

We obtain the same estimates on the γv derivative of rˆ because (7.14, 7.15) are uniform on a disk |γv ||b| ≤ ch−4 . Consider the γr derivative of rˆ . The derivative of the second term in (7.12) is SEq−1 ()(e−v )G1 \{x} rx,γr , where there is no sum over x and x = 0 because rx,γr = 0 unless x = 0 and then rx,γr = rγr . The absence of the sum over x saves a factor of L4 when the argument leading to (7.13) is repeated so that S(e−v )G1 \{x} Srx,γr w2 ,2h ≤ cL−6 rγr w,h , and the same factor L4 is also saved in estimating the h norm. These terms are the dominant contribution to the γr derivative of rˆ : recall the w 2 , 2h estimate on the third term in (7.12) in the proof of Proposition 7.9. It is uniform on the disk γr rγr w,h ≤ rw,h , so that the γr derivative of this term is bounded by cL rw,h rγr w,h . This can be absorbed into a small change in the constant c in cL−6 rγr w,h because r is small by (7.17). The γr derivative of the fourth term can also be absorbed because it is down in norm by h−2q with q = 1. In summary, ˆrγ w2 ,2h ≤ cL C2 |b| + cL−6 rγ w,h |ˆrγ |h ≤ cL C |λ| |b| + cL 3

2

−6

(7.19) q −2q

(h /h) |rγ |h + cL C h 6

rγ w,h ,

where the second equation is obtained in the same way using the h norm and q = 5. By this estimate and parts (ii) and (iii) of Corollary 7.10, 1

rj +1,γ w,hj +1 ≤cL (βj ) 2 |bj | + cL−6 rj,γ w,hj , |rj +1,γ |hj +1 ≤cL (βj )3 |λj |2 |bj | + cL−6 (hj +1 /hj )6 |rj,γ |hj −2q

+ cL j q hj

rj,γ w,hj ,

where, in the first estimate, the term ˆr w2 ,h |vˆγ |h of Corollary 7.10 was absorbed by a change of constant into cL (β)1/2 |b|, using Claim 1 and |λ||vˆγ |h ≤ c|b|. Likewise, in the second estimate, the term |ˆr |h|vˆγ |h was absorbed into cL C3 |λ|2 |b|. Claim 3 (7.18) follows by induction (cf. the proof of (7.17)).

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

577

By substituting (7.18) into (7.19), we obtain Claim 3 . Equation (7.18) holds with r replaced by rˆ . By Corollary 7.10 (i) and Claims 1 and 3 , |vˆγ − vγ |hj ≤ cL j −1 3 |λj −1 |2 |bj −1 |. We change the indices j − 1 to j on the right by increasing cL . By the definition of ∗,j in Sect. 6, |vˆγ − vγ |hj dominates |0,j |, |1,j |h2j and |2,j |h4j . Solving the inequality for the ∗ concludes the proof of Proposition 6.2.

A. Convolution of Forms and Supersymmetry Convolution f, g → f (u − v)g(v) dv of functions has a natural extension to forms. Furthermore many associated facts such as the closure of Gaussian functions under convolution also have form analogues. This is because functions and forms both pull back under the map u, v → u − v,

CN × CN → CN .

(A.1)

To make this extension we first must define the integral of a form over a linear manifold of dimension less than the form. Thus, suppose V is a complex linear subspace of Cm and ω is an integrable form on Cm . Given any complementary subspace V ⊥ we define a form V ω on V ⊥ by requiring that V⊥

ω⊥

ω=

V

Cm

ω⊥ ω

holds for all ω⊥ on V ⊥ , where, on the right-hand side, ω⊥ is the form on Cm obtained by pulling back the projection from Cm ≈ V ⊥ ⊕ V to V ⊥ . Taking V = CM as a subspace of CN+M we have the following form analogue of a familiar Gaussian identity: Proposition A.1. Let A be a matrix on CN+M = CN × CM with positive real part. Define the quadratic form SA as in (2.1). Then CM

e−SA = e−SAu ,

where Au is the N × N matrix obtained by inverting (A−1 )ij with indices restricted to 1 ≤ i, j ≤ N. The partial integration over complex linear subspaces has the following desirable property:

578

D.C. Brydges, J. Z. Imbrie

Lemma A.2. If ω is a smooth, rapidly decaying form, then Q Proof.

V⊥

ω

⊥

Qω =

Cm

V

V

ω=

V

Qω.

ω⊥ Qω.

Since Q is a derivation and Q(ω⊥ ω) = 0, the right-hand side is − Cm (Qω⊥ )ω. Since V and V ⊥ are complex linear, this is the same as ⊥ ⊥ (Qω ) ω= ω Q ω. − V⊥

Therefore,

V⊥

V⊥

V

ω⊥

Qω = V

This proves the claim because ω⊥ is arbitrary.

V⊥

ω⊥ Q

V

ω. V

Proof of Proposition A.1. Let (u, v) ∈ CN × CM . exp(−SA ) is a Gaussian times a sum of constant forms. Since Gaussian functions remain Gaussian when variables are integrated out, e−SA = e−uAu u¯ × some constant form ω. CM

The covariance of a normalized Gaussian is always the inverse of the matrix in the exponent. This identifies Au because the covariance of the u variables is not changed by integrating out v. By Lemma A.2, Q(exp(−uAu u)ω) ¯ = 0. Multiply both sides of this equation by exp(uAu u¯ + duAu d u/(2πi)) ¯ which is supersymmetric and therefore commutes past Q. Therefore, ¯ Q(eduAu d u/(2πi) ω) = 0.

Since the only constant forms annihilated by Q are constants, ¯ ω = const e−duAu d u/(2πi) .

Therefore

CM

¯ ¯ u d u/(2πi) e−SA = const e−uAu u−duA .

The constant is one because the integral of both sides over CN is one by Lemma 2.1.

Next comes the form analogue of the well known closure property of Gaussian convolutions. SA (u − v) denotes the pullback of SA by the map (A.1). Corollary A.3.

e−SA (u−v) e−SB (v) = e−SC (u) ,

with C −1 = A−1 + B −1 . The integral is over v ∈ CN .

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

579

Proof. Apply Proposition A.1 noting that the left-hand side is a Gaussian form exp(−SA˜ ) with −1 A, −A A + B −1 , B −1 ˜ . A= whose inverse is −A, A + B B −1 , B −1

We conclude with a characterization of supersymmetric forms on C. Lemma A.4. (i) If ω is a smooth even supersymmetric form on C then ω = f (τ ) for some smooth function f . (ii) If ω is a smooth odd degree supersymmetric form on C then ω = f (τ )(φ d φ¯ + φ¯ dφ) for some smooth function f . ¯ where a, b are functions Proof. (i) Any even form can be written as ω = a + b dφ d φ, of φ. Then ¯ Qω = aφ dφ + aφ¯ d φ¯ − (2πi)bφ d φ¯ − (2π i)b dφ φ. ¯ aφ¯ = Therefore supersymmetry implies that the partial derivatives satisfy aφ = 2π i φb, −1 ¯ ¯ 2π iφb which implies there is f such that a = f (φ φ) and b = (2π i) f (φ φ). Therefore ω = f (τ ). (ii) Away from φ = 0 we can write ω = aφ d φ¯ + bφ¯ dφ. Qω = 0 implies a = b and then Qa = 0, so a = a(τ ).

B. Dirichlet Boundary Conditions U ,β (x) = G

λ=0 (β, x) is the β potential for the hierarchical Levy process ω(t) killed on first exit. Recall that the Hierarchical lattice is invariant under the map L−1 which is the shift x −→ x + G1 , or equivalently, when x = (. . . , x2 , x1 , x0 ), L−1 x = (. . . , x3 , x3 , x1 ). This map induces a scaling S that acts on Green’s functions by SU (x) = L−2 U (L−1 x) and on β by L2 β = L2 β. We also define /L = {Sx|x ∈ }. The main result for this section is the scale decomposition U (β, x) = SU /L (L2 β, x) + (β, x), where 1

(β, x) = γ +β

1 11G0 − 11G1 , n

and γ = 1 for the four dimensional hierarchical process. This is the only property of the hierarchical Levy process that is used in the renormalization group analysis. The remainder of this section is provided for a completeness but plays no further role in the paper. Recall from [BEI92], that conditional on jumping, the probability of jumping from x to y is proportional to |x|−6 . This jump law was chosen because it makes the process scale in the same way as a random walk on a four dimensional simple cubic lattice, namely for N = ∞, we have U β=0 (x) ∝ |x|−2 . The jump rate r is chosen so that the constant of proportionality is one. Equation (3.4) is an immediate consequence of

580

D.C. Brydges, J. Z. Imbrie

Proposition B.1. U ,β (x) =

N−1

S j (S j β, x) + (r + S N β)−1 S N 11G0 (x).

j =0

Proof. We will prove a more general result by considering a unit ball G1 with n elements and q(x − y) = c|x − y|−α , = 0 if y = x. Note that

Gk

q(x) dx = 1 − nk L−αk .

By repeating the proof of Lemma 2.2 on p. 89 of [BEI92], taking into account the killing, we find E (ω(t), ξ ) = exp(−tψ(ξ )), with

ψ(ξ ) = r − r

Since

GN

q(x)x, ξ dx.

q dx = 1 ψ(ξ ) = r

GN

q(x)[1 − x, ξ ] dx + r

G \GN

q(x) dx.

Case. ξ ∈ HN . Recall from [BEI92] that the dual ball Hj is defined by: ξ ∈ Hj if x, ξ = 1 for every x ∈ Gj . Therefore, for ξ ∈ HN , q(x) dx = rnN L−αN . ψ(ξ ) = r G \GN

Case. ξ ∈ Hj \Hj +1 , with j = 0, 1, . . . N −1. By the calculation starting at the bottom of p. 89 of [BEI92], GN

q(x)[1 − x, ξ ] dx = cnj +1 L−α(j +1) + c = cnj +1 L−α(j +1) +

Therefore j +1 −α(j +1)

ψ(ξ ) = rcn

L

+r

G \Gj +1

N

L−αk (nk − nk−1 )

k=j +2

GN \Gj +1

q(x) dx.

q(x) dx = γ nj L−αj ,

where γ = r(1 + c)nL−α is the same as the γ in [BEI92].

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

581

Therefore, using 11Hj − 1 Hj +1 to isolate these cases: (β + ψ(ξ ))−1 =

N−1

(β + γ nj L−αj )−1 (11 Hj − 1 Hj +1 ) + (β + rnN L−αN )−1 11HN .

j =0

Then we invert the Fourier transform using Lemma 2.1 in [BEI92],

U

,β

=

N−1

(β + γ n L j

−αj −1 −j

)

n

j =0

1 1 Gj − 1 Gj +1 + (β + rnN L−αN )−1 n−N 11GN . n

The proposition now follows by choosing α, n so that n = Lα−2 and obtain U ,β (x) =

N−1 j =0

x x , n−j L2j βL2j , j + n−N L2N (r + βL2N )−1 11G0 L LN

and then setting α = 6.

Remark. Choosing α, n so that n = Lα−2 and n = Ld gives the canonical scaling factor n−j L2j = L−(d−2) so that the infinite volume potential (N → ∞) with β = 0 is U 0 (x) =

∞ j =0

x L−(d−2)j 0, j = ρ|x|−(d−2) L

for x = 0 and d > 2. We choose r and thereby γ = r(1 + c)nL−α so that ρ = 1. This requires, as in [BEI92], γ =

1 − L−2 , 1 − L4−α

r=

1 − L−2 1 − L2−α . 1 − L4−α 1 − L−α

C. Calculations for Proposition 5.1 To classify the different algebraic expressions that result from carrying out the derivatives in the formula in Lemma 5.2 we introduce the Feynman diagram notation. The diagrams

R @

@ I

@ R

all represent τx . The incoming vector symbolizes the φx and the ψx and the outgoing vector symbolizes the φ¯ x and ψ¯ x . The vectors are called legs. The common vertex signifies the single sum over x and the sum over a term φx φ¯ x and a term ψx ψ¯ x , whereas two vertices close together as in the two diagrams @ R @ I

@ R @ R

582

D.C. Brydges, J. Z. Imbrie

represent τx2 , in which there is a single sum over x but each τ x is a sum of two contributions φx φ¯ x and ψx ψ¯ x . The action of the Laplacian on τx2 is symbolized by joining an outgoing leg to an incoming leg as in @ R @ I

@ R @ I

@ R @ I

@ R @ I .

(C.1)

The second two diagrams each contain a closed loop. This closed loop is a factor of τ = 0 in the algebraic expression represented by the diagram: the anticommuting ψ derivatives in give a contribution that exactly cancels the contribution from the φ derivatives. Closed loops are always a factor which is the sum of two canceling contributions so diagrams containing closed loops will not be exhibited. We classify all the terms that arise by applying k by drawing all possible diagrams with k pairs of consistently oriented legs joined. A joined pair of legs is called a line. Each line is associated to a factor (x, y) coming from . Consistently, a line joining legs at the same vertex carries the factor (x − x) from . Set γ = 0 so that there is no observable and consider V = vy . y∈x+G1

By (C.1), V involves two non-vanishing diagrams, but since both diagrams represent the same algebraic expression, λ (0) τx , we write one diagram with the combinatoric coefficient 2, @ R @ I

V1 := e V = The graphical representation for

1 2 V1

+2

↔

@ R @ I

.

V1 is

@ R @ + 8 @ R 4 R @ I @ I @ I

@ R @ I

+4

@ R @ I

@ R @ I

.

First diagram: any of 4 legs in left-hand vertex pairs with either of 2 legs in right-hand vertex and prefactor 21 . Second diagram: any of four legs pairs with one leg, prefactors 1 2 2 and a factor 2 because we can interchange the 4 leg vertex with the 2 leg vertex. Third diagram: either of 2 legs can pair with one leg, prefactors 21 4. The graphical representation for @ R @ + 2 R @ 2 R @ R R @ @ I

1 1 2 2! V1

@ R @ I

↔2

V1 is

@ R @ +4 R @ + 4 R @ @ I R @ I

@ R @ I

.

↔ First diagram: there is already a factor of 4 in the first diagram in 21 V1 V1 , there is 1 one way to add an additional line to obtain this topology, prefactor 2! . Second diagram: ↔ 1 same. Third diagram: there is a factor of 4 in the second diagram in 2 V1 V1 , there are 1 two ways to add an additional line to obtain this topology, prefactor 2! . Fourth diagram: ↔ 1 there is a factor of 8 in the third diagram in 2 V1 V1 , there is one way to add an 1 additional line to obtain this topology, prefactor 2! . ↔3 1 1 The graphical representation for 2 3! V1 V1 is

Green’s Function for a Hierarchical Self-Avoiding Walk in Four Dimensions

R @

4 There is 2 in the first diagram in

↔2 1 2 V1

R @

583

.

/2V1 , there are 2 ways to add one more line. ↔2

There is no way to add an additional line to the second diagram in 21 V1 /2V1 . There ↔2

is 4 in the third diagram in 21 V1 /2V1 , there are 2 ways to add one more line. The 1 total is 12, there is prefactor 13 in 3! . ↔p

For p > 3, V1 V1 = 0, therefore 1 ↔ S V1 V1 = L−2 B1 λ2 τx3 = 0 because B1 = 0, 2 ↔2 1 1 S V1 V1 = 8B2 λ2 τx2 + 4B2 B0 L2 λ2 τx , 2 2! ↔3 1 1 S V1 V1 = 4B3 L2 λ2 τx , 2 3! so that λ˜ = λ − 8B2 λ2 , β˜ = L2 β + 2B0 L2 λ − 4L2 B2 B0 λ2 − 4L2 B3 λ2 . There are three cases to consider for the observable. Case (1) there have been fewer than N(x) − 1 iterations so that the observable is −γ b1,j φ0 φ¯ xj with |xj | > L. Here we know by Lemma 3.5 that bi,j = L−2j and b1,j = b2,j = 0 and there is no need to calculate anything. Case (2) j ≥ N (x). For this case v0 contains the additional terms −γ (b0 + b1 φ0 φ¯ 0 + b2 τ0 φ0 φ¯ 0 + b3 τ0 ). Therefore in SV1 we have the additional terms ¯ −b0 − b1 L−2 φ φ¯ − b2 L−4 τ φ φ¯ − (0)b1 − O( 2 )b2 − 2 (0)b2 φ φ, where we have omitted 0 and j subscripts. Terms of the form bτ have been omitted for reasons explained at the end of Sect. 4. In the second order part SQ1 we have additional terms ¯ O( 3 )λb1 + O( 4 )λb2 + O( 2 )λb1 + O( 3 )λb2 L−2 φ φ¯ + O( 2 )λb2 L−4 φ φτ. There is no O( )λb1 contribution to b2 because B1 = 0. Case (3) j = N(x) − 1. This is almost the same as Case (2), but with b2,j vanishing. References [AB84] [BEI92] [BI02]

Atiyah, M.F., Bott, R.: The moment map and equivariant cohomology. Topology 23, 1–28 (1984) Brydges, D.C., Evans, S., Imbrie, J.Z.: Self-avoiding walk on a hierarchical lattice in four dimensions. Ann. Probab. 20, 82–124 (1992) Brydges, D.C., Imbrie, J.Z.: End-to-end distance from the Green’s function for a hierarchical self-avoiding walk in four dimensions. Commun. Math. Phys.; DOI 10.1007/s00220-003-0885-6

584

D.C. Brydges, J. Z. Imbrie

[BMM91] Brydges, D.C., Munoz-Maya, I.: An application of Berezin integration to large deviations. J. Theor. Probab. 4, 371–389 (1991) [LJ87] Le Jan, Y.: Temps local et superchamp. S´eminaire de Probabilit´es XXI. In: Lecture Notes in Mathematics, Vol. 1247. Berlin, New York: Springer-Verlag, 1987 [Lut83] Luttinger, J.M.: The asymptotic evaluation of a class of path integrals. II. J. Math. Phys. 24, 2070–2073 (1983) [McK80] McKane, A.J.: Reformulation of n → 0 models using anticommuting scalar fields. Phys. Lett. A 76, 22–24 (1980) [PS79] Parisi, G., Sourlas, N.: Random magnetic fields, supersymmetry, and negative dimensions. Phys. Rev. Lett. 43, 744–745 (1979) [PS80] Parisi, G., Sourlas, N.: Self-avoiding walk and supersymmetry. J. Physique Lett. 41, 1403– 1406 (1980) [Sim79] Simon, B.: Functional integration and quantum physics. New York: Academic Press, 1979 [Wit92] Witten, E.: Two-dimensional gauge theories revisited. J. Geom. Phys. 9, 303–368 (1992) http://arXiv:hep-th/9204083 Communicated by M. Aizenman