Communications in Mathematical Physics - Volume 307

Commun. Math. Phys. 307, 1–16 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1298-6 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

29 downloads 619 Views 10MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 307, 1–16 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1298-6

Communications in

Mathematical Physics

On Majorana Representations of A6 and A7 A. A. Ivanov Department of Mathematics, Imperial College of Science, Technnology and Medicine, 180 Queens Gate, London SW7 2AZ, UK. E-mail: [email protected] Received: 6 October 2010 / Accepted: 18 March 2011 Published online: 10 August 2011 – © Springer-Verlag 2011

Abstract: The Majorana representations of groups were introduced in Ivanov (The Monster Group and Majorana Involutions, 2009) by axiomatising some properties of the 2 A-axial vectors of the 196 884-dimensional Monster algebra, inspired by the sensational classification of such representations for the dihedral groups achieved by Sakuma (Int Math Res Notes, 2007). This classification took place in the heart of the theory of Vertex Operator Algebras and expanded earlier results by Miyamoto (J Alg 268:653– 671, 2003). Every subgroup G of the Monster which is generated by its intersection with the conjugacy class of 2 A-involutions possesses the (possibly unfaithful) Majorana representation obtained by restricting to G the action of the Monster on its algebra. This representation of G is said to be based on an embedding of G in the Monster. So far the Majorana representations have been classified for the groups G isomorphic to the symmetric group S4 of degree 4 (Ivanov et al. in J Alg 324:2432–2463, 2010), the alternating group A5 of degree 5 (Ivanov AA, Seress Á in Majorana Representations of A5 , 2010), and the general linear group G L 3 (2) in dimension 3 over the field of two elements (Ivanov AA, Shpectorov S in Majorana Representations of L 3 (2), 2010). All these representations are based on embeddings in the Monster of either the group G itself or of its direct product with a cyclic group of order 2. The dimensions and shapes of these representations are given in the following table: “What is our life? A game!” (A.S. Pushkin, “The Queen of Spades”) Shape S4 A5 L 3 (2)

(2 A, 3A) 13 26 49

(2 A, 3C) 9 20 21

(2B, 3A) 13 46 –

(2B, 3C) 6 21 –

In the present note the classification is expanded to the groups A6 and A7 (subject to invariance of the 3A-axial vectors and the absence of 3C-subalgebras).

2

A. A. Ivanov

1. Introduction A tuple R = (G, T, V, ( , ), · , ϕ, ψ) is said to be a Majorana representation (of G or, rather, of G/C G (T )) ) if the following conditions hold: G is a finite group; T is a set of involutions (elements of order 2) in G which generates G and is stable under conjugation by elements of G (this means that T is a generating union of conjugacy classes of involutions in G); V is a real vector space equipped with an inner product ( , ) and with a commutative non-associative algebra product · , which associate with each other and satisfy the Norton inequality: (u · u, v · v) ≥ (u · v, u · v) for all u, v ∈ V ; ϕ : G → G L(V ) is a faithful homomorphism, whose image preserves ( , ) and · ; and ψ is a rule which injectively assigns to every t ∈ T a vector at = ψ(t) which is a Majorana axis in (V, ( , ), · ) (cf. the definition in the next paragraph) and such that the Majorana involution associated with at coincides with ϕ(t). It is further assumed that V is generated by the elements at taken for all t ∈ T , and dim (V ) is said to be the dimension of R. It is assumed that ϕ and ψ commute in the sense that ag−1 tg = (at )ϕ(g) for every g ∈ G. For a subset X of vectors in V we denote the linear span of X in V and the algebra closure of X in (V, · ) by < X > and X , respectively. Thus X is the smallest subspace Y in V containing X such that y1 · y2 ∈ Y whenever y1 , y2 ∈ Y . 1 By the definition in [Iv09] a Majorana axis a in (V, ( , ), · ) is an idempotent of length 1, whose adjoint operator ada : v → a · v 1 is semi-simple with spectrum {1, 0, 41 , 32 } (when talking about eigenvectors of a one really means eigenvectors of ada ). The following conditions concerning the eigenspaces are imposed. The 1-eigenvectors of a are precisely the scalar multiples of a. The Majorana involution τ (a) associated with a is the linear transformation of V which 1 negates every 32 -eigenvector and centralizes the other eigenvectors. By the above definition the Majorana involution is an automorphism of (V, ( , ), · ). This condition is 1 1 equivalent to the fusion rules involving 32 -eigenvectors: if v and u are 32 -eigenvectors 1 1 of a, and if x and y are λ- and μ-eigenvectors for λ, μ ∈ {1, 0, 4 }, then v · x is a 32 1 eigenvector, and both v · u and x · y project to zero in the 32 -eigenspace. The remaining fusion rules are as follows: if α1 and α2 are 0-eigenvectors of a, and β1 and β2 are 1 4 -eigenvectors of a, then

α1 · α2 , β1 · β2 − (β1 · β2 , a) a and α1 · β1 are λ-eigenvectors of a, where λ = 0, 0 and 41 , respectively. 1 It was rightly pointed out by the referee that the centre of G acts trivially on V .

On Majorana Representations of A6 and A7

3

Sakuma’s theorem [Sak07], together with Norton’s explicit description of the subalgebras in the Monster algebra generated by pairs of transposition axes [N96], implies the following proposition, which will now be installed in the axiomatics of the Majorana theory. Proposition 1 [N96,Sak07]. Let R = (G, T, V, ( , ), · , ϕ, ψ) be a Majorana representation. For two distinct involutions t0 and t1 in T put a0 = ψ(t0 ), a1 = ψ(t1 ), τ0 = ϕ(t0 ), τ1 = ϕ(t1 ), ρ = t0 t1 . Let D be the dihedral subgroup in G L(V ) generated by τ0 and τ1 , and let |D| = 2N . Then the subalgebra Y = a0 , a1 is isomorphic to one of the eight Norton–Sakuma algebras in Table 1(more specifically to an algebra of type N X , where N is as above and X ∈ {A, B, C}). For an integer i and ε ∈ {0, 1} the vector a2i+ε is the image of aε under the i th power of ρ(so that τ (a2i+ε ) = ρ −i τε ρ i ), and the remaining vectors in the basis of Y , given in the second column in Table 1, are centralized by D. The kernel of the action of D on Y coincides with the centre of D. For the case of algebras generated by two Majorana axes it was shown in [IPSS10] (cf. the beginning of Sect. 2) that the Norton inequality is a consequence of the other Majorana axioms. The shape of a Majorana representation R is a rule which specifies the type of the Norton–Sakuma subalgebra ψ(t0 ), ψ(t1 ) for every pair t0 , t1 of involutions in T . The rule must be stable under conjugation by the elements of G and must respect the embeddings of the algebras: 2 A → 4B, 2 A → 6A, 2B → 4 A, 3A → 6A. The 2 A-subalgebras in the Monster possess an important property which we axiomatise: (2A) The following conditions hold (where t0 , t1 , t2 ∈ T and ai = ψ(ti ) for 0 ≤ i ≤ 2): (a) if t0 t1 t2 = 1, then a0 , a1 is of type 2 A and a2 = aρ := a0 +a1 −8a0 ·a1 ; (b) if a0 , a1 is of type 2 A, 4B or 6A, then t0 t1 , (t0 t1 )2 or (t0 t1 )3 belongs to T , and ψ(t0 t1 ), ψ((t0 t1 )2 ) or ψ((t0 t1 )3 ) coincides with aρ , aρ 2 or aρ 3 . The vectors u ρ , vρ and ±wρ in the 3A-, 4 A- and 5A-subalgebras (cf. Table 1) in the Monster case are called 3A-, 4 A- and 5A-axial vectors, respectively. We apply this terminology to an arbitrary Majorana representation. In the case of the Monster group M, the axial vector associated with ρ can be specified inside the centralizer of C M (ρ) in the 196 884-dimensional vector space. In the present setting the axial vectors are defined through the subalgebras generated by pairs of Majorana generators. It is natural to assume that every axis is independent of the choice of the subalgebra involved in its definitions. In the groups we have considered so far (S4 , A5 and L 3 (2)), every cyclic subgroup of order n is contained in at most one dihedral subgroup of order 2n. The situation in A6 is different since it contains a pair of D6 -subgroups intersecting in a cyclic subgroup of order 3. This forces us to axiomatise yet another property of the Majorana representation of the Monster (we don’t know whether or not this axiom is independent of the previous Majorana axioms). (3A) The following condition holds (where t0 , t1 , t2 , t3 ∈ T , ai = ψ(ti ) for 0 ≤ i ≤ 3): if (a) t0 , t1 ∼ = t2 , t3 ∼ = D6 ; (b) t0 t1 = t2 t3 ; (c) both a0 , a1 and a2 , a3 have type 3A; then the 3A-axial vectors in the subalgebras in (c) are equal.

4

A. A. Ivanov

Table 1. Norton–Sakuma algebras Type

Basis

2A

a0 , a1 , aρ

2B

a0 , a1

Products and angles a0 · a1 = 13 (a0 + a1 − aρ ), a0 · aρ = 13 (a0 + aρ − a1 ) 2 2 (a0 , a1 ) = (a0 , aρ ) = (a1 , aρ ) = 13 2 a0 · a1 = 0, (a0 , a1 ) = 0 3 a0 · a1 = 15 (2a0 + 2a1 + a−1 ) − 3 11.5 u ρ

2

2

3A

a−1 , a0 , a1 , uρ

a0 · u ρ = 12 (2a0 − a1 − a−1 ) + 55 u ρ 3 2 uρ · uρ = uρ 3 (a0 , a1 ) = 138 , (a0 , u ρ ) = 12 , (u ρ , u ρ ) = 25

2

3C

a−1 , a0 , a1

2

a0 · a1 = 16 (a0 + a1 − a−1 ), (a0 , a1 ) = 16 2

2

a0 · a1 = 16 (3a0 + 3a1 + a2 + a−1 − 3vρ ) 2

4A

a−1 , a0 , a1 , a2 , vρ

a0 · vρ = 14 (5a0 − 2a1 − a2 − 2a−1 + 3vρ ) 2 vρ · vρ = vρ , a0 · a2 = 0 (a0 , a1 ) = 15 , (a0 , a2 ) = 0, (a0 , vρ ) = 33 , (vρ , vρ ) = 2 2

2

4B

a−1 , a0 , a1 ,

a0 · a1 = 16 (a0 + a1 − a−1 − a2 + aρ 2 )

a2 , aρ 2

a0 · a2 = 13 (a0 + a2 − aρ 2 )

2

2

(a0 , a1 ) = 16 , (a0 , a2 ) = (a0 , aρ 2 ) = 13 2

2

a0 · a1 = 17 (3a0 + 3a1 − a2 − a−1 − a−2 ) + wρ

5A

a−2 , a−1 , a0 , a 1 , a 2 , wρ

6A

a−2 , a−1 , a0 , a1 , a2 , a3 aρ 3 , u ρ 2

2 a0 · a2 = 17 (3a0 + 3a2 − a1 − a−1 − a−2 ) − wρ 2 a0 · wρ = 712 (a1 + a−1 − a2 − a−2 ) + 75 wρ 2 2 2 wρ · wρ = 5 19.7 (a−2 + a−1 + a0 + a1 + a2 ) 2 3 (a0 , a1 ) = 37 , (a0 , wρ ) = 0, (wρ , wρ ) = 5 19.7 2 2 2 a0 · a1 = 16 (a0 + a1 − a−2 − a−1 − a2 − a3 + aρ 3 ) + 3 11.5 u ρ 2 2 2 3 a0 · a2 = 15 (2a0 + 2a2 + a−2 ) − 3 11.5 u ρ 2 2 2 a0 · u ρ 2 = 12 (2a0 − a2 − a−2 ) + 55 u ρ 2 3 2 a0 · a3 = 13 (a0 + a3 − aρ 3 ), aρ 3 · u ρ 2 = 0, (aρ 3 , u ρ 2 ) = 0 2 (a0 , a1 ) = 58 , (a0 , a2 ) = 138 , (a0 , a3 ) = 13 2 2 2

If G is A6 or A7 , then every cyclic subgroup of order 5 of G is contained in a unique dihedral subgroup of order 10, so the uniqueness condition for 5A is not necessary in these cases. In fact, the uniqueness property for 2 A-generated D10 -subgroups holds in the whole of the Monster, so we might never need to axiomatise it. On the other hand, it is doubtful that the Majorana representations of A8 can be classified without a uniqueness assumption on the 4 A-axial vectors. Let R = (G, T, V, ( , ), · , ϕ, ψ) be a Majorana representation of a group G and let ζ : g → g ζ be an automorphism of G preserving T . Then R is said to be ζ -stable if the permutation which maps ψ(t) onto ψ(t ζ ) for every t ∈ T extends to an automorphism of (V, ( , ), · ).

On Majorana Representations of A6 and A7

5

S-stability (where S is a group of automorphisms of G) has the obvious meaning. By definition every Majorana representation of G is stable under the inner automorphisms of G. It is implicit in Table 1 that every element of order 5 in the Monster which is inverted by a 2 A-involution has type 5 A. All simple subgroups in the Monster containing 5Aelements have been classified by Simon Norton. The classification proof remains unpublished but the list of subgroups can be found in [N98]. It can be seen from this list that the Monster contains a single class of A6 -subgroups and a single class of A7 subgroups which are generated by 2 A involutions. These subgroups are 6- and 5-point stabilizers in the ‘standard’ A12 -subgroup in the Monster, which is the centralizer of an A5 -subgroup of type (2 A, 3A, 5A). All 3-elements in the A6 and A7 -subgroups in question are of type 3A. Finally, each of these two subgroups is fully normalized in the Monster in the sense that every automorphism is induced by a suitable element from the normalizer. This property can be seen from the structure of the Monstralizers quoted in the proof of Lemma 5.1. Theorem 1. Let G be isomorphic to the alternating group of degree 6 or 7, and let R = (G, T, V, ( , ), · , ϕ, ψ) be a Majorana representation of G satisfying the conditions (2 A) and (3A). Suppose further that whenever t0 , t1 are involutions in T which generate a D6 -subgroup, the subalgebra ψ(t0 ), ψ(t1 ) has type 3A (rather than 3C). Then (i) R is uniquely determined by G; (ii) R is based on an embedding of G in the Monster and it is ζ -stable for every ζ ∈ Aut (G); (iii) if G ∼ = A6 then dim (V ) is either 75 or 76; (iv) if G ∼ = A7 then dim (V ) is either 196 or 197. It follows from the properties of the Monster stated in the paragraph before the theorem that the second assertion would follow from the first.2 Having been challenged by the referee’s remark on an earlier version of the paper I was able to prove the following. Proposition 1.1 [Iv11]. Suppose that there exists a Majorana representation R = (A6 , T, V, ( , ), · , ϕ, ψ) such that the subalgebra ψ(t0 ), ψ(t1 ) has type 3C for at least one pair t0 , t1 of involutions in T which generate a D6 -subgroup. Then (i) ψ(t), ψ(s) has type 3C for all pairs t, s of involutions in T which generate a D6 -subgroup; (ii) R is 2-closed and 70-dimensional; (iii) R is determined uniquely up to isomorphism. Corollary 1.2. There exist no Majorana representations of A7 for which the subalgebra ψ(t0 ), ψ(t1 ) has type 3C for at least one pair t0 , t1 of involutions generating a D6 -subgroup. Proof. If t0 t1 is a 3-cycle, then t0 t1 is contained in a D12 -subgroup and hence ψ(t0 ), ψ(t1 ) must be of type 3A, since the 3C-algebra is not contained in the 6A-type algebra. If ψ(t0 ), ψ(t1 ) has type 3C when t0 t1 is the product of two 3-cycles, then either the preceding argument applies or the restriction of the Majorana representation of A7 to an A6 -subgroup involves both 3A- and 3C-types, contrary to Proposition 1.1 (i).

2 Recently the representations of Theorem 1 have been explicitly constructed by Ákos Seress (private communication). Their dimensions turned out to be 76 and 196, respectively.

6

A. A. Ivanov

In the proof of Theorem 1 we make use of the fact that the restriction of R to a subgroup S4 , A5 or L 3 (2) in G is an explicitly known Majorana representation of shape (2 A, 3A) explicitly described in [IPSS10,ISe10] or [ISh10], respectively. This fact enables us to calculate many pairwise products in V . The remaining ones are recovered by a formula introduced in [IPSS10] under the name of the resurrection principle. This principle is an immediate consequence of the fusion rule and we reproduce it below for the sake of completeness (compare Lemma 1.7 in [IPSS10]). Lemma 1.3. Let R = (G, T, V, ( , ), · , ϕ, ψ) be a Majorana representation, let a be a Majorana axis in V , and let X be an a-stable subspace of V in the sense that a · x ∈ X for every x ∈ X . Suppose that s ∈ V and that αs = s + x0 and βs = s + x 1 4

are, respectively, 0- and

1 -eigenvectors 22

of a for some x0 , x 1 ∈ X . Then

4

s = − 4a · (x0 − x 1 ) + x 1 . 4

4

2. 2-Closure Let R = (G, T, V, ( , ), · , ϕ, ψ) be a Majorana representation. As above for t ∈ T we denote by at the corresponding Majorana axis ψ(t). Put A = {at | t ∈ T } and A2 = {at · as | t, s ∈ T } (note that A2 contains A, since the Majorana axes are idempotents). By definition V = A . A Majorana representation is said to be 2-closed if V = < A2 >. It is clear that R is 2-closed if and only if the algebra product · is closed on the linear span < A2 > of A2 . As a consequence of the main results of [IPSS10,ISh10,ISe10] we have the following result. Lemma 2.1. Let H be S4 , A5 or L 3 (2), and let R be a Majorana representation of H . Then either (i) R is 2-closed, or (ii) H ∼ = S4 and R is of shape (2B, 3A) and the algebra product is closed on the linear span of the set {at · (as · ar ) | t, s, r ∈ T }.

From now on we assume that G is the alternating group of degree 6 or 7 and that R = (G, T, V, ( , ), · , ϕ, ψ) is a Majorana representation of G which satisfies the hypothesis of Theorem 1. All the involutions in G are conjugate, which implies that T is the set of all involutions in G, so all the involutions in G are of type 2 A. We are going to show that < A2 > contains a spanning set whose vectors are indexed by the subgroups of order

On Majorana Representations of A6 and A7

7

2, 3 and 5 in G. By definition < A2 > is linearly spanned by A together with the subalgebras Y = at , as taken for all at , as ∈ A. By Proposition 1, unless at = as , the subalgebra Y is isomorphic to a Norton–Sakuma algebra of type N X in Table 1, where N is the order of ts. We analyse the possibilities for Y , depending on N . If N = 2 then ts ∈ T , since T contains all the involutions of G, and by (2 A) Y must be of type 2 A with Y = < at , as , ats > ⊆ < A > . If N = 3 then by the hypothesis of Theorem 1 Y is of type 3A. By Proposition 1 and since ϕ and ψ commute, we have Y = < at , as = ahth −1 , ah −1 th , u h (t) >, where h = ts and 211 u h (t) = 3 3 ·5

1 (2at + 2ahth −1 + ah −1 th ) − at · ahth −1 25

is the unique idempotent in Y of squared norm 85 . Since Y is stable under the ϕ-image of D = t, s ∼ = D6 and since ϕ(t) maps ahth −1 onto ah −1 th centralizing at , the vector u h −1 (t) defined via the pair (at , ah −1 th ) is equal to u h (t) by the last assertion in Proposition 1. By (3A) the vector u h (t) does not depend on the particular choice of the dihedral group D = h, t of order 6 containing h and can be denoted simply by u h (by the above u h = u h −1 ). Finally, every subgroup of order 3 in G is inverted by an involution and hence it is contained in a D6 -subgroup. Therefore with every subgroup of order 3 in G we have associated a vector u h (where h is a generator of the subgroup) so that the linear span of the union of these vectors with the set A contains the linear span of the 2-generated subagebras Y of type 3A. If N = 4 then Y must be of type 4B, since the 4 A-algebra contains a 2B-subalgebra, which is not present by the N = 2 case. By (2A) we have Y = < at , as , atst , asts , a(ts)2 > ⊆ < A > . If N = 5 then Y is of type 5A and with f = ts we have Y = < at , as = a f 2 t f 3 , a f 3 t f 2 , a f t f −1 , a f −1 t f , w f (t) >, where w f (t) = at · a f 2 t f 3 −

1 (3at + 3a f 2 t f 3 − a f t f −1 − a f −1 t f − a f 3 t f 2 ). 27

If F is the automorphism group of the algebra structure on Y , preserving {a f i t f 5−i | 1 ≤ i ≤ 4}, then F is easily seen to be F20 , the Frobenius group of order 20, and, up to rescaling, w f (t) is the unique vector in Y which is centralized by the elements of the D10 -subgroup in F and negated by the remaining elements of F. The structure of the 5A-algebra and its invariance under t, s ∼ = D10 show that w f (t) = −w f 2 (t) = −w f 3 (t) = w f −1 (t). It is clear that w f (t) does not depend on the particular choice of the involution t inside the D10 -subgroup containing f and we may therefore write it as w f . Since N A6 (Z 5 ) ∼ = D10 ,

N A7 (Z 5 ) ∼ = F20 ,

8

A. A. Ivanov

this D10 -subgroup is uniquely determined by f and (without the (5A)-condition) with every subgroup of order 5 in G we can associate a 1-dimensional subspace of < A2 > spanned by w f , where f is a generator of the subgroup. If N = 6 then Y is of type 6A and with g = ts and by (2A) we have Y = < at , as , agtg−1 , ag−1 tg , agsg−1 , ag−1 sg , a(st)3 , u g2 > . Since at , agtg−1 is a subalgebra of type 3A which contains u g2 (since g 2 = tg −1 tg), the algebras of type 6A are already generated as vector spaces by their 2 Aaxials and the vectors produced in the 3A-subalgebras. For i = 2, 3 or 5 let G (i) denote a set of elements of order i in G which contains one representative from every subgroup of order i in G; if x, y ∈ G (i) and the subgroups x

and y are conjugate in G we choose x and y to be G-conjugate. Clearly G (2) is the set T of involutions in G. The above considerations can be summarised in the following lemma. Lemma 2.2. Let A = {at | t ∈ G (2) }, U = {u h | h ∈ G (3) } and W = {w f | f ∈ G (5) }. Then < A2 > = < A ∪ U ∪ W > .

For a subgroup H of G and i ∈ {2, 3, 5} put H (i) = H ∩ G (i) and define A(H ) = {at | t ∈ H (2) }, U (H ) = {u h | h ∈ H (3) }, W (H ) = {w f | f ∈ H (5) }, V (H ) = A(H ) ∪ U (H ) ∪ W (H ). Lemma 2.3. Let H be a subgroup of G isomorphic to D12 , A4 , S4 , A5 or L 3 (2). Then V (H ) = < V (H ) >. Proof. If H ∼ = D12 then Proposition 1 is applicable. If H is S4 , A5 or L 3 (2) then the restriction of R to H is a Majorana representation of H of shape (2 A, 3A) and Lemma 2.1 is applicable. Every A4 -subgroup of G is contained in an S4 -subgroup, and by Lemma 4.16 in [IPSS10] in the Majorana representation of S4 the algebra product is closed on V (A4 ). This provides the required reduction.

Lemma 2.4. Let F21 be a Frobenius subgroup of order 21 in A7 . Then F21 is contained in a subgroup H ∼ = L 3 (2) of A7 and V (F21 ) ⊆ < V (H ) >. Proof. Since F is the normalizer of a Sylow 7-subgroup in A7 , the inclusion of subgroups follows. By Lemma 2.3 the algebra product is closed on V (H ), hence we also have the required inclusion of subspaces.

Note that the explicit product formulae on V (H ) in the above lemma can be read from Table 1, [IPSS10,ISh10,ISe10]. Of course the L 3 (2)-subgroup appears only when G is A7 . As the crucial step in our treatment we show that the 5A-axial vectors in < A2 > together add at most one dimension to the subspace < A∪U > spanned by the Majorana and 3A-axial vectors. This conclusion is a consequence of Norton’s relation stated on p. 300 of [N96]. It was, incidentally, noticed by Sophie Decelle that the statement of Norton’s relation in [N96] contains a sign error (in the sign of the terms beginning 16 ). Lemma 2.5. The codimension of < A ∪ U > in < A ∪ U ∪ W > is at most 1.

On Majorana Representations of A6 and A7

9

Proof. A fundamental property of the Majorana representation of H ∼ = A5 of the shape (2 A, 3A) is that < A(H ) ∪ U (H ) > has codimension 1 in V (H ) (cf. p. 300 in [N96] and Lemma 4.4 in [ISe10]). Thus, in order to establish the assertion, it suffices to prove the connectivity of the graph (G) on the set of order 5 subgroups in G, where two such subgroups are adjacent if they are contained in a common A5 -subgroup. It is clear that the subgroups in a connected component of (G) generate a nontrivial normal subgroup of G. Since both A6 and A7 are simple, the connectivity follows.

Lemma 2.6. The Majorana representation R considered is 2-closed if and only if A ∪ U ⊆ < A ∪ U ∪ W >. Proof. The ‘only if’ assertion is obvious. Suppose that the inclusion holds. In order to prove the 2-closure, in view of Lemma 2.2 it is sufficient to show that x · wf ∈< A ∪U ∪ W > for every f ∈ G (5) and x ∈ A ∪ U . Suppose that x = at for some t ∈ G (2) (the case when x = u h can be treated similarly). Let g be an element of order 5 in G such that t and g are contained in a common A5 -subgroup H . Then at · wg ∈ A(H ) ∪ U (H ) ∪ W (H ) by Lemma 2.3, while by Lemma 2.5 w f = wg + y for some y ∈ < A ∪ U >. Now the assumed inclusion gives the result.

We are going to prove the following enhanced version of the 2-closure condition stated in the above lemma. Proposition 2.7. The following assertions hold: (i) if t, s ∈ G (2) then at · at ∈ < A ∪ U ∪ W >; (ii) if t ∈ G (2) and h ∈ G (3) then at · u h ∈ < A ∪ U ∪ W >; (iii) if h, k ∈ G (3) then u h · u k ∈ < A ∪ U ∪ W >; (iv) in each of the above cases the product is uniquely determined by the factors. Proof. The assertion (i) is immediate from Lemma 2.2. Roughly speaking (ii) holds because neither A6 nor A7 is generated by an element of order 2 together with an element of order 3. We start with the case G ∼ = A6 . For t ∈ G (2) and h ∈ G (3) let o(th) denote the order of the group product of t and h and let t, h denote the isomorphism type of the subgroup in G generated by t and h. Then it is an elementary exercise to check that the possibilities are as described in (the first three columns of) the following table: # 1 2 3 4

o(th) 2 3 4 5

t, h

D6 A4 S4 A5

(at , u h ) 1 4 1 9 1 36 1 18

10

A. A. Ivanov

Thus in the A6 -case assertion (ii) follows from Lemma 2.3. In the situation corresponding to the first row t normalizes h and thus at · u h is contained in the 3A-subalgebra at , ah −1 th . The entries in the last column of the above table can be deduced from [N96] Table 3 by rescaling or read from [IPSS10] and [ISe10]. Let us turn to the case G ∼ = A7 . If h is a 3-cycle then either t and h are contained in a common A6 -subgroup and the assertion has already been established, or t and h have disjoint supports in the permutation action on seven points. In that case they are simultaneously inverted by an involution, so they are contained in a D12 -subgroup and again Lemma 2.3 is applicable (in this case at and u h are perpendicular). We may therefore assume that h is a product of two 3-cycles. It can be shown that under conjugation by S7 there is a single orbit on pairs (t, h) with o(th) = 7. On the other hand, such a pair can be found inside an L 3 (2)-subgroup (it is a well known fact that A7 is not a (2, 3, 7)-group). Thus the table describing the possibilities for A7 is the one for A6 extended by the following two rows: # 5 6

o(th) 6 7

t, h

Z6 L 3 (2)

(at , u h ) 0 1 24

The rightmost entry of the last row can be deduced from [N96] Table 3 by rescaling or read from [ISh10]. Again Lemma 2.3 is applicable, which completes the proof of (ii). The proof of (iii) will be accomplished in Sect. 4 after some detailed analysis of 3Aalgebras in the next section, but first we state explicitly what we have achieved so far. Corollary 2.8. The subspace < A ∪ U ∪ W > is Majorana stable, in the sense that for every t ∈ T it contains the product at · v whenever v, taken from this subspace. Proof. The assertion is implied by Proposition 2.7 (i) and (ii) together with Lemma 2.6.

3. Calculating with 3 A-Subalgebras We start with the following lemma, which can be justified either by direct calculations with the formulae in Table 1 or by referring to Table 4 in [IPSS10]. Lemma 3.1. Let t ∈ T, h ∈ G (3) and suppose that t inverts h, that is tht = h −1 . Then the vectors (t)

αh = u h −

2·5 25 ahth −1 + ah −1 th a + t 3 3 3 3

and (t)

βh = u h −

23 25 a a −1 + ah −1 th − t 32 · 5 32 · 5 hth

are, respectively, 0- and 14 -eigenvectors of at in the algebra at , u h of type 3A.

Proposition 3.2. Let t ∈ T , and let h, k ∈ G (3) . Suppose that h and k generate distinct subgroups in G and that t inverts both h and k. Then the following assertions hold: 4

(i) (u h , u k ) = 325 ·5 (5−23 ·32 · p+27 ·q), where p = (u h , aktk −1 ), q = (ahth −1 , aktk −1 )+ (ahth −1 , ak −1 tk );

On Majorana Representations of A6 and A7 (t)

11

(t)

(t)

(t)

(t)

(t)

(ii) u h · u k = −4at · (αh · αk − u h · u k − αh · βk + u h · u k ) + αh · βk − u h · u k , (t) (t) (t) (t) where αh , αk , βh and βk are defined as in Lemma 3.1; (iii) if the subspace < A ∪ U ∪ W > is Majorana stable then it contains the product uh · uk . Proof. Since the inner and algebra products associate by the Majorana representation axiomatics, eigenvectors with distinct eigenvalues are perpendicular, and hence (t) (t) (t) (t) (αh , βk ) = 0. Expanding this equality in terms of the expressions for αh and βk given in Lemma 3.1, we obtain (i). Note that the equality (u h , aktk −1 ) = (u k , ahth −1 ) also follows from the associativity between the inner and algebra products. The assertion (ii) is an immediate consequence of the resurrection principle Lemma 1.3 for (t)

(t)

(t)

(t)

s = u h · u k , αs(t) = αh · αk , βs(t) = αh · βk . It follows immediately from the shape of the eigenvectors in Lemma 3.1 that both (t) (t) (t) (t) αh · αk − u h · u k and αh · βk − u h · u k are linear combinations of products of the form ar · v, where r ∈ T and v ∈ A ∪ U , and hence (iii) follows.

The next lemma is an important specialization of the above proposition. Lemma 3.3. With t, h, k as in Proposition 3.2, suppose that h and k commute. Then u h and u k are perpendicular and annihilate each other: (u h , u k ) = u h · u k = 0. Proof. The subgroup P = t, h, k generated by t, h and k is a semi-direct product of the elementary abelian group Q of order 9 generated by h and k, and the order 2 group generated by t, where t inverts every element of Q. The group P possesses a unique Majorana representation satisfying the (3A)-condition. The vector space underlying this representation is spanned by four 3A-axis vectors u h , u k , u hk , u hk −1 together with nine Majorana axis vectors ah i k j tk − j h −i for 0 ≤ i, j ≤ 2. Any pair of distinct Majorana axes generate a 3A-subalgebra and so do a Majorana axis together with a 3 A-axis. Therefore, by Table 1, in the notation of Proposition 3.2 we have p = 41 , q = 13 and calculating 26 with the formulae therein we obtain the orthogonality. The annihilation property requires longer calculations with products of eigenvectors, but this is still doable by hand and fairly straightforward.

32

The four 3A-axes and nine Majorana axes in the representation of the group P ∼ = : 2 defined in the proof of Lemma 3.3 are not linearly independent. The vector 45(u h + u k + u hk + u hk −1 ) − 32 ah i k j tk − j h −i 1≤i, j≤2

is the zero vector, since its length is zero and ( , ), being an inner product, is positive definite. This relation was shown to me by Dima Pasechnik and hence I call it Pasechnik’s relation.3 The relation has the following important consequence vital for our proof. Lemma 3.4. Suppose that h and k are disjoint 3-cycles in G. Then the sum u h + u k is a linear combination of some Majorana axes and 3A-axial vectors u g , for g generated by products of two disjoint 3-cycles.

3 The referee has pointed out that this relation is actually known in VOA theory since u + u + u + u h k hk hk −1

and

1≤i, j≤2 ah i k j tk − j h −i are both scalar multiples of the Virasoro element.

12

A. A. Ivanov

We conclude this section with yet another specialization of Proposition 3.2 (i). It is easy to check that under conjugation by Aut (A6 ) there is a unique orbit on pairs (h, k) of elements of order 3 generating A6 , with representative h = (1, 2, 3)(4, 5, 6), k = (1, 2, 4). (These two elements are simultaneously inverted by t = (1, 2)(5, 6).) Lemma 3.5. With t, h, k as in Proposition 3.2, suppose that h and k generate A6 . Then 32 (u h , u k ) = 405 . Proof. By the paragraph just before the lemma, we assume that h = (1, 2, 3)(4, 5, 6), k = (1, 2, 4), and t = (1, 2)(5, 6). We apply Proposition 3.2 (i). The product of h and 1 ktk −1 has order 5, which gives p = 18 (cf. the table in the proof of Proposition 2.7 −1 −1 (ii)). The products of hth with ktk and k −1 tk have orders 5 and 4, respectively. By Table 1 this gives q = 237 + 216 = 257 . Now the formula in Proposition 3.2 (i) gives the claimed value for the inner product.

Finding an explicit formula for the product u h · u k for a generating pair of 3-elements appears far too complicated to be done by hand. The following lemma can also be proved by the methods developed in this section. Lemma 3.6. The following assertions hold: (i) if h = (1, 2, 3)(4, 5, 6) and k = (1, 2, 7) then (u h , u k ) = (ii) if h = (1, 2, 3)(4, 5, 6) and k = (1, 4, 7) then (u h , u k ) =

64 405 ; 32 405 .

Note that Lemmas 3.3, 3.5 and 3.6 contains representatives of all the orbits of A7 on pairs of order 3 subgroups with one being generated by a single 3-cycle and the other one by two such cycles. 4. Proof of Proposition 2.7 (iii) We consider the two alternating groups in separate subsections. 4.1. A6 -case. By Corollary 2.8 and Proposition 3.2 (iii), the assertion would follow if we can show that any pair of elements of order 3 in G are simultaneously inverted by an involution. This is indeed the case when G ∼ = A6 . Lemma 4.1. Any two elements h and k of order 3 in A6 are simultaneously inverted by an involution. Proof. This is indeed a very elementary fact. A possible argument is as follows. First suppose that h and k are conjugate in A6 . Then taking into account the outer automorphisms of A6 , we may assume that both h and k are 3-cycles (and certainly distinct). Then up to conjugation and inversion we just have the following three possibilities, where t is the required involution: h = (1, 2, 3), k = (1, 2, 4), t = (1, 2)(5, 6); h = (1, 2, 3), k = (1, 4, 5), t = (2, 3)(4, 5); h = (1, 2, 3), k = (4, 5, 6), t = (1, 2)(4, 5). If h and k are from different conjugacy classes, then up to inversion they conjugate to the pair given just before Lemma 3.5, which are also simultaneously inverted as shown there, or to (1, 2, 3)(4, 5, 6) and (1, 2, 3), which are simultaneously inverted by (1, 2)(4, 5).

On Majorana Representations of A6 and A7

13

4.2. A7 -case. The analogue of Lemma 4.1 fails for A7 , since this group contains the following elements which are not simultaneously inverted by an involution of A7 : h = (1, 2, 3)(4, 5, 6)(7), k = (1, 2, 4)(3, 5, 7)(6) (in fact there is not even an involution centralizing one and inverting the other, which would enable us to apply another version of the resurrection principle). On the other hand, this pair generates L 3 (2), rather than the whole of A7 , and hence Lemma 2.3 is applicable. Let us get it all organized. We start by applying Pasechnik’s relation in the form of Lemma 3.4. For i = 1 or 2 put U (i) = {u h | h ∈ G (3) and h has i cycles }. Lemma 4.2. < A ∪ U (2) > = < A ∪ U > . Proof. Let h 1 = (1, 2, 3), h 2 = (4, 5, 6), h 3 = (1, 2, 7), h 4 = (3, 4, 6), h 5 = (2, 5, 7), h 6 = (1, 3, 4), h 7 = (5, 6, 7), h 8 = h 1 . If 1 ≤ i ≤ 7 then [h i , h i+1 ] = 1 and by Lemma 3.4 (u h i + u h i+1 ) ∈< A ∪ U (2) >. Now the obvious equality (−1)i (u h i + u h i+1 ) = −2u h 1 1≤i≤7

proves the assertion.

The finishing touch of this subsection is the following lemma. Lemma 4.3. Let h and k be elements of order 3 in A7 generated by products of two commuting 3-cycles. Then, up to reordering and inverting, the pair (h, k) corresponds to a row in the following table (the entry in the fifth column indicates whether h and k are simultaneously inverted by an involution of A7 ). # 1 2 3, 4 5 6 7 8 9 10

o(hk) 1 3 2 3 3 5 4 4 5

o(hk −1 ) 3 3 3 7 6 5 4 7 6

h, k

Z3 Z3 × Z3 A4 F21 A4 × Z 3 A5 L 3 (2) L 3 (2) A7

± + + + − + + + − +

(u h , u k ) 8 5

0 136 405 4 27 64 405 16 405 32 405 4 81 8 81

The group A7 contains two classes of A4 -subgroups generated by pairs of elements of order 3 having two 3-cycles: they preserve partitions 7=1+6 and 7=3+4, respectively. This explains the entry 3, 4 in the first column of the above table.

14

A. A. Ivanov

Proof. The classification of pairs (h, k) in question is fairly elementary and can be achieved by hand calculations, although computer calculations proved to be more efficient. The permutation character has been computed by Dima Pasechnik, and Igor Faradjev has observed that the pair {o(hk), o(hk −1 )} of product orders determine the isomorphism type of the generated subgroup. Then Dima listed the representatives of the ten orbits on pairs and determined the isomorphism type of the subgroups they generate. This information is collected in the first five columns of the above table. Let us turn to the last column containing the inner product. In the terminology of [ISh10] the orbits numbered 5, 8 and 9 correspond to pairs of 3-, 2- and 4-related anti-flags in the relevant projective plane of order 2. After Lemmas 2.3 and 3.3 are applied and the inner products calculated in [IPSS10, ISe10 and ISh10] are inserted, we are left with two cases to deal with: the orbits numbered 6 and 10. In each of these two cases there is an involution simultaneously inverting the elements of order 3, and hence Proposition 3.2 (i) is applicable. The calculations are similar to those in the proof of Lemma 3.3. The details are as follows. The pair h = (1, 2, 3)(4, 5, 6), k = (1, 2, 7)(4, 5, 6) represents orbit number 6 and 1 t = (1, 2)(4, 5) inverts both h and k. In terms of Proposition 3.2 (i) we have p = 36 ,q = 9 64 , which gives (u , u ) = . For orbit number 10 we have h = (1, 2, 3)(4, 5, 6), k= h k 405 27 1 11 1

(1, 2, 4)(5, 6, 7), t = (1, 2)(5, 6), p = 24 , q = 28 , which gives (u h , u k ) = 81 . Proof of Proposition 2.7, completion. To prove assertion (iii) first recall that, by Corollary 2.8, < A ∪ U ∪ W > is Majorana stable. Let h, k ∈ G (3) . If G ∼ = A6 then by Lemma 4.1 h and k are simultaneously inverted by an involution t of A6 and Proposition 3.2 (iii) is applicable. If G ∼ = A7 then by Lemma 4.2 we can assume that u h , u k ∈ U (2) , so that h and k are generated by products of two 3-cycles. By Lemma 4.3 and the table therein we have one of the following two possibilities: either h and k are contained in a subgroup H with H ∈ {D12 , A4 , S4 , A5 , L 3 (2), F21 }, or h and k are simultaneously inverted by an involution of A7 . In the former case we apply Lemmas 2.3 and 2.4, and in the latter case Proposition 3.2 (iii) can be applied exactly as in the A6 -case discussed above. The situation when h and k commute is a particular case of the second possibility (in this case the product u h · u k is zero by Lemma 3.3). This completes the proof of assertion (iii). Assertion (iv) now follows from the observation that the products in V (H ), with H as in the previous paragraph, and also the products obtained via Proposition 3.2 (iii), are uniquely determined by the factors.

It is worth mentioning that a priori we did not assume that the Majorana representations of A6 and A7 are stable under Aut (A6 ) and S7 , but this can be seen a posteriori. 5. Dimensions and Monstralizers There is a standard procedure for obtaining an upper bound for the dimension of a Majorana representation of a group G based on its embedding in the Monster (cf. Lemma 2.2 in [ISe10]). Lemma 5.1. Let M be the Monster group, let = 0 ⊕ 1 be the vector space underlying the Monster algebra, where 0 is the 1-dimensional trivial M-module, while 1 is the minimal faithful M-module of dimension 196 883. Let A6 and A7 be alternating

On Majorana Representations of A6 and A7

15

subgroups in M generated by 2 A-involutions and let M6 = C M (A6 ) and M7 = C M (A7 ). Then d6 := dim C 1 (M6 ) = 77 and d7 := dim C 1 (M7 ) = 204. Proof. The Monstralizers M6 and M7 are contained in the maximal subgroups of the Monster of shape (A6 × A6 × A6 ).(2 × S4 ) and (A7 × A5 × A5 ).D8 , respectively. The dimension d6 has been calculated by independent methods by Simon Norton and Dima Pasechnik, while d7 was calculated by Sergey Shpectorov.

Since M6 centralizes the set A := {at | t ∈ A6 } of Majorana axes, it centralizes A (where the closure is in the Monster algebra). Hence the dimension of the Majorana representation of A6 based on the embedding in the Monster is at most 78 = d6 + dim 0 , and similarly for A7 . On the other hand, the dimensions are equal to the ranks of the Gram matrices of the corresponding spanning sets and thus can be calculated precisely. Proposition 5.2. Let G be isomorphic to the alternating group of degree 6 or 7 and let R = (G, T, V, ( , ), · , ϕ, ψ) be a Majorana representation of G satisfying the hypothesis of Theorem 1. Let A = {at | t ∈ T } be the set of Majorana axes and let U = {u h | h ∈ G (3) } be the set of 3A-axial vectors. Then (i) R is uniquely determined by G, based on an embedding of G in the Monster, and Aut (G)-stable; (ii) < A ∪ U > has co-dimension 0 or 1 in V ; (iii) if G ∼ = A6 then dim (< A ∪ U >) = 75; (iv) if G ∼ = A7 then dim (< A ∪ U >) = 196. Proof. Assertion (i) follows from Proposition 2.7 (iii), (iv), Lemma 2.6 and the fact that the Monster contains 2 A-generated subgroups A6 and A7 . Assertion (ii) is by Lemma 2.5. Assertion (iii) has been established by Dima Pasechnik. The details are as follows: there are 10 Pasechnik relations, one for every Sylow 3-subgroup in A6 . These relations were shown to be linearly independent and it turned out that there are no further relations other than linear combinations of these 10. Hence dim (< A(A6 ) ∪ U (A6 ) >) = |A(A6 )| + |U (A6 )| − 10 = 45 + 40 − 10 = 75. In case (iii) the calculations were performed by Igor Faradjev. Each of the sets A(A7 ) and U (A7 ) turned out to be linearly independent and hence spans in V (A7 ) the permutation module of A7 (or rather of S7 ), respectively on involutions and order 3 subgroups generated by pairs of 3-cycles. By the upper bound in Lemma 5.1 the whole of A(A7 ) ∪ U (2) (A7 ) cannot be linearly independent since it contains 105+140=245 vectors. Specifically, it was shown that inside V (A7 ) the permutation module on A(A7 ) and the permutation module on U (2) (A7 ) intersect in a 49-dimensional subspace, which is the direct sum of a 14- and a 35-dimensional A7 -module. Thus dim (< A(A7 ) ∪ U (A7 ) >) = |A(A7 )| + |U (2) (A7 )| − 49 = 196, and there are 49 linearly independent Faradjev relations in V (A7 ).

By Proposition 5.2 and Lemma 2.5 assertions (iii) and (iv) of Theorem 1 follow.

16

A. A. Ivanov

Acknowledgements. The results in this note would not have been achieved without essential computational and conceptual support provided by Igor Faradjev, Simon Norton, Dima Pasechnik, Ákos Seress, and Sergey Shpectorov. I am thankful to Karina Kirkina for her most careful proofreading. I also give thanks to two anonymous referees whose numerous rounds of comments helped me to improve the exposition.

References [C84] [CCNPW] [Iv09] [Iv11] [IPSS10] [ISh10] [ISe10] [Miy03] [N96] [N98] [Sak07]

Conway, J.H.: A simple construction for the Fischer–Griess monster group. Invent. Math. 79, 513–540 (1984) Conway, J.H., Curtis, R.T., Norton, S.P., Parker, R.A., Wilson, R.A.: Atlas of Finite Groups. Oxford: Clarendon Press, 1985 Ivanov, A.A.: The Monster Group and Majorana Involutions. Cambridge: Cambridge Univ. Press, 2009 Ivanov, A.A.: Majorana representations of A6 involving 3C-algebras. Manuscript, 2011 Ivanov, A.A., Pasechnik, D.V., Seress, Á., Shpectorov, S.: Majorana representations of the symmetric group of degree 4. J. Alg. 324, 2432–2463 (2010) Ivanov, A.A., Shpectorov, S.: Majorana Representations of L 3 (2). Preprint, 2010 Ivanov, A.A., Seress, Á.: Majorana Representations of A5 . Preprint 2010 Miyamoto, M.: Vertex operator algebras generated by two conformal vectors whose τ -involutions generate s3 . J. Alg. 268, 653–671 (2003) Norton, S.P.: The Monster algebra: some new formulae. In: Moonshine, the Monster and Related Topics, Contemp. Math. 193, Providence, RI: Amer. Math. Soc., 1996, pp. 297–306 Norton, S.P.: Anatomy of the Monster I. In: The Atlas of Finite Groups: Ten Years On, LMS Lect. Notes Ser. 249, Cambridge: Cambridge Univ. Press, 1998, pp. 198–214 Sakuma, S.: 6-Transposition property of τ -involutions of Vertex Operator Algebras. International Math. Research Notes 2007, article rnm030, 19 pages, doi:10.1093/imrn/rmn030

Communicated by Y. Kawahigashi

Commun. Math. Phys. 307, 17–63 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1254-5

Communications in

Mathematical Physics

Stability and Instability of Extreme Reissner-Nordström Black Hole Spacetimes for Linear Scalar Perturbations I Stefanos Aretakis University of Cambridge, Department of Pure Mathematics and Mathematical Statistics, Wilberforce Road, Cambridge CB3 0WB, United Kingdom. E-mail: [email protected] Received: 18 October 2010 / Accepted: 5 December 2010 Published online: 20 May 2011 – © Springer-Verlag 2011

Abstract: We study the problem of stability and instability of extreme ReissnerNordström spacetimes for linear scalar perturbations. Specifically, we consider solutions to the linear wave equation g ψ = 0 on a suitable globally hyperbolic subset of such a spacetime, arising from regular initial data prescribed on a Cauchy hypersurface 0 crossing the future event horizon H+ . We obtain boundedness, decay and non-decay results. Our estimates hold up to and including the horizon H+ . The fundamental new aspect of this problem is the degeneracy of the redshift on H+ . Several new analytical features of degenerate horizons are also presented.

1. Introduction Black holes are one of the most celebrated predictions of General Relativity and one of the most intriguing objects of Mathematical Physics. A particularly interesting but peculiar example of a black hole spacetime is given by the so-called extreme Reissner-Nordström metric, which in local coordinates (t, r, θ, φ) takes the form g = −Ddt 2 +

1 2 dr + r 2 gS2 , D

(1.1)

where M 2 D = D (r ) = 1 − , r gS2 is the standard metric on S2 and M > 0. This spacetime has been the object of considerable study in the physics literature (see for instance the recent [37]). In this series of papers, we investigate the linear stability and instability of extreme Reissner-Nordström

18

S. Aretakis

for scalar perturbations, that is to say we shall attempt a more or less complete treatment of the wave equation g ψ = 0

(1.2)

on extreme Reissner-Nordström exterior backgrounds. This resolves Open Problem 4 (for extreme Reissner-Nordström) from Sect. 8 of [24]. The fundamentally new aspect of this problem is the degeneracy of the redshift on the event horizon H+ . Several new analytical features of degenerate event horizons are also presented. The main results (see Sect. 4) of the present paper include: (1) Local integrated decay of energy, up to and including the event horizon H+ (Theorem 1). (2) Energy and pointwise uniform boundedness of solutions, up to and including H+ (Theorems 2, 4). (3) Sharp second order L 2 estimates, up to and including H+ (Theorem 3). (4) Non-Decay of higher order translation invariant quantities along H+ (Theorem 5). Note that the last result is a statement of instability. In the companion paper [3], where we shall provide the complete picture of the linear stability and instability of extreme Reissner-Nordström spacetimes, we obtain energy and pointwise decay of solutions up to and including H+ and non-decay and blow-up results for higher order derivatives of solutions along H+ . Note that the latter blow-up estimates are in sharp contrast with the non-extreme case, for which decay holds for all higher order derivatives of ψ along H+ . 1.1. Preliminaries. Before we discuss in detail our results, let us present the distinguishing properties of extreme Reissner-Nordström and put this spacetime in the context of previous results. 1.1.1. Extreme black holes. We briefly describe here the geometry of the horizon of extreme Reissner-Nordström. (For a nice introduction to the relevant notions, we refer the reader to [32]). The event horizon H+ corresponds to r = M, where the (t, r ) coordinate system (1.1) breaks down. The coordinate vector field ∂t , however, extends to a regular null Killing vector field T on H+ . The integral curves of T on H+ are in fact affinely parametrised: ∇T T = 0.

(1.3)

More generally, if an event horizon admits a Killing tangent vector field T for which (1.3) holds, then the horizon is called degenerate and the black hole extreme. In other words, a black hole is called extreme if the surface gravity vanishes on H+ (see Sect. 2.3). Under suitable circumstances, the notion of extreme black holes can in fact be defined even in case the spacetime does not admit a Killing vector field (see [19]). There has also been rapid progress as regards the problem of uniqueness of extreme black hole spacetimes. We refer the reader to [15] for the classification of static electro-vacuum solutions with degenerate components. The extreme Reissner-Nordström corresponds to the M = e subfamily of the two parameter Reissner-Nordström family with parameter mass M > 0 and charge e > 0. It sits between the non-extreme black hole case e < M and the so-called naked singularity case M < e. Note that the physical relevance of the black hole notion rests in the expectation that black holes are “stable” objects in the context of the dynamics of the Cauchy

Linear Stability and Instability of Extreme Reissner-Nordström I

19

problem for the Einstein equations. On the other hand, the so-called weak cosmic censorship conjecture suggests that naked singularities are dynamically unstable (see the discussion in [25]). That is to say, extreme black holes are expected to have both stable and unstable properties; this makes their analysis very interesting and challenging. 1.1.2. Linear scalar perturbations. The first step in understanding the non-linear stability (or instability) of a spacetime is by considering the wave equation (1.2) (we refer the reader to [10] for the details of the proof of the stability of Minkowski space). This is precisely the motivation of the present paper. Indeed, to show stability one would have to prove that solutions of the wave equation decay sufficiently fast. For potential future applications all methods should ideally be robust and the resulting estimates quantitative. Robust means that the methods still apply when the background metric is replaced by a nearby metric, and quantitative means that any estimate for ψ must be in terms of uniform constants and (weighted Sobolev) norms of the initial data. Note also that it is essential to obtain non-degenerate estimates for ψ on H+ and to consider initial data that do not vanish on the horizon. As we shall see, the issues at the horizon turn out to be the most challenging part in understanding the evolution of waves on extreme Reissner-Nordström. 1.1.3. Previous results for waves on non-extreme black holes. The wave equation (1.2) on black hole spacetimes has been studied for a long time beginning with the pioneering work of Regge and Wheeler [44] for Schwarzschild. Subsequently, a series of heuristic and numerical arguments were put forth for obtaining decay results for ψ (see [5,42]). However, the first complete quantitative result (uniform boundedness) was obtained only in 1989 by Kay and Wald [55], extending the restricted result of [53]. Note that the proof of [55] heavily depends on the exact symmetries of the Schwarzschild spacetime. During the last decade, the wave equation on black hole spacetimes has become a very active area in mathematical physics. As regards the Schwarzschild spacetime, “X estimates” providing local integrated energy decay (see Sect. 1.2.1 below) were derived in [6,7,20]. Note that [20] introduced a vector field estimate which captures in a stable manner the so-called redshift effect, which allowed the authors to obtain quantitative pointwise estimates on the horizon H+ . Refinements for Schwarzschild were achieved in [22] and [38]. Similar estimates to [6] were derived in [8] for the whole parameter range of Reissner-Nordström including the extreme case. However, these estimates degenerate on H+ and require the initial data to be supported away from H+ . The first boundedness result for solutions of the wave equation on slowly rotating Kerr (|a| M) spacetimes was proved in [23] and decay results were derived in [2,24,49]. Decay results for general subextreme Kerr spacetimes (|a| < M) are proven in [28]. Two new methods were presented recently for obtaining sharp decay of energy flux and pointwise decay on black hole spacetimes; see [26,50]. For results on the coupled wave equation see [19]. For other results see [29,30,34]. For an exhaustive list of references, see [24]. Note that all previous arguments for obtaining boundedness and decay results on non-extreme black hole spacetimes near the horizon would break down in our case (see Sects. 2.3, 10.1). The reason for this is precisely the degeneracy of the redshift on H+ . 1.2. Overview of results and techniques. We use the robust vector field method (see Sect. 5). Our methods at various points rely on the spherical symmetry but not on other

20

S. Aretakis

unstable properties of the spacetimes (such as the complete integrability of the geodesic flow, the separability of the wave equation, etc.) 1.2.1. Zeroth order Morawetz and X estimates. Our analysis begins with local L 2 spacetime estimates. We refer to local spacetime estimates controlling the derivatives of ψ as “X estimates” and ψ itself as “zeroth order Morawetz estimate”. Both these types of estimates have a long history (see [24]) beginning with the seminal work of Morawetz [40] for the wave equation on Minkowski spacetime. They arise from the spacetime term of energy currents JμX associated to a vector field X (see Sect. 5 for the definition of energy currents). For Schwarzschild, such estimates appeared in [6,7,9,20,22] and for Reissner-Nordström in [8]. The biggest difficulty in deriving an X estimate for black hole spacetimes has to do with the trapping effect. Indeed, from a continuity argument one can infer the existence of null geodesics which neither cross H+ nor terminate at I + . In our case, a class of such geodesics lie in a hypersurface of constant radius (see Sect. 2.2) known as the photon sphere. From the analytical point of view, trapping affects the derivatives tangential to the photon sphere and any non-degenerate spacetime estimate must lose (tangential) derivatives (i.e. must require high regularity for ψ). In this paper, we first (making minimal use of the spherical decomposition) derive a zeroth order Morawetz estimate for ψ which does not degenerate at the photon sphere. For the case l ≥ 1 (where l is related to the eigenvalues of the spherical Laplacian, see Sect. 7) we use a suitable current which captures the trapping in the extreme case. As regards the zeroth spherical harmonics (l = 0), we present a method which is robust and uses only geometric properties of the domain of outer communications. Our argument (for l = 0) applies for a wider class of black holes spacetimes and, in particular, it applies for Schwarzschild. Note that no unphysical conditions are imposed on the initial data which, in particular, are not required to be compactly supported or supported away from H+ . Once this Morawetz estimate is established, we then show how to derive a degenerate (at the photon sphere) X estimate which does not require higher regularity and a non-degenerate X estimate (for which we need, however, to commute with the Killing vector field T ). These estimates, however, degenerate on H+ ; this degeneracy will be eliminated later (see Sect. 1.2.4). See Theorem 1 of Sect. 4. 1.2.2. Uniform boundedness of non-degenerate energy. The vector field T = ∂t is causal and Killing and the energy flux of the current JμT is non-negative definite (and bounded) but degenerates on the horizon (see Sect. 8). Moreover, in view of the lack of redshift along H+ , the divergence of the energy current JμN associated to the redshift vector field N , first introduced in [20], is not positive definite near H+ (see Sect. 10). For this reason we appropriately modify JμN so the new bulk term is non-negative definite near H+ . Note that the bulk term is not positive far away from H+ and so to control these terms we use the X and zeroth order Morawetz estimates. The arising boundary terms can be bounded using Hardy-like inequalities. It is important here to mention that a Hardy inequality (in the first form presented in Sect. 6) allows us to bound the local L 2 norm of ψ on hypersurfaces crossing H+ using the (conserved) degenerate energy of T . See Theorem 2 of Sect. 4. 1.2.3. Non-decay along H+ . We next show that the degeneracy of redshift gives rise to a conservation law along H+ for the zeroth spherical harmonics. This result implies that for generic waves a higher order translation invariant geometric quantity does not

Linear Stability and Instability of Extreme Reissner-Nordström I

21

decay along H+ (see Theorem 5 of Sect. 4). This low frequency obstruction will be crucial for obtaining the definitive statement of instability. 1.2.4. Local integrated energy decay. We next return to the problem of retrieving the derivative transversal to H+ in the X estimate in a neighbourhood of H+ ; see Theorem 3 of Sect. 4. We first show that on top of the above low frequency obstruction comes another new feature of degenerate event horizons. Indeed, to obtain non-degenerate spacetime estimates, one needs to require higher regularity for ψ and commute with the vector field transversal to H+ . This shows that H+ exhibits phenomena characteristic of trapping (see also the discussion in Sect. 1.3.2). Then by using appropriate modifications of the redshift current and Hardy inequalities along H+ we obtain the sharpest possible result. See Sect. 12. Note that although (an appropriate modification of) the redshift current can be used as a multiplier for all angular frequencies, the redshift vector field N can only be used as a commutator1 for ψ supported on the frequencies l ≥ 1. These results will be further investigated in [3]. 1.2.5. Pointwise uniform boundedness. Using the above higher order energy estimates and appropriate Sobolev inequalities we finally obtain uniform pointwise boundedness of solutions up to and including H+ (see Theorem 4 of Sect. 4). We note that in [3] we show that the argument of Kay and Wald [55] cannot be applied in the extreme case for obtaining similar boundedness results, i.e. for generic ψ, there does not exist a Cauchy hypersurface crossing H+ and a solution ψ˜ such that T ψ˜ = ψ in the causal future of . 1.3. Remarks on the analysis of extreme black holes. We conclude this introductory section by discussing several new features of degenerate event horizons. 1.3.1. Dispersion vs redshift. In [24], it was shown that for a wide variety of nonextreme black holes, the redshift on H+ suffices to yield uniform boundedness (up to and including the horizon) of waves ψ without any need of understanding the dispersion properties of ψ. However, in the extreme case, the degeneracy of the redshift makes the understanding of the dispersion of ψ essential even for the problem of boundedness. In particular, one has to derive spacetime integral estimates for ψ and its derivatives. Moreover, we show that dispersion can be completely decoupled from the redshift effect. 1.3.2. Trapping effect on H+ . According to the results of Sects. 10.2 and 12, in order to obtain L 2 estimates in neighbourhoods of H+ one must require higher regularity for ψ and commute with the vector field transversal to H+ (which is not Killing). This loss of a derivative is characteristic of trapping. Geometrically, this is related to the fact that the null generators of H+ viewed as integral curves of the Killing vector field T are affinely parametrised. The trapping properties of the photon sphere have a different analytical flavour. Indeed, in order to obtain L 2 estimates in regions which include the photon sphere, one 1 The redshift vector field was used as a commutator for the first time in [23].

22

S. Aretakis

needs to commute with either T or the generators of the Lie algebra so(3) (note that all these vector fields are Killing). Only the high angular frequencies are trapped on the photon sphere (and for the low frequencies no commutation is required) while all the angular frequencies are trapped (in the above sense) on H+ . 2. Geometry of Extreme Reissner-Nordström spacetime The unique family of spherically symmetric asymptotically flat solutions of the coupled Einstein-Maxwell equations is the Reissner-Nordström family of two parameter 4-dimensional Lorentzian manifolds N M,e , g M,e , where the parameters M and e are called mass and (electromagnetic) charge, respectively. The Reissner-Nordström metric was first written in local coordinates (t, r, θ, φ) in 1916 [45] and 1918 [41]. In these coordinates, g = g M,e = −Ddt 2 + D1 dr 2 + r 2 gS2 , 2

e 2 where D = D (r ) = 1 − 2M r + r 2 and gS2 is the standard metric on S . The extreme case corresponds to M = |e|. From now on we restrict our attention to the extreme case. Clearly, SO(3) acts by isometry on these spacetimes. We will refer to the SO(3)-orbits as (symmetry) spheres. The coordinate r is defined intrinsically such that the area of the spheres of symmetry is 4πr 2 (and thus should be thought of as a purely geometric function of the spacetime). In view of the coordinate singularity at r = M (note that M is the double root of D in the extreme case), we introduce the so-called tortoise coordinate r ∗ given by ∂r ∗ (r ) = D1 . Note that in the extreme case r ∗ is inverse linear (instead of logarithmic ∂r in the non-extreme case). The metric with respect to the system (t, r ∗ ) then becomes g = −Ddt 2 + D (dr ∗ )2 + r 2 gS2 . The way to extend the metric beyond r = M is by considering the ingoing Eddington-Finkelstein coordinates (v, r ), where v = t +r ∗ . In these coordinates the metric is given by

g = −Ddv 2 + 2dvdr + r 2 gS2 , (2.1) 2 where D = D (r ) = 1 − M and gS2 is the standard metric on S2 . The radial curves r v = c, where c is a constant, are the ingoing radial null geodesics. This means that the null coordinate vector field ∂r differentiates with respect to r on these null hypersurfaces. This geometric property of ∂r makes this vector field very useful for understanding the behaviour of solutions to the wave equation close to H+ . The Penrose diagram of the spacetime N covered by this coordinate system for v ∈ R, r ∈ R+ is

Linear Stability and Instability of Extreme Reissner-Nordström I

23

We will refer to the hypersurface r = M as the event horizon (and denote it by H+ ) and the region r ≤ M as the black hole region. The region where M < r corresponds to the domain of outer communications. In view of the existence of the timelike curvature singularity {r = 0} ‘inside’ the black hole (thought of here as a singular boundary of the black hole region) and its unstable behaviour, we are only interested in studying the wave equation in the domain of outer communications including the horizon H+ . Note that the study of the horizon is of fundamental importance since any attempt to prove the nonlinear stability or instability of the exterior of black holes must come to terms with the structure of the horizon. We consider a connected asymptotically flat SO(3)-invariant spacelike hypersurface 0 in N terminating at i 0 with boundary such that ∂0 = 0 ∩ H+ . We also assume that if n is its future directed unit normal and T = ∂v , then there exist positive constants C1 < C2 such that C1 < −g (n, n) < C2 , C1 < −g (n, T ) < C2 . Let M be the domain of dependence of 0 . Then, using the coordinate system (v, r ) we have M = (−∞, +∞) × [M , +∞ ) × S2 ∩ J + (0 ), (2.2) where J + (0 ) is the causal future of 0 (which by our convention includes 0 ). Note that M is a manifold with stratified (piecewise smooth) boundary ∂M = (H+ ∩M)∪0 .

˜ is the null system (u, v), where Another coordinate system that partially covers M ∗ ∗ u = t − r , v = t + r and with respect to which the metric is g = −Ddudv + r 2 gS2 . The hypersurfaces v = c and u = c are null and thus this system is useful for applying the method of characteristics. ˜ τ . We consider the foliation τ = ϕτ (0 ), where ϕτ is 2.1. The foliations τ and the flow of T = ∂v . Of course, since T is Killing, the hypersurfaces τ are all isometric to 0 .

24

S. Aretakis

We define the region R(0, τ ) = ∪0≤τ˜ ≤τ τ˜ . On τ we have an induced Lie propagated coordinate system (ρ, ω) such that ρ ∈ [ M, +∞) and ω ∈ S2 . These coordinates are defined such that if Q ∈ τ and Q = (v Q , r Q , ω Q ), then ρ = r Q and ω = ω Q . Our assumption for the normal n 0 (and thus for n τ ) implies that there exists a bounded function g1 such that ∂ρ = g1 ∂v + ∂r .

(2.3)

This defines a coordinate system since [∂ρ , ∂θ ] = [∂ρ , ∂φ ] = 0. Moreover, the volume form of τ is dgτ = Vρ 2 dρdω,

(2.4)

where V is a positive bounded function.

˜τ In the companion paper [3] we shall make use of another foliation denoted by + whose leaves terminate at I and thus “follow” the waves to the future.

˜ τ . Note that only local One can similarly define an induced coordinate system on ˜ τ. elliptic estimates are to be applied on 2.2. The photon sphere and trapping effect. One can easily see that there exist orbiting future directed null geodesics, i.e. null geodesics that neither cross the horizon H+ nor meet null infinity I + . A class of such geodesics γ are of the form γ :R→M

π τ → γ (τ ) = t (τ ) , Q, , φ (τ ) . 2

Linear Stability and Instability of Extreme Reissner-Nordström I

25

. . . The conditions ∇γ. γ = 0 and g γ , γ = 0 imply that Q = 2M, which is the radius of the so called photon sphere. The t, φ depend linearly on τ .

In fact, from any point there is a codimension-one subset of future directed null directions whose corresponding geodesics approach the photon sphere to the future. The existence of this “sphere” (which is in fact a 3-dimensional timelike hypersurface) implies that the energy of some photons is not scattered to null infinity or the black hole region. This is the so called trapping effect. As we shall see, this effect forces us to require higher regularity for the waves in order to achieve decay results. 2.3. The redshift effect and surface gravity of H+ . The Killing vector field ∂v becomes null on the event horizon H+ and is also tangent to it and thus H+ is a Killing horizon. In general, if there exists a Killing vector field V which is normal to a null hupersurface then ∇V V = κ V

(2.5)

on the hypersurface. Since V is Killing, the function κ is constant along the integral curves of V . This can be seen by taking the pushforward of (2.5) via the flow of V and noting that since the flow of V consists of isometries, the pushforward of the Levi-Civita connection is the same connection. The quantity κ is called the surface gravity2 of the null hypersurface.3 In the Reissner-Nordström family, the surface gravities of the two horizons {r = r− } and {r = r+ } are given by r± − r∓ 1 d D (r ) κ± = = , (2.6) 2 2 dr r =r± 2r± where D is given in Sect. 2 (and r+ , r− are the roots of D). Note that in extreme 2 and so r+ = r− = M which Reissner-Nordström spacetime we have D (r ) = 1 − M r implies that the surface gravity vanishes. A horizon whose surface gravity vanishes is called degenerate. Physically, the surface gravity is related to the so-called redshift effect that is observed along (and close to) H+ . According to this effect, the wavelength of radiation close to H+ becomes longer as v increases and thus the radiation gets less energetic. This effect has a long history in the heuristic analysis of waves but only in the last decade has it been used mathematically. For example, Price’s law (see [19]) and the stability and instability of Cauchy horizons in an appropriate setting (see [17]) were proved using, in particular, this effect. (Note that for the latter, one also needs to use the dual blueshift effect which is present in the interior of black holes.) 2 This plays a significant role in black hole “thermodynamics” (see also [54] and [43]). 3 Note that in Riemannian geometry, any Killing vector field that satisfies (2.5) must have κ = 0.

26

S. Aretakis

3. The Cauchy Problem for the Wave Equation We consider solutions of the Cauchy problem of the wave equation (1.2) with initial data k−1 k ψ|0 = ψ0 ∈ Hloc (3.1) (0 ) , n 0 ψ = ψ1 ∈ Hloc (0 ) , 0

where the hypersurface 0 is as defined in Sect. 2 and n 0 denotes the future unit normal of 0 . In view of the global hyperbolicity of M, there exists a unique solution to the above equation. Moreover, as long as k ≥ 1, we have that for any spacelike hypersurface S, k ψ| S ∈ Hloc (S) ,

k−1 n S ψ| S ∈ Hloc (S) .

In this paper we will be interested in the case where k ≥ 2. Moreover, we assume that lim r ψ 2 (x) = 0.

x→i 0

(3.2)

For simplicity, from now on, when we say “for all solutions ψ of the wave equation” we will assume that ψ satisfies the above conditions. Note that for obtaining sharp decay results we will have to consider even higher regularity for ψ. 4. The Main Theorems We consider the Cauchy problem for the wave equation (see Sect. 3) on the extreme Reissner-Nordström spacetime. This spacetime is partially covered by the coordinate systems (t, r ), (t, r ∗ ), (v, r ) and (u, v) described in Sect. 2. Recall that M is a positive 2 . Recall also that the horizon H+ is located at parameter and D = D(r ) = 1 − M r {r = M} and the photon sphere at {r = 2M}. We denote T = ∂v = ∂t , where ∂v corresponds to the system (v, r ) and ∂t corresponds to (t, r ). From now on, ∂v , ∂r are the coordinate vector fields corresponding to (v, r ), unless otherwise stated. Note that ∂r ∗ = ∂v = T on H+ and therefore it is not transversal to H+ , whereas ∂r is transversal to H+ . The foliation τ is defined in Sect. 2.1 and the current J V associated to the vector field V is defined in Sect. 5.1. Note that every time we use such a current we refer to V as a multiplier vectorfield. For reference, we mention that M 2 μ JμT [ψ]n ∼ (T ψ)2 + 1 − (∂r ψ)2 + |∇ / ψ|2 , r which degenerates on H+ whereas μ

Jμn [ψ]n ∼ (T ψ)2 + (∂r ψ)2 + |∇ / ψ|2 , which does not degenerate on H+ . The Fourier decomposition of ψ on S2 (r ) is discussed in Sect. 7, where it is also defined what it means for a function to be supported on a given range of angular frequencies. Note that all the integrals are considered with respect to the volume form. The initial data are assumed to be as in Sect. 3 and sufficiently regular such that the right hand side of the estimates below are all finite. Then we have the following

Linear Stability and Instability of Extreme Reissner-Nordström I

27

Theorem 1 (Morawetz and X Estimates). Let δ > 0. There exists a constant Cδ > 0 which depends on M, δ and 0 such that for all solutions ψ of the wave equation the following estimates hold: (1) Non-Degenerate Zeroth Order Morawetz Estimate: 1 n μ 2 ψ ≤ C Jμ 0 [ψ]n 0 . δ 3+δ r R(0,τ ) 0 (2) X Estimate with Degeneracy at H+ and Photon Sphere:

√ (r − 2M)2 · D (∂v ψ)2 + |∇ / ψ|2 + D 2 (∂r ψ)2 +χ 3M 5M · (∂r ∗ ψ)2 r 3+δ 2 , 2 R(0,τ ) μ ≤ Cδ JμT [ψ]n 0 , 0

where χ 3M

5M 2 , 2

(r )

is the indicator function of the interval

3M 5M 2 , 2

.

(3) X Estimate with Degeneracy at H+ :

√ √ √ D D D2 D 2 2 2 |∇ / ψ| (∂v ψ) + 1+δ (∂r ψ) + 1+δ r r R(0,τ ) r μ μ ≤ Cδ JμT [ψ]n 0 + JμT [T ψ]n 0 . 0

For proving statement (1), we first derive in Sect. 9.7 an estimate which degenerates on H+ . Note that the estimate of Sect. 9.7 requires applying only the Killing vectorfield T as a multiplier. Later in Sect. 14 we construct a timelike multiplier and we use an appropriate version of Hardy’s inequality to eliminate this degeneracy (see Theorem 3). Note that estimate (2) degenerates on the photon sphere with respect to all derivatives except precisely ∂r ∗ . Note moreover, that the same estimate degenerates on H+ with respect to all derivatives. As we shall see, unlike the subextreme case, the degeneracy on H+ with respect to the transversal derivative ∂r can not be eliminated even if we apply a timelike multiplier (see also Theorem 2). In Sect. 9.6 we commute with the vectorfield T to retrieve the tangential to the photon sphere derivatives thus obtaining statement (3). Theorem 2 (Uniform Boundedness of Non-Degenerate Energy). There exists r0 such that M < r0 < 2M and a constant C > 0 which depends on M, and 0 such that if A = R(0, τ ) ∩ {M ≤ r ≤ r0 } then for all solutions ψ of the wave equation we have √ n n μ μ D(∂r ψ)2 + (∂v ψ)2 + |∇ / ψ|2 + Jμ τ [ψ]n τ ≤ C Jμ 0 [ψ]n 0 . A

τ

0

The above theorem is proved in Sect. 11 using a novel redshift current constructed in Sect. 10 and a new version of Hardy’s inequality (see Sect. 6). Note that this theorem shows that we can not eliminate the degeneracy of ∂r in spacetime neighbourhoods of H+ even if we apply timelike multipliers. To eliminate this degeneracy we need to reveal a new feature of degenerate horizons captured in the next theorem.

28

S. Aretakis

Theorem 3 (Trapping Effect on the Event Horizon H+ ). Let A be the spacetime region defined in Theorem 2 and δ > 0. Then there exists a constant C > 0 (and Cδ > 0) which depends on M and 0 (and δ) such that for all solutions ψ with vanishing spherical mean (i.e. for all ψ supported on the angular frequencies l ≥ 1), the following hold (1) Sharp Second Order L 2 Estimates: 2 2 2 / ∂r ψ|2 / ∂r ψ| + (∂v ∂r ψ) + (∂r ∂r ψ) + |∇ (∂v ∂r ψ)2 + χ1 |∇ τ ∩A H+ √ + / ∂r ψ|2 (∂v ∂r ψ)2 + D (∂r ∂r ψ)2 + |∇ A n n n μ μ μ ≤ C Jμ 0 [ψ]n 0 + C Jμ 0 [T ψ]n 0 + C Jμ 0 [∂r ψ]n 0 , 0

0 ∩A

0

where χ1 = 0 if ψ is supported on l = 1 and χ1 = 1 if ψ is supported on l ≥ 2. (2) Local Integrated Energy Decay: 1 n τ n 0 n μ μ μ J [ψ]n ≤ C J [ψ]n + C Jμ 0 [T ψ]n 0 δ μ δ τ 0 1+δ μ R(0,τ ) r 0 0 n μ +Cδ Jμ 0 [∂r ψ]n 0 . 0 ∩A

Statement (1) of Theorem 3 is proved in Sect. 12, where we construct a new current and derive appropriate Hardy inequalities along H+ . Note that the restriction on frequencies l ≥ 1 is required in view of a new low-frequency phenomenon of degenerate horizons (see also Theorem 5). In particular, we note that in [3] we show that there is no constant C such that statement (1) holds for all solutions ψ of the wave equation. Therefore, the assumption on the frequency range is sharp. In statement (2) we have eliminated the degeneracy of ∂r in neighbourhoods of H+ at the expense, however, of commuting the wave equation with ∂r and thus requiring higher regularity for ψ. This allows us to conclude that the event horizon H+ exhibits trapping. Theorem 4 (Pointwise Boundedness). There exists a constant C which depends on M and 0 such that for all solutions ψ of the wave equation we have |ψ| ≤ C E˜4 , everywhere in R, where

E˜4 =

0

n μ Jμ 0 [ψ]n 0 + C

0

n

μ

Jμ 0 [n 0 ψ]n 0 .

This theorem is proved in Sect. 14 using Theorem 2 and Sobolev inequalities. Theorem 5 (Non-Decay). For generic initial data which give rise to solutions ψ of the wave equation the quantity ψ 2 + (∂r ψ)2 does not decay along H+ .

Linear Stability and Instability of Extreme Reissner-Nordström I

29

This theorem arises from a new low-frequency phenomenon of degenerate horizons. In particular, in Sect. 12.1 we show that an appropriate expression which corresponds to the spherical mean of ψ is conserved along H+ under the evolution. This low-frequency “instability” will be extensively investigated in the companion papers [3,4]. 5. The Vector Field Method For understanding the evolution of waves we will use the so-called vector field method. This is a geometric and robust method and involves mainly L 2 estimates. The main idea is to construct appropriate (0,1) currents and use Stokes’ theorem (see Appendix B) in appropriate regions. For a nice recent exposition see [1]. We briefly recall here the method to set notation. Given a (0,1) current Pμ we have the continuity equation μ μ μ Pμ n 0 = Pμ n τ + Pμ n H+ + ∇ μ Pμ , (5.1) 0

τ

H+ (0,τ )

R(0,τ )

where all the integrals are with respect to the induced volume form and the unit normals n τ are future directed. All the integrals are to be considered with respect to the induced volume form and thus we omit writing the measure. Our choice is n H+ = ∂v = T (and thus the volume element on H+ is chosen so (5.1) holds). 5.1. The compatible currents J, K and the current E. We usually consider currents Pμ that depend on the geometry of (M, g) and are such that both Pμ and ∇ μ Pμ depend only on the 1-jet of ψ. This can be achieved by using the wave equation to make all second order derivatives disappear. There is a general method for producing such currents using the energy momentum tensor T. Indeed, the Lagrangian structure of the wave equation gives us the following energy momentum tensor: 1 Tμν [ψ] = ∂μ ψ∂ν ψ − gμν ∂ a ψ∂a ψ, 2

(5.2)

which is a symmetric divergence free (0,2) tensor. We will in fact consider this tensor for general functions ψ : M → R in which case we have the identity DivT [ψ] = g ψ dψ. (5.3) Since T is a (0, 2) tensor we need to contract it with vector fields of M. It is here where the geometry of M makes its appearance. Given a vector field V we define the J V current by JμV [ψ] = Tμν [ψ]V ν .

(5.4)

We say that we use the vector field V as a multiplier4 if we apply (5.1) for the current JμV . The divergence of this current is Div(J ) = Div (T) V + T (∇V ) , where (∇V )i j = j j ki g ∇k V = ∇ i V . If ψ is a solution of the wave equation then Div (T) = 0 and, 4 The name comes from the fact that the tensor T is multiplied by V.

30

S. Aretakis

therefore, ∇ μ JμV is an expression of the 1-jet of ψ. Given a vector field V the scalar current K V is defined by j K V [ψ] = T [ψ] (∇V ) = Ti j [ψ] ∇ i V .

(5.5) μν

μν

Note that from the symmetry of T we have K V [ψ] = Tμν [ψ] πV , where πV = (LV g)μν is the deformation tensor of V . Clearly if ψ satisfies the wave equation then K V [ψ] = ∇ μ JμV [ψ]. Thus if we use Killing vector fields as multipliers then the divergence vanishes and so we obtain a conservation law. This is partly the content of a deep theorem of Noether.5 In general we define the scalar current E V by E V [ψ] = Div (T) V = g ψ dψ (V ) = g ψ (V ψ) .

(5.6)

5.2. The hyperbolicity of the wave equation. The hyperbolicity of the wave equation is captured by the following proposition Proposition 5.2.1. Let V1 , V2 be two future directed timelike vectors. Then the quadratic expression T (V1 , V2 ) is positive definite in dψ. By continuity, if one of these vectors is null then T (V1 , V2 ) is non-negative definite in dψ. Proof. Consider a point p ∈ M and the normal coordinates around this point and that without loss of generality V2 = (1, 0, 0, . . . , 0). Then the proposition is an application of the Cauchy-Schwarz inequality. An important application of the vector field method and the hyperbolicity of the wave equation is the domain of dependence property. As we shall see, the exact dependence of T on the derivatives of ψ will be crucial later. For a general computation see Appendix A.2.

6. Hardy Inequalities In this section we establish three Hardy inequalities which as we shall see will be very crucial for obtaining sharp estimates. They do not require ψ to satisfy the wave equation and so we only assume that ψ satisfies the regularity assumptions described in Sect. 3 and (3.2). From now on, τ is the foliation introduced in Sect. 2.1 (however, these ˜ τ ). inequalities hold also for the foliation Proposition 6.0.2 (First Hardy Inequality). For all functions ψ which satisfy the regularity assumptions of Sect. 3 we have τ

1 2 ψ ≤C r2

τ

D[(∂v ψ)2 + (∂r ψ)2 ],

where the constant C depends only on M and 0 . 5 According to this theorem, any continuous family of isometries gives rise to a conservation law.

Linear Stability and Instability of Extreme Reissner-Nordström I

31

Proof. Consider the induced coordinate system (ρ, ω) introduced in Sect. 2.1. We use the 1-dimensional identity +∞ +∞

+∞ (∂ρ h)ψ 2 dρ = hψ 2 −2 hψ(∂ρ ψ)dρ (6.1) r =M

M

M

with h = r − M. In view of the assumptions on ψ, the boundary terms vanish. Thus, Cauchy-Schwarz gives ψ 2 dρ = −2 (r − M)ψ(∂ρ ψ)dρ {r ≥M}

≤2

Therefore,

{r ≥M}

ψ dρ

2

{r ≥M}

{r ≥M}

1 2

1 (r − M) (∂ρ ψ) dρ 2

{r ≥M}

2

2

.

ψ 2 dρ ≤ 4

{r ≥M}

(r − M)2 (∂ρ ψ)2 dρ.

Integrating this inequality over S2 gives us 1 2 2 ψ ρ dρdω ≤ 4 D(∂ρ ψ)2 ρ 2 dρdω. 2 S2 {r ≥M} ρ S2 {r ≥M} The result follows from the fact that each τ is diffeomorphic to S2 × [M, + ∞), from (2.3) and the boundedness of the factor V in the volume form (2.4) of τ . The importance of the above inequality lies in the weights. The weight of (∂ρ ψ)2 vanishes to second order on H+ but does not degenerate at infinity6 whereas the weight of ψ degenerate at infinity but not at r = M. Similarly, one might derive estimates for the non-extreme case.7 We also mention that the right hand side is bounded by the (conserved) flux of T through τ (see Sect. 8). Proposition 6.0.3 (Second Hardy Inequality). Let r0 ∈ (M, 2M). Then for all functions ψ which satisfy the regularity assumptions of Sect. 3 and any positive number we have ψ2 ≤ (∂v ψ)2 + (∂r ψ)2 + C ψ 2, H+ ∩τ

τ ∩{r ≤r0 }

τ ∩{r ≤r0 }

where the constant C depends on M, , r0 and 0 . Proof. We use as before the 1-dimensional identity (6.1) with h = r − r0 . Then r0 r0 r0 g2 2 2 2 ψ 2 dρ, 1+ ψ + 2gψ(∂ρ ψ)dρ ≤ (∂ρ ψ) dρ + (r0 − M)ψ (M) = M M M for any > 0. By integrating over S2 , using (2.3) and noting that the ρ 2 factor that appears in the volume form of τ ∩ {r ≤ r0 } is bounded we obtain the required result. 6 Note that if we replace r − M with another function h then we will not be able to make this weight degenerate fast enough without obtaining non-trinial boundary terms. 7 These estimates turn out to be stronger since the weight of ψ may diverge at r = M in an integrable manner.

32

S. Aretakis

The previous two inequalities concern the hypersurfaces τ that cross H+ . The next estimate concerns spacetime neighbourhoods of H+ . Proposition 6.0.4 (Third Hardy Inequality). Let r0 , r1 be such that M < r0 < r1 . We define the regions A = R(0, τ ) ∩ {M ≤ r ≤ r0 } , B = R(0, τ ) ∩ {r0 ≤ r ≤ r1 } . Then for all functions ψ which satisfy the regularity assumptions of Sect. 3 we have ψ2 ≤ C ψ2 + C D[(∂v ψ)2 + (∂r ψ)2 ], A

B

A ∪B

where the constant C depends on M, r0 , r1 and 0 .

Proof. We use again (6.1) with h such that h = 2(ρ − M) in [M, r0 ] and h(r1 ) = 0. Then r0 r0 r1 r1 2 1 − (∂ρ g) ψ 2 dρ + ψ 2 dρ ≤ ψ 2 dρ + g 2 (∂ρ ψ)2 dρ. M

M

r0

Thus, integrating over S2 we obtain ψ2 ≤ C τ˜ ∩A

M

τ˜ ∩(A∪B)

D[(∂v ψ)2 + (∂r ψ)2 ] + C

τ˜ ∩B

ψ 2,

for all τ˜ ≥ 0, where the constant C depends on M, r0 , r1 and 0 . Therefore, integrating over τ˜ ∈ [0, τ ] and using the coarea formula τ f d τ˜ ∼ f 0

τ˜

0≤τ˜ ≤τ

completes the proof of the proposition.

τ˜

7. Elliptic Theory on S2 (r), r > 0 In view of the symmetries of the spacetime, it is important to understand the behaviour of functions on the orbits of the action of SO(3), which are isometric to S2 (r ) for r > 0. , l ∈ N. Recall that the eigenvalues of the spherical Laplacian / are equal to −l(l+1) r2 l The dimension of the eigenspaces E is equal to 2l + 1 and the corresponding eigenvectors are denoted by Y m,l , −l ≤ m ≤ l and called spherical harmonics. We have L 2 (S2 (r )) = ⊕l≥0 E l and, therefore, any function ψ ∈ L 2 (S2 (r )) can be written as ψ=

∞ l l=0 m=−l

ψm,l Y m,l .

(7.1)

Linear Stability and Instability of Extreme Reissner-Nordström I

33

The right-hand side converges to ψ in L 2 of the spheres and under stronger regularity assumptions the convergence is pointwise. Let us denote by ψl the projection of ψ onto E l , i.e. ψl =

l

ψm,l Y m,l .

m=−l

Since each eigenspace E l is finite dimensional (and so complete) and so closed we have ⊥ L 2 (S2 (r )) = E l ⊕ E l . Therefore, ψ can be written uniquely as ψ = ψ0 + ψ≥1 , ⊥ where ψ≥1 ∈ E 1 = ⊕l≥1 E l . We also have the following Proposition 7.0.5 (Poincaré inequality). If ψ ∈ L 2 S2 (r ) and ψl = 0 for all l ≤ L − 1 for some finite natural number L then we have L (L + 1) 2 |∇ / ψ|2 ψ ≤ 2 2 r2 S (r ) S (r ) and equality holds if and only if ψl = 0 for all l = L. Proof. We have ψ = l≥L ψl . Therefore, ⎛ ⎞⎛ ⎞ l (l + 1) ⎝ |∇ / ψ|2 = − ψ · /ψ = ψl ⎠ ⎝ ψl ⎠ r2 S2 (r ) S2 (r ) S2 (r ) l ≥L l≥L l (l + 1) = ψl2 r2 S2 (r ) l≥L L (L + 1) L (L + 1) 2 ≥ ψ = ψ 2, l r2 r2 S2 (r ) S2 (r ) l≥L

where we have used the orthogonality of distinct eigenspaces.

Returning to our 4-dimensional problem, if ψ is sufficiently regular in R then its restriction at each sphere can be written as in (7.1), where ψm,l = ψm,l (v, r ) and Y m,l = Y m,l (θ, φ). We will not worry about the convergence of the series since we may assume that ψ is sufficiently regular.8 First observe that for each summand in (7.1) we have l(l + 1) m,l g ψm,l Y m,l = Sψm,l − ψ , m,l Y r2 where S is an operator on the quotient M/SO(3). Therefore, if ψ satisfies the wave equation then in view of the linear independence of Y m,l ’s the terms ψm,l satisfy Sψm,l = l(l+1) ψm,l and, therefore, each summand also satisfies the wave equation. This implies r2 that if ψm,l = 0 initially then ψm,l = 0 everywhere. From now on, we will say that the 8 Indeed, we may work only with functions which are smooth and such that the non-zero terms in (7.1) are finitely many. Then, since all of our results are quantitative and all the constants involved do not depend on ψ, by a density argument we may lower the regularity of ψ requiring only certain norms depending on the initial data of ψ to be finite.

34

S. Aretakis

wave ψ is supported on the angular frequencies l ≥ L if ψi = 0, i = 0, . . . , L − 1 initially (and thus everywhere). Similarly, we will also say that ψ is supported on the angular frequency l = L if ψ ∈ E L . Another important observation is that the operators ∂v and ∂r are endomorphisms of E l for all l. Indeed, ∂v commutes with / and if ψ ∈ E l , then 2 l(l + 1) / ∂r ψ = ∂r /ψ + ∂r ψ, /ψ = − r r2 and thus ∂r ψ ∈ E l . Therefore, if ψ is supported on frequencies l ≥ L then the same holds for ∂r ψ. 8. The Vector Field T One can easily see that ∂v = ∂t = ∂t ∗ in the intersection of the corresponding coordinate systems. Here ∂t corresponds to the coordinate basis vector field of either (t, r ) or (t, r ∗ ). Recall that the region M where we want to understand the behaviour of waves is covered by the system (v, r, θ, φ). Therefore we define T = ∂v . It can be easily seen from (2.1) that T is Killing vectorfield and timelike everywhere9 except on the horizon where it is null. μν

8.1. Uniform boundedness of degenerate energy. Recall that K T = Tμν πT , where μν πT is the deformation tensor of T . Since T is Killing vectorfield, its deformation tensor is zero and so K T = 0. Therefore, the divergence identity in the region R(0, τ ) gives us the following conservation law: μ μ μ JμT [ψ]n τ + JμT [ψ]n H+ = JμT [ψ]n 0 . (8.1) H+

τ

0

H+

Since T is null on the horizon we have μ n H+ = T , from (A.1) (see Appendix A.2) we following proposition:

μ that JμT n H+ μ have JμT n H+

≥ 0. More presicely, since = (∂v ψ)2 , thus proving the

Proposition 8.1.1. For all solutions ψ of the wave equation we have μ μ JμT [ψ]n τ ≤ JμT [ψ]n 0 . τ

0

(8.2)

μ

We know by Proposition 5.2.1 that JμT n τ is non-negative definite. However, we need μ to know the exact way JμT n τ depends on dψ. Note that ξn (see Appendix A.2 for the definition of ξ ) is strictly positive and uniformly bounded. However ξT = − 21 g(T, T ) = 1 2 D, which clearly vanishes (to second order) at the horizon. Therefore, (A.1) implies JμT n μ ∼ (∂v ψ)2 + D (∂r ψ)2 + |∇ / ψ|2 , where the constants in ∼ depend on the mass M and 0 . Note that all these relations are invariant under the flow of T . 9 Note that in the subextreme range T becomes spacelike in the region bounded by the two horizons, which however coincide in the extreme case.

Linear Stability and Instability of Extreme Reissner-Nordström I

35

On H+ , the energy estimate (8.2) degenerates with respect to the transversal derivative ∂r ψ. It is exactly this that does not allow us to use estimate (8.2) to obtain the boundedness result for the waves in the whole region R. However, if we restrict our attention to the region where r ≥ r0 > M, then commuting the wave equation with T and estimate (8.2) in conjunction with elliptic and Sobolev estimates give us the boundedness of ψ in this region. This result is not satisfactory since it provides no information about the behaviour of waves on the horizon and so it is not sufficient for non-linear stability problems. 9. Morawetz and X Estimates In view of the absence of the redshift along H+ , any proof of uniform boundedness of solutions to the wave equation essentially relies on the dispersion of waves. Therefore, we must derive estimates which capture these dispersive properties. The first result we prove in this direction is that there exists a constant C which depends on M and 0 such that μ μ χ · JμT [ψ]n + ψ 2 ≤ C JμT [ψ]n 0 , R

0

where the weight χ = χ (r ) degenerates only at H+ , at r = 2M (photon sphere) and at infinity. The degeneracy at the photon sphere is expected in view of trapping. In particular, as we shall see, such an estimate degenerates only with respect to the derivatives tangential to the photon sphere. This degeneracy can be overcome at the expense of commuting with T . The degeneracy at H+ will be overcome in Sect. 12 by imposing necessary conditions on the spherical decomposition of ψ. 9.1. The vector field X . We are looking for a vector field which gives rise to a current whose divergence is non-negative (upon integration on the spheres of symmetry). We will be working with vector fields of the form X = f (r ∗ ) ∂r ∗ (for the coordinate system (t, r ∗ ) see Sect. 2). 9.1.1. The spacetime term K X . We compute f f f f f +2f H X 2 2 ∗ |∇ + − / ψ|2 , K = (∂t ψ) + (∂r ψ) + − 2D r 2D r 2 d f r∗ ) where f = dr( ∗ ) and H = 21 d D(r dr . Note that all the derivatives in this section will be ∗ considered with respect to r unless otherwise stated. Unfortunately, the trapping obstruction does not allow us to obtain a positive definite current K X so easily. Indeed, if we assume that all these coefficients are positive then we have f f H 1 H 1 fH > >− ⇒ − + f >0⇒ − < 0. f > 0, f < 0, − D 2D r D r D r 1 However, the quantity H D − r changes sign exactly at the radius Q of the photon sphere. Therefore, there is no way to make all the above four coefficients positive. For simplicity, let us define

P (r ) (r − M)(r − 2M) = −H · r + D = . r2 r2

36

S. Aretakis

9.2. The case l ≥ 1. We first consider the case where ψ is supported on the frequencies l ≥ 1. 9.2.1. The currents JμX,1 and K X,1 . To assist in overcoming the obstacle of the photon sphere we shall introduce zeroth order terms in order to modify the coefficient of (∂t ψ)2 and create another which is more flexible. Let us consider the current JμX,h 1 ,h 2 ,w = JμX + h 1 (r ) ψ∇μ ψ + h 2 (r ) ψ 2 ∇μ w , where h 1 , h 2 , w are functions on M. Then we have K˜ X = ∇ μ JμX,h 1 ,h 2 ,w = K X + ∇ μ h 1 ψ∇μ ψ + ∇ μ h 2 ψ 2 ∇μ w = K X + h 1 ∇ a ψ∇a ψ + ∇ μ h 1 + 2h 2 ∇ μ w ψ∇μ ψ + ∇μ h 2 ∇ μ w + h 2 g w ψ 2 .

By taking h 2 = 1, h 1 = 2G, w = −G we make the coefficient of ψ∇μ ψ vanish. Therefore, let us define . JμX,1 = JμX + 2Gψ ∇μ ψ − ∇μ G ψ 2 .

(9.1)

Then .

K X,1 = ∇ μ JμX,1 = K X + 2G

1 1 − / ψ|2 − g G ψ 2 . (∂t ψ)2 + (∂r ∗ ψ)2 + |∇ D D

Therefore, if we take G such that 1 f f f f ·D =− + ⇒G= + , (2G) − D 2D r 4 2r

(9.2)

then f (∂r ∗ ψ)2 + D f = (∂r ∗ ψ)2 + D

K X,1 =

f ·P |∇ / ψ|2 − g G ψ 2 r3 D f ·P 1 1 D D 2− |∇ f f f − / ψ| f ψ 2. + + + 4D r D ·r 2D · r r3 2r 2

Note that, since f must be bounded and f positive, − f must become negative and therefore has the wrong sign. Note also the factor D at the denominator which degenerates at the horizon. The way that turns out to work is by borrowing from the coefficient of (∂r ∗ ψ)2 which is accomplished by introducing a third current. Note that this current exploits an algebraic property of the trapping in the extreme case (see the computation (9.4) of the term I below).

Linear Stability and Instability of Extreme Reissner-Nordström I

37

9.2.2. The current JμX,2 and estimates for K X,2 . We define JμX,2 [ψ] = JμX,1 [ψ] +

f βψ 2 X μ , D· f

where β will be a function of r ∗ to be defined below. Since Div (∂r ∗ ) = have

D D

+

2D r ,

we

f βψ 2 ∂r ∗ D f f ·P / ψ|2 = (∂r ∗ ψ + βψ)2 + 3 |∇ D r D f D f D 1 f 2D D + β− + β − β2 + β− + f ψ 2. + − − 4 D r D r r D 2D · r 2r 2

K X,2 = K X,1 + Div

Note that the coefficient of f ψ 2 is independent of the choice of the function β. Let √ x ∗ us now take β = Dr − α 2 +x α and α > 0 is a sufficiently 2 , where x = r − α − large number to be chosen appropriately. As we shall see, the reason for introducing this shifted coordinate x is that we want the origin x = 0 to be far away from the photon sphere. We clearly need to choose a function f that is strictly increasing and changes sign at the photon sphere. So, if we choose this function (which we call f α ) such that 1 α ∗ ( f α ) = α 2 +x 2 , f (r = 0) = 0, then 1 ( f α ) x α2 1 x 2 − α2 ( f α ) ( f α ) − 2 − = 2 . 2 4 D α +x D D 2D x 2 + α 2 3 α2 + x 2 D f α , then If we define10 X α = f α ∂r ∗ and I = 2rD2 − 2D·r F := −

KX

α ,2

[ψ] =

fα · P ( f α ) |∇ / ψ|2 + (F + I ) ψ 2 . (∂r ∗ ψ + βψ)2 + D r3

9.2.3. Nonnegativity of K X

α ,2

. We have the following important proposition

Proposition 9.2.1. There exists a constant C which depends only on M such that for all solutions ψ to the wave equation which are supported on the frequencies l ≥ 1 we have

(r − M) (r − 2M)2 1 1 α 2 2 |∇ / ψ| + ψ ≤C K X ,2 [ψ]. (9.3) 2 4 ∗ 2 2 2 r D ((r ) + 1) S S Proof. We have fα · P α |∇ / ψ|2 + (F + I ) ψ 2 ≤ K X ,2 . 3 r We compute the term I D D M 3M α α f = 1− I = − f · P. 2r 2 2D · r r r6

(9.4)

10 Note that the role of the parameter α as an index in our notation in this section is to emphasise that the corresponding tensors (i.e. functions, vector fields, etc.) depend on the choice of α.

38

S. Aretakis

Therefore, I ≥ 0 and vanishes (to second order) at the horizon and the photon sphere. Note now that the term F is positive whenever x is not in the interval [−α, α]. On the other hand, the photon sphere is far away from the region where x ∈ [−α, α], so I is positive there. However, I behaves like r14 and so it is not sufficient to compensate the negativity of F. That is why we need to borrow from the / ψ which coefficient of ∇ / ψ|2 and, behaves like r1 . Indeed, the Poincaré inequality gives us S2 r22 ψ 2 ≤ S2 |∇ α

therefore, it suffices to prove that 2 f r 5·P + F > 0 for all x ∈ [−α, α] or, equivalently, √ √ r∗ ∈ α, 2α + α . But F=

1 x 2 − α2 x 2 − α2 3 = α 3 D 2 x 2 + α2 2 x 2 + α2

with 1 ≤ α → 1 as α → +∞. Moreover, since r ∗ ≥ r for big r , by taking α sufficiently big we have 2

fα fα · P ≥ 2δα 5 r (r ∗ )3

with 1 ≥ δα → 1 as α → +∞. Therefore, we need to establish that 2 √ 3 α − x2 x + α + α δα := < 3 α 4 x 2 + α2 f α for all x ∈ [−α, α]. In view of the asymptotic behaviour of the constants α , δα it suffices to prove that the right-hand side of the above inequality is strictly less than 1. Lemma 9.2.1. Given the function f α (r ∗ ) = f α (r ∗ (x)) defined above, we have the 1 . Also, if 0 ≤ x ≤ α then f α > xx+α following: If −α ≤ x ≤ 0 then f α > x+α 2 +α 2 − 2α . 2α 2 Proof. For −α ≤ x ≤ 0 we have x x x 1 1 1 x +α α f (x) = d x˜ > d x˜ = . √ x˜ 2 + α 2 d x˜ > 2 + α2 2 x ˜ 2α 2α 2 −α− α −α −α Now for 0 ≤ x ≤ α we have −x x x 1 1 1 α f (x) > d x˜ > d x˜ + d x˜ 2 2 2 2 2 2 −α x˜ + α −α x˜ + α −x x + α x −x 1 x +a 1 1 1 d x˜ > 2 . = d x ˜ − − 2 − 2 2 2 2 2 2 x +α x˜ + α x +α 2a −α x + α −α Consequently, if −α ≤ x ≤ 0 and x = −λα, then 0 ≤ λ ≤ 1 and 1 3 √ 3 (1 − λ) 1 + λ + α − 2 (α − x) x + α + α α 2 (1 − λ) (1 + λ)3 < = = dα 3 3 3 2 x 2 + α2 2 λ2 + 1 2 λ2 + 1 9 2 , < dα < 3 10

Linear Stability and Instability of Extreme Reissner-Nordström I

39

since dα → 1 as α → +∞. Similarly, if 0 ≤ x ≤ α and x = λα then 0 ≤ λ ≤ 1 and √ 3 √ 3 α α2 − x 2 x + α + α α α2 − x 2 x + α + α < 2 < 3 2 x 2 + α 2 α 2 + 2αx − x 2 2 x 2 + α2 1 − λ2 (1 + λ)3 9 ˜ = dα < , 3 2 10 2 1+λ since d˜α → 1 as α → +∞.

Note, however, that although the coefficient of ψ does not degenerate on the photon sphere, the coefficient of the angular derivatives vanishes at the photon sphere to second order. Having this estimate for ψ, we obtain estimates for its derivatives (in Sect. 9.7 we derive a similar estimate for ψ for the case l = 0). Proposition 9.2.2. There exists a positive constant C which depends only on M such that for all solutions ψ of the wave equation which are supported on the frequencies l ≥ 1 we have 1

(r − M) (r − M) 2 2 X α ,2 i ≤ C T (∂ ψ) + ψ K ψ . t r4 r4 S2 S2 i=0

Proof. From (9.3) we have that the coefficient of ψ 2 does not degenerate at the photon sphere. The weights at H+ and infinity are given by the Poincaré inequality. Commuting the wave equation with T completes the proof of the proposition. f

9.2.4. The Lagrangian current L μ . In order to retrieve the remaining derivatives we consider the Lagrangian11 current L μf = f ψ∇μ ψ, where f =

1 r3

3

D2.

Proposition 9.2.3. There exists a positive constant C which depends only on M such that for all solutions ψ of the wave equation which are supported on the frequencies l ≥ 1 we have S2

√

√

D D 3/2 / ψ|2 (∂t ψ)2 + 3 (∂r ∗ ψ)2 + 3 |∇ 3 r 2r r D

⎛

⎞

1

α ⎝Div(L μf ) + C K X ,2 T i ψ ⎠. ≤ S2 i=0

Proof. We have Div(L μf ) = f ∇ μ ψ∇μ ψ + ∇ μ f ψ∇μ ψ √ √ D D D 3/2 1D 2 D 2 2 ≥ 3 (∂r ∗ ψ) + 3 |∇ / ψ| − 3 (∂t ψ)2 − ψ − 4 (∂r ∗ ψ)2 , 4 r r r r r √ r where > 0 is such that D < 2 . Therefore, in view of Proposition 9.2.2, we have the required result. 11 The name Lagrangian comes from the fact that if ψ satisfies the wave equation and L denotes the Lagrangian that corresponds to the wave equation, then L(ψ, dψ, g −1 ) = g μν ∂μ ψ∂ν ψ = Div(ψ∇μ ψ).

40

S. Aretakis

Proposition 9.2.4. There exists a positive constant C which depends only on M such that for all solutions ψ of the wave equation which are supported on the frequencies l ≥ 1 we have √ √ √ √ D D D D |∇ / ψ|2 + 3 ψ 2 (∂t ψ)2 + 3 (∂r ∗ ψ)2 + r3 2r r r S2

1

α ≤ K X ,2 T i ψ . Div(L μf ) + C S2

i=0

Proof. Immediate from Propositions 9.2.1 and 9.2.3.

9.2.5. The current JμX ,1 . In case we allow some degeneracy at the photon sphere we can obtain similar estimates without commuting the wave equation with T . This will be very useful for estimating error terms in spacetime regions which do not contain the photon sphere. We define the function f d such that 1 fd = ∗ 2 , f d (r ∗ = 0) = 0. (r ) + 1 d

Proposition 9.2.5. There exists a positive constant C which depends only on M such that for all solutions ψ of the wave equation which are supported on the frequencies l ≥ 1 we have 1 1 P · (r − 2M) 2 2 X d ,1 X α ,2 ∗ |∇ ≤ / ψ| K (∂ ψ) + [ψ] + C K [ψ] , r C S2 r 2 r4 S2 where X d = f d ∂r ∗ and the current JμX,1 is as defined in Sect. 9.2.1. Proof. Note that the coefficient of ψ 2 in K X ,1 vanishes to first order on H+ (see also Lemma 9.4.1) and behaves like r14 for large r . Note also that the coefficient of (∂r ∗ ψ)2 converges to M 2 (see again Lemma 9.4.1). The result now follows from Proposition 9.2.1. d

d

h = h d ψ∇ ψ, where In order to retrieve the ∂t -derivative we introduce the current L μ μ d h is such that:

hd = −

1 (r ∗ )2

+1

for M ≤ r ≤ r0 < 2M, h d < 0 for r0 < r < 2M,

1 h d = 0 for r = 2M to second order, h d < 0 for 2M
Linear Stability and Instability of Extreme Reissner-Nordström I

41

Proof. We have as before hd hd (h d ) / ψ|2 + (∂t ψ)2 + (∂r ∗ ψ)2 + h d |∇ ψ(∂r ∗ ψ). D D D Since the coefficient of ψ(∂r ∗ ψ) vanishes to first on the horizon, the Cauchy-Schwarz inequality and Proposition 9.2.5 imply the result. d

h )=− Div(L μ

Finally, we obtain: Proposition 9.2.7. There exists a positive constant C which depends only on M such that for all solutions ψ of the wave equation which are supported on the frequencies l ≥ 1 we have 1 P · (r − 2M) 1 P · (r − 2M) (r − M) 2 2 2 2 ∗ ψ) + |∇ / ψ| (∂ ψ) + (∂ + ψ t r C S2 r2 r4 r4 r5 d α gd Div(L μ ) + K X ,1 [ψ] + C K X ,2 [ψ] . ≤ S2

Proof. Immediate from Propositions 9.2.1 and 9.2.6.

9.3. The case l = 0. In case l = 0 the wave is not trapped. Indeed, if we consider the vector field X 0 = f 0 ∂r ∗ with f 0 = − r13 then we have:

Proposition 9.3.1. For all spherically symmetric solutions ψ of the wave equation we have 5 1 0 (∂t ψ)2 + 4 (∂r ∗ ψ)2 = K X . r4 r Proof. Immediate from the expression of K X and the above choice of f = f 0 . 9.4. The boundary terms. The positivity of the bulk terms shown so far will be useful only if we can control the arising boundary terms. μ

9.4.1. Estimates for JμX n S

Proposition 9.4.1. Let X = f ∂r ∗ where f = f (r ∗ ) is bounded and S be either a S O(3) invariant spacelike (that may cross H+ ) or a S O(3) invariant null hypersurface. Then there exists a uniform constant C that depends on M, S and the function f such that for all ψ we have μ J X i [ψ]n μ ≤ C JμT [ψ]n S . μ S S

S

Proof. We work using the coordinate system (v, r, θ, φ). First note that X = f ∂r ∗ = f ∂v + f · D∂r . Now μ

μ

μ

μ

JμX n S = Tμν (X )ν n S = Tμv f n S + Tμr f Dn S 1 1 1 D f n v − f nr − D f n v + f nr (∂r ψ)2 = f n v (∂v ψ)2 + D f n v (∂v ψ) (∂r ψ)+ D 2 2 2 1 1 1 r v v 2 / ψ| . D f n − D f n − f n |∇ + 2 2 2

The result now follows from the boundedness of f .

42

S. Aretakis

9.4.2. Estimates for JμX

α ,i

μ

n S , i = 1, 2.

Proposition 9.4.2. There exists a uniform constant C that depends on M and S such that μ J X α ,i [ψ]n μ ≤ C JμT [ψ]n S , i = 1, 2, μ S S

S

where S is as in Proposition 9.4.1. Proof. It suffices to prove α 2 (f ) μ μ α α 2 ∗ ≤ B 2G ψ n β(∂ ψ ψ ∇ ψ − ∇ G + ) JμT [ψ]n S . μ μ r μ S D S S For we first prove the following lemma that is true only in the case of extreme ReissnerNordström (and not in the subextreme range). Lemma 9.4.1. The function F=

1 1 D ((r ∗ )2 + 1)

is bounded in R ∪ H+ . Proof. For the tortoise coordinate r ∗ we have r ∗ (r ) = r + 2M ln(r − M) − Clearly, F → 0 as r → +∞. Moreover, in a neighbourhood of H+ , F∼

r2 (r − M)2

1 M4 (r −M)2

This implies the required result.

+ r2

+ 4M 2 (ln(r

−

M))2

M2 r −M

+ C.

→ M 2 < ∞.

α

α

An immediate corollary of this lemma is that the functions G α1 = r GD and ( fD ) are bounded in R ∪ H+ . Furthermore, for M r we have ( f α ) ∼ r12 , D ∼ 1 and ∂r D ∼ r12 . Therefore, G α2 = r 2 ∇r G α is also bounded. Finally, ∇v G α = 0. The above bounds and the first Hardy inequality complete the proof of the proposition.

9.5. A degenerate X estimate. We first obtain an estimate which does not lose derivatives but degenerates at the photon sphere. Theorem 9.1. There exists a constant C which depends on M and 0 such that for all solutions ψ of the wave equation we have

R(0,τ )

1 (r − M)(r − 2M)2 μ 2 2 2 ∗ |∇ (∂t ψ) + / ψ| ≤C (∂r ψ) + JμT [ψ]n 0 . r4 r7 0 (9.5)

Linear Stability and Instability of Extreme Reissner-Nordström I

43

Proof. We first decompose ψ as ψ = ψ≥1 + ψ0 . We apply Stokes’ theorem for the current Jμd [ψ≥1 ] = JμX

d ,1

[ψ≥1 ] + JμX

α ,2

d

g [ψ≥1 ] + L μ [ψ≥1 ]

in the spacetime region R(0, τ ) and use Propositions 9.2.7 and 9.4.2. We also apply 0 Stokes’ theorem in R(0, τ ) for the current JμX [ψ0 ] and use Proposition 9.4.1 and by adding these two estimates we obtain the required result. Remark 9.1. One can further improve the weights at infinity. Indeed, if we apply Stokes’ theorem in the region {r ≥ R} for sufficiently large R for the current Jμ = JμX

f 1 ,1

f2

+ JμX ,

δ 1 where f 1 = 1 − r1δ , f 2 = 2+δ and δ > 0, then an easy calculation shows that there rδ exists a constant Cδ > 0 such that for r ≥ R, / ψ|2 + r −3−δ ψ 2 . ∇ μ Jμ ≥ Cδ r −1−δ (∂t ψ)2 + r −1−δ (∂r ∗ ψ)2 + r −1 |∇

The above remark and theorem and the identity ∂r ∗ = ∂v + D∂r imply statement (2) of Theorem 1 of Sect. 4. 9.6. A non-degenerate X estimate. We derive an L 2 estimate which does not degenerate at the photon sphere but requires higher regularity for ψ. Theorem 9.2. There exists a constant C which depends on M and 0 such that for all solutions ψ of the wave equation we have R(0,τ )

√

D

√

D (∂t ψ)2 + 4 (∂r ∗ ψ)2 + r4 2r

√

D |∇ / ψ|2 r

≤C

0

μ μ JμT [ψ]n + JμT [T ψ]n . 0

0

(9.6) Proof. We again decompose ψ as ψ = ψ≥1 + ψ0 and apply Stokes’ theorem for the current Jμ [ψ≥1 ] = L μf [ψ≥1 ] + C JμX

α ,2

[ψ≥1 ] + C JμX

α ,2

[T ψ≥1 ]

in the spacetime region R(0, τ ), where C is the constant of Proposition 9.2.4, and use Propositions 9.2.4 and 9.4.2. Finally, we apply Stokes’ theorem in R(0, τ ) for the current 0 JμX [ψ0 ] and use Proposition 9.4.1. Adding these two estimates completes the proof. The proof of statement (3) of Theorem 1 of Sect. 4 is immediate from the above proposition, Theorem 9.1 and Remark 9.1. ˜ τ . Indeed, the results of Remark 9.2. All the above estimates hold if τ is replaced by Sect. 9.4 allow us to bound the corresponding boundary terms in this case. Moreover, the additional boundary term on I + can be bounded by the conserved T -flux.

44

S. Aretakis

9.7. Zeroth order Morawetz estimate for ψ. We now prove weighted L 2 estimates of the wave ψ itself. In Sect. 9.2.3, we obtained such an estimate for ψ≥1 . Next we derive a similar estimate for the zeroth spherical harmonic ψ0 . We first prove the following lemma Lemma 9.7.1. Fix R > 2M. There exists a constant C which depends on M, 0 and R such that for all spherically symmetric solutions ψ of the wave equation μ (∂t ψ)2 + (∂r ∗ ψ)2 ≤ C R JμT [ψ]n 0 . {r =R}∩R(0,τ )

0

Proof. Consider the region F(0, τ ) = {R(0, τ ) ∩ {M ≤ r ≤ R}} . By applying the vector field X = ∂r ∗ as a multiplier in the region F(0, τ ) we obtain μ μ μ X X X Jμ [ψ]n H+ + K [ψ] + Jμ [ψ]n τ + JμX [ψ]n F + {r =R}∩R(0,τ ) F ∩H F τ ∩F μ X = Jμ [ψ]n 0 , 0 ∩F

μ

where n F denotes the unit normal vector to {r = R} pointing in the interior of F.

In view of the spatial compactness of the region F, the corresponding spacetime integral can be estimated using Propositions 9.3.1 and 9.4.1. The boundary integrals over 0 and τ can be estimated using Proposition 9.4.1. Moreover, for spherically symmetric μ waves ψ we have JμX [ψ]n F = − √1 (∂t ψ)2 + (∂r ∗ ψ)2 , which completes the proof. 2 D Consider now the region G = R ∩ {R ≤ r } . Proposition 9.7.1. Fix R > 2M. There exists a constant C which depends on M, 0 and R such that for all spherically symmetric solutions ψ of the wave equation 1 2 μ 2 ψ + ψ ≤ C JμT [ψ]n 0 . R 4 {r =R} G r 0 Proof. By applying Stokes’ theorem for the current JμX,1 [ψ] in the region G(0, τ ), where again X = ∂r ∗ (and, therefore, f = 1), we obtain μ μ K X,1 [ψ] + JμX,1 [ψ]n G ≤ C JμT [ψ]n 0 , G (0,τ )

{r =R}∩G (0,τ )

0

Linear Stability and Instability of Extreme Reissner-Nordström I

45

where we have used Proposition 9.4.2 to estimate the boundary integral over τ ∩ G. μ Note that again n G denotes the unit normal vector to {r = R} pointing in the interior of G. For f = 1 we have K X,1 [ψ] = I ψ 2 , where I > 0 and I ∼ r14 for r ≥ R > 2M. D If G is the function defined in Sect. 9.2.1 then G = 2r and therefore for sufficiently large R we have ∂r ∗ G < 0. Then, μ

μ

μ

μ

JμX,1 [ψ]n G = JμX n G + 2Gψ(∇μ ψ)n G − (∇μ G)ψ 2 n G . Since n G = √1 ∂r ∗ , by applying Cauchy-Schwarz for the second term on the right hand D side (and since the third term is positive) we obtain 1 μ JμX,1 [ψ]n F ∼ ψ 2 − ((∂t ψ)2 + (∂r ∗ ψ)2 ), for a sufficiently small . Lemma 9.7.1 completes the proof. It remains to obtain a (weighted) L 2 estimate for ψ in the region F. Proposition 9.7.2. Fix sufficiently large R > 2M. Then, there exists a constant C which depends on M, 0 and R such that for all spherically symmetric solutions ψ of the wave equation μ Dψ 2 ≤ C R JμT [ψ]n 0 . F

0

Proof. We apply Stokes’ theorem for the current JμH [ψ] = (∇μ H )ψ 2 − 2H ψ∇μ ψ, where H = (r − M)2 . All the boundary integrals can be estimated using Propositions 9.4.2 and 9.7.1. Note also that 2H ∇ μ JμH = (g H )ψ 2 − ((∂r ∗ ψ)2 − (∂t ψ)2 ). D Since g H ∼ D for M ≤ r ≤ R and H D is bounded, the result follows in view of the spatial compactness of F and Proposition 9.3.1. We finally have the following zeroth order Morawetz estimate: Theorem 9.3. There exists a constant C that depends on M and 0 such that for all solutions ψ of the wave equation we have D 2 μ ψ ≤ C JμT [ψ]n 0 . (9.7) 4 Rr 0 Proof. Write ψ = ψ0 + ψ≥1 and use Propositions 9.2.1, 9.7.2 and 9.7.1.

Clearly, the lemma used for Proposition 9.7.1 holds strictly for the case l = 0. In general, one could have argued by averaging the X estimate (9.5) in R, which would imply that there exists a value R0 of r such that all the derivatives are controlled on the hypersurface of constant radius R0 . This makes our argument work for all ψ without recourse to the spherical decomposition. Remark 9.3. The key property used in Proposition 9.7.1 is that the d’ Alembertian of 1 1 r is always negative, something not true for larger powers of r . Note that this unsta1 ble behaviour of r is expected since it is the static solution of the wave equation in Minkowski.

46

S. Aretakis

9.8. Discussion. The current of Sect. 9.2.2 was first introduced in [20] and subsequently in [22], where a non-degenerate X estimate is established for Schwarzschild. In [20], the authors show that for each fixed spherical number l there is a corresponding “effective photon sphere” centred at r = −γl . It is also shown that liml→+∞ −γl → 3M. In our case, our computation (9.4) shows that all “effective photon spheres” coincide with the photon sphere, i.e. for all l we have −γl = 2M. Note also that in [22], one needs to commute with the generators of the Lie algebra so(3). Moreover, the boundary terms could not be controlled by the flux of T but one needed a small portion of the redshift estimate. In our case the above geometric symmetry of the trapping and the asymptotic behavior of r ∗ allowed us to completely decouple the dispersion from the redshift effect. For a nice exposition of previous work on X estimates see [24]. 10. The Vector Field N It is clear that in order to obtain an estimate for the non-degenerate energy of a local observer we need to use timelike multipliers at the horizon. Then uniform boundedness of energy would follow provided we can control the spacetime terms which arise. For a suitable class of non-degenerate black hole spacetimes, not only have the bulk terms the right sign close to H+ but they in fact control the non-degenerate energy. Indeed, in [24] the following is proved Proposition 10.0.1. Let H+ be a Killing horizon with positive surface gravity and let V be the Killing vector field tangent to H+ . Then there exist constants b, B > 0 and a timelike vector field N which is φτV -invariant (i.e. LV N = 0) such that for all functions ψ we have μ

μ

b JμN [ψ]n τ ≤ K N [ψ] ≤ B JμN [ψ]n τ

(10.1)

on H+ . The construction of the above vector field does not require the global existence of a causal Killing vector field and the positivity of the surface gravity suffices. Under suitable circumstances, one can prove that the N flux is uniformly bounded without understanding the structure of the trapping (i.e. no X or Morawetz estimate is required). However, in our case, in view of the lack of redshift along H+ the situation is completely different. Indeed, we will see that in extreme Reissner-Nordström a vector field satisfying the properties of Proposition 10.0.1 does not exist. Not only will we show that there is no φτT -invariant vector field N satisfying (10.1) on H+ but we will in fact prove that there is no φτT -invariant timelike vector field N such that K N [ψ] ≥ 0 on H+ . Our resolution to this problem uses an appropriate modification of JμN and Hardy inequalities and thus is still robust. 10.1. The effect of vanishing redshift on linear waves . Let N = N v (r ) ∂v + N r (r ) ∂r be a future directed timelike φτT -invariant vector field. Then / ψ|2 + Fvr (∂v ψ) (∂r ψ) , K N [ψ] = Fvv (∂v ψ)2 + Frr (∂r ψ)2 + F∇/ |∇

Linear Stability and Instability of Extreme Reissner-Nordström I

47

where the coefficients are given by Nr N r D 1 (∂r N r ) − − , F∇/ = − ∂r N r , Fvv = ∂r N v , Frr = D 2 r 2 2 r 2N , (10.2) Fvr = D ∂r N v − r where D = ddrD . Note that since g (N , N ) = −D (N v )2 +2N v N r , g (N , T ) = −D N v + N r , and so N r (r = M) can not be zero (otherwise the vector field N would not be timelike). Therefore, looking back at the list (10.2) we see that the coefficient of (∂r ψ)2 r vanishes on the horizon H+ whereas the coefficient of ∂v ψ∂r ψ is equal to − 2N M(M) which is not zero. Therefore, K N [ψ] is linear with respect to ∂r ψ on the horizon H+ and thus it necessarily fails to be non-negative definite. This linearity is a characteristic feature of the geometry of the event horizon of extreme Reissner-Nordström and degenerate black hole spacetimes more generally. This proves that a vector field satisfying the properties of Proposition 10.0.1 does not exist. 10.2. A locally non-negative spacetime current. In view of the above discussion, we need to modify the bulk term by introducing new terms that will counteract the presence of ∂v ψ∂r ψ. We define Jμ ≡ JμN ,h = JμN + h (r ) ψ∇μ ψ,

(10.3)

where h is a function on M. Then we have K ≡ K N ,h = ∇ μ Jμ = K N + ∇ μ h ψ∇μ ψ + h ∇ a ψ∇a ψ , provided ψ is a solution of the wave equation. Let us suppose that h (r ) = N M(M) . Then K N ,h [ψ] = K N [ψ] + h ∇ a ψ∇a ψ = K N [ψ] + h 2∂v ψ∂r ψ + D (∂r ψ)2 + |∇ / ψ|2 ∂r N r 2 2 + h |∇ / ψ|2 = Fvv (∂v ψ) + [Frr + h D] (∂r ψ) + − 2 + [Fvr + 2h] (∂v ψ∂r ψ) . (10.4) r

Note that by taking h to be constant we managed to have no zeroth order terms in the current K . Let us denote the above coefficients of (∂a ψ∂b ψ) by G ab , where a, b ∈ {v, r, ∇ /} and define the vector field N in the region M ≤ r ≤ 9M to be such that 8 N v (r ) = 16r,

3 N r (r ) = − r + M 2

(10.5)

and, therefore, h = − 21 . Clearly, N is a timelike future directed vector field. We have the following 1

Proposition 10.2.1. For all functions ψ, the current K N ,− 2 [ψ] defined by (10.4) is non-negative definite in the region A N = M ≤ r ≤ 9M and, in particular, there is a 8 positive constant C that depends only on M such that √ 1 / ψ|2 . K N ,− 2 [ψ] ≥ C (∂v ψ)2 + D (∂r ψ)2 + |∇

48

S. Aretakis

Proof. We first observe that the coefficient G ∇/ of ∂v ψ∂r ψ is equal to G ∇/ = 41 . Clearly, G vv = 16 and G rr is non-negative since the factor of the dominant term D (which vanishes to first order on√H+ ) is positive. As regards √ the coefficient √ of the mixed term we have G vr = 16D + 2 D = 1 · 2 , where 1 = D, 2 = 16 D + 2. We will show that in region A N we have 12 ≤ G rr , 22 < G vv . Indeed, if we set λ = M r we have 12 ≤ G rr ⇔ (1 − λ) ≤ (1 − λ)

1 3 3 −λ −λ − +λ ⇔λ≤ , 4 2 5

which holds. Note also that 12 vanishes to second order on H+ whereas G rr vanishes to √ first order. Therefore, in region A N we have G rr − 12 ∼ D. Similarly, 22 < G vv ⇔ 16

2 M 7 + 2 < 16 ⇔ λ > , 1− r 8

which again holds. The proposition follows from the spatial compactness of A N and that a 2 + ab + b2 ≥ 0 for all a, b ∈ R. N ,δ,− 1

1

2 10.3. The cut-off δ and the current Jμ . Clearly, the current K N ,− 2 will not be non-negative far away from H+ and thus is not useful there. For this reason we extend N v , N r such that

8M , 7 8M . N r (r ) ≤ 0 for all r ≥ M and N r (r ) = 0 for all r ≥ 7 N v (r ) > 0 for all r ≥ M and N v (r ) = 1 for all r ≥

N as defined is a future directed timelike φτT -invariant vector field. Similarly, the modiN ,− 1

fication term in the current Jμ 2 is not useful for consideration far away from H+ . We thus introduce a smooth cut-off 8Mfunction δ : [ M, +∞ ) → R such that δ (r ) = 1, r ∈ M, 9M and δ = 0, r ∈ , +∞ and consider the currents (r ) 8 7 1 . 1 N ,δ,− 21 = JμN − δψ∇μ ψ, K N ,δ,− 2 = ∇ μ Jμ . 2

N ,δ,− 12 .

Jμ

We now consider the three regions A N , B N , C N :

(10.6)

Linear Stability and Instability of Extreme Reissner-Nordström I

49

N ,δ,− 21 In region C N = r ≥ 8M = 0. How7 , where δ = 0 and N = T , we have K ever, this spacetime current, which depends on the 1-jet of ψ, will generally be negative in region B N and thus will be controlled by the X and Morawetz estimates (which, of course, are non-degenerate in B N ). / ψ|2 . We next control the Sobolev norm ψ2H˙ 1 ( ) τ (∂v ψ)2 + (∂r ψ)2 + |∇ τ

μ

Since N and n τ are timelike everywhere in R and since ξ N , ξn are positive and uniμ formly bounded, (A.1) of Appendix A.2 implies JμN [ψ]n τ ∼ (∂v ψ)2 +(∂r ψ)2 +|∇ / ψ|2 , where, in view of the φτT -invariance of τ and N , the constants in ∼ depend only on M and 0 . Therefore, it suffices to estimate the flux of N through τ . We have:

Proposition 10.3.1. There exists a constant C > 0 which depends on M and 0 such that for all functions ψ, τ

JμN [ψ]n μ ≤ 2

N ,δ,− 12

τ

Jμ

[ψ]n μ + C

τ

JμT [ψ]n μ .

(10.7)

Proof. We have 1 1 1 n = JμN n μ − δψ∂μ ψn μ = JμN n μ − δψ∂v ψn v − δψ∂r ψnr 2 2 2 δ 2 1 N μ δ 2 N μ 2 2 ≥ Jμ n − δ(∂v ψ) − δ(∂r ψ) − ψ ≥ Jμ n − ψ 2

N ,δ,− 12 μ

Jμ

for a sufficiently small . The result follows from the first Hardy inequality.

Corollary 10.1. There exists a constant C > 0 which depends on M and 0 such that for all functions ψ,

N ,δ,− 12

τ

Jμ

[ψ]n μ ≤ C

τ

JμN [ψ]n μ .

(10.8)

Proof. Note that 1 n = JμN n μ − δψ∂μ ψn μ ≤ JμN n μ + δ(∂v ψ)2 + δ(∂r ψ)2 + Cδψ 2 , 2

N ,δ,− 12 μ

Jμ

and use the first Hardy inequality.

10.4. Lower estimate for an integral over H+ . As regards the integral over H+ , we have Proposition 10.4.1. For all functions ψ and any > 0 we have H+

N ,δ,− 21

Jμ

μ

[ψ]n H+ ≥

μ

H+

JμN [ψ]n H+ − C

where C depends on M, 0 and .

μ

τ

JμT [ψ]n τ −

τ

μ

JμN [ψ]n τ ,

50

S. Aretakis N ,δ,− 12

Proof. Recall our convention n H+ = T . On H+ we have δ = 1. Therefore, Jμ μ μ n H+ = JμN n H+ − 21 ψ∂v ψ. However, −2ψ∂v ψ = −∂v ψ 2 = ψ2 − ψ 2. H+

H+

H+ ∩0

H+ ∩τ

From the first and second Hardy inequality we have μ 2 T μ 2 2 T μ ψ ≤ C Jμ n + (∂v ψ) + (∂r ψ) ≤ C Jμ n + JμN n , H+ ∩

which completes the proof.

11. Uniform Boundedness of Local Observer’s Energy We have all tools in place in order to prove the following theorem Theorem 11.1. There exists a constant C > 0 which depends on M and 0 such that for all solutions ψ of the wave equation μ μ μ JμN [ψ]n τ + JμN [ψ]n H+ ≤ C JμN [ψ]n 0 . (11.1) H+

τ

0

N ,δ,− 12

Proof. Stokes’ theorem for the current Jμ 1 N ,δ,− 12 μ Jμ n τ + K N ,δ,− 2 + τ

R

H+

in region R(0, τ ) gives us N ,δ,− 21 μ N ,δ,− 12 μ Jμ n H+ = Jμ n 0 . 0

First observe that the right hand side is controlled by the right hand side of (10.8). As regards the left hand side, the boundary integrals can be estimated using Propositions 10.3.1 and 10.4.1. The spacetime term is non-negative (and thus has the right sign) in region A N , vanishes in region C N and can be estimated in the spatially compact region B N (which does not contain the photon sphere) by the X estimate (9.5) and Morawetz estimate (9.7). The result follows from the boundedness of T -flux through τ . Corollary 11.1. There exists a constant C > 0 which depends on M and 0 such that for all solutions ψ of the wave equation μ N ,− 21 K [ψ] ≤ C JμN [ψ]n 0 . (11.2) AN

0

Theorem 2 of Sect. 4 is then implied by (11.1), Proposition 10.2.1 and the above corollary. Note also that Corollary 11.1 and estimate (9.6) give us a spacetime integral where the only weight that locally degenerates (to first order) is that of the derivative tranversal to H+ . Recall that in the subextreme Reissner-Nordström case there is no such degeneration. In the next section, this degeneracy is eliminated provided ψ0 = 0. This condition is necessary as is shown in Sect. 12.1. One application of the above theorem is the following Morawetz estimate which does not degenerate at H+ .

Linear Stability and Instability of Extreme Reissner-Nordström I

51

Proposition 11.0.2. There exists a constant C > 0 which depends on M and 0 such that for all solutions ψ of the wave equation μ ψ2 ≤ C JμN [ψ]n 0 . (11.3) AN

0

Proof. The third Hardy inequality gives us 2 2 ψ ≤C ψ +C AN

BN

A N ∪B N

D (∂v ψ)2 + (∂r ψ)2 , ,

where C is a uniform positive constant that depends only on M. The integral over B N of ψ 2 can be estimated using (9.7) and the last integral on the right hand side can be estimated using (9.5) and Propositions 10.2.1 and Corollary 11.1. The above proposition in conjunction with Remark 9.2 and Theorem 9.3 implies statement (1) of Theorem 1 of Sect. 4. Remark 11.1. Note that all the above estimates hold if we replace the foliation τ with ˜ τ which terminates at I + since the only difference is a boundary integral the foliation + over I of the right sign that arises every time we apply Stokes’ theorem. The remaining local estimates are exactly the same. 12. Commuting with a Vector Field Transversal to H+ As we have noticed, the estimate (11.2) degenerates with respect to the transversal derivative ∂r to H+ . In order to remove this degeneracy we commute the wave equation g ψ = 0 with ∂r aiming at additionally controlling all the second derivatives of ψ (on the spacelike hypersurfaces and the spacetime region up to and including the horizon H+ ). Such commutations first appeared in [23]. 12.1. The spherically symmetric case. Let us first consider spherically symmetric waves. We have the following Proposition 12.1.1. For all spherically symmetric solutions ψ to the wave equation the quantity 1 (12.1) ∂r ψ + ψ M is conserved along H+ . Therefore, for generic initial data this quantity does not decay. / ψ = 0, we have ∂v ∂r ψ + M1 ∂v ψ = 0 and, Proof. Since ψ solves g ψ = 0 and since + since ∂v is tangential to H , this implies that ∂r ψ + M1 ψ remains constant along H+ , and clearly for generic initial data it is not equal to zero. By projecting on the zeroth spherical harmonic we deduce that for a generic wave ψ either ψ 2 or (∂r ψ)2 does not decay along H+ , completing the proof of Theorem 5 of Sect. 4. The nature of the above proposition (which turns out to be of fundamental importance for extreme black hole spacetimes) will be investigated in great detail in the companion papers [3,4], where, in particular a similar non-decay result is shown to hold even for “generic” waves whose zeroth spherical harmonic vanishes. Clearly, this proposition indicates that if one is to commute with ∂r then one must exclude the spherically symmetric waves and thus consider only angular frequencies l ≥ 1. Indeed, we next obtain the sharpest possible result.

52

S. Aretakis

12.2. Commutation with the vector field ∂r . We compute the commutator g , ∂r . First D / ψ. Therefore, if R = D + 2r , D = ddrD , then note that [ / , ∂r ] ψ = r2 where R =

2 2 / ψ, g , ∂r ψ = −D ∂r ∂r ψ + 2 ∂v ψ − R ∂r ψ + r r

(12.2)

dR dr .

12.3. The multiplier L and the energy identity. For any solution ψ of the wave equation we have complete control of the second order derivatives of ψ away from H+ since μ μ T ∂a ∂b ψ2L 2 ( ∩{M
0

0

0

/ }. Note that we have used elliptic where C depends on M, r0 and 0 and a, b ∈ {v, r, ∇ estimates since T is timelike away from H+ (the zeroth order terms can be estimated by the first Hardy inequality). Similarly, away from H+ , we have control of the bulk integrals of the second order derivatives since (9.7) and local elliptic estimates imply μ μ 2 T ∂a ∂b ψ L 2 (R(0,τ )∩{M
1

0

0

(12.4) The above estimates will be required for bounding several spacetime error terms away from H+ . In order to estimate the second order derivatives of ψ in a neighbourhood of H+ we will construct an appropriate future directed timelike φτT -invariant vector field L = L v ∂v + L r ∂r , which will be used as our multiplier. Since we are interested in the region M ≤ r ≤ r0 < r1 , L will be such that L = 0 in r ≥ r1 and L timelike in the region M ≤ r < r1 . Note that r0 , r1 are constants to be determined later on. The regions A, B are depicted below

For simplicity we will write R instead of R (0, τ ) , A instead of A (0, τ ), etc. The “energy” identity for the current JμL [∂r ψ] is μ μ μ JμL [∂r ψ] n τ + ∇ μ JμL [∂r ψ] + JμL [∂r ψ] n H+ = JμL [∂r ψ] n 0 . τ

R

H+

0

(12.5)

Linear Stability and Instability of Extreme Reissner-Nordström I

53

The right-hand side is controlled by the initial data and thus bounded. Also since L is timelike in the compact region A we have from (A.1) of Appendix A.2: μ

JμL [∂r ψ] n τ ∼ (∂v ∂r ψ)2 + (∂r ∂r ψ)2 + |∇ / ∂r ψ|2 ,

(12.6)

where the constants in ∼ depend on M, 0 and L. Furthermore, on H+ we have μ

JμL [∂r ψ] n H+ = L v (M)(∂v ∂r ψ)2 −

L r (M) |∇ / ∂r ψ|2 . 2

(12.7)

Therefore, the term that remains to be understood is the bulk integral. Since ∂r ψ does not satisfy the wave equation, we have ∇ μ JμL [∂r ψ] = K L [∂r ψ] + E L [∂r ψ] = K L [∂r ψ] + g (∂r ψ) L (∂r ψ) . We know that / ∂r ψ|2 + Fvr (∂v ∂r ψ) (∂r ∂r ψ) , K L [∂r ψ] = Fvv (∂v ∂r ψ)2 + Frr (∂r ∂r ψ)2 + F∇/ |∇ where the coefficients are as in (10.2). In view of Eq. (12.2) we have E L [∂r ψ] = −D L r (∂r ∂r ψ)2 − D L v (∂v ∂r ψ) (∂r ∂r ψ) − R L v (∂v ∂r ψ) (∂r ψ) Lv Lr +2 2 (∂v ∂r ψ) (∂v ψ) + 2 2 (∂r ∂r ψ) (∂v ψ) − R L r (∂r ∂r ψ) (∂r ψ) r r Lv Lr +2 (∂v ∂r ψ) / ψ + 2 (∂r ∂r ψ) / ψ. r r Therefore, we can write / ∂r ψ|2 + H4 (∂v ∂r ψ) (∂v ψ) ∇ μ JμL [∂r ψ] = H1 (∂v ∂r ψ)2 + H2 (∂r ∂r ψ)2 + H3 |∇ +H5 (∂v ∂r ψ) (∂r ψ) + H6 (∂r ∂r ψ) (∂v ψ) + H7 (∂v ∂r ψ) /ψ +H8 (∂r ∂r ψ) / ψ + H9 (∂v ∂r ψ) (∂r ∂r ψ) + H10 (∂r ∂r ψ) (∂r ψ) , where the coefficients Hi , i = 1, . . . , 10 are given by Lr 3D r 1 (∂r L r ) v − − L , H3 = − ∂r L r , H1 = ∂r L , H2 = D 2 r 2 2 r v Lv L L Lr H4 = +2 2 , H5 = −L v R , H6 = +2 2 , H7 = 2 , H8 = 2 , r r r r r L H9 = D ∂r L v − D L v − 2 , H10 = −L r R . (12.8) r Since L is a future directed timelike vector field we have L v (r ) > 0 and L r (r ) < 0 near H+ . By taking ∂r L v (M) sufficiently large we can make H1 positive close to the horizon H+ . Also since the term D vanishes on the horizon to second order and the terms R, D to first order and since L r (M) < 0, the coefficient H2 is positive close to H+ (and vanishes H2 2 2 to first order on it). For the same reason we have H9 D ≤ H 10 and (H9 R) ≤ 10 close to H+ . Moreover, by taking −∂r L r (M) sufficiently large we can also make the coefficient r 3 H3 positive close to H+ such that H9 < H 10 . Indeed, it suffices to consider L such that L r (M) ∂r L r (M) − M < − 25 , and then by continuity we have the previous inequality close to

54

S. Aretakis

H+ . Therefore, we consider M < r0 < 2M such that in region A = {M ≤ r ≤ r0 } we have L v > 1, ∂r L v > 1, −L r > 1, H1 > 1, H2 ≥ 0, H3 > 1, H3 H2 H2 H3 , H9 D ≤ , (H9 R)2 ≤ , H9 < . H8 < 10 10 10 10

(12.9)

Clearly, r0 depends only on M and the precise choice for L close to H+ . In order to define L globally, we just extend L v and L r such that L v > 0 for all r < r1 and L v = 0 for all r ≥ r1 , −L r > 0 for allr < r1 and L r = 0 for all r ≥ r1 , for some r1 such that r0 < r1 < 2M. Again, r1 depends only on M (and the precise choice for L). Clearly, L depends only on M and thus all the functions that involve the components of L depend only on M. 12.4. Estimates of the spacetime integrals. It suffices to estimate the remaining 7 integrals with coefficients Hi ’s with i = 4, . . . , 10. Note that all these coefficients do not vanish on the horizon. We will prove that each of these integrals can be estimated by the N flux for ψ and T ψ and a small (epsilon) portion of the good terms in K L [∂r ψ]. First we prove the following propositions. Proposition 12.4.1. For all solutions ψ of the wave equation and any positive number we have (∂r ψ)2 ≤ H1 (∂v ∂r ψ)2 + H2 (∂r ∂r ψ)2 A A μ μ N +C Jμ [ψ]n 0 + C JμN [T ψ]n 0 , 0

0

where the constant C depends on M, 0 and . Proof. By applying the third Hardy inequality for the regions A, B we obtain

(∂r ψ)2 ≤ C (∂r ψ)2 + C D (∂v ∂r ψ)2 + (∂r ∂r ψ)2 , A

B

A ∪B

where the constant C depends on M and 0 . Moreover, since C D vanishes to second order at H+ , there exists r with M < r ≤ r0 such that in the region {M ≤ r ≤ r } we have C D < H1 and C D ≤ H2 . Therefore, (∂r ψ)2 ≤ C (∂r ψ)2 + H1 (∂v ∂r ψ)2 + H2 (∂r ∂r ψ)2 {M≤r ≤r } A B

+ D (∂v ∂r ψ)2 + (∂r ∂r ψ)2 . {r ≤r ≤r1 }

The first integral on the right hand side is estimated using (9.5) and the last integral using the local elliptic estimate (12.4) since T is timelike in region {r ≤ r ≤ r1 } since M < r . The result follows from the inclusion {M ≤ r ≤ r } ⊆ A and the non-negativity of Hi , i = 1, 2 in A.

Linear Stability and Instability of Extreme Reissner-Nordström I

55

Proposition 12.4.2. For all solutions ψ of the wave equation and any positive number we have μ μ N (∂v ψ)(∂r ψ) ≤ C Jμ [ψ]n 0 + JμL [∂r ψ]n H+

0

0 ∪τ

where the positive constant C depends on M, 0 and . Proof. Integrating by parts gives us (∂v ψ)(∂r ψ) = − ψ(∂v ∂r ψ) + H+

H+

H+ ∩τ

ψ(∂r ψ) −

H+ ∩0

ψ(∂r ψ).

Since ψ solves the wave equation, it satisfies −∂v ∂r ψ = M1 (∂v ψ) + 21 / ψ on H+ . Therefore, 1 1 − ψ(∂v ∂r ψ) = ψ(∂v ψ) + ψ( / ψ) M H+ 2 H+ H+ 1 1 1 2 2 |∇ = / ψ|2 . ψ − ψ − 2M H+ ∩τ 2M H+ ∩0 2 H+ μ All the integrals on the right hand side can be estimated by 0 JμN [ψ]n 0 , using the second Hardy inequality and (11.1). Furthermore, H+ ∩ ψ(∂r ψ) ≤ H+ ∩ ψ 2 + 2 H+ ∩ (∂r ψ) . From the first and second Hardy inequality we have

H+ ∩

ψ2 ≤ C

μ

0

JμN [ψ]n 0 ,

where C depends on M and 0 . In addition, from the second Hardy inequality we have that for any positive number there exists a constant C which depends on M, 0 and such that μ μ 2 N (∂r ψ) ≤ C Jμ [ψ]n + JμL [∂r ψ]n . H+ ∩

Proposition 12.4.3. For all solutions ψ of the wave equation and any positive number we have μ μ N ≤ C ( / ψ)(∂ ψ) J [ψ]n + JμL [∂r ψ]n , r μ 0 H+

0

0 ∪τ

where the positive constant C depends on M, 0 and . Proof. In view of the wave equation on H+ we have 2 ( / ψ)(∂r ψ) = − (∂v ψ)(∂r ψ) − 2 (∂v ∂r ψ)(∂r ψ). M H+ H+ H+

56

S. Aretakis

The first integral on the right hand side can be estimated using Proposition 12.4.2. For the second integral we have 2 2 2 (∂v ∂r ψ)(∂r ψ) = ∂v (∂r ψ) = (∂r ψ) − (∂r ψ)2 H+ H+ H+ ∩τ H+ ∩0 μ μ μ N L ≤ C Jμ [ψ]n 0 + Jμ [∂r ψ]n 0 + JμL [∂r ψ]n τ , 0

0

τ

where, as above, is any positive number and C depends on M, 0 and . Estimate for R H4 (∂v ∂r ψ) (∂v ψ).

For any > 0 we have 2 H4 (∂v ∂r ψ) (∂v ψ) ≤ ∂ ψ) + −1 H42 (∂v ψ)2 (∂ v r A A 0 μ 2 ≤ JμN [ψ]n 0 , (∂v ∂r ψ) + C A

0

where the constant C depends only on M,0 and . Estimate for R H5 (∂v ∂r ψ) (∂r ψ).

As above, for any > 0,

A

H5 (∂v ∂r ψ) (∂r ψ) ≤ (∂v ∂r ψ)2 + −1 H52 (∂r ψ)2 ≤ (∂v ∂r ψ)2 + m (∂r ψ)2 , A

A

A

A

where m = Then from Proposition 12.4.1 (where is replaced with m ) we obtain 2 H5 (∂v ∂r ψ) (∂r ψ) ≤ ∂ ψ) + H1 (∂v ∂r ψ)2 + H2 (∂r ∂r ψ)2 (∂ v r A A A μ μ N + C Jμ [ψ]n 0 + C JμN [T ψ]n 0 , (12.10) maxA −1 H52 .

0

0

where C depends on M, 0 and . Estimate for R H6 (∂r ∂r ψ) (∂v ψ). Since Div ∂r = r2 , Stokes’ theorem yields 2 H6 (∂v ψ) (∂r ∂r ψ) + ∂r (H6 ∂v ψ) (∂r ψ) + H6 (∂v ψ) (∂r ψ) r R R R H6 (∂v ψ) (∂r ψ) ∂r · n 0 − H6 (∂v ψ) (∂r ψ) ∂r · n τ = 0 τ − H6 (∂v ψ) (∂r ψ) ∂r · n H+ . H+

The boundary term over τ can be estimated by the N -flux whereas the boundary integral over H+ can be estimated using Proposition 12.4.2. If Q = ∂r H6 + r2 H6 , then it remains to estimate Q (∂v ψ) (∂r ψ) + H6 (∂v ∂r ψ) (∂r ψ). R

R

Linear Stability and Instability of Extreme Reissner-Nordström I

57

The first integral can be estimated using Cauchy-Schwarz, (11.2) and Proposition 12.4.1. The second integral is estimated by (12.10) (where H5 is replaced with H6 ). / ψ. Estimate for R H7 (∂v ∂r ψ) Stokes’ theorem in region R gives us

H7 (∂v ∂r ψ) /ψ + H7 (∂r ψ)( / ∂v ψ) R R = H7 (∂r ψ) ( / ψ)∂v · n 0 − H7 (∂r ψ) ( / ψ) ∂v · n τ. 0

τ

For the boundary integrals we have the estimate H7 (∂r ψ) /ψ =− H7 ∇ / ∂r ψ · ∇ /ψ ≤

A∩

A∩

A∩

μ JμL [∂r ψ]n +C

μ

JμN [ψ]n ,

where C depends on M, 0 and . As regards the second bulk integral, after applying Stokes’ theorem on S2 (r ) we obtain |∇ |∇ / ∂r ψ|2 + C / ∂v ψ|2 , H7 ∇ / ∂r ψ · ∇ / ∂v ψ ≤ A

A

A

where C depends on Note that the second integral on the right hand side M, 0 and . μ can be bounded by 0 JμN [T ψ]n 0 . Indeed, we commute with T and use Proposition 11.1. (Another way without having to commute with T is by solving with respect to / ψ in the wave equation. As we shall see, this will be crucial in obtaining higher order estimates without losing derivatives.) Estimate for R H8 (∂r ∂r ψ) / ψ. We have

2 H8 ( ∂r (H8 / ψ) (∂r ψ) + H8 ( / ψ) (∂r ∂r ψ) + / ψ) (∂r ψ) r R R R = H8 ( H8 ( / ψ) (∂r ψ) ∂r · n 0 − / ψ) (∂r ψ) ∂r · n τ 0 τ − H8 ( / ψ) (∂r ψ) ∂r · n H+ . H+

The integral over H+ is estimated in Proposition 12.4.3. Furthermore, the CauchySchwarz inequality implies

τ ∩A

H8 ( / ψ) (∂r ψ) ∂r · n τ = − ≤ C

H8 (∇ /ψ · ∇ / ∂r ψ) ∂r · n τ μ μ JμN [ψ]n 0 + JμL [∂r ψ]n 0 ,

τ ∩A

0

0

58

S. Aretakis

where is any positive number and C depends only on M, 0 and . For the remaining two spacetime integrals we have 2 ∂r (H8 / ψ) (∂r ψ) + / ψ) (∂r ψ) H8 ( r A A = / ∂r ψ|2 + −H8 |∇ −∂r H8 (∇ /ψ · ∇ / ∂r ψ). A

A

3 The conditions (12.9) assure us that |H8 | < H 10 and thus the first integral above can be estimated. For the second integral we apply the Cauchy-Schwarz inequality. Estimate for R H9 (∂v ∂r ψ) (∂r ∂r ψ).

In view of the wave equation we have ∂v ∂r ψ = − D2 ∂r ∂r ψ − r1 ∂v ψ − R2 ∂r ψ − 21 / ψ. Therefore, H9 (∂v ∂r ψ) (∂r ∂r ψ) D R / ψ) (∂v ψ) ( 2 . = −H9 ∂ ψ) + ψ) ∂ ψ) + ∂ ψ) + ∂ ψ) (∂r r (∂r (∂r r (∂r r (∂r r 2 2 r 2 2 Note that in A we have H9 D ≤ H 10 and thus the first term on the right hand side poses no problem. Similarly, in A we have

R H2 . (∂r ψ) (∂r ∂r ψ) ≤ (∂r ψ)2 + (H9 R)2 (∂r ∂r ψ)2 , (H9 R)2 ≤ 2 10 According to what we have proved above, the integral A − Hr9 (∂v ψ) (∂r ∂r ψ) can be estimated (provided we replace H6 with − Hr9 ). Similarly, we have seen that the integral H9 3 / ψ can be estimated provided we have H9 ≤ H A − 2 (∂r ∂r ψ) 10 , which holds by the definition of L. Estimate for R H10 (∂r ∂r ψ) (∂r ψ). −H9

We have 2 ∂r H10 + H10 (∂r ψ)2 H10 (∂r ∂r ψ) (∂r ψ) + 2 r R R = H10 (∂r ψ)2 ∂r · n 0 − H10 (∂r ψ)2 ∂r · n τ − 0

τ

H+

H10 (∂r ψ)2 .

The second spacetime integral can be estimated using Proposition 12.4.1 whereas the boundary integral over τ can be estimated using the N -flux. Finally we need to estimate the integral over H+ . It turns out that this integral is the most problematic. Since 2D 2 R = D + 2D r − r 2 we have R (M) = M 2 and thus H10 (M) = −L r (M) R (M) = −L r (M)

2 > 0. M2

In order to estimate the integral along H+ we apply the Poincaré inequality H10 (M) H10 (M) M 2 2 − |∇ ≤ / ∂r ψ|2 . ψ) (∂ r 2 2 l(l + 1) H+ H+

(12.11)

Linear Stability and Instability of Extreme Reissner-Nordström I

59

Therefore, in view of (12.7) it suffices to have M2 L r (M) (12.11) H10 (M) ≤ − ⇔ l ≥ 1. 2l(l + 1) 2 Note that for l = 1 we need to use up all of the good term over H+ which appears in identity (12.5). This is not by accident. Indeed, in [3] we will show that even for l = 1 we have another conservation law on H+ . This law is not implied only by the degeneracy of the horizon (as is the case for the conservation law for l = 0 of Sect. 12.1) but also uses an additional property of the metric tensor on H+ . This law then implies that one needs to use up all of precisely this good term over H+ . This is something we can do, since we have not used this term in order to estimate other integrals. That was possible via successive use of Hardy inequalities. 12.5. L 2 estimates for the second order derivatives. We can now prove the statement (1) of Theorem 3 of Sect. 4. First note that L ∼ n τ in region A. Therefore, the theorem follows from the estimates (12.3) and (12.4), the estimates that we derived in Sect. 12.4 (by taking sufficiently small) and the energy identity (12.5). Note hand side of the estimate of statement (1) of Theorem 3 is bounded that the right μ μ by C 0 JμN [ψ]n 0 + C 0 JμN [N ψ]n 0 . This means that equivalently we can apply N as multiplier but also as commutator for frequencies l ≥ 1. As we shall see in the companion paper [3], such commutation does not yield the above estimate for l = 0. One can also obtain similar spacetime estimates for spatially compact regions which include A by commuting with T and T 2 and applying X as a multiplier. Note that we need to commute with T 2 in view of the photon sphere. Note also that no commutation with the generators of the Lie algebra so(3) is required. 13. Integrated Weighted Energy Decay We are now in position to eliminate the degeneracy of (11.2). Proposition 13.0.1. There exists a uniform constant C > 0 which depends on M and 0 such that for all solutions ψ of the wave equation which are supported on the frequencies l ≥ 1 we have μ μ μ 2 N N Jμ [ψ]n 0 + C Jμ [T ψ]n 0 + C JμN [∂r ψ]n 0 . (∂r ψ) ≤ C A

0

0

0 ∩A

Proof. Immediate from the third Hardy inequality, the statement (1) of Theorem 3 and Theorem 9.2. Note that the above proposition shows that in order to obtain this non-degenerate estimate in a neighbourhood of H+ one needs to commute the wave equation with ∂r and thus require higher regularity for ψ. This allows us to conclude that trapping takes place along H+ . Remark that all frequencies exhibit trapped behaviour in contrast to the photon sphere where only high frequencies are trapped. This phenomenon is absent in the non-extreme case. The statement (2) of Theorem 3 of Sect. 4 follows from the above proposition and the results of Sect. 9.

60

S. Aretakis

14. Uniform Pointwise Boundedness It remains to show that all solutions ψ of the wave equation are uniformly bounded. Proof of Theorem 4 of Sect. 4. We decompose ψ = ψ0 + ψ≥1 and prove that each projection is uniformly bounded by a norm that depends only on initial data. Indeed, by applying the following Sobolev inequality on the hypersurfaces τ we have ψ≥1

L ∞ (τ )

≤ C ψ≥1 . 1

H (τ )

+ ψ≥1 . 2

H (τ )

+ lim x→i 0 ψ≥1 , (14.1)

where C depends only on 0 . Note also that we use the Sobolev inequality that does not involve the L 2 -norms of zeroth order terms. We observe that the vector field ∂v − ∂r is timelike since g (∂v − ∂r , ∂v − ∂r ) = −D − 2 and, therefore, by an elliptic estimate there exists a uniform positive constant C which depends on M and 0 such that 2 ψ≥1 2. 1 + ψ≥1 . 2 H ( ) H (τ ) τ μ N ≤C Jμ [ψ≥1 ]n τ + C τ

τ

μ JμN [T ψ≥1 ]n τ

+C

τ

μ

JμN [∂r ψ≥1 ]n τ .

It remains to derive a pointwise bound for ψ0 . However, from the 1-dimensional Sobolev inequality we have 4π ψ02 (r0 , ω)

C ≤ r0

τ ∩{r ≥r0 }

μ

JμN [ψ0 ]n τ ,

where C is a constant that depends only on M and 0 . This completes the proof of the uniform boundedness of ψ. Acknowledgements. I would like to thank Mihalis Dafermos for introducing to me the problem and for his teaching and advice. I also thank Igor Rodnianski for sharing useful insights. I am supported by a Bodossaki Grant and a Grant from the European Research Council.

Appendix A. Useful Reissner-Nordström Computations A.1. The wave operator. The wave operator in (v, r, θ, φ) coordinates is 2 / ψ, g ψ = D∂r ∂r ψ + 2∂v ∂r ψ + ∂v ψ + R∂r ψ + r where R = D +

2D r

and D =

dD dr .

g ψ = −

In (u, v) coordinates,

4 D ∂u ∂v (r ψ) − ψ + / ψ. Dr r

Linear Stability and Instability of Extreme Reissner-Nordström I

61

A.2. The non-negativity of the energy-momentum tensor T. We use the coordinate system (v, r ) and suppose that V = (V v , V r , 0, 0) , n = (n v , nr , 0, 0). The reader can easily verify that if ξV = 1v 2 (−g (V, V )) and similarly for n then 2(V )

D2 2 v v ξ V · ξn = V n 1− 2 (∂v ψ) + V n (∂r ψ)2 D + 2ξV · ξn 2 ⎛ ⎞2 2 2 + 2ξ · ξ D D 1 V n / ψ|2 + ⎝ + − g (V, n) |∇ · ∂v ψ + · ∂r ψ ⎠ . 2 D 2 + 2ξV · ξn 4 JμV n μ

v v

(A.1) Appendix B. Stokes’ Theorem on Lorentzian Manifolds If R is a pseudo-Riemannian manifold and P is a vector field on it then we have the divergence identity R ∇μ P μ = ∂ R P · n ∂ R . Both integrals are taken with respect to the induced volume form. Note that n ∂ R is the unit normal to ∂R and its direction depends on the convention of the signature of the metric. For Lorentzian metrics with signature (−, +, +, +) the vector n ∂ R is the inward directed unit normal to ∂R in case ∂R is spacelike and the outward directed unit normal in case ∂R is timelike. If ∂R (or a piece of it) is null then we take a past (future) directed null normal to ∂R if it is future (past) boundary. The following diagram is embedded in R1+1

References 1. Alinhac, S.: Geometric analysis of hyperbolic differential equations: An introduction, The London Mathematical Society, Lecture Note Series 374, Cambridge: Cambridge Univ. Press, 2010 2. Andersson, L., Blue, P.: Hidden symmetries and decay for the wave equation on the Kerr spacetime. http://arxiv.org/abs/0908.2265v2 [math.Ap], 2009 3. Aretakis, S.: Stability and Instability Of Extreme Reissner-Nordström Black Hole Spacetimes for Linear Scalar Perturbations II. Preprint 4. Aretakis, S.: The Price Law for Self-Gravitating Scalar Fields On Extreme Black Hole Spacetimes. Preprint 5. Biˇcák, J.: Gravitational collapse with charge and small asymmetries I: scalar perturbations. Gen. Rel. Grav. 3(4), 331–349 (1972) 6. Blue, P., Soffer, A.: Semilinear wave equations on the Schwarzschild manifold. I. Local decay estimates. Adv. Diff. Eqs. 8(5), 595–614 (2003) 7. Blue, P., Sterbenz, J.: Uniform decay of local energy and the semi-linear wave equation on Schwarzschild space. Commun. Math. Phys. 268(2), 481–504 (2006) 8. Blue, P., Soffer, A.: Phase space analysis on some black hole manifolds. J. Func. Anal. 256(1), 1–90 (2009) 9. Blue, P., Soffer, A.: Improved decay rates with small regularity loss for the wave equation about a Schwarzschild black hole. http://arxiv.org/abs/math/0612168v1 [math.Ap], 2006 10. Christodoulou, D., Klainerman, S.: The Global Nonlinear Stability of the Minkowski Space. Princeton, NJ: Princeton University Press, 1994 11. Christodoulou, D.: On the global initial value problem and the issue of singularities. Class. Quant. Grav. 16(12A), A23–A35 (1999)

62

S. Aretakis

12. Christodoulou, D.: The instability of naked singularities in the gravitational collapse of a scalar field. Ann. of Math. 149(1), 183–217 (1999) 13. Christodoulou, D.: The Action Principle and Partial Differential Equations. Princeton, NJ: Princeton University Press, 2000 14. Christodoulou, D.: The Formation of Black Holes in General Relativity. Zurich: European Mathematical Society Publishing House, 2009 15. Chru´sciel, P.T., Tod, K.P.: The classification of static electro-vacuum space-times containing an asymptotically flat spacelike hypersurface with compact interior. Commun. Math. Phys. 271, 577–589 (2007) 16. Chru´sciel, P.T., Nguyen, L.: A uniqueness theorem for degenerate Kerr-Newman black holes. http://arxiv. org/abs/1002.1737v1 [gr-qc], 2010 17. Dafermos, M.: Stability and instability of the Cauchy horizon for the spherically symmetric EinsteinMaxwell-scalar field equations. Ann. of Math. 158(3), 875–928 (2003) 18. Dafermos, M.: The interior of charged black holes and the problem of uniqueness in general relativity. Comm. Pure App. Math. 58(4), 445–504 (2005) 19. Dafermos, M., Rodnianski, I.: A proof of Price’ s law for the collapse of a self-gravitating scalar field. Invent. Math. 162, 381–457 (2005) 20. Dafermos, M., Rodnianski, I.: The redshift effect and radiation decay on black hole spacetimes. Comm. Pure Appl. Math. 62, 859–919 (2009) 21. Dafermos, M., Rodnianski, I.: The wave equation on Schwarzschild-de Sitter spacetimes. http://arxiv. org/abs/0709.2766v1 [gr-qc], 2007 22. Dafermos, M., Rodnianski, I.: A note on energy currents and decay for the wave equation on a Schwarzschild background. http://arxiv.org/abs/0710.0171v1 [math.Ap], 2007 23. Dafermos, M., Rodnianski, I.: A proof of the uniform boundedness of solutions to the wave equation on slowly rotating Kerr backgrounds. http://arxiv.org/abs/0805.4309v2 [gr-qc], 2008 24. Dafermos, M., Rodnianski, I.: Lectures on Black Holes and Linear Waves. http://arxiv.org/abs/0811. 0354v1 [gr-qc], 2008 25. Dafermos, M.: The evolution problem in general relativity, Current developments in mathematics 2008, Somerville, MA: Int. Press, 2009, pp. 1-66 26. Dafermos, M., Rodnianski, I.: A new physical-space approach to decay for the wave equation with applications to black hole spacetimes. http://arxiv.org/abs/0910.4957v1 [math.Ap], 2009, Submitted to proc. ICMP, prague, Aug, 2009 27. Dafermos, M., Rodnianski, I.: Decay for solutions of the wave equation on Kerr exterior spacetimes I − I I : The cases |a| M or axisymmetry. http://arxiv.org/abs/1010.5132v1 [gr-qc], 2010 28. Dafermos, M., Rodnianski, I.: The black holes stability problem for linear scalar perturbations. http:// arxiv.org/abs/1010.5137v1 [gr-qc], 2010 29. Donninger, R., Schlag, W., Soffer, A.: On pointwise decay of linear waves on a Schwarzschild black hole background. http://arxiv.org/abs/0911.3179v1 [math.Ap], 2009 30. Finster, F., Kamran, N., Smoller, J., Yau, S.T.: Decay of solutions of the wave equations in the Kerr geometry. Commun. Math. Phys. 264(2), 221–255 (2008) 31. Graves, J.C., Brill, D.R.: Oscillatory Character of Reissner-Nordström Metric for an Ideal Charged Wormhole. Phys. Rev. 120, 1507–1513 (1960) 32. Hawking, S.W., Ellis, G.F.R.: The large scale structure of spacetime, Cambridge Monographs on Mathematicals Physics, No. 1, London-New York: Cambridge University Press, 1973 33. Klainerman, S.: Uniform decay estimates and the Lorentz invariance of the classical wave equation. Commun. Pure Appl. Math. 38, 321–332 (1985) 34. Kronthaler, J.: Decay rates for spherical scalar waves in a Schwarzschild geometry. http://arxiv.org/abs/ 0709.3703v1 [gr-qc], 2007 35. Lindblad, H., Rodnianski, I.: The global stability of Minkowski spacetime in harmonic gauge. Ann. of Math. 171(3), 1401–1477 (2010) 36. Luk, J.: Improved decay for solutions to the linear wave equation on a Schwarzschild black hole. Ann. H. Poincareé 11, 805–880 (2010) 37. Marolf, D.: The danger of extremes. http://arxiv.org/abs/1005.2999v2 [gr-qc], 2010 38. Marzuola, J., Metcalfe, J., Tataru, D., Tohaneanu, M.: Strichartz estimates on Schwarzschild black hole backgrounds. Commun. Math. Phys. 293(1), 37–83 (2010) 39. Morawetz, C.S.: The limiting amplitude principle. Comm. Pure Appl. Math. 15, 349–361 (1962) 40. Morawetz, C.S.: Time Decay for Nonlinear Klein-Gordon Equation. Proc. Roy. Soc. London 306, 291– 296 (1968) 41. Nordström, G.: On the Energy of the Gravitational Field in Einstein’s Theory. Verhandl. Koninkl. Ned. Akad. Wetenschap., Afdel. Natuurk., Amsterdam 26, 1201–1208 (1918) 42. Price, R.: Nonspherical perturbations of relativistic gravitational collapse. I. Scalar and gravitational perturbations. Phys. Rev. D 5(3), 2419–2438 (1972)

Linear Stability and Instability of Extreme Reissner-Nordström I

63

43. Poisson, E.: A Relativist’s Toolkit: The Mathematics of Black-Hole Mechanics. Cambridge: Cambridge University Press, 2004 44. Regge, T., Wheeler, J.: Stability of a Schwarzschild singularity. Phys. Rev. 108, 1063–1069 (1957) 45. Reissner, H.: Über die Eigengravitation des elektrischen Feldes nach der Einstein’schen Theorie. Annalen der Physik 50, 106–120 (1916) 46. Rodnianski, I., Speck, J.: The Stability of the Irrotational Euler-Einstein System with a Positive Cosmological Constant. http://arxiv.org/abs/0911.5501v2 [math-ph], 2010 47. Schlue, V.: Linear waves on higher dimensional Schwarzschild black holes. Rayleigh Smith Knight Essay 2010, University of Cambridge, January 2010 48. Sogge, C.: Lectures on nonlinear wave equations. Boston: International Press, 1995 49. Tataru, D., Tohaneanu, M.: Local energy estimate on Kerr black hole backgrounds. http://arxiv.org/abs/ 0810.5766v2 [math.Ap], 2008 50. Tataru, D.: Local decay of waves on asymptotically flat stationary space-times. http://arxiv.org/abs/0910. 5290v2 [math.Ap], 2010 51. Taylor, M.E.: Partial Differential Equations I. New York: Springer, 1996 52. Twainy, F.: The Time Decay of Solutions to the Scalar Wave Equation in Schwarzschild Background. Thesis. San Diego: University of California, 1989 53. Wald, R.M.: Note on the stability of the Schwarzschild metric. J. Math. Phys. 20, 1056–1058 (1979) 54. Wald, R.M.: General Relativity. The University of Chicago Press, 1984 55. Wald, R., Kay, B.: Linear stability of Schwarzschild under perturbations which are nonvanishing on the bifurcation 2-sphere. Classical Quantum Gravity 4(4), 893–898 (1987) Communicated by P.T. Chru´sciel

Commun. Math. Phys. 307, 65–100 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1297-7

Communications in

Mathematical Physics

Random Time-Dependent Quantum Walks Alain Joye UJF-Grenoble 1, CNRS Institut Fourier UMR 5582, 38402 Grenoble, France. E-mail: [email protected] Received: 19 October 2010 / Accepted: 15 February 2011 Published online: 2 July 2011 – © Springer-Verlag 2011

Abstract: We consider the discrete time unitary dynamics given by a quantum walk on the lattice Zd performed by a quantum particle with internal degree of freedom, called coin state, according to the following iterated rule: a unitary update of the coin state takes place, followed by a shift on the lattice, conditioned on the coin state of the particle. We study the large time behavior of the quantum mechanical probability distribution of the position observable in Zd when the sequence of unitary updates is given by an i.i.d. sequence of random matrices. When averaged over the randomness, this distribution is shown to display a drift proportional to the time and its centered counterpart is shown to display a diffusive behavior with a diffusion matrix we compute. A moderate deviation principle is also proven to hold for the averaged distribution and the limit of the suitably rescaled corresponding characteristic function is shown to satisfy a diffusion equation. A generalization to unitary updates distributed according to a Markov process is also provided. An example of i.i.d. random updates for which the analysis of the distribution can be performed without averaging is worked out. The distribution also displays a deterministic drift proportional to time and its centered counterpart gives rise to a random diffusion matrix whose law we compute. A large deviation principle is shown to hold for this example. We finally show that, in general, the expectation of the random diffusion matrix equals the diffusion matrix of the averaged distribution. 1. Introduction Quantum walks are models of discrete time quantum evolution taking place on a d-dimensional lattice. Their implementation as unitary discrete dynamical systems on a Hilbert space is typically the following. A quantum particle with internal degree of freedom moves on an infinite d-dimensional lattice according to the following rule. The Partially supported by the Agence Nationale de la Recherche, grant ANR-09-BLAN-0098-01.

66

A. Joye

one-step motion consists in an update of the internal degree of freedom by means of a unitary transform in the relevant part of the Hilbert space followed by a finite range shift on the lattice, conditioned on the internal degree of freedom of the particle. Quantum walks constructed this way can be considered as quantum analogs of classical random walks on lattices. Therefore, in this context, the space of the internal degree of freedom is called coin space, the degree of freedom is the coin state and the unitary operators performing the update are coin matrices. Due to the important role played by classical random walks in theoretical computer science, quantum walks have enjoyed an increasing popularity in the quantum computing community in recent years, see for example [3,21,23,28]. Their particular features for the search algorithm is described in [4,27,34] and in the review [31]. In addition, quantum walks can be considered as effective dynamics of quantum systems in certain asymptotic regimes. See e.g. [1,9,11,26,28,30], for a few models of this type, and [5,6,8,12,14] for their mathematical analysis. Moreover, quantum walk dynamics have been shown to be an experimental reality for systems of cold atoms trapped in suitably monitored optical lattices [19], and ions caught in monitored Paul traps [37]. While several variants and generalizations of the quantum dynamics described above are possible, we will focus on the case where the underlying lattice is Zd and where the dimension of the coin space is 2d. We are interested in the long time behavior of quantum mechanical expectation values of observables that are non-trivial on the lattice only, i.e. that do not depend on the internal degree of freedom of the quantum walker. Equivalently, this amounts to studying a family of random vectors X n on the lattice Zd , indexed by the discrete time variable, with probability laws P(X n = k) = Wk (n) defined by the prescriptions of quantum mechanics. The initial state of the quantum walker is described by a density matrix. The case where the unitary update of the coin variable is performed at each time step by means of the same coin matrix is well known. It leads to a ballistic behavior of the expectation of the position variable characterized by EW (n) (X n ) nV when n is large, for some vector V . This vector and further properties of the motion can be read off the Fourier transform of the one step unitary evolution operator. In this paper, we consider the situation where the coin matrices used to update the coin variable depend on the time step in a random fashion, that is a situation of temporal disorder. Let us describe our results informally here, referring the reader to the relevant sections for precise statements. We assume the sequence of coin matrices consists of random unitary matrices which are independent and identically distributed (i.i.d.) and we analyze the large n behavior of the corresponding random distribution W ω (n) of X nω . We do so by studying the charω acteristic function ωn (y) = EW ω (n) (ei y X n ). In Sect. 2, we first show a deterministic result saying that the characteristic function at time n can be expressed in terms of a product of n matrices, M j , each M j depending on the coin operator at step j only, in the spirit of the GNS construction, see Propositions 2.9, 2.13. In the random case, the M j ’s become i.i.d. random matrices Mω . Then we address the behavior of the averaged distribution w(n) = Eω (W ω (n)) of X n , for n large in Sect. 3. Theorem 3.10 says under certain natural spectral assumptions on the matrices E ω (Mω ) that X n displays a ballistic behavior Ew(n) (X n ) nr ∈ Rd , where r is a drift vector depending only on the properties of the deterministic shift operation following the random update of the coin state. Moreover, the centered random

Random Time-Dependent Quantum Walks

67

vector (X n − nr ) is shown to display a diffusive behavior characterized by a diffusion matrix D we compute: Ew(n) ((X n − nr )i (X n − nr ) j ) nDi j , i, j = 1, 2, . . . , d. We also show in Theorem 3.10 √ that for any t > 0, y ∈ Rd , the averaged and rescaled √ characteristic function e−i[tn]r y/ n Eω (ω[tn] (y/ n)) converges for large n, in a certain sense, to the Fourier transform of superpositions of solutions to a diffusion equation, with diffusion matrix D(v), v ∈ Td , the d-dimensional torus: √ √ t −i[tn]r y/ n ω Eω ([tn] (y/ n)) → e− 2 y|D(v)y dv/(2π )d . e Td

In Sect. 4, we briefly discuss the relationship between the drift vector r and the diffusion matrix D in case the deterministic shift can take arbitrarily large values. Then we investigate finer properties of the behavior in n of the averaged distribution in Sect. 5. Theorem 5.2 states that a moderate deviation principle holds for w(n): there exists a rate function ∗ : Rd → [0, ∞] such that, for any set ∈ Rd and any 0 < α < 1, as n → ∞, P(X n − nr ∈ n (α+1)/2 ) e−n

α

inf x∈ ∗ (x)

.

In Sect. 6, we consider a distribution of coin operators which allows us to analyze the random distribution W ω (n), without averaging over the temporal disorder. This distribution is supported, essentially, on the unitary permutation matrices. We show that in this case W ω (n) coincide with the distribution of a Markov chain with finite state space whose transition matrix we compute explicitly. Consequently, we get that the centered random vector X nω − nr converges in distribution to a normal law N (0, ), with an explicit correlation matrix , and an explicit deterministic drift vector r given in Theorem 6.6. In turn, this allows us to show in Corollary 6.8 the existence of a random diffusion matrix Dω such that EW ω (n) ((X nω − nr )i (X nω − nr ) j ) nDiωj , i, j = 1, 2, . . . , d, whose matrix elements Diωj are distributed according to the law of X iω X ωj , where the vector X ω is distributed according to N (0, ). Finally, a large deviation principle for the random distribution W ω (n) is stated as Theorem 6.14. This example also shows that we cannot expect almost sure convergence results for random quantum walks. We close the paper by showing how to generalize the results of Sects. 3 and 5 to the case where the random coin matrices are not independent anymore and are distributed according to a Markov process with a finite number of states. See Sect. 7. Let us comment about the literature. In a sense, the situation we address corresponds to the cases considered in [15,17,29] where the dynamics is generated by a quantum Hamiltonian with a time dependent potential generated by a random process. For quantum walks, the role of the random time dependent potential is played by the random coin operators whereas the role of the deterministic kinetic energy is played by the shift. Quantum walks with unitary random coin operators have been tackled in some numerical works, see [32], for example. On the analytical side, we can mention [24] (see also [16]) where particular hypotheses on the coin matrices reduce the problem to the study of correlated random walks. During the completion of the paper, the preprint [2] appeared. It reviews and addresses several types of quantum walks, deterministic and random,

68

A. Joye

decoherent and unitary. In particular, the averaged dynamics of random quantum walks of the type studied in the present paper are tackled, by means of a similar approach. The results we prove, however, are more detailed and go beyond those of [2]. We finally note that there exists another instance of random quantum walks in which the randomness lies in space rather than in time. In the present context, this means the coin operators depend on the sites of the lattice and are chosen according to some law, in the same spirit as for the Anderson model. Dynamical or spectral localization are the phenomena of interest there. See for example [16,22,33] and references therein for results about such questions.

2. General Setup Let H = C2d ⊗ l 2 (Zd ) be the Hilbert space of the quantum walker in Zd with 2d internal degrees of freedom. We denote the canonical basis of C2d by {|τ }τ ∈I d , where ± I± = {±1, ±2, . . . , ±d}, so that the orthogonal projectors on the basis vectors are noted 2 d Pτ = |τ τ |, τ ∈ I± . We denote the canonical basis of l (Z ) by {|x}x∈Zd . We2dshall write for a vector ψ ∈ H, ψ = and x∈Zd ψ(x)|x, where ψ(x) = x|ψ ∈ C 2 2 < ∞. We shall abuse notations by using the same symbols ψ(x) = ψ d x∈Z C2d ·|· for scalar products and corresponding “bra” and “ket” vectors on H, C2d and l 2 (Zd ), the context allowing us to determine which spaces we are talking about. Also, we will often drop the subscript C2d of the norm. A coin matrix acting on the internal degrees of freedom, or coin state, is a unitary matrix C ∈ M2d (C) and a jump function is a function r : I± → Zd . The corresponding one step unitary evolution U of the walker on H = C2d ⊗ l 2 (Zd ) is given by U = S (C ⊗ I),

(2.1)

where I denotes the identity operator and the shift S is defined on H by S=

Pτ ⊗ |x + r (τ )x| + P−τ ⊗ |x + r (−τ )x|

x∈Zd τ ∈{1,...,d}

=

x∈Zd

Pτ ⊗ |x + r (τ )x|.

(2.2)

τ ∈I±

By construction, a walker at site y with internal degree of freedom τ represented by the vector |τ ⊗ |y ∈ H is just sent by S to one of the neighboring sites depending on τ determined by the jump function r (τ ) S |τ ⊗ |y = |τ ⊗ |y + r (τ ).

(2.3)

The composition by C ⊗ I reshuffles or updates the coin state so that the pieces of the wave function corresponding to different internal states are shifted to different directions, depending on the internal state. We can write U=

x∈Zd τ ∈I±

Pτ C ⊗ |x + r (τ )x|.

(2.4)

Random Time-Dependent Quantum Walks

69

Given a set of n > 0 unitary coin matrices Ck ∈ M2d (C), k = 1, . . . , n, we define the corresponding discrete evolution from time zero to time n by U (n, 0) = Un Un−1 · · · U1 , where Uk = S (Ck ⊗ I).

(2.5)

→ C and define the multiplication operator F : D(F) → H on its domain Let f : D(F) ⊂ H by (Fψ)(x) = f (x)ψ(x), ∀x ∈ Zd , where ψ ∈ D(F) is equivalent to 2 2 x∈Zd | f (x)| ψ(x)Cd < ∞. Note that F acts trivially on the coin state. When f is real valued, F is self-adjoint and will be called a lattice observable. Zd

2.1. Vector states. In particular, consider a walker characterized at time zero by the normalized vector ψ0 = ϕ0 ⊗ |0, i.e. which sits on site 0 with coin state ϕ0 . The quantum mechanical expectation value of a lattice observable F at time n is given by Fψ0 (n) = ψ0 |U (n, 0)∗ FU (n, 0)ψ0 . As in [16], a straightforward computation yields Lemma 2.1. With the notations above, U (n, 0) =

Pτn Cn Pτn−1 Cn−1 · · · Pτ1 C1

x∈Zd τ1 ,τ2 ,...,τn ∈I± n

⊗|x + r (τ1 ) + · · · + r (τn )x| ≡ Jk (n) ⊗ |x + kx|,

(2.6)

x∈Zd k∈Zd

where

Jk (n) =

Pτn Cn Pτn−1 Cn−1 · · · Pτ1 C1 ∈ M2d (C)

(2.7)

τ1 ,τ2 ,...,τn ∈I± n n s=1 r (τs )=k

and Jk (n) = 0, if ns=1 r (τs ) = k. Moreover, for any lattice observable F, and any normalized vector ψ0 = ϕ0 ⊗ |0, Fψ0 (n) = ψ0 |U ∗ (n, 0)FU (n, 0)ψ0 = f (k)ϕ0 |Jk (n)∗ Jk (n)ϕ0 ≡

k∈Zd

f (k)Wk (n),

(2.8)

k∈Zd

where Wk (n) = Jk (n)ϕ0 2C2d satisfy Wk (n) = Jk (n)ϕ0 2C2d = ψ0 2H = 1. k∈Zd

(2.9)

k∈Zd

Remark 2.2. We view the non-negative quantities {Wk (n)}n∈N∗ as the probability distributions of a sequence of Zd -valued random variables {X n }n∈N∗ , with Prob(X n = k) = Wk (n) = ψ0 |U (n, 0)∗ (I ⊗ |kk|)U (n, 0)ψ0 = Jk (n)ϕ0 2C2d , (2.10) in keeping with (2.8). In particular, Fψ0 (n) = EWk (n) ( f (X n )). We shall use freely both notations.

70

A. Joye

Remark 2.3. All sums over k ∈ Zk are finite since Jk (n) = 0 if max j=1,...,d |k j | > ρn, for some ρ > 0 since the jump functions have finite range. We are particularly interested in the long time behavior, n >> 1, of X 2 ψ0 (n), the expectation of the observable X 2 corresponding to the function f (x) = x 2 on Zd with initial condition ψ0 . Or, in other words, in the second moments of the distributions {Wk (n)}n∈N∗ . Let us proceed by expressing the probabilities Wk (n) in terms of the Ck ’s, k = 1, . . . , n. We need to introduce n some more notations. Let In (k) = {τ1 , · · · , τn }, where τl ∈ I± , l = 1, . . . , n and l=1 r (τl ) = k. In other words, In (k) denotes the set d in n steps via the jump function r . Let us write of paths that link the origin to k ∈ Z ϕ0 = τ ∈I± aτ |τ . Lemma 2.4. Wk (n) =

τ0 ,{τ1 ,...,τn }∈In (k) τ0 ,{τ1 ,...,τn }∈In (k) s.t. τn =τn

aτ0 aτ0 τ0 |C1∗ τ1 τ1 |C1 τ0

n

τs−1 |Cs∗ τs τs |Cs τs−1 .

s=2

(2.11) We approach the problem through the characteristic functions n of the probability distributions {W· (n)}n∈N∗ defined by the periodic function n (y) = EW (n) (ei y X n ) = Wk (n)ei yk , where y ∈ [0, 2π )d . (2.12) k∈Zd ϕ

To emphasize the dependence in the initial state, we will sometimes write n 0 and/or ϕ Wk 0 (n). All periodic functions will be viewed as functions defined on the torus, i.e. [0, 2π )d Td . The asymptotic properties of the quantum walk emerge from the analysis of the limit in an appropriate sense as n → ∞ of the characteristic function in the diffusive scaling √ lim n (y/ n). (2.13) n→∞

Looking for a relation between Wk (n) and Wk (n + 1), we find that the condition τn = τn is a nuisance we can relax now and deal with later. k 2d Consider the set of paths G n (K ) in Z from the origin to K = ∈ Z2d via the k (extended) jump function defined by r (τs ) τs 2 2d , (2.14) R : I± → Z , R = r (τs ) τs τ that is paths of the form (T1 , · · · , Tn−1 , Tn ), where Ts = s ∈ I±2 , s = 1, 2, . . . , n, τs and ns=1 R(Ts ) = K . Then note that the generic term in Lemma 2.4 reads τ |C τ τs−1 |Cs∗ τs τs |Cs τs−1 = τs |Cs τs−1 s s s−1 , ≡ τs ⊗ τs |(Cs ⊗ C s ) τs−1 ⊗ τs−1

(2.15)

Random Time-Dependent Quantum Walks

71

where, in the last expression, we introduced the unitary tensor product V (s) ≡ Cs ⊗ C s in C2d ⊗ C2d . The canonical basis of

C2d

is {|τ (τ,τ )∈I±2 , hence, with the identification τ |τ ⊗ τ , (2.17) T = τ

we can write the matrix elements of V (s) as

σ τ , T = ∈ I±2 . σ τ

σ ⊗ σ |(Cs ⊗ C s ) τ ⊗ τ ≡ V (s) ST , S = With the decompositions ϕ0 = aτ |τ ⇒ χ0 = ϕ0 ⊗ ϕ0 = τ ∈I±

aτ a τ |τ ⊗ τ

(2.18)

(2.19)

(τ,τ )∈I±2

we can write T1 |V (1)χ0 =

(2.16)

⊗ τ }

⊗ C2d

V (1)T1 T0 A T0 , where A T0 = aτ0 a τ0 , for T0 =

T0 ∈I±2

τ0 . τ0

(2.20)

With these notations, we consider the weight of n-step paths in Z2d from the origin to K , with last step T , defined by V (n)T Tn−1 · · · V (2)T2 T1 V (1)T1 χ0 . (2.21) W KT (n) = 2 n−1 s.t. (T1 ,...,Tn−1 )∈I± (T1 ,...,Tn−1 ,T )∈G n (K )

Note that by construction, see Lemma 2.4, k T Wk (n) = W K k (n) with K k = and k T ∈H±

τ H± = , τ ∈ I± . (2.22) τ

We also introduce corresponding periodic functions nT , n > 0 by nT (Y ) = eiY K W KT (n), where Y ∈ T2d

(2.23)

K ∈Z2d

and, see (2.20), 0T (Y ) = A T .

(2.24)

These definitions lead to the sought for relationships: Proposition 2.5. For all n ∈ N, T ∈ I±2 , K ∈ Z2d and Y ∈ T2d , W KT (n + 1) = V (n + 1)T S W KS −R(T ) (n),

(2.25)

S∈I±2 T n+1 (Y ) =

S∈I±2

eiY R(T ) V (n + 1)T S nS (Y ).

(2.26)

72

A. Joye

We can express these relationships in a yet more concise way as follows. Recall that for each basis vector T τ ⊗ τ and each vector Y = (y, y ) ∈ Td × Td , we have Y R(T ) = yr (τ ) + y r (τ ) ∈ R.

(2.27)

2

Introduce the vectors in C4d C2d ⊗ C2d with Y ∈ T2d and n ≥ 0, n (Y ) = nT (Y ) |τ ⊗ τ and 0 = A T |τ ⊗ τ , T =(τ,τ )∈I±2

(2.28)

T =(τ,τ )∈I±2

where 0 is determined by the internal state ϕ0 only and is independent of Y . 2 With the matrices on C2d ⊗ C2d C4d expressed in the ordered basis I±2 by D(Y ) = eiY R(T ) |τ ⊗ τ τ ⊗ τ |, with Y ∈ T2d , (2.29) T =(τ,τ )∈I±2

and V (s) = Cs ⊗ C s , Ms (Y ) = D(Y )V (s),

(2.30)

we get Corollary 2.6. For any n ≥ 0 and Y ∈ T2d , n (Y ) = Mn (Y )Mn−1 (Y ) · · · M1 (Y )0 .

(2.31)

Remark 2.7. The matrix D(Y ) can be expressed as a tensor product of unitary diagonal matrices. Let Y = (y, y ) ∈ Td × Td . Then D(y, y ) = ei(yr (τ )+y r (τ )) |τ ⊗ τ τ ⊗ τ | (τ,τ )∈I±2

=

ei yr (τ ) |τ τ | ⊗

τ ∈I±

τ ∈I±

ei y r (τ ) |τ τ |

≡ d(y) ⊗ d(y ) where d(y) =

ei yr (τ ) |τ τ |.

(2.32)

τ ∈I±

Consequently, we can write Ms (y, y ) = d(y)Cs ⊗ d(−y )Cs .

(2.33)

Together with the fact that 0 = ϕ0 ⊗ ϕ 0 , this yields n (y, y ) = d(y)Cn · · · d(y)C1 ϕ0 ⊗ d(−y )Cn d(−y ) · · · C1 ϕ0 ≡ Jn (y)ϕ0 ⊗ Jn (−y )ϕ0 = (Jn (y) ⊗ Jn (−y )) ϕ0 ⊗ ϕ0 .

(2.34)

We note here for future reference that Jn (y) is the Fourier transform of Jk (n): Lemma 2.8. For any y ∈ Td , Jn (y) =

k∈Zd

ei yk Jk (n).

(2.35)

Random Time-Dependent Quantum Walks

73

Proof. With the convention (2.27), the right hand side reads ei y(r (τ1 )+···+r (τn )) Pτn Cn Pτn−1 Cn−1 · · · Pτ1 C1 k∈Zd

=

(τ1 ,τ2 ,...,τn )∈I± n n s=1 r (τs )=k

ei yr (τn ) |τn τn |Cn ei yr (τn−1 ) |τn−1 τn−1 |Cn−1 · · · ei yr (τ1 ) |τ1 τ1 |C1

(τ1 ,τ2 ,...,τn )∈I± n

= d(y)Cn d(y)Cn−1 · · · d(y)C1 = Jn (y).

(2.36)

Eventually, the characteristic function n (y) we are interested in, see (2.12), can be obtained from n (Y ). We shall denote the normalized measure on the torus Td by dv . d v˜ = (2π )d Proposition 2.9. For any n ∈ N and y ∈ Td , ϕ0 1 |Mn (y − v, v)Mn−1 (y − v, v) · · · M1 (y − v, v)0 d v, ˜ n (y) = Td

where

1 =

|T =

T ∈H±

|τ ⊗ τ .

(2.37)

(2.38)

τ ∈I±

Proof. Following (2.22), to get n (y) from {nT (Y )}T ∈I 2 we need to restrict the sum ±

in (2.23) to K = K k , k ∈ Zd , and to sum on all last steps T ∈ H± . Let α be the distribution on C ∞ (Td × Td ) defined by α( f (·, ·)) = f (v, −v)d v. ˜

(2.39)

Its Fourier coefficients satisfy α(K ˆ ) = δk,k , for all K = (k, k ) ∈ Z2d so that eiY K W KT (n)δk,k = α nT (Y ) = α(nT (Y − ·)).

(2.40)

Td

K =(k,k )

One concludes using periodicity and by observing that the form 1 | yields the summation on T ∈ H± . Remark 2.10. Noting that 1 = τ ∈I± |τ ⊗ τ , we get ϕn 0 (y) =

d τ ∈I± T

= =

Td

Td

τ |Jn (y − v)ϕ0 τ |Jn (−v)ϕ0 d v˜

Tr(Jn (y − v)|ϕ0 ϕ0 |Jn∗ (−v)) d v˜ Tr(Jn∗ (−v)Jn (y − v)|ϕ0 ϕ0 |) d v. ˜

(2.41)

74

A. Joye

2.2. Density matrices. The analysis above can easily be adapted in order to accommodate more general initial vectors or density matrices. A density matrix ρ is a trace class non-negative operator on H = C2d ⊗l 2 (Zd ) which can be represented by its kernel ρ = (ρ(x, y))(x,y)∈Zd ×Zd , where ρ(x, y) ∈ M2d (C)

(2.42)

such that

ρ=

ρ(x, y) ⊗ |xy|.

(2.43)

(x,y)∈Z2d

The matrix ρ(x, y) satisfies ρ(x, y) = ρ ∗ (y, x) ⇒ ρ(x, x) = ρ ∗ (x, x) ≥ 0

(2.44)

and its elements are given by ρσ,τ (x, y), (σ, τ ) ∈ I±2 , so that σ ⊗ x|ρ τ ⊗ y = ρσ,τ (x, y). Since ρ ≥ 0 is trace class, and Cd is finite dimensional, we have Trρ(x, x) = ρ1 < ∞,

(2.45)

(2.46)

x∈Zd

ρ(x, x) ≤ Trρ(x, x) ≤ 2dρ(x, x).

(2.47)

The expectation value of a lattice observable F = I ⊗ f in the state corresponding to ρ reads f (x)Tr(ρ(x, x)), (2.48) Fρ = Tr(ρ(I ⊗ f )) = x∈Zd

where the first trace is on H and the second on C2d , assuming that the sum converges. If ρ0 denotes the initial density matrix, its evolution at time n under U (n, 0) defined by (2.5) is given by ρn = U (n, 0)ρ0 U ∗ (n, 0)

(2.49)

and the expectation of the lattice observable F is denoted by Fρ0 (n) = Tr(ρn (I ⊗ f )),

(2.50)

if it exists. Let us specify regularity properties on the lattice observable F = I⊗ f and the initial density matrix ρ0 which imply that all manipulations below are legitimate. Assumption R. a) The lattice observable is such that, for any μ < ∞, ∃Cμ < ∞ such that | f (x + y)| ≤ Cμ | f (x)|, ∀ (x, y) ∈ Zd × Zd with y ≤ μ.

(2.51)

Random Time-Dependent Quantum Walks

75

b) The kernel ρ0 (x, y) is such that

ρ0 (x, y) < ∞,

(2.52)

| f (x)|ρ0 (x, x) < ∞.

(2.53)

(x,y)∈Zd ×Zd

x∈Zd

In a similar fashion to Lemma 2.1, we can express Fρ0 (n) in the following way Lemma 2.11. The kernel of ρn reads ρn (x, y) =

Jk (n)ρ0 (x − k, y − k )Jk∗ (n).

(2.54)

(k,k )∈Zd ×Zd

Let F = I ⊗ f and ρ0 satisfy Assumption R. Then ρn satisfies Assumption R b) and f (z) Tr(Jk (n)ρ0 (z − k, z − k )Jk∗ (n)) Fρ0 (n) = (k,k )∈Zd ×Zd

z∈Zd

=

f (z)

Tr(Jk∗ (n)Jk (n)ρ0 (z − k, z − k )).

(2.55)

(k,k )∈Zd ×Zd

z∈Zd

Proof. Since for all n ∈ N, the summations on k and k are restricted to k ≤ μ(n) and Jk (n) ≤ c(n), we need to control := | f (z)| ρ0 (z − k, z − k ) (2.56) (k,k )∈Zd ×Zd

z∈Zd

under Assumption R. Now, ρ0 ≥ 0 implies Px y ρ0 Px y ≥ 0, where Px y is the orthogonal projector on H, Px y = I ⊗ |xx| + I ⊗ |xy| + I ⊗ |yx| + I ⊗ |yy|. In other words, the following 4d × 4d block matrix is non-negative ρ(x, x) ρ(x, y) . ρ ∗ (x, y) ρ(y, y)

(2.57)

(2.58)

According to Lemma 1.21 in [36], this is equivalent to ρ(x, x) ≥ 0, ρ(y, y) ≥ 0 and ∃W , W ≤ 1 such that ρ(x, y) = ρ(x, x)1/2 Wρ(y, y)1/2 . Hence, ρ(x, y) ≤ ρ(x, x)1/2 ρ(y, y)1/2 .

(2.59)

Applied to (2.56), this yields together with Assumption R and Cauchy Schwarz, ≤ | f (z)|1/2 | f (z)|1/2 ρ0 (z − k, z − k)1/2 ρ0 (z − k , z − k )1/2 z∈Zd

≤ Cμ(n)

(k,k )∈Zd ×Zd

(k,k )∈Zd ×Zd k≤μ(n),k ≤μ(n)

z∈Zd

(| f (z − k)|ρ0 (z − k, z − k))1/2

76

A. Joye

×(| f (z − k )|ρ0 (z − k , z − k ))1/2 ⎞ ⎛ ≤ Cμ(n) ⎝ 1⎠ | f (x)|ρ0 (x, x) < ∞. k≤μ(n),k ≤μ(n)

(2.60)

x∈Zd

This generalization of Lemma 2.1 allows us to give an interpretation in terms of classical random walk on Zd , P(X n = z) = Tr(ρn (z, z)) ≡ Wzρ0 (n), (2.61) ρ ρ with corresponding characteristic function n 0 (y) = z∈Zd ei yz Wz 0 (n). Due to the expression of Wz (n) as a convolution, the characteristic function will be expressed as a product of Fourier transforms. For Y = (y, y ) ∈ Td × Td , we define the matrix valued Fourier transform of a density matrix ρ0 by R0 (Y ) = ei(yk+y k ) ρ0 (k, k ) ∈ M2d (C). (2.62) (k,k )∈Zd ×Zd

Because of (2.46) and (2.44), R0 is uniformly continuous in Y and satisfies R0∗ (y, y ) = R0 (−y , −y).

(2.63)

Then, the Fourier transform of Jn being J Lemma 2.8, the Fourier transform Rn of ρn reads Rn (y, y ) = ei(yx+y x ) Jk (n)ρ0 (x − k, x − k )Jk∗ (n) (x,x )∈Zd ×Zd

= Jn (y)R0 (y, y

(k,k )∈Zd ×Zd ∗ )Jn (−y ).

(2.64)

Proceeding as above, we arrive at the generalization of (2.41) Lemma 2.12. For any y ∈ Td , ρn 0 (y) = Tr Rn (y − v, v) d v˜ d T = Tr Jn (y − v)R0 (y − v, v)Jn∗ (−v) d v˜ d T = Tr Jn∗ (−v)Jn (y − v)R0 (y − v, v) d v. ˜ Td

Let R0 (y, y ) =

(2.65)

τ |R0 (y, y )τ |τ ⊗ τ ∈ C2d ⊗ C2d .

(2.66)

Mn (y, y )Mn−1 (y, y ) · · · M1 (y, y ) = Jn (y) ⊗ Jn (−y ),

(2.67)

(τ,τ )∈I±2

Then, making use of the identity

it is straightforward to get from (2.65) the following generalization of Proposition 2.9:

Random Time-Dependent Quantum Walks

77

Proposition 2.13. ρn 0 (y)

=

Td

1 |Mn (y − v, v)Mn−1 (y − v, v) · · · M1 (y − v, v)R0 (y − v, v)d v. ˜ (2.68)

Remark 2.14. The map y → R0 (y − v, v) is continuous only under (2.46). Under Assumption R for an observable increasing at infinity, this map becomes more regular. Remark 2.15. The procedure consisting in extending the space C2d , where the C j ’s act to the tensor product C2d ⊗ C2d , where we consider C j ⊗ C j is parallel to the GNS construction. Once the Fourier transform in position space is taken, it allows us to write the action of the one-step dynamics in the coin variables as the action of a unitary matrix in C2d ⊗ C2d and to replace the density matrix by a vector.

3. Random Framework For a deterministic non periodic set of coin operators, not much can be said about Fψ0 (n) in general. Therefore we consider the following random quantum dynamical system which defines a quantum walk with random update of the internal degrees of freedom at each time step. Let C(ω) be a random unitary matrix on C2d with probability space (, σ, dμ), where dμ is a probability measure. We consider the random evolution ∗ operator obtained from sequences of i.i.d. coin matrices on (N , F, dP), where F is the σ -algebra generated by cylinders and dP = ⊗k∈N∗ dμ, by Uω (n, 0) = Un (ω)Un−1 (ω) · · · U1 (ω), where Uk (ω) = S (C(ωk ) ⊗ I),

(3.1)

∗

and ω = (ω1 , ω2 , ω3 , . . . ) ∈ N . The evolution operator at time n is now given by a product of i.i.d. unitary operators on H. We shall denote statistical expectation values with respect to P by E. All results of the previous section apply, with each occurrence of Cs replaced by C(ωs ). A superscript ω will mention the resulting randomness of the different quantities encountered. In particular, the random dynamical system at hands yields random matrices Jkω (n) ∈ M2d (C), which, in turn, define random probability distributions {Wkω (n)}n∈N∗ ∗ on Zd which satisfy (2.9) for all n ∈ N∗ and ω ∈ N . The corresponding characteristic ω functions n become random Fourier series whereas nω (Y ) is obtained by the following product of i.i.d. random matrices ωn (Y ) = Mωn (Y )Mωn−1 (Y ) · · · Mω1 (Y )0 ,

(3.2)

where, with Y = (y, y ), Mωs (Y ) = D(Y )V (ωs ) = d(y)C(ωs ) ⊗ d(−y )C(ωs )

(3.3)

are distributed according to the image of dμ by the inverse mapping C → D(Y )C ⊗ C.

78

A. Joye

3.1. Diffusive averaged dynamics. We consider in that section the statistical average of the motion performed by random quantum walk, and, more specifically, its diffusive characteristics. For the lattice observable X 2 , we will derive results regarding the long time behavior of E(X 2 ωψ0 )(n) = EUω∗ (n, 0)ψ0 |X 2 Uω (n, 0)ψ0 = E(EW ω (n) (X n2 )). k

(3.4)

It means, see (2.8), that we consider the motion corresponding to the averaged probability distributions defined by wk (n) := E(Wkω (n)), k ∈ Zd , n ∈ N∗ ,

(3.5)

with corresponding characteristic function n (y) = Ew(n) (ei y X n ) =

wk (n)eiky .

(3.6)

k∈Zd

Remark 3.1. To stress the dependence on ω in the distribution Wkω (n), we shall denote the corresponding random vector on the lattice by X nω . When we consider the averaged distribution w(n) instead, we shall write X n for the corresponding random vector. Our results generalize those of [24] in the sense that the distribution of the random coin matrices considered here is arbitrary. As a consequence, the analysis cannot be mapped to that of a persistent or correlated classical random walk on the lattice, as was observed in [24]. Let E and M(Y ) be the matrices defined by E = E(V (ω)) = E(C(ω) ⊗ C(ω)) and M(Y ) = D(Y )E.

(3.7)

Note that while V (ω) is a unitary tensor product, its expectation E is neither unitary, nor a tensor product in general. But E ≤ 1 and therefore M(Y ) ≤ 1. Since the {C(ωs )}s∈N∗ are i.i.d., we immediately get from Corollary 2.6 and Proposition 2.9 that E(ωn )(Y ) = (M(Y ))n 0 ,

(3.8)

so that E(ωn )(y) =

Td

1 |(M(y − v, v))n 0 d v. ˜

(3.9)

The analysis of the diffusive scaling limit (2.13) now relies on the spectral properties of the matrices E and M(Y ).

Random Time-Dependent Quantum Walks

79

3.2. Spectral properties. The structure of these matrices implies the following deterministic and averaged statements: Lemma 3.2. Let V (ω) = C(ω) ⊗ C(ω), E = E(V (ω)), M(Y ) = D(Y )E and let S 2d denote the unitary involution defined by Sϕ ⊗ ψ = ψ ⊗ ϕ, for all ϕ, ψ ∈ C . Then, for all ω and all y, 1 = τ ∈I± |τ ⊗τ is invariant under V (ω), E, M(y, −y), S and their adjoints. Consequently E = Spr (E) = M(y, −y) = Spr (M(y, −y)) = 1,

(3.10)

where Spr denotes the spectral radius. Moreover, S V (ω)S = V (ω), SM(y, −y)S = M(y, −y), SES = E,

(3.11)

so that σ (V (ω)) = σ (V (ω)), σ (E) = σ (E), σ (M(y, −y)) = σ (M(y, −y)).

(3.12)

Remark 3.3. For M(Y ), we only have Spr (M(Y )) ≤ M(Y ) ≤ 1.

(3.13)

Remark 3.4. Before taking expectation values, 1 is a eigenvalue at least 2d times degenerate for the unitary matrices V (ω) and Mω (y, −y), for all ω and y, because of their tensor product structure (2.16), (2.33). We shall work under an assumption which implies that in the long run, the averaged quantum walk loses track of interferences and acquires a universal diffusive behavior. At the spectral level, this is expressed by the fact that after taking expectation values, 1 is the only eigenvalue of M(y, −y) = E(Mω (y, −y)) on the unit circle and it is simple. Let D(z, r ) ⊂ C denote the open disc of radius r centered at z ∈ C. Assumption S. For all v ∈ [0, 2π )d = Td , σ (M(−v, v)) ∩ ∂ D(0, 1) = {1} and the eigenvalue 1 is simple.

(3.14)

Remark 3.5. Actually, because of the form of (3.9), it is enough to consider the spectrum of the restriction of M(−v, v) to the M∗ (Y )-cyclic subspace generated by 1 . Set I = Span {M∗ (Y )k 1 , k ∈ N, Y ∈ Td × Td },

(3.15)

and let PI = PI∗ be the orthogonal projector onto I. If PI = I, we can work under the weaker Assumption S’. For all v ∈ [0, 2π )d = Td , σ (M(−v, v)|I ) ∩ ∂ D(0, 1) = {1} and the eigenvalue 1 is simple.

(3.16)

Indeed, note that M∗ (Y )PI = PI M∗ (Y )PI so that, at the level of linear forms 1 |M(Y )n = M∗ (Y )n PI 1 | = 1 |(PI M(Y )PI )n = 1 |M(Y )|I n .

(3.17)

While it is often necessary in applications to use S’, see the examples, we keep working under S below in order not to burden the notation.

80

A. Joye

Let 0 = 1 /1 . Under assumption S, 0 spans the one dimensional spectral subspace of M(−v, v) associated with the eigenvalue 1. Moreover, by Lemma 3.2, the corresponding rank one spectral projector reads P = |0 0 | and is v-independent. With Q = I − P, we have the spectral decomposition M(−v, v) = P + QM(−v, v)Q,

(3.18)

where, under assumption S, ∃ < 1, independent of v ∈ Td such that Spr QM(−v, v) Q ≤ . In keeping with (3.9) and the diffusive scaling (2.13) to be used below, we perform a perturbative analysis of the spectrum of M(y − v, v) for small values of y, uniformly in v ∈ Td . Let us introduce the following notation for (y, v) ∈ Td × Td : Mv (y) = M(y − v, v), so that Mv (0) = M(−v, v).

(3.19)

Now, with D(y, y ) = d(y) ⊗ d(y ), see (2.32), (ei yr (τ ) − 1)|τ τ | ⊗ I Mv (0) Mv (y) = D(y, 0)Mv (0) = Mv (0) + τ ∈I±

≡ Mv (0) + F(y)Mv (0),

(3.20)

where Mv (0) = 1 and F(y)Mv (0) ≤ cy, with c independent of v. Since the map (y, v) → Mv (y) is actually analytic in Cd × Cd , we can say more. For ν > 0, let Tνd = {z ∈ Cd | z ∈ Td , z j < ν, j = 1, . . . , d.} ⊂ Cd be a complex neighborhood of Td . For y0 > 0, let B(0, y0 ) = {y ∈ Cd | y ≤ y0 }. Analytic perturbation theory, see [18], then yields the following Lemma 3.6. Under assumption S, there exists 0 < δ < 1, ν = ν(δ) > 0 and y0 = y0 (δ) > 0 such (y, v) ∈ (Tνd ∩ B(0, y0 )) × Tνd implies σ (Mv (y)) ∩ D(1, δ) = {λ1 (y, v)} , σ (Mv (y)) \ {λ1 (y, v)} ⊂ D(0, 1 − δ).

(3.21) (3.22)

Moreover, λ1 (y, v) is simple, analytic in (Tνd ∩ B(0, y0 )) × Tνd and λ1 (0, v) = 1 for all v ∈ Tνd . The corresponding spectral decomposition reads Mv (y) = λ1 (y, v)P(y, v) + M Q (y, v),

(3.23)

where P(y, v) is analytic in (Tνd ∩B(0, y0 ))×Tνd and P(0, v) = P for all v ∈ Tνd . With Q(y, v) = I − P(y, v), the restriction M Q (y, v) = Q(y, v)Mv (y)Q(y, v) satisfies Spr (M Q (y − v, v)) < 1 − δ. We need to compute λ1 (y, v) = Tr(P(y, v)Mv (y)) to second order in y. We expand F(y) as F(y) = F1 (y) + F2 (y) + O(y3 ) (yr (τ ))2 |τ τ | ⊗ I + O(y3 ) (3.24) = i yr (τ ) |τ τ | ⊗ I − 2 τ ∈I±

τ ∈I±

and introduce the (unperturbed) reduced resolvent Sv (z) for v ∈ Tνd and z in a neighborhood of 1 such that P (Mv (0) − z)−1 = (3.25) + Sv (z) with P = |0 0 |. 1−z

Random Time-Dependent Quantum Walks

81

We have for a simple eigenvalue, (see [18] p. 69) λ1 (y, v) = 1 + Tr(F1 (y)Mv (0)P) + Tr(F2 (y)Mv (0)P −F1 (y)Mv (0)Sv (1)F1 (y)Mv (0)P) + Ov (y3 ).

(3.26)

Explicit computations with symmetry considerations yield Lemma 3.7. For all v ∈ Tνd and y ∈ B(0, y0 ), there exists a symmetric matrix D(v) ∈ Md (C) such that i yr (τ ) + Ov (y3 ) 2d τ ∈I± ⎛ ⎞ 1 ⎝ (yr (τ ))2 1 ⎠ + + (yr (τ ))(yr (τ )) τ ⊗ τ |Sv (1)τ ⊗ τ − 2d 2 2d

λ1 (y, v) = 1 +

τ ∈I±

τ,τ ∈I±

1 i yr (τ ) − y|D(v)y + Ov (y3 ). ≡ 1+ 2d 2

(3.27)

τ ∈I±

The map v → D(v) is analytic in Tνd ; when v ∈ Td , D(v) ∈ Md (R) is non-negative 2 and D(v) j,k = ∂ y∂j ∂ yk λ(0, v), j, k ∈ {1, 2, . . . , d}. Moreover, Ov (y3 ) is uniform in v ∈ Tνd .

Proof. Existence and analyticity in v of D(v) follow from analyticity of λ1 in y and ana2 lyticity of Sv (1) in v, see (3.25). Since D(v) j,k = ∂ y∂j ∂ yk λ(0, v), the matrix is symmetric. For v ∈ Td , the symmetry (3.11) implies S Sv (1)S = S v (1) so that τ ⊗ τ |Sv (1)τ ⊗ τ = τ ⊗ τ |S Sv (1)Sτ ⊗ τ = τ ⊗ τ |Sv (1)τ ⊗ τ .

(3.28)

Hence the matrix elements D(v) for v ∈ Td are real as well. Finally, (3.13) implies that y|D(v)y ≥ 0 for all y ∈ Td . 1 Remark 3.8. Using the notation f = 2d τ ∈I± f (τ ) for any function on I± , D(v) reads D(v) = 2|r r | −

1 1 |r r | − |r (τ )τ ⊗ τ |Sv (1)τ ⊗ τ r (τ )|, 2 d

(3.29)

τ,τ ∈I±

where the “bra” and “ket” notation is understood in Rd for vectors r (τ ) and in C2d ⊗C2d for τ ⊗ τ . We are now set to prove Proposition 3.9. Under assumption S, uniformly in v ∈ Tνd , in y in compact sets of C and in t in compact sets of R∗+ , it yr lim M[tn] P, v (y/n) = e √ √ t −i[tn]r y/ n lim M[tn] = e− 2 y|D(v)y P. v (y/ n)e n→∞

n→∞

(3.30) (3.31)

82

A. Joye

Proof. Let v ∈ Tνd , t ∈ G√⊂ R∗+ , and y ∈ K ⊂ C, G and K compact. We consider n large enough so that y/ n and y/n belong to B(0, y0 ), uniformly in y ∈ K . The decomposition (3.23) implies for any n large enough, [tn] [tn] M[tn] v (y) = λ1 (y, v)P(y, v) + M Q (y, v),

(3.32)

where Lemma 3.6 implies the existence of c > 0 and 1 > δ > δ, uniform in (y, v, t) ∈ Tνd × K × G, such that tn M[tn] Q (y, v) ≤ cδ .

(3.33)

Moreover, by Lemma 3.7, λ[tn] 1 (y/n, v)

= n→∞ −→

ry √ −i[tn] √ n (y/ n, v)e λ[tn] 1

= n→∞ −→

[tn] ry 2 2 + Ov (y /n ) 1+i n eit yr , (3.34) [tn] 3 r y y ry y|D(v)y −i √n + Ov e 1+i√ − 2n n 3/2 n e− 2 y|D(v)y , t

(3.35) √ and both P(y/ n, v) and P(y/n, v) tend to P as n → ∞, uniformly in (v, y) ∈ Tνd ×K . With these technical results behind us, we come to the main results of this section which are the existence of a diffusion matrix and central limit type behaviors. Let N (0, ) denote the centered normal law in Rd with positive definite covariance matrix and let us write X ω N (0, ) a random vector X ω ∈ Rd with distribution N (0, ). The superscript ω can be thought of as a vector in Rd such that for any Borel set A ⊂ Rd , 1 1 −1 P(X ω ∈ A) = e− 2 ω| ω dω. (3.36) √ d/2 (2π ) det( ) A ω

The corresponding characteristic function is N (y) = E(ei y X ) = e− 2 y| y . The first result concerning the asymptotics of the random variable X n reads as follows: 1

Theorem 3.10. Under Assumption S, uniformly in y in compact sets of C and in t in compact sets of R∗+ , ϕ

0 (y/n) = eit yr , lim [tn] ry √ t −i[tn] √ ϕ0 n lim e [tn] (y/ n) = e− 2 y|D(v)y d v, ˜

n→∞

n→∞

Td

(3.37) (3.38)

where the right-hand side admits an analytic continuation in (t, y) ∈ C × C2 . In particular, for any (i, j) ∈ {1, 2, . . . , d}2 , X i ψ0 (n) = ri , n (X − nr )i (X − nr ) j ψ0 (n) lim = Di j (v) d v. ˜ n→∞ n Td lim

n→∞

(3.39) (3.40)

Random Time-Dependent Quantum Walks

83

Remark 3.11. We will call diffusion matrices both D(v) and D = Td D(v) d v. ˜ s1 sd d ∂ d s For any s = (s1 , s2 , . . . , sd ) ∈ N with |s| = j=1 s j and D y = ∂ y1 · · · ∂∂yd , X 1s1 X 2s2 · · · X dsd ψ0 (n) = r s11 r s22 · · · r sdd , n→∞ n |s| (X − nr )s11 (X − nr )s22 · · · (X − nr )sdd ψ0 (n) lim n→∞ n |s|/2 lim

= (−i)|s|

(D sy e− 2 y|D(v)y )| y=0 d v, ˜ 1

Td

(3.41)

(3.42)

which shows that all odd moments (|s| odd) of the centered variable are zero whereas all even moments can be computed explicitly. Proof. This is a direct consequence of Proposition 3.9 and Definition (3.9). The uniformity of the convergence in the variables (v, y, t) in compact sets provides analyticity after integration in v ∈ Td and commutation of the limit and derivations. For initial conditions corresponding to a density matrix ρ0 , we have Corollary 3.12. Under Assumption S, for any t ≥ 0, ρ

0 (y/n) = eit yr , lim [tn] ry √ t −i[tn] √ ρ0 n [tn] (y/ n) = e− 2 y|D(v)y 1 |R0 (−v, v)d v˜ lim e n→∞ d T t = e− 2 y|D(v)y Tr (R0 (−v, v))d v, ˜

n→∞

Td

(3.43)

(3.44)

where R0 (−v, v) =

eivl ρ0 (k, k + l).

(3.45)

(k,l)∈Zd ×Zd

Remark 3.13. Under Assumption R for the observable X 2 , we deduce that for any (i, j) ∈ {1, 2, . . . , d}2 , X i ρ0 (n) = ri , n (X − nr )i (X − nr ) j ρ0 (n) = lim Di j (v)Tr (R0 (−v, v)) d v. ˜ n→∞ n Td lim

n→∞

(3.46) (3.47)

From Corollary 3.12, and Theorem 3.10, we gather that the characteristic √ function of the centered variable X n − nr in the diffusive scaling T = nt, Y = y/ n, where n → ∞, converges to

Td

F

1 −1 e− 2t x|D (v)x ˜ (y) Tr (R0 (−v, v))d v, √ (t2π )d/2 det D(v)

(3.48)

84

A. Joye

where the function under the Fourier transform symbol F is a solution to the diffusion equation d ∂ϕ 1 ∂ 2ϕ = Di j (v) . ∂t 2 ∂xi ∂x j

(3.49)

i, j=1

√ As explained in [15,17], it follows that the position space density wk ([nt])δ( nx − k) converges in the sense of distributions to a superposition of solutions to the diffusion equations (3.49) as n → ∞. In the case where the diffusion matrix D(v) = D is independent of v, Theorem 3.10 √ with t = 1 says that the characteristic function of the rescaled variable (X n − nr )/ n √ 1 defined by Ewk (n) (ei y(X n −nr )/ n ) converges to e− 2 y|D y which is the characteristic function of the normal law N (0, D). Hence, by Lévy’s continuity theorem, see e.g. Theorem 7.6 in [7], Corollary 3.14. Assume S, suppose D(v) = D > 0 is independent of v ∈ Td . Then, for any initial vector 0 = ϕ0 ⊗ |0, we have as n → ∞, in distribution, X n − nr −→ X ω N (0, D). √ n

(3.50)

Remark 3.15. All results of this section hold under Assumption S’ only, mutatis mutandis. In particular, if the invariant subspace I coincides with span {|σ ⊗ σ }σ ∈I± , the matrix Mv (y) is actually independent of v, because D(−v, v) acts like the identity on the latter space. Consequently, the diffusion matrix is independent of v as well. Remark 3.16. We have chosen to randomize the coin state updates only, but it is possible to adapt the method to deal with random jump functions as well.

3.3. consider the set of three unitary matrices in C2 given by For d = 1, Example. 0 1 √1 1 1 and the distribution which assigns the probability p/2 > 0 , I, 2 1 −1 1 0 to the first and second matrices and q = 1 − p to the third one. Let r be the jump function defined by r (±1) = ±1 so that r = 0. Then, in the ordered basis {| − 1 ⊗ −1, |1 ⊗ 1, | − 1 ⊗ 1, |1 ⊗ −1}, the corresponding matrix E reads ⎛

1 1 ⎜1 E= ⎝ 2 q q

1 1 −q −q

q −q p−q 1

⎞ q −q ⎟ 1 ⎠ p−q

(3.51)

and D(−v, v) = diag(1, 1, ei2v , e−i2v ). We introduce the following orthonormal basis whose first vector is 0 : √ √ {(| − 1 ⊗ −1 + |1 ⊗ 1)/ 2, (| − 1 ⊗ −1 − |1 ⊗ 1)/ 2, | − 1 ⊗ 1, |1 ⊗ −1} ≡ {0 , ϕ1 , ϕ2 , ϕ3 }. (3.52)

Random Time-Dependent Quantum Walks

85

In this basis, Mv (0) = D(−v, v)E as written: ⎛ 1 0 0 √q 0 ⎜0 2 ⎜ Mv (0) = ⎜0 ei2v √q ei2v ( p−q) ⎝ 2 2 q 1 −i2v −i2v √ 0 e e 2 2

⎞

0

√q ⎟ 2 ⎟ 1 i2v e 2 ⎟ ⎠ ( p−q) −i2v e 2

≡ 1 ⊕ Nv ,

(3.53)

where Nv is the restriction of Mv (0) to the subspace orthogonal to C0 . In order to √ make computations easier, we specialize to the case p = 1/ 2, so that √ √ (3.54) q/ 2 = ( p − q)/2 = ( 2 − 1)/2 ≡ γ . This way we can write in (C0 )⊥ , ⎛ −1 Nv − I = ⎝ ei2v γ e−i2v γ

γ ei2v γ − 1 e−i2v 21

γ

⎞

ei2v 21 ⎠ . −1

e−i2v γ

(3.55)

We have det(Nv − I) = 2 cos(2v)(γ 2 + γ ) − (2γ 3 + 3/4) < 0 for all v ∈ T so that C0 is the only invariant subspace under Mv (0). Hence Sv (1) = (Nv − I)−1 . To get the diffusion constant D(v) we need to compute τ ⊗ τ |S(1)τ ⊗ τ for τ,√τ = ±1, where S(1) is defined on (C0 )⊥ = QC4 . We have Q| ± 1 ⊗ ±1 = ∓ϕ1 / 2, so that ττ ϕ1 |(Nv − I)−1 ϕ1 2 γ 2 − 2 cos(2v)γ + 3/4 ττ . = 2 2 cos(2v)(γ 2 + γ ) − (2γ 3 + 3/4)

τ ⊗ τ |S(1)τ ⊗ τ =

(3.56)

Taking into account the r (τ ) = τ in formula (3.27), we get D(v)y 2 = −y 2 (1 + 4ϕ1 |(Nv − I)−1 ϕ1 ), which, plugging in the value of γ , eventually yields √ √ 16 − 9 2 + 2 cos(2v)(5 − 4 2) D(v) = > 0. √ 5 2 − 4 − 2 cos(2v)

(3.57)

(3.58)

4. Einstein’s Relation An interesting feature of the previous results is that the asymptotic averaged velocity r depends on the jump function r only and is independent of the coin distribution. This is reminiscent of the asymptotic velocity v(F) reached by a particle subject to a deterministic force of amplitude F in a random dissipative environment modeled by random forces. Given an asymptotic velocity, the mobility vector μ is defined as the ratio μ = lim v F /F. F→0

(4.1)

This mobility μ is then related to the fluctuations of the system around the asymptotic trajectory by Einstein’s relation which says that the diffusion matrix is proportional to μ.

86

A. Joye

In the present framework, neither dissipation nor forces can be directly traced back to describe the asymptotic motion X 0 (n) = nr + o(n), n → ∞.

(4.2)

Moreover, the motion taking place on a lattice, the asymptotic velocity r has a minimal amplitude 1/(2d), if it is non zero, which prevents a behavior similar to (4.1). Nevertheless, the jump function r which characterizes the deterministic motion can be thought of as an external control parameter, similar to a driving force. In order to get an asymptotic velocity which vanishes with the exterior control parameter, we rescale the lattice Zd to (Z/l)d , with l > 0. This means we consider the variable Yn = X n /l ∈ (Z/l)d .

(4.3)

Then we introduce a parameter s ∈ N as follows. Let r1 and r0 be two non-zero jump functions such that r1 = 0 and r0 = 0.

(4.4)

We define a new s-dependent jump function by rs (τ ) = sr1 (τ ) + r0 (τ ) ∈ Zd such that rs = r0 ,

(4.5)

and we will consider the large s limit. Hence the rescaled variable Yn satisfies Y ψ0 (n) r0 = := v Y , (4.6) n l (Y − nrs )i (Y − nrs ) j ψ0 (n) s2 = 2 D(1) i j (v) d v˜ + O(1/s) := DiY j , lim n→∞ d n l T (4.7) lim

n→∞

where D(1) (v) is the diffusion matrix computed by means of the jump function r1 and the remainder term is uniform in v ∈ Td . Therefore, choosing the scale l = s ∈ N, we get that the diffusion matrix DY is finite for large s whereas the asymptotic velocity tends to zero. Hence, setting F = 1/s, we get for s large, μ=

lim

F=1/s→0

DY =

Td

v Y /F = r0 ,

D(1) (v) d v˜ + o(1/s) → K μ, as s → ∞.

(4.8) (4.9)

The last formula is admittedly a consequence of a rather ad hoc construction. On the other hand, assuming that r1 = 0, we get with the same scaling v Y = r1 which never vanishes.

Random Time-Dependent Quantum Walks

87

5. Moderate Deviations The spectral properties of the matrix M(y − v, v) proven in Sect. 3.2 allow us to obtain further results on the behavior with n of the distribution of the random variable X n defined by (2.2). This section is devoted to establishing some moderate deviations results on the random variable X n . We consider initial conditions of the form ψ0 = ϕ0 ⊗ |0 and we will be concerned with X n − nr . Moderate deviations results depend on asymptotic behaviors in different regimes of the logarithmic generating function of X n − nr defined for y ∈ Rd by n (y) = ln(Ew(n) (e y(X n −nr ) )) ∈ (−∞, ∞].

(5.1)

This function n is convex and n (0) = 0. Let {an }n∈N be a positive valued sequence such that lim an = ∞, and

lim an /n = 0.

n→∞

n→∞

(5.2)

√ ˜ n (y) = ln(Ew(n) (e yYn )) be the Define Yn = (X n − nr )/ nan and, for any y ∈ Rd , let logarithmic generating function of Yn . Proposition 5.1. Assume S and further suppose D(v) > 0 for all v ∈ Td . Let y ∈ Rd \{0} and assume the real analytic map Td v → y|D(v)y ∈ R+∗ is either constant or admits a finite set {v j (y)} j=1,...,J of non-degenerate maximum points in Td . Then, for any y ∈ Rd , lim

n→∞

1 1 ˜ n (an y) = y|D(v1 (y))y, an 2

(5.3)

which is a smooth convex function of y. Proof. This proposition essentially follows √ from Lemmas 3.6 and 3.7 and the asymptotic evaluation of an integral. Let bn = an /n s.t. limn→∞ bn = 0. By construction, ˜ n (an y) = n (bn y), where, according to Lemma 3.6 and 3.7, there exists γ > 0 s.t. for n large enough, n b2 1 + n y|D(v)y + Ov (bn3 y 3 ) (1 + Ov (bn y)) d v˜ + O(e−γ n ) exp(n (bn y)) = d 2 T an 3 = e 2 y|D(v)y+Ov (an bn y ) (1 + Ov (bn y)) d v˜ + O(e−γ n ). (5.4) Td

All remainder terms Ov (· · · ) are analytic in v ∈ Tν , as well as D(v). An application of Laplace’s method around each of the non-degenerate maximum points, yields for 1/3 < α < 1/2, exp(n (bn y)) =

J

e

an 2

y|D(v j (y))y+O(bn y 3 )

d/2

(G j (y)/an

+ O(bn y) + O(1/an3α−1 ))

j=1

+O(e−γ n ) + O(e−K an

1−2α

),

(5.5)

where G j (y) > 0, K > 0, from which the result follows. The case where D(v) is independent of v follows directly from (5.4). The convexity of the limit follows from ˜ n . The assumed non-degeneracy of the maximum point ensures that the convexity of the functions Rd \ {0} y → v j (y) are all smooth by the implicit function theorem.

88

A. Joye

Let us recall a few definitions notations. A rate function I is a lower semicontinuous map from Rd to [0, ∞] s.t. for all α ≥ 0, the level sets {x | I (x) ≤ α} are closed. When the level sets are compact, the rate function I is called good. For any ⊂ Rd , 0 denotes the interior of , while denotes its closure. As a direct consequence of the Gärtner-Ellis Theorem, see [13] Sect. 2.3, we get Theorem 5.2. Define ∗ (x) = sup y∈Rd y|x − 21 y|D(v1 (y))y , for all x ∈ Rd . ∗ Then, is a good rate function and, any positive valued sequence {an }n∈N satisfying (5.2) and all ⊂ Rd , √ 1 ln(P((X n − nr ) ∈ nan )) n→∞ an √ 1 ≤ lim sup ln(P((X n − nr ) ∈ nan )) ≤ − inf ∗ (x). (5.6) an x∈

− inf ∗ (x) ≤ lim inf x∈ 0

Remark 5.3. As a particular case, when D(v) = D > 0 is constant, we get ∗ (x) =

1 x|D−1 x. 2

(5.7)

Remark 5.4. Specializing the sequence {an }n∈N to a power law, i.e. taking an = n α , we can express the content of Theorem 5.2 in an informal way as follows. For 0 < α < 1, P((X n − nr ) ∈ n (α+1)/2 ) e−n

α

inf x∈ ∗ (x)

.

(5.8)

For α close to zero, we get results compatible with the Central Limit Theorem and for α close to one, we get results compatible with those obtained from a large deviation principle. Let us come back to the example in Sect. 3.3. The diffusion coefficient D(v) given in (3.58) admits, as a function of√v ∈ T, a single non-degenerate maximum at v = 0, where it takes the value D(0) = 2 2 − 1. Thus we get from the foregoing that a moderate 2 deviation principle holds for this example, with the good rate function ∗ (x) = 2Dx (0) = x2 √ . 2(2 2−1)

6. Example of Diffusive Random Dynamics The results obtained so far can be viewed, essentially, as an adaptation to the quantum walk dynamics setup of those proven in [15,17,29] for the averaged dynamics and as an extension of [16,24]. In this section we consider a specific example of measure dμ on U (2d), the set of coin matrices, for which we can prove convergence results on the associated random quantum dynamical system (3.1) for large times, in distribution rather than in average. In particular, our example shows that almost sure convergence results cannot be expected in general. As noted in Sect. 3.2, the spectra of V (ω) and Mω (y, −y) lying on the unit circle and admitting 1 as a 2d-fold eigenvalue prevent us from using the same spectral methods as above. For the same reason, the results about products of random contractions in [10] do not apply. However, the structure of the example at hand allows for a direct approach which, eventually, reduces the analysis to that of a central limit theorem for a Markov chain.

Random Time-Dependent Quantum Walks

89

6.1. Permutation matrices. Let S2d be the set of permutations of the 2d elements of I± = {±1, ±2, . . . , ±d}. For π ∈ S2d and = {θ j } j∈I± ∈ T2d , define eiθπ(τ ) |π(τ )τ | ∈ U (2d) so that Cσ τ (π, ) = eiθσ δσ,π(τ ) . (6.1) C(π, ) = τ ∈I±

With () = diag (eiθ ) and C(π ) ≡ C(π, 0), we can write C(π, ) = ()C(π ),

(6.2)

where C(π ) is a permutation matrix associated with π . We recall the following elementary properties: For any π, σ ∈ S2d , C(I) = I, C ∗ (π ) = C T (π ) = C(π −1 ), C(π )C(σ ) = C(π σ ).

(6.3)

Moreover, Birkhoff-Von Neumann Theorem asserts that the set of doubly stochastic matrices of order n is the convex hull of the set of permutation matrices of order n whose extreme points coincide with the permutation matrices. The matrices C(π, ) allow for explicit computations of the relevant quantities introduced in Sect. 2. It is easy to derive the next Lemma 6.1. Let r : I± → Zd be a jump function. Given a sequence of n permutations π1 , π2 , . . . , πn , let (τ1 , τ2 , . . . , τn ) ∈ I±n be the sequence parametrized by τ1 given by (τ1 , π2 (τ1 ), π3 (τ2 ), . . . , πn (τn−1 )), i.e. such that τ j = (π j π j−1 · · · π2 )(τ1 ),

j = 2, . . . , n.

(6.4)

Let 1 , 2 , . . . , n be a set of phases, j = (θ1 ( j), . . . , θn ( j)). Then, with the convention C j = C(π j , j ), we get for all k ∈ Zd , Jk (n) = ei(θτ1 +···+θτn ) |τn π1−1 (τ1 )|, (6.5) τ ∈I s.t. n1 ± j=1 r (τ j )=k

and Jk (n) = 0, if

n

j=1 r (τ j )

= k.

Consequently, the non-zero probabilities Wk (n) on Zd read for any normalized internal state vector ϕ0 and any density matrix ρ0 , ϕ |π1−1 (τ1 )|ϕ0 |2 , Wk 0 (n) = Jk (n)ϕ0 2 = τ ∈I s.t. n1 ± j=1 r (τ j )=k

ρ

Wk 0 (n) = Trρn (k, k) =

j∈Zd

τ ∈I 1 ± j= ns=1 r (τs )

π1−1 (τ1 )|ρ0 (k − j, k − j)π1−1 (τ1 ).

(6.6)

Note that the sets of phases j , j = 1, . . . , n play no role in the computation of expectation values of lattice observables. We set τ1 = π1 (τ0 ) and note ϕ0 = aτ0 |τ0 ⇒ |π1−1 (τ1 )|ϕ0 |2 = |aτ0 |2 δτ1 ,π1 (τ0 ) . (6.7) τ0 ∈I±

τ0 ∈I±

90

A. Joye ϕ

Hence Wk 0 (n) =

τ0 ∈I±

Fψ0 (n) =

|aτ0 |2 δnj=1 r (τ j ),k so that for F = I⊗ f and ψ0 = ϕ0 ⊗|00|,

ϕ Wk 0 (n) f (k)

=

|aτ0 | f ( 2

τ0 ∈I±

k∈Zd

n

r (τ j )).

(6.8)

j=1

Remark 6.2. In other words, given a set of n permutations, there is no more quantum randomness in the variable X n , except in the initial state. Therefore the characteristic functions take the form Corollary 6.3. With τ j = (π j π j−1 · · · π1 )(τ0 ), for j = 1, . . . , n, i y n r (τ ) j j=1 ϕn 0 (y) = e |aτ0 |2 , τ0 ∈I±

ρn 0 (y)

=

e

iy

n

j=1 r (τ j )

(6.9)

Td

τ0 ∈I±

τ0 |R0 (y − v, v)τ0 d v. ˜

(6.10)

The dynamical information is contained in the sum Sn = nj=1 r (τ j ) which appears in the phase. The next section is devoted to its study, in the random version of this model where the coin matrices are i.i.d. random variables with values in {C(π, ), π ∈ S2d , ∈ Z2d }. 6.2. Random dynamics. Assume a random variable C(ω) with values in {C(π, ) ∈ U (2d), (π, ) ∈ S2d ×T2d } is defined on a probability space (, σ, dν). The foregoing shows that only the marginal α defined on the discrete set {C(π ) ∈ U (2d), π ∈ S2d } or, equivalently on {π ∈ S2d }, matters: μ(π ) ≡ μ(α(ω) = π ) = ν({C(ω) = C(π, ) | ∈ T2d }), π ∈ S2d .

(6.11)

We shall use the notation α(ω) ≡ ω ∈ S2d and = S2d . The corresponding process ∗ is denoted by ω = (ω1 , ω2 , ω3 , . . . ) ∈ N and dP = ⊗k∈N∗ dμ. 2d Given ϕ0 ∈ C an initial internal state and a random sequence of permutation matrices (ω1 , . . . , ωn ), the random variable Sn (ω) = nj=1 τ j (ω) ∈ Zd is the sum of random variables τ j (ω), j = 1, . . . , n whose properties are given in the next lemma: Lemma 6.4. Let ϕ0 = τ0 aτ0 |τ0 be the initial condition. The path (τ0 , τ1 , . . . , τn ) is a Markov chain with finite state space I± characterized by the initial probability distribution p0 (τ0 = σ0 ) = |aσ0 |2 and by the stationary transition probabilities P(σ , σ ) = Prob(τk (ω) = σ |τk−1 (ω) = σ ), k = 2, 3, . . . , n given by P(σ , σ ) =

μ(π )δσ,π(σ ) .

(6.12)

(6.13)

π ∈S2d

The corresponding transition matrix P = (P(σ , σ )) ∈ M2d (R+ ) is doubly stochastic and P = E(C T (ω)).

(6.14)

Random Time-Dependent Quantum Walks

91

Remark 6.5. The transition matrix P is unitary iff μ(π ) = δπ0 ,π for some π0 . Proof. By Lemma 6.1, τk (ω) only depends on {ω j } j=1,...,k and τ0 and is given by τk (ω) = ωk (τk−1 ). Hence Prob(τk (ω) = σ |τk−1 (ω) = σ ) = Prob(ωk (τk−1 ) = σ |τk−1 = σ ) = μ({ω | ω(σ ) = σ }) = μ(π )δσ,π(σ ) , (6.15) π ∈S2d

where we used the independence of the ωk . Finally, (6.1) shows that the right-hand side is the expectation of C T (ω) w.r.t μ. Considering the diffusive scaling (2.13), we are thus naturally lead to investigate the large n behavior of the quantity n 1 1 r (τ j (ω)), √ Sn (ω) = √ n n

(6.16)

j=1

i.e. to a functional central limit theorem for the Markov chain (τ0 , τ1 , τ2 , . . . ) with finite state space I± , initial probability p0 and transition matrix P. There are simple conditions under which a functional central limit theorem holds for Markov chains with finite state space, see e.g. in [7]. Let us recall the few basic notions and results associated to Markov chains with finite state space, F, characterized by a transition matrix P ∈ M|F| (R+ ) s.t. τ ∈F P(σ, τ ) = 1 that we will need below. A transition matrix P is irreducible if, ∀ σ, τ ∈ F, ∃ n ∈ N∗ such that P n (σ, τ ) > 0. A probability distribution p0 , considered as a vector in R|F| , is invariant for the transition matrix P if P T p0 = p0 . If p0 is invariant, then the Markov process is stationary, Prob((τ0 , τ1 , . . .) ∈ B) = Prob((τk , τk+1 , . . .) ∈ B), ∀ k ∈ N, B ⊂ F N .

(6.17)

If P is irreducible, the invariant distribution p0 is unique and p0 (τ ) > 0, ∀τ ∈ F. If P is furthermore doubly stochastic, the invariant distribution u 0 is uniform u 0 (τ ) = 1/|F|, ∀τ ∈ F. In terms of spectral properties, an irreducible stochastic matrix P admits 1 as a simple eigenvalue. If it is furthermore doubly stochastic, the uniform vector u 0 is invariant under both P and P ∗ . Hence, if we take as initial distribution the uniform measure u 0 (τ ) = 1/(2d) ∀ τ ∈ I± , which is invariant for the doubly stochastic transition matrix P = Eμ (C T (ω)), the Markov process is stationary. Moreover, Eu 0 (r (τ0 )) =

1 r (τ ) = r . 2d

(6.18)

τ ∈I±

From Thm. 20.1 in [7] and its applications p. 177, or [25], we have Theorem 6.6. Let ϕ0 = τ ∈I± aτ |τ and p0 s.t. p0 (τ ) = |aτ |2 . Assume the transition matrix P = E(C T (ω)) is irreducible. Then, limn→∞ n1 nj=1 r (τ j (ω)) = r almost surely and, with convergence in distribution,

92

A. Joye n 1 r (τ j (ω)) − r √ n

n→∞ −→

X ω N (0, ),

(6.19)

j=1

provided the covariance matrix i j = −

1 1 ri |r j + r i r j − ri |S(1)r j + r j |S(1)ri 2d 2d

(6.20)

is definite positive, where S(1) denotes the reduced resolvent of P at 1. Remark 6.7. An alternative formulation for is 1 i j = ri |(I Q − PQ )−1 (I Q + PQ )r j +r j |(I Q − PQ )−1 (I Q + PQ )ri , 4d

(6.21)

where the projector Q and the operator PQ are defined in the spectral decomposition of P, P = 2d|u 0 u 0 | + PQ , with PQ = Q P Q and Q = Q 2 = Q ∗ = I − 2d|u 0 u 0 |. (6.22) Proof. Our assumptions imply that P is irreducible, doubly stochastic and that u 0 is invariant for P and P ∗ . This together with (6.18) allows us to apply Thm. 20.1 of [7] and the remarks p.177 or the results of [25]. It remains to compute the covariance matrix. Let us define the centered random vector, r˜ (τ (ω)) = r (τ (ω)) − r ,

(6.23)

such that Eu 0 (˜r (τ0 )) = 0, where Eu 0 denotes the expectation with invariant initial measure u 0 . The first mentioned reference yields the following expression for the covariance matrix: i j = Eu 0 (˜r (τ0 )i r˜ (τ0 ) j ) +

∞

Eu 0 (˜r (τ0 )i r˜ (τk (ω)) j + r˜ (τk (ω))i r˜ (τ0 ) j ), (6.24)

k=1

for i, j = 1, 2, . . . , d, where r˜ (τ ) j denotes the j th component of r˜ (τ ) ∈ Zd . We compute for any k ∈ N, Eu 0 (˜r (τ0 )i r˜ (τk ) j ) =

1 ri |P k r j − r i r j . 2d

(6.25)

Note that the right-hand side of (6.25) is equal to 1 ri |P k (r j − 2d r j u 0 ) with u 0 |(r j − 2d r j u 0 ) = 0. 2d

(6.26)

By (6.22), for any v, w ∈ C2d , v − 2d vu 0 = Qv and w|Qv = w|v − 2dw v.

(6.27)

Random Time-Dependent Quantum Walks

93

Thanks to (6.26), we can write ∞

P k (r j − 2d r j u 0 ) =

k=1

∞

PQk (r j − 2d r j u 0 ) = (I Q − PQ )−1 PQ (r j − 2d r j u 0 ),

k=1

(6.28) where I Q is the identity reduced to the subspace QC2d . Therefore, (I Q − PQ )−1 PQ ≡ −(S(1) + I Q ),

(6.29)

denotes the reduced resolvent of P at 1. where S(1) = Q S(1)Q = (PQ − I Q )|−1 Q C2d Hence 1 1 ri |r j − r i r j − (ri |(S(1) + I Q )(r j − 2d r j u 0 ) 2d 2d +r j |(S(1) + I Q )(ri − 2d r i u 0 )) 1 1 ri |S(1)r j + r j |S(1)ri = − ri |Qr j − 2d 2d 1 ri |(Q + 2S(1))r j + r j |(Q + 2S(1))ri =− 4d 1 ri |(I Q − PQ )−1 (I Q + PQ )r j +r j |(I Q − PQ )−1 (I Q + PQ )ri . = 4d

i j =

(6.30)

The convergence of

√1 n Xω

n

j=1 (r (τ j (ω)) − r )

for any initial measure p0 s.t. p0 (τ ) =

|aτ in distribution to N (0, ) implies the convergence of the characteristic function and of its derivatives, which are continuous functions of the random variable n √1 (r (τ (ω)) − r ). In particular j j=1 n |2

y | y=0 ϕn 0 √ n n n 21 = |aτ0 | (r (τl (ω)) − r ) j (r (τl (ω)) − r )k , n

−∂ y j ∂ yk e−i yr

√

n

τ0 ∈I±

l=1

(6.31)

l=1

whose limit, as n → ∞ yields the elements of the random diffusion matrix. We have Corollary 6.8. Under the assumptions of Theorem 6.6, the following random variables converge in distribution as n → ∞: The random rescaled characteristic functions √ √ ω i y √1n nj=1 (r (τ j (ω))−r ) −i yr n ϕ0 2 −→ ei y X , n (y/ n) = |aτ0 | e (6.32) e e

√ −i yr n

√ ρn 0 (y/ n) =

τ0 ∈I±

e

τ0 ∈I± ω

i y √1n

n

j=1 (r (τ j (ω))−r )

Td

√ τ0 |R0 (y/ n − v, v)τ0 d v˜

−→ ei y X , where X ω N (0, ),

(6.33)

94

A. Joye

and the random diffusion constants n n 1 |aτ0 |2 (r (τl (ω)) − r ) j (r (τl (ω)) − r )k −→ Dωjk , (6.34) n τ0 ∈I± l=1 l=1 n n √ 1 τ0 |R0 (y/ n − v, v)τ0 d v˜ (r (τl (ω)) − r ) j (r (τl (ω)) − r )k n Td τ0 ∈I±

l=1

l=1

−→ Dωjk ,

(6.35)

where Dωjk is distributed according to the law of X ωj X kω , where X ω N (0, ). Remark 6.9. In particular we get Eω (Dωjk ) = jk .

(6.36)

Proof. For the case of initial density matrix ρ0 , it is enough to note that τ0 |R0 (−v, v)τ0 d v˜ = TrR0 (−v, v) d v˜ = 0 (0) = 1. d τ0 ∈I± T

Td

One concludes using the convergence results stated in [7], p. 28 and 30.

(6.37)

At this point one may wonder if the assumption P = Eμ (C T (ω)) is enough to apply Theorem 3.10 and compare the results concerning the averaged distribution w(n) = E(W ω (n)). The next proposition answers this question positively. Proposition 6.10. Under the hypotheses of Theorem 6.6, assumption S’ holds. Moreover, the diffusion constant D(v) given by Theorem 3.10 is independent of v ∈ Td . Proof. We need to consider M(y, y ) = D(y, y )E =

τ,τ ∈I

ei(yr (τ )+y r (τ )) |τ ⊗ τ τ ⊗ τ | E,

(6.38)

±

where E = Eν (Cω ⊗ Cω ), with ν and C(ω) defined above (6.11). We first observe that the M∗ (Y )-cyclic subspace generated by 1 , I, is given by I = span{|σ ⊗ σ }σ ∈I± . Indeed, 1 ∈ I and I is invariant under D(Y ) for any Y = (y, y ) ∈ T2d since D(y, y )|σ ⊗σ = ei(y+y )r (σ ) |σ ⊗σ . Then, for any C(π, ) = τ ∈I± eiθπ(τ ) |π(τ )τ | and any {ασ }σ ∈I± , ασ ∈ C, we compute ασ |σ ⊗ σ = ασ |π(σ ) ⊗ π(σ ) C(π, ) ⊗ C(π, ) σ ∈I±

σ ∈I±

≡ C(π ) ⊗ C(π )

ασ |σ ⊗ σ .

(6.39)

σ ∈I±

This shows that I is invariant under (Cω ⊗ Cω )∗ , and thus under its expectation as well, which is enough to prove the claim. Moreover, (6.39) shows that, when restricted to I, any matrix C(π, ) ⊗ C(π, ) acts like C(π ) ⊗ C(π ) does. Consequently, we can

Random Time-Dependent Quantum Walks

95

consider the finite measure μ on S2d defined by (6.11) instead of the original measure ν. Altogether we get μ(π )C(π ) ⊗ C(π )|I M(−v, v)|I = D(−v, v)|I =

π ∈S2d

μ(π )C(π ) ⊗ C(π )|I = E|I ,

(6.40)

π ∈S2d

which is independent of v ∈ Td . It remains to show that 1 is the only invariant vector under E|I . But the equation for the coefficients ασ , μ(π )C(π ) ⊗ C(π ) ασ |σ ⊗ σ = ασ |σ ⊗ σ , (6.41) σ ∈I±

π ∈S2d

σ ∈I±

is equivalent to π ∈S2d μ(π )απ −1 (σ ) = ασ , for all σ . This is in turn equivalent to requiring that the vector α ∈ C2d with components {ασ }σ ∈I± be invariant under P T . P being irreducible by assumption, the only invariant vectors have constant components and thus are proportional to 1 . Remark 6.11. The assumption S doesn’t always hold for the case under study. In case ≡ 0, the vector = (τ,τ )∈I 2 |τ ⊗ τ is also invariant under M(0, 0) = E. ±

We close this section by providing a general relationship between the diffusion matrix Dω computed by means of W ω (n) and the diffusion matrix D computed by means of the averaged distribution w(n) = Eω (W ω (n)) in Theorem 3.10, provided both exist. To deal with the whole diffusion matrix at once, we let z ∈ Rd \ {0} be fixed and consider the non-negative random variables D = z|Dz, Dnω = z|Ynω Ynω |z, where Ynω =

X nω − nr . √ n

(6.42)

With these notations, Ew(n) (Dn ) = Eω EW ω (n) (Dnω ) → D by Theorem 3.10 and EW ω (n) (Dnω ) → D ω , where D ω = z|Dω z as n → ∞, if the limit exists. Proposition 6.12. Consider the initial condition 0 = ϕ0 ⊗|0 and assume the hypotheses of Theorem 3.10 hold for the distribution w(n). Further assume the random variables defined in (6.42) satisfy EW ω (n) (Dnω ) → D ω in distribution. Then Eω (D ω ) = lim Eω EW ω (n) (Dnω ), n→∞

(6.43)

which implies Eω (Dω ) = D. Proof. Let dnω = EW ω (n) (Dnω ) ≥ 0 which converges in distribution to D ω . By Theorem 5.4 of [7], if supn Eω ((dnω )1+ ) < ∞, for some > 0, then the limit and expectation commute: limn→∞ Eω (dnω ) = Eω (D ω ), which yields the result. Now, Remark 3.11 below Theorem 3.10 implies the condition with = 1. Remark 6.13. When applied to the case discussed in the section, the proposition yields Eω (Dω ) = = D d v˜ = D. (6.44) Td

96

A. Joye

6.3. Large deviations. We can complete the picture for the model at hand by looking at its large deviations properties. Because the analysis reduces to the study of a finite state Markov chain, a large deviation principle is true for the model, see [13], Theorem 3.1.2. For λ ∈ Rd , let λ ∈ M2d (R+ ) be the matrix whose non-negative elements read λ (σ, τ ) = P(σ, τ )eλ|r (τ ) ,

(6.45)

where P is the transition matrix and r is the jump function. As P is irreducible, λ is irreducible as well. Hence, by Perron-Frobenius theorem, for all λ ∈ Rd , the largest real eigenvalue of λ , ρ(λ) > 0, is simple and σ (λ ) ⊂ D(0, ρ(λ)). As a function of λ ∈ Rd , ρ(λ) is real analytic. For every x ∈ Rd , define I (x) = sup (λ|x − ln(ρ(λ))).

(6.46)

λ∈Rd

The function I is a good rate function. Theorem 6.14. Under the hypotheses of Theorem 6.6, the random variable Z n = 1 n r (τ j ) satisfies a large deviation principle with convex good rate function I : j=1 n For any σ ∈ I± and any ⊂ Rd , 1 ln(Pσ (Z n ∈ )) n 1 ≤ lim sup ln(Pσ (Z n ∈ )) ≤ − inf I (x), n x∈

− inf I (x) ≤ lim inf x∈ 0

n→∞

(6.47)

where Pσ refers to the initial law p0 (τ ) = δσ,τ 6.4. Example. We end this section by a simple example which allows us to make explicit all quantities encountered so far. In dimension d = 1, we consider the jump function r defined by r (±1) = ±1, so that r = 0. We define the distribution μ on S2 by the Bernoulli process which assigns 0 1 probability p > 0 to the identity matrix and q = 1 − p to the matrix . The 1 0 corresponding transition matrix reads |ψ1 ψ1 | |0 0 | p q P= s.t. (P − z)−1 = + , (6.48) q p 1−z ( p − q) − z 1 1 1 1 1 ψ1 | √ √ , ψ1 = . Consequently S(1) = (|ψ with 0 = p−q)−1 . Thus, the aver2 −1 2 1 1 reads aged diffusion constant D = computed from (6.20) with r = −1 1 1 p = − r |r − 2r |S(1)r = . 2 2 q

(6.49)

Xω

Therefore the random variable EW ω (n) ( √nn ) converges in distribution to X ω N (0, p/q) and the corresponding random diffusion constant Dω is distributed according to the chi-square law with density qp f (· qp ), where e−t/2 , t ≥ 0. f (t) = √ 2π t

(6.50)

Random Time-Dependent Quantum Walks

97 Xω

Next we compute the rate function I such that Pσ (EW ω (n) ( nn ) ∈ ) e−n inf x∈ I (x) , for n large. With our choice of jump function, the matrix λ reads λ qe−λ pe s.t. det λ = p − q and Tr λ = 2 p cosh(λ). (6.51) λ = qeλ pe−λ Thus we have ρ(λ) = p cosh(λ) +

p 2 sinh2 (λ) + q 2

so that supλ∈R (xλ−ln(ρ(λ))) is reached at λ(x) = arsinh I (x) = x arsinh

qx

√ p 1 − x2

− ln

q+

(6.52)

√q x p 1−x 2

p 2 + x 2 (q − p) √ 1 − x2

for |x| < 1. Hence

if |x| < 1

(6.53)

and I (x) = ∞ otherwise. 7. Generalization In the light of the last example, it is natural to generalize the results of Sect. 3 to the case where the random coin matrices C(ω) are distributed according to a Markov process, in the spirit of [15,17,29]. We briefly do so in this last section, mentioning the main modifications only and considering finitely many coin matrices for simplicity. Consider a finite set {C1 , C2 , . . . , C F } of unitary coin matrices on C2d and assume that for any n ∈ N, the set of random matrices {C(ω1 ), . . . , C(ωn )} is determined by a Markov chain characterized by an initial distribution { p0 ( j)} Fj=1 and an irreducible transition matrix {P( j, k)} j,k∈{1,...,F} . Correspondingly, for any Y ∈ Td × Td , the sequence of matrices {Mω j (Y )} j∈N has the same distribution, assuming the C j ⊗ Ck are distinct, so that P({Mωn (Y ), Mωn−1 (Y ), . . . Mω1 (Y )} = {Mkn (Y ), Mkn−1 (Y ), . . . Mk1 (Y )}) = p0 (k1 )P(k1 , k2 ) · · · P(kn−1 , kn ).

(7.1)

We introduce the matrices M jk (Y ) = P T ( j, k)Mk (Y ),

j, k ∈ {1, . . . , F},

acting on C2d ⊗ C2d and the operator acting on C F ⊗ (C2d ⊗ C2d ), |e j ek | ⊗ M jk (Y ), M(Y ) =

(7.2)

(7.3)

j,k∈{1,...,F}

where the e j ’s are the canonical basis vectors of C F and the “bra”s and “ket”s refer to the usual scalar product in C F . Similarly, let M0 (Y ) be the vector in M4d 2 (C) F defined by M0 (Y ) = p0 ( j)|e j ⊗ M j (Y ). (7.4) j∈{1,...,F}

98

A. Joye

If χ1 =

j∈{1,...,F} |e j

∈ C F and I is the identity operator on C2d ⊗ C2d , we obtain

E(Mωn (Y )Mωn−1 (Y ) · · · Mω1 (Y )) = χ1 ⊗ I|M(Y )n−1 M0 (Y ). Hence, with 1 = τ ∈I± |τ ⊗ τ and 0 ∈ C2d ⊗ C2d , we can write

(7.5)

(7.6) E(ωn (Y )) = χ1 ⊗ 1 |M(Y )n−1 M0 (Y )0 , where χ1 ⊗1 and M0 (Y )0 = j p0 ( j)|e j ⊗ M j (Y )0 belong to C F ⊗(C2d ⊗C2d ) and “bra”s and “ket”s should be interpreted accordingly. This brings us back to the study of large powers of an operator, M(Y ), which has essentially the same properties as M(Y ): Lemma 7.1. The matrix M(Y ) acting on C F ⊗(C2d ⊗C2d ) is analytic in Y ∈ C2d ×C2d . Assume P is irreducible, and let χ p ∈ C F be the unique real valued vector s.t. P T χ p = χ p and χ p |χ1 = 1. Then for any v ∈ Td , M(−v, v)|χ p ⊗ 1 = |χ p ⊗ 1 and M∗ (−v, v)|χ1 ⊗ 1 = |χ1 ⊗ 1 .

(7.7)

For any Y ∈ Td × Td , Spr M(Y ) = Spr M∗ (Y ) ≤ 1.

(7.8)

Proof. The first two identities follow from explicit computations. The second property is a consequence of the fact that if one endows C F ⊗ (C2d ⊗ C2d ) with the norm j e j ⊗ j ∞ := max j j C2d ⊗C2d , then M∗ (Y ) becomes a contraction. This is due to the fact that P, as a stochastic matrix, is a contraction with the sup norm on C F . The spectral radius of M∗ (Y ) thus cannot exceed one. We shall work under the Assumption S”. For all v ∈ Td , σ (M(−v, v)|I ∗ ) ∩ ∂ D(0, 1) = {1} and the eigenvalue 1 is simple, where

I∗

is the

M∗ (Y )-cyclic

(7.9)

subspace generated by χ1 ⊗ 1 .

Then we can proceed with a spectral analysis similar to that of Sect. 3. Ignoring the restriction |I ∗ in the notation, we note that Assumption S” implies that for all v ∈ Tνd , a complex neighborhood of Td , we can write (M(−v, v) − z)−1 =

P˜ + S˜v (z), z ∈ σ (M(−v, v)), 1−z

(7.10)

where P˜ = d1 |χ p ⊗ 1 χ1 ⊗ 1 | is independent of v, and the reduced resolvent S˜v (z) has the same analyticity properties as that of M(−v, v). Moreover, introducing |e j e j | ⊗ D(Y ) (7.11) (Y ) = j

we see that M(y − v, v) = (y, 0)M(−v, v), where (y, 0) = |e j e j | ⊗ (I + F1 (y) + F2 (y) + O(y3 )), j

(7.12)

Random Time-Dependent Quantum Walks

99

see (3.24). Therefore, if (y, v) ∈ B(0, y0 ) × Tνd , for y0 > 0 and ν > 0 small enough, Lemma 3.6 holds for M(y − v, v). Applying the same perturbation formulas for the isolated eigenvalue of M(y − v, v), noted λ1 (y, v) again, for y ∈ Cd small enough, we reach the same conclusions by explicit computations: λ1 (y, v) = 1 +

i 1 yr (τ ) − y|D(v)y + Ov (y3 ), 2d 2

(7.13)

τ ∈I±

for all v ∈ Tνd . The first order term in y is the same as the one of (3.27). On the other hand, the explicit form of the quadratic term in y which defines the (analytic) diffusion matrix D(v), depends on the transition matrix P, 1 (yr (τ ))2 d 2 τ ∈I± ⎛ ⎞ 1⎝ 1 ⎠. − (yr (τ ))(yr (τ )) χ1 ⊗ τ ⊗ τ | S˜v (1)χ p ⊗ τ ⊗ τ − d 2d

y|D(v)y = −

τ,τ ∈I±

(7.14) ˜ Moreover, the corresponding rank one projector P(y, v) is analytic and thus tends to P˜ d ˜ ˜ ˜ v), the matrix Q(y, v)M uniformly in v ∈ Tν , as y → 0. With Q(y, v) = I − P(y, d ˜ (y − v, v) Q(y, v) has spectral radius strictly smaller than one, for v ∈ Tν and y ∈ Cd small enough. Therefore we can state that all conclusions drawn in Sect. 3, e.g. Theorem 3.10, and in Sect. 5, e.g. Theorem 5.2, for i.i.d coin matrices under Assumption S are true for finitely many coin matrices forming a Markov chain, under Assumption S”, mutatis, mutandis. Acknowledgements. It is a pleasure to thank L. Bruneau, E. Hamza, M. Merkli and C.A. Pillet for fruitful discussions and suggestions about this work.

References 1. Aharonov, Y., Davidovich, L., Zagury, N.: Quantum random walks. Phys. Rev. A 48, 1687–1690 (1993) 2. Ahlbrecht, A., Vogts, H., Werner, A.H., Werner, R.F.: Asymptotic evolution of quantum walks with random coin. J. Math. Phys. 52, 042201 (2011) 3. Ambainis, A., Aharonov, D., Kempe, J., Vazirani, U.: Quantum Walks on Graphs. In: Proc. 33rd ACM STOC, 2001, pp. 50–59 4. Ambainis, A., Kempe, J., Rivosh, A.: Coins make quantum walks faster. In: Proceedings of SODA’05, 2005 pp. 1099–1108 5. Asch, J., Bourget, O., Joye, A.: Localization Properties of the Chalker-Coddington Model. Ann. H. Poincaré 11(7), 1341–1373 (2010) 6. Asch, J., Duclos, P., Exner, P.: Stability of driven systems with growing gaps, quantum rings, and Wannier ladders. J. Stat. Phys 92, 1053–1070 (1998) 7. Billingsley, P.: Convergence of Probability Measures. New York: John Wiley and Sons, 1968 8. Bourget, O., Howland, J.S., Joye, A.: Spectral analysis of unitary band matrices. Commun. Math. Phys 234, 191–227 (2003) 9. Blatter, G., Browne, D.: Zener tunneling and localization in small conducting rings. Phys. Rev. B 37, 3856 (1988) 10. Bruneau, L., Joye, A., Merkli, M.: Infinite Products of Random Matrices and Repeated Interaction Dynamics. Ann. Inst. Henri Poincaré (B) Prob. Stat. 46, 442–464 (2010)

100

A. Joye

11. Chalker, J.T., Coddington, P.D.: Percolation, quantum tunneling and the integer Hall effect. J. Phys. C 21, 2665–2679 (1988) 12. de Oliveira, C.R., Simsen, M.S.: A Floquet Operator with Purely Point Spectrum and Energy Instability. Ann. H. Poincaré 7, 1255–1277 (2008) 13. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Berlin-Heidelberg-New york: Springer, 1998 14. Hamza, E., Joye, A., Stolz, G.: Dynamical Localization for Unitary Anderson Models. Math. Phys., Anal. Geom. 12, 381–444 (2009) 15. Hamza, E., Kang, Y., Schenker, J.: Diffusive propagation of wave packets in a fluctuating periodic potential. Lett. Math. Phys. 95, 53–66 (2011) 16. Joye, A., Merkli, M.: Dynamical Localization of Quantum Walks in Random Environments. J. Stat. Phys. 140, 1025–1053 (2010) 17. Kang, Y., Schenker, J.: Diffusion of wave packets in a Markov random potential. J. Stat. Phys. 134, 1005– 1022 (2009) 18. Kato, T.: Perturbation Theory for Linear Operators. Berlin-Heidelberg-New York: Springer, 1980 19. Karski, M., Förster, L., Chioi, J.M., Streffen, A., Alt, W., Meschede, D., Widera, A.: Quantum Walk in Position Space with Single Optically Trapped Atoms. Science 325, 174–177 (2009) 20. Keating, J.P., Linden, N., Matthews, J.C.F., Winter, A.: Localization and its consequences for quantum walk algorithms and quantum communication. Phys. Rev. A 76, 012315 (2007) 21. Kempe, J.: Quantum random walks - an introductory overview. Contemp. Phys. 44, 307–327 (2003) 22. Konno, N.: One-dimensional discrete-time quantum walks on random environments. Quantum Inf Process 8, 387–399 (2009) 23. Konno, N.: Quantum Walks. In: Quantum Potential Theory. Franz, Schürmann eds., Lecture Notes in Mathematics, 1954, Berlin-Heidelberg-New York: Springer, 2009, pp. 309–452 24. Kosk, J., Buzek, V., Hillery, M.: Quantum walks with random phase shifts. Phys. Rev. A 74, 022310 (2006) 25. Landim, C.: Central Limit Theorem for Markov Processes. In: From Classical to Modern Probability CIMPA Summer School 2001, Picco, Pierre; San Martin, Jaime (Eds.), Progress in Probability 54, Basel: Birkhaüser, 2003, pp. 147–207 26. Lenstra, D., van Haeringen, W.: Elastic scattering in a normal-metal loop causing resistive electronic behavior. Phys. Rev. Lett 57, 1623–1626 (1986) 27. Magniez, F., Nayak, A., Richter, P.C., Santha, M.: On the hitting times of quantum versus random walks. 20th SODA Philadelphia PA: SIAM, 2009, pp. 86–95 28. Meyer, D.: From quantum cellular automata to quantum lattice gases. J. Stat. Phys. 85, 551–574 (1996) 29. Pillet, C.A.: Some Results on the Quantum Dynamics of a Particle in a Markovian Potential. Commun. Math. Phys. 102, 237–254 (1985) 30. Ryu, J.-W., Hur, G., Kim, S.W.: Quantum Localization in Open Chaotic Systems. Phys. Rev. E 78, 037201 (2008) 31. Santha, M.: Quantum walk based search algorithms, 5th TAMC (Xian, 2008), LNCS 4978, BerlinHeidelberg-New York: Springer Verlag, 2008, pp. 31–46 32. Shapira, D., Biham, O., Bracken, A.J., Hackett, M.: One dimensional quantum walk with unitary noise. Phys. Rev. A 68, 062315 (2003) 33. Shikano, Y., Katsura, H.: Localization and fractality in inhomogeneous quantum walks with self-duality. Phys. Rev. E 82, 031122 (2010) 34. Shenvi, N., Kempe, J., Whaley, K.B.: Quantum random-walk search algorithm. Phys. Rev. A 67, 052307 (2003) 35. Yin, Y., Katsanos, D.E., Evangelou, S.N.: Quantum Walks on a Random Environment. Phys. Rev. A 77, 022302 (2008) 36. Zhan, X.: Matrix Inequalities. LNM 1790, Berlin-Heidelberg-New York: Springer, 2002 37. Zähringer, F., Kirchmair, G., Gerritsma, R., Solano, E., Blatt, R., Roos, C.F.: Realization of a quantum walk with one and two trapped ions. Phys. Rev. Lett. 104, 100503 (2010) Communicated by H. Spohn

Commun. Math. Phys. 307, 101–131 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1301-2

Communications in

Mathematical Physics

Quantum Isometries of the Finite Noncommutative Geometry of the Standard Model Jyotishman Bhowmick1 , Francesco D’Andrea2 , Ludwik Da˛browski2 1 Abdus Salam International Center for Theoretical Physics (ICTP), Strada Costiera 11, 34151 Trieste, Italy.

E-mail: [email protected]

2 Scuola Internazionale Superiore di Studi Avanzati (SISSA), via Bonomea 265, 34136 Trieste, Italy.

E-mail: [email protected]; [email protected] Received: 12 November 2010 / Accepted: 15 March 2011 Published online: 14 July 2011 – © Springer-Verlag 2011

Abstract: We compute the quantum isometry group of the finite noncommutative geometry F describing the internal degrees of freedom in the Standard Model of particle physics. We show that this provides genuine quantum symmetries of the spectral triple corresponding to M × F, where M is a compact spin manifold. We also prove that the bosonic and fermionic part of the spectral action are preserved by these symmetries. 1. Introduction In modern theoretical physics, symmetries play a fundamental role in determining the dynamics of a theory. In the two foremost examples, namely General Relativity and the Standard Model of elementary particles, the dynamics is dictated by invariance under diffeomorphisms and under local gauge transformations respectively. As a way to unify external (i.e. diffeomorphisms) and internal (i.e. local gauge) symmetries, Connes and Chamseddine proposed a model from Noncommutative Geometry [15] based on the product of the canonical commutative spectral triple of a compact Riemannian spin manifold M and a finite dimensional noncommutative one, describing an “internal” finite noncommutative space F [12,13,18,20]. In this picture, diffeomorphisms are realized as outer automorphisms of the algebra, while inner automorphisms correspond to the gauge transformations. Inner fluctuations of the Dirac operator are divided in two classes: the 1-forms coming from commutators with the Dirac operator of M give the gauge bosons, while the 1-forms coming from the Dirac operator of F give the Higgs field. The gravitational and bosonic part Sb of the action is encoded in the spectrum of the gauged Dirac operator, which is invariant under isometries of the Hilbert space. The fermionic part S f is also defined in terms of the spectral data. The result is an Euclidean version of the Standard Model minimally coupled to gravity (cf. [20] and references therein). In his “Erlangen program”, Klein linked the study of geometry with the analysis of its group of symmetries. Dealing with quantum geometries, it is natural to study quantum

102

J. Bhowmick, F. D’Andrea, L. Da˛browski

symmetries. The idea of using quantum group symmetries to understand the conceptual significance of the finite geometry F is mentioned in a final remark by Connes in [17]. Preliminary studies on the Hopf-algebra level appeared in [21,26,30]. Following Connes’ suggestion, quantum automorphisms of finite-dimensional complex C ∗ -algebras were introduced by Wang in [37,38] and later the quantum permutation groups of finite sets and graphs have been studied by a number of mathematicians, see e.g. [3,4,11,34]. These are compact quantum groups in the sense of Woronowicz [41]. The notion of compact quantum symmetries for “continuous” mathematical structures, like commutative and noncommutative manifolds (spectral triples), first appeared in [28], where quantum isometry groups were defined in terms of a Laplacian, followed by the definition of “quantum groups of orientation preserving isometries” based on the theory of spectral triples in [7], and on spectral triples with a real structure in [29]. Computations of these compact quantum groups were done for several examples, including the tori, spheres, Podle´s quantum spheres, and Rieffel deformations of compact Riemannian spin manifolds. For these studies we refer to [6–10] and references therein. The finite noncommutative geometry F = (A F , H F , D F , γ F , J F ) describing the internal space of the Standard Model is given by a unital real spectral triple over the finite-dimensional real C ∗ -algebra A F = C ⊕ H ⊕ M3 (C), with H the field of quaternions. Let B F ⊂ B(H) be the smallest complex C ∗ -algebra containing A F as a real C ∗ -subalgebra. In this article we first compute the quantum group of orientation and real structure preserving isometries of the spectral triple (B F , H F , D F , γ F , J F ); next we show that this quantum symmetry can be extended to get quantum isometries of the product of this spectral triple with the canonical spectral triple of M. Thus, we have genuine quantum symmetries of the full spectral triple of the Standard Model. Moreover these quantum symmetries preserves the spectral action in a suitable sense. Finally we compute the maximal quantum subgroup of the quantum isometry group whose coaction is a quantum automorphism of the real C ∗ -algebra A F . The plan of this article is as follows. We start by recalling in Sect. 2 some basic definitions and facts about compact quantum groups and quantum isometries. In Sect. 3 we introduce the spectral triple F and state the main result. Since quantum groups, coactions, etc. are defined in the framework of complex (C ∗ -)algebras, we replace A F by B F and compute the quantum isometry group of the latter in the sense of [29]. As shown in Sect. 3.2, this is given by the free product C(U (1)) ∗ Aaut (M3 (C)), where Aaut (Mn (C)) is Wang’s quantum automorphism group of Mn (C) [37]. In Sect. 4, we discuss the invariance of the spectral action under quantum isometries. In Sect. 5 we explain how the result changes if we work with real instead of complex algebras. The final section deals with the proof of the main result, that is, Proposition 3.4. Throughout the paper, by the symbol ⊗alg we always mean the algebraic tensor product over C, by ⊗ the minimal tensor product of complex C ∗ -algebras or the completed tensor product of Hilbert modules over complex C ∗ -algebras. The symbol ⊗R denotes the tensor product over the real numbers. Unless otherwise stated, all algebras are assumed to be unital complex associative involutive algebras. We denote by N ∗ the set of all bounded linear functionals N → C on the normed linear space N , by M(A) the multiplier algebra of the complex C ∗ -algebra A, by L(H) the adjointable operators on the Hilbert module H and by K(H) the compact operators on the Hilbert space H. For a unital complex C ∗ -algebra A, we implicitly use the identification of M(K(H) ⊗ A) with the set of all adjointable operators on the Hilbert A-module H ⊗ A. By abelianization of A we mean the quotient of A by its commutator C ∗ -ideal. Given a matrix u with entries u i j in a C ∗ -algebra A, we denote by u i∗j = (u i j )∗ the conjugate of the

Quantum Isometries of the Standard Model

103

element u i j , and by (u ∗ )i j = u ∗ji the entry (i, j) of the adjoint matrix u ∗ . Lastly, we + used want to attract the reader’s attention to a choice of notation. The notation QISO J + of [29]. We do this to avoid confusion with the in this article is the same as QISO real + of Sect. 5 in the context of quantum isometries of real newly defined object QISO R ∗ C -algebras. 2. Compact Quantum Groups and Quantum Isometries 2.1. Some generalities on compact quantum groups. We begin by recalling the definition of compact quantum groups and their coactions from [40,41]. We shall use most of the terminology of [36], for example Woronowicz C ∗ -subalgebra, Woronowicz C ∗ -ideal, etc., however with the exception that Woronowicz C ∗ -algebras will be called compact quantum groups, and we will not use the term compact quantum groups for the dual objects as done in [36]. Definition 2.1. A compact quantum group (to be denoted by CQG from now on) is a pair (Q, ) given by a complex unital C ∗ -algebra Q and a unital C ∗ -algebra morphism : Q → Q ⊗ Q such that i) is coassociative, i.e. ( ⊗ id) ◦ = (id ⊗ ) ◦ as equality of maps Q → Q ⊗ Q ⊗ Q; ii) Span (a ⊗ 1 Q )(b) | a, b ∈ Q and Span (1 Q ⊗ a)(b) | a, b ∈ Q are normdense in Q ⊗ Q. For Q = C(G), where G is a compact topological group, conditions i) and ii) correspond to the associativity and the cancellation property of the product in G, respectively. Definition 2.2. A unitary corepresentation of a compact quantum group (Q, ) on a Hilbert space H is a unitary element U ∈ M(K(H) ⊗ Q) satisfying (id ⊗ )U = U(12) U(13) , where we use the standard leg numbering notation (see e.g. [32]). If Q = C(G), U corresponds to a strongly continuous unitary representation of G. For any compact quantum group Q (see [40,41]), there always exists a canonical dense ∗-subalgebra Q 0 ⊂ Q which is spanned by the matrix coefficients of the finite dimensional unitary corepresentations of Q and two maps : Q 0 → C (counit) and κ : Q 0 → Q 0 (antipode) which make Q 0 a Hopf ∗-algebra. Definition 2.3. A Woronowicz C ∗ -ideal of a CQG (Q, ) is a C ∗ -ideal I of Q such that (I ) ⊂ ker(π I ⊗ π I ), where π I : Q → Q/I is the projection map. The quotient Q/I is a CQG with the induced coproduct. If Q = C(G) are continuous functions on a compact topological group G, closed subgroups of G correspond to the quotients of Q by its Woronowicz C ∗ -ideals. While quotients Q/I give “compact quantum subgroups”, C ∗ -subalgebras Q ⊂ Q such that (Q ) ⊂ Q ⊗ Q describe “quotient quantum groups”.

104

J. Bhowmick, F. D’Andrea, L. Da˛browski

Definition 2.4. We say that a CQG (Q, ) coacts on a unital C ∗ -algebra A if there is a unital C ∗ -homomorphism (called a coaction) α : A → A ⊗ Q such that: i) (α ⊗ id)α = (id ⊗ )α, ii) Span {α(a)(1A ⊗ b) | a ∈ A, b ∈ Q} is norm-dense in A ⊗ Q. The coaction is faithful if any compact quantum group Q ⊂ Q coacting on A coincides with Q. It is well known (cf. [33,37]) that condition (ii) in Def. 2.4 is equivalent to the existence of a norm-dense unital ∗-subalgebra A0 of A such that α(A0 ) ⊂ A0 ⊗alg Q 0 and (id ⊗ )α = id on A0 . For later use, let us now recall the concept of universal CQGs Au (R) as defined in [35,38] and references therein. Definition 2.5. For a fixed n × n positive invertible matrix R, Au (R) is the universal C ∗ -algebra generated by {u i j , i, j = 1, . . . , n} such that uu ∗ = u ∗ u = In ,

u t (Ru R −1 ) = (Ru R −1 )u t = In ,

where u := ((u i j )), u ∗ := ((u ∗ji )) and u := (u ∗ )t = ((u i∗j )). The coproduct is given by u ik ⊗ u k j . (u i j ) = k

Note that u is a unitary corepresentation of Au (R) on Cn . The Au (R)’s are universal in the sense that every compact matrix quantum group (i.e. every CQG generated by the matrix entries of a finite-dimensional unitary corepresentation) is a quantum subgroup of Au (R) for some R > 0 [38]. It may also be noted that Au (R) is the universal object in the category of CQGs which admit a unitary corepresentation on Cn such that the adjoint coaction on the finite-dimensional C ∗ -algebra Mn (C) preserves the functional Mn (C) m → Tr(R t m) (see [39]). We observe the following elementary fact which is going to be used in the sequel. Lemma 2.6. Let H = Cn , n ∈ N and B ∈ Mn (B) be a matrix with entries in a unital ∗-algebra B. Then (Tr H ⊗ id) B(L ⊗ 1)B ∗ = Tr H (L) · 1B for any linear operator L on H if and only if B t is unitary. A matrix B (with entries in a unital ∗-algebra B) such that both B and B t are unitary is called a biunitary [5]. We remark that the CQG Au (n) := Au (In ), called the free quantum unitary group, is generated by the biunitary matrix u given in Def. 2.5. We refer to [38] for a detailed discussion on the structure and classification of such quantum groups. The analogue of projective unitary groups was introduced in [2] (see also Sect. 3 of [5]). Let us recall the definition. Definition 2.7. We denote by P Au (n) the C ∗ -subalgebra of Au (n) generated by {(u i j )∗ u kl : i, j, k, l = 1, . . . , n}. This is a CQG with the coproduct induced from Au (n).

Quantum Isometries of the Standard Model

105

Remark 2.8. The projective version of any quantum subgroup of Au (n) can be defined similarly. In [37], Wang defines the quantum automorphism group of Mn (C), denoted by Aaut (Mn (C)) to be the universal object in the category of CQGs with a coaction on Mn (C) preserving the trace (and with morphisms given by CQGs homomorphisms intertwining the coactions). The explicit definition is in Theorem 4.1 of [37]. In the following proposition we recall Théorème 1(iv) of [2] (cf. also Prop. 3.1(3) of [5]). Proposition 2.9 ([2,5]). We have P Au (n) Aaut (Mn (C)). Definition 2.10. We denote by Q n (n ) the amalgamated free product of n copies of Au (n ) over the common Woronowicz C ∗ -subalgebra P Au (n ). This is the CQG generated by the matrix entries of n biunitary matrices u m (m = 1, . . . , n) of size n , with relations (u ∗m )i, j (u m )k,l = (u ∗m )i, j (u m )k,l

∀ i, j, k, l = 1, . . . , n , m, m = 1, . . . , n,

and with standard matrix coproduct: ((u m )i j ) = 1, . . . , n.

n

k=1 (u m )ik

⊗ (u m )k j for all m =

The next lemma will be needed later on. Lemma 2.11. Let Q be a CQG and X, Y ∈ M N (Q), N ∈ N,be matrices with entries in Q satisfying (X ik ) = Nj=1 X i j ⊗ X jk and (Yik ) = Nj=1 Yi j ⊗ Y jk . Let A ∈ M N (C). Then the ideal I ⊂ Q generated by the matrix entries of the matrix X A − AY is a Woronowicz C ∗ -ideal. Proof. We now prove that (I ) ⊂ Q ⊗ I + I ⊗ Q ⊂ ker(π I ⊗π I ), where π I : Q → Q/I is the quotient map, and hence I is a Woronowicz C ∗ -ideal. Since I is a (two-sided) ideal and a C ∗ -algebra homomorphism, it is enough to give the proof for the generators N (X ik Ak j − Aik Yk j ) of I . The following algebraic identity holds: Z i j := k=1 (Z il ) = = = =

N j=1

(X i j A jl − Ai j Y jl )

N

j,k=1

N

j,k=1

N

j=1

X i j ⊗ X jk Akl − Ai j Y jk ⊗ Ykl X i j ⊗ (X jk Akl − A jk Ykl ) + (X i j A jk − Ai j Y jk ) ⊗ Ykl

(X i j ⊗ Z jl + Z i j ⊗ Y jl ).

This concludes the proof.

2.2. Noncommutative geometry and quantum isometries. In noncommutative geometry, compact Riemannian spin manifolds are replaced by real spectral triples. Recall that a unital spectral triple (A, H, D) is the datum of: a complex Hilbert space H, a complex unital associative involutive algebra A with a faithful unital ∗-representation π : A → B(H) (the representation symbol is usually omitted), a (possibly unbounded) selfadjoint operator D on H with compact resolvent and having bounded commutators with

106

J. Bhowmick, F. D’Andrea, L. Da˛browski

all a ∈ A. The canonical commutative example is given by (C ∞ (M), L 2 (M, S), D / ), where C ∞ (M) is the algebra of complex-valued smooth functions on a compact Riemannian spin manifold with no boundary, L 2 (M, S) is the Hilbert space of square integrable spinors and D / is the Dirac operator. A spectral triple is even if there is a Z2 -grading γ on H commuting with A and anticommuting with D. We will set γ = 1 when the spectral triple is odd. A spectral triple is real if there is an antilinear isometry J : H → H, called the real structure, such that J 2 = 1,

J D = D J,

J γ = γ J,

(2.1)

[[D, a], J b J −1 ] = 0,

(2.2)

and [a, J b J −1 ] = 0,

for all a, b ∈ A.1 , and are signs and determine the KO-dimension of the space [16]. For the finite part of the Standard Model = +1, = +1, = −1 and the KO-dimension is 6 [14]. Imposing a few additional conditions, it is possible to reconstruct a compact Riemannian spin manifold from any commutative real spectral triple [19]. In the example (C ∞ (M), L 2 (M, S), D / , J, γ ) of the spectral triple associated to a compact Riemannian spin manifold M with no boundary, there exists a covering group G of the group of orientation preserving isometries G of M having a unitary representation U on the Hilbert space of spinors L 2 (M, S) commuting with D / , J, γ whose adjoint action AdU on B(L 2 (M, S)) preserves the subalgebra C ∞ (M). This picture is used to generalize the notion of isometries as follows (cf. Def. 3 and 4 of [29]). Definition 2.12. A compact quantum group Q coacts by “orientation and real structure preserving isometries” on the spectral triple (A, H, D, γ , J ) if there is a unitary corepresentation U ∈ M(K(H) ⊗ Q) such that U commutes with D ⊗ 1 and γ ⊗ 1; (J ⊗ ∗)U (ξ ⊗ 1 Q ) = U (J ξ ⊗ 1 Q ) for all ξ ∈ H; (id ⊗ ϕ)AdU (a) ∈ A for all a ∈ A and every state ϕ on Q,

(2.3a) (2.3b) (2.3c)

where AdU = U (. ⊗ 1 Q )U ∗ is the adjoint coaction and A is the double commutant of A. Note that in Definition 4 of [29] two antilinear operators J and J˜ appear. J˜ is a generalized real structure (it is not assumed to be an isometry) and J is its antiunitary part. As in the case of this article the real structure is an antilinear isometry J and J˜ coincide and hence our definition is a particular instance of Definition 4 of [29]. We end this section by recalling Theorem 1 of [29]. Let (A, H, D, γ , J ) be a real spectral triple with = 1 and C J be the category with objects (Q, U ) as in Definition 2.12 and morphisms given by CQG morphisms intertwining the corresponding corepresentations. We recall that an object (Q, U ) in the category C J is said to be a sub-object of (Q 0 , U0 ) in the same category if there exists a CQG morphism ϕ : Q 0 → Q such that (id ⊗ ϕ)(U0 ) = U . An object (Q 0 , U0 ) is universal if for any other object (Q, U ) in C J there exists unique such ϕ. 1 Notice that in some examples, although not in the present case, condition (2.2) has to be slightly relaxed, cf. [22–25].

Quantum Isometries of the Standard Model

107

+ Theorem 2.13 ([29]). The category C J has a universal object denoted by QISO + (D)) whose unitary corepresentation, say U0 , is (A, H, D, γ , J ) (or simply QISO J faithful. The quantum isometry group, denoted by QISO+ (A, H, D, γ , J ) (or simply + + (D) generated by the QISO J (D)), is given by the Woronowicz C ∗ -subalgebra of QISO J + (D)elements ξ ⊗ 1, AdU0 (a)(η ⊗ 1) , where a ∈ A, ξ, η ∈ H and , is the QISO J + (D) (cf. Def. 5 in [29]). valued inner product on the Hilbert module H ⊗ QISO J

+ (D) is the quantum analogue of the covering G of the classical group G of QISO J orientation preserving isometries of a spin manifold M. Its projective version (in the + sense of Sect. 3 of [5]) is the quantum group QISO J (D), which is the quantum analogue of G. 3. Quantum Isometries of the Internal Non-commutative Space of the Standard Model 3.1. The finite non-commutative space F. The spectral triple (A F , H F , D F , γ F , J F ) describing the internal space F of the Standard Model is defined as follows (cf. [20] and references therein). The algebra A F is A F := C ⊕ H ⊕ M3 (C), where we identify H with the real subalgebra of M2 (C) with elements

α β q= −β α

(3.1)

(3.2)

for α, β ∈ C (cf. Cayley-Dickson construction). Let us denote by C[v1 , . . . , vk ] Ck the vector space with basis v1 , . . . , vk . For our convenience, we adopt the following notation for the Hilbert space H F . It can be written as a tensor product H F := C2 ⊗ C4 ⊗ C4 ⊗ Cn , where, in the notations of [20], we have i) the first two factors C2 ⊗ C4 with C2 = C[↑, ↓],

C4 = C[ , {qc }c=1,2,3 ],

where ↑ and ↓ stand for weak isospin up and down, and qc stand for lepton and quark of color c respectively. These may be combined into C8 = C[ν, e, {u c , dc }c=1,2,3 ], where ν stands for “neutrino”, e for “electron”, u c and dc for quarks with weak isospin +1/2 and −1/2 respectively and of color c. Explicitly, the isomorphism C2 ⊗ C4 → C8 is the map ↑ ⊗ → ν, ↓ ⊗ → e, ↑ ⊗ qc → u c , ↓ ⊗ qc → dc .

108

J. Bhowmick, F. D’Andrea, L. Da˛browski

ii) a factor C4 = C[ p L , p R , p L , p R ], where L , R stand for the two chiralities, p for “particle” and p for “antiparticle”; iii) a factor Cn since each particle comes in n generations. Presently only 3 generations have been observed, but for the sake of generality we will work with an arbitrary n ≥ 3. From a physical point of view, rays (lines through the origin) of H F are states describing the internal degrees of freedom of the elementary fermions. The charge conjugation J F changes a particle into its antiparticle, and is the composition of the componentwise complex conjugation on H F with the linear operator ⎛ ⎞ 0 0 1 0 ⎜0 0 0 1⎟ J0 := 1 ⊗ 1 ⊗ ⎝ ⊗ 1. (3.3) 1 0 0 0⎠ 0 1 0 0 The grading is γ F := 1 ⊗ 1 ⊗ diag(1, 1, −1, −1) ⊗ 1. The element a = (λ, q, m) ∈ A F (with λ ∈ C, q ∈ H and m ∈ M3 (C)) is represented by

λ 0 π(a) = q ⊗ 1 ⊗ e11 ⊗ 1 + ⊗ 1 ⊗ e44 ⊗ 1 0 λ ⎛ ⎞ λ 0 0 0 ⎜0 ⎟ +1 ⊗ ⎝ ⊗ (e22 + e33 ) ⊗ 1, (3.4) 0 m ⎠ 0 where m is a 3 × 3 block and {ei j }i, j=1,...,k is the canonical basis of Mk (C) (ei j is the matrix with 1 in the (i, j)th position and 0 everywhere else). In particular, in (3.4) e11 projects on the space C[ p L ] of particles with left chirality, e22 on C[ p R ], e33 on C[ p L ] and e44 on C[ p R ]. The Dirac operator is ⎛

0 0 ⎜0 0 ⎜ D F := e11 ⊗ e11 ⊗ ⎝ 0 ϒν ϒν∗ ϒ R∗ ⎛ 0 0 ⎜0 0 ⎜ +e22 ⊗ e11 ⊗ ⎝ ϒe 0 ϒe∗ 0

⎞ ⎛ ϒν 0 ⎜ ϒR ⎟ ⎟ + e11 ⊗ (1 − e11 ) ⊗ ⎜ 0 ⎝0 0 ⎠ 0 ϒu∗ ⎞ ⎛ 0 0 ϒe ⎜ ϒet 0 ⎟ ⎟ + e22 ⊗ (1 − e11 ) ⊗ ⎜ 0 ⎝0 0 0 ⎠ 0 0 ϒd∗

0 ϒνt 0 0

⎞ ϒu 0 ⎟ ⎟ 0 ⎠ 0 ⎞ 0 0 ϒd 0 ϒdt 0 ⎟ ⎟, ϒd 0 0 ⎠ 0 0 0

0 0 ϒu 0

0 ϒut 0 0

(3.5) where each of the ϒ matrices are in Mn (C), m := (m ∗ )t is the matrix obtained from m by conjugating each entry, and we identify B(H F ) = M2 (C)⊗ M4 (C)⊗(M4 (C) ⊗ Mn (C)) with M2 (C) ⊗ M4 (C) ⊗ M4n (C) by writing M4n (C) as a 4 × 4 matrix with entries in

Quantum Isometries of the Standard Model

109

Mn (C); in particular ei j ⊗ m ∈ M4 (C) ⊗ Mn (C) is the matrix with the n × n block m in position (i, j). The matrix ϒ R is symmetric, the other ϒ matrices are positive. Their physical meaning is explained in Sect. 17.4 of [20]: for x = e, u, d the eigenvalues of ϒx∗ ϒx give the square of the masses of the n generations of the particle x; the eigenvalues of ϒν∗ ϒν give the Dirac masses of neutrinos; the eigenvalues of ϒ R∗ ϒ R give the Majorana masses of neutrinos. If we replace a spectral triple with one that is unitary equivalent we do not change the symmetries. From Theorem 1.187(3) (and analogously to Lemma 1.190) of [20] it follows that, modulo an unitary equivalence, we can diagonalize one element of each pair (ϒν , ϒe ) and (ϒu , ϒd ). We choose to diagonalize ϒu and ϒe . Thus, we make the following hypothesis on the ϒ matrices: • ϒu and ϒe are positive, diagonal and their eigenvalues are non-zero. • ϒd and ϒν are positive, the eigenvalues of ϒd are non-zero. Let us denote by C the SU (n) matrix such that ϒd = Cδ↓ C ∗ , where δ↓ is a diagonal matrix with non-negative eigenvalues. C is the so-called Cabibbo-Kobayashi-Maskawa matrix, responsible for the quark mixing, cf. Sect. 9.3 of [20]. Similarly the unitary diagonalizing ϒν is the so-called Pontecorvo-Maki-Nakagawa-Sakata matrix, responsible for the neutrino mixing, cf. Sect. 9.6 of [20]. • ϒ R is symmetric. • For physical reasons, we assume that: ϒx and ϒ y have distinct eigenvalues, for all x, y ∈ {ν, e, u, d} with x = y; eigenvalues of ϒe , ϒu and ϒd are non-zero and with multiplicity one. Remark 3.1. We will often use the fact that ϒ t = ϒ for any positive matrix ϒ. 3.2. Quantum isometries of F. Since the definition of quantum isometry group is given for spectral triples over complex ∗-algebras, we first need to explain how to canonically associate one to any spectral triple over a real ∗-algebra. Lemma 3.2. To any real spectral triple (A, H, D, γ , J ) over a real ∗-algebra A we can associate a real spectral triple (B, H, D, γ , J ) over the complex ∗-algebra B AC / ker πC , where AC A ⊗R C is the complexification of A, with conjugation defined by (a ⊗R z)∗ = a ∗ ⊗R z for a ∈ A and z ∈ C, and πC : AC → B(H) is the ∗-representation πC (a ⊗R z) = zπ(a),

a ∈ A, z ∈ C.

(3.6)

Notice that ker πC may be nontrivial since the representation πC is not always faithful. For example, if A is itself a complex ∗-algebra (every complex ∗-algebra is also a real ∗-algebra) and π is complex linear, then for any a ∈ A the element a⊗R 1+ia⊗R i of AC is in the kernel of πC . This happens in the Standard Model case, where the complexification of A F = C⊕H⊕M3 (C) is the algebra (A F )C := C⊕C⊕M2 (C)⊕M3 (C)⊕M3 (C), where we have used the complex ∗-algebra isomorphism Mn (C) ⊗R C → Mn (C) ⊕ Mn (C) given by m ⊗R z → (mz, mz) having inverse (m, m ) → for all m, m ∈ Mn (C), z ∈ C.

m + m m − m ⊗R 1 + ⊗R i 2 2i

(3.7)

110

J. Bhowmick, F. D’Andrea, L. Da˛browski

Using (3.6), (3.7) and (3.4) we get πC (λ, λ , q, m, m ) = λ, λ , q, m , where λ, λ , q, m := q ⊗ 1 ⊗ e11 ⊗ 1 +

⎛

λ 0 0 λ ⎞

⊗ 1 ⊗ e44 ⊗ 1

λ 0 0 0 ⎜0 ⎟ +1 ⊗ ⎝ ⊗ (e22 + e33 ) ⊗ 1. 0 m ⎠ 0

(3.8)

The complex ∗-algebra B F := (A F )C / ker π C is simply the algebra B F C ⊕ C ⊕ M2 (C) ⊕ M3 (C) with elements λ, λ , q, m . With A F replaced by B F , we can now study quantum isometries. We notice that in the case of the spectral triple of the internal part of the Standard Model, the conditions (2.3b–2.3c) are equivalent to (J0 ⊗ 1)U = U (J0 ⊗ 1); AdU (B F ) ⊂ B F ⊗alg Q ;

(3.9a) (3.9b)

with J0 given by (3.3). The equivalence between (2.3b) and (3.9a) is an immediate consequence of the definition of J F . The equivalence between (2.3c) and (3.9b) follows from the equality of B F and B F , since the latter is a finite-dimensional C ∗ -algebra. We need a preparatory lemma before our main proposition. Lemma 3.3. Let Q be the universal C ∗ -algebra generated by unitary elements xk (k = 0, . . . , n), the matrix entries of 3 × 3 biunitaries Tm (m = 1, . . . , n) and of an n × n biunitary V , with relations diag(x0 x1 , . . . , x0 xn )ϒν = ϒν diag(x0 x1 , . . . , x0 xn ) = V ϒν = ϒν V , V ϒ R = ϒ R V , (3.10a) n Cr m C sm (Tm ) j,k = 0, ∀ r = s, ( r, s = 1, . . . , n; j, k = 1, 2, 3 ), (3.10b) m=1 (Tm∗ )i, j (Tm )k,l = (Tm∗ )i, j (Tm )k,l ,

∀m, m , ( i, j, k, l = 1, 2, 3, m, m = 1, . . . , n ), (3.10c)

where C = ((Cr,s )) is the CKM matrix. Then Q with matrix coproduct (xk ) = xk ⊗ xk , ((Tm )i j ) =

(Tm )il ⊗ (Tm )l j , (Vi j ) =

l=1,2,3

Vil ⊗ Vl j ,

l=1,...,n

(3.11) is a quantum subgroup of the free product C(U (1)) ∗ C(U (1)) ∗ · · · ∗ C(U (1)) ∗ Q n (3) ∗ Au (n).

(3.12)

n+1

The Woronowicz C ∗ -ideal of (3.12) defining Q is determined by the relations (3.10a) and (3.10b).

Quantum Isometries of the Standard Model

111

Proof. Q n (3) is by definition generated by 3 × 3 biunitaries Tm (m = 1, . . . , n) with the relation (3.10c), Au (n) is generated by the matrix entries of a n × n biunitary V , and C(U (1)) ∗ C(U (1)) ∗ · · · ∗ C(U (1)) is freely generated by unitary elements xk (k = 0, . . . , n). The map Tm → Tm , V → V and xk → xk defines a surjective C ∗ -algebra morphism from the CQG in (3.12) to Q. From Lemma 2.11, it follows that the kernel of the morphism (V , xk ) → (V, xk ) is a Woronowicz C ∗ -ideal, i.e. the relations (3.10a) define a quantum subgroup of C(U (1)) ∗ C(U (1)) ∗ · · · ∗ C(U (1)) ∗ Au (n) (apply the lemma to A = ϒν and X, Y ∈ {diag(x0 x1 , ..., x0 xn ), V }). It remains to prove that the kernel I of the morphism Tm → Tm is also a Woronowicz ∗ C -ideal, i.e. the quotient of Q n (3) by the relation (3.10b) is a CQG. The ideal I is generated by the elements X r,s, j,k := nm=1 Cr m C sm (Tm ) j,k for all j, k = 1, 2, 3, r, s = 1, . . . , n and r = s. An easy computation shows that (X r,s, j,k ) =

3 n

Cr m C sm (Tm ) j,l ⊗ (Tm )l,k

m=1 l=1

=

3 n

Cr m C pm (Tm ) j,l ⊗

3

C pm C sm (Tm )l,k

m =1

l, p=1 m=1

=

n

X r, p, j,l ⊗ X p,s,l,k ,

l, p=1

where the second equality follows from 3p=1 C pm C pm = (C ∗ C)mm = δmm (recall that C is a unitary matrix). Hence (I ) ⊂ I ⊗ I , so that I is a Woronowicz C ∗ -ideal. This concludes the proof. + (D F ) of the category C J is given by the Proposition 3.4. The universal object QISO J CQG in Lemma 3.3 with corepresentation U = e11 ⊗ e11 ⊗ e11 ⊗

n

ekk ⊗ x0 xk + e22 ⊗ e11 ⊗ (e11 + e44 ) ⊗

k=1

+e11 ⊗ e11 ⊗ e33 ⊗

n

n

ekk ⊗ xk∗ x0∗ + e22 ⊗ e11 ⊗ (e22 + e33 ) ⊗

k=1

+e11 ⊗ e11 ⊗ e22 ⊗

n

e jk ⊗ (V ) jk + e11 ⊗ e11 ⊗ e44 ⊗

e j+1,k+1 ⊗ (e11 + e44 ) ⊗

j,k=1,2,3

n j,k=1

n

emm ⊗ (Tm ) j,k

m=1

j,k=1,2,3

+e22 ⊗

n

ekk ⊗ xk∗

k=1

j,k=1

+e11 ⊗

ekk ⊗ xk

k=1

e j+1,k+1 ⊗ (e11 + e44 ) ⊗

n m=1

emm ⊗ x0∗ (Tm ) j,k

e jk ⊗ (V ) jk

112

J. Bhowmick, F. D’Andrea, L. Da˛browski

+e11 ⊗

e j+1,k+1 ⊗ (e22 + e33 ) ⊗

j,k=1,2,3

+e22 ⊗

e j+1,k+1 ⊗ (e22 + e33 ) ⊗

n m=1 n

emm ⊗ (T m ) j,k emm ⊗ (T m ) j,k x0 .

(3.13)

m=1

j,k=1,2,3

+ (D F ) coacts trivially on the two summands C of B F = C⊕C⊕ M2 (C)⊕ M3 (C), QISO J while on the remaining summands the coaction is α(0, 0, eii , 0) = 0, 0, eii , 0 ⊗ 1, α(0, 0, e12 , 0) = 0, 0, e12 , 0 ⊗ x0 , α(0, 0, e21 , 0) = 0, 0, e21 , 0 ⊗ x0∗ , 0, 0, 0, ekl ⊗ (T1∗ )i,k (T1 )l, j . α( 0, 0, 0, ei j ) =

(3.14a) (3.14b) (3.14c) (3.14d)

k,l=1,2,3

Proof. The proof is in Sect. 6. Definition 3.5. Let Q n,C (3) be the quantum subgroup of Q n (3), cf. Def. 2.10, defined by the relation nm=1 Cr m C sm (u m ) j,k = 0. Remark 3.6. It is easy to see that Q n,C (3) is noncommutative as a C ∗ -algebra. Indeed, if u is a 3 × 3 biunitary generating Au (3), the map (u m ) jk → u jk , ∀m = 1, . . . , n, j, k = 1, 2, 3, is a C ∗ -algebra morphism (C is a unitary matrix, hence (3.10b) and (3.10c) are automatically satisfied). Thus Au (3) is a quantum subgroup of Q n,C (3). Proposition 3.7. The quantum isometry group of the internal space of the Standard Model is +

QISO J (D F ) = C(U (1)) ∗ Aaut (M3 (C)). Its abelianization is given by (complex functions on) the classical group U (1) × PU (3). Proof. From (3.14) it follows that QISO J (D F ) is generated by x0 and (T1∗ )i,k (T1 )l, j : + then QISO J (D F ) is a quantum subgroup of C(U (1))∗ P Au (3). On the other hand Au (3) is a quantum subgroup of Q n,C (3) (Rem. 3.6), and with the map x0 → x0 , xi → 1 (i = 1, . . . , n), V → x0 1n and (Tm ) jk → u jk ∀m = 1, . . . , n (with u jk the usual + (D F ) in generators of Au (3)), one proves that C(U (1)) ∗ Au (3) is a sub-object of QISO J + the category C J and C(U (1)) ∗ P Au (3) is a quantum subgroup of QISO J (D F ); hence + QISO J (D F ) and C(U (1)) ∗ P Au (3) coincide. Recalling that P Au (3) Aaut (M3 (C)) (cf. Def. 2.7 and Prop. 2.9) the proof is concluded. +

+ (D F ) depends on ϒν , ϒ R and the CKM matrix C (cf. (3.10)), Although QISO J + the quantum group QISO J (D F ) does not depend on the explicit form of these three matrices. We stress the importance of this result, since neutrino masses are not known + (at the moment, we only know that they are all distinct [1,27]). Also, QISO J (D F ) is independent of the number of generations.

Quantum Isometries of the Standard Model

113

Let us conclude this section by explaining how elementary particles transform under the corepresentation U in physics notation. As explained in Sect. 3.1, we have ν L ,k ν R,k e L ,k e R,k u L ,c,k u R,c,k d L ,c,k d R,c,k

:= := := := := := := :=

e1 ⊗ e1 ⊗ e1 ⊗ ek , e1 ⊗ e1 ⊗ e4 ⊗ ek , e2 ⊗ e1 ⊗ e1 ⊗ ek , e2 ⊗ e1 ⊗ e4 ⊗ ek , e1 ⊗ ec+1 ⊗ e1 ⊗ ek , e1 ⊗ ec+1 ⊗ e4 ⊗ ek , e2 ⊗ ec+1 ⊗ e1 ⊗ ek , e2 ⊗ ec+1 ⊗ e4 ⊗ ek ,

(left-handed neutrino, generation k) (right-handed neutrino, generation k) (left-handed electron, generation k) (right-handed electron, generation k) (left-handed up-quark, color c, generation k) (right-handed up-quark, color c, generation k) (left-handed down-quark, color c, generation k) (righ-handed down-quark, color c, generation k)

where {ei , i = 1, . . . , r } is the canonical orthonormal basis of Cr , c = 1, 2, 3 and k = 1, . . . , n. These together with the corresponding antiparticles form a linear basis of H F . A straightforward computation using (3.13) proves that we have the following transformation laws: n U (ν L ,k ) := ν L ,k ⊗ x0 xk , U (ν R,k ) := ν R, j ⊗ V jk , j=1

U (e L ,k ) := e L ,k ⊗ xk , U (e R,k ) := e R,k ⊗ xk , U (u L ,c,k ) :=

3

u

L ,c ,k

⊗ (Tk ) , U (u R,c,k ) := c c

c =1

U (d L ,c,k ) :=

3

3

u R,c ,k ⊗ (Tk )c c ,

c =1

d L ,c ,k ⊗ x0∗ (Tk )c c , U (d R,c,k ) :=

c =1

3

d R,c ,k ⊗ x0∗ (Tk )c c ,

c =1

where U (v), v ∈ H F , is a shorthand notation for U (v ⊗ 1 Q ). Antiparticles transform according to the conjugate corepresentations. + (D F ). We comment now on the meaning of QISO J Remark 3.8. Let z be the generator of C(U (1)), T = ((T jk )) be the generators of Au (3) and consider the corepresentation H F → H F ⊗ (C(U (1)) ∗ Au (3)) determined by ν•,k → ν•,k ⊗ 1, e•,k → e•,k ⊗ (z ∗ )3 , u •,c,k →

3 c =1

u •,c ,k ⊗ z 2 Tc c , d•,c,k →

(3.15a) 3

d•,c ,k ⊗ z ∗ Tc c ,

(3.15b)

c =1

where • is L or R. Let q be a third root of unity, and consider the Z3 action on C(U (1)) ∗ Au (3) given by z → qz, T jk → qT jk . The elements appearing in the image of the above corepresentation generate the fixed point subalgebra for this action, that is {C(U (1)) ∗ Au (3)}Z3 . The quantum group {C(U (1)) ∗ Au (3)}Z3 with the corepresentation above is a + (D F ) in the category C J . The surjective CQG homorphism sub-object of QISO J + (D F ) → {C(U (1)) ∗ Au (3)}Z3 is given by QISO J x0 → z 3 , xm → (z ∗ )3 , ∀m = 1, . . . , n, (Tm ) jk → z 2 T jk , ∀m = 1, . . . , n, V → 1n .

114

J. Bhowmick, F. D’Andrea, L. Da˛browski

The kernel of this map — the ideal generated by V jk and by products x0 xk and (Tm∗ Tm )kl for all m = m — is given by elements that do not appear in the adjoint coaction on B F . Roughly speaking, modulo terms “commuting” with the algebra B F , we have that + (D F ) ∼ {C(U (1)) ∗ Au (3)}Z3 is the “free version” of the ordinary gauge group QISO J after symmetry breaking. If we pass to the abelianization C(U (1) × U (3))Z3 C ((U (1) × U (3)) /Z3 ) of {C(U (1)) ∗ Au (3)}Z3 and from the corresponding corepresentation to the dual representation of (τ, g) ∈ U (1) × U (3), from (3.15) we find the usual global gauge transformations after symmetry breaking: ν•,k → ν•,k ,

e•,k → (τ ∗ )3 e•,k ,

u •,c,k →

3

τ 2 gc c u •,c ,k ,

c =1

d•,c,k →

3

τ ∗ gc c d•,c ,k .

c =1 + + (D F ) depends upon the J in two special cases. As we already noticed, QISO 3.3. QISO J explicit form of ϒν , ϒ R and C. In particular, on one extreme we have the case when ϒν is invertible (this is the case of the Dirac operator in the moduli space as in Prop. 1.192 of [20]) and on the other extreme we have the case ϒν = 0.

+ (D F ) is the free product of Q n,C (3) with the Proposition 3.9. If ϒν is invertible, QISO J quotient of C(U (1)) ∗ C(U (1)) ∗ . . . ∗ C(U (1)) n+1

by the relations ∀i, j such that (ϒ R )i j = 0, xi∗ x0∗ = x0 x j xi = x j ∀i, j such that (ϒν )i j = 0. Proof. If ϒν is invertible, the first equation in (3.10a) gives V = diag(x1∗ x0∗ , . . . , xn∗ x0∗ ) (so that the factor Au (n) in (3.12) disappears) and also (ϒν )i j x0 (xi − x j ) = 0. The latter implies xi = x j whenever (ϒν )i j = 0. The second equation in (3.10a) becomes (ϒ R )i j (xi∗ x0∗ − x0 x j ) = 0, which implies ∗ xi x0∗ = x0 x j whenever (ϒ R )i j = 0. Although disproved by experiment, it is an interesting exercise to study the case of massless (ϒν = 0) left-handed neutrinos, that is the so-called minimal Standard Model. (D F ) is isomorphic to Proposition 3.10. If ϒν = 0, QISO J +

C(U (1)) ∗ C(U (1)) ∗ . . . ∗ C(U (1)) ∗ Q n,C (3) ∗ A , n+1

A

:= Au (n)/ ∼, Au (n) is generated by the n × n biunitary V and “∼” is the where relation V ϒ R = ϒ R V . As a consequence of Noether’s Theorem, any Lie group symmetry is associated to + (D F ) is indeed a corresponding conservation law. We shall see in Sect. 4.2 that QISO J a symmetry of the dynamics. In Rem. 3.8, roughly speaking, we discussed the part of

Quantum Isometries of the Standard Model

115

+ that is relevant in the coaction on the algebra B F : it is the free version of the QISO J gauge group which corresponds to the conservation of color and electric charge. We complete here the analysis by discussing the additional symmetries that are present in the case of the minimal Standard Model in Prop. 3.10. The factor A coacts only on the subspace (e11 ⊗ e11 ⊗ (e22 + e44 ) ⊗ 1)H F of right-handed neutrinos, and can be neglected in the minimal Standard Model (where we consider only left-handed neutrinos). As a consequence of Noether’s Theorem, there exists a conservation law corresponding to each classical group of symmetries. It is easy to give an interpretation to the C(U (1)) factors generated by xi , i = 1, . . . , n. Passing from the C(U (1)) coaction to the dual U (1) action, one easily sees that for i > 0, xi gives a phase transformation of the i th generation of ν L , e L , e R (plus the opposite transformation for the antiparticles). In the minimal Standard Model, which has only left-handed (massless) neutrinos, these symmetries give the conservation laws of the total number of leptons in each generation (electron number, muon number, tau number, plus other n − 3 for the other families of leptons). To conclude the list of conservation laws, there is still one classical U (1) subgroup of the factor Q n,C (3) that should be mentioned. If we denote by y the unitary generator + (D F ) → C(U (1)) is given of C(U (1)), a surjective CQG homomorphism ϕ : QISO J by x0 → 1,

xi → 1,

V j,k → δ j,k ,

(Ti ) j,k → δ j,k y,

for all i = 1, . . . , n and j, k = 1, 2, 3. From U we get the following corepresentation of this U (1) subgroup on H F : (id ⊗ ϕ)(U ) = 1 ⊗ e11 ⊗ 1 ⊗ 1 ⊗ 1C(U (1)) +1 ⊗ (1 − e11 ) ⊗ (e11 + e44 ) ⊗ 1 ⊗ y +1 ⊗ (1 − e11 ) ⊗ (e22 + e33 ) ⊗ 1 ⊗ y ∗ . The representation of U (1) dual to this corepresentation of C(U (1)) is given by a phase transformation on the subspace C2 ⊗ (1 − e11 )C4 ⊗ (e11 + e44 )C4 ⊗ Cn of quarks and the inverse transformation on the subspace C2 ⊗ (1 − e11 )C4 ⊗ (e22 + e33 )C4 ⊗ Cn of anti-quarks and is called in physics the “baryon phase symmetry”. It corresponds to the conservation of the baryon number (total number of quarks minus the number of anti-quarks). In this section we discussed conservation laws associated to classical subgroups of + (D F ) in the massless neutrino case. It would be interesting to extend this study to QISO J + (D F ) in the sense of a suitable Noether analysis extended the full quantum group QISO J to the quantum group framework. If we consider massive neutrinos, we lose a lot of classical symmetries, but we still have many quantum symmetries. A natural question is whether quantum symmetries are suitable for deriving conservation laws (i.e. physical predictions). A first step in this direction is to investigate whether the spectral action is invariant under quantum isometries. We discuss this point in the next section. 4. Quantum Isometries of M × F 4.1. Quantum isometries of a product of spectral triples. Before discussing the spectral action, we want to understand whether the quantum isometry group of the finite geometry F is also a quantum group of orientation preserving isometries of the full spectral

116

J. Bhowmick, F. D’Andrea, L. Da˛browski

triple of the Standard Model, that is the product of F with the canonical spectral triple of a compact Riemannian spin manifold M with no boundary. The answer is affirmative and we can prove it in a more general situation: Let (A1 , H1 , D1 , γ1 , J1 ) be any unital real spectral triple (γ1 = 1 if the spectral triple is odd). Let (A2 , H2 , D2 , γ2 , J2 ) be a finite-dimensional unital even real spectral triple. Let (A, H, D, γ , J ) be the product triple, i.e.

H := H1 ⊗ H2 , A := A1 ⊗alg A2 , γ := γ1 ⊗ γ2 , J := J1 ⊗ J2 .

D := D1 ⊗ γ2 + 1 ⊗ D2 ,

In the case of the Standard Model, (A1 , H1 , D1 , γ1 , J1 ) and (A2 , H2 , D2 , γ2 , J2 ) will be the canonical spectral triple of M and the spectral triple (B F , H F , D F , γ F , J F ) respectively. We claim that: + (A2 , H2 , D2 , γ2 , J2 ) coacts by “orientation and real structure preLemma 4.1. QISO serving isometries” on the product triple (A, H, D, γ , J ). + (D2 ) and U its corepresentation on H2 . Proof. Let Q 0 be the quantum group QISO J2 Then Uˆ := 1 ⊗ U is a unitary corepresentation on H1 ⊗ H2 , and we need to prove that it satisfies (2.3a), (2.3b), and (2.3c). The first two conditions are easy to check. Indeed, if U commutes with D2 and γ2 , clearly 1 ⊗ U commutes with D = D1 ⊗ γ2 + 1 ⊗ D2 and γ = γ1 ⊗ γ2 . Moreover, for any vector ξ = ξ1 ⊗ ξ2 ∈ H1 ⊗ H2 , (J ⊗ ∗)Uˆ (ξ ⊗ 1) = = = =

(J1 ⊗ J2 ⊗ ∗)(1 ⊗ U )(ξ1 ⊗ ξ2 ⊗ 1) J1 ξ1 ⊗ (J2 ⊗ ∗)U (ξ2 ⊗ 1) J1 ξ1 ⊗ U (J2 ξ2 ⊗ 1) (1 ⊗ U )(J1 ξ1 ⊗ J2 ξ2 ⊗ 1) = Uˆ (J ξ ⊗ 1),

and thus (2.3b) is proved. Any element of A is a finite sum of tensors a1 ⊗ a2 , with a1 ∈ A1 and a2 ∈ A2 , and since A2 is finite dimensional implies U (a2 ⊗ 1 Q 0 )U ∗ ∈ A2 ⊗alg Q 0 , we have ˆ (a1 ⊗ a2 ⊗ 1 Q 0 )Uˆ ∗ = a1 ⊗ U (a2 ⊗ 1 Q 0 )U ∗ ∈ A1 ⊗alg A2 ⊗alg Q 0 , AdU O (a1 ⊗ a2 ) = U which implies (2.3c).

4.2. Invariance of the spectral action. The dynamics of a unital spectral triple (A, H, D, γ ) — with γ = 1 in the odd case — is governed by an action functional [12], S[A, ψ] := Sb [A] + S f [A, ψ], whose variables are a self-adjoint one-form A ∈ 1,s.a. ⊂ B(H) and ψ either in H or D in H+ := (1 + γ )H. While one uses H in Yang-Mills theories, the reduction to H+ is employed in the Standard Model to solve the fermion doubling problem [20,31]. The fermionic part of the spectral action is either S f [A, ψ] = ψ, D A ψ ,

D A := D + A,

(4.1)

Quantum Isometries of the Standard Model

117

or for a real spectral triple S f [A, ψ] := J ψ, D A ψ ,

D A := D + A + J A J −1 ,

(4.2)

where is the sign in (2.1). The bosonic part is Sb [A] = Tr f (D A /), where D A is either the operator in (4.1) or (4.2), and f is a suitable cut-off function (with > 0). More precisely, f is a smooth approximation of the characteristic function of the interval [−1, 1], so that f (D A /) — defined via the continuous functional calculus — is a trace class operator on H and Sb [A] is well defined. In the rest of the section we focus on the fermionic action S f and the operator D A given by (4.2), although all the proofs can be repeated in the case (4.1) as well. Assume that Q is a CQG with a unitary corepresentation Uˆ on H commuting with D and γ , and such that AdUˆ maps A into A ⊗alg Q (rather than (2.3c)). Then H+ is preserved by Uˆ , and for any 1-form A = i ai [D, bi ], with ai , bi ∈ A, the operator AdUˆ (A) = Uˆ (A ⊗ 1)Uˆ ∗ = i AdUˆ (ai )[D ⊗ 1, AdUˆ (bi )] is an element of 1D ⊗alg Q. Therefore a coaction of Q on 1D ⊕ H+ is given by β : (A, ψ) → Uˆ (A ⊗ 1)Uˆ ∗ , Uˆ (ψ ⊗ 1) . To discuss the (co)invariance of the spectral action we need to extend it to the latter space. There is a natural way to do it. The inner product , : H+ ⊗ H+ → C can , Q : M ⊗ M → Q on the be extended in a unique way to an Hermitian structure right Q-module M := H+ ⊗ Q by the rule ψ ⊗ q, ψ ⊗ q Q = q ∗ q ψ, ψ . Unitary (resp. antiunitary) maps L on H+ are extended in a unique way to Q-linear (resp. antilinear) maps on M as L ⊗ 1 (resp. L ⊗ ∗). The corresponding extension of the spectral action is given by the Q-valued functional ˜ + S˜ f [ A, ˜ ψ], ˜ A, ˜ ψ] ˜ ˜ := S˜b [ A] S[ where ˜ := (Tr H ⊗ id) f (D ˜ /), S˜b [ A] A ˜ ψ] ˜ := (J ⊗ ∗)ψ, ˜ D ˜ ψ˜ , S˜ f [ A, A

Q

and A˜ is a self-adjoint element of 1D ⊗alg Q, ψ˜ ∈ H+ ⊗ Q, D A˜ := D ⊗ 1 + A˜ + ˜ ⊗ ∗)−1 . (J ⊗ ∗) A(J Here f (D A˜ /) is defined in the following way: if L 2 (Q) is the GNS representation ˜ ⊗ ∗)−1 is a bounded selfassociated to the Haar state of Q, then A˜ + (J ⊗ ∗) A(J 2 adjoint operator on H ⊗ L (Q) and D A˜ is a (unbounded) self-adjoint operator on the Hilbert space H ⊗ L 2 (Q). The operator f (D A˜ /) is then defined using the continuous functional calculus. By (co)invariance of the action functional we mean the property ˜ S[β(A, ψ)] = S[A, ψ] · 1 Q .

(4.3)

118

J. Bhowmick, F. D’Andrea, L. Da˛browski

Notice that since A is a self-adjoint 1-form, A˜ = Uˆ (A ⊗ 1)Uˆ ∗ is a self-adjoint element ˜ ψ)] is well defined. In the remaining part of 1D ⊗alg Q as required above so that S[β(A, of the section we discuss the invariance of the action. We study separately the fermionic and the bosonic part. Proposition 4.2. If Uˆ satisfies (2.3a) and (2.3b), then S˜ f [β(A, ψ)] = S f [A, ψ] · 1 Q for all (A, ψ) ∈ 1,s.a. ⊕ H+ . D Proof. This is a simple algebraic identity. Since Uˆ commutes with D and J ⊗ ∗, we have DUˆ (A⊗1)Uˆ ∗ = D ⊗ 1 + Uˆ (A ⊗ 1)Uˆ ∗ + (J ⊗ ∗)Uˆ (A ⊗ 1)Uˆ ∗ (J ⊗ ∗)−1 = Uˆ (D A ⊗ 1)Uˆ ∗ . Thus,

(4.4)

S˜ f [β(A, ψ)] = (J ⊗ ∗)Uˆ (ψ ⊗ 1 Q ), DUˆ (A⊗1)Uˆ ∗ Uˆ (ψ ⊗ 1 Q ) Q = Uˆ (J ψ ⊗ 1 Q ), Uˆ (D A ψ ⊗ 1 Q ) Q

= J ψ, D A ψ · 1 Q = S f [A, ψ] · 1 Q , by the unitarity of Uˆ . For the rest of the subsection, we will assume that (A, H, D, J, γ ) is the product of two real spectral triples, one of them being even and finite-dimensional. In fact, we will use the notations in Subsect. 4.1. Moreover we assume that Uˆ := 1 ⊗ U , where U is a unitary corepresentation of the compact quantum group Q such that (Q, U ) coacts by orientation and real structure preserving isometries on the finite dimensional spectral triple (A2 , H2 , D2 , γ2 , J2 ). Under these assumptions, we now establish the invariance for the bosonic part. Lemma 4.3. For any trace-class operator L on H = H1 ⊗ H2 , (Tr H ⊗ id)Uˆ (L ⊗ 1)Uˆ ∗ = Tr H (L) · 1 Q . Proof. Let L = L 1 ⊗ L 2 with L 1 ∈ L1 (H1 ) and L 2 ∈ B(H2 ). Since Uˆ (L ⊗ 1)Uˆ ∗ = L 1 ⊗ U (L 2 ⊗ 1)U ∗ , by Lemma 2.6, we have: (Tr H1 ⊗H2 ⊗ id)Uˆ (L ⊗ 1)Uˆ ∗ = Tr H1 (L 1 ) · (Tr H2 ⊗ id) U (L 2 ⊗ 1)U ∗ · 1 Q = Tr H1 ⊗H2 (L) · 1 Q . Since H2 is finite dimensional, any element of L1 (H1 ⊗ H2 ) is a finite sum of elements of the form L := L 1 ⊗ L 2 , with L 1 ∈ L1 (H1 ) and L 2 ∈ B(H2 ), and thus by the linearity of the trace, the proof is finished.

Quantum Isometries of the Standard Model

119

Proposition 4.4. For any A ∈ 1,s.a. , S˜b [AdUˆ (A)] = Sb [A] · 1 Q . D Proof. From (4.4) we have S˜b [Uˆ (A ⊗ 1)Uˆ ∗ ] = (Tr H ⊗ id) f (DUˆ (A⊗1)Uˆ ∗ /) = (Tr H ⊗ id) f Uˆ (D A ⊗ 1)Uˆ ∗ / . By continuous functional calculus, f Uˆ (D A ⊗ 1)Uˆ ∗ / = Uˆ f ((D A ⊗ 1)/) Uˆ ∗ = Uˆ ( f (D A /) ⊗ 1) Uˆ ∗ , and applying Lemma 4.3 to the trace-class operator L := f (D A /) we get S˜b [Uˆ (A ⊗ 1)Uˆ ∗ ] = (Tr H ⊗ id) Uˆ (L ⊗ 1)Uˆ ∗ = Tr H (L) · 1 Q ≡ Tr H f (D A /) · 1 Q = Sb [A] · 1 Q , which concludes the proof.

Proposition 4.5. The bosonic and the fermionic part of the spectral action of the Stan + (B F , H F , D F , γ F , J F ). dard Model are preserved by the compact quantum group QISO + (B F , H F , D F , γ F , J F ) has a corepProof. The compact quantum group Q := QISO resentation preserving H+ and it satisfies the hypothesis of Lemma 4.1 and Prop. 4.2 and 4.4, hence the result follows. 5. Some Remarks on Real ∗-Algebras and Their Symmetries In Sect. 3.2 we computed the quantum isometry group of the finite part of the Standard Model by replacing the real C ∗ -algebra A F with the complex C ∗ -algebra B F . Here we explain what happens if we work with A F . Any real ∗-algebra A (i.e. unital, associative, involutive algebra over R) can be thought of as the fixed point subalgebra of its complexification AC = A ⊗R C with respect to the involutive (conjugate-linear) real ∗-algebra automorphism σ defined by σ (a ⊗R z) = a ⊗R z ∀a ∈ A, z ∈ C,

(5.1)

that is A = {a ∈ AC : σ (a) = a}. A crucial observation is that we can characterize the automorphisms of A as those automorphisms of AC which commute with σ , as proved in the following lemma. Lemma 5.1. For any real ∗-algebra A, Aut(A) {φ ∈ Aut(AC ) : σ φ = φσ } .

(5.2)

120

J. Bhowmick, F. D’Andrea, L. Da˛browski

Proof. If ϕ is any (real) ∗-algebra morphism of A, φ(a ⊗R z) := ϕ(a) ⊗R z defines a (complex) ∗-algebra morphism of AC clearly satisfying σ φ = φσ . The map ϕ → φ gives an inclusion of the left hand side of (5.2) into the right hand side. Conversely, if φ ∈ Aut(AC ) satisfies σ φ = φσ , then it maps the real subalgebra A A ⊗R 1 ⊂ AC into itself, since σ φ(a ⊗R 1) = φσ (a ⊗R 1) = φ(a ⊗R 1) for any a ∈ A. Therefore, we can define an element ϕ ∈ Aut(A) by ϕ(a) ⊗R 1 := φ(a ⊗R 1). The two group homomorphisms ϕ → φ and φ → ϕ are the inverses of each other and thus, we have the isomorphism in (5.2). From a dual point of view, if G = Aut(A), the right coaction of C(G) on AC is the map α : AC → AC ⊗ C(G) C(G; AC ) defined by (id ⊗ evφ )α(a) := φ(a), φ ∈ G, a ∈ AC . We can rephrase Lemma 5.1 as follows. Lemma 5.2. For a finite dimensional real C ∗ -algebra A, the condition σ φ = φ σ ∀ φ ∈ G is equivalent to (σ ⊗ ∗C(G) )α = ασ. Proof. Let αφ σ = (σ ⊗ evφ ∗C(G) )α and φ ∈ G, a ∈ AC . Let us suppose that (σ ⊗ ∗C(G) )α = ασ . Then σ φ(a) = (id ⊗ evφ )ασ (a) = (σ ⊗ evφ ∗C(G) )α(a) = (σ ⊗∗C evφ )α(a) = φσ (a) by the antilinearity of σ. Conversely, if σ φ = φ σ ∀ φ ∈ G, then for all φ, (id ⊗ evφ )α(σ (a)) = (σ ⊗ evφ )(α(a)). Thus, (σ ⊗ evφ ∗C(G) )α(a) = (σ ⊗ ∗C evφ )α(a) = σ ((id ⊗ evφ )α(a)) = σ φ(a) = φσ (a) = (id ⊗ evφ )α(σ (a)). As {evφ : φ ∈ G} separates points on G, this proves (σ ⊗ ∗C(G) )α = ασ . Motivated by this lemma, we consider the category C J,R of CQGs coacting by orientation and real structure preserving isometries via a unitary corepresentation U (in the sense of Def. 2.12) on the spectral triple (B F , H F , D F , γ F , J F ) whose adjoint coaction AdU can be extended to a coaction α on (A F )C = A F ⊗R C satisfying (σ ⊗ ∗)α = ασ.

(5.3)

We notice that it is a subcategory of C J : objects of C J,R are those objects of C J compatible with σ in the sense explained above, and the morphisms in the two categories are the same. Thus any object, say Q, of C J,R satisfies the relations of the universal object + (D F ) of C J in Prop. 3.4. In the rest of this subsection, with a slight abuse of QISO J notation, we will continue to denote the generators of Q by the same symbols as in Prop. 3.4. Theorem 5.3. A compact quantum group Q is an object in C J,R if and only if the generators satisfy (Tm ) jk (Tm )∗j k (Tm ) j k = (Tm ) j k (Tm )∗j k (Tm ) jk for all m = 1, . . . , n and all j, j , j , k, k , k ∈ {1, 2, 3}.

(5.4)

Quantum Isometries of the Standard Model

121

Proof. The real algebra A F = C⊕H⊕ M3 (C) is the fixed point subalgebra of (A F )C C ⊕ C ⊕ M2 (C) ⊕ M3 (C) ⊕ M3 (C) with respect to the automorphism

σ (λ, λ , q, m, m ) = (λ , λ, σ2 qσ2 , m , m), where σ2 is the second Pauli matrix: σ2 :=

0 −i i 0

.

It is easy to check that q ∈ M2 (C) satisfies σ2 qσ2 = q if and only if it is of the form (3.2), and that under the isomorphism (3.7) C is identified with the real subalgebra of C ⊕ C with elements (λ, λ) and M3 (C) with the real subalgebra of M3 (C) ⊕ M3 (C) with elements (m, m). The coaction on the factor B F ⊂ (A F )C is given by (3.14), and an extension AdU to (A F )C satisfying (5.3) exists if and only if AdU (0, 0, 0, 0, ei j ) = (σ ⊗ ∗) AdU σ (0, 0, 0, 0, ei j ) = (σ ⊗ ∗) AdU (0, 0, 0, ei j , 0) = (σ ⊗ ∗) AdU ( 0, 0, 0, ei j ), 0 = (σ ⊗ ∗) (0, 0, 0, ekl , 0) ⊗ (T1 )∗ki (T1 )l j =

k,l=1,2,3

(0, 0, 0, 0, ekl ) ⊗ (T1 )l∗j (T1 )ki .

k,l=1,2,3

The only conditions left to impose is that this extension is a coaction of a CQG. As it is already a coaction on B F , we need to impose it for the coaction on the second copy of M3 (C), which has to be preserved by AdU . At this point, we note that as AdU is an extension of AdU , which preserves the trace on the first copy of M3 (C), the formula AdU (0, 0, 0, 0, ei j ) = k,l=1,2,3 (0, 0, 0, 0, ekl ) ⊗ (T1 )l∗j (T1 )ki forces AdU to preserve the trace on the second copy of M3 (C). Thus, by Theorem 4.1 of [37], it suffices to impose the conditions (4.1–4.5) in that paper with aiklj replaced by (Tm )l∗j (Tm )ki . It is easy to check that (4.3–4.5) are automatically satisfied. The only non trivial conditions come from (4.1) and (4.2). From (4.1), we get 3 ∗ ∗ (Tm )∗v j (Tm )ki (Tm )ls (Tm )vr = δ jr (Tm )ls (Tm )ki .

(5.5)

v=1

From (4.2), we get the same relation with (Tm )t instead of Tm . Now we show that (5.5) and (5.4) are equivalent, which will finish the proof since if Tm satisfies (5.4), then (Tm )t satisfies it too. If we multiply both sides of (5.5) by (Tm )q j from the left and sum over j, we get 3 v=1

∗ δvq (Tm )ki (Tm )ls (Tm )vr =

3 j=1

∗ δ jr (Tm )q j (Tm )ls (Tm )ki ,

122

J. Bhowmick, F. D’Andrea, L. Da˛browski

using biunitarity of Tm . The last equation is clearly equivalent to (5.4). To prove that (5.4) implies (5.5), it is enough to multiply both sides by (Tm ) j k from the left, then sum over j and use the biunitarity of Tm again. It is easy to check that (5.4) defines a Woronowicz C ∗ -ideal, and hence the quotient + (D F ) by (5.4) is a CQG. This leads to the following corollary. of QISO J

+ (D F ) be the quantum subgroup of the CQG QISO + (D F ) in Corollary 5.4. Let QISO R J + (D F ) is the universal object in the Prop. 3.4 defined by the relations (5.4). Then QISO R category C J,R . Motivated by (5.4), we give the following definition. Definition 5.5. For a fixed N , we call A∗u (N ) the universal unital C ∗ -algebra generated by a N × N biunitary u = ((u i j )) with relations ab∗ c = cb∗ a,

∀ a, b, c ∈ {u i j , i, j = 1, . . . , N }. A∗u (N ) is a CQG with coproduct given by (u i j ) = k u ik ⊗ u k j .

(5.6)

We will call A∗u (N ) the N -dimensional half-liberated unitary group. This is similar to the half-liberated orthogonal group A∗o (N ), that can be obtained by imposing the further relation a = a ∗ for all a ∈ {u i j , i, j, = 1, . . . , N } (cf. [5]). Remark 5.6. We notice that there are two other possible ways to “half-liberate” the free unitary group. Instead of ab∗ c = cb∗ a (which by adjunction is equivalent to a ∗ bc∗ = c∗ ba ∗ ), one can consider respectively the relation a ∗ bc = cba ∗ (which is equivalent to abc∗ = c∗ ba and to the adjoints ab∗ c∗ = c∗ b∗ a and a ∗ b∗ c = cb∗ a ∗ ) or abc = cba (equivalent to a ∗ b∗ c∗ = c∗ b∗ a ∗ ) for any triple a, b, c ∈ {u i j , i, j = 1, . . . , N }. Like A∗o (N ), the projective version of A∗u (N ) is also commutative, as proved in the next proposition. Proposition 5.7. The CQG P A∗u (N ) is isomorphic to C(PU (N )). Proof. We recall (Rem. 2.8) that for a CQG Q generated by a biunitary u = ((u i j )), the projective version is the C ∗ -subalgebra generated by products u i∗j u kl . Clearly C(U (N )) is a quantum subgroup of A∗u (N ), and the latter is a quantum subgroup of Au (N ). Thus, C(PU (N )) is a quantum subgroup of P A∗u (N ), which is a quantum subgroup of P Au (N ). Since the abelianization of P Au (N ) is exactly C(PU (N )), any commutative (as a C ∗ -algebra) quantum subgroup of P Au (N ) containing C(PU (N )) coincides with C(PU (N )). Thus, the proof will be over if we can show that the C ∗ -algebra of P Au (N ) is commutative, i.e. P Au (N ) is the space of continuous functions on a compact group. This is a simple computation. Using first (5.6) and then its adjoint we get: (u i∗j u kl )(u ∗pq u r s ) = u i∗j (u kl u ∗pq u r s ) = u i∗j (u r s u ∗pq u kl ) = (u i∗j u r s u ∗pq )u kl = (u ∗pq u r s u i∗j )u kl = (u ∗pq u r s )(u i∗j u kl ). This proves that the generators of P Au (N ) commute, which concludes the proof.

Quantum Isometries of the Standard Model

123

In complete analogy with (3.12), if we call Q ∗n (n ) the amalgamated free product of n copies of A∗u (n ) over the common Woronowicz C ∗ -subalgebra C(PU (n )), then we have: + (D F ) is a quantum subgroup of the free product Corollary 5.8. QISO R C(U (1)) ∗ C(U (1)) ∗ · · · ∗ C(U (1)) ∗ Q ∗n (3) ∗ Au (n). n+1

+ (D F ) is determined by (3.10a) The Woronowicz C ∗ -ideal of this CQG defining QISO R and (3.10b). As in the complex case, let us denote by QISOR (D F ) the C ∗ -subalgebra of + (D F ) generated by ξ ⊗ 1, AdU (a)(η ⊗ 1) , where a ∈ B F , ξ, η ∈ H F and QISO R R + (D F ). An immediate corollary of Prop. 5.7 and UR is the corepresentation of QISO R Corollary 5.8 is the following. +

+

Corollary 5.9. QISOR (D F ) = C(U (1)) ∗ C(PU (3)). + (D F ) is a quantum subgroup of QISO + (D F ), its coaction Remark 5.10. Since QISO R J still preserves the spectral action. A detailed study of quantum automorphisms for finite-dimensional real C ∗ -algebras, along the lines of the discussion in this section, will be reported elsewhere. 6. Proof of Proposition 3.4 In this section, we prove the main result, that is, Proposition 3.4. Throughout this section, (Q, U ) will denote an object in C J . We start by exploiting the conditions regarding γ F and J F , then we use the conditions regarding D F and AdU to get a neater expression for U in Lemma 6.2, 6.3 and 6.4 and then using these simplified expressions in the next lemmas, we derive the desired form of U from which we can identify the quantum isometry group. We will use Remark 3.1 in this section without mentioning it. Recall that B(H F ) = M2 (C) ⊗ M4 (C) ⊗ M4 (C) ⊗ Mn (C), where n is the number of generations. Lemma 6.1. U ∈ B(H F ) ⊗ Q satisfies (γ F ⊗ 1)U = U (γ F ⊗ 1) and (J0 ⊗ 1)U = U (J0 ⊗ 1) iff U = (ei1 j1 ⊗ ei2 j2 ⊗ ei3 j3 ⊗ ei4 j4 ) ⊗ u I J IJ + (ei1 j1 ⊗ ei2 j2 ⊗ ei3 +2, j3 +2 ⊗ ei4 j4 ) ⊗ u I J , (6.1) IJ

where the multi-indices I = (i 1 , . . . , i 4 ), J = ( j1 , . . . , j4 ), etc. run in {1, 2} × {1, 2, 3, 4} × {1, 2} × {1, 2, . . . , n}. Proof. The condition (γ F ⊗ 1)U = U (γ F ⊗ 1) implies that u i1 , j1 ,i2 , j2 ,i3 , j3 ,i4 , j4 = 0 unless i 3 , j3 are both greater than or equal to 2 or both less than or equal to 3. Using the reduced form of U obtained from this observation, we impose (J0 ⊗ 1)U = U (J0 ⊗ 1) and get u i1 , j1 ,i2 , j2 ,i3 , j3 ,i4 , j4 = (u i1 , j1 ,i2 , j2 ,i3 −2, j3 −2,i4 , j4 )∗ for all i 3 , j3 ≥ 3, which proves the lemma.

124

J. Bhowmick, F. D’Andrea, L. Da˛browski

Let V1 , V2 , V3 , V4 denote the subspaces (e11 ⊗ e11 ⊗ 1 ⊗ 1)H, (e22 ⊗ e11 ⊗ 1 ⊗ 1)H, (e11 ⊗ (1 − e11 ) ⊗ 1 ⊗ 1)H, and (e22 ⊗ (1 − e11 ) ⊗ 1 ⊗ 1)H respectively. Lemma 6.2. If U is of the form (6.1) and commutes with D F , the subspaces Vi , i = 1, 2, 3, 4 are kept invariant by U and thus (6.1) becomes ⎛ i ⎞ i α11 α12 0 0 ⎜ i ⎟ i ⎜ α21 α22 0 0 ⎟ ⎜ ⎟ eii ⊗ e11 ⊗ ⎜ U = i i ⎟ α α 0 0 ⎝ 11 12 ⎠ i=1,2 i i 0 0 α 21 α 22 ⎛ i, j,k ⎞ i, j,k β11 β12 0 0 ⎜ i, j,k ⎟ i, j,k ⎜β ⎟ β22 0 0 ⎜ 21 ⎟ [3mm] + , (6.2) eii ⊗ e j+1,k+1 ⊗ ⎜ ] i, j,k i, j,k ⎟ ⎜0 ⎟ β β 0 i=1,2 11 12 ⎝ ⎠ j,k=1,2,3

0

i, j,k

β 21

0

i, j,k

β 22

where, as in (3.5) we identify M4 (C) ⊗ Mn (C) ⊗ Q with M4n (Q), we call α ij1 k1 the n × n matrix with entries (α ij1 k1 ) j2 k2 := u J K with J = (i, 1, j1 , j2 ) and K = (i, 1, k1 , k2 ) i, j ,k

i, j ,k0

and we call β j1 k01 0 the n × n matrix with entries (β j1 k01 (i, j0 + 1, j1 , j2 ) and K = (i, k0 + 1, k1 , k2 ).

) j2 k2 := u J K with J =

Proof. The subspaces Vi , i = 1, 2, 3, 4 are D F -invariant and correspond to distinct sets of eigenvalues (masses of the generations of ν, e, u and d respectively). Since (D F ⊗ 1)U = U (D F ⊗ 1) these four subspaces must be preserved by U and this completes the proof of the lemma. Lemma 6.3. Let Q be any CQG with U as in (6.1) and satisfying (3.9b). Then each one of the four summands in B F = C ⊕ C ⊕ M2 (C) ⊕ M3 (C) is a coinvariant subalgebra under the adjoint coaction AdU (a) = U (a ⊗ 1)U ∗ of Q. Proof. We start with the basis element 0, 1, 0, 0 of the second copy of C. Equation (3.9b) means that AdU (0, 1, 0, 0) = 1, 0, 0, 0 ⊗ a 1,0,0,0 + 0, 1, 0, 0 ⊗ a 0,1,0,0 0, 0, ei j , 0 ⊗ a 0,0,ei j ,0 + 0, 0, 0, ei j ⊗ a 0,0,0,ei j , + i, j=1,2

i, j=1,2,3

(6.3) a .

are some elements of Q. where By (6.2), U (0, 1, 0, 0 ⊗ 1)U ∗ has e22 in the first position and e jk in the third, with j, k = 3, 4. Therefore, U (0, 1, 0, 0 ⊗ 1)U ∗ vanishes on the subspaces (e11 ⊗ 1 ⊗ e44 ⊗ 1)H F , (1 ⊗ 1 ⊗ e11 ⊗ 1)H F and (1 ⊗ (1 − e11 ) ⊗ (e22 + e33 ) ⊗ 1)H F . Applying (6.3) on these three subspaces and using (3.8) we get respectively: 0 = (e11 ⊗ 1 ⊗ e44 ⊗ 1) ⊗ a 1,0,0,0 + 0 + 0 + 0, 0, 0, ei j , 0 ⊗ a 0,0,ei j ,0 + 0, 0 = 0+0+ i, j=1,2

0 = 0+0+0+

i, j=1,2,3

0, 0, 0, ei j ⊗ a 0,0,0,ei j .

Quantum Isometries of the Standard Model

125

Therefore a 1,0,0,0 = a 0,0,ei j ,0 = a 0,0,0,ei j = 0 and AdU (0, 1, 0, 0) ⊂ 0, 1, 0, 0 ⊗ Q. The proof for the other three factors is similar. For the rest of the proof, let λ ∈ C, q ∈ M2 (C), m ∈ M3 (C) be arbitrary. U (1, 0, 0, 0 ⊗ 1)U ∗ vanishes on the subspaces (e22 ⊗ (1 − e11 ) ⊗ e44 ⊗ 1)H F , (1 ⊗ e11 ⊗ e11 ⊗ 1)H F , and (1 ⊗ (1 − e11 ) ⊗ e22 ⊗ 1)H F , and hence this implies respectively that the coefficients of 0, λ, 0, 0 , 0, 0, q, 0 , 0, 0, 0, m in AdU (1, 0, 0, 0) are zero. U (0, 0, q, 0 ⊗ 1)U ∗ vanishes on the subspaces (e11 ⊗ 1 ⊗ e44 ⊗ 1)H F , (e22 ⊗ 1 ⊗ e44 ⊗ 1)H F , and (1 ⊗ (1 − e11 ) ⊗ e33 ⊗ 1)H F , and hence this implies respectively that the coefficients of λ, 0, 0, 0 , 0, λ, 0, 0 , 0, 0, 0, m in AdU (0, 0, q, 0) are zero. Finally, U (0, 0, 0, m ⊗ 1)U ∗ vanishes on the subspaces (e11 ⊗ e11 ⊗ e44 ⊗ 1)H F , (e22 ⊗ e11 ⊗ e44 ⊗ 1)H F and (1 ⊗ e11 ⊗ e11 ⊗ 1)H F , which implies respectively that the coefficients of λ, 0, 0, 0 , 0, λ, 0, 0 , 0, 0, q, 0 in AdU (0, 0, 0, m) are zero. i, j ,k0

Lemma 6.4. If (3.9b) is satisfied, the matrices α ij1 k1 and β j1 k01 j1 = k1 .

in (6.2) are zero for all

Proof. We use Lemma 6.3. Since AdU (1, 0, 0, 0) ⊂ 1, 0, 0, 0 ⊗ Q, it is easy to see that (eii ⊗ e11 ⊗ e11 ⊗ 1 ⊗ 1 Q )AdU (1, 0, 0, 0) equals zero for all i = 1, 2. On the other hand, straightforward computation gives i i ∗ (eii ⊗ e11 ⊗ e11 ⊗ 1 ⊗ 1 Q )AdU (1, 0, 0, 0) = eii ⊗ e11 ⊗ e11 ⊗ α12 (α12 ) = 0, i = 0 for all i = 1, 2. Similarly, from which it follows that α12 i i ∗ (1 ⊗ e11 ⊗ e22 ⊗ 1 ⊗ 1 Q )AdU (0, 0, eii , 0) = eii ⊗ e11 ⊗ e22 ⊗ α21 (α21 ) =0

i = 0 for all i = 1, 2. Finally, Ad ( 0, 0, 0, e gives α21 U k0 l0 ) applied to the projections 1 ⊗ 1 ⊗ e11 ⊗ 1 and 1 ⊗ 1 ⊗ e44 ⊗ 1, we get the conditions i, j ,k0

β12 0

i, j ,k0

i,l0 ,n 0 ∗ (β12 ) = β 21 0

i,l0 ,n 0 t (β21 ) =0 i, j ,k0

for all i, j0 , k0 , l0 , n 0 . In particular setting j0 = l0 and k0 = n 0 we get β12 0 i, j ,k β21 0 0 = 0.

=

Now we impose U (D F ⊗ 1) = (D F ⊗ 1)U , with D F as in (3.5), U as in (6.2) and using Lemma 6.4. i, j ,k0

Lemma 6.5. Any U of the form (6.2), and with α ij1 k1 = β j1 k01 satisfies U (D F ⊗ 1) = (D F ⊗ 1)U if and only if 1, j,k

2 and β 1. all αss rr

2. 3. 4.

are diagonal n × n matrices,

2 = α 2 , β 1, j,k = β 1, j,k , β 2, j,r = β 2, j,r , α22 11 11 22 22 11 1 ϒ = ϒ α1 = α1 ϒ = ϒ α1 , α1 ϒ = α11 ν ν 11 ν ν 22 22 22 R 2, j,k C ∗ β11 C is a diagonal matrix.

ϒ R α 122 ,

= 0 for all j1 = k1 ,

126

J. Bhowmick, F. D’Andrea, L. Da˛browski

Proof. The condition U (D F ⊗ 1) = (D F ⊗ 1)U is equivalent to the following sets of equations: 1 α11 ϒν = ϒν α 122 ,

1 α 122 ϒν = ϒν α11 ,

2 α11 ϒe = ϒe α 222 , 1, j,k β 22 ϒu

=

2 α22 ϒe = ϒe α 211 ,

1, j,k ϒu β11 ,

2, j,k β11 ϒd

=

2, j,k ϒd β 22 ,

1 α22 ϒ R = ϒ R α 122 , 1, j,k β11 ϒu 2, j,k β 22 ϒd

= =

1, j,k ϒu β 22 , 2, j,k ϒd β11 .

(6.4a) (6.4b) (6.4c)

Actually, there are an additional 9 relations that — recalling that ϒx (x = e, u, d, ν) are positive, ϒe , ϒu are diagonal and ϒ R is symmetric — turn out to be the “bar” of previous ones and hence they do not give any new information. 1 commute with ϒ 2 : From the first two equations in (6.4a), we deduce that α11 ν 1 1 ϒν2 = ϒν α 122 ϒν = ϒν2 α11 , α11 1 commutes with ϒ and hence it commutes with its positive square root ϒν . Similarly α11 ν and the conditions (6.4a) turn out to be equivalent to point 3 of the lemma. 2 commute with ϒ 2 and In a similar way from (6.4b) and (6.4c) we deduce that all αss e 1, j,k 2 2 all βrr commute with ϒu . Since ϒx (x = e, u) are diagonal with distinct eigenvalues, 2 and β 1, j,k must be diagonal n × n matrices. This proves 1. we deduce that all αss rr 2 and β 1, j,k are diagonal, (6.4b) implies that α 2 = α 2 and β 1, j,k = β 1, j,k , As all αss rr 11 11 22 22 where we have used that ϒe and ϒu are diagonal invertible matrices. Thus the first two equations of 2 are proved. The second and third equation of (6.4c) implies respectively 2, j,k

β22

= (ϒdt )

−1 2, j,k t β11 ϒd ,

2, j,k

β22

= ϒdt β11 (ϒdt )−1 . 2, j,k

(6.5)

These two equations taken together mean −1 2, j,k t β11 ϒd

(ϒdt )

= ϒdt β11 (ϒdt )−1 . 2, j,k

Thus, β11 = ϒd ϒd ∗ β11 (ϒd ϒd ∗ )−1 . But, ϒd ϒd ∗ = Cδ↓ C ∗ Cδ↓ ∗ C ∗ = 2, j,k Cδ↓ δ↓ ∗ C ∗ . Thus, C ∗ β11 C commutes with δ↓ δ↓ ∗ . As the latter is a diagonal matrix 2, j,k 2, j,k with distinct eigenvalues, C ∗ β11 C is a diagonal matrix. The fact that C ∗ β11 C is 2, j,k

2, j,k

−1 2, j,k

diagonal implies that (ϒdt ) β11 ϒdt = ϒdt β11 (ϒdt )−1 . Thus, (6.5) is equivalent to 4. The equation remaining to be proved is the third equation of 2 which follows by part 2, j,r

4 and (6.5). Indeed, by (6.5), β22

2, j,k

= ϒdt β11 (ϒdt )−1 . = Cδ↓ (C)t β11 C(δ↓ )−1 (C)t . 2, j,r

2, j,r

As (C)∗ β11 C is diagonal by part 4, (C)t β11 C is also diagonal and hence it 2, j,r

2, j,r

2, j,r

2, j,r

commutes with δ↓ and thus we get β22 = β11 . Conversely, if 1 – 4 of this lemma are satisfied, then it can be easily verified that (6.4a), (6.4b) and (6.4c) are satisfied and hence U commutes with D F . In view of Lemma 6.5, we define elements xk and 3 × 3 matrices Tm by 2 α11 =

n k=1

ekk ⊗ xk ,

1, j,k

β11

=

n m=1

emm ⊗ (Tm ) j,k .

Quantum Isometries of the Standard Model

127

Hence, by part 2 of Lemma 6.5, 1, j,k

β22

n

=

emm ⊗ (T m ) j,k .

m=1

Moreover, let X (s, m) =

2,i, j

ei j ⊗ (β11 )s,m .

Lemma 6.6. If U is a unitary corepresentation satisfying the hypothesis of Lemma 6.5, i , T and X (m, m) are biunitaries. In particular, {x , x , ....., x } then the matrices αrr m 1 2 n are unitary elements. Proof. The condition UU ∗ = 1 ⊗ 1 implies that for r = 1, 2, i i ∗ i i ∗ αrr (αrr ) = αrr (αrr ) = 1, i, j,k i, j,k i, j,k βrr (βrri,l,k )∗ = β rr (β rr ) = δ jl . k

Similarly, from

U ∗U

k

= 1 ⊗ 1 we get the relations

i ∗ i i ∗ i ) αrr = (αrr ) αrr = 1, (αrr i,l,k i, j,k i, j,k (βrri,l,k )∗ βrr = (β rr )∗ β rr = δ jl . k

k

i , T and X (m, m) are biunitaries. Thus, the matrices αrr m

We note that in Lemma 6.4 we provide a necessary condition for (3.9b). The next lemma gives conditions that are necessary and sufficient. Lemma 6.7. Assume U satisfies the hypothesis of Lemma 6.5 and 6.6. The condition (3.9b) is satisfied, i.e. the coaction AdU preserves the subalgebra B F , iff there exists a unitary x0 such that 1 α11 = diag(x0 x1 , . . . , x0 xn ), 2,i, j β11 n

=

2 α22 = diag(x1∗ , . . . , xn∗ ),

diag(x0∗ (T1 )i, j , . . . , x0∗ (Tn )i, j ),

Cr m C sm (Tm ) j,k = 0

∀ r = s ( r, s = 1, . . . , n; j, k = 1, 2, 3 ),

(6.6a) (6.6b) (6.6c)

m=1

(Tm∗ )i, j (Tm )k,l = (Tm∗ )i, j (Tm )k,l

∀ m, m ( i, j, k, l = 1, 2, 3, m, m = 1, . . . , n ),

(6.6d) and the adjoint coaction is AdU (0, 0, e12 , 0) = 0, 0, e12 , 0 ⊗ x0 , AdU (0, 0, e21 , 0) = 0, 0, e21 , 0 ⊗ x0∗ , 0, 0, 0, ekl ⊗ ((T1 )k,i )∗ (T1 )l, j . AdU ( 0, 0, 0, ei j ) = kl

Moreover, 1, 0, 0, 0 , 0, 1, 0, 0 and 0, 0, eii , 0 (i = 1, 2) are coinvariant.

(6.7a) (6.7b) (6.7c)

128

J. Bhowmick, F. D’Andrea, L. Da˛browski

Proof. We use the notations of the previous lemmas. The coinvariance of 1, 0, 0, 0 , 0, 1, 0, 0 and 0, 0, eii , 0 (i = 1, 2) follows automatically from unitarity of U . Since 1 2 ∗ (α11 ) AdU (0, 0, e12 , 0) = e12 ⊗ e11 ⊗ e11 ⊗ α11 1,i, j 2,k, j + e12 ⊗ ei+1,k+1 ⊗ e11 ⊗ β11 (β11 )∗ , i jk

condition (3.9b) implies that there exists x0 ∈ Q such that 1 2 ∗ (α11 ) = α11

n

eii ⊗ x0 ,

i=1

1,i, j

2,k, j ∗

β11 (β11

) = δi,k (

j

n

eii ⊗ x0 ).

(6.8)

i=1

i implies unitarity of x . Moreover, we have α 1 = diag(x x , . . . , x x ). Unitarity of αrr 0 0 1 0 n 11 n 2 = α 2 in Lemma 6.5, we deduce that α 2 = ∗ . We get e ⊗x Using the relation α11 kk k=1 22 22 k AdU (0, 0, e12 , 0) = 0, 0, e12 , 0 ⊗ x0 and AdU (0, 0, e21 , 0) = 0, 0, e21 , 0 ⊗ x0∗ . 2,k, j From the second equation of (6.8), we deduce that j (Tm )i, j (β11 )∗sm = δm,s δi,k x0 . ∗ ∗ = Thus, j (Tm )i, j (X (s, m))k, j = δm,s δi,k x 0 , and in particular Tm X (s, m) δm,s diag(x0 , x0 , x0 ), which implies

X (s, m) = 0 if s = m and X (s, s) = diag(x0∗ , x0∗ , x0∗ )Ts , which translates into 2,i, j

β11

= diag(x0∗ (T1 )i, j , . . . , x0∗ (Tn )i, j ).

(6.9)

Moreover, as C ∗ β 2jk C is diagonal from 4 of Lemma 6.5, we get for all j, k = 1, 2, 3, Cr m C sm (Tm ) j,k = 0 if r = s. m

Now we compute AdU (0, 0, 0, er s ) =

e11 ⊗ e j+1,c+1 ⊗ (e22 + e33 ) ⊗ eaa ⊗ ((Ta ) j,r )∗ (Ta )c,s

j,a,c

+

2, j,r

e22 ⊗ e j+1,c+1 ⊗ (e22 + e33 ) ⊗ eap ⊗ (β22

2,c,s ∗ )a,b (β22 ) p,b .

j,a,c, p,b

Coinvariance of M3 (C) gives the following relations for all j, r, c, s: 2, j,r 2,c,s ∗ (β22 )a,b (β22 ) p,b is 0 unless a = p,

(6.10a)

b

2, j,r 2,c,s (β 11 )a,b (β11 ) p,b is 0 unless a = p,

(6.10b)

b

(Ta )∗j,r (Ta )c,s = (Tb )∗j,r (Tb )c,s ∀a, b, 2, j,r 2,c,s ∗ (Ta )∗j,r (Ta )c,s = (β22 )a,b (β22 )a,b ∀a,

(6.10c) (6.10d)

b

2, j,r 2, j,r 2,c,s ∗ 2,c,s (β22 )a,b (β22 )a,b = (β 11 )a ,b (β11 )a ,b ∀a, a . b

b

(6.10e)

Quantum Isometries of the Standard Model

129

However, it turns out that (6.10c) is the only new information. Indeed, (6.10a) and 2,i, j

2, j,r

2, j,r

(6.10b) are consequences of the facts that β11 is diagonal ((6.9)) and β22 = β11 (part 2. of Lemma 6.5). Equation (6.10d) follows again from (6.9). Finally (6.10e) follows from (6.10c) and (6.10d) taken together. Equations (6.10a)–(6.10e) show that AdU (0, 0, 0, er s ) is given by (6.7c). This completes the proof. We are now in the position to prove Proposition 3.4, i.e. that the universal object in category C J is the CQG given in Lemma 3.3 with corepresentation U as in (3.13). Proof of Proposition 3.4. The proof is in two steps: 1. we need to prove that the CQG in Lemma 3.3 with corepresentation (3.13) is an object of the category C J , and 2. we need to prove that this object is universal. 1. First we notice that the operator U in (3.13) is indeed a unitary corepresentation: the unitaries/biunitaries xk , x0 xk , Tm , x0∗ Tm , V and their “bar” define unitary corepresentations due to (3.11), and they coact on orthogonal subspaces of H F in (3.13) so that U is an orthogonal direct sum of unitary corepresentations. Since our U is of the form (6.1), by Lemma 6.1 it satisfies the compatibility conditions with γ F and J F . Since U is of the form (6.2), with parameters 1 := α11

1, j,k

β22

:=

n k=1 n

n

1 := ekk ⊗ x0 xk , α22

2 := (α 2 )∗ := ei j ⊗ Vi j , α11 22

i, j=1 1, j,k

emm ⊗ (Tm ) j,k , β11

n

ekk ⊗ xk ,

k=1 1, j,k ∗ ) ,

:= (β22

2, j,k

β11

:=

m=1

n

emm ⊗ x0∗ (Tm ) j,k ,

m=1 i, j ,k α ij1 ,k1 := β j ,k0 0 := 0, 1 1

if j1 = k1 ,

satisfying 1 – 4 of Lemma 6.5, and by Lemmas 6.2 and 6.5 it follows that U commutes with D F . Since the parameters defined above satisfy the conditions in Lemma 6.7 too, we have that the adjoint coaction preserves B F and then our CQG with corepresentation (3.13) is an object of the category C J . 2. Now we pass to universality. From Lemmas 6.1–6.7 it follows that any object (Q, U ) in the category C J must be generated by the matrix entries of a corepresentation U of the form (3.13), with matrix entries satisfying (3.10). In particular (3.10b) coincides with (6.6c), (3.10c) coincides with 6.6d, and (3.10a) coincides with point 3 of Lemma 6.5 after the parameter substitution in (6.6a–6.6b) and after renaming V 1 . Different summands in (3.13) coact on orthogonal subspaces of H : the matrix α22 F hence from unitarity of U we deduce the unitarity of the xk ’s and of the matrices Tm , V and their “bar”, i.e. they must be biunitary. This proves that any object in the category is a quotient of the CQG in Prop. 3.4. Acknowledgements. We would like to thank Prof. J.W. Barrett and Prof. A.H. Chamseddine for useful conversations and comments. L.D. was partially supported by GSQS 230836 (IRSES, EU) and PRIN 2008 (MIUR, Italy).

References 1. Ahmad, Q.R., et al. (SNO Collaboration): Direct Evidence for Neutrino Flavor Transformation from Neutral-Current Interactions in the Sudbury Neutrino Observatory. Phys. Rev. Lett. 89, 011301 (2002)

130

J. Bhowmick, F. D’Andrea, L. Da˛browski

2. 3. 4. 5.

Banica, T.: Le groupe quantique compact libre U (n). Commun. Math. Phys. 190, 143–172 (1997) Banica, T.: Quantum automorphism groups of small metric spaces. Pacific J. Math. 219, 27–51 (2005) Banica, T.: Quantum automorphism groups of homogeneous graphs. J. Funct. Anal. 224, 243–280 (2005) Banica, T., Vergnioux, R.: Invariants of the half-liberated orthogonal group. Ann. Inst. Fourier 60, 2137–2164 (2010) Bhowmick, J., Goswami, D., Skalski, A.: Quantum Isometry Groups of 0-Dimensional Manifolds. Trans. AMS 363, 901–921 (2011) Bhowmick, J., Goswami, D.: Quantum Group of Orientation preserving Riemannian Isometries. J. Funct. Anal. 257, 2530–2572 (2009) Bhowmick, J., Goswami, D.: Quantum isometry groups of the Podles spheres. J. Funct. Anal. 258, 2937–2960 (2010) Bhowmick, J., Goswami, D.: Some counterexamples in the theory of quantum isometry groups. Lett. Math. Phys. 93(3), 279–293 (2010) Bhowmick, J., Skalski, A.: Quantum isometry groups of noncommutative manifolds associated to group C ∗ -algebras. J. Geom. Phys. 60(10), 1474–1489 (2010) Bichon, J.: Quantum automorphism groups of finite graphs. Proc. Amer. Math. Soc. 131(3), 665–673 (2003) Chamseddine, A.H., Connes, A.: The Spectral Action Principle. Commun. Math. Phys. 186, 731–750 (1997) Chamseddine, A.H., Connes, A.: Why the Standard Model. J. Geom. Phys. 58, 38–47 (2008) Chamseddine, A.H., Connes, A., Marcolli, M.: Gravity and the standard model with neutrino mixing. Adv. Theor. Math. Phys. 11, 991–1090 (2007) Connes, A.: Noncommutative Geometry. London: Academic Press, 1994 Connes, A.: Noncommutative geometry and reality. J. Math. Phys. 36, 6194–6231 (1995) Connes, A.: Noncommutative differential geometry and the structure of space-time. In: Proceedings of the Symposium on Geometry, Huggett, S.A. (ed.) et al., Oxford: Oxford Univ. Press, 1998, pp. 49–80 Connes, A.: Noncommutative geometry and the Standard Model with neutrino mixing. JHEP 11, 081 (2006) Connes, A.: On the spectral characterization of manifolds. http://arxiv.org/abs/0810.2088v1 [math.OA], 2008 Connes, A., Marcolli, M.: Noncommutative geometry, quantum fields and motives. Colloquium Publications, Vol. 55, Providence, RI: Amer. Math. Soc., 2008 Coquereaux, R.: On the finite dimensional quantum group M3 ⊕ (M2|1 (2 ))0 . Lett. Math. Phys. 42, 309–328 (1997) D’Andrea, F., Da˛browski, L., Landi, G., Wagner, E.: Dirac operators on all Podle´s spheres. J. Noncomm. Geom. 1, 213–239 (2007) D’Andrea, F., Da˛browski, L., Landi, G.: The Isospectral Dirac Operator on the 4-dimensional Orthogonal Quantum Sphere. Commun. Math. Phys. 279, 77–116 (2008) Da˛browski, L., Landi, G., Paschke, M., Sitarz, A.: The spectral geometry of the equatorial Podle´s sphere. Comptes Rendus Acad. Sci. Paris 340, 819–822 (2005) Da˛browski, L., Landi, G., Sitarz, A., van Suijlekom, W., Várilly, J.C.: The Dirac operator on SUq (2). Commun. Math. Phys. 259, 729–759 (2005) Dabrowski, L., Nesti, F., Siniscalco, P.: A Finite Quantum Symmetry of M(3, C). Int. J. Mod. Phys. A 13, 4147–4162 (1998) Fukuda (Super-Kamiokande Collaboration), Y. et al.: Evidence for Oscillation of Atmospheric Neutrinos. Phys. Rev. Lett. 81, 1562–1567 (1998) Goswami, D.: Quantum Group of Isometries in Classical and Noncommutative Geometry. Commun. Math. Phys. 285, 141–160 (2009) Goswami, D.: Quantum Isometry Group for Spectral Triples with Real Structure. SIGMA 6, 007 (2010) Kastler, D.: Regular and adjoint representation of S L q (2) at third root of unit. CPT internal report, 1995 Lizzi, F., Mangano, G., Miele, G., Sparano, G.: Fermion Hilbert space and fermion doubling in the noncommutative geometry approach to gauge theories. Phys. Rev. D 55, 6357–6366 (1997) Maes, A., Van Daele, A.: Notes on compact quantum groups. Nieuw Arch. Wisk. 16, 73–112 (1998) Podles, P.: Symmetries of quantum spaces. Subgroups and quotient spaces of quantum SU (2) and S O (3 groups. Commun. Math. Phys. 170, 1–20 (1995) Sołtan, P.M.: Quantum S O(3) groups and quantum group actions on M2 . J. Noncommut. Geom. 4, 1–28 (2010) Van Daele, A., Wang, S.: Universal quantum groups. Int. J. Math. 7, 255–264 (1996) Wang, S.: Free products of compact quantum groups. Commun. Math. Phys. 167(3), 671–692 (1995) Wang, S.: Quantum Symmetry Groups of Finite Spaces. Commun. Math. Phys. 195, 195–211 (1998) Wang, S.: Structure and Isomorphism Classification of Compact Quantum Groups Au (Q) and Bu (Q). J. Operator Theory 48, 573–583 (2002)

6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38.

Quantum Isometries of the Standard Model

131

39. Wang, S.: Ergodic actions of universal quantum groups on operator algebra. Commun. Math. Phys. 203(2), 481–498 (1999) 40. Woronowicz, S.L.: Compact matrix pseudogroups. Commun. Math. Phys. 111, 613–665 (1987) 41. Woronowicz, S.L.: Compact quantum groups. In: Symétries quantiques (Les Houches, 1995), edited by A. Connes et al., Amsterdam: Elsevier, 1998, pp. 845–884 Communicated by A. Connes

Commun. Math. Phys. 307, 133–156 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1291-0

Communications in

Mathematical Physics

Suitable Solutions for the Navier–Stokes Problem with an Homogeneous Initial Value Pierre Gilles Lemarié–Rieusset, Frédéric Lelièvre Équipe Analyse et Probabilités, EA2172, Université d’Evry Val d’Essonne, 91025 Evry cedex, France. E-mail: [email protected] Received: 22 November 2010 / Accepted: 16 January 2011 Published online: 26 June 2011 – © Springer-Verlag 2011

Abstract: This paper is devoted to the study of strong or weak solutions of the Navier– Stokes equations in the case of an homogeneous initial data. The case of small initial data is discussed. For large initial data, an approximation is developed, in the spirit of a paper of Vishik and Fursikov. Qualitative convergence is obtained by use of the theory of Muckenhoupt weights. 1. Introduction In this paper, we shall study the Cauchy problem for an incompressible 3D Navier–Stokes problem with no boundary and no external force ⎧ p, u−∇ ⎪ ⎨ ∂t u + u.∇ u = u|t=0 = u0 , (1) ⎪ ⎩ u = 0, ∇. where u(t, x) is a time-dependent divergence-free vector field on R3 (t > 0, x ∈ R3 ) and where the initial value u0 is homogeneous: for λ > 0, λ u 0 (λx) = u0 (x).

(2)

The homogeneity condition on u0 fits the scaling property of the Navier-Stokes equations: if u is a solution of the Cauchy problem with initial value, then uλ defined by uλ (t, x) = λ u (λ2 t, λx) (where λ > 0) is a solution for the Cauchy problem with initial value λ u 0 (λx). Of course, we aim to exhibit self–similar solutions (λ u (λ2 t, λx) = u(t, x)), but up to now this can be done only for small initial values. For such small values, the formalism of mild solutions [KAT 84,CAN 95], based on Banach’s contraction principle, provides solutions together with some uniqueness which grants self–similarity. When

134

P. G. Lemarié–Rieusset, F. Lelièvre

we deal with large initial values, the formalism of mild solutions breaks down and we can only exhibit weak solutions, through a compactness argument based on some energy estimates. For such solutions, we have no uniqueness, so that we cannot conclude for self–similarity. Those energy estimates cannot be a direct consequence of Leray’s theory [LER 34], since, when u0 = 0, homogeneity implies that u 0 2 = +∞. We thus have to replace Leray’s energy inequality by Scheffer’s local energy inequality [SCH 77]. We shall describe some consequences of this inequality, through the use of Caffarelli, Kohn and Nirenberg’s regularity criterion [CAF 82]. Weak solutions are usually obtained through mollification [LEM 02] or truncation [LEM 99,BAS 06]. In the last sections, we shall describe another approximation of the Navier–Stokes equation which provides suitable solutions and preserves the scaling property of the equations. Those approximations are modifications of a model studied by Vishik and Fursikov [VIS 77]. 2. Suitable Solutions In this section, we recall previous results from [LEM 99,LEM 02 and LEM 07] on weak solutions for the Navier–Stokes equations. 2 In Leray’s theory [LER 34], a weak solution of Eqs. (1) is a solution u ∈ L ∞ t Lx ∩ 2 1 3 L t H˙ x defined on (0, +∞) × R which satisfies the energy inequality u (t, .)22

+2 0

t

⊗ u22 ds ≤ ∇ u 0 22 ,

(3)

where the initial value u0 is a square-integrable divergence-free vector field. In that case, 3/2 the pressure p(t, x) belongs to L 2t L x and can be recovered from u by the formula p=−

3 3 1 ∂i ∂ j (u i u j ).

(4)

i=1 j=1

8/3

In particular, a Leray solution u belongs to L t L 4x . If we assume more regularity on the solution u ( u ∈ L 4t L 4x ), then the inequality (3) becomes an equality. Indeed, in that case u as ∇.( u ⊗ u)). Thus, we have p ∈ L 2t L 2x and thus ∂t u ∈ L 2t H˙ x−1 (rewriting u.∇ d u (t, .)22 = 2∂t u(t, .)| u (t, .) H˙ −1 , H˙ 1 dt 2 p d x, = −2 |∇ ⊗ u| d x − 2 u.( u .∇) u d x − 2 u.∇ where

p dx = − u.∇

u dx = 0 p ∇.

and u d x = − u.( u d x − | u d x = − 1 | u d x = 0, u.( u .∇) u .∇) u |2 ∇. u |2 ∇. 2

(5)

(6)

(7)

Suitable Solutions for the Navier–Stokes Problem with an Homogeneous Initial Value

135

since u is divergence free. The integrations by part involved in (5), (6) and (7) may be ⊗ u|2 as a divergence: u |2 + 2|∇ enlighted by describing the distribution ∂t | ⊗ u|2 = u |2 + 2|∇ ∂t |

3

∂i (2 u .∂i u − (| u |2 + 2 p)u i ) + R,

(8)

i=1

where u = 0. R = (| u |2 + 2 p) ∇.

(9)

When u is a Leray solution but does not belong to L 4t L 4x , we cannot write ∂t | u |2 = 2∂t u. u . The energy equality is not fullfilled (or, at least, is not known to be fullfilled). An usual way to exhibit Leray solutions consists in mollifying the nonlinearity: we take a nonnegative ω ∈ D(R3 ) such that ω d x = 1, we define, for > 0, ω (x) = −3 ω( −1 x) and we change Eqs. (1) into ⎧ p , u − ∇ ⎪ ⎨ ∂t u + (ω ∗ u ).∇ u = u|t=0 = u0 , (10) ⎪ ⎩ ∇. u = 0, 2 2 ˙1 ∈ L 2t H˙ x−1 . We get that and we find a solution u ∈ L ∞ t L x ∩ L t Hx such that ∂t u

⊗ u |2 = ∂t | u |2 + 2|∇

3

∂i (2 u .∂i u − | u |2 ω ∗ u ,i − 2 p u ,i ) + R ,

(11)

i=1

where u ) + 2 p ∇. u = 0. R = | u |2 ω ∗ (∇.

(12)

By a compactness argument based on Rellich’s theorem (for details, we refer to [LEM 02] 2 2 ˙1 Chaps. 13 and 14), there is a sequence k → 0 and a distribution u ∈ L ∞ t L ∩ L t Hx 2 1 2 such that uk converges to u weakly in L t H˙ x and strongly in the L norm on every compact subset of (0, +∞) × R3 . Thus, we have (in D ((0, +∞) × R3 )) uk − pk = ∂t u + u.∇ u − p = 0, u k + ∇ u+∇ lim ∂t uk + (ωk ∗ uk ).∇

k →0

(13)

and lim ∂t | u k |2 +

k →0

u |2 + = ∂t |

3

∂i (−2 u k .∂i uk + | u k |2 ωk ∗ u k ,i + 2 pk u k ,i )

i=1 3

∂i (−2 u .∂i u + | u |2 u i + 2 pu i ).

(14)

i=1

⊗ uk |2 should converge to |∇ ⊗ u|2 . The best we However, there is no reason that |∇ can get is ⊗ uk |2 = |∇ ⊗ u|2 + μ, lim |∇

k →0

(15)

136

P. G. Lemarié–Rieusset, F. Lelièvre

where μ is a non-negative distribution on (0, +∞) × R3 (hence a locally finite non-negative measure). This gives ⊗ u|2 = | u |2 + 2 p) ∂t | u |2 + 2|∇ u |2 − ∇.((| u ) − μ.

(16)

This is Scheffer’s local energy inequality [SCH 77]. In contrast with Leray’s inequality (3), we don’t need that u be square-integrable in inequality (16). Solutions of the Navier-Stokes equations which satisfy (16) will be called suitable (following [CAF 82]): Definition 1. A time-dependent divergence-free vector field u defined on (0, T ) × R3 will be a suitable solution of the Navier–Stokes equations if 2 2 ˙1 i) u belongs locally (in time and space) to L ∞ t L x ∩ L t Hx , u − p = 0, ii) there exists a distribution p ∈ D ((0, T )×R3 ) such that ∂t u + u.∇ u +∇ 3/2 3/2 3 3 iii) locally in time and space, u belongs to L t L x and p belongs to L t L x , iv) u satisfies Scheffer’s inequality: there exists a locally finite non-negative measure ⊗ u|2 = | u |2 + 2 p) μ on (0, T ) × R3 such that ∂t | u |2 + 2|∇ u |2 − ∇.((| u ) − μ. The main interest of suitable solutions is the regularity criterion of Caffarelli, Kohn and Nirenberg [CAF 82]: Theorem 1. There exists two constants 0 > 0 and C0 > 0 such that if T > 0, if x0 ∈ R3 , if 0 < r 2 < t0 < T , if 0 < < 0 , if u is a suitable solution of the Navier–Stokes equations on (0, T ) × R3 such that | u (t, x)|3 + | p(t, x)|3/2 d x dt < r 2 (17) |x−x0 |
then sup |x−x0 |
| u (t, x)| < C0 1/3r −1 .

(18)

Inequality (16) is a key tool to develop a theory of weak solutions for initial values u0 with infinite energy ( u 0 2 = +∞). In [LEM 99,LEM 02] a theory has been developed to exhibit suitable weak solutions associated to an initial value u0 which is uniformly locally square integrable (supx0 ∈R3 |x−x0 |≤1 | u 0 (x)|2 d x < +∞). The basic idea of the proof is to consider the mollified equations (10) and to compute the L 2uloc norm of u as u L 2

uloc

= sup ϕ0 (x − x0 ) u 2

(19)

x0 ∈R3

for some ϕ0 ∈ D(R3 ) (with ϕ0 = 0). In contrast with the finite-energy case, p cannot 3 3 1 be computed as p = − i=1 j=1 ∂i ∂ j (ω ∗ u ,i u , j ) since the kernel of the conp 1 volution operator ∂i ∂ j has slow decay at infinity, hence is not defined on L uloc . But p p is well defined: the kernel of is well defined up to a constant additive term, so that ∇ p ∂k 1 ∂i ∂ j has enough decay at infinity to operate on L uloc . Then formulas (11) and (12) remain true. Carefully integrated against test functions ϕ(x) = ϕ02 (x − x0 ), they give a control independent of : we start from the identity t ⊗ u(s, x)|2 d x dt = ϕ(x)| ϕ(x)| u (t, x)|2 d x +2 u 0 (x)|2 d x + I (t) ϕ(x)|∇ 0

(20)

Suitable Solutions for the Navier–Stokes Problem with an Homogeneous Initial Value

with

I (t) =

t

| u (s, x)| ϕ(x) d x ds + t d x ds, +2 p u .∇ϕ 2

0

t 0

137

d x ds | u |2 (ω ∗ u ) ∗ ∇ϕ (21)

0

and defining

α (t) = sup

x0 ∈R3

= sup x0

we find that

I (t) ≤ C(

t 0

u (t, x)|2 d x and β (t) ϕ02 (x − x0 )| t

∈R3

0

α (s)ds + (

⊗ u(s, x)|2 d x dt, ϕ02 (x − x0 )|∇

t 0

α3 (s) ds)1/4 (β (t) +

0

t

(22)

α (s) ds)3/4 ).

(23)

In [LEM 02], we show that inequalities (20) and (23) provide a control uniform in ). Then the same compactness on a time interval (0, T ) with T = O(min(1, u 0 −2 L2 uloc

argument as in the case of finite-energy initial values allows us to show that: u 0 = 0. Then, there exists a positive Theorem 2. Let u0 ∈ (L 2uloc (R3 ))3 be such that ∇. 1 constant C0 (which does not depend on u0 ) such that, defining T0 = C 4 sup(1, , u 2 ) 0

u (t, .) L 2

uloc

and

t sup x0 ∈R3 0

|x−x0 |<1

0

L2 uloc

such that for all 0 < t < T0 , we have

t −1/4 1− ≤ C0 u0 L 2 (24) uloc T0

Eqs. (1) have a suitable solution u on (0, T0

) × R3

t −1/2 1− . T0 uloc

⊗ u(s, x)|2 d x ds ≤ C0 |∇ u 0 2L 2

(25)

Moreover, the decay rate, when x goes to infinity, may be controlled [LEM 02]: u 0 = 0. For some positive T ∗ , let u Theorem 3. Let u0 ∈ (L 2uloc (R3 ))3 be such that ∇. be a suitable solution for the Navier–Stokes problem on (0, T ∗ ) × R3 with initial value 3 3 3 p = − i=1 1 u0 (such that p is given by ∇ j=1 ∇ ∂i ∂ j (u i u j )). Let θ ∈ D(R ) be x equal to 1 on a neighbourhood of 0. For R > 0, let χ R (x) = 1 − θ ( R ). Then, for every T ∈ (0, T ∗ ), there exists a positive constant C T such that, for all 0 < t < T and all R > 1, we have

1 + ln R ) (26) χ R u(t, .) L 2 ≤ C T (χ R u0 L 2 + uloc uloc R and

t

sup x0

∈R3

0

|x−x0 |<1

⊗ u(s, x)|2 d x ds ≤ C T (χ R u0 2 2 + χ R (x)|∇ L uloc

1 + ln R ). R

(27)

138

P. G. Lemarié–Rieusset, F. Lelièvre

The constant C T depends only on T , sup0
T 0

x−x0 <1

Now, if we want to study the Navier–Stokes equations with an initial value u0 which is homogeneous (as given by (2)) and uniformly locally square-integrable, this initial value will belong to a Morrey–Campanato space: Definition 2. For 1 < p ≤ q < ∞, the homogeneous Morrey–Campanato space M˙ p,q (R3 ) is defined as the space of locally p-integrable functions f such that sup sup R 3(1/q−1/ p) ( | f (x)| p d x)1/ p < ∞, (28) |x−x0 |
x0 ∈R3 0
or equivalently sup R 3/q f (Rx) L p

uloc

R>0

< +∞.

(29)

A direct consequence of (29) is that, when u0 ∈ M˙ 2,3 the Proof of Theorem 2 can be adapted to any scale, hence will provide a solution on any time interval (0, T ), anf finally (through a diagonal extraction process) a global solution [LEM 07]: u 0 = 0. Then, there exists a positive Theorem 4. Let u0 ∈ ( M˙ 2,3 (R3 ))3 be such that ∇. 1 constant C0 (which does not depend on u0 ) such that, defining T0 = C 4 sup(1, , u 2 ) 0

Eqs. (1) have a suitable solution u on (0, +∞) × R3 such that 1 | u (t, x)|2 d x ≤ C0 u 0 2M˙ 2,3 sup t 3 |x−x |0, t>0 R + 0 T0 and

sup x0

∈R3 ,

t>0

T0 t

t 0

|x−x0 |< Tt

⊗ u(s, x)|2 d x ds ≤ C0 |∇ u 0 2M˙ 2,3 .

0

M˙ 2,3

(30)

(31)

0

3. Small Solutions When u0 is small, we have many further results on the solutions of Eqs. (1): u 0 = 0. Then, there exist constants Theorem 5. Let u0 ∈ ( M˙ 2,3 (R3 ))3 be such that ∇. C0 , C1 and 0 (which don’t depend on u0 ) such that, if u 0 M˙ 2,3 < 0 , the following assertions are true: (A) [Existence] Equations (1) have a suitable solution u on (0, +∞) × R3 (with pressure 3 3 p = − i=1 1 p given by ∇ j=1 ∇ ∂i ∂ j (u i u j )) such that 1 sup u (s, x)|2 d x ≤ C0 u 0 2M˙ 2,3 (32) √ √ | t 3 |x−x0 |< t x0 ∈R , t>0, t>s>0 and

1 t (s, x)|2 d x ds ≤ C0 sup u 0 2M˙ 2,3 . √ |∇ ⊗ u t 3 |x−x |< t 0 x0 ∈R , t>0 0

(33)

Suitable Solutions for the Navier–Stokes Problem with an Homogeneous Initial Value

139

(B) [Uniqueness] If u and v are two suitable solutions of (1) which satisfy (32) and (33) then u = v. (C) [Regularity] The solution u satisfies √ sup t u (t, .)∞ + sup u (t, .) M˙ 2,3 ≤ C1 u M˙ 2,3 . (34) t>0

t>0

(D) [Convergence] If u is the solution of the mollified equations (10), then u converge to u in D ((0, +∞) × R3 ) as goes to 0. If moreover u0 ∈ (E 2 )3 , where | f (x)|2 d x = 0 (35) f ∈ E 2 ⇔ f ∈ L 2uloc and lim x0 →∞ |x−x |<1 0

then we have, for all T > 0, lim sup u (t, .) − u (t, .) L 2

→0 0
uloc

= 0.

(36)

(E) [Self–similarity] If u0 is homogeneous (λ u 0 (λx) = u0 (x) for λ > 0) then u is self–similar (λ u (λ2 t, λx) = u(t, x)). u 0 = 0. If u 0 M˙ 2,3 < 1, (30) and (31) in Proof. Let u0 ∈ ( M˙ 2,3 (R3 ))3 be such that ∇. Theorem 4 give us (32) and (33). Equations (32) and (33) give us that for some constant C2 , 1 t sup u (s, x)|3 d x ds ≤ C2 u 0 3M˙ 2,3 . (37) √ | t 3 0 |x−x0 |< t x0 ∈R , t>0 3 Moreover Lemma 32.2 in √ [LEM 02] shows us that, for given x0 ∈ R and t > 0, we may modify p on B(x0 , t) × (0, t) such that

1 t

t 0

√ |x−x0 |< t

| p(s, x)|3/2 d x ds ≤ C2 u 0 3M˙ 2,3 .

(38)

Thus, if u0 is√small enough, we are allowed to use Theorem 1 and to find that, for almost |x − x0 | < t/2 and almost 3t/4 < s < t we have | u (s, x)| ≤ C1 u 0 M˙ 2,3 √1t . This gives that √ sup t u (t, .)∞ ≤ C1 u 0 M˙ 2,3 . (39) 0
Now, if we want to estimate R1 |x−x0 |
140

P. G. Lemarié–Rieusset, F. Lelièvre

The following estimates are classical and easy to prove [LEM 02] for T ∈ (0, +∞) (with a constant C0 which does not depend on T ) √ g , f) L 2 ≤ C0 sup f(t, .) L 2 sup t g (t, .)∞ , sup B( f, g) L 2 + B( uloc

0
uloc

uloc

0
0
sup B( f, g) M˙ 2,3 + B( g , f) M˙ 2,3 ≤ C0 sup f(t, .) M˙ 2,3 sup

0
0
0
√

(41) t g (t, .)∞ , (42)

and sup 0
√ √ tB( f, g)∞ ≤ C0 ( sup f(t, .) M˙ 2,3 sup t g (t, .)∞ 0
0
+ sup g (t, .) M˙ 2,3 sup 0
0
√

t f(t, .)∞ ).

(43)

We go back to u and v. We define w = u − v. We have u = et u0 − B( u , u) and v = et u0 − B( v , v),

(44)

w = B( v , v) − B( u , u) = −B(w, v) − B( u , w).

(45)

hence

Combining (41) and (39) we find that sup w(t, .) L 2 0
uloc

≤ 2C0 C1 u 0 M˙ 2,3 sup w(t, .) L 2 , 0
uloc

(45)

u 0 M˙ 2,3 < 1. so that w = 0 if 2C0 C1 Uniqueness implies self-similarity when u0 is homogeneous: if u0 is homogeneous and u a solution of (1) which satisfies (32) and (33), then λ u (λ2 t, λx) is still a solution of (1) which satisfies (32) and (33); by uniqueness, we get u(t, x) = λ u (λ2 t, λx). We now prove point (D). The set ( u )>0 is a relatively compact subset of (D (R3 ))3 . If u is the limit of some sequence uk with k → 0, then u will be a solution of (1) and satisfy (32) and (33); but such a solution has been seen to be unique. Thus, any sequence uk with k → 0 will converge to the same limit u. This limit u belongs to the space X 3 where √ f X = sup f (t, .) M˙ 2,3 + sup t f (t, .)∞ . (46) 0
0
From (42) and (43), we see that B( f, g) X ≤ 2C0 f X g X ,

(47)

so that, if et u0 X < 8C1 0 there is a unique solution u in the ball u X < 4C1 0 of the fixed– u , u). This solution will belong to (C((0, +∞), L 2uloc ))3 point equation u = et u0 − B( 2,3 3 ˙ for a general u ∈ ( M ) (small enough to ensure that et u0 X < 8C1 0 ); when u0 ∈ (E 2 )3 , then u belongs to (C([0, +∞), E 2 ))3 . Moreover, u is a solution of the fixed–point problem u = et u0 − B(ω ∗ u , u ). Since f ∗ ω X ≤ f X , we see

Suitable Solutions for the Navier–Stokes Problem with an Homogeneous Initial Value

that if et u0 X ≤ δ < Then we have

1 uX 8C0 , then

141

≤ 2δ and u X ≤ 2δ. We define w = u − u.

u , u) − B( u ∗ ω , u ) = −B( u ∗ ω , w ) − B(w ∗ ω , u) w = B( −B( u ∗ ω − u, u).

(48)

We use (41) and get L 2 sup w

uloc

0
≤ 4C0 δ sup w L 2

uloc

0
+ 2C0 δ sup u ∗ ω − u L 2 . 0
uloc

(49)

u ∗ ω − If u0 ∈ (E 2 )3 , we have u ∈ (C([0, +∞), E 2 ))3 , hence lim→0 sup00, t>0 R + 0 T0 and

sup x0 ∈R3 , t>0

where T0 =

T0 t

t |x−x0 |< Tt

0

⊗ u(s, x)|2 d x ds ≤ C0 |∇ u 0 2M˙ 2,3 ,

(51)

0

1 . C04 sup(1, u 0 2˙ 2,3 ) M

(B) [Local spatial boundedness] For every T > 0 and every compact subset K of R3 we have u ∈ L 1t L ∞ x ((0, T ) × K ). (C) [Boundedness] If moreover u0 ∈ (E 2 )3 then for almost every positive t the function u(t, .) belongs to L ∞ . (D) [Homogeneous data] If u0 is homogeneous, then u0 ∈ (E 2 )3 and solutions u described in point (A) satisfy t 1 sup √ u (s, .)∞ ds < +∞ (52) t 0 t>0 and √ 1 u (t, x)| < √ . (53) for some R > 0, and for every (t, x) such that |x| > R t, | t

142

P. G. Lemarié–Rieusset, F. Lelièvre

(E) [Self–similarity] If u is self–similar (λ u (λ2 t, λx) = u(t, x)), then the profile u(1, .) = U is a bounded function. Proof. Point (A) is given by Theorem 4. Point (B) is a consequence of point (A): if ω K is a function in D(R3 ) which is equal to 1 on {x ∈ R3 / d(x, K ) ≤ 1}, we may estimate u on (0, T ) × K by writing u = et u0 − B(ω K u, u) − B((1 − ω K ) u , u),

(54)

1 |et u0 | ≤ C u 0 M˙ 2,3 √ t

(55)

where

and u , u)| ≤ C |B((1−ω K )

t 0

|x−y|≥1

1 | u (s, y)|2 dy ds ≤ C t sup u (s, .)2L 2 . |x − y|4 uloc 0<s
In order to control B(ω K u, u), we write K 1 for the support of ω K ; on (0, T ) × K 1 , u 1/2,1 k u ⊗ u) belongs belongs to L 2t H˙ x1 so that u ⊗ u belongs to L 1t B˙ 2 and thus 1 P∇.(ω 3/2,1 to L 1t B˙ 2 on (0, T ) × R3 ; this gives (see [LEM 02]) that T T ⊗ u|2 d x dt, B(ω K u, u) B˙ 3/2,1 dt ≤ C K | u |2 + |∇ (57) 0

2

0

K1

3/2,1 B˙ 2

and we have proved (B), since ⊂ L ∞. (C) is then a consequence of Theorems 3 and 1. (Same proof as inequality (34) in Theorem 4). Let f ∈ L 2uloc ; then f is homogeneous (of homogeneity exponent −1) if and only if x 1 f (x) = F( |x| ) |x| , where F ∈ L 2 (S 2 ) (see [LEM 02]). Moreover, we have 1 x |F( )| 2 d x ≤ C |F(σ )|2 dσ (58) x0 1 |x| |x| |x−x0 |<1 |σ − |x | |≤C |x | 0

0

so that f ∈ E 2 . Thus, we may apply point (C) and find that for some R > 0 we have | u (t, x)| ≤ 1 for 1/2 < t < 1 and |x| > R. The value of R depends only on u0 and not on the specific solution u. Thus, since λ u (λ2 t, λx) is a solution as well which satisfies −1 (50) and (51), we find that | u (t, x)| ≤ λ for t ∈ (λ2 /2, λ2 ) and |x| > Rλ. This gives (53). Then, we use (B) to get that u belongs to L 1t L ∞ x on (0, 1) × B(0, R) and (53) to get 3 3 ∈ L 1t L ∞ that u belongs to L 1t L ∞ x on (0, 1) × (R − B(0, R)). Thus, u x on (0, 1) × R and moreover its norm is controlled by a constant which depend only on u0 and not on λ2 the specific solution u. Thus, rescaling, we find that 0 u (t, .)∞ dt ≤ Cλ. Thus (D) is proved. (E) is a direct consequence of (D): if u(t, x) = √1t U ( √x t ), then U ∞ = 1 u (t, .)∞ dt. Thus, Theorem 6 is proved. 2 0 Remark. Grujiˇc [GRU 06] proved that the profile U of a self–similar suitable solution of Eqs. (1) must be bounded on any compact subset of R3 .

Suitable Solutions for the Navier–Stokes Problem with an Homogeneous Initial Value

143

5. A Scale-Preserving Approximation to the Navier-Stokes Problem One of the main difficulties in the Navier–Stokes equations is the fact that p depends in a nonlocal way on u. In formula (4), p is expressed through the use of singular integral operators whose kernels are supported by the whole space. In order to turn the equations in local equations, we will consider the following modification of the Navier–Stokes equations, associated to a positive : ⎧ u ⊗ u ) = p , u − ∇ ⎪ ⎨ ∂t u + ∇.( u |t=0 = u0 , ⎪ ⎩ u . p = − 1 ∇.

(59)

We will show that those equations are well fitted for small regular initial values and provide solutions which converge to the solution of Eqs. (1). An important feature is that Eqs. (59) are invariant under the same rescaling as Eqs. (1). In particular, when u0 is homogeneous (and small), we shall have self–similar solutions for (59). 3/q−1,∞ 3 ) for some q ∈ [1, 3). Such We shall work with an initial value u0 ∈ ( B˙ ∞ Besov spaces have been studied by Cannone [CAN 95] as a good frame to exhibit self– 3/q−1,∞ similar solutions. Let us remark that the function 1/|x| belongs to B˙ ∞ for every q ≥ 1. Theorem 7. Let q ∈ [1, 3). Then there exists a constant Cq such that, for every u0 ∈ 3/q−1,∞ u 0 = 0 and ( B˙ q (R3 ))3 such that, ∇. u 0 B˙ 3/q−1,∞ < Cq , the following assertions q are true: (A) [Existence] Eqs. (1) have a unique solution u on (0, +∞) × R3 such that u (t, .) B˙ 3/q−1,∞ ≤ 2 u 0 B˙ 3/q−1,∞ sup q

t>0

q

(60)

3 3 p = − i=1 1 (with pressure p given by ∇ j=1 ∇ ∂i ∂ j (u i u j )). (B) [Existence for the modified equations] For every > 0, Eqs. (59) have a unique solution u on (0, +∞) × R3 such that sup u (t, .) B˙ 3/q−1,∞ ≤ 2 u 0 B˙ 3/q−1,∞ . q

t>0

q

(61)

(C) [Convergence] When goes to 0, the solutions ( u , p ) converge (in the sense of distributions) to the solution ( u , p) of Eqs. (1). (D) [Self–similarity] If u0 is homogeneous, then u and u are self–similar. Proof. We define L as the operator

t

Lf =

e(t−s) f (s, .) ds

(62)

0

and, for λ > 0, τλ as the operator (τλ f )(t, x) = f (λt, x).

(63)

144

P. G. Lemarié–Rieusset, F. Lelièvre

˙ 3/q−1,∞ and p ∈ L ∞ ˙ 3/q−2,∞ if and Thus ( u , p) is a solution of (1) with u ∈ L ∞ t Bq t Bq only if ⎧ u ⊗ u)) − L( 1 ∇ ⎨ u = et u0 − L( 1 ∇.( p) (64) ⎩ ⊗ ∇.( u ⊗ u). p = − 1 ∇ Similarly, taking the divergence of (59), we find that 1 1 u ⊗ u ), ∂t p = (1 + )p + ∇ ⊗ ∇.(

(65)

so that (since p |t=0 = 0) ( u , p ) is a solution of (59) if and only if ⎧ 1 t p ) u ⊗ u )) − L( 1 ∇ ⎪ ⎨u = e u0 − L( ∇.( 1 ⎪ ⊗ ∇.( u ⊗ u ) ds = 1 τ 1 L( 1 ∇ −1 ⊗ τ −11 u)). ⎩ p = 0t e(1+ )(t−s) 1 ∇ 1+ 1+ ⊗ ∇.τ 1 u 1+

1+

(66) It is now easy to conclude by using classical estimates [LEM 02] for 1 ≤ q < 3: f g B˙ 3/q−2,∞ ≤ Cq f B˙ 3/q−1,∞ g B˙ 3/q−1,∞ , q q q ⎧ 1 ⎨ ∂i f B˙ 3/q−1,∞ ≤ Cq f B˙ 3/q−2,∞ , q q ⎩ 1 ∂i ∂ j f ˙ 3/q−2,∞ ≤ Cq f ˙ 3/q−2,∞ , Bq Bq L( f ) L ∞ B˙ 3/q−1,∞ ≤ Cq f L ∞ B˙ 3/q−1,∞ ,

(67)

L( f ) L ∞ B˙ 3/q−2,∞ ≤ Cq f L ∞ B˙ 3/q−2,∞ ,

(70)

τλ ( f ) L ∞ B˙ 3/q−1,∞ = f L ∞ B˙ 3/q−1,∞ ,

(71)

τλ ( f ) L ∞ B˙ 3/q−2,∞ = f L ∞ B˙ 3/q−2,∞ .

(72)

q

t

q

t

t

t

q

t

q

t

q

q

t

t

q

q

(68) (69)

Thus, we find that the bilinear operators A( u , v) = L(

1 1 1 u ⊗ v))) ∇. u ⊗ v) − L( ∇( ∇ ⊗ ∇.(

(73)

and A ( u , v) = L(

1 1 1 1 −11 u ⊗ τ −11 u))) L( ∇(τ ∇. u ⊗ v) + ∇ ⊗ ∇.τ 1 L( 1+ 1+ 1+ 1+

(74)

˙ 3/q−1,∞ )3 . Thus, are an equicontinuous family of bounded bilinear operators on (L ∞ t Bq u , u) or to u = if u0 is small enough, we may find a solution to u = et u0 − A( et u0 − A ( u , u ). Thus, (A) and (B) (and (D)) are proved. ˙ 3/q−1,∞ and p remain bounded in L ∞ ˙ 3/q−2,∞ , Since u remain bounded in L ∞ t Bq t Bq 3 they belong to a relatively compact subset of D ((0, +∞) × R ). Hence in order to prove convergence it is enough to check that if ( v , q) is a limit of some sequence ( u k , p k ) with k → 0, then v = u and q = p. It is even enough to show that ( v , q) is a solution ˙ 3/q−2,∞ , we have that ∇. v = − limk →0 k pk = 0. of (1). Since p is bounded in L ∞ t Bq

Suitable Solutions for the Navier–Stokes Problem with an Homogeneous Initial Value

145

− limk →0 ∇.( vk ⊗ vk ). But v is bounded Moreover, ∂t v = limk →0 ∂t vk = v − ∇q 3/q−1,∞ 3/q−3,∞ ∞ ∞ ˙ ˙ in L t Bq and ∂t v is bounded in L t Bq , hence there exists σ < 0 < τ such that, on any compact subset K of (0, +∞) × R3 , v is bounded in L 2t Hxτ and ∂t v ρ is bounded in L 2t H σ , so that [LEM 02] v is bounded in Ht,x (K ) for some positive ρ; by Rellich’s theorem, we get that vk is strongly convergent in L 2t,x (K ) for every v ⊗ v). Hence, v is solution of ∂t v = vk ⊗ vk ) = ∇.( compact K , hence limk →0 ∇.( ˙ 3/q−1,∞ ; but, then, v is smooth v ⊗ v) − ∇q and ∇. v = 0 and is small in L ∞ v − ∇.( t Bq v ⊗ v) = ( v . Theorem 7 is proved. for t > 0 and thus ∇.( v .∇)

6. Another Scale-Preserving Approximation to the Navier-Stokes Problem Equations (59) are not good when dealing with large data (and thus looking for weak solutions). Indeed, when we write the energy balance (8) the remainder R given by for u mula (9) is no longer equal to 0, since u is no longer divergence-free. The term | u |2 ∇. provides the worst contribution to the energy since it cannot be controlled by the (local) 2 2 ˙1 L∞ u |2 belongs t L x and L t Hx norms. We have to add a damping term to ensure that | 2 2 (locally) to L t L x . Vishik and Fursikov proposed the following approximation [VIS 77]: ⎧ u ,α = p,α , u ,α .∇) u ,α − α| u ,α |4 u,α − ∇ ⎪ ⎨ ∂t u,α + ( u,α |t=0 = u0 , ⎪ ⎩ u ,α , p,α = − 1 ∇.

(75)

where > 0 and α > 0. This model,which provides unique solutions for a large class of initial values, is not well fitted to homogeneous initial values, since the equations are not invariant through the rescaling. Thus, we shall study a modified model: ⎧ u ,α = p,α u ,α .∇) u ,α − α| u ,α |2 u,α − ∇ ⎪ ⎨ ∂t u,α + ( u,α |t=0 = u0 ⎪ ⎩ u ,α p,α = − 1 ∇.

(76)

under some additional restriction on α and (namely, < 4α). We shall lose uniqueness, but keep scaling invariance (as for (1)) and get energy equality (in contrast to (1)). Moreover, we shall prove convergence to suitable solutions of (1) when goes to 0 (and when u0 ∈ ( M˙ 2,3 )3 ). Equations (76) have been studied by F. Lelièvre [LEL 10] in the case of finite-energy ( u 0 ∈ (L 2 )3 ) and in the case of uniformly square-integrable initial value ( u 0 ∈ (L 2uloc )3 ). He proved the existence of global weak solutions in the first case and of local weak solutions in the second case (with existence time depending on α). Those solutions are locally L 4t L 4x and we can write the energy balance ⊗ u,α |2 = u ,α |2 + 2|∇ ∂t |

3 i=1

∂i (2 u ,α .∂i u,α − (| u ,α |2 + 2 p,α )u ,α,i ) + R,

(77)

146

P. G. Lemarié–Rieusset, F. Lelièvre

where 1 2 u ,α − 2 |∇. u ,α |2 ≤ −α| R = −2α| u ,α |4 + | u ,α |2 ∇. u ,α |4 + ( − )|∇. u ,α |2 4α 1 ≤ −α| u ,α |4 − |∇. (78) u ,α |2 ≤ 0. This is the key tool to exhibit weak solutions. Let us recall the main results of [LEL 10]: u 0 = 0. Then the Theorem 8. Let 0 < < 4α. Let u0 ∈ (L 2 (R3 ))3 be such that ∇. following assertions are true: (A) [Existence] Equations (76) have a solution u,α on (0, +∞) × R3 such that u,α ∈ 2 3 2 ˙1 3 4 4 3 (L ∞ t L x ) ∩ (L t Hx ) ∩ (L t L x ) . (B) [Energy inequality] For every t > 0, we have t t ⊗ u,α 22 ds + α ∇ u ,α 44 ds u ,α (t, .)22 + 2 0 0 1 t + ∇. u ,α 22 ds ≤ u 0 22 . (79) 0 (C) [Convergence] There exists a sequence (αk , k ) going to (0, 0) such that ( u k ,αk , pk ,αk ) converges (in the sense of distributions) to a suitable solution of (1). u 0 = 0. Then, the Theorem 9. Let 0 < < 4α. Let u0 ∈ (L 2uloc (R3 ))3 be such that ∇. following assertions are true: (A) [Existence] Equations (76) have a solution u,α on (0, T ) × R3 such that u,α ∈ 2 3 2 ˙1 3 4 4 3 ((L ∞ t L x )uloc ) ∩ ((L t Hx )uloc ) ∩ ((L t L x )uloc ) . (B) [Uniqueness] If α > 6 and < α/6, then if u and v are two solutions of (76) on 2 3 2 ˙1 3 4 4 3 (0, T ) × R3 which belong to ((L ∞ t L x )uloc ) ∩ ((L t Hx )uloc ) ∩ ((L t L x )uloc ) , then u = v. (C) [Self–similarity] If α > 6 and < α/6 and if u0 is homogeneous, then u,α is self–similar. The problem in Theorem 9 is that we have no control when (α, ) goes to (0, 0), neither on the existence time T nor on the size of the solution. Indeed, we have to control the pressure p,α ; in the case of finite energy solutions, this can be done through a result on maximal regularity of the heat kernel; this result breaks down in the case of uniformly locally square-integrable solutions. But we can recover it in the case of an initial value in the Morrey–Campanato space: u 0 = 0. Then, the Theorem 10. Let 0 < < 4α. Let u0 ∈ ( M˙ 2;3 (R3 ))3 be such that ∇. following assertions are true: (A) [Existence] Equations (76) have a solution u,α on (0, +∞) × R3 such that for 2 3 2 ˙1 3 4 4 3 every T > 0 we have u,α ∈ ((L ∞ t L x )uloc ) ∩ ((L t Hx )uloc ) ∩ ((L t L x )uloc ) on 3 (0, T ) × R . (B) [Convergence] There exists a sequence (αk , k ) going to (0, 0) such that ( u k ,αk , pk ,αk ) converges (in the sense of distributions) to a suitable solution of (1).

Suitable Solutions for the Navier–Stokes Problem with an Homogeneous Initial Value

147

Proof. From Theorem 9, we may derive by rescaling, compactness and extraction that Eqs. (76) with initial data in M˙ 2,3 have a global solution u,α whose size depends on and α as well as on u 0 M˙ 2,3 . As this diagonal extraction must be done very carefully (since it interacts with multiple rescalings), we describe it in the next section. Let us take this existence for granted. We shall try to get rid of the dependence on and α. The problem is to find a space X such that M˙ 2,3 ⊂ X and such that the solution u,α of Eqs. (76) will be well controlled through the estimates on X norms on some strip (0, T ) × R3 for some time T independent from and α (with a (local in t and x) control 2 2 ˙1 in L ∞ t L ∩ L Hx independent from and α ). ω(x) = The space X we shall consider is the space X = L 2 (ω(x) d x), where (1 + |x|2 )−λ/2 with λ ∈ (1, 2). Since λ > 1, it is easy to check that | u 0 |2 ω(x) d x < C u 0 2M˙ 2,3 . We shall use another weight: (x) = (1 + |x|2 )−μ/2 with μ ∈ (3 λ2 , 23 + 3 λ4 ) ⊂ (3/2, 3). We start from the identity u,α ) ∂t (| u ,α |2 ω) = 2ωu,α . u ,α − 2ωu,α .( u ,α .∇ 2 ∇. u ,α ) −2α| u ,α |4 ω+ω u,α .∇(

(80)

with ⊗ u,α |2 ω−2 2ωu,α . u ,α = (ω| u ,α |2 )−| u ,α |2 ω − 2|∇

3

u ,α |2 ∂i ω , ∂i |

i=1

(81)

u,α ) = ∇ ω| u ,α −| u ,α |2 ∇. 2ωu,α .( u ,α .∇ u ,α |2 u,α −ω| u |2

3

u ,α,i ∂i ω,

i=1

(82) and 3 2 2 2 2 2 ω u,α .∇ ∇. u ,α | ω− ∇. u ,α = ∇. ω ∇. u ,α,i ∂i ω. u ,α u,α − |∇. u ,α i=1

(83) We then introduce

U (t) =

| u ,α (t, x)|2 ω(x) d x, t ⊗ u,α (s, x)|2 ω(x) ds d x, |∇ V (t) = 0 t W (t) = α | u ,α (s, x)|4 ω(x) ds d x, 0 t 1 X (t) = |∇. u ,α (s, x)|2 ω(x) ds d x, 0 t 1 Y (t) = |∇. u ,α (s, x)|3/2 (x) ds d x. 3/2 0

(84) (85) (86) (87) (88)

148

P. G. Lemarié–Rieusset, F. Lelièvre

u ,α | ≤ α| Integrating (80) in t and x (and writing | u ,α |2 |∇. u ,α |4 + u ,α |2 ), we get α| u ,α |4 + 1 |∇.

1 u ,α |2 4α |∇.

U (t) + 2V (t) + W (t) + X (t) ≤ U (0) + Z (t)

≤

(89)

with Z=

t 0

−

t 0

t

| u ,α |2 ω ds d x +

0

| u ,α |2

3

u i ∂i ω ds d x

i=1

2 u ,α,i ∂i ω ds d x = Z 1 + Z 2 + Z 3 . ∇. u ,α 3

(90)

i=1

We have |ω(x)| ≤ Cω(x), so that:

Z1 ≤ C

t

U (s) ds.

(91)

0

Since λ < 2, we have |∇ω(x)| ≤ λ(1 + |x|2 )−1/2 ω(x) ≤ λω(x)3/2 so that: t Z2 ≤ C | u ,α (s, x)|3 ω3/2 ds d x,

(92)

0

hence Z2 ≤ C

0

t

√ 1 3/2 U (s)3/4 ωu,α H 1 ds ≤ 4

t 0

√ ωu,α 2H 1 ds + C

t

U (s)3 ds,

0

(93) and finally 1 Z 2 ≤ V (t) + C 4

t

t

U (s) ds + C

0

U 3 (s) ds.

(94)

0

Since 2μ/3 ≤ 1 + λ/2, we have |∇ω(x)| ≤ λ(1 + |x|2 )−1/2 ω(x) ≤ λω1/2 2/3 so that t t √ 1 1 ωu,α 3 ∇. U (s) ds Z3 ≤ C u ,α 3/2 ds ≤ V (t) + C 4 0 0 t +C U 3 (s) ds + CY (t). (95) 0

The main problem we have to deal with is to controlthe size of Y (t) indepedently of u ,α (so that Y (t) = t | p,α (s, x)|3/2 (x) ds d x) and α. We define p,α = − 1 ∇. 0 and we write t 1 1 1 ∇ u,α + α| p,α = − e(1+ )(t−s) (1 + ) .( u ,α |2 u,α ) ds. (96) u ,α .∇ 1+ 0 Since 0 < μ < 3, the weight belongs to the Muckenhoupt class A3/2 [STE 93]; thus, 3/2 3/2 we have maximal regularity for the heat kernel on L t L x ( (x) d x): the operator T defined by t f → T ( f ) = e(t−s) f (s, .) ds (97) 0

Suitable Solutions for the Navier–Stokes Problem with an Homogeneous Initial Value

149

is a Calderón–Zygmund operator on the homogeneous–type space R × R3 (endowed with the Lebesgue measure dt d x and the pseudo–distance δ((t, x), (t , x )) = 1/4 ) [LEM 02]; since (x) is a Muckenhoupt weight on R × R3 , |t − t |2 + |x − x |4 the operator T is bounded on L 3/2 (dt d x) [STE 93,PRA 07]. We thus get 3/2 t ∇ u,α ) u ,α .∇ (x) ds d x Y (t) ≤ C .( 0 3/2 t ∇ +C u ,α |2 u,α ) (x) ds d x. .(α| 0 ⊗ u,α |, f 2 = We write f 1 = |∇

(98)

√ α| u ,α |2 and g = | u ,α | and thus we have

3/2 t 1 Y (t) ≤ C (x) ds d x √− ( f 1 g) 0 3/2 t 1 √ +C η √− (( f 2 g) (x) ds d x. 0

(99)

We define E j = B(0, 2 j ), F j = B(0, 3 2 j ) and G j = R3 − F j . We write +∞ 2/3 1 √ ≤ C ( f g) 2−2μj/3 1 E j i − j=0

with 2

−2μj 3

1E j

1 √ ( f g) − i

1 √ − (1 F j f i g) 1 −2μj 3 +2 (1G j f i g) = A j + B j . 1E j √ −

(100)

1 −2μj √ − ( f i g) ≤ 2 3 1 E j

(101)

In order to estimate A j , we write μ 1 (ω f i g). (102) A j ≤ C2 j (λ−2 3 ) 1 B(0,2 j ) √ − √ √ Since ωg belongs to L 2 ∩ L 6 , we find that, for q ∈ (2, 6), ωg ∈ L q so that ω f i g ∈ L r 1 with 1/r = 1/q + 1/2, which gives √− (ω f i g) ∈ L ρ with 1/ρ = 1/q + 1/6 and finally

A j L 3/2 (x) ≤ C2

j (λ−2 μ3 + 23 − q3 )

√ √ ω f i 2 ωgq .

(103)

μ 3 3 Let θ = 2μ 3 − λ − 2 + q . As λ < 2 3 , we may choose q close enough to 2 to ensure that θ > 0. We then have

t 0

|

+∞ j=0

Aj|

3/2

ds d x ≤ C(

+∞ j=0

2

− jθ 3/2

t

)

0

√ 3/2 √ 3/2 ω f i 2 ωgq ds. (104)

150

P. G. Lemarié–Rieusset, F. Lelièvre

If 2 < q < 18/7, we may find an exponent R > 1 such that, for all κ > 0, √ √ √ √ 3/2 √ 3/2 ω f i 2 ωgq ≤ κ ω f i 22 + κ ωg26 + Cκ,q ωg2R 2 .

(105)

In order to estimate B j , we write that B j ≤ C2

−2 jμ 3

1 B(0,2 j )

+∞

2−( j+k)2 2( j+k)λ (ω f i g) dy

j+k j+k+1 k=0 3 2 <|y|<3 2

(106)

and, since λ < 2, we get that B j ≤ C2

−2 jμ 3

+∞ √ √ 1 B(0,2 j ) ω f i 2 ωg2 2−( j+k)2 2( j+k)λ k=0

j (λ−2− 2μ 3 )

≤C2

√ √ 1 B(0,2 j ) ω f i 2 ωg2 .

(107)

Since B j L 3/2 (d x) ≤ C22 j B j ∞ and λ < 2μ/3„ we get that t

|

0

+∞

Bj|

3/2

ds d x ≤ C(

j=0

+∞

2

j (λ− 2μ 3 ) 3/2

t

)

0

j=0

√ 3/2 √ 3/2 ω f i 2 ωg2 ds

(108)

where, for all κ > 0, √ 3 √ 1 √ 3/2 √ 3/2 ω f i 2 ωg2 ≤ κ ω f i 22 + 3 ωg62 . 4 4κ Finally, we get t 1 U (s) + U 3 (s) + U R (s) ds, Z (t) ≤ V (t) + W (t) + C 2 0 so that 1 U (t) + V (t) + W (t) + X (t) ≤ U (0) + C 2

t

U (s) + U 3 (s) + U R (s) ds

(109)

(110)

(111)

0

and we may conclude. Thus far, we have got size estimates of the solution u,α independent from and α for (t, x) in a strip (0, T0 ) × R3 , where T0 depends on the size of u 0 M˙ 2,3 . But we may rescale u0 in u0 (x) = γ u0,γ (γ x) with u 0,γ M˙ 2,3 = u 0 M˙ 2,3 ; this gives estimates on the strip 0 < t < γ −2 T0 . Let us notice that the weights ω and ω(x/γ ) give way to the same Lebesgue spaces so that we have controls in L 2 (ωd x) norm on (0, γ −2 T0 ) for every γ > 0. Thus, there exists a sequence (αk , k ) going to (0, 0) such that ( u k ,αk ) converges (in the sense of distributions) to some function u while ( pk ,αk ) converges weakly to some function p. Since moreover α 1/4 u,α is bounded in every L 4 ((0, T ), L 4 (ω d x)) u ,α is bounded in every L 2 ((0, T ), L 2 (ω d x)), we find that all terms but and −1/2 ∇. one in Eq. (76) have a limit in D (and thus the last one as well) and we get the equation ⎧ u k ,αk = p, u k ,αk .∇) u−∇ ⎪ ⎨ ∂t u + lim ( u|t=0 = u0 , (112) ⎪ ⎩ ∇. u = 0.

Suitable Solutions for the Navier–Stokes Problem with an Homogeneous Initial Value

151

Of course, we want to show that u k ,αk = ( u. lim ( u k ,αk .∇) u .∇)

(113)

Thus, we need to prove that uk ,αk converges to u strongly locally in L 2t,x . We split uk ,αk in P u k ,αk + (I d − P) u k ,αk , where P is the Leray projection operator. The operator P is bounded on L 2 (ω d x), since ω belongs to the Muckenhoupt class A2 . On every compact subset of [0, +∞) × R3 , we control (uniformly in and α) the size of P u ,α in L 2t Hx1 2 −2 and the size of ∂t P u ,α in L t Hx . A classical Rellich compactness argument [LEM 02] then ensures that P( u k ,αk ) converges strongly locally in L 2t,x norm. We now consider the remaining part vk ,αk = (I d − P) u k ,αk . We have vk ,αk =

∇ u k ,αk . ∇.

(114)

We want to show that, for every x0 ∈ R3 and every T > 0, vk ,αk converges to 0 in L 2 ((0, T ) × B(x0 , 1)). We consider a function θ ∈ D(R3 ) such that θ (x) = 1 on 0 B(0, 1) and θ (x) = 0 for |x| > 2. We write θk (x) = θ ( x−x Rk ) for some Rk > 10 which will be fixed later. We define ω0 (x) = (1 + |x|2 )−λ0 /2 with λ ∈ (λ, 2). We then write vk ,αk = Ak + Bk + Ck with y Ak =

∇ u k ,αk , θk ∇.

Bk =

∇ k , Ck = ∇ ∇. (1 − θk ) uk ,αk .∇θ u k ,αk . (115)

We then write u k ,αk L 2 ((0,T )×R3 ) , Ak L 2 ((0,T )×B(x0 ,1)) ≤ CAk L 2 L 6 ((0,T )×R3 ) ≤ C θk ∇. t

x

(116)

which gives 1/2 −1/2

Ak ≤ C(Rk + |x0 |)λ/2 k k

u k ,αk L 2 ((0,T ),L 2 (ω d x)) ≤ C T (Rk + |x0 |)λ/2 , ∇. k (117) 1/2

where C T depends only on T and on u 0 M˙ 2,3 . We write in a similar way k L 2 ((0,T )×R3 ) , u k ,αk .∇θ Bk L 2 ((0,T )×B(x0 ,1)) ≤ CBk L 2 L 6 ((0,T )×R3 ) ≤ C t

x

(118)

which gives u k ,αk L 2 ((0,T ),L 2 (ω d x)) ≤ C T (Rk + |x0 |)λ/2 Rk−1 . Bk ≤ C(Rk + |x0 |)λ/2 Rk−1

(119)

Finally, we use the fact that ω0 belongs to the Muckenhoupt class A2 and we write Ck L 2 ((0,T )×B(x0 ,1)) ≤ C(1 + |x0 |)λ0 /2 Ck L 2 (0,T ),L 2 (ω0

≤ C (1 + |x0 |)

λ0 /2

d x)

(1 − θk ) u k ,αk L 2 (0,T ),L 2 (ω0

d x)

(120)

which gives Ck ≤ C(1 + |x0 |)λ0 /2 (Rk + |x0 |)(λ−λ0 )/2 u k ,αk L 2 ((0,T ),L 2 (ω d x)) ≤ C T (1 + |x0 |)λ0 /2 (Rk + |x0 |)(λ−λ0 )/2 .

(121)

152

P. G. Lemarié–Rieusset, F. Lelièvre −1/2

We take Rk = k

and we find that 2−λ

λ0 −λ 4

vk ,αk L 2 ((0,T )×B(x0 ,1)) = O(k 4 ) + O(k

),

(122)

so that we have limk→+∞ vk ,αk L 2 ((0,T )×B(x0 ,1)) = 0. It remains to show that the solution u we have obtained is a suitable one. Starting from the equation ⊗ u,α |2 − 2 u,α ) u ,α |2 ) = (| u ,α |2 ) − 2|∇ u ,α .( u ,α .∇ ∂t (| 4 p,α −2α| u ,α | − 2 u ,α .∇

(123)

⊗ u,α |2 − ∇.((| u ,α |2 + 2 p,α ) ∂t (| u ,α |2 ) = (| u ,α |2 ) − 2|∇ u ,α ) 2 4 +| u ,α | ∇. u ,α | + 2 p,α ∇. u ,α − 2α| u ,α ,

(124)

rewritten as

and noticing that u ,α −2α| u ,α u ,α |4 +2 p,α ∇. | u ,α |2 ∇. 2 ≤ −/2 (| u ,α |4 + 4 p,α + 2 p,α | u ,α |2 ) ≤ 0,

(125)

we find (for some subsequence (k ; αk )) that ∂t (| u ,α |2 ) converges to ∂t | u |2 , that 2 2 2 2 (| u ,α | ) converges to | u | , that |∇ ⊗ u,α | converges to |∇ ⊗ u| +μ1 , where μ1 is a u ,α |2 +2 p,α ) u |2 +2 p) non-negative local measure, that ∇.((| u ,α ) converges to ∇.((| u) 3 2 u ,α −2α| [since u is locally strongly convergent in L t,x ] so that finally | u ,α | ∇. u ,α |4 + u ,α is bound to converge in D to some distribution −μ2 , where μ2 is a non2 p,α ∇. negative local measure. This proves that the solution u of (1) is suitable. ˇ Remark. A different equation has been considered by Plecháˇc and Sverák [PLE 03] in case of a radially symmetrical compactly supported initial data. They modify Eq. (59) into ⎧ u ⊗ u ) − 1 (∇. u ) p , u = u − ∇ ⎪ 2 ⎨ ∂t u + ∇.( u |t=0 = u0 , (126) ⎪ ⎩ 1 p = − ∇. u .

When we write the energy balance (8) the remainder R given by formula (9) is no u has u |2 ∇. longer equal to 0, since u is no longer divergence-free, but the bad term | been removed. R is just given by u = −2p2 ≤ 0. R = 2 p ∇.

(127)

However, we preferred to study the modified Vishik and Fursikov equations (76), since the damping ensures us that the solution u,α is locally L 4t L 4x so that we may use more easily the energy estimates.

Suitable Solutions for the Navier–Stokes Problem with an Homogeneous Initial Value

153

7. Scaling and Extractions The solutions to Eqs. (76) in Theorems 8, 9 and 10 are constructed through a constant reiteration of the following Rellich compactness criterion [LEM 02]: Lemma 1. If T ∈ (0, +∞) and if (vθ )θ>0 is a family of distributions on (0, T )×R3 such that, for some positive s and σ , for every ϕ ∈ D ((0, T ) × R3 ) the family (ϕvθ )θ>0 is bounded in L 2t ((0, T ), Hxs ) and the family (∂t (ϕvθ ))θ>0 is bounded in L 2 ((0, T ), Hx−σ ), then there exists a sequence (θn )n∈N such that limn→+∞ θn = 0 and such that vθn converges to a distribution v ∈ D ((0, T ) × R3 ) and vθn converges strongly to v in L 2t L 2x norm on every compact subset of (0, T ) × R3 . Let us explain the way solutions to (76) are constructed when the initial data belongs to M˙ 2,3 . Theorem 8 will give us a solution on (0, T ) × R3 , where T is controlled by below in a way that depends only on and on the norm of u0 in M˙ 2,3 . We first try and construct a solution uγ which we may control on (0, γ −2 T ) × R3 by considering Eq. (76) with initial data u0,γ = γ −1 u0 (γ −1 x); Theorem 8 gives a solution vγ on (0, T ) × R3 associated to u0,γ , then by rescaling a solution uγ = γ vγ (γ 2 t, γ x) associated to u0 and defined on (0, γ −2 T ) × R3 . If we want to use Lemma 1 to extract a global solution, we need to provide for every R > 1 a uniform control of uγ on every compact subset of (0, RT ) × R3 for γ ≤ R −2 . In order to control uγ , we need to understand how vγ is produced. We truncate u0,γ by multiplication with a function θ (x/R), where θ ∈ D (R3 ) is equal to 1 on a neighbourhood of 0, then we project the truncated initial value on the solenoidal vector fields through Leray’s projection operator and get u0,γ ,R = P(θ (x/R) u 0,γ ). This is not a good 2 way to process an initial value in L uloc [BAS 06] since the Leray projection operator is not bounded on L 2uloc , but this is a good way for an initial value in M˙ 2,3 : we have u 0,γ ,R M˙ 2,3 ≤ C u 0 M˙ 2,3 for a constant C which doesn’t depend on u0 , γ nor R, and we have the *-weak convergence of u0,γ ,R to u0,γ . Theorem 8 gives a solution vγ ,R on (0, T ) × R3 associated to u0,γ ,R , and we have indeed a uniform control of vγ ,R on every compact subset of (0, T ) × R3 , this control depending only on u 0,γ ,R M˙ 2,3 (and thus on u 0 M˙ 2,3 ). If we look directly at the consequences of Theorem 8 however, this control does not appear precise enough to control uγ uniformly with respect to γ : 2 2 1 γ only on domains roughly speaking, we get a control of the L ∞ t L x ∩ L t Hx norm of v (0, T ) × B(x0 , 1) (with an estimate O(1)), hence of the L 2t L 2x ∩ L 2t Hx1 norm of uγ on domains (0, γ −2 T ) × B(x0 , γ −1 ) with estimates O(γ −1/2 ). In order to get a better control on uγ , we need to explain more precisely how vγ ,R is produced. Following [LEL 10], we consider a mollified equation ⎧ vγ ,R,η,∞ = pγ ,R,η , ⎪ vγ ,R,η − αωη ∗ (| vγ ,R,η,∞ .∇) vγ ,R,η,∞ |2 vγ ,R,η,∞ ) − ∇ ∂t vγ ,R,η + ωη ∗ ( ⎪ ⎪ ⎪ ⎪ ⎨ = u , v γ ,R,η |t=0

⎪ ⎪ ⎪ ⎪ ⎪ ⎩

0,γ ,R

vγ ,R,η,∞ = ωη ∗ vγ ,R,η , vγ ,R,η,∞ , pγ ,R,η = − 1 ωη ∗ ∇.

(128η )

where ω ∈ D(R3 ) satisfies ω(x) d x = 1 and where ωη (x) = η−3 ω(η−1 x). We get 2 2 ˙1 (through energy estimates) a control of the solution vγ ,R,η in L ∞ t L x ∩ L Hx norm uniformly with respect to η > 0 on (0, +∞) × R3 . Lemma 1 then gives Theorem 8. Note that for Eqs. (128η ) we have uniqueness of the solution.

154

P. G. Lemarié–Rieusset, F. Lelièvre

We shall call Sγ ,R the set of solutions v to (76) which are obtained from the initial value u0,γ ,R as a weak limit limηn →0 vγ ,R,ηn . Now, the precise mechanism of the proof is the following one: for μ > 0, let us considers μ vγ ,R,η (μ2 t, μx); Eqs. (128η ) are no longer invariant through the rescaling, because of the convolutions with ωη ; we have to rescale ωη into ωη/μ ; thus μ vγ ,R,η (μ2 t, μx) is a solution to Eqs. (128η/μ ) with initial value μ u 0,γ ,R (μx) = u0,γ /μ,R/μ (x). Thus, we find that μ vγ ,R,η (μ2 t, μx) = vγ /μ,R/μ,η/μ (t, x).

(129)

Using another process of extractions, we find that for each v ∈ Sγ ,R and every μ > 0 there exists w ∈ Sγ /μ,R/μ such that x). μ v (μ2 t, μx) = w(t,

(130)

We note Q T,x0 ,ρ = (0, T ) × B(x0 , ρ). We get through the proof of Theorem 9 [LEL 10] that sup

sup

sup vL ∞ 2 t L x (Q T ,x

γ >0,R>0 v∈Sγ ,R x0 ∈R3

0 ,1 )

⊗ v 2 2 + ∇ L L (Q T t

,x0 ,1 )

x

≤ C(, u 0 M˙ 2,3 ). (131)

Using (130) and (131) we get

sup

sup

sup

vL ∞ 2 t L x (Q

μ2 T ,x0 ,μ )

γ >0,R>0,μ>0 v∈Sγ ,R x0 ∈R3

⊗ v 2 2 + ∇ L t L x (Q μ2 T ,x ,μ ) 0 √ μ

≤ C(, u 0 M˙ 2,3 ).

(132)

Thus, we have for any γ > 0 and R > 0 a solution vγ ,R of (76) (with initial value u0,γ ,R ) on the strip (0, T ) × R3 with a uniform control given by (131) and (132): sup

sup vγ ,R L ∞ 2 t L x (Q T ,x

γ >0,R>0 x0 ∈R3

0 ,1 )

⊗ vγ ,R 2 2 + ∇ L L (Q T t

x

,x0 ,1 )

≤ C(, u 0 M˙ 2,3 ), (133)

and sup

sup

vγ ,R L ∞ 2 t L x (Q

μ2 T ,x0 ,μ )

γ >0,R>0,0<μ≤1 x0 ∈R3

⊗ vγ ,R 2 2 + ∇ L t L x (Q μ2 T ,x ,μ ) 0 √ μ

≤ C(, u 0 M˙ 2,3 ).

(134)

Letting 1/R go to 0, we apply Lemma 1 and extract a weak limit vγ on (0, T ) × R3 with strong local convergence in the L 2t L 2x norm. vγ is a solution of (76) (with initial value u0,γ ) on the strip (0, T ) × R3 and, from (133) and (134), we get sup sup vγ L ∞ 2 t L x (Q T ,x

γ >0 x0 ∈R3

0 ,1 )

⊗ vγ 2 2 + ∇ L L (Q T t

x

,x0 ,1 )

≤ C(, u 0 M˙ 2,3 ),

(135)

Suitable Solutions for the Navier–Stokes Problem with an Homogeneous Initial Value

155

and sup

sup

vγ L ∞ 2 t L x (Q

μ2 T ,x0 ,μ )

γ >0,0<μ≤1 x0 ∈R3

⊗ vγ 2 2 + ∇ L t L x (Q μ2 T ,x ,μ ) 0 ≤ C(, u 0 M˙ 2,3 ). √ μ (136)

Now, we rescale vγ into uγ by writing uγ (t, x) = γ vγ (γ 2 t, γ x). uγ is a solution of (76) (with initial value u0 ) on the strip (0, γ −2 T ) × R3 and, from (136), we get sup

sup

uγ L ∞ 2 t L x (Q

γ >0,0<μ≤1 x0 ∈R3

≤ C(, u 0 M˙ 2,3 ).

⊗ uγ 2 2 + ∇ L t L x (Q μ2 γ −2 T ,x ,μγ −1 ) 0

μγ −1 (137)

μ2 γ −2 T ,x0 ,μγ −1 )

Now, we let γ go to 0. We have uniform control for larger and larger domains as γ goes to 0; then, Lemma 1 and a diagonal extraction process allow us to extract a weak limit u on (0, +∞) × R3 with strong local convergence in L 2t L 2x norm. u is a solution of (76) (with initial value u0 ) on the strip (0, +∞) × R3 and, from (133) and (137), we get sup sup

uL ∞ 2 t L x (Q

R>0 x0 ∈R3

R 2 T ,x0 ,R )

⊗ uγ 2 2 + ∇ L t L x (Q R 2 T ,x ,R ) 0 ≤ C(, u 0 M˙ 2,3 ). √ R

(138)

8. Conclusion In the search of self-similar solutions u for Eqs. (1), in the case of large initial data, a strategy could be to investigate the existence of self-similar solutions u,α for Eqs. (76). If we were able to prove that (76) has self-similar solutions for every > 0 and α > 0 (with < 4α), then Theorem 10 would give us a self-similar solution to (1). The small benefits we may find in studying (76) instead of (1) are the energy equality and the simple expression of the pressure. But the problem of finding large self-similar solutions of (76) is very similar to the problem of finding large self-similar solutions to (1): the self-similarity precludes any contractivity argument (in contrast to the case of small initial data) thus any easy uniqueness argument to prove self-similarity. References [BAS 06] [CAF 82] [CAN 95] [GRU 06] [KAT 84] [LEL 10] [LEM 99] [LEM 02]

Basson, A.: Homogeneous statistical solutions and local energy inequality for 3d navier-stokes equations. Commun. Math. Phys. 266, 17–35 (2006) Caffarelli, L., Kohn, R., Nirenberg, L.: Partial regularity of suitable weak solutions of the navier-stokes equations. Comm. Pure Appl. Math. 35, 771–831 (1982) Cannone, M.: Ondelettes, paraproduits et Navier–Stokes. Paris: Diderot Editeur, 1995 Gruji´c, Z.: Regularity of forward-in-time self-similar solutions to the 3d navier–stokes equations. Discrete Cont. Dyn. Systems 14, 837–843 (2006) Kato, T.: Strong l p solutions of the navier–stokes equations in Rm with applications to weak solutions. Math. Zeit. 187, 471–480 (1984) Lelièvre, F.: A scaling and energy equality preserving approximation for the 3D Navier–Stokes equations in the finite energy case. Nonlinear Anal. TMA (2011, to appear) Lemarié-Rieusset, P.G.: Solutions faibles d’énergie infinie pour les équations de Navier–Stokes dans R3 . C. R. Acad. Sc. Paris 328, série I, 1133–1138 (1999) Lemarié-Rieusset, P.G.: Recent developments in the Navier–Stokes problem. Boca Raton, FL: Chapman & Hall/CRC, 2002

156

[LEM 07] [LER 34] [PLE 03] [PRA 07] [SCH 77] [STE 93] [VIS 77]

P. G. Lemarié–Rieusset, F. Lelièvre

Lemarié-Rieusset, P.G.: The navier–stokes equations in the critical morrey–campanato space. Rev. Mat. Iberoamer. 23, 897–930 (2007) Leray, J.: Essai sur le mouvement d’un fluide visqueux emplissant l’espace. Acta Math. 63, 193– 248 (1934) Plecháˇc, P., Šverák, V.: Singular and regular solutions of a nonlinear parabolic system. Nonlinearity 16, 2093–2097 (2003) Pradolini, G., Salinas, O.: Commutators of singular integrals on spaces of homogeneous type. Czech. Math. J. 57, 75–93 (2007) Scheffer, V.: Hausdorff measures and the navier–stokes equations. Commun. Math. Phys. 55, 97–112 (1977) Stein, E.M.: Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals. Princeton, NJ: Princeton University Press. 1993 Vishik, M.I., Fursikov, A.V.: Solutions statistiques homogènes des systèmes différentiels paraboliques et du système de Navier–Stokes. Ann. Scuola Norm. Sup. Pisa, série IV, IV, 531–576 (1977)

Communicated by P. Constantin

Commun. Math. Phys. 307, 157–183 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1306-x

Communications in

Mathematical Physics

Effective Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians Abed Bounemoura IMPA, Estrada Dona Castorina 110, 22460-320 Rio de Janeiro, Brazil. E-mail: [email protected] Received: 24 November 2010 / Accepted: 17 March 2011 Published online: 28 July 2011 – © Springer-Verlag 2011

Abstract: For perturbations of integrable Hamiltonian systems, the Nekhoroshev theorem shows that all solutions are stable for an exponentially long interval of time, provided the integrable part satisfies a steepness condition and the system is analytic. This fundamental result has been extended in two distinct directions. The first one is due to Niederman, who showed that under the analyticity assumption, the result holds true for a prevalent class of integrable systems which is much wider than the steep systems. The second one is due to Marco-Sauzin but it is limited to quasi-convex integrable systems, for which they showed exponential stability if the system is assumed to be only Gevrey regular. If the system is finitely differentiable, we showed polynomial stability, still in the quasi-convex case. The goal of this work is to generalize all these results in a unified way, by proving exponential or polynomial stability for Gevrey or finitely differentiable perturbations of prevalent integrable Hamiltonian systems. 1. Introduction 1. Consider a near-integrable Hamiltonian system, that is a perturbation of an integrable Hamiltonian system, which is of the form H (θ, I ) = h(I ) + f (θ, I ), |h| = 1, | f | < ε << 1. Here (θ, I ) ∈ Tn × Rn are angle-action coordinates, and f is a small perturbation, of size ε, in some suitable topology defined by a norm | . |. In the absence of perturbation, that is when ε is zero, the action variables I (t) are integrals of motion and all solutions are quasi-periodic. Therefore it is a natural question, which is in fact motivated by concrete problems of stability in celestial mechanics and in mathematical physics in general, to study the evolution of the action variables I (t) after perturbation, that is for ε > 0 but arbitrarily small.

158

A. Bounemoura

2. If the system is analytic and if h satisfies a steepness condition, which is a quantitative transversality condition, it is a remarkable result due to Nekhoroshev ([Nek77,Nek79]) that the action variables are stable for an exponentially long interval of time with respect to the inverse of the size of the perturbation: one has |I (t) − I0 | ≤ c1 εb , |t| ≤ exp(c2 ε−a ), for some positive constants c1 , c2 , a and b, and provided that the size of the perturbation ε is smaller than a threshold ε0 . This is a major result in the theory of perturbations of Hamiltonian systems, which gives additional information to two previous and equally important results. The first one is that “most” solutions (in a measure-theoretical sense) are quasi-periodic and close to the unperturbed ones, hence stable for all time in the sense that √ |I (t) − I0 | ≤ c ε, t ∈ R, for some positive constant c, provided that the system is sufficiently regular (analytic, Gevrey or even C k for k > 2n), h satisfies a mild non-degeneracy assumption and ε is smaller than a threshold ε0 . This is the content of KAM theory, see [Kol54] for the original statement and, for instance [Rüs01,Sal04 and Pop04] among the enormous literature, for various improvements regarding the hypotheses of non-degeneracy and regularity. Hence Nekhoroshev estimates give new information for solutions living on the phase space not covered by KAM theory, and even though the latter has a small measure, it is usually topologically large (at least for n ≥ 3). The second important previous result is that there exist “unstable” solutions, satisfying |I (τ ) − I0 | ≥ 1, τ = τ (ε) > 0. This was discovered by Arnold in his famous paper [Arn64], and it has become widely known as “Arnold diffusion”. Even though this phenomenon has been intensively studied, very little is known. Here Nekhoroshev estimates give an exponentially large lower bound on the time of instability τ (ε), explaining (in part) why such instability properties are so hard to detect. 3. Now returning to Nekhoroshev estimates, the original proof is rather long and complicated. It is naturally divided into two parts. The first part, which is analytic, is the construction of general resonant normal forms, up to an exponentially small remainder (this is where the analyticity of the system is used), on local domains of the action space where one has a suitable control on the so-called small divisors. Then, the second part, which is geometric, consists in the construction of a partition of the action space where one can use such normal forms. This is where the steepness of the integrable system enters, basically it rules out the existence of solutions which cannot be controlled by these normal forms, so eventually the exponentially small remainders easily translate into an exponentially long time of stability for all solutions. 4. It has been noticed by the Italian school ([BGG85,BG86]) that using preservation of energy, the geometric part of the proof can be simplified for the simplest steep Hamiltonians, namely strictly convex or strictly quasi-convex Hamiltonians (recall that quasi-convexity means that the energy sub-levels are convex subsets). Then, much work has been devoted to this special case. In particular, Lochak introduced in [Loc92] a new method leading in particular to an extremely simple and elegant proof of these estimates under the quasi-convexity assumption. His approach only relies on averaging

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

159

along periodic frequencies, which enables him to construct special resonant normal forms (periodic frequencies are just resonant frequencies of codimension-one multiplicities), and the use of the most basic result in simultaneous Diophantine approximation, namely Dirichlet’s theorem, to cover the whole action space by domains where such special normal forms can be used. This also brought to light the surprising phenomenon of “stabilization by resonances”, which implies that the more resonant the initial condition is, the more stable (in a finite time-scale) the solution will be. This method had several applications, and an important one, that we shall be concerned with here, was the extension of these stability estimates for non-analytic systems. Indeed, using Lochak’s strategy, it was shown in [MS02] that the exponential estimates are also satisfied for Gevrey regular systems, and in [Bou10b] it was proved that polynomial estimates hold true if the system is only of finite differentiability (this is obviously the best one can expect under such a weak regularity assumption). Let us point out that for non-analytic systems, the construction of these normal forms, which is the only new ingredient one has to add since the geometric part of the proof is insensible to the regularity of the system, is more difficult than for an analytic system (this is especially true for Gevrey systems). Indeed, one cannot simply work with C 0 -norms (on some complex strip, then by the usual Cauchy estimates one has a control on all derivatives on smaller complex strips), so one has to work directly with all the derivatives (and moreover keep a control on the growth of these derivatives in the Gevrey case). 5. However, all these results were restricted by the quasi-convexity hypothesis. A study of Nekhoroshev estimates under more general assumptions on the integrable part has been initiated by Niederman. First, in [Nie04], he introduced new geometric arguments based on simultaneous Diophantine approximation, leading to a great simplification in the geometric part of the proof of Nekhoroshev’s result under the steepness condition. Then, in [Nie07], he realized that this method allows in fact to obtain the result for a much wider class of unperturbed Hamiltonians, which he called “Diophantine steep”, and which are prevalent (in the sense of Hunt, Sauer and Yorke, recall that prevalence is a possible generalization of full Lebesgue measure for infinite dimensional linear spaces). However, the analytic part of Niederman’s proof was still based on averaging along general frequencies and hence required the construction of general resonant normal forms (which was taken from [Pös93]). In fact, in the non-convex case, a simple averaging along a periodic frequency, which corresponds to studying the dynamics in the neighbourhood of a resonance of codimension-one multiplicity, cannot be enough since solutions will necessarily explore resonances associated to different, and possibly all, multiplicities. But then in [BN10] we were able to construct normal norms associated to any multiplicities by making suitable composition of periodic averagings, with periodic vectors which are independent and sufficiently close to each other. This was an extension of Lochak’s method, in the sense that no small divisors were involved, only (a composition of) periodic averagings and simultaneous Diophantine approximation were used. This proof was not only simpler than the previous one, but also opened the way to several applications. For instance, in [Bou10a], it was shown how one can easily obtain more general results of stability in the vicinity of linearly stable quasi-periodic invariant tori. 6. The aim of this paper is to extend the above results by proving stability estimates for Hamiltonian systems with a prevalent integrable part, but which are not necessarily analytic. In the Gevrey case, this will lead to exponential estimates of stability for perturbation of a generic integrable Hamiltonian, as stated below.

160

A. Bounemoura

Theorem 1.1. For α ≥ 1, consider an arbitrary α-Gevrey integrable Hamiltonian h defined on an open ball in Rn . Then for almost any ξ ∈ Rn , the integrable Hamiltonian h ξ (x) = h(I ) − ξ.I is exponentially stable. This will be a direct consequence of Theorems 2.2 and Theorem 2.4 below. This result generalizes the main results of [Nie07] and [BN10] which were restricted by the analyticity assumption (α = 1), and the main stability result of [MS02] which was restricted by the quasi-convexity assumption (our condition on the integrable part is much more general than quasi-convexity). In the finitely differentiable case, we will obtain polynomial estimates of stability for perturbation of a generic integrable Hamiltonian. Theorem 1.2. For k > 2n + 2, consider an arbitrary C k integrable Hamiltonian h defined on an open ball in Rn . Then for almost any ξ ∈ Rn , the integrable Hamiltonian h ξ (x) = h(I ) − ξ.I is polynomially stable. Once again, this will be a direct consequence of Theorems 2.2 and Theorem 2.5 below, and the above result extends the main result of [Bou10b] which was only valid for quasi-convex integrable systems. 2. Main Results In order to state our results, we now describe our setting more precisely. We let B R = B(0, R) be the open ball of Rn , centered at the origin, of radius R > 0 with respect to the supremum norm | . |. Our phase space will be the domain D R = Tn × B R . 1. Let us first explain our prevalent condition on the unperturbed Hamiltonian h, which comes from [BN10]. Let G(n, k) be the Grassmannian of all vector subspaces of Rn of dimension k. We equip Rn with the Euclidean scalar product, . stands for the Euclidean norm, and given an integer L ∈ N∗ , we define G L (n, k) as the subset of G(n, k) consisting of those subspaces whose orthogonal complement can be spanned by vectors k ∈ Zn \{0} with |k|1 ≤ L, where | . |1 is the 1 -norm. Definition 2.1. A function h ∈ C 2 (B R ) is said to be Diophantine Morse if there exist γ > 0 and τ ≥ 0 such that for any L ∈ N∗ , any k ∈ {1, . . . , n} and any ∈ G L (n, k), there exists (e1 , . . . , ek ) (resp. ( f 1 , . . . , f n−k )), an orthonormal basis of (resp. of ⊥ ), such that the function h defined on B R by h (α, β) = h (α1 e1 + · · · + αk ek + β1 f 1 + · · · + βn−k f n−k ) , satisfies the following property: for any (α, β) ∈ B R , ∂α h (α, β) ≤ γ L −τ ⇒ ∂αα h (α, β).η > γ L −τ η for any η ∈ Rn \{0}. In other words, for any (α, β) ∈ B R , we have the following alternative: either ∂α h (α, β) > γ L −τ or ∂αα h (α, β).η > γ L −τ η for any η ∈ Rn \{0}. This technical definition is basically a quantitative transversality condition which is stated in adapted coordinates. It is inspired on the one hand by the steepness condition introduced by Nekhoroshev ([Nek77]) where one has to look at the projection of the gradient map ∇h onto affine subspaces, and on the other hand by the quantitative Morse-Sard theory of Yomdin ([Yom83,YC04]) where critical or “nearly-critical” points of h have

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

161

to be quantitatively non degenerate. The above condition is in fact equivalent to the condition introduced by Niederman in [Nie07]: there the author considered the subset G L (n, k) of G(n, k) consisting of those subspaces which can be spanned by vectors k ∈ Zn \{0} with |k|1 ≤ L, but one can check that G L (n, k) is included in G L k (n, k) k (similarly, G L (n, k) is included in G L (n, k)). Hence we will stick with the terminology “Diophantine Morse” of [Nie07], and the equivalent term “Simultaneous Diophantine Morse” introduced in [BN10] will not be used any more. The set of Diophantine Morse functions on B R with respect to γ > 0 and τ ≥ 0 will be denoted by D Mγτ (B R ), and we will also use the notations D M τ (B R ) =

γ >0

D Mγτ (B R ),

D M(B R ) =

D M τ (B R ).

τ ≥0

We recall the following two results from [Nie07] (see also [BN10]). Theorem 2.2. Let τ > 2(n 2 + 1) and h ∈ C 2n+2 (B R ). Then for Lebesgue almost all ξ ∈ Rn , the function h ξ (I ) = h(I ) − ξ.I belongs to D M τ (B R ). We already mentioned that there is a good notion of “full measure” in an infinite dimensional vector space, which is called prevalence (see [OY05] and [HK10] for nice surveys), and the previous theorem has the following immediate corollary. Corollary 2.3. For τ > 2(n 2 + 1), D M τ (B R ) is prevalent in C 2n+2 (B R ). 2. Now let us introduce our regularity assumption, starting with the Gevrey case. Given α ≥ 1 and L > 0, a real-valued function H ∈ C ∞ (D R ) is (α, L)-Gevrey if, using the standard multi-index notation, we have |H |G α,L (D R ) = L |l|α (l!)−α |∂ l H |C 0 (D R ) < ∞, l∈N2n

where | . |C 0 (D R ) is the usual supremum norm for functions on D R . The space of such functions, with the above norm, is a Banach algebra that we denote by G α,L (D R ), and in the sequel we shall simply write | . |α,L = | . |G α,L (D R ) . Analytic functions are a particular case of Gevrey functions, as one can check that G 1,L (D R ) is exactly the space of bounded real-analytic functions on D R which extend as bounded holomorphic functions on the complex domain VL (D R ) = {(θ, I ) ∈ (Cn /Zn ) × Cn | |I(θ )| < L , d(I, B R ) < L}, where I(θ ) is the imaginary part of θ, | . | the supremum norm on Cn and d the associated distance on Cn . 3. Therefore we shall consider a Hamiltonian H (θ, I ) = h(I ) + f (θ, I ), (θ, I ) ∈ D R , |h|α,L = 1, | f |α,L < ε. Our main result in the Gevrey case is the following.

(∗)

162

A. Bounemoura

Theorem 2.4. Let H be as in (∗), and assume that the integrable part h belongs to D Mγτ (B), with τ ≥ 2 and γ ≤ 1. Let us define a = b = 3−1 (2(n + 1)τ )−n . Then there exists a constant ε0 > 0, depending on n, R, L , α, γ and τ , such that if ε ≤ ε0 , for every initial action I (0) ∈ B R/2 the following estimates: |I (t) − I (0)| < (n + 1)2 εb , |t| ≤ exp(ε−α

−1 a

),

hold true. For α = 1, we exactly recover the main theorem of [BN10] including the value of the exponents a and b, therefore the latter result is generalized to the Gevrey classes. Moreover, quasi-convex Hamiltonians are a very particular case of our class of Morse Diophantine Hamiltonians, hence the stability result of [MS02] is also generalized, but not with the same exponents (we did not try to improve our exponents). Let us now explain several consequences of our result. First, note that the only property used on the integrable part h to derive these estimates is a specific steepness property, therefore the proof is also valid assuming a Diophantine steepness condition as in [Nie07], which is much more general than the original steepness condition of Nekhoroshev. Indeed, the class of Diophantine Morse functions (and a fortiori the class of Diophantine steep functions) contains fairly degenerate Hamiltonians, as for instance linear Hamiltonians with a Diophantine frequency, which of course are far from being steep. As a direct consequence, our main theorem also gives an alternative proof of exponential stability in the neighbourhood of a Gevrey Lagrangian quasi-periodic invariant torus, a fact which was only recently proved by Mitev and Popov in [MP10] by the construction of a Gevrey Birkhoff normal form. Then, as it was proved in [Bou10a], the method we are using is relatively intrinsic and does not depend much on the choice of coordinates. This remark is particularly useful when studying the stability in the neighbourhood of an elliptic fixed point, and more generally in the neighbourhood of a linearly stable lower-dimensional torus, under the common assumptions of isotropicity and reducibility (which are automatic for a fixed point or a Lagrangian torus). As in [Bou10a], one can easily prove results of exponential stability in the Gevrey case under an appropriate Diophantine condition, therefore extending the results of exponential stability obtained in [Bou10a] which were valid in the analytic case. This also gives an extension of the stability result of [MP10] which is only available for a Gevrey Lagrangian torus. In [Bou10a], using the idea of Morbidelli and Giorgilli ([MG95]) to combine Birkhoff normal forms and Nekhoroshev estimates, we also had results of super-exponential stability under a Diophantine condition on the frequency and a prevalent condition on the formal series of Birkhoff invariants. Using the Gevrey Birkhoff normal form of [MP10], we can also extend this super-exponential stability result to Gevrey classes, but only for a Lagrangian torus. For a more general linearly stable torus (isotropic, reducible), the existence of a Gevrey Birkhoff normal form undoubtedly holds true but it is still unknown. As a last remark, we would like to point out that one can also extend a fairly different result of stability, which is due to Berti, Bolle and Biasco ([BBB03]). This concerns perturbations of a priori unstable Hamiltonian systems, which have been intensively studied since instability properties in this context are much simpler to exhibit. In the analytic

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

163

case, if the size of the perturbation is μ, it was proved in [BBB03] that the optimal time of instability is τ (μ) μ−1 ln μ−1 . The upper bound τ (μ) μ−1 ln μ−1 follows from a specific construction of an unstable solution, while the lower bound τ (μ) μ−1 ln μ−1 was a consequence of a stability result, where the analyticity of the system was only necessary to apply Nekhoroshev estimates both in the quasi-convex and steep case on certain regions of the phase space (but because of the presence of “hyperbolicity”, the global stability time is far from being exponentially large). In [BP11], we introduced yet another technique (which pertains more to dynamical systems, as opposed to the variational arguments of [BBB03]) to construct a solution for which τ (μ) μ−1 ln μ−1 , but only in the Gevrey case. Now having at our disposal Nekhoroshev estimates in the Gevrey case for both quasi-convex and steep integrable systems, this implies that the lower bound τ (μ) μ−1 ln μ−1 can also be obtained and that the time of instability τ (μ) μ−1 ln μ−1 is also optimal in the Gevrey case (in fact, using Theorem 2.5 below, this remains true if the system is C k regular, for k large enough). This justifies the optimality we claimed in [BP11]. 4. Let us now explain our result in the finitely differentiable case. Here we assume that H is of class C k , i.e. it is k-times differentiable and all its derivatives up to order k extend continuously to the closure D R . In order to have non-trivial results, we shall assume a minimal amount of regularity, that is k ≥ n + 1 and it will be convenient to introduce another parameter of regularity k ∗ ∈ N∗ satisfying k ≥ k ∗ n + 1. We denote by C k (D R ) the space of functions of class C k on D R , which is a Banach algebra with the norm |H |C k (D R ) =

|l|≤k

(l!)−1 |∂ l H |C 0 (D R ) ,

where we have used the standard multi-index notation and where | . |C 0 (D R ) still denotes the usual supremum norm for functions on D R . Once again, for simplicity, we shall only write | . |k = | . |C k (D R ) . 5. So now we consider a Hamiltonian of the form H (θ, I ) = h(I ) + f (θ, I ), (θ, I ) ∈ D R , |h|k = 1, | f |k < ε.

(∗∗)

Our main result in the finitely differentiable case is the following. Theorem 2.5. Let H be as in (∗∗), assume that the integrable part h belongs to D Mγτ (B), with τ ≥ 2 and γ ≤ 1, and that k ≥ k ∗ n + 1 for some k ∗ ∈ N∗ . Let us define a = b = 3−1 (2(n + 1)τ )−n . Then there exists a constant ε0 > 0, depending on n, R, k, γ and τ , such that if ε ≤ ε0 , for every initial action I (0) ∈ B R/2 the following estimates: ∗

|I (t) − I (0)| < (n + 1)2 εb , |t| ≤ ε−k a , hold true.

164

A. Bounemoura

The above theorem extends the main result of [Bou10b], which was only valid for quasi-convex integrable Hamiltonians. Let us point out that in this result (as in the one contained in [Bou10b]) we have decided to consider only the case of integer values of k, but the results can also be extended to real values (that is, to Hölder spaces) and this would have given a slightly more precise time of stability in terms of the regularity of the system, but we decided not to pursue this further. 6. Let us now conclude with some notations that we shall use throughout the text. First, we define norms for Gevrey and C k functions, but we shall need corresponding norms for vector-valued functions (in particular for diffeomorphisms). Hence given a vector-valued function F : D R → Rm , m ∈ N∗ and F = (F1 , . . . , Fm ), we say that F is (α, L)-Gevrey if Fi ∈ G α,L (D R ), for 1 ≤ i ≤ m, and we will write |F|α,L = m Similarly, F is of class C k if Fi ∈ C k (D R ), for 1 ≤ i ≤ m, and we will i=1 |Fi |α,L . m write |F|k = i=1 |Fi |k . Then, to avoid cumbersome expressions, we will replace constants depending only on n, R, L , α, γ and τ (resp. on n, R, k, γ and τ ) in the Gevrey case (resp. in the C k case) with a dot. More precisely, an assertion of the form “there exists a constant c ≥ 1 depending on the above parameters such that u < cv” will be simply replaced with “u <· v”, when the context is clear. 3. Analytical Part In this part, we shall describe and prove some normal forms that we will need for the proofs of Theorem 2.4 and Theorem 2.5. More precisely, Gevrey Hamiltonians will be considered in Sect. 3.1 and finitely differentiable Hamiltonians in Sect. 3.2, and eventually in Sect. 3.3 we will explain the dynamical consequences of these normal forms. But first we need to recall the following basic definition, which will be crucial to us. Definition 3.1. A vector ω ∈ Rn \ {0} is said to be periodic if there exists a real number t > 0 such that tω ∈ Zn . In this case, the number T = inf{t > 0 | tω ∈ Zn } is called the period of ω. The easiest example is given by a vector with rational components, the period of which is just the least common multiple of the denominators of its components. Geometrically, an invariant torus carrying a linear flow with a T -periodic frequency vector is filled with T -periodic orbits. 3.1. The Gevrey case. As in [MS02], we shall start with perturbations of linear integrable Hamiltonians, for which we will obtain global normal forms (Lemma 3.4 below), and then the latter will be used to obtain local normal forms for perturbations of general Hamiltonians (Proposition 3.5 below). 1. Let ω1 ∈ Rn \{0} be a T1 -periodic vector, and let l1 (I ) = ω1 .I be the linear integrable Hamiltonian with frequency ω1 . In the following, we shall consider a “large” positive integer m ∈ N∗ and a “small” parameter μ1 > 0, which will eventually depend on ε. We shall also use a real number ρ1 > 0 independent of ε, to be fixed below. The following result is due to Marco and Sauzin ([MS02]).

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

165

Lemma 3.2. (Marco-Sauzin) Consider the Hamiltonian H = l1 + f defined on D3ρ1 , with f ∈ G α,L (D3ρ1 ) and | f |α,L < μ1 . Assume that mT1 μ1 ·< 1.

(1)

Then there exist L 1 = C L, for some constant 0 < C < 1, and an (α, L 1 )-Gevrey symplectic transformation 1 : D2ρ1 → D3ρ1 with |1 − Id|α,L 1 <· T1 μ1 such that H 1 = H ◦ 1 = l 1 + g 1 + f 1 with {g 1 , l1 } = 0 and the estimates |g 1 |α,L 1 <· μ1 , | f 1 |α,L 1 <· e−m

1/α

μ1

hold true. 1−α

One can choose the constant C = 16−1 (2n) α . The statement above is exactly Proposition 3.2 in [MS02], where the authors state their result for mT1 μ1 · = 1, but it also holds trivially for smaller m, that is when mT1 μ1 ·< 1. The use of this artificial parameter m ∈ N∗ will make subsequent arguments easier. It is perhaps useful to understand this lemma in the very special case where ω1 = e1 is the first vector of the canonical basis of Rn : the equality {g 1 , l1 } = 0 simply means that g 1 is independent of the first angle θ1 , and therefore the evolution of the first action component I1 is only governed by the remainder f 1 . 2. Now we are going to make suitable “compositions” of the above lemma, but first we shall explain heuristically what we are planning to do formally in the sequel. So we consider another periodic vector ω2 ∈ Rn \{0}, with period T2 , which is independent of ω1 , and we let l2 (I ) = ω2 .I . If the suitable hypotheses are met, by Lemma 3.2 we can transform H = l2 + f , where the size of f is of order μ2 , into H 2 = H ◦ 2 = l2 + g 2 + f 2 , {g 2 , l2 } = 0, with g 2 of order μ2 and f 2 of order e−m μ2 . Now if ω2 is close enough to ω1 , that is |ω2 − ω1 | < μ1 , and if μ2 < μ1 , we can write H 2 = (l2 + l1 − l1 ) + g 2 + f 2 = l1 + (l2 − l1 + g 2 ) + f 2 = l1 + f˜ + f 2 , where f˜ = l2 − l1 + g 2 satisfies { f˜, l2 } and its size is of order μ1 . For a moment, let us forget about f 2 , which is already exponentially small with respect to m, and consider l1 + f˜ as a perturbation of l1 . Under the suitable assumptions, we can apply once again Lemma 3.2 and find a transformation 1 that sends l1 + f˜ into l1 + g 1 + f 1 , where f 1 is exponentially small with respect to m and {g 1 , l1 } = 0. Now the key point is the following: as f˜ satisfies { f˜, l2 } = 0, g 1 and f 1 also satisfy 1 {g , l2 } = { f1 , l2 } = 0, hence {g 1 , l1 } = {g 1 , l2 } = 0. Indeed, it is enough to show that, l1 the Hamiltonian flow of l1 , if { f˜, l2 } = 0 then denoting by s s∈R

[ f˜]1 =

1 T1

0

T1

f˜ ◦ ls1 ds

166

A. Bounemoura

and χ1 =

1 T1

T1 0

( f˜ − [ f˜]1 ) ◦ ls1 sds

also satisfy {[ f˜]1 , l2 } = 0 and {χ1 , l2 } = 0. This simple but important property was first used (to our knowledge) in [Bam99], then stated more clearly in [Pös99b] and rediscovered (independently) in [BN10]. It can be easily proved by direct computations, but this is a general fact in normal form theory and a nicer way to see this goes as follows. Since {l1 , l2 } = 0, the linear operators L l1 = {., l1 } and L l2 = {., l2 } commute, so that the kernel of L l2 is invariant by L l1 , and as L l1 is semi-simple, the kernel of L l2 is also invariant under the projection onto the kernel of L l1 which is given by the map [ . ]1 . This explains why {[ f˜]1 , l2 } = 0. Now f˜ − [ f˜]1 is in the kernel of L l2 , and its unique pre-image by L l1 is given by χ1 , hence {χ1 , l2 } = 0. Put differently, if a Hamiltonian (above l1 + f˜) has an integral (in our case, l2 ), then the integral is invariant under the normalizing transformation (in our case, l2 ◦ 1 = l2 ) and l2 remains an integral of the normalized Hamiltonian (that is {l1 + g 1 + f 1 , l2 }). Let us state this simple algebraic property as a lemma, which complements Lemma 3.2. Lemma 3.3. Under the assumptions of Lemma 3.2, suppose that S is an integral of H = l1 + f with {l1 , S} = { f, S} = 0. Then, in the conclusions of Lemma 3.2, we have S ◦ 1 = S and {l1 , S} = {g 1 , S} = { f 1 , S} = 0. f2 f1

Now to conclude our informal discussion, taking into account f 2 , the map 1 sends to f 2 ◦ 1 which remains exponentially small with respect to m, and so is f 2 = + f 2 ◦ 1 . Therefore H 2 ◦ 1 = l1 + g 1 + f 2 , {g 1 , l1 } = {g 1 , l2 } = 0,

then setting g2 = l1 − l2 + g 1 , we can write again H 2 ◦ 1 = l2 + g2 + f 2 , {g2 , l1 } = {g2 , l2 } = 0. Finally, we have found 2 = 2 ◦ 1 such that H2 = H ◦ 2 = H 2 ◦ 1 = H ◦ 2 ◦ 1 = l2 + g2 + f 2 with {g2 , l1 } = {g2 , l2 } = 0 and f 2 is exponentially small. 3. Now let us make our previous discussion rigorous. For i ∈ {1, . . . , n}, let ωi ∈ Rn \{0} be independent Ti -periodic vectors, and we denote by li the linear integrable Hamiltonian of frequency ωi . We consider a positive integer m ∈ N∗ and a sequence of small parameters μi > 0, for i ∈ {1, . . . , n}. As before, m ∈ N∗ and μi > 0, for i ∈ {1, . . . , n}, will eventually depend on ε. Now to fix the ideas, we define the increasing sequence ρi = 2i−1 , i ∈ {1, . . . , n}. We shall need some assumptions on these parameters, so we define the condition (Ai ) for i ∈ {1, . . . , n} by mT1 μ1 ·< 1

(A1 )

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

167

and for i ∈ {2, . . . , n}, mTi μi ·< 1, |ωi − ωi−1 | <· μi−1 , μi ·< μi−1 .

(Ai )

Recalling the constant C > 0 that appeared in Lemma 3.2, we shall also define the decreasing sequence L i = C i L, for i ∈ {0, . . . , n}. In the proof of the following lemma, we shall use Lemma A.1 of Appendix A. Lemma 3.4. Let j ∈ {1, . . . , n}, and consider the Hamiltonian H = l j + f defined on D3ρ j , with f ∈ G α,L (D3ρ j ) and | f |α,L <· μ j . Assume that (Ai ) is satisfied for i ∈ {1, . . . , j}. Then there exists a symplectic transformation j = j ◦ · · · ◦ 1 : D2ρ1 → D3ρ j , where i : D2ρi → D3ρi is (α, L j−i+1 )-Gevrey and |i − Id|α,L j−i+1 <· Ti μi for i ∈ {1, . . . , j}, such that H j = H ◦ j = l j + g j + f j ∈ G α,L j (D2ρ1 ) with {g j , li } = 0, for i ∈ {1, . . . , j}, and the estimates |g j |α,L j <· μ1 , | f j |α,L j <· e−m

1/α

μ1

hold true. Proof. The proof goes by induction. For j = 1, this is nothing but Lemma 3.2 with g1 = g 1 and f 1 = f 1 . Now assume that the result holds true for some j − 1 ∈ {1, . . . , n − 1}, and let us show that it remains true for j ∈ {2, . . . , n}. By assumption, (A j ) is satisfied, in particular mT j μ j ·< 1, hence condition (1) of Lemma 3.2 is satisfied. Therefore, there exists an (α, L 1 )-Gevrey symplectic transformation j : D2ρ j → D3ρ j with | j − Id|α,L 1 <· T j μ j such that H j = H ◦ j = lj + g j + f

j

with {g j , l j } = 0 and the estimates |g j |α,L 1 <· μ j , | f j |α,L 1 <· e−m

1/α

μj

hold true. Now let us introduce f˜ = l j − l j−1 + g j . Obviously, we have { f˜, l j } = 0. Moreover, by assumption (A j ) we have |ω j − ω j−1 | <· μ j−1 and μ j ·< μ j−1 so that | f˜|α,L 1 ≤ |l j − l j−1 |α,L 1 + |g j |α,L 1 <· |ω j − ω j−1 | + μ j <· μ j−1 .

168

A. Bounemoura

Then we can write H j = l j−1 + f˜ + f j . Furthermore, as D3ρ j−1 ⊆ D2ρ j , the Hamiltonian l j−1 + f˜ is well-defined on D3ρ j−1 , we have f˜ ∈ G α,L 1 (D3ρ j−1 ) with | f˜|α,L 1 <· μ j−1 . Now recall that (Ai ) holds true for i ∈ {1, . . . , j − 1}, hence we can eventually apply our hypothesis of induction to the Hamiltonian l j−1 + f˜: there exists a symplectic transformation j−1 = j−1 ◦ · · · ◦ 1 : D2ρ1 → D3ρ j−1 , where i : D2ρi → D3ρi is (α, L j−i+1 )-Gevrey and |i − Id|α,L j−i+1 <· Ti μi for i ∈ {1, . . . , j − 1}, such that (l j−1 + f˜) ◦ j−1 = l j−1 + g j−1 + f j−1 ∈ G α,L j−1 (D2ρ1 ) with {g j−1 , li } = 0, for i ∈ {1, . . . , j − 1}, and the estimates |g j−1 |α,L j−1 <· μ1 , | f j−1 |α,L j−1 <· e−m

1/α

μ1

hold true. Moreover, as { f˜, l j } = 0, we can use Lemma 3.3 (with S = l j ) to obtain {g j−1 , l j } = 0, and therefore {g j−1 , li } = 0 for i ∈ {1, . . . , j}. Then we set j = j ◦ j−1 so that j = j ◦ · · · ◦ 1 : D2ρ1 → D3ρ j . Now Hj = H ◦ j = H j ◦ j−1 = (l j−1 + f˜) ◦ j−1 + f j ◦ j−1 = l j−1 + g j−1 + f j−1 + f j ◦ j−1 . We will prove below that f j ◦ j−1 ∈ G α,L j (D2ρ1 ), and since we know that the function l j−1 + g j−1 + f j−1 ∈ G α,L j−1 (D2ρ1 ), this easily implies that H j ∈ G α,L j (D2ρ1 ). Now let us define g j = l j−1 − l j + g j−1 and f j = f j−1 + f j ◦ j−1 so that we can eventually write Hj = l j + gj + f j. Since {g j−1 , li } = 0 for i ∈ {1, . . . , j}, then {g j , li } = 0 for i ∈ {1, . . . , j}. Therefore it remains to prove the estimates. First, we have |g j |α,L j ≤ ≤ <· <· <·

|l j−1 − l j |α,L j + |g j−1 |α,L j |l j−1 − l j |α,L j + |g j−1 |α,L j−1 |ω j−1 − ω j | + μ1 μ j−1 + μ1 μ1 .

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

169

Then, we know that | f j−1 |α,L j−1 <· e−m j−1 |α,L j . For that, recall that

j

1/α

μ1 , so we only need to estimate | f

◦

j−1 = j−1 ◦ · · · ◦ 1 , and for i ∈ {1, . . . , j − 1}, we have the estimates |i − Id|α,L j−i+1 <· Ti μi <· 1. Therefore a repeated use of Lemma A.1 yields | f j ◦ j−1 |α,L j = | f j ◦ j−1 ◦ · · · ◦ 1 |α,L j ≤ | f j ◦ j−1 ◦ · · · ◦ 2 |α,L j−1 ≤ | f j ◦ j−1 ◦ · · · ◦ 3 |α,L j−2 ... ≤ | f j |α,L 1 <· e−m μ j . Hence | f j |α,L j ≤ | f j−1 |α,L j + | f j ◦ j−1 |α,L j ≤ | f j−1 |α,L j−1 + | f j ◦ j−1 |α,L j <· e−m <· e which is the required estimate.

1/α

−m 1/α

(μ1 + μ j ) μ1 ,

Here also, it is perhaps useful to understand this lemma in the special case where (ω1 , . . . , ωn ) is the canonical basis of Rn : the equality {g j , li } = 0 for i ∈ {1, . . . , j} means that g j is independent of the first j angles θ1 , . . . , θ j , and therefore the evolution of the first j action components I1 , . . . , I j is only governed by the remainder f j . Now in the general case, since we are assuming that (ω1 , . . . , ωn ) are linearly independent, then for j = n, gn is integrable and the action variables can only evolve according to f n . 4. Now we shall come back to our original setting (∗), that is H (θ, I ) = h(I ) + f (θ, I ), (θ, I ) ∈ D R , |h|α,L = 1, | f |α,L < ε. For i ∈ {1, . . . , n}, we still consider a sequence of Ti -periodic vectors ωi , a sequence of small parameters μi and an integer m ∈ N∗ . Let us fix i ∈ {1, . . . , n}. If we were able to find a Ti -periodic action Ii ∈ B R linked to ωi , that is satisfying ∇h(Ii ) = ωi , then on a small ball of radius μi around Ii , we could perform some standard scalings to reduce the study of perturbations of h to the study of perturbations of the linear Hamiltonian li (I ) = ωi .I , and so we could use the results of the previous section. However, in the sequel we will construct ωi , but since we are not assuming that the gradient map of h is invertible, we will not be able to construct a corresponding action. In fact, this is not a serious problem, but this is just

170

A. Bounemoura

meant to explain why we will need to use some slightly twisted arguments below. In [BN10], we used the idea, coming from [Nie07], to define domains directly in the space of frequencies, but unfortunately this lead to a rather cumbersome definition of domains. Here we shall use a simpler approach that will enable us to work in the space of actions. For i ∈ {1, . . . , n}, we consider a sequence of actions Ii which is μi -linked to a sequence of independent periodic vectors ωi , in the sense that |∇h(Ii ) − ωi | < μi . By the construction of our periodic vectors, such actions will indeed exist. Taking into account this sequence of actions (I1 , . . . , In ) and the size of the perturbation ε, we define some new assumptions (Bi ), for i ∈ {1, . . . , n}, by

mT1 μ1 ·< 1, μ1 ·< 1, ε < μ21 , |∇h(I1 ) − ω1 | < μ1 ,

(B1 )

and for i ∈ {2, . . . , n},

mTi μi ·< 1, |ωi − ωi−1 | <· μi−1 , μi ·< μi−1 , μi ·< 1, ε < μi2 , |∇h(Ii ) − ωi | < μi .

(Bi )

In the proposition below, we shall denote by I : D R → B R the projection onto the action space, and in the proof we shall make use of Lemma A.2 in Appendix A. Proposition 3.5. Suppose H is as in (∗), and assume that (Bi ) is satisfied for i ∈ {1, . . . , j}. Then there exists a C ∞ symplectic transformation j : Tn × B(I j , 2ρ1 μ j ) → Tn × B(I j , 3ρ j μ j ) with | I j − Id I |C 0 (B(I j ,2ρ1 μ j )) ·< μ j such that H ◦ j = h + gj + f j, with {g j , li } = 0, for i ∈ {1, . . . , j}, and the estimate |∂θ f j |C 0 (Tn ×B(I j ,2ρ1 μ j )) < e−m

1/α

μj

holds true. Note that the proof gives in fact slightly better estimates, and also an estimate on the size of the function g j , but this will not be needed in the following. The case j = 1 is due to Marco and Sauzin ([MS02]), and together with an estimate on g j this case is sufficient to prove effective stability for quasi-convex unperturbed systems. But here we shall need this result for any j ∈ {1, . . . , n}: indeed, since we are assuming the periodic vectors to be independent, the above proposition will give us information on resonances of any multiplicities (the case j ∈ {1, . . . , n} corresponds to a resonance of multiplicity n − j).

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

171

Proof. To analyze our Hamiltonian H in a neighbourhood of size μ j around I j , we translate and rescale the action variables using the conformally symplectic map σ j : (θ, I˜) −→ (θ, I ) = (θ, I j + μ j I˜) which sends the domain D3ρ j = Tn × B3ρ j onto Tn × B(I j , 3ρ j μ j ). By the condition μ j <· 1 in (B j ), we can assume that the latter domain is included in D R . Let H˜ = μ−1 j (H ◦ σ j ) be the rescaled Hamiltonian, so H˜ is defined on D3ρ j and reads −1 −1 ˜ ˜ ˜ H˜ (θ, I˜) = μ−1 j H (θ, I j + μ j I ) = μ j h(I j + μ j I ) + μ j f (θ, I j + μ j I )

for (θ, I˜) ∈ D3ρ j . Now using Taylor’s formula we can expand h around I j and, assuming with no loss of generality that h(I j ) = 0, we obtain 1 2 ˜ ˜ (1 − t)∇ 2 h(I j + tμ j I˜) I˜. I˜dt h(I j + μ j I ) = μ j ∇h(I j ). I + μ j 0

= μ j ω j . I˜ + μ j (∇h(I j ) − ω j ). I˜ 1 +μ2j (1 − t)∇ 2 h(I j + tμ j I˜) I˜. I˜dt 0

˜ I˜), = μ j ω j . I˜ + μ j h( where we have defined ˜ I˜) = (∇h(I j ) − ω j ). I˜ + μ j h(

1

(1 − t)∇ 2 h(I j + tμ j I˜) I˜. I˜dt.

0

Therefore we can write H˜ = l j + f˜ with f˜ = h˜ + μ−1 j ( f ◦ σ j ). Let us estimate the norm of f˜. Setting L˜ = L/2, by using Lemma A.2 (with p = 2) we ˜ can bound the (α, L)-norm of ∇ 2 h in terms of the (α, L)-norm of h. Now recalling that |h|α,L = 1, we obtain |∇ 2 h|α, L˜ <· 1, and as |∇h(I j ) − ω j | < μ j (this is part of assumption (B j )), we have ˜ ˜ <· μ j . |h| α, L Then, by (B j ) again, ε < μ2j , hence −1 |μ−1 j ( f ◦ σ j )|α,L ≤ μ j ε < μ j ,

172

A. Bounemoura

and therefore ˜ ˜ + |μ−1 ( f ◦ σ j )|α,L <· μ j . | f˜|α, L˜ ≤ |h| j α, L As (Bi ) implies (Ai ) for i ∈ {1, . . . , j}, we can eventually apply Lemma 3.4 to the Hamiltonian H˜ = l j + f˜ (replacing L by L˜ and L j by L˜ j ): there exists a symplectic transformation ˜ j ◦ ··· ◦ ˜ 1 : D2ρ1 → D3ρ j , ˜j = ˜ i : D2ρi → D3ρi is (α, L˜ j−i+1 )-Gevrey and | ˜ i − Id| ˜ where α, L j−i+1 <· Ti μi for i ∈ {1, . . . , j}, such that ˜ j = l j + g˜ j + f˜j ∈ G α, L˜ j (D2ρ1 ) H˜ j = H˜ ◦ with {g˜ j , li } = 0, for i ∈ {1, . . . , j}, and the estimates |g˜ j |α, L˜ j <· μ1 , | f˜j |α, L˜ j <· e−m

1/α

μ1 ,

hold true. Moreover, if we introduce ˜ s˜ j = g˜ j − h, we still have {˜s j , l j } = 0, for i ∈ {1, . . . , j}, and so the transformed Hamiltonian can also be written as ˜ j = l j + h˜ + s˜ j + f˜j . H˜ j = H˜ ◦ ˜ j ◦ σ −1 , therefore Now scaling back to our original coordinates, we define j = σ j ◦ j j : Tn × B(I j , 2ρ1 μ j ) −→ Tn × B(I j , 3ρ j μ j ) and ˜ j ◦ σ −1 H ◦ j = μ j H˜ ◦ j = μ j (l j + h˜ + s˜ j + f˜j ) ◦ σ j−1 ˜ ◦ σ −1 + μ j s˜ j ◦ σ −1 + μ j f˜j ◦ σ −1 . = μ j (l j + h) j j j ˜ ◦ σ −1 = h, so we may set Observe that μ j (l j + h) j g j = μ j s˜ j ◦ σ j−1 ,

f j = μ j f˜j ◦ σ j−1 ,

and write H ◦ j = h + gj + f j. It is clear that {g j , li } = 0, for i ∈ {1, . . . , j}, and as ∂θ f j = μ j ∂θ f˜j , then |∂θ f j |C 0 (Tn ×B(I j ,2ρ1 μ j )) <· e−m

1/α

μ j μ1 < e−m

1/α

μj.

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

173

Finally, since ˜j = ˜ j ◦ ··· ◦ ˜ 1 : D2ρ1 → D3ρ j ˜ i − Id| ˜ with | α, L j−i+1 <· Ti μi for i ∈ {1, . . . , j}, then ˜ j − Id I |C 0 (D ) <· max {Ti μi }, | I 2ρ 1

i=1,..., j

hence | I j − Id I |C 0 (Tn ×B(I j ,2ρ1 μ j )) <· μ j max {Ti μi } ·< μ j , i=1,...,n

where the last estimate follows from the fact that Ti μi <· 1 for i ∈ {1, . . . , j} with a suitable implicit constant. This ends the proof. 3.2. The C k -case. Now let us explain how one can obtain similar normal forms for finitely differentiable Hamiltonians, with of course only a polynomial bound on the remainder. Here also, as in [Bou10b], we shall first construct global normal forms for perturbations of linear integrable Hamiltonians (Lemma 3.7 below) and then recover local normal forms for perturbations of general Hamiltonians (Proposition 3.8 below). 5. We use the same notations as in the previous section, that is ω1 ∈ Rn \{0} is a T1 -periodic vector, l1 (I ) = ω1 .I and we have parameters m ∈ N∗ and μ1 > 0, while ρ1 is already fixed. Given k ≥ n + 1, we consider k ∗ ∈ N∗ such that k ≥ k ∗ n + 1. The following result is due to the author ([Bou10b]). Lemma 3.6. Consider the Hamiltonian H = l1 + f defined on D3ρ1 , with f ∈ C k (D3ρ1 ) and | f |k < μ1 . Assume that mT1 μ1 ·< 1. ∗

Then there exists a C k−k symplectic transformation 1 : D2ρ1 → D3ρ1 with |1 − Id|k−k ∗ <· T1 μ1 such that H 1 = H ◦ 1 = l 1 + g 1 + f 1 with {g 1 , l1 } = 0 and the estimates ∗

|g 1 |k−k ∗ <· μ1 , | f 1 |k−k ∗ <· m −k μ1 hold true. The statement above is exactly Proposition 3.2 in [Bou10b], where it is stated for mT1 μ1 · = 1 and k ∗ = k − 2, but of course it also holds for smaller m, that is when mT1 μ1 ·< 1, and for any integer k ∗ ∈ N∗ smaller than k. 6. Now as in the previous section, performing suitable “compositions” of the above lemma, this readily gives us the following statement.

174

A. Bounemoura

Lemma 3.7. Let j ∈ {1, . . . , n}, and consider the Hamiltonian H = l j + f defined on D3ρ j , with f ∈ C k (D3ρ j ) and | f |k <· μ j . Assume that (Ai ) is satisfied for i ∈ {1, . . . , j}. Then there exists a symplectic transformation j = j ◦ · · · ◦ 1 : D2ρ1 → D3ρ j , ∗

where i : D2ρi → D3ρi is C k−k i and |i − Id|k−k ∗ i <· Ti μi for i ∈ {1, . . . , j}, such that ∗

H j = H ◦ j = l j + g j + f j ∈ C k−k i (D2ρ1 ) with {g j , li } = 0, for i ∈ {1, . . . , j}, and the estimates ∗

|g j |k−k ∗ j <· μ1 , | f j |k−k ∗ j <· m −k μ1 hold true. The proof is completely identical to the proof of Lemma 3.4 (using Lemma A.3 instead of Lemma A.1), hence we do not repeat the details. 7. Finally we come back to the original setting, that is H (θ, I ) = h(I ) + f (θ, I ), (θ, I ) ∈ D R , |h|k = 1, | f |k < ε, for which we have the following proposition. Proposition 3.8. Suppose H is as in (∗∗), and assume that (Bi ) is satisfied for i ∈ ∗ {1, . . . , j}. Then there exists a C k−k j symplectic transformation j : Tn × B(I j , 2ρ1 μ j ) → Tn × B(I j , 3ρ j μ j ) with | I j − Id I |C 0 (B(I j ,2ρ1 μ j )) ·< μ j such that H ◦ j = h + gj + f j, with {g j , li } = 0 for i ∈ {1, . . . , j} and the estimate ∗

|∂θ f j |C 0 (Tn ×B(I j ,2ρ1 μ j )) < m −k μ j holds true. Once again, the proof is completely analogous to the proof of Proposition 3.5 (using Lemma A.4 instead of Lemma A.2), hence there is no need to give further details. Let us notice that the proof gives in fact ∗

| f j |k−k ∗ j < m −k μ j , but since k − k ∗ j ≥ 1 for any j ∈ {1, . . . , n}, this yields in particular the estimate stated in the above proposition. Here the case j = 1 is due to the author ([Bou10b]) and is enough to prove effective stability for quasi-convex unperturbed systems, but as we already explained, we shall need this result for any j ∈ {1, . . . , n}.

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

175

3.3. Dynamical consequences. Let us now examine the dynamical consequences of our normal forms. As usual, it will be used to control the directions, if any, in which the action variables in these new coordinates can actually drift. Below we shall state a result in the coordinates given by our normal forms, and we shall come back to our original coordinates at the beginning of the next section. In order to treat the Gevrey case and the C k case in a unified way, we introduce yet another parameter τm > 0 which will eventually give us the time of stability. For α-Gevrey Hamiltonians, we set τm = em

1/α

,

and for C k -Hamiltonians, with k ≥ k ∗ n + 1 for some k ∗ ∈ N∗ , we set ∗

τm = m k . Under the assumptions of Proposition 3.5 or Proposition 3.8, consider the Hamiltonian Hj = H ◦ j = h + gj + f j defined on the domain Tn × B(I j , 2μ j ) (from now on we shall use the fact that we have defined ρ1 = 1). Let M j be the Z-module M j = {k ∈ Zn | k.ωi = 0, i ∈ {1, . . . , j}}. Since the periodic vectors are independent, the rank of M j is n − j, and it is also the dimension of the vector space j = M j ⊗ R spanned by M j . We shall need the following lemma, which is completely obvious using the definition of the Poisson bracket. Lemma 3.9. The equality {g j , li } = 0, for all i ∈ {1, . . . , j}, is equivalent to ∂θ g j (θ, I ) ∈ j , for (θ, I ) ∈ Tn × B(I j , 2μ j ). Now consider a solution (θ j (t), I j (t)) of the Hamiltonian H j starting at I j (t j ) ∈ B(I j , 2μ j ) for some t j ∈ R, and define the time of escape of this solution as the smallest / B(I j , 2μ j ). time t˜j ∈]t j , +∞] for which I j (t˜j ) ∈ The only information we shall use from our normal forms is contained in the next proposition, where we shall denote by j the projection onto the linear subspace j . Proposition 3.10. Let H j = h + g j + f j be a Hamiltonian defined on the domain Tn × B(I j , 2μ j ), with {g j , li } = 0 for i ∈ {1, . . . , j}, and such that the estimate |∂θ f j |C 0 (Tn ×B(I j ,2μ j )) < τm−1 μ j holds true. Then, with the previous notations, we have |I j (t) − I j (t j ) − j (I j (t) − I j (t j ))| < μ j , t ∈ [t j , τm ] ∩ [t j , t˜j [. In particular, |I n (t) − I n (tn )| < μn , t ∈ [tn , τm ].

176

A. Bounemoura

Proof. Let ⊥j be the projection onto the orthogonal complement of j , so that j +⊥j is the identity and therefore |I j (t) − I j (t j ) − j (I j (t) − I j (t j ))| = |⊥j (I j (t) − I j (t j ))|. Now, as long as t < t˜j , the equations of motion for H j = h + g j + f j and the mean value theorem give |I j (t) − I j (t j )| ≤ |t − t j ||∂θ (g j + f j )|C 0 (Tn ×B(I j ,2μ j )) . But {g j , li } = 0 for i ∈ {1, . . . , j}, so by Lemma 3.9 we have ∂θ g j (θ, I ) ∈ j for any (θ, I ) ∈ Tn × B(I j , 2μ j ), hence if we first project the equations onto the orthogonal complement of j we have |⊥j (I j (t) − I j (t j ))| ≤ |t − t j ||∂θ f j |C 0 (Tn ×B(I j ,2μ j )) . Now since |t − t j | ≤ τm and |∂θ f j |C 0 (Tn ×B(I j ,2μ j )) < τm−1 μ j , the previous estimate gives |⊥j (I j (t) − I j (t j ))| < μ j , t ∈ [t j , τm [∩[t j , t˜j [, and therefore |I j (t) − I j (t j ) − j (I j (t) − I j (t j ))| < μ j , t ∈ [t j , τm [∩[t j , t˜j [. Finally, for j = n, note that gn is integrable and n is identically zero, so that the mean value theorem immediately gives t˜n ≥ τm and the estimate |I n (t) − I n (tn )| < μn , t ∈ [tn , τm ], follows easily. This concludes the proof.

The interpretation of the above proposition is the following: if λ j is the affine subspace passing through I j (t j ) with direction space j , then as long as I j (t) remains in the domain of definition, it is μ j -close to λ j during an interval of time of length τm . This means that for that interval of time, there is almost no variation of the action components in the direction transversal to λ j , so that any potential drift has to occur along that space. If we were assuming that the energy sub-levels of the integrable Hamiltonian are convex, then some direct arguments using the preservation of energy would give us a complete stability result for such solutions. But in our more general situation, we will have to use indirect and more complicated geometric arguments. 4. Geometric Part In this section, we shall describe some geometric arguments, first introduced by Niederman ([Nie04], see also [BN10] for a somehow clearer exposition), that will lead to the proof of both Theorem 2.4 and Theorem 2.5. Without loss of generality, we will consider only solutions (θ (t), I (t)) starting at time t0 = 0 and evolving in positive time t > 0. In Sect. 4.1, we will introduce a class of solutions, which we call “restrained”, and for which the stability of the action variables is easily proved. Then, in Sect. 4.2, we introduce the notion of “drifting” solutions, which by definition do not satisfy the stability

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

177

properties implied by restrained solutions. We will then show that drifting solutions can in fact be restrained provided some assumptions are required, hence leading to the nonexistence of such drifting solutions. This will eventually give us a proof of Theorem 2.4 and Theorem 2.5 in Sect. 4.3. As these geometric arguments are not affected at all by the regularity of the system, they will be similar for Gevrey or C k Hamiltonians. In fact, they are also the same for analytic Hamiltonians which were studied in [BN10], therefore we shall merely state and explain the relevant results, for which detailed proofs are available in [BN10].

4.1. Restrained solutions. 1. In order to define our restrained solutions, we shall need some notations. Recall from Proposition 3.5 that if the suitable assumptions are met, we can define transformations j : Tn × B(I j , 2μ j ) → Tn × B(I j , 3ρ j μ j ) with | I j − Id I |C 0 (B(I j ,2μ j )) ·< μ j , for j ∈ {1, . . . , n}. In particular, we can easily ensure that the image of j contains the domain Tn × B(I j , μ j ). From now on, we shall write B j = B(I j , μ j ), B j = B(I j , 2μ j ),

j ∈ {1, . . . , n},

so that B j ⊆ j (B j ), and for completeness we set B0 = B R . Now consider a solution (θ (t), I (t)) ∈ Tn × B0 of our Hamiltonian H , starting at time t0 = 0. If at some time t1 ≥ 0 we can find a periodic vector ω1 ∈ Rn \ {0} such that |∇h(I (t1 )) − ω1 | < μ1 , then setting I1 = I (t1 ), we have trivially (θ (t1 ), I (t1 )) ∈ Tn ×B1 . Therefore, if assumption (B1 ) is satisfied, we can define a normalized solution (θ 1 (t), I 1 (t)), for t ≥ t1 , by 1 (θ 1 (t), I 1 (t)) = (θ (t), I (t)), as long as I (t) ∈ B1 . Now we can start again, but this time with the solution (θ 1 (t), I 1 (t)) ∈ Tn × B1 of the Hamiltonian H1 = H ◦ 1 starting at time t1 : if at some time t2 ≥ t1 we can find a periodic vector ω2 ∈ Rn \ {0}, independent of ω1 , such that |∇h(I 1 (t2 )) − ω2 | < μ2 , then setting I2 = I 2 (t1 ), (θ 1 (t2 ), I 1 (t2 )) ∈ Tn × B2 , and provided (B2 ) holds, we can define yet another normalized solution (θ 2 (t), I 2 (t)), for t ≥ t2 , by 2 (θ 2 (t), I 2 (t)) = (θ 1 (t), I 1 (t)), as long as I 1 (t) ∈ B2 . Inductively, setting (θ 0 (t), I 0 (t)) = (θ (t), I (t)), for j ∈ {1, . . . , n} we can define the averaged solution (θ j (t), I j (t)), for t ≥ t j , by j (θ j (t), I j (t)) = (θ j−1 (t), I j−1 (t)),

178

A. Bounemoura

as long as I j−1 (t) ∈ B j , provided we have found independent periodic vectors ω1 , . . . , ω j such that |∇h(I j ) − ω j | < μ j , with I j = I j−1 (t j ) and assuming (B j ) is satisfied. Moreover, using our estimate on j we have |I j (t) − I j−1 (t)| ·< μ j ,

j ∈ {1, . . . , n},

during that time interval. 2. We can eventually write our definition. Definition 4.1. Given μ0 > 0 and m ∈ N∗ , a solution (θ (t), I (t)) of the Hamiltonian (∗) or (∗∗), starting at time t0 = 0, is said to be restrained (by μ0 , up to time τm ) if we can find sequences of: (1) radii (μ1 , . . . , μn ), with 0 < μn < · · · < μ1 < μ0 ; (2) independent periodic vectors (ω1 , . . . , ωn ), with periods (T1 , . . . , Tn ); (3) times (t1 , . . . , tn ), with 0 = t0 ≤ t1 ≤ · · · ≤ tn ≤ tn+1 = τm , satisfying, for j ∈ {0, . . . , n − 1}, conditions (B j+1 ) and the following conditions (C j ) defined by

|I j (t) − I j (t j )| < μ j , t ∈ [t j , t j+1 ], |∇h(I j (t j+1 )) − ω j+1 | < μ j+1 .

(C j )

Let us notice that for j ∈ {0, . . . , n − 2}, setting I j+1 = I j (t j+1 ) the second condition of (C j ) gives part of condition (B j+1 ), and also it ensures that the first condition of (C j+1 ) is indeed well-defined. 3. The terminology “restrained” was introduced in [BN10] because for such solutions, the actions I (t) (or some properly normalized actions I j (t)) are forced to pass close to a resonance at the time t = t j , the multiplicity of which decreases as j increases (since the periodic vectors are assumed to be independent), and moreover the variation of these (normalized) actions is controlled on each time interval [t j , t j+1 ]. Hence after the time tn , the actions are in a domain free of resonances and are easily confined in view of the last part of Proposition 3.10. This is the content of the following proposition. Proposition 4.2. Consider a restrained solution (θ (t), I (t)), with an initial action I (0) ∈ B R/2 . If μ0 ·< 1, then the estimates |I (t) − I (0)| < (n + 1)2 μ0 , 0 ≤ t ≤ τm hold true. This is exactly Proposition 3.7 in [BN10], to which we refer for the easy proof.

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

179

4.2. Drifting solutions. Restrained solutions are stable for an exponentially long interval of time with respect to m, and now we will show that this is in fact true for all solutions. 4. The following definition will be useful in the sequel. Definition 4.3. Given μ0 > 0 and m ∈ N∗ , a solution (θ (t), I (t)) of the Hamiltonian (∗) or (∗∗), starting at time t0 = 0, is said to be drifting (to μ0 , before time τm ) if there exists a time t∗ satisfying |I (t∗ ) − I (0)| = (n + 1)2 μ0 , 0 < t∗ ≤ τm . Of course, this definition makes sense only if (n + 1)2 μ0 < R/2. In view of Proposition 4.2, drifting solutions cannot be restrained. However, we will prove below that if such a drifting solution exists, it has to be restrained under some assumptions on μ0 , m and ε, which will eventually prove that drifting solutions do not exist so that all solutions are indeed exponentially stable. More precisely, assuming the existence of a drifting solution, we will construct a sequence of radii (μ1 , . . . , μn ), an increasing sequence of times (t1 , . . . , tn ) and a sequence of linearly independent vectors (ω1 , . . . , ωn ), with periods (T1 , . . . , Tn ) satisfying, for j ∈ {0, . . . , n − 1}, assumptions (B j+1 ) and (C j ). All sequences will be built inductively, and we first describe two lemmas that we shall need. 5. For j ∈ {1, . . . , n}, recall that j is the vector space spanned by M j = {k ∈ Zn | k.ωi = 0, i ∈ {1, . . . , j}}, and that j (resp. ⊥j ) is the projection onto j (resp. ⊥j ). Let us define the integer Lj =

sup {|Ti ωi |} ∈ N∗ ,

j ∈ {1, . . . , n − 1}.

i∈{1,..., j}

For completeness, we set 0 = Rn , L 0 = 1 and in this case 0 is nothing but the identity. The first lemma will allow us to construct the sequence of times, and for that we will rely on the fact that our integrable part h belongs to D Mγτ (B), so that it satisfies the following steepness property (see Lemma 3.9 in [BN10]). Lemma 4.4. For j ∈ {0, . . . , n − 1}, let λ j be any affine subspace with direction j , and take c j < 1. Then for any continuous curve j : [t j , t ∗j ] → λ j ∩ B R with length | j (t ∗j ) − j (t j )| = c j ·< γ L −τ j , there exists a time t j+1 ∈ [t j , t ∗j ] such that

| j (t) − j (t j )| < c j , t ∈ [t j , t j+1 ], j (∇h( j (t j+1 ))) ·> c2 . j

Now to construct the sequence of periodic vectors, we shall use the following lemma, which is a straightforward application of Dirichlet’s theorem on simultaneous Diophantine approximation (see Lemma 3.10 in [BN10]).

180

A. Bounemoura

Lemma 4.5. Given any vector v ∈ Rn and any real number Q > 1, there exists a T -periodic vector ω satisfying 1

|v − ω| ≤ T −1 Q − n−1 , |v|−1 ≤ T ≤ Q|v|−1 . 6. Now we have the necessary tools to show that drifting solutions cannot exist, provided suitable assumptions on the parameters are demanded. This will be done inductively, and for technical reasons we separate the first step (Proposition 4.6) from the general inductive step (Proposition 4.7). Proposition 4.6. Let (θ (t), I (t)) be a drifting solution. If μ0 ·< γ , then there exist a time t1 , a T1 -periodic vector ω1 and μ1 = · T1−1 εa1 for some positive constant a1 , satisfying (a) |I (t) − I (0)| < μ0 , t ∈ [0, t1 ]; (b) |∇h(I (t1 )) − ω1 | < μ1 . Moreover, we have the estimate 1 <· T1 <· ε−a1 (n−1)r0−2 , 1 ≤ L 1 <· ε−a1 (n−1) μ−2 0 . The proof is completely analogous to the proof of Proposition 3.11 in [BN10]: it uses the fact that our solution is drifting, Lemma 4.4 applied to the curve 0 (t) = I (t) with c0 = μ0 , and Lemma 4.5. Proposition 4.7. Let (θ (t), I (t)) be a drifting solution, j ∈ {1, . . . , n − 1} and assume that there exist sequences (t1 , . . . , t j ), (ω1 , . . . , ω j ) linearly independent and (μ1 , . . . , μ j ), satisfying assumptions (Bi ) and (Ci−1 ), for i ∈ {1, . . . , j}. Assume also that τ ·< μ j ; (i) T j μ j L −1 j τ (ii) T j μ j L −1 ·< γ L −τ j j ; (iii) μ1 ·< μ20 . −1 a j+1 Then there exist a time t j+1 , a T j+1 -periodic vector ω j+1 and μ j+1 = · T j+1 ε for some positive constant a j+1 , satisfying

(a) |I j (t) − I j (t j )| < μ j , t ∈ [t j , t j+1 ]; (b) |∇h(I j (t j+1 )) − ω j+1 | < μ j+1 ; (c) |ω j+1 − ω j | <· μ j . Moreover, we have the estimates 1 <· T j+1 <· ε−a j+1 (n−1) μ−2 0 , 1 ≤ L j+1 <·

max

{ε−ai (n−1) }μ−2 0 ,

i∈{1,..., j+1}

and if

2τ , (iv) μ j+1 ·< T j μ j L −1 j

then ω j+1 is linearly independent of (ω1 , . . . , ω j ). Once again, the proof is completely similar to the proof of Proposition 3.12 in [BN10]. As before, it uses the fact that our solution is drifting, Lemma τ 4.4 applied −1 j j j to the curve j (t) = I (t j ) + j (I (t) − I (t j )) with c j = T j μ j L j , Lemma 4.5 and Proposition 3.10.

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

181

4.3. Proof of Theorem 2.4 and Theorem 2.5. We can finally prove our theorems, and once again we refer to the proof of Theorem 2.4 in [BN10] for some more details. Proof of Theorem 2.4 and Theorem 2.5. As a consequence of Propositions 4.2, 4.6 and a repeated use of Proposition 4.7, we know that |I (t) − I (0)| < (n + 1)2 μ0 , 0 ≤ t ≤ τm , provided that the parameters μ0 , m and ε satisfy the following ten conditions: 2τ (i) μ j+1 ·< T j μ j L −1 , j ∈ {1, . . . , n − 1}; j τ −1 (ii) T j μ j L j ·< μ j , for j ∈ {1, . . . , n − 1}; (iii) mT j μ j ·< 1, for j ∈ {1, . . . , n}; (iv) μ1 ·< μ20 ; (v) ε < μ2j , for j ∈ {1, . . . , n}; τ (vi) T j μ j L −1 ·< γ L −τ j j , for j ∈ {1, . . . , n − 1}; (vii) μ j ·< 1, for j ∈ {1, . . . , n}; (viii) μ j ·< μ j−1 , for j ∈ {2, . . . , n}; (i x) μ0 ·< γ ; (x) μ0 ·< 1; where μ j = · T j−1 εa j , with a j to be chosen for j ∈ {1, . . . , n}, and 1 <· T j <· ε−a j (n−1) μ−2 0 , 1 ≤ L j <·

max {ε−ai (n−1) }μ−2 0 .

i∈{1,..., j}

So let us choose m · = ε−a and r0 = εb , for two positive constants a and b. One can check (this is done in details in [BN10]) that all these conditions hold true if we choose a j = (2τ (n + 1))−n−1+ j ,

j ∈ {1, . . . , n}

and a = b = 3−1 (2τ (n + 1))−n , provided ε ≤ ε0 , with a sufficiently small ε0 depending on n, R, α, L , E, M, γ and τ (resp. on n, R, k, M, γ and τ ) in the Gevrey case (resp. in the C k case). Now recalling that for α-Gevrey Hamiltonians, τm = em

1/α

,

and for C k -Hamiltonians, with k ≥ k ∗ n + 1 for k ∗ ∈ N∗ , ∗

τm = m k , this completes the proof of both theorems.

Acknowledgements. This work has been written up while the author served as a Research Fellow at Warwick University, through CODY network, and revised at IMPA, supported by a PDJ grant from CNPq.

182

A. Bounemoura

A. Technical Estimates In this short appendix, we give some technical estimates concerning Gevrey and finitely differentiable functions that we used in Sect. 3. 1. First in the proof of Lemma 3.4, we used the following estimate concerning the composition of Gevrey functions. Lemma A.1. Let 0 < ρ < ρ and L = C L. Suppose that g ∈ G α,L (Dρ ) and that : Dρ → Dρ is (α, L )-Gevrey. If | − Id|α,L <· 1, then f ◦ ∈ G

α,L

(Dρ ) and | f ◦ |α,L ≤ | f |α,L .

This lemma is contained in the statement of Corollary A.1, Appendix A.2, in [MS02], to which we refer for a proof and a possible choice of implicit constant. In fact, we could have used a more elaborated statement concerning compositions of Gevrey vector-valued functions (as in Proposition A.1 in [MS02]), which would have implied that the diffeomorphisms j constructed in Lemma 3.4 are in fact (α, L j )-Gevrey, with an estimate on their distance to the identity. However, such a statement is not very elegant to state and it is not needed here. 2. Then, in the proof of Proposition 3.5, we used the following lemma which enabled us to bound the Gevrey norm of the derivatives of a function in terms of the Gevrey norm of the function. Lemma A.2. Let ρ > 0 and g ∈ G α,L (Dρ ). For p ∈ N, we have |∂ l g|α,L/2 <· |g|α,L . l∈N2n , |l|= p

For an easy proof and a possible implicit constant, we refer to Lemma A.1, Appendix A.1 in [MS02]. 3. Finally, we shall also need corresponding estimates for the C k norms, which are well-known and much easier to prove. The following lemma is the analogue of Lemma A.1, and it is needed in the proof of Lemma 3.7. Lemma A.3. Let 0 < ρ < ρ, suppose that g ∈ C k (Dρ ) and that : Dρ → Dρ is of class C k . If | − Id|k <· 1, then f ◦ ∈ C k (Dρ ) and | f ◦ |k ≤ | f |k . Finally, here’s an easy analogue of Lemma A.2 which is useful in the proof of Proposition 3.8. Lemma A.4. Let ρ > 0 and g ∈ C k (Dρ ). For 0 ≤ p ≤ k, we have |∂ l g|k− p <· |g|k . l∈N2n , |l|= p

Stability for Gevrey and Finitely Differentiable Prevalent Hamiltonians

183

References [Arn64]

Arnold, V.I.: Instability of dynamical systems with several degrees of freedom. Sov. Math. Doklady 5, 581–585 (1964) [Bam99] Bambusi, D.: Nekhoroshev theorem for small amplitude solutions in nonlinear Schrödinger equations. Math. Z. 230(2), 345–387 (1999) [BBB03] Berti, M., Biasco, L., Bolle, P.: Drift in phase space: a new variational mechanism with optimal diffusion time. J. Math. Pures Appl. (9) 82(6), 613–664 (2003) [BG86] Benettin, G., Gallavotti, G.: Stability of motions near resonances in quasi-integrable Hamiltonian systems. J. Stat. Phys. 44, 293–338 (1986) [BGG85] Benettin, G., Galgani, L., Giorgilli, A.: A proof of Nekhoroshev’s theorem for the stability times in nearly integrable Hamiltonian systems. Celestial Mech. 37, 1–25 (1985) [BN10] Bounemoura, A., Niederman, L.: Generic Nekhoroshev theory without small divisors. Ann. Inst. Fourier, to appear, available at http://w3.impa.br/~abed/NwsdFinal.pdf, Nov. 2010 [Bou10a] Bounemoura, A.: Generic super-exponential stability of invariant tori. Erg. Th. Dyn. Syst. (2010), to appear, doi:10.1017/s0143385710000441, Oct. 2010 [Bou10b] Bounemoura, A.: Nekhoroshev theory for finitely differentiable quasi-convex Hamiltonians. J. Diff. Eqs. 249(11), 2905–2920 (2010) [BP11] Bounemoura, A., Pennamen, E.: Instability for a priori unstable Hamiltonian systems: a dynamical approach. Discrete and Continuous Dynamical Systems - Series A (2011), to appear, available at http://w3.impa.br/~abed/poly4.pdf, Oct 2010 [HK10] Hunt, B., Kaloshin, V.: Prevalence. H. Broer, F. Takens, B. Hasselblatt (eds.), Handbook of Dynamical Systems Volume 3, Amsterdam: North Holland Title, Elsevier, 2010 [Kol54] Kolmogorov, A.N.: On the preservation of conditionally periodic motions for a small change in Hamilton’s function. Dokl. Akad. Nauk. SSSR 98, 527–530 (1954) [Loc92] Lochak, P.: Canonical perturbation theory via simultaneous approximation. Russ. Math. Surv. 47(6), 57–133 (1992) [MG95] Morbidelli, A., Giorgilli, A.: Superexponential stability of KAM tori. J. Stat. Phys. 78, 1607– 1617 (1995) [MP10] Mitev, T., Popov, G.: Gevrey normal form and effective stability of Lagrangian tori. Discrete Contin. Dyn. Syst. 3(4), 643–666 (2010) [MS02] Marco, J.-P., Sauzin, D.: Stability and instability for gevrey quasi-convex near-integrable Hamiltonian systems. Publ. Math. Inst. Hautes Études Sci. 96, 199–275 (2002) [Nek77] Nekhoroshev, N.N.: An exponential estimate of the time of stability of nearly integrable Hamiltonian systems. Russian Math. Surveys 32(6), 1–65 (1977) [Nek79] Nekhoroshev, N.N.: An exponential estimate of the time of stability of nearly integrable Hamiltonian systems ii. Trudy Sem. Petrovs 5, 5–50 (1979) [Nie04] Niederman, L.: Exponential stability for small perturbations of steep integrable Hamiltonian systems. Erg. Th. Dyn. Sys. 24(2), 593–608 (2004) [Nie07] Niederman, L.: Prevalence of exponential stability among nearly integrable Hamiltonian systems. Erg. Th. Dyn. Sys. 27(3), 905–928 (2007) [OY05] Ott, W., Yorke, J.A.: Prevalence. Bull. of the Amer. Math. Soc. 42(3), 263–290 (2005) [Pop04] Popov, G.: KAM theorem for gevrey Hamiltonians. Erg. Th. Dyn. Sys. 24(5), 1753–1786 (2004) [Pös93] Pöschel, J.: Nekhoroshev estimates for quasi-convex Hamiltonian systems. Math. Z. 213, 187–216 (1993) [Pös99b] Pöschel, J.: On Nekhoroshev estimates for a nonlinear Schrödinger equation and a theorem by Bambusi. Nonlinearity 12(6), 1587–1600 (1999) [Rüs01] Rüssmann, H.: Invariant tori in non-degenerate nearly integrable Hamiltonian systems. Regul. Chaotic Dyn. 6(2), 119–204 (2001) [Sal04] Salamon, D.A.: The Kolmogorov-Arnold-Moser theorem. Math. Phys. Elect. J. 10(2), 1–37 (2004) [YC04] Yomdin, Y., Comte, G.: Tame geometry with application in smooth analysis. Lecture Notes in Mathematics, Berlin: Springer Verlag, 2004 [Yom83] Yomdin, Y.: The geometry of critical and near-critical values of differentiable mappings. Math. Ann. 264, 495–515 (1983) Communicated by G. Gallavotti

Commun. Math. Phys. 307, 185–227 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1290-1

Communications in

Mathematical Physics

One-Dimensional Chern-Simons Theory Anton Alekseev1 , Pavel Mnëv2,3 1 Section of Mathematics, University of Geneva, 2-4 Rue du Lièvre, C.P. 64, 1211 Genève 4, Switzerland.

E-mail: [email protected]

2 Petersburg Department of V. A. Steklov Institute of Mathematics, Fontanka 27, 191023 St. Petersburg,

Russia. E-mail: [email protected]

3 Institut für Mathematik, Universität Zürich-Irchel, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland

Received: 1 December 2010 / Accepted: 31 December 2010 Published online: 29 June 2011 – © Springer-Verlag 2011

Abstract: We study a one-dimensional toy version of the Chern-Simons theory. We construct its simplicial version which comprises features of a low-energy effective gauge theory and of a topological quantum field theory in the sense of Atiyah. Contents 1.

2.

3.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 The logic of the paper . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 The main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 A speculation: towards generalizations to topological quantum field theories in higher dimensions . . . . . . . . . . . . . . . . . . . . . . . 1.4 Authorship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simplicial Chern-Simons Theory on the Circle . . . . . . . . . . . . . . . 2.1 Continuum theory on the circle: fields, BV structure, action . . . . . . 2.2 Effective action on the cohomology of the circle . . . . . . . . . . . . 2.2.1 Harmonic gauge. . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Effective action on cohomology. . . . . . . . . . . . . . . . . . . 2.3 Simplicial Chern-Simons action on circle . . . . . . . . . . . . . . . 2.3.1 Cyclic Whitney gauge. . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Chain homotopy, dressed chain homotopy. . . . . . . . . . . . . . 2.3.3 Simplicial action. . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 Remarks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Approach Through Operator Formalism . . . . . . . . . . . . . . . . . . . 3.1 One-dimensional Chern-Simons theory in operator formalism . . . . . 3.1.1 First approximation. . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Imposing the cyclic Whitney gauge. . . . . . . . . . . . . . . . . 3.1.3 Consistency check: the effective action on cohomology. . . . . . . 3.1.4 Consistency check: the case of mutually commuting {Ak } and ψ = 0.

186 187 188 188 189 189 190 192 192 192 193 193 194 197 198 201 202 202 202 203 205

186

A. Alekseev, P. Mnëv

3.2 One-dimensional Chern-Simons with boundary . . . . . . . . . . . . 3.2.1 One-dimensional simplicial Chern-Simons in the operator formalism. Concatenations. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Simplicial aggregations. . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Quantum master equation. . . . . . . . . . . . . . . . . . . . . . 4. Back to Path Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Representation of Cl(g), complex polarization of g . . . . . . . . . . 4.2 One-dimensional Chern-Simons in terms of Atiyah-Segal’s axioms . . 4.3 Integrating out the bulk fields . . . . . . . . . . . . . . . . . . . . . . 4.4 From operator formalism to path integral . . . . . . . . . . . . . . . . 4.4.1 Abelian one-dimensional Chern-Simons theory. . . . . . . . . . . 4.4.2 Path integral for the non-abelian one-dimensional Chern-Simons theory in the cyclic Whitney gauge. End of proof of Theorem 2. . . . 4.5 Simplicial action on an interval . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

206 206 207 210 212 212 214 215 217 217 219 221 226

1. Introduction We begin by considering a one-dimensional version of the Chern-Simons theory on a circle. This is a gauge theory in the Batalin-Vilkovisky formalism defined by the action 1 S(A, ψ) = (ψ, dψ) + (ψ, [A, ψ]), (1) 2 where the field ψ is an odd function on the circle with values in a quadratic Lie algebra g (that is, g has an invariant non-degenerate scalar product, e.g. the Killing form in the case of g semisimple), and the field A is an even 1-form with values in g. We address the problem of constructing an effective BV action induced by a triangulation of the circle (that is, a splitting of the circle into a finite number of segments). This problem is interesting by itself since it is related to discretization of differential geometry. In fact, the action (1) encodes the structure of a unimodular cyclic differential graded Lie algebra (DGLA) on g-valued differential forms on the circle (for details, see [7]). Then, the effective action is a generating function of a discretized (homotopy) version of this DGLA structure induced on the chains of a triangulation (which can be viewed as discrete analogs of differential forms). Another motivation for studying the effective action in the simple model defined by action (1) is a hope to get a new insight for constructing a discrete version of the 3-dimensional Chern-Simons theory. Such a discrete Chern-Simons theory would allow defining quantum invariants of 3-manifolds as finite-dimensional integrals, and it would be compatible with the gauge symmetry (i.e. it would satisfy the Batalin-Vilkovisky quantum master equation); cf. Sect. 1.3. We show that the effective action of the one-dimensional Chern-Simons theory on a triangulated circle is given by an explicit albeit somewhat cumbersome formula (42). It immediately raises a number of questions. Indeed, the result is expected to satisfy the quantum master equation (QME) and to be compatible with simplicial aggregations (merging several 1-simplices of the triangulation). How can we check this directly? Another desire is to represent the effective action in a “simplicially-local” form (that is, as a sum of contributions where each term depends only on two neighboring segments). It turns out that answers to these questions come from the following construction. First, we give a new definition of the one-dimensional simplicial Chern-Simons theory

One-Dimensional Chern-Simons Theory

187

using the operator formalism, i.e. in the language of Clifford algebras Cl(g) (Sect. 3.2.1). The partition function for a simplicial complex is then an element of Cl(g)⊗i (where i is half the number of boundary points of the simplicial complex), and it is a product of local Cl(g)-valued expressions (61) for 1-simplices. In particular, for a triangulated circle the partition function is a scalar (depending on simplicial “bulk fields”). In Sect. 4.4.2, we establish the equivalence between the operator formalism and the path integral formalism of Sect. 2. In the operator formalism, the consistency with simplicial aggregations is checked in a straightforward manner (Sect. 3.2.2). The partition function Z I on the interval I satisfies Eq. (68): ∂ ∂ 1 1 abc a b c ˆ ˆ ˆ f Z + , Z = 0, ψ ψ ψ I I 6 ∂ ψ˜ a ∂ Aa Cl(g) which is a version of the quantum master equation (QME) adjusted for the presence of the boundary. It immediately implies the QME with boundary contributions for an arbitrary one-dimensional simplicial complex (73), and the usual QME for the triangulated circle (44). In order to formulate the one-dimensional simplicial Chern-Simons theory in the spirit of Atiyah’s axioms of TQFT (Sect. 4.2), we choose a complex polarization of g: ¯ gC = h ⊕ h. This is always possible if g is even-dimensional. Note however that in this way we break the original O(g)-symmetry of the problem. The (compexified) Clifford algebra Cl(gC ) is isomorphic to the matrix (super-)algebra End(∧• h) = End(Fun(h)). Therefore, the space of states associated to a point in the one-dimensional Chern-Simons theory is H pt = Fun(h) — the super vector space of polynomials in (dim g)/2 odd variables. The super-space H pt is endowed with an odd third-order differential operator δ. One-dimensional “cobordisms” are now equipped with triangulations. To a triangubulk , equipped with the BV lated cobordism we associate the “space of bulk fields” F Laplacian bulk . The partition function for a triangulated cobordism satisfies the quantum master equation (82). In addition to the operations of gluing and disjoint union (which are standard in Atiyah’s picture), simplicial aggregations are allowed for triangulated cobordisms. The original continuum theory can be thought of as the simplicial theory in the limit of dense triangulation. Matrix elements of the partition function for a triangulated cobordism can be written as path integrals for the one-dimensional Chern-Simons theory with BV gauge fixing in the bulk and holomorphic-antiholomorphic boundary conditions (Sect. 4.4.2). This brings us back to the formalism of effective BV actions. The action for an interval is given by a Gaussian integral, and one can compute it explicitly (see Eq. (117)). 1.1. The logic of the paper. In Sect. 2, we construct an effective BV action for the one-dimensional Chern-Simons theory on a triangulated circle. Section 2.2 contains a warm-up calculation of the effective BV action induced on the de Rham cohomology of the circle. In Sects. 2.3.1 and 2.3.2, we handle the gauge fixing in the path integral. The final result of Sect. 2 is the explicit formula (42) for the simplicial action on a triangulated circle (Theorem 1). In Sect. 3, we address the Chern-Simons theory on a circle via the quantum mechanical operator formalism. The resulting effective BV action on a triangulation is given by formulae (51, 52). The comparison with the path integral formalism (Theorem 2) is

188

A. Alekseev, P. Mnëv

postponed until Sect. 4.4.2. Two quick consistency checks of Theorem 2 are performed in Sect. 3.1.3 and Sect. 3.1.4. In Sect. 3.2.1, we explain how the operator formalism helps to give an Atiyah-style axiomatic formulation of one-dimensional simplicial Chern-Simons theory. In more detail, in this approach one associates operator-valued effective actions to triangulated one-dimensional cobordisms. This picture comprises the features of an Atiyah TQFT (concatenation of cobordisms is sent to the composition of operators) and of a simplicial theory (simplicial aggregations are sent to certain BV integrals for the effective actions, cf. Sect. 3.2.2). In Sect. 3.2.3, we prove that a version of the BatalinVilkovisky quantum master equation (QME) is fulfilled for the partition function of a triangulated 1-cobordism (Theorem 3). This implies the usual QME for the simplicial action (42) on the triangulated circle (see Corollary 1). In Sect. 4, we prove Theorem 2. The proof follows the standard route of using the heat kernel expansion of the evolution operator in order to recover the path integral representation (Sect. 4.4). Along the way, we identify the space of states associated to a point in the one-dimensional Chern-Simons theory (Sect. 4.2). In order to do that (and to define integral kernels of evolution operators) one needs to introduce a complex polarization of the gauge Lie algebra g (Sect. 4.1). In Sect. 4.5, we compute the Chern-Simons partition function for an interval by perturbative techniques (Proposition 4). 1.2. The main results. • Theorem 1: an explicit formula (42) for the effective BV action of the one-dimensional Chern-Simons theory on a circle induced on cochains of a triangulation of the circle. An essential part of this result is the unique gauge fixing in the path integral given in Sect. 2.3.1. • Expressions (51,52) for the exponential of the simplicial action on the triangulated circle in terms of the Clifford algebra Cl(g), and Theorem 2 giving a comparison with (42). • Construction of the one-dimensional Chern-Simons theory as an Atiyah-style TQFT defined on triangulated 1-cobordisms. It comprises features of an Atiyah TQFT (concatenations), of a simplicial theory (simplicial aggregations) and of a Lagrangian gauge theory (quantum master equation). The partition function of this theory on a triangulated circle is the exponential of simplicial action considered in Sect. 2. The construction is presented in Sects. 3.2.1, 4.2. • Theorem 3: the quantum master equation with boundary terms. 1.3. A speculation: towards generalizations to topological quantum field theories in higher dimensions. Our discussion of the one-dimensional Chern-Simons theory in Sect. 4.2 inspires a hope that a hybrid Atiyah-Lagrange picture can apply in other topological quantum field theories, e.g. in more general AKSZ theories (which include the 3-dimensional Chern-Simons theory and the 2-dimensional Poisson sigma model). This is the subject of the work in progress by A. Cattaneo, P. M. and N. Reshetikhin [8]. By the loose term “hybrid Atiyah-Lagrange picture” we mean a functor which asso

ciates to a D-cobordism Bin − → Bout a partition function Z ∈ H∗Bin ⊗ H Bout ⊗ Fun(Fbulk ). It is a function on the “space of bulk fields” Fbulk (a finite-dimensional BV manifold) associated to the cobordism, and it takes values in linear operators from the space

One-Dimensional Chern-Simons Theory

189

of states associated to incoming boundary H Bin to the space of states associated to the outgoing boundary H Bout . Gluing of two cobordisms along the common boundary should be mapped to the composition of operators (as in Atiyah’s picture). Gauge symmetry should be present in the form of a quantum master equation with boundary terms, −1 ∗ −1 bulk − δ + δ Bout Z = 0 Bin cf. (82). Here bulk is the BV Laplacian on Fun(Fbulk ) and δ B is a certain coboun dary operator on the space of states H B (the coboundary property δ 2B = 0 might get spoiled by an anomaly, cf. (83)). Thus, a TQFT in the Atiyah-Lagrange picture would have features of both Atiyah’s TQFT and of an effective gauge theory in the Lagrangian Batalin-Vilkovisky formalism. One important example of an Atiyah-Lagrange TQFT would be an effective theory on zero modes of a continuum TQFT in Lagrangian formalism, with matrix elements of the partition function given by perturbative path integrals. A more ambitious task would be to construct a discrete TQFT in the Atiyah-Lagrange picture, where cobordisms and boundary components carry “discretization data” (e.g. triangulations). Then, spaces of bulk fields and spaces of states depend on the discretization data, and aggregations of discretization data are mapped to finite-dimensional fiber BV integrals at the level of partition functions. One hopes to construct such discrete TQFTs from continuum ones by perturbative path integrals (this would be the higher-dimensional version of the integrals (104), (105)). For an AKSZ sigma model on Maps(•, M), our study gives an educated guess for the space of states H B and for the coboundary operator δ B appearing in the quantum master equation. In more detail, the space of states H B should be the geometric quantization of the symplectic supermanifold B = Maps(B, M) (the phase super-space). The AKSZ construction endows B with a cohomological hamiltonian vector field Q B = {S B , •}, where S B is a certain odd function on B satisfying the Maurer-Cartan equation {S B , S B } = 0. A natural candidate for the coboundary operator δ B : H B → H B is the quantization of the Maurer-Cartan element S B . In the case of the one-dimensional Chern-Simons theory, we have the target space M = g, the phase space associated to a point pt = g, the Maurer-Cartan element (4) S pt = 16 f abc X a X b X c , the space of states for a point (80) H pt = Fun(h) — the geometric quantization space for pt = g (cf. Remark 22). The coboundary (up to possible anomaly (83)) operator for a point δ pt is the quantization of S pt , see (81). For other AKSZ theories, spaces of states constructed in this way will typically be infinite-dimensional complexes. However one may hope to pass to finite-dimensional deformation retracts of (H B , δ B ) by means of the homological perturbation theory. 1.4. Authorship. The idea of looking at the one-dimensional Chern-Simons theory and some parts of Sect. 3 (operator formalism approach) are joint work of both authors (the idea to use operator formalism for the one-dimensional Chern-Simons theory was suggested by A. A.). Other parts of the paper are due to P. M. 2. Simplicial Chern-Simons Theory on the Circle In this section we study the Chern-Simons theory on the circle in Batalin-Vilkovisky (BV) formalism and construct an effective BV action induced on cochains of a triangulation.

190

A. Alekseev, P. Mnëv

Much of this discussion is inspired by [7] and [15]. In particular, the reader is referred to Sects. 2 and 3.2 of [7] for details of the effective BV action construction. 2.1. Continuum theory on the circle: fields, BV structure, action. Let g be a quadratic Lie algebra with Lie bracket [, ] and non-degenerate ad-invariant pairing (, ). We will denote by {T a } an orthonormal basis in g and by f abc = (T a , [T b , T c ]) the structure constants in this basis. We will also use the Einstein summation convention for the Lie algebra indices. The Chern-Simons theory on a 3-manifold M can be constructed as an AKSZ sigma model [1] with the space of fields F = Maps(T M, g) = g ⊗ • (M). That is, F is the space of maps of super-manifolds from the parity-shifted tangent bundle of M to the parity-shifted Lie algebra. Equivalently, this is the space of differential forms on M with values in g. From the canonical integration measure on T M and the even symplectic structure ωg = 21 δ X a ∧ δ X a on g (we denote by {X a } the set of odd coordinates on g associated to the orthonormal basis {T a } on g) one constructs an odd symplectic form (the “BV 2-form”) on F: 1 ω= (δα, δα). (2) 2 M Here the superfield α is the canonical odd map (the parity-shifted identity operator) α : F → g ⊗ • (M)

(3)

which can be viewed as the generating function for coordinates on F with values in g-valued differential forms on M. By splitting α into components according to the degrees of differential forms we obtain α = A(0) + A(1) + A(2) + A(3) , where A( p) takes values in g-valued p-forms.1 Since α is totally odd, A(0) and A(2) are intrinsically odd, A(1) and A(3) are intrinsically even (the intrinsic parity is the total parity minus the de Rham degree modulo 2). The Chern-Simons action is built of the 1-form 21 X a ∧ δ X a on g (which is a primitive for ωg) and the odd function on g θ=

1 abc a b c f X X X 6

which satisfies {θ, θ }g = 0. The action is given by formula, 1 1 S= (α, dα) + (α, [α, α]). 2 6 M

(4)

(5)

By the general construction [1], S satisfies the classical master equation {S, S} = 0, where {, } is the BV anti-bracket on functions on F defined by the odd symplectic form ω. 1 In the BV formalism, A(1) is the “classical field”, A(0) is the “ghost”; A(2) and A(3) are the “anti-fields” for A(1) and A(0) , respectively.

One-Dimensional Chern-Simons Theory

191

We would like to define the one-dimensional Chern-Simons theory on the circle S 1 by substituting M = S 1 into the construction described above. Then, the space of fields becomes F = Maps(T S 1 , g) = g ⊗ 0 (S 1 ) ⊕ g ⊗ 1 (S 1 ). The superfield α can now be written as α = ψ + A,

(6)

where the component ψ = T a ψ a (τ ) takes values in g-valued functions on the circle and is intrinsically odd (τ is the coordinate on S 1 ); and the component A = dτ T a Aa (τ ) takes values in g-valued 1-forms on the circle and is intrinsically even. Thus, {ψ a (τ ), Aa (τ )}

are odd and even coordinates on F, respectively. The space F is equipped with an odd symplectic structure (2): ω= (δψ, δ A), (7) S1

defining the anti-bracket {•, •} : Fun(F) × Fun(F) → Fun(F), ← ← − − − → − → δ δ δ δ − { f, g} = dτ f g a a a a δψ (τ ) δ A (τ ) δ A (τ ) δψ (τ ) S1 and the BV Laplacian : Fun(F) → Fun(F) δ δ f. dτ f = a a δψ (τ ) δ A (τ ) S1 Note that the operator is ill-defined on local functionals. The action (5) can be written in terms of components (6) of the superfield as 1 S= ((ψ, dψ) + (ψ, [A, ψ])) . 2 S1

(8)

Here d is the de Rham differential on S 1 . By the general AKSZ construction,2 the action S satisfies the classical master equation {S, S} = 0. Naïvely, one could also say that the unimodularity of g implies unimodularity of g ⊗ • (S 1 ), and therefore the quantum master equation is fulfilled 1 {S, S} + S = 0. 2 However, S is ill-defined in continuum theory. Remark 1. The Z2 -grading on the space of fields of the Chern-Simons theory on a 3-manifold can be promoted to a Z-grading (by setting F = Maps(T [1]M, g[1])) in such a way that the odd symplectic form attains grade3 −1 (so that the anti-bracket has degree +1) and the action S is in degree zero. However, this does not apply to the onedimensional Chern-Simons theory which is essentially Z2 -graded: there is no consistent Z-grading on the space of fields. 2 Or, in the algebraic language, due to relations (Leibniz identity, Jacobi identity, cyclicity of differential, cyclicity of Lie bracket) in the cyclic dg Lie algebra g ⊗ • (S 1 ), cf. [7]. 3 Grade is defined as the total degree minus the de Rham degree of a differential form.

192

A. Alekseev, P. Mnëv

2.2. Effective action on the cohomology of the circle. 2.2.1. Harmonic gauge. Let’s split the space of differential forms on the circle into constant 0- and 1-forms and those with vanishing integral4 : • (S 1 ) = • (S 1 ) ⊕ • (S 1 ), where

• (S 1 ) = { f + dτ g | f, g ∈ R}, • 1

(S ) = { f (τ ) + dτ g (τ ) | dτ f (τ ) = 0, S1

(9) S1

dτ g (τ ) = 0}.

(10)

It induces the splitting for fields into infrared and ultraviolet parts F = F ⊕ F , where F = {ψ0 + dτ A0 | ψ0 ∈ g, A0 ∈ g}, F = {ψ + A | dτ ψ (τ ) = 0, S1

S1

A = 0}.

This splitting respects both the BV 2-form and the de Rham differential. We define the Lagrangian subspace L ⊂ F as L = {ψ + A | A = 0}. 2.2.2. Effective action on cohomology. fiber BV integral e W (ψ0 ,A0 ,) = 1

=

L

5

(11)

We define the effective action W on F by the

1

e S(ψ0 +ψ 1

Dψ e 2

, dτ

A0 +A )

S1

ψ0 +ψ ,(d+dτ ad A0 )(ψ0 +ψ )

.

(12)

The Lagrangian subspace L ⊂ F is uniquely6 fixed by the requirement that the “free” 1 part of the action 2 S 1 (ψ, dψ) be non-degenerate when restricted to L. Integral (12) is Gaussian, and it yields the following result. Proposition 1. The effective BV action W of the one-dimensional Chern-Simons theory is given by e

1 W (ψ0 ,A0 ,)

1/2

= det g

sinh

ad A0 2

ad A0 2

1

· e− 2 (ψ0 ,ad A0 ψ0 ) .

(13)

4 Properties of being constant for a 1-form and being of integral zero for a 0-form are non-covariant. This is not a problem as choosing a gauge always relies on introducing some additional structure. In the case of harmonic gauge, this extra structure is the parametrization of the circle. 5 More precisely, we consider the effective action as a function on g⊗ H • (S 1 ), i.e. on the parity-shifted de Rham cohomology of the circle (which we represented by harmonic forms) with coefficients in g. 6 This a special property of the one-dimensional theory related to the fact that the de Rham operator d : 0 (S 1 ) → 1 (S 1 ) is an isomorphism, and there is unique chain homotopy K = d −1 : 1 (S 1 ) →

0 (S 1 ).

One-Dimensional Chern-Simons Theory

193

A simple form of the 0-loop part is due to the fact that multiplication by constant 1-forms respects the splitting of forms into infrared and ultraviolet parts (9, 10).7 The functional determinant is easily computed e.g. by using the exponential basis {e2πikτ } for 0-forms on the circle. The effective action (13) satisfies the quantum master equation ∂ ∂ 1 W (ψ0 ,A0 ,) e = 0. ∂ψ0a ∂ Aa0 The classical master equation is implied by the Jacobi identity and by cyclicity property of the Lie bracket on g; the quantum part of master equation follows from the fact that the one-loop part of W is manifestly ad-invariant. 2.3. Simplicial Chern-Simons action on circle. Let’s assume that the circle S 1 is glued of n intervals I1 = [p1 , p2 ], . . . , In = [pn , p1 ] (where p1 , . . . , pn is a cyclically ordered collection of points on the circle), and that each interval Ik is equipped with a coordinate function τ : Ik → [0, 1]. We will denote a point of Ik with coordinate τ by (k, τ ). We will assume n to be odd (otherwise, our gauge will be inconsistent, see Remark 2). We denote this “triangulation” of the circle by n . 2.3.1. Cyclic Whitney gauge. Splitting of fields into infrared and ultraviolet parts is • 1 1 defined by splitting for differential forms on the circle • (S 1 ) = • n (S ) ⊕ n (S ), where we split 0-forms into continuous piecewise-linear ones and those with vanishing integrals over each Ik , and we split 1-forms into piecewise-constant ones and the orthogonal complement of piecewise-linear 0-forms: 1

• n (S ) = { f + dτ g | f |Ik = (1 − τ ) f k + τ f k+1 , g |Ik = gk ∀k}, 1 (S ) = { f + dτ g | dτ f = 0, τ dτ g + (1 − τ )dτ g = 0 ∀k},

• n

Ik

Ik

Ik+1

(14) 1 ∼ n|n (as a super-space). As a cochain where f k , gk ∈ R are numbers. Thus, • n (S ) = R • 1 • complex, n (S ) is isomorphic to C (n ), the cochain complex of the simplicial complex n . As in Sect. 2.2.1, this splitting agrees with the de Rham differential, and the associated splitting for fields F = F ⊕ F agrees with the BV 2-form. We call splitting (14) the “cyclic Whitney gauge”, because our representatives • for cell cochains of triangulation are exactly the Whitney forms [17] for n . The word “cyclic” indicates that • is constructed as an orthogonal complement of • with respect to the Poincaré pairing S 1 • ∧ •. Coordinates on F are given by values of ψ at the vertices of triangulation:

ψk = ψ (pk ) ∈ g, and by integrals of A over intervals: Ak =

Ik

A ∈ g.

(15)

(16)

7 Higher order terms in the 0-loop effective action would correspond to Massey operations on the de Rham cohomology (cf. [7,15]). In the case of the circle, Massey operations vanish.

194

A. Alekseev, P. Mnëv

The BV 2-form on F is given by ω =

n a δψka + δψk+1 ∧ δ Aak . 2

(17)

k=1

We will denote F = Fn to emphasize its dependence on n. We have Fn ∼ = g ⊗ C • (n ). Remark 2. The requirement that n be odd is needed since for n even the piecewise-linear 0-form f (k, τ ) = (−1)k (τ − 1/2) belongs to both the infrared and ultraviolet subspaces. As in Sect. 2.2.1, we define the Lagrangian subspace L ⊂ F by setting the 1-form part of the ultraviolet field to zero (11). 2.3.2. Chain homotopy, dressed chain homotopy. Let us define the infrared projector 1 P : • (S 1 ) → • n (S ) by formula f + dτ · g →

n k=1

dτ f − (1 − 2τ ) ·

Ik

+··· + +dτ

dτ f −

Ik−2

n k=1

+··· −

dτ g +

Ik

Ik−1

Ik+1

Ik−1

Ik+2

dτ f

θIk

dτ f

Ik+1

dτ f −

(1 − 2τ )dτ g −

Ik+2

(1 − 2τ )dτ g

(1 − 2τ )dτ g θIk ,

(18)

where θIk is the function on the circle with value 1 on Ik and zero elsewhere. The chain homotopy κ : 1 (S 1 ) → 0 (S 1 ) is uniquely defined by the properties d κ + κ d = id − P , P κ = 0, κP = 0.

(19) (20) (21)

Lemma 1. The operator κ defined by relations (19), (20), (21) acts on the 1-form dτ · g ∈ 1 (S 1 ) by8 κ(dτ · g)(k, τ ) =

n k =1

Ik

dτ κ((k, τ ), (k , τ )) g(k , τ ),

(22)

8 Recall that (k, τ ) denotes a point on the circle which belongs to the interval I and which has a local k coordinate τ . Hence, the integral kernel here is actually a function on S 1 × S 1 .

One-Dimensional Chern-Simons Theory

195

where the integral kernel is given by θ (τ − τ ) − 21 − τ + τ if k = k κ((k, τ ), (k , τ )) = (−1)k−k 2( 21 − τ )( 21 − τ ) if k < k < k + n

(23)

and θ is the unit step function. In addition, the kernel has the anti-symmetry property: κ((k , τ ), (k, τ )) = −κ((k, τ ), (k , τ )). To obtain formula (23), one observes that relations (19) and (20) imply the differential equation ∂ κ((k, τ ), (k , τ )) = δk,k δ(τ − τ ) + Ck (k , τ ) ∂τ

(24)

subject to conditions

1

dτ κ((k, τ ), (k , τ )) = 0,

κ((k, 1), (k , τ )) = κ((k + 1, 0), (k , τ ))

∀k.

0

(25) Here Ck (k , τ ) are some functions independent of τ . Solving (24) together with (25) immediately yields (23). This proves the uniqueness property. In order to prove existence, one checks that (23) satisfies (19), (20), (21). We will also need the “dressed” chain homotopy (i.e. dressed by the connection) κ A : g ⊗ 1 (S 1 ) → g ⊗ 0 (S 1 ), where A = nk=1 Ak θIk is a piecewise-constant g-valued 1-form on the circle. The operator κ A is uniquely defined by the properties (id − P )d A κ A + κ A d A (id − P ) = id − P , P κ A = 0, κ A P = 0,

(26) (27) (28)

where d A = d + ad A . One can summarize these properties by saying that κ A : 0 0 1 1 1

1 n (S ) → n (S ) is the inverse of the operator (id − P )d A (id − P ) : n (S ) → 1

1 n (S ). The dressed chain homotopy can either be expressed perturbatively as a series κ A = κ − κ ad A κ + κ ad A κ ad A κ − · · · , or it can be computed explicitly by solving the differential equation (26). To present an explicit formula for the integral kernel of κ A (defined as in (22)), we have to first introduce some notation. Let F(A, τ ) be an End(g)-valued function on the interval [0, 1] depending on an anti-symmetric matrix A ∈ so(g) ⊂ End(g) and defined by the following properties: F(A, •) ∈ SpanEnd(g) (1, e−Aτ ),

1 0

dτ F(A, τ ) = 0, F(A, 0) = 1.

(29)

196

A. Alekseev, P. Mnëv

Property (29) is equivalently stated as (d + A)F(A, •) = const (this constant depends on A). It is convenient to introduce notation R(A) = −F(A, 1) ∈ End(g). More explicitly, we have F(A, τ ) =

1 2

R(A) = −

coth A + 1 e−Aτ − A−1 2 , 1 A −1 2 coth 2 + 1 − A

A−1 + A−1 −

1 2 1 2

− −

1 A 2 coth 2 1 A 2 coth 2

.

(30)

(31)

Since the function x −1 − 21 coth x2 is analytic at x = 0, expressions (30), (31) are in fact well-defined for non-invertible A. We will use notation μk (A ) = R(ad Ak−1 )R(ad Ak−2 ) · · · R(ad Ak+1 )R(ad Ak ) ∈ End(g).

(32)

The following reflection properties F(−A, τ ) = −

F(A, 1 − τ ) , R(A)

R(−A) = R(A)−1 , μk (A )T = μk (A )−1

mean that R(A) and μk (A ) take values in orthogonal matrices O(g) ⊂ End(g). We can now present the result for the integral kernel of the dressed chain homotopy κ A : Lemma 2. The operator κ A defined by relations (26), (27), (28) acts on a g-valued 1-form dτ · β ∈ g ⊗ 1 (S 1 ) by formula n κ A (dτ · β)(k, τ ) = dτ · κ A ((k, τ ), (k , τ )) ◦ β(k , τ ), (33) k =1 Ik

where the integral kernel is given by κ A ((k, τ ), (k , τ )) ⎧ ad Ak (τ −τ )ad Ak 1 1 ⎪ − (ad Ak )−1 ⎪ θ(τ − τ ) − 2+ 2 coth 2 e ⎪ ⎪ ⎨ 1 +F(ad Ak , τ ) 1+μ1k (A ) − 1+R(ad if k = k , F(−ad Ak , τ ) Ak ) = (34) ⎪ ⎪ ⎪ ⎪ ⎩ (−1)k−k F(ad Ak , τ )R(ad Ak−1 ) · · · R(ad Ak ) 1+μ 1 (A ) F(−ad Ak , τ ) if k < k < k + n. k

This integral kernel is anti-symmetric: κ A ((k , τ ), (k, τ ))T = −κ A ((k, τ ), (k , τ )). To obtain (34) we proceed as in the derivation of (23). The differential equation implied by (26) is as follows:

∂ + ad Ak κ A ((k, τ ), (k , τ )) = δk,k δ(τ − τ ) + Ck (k , τ ). (35) ∂τ Conditions (25) are still fulfilled. Solving (35) with these conditions imposed yields formula (34).

One-Dimensional Chern-Simons Theory

197

2.3.3. Simplicial action. Simplicial Chern-Simons action Sn for the triangulation n of circle is defined by the fiber BV integral 1 1 Sn (ψ ,A ,) = e S(ψ +ψ ,A +A ) . (36) e L

As in the case of induction to cohomology, one puts A |L = 0, and the integral becomes Gaussian: 1 1 e Sn (ψ ,A ,) = Dψ e 2 S 1 (ψ +ψ ,d A (ψ +ψ )) . (37) Expanding the integrand as (ψ , d A ψ ) + (ψ , dψ ) + (ψ , ad A ψ ) + 2(ψ , ad A ψ ), one can consider the second term as the “free” part of the action and the third and fourth terms as a perturbation. In this way, we arrive at the following Feynman diagram expansion for Sn : Sn =

1 1 1 1 (ψ , dψ ) + (ψ , ad A ψ ) − (ψ , ad A κ ad A ψ ) + (ψ , ad A κ ad A κ ad A ψ ) − · · · 2 2 2 2 0 S n

1 1 + tr g⊗ 0 (S 1 ) (κ ad A κ ad A ) − tr 0 1 (κ ad A κ ad A κ ad A ) + · · · . 2 · 2 2 · 3 g⊗ (S )

(38)

1 S n

Here the first line is the sum of “tree” diagrams and the second line is the sum of “wheel diagrams”. The tree part of Sn can be expressed in terms of the dressed chain homotopy, 1 1 1 0 ψ ) − S = (ψ , dψ ) + (ψ , ad (ψ , ad A κ A ad A ψ ). (39) A n 2 S1 2 S1 2 S1 For the 1-loop part, we first write 1 S = n

1 tr 0 1 log(1 + κ ad A ). 2 g⊗ (S )

Then, by using the general formula ∂ tr log Ms = tr ∂s

Ms−1

∂ Ms ∂s

one obtains 1 S n

1

= 0

1 = 2

∂ ds ∂s 1 0

1 tr 0 1 log(1 + κ ad s A ) 2 g⊗ (S )

ds tr g⊗ 0 (S 1 ) (1 + κ ads A )−1 κ ad A . κs A

198

A. Alekseev, P. Mnëv

One can evaluate the functional trace in the integrand in the coordinate representation (i.e. in the basis of delta-functions) by expressing it in terms of the integral kernel (34) restricted to the diagonal S 1 ⊂ S 1 × S 1 : 1 n

1 1 1 Sn = (40) ds dτ tr g κs A ((k, τ ), (k, τ ))ad Ak . 2 0 0 k=1

There is an ambiguity in this expression since the integral kernel (34) is discontinuous on the diagonal. We regularize this ambiguity by using the convention 1 . (41) 2 Observe that the value assigned to θ (0) does not really matter: changing the conven1 by ∝ tr ad = 0. It is interesting that the integral over tion for θ (0) changes S g A n the auxiliary parameter s in (40) can be computed explicitly. By putting together (39) and (40) and substituting (34) we obtain the following explicit result for the simplicial Chern-Simons action on the triangulated circle. θ (0) =

Theorem 1. Simplicial Chern-Simons action on the triangulated circle is given by n 1 1 1 1 Sn = − (ψk , ψk+1 )+ (ψk , ad Ak ψk )+ (ψk+1 , ad Ak ψk+1 ) + (ψk , ad Ak ψk+1 ) 2 3 3 3 k=1

n 1 1 1 − R(ad Ak ) 1 − R(ad Ak ) 1 − (ψk+1 −ψk , + 2 2 1 + μk (A ) 1 + R(ad Ak ) 2R(ad Ak ) k=1 ad Ak 1 1 +(ad Ak )−1 + ad Ak − coth ◦ (ψk+1 − ψk )) 12 2 2

n k +n−1 1 1 − R(ad Ak ) R(ad Ak−1 ) · · · R(ad Ak ) + (−1)k−k (ψk+1 − ψk , 2 2 k =1 k=k +1

1 − R(ad Ak ) 1 · ◦ (ψk +1 − ψk )) 1 + μk (A ) 2R(ad Ak ) ad A n sinh 2 k 1 1 + tr g log (1 + μ• (A )) · ad . Ak 2 1 + R(ad Ak ) ·

k=1

(42)

2

In the last term (the 1-loop contribution) μ• (A ) stands for μl (A ) for arbitrary l. For different l, these matrices differ by conjugation. Hence, the expression det g(1 + μ• (A )) is well defined. 2.3.4. Remarks. Remark 3. In deriving formula (42) we were sloppy about additive constants (we did not pay attention to normalization of the measure in the functional integral (37)). We chose an ad hoc normalization n−1 Sn (0, 0, ) = − dim g log 2 2 which turns out to be consistent with the operator formalism (see Sect. 3.1).

One-Dimensional Chern-Simons Theory

199

Remark 4. Setting n = 1 in (42) yields the effective action on cohomology (13). Remark 5. By expanding (42) in a power series with respect to A we get back the perturbative expansion (38)), n n 1 1 1 1 (ψk , ad Ak ψk ) + (ψk+1 , ad Ak ψk+1 ) (ψk , ψk+1 ) − Sn = − 2 2 3 3 k=1 k=1 1 + (ψk , ad Ak ψk+1 ) 3

n k +n−1 1 1 (−1)k−k (ψk+1 − ψk , ad Ak ad Ak (ψk +1 − ψk )) + 2 72 k =1 k=k +1 ⎛ n n−1 1 ⎝ 1 − dim g log 2 + tr g(ad Ak )2 2 2·2 12 k=1 ⎞ n k +n−1 1 tr g(ad Ak ad Ak )⎠ + O((A )3 ). + 36

(43)

k =1 k=k +1

Remark 6. Naïvely, at large n the simplicial action Sn can be viewed as a lattice approximation to the continuum action (8). For ψ and A fixed, we have (k+1)/n

Sn {ψk = ψ(k/n)}, {Ak =

A} −→ S(ψ, A) + O(1/n)

k/n

when n tends to infinity. The point is that rather than being just an approximation the simplicial theory (Fn , Sn ) is exactly equivalent to the continuum theory (F, S) for any finite n. Remark 7. The simplicial theory is constructed by the fiber BV integral from the continuum theory. Hence, we expect the simplicial action to satisfy the quantum master equation 1

n e Sn = 0

(44)

with BV Laplacian associated to the BV 2-form (17), n

n ∂ ∂ = a, a ˜ ∂ ∂ ψk Ak

(45)

k=1

where ψ˜ k =

ψk + ψk+1 . 2

(46)

However, the BV Laplacian in the continuum theory is ill-defined. Therefore, the quantum master equation (44) is not automatic, and it must be checked independently. It is easy to do it in low degrees in A by using the expansion (43). We will prove the quantum master equation via the operator approach in Sect. 3.2.3.

200

A. Alekseev, P. Mnëv

Remark 8. Another property we expect from the simplicial theory is its compatibility with simplicial aggregations n+2 → n . We will prove this property in Sect. 3.2.2. Remark 9. The gauge choice (14) is actually rigid up to diffeomorphisms of intervals Ik . More exactly, if we require compatibility of the splitting = ⊕ with the de Rham differential and with the pairing S 1 • ∧ • (i.e. so that and are subcomplexes of , and is orthogonal to ), then the splitting is completely determined by the images of basis 1-cochains of n in 1 (S 1 ). If in addition, we require that the image of each basis 1-cochain ek(1) be supported exactly on the respective interval Ik ⊂ S 1 (a kind of simplicial locality property), we obtain the splitting (14) up to diffeomorphisms of (1) intervals Ik taking the representatives of basis 1-cochains ek to constant forms θIk dτ . It is important to note that we are implicitly assuming that the splitting for fields is induced by the splitting for real-valued differential forms. If we drop this assumption and allow splittings of g ⊗ which are not obtained from a splitting of by tensoring with g, we can introduce other gauges. Remark 10. The embedding of cell cochains of n to the space of differential forms in (14) is the same as in the 1-dimensional simplicial BF theory [15]: 0-cochains are represented by continuous piecewise-linear functions, and 1-cochains are represented by piecewise-constant 1-forms. However, the projection (18) is very different. In fact, it is non-local: for 0-forms instead of evaluation at vertices (as in simplicial BF) we have a sum of integrals over all intervals Ik with certain signs. And for 1-forms, instead of integration over one interval we have an integral over the whole circle with a certain piecewise-linear integral kernel. Thus, in the splitting = ⊕ the infrared part is like in the simplicial BF theory, but the ultraviolet part is different. Remark 11. The action (42) can be viewed as a generating function of a certain infinitystructure on g ⊗ C • (n ). In more detail, this is a structure of a loop-enhanced (or “quantum”, or “unimodular”) cyclic L ∞ algebra with the structure maps (operations) (l) ck : ∧k (g ⊗ C • (n )) → R related to the action (42) by

Sn =

1 ∞ l l=0 k=2

k!

(l)

ck (ψ + A , · · · , ψ + A ), k

where the superfield ψ + A is understood as the parity-shifted identity map Fn → g⊗C • (n ) (as in (3)). The quantum master equation (44) generates a family of structure (l) (0) equations on operations ck . In particular, ck satisfy the structure equations of the usual (nonunimodular) cyclic L ∞ algebra. This algebraic structure on g-valued cochains of n can be viewed as a homotopy transfer of the unimodular cyclic DGLA structure on g ⊗ • (S 1 ). Only the first two cyclic operations, c2(0) : ∧2 (g ⊗ C • (n )) → R and (0) c3 : ∧3 (g ⊗ C • (n )) → R are simplicially-local. All the other operations are nonlocal. We can also use the pairing on g ⊗ C • (n ) (induced by the pairing S 1 (•, •) on g-valued differential forms) to invert one input in cyclic operations. This gives an oriented (non-cyclic) version of the unimodular L ∞ structure (see e.g. [11]) on g ⊗ C • (n ) with (0) (1) structure maps lk : ∧k (g ⊗ C • (n )) → g ⊗ C • (n ) and lk : ∧k (g ⊗ C • (n )) → R

One-Dimensional Chern-Simons Theory

201

(l)

related to the cyclic operations ck by the following formula: ⎛ (0)

(0)

⎞

ck+1 (ψ + A , · · · , ψ + A ) = ⎝ψ + A , lk (ψ + A , · · · , ψ + A )⎠ , k+1

k≥1

k (1) ck

=

(1) lk ,

k ≥ 2. (0)

In this unimodular L ∞ algebra, only the differential l1 is local while all other operations (including the binary bracket l2(0) : ∧2 (g ⊗ C • (n )) → g ⊗ C • (n )) are non-local — unlike in the simplicial BF theory [15] where all higher operations are simplicially-local. Remark 12. The dressed chain homotopy κ A can be used to construct the non-linear map U : Fn → F, U : ψ + A → ψ + A − κ A ad A ψ . It sends the infrared field ψ + A to the conditional extremum of the continuum action S restricted to {ψ + A } ⊕ L. The tree part of the simplicial action can be expressed in terms of U , 0 S = S(U (ψ + A )) n 1 1 (U (ψ + A ), d U (ψ + A )) + (U (ψ + A ), [U (ψ + A ), U (ψ + A )]). = 6 S1 2

In the language of infinity algebras, U is an L ∞ morphism intertwining the DGLA structure on g ⊗ • (S 1 ) and the L ∞ structure on g ⊗ C • (n ). Remark 13. There are two natural systems of g-valued coordinates on the space of simplicial BV fields Fn : {ψk , Ak } and {ψ˜ k , Ak } (where variables ψ˜ k are defined by (46)). The first coordinate system is associated to the realization of the space of fields through the cochain complex of a triangulation: Fn = g ⊗ C • (n ). The second coordinate system is associated to the realization through 1-chains and 1-cochains: Fn ∼ = g ⊗ C1 (n ) ⊕ g ⊗ C 1 (n ) (or instead of 1-chains of n one can talk of 0-cochains of 0 ∨ 1 ∼ the dual cell decomposition ∨ n : Fn = g ⊗ C (n ) ⊕ g ⊗ C (n )). The convenience of the first coordinate system is that the abelian part of the simplcial action (42) is local in variables (ψk , Ak ). The convenience of the second coordinate system {ψ˜ k , Ak } is that the BV Laplacian becomes diagonal (45). 3. Approach Through Operator Formalism In this section, our strategy is to give a new definition of the one-dimensional ChernSimons theory. It will be inspired by the definition of Sect. 2.1, but we will be able to consider our theory on an interval and to define a concatenation (gluing) procedure. We will check that the results obtained by the new approach are consistent with those of Sect. 2. For the rest of the paper, we will assume that dim g = 2m is even. This is important for Theorem 2 (the correspondence between the operator and path integral formalism): its proof relies on recovering the path integral by using the fundamental representation of Clifford algebra Cl(g) (Sect. 4.1) which is simpler for dim g = 2m.

202

A. Alekseev, P. Mnëv

3.1. One-dimensional Chern-Simons theory in operator formalism. 3.1.1. First approximation. We would like to define the one-dimensional Chern-Simons theory (8) as quantum mechanics where components of the quantized odd field {ψˆ a } are subject to the anti-commutation relations ψˆ a ψˆ b + ψˆ b ψˆ a = δ ab ,

(47)

i.e. {ψˆ a } are generators of the Clifford algebra Cl(g). The even 1-form (connection) field A = dτ T a Aa (τ ) is non-dynamical, and it is treated as a classical background. The evolution operator for the theory on the interval is defined as a path-ordered exponential of A in the spin representation:

1 ←−−− (48) UI (A) = P exp − dτ f abc ψˆ a Ab (τ )ψˆ c ∈ Cl(g). 2 I Here, the intuition is as follows: the term 21 (ψ, dψ) in the action (8) generates the canonical anti-commutation relations (47) while the term 21 (ψ, [A, ψ]) generates the time-dependent quantum Hamiltonian Hˆ (τ ) = − 21 f abc ψˆ a Ab (τ )ψˆ c which appears in (48). The evolution operator for the concatenation of intervals I1 = [p1 , p2 ], I2 = [p2 , p3 ] with connections A1 , A2 is naturally given by the product of the corresponding evolution operators in the Clifford algebra: U[p1 ,p3 ] (A1 θ[p1 ,p2 ] + A2 θ[p2 ,p3 ] ) = U[p2 ,p3 ] (A2 ) · U[p1 ,p2 ] (A1 ). As in Sect. 2.3.2, θ[pk ,pk+1 ] = θIk denotes the function taking value 1 on the interval Ik and zero everywhere else. The partition function for a circle is given by

1 ←−−− abc ˆ a b c ˆ dτ f ψ A (τ )ψ ∈ R, Z S 1 (A) = Str Cl(g) P exp − 2 I where StrCl(g) is the super-trace on Cl(g) defined as StrCl(g) : aˆ → (i)m · Coefficient of ψˆ 1 · · · ψˆ 2m in aˆ .

(49)

3.1.2. Imposing the cyclic Whitney gauge. Our next task is to model in operator formalism the fiber BV integral (36) for the theory on the circle. We will look for the analogue of the cyclic Whitney gauge introduced in Sect. 2.3.1. For the connection A, imposing the gauge just amounts to saying that A is now infrared, i.e. a piecewise-constant connection A = nk=1 dτ Ak θIk with Ak ∈ g. For ψ, we would like to restrict the integration to k+1 over intervals Ik . So, we are ψ’s with given integrals (average values) ψ˜ k = ψk +ψ 2 interested in the integral

n 1 1 e Sn = Dψ e 2 S 1 ((ψ,dψ)+(ψ,ad A ψ)) δ dτ ψ − ψ˜ k =

n k=1

Dλk e−

n

˜

k=1 (λk ,ψk )

Ik

k=1 1

Dψ e 2

S1

n

((ψ,dψ)+(ψ,ad A ψ))+

Z n (λ ,A )

k=1 (λk ,ψ)

.

(50)

One-Dimensional Chern-Simons Theory

203

Here we got rid of δ-functions at the cost of introducing odd auxiliary variables λk = T a λak ∈ g.One can organize them into an odd piecewise-constant function on the circle λ = nk=1 λk θIk which plays the role of a source for the field ψ. The entity Z n (λ , A ) that appeared in the integrand can be written in the operator formalism as

Z n (λ , A ) = StrCl(g)

← − n k=1

1 abc a b c exp − f ψˆ Ak ψˆ + λak ψˆ a . 2

(51)

Then, the partition function of the one-dimensional Chern-Simons theory on the circle (in the Whitney gauge) is given by the odd Fourier transform of (51):

Z n (ψ , A ) = (i)

−nm

n

Dλk e−

n

a ˜a k=1 λk ψk

Z n (λ , A ).

(52)

k=1

Remark 14. To be precise with signs, we should introduce an ordering convention for the Berezin measure in (52). We set n

Dλk =

− →

n k=1

− →

2m a a=1 Dλk .

k=1

Theorem 2. For n odd and dim g = 2m even, one has 1

Z n (ψ , A ) = e Sn (ψ ,A ) ,

(53)

where the right hand side is given by (42) and the left hand side is defined by (51, 52). We will prove this theorem in Sect. 4.4.2 by constructing a path integral representation for Z n . But first (see Sects. 3.1.3, 3.1.4) we will perform some direct tests of formula (53). Remark 15. Note that we can define the right-hand side of (53) only for n odd while the definition of left-hand side makes sense for both even and odd n. A simple computation shows that for n even the partition function Z n (ψ , A ) vanishes at ψ = 0, A = 0. This agrees with the observation that for n even the Whitney gauge does not apply (see Sect. 2). Remark 16. We chose the normalization for the super-trace in Clifford algebra (49) and for Z n (52) in a way consistent with the path integral formalism (see (76), (77)). 3.1.3. Consistency check: the effective action on cohomology. 9 The first test of the correspondence (53) is the case of n = 1. Let us first compute the following expression in the Clifford algebra with two generators Cl2 : ϕ(ψ˜ 1 , ψ˜ 2 , a)

1 1 ˜1 2 ˜2 = (i)−1 Dλ1 Dλ2 e−λ ψ −λ ψ StrCl2 exp − ψˆ 1 a ψˆ 2 + λ1 ψˆ 1 + λ2 ψˆ 2 ,

(54)

9 For the reader’s convenience, we present explicit calculations in the Clifford algebra here and in subsequent sections; the general reference for the Clifford calculus is [4].

204

A. Alekseev, P. Mnëv

where a ∈ R is a number. For the exponential under the super-trace we have

1 1 2 2 sin(a/2) 1 2 1 ˆ1 2 ˆ2 ˆ ˆ exp − ψ a ψ + λ ψ + λ ψ = − sin(a/2) − λ λ ψˆ 1 ψˆ 2 a/2 sin(a/2) 1 1 (λ ψˆ + λ2 ψˆ 2 ) + a/2

sin(a/2) 1 2 1 λ λ . + cos(a/2)+ cos(a/2)− a a/2 (55) In this expression, only the first term contributes to the super-trace in (54), and we obtain

2 sin(a/2) 1 2 − sin(a/2) − λ λ a/2 1 ˜1 ˜2 sin(a/2) sin(a/2) 2 − sin(a/2)ψ˜ 1 ψ˜ 2 = e− ψ a ψ . = a/2 a/2

ϕ(ψ˜ 1 , ψ˜ 2 , a) =

1ψ ˜ 1 −λ2 ψ˜ 2

Dλ1 Dλ2 e−λ

Next let us consider an anti-symmetric block-diagonal matrix with blocks 2 × 2 ⎛

⎞ 0 a1 ⎜ −a 1 0 ⎟ ⎜ ⎟ 2 ⎜ ⎟ 0 a ⎜ ⎟ 2 ⎜ ⎟ −a 0 A=⎜ ⎟ ∈ so(g) ⊂ End(g). ⎜ ⎟ .. ⎜ ⎟ . ⎜ ⎟ ⎝ 0 am ⎠ −a m 0

(56)

Then, we have

1 a ab b a ˆa ˆ ˆ Dλ e StrCl(g) exp − ψ A ψ + λ ψ (i) 2 a m sinh A 1 ˜ a ab ˜ b 1/2 2 p−1 2 p p 2 = ϕ(ψ˜ , ψ˜ , a ) = det g · e− 2 ψ A ψ . A −m

a

−λa ψ˜ a

(57)

2

p=1

Since both the left-hand side and the right hand side of (57) are S O(g)-invariant, the equality actually holds for all anti-symmetric matrices A, and in particular for A = ad A0 , where A0 ∈ g. Thus, we have shown that ˜ A0 ) = Z 1 (ψ,

1/2 det g

sinh

ad A0 2

ad A0 2

1

˜

˜

· e− 2 (ψ,ad A0 ψ) .

The right-hand side coincides with (13), so we have checked the correspondence (53) in the case of n = 1.

One-Dimensional Chern-Simons Theory

205

3.1.4. Consistency check: the case of mutually commuting {Ak } and ψ = 0. Now we would like to perform a direct check of the correspondence (53) at the point ψ = 0 (i.e. neglecting the tree part of the simplicial action) and assuming that [Ak , Ak ] = 0 for all k, k = 1, . . . , n. That is, we will check that ← −

n n 1 abc a b c −nm f ψˆ Ak ψˆ + λak ψˆ a Dλk StrCl(g) exp − (i) 2 k=1 k=1 ad A n 1 + nk=1 R(ad Ak ) sinh 2 k 1/2 = det g · . (58) n ad Ak k=1 (1 + R(ad Ak )) k=1

2

We use the idea of Sect. 3.1.3 to reduce (58) to a computation in Cl2 . Since all Ak mutually commute, we can choose an orthonormal basis in g, such that the matrices ad Ak simultaneously assume the standard form (56). Then, both sides of (58) factorize into contributions of 2 × 2 blocks, and it suffices to check the identity ← −

n n 1 −n 1 2 (Dλk Dλk ) StrCl2 exp − ψˆ 1 ak ψˆ 2 + λ1k ψˆ 1 + λ2k ψˆ 2 (i) k=1 k=1 n iσ a 1 + nk=1 R(iσ2 ak ) sinh 22 k 1/2 · = det . (59) n iσ2 ak k=1 (1 + R(iσ2 ak )) k=1 2

0 −i 0 ak is the second Pauli matrix, and iσ2 ak = . To evaluate Here σ2 = i 0 −ak 0 the left-hand side of (59), we use the result (55) for the Clifford exponential: ← −

n sin(ak /2) 1 2 1 sin(ak /2) −n ˆ ˆ l.h.s of (59) = (i) StrCl2 − cos(ak /2) . ψ ψ + ak /2 ak ak /2 k=1

An easy way to evaluate this expression is to use the matrix representation Cl2 → End(C1|1 ) which maps

1/2

1/2 01 0 −i 1 2 ˆ ˆ ψ → , ψ → , StrCl2 → Str End(C1|1 ) , 10 i 0 2 2 and

Str End(C1|1 ) :

αβ γ δ

→ α − δ

is the standard super-trace on matrices of the size (1|1)×(1|1). Using this representation, we obtain l.h.s of (59) =i

−n

Str End(C1|1 )

n k=1

= i −n

⎛ ⎝

1 ak

sin(ak /2) ak /2

− cos(ak /2) +

i sin(ak /2) 2 ak /2

0

n n 1 1 sin(ak /2) i · 2i Im + − cot(ak /2) ak /2 2 ak 2

k=1

k=1

n n sin(ak /2) 1 · Re . =2 ak /2 1 + R(iak ) k=1

k=1

In the last line we used the assumption that n is odd.

⎞ 0 1 ak

sin(ak /2) ak /2

− cos(ak /2) −

i sin(ak /2) 2 ak /2

⎠

206

A. Alekseev, P. Mnëv

To evaluate the right-hand side of (59), first observe that the matrix n k=1 R(iσ2 ak ) n (1 + R(iσ2 ak )) k=1

1+

1 + R(iσ2 ak ))

=

n k=1 (1 +

n k=1 (1 +

1 R(−iσ2 ak ))

α β 0 ak and is therefore of form is constructed from matrices iσ2 ak = −β α −ak 0 (i.e. belongs to SpanR (1, iσ2 )), and that it is symmetric. Hence, it is actually a multiple α 0 , and the determinant may be expressed as of the identity matrix 0 α

det

1/2

n k=1

R(iσ2 ak ) n (1 + R(iσ 2 ak )) k=1

1+

1 = tr 2 n = k=1

R(iσ2 ak ) n k=1 (1 + R(iσ2 ak )) n 1 1 + 1 + R(iak ) 1 + R(−iak )

= 2 Re

1+

n k=1

k=1

n k=1

1 . 1 + R(iak )

Now it is obvious that (59) is verified, and this implies (58).

3.2. One-dimensional Chern-Simons with boundary. 3.2.1. One-dimensional simplicial Chern-Simons in the operator formalism. Concatenations. There is a natural construction of the one-dimensional simplicial Chern-Simons theory in the operator formalism: to a one-dimensional oriented simplicial complex (a collection of i() triangulated intervals and c() circles) it associates a partition function Z ∈ Fun(g ⊗ C1 () ⊕ g ⊗ C 1 ()) ⊗ Cl(g)⊗i() ,

(60)

bulk F

where we think of variables Ak ∈ g as coordinates on g-valued simplicial 1-cochains of with parity shifted, and we think of variables ψ˜ k ∈ g as coordinates on g-valued simplicial 1-chains of . The partition function Z is defined by the following properties: • For a single interval with a standard triangulation, we have Z I = (i)

−m

2m

a

Dλ

e

−λa ψ˜ a

a=1

1 abc a b c a ˆa ˆ ˆ f ψ A ψ +λ ψ . exp − 2

• For a disjoint union of 1 and 2 , Z 1 2 = Z 1 ⊗ Z 2 .

(61)

One-Dimensional Chern-Simons Theory

207

• For a concatenation 1 ∪ 2 with intersection 1 ∩ 2 = p being a single point embedded as a boundary point of positive orientation p1 ∈ 1 belonging to the i th triangulated interval of 1 , and embedded as a boundary point of negative orientation p2 ∈ 2 belonging to the j th triangulated interval of 2 , we have: Z 1 ∪2 = m(Z 2 ⊗ Z 1 ).

(62)

Here m : Cl(g) ⊗ Cl(g) → Cl(g) is the Clifford algebra multiplication, and m in (62) acts on Clifford algebras associated to the j th triangulated interval of 2 and the i th triangulated interval of 1 . • If the simplicial complex is obtained from by closing the i th triangulated interval into a circle, we have: Z = Str Cl(g) Z ,

(63)

where the super-trace is taken over the Clifford algebra associated to the i th interval of . Obviously, this construction gives (52) for a triangulated circle = n , and due to the correspondence (53) it is consistent with the results of Sect. 2. Remark 17. We may regard Z I as a contribution of an open interval. Then, the concatenation (62) is understood as taking a disjoint union of open triangulated intervals and then gluing them together at the point p. Likewise, (63) is understood as gluing together two end-points of the same interval. Thus, a contribution of point is either the (1, 2)tensor10 m on the Clifford algebra or the (0,1)-tensor Str Cl(g) , depending on whether we are gluing together two different or the same connected component. Remark 18. Once we come to a description of the one-dimensional Chern-Simons theory in terms of Atiyah-Segal Axioms and specify the vector space associated to a point (the space of states), properties (62) and (63) become a single sewing axiom (see Sect. 4.2 below). 3.2.2. Simplicial aggregations. Let be the interval [p1 , p3 ] with the standard triangulation, and let be the subdivision of with three 0-simplices p1 , p2 , p3 and two 1-simplices [p1 , p2 ], [p2 , p3 ]. The aggregation morphism r ξ acts on simplicial chains as ξ

rC• : C• () → C• ( )

αp1 ep1 + αp2 ep2 + αp3 ep3 + αp1 p2 ep1 p2 + αp2 p3 ep2 p3

→ (αp1 +(1 − ξ )αp2 )ep1 +(αp3 +ξ αp2 )ep3 + (ξ αp1 p2 +(1 − ξ )αp2 p3 )ep1 p3 ,

and on simplicial cochains as ξ

rC • : C • () → C • ( )

α p1 ep1 + α p2 ep2 + α p3 ep3 + α p1 p2 ep1 p2 + α p2 p3 ep2 p3

→ α p1 ep 1 + α p3 ep 3 + (α p1 p2 + α p2 p3 )ep 1 p3 .

Here 0 < ξ < 1 is a parameter determining the relative weight of intervals [p1 , p2 ] and [p2 , p3 ] inside [p1 , p3 ] (the symmetric choice corresponds to ξ = 1/2). Basis chains 10 We mean the rank of the tensor: one time covariant, two times contravariant.

208

A. Alekseev, P. Mnëv

and cochains corresponding a simplex σ are denoted by eσ and eσ , respectively. We use primes to distinguish the basis of C• ( ), C • ( ) ; α... , α ... are numerical coefficients. One can also introduce the subdivision morphism which is dual to the aggregation morphism: ξ

ξ

ξ

ξ

i C• = (rC • )∗ : C• ( ) → C• (), i C • = (rC• )∗ : C • ( ) → C • (). More explicitly, ξ

i C• : αp 1 ep1 + αp 3 ep3 + αp 1 p3 ep1 p3 → αp 1 ep1 + αp 3 ep3 + αp 1 p3 (ep1 p2 + ep2 p3 ), ξ

i C • : α p1 ep 1 + α p3 ep 3 + α p1 p3 ep 1 p3

→ α p1 ep1 +((1 − ξ )α p1 + ξ α p3 )ep2 +α p3 ep3 + α p1 p3 (ξ ep1 p2 +(1 − ξ )ep2 p3 ).

Aggregation and subdivision morphisms are chain maps. Moreover, they are quasiisomorphisms. ξ bulk → F bulk and projection r ξ These maps induce an embedding i , : F , : bulk → F bulk for the spaces of “bulk fields”, F bulk bulk F = g ⊗ C1 () ⊕ g ⊗ C 1 (), F = g ⊗ C1 ( ) ⊕ g ⊗ C 1 ( ).

That is, we have a splitting ξ

bulk,ξ

bulk bulk F = i , (F . ) ⊕ F,

(64)

In more detail, we split the bulk fields as ψ˜ 1 = ψ˜ + (1 − ξ )ψ˜ , ψ˜ 2 = ψ˜ − ξ ψ˜ , ξ

A1 = ξ A − A ,

A2 = (1 − ξ )A + A .

ξ

bulk The maps i , and r, are dual to each other with respect to the odd pairing on F ξ

bulk,ξ

bulk (thus, (64) is an orthogonal decomposition). We denote by L and F , ⊂ F, the Lagrangian subspace defined by setting to zero the 1-cochain part of the ultraviolet field A . We define the action of the aggregation map → on the partition function of the one-dimensional simplicial Chern-Simons theory by the fiber BV integral associated to the splitting (64): ξ

(r, )∗ (Z ) =

Lξ,

Z

bulk ∈ Fun(F ) ⊗ Cl(g).

(65)

It is easy to give a direct check of the following statement. Lemma 3. ξ

(r, )∗ (Z ) = Z

(66)

One-Dimensional Chern-Simons Theory

209

Proof. Indeed, by definition (65) we have ξ

(r, )∗ (Z ) ←− 2m m D ψ˜ a Z [p2 ,p3 ] ψ˜ − ξ ψ˜ , (1 − ξ )A = (i) a=1

·Z [p1 ,p2 ] ψ˜ + (1 − ξ )ψ˜ , ξ A −→ −→ ←− 2m 2m 2m a ˜ a a ˜ a ˜ a ˜ a D ψ˜ a Dλa2 Dλa1 e−λ1 (ψ +(1−ξ )ψ )−λ2 (ψ −ξ ψ ) · = (i)−m

a=1

a=1

a=1

1 − ξ abc a b c ξ abc a b c f ψˆ A ψˆ + λa2 ψˆ a · exp − f ψˆ A ψˆ + λa1 ψˆ a . · exp − 2 2 Here we made a change of coordinates (λ1 , λ2 ) → (λ = λ1 + λ2 , ν = (1 − ξ )λ1 − ξ λ2 ). The integral over ψ˜ produces the delta-function δ(ν); by integrating over ν we obtain ξ (r, )∗ (Z )

= (i)−m

−→ 2m

aψ ˜ a

Dλa e−λ

·

a=1

1 − ξ abc a b c a ˆa ˆ ˆ · exp − f ψ A ψ + (1 − ξ )λ ψ 2

ξ abc a b c a ˆa ˆ ˆ f ψ A ψ + ξλ ψ · exp − 2 −→ 2m a ˜ a = (i)−m Dλa e−λ ψ a=1

1 abc a b c a ˆa ˆ ˆ f ψ A ψ +λ ψ · exp − = Z . 2 ξ

Remark 19. We normalize the measure on L, in such a way that relation (66) holds with no additional factors. Up to now, we only discussed the elementary aggregation which takes an interval subdivided into two smaller intervals and into an interval with the standard triangulation (that is, one removes the middle point p2 and merges intervals [p1 , p2 ] and [p2 , p3 ]). A general simplicial aggregation for a one-dimensional simplicial complex is a sequence of elementary aggregations made at each step on an incident pair of intervals. In particular, there are many simplicial aggregations for triangulated circles: n → n with n > n. The following is an immediate consequence of (66): Proposition 2. For a general simplicial aggregation ξ

ξ

ξ

1 : → r = rl l−1 , ◦ · · · ◦ r21 ,2 ◦ r, 1

210

A. Alekseev, P. Mnëv

(where is an arbitrary one-dimensional simplicial complex and is some aggregation of ), one has r∗ (Z ) = Z .

(67)

The compatibility with aggregations is an important property expected from a simplicial theory. In particular, (53) implies that the simplicial action for a circle Sn given by (42) is compatible with simplicial aggregations n → n for n, n odd. 3.2.3. Quantum master equation. Lemma 4. 11 The partition function for an interval (61) satisfies the following differential equation: ∂ ∂ 1 1 abc a b c ˆ ˆ ˆ f ψ ψ ψ , ZI ZI + = 0, (68) 6 ∂ ψ˜ a ∂ Aa Cl(g) where [, ]Cl(g) denotes the super-commutator on Cl(g). Proof. We will check (68) using variables (λ, A). We have,

1 1 abc a b c 1 abc a b c a ∂ ˆ ψˆ ψˆ , • ˆ A ψˆ + λa ψˆ a = 0. ψ ψ f f λ + exp − ∂ Aa 6 2 Cl(g) (69) ˜ this expression becomes After the Fourier transform from the variable λ to the variable ψ, (68)). Observe that

1 abc a b c a ∂ a ˆa ˆ ˆ λ f ψ A ψ +λ ψ exp − ∂ Aa 2 1 1 abc ˆ a b ˆ c 1 abc ˆ a b ˆ c 1 τ − 2 f ψ A ψ +λa ψˆ a (1−τ ) − 2 f ψ A ψ +λa ψˆ a dτ e · f abc ψˆ a λb ψˆ c · e . = 2 0 (70) Next, compute commutators in the Clifford algebra, 1 abc a b c a a f ψˆ ψˆ ψˆ , λ ψˆ = − f abc ψˆ a λb ψˆ c , 6 2 Cl(g) and

1 abc a b c 1 a b c a b c ˆ ˆ ˆ ˆ ˆ ψ A ψ f ψ ψ ψ , f 6 2 Cl(g) 1 abc a b c b f = f A ([ψˆ a ψˆ b ψˆ c , ψˆ a ]ψˆ c − ψˆ a [ψˆ a ψˆ b ψˆ c , ψˆ c ]) 12 = f abc f a b c Ab (δ ca ψˆ a ψˆ b ψˆ c − δ cc ψˆ a ψˆ a ψˆ b ) 4 abc cb c b a b c = f f A (ψˆ ψˆ ψˆ + ψˆ c ψˆ a ψˆ b ) 4

11 In a different setting Eq. (68) appeared in [2].

(71)

One-Dimensional Chern-Simons Theory

=

211

abc cb c b a b c f f A (ψˆ ψˆ ψˆ + ψˆ c ψˆ a ψˆ b + ψˆ b ψˆ c ψˆ a ) 6 =0 by Jacobi identity

abc cb c b a b c f f A (ψˆ ψˆ ψˆ + ψˆ c ψˆ a ψˆ b − 2ψˆ b ψˆ c ψˆ a ) 12 abc cb c b = f f A ((ψˆ a ψˆ b ψˆ c + ψˆ c ψˆ a ψˆ b − 2ψˆ b ψˆ c ψˆ a ) 24 −(ψˆ b ψˆ a ψˆ c + ψˆ c ψˆ b ψˆ a − 2ψˆ a ψˆ c ψˆ b )) abc cb c b a b c f = f A (ψˆ [ψˆ , ψˆ ] + [ψˆ c , ψˆ a ]ψˆ b − ψˆ b [ψˆ c , ψˆ a ] − [ψˆ b , ψˆ c ]ψˆ a ) 24 = 0. (72) +

For brevity, we are omitting the subscript in [, ]Cl(g) in computations. Identities (71,72) imply

1 1 abc a b c 1 abc a b c f ψˆ ψˆ ψˆ , exp − f ψˆ A ψˆ + λa ψˆ a 6 2 Cl(g) 1 1 abc ˆ a b ˆ c 1 abc a b c (1−τ ) − 21 f abc ψˆ a Ab ψˆ c +λa ψˆ a τ − 2 f ψ A ψ +λa ψˆ a ˆ ˆ =− dτ e · f ψ λ ψ ·e . 2 0 Together with (70), this implies (69) which finishes the proof of (68).

Let us denote12 by θˆ :=

1 abc a b c f ψˆ ψˆ ψˆ 6

the Clifford element in (68). Theorem 3. The partition function for any one-dimensional simplicial complex satisfies the differential equation ( bulk +

1 δ )Z = 0,

(73)

where bulk =

∂ ∂ a a ˜ ∂ ∂ ψk Ak k

(the sum goes over all 1-simplices of ), and δ =

i() !

" θˆ ( j) , •

j=1

Cl(g)

.

Here the sum goes over connected components of that are triangulated intervals; θˆ ( j) denotes θˆ as an element of the j th copy of Cl(g). 12 The notation stems from the fact that this is a quantization of the Maurer-Cartan element θ ∈ Fun(g) (4).

212

A. Alekseev, P. Mnëv

Proof. Equation (73) follows from (68). The compatibility with disjoint unions is bulk bulk obvious as bulk 1 2 = 1 + 2 , and δ1 2 = δ1 + δ2 . For concatenations, it suffices to check the case of two triangulated intervals 1 and 2 :

! " bulk −1 bulk bulk −1 ˆ ◦ (Z 2 · Z 1 ) θ, • (1 ∪2 + δ1 ∪2 )Z 1 ∪2 = 1 + 2 + Cl(g)

! " bulk −1 ˆ ◦ Z 1 θ, • 1 + = Z 2 · Cl(g)

! " bulk −1 ˆ ◦ Z 2 · Z 1 θ, • + 2 + Cl(g)

= 0. The compatibility with closure of a triangulated interval into a triangulated circle follows from Str Cl(g) [θˆ , Z ]Cl(g) = 0. Remark 20. We understand (73) as a kind of quantum master equation with the boundary term δ Z . It is tempting to think of the operator bulk + δ appearing in (73) as a new BV Laplacian adjusted for the presence of the boundary. Corollary 1. In the case of a triangulated circle = n , the partition function satisfies the usual (non-modified) quantum master equation n Z n = 0. 4. Back to Path Integral Sections 4.1 and 4.4 mostly go along the lines of the standard derivation of the path integral representation for quantum mechanics, see [9]. 4.1. Representation of Cl(g), complex polarization of g. The Clifford algebra Cl(g) admits a representation ρ on the space of polynomials of m odd variables Fun(C0|m ) ∼ = C[η1 , . . . , ηm ]. This representation is defined on generators of Cl(g) as ⎧ ⎨ √1 η p + ∂ p if a = 2 p − 1, ∂η 2 ρ : ψˆ a → (74) ⎩ √i η p − ∂ p if a = 2 p. ∂η 2

m−1 m−1 ∼ In fact, ρ : Cl(g) → = End(C2 |2 ) is an isomorphism of superalgebras. There is a natural identification

End(Fun(C0|m ))

φ:

∼ End(Fun(C0|m )) − → Fun(C0|m ⊕ C0|m ) ∼ = C[η1 , . . . , ηm , η¯ 1 , . . . , η¯ m ].

This is not an algebra morphism with respect to the standard algebra structure on polynomials. Instead, it takes the product of endomorphisms into the convolution; see [3,9]). We are interested in the composition = φ ◦ ρ : Cl(g) → Fun(C0|m ⊕ C0|m ) which maps ⎧ 1 q q ˆ → e q η η¯ ⎪ 1 ⎪ ⎨ 1 q q ψˆ 2 p−1 → √1 (η p + η¯ p )e q η η¯ : (75) 2 ⎪ ⎪ ⎩ ψˆ 2 p → √i (η p − η¯ p )e 1 q ηq η¯ q 2

One-Dimensional Chern-Simons Theory

213

and sends the product in Cl(g) into the convolution ˆ 2 , η¯ 1 ) (αˆ · β)(η q q 1 p p ˆ 1 , η¯ 1 ). = m (Dη1 D η¯ 2 ) (α)(η ˆ 2 , η¯ 2 ) · e q η¯ 2 η1 · (β)(η

(76)

p

Formula (76) is a key point in reconstructing the path integral from the operator formalism. Another useful identity is as follows: 1 q q m StrCl(g) (α) ˆ = (Dη p D η¯ p ) e q η¯ η · (α)(η, ˆ η). ¯ (77) p

More generally, a representation of type (74) is associated to a choice of a linear complex structure J on g compatible with the pairing: J : g → g,

J 2 = −idg,

(J a, b) = −(a, J b) for a, b ∈ g.

It induces the splitting of the complexified Lie algebra gC = C ⊗ g into “holomorphic” and “anti-holomorphic” subspaces: ¯ gC = h ⊕ h,

(78)

where J acts on h, h¯ by multiplication by +i and −i, respectively. (Note that h and h¯ are complex subspaces of gC with respect to the standard complex structure; bar in h¯ does not mean conjugation.) The subspaces h are h¯ are Lagrangian with respect to the pairing (, ). The complex Lagrangian polarization (78) induces a polarization for the parity-reversed Lie algebra ¯ g = h ⊕ h.

(79)

We denote coordinates on h by η1 , . . . , ηm and coordinates on h¯ by η¯ 1 , . . . , η¯ m . The → End(Fun(h)) sends quantized holomorphic representation ρ : Cl(g) = Fun(g) coordinates to multiplication operators and quantized anti-holomorphic coordinates to partial derivatives: ρ:

ηˆ p → η p · , ηˆ¯ p → ∂η∂ p .

¯ is given on Morphism (75) from Cl(g) to the convolution algebra Fun(h ⊕ h) generators by 1 q q ηˆ p → η p e q η η¯ , : 1 q q ηˆ¯ p → η¯ p e q η η¯ , and it extends to the other elements of Cl(g) by the convolution formula (76). ¯ respectively. We We will use notation π, π¯ for projections from g to h and h, denote by ι, ι¯ embeddings of h and h¯ into g.

214

A. Alekseev, P. Mnëv

4.2. One-dimensional Chern-Simons in terms of Atiyah-Segal’s axioms. To a point with positive orientation we associate the vector super-space (the space of states) H pt + = Fun(h) ∼ = C[η1 , . . . , ηm ],

(80)

and to a point with negative orientation – the dual space ¯ ∼ H pt − = (H pt + )∗ = Fun(h) = C[η¯ 1 , . . . , η¯ m ]. To an interval I = [p1 , p2 ] we associate the partition function ρ

Z I := ρ(Z I ) ∈ Fun(g ⊕ g) ⊗ Hp+2 ⊗ Hp− 1 ∼ =End(H pt + )

given by formula (61) in representation ρ (see Eq. (74)). In general, to a one-dimensional simplicial complex we associate the partition function (60) of Sect. 3.2.1 taken in representation ρ: ρ

bulk Z := ρ ⊗i() ◦ Z ∈ Fun(F ) ⊗ (H pt + ⊗ H pt − )⊗i() .

The one-dimensional Chern-Simons theory features three types of operations: • To a disjoint union 1 2 corresponds the tensor product for partition functions. + • To a sewing of boundary points p− 1 and p2 in a simplicial complex corresponds the convolution of spaces of states Hp+2 and Hp− . 1 • To a simplicial aggregation r : → corresponds a fiber BV integral r∗ which bulk to F bulk . reduces the space of bulk fields from F In addition, H pt + is equipped with an odd third-order differential operator ˆ : H pt + → H pt + , δ ρ = ρ(θ)

(81)

and H pt − is equipped with minus its dual −(δ ρ )∗ : H pt − → H pt − . The partition ρ function Z satisfies the quantum master equation ρ

ρ

−1 (bulk + δ )Z = 0,

(82)

ρ where the “boundary BV operator” δ is the sum over boundary points of of operators ρ ρ ∗ δ or −(δ ) acting on the corresponding H pt (depending on whether the orientation

of pt is positive or negative). Remark 21. δ ρ is “almost” a coboundary operator: its square is proportional to identity: 3 abc abc f f · idH pt + 48 (see [12]). This implies that the boundary BV operator for an interval (δ ρ )2 = −

ρ

δI = δ ρ ⊗ idH pt − − idH pt + ⊗ (δ ρ )∗ :

(83)

H pt + ⊗ H pt − → H pt + ⊗ H pt − ∼ =End(H pt + )

∼ =End(H pt + )

squares to zero ρ

(δI )2 = 0 Cases when δ ρ squares to zero (i.e. when f abc f abc = 0) are quite interesting as then the reduced space of states for a point Hred pt + emerges (see Remarks 25, 34, 35 below).

One-Dimensional Chern-Simons Theory

215

Remark 22. The space of states H pt + can be viewed as a geometric quantization of the classical phase super-space g (viewed as an odd Kähler manifold). The operator ρ δ ρ is the quantization of the Maurer-Cartan element θ (4); the operator −1 δI is the quantization of the Hamiltonian vector field {θ, •} on g. Remark 23. Topological quantum mechanics (TQM) in the sense of A. Losev [13] assigns to an interval a manifold Geom (the “space of geometric data”) and to a point — a vector superspace H endowed with an odd coboundary operator Q. The evolution operator U for an interval is a differential form on Geom with values in End(H) and has to satisfy the “homotopy topologicity” equation (d + ad Q ) U = 0,

(84)

where d is the de Rham operator on Geom. A standard class of examples of TQMs comes from choosing Geom = R>0 (with coordinate t > 0) and setting U (t, dt) = e[Q,G] t+dt G = e(d+ad Q )◦(t G) ,

(85)

where G is an odd operator on H. For instance, for the Hodge TQM [13] on a Riemannian ∗ — the Hodge operator on manifold M, one sets H = • (M), Q = d M and G = d M forms on M. For the Morse TQM [10,18], one takes the same H and Q, but now G = ιv is the substitution of the gradient vector field. The one-dimensional Chern-Simons theory on an interval can be viewed as a TQM: here Geom = g (with coordinates Aa ), H = Fun(h), Q = −1 δ ρ . The odd Fourier transform in variable ψ˜ of the partition function for an interval (61) is 1 U (A, λ) = ρ e− 2

f abc ψˆ a Ab ψˆ c +λa ψˆ a

=e

(d+−1 adδ ρ )◦ −1 ρ Aa ψˆ a

,

(86)

where d = λa ∂ ∂Aa is the de Rham operator on Geom. Note that the expression (86) is similar to (85) where we make a substitution t G → −1 ρ(Aa ψˆ a ). The quantum master equation (68) is equivalent to (d + −1 adδ ρ ) U (A, λ) = 0,

(87)

which is exactly the “homotopy topologicity” equation (84). The peculiarity of the onedimensional Chern-Simons theory viewed as a TQM is that δ ρ is not necessarily a coboundary operator on H. 4.3. Integrating out the bulk fields. In Sect. 3.2.2, we discussed simplicial aggregations which reduce the space of bulk fields of the 1-dimensional Chern-Simons theory bulk → F bulk according to combinatorial moves applied to the triangulation → . F It is interesting to consider the “ultimate aggregation” — integrating out the bulk fields completely. This procedure should yield the partition function in the sense of AtiyahSegal (i.e. without bulk fields). We will denote it by Z ◦ . For an interval, we have 1 ˜ A) = e− 2 f abc ψˆ a Ab ψˆ c . (88) (i)m D ψ˜ Z I (ψ,

216

A. Alekseev, P. Mnëv

We view (88) as a BV integral over the Lagrangian subspace L A = {ψ˜ + A| ψ˜ is free, A fixed} ⊂ FIbulk .

(89)

This subspace depends on the value of A, and integral (88) also depends on A. However, this dependence is adθˆ -exact: 1

e− 2

f abc ψˆ a (A+δ A)b ψˆ c

1

− e− 2

f abc ψˆ a Ab ψˆ c

1 1 1 ˆ e− 2 f abc ψˆ a Ab ψˆ c + δ Aa ψˆ a ]Cl(g) + O((δ A)2 ) [θ, (90) (this can be checked analogously to the proof of Lemma 4). Therefore, we should understand the partition function Z I◦ as an element of cohomology of the operator adθˆ (the fact that (88) is adθˆ -closed is an immediate consequence of the quantum master equation (68)). More exactly, Z I◦ is the class of Clifford unit 1ˆ in adθˆ -cohomology:

=

ˆ ∈ Had (Cl(g)). Z I◦ = [1] θˆ

(91)

Equivalently, in terms of representation ρ, we have ρ(Z I◦ ) = [idH pt + ] ∈ Hδ ρ (End(H pt + )). I

(92)

Remark 24. If the contraction of structure constants f abc f abc for g is nonzero, the cohomology class (91), (92) vanishes since 24−3 ˆ 1ˆ = − abc abc adθˆ θ. f f ∼ H ρ vanishes since every ad ˆ -cocycle In fact, the whole cohomology group Hadθˆ = δI θ αˆ ∈ Cl(g) is automatically exact: αˆ = −

24−3 ad ˆ (θˆ · α). ˆ f abc f abc θ

Remark 25. If f abc f abc = 0, we can define the reduced space of states for a point as the δ ρ -cohomology: Hred pt + := Hδ ρ (H pt + ),

red ∗ ρ ∗ Hred pt − := H−(δ ) (H pt − ) = (H pt + ) .

By the Künneth formula, we have red Hδ ρ (End(H pt + )) ∼ = Hred pt + ⊗ H pt − . I

(93)

The partition function (92) is then represented by the identity operator ρ(Z I◦ ) :

id

Hred → Hred pt + − pt + .

For a circle, we can obtain the partition function Z S◦ 1 (which is just a number) either as a Clifford super-trace of (88) or as a BV integral of the effective action (13) over the Lagrangian subspace (89). Either way, we have Z S◦ 1 = 0,

(94)

due to non-saturation of fermionic modes either in the Clifford super-trace or in the ˜ Berezin integral over ψ.

One-Dimensional Chern-Simons Theory

217

4.4. From operator formalism to path integral. 4.4.1. Abelian one-dimensional Chern-Simons theory. The one-dimensional abelian Chern-Simons theory associates to an interval I the unit of Cl(g) (here g can be viewed as a Euclidean vector space; the Lie algebra structure is irrelevant). The path integral arises upon applying the map (75) to this trivial partition function: ˆ out , η¯ in ) = (1ˆ · 1ˆ · · · 1ˆ )(ηout , η¯ in ) (1)(η N

=

N −1

Dηk D η¯ k+1 · m

k=1

1 (ηout , η¯ N + η¯ N , η N −1 + η N −1 , η¯ N −1 + · · · + η2 , η¯ 2 + η¯ 2 , η1 + η1 , η¯ in ) N −1 N 1 = m Dηk D η¯ k+1 · exp ηk − ηk−1 , η¯ k . η1 , η¯ in + · exp

k=1

k=2

(95) For convenience, we set η¯ 1 := η¯ in , η N := ηout and introduced a notation η, η ¯ := ηq η¯ q . q

In the exponential of (95), N terms of type ηk , η¯ k correspond to Clifford units, and N − 1 terms of type η¯ k+1 , ηk correspond to convolutions kernels as in (76); the symbol p p Dηk D η¯ k+1 is defined as p (Dηk D η¯ k+1 ). Expression (95) corresponds to triangulating an interval by N smaller intervals; the terms in the exponential correspond to 0- and 1-simplices of this triangulation. In the limit N → ∞, one formally writes (95) as a path integral over paths η(τ ), η(τ ¯ ) with η at the right end-point and η¯ at left end-point of I fixed by the boundary conditions. Lemma 5. ˆ out , η¯ in ) = (1)(η

η(0)= ¯ η¯ in , η(1)=ηout

DηDη¯ · exp

1 η(0), η(0) ¯ + dη, η ¯ . I (96)

A perturbative computation of the path integral (96) is trivial: the integral is given by the contribution of the critical point η(τ ) = ηout , η(τ ¯ ) = η¯ in for all τ ∈ [0, 1] and yields ˆ out , η¯ in ) = e ηout ,η¯in . (1)(η 1

It is instructive to write the integral (96) in terms of the field ψ instead of fields η, η. ¯ For simplicity, we first choose a complex polarization (as in (74)) 2 p−1 p ψ = √1 (η p + η¯ p ) η = √1 (ψ 2 p−1 − iψ 2 p ) 2 2 (97) ⇔ η¯ p = √1 (ψ 2 p−1 + iψ 2 p ) ψ 2 p = √i (η p − η¯ p ) 2

2

218

A. Alekseev, P. Mnëv

(this corresponds to the complex structure J on g which assigns ψ 2 p−1 as “real” coordinates and ψ 2 p as “imaginary” coordinates on g). We have,

p p

ηk η¯ k =

p

p

2 p−1

iψk

2p

ψk ,

p p

p

η¯ k+1 ηk =

ψ 2 p−1 ψ 2 p−1 + ψ 2 p ψ 2 p k+1

k

k+1 k

2

p

−

ψ 2 p−1 ψ 2 p − ψ 2 p ψ 2 p−1 k k+1 k . i k+1 2 p

Substituting these expressions into the integral representation (95) for (1), we obtain ˆ out , η¯ in ) = (1)(η

m/2

Dη1

N −1 m (i) Dψk m/2 D η¯ N k=2

N −1 −1 N 1 1 i 2 p−1 2 p−1 2p 2p · exp (ψk+1 , ψk ) + (ψk+1 − ψk )(ψk+1 − ψk ) 2 2 p k=1 k=1 i 2 p−1 2 p i 2 p−1 2 p ψ ψ + ψ1 + ψN . (98) 2 1 2 N p p ← − Here Dψk := a Dψka is the Berezin measure on g; the variable ψ1 is constructed by formulae (97) from the integration variable η1 and the boundary value η¯ 1 := η¯ in , and ψ N is constructed from the integration variable η¯ N and the boundary value η N := ηout . Remark 26. We think of the integral (95) as corresponding to cutting the interval I = [pin , pout ] into N intervals [pin , p2 ] ∪ [p2 , p3 ] ∪ · · · ∪ [p N , pout ]. Integration variables ηk , η¯ k+1 are associated to the point pk+1 (more specifically, to the right end of the interval [pk , pk+1 ] and to the left end of the interval [pk+1 , pk+2 ], respectively); the boundary value η¯ in corresponds to the point pin , the boundary value ηout — to the point pout . However, variables ψk are linear combinations of ηk and η¯ k . Thus, they are not associated to any single point, but rather to a pair of neighboring points (pk , pk+1 ). Again, we formally write the limit N → ∞ of the integral (98) as a path integral over paths ψ : I → g with a fixed anti-holomorphic projection of ψ at the right end-point of the interval and a fixed holomorphic projection at the left end-point: Lemma 6. The path integral expression for the partition function of the abelian ChernSimons theory on an interval is given by ˆ out , η¯ in ) = (1)(η

π¯ (ψ(0))=η¯ in , π(ψ(1))=ηout

Dψ

1 1 i 2 p−1 2p ψ (ψ, dψ) (0)ψ (0) + · exp 2 I 2 p i + ψ 2 p−1 (1)ψ 2 p (1) . 2 p

(99)

One-Dimensional Chern-Simons Theory

219

The exact meaning of the conditional measure on paths in (99) is the formal N → ∞ limit of the measure in (98). The second term in the exponential in (98) does not contribute to the limit N → ∞: once we assume that ψk are values of a differentiable path ψ(τ ) at times τ = k/N , the contribution of this term becomes of order O(1/N ). For a general complex structure J on g, path integral (99) becomes ˆ out , η¯ in ) = (1)(η Dψ π¯ (ψ(0))=η¯ in , π(ψ(1))=ηout

1 i 1 i · exp (ψ(0), J ψ(0)) + (ψ, dψ) + (ψ(1), J ψ(1)) . 4 4 I 2 4.4.2. Path integral for the non-abelian one-dimensional Chern-Simons theory in the cyclic Whitney gauge. End of proof of Theorem 2. To obtain a path integral representation for the partition function of the one-dimensional Chern-Simons theory on an interval (61) we use the same strategy as in Sect. 4.4.1: we cut the interval into N smaller intervals and then apply the map (75). The new point here is that for small intervals we have to use the “heat kernel” approximation which gives an exact result only in the limit N → ∞. Applying to Z I (61), we have ˜ (Z I )(ηout , η¯ in ) = (i)−m Dλ · e−(λ,ψ)

1 ˆ [A, ψ]) ˆ + (λ, ψ) ˆ ηout , η¯ in ) · exp − (ψ, 2 ˜ = (i)−m Dλ · e−(λ,ψ)

N 1 1 ˆ [A, ψ]) ˆ + (λ, ψ) ˆ exp − (ψ, · (ηout , η¯ in ) 2N N N −1 ˜ −m −(λ,ψ) m = (i) Dλ · e Dηk D η¯ k+1 k=1

1 1 1 ˆ [A, ψ])+ ˆ ˆ (ψ, (λ, ψ) (ηout , η¯ N ) · e η¯ N ,η N −1 · exp − 2N N

1 1 1 ˆ [A, ψ])+ ˆ ˆ (ψ, (λ, ψ) (η1 , η¯ in ). · · · e η¯ 2 ,η1 · exp − 2N N (100) Next, we need to evaluate the partition function for a small interval in the limit N → ∞:

1 1 ˆ [A, ψ]) ˆ + (λ, ψ) ˆ exp − (ψ, (η, η) ¯ 2N N

1 1 1 ˆ [A, ψ]) ˆ + (λ, ψ) ˆ +O (η, η) ¯ (ψ, = 1ˆ − 2N N N2

1 1 1 i 1 ¯ 1− . (ψ, [A, ψ]) + (λ, ψ) + tr (J · ad A ) + O = e η,η 2N N 4N N2 (101)

220

A. Alekseev, P. Mnëv

Here ψ is a linear combination of η, η, ¯ prescribed by the choice of a complex structure J (e.g. (97)); the term with a trace appeared due to the following identity:

1 ab i ab a ˆb η, η ¯ a b ˆ ψ ψ + δ + J . (ψ ψ )(η, η) ¯ = e 2 2 Here the third term generates the trace term in (101). Substituting the “heat kernel” asymptotics (101) into (100), we get (Z I )(ηout , η¯ in ) N −1 1 m Dηk D η¯ k+1 (i)−m Dλ · e(λ, N =

N

˜

k=1 ψk −ψ)

k=1

·e

i 4 tr(J ·ad A ) 1

·e− 2 N

1

···+η¯ 2 ,η1 +η1 ,η¯ in ) · e (ηout ,η¯ N +η¯ N ,η N −1 N 1 k=1 (ψk ,[A,ψk ]) + O . N

(102)

Taking the limit N → ∞, we obtain the following. Proposition 3. The Chern-Simons partition function for an interval is given by the path integral: i (Z I )(ηout , η¯ in ) = e 4 tr(J ·ad A ) Dψ 1

π¯ (ψ(0))=η¯ in , π(ψ(1))=ηout ,

I

dτ ψ(τ )=ψ˜

1 (ψ, (d + dτ · ad A )ψ) 2 I i i + (ψ(0), J ψ(0)) + (ψ(1), J ψ(1)) . 4 4

· exp

(103)

The conditional measure on paths ψ(τ ) with a fixed holomorphic projection at τ = 1, a fixed anti-holomorphic projection at τ = 0 and with a fixed integral over τ in (103) is the N → ∞ limit of the measure in (102). Applying the concatenation formulae (76), (77) to (103), we obtain a path integral representation of the Chern-Simons partition function for one-dimensional simplicial complexes: Corollary 2. For a triangulated interval = [pin , p2 ] ∪ [p2 , p3 ] ∪ · · · ∪ [pn , pout ] we have n i tr(J ·ad Ak ) k=1 4 Dψ (Z )(ηout , η¯ in ) = e p 1

π¯ (ψ(pin ))=η¯ in , π(ψ(pout ))=ηout ,

k+1 pk

dτ ψ(τ )=ψ˜ k

1 i (ψ, (d + dτ · ad A )ψ) + (ψ(pin ), J ψ(pin )) 2 4 I i (104) + (ψ(pout ), J ψ(pout )) . 4

· exp

For a triangulated circle n = [p1 , p2 ] ∪ · · · ∪ [pn−1 , pn ] ∪ [pn , p1 ], we obtain: n i 1 Z n = e k=1 4 tr(J ·ad Ak ) p Dψ · e 2 I (ψ,(d+dτ ·ad A )ψ) . (105) k+1 pk

dτ ψ(τ )=ψ˜ k

One-Dimensional Chern-Simons Theory

221

Remark 27. Expression (105) returns us to the “naïve” path-integral (37) for the i simplicial Chern-Simons on a circle, up to a somewhat puzzling factor e k 4 tr(J ·ad Ak ) . The explanation is as follows: the path integral in (105) was obtained from the path integral with boundaries (104) by the concatenation formula (77). Hence, it is secretly using the normal ordering prescribed by the choice of a complex structure J on g (which dictates the regularization for the one-loop determinant in (105)). This implicit depen i tr(J ·ad Ak ) k 4 dence on J is exactly cancelled by the factor e (indeed, we know that the left hand side of (105) is defined in terms of the Clifford algebra Cl(g) and therefore cannot possibly depend on J ). For the naïve path integral (37), we implicitly assumed the symmetric normal ordering by making a regularization (41) in our computation of the one-loop determinant (the important point is that θ (0) is a number, and not a matrix). Path integral representation (105) returns us to the perturbative computation of Sect. 2.3 and thus finishes the proof of Theorem 2. Remark 28. It is easy to compute (103) in the case of A = 0: 1

(Z I | A=0 )(ηout , η¯ in ) = 2−m e

¯˜ ˜ η¯ in −η ηout ,η¯ in −2ηout −η,

,

(106)

˜ We where η, ˜ η¯˜ are holomorphic and anti-holomorphic components of the bulk field ψ. can also write (106) as (Z I | A=0 )(ηout , η¯ in ) = 2

−m

e

1

i ˜ ˜ 2 (ψbd ,J ψbd )−i(ψbd −ψ,J (ψbd −ψ))

,

where ψbd = ι(ηout ) + ι¯(η¯ in ) is the linear combination of boundary fields ηout , η¯ in . 4.5. Simplicial action on an interval. Proposition 4. The path integral for the Chern-Simons partition function on an interval (103) is given by ad A 1/2 sinh 2 −1/2 ˜ (Z I (ψ, A))(ηout , η¯ in ) = det g · det g M(ad A ) ad A

2 1 1 ˜ ˜ ad A ψ) ηout , η¯ in − (ψ, · exp 2

1

η˜ − ηout , + η˜ − ηout η¯˜ − η¯ in · M(ad A ) · ¯ η˜ − η¯ in 2 (107) where the bilinear form M(ad A ) in basis (η, η) ¯ is represented by the block matrix

−1 −1 R−+ R++ −1 − R−− + R−+ R++ R+− . (108) M(ad A ) = −1 −1 1 + R++ R++ R+− Here symbols R±± stand for blocks of R(ad A ) (defined by formula (31)) in the basis (η, η): ¯

R++ R+− . (109) R(ad A ) = R−+ R−−

222

A. Alekseev, P. Mnëv

Proof. The path integral (103) is Gaussian with a critical point ψ cr (τ ) being the solution of (d + dτ ad A )ψ cr = const subject to conditions

(110)

˜ dτ ψ cr (τ ) = ψ,

(111)

π(ψ cr (1)) = ηout , π¯ (ψ cr (0)) = η¯ in .

(112) (113)

Equation (110) together with (111) gives ˜ ψ cr (τ ) = ψ˜ + F(ad A , τ )(ψ cr (0) − ψ),

(114)

where F(ad A , τ ) is defined by (30). The value of the action (together with boundary terms) for this path is S(ψ cr ) + boundary terms 1 1 1 1 ˜ − (ψ, ˜ ad A ψ) ˜ ψ cr (1) − ψ cr (0)) + ηcr (0), η¯ in + ηout , η¯ cr (1). = − (ψ, 2 2 2 2 1 2

(ψ cr ,d A ψ cr )

(115) Boundary values of ψ cr can be found as follows: (114) implies that ψ cr (1) = (1 + R(ad A ))ψ˜ − R(ad A )ψ cr (0). Solving this equation together with (112) and (113) in coordinates (η, η), ¯ we obtain

cr η˜ R+− 1 + R++ ηout η (0) R++ R+− = . (116) − η¯ in η¯ cr (1) R−+ 1 + R−− R−+ R−− η˜¯ Here ηcr (0) and η¯ cr (1) are the unknowns. Solving (116), we get −1 −1 )(η˜ − ηout ) + R++ R+− (η¯˜ − η¯ in ), ηcr (0) = ηout + (1 + R++

−1 −1 (η˜ − ηout ) − (−1 − R−− + R−+ R++ R+− )(η¯˜ − η¯ in ). η¯ cr (1) = η¯ in − R−+ R++

By substituting into (115), we obtain the exponential in (107). The pre-exponential in (107) can be derived as follows. We know that it is a function of A only (since it is a square root of a functional determinant of the operator13 d A acting on functions with vanishing integral, vanishing holomorphic part at τ = 1 and vanishing anti-holomorphic part at τ = 0); let us denote it by G(A). Closing the interval into a circle (by using (77)), we find 1 ˜ A))(ηout , η¯ in ) m Dηout D η¯ in e η¯in ,ηout · (Z I (ψ, 1

˜

˜

= detg M(ad A ) · G(A) · e− 2 (ψ,ad A ψ) . 1/2

Comparing with the known result for a circle (13), we obtain the pre-exponential in (107). 13 More exactly, the determinant of a matrix of the bilinear form

(•, d A •).

One-Dimensional Chern-Simons Theory

223 i

Remark 29. In this computation, we neglected the factor of e 4 tr(J ·ad A ) . If we did not, it would anyway be cancelled by the pre-exponential obtained by comparison with (13) (and this result is obtained by an explicit computation in the operator formalism in Sect. 3.1.3). by

˜ A; ηout , η¯ in ) We can define the simplicial Chern-Simons action for the interval SI (ψ, ˜

˜ A))(ηout , η¯ in ), e SI (ψ,A;ηout ,η¯in ) = (Z I (ψ, 1

or explicitly ˜ A; ηout , η¯ in ) SI (ψ,

1

1 η˜ − ηout ¯ ˜ ˜ = ηout , η¯ in − (ψ, ad A ψ) + η˜ − ηout η˜ − η¯ in · M(ad A ) · ¯ η˜ − η¯ in 2 2 ad A sinh 2 + tr g log (117) − tr g log M(ad A ). ad A 2 2 2

Remark 30. Expansion of (117) as a power series in A starts as ˜ A; ηout , η¯ in ) SI (ψ, 1 i ˜ ˜ ad A ψ) ˜ J (ψbd − ψ)) ˜ − (ψ, = (ψbd , J ψbd ) − i(ψbd − ψ, 2 2 1 ˜ J ad A J (ψbd − ψ)) ˜ + O(A2 ) − (ψbd − ψ, 6 i − m log 2 + tr(J ad A ) + O(A2 ), 12

(118)

where ψbd = ι(ηout ) + ι¯(η¯ in ). We can obtain the action for a simplicial complex by gluing actions (117) for individual intervals using concatenation formulae (76), (77). E.g. for a triangulated interval = [pin , p2 ] ∪ [p2 , p3 ] ∪ · · · ∪ [pn , pout ] we have ˜

1

˜

e S (ψ1 ,A1 ,...,ψn ,An ;ηout ,η¯in ) =

n−1

m Dηk D η¯ k+1 · k=1

1 SI (ψ˜ n , An ; ηout , η¯ n ) + η¯ n , ηn−1 + · · · + η¯ 2 , η1 + SI (ψ˜ 1 , A1 ; η1 , η¯ in ) . (119)

· exp

For a triangulated circle n = [p1 , p2 ] ∪ · · · ∪ [pn−1 , pn ] ∪ [pn , p1 ], we have 1

˜

˜

e Sn (ψ1 ,A1 ,...,ψn ,An ) =

n

m Dηk D η¯ k+1 k=1

· exp

n 1 SI (ψ˜ k , Ak ; ηk , η¯ k ) + η¯ k+1 , ηk . k=1

(120)

224

A. Alekseev, P. Mnëv

Remark 31. Looking at formulae (119), (120), it is tempting to identify η¯ k+1 , ηk as a simplicial action for the point pk+1 . Remark 32. Formula (120) explains how the simplicially non-local expression (42) is produced from a simplicially local expression (the sum of contributions of individual intervals — the integrand in (120)). The key is integration over boundary fields {ηk , η¯ k }. pqr

¯ Let us introduce the notation f ±±± for structure constants14 of g in the basis (η, η): 1 abc a b c f ψ ψ ψ 6 1 pqr 1 pqr 1 pqr 1 pqr = f +++ η p ηq ηr + f ++− η p ηq η¯ r + f +−− η p η¯ q η¯ r + f −−− η¯ p η¯ q η¯ r (121) 6 2 2 6

θ =

(this is the same θ as in (4) rewritten in holomorphic-antiholomorphic coordinates on g). Then, the quantum master equation (68) for the action (117) has the following form:

∂ 1 SI (ψ,A;η, ∂ 1 1 pqr p q r pqr p q ∂ pqq ˜ η) ¯ e + − f ++− η p f +++ η η η + f ++− η η a r a 6 2 ∂η 2 ∂ ψ˜ ∂ A 2 2 3 1 ∂ ∂ pqr ppr ∂ pqr ∂ ∂ ∂ ˜ ¯ e SI (ψ,A;η,η) f f + f +−− η p q r − + 2 ∂η ∂η 2 +−− ∂ηr 6 −−− ∂η p ∂ηq ∂ηr ← − ← − ← − ← − ← − ← − 1 3 pqr ∂ ∂ ∂ 1 2 pqr ∂ ∂ r 2 pqq ∂ ˜ S ( ψ,A;η, η) ¯ f +++ f f −e I + η ¯ − 6 ∂ η¯ p ∂ η¯ q ∂ η¯ r 2 ++− ∂ η¯ p ∂ η¯ q 2 ++− ∂ η¯ p ← − pqr ∂ q r ppr r 3 pqr p q r f + f +−− p η¯ η¯ − f +−− η¯ + η¯ η¯ η¯ = 0. (122) 2 ∂ η¯ 2 6 −−− Remark 33. The one-dimensional B F theory is a special case of the one-dimensional Chern-Simons theory where the complex polarization (78) is compatible with the Lie algebra structure on g. In more detail, let h in (78) be a Lie subalgebra, and let the Lie ¯ algebra structure on g be given by a semidirect product of h with its coadjoint module h: ¯ g = h h. In this case, formula (117) for SI simplifies: the block R+− in (109) vanishes, and the matrix M(ad A ) (108) becomes

M(ad A ) =

−1 R−+ R++ −1 − R−− −1 1 + R++ 0

,

−1 det g M(ad A ) = det h(1 + R++ ). 1/2

In (121), only the second term on the right hand side survives: θ=

1 r p q F η η η¯r . 2 pq

14 Here we mean the structure constants of the cyclic operation (•, [•, •]) : ∧3 g → R.

One-Dimensional Chern-Simons Theory

225

r are the structure constants of h; we distinguish between upper and lower (Here F pq indices to emphasize that we do not assume that h comes with a pairing). So, the quantum master equation (122) is simplified:

1 ∂ 1 SI (ψ,A;η, ∂ 1 r p q ∂ q p ˜ ˜ η) ¯ ¯ F pq η η e + − F pq η e SI (ψ,A;η,η) a a ˜ 2 ∂ηr 2 ∂ψ ∂ A ← − ← − ← − 1 1 2 r ∂ ∂ 2 q ∂ ˜ S ( ψ,A;η, η) ¯ I F F pq p = 0. −e η¯r − (123) 2 pq ∂ η¯ p ∂ η¯ q 2 ∂ η¯

(If in addition h is unimodular, the last terms in brackets vanish.) Note that the result for B F theory that we obtain from (117) cannot be directly compared to the result in [15] as the choice of gauge fixing is very different.15 Remark 34. Another interesting point about the B F case is that f abc f abc = 0. Hence, the operator δ ρ : H pt + → H pt + becomes a coboundary operator. If we assume in addition that h is unimodular, then (H pt + , δ ρ ) can be identified with the Chevalley-Eilenberg complex of the Lie algebra h. Thus, the reduced space of states associated to a point (see Remark 25) is the Chevalley-Eilenberg cohomology of h: ∼ Hred pt + = HC E (h). Therefore, the cohomology space Hadθˆ (Cl(g)) ∼ = Hδ ρ (End(H pt + )) ∼ = HC E (h) ⊗ (HC E (h))∗ I

(124)

becomes non-trivial. In this case, the partition function Z I◦ can be understood as an identity operator acting on the Chevalley-Eilenberg cohomology HC E (h). Remark 35. One can also view the one-dimensional version of the B F theory with a cosmological term [6] as a special case of the one-dimensional Chern-Simons theory for g = h ⊕ h∗ , where h is itself a quadratic Lie algebra, and the Lie algebra structure on g is given by θ=

1 1 pqr p q r F η η η¯ + κ F pqr η¯ p η¯ q η¯ r . 2 6

(125)

Here F pqr are the structure constants of h (in an orthonormal basis) and the parameter κ is the “cosmological constant”. For Lie algebra g, we automatically have f abc f abc = 0, and Remark 25 applies in this case. Let us denote g with Lie algebra structure defined by (125) by g B F,κ . Then, onedimensional Chern-Simons theories with Lie algebras g B F,κ and h are related, similarly to the 3-dimensional case [6]. In particular, for continuum action on the circle we have ¯ ) SgB F,κ (ι(η) + ι¯(η) ¯ , ι(A) + ι¯(A) ψ

A

1

¯ − Sh η − κ η, ¯ , Sh η + κ η, ¯ A+κ A ¯ A−κ A = 2κ

(126)

15 Indeed, here we fix the field A to be constant on the interval, and we fix the integral ψ ˜ of field ψ over the interval, and the holomorphic and anti-holomorphic projections of ψ at the right and left end-points of the interval. The gauge used in [15] fixes π(A) to be constant, π¯ (A) to be a sum of delta-functions at the ends of the interval; and it fixes the values π(ψ) at the ends of the interval and the integral for π¯ (ψ). The latter gauge choice features better simplicial locality properties, but is only h-equivariant.

226

A. Alekseev, P. Mnëv

where ι and ι¯ denote the embeddings of h, h∗ into g B F,κ and on the right hand side we implicitly use the isomorphism h ∼ = h∗ given by the pairing on h. Relation (126) implies the following relation for partition functions for the triangulated circle n for Lie algebras g B F,κ and h: ¯ k )}; ) Z gB F,κ , n ({ι(ηk ) + ι¯(η¯ k )}, {ι(Ak ) + ι¯(A ¯ k }; 2κ ) · Z h, ({ηk −κ η¯ k }, {Ak −κ A ¯ k }; −2κ ). = Z h, n ({ηk +κ η¯ k }, {Ak +κ A n (127) Remark 36. Another special case of a one-dimensional Chern-Simons theory can be constructed from a Lie bialgebra h. Here we set g = h ⊕ h∗ with the canonical pairing and with Lie algebra structure on g defined by θ=

1 r p q 1 qr F pq η η η¯r + G p η p η¯ q η¯r . 2 2

qr

r and G Here F pq p are structure constants of the Lie bracket and co-bracket on h. This is a one-dimensional version of the Lie bialgebra B F theory, cf. [14] (the underlying unimodular Lie bialgebra for continuum theory on the circle is h ⊗ • (S 1 )). It does not seem to enjoy any particular simplifications with respect to the general case other than having a canonical complex polarization on g.

˜ A, ηout , η¯ in that Remark 37. The odd third-order differential operator in variables ψ, ¯ with a appears in (122) endows the algebra of functions Fun(g ⊕ g ⊕ h ⊕ h) structure of homotopy BV algebra in the sense of Tamarkin-Tsygan [16]. In general, the bulk ⊕ (h)×i() ⊕ (h) ¯ ×i() ) for any 1-dimensional simplicial same applies to Fun(F complex . If has no boundary, this homotopy BV structure is strict. Acknowledgements. We wish to thank Alberto Cattaneo and Andrei Losev for enlightening discussions on the subject. Research of A. A. was supported in part by the grants of the Swiss National Science Foundation number 200020-129609 and 200020-126817; P. M. acknowledges partial support by SNF Grant 200020-121640/1 and by RFBR Grants 08-01-00638, 09-01-12150.

References 1. Aleksandrov, M., Kontsevich, M., Schwarz, A., Zaboronsky, O.: The geometry of the master equation and topological quantum field theory. Int. J. Mod. Phys. A 12, 1405–1430 (1997) 2. Alekseev, A., Meinrenken, E.: Clifford algebras and the classical dynamical Yang-Baxter equation. Math. Res. Lett. 10(2–3), 253–268 (2003) 3. Berezin, F.: Covariant and contravariant symbols of operators. (Russian) Izv. Akad. Nauk SSSR Ser. Mat. 66, 1134–1167 (1972) 4. Berline, N., Getzler, E., Vergne, M.: Heat kernels and Dirac operators, Grundlehren der Mathematischen Wissenschaften, Vol. 298. New York: Springer-Verlag, 1992 5. Atiyah, M.: Topological quantum field theories. Publ. Math. Inst. Hautes Etudes Sci. 68, 175–186 (1989) 6. Cattaneo, A.S., Cotta-Ramusino, P., Froehlich, J., Martellini, M.: Topological B F theories in 3 and 4 dimensions. J. Math. Phys. 36, 6137–6160 (1995) 7. Cattaneo, A.S., Mnev, P.: Remarks on Chern-Simons invariants. Commun. in Math. Phys. 293(3), 803–836 (2010) 8. Cattaneo, A.S., Mnev, P., Reshetikhin, N.: Perturbative topological quantum field theory with boundary. In preparation 9. Faddeev, L.D., Slavnov, A.A.: Gauge fields: an introduction to quantum theory. Reading, MA: Addison, Wesley, 1988 10. Frenkel, E., Losev, A., Nekrasov, N.: Instantons beyond topological theory I. http://arxiv.org/abs/hep-th/ 0610149v1, 2006

One-Dimensional Chern-Simons Theory

227

11. Granåker, J.: Unimodular L-infinity algebras, http://arxiv.org/abs/0803.1763v1 [math.QA], 2008 12. Kostant, B., Sternberg, S.: Symplectic reduction, BRS cohomology, and infinite-dimensional Clifford algebras. Ann. Phys. 176(1), 49–113 (1987) 13. Losev, A.: Lectures on topological quantum field theory. 2008 14. Merkulov, S.A.: Wheeled pro(p)file of Batalin-Vilkovisky formalism. Commun. Math. Phys. 295, 585– 638 (2010) 15. Mnev, P.: Notes on simplicial BF theory. Moscow Math. J. 9(2), 371–410 (2009); Discrete BF theory. http://arxiv.org/abs/0809.1160v2 [hep-th], 2008 16. Tamarkin, D., Tsygan, B.: Noncommutative differential calculus, homotopy BV algebras and formality conjectures. Methods Funct. Anal. Topology 6(2), 85–100 (2000) 17. Whitney, H.: Geometric integration theory. Princeton, NJ: Princeton University Press, 1957 18. Witten, E.: Supersymmetry and Morse theory. J. Diff. Geom. 17(4), 661–692 (1982) Communicated by A. Kapustin

Commun. Math. Phys. 307, 229–259 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1307-9

Communications in

Mathematical Physics

Parabolic Presentations of the Super Yangian Y (gl M|N ) Yung-Ning Peng Department of Mathematics, University of Virginia, Charlottesville, VA 22904-4137, USA. E-mail: [email protected] Received: 6 December 2010 / Accepted: 14 March 2011 Published online: 2 August 2011 – © Springer-Verlag 2011

Abstract: Associated to a composition of M and a composition of N, a new presentation of the super Yangian of the general linear Lie superalgebra Y (gl M|N ) is obtained. Contents 1. 2. 3. 4. 5. 6. 7. 8.

Introduction . . . . . . . . . . . . . . . . . Properties of the Super Yangian Y (gl M|N ) . Gauss Decomposition and Quasideterminants Maps Between Super Yangians . . . . . . . Special Cases: Non-super Case and m = n = 1 Special Case: m = 2, n = 1 . . . . . . . . . . The General Case . . . . . . . . . . . . . . Injectivity of . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

229 231 233 235 239 243 248 253

1. Introduction For each simple finite-dimensional Lie algebra g over C, the associated Yangian Y (g) was defined by Drinfeld in [D1] as a deformation of the universal enveloping algebra U (g[x]) for the polynomial current Lie algebra g[x]. The Yangians form a family of quantum groups which give rise to rational solutions of the Yang-Baxter equation originating from statistical mechanics; see [CP]. A Yangian admits PBW basis, triangular decomposition and Hopf algebra structure. The Yangian Y (gl N ) of the reductive Lie algebra gl N was earlier considered in [TF]. It is an associative algebra whose defining relations can be written in a specific matrix form, which is called the RTT relation; see e.g. [FRT] and [MNO]. The structures and representation theory of Y (gl N ) have been studied by many people; see e.g. [KRS,Ta,MNO and Mo]. In [D2], Drinfeld gave a new presentation for Yangians and it in particular can be used to define the analog of the Cartan subalgebra and the Borel subalgebra in Y (gl N ).

230

Y.-N. Peng

In [BK1], Brundan and Kleshchev found a parabolic presentation for Y (gl N ) associated to each composition λ of N. Roughly speaking, the new presentation corresponds to a block matrix decomposition of gl N of shape λ. In the special case when λ = (1, 1, . . . , 1), the corresponding parabolic presentation is just a variation of Drinfeld’s; see [BK1, Remark 5.12]. On the other extreme case when λ = (N ), the corresponding parabolic presentation is exactly the original RTT presentation. The parabolic presentation allows Brundan and Kleshchev to further define the standard Levi and parabolic subalgebras of Y (gl N ), and thus to obtain a Levi decomposition of Y (gl N ). The parabolic presentations have played a crucial role in their subsequent work [BK2], in which they derived generators and relations for the finite W -algebras. The main goal of this article is to obtain the superalgebra generalization of the parabolic presentations of [BK1] for the super Yangian Y (gl M|N ). The super Yangian of the general linear Lie superalgebra Y (gl M|N ) was introduced by Nazarov in [Na], and it shares many properties with the usual Yangian, such as the PBW theorem, the RTT relation and the Hopf algebra structure. The results of this article will be used in a sequel on the connection between Y (gl M|N ) and the super W -algebras. Let λ be a composition of M and ν be a composition of N . We first define some distinguished elements in Y (gl M|N ), denoted by D’s, E’s and F’s, by Gauss decomposition and quasideterminants. We show that these new elements form a set of generators for Y (gl M|N ). The next step is to find the relations among the new generators, where the signs arising from the Z2 -grading are involved here. However, since the (λ|ν)-block decomposition of gl M|N respects the Z2 -grading of the superalgebra, the signs in the relations are determined by the block positions only. It is known (cf. [BK1]) that if the elements are from two different blocks and the blocks are not “close”, then they commute. This phenomenon remains true in our super Yangian setting and it dramatically reduces the number of the nontrivial relations. Hence we only have to focus on the commutation relations of the elements in the same block or when their block-positions are “close”. Let m be the number of parts of λ and n be the number of parts of ν. Then the first new non-trivial case will be m = n = 1, and the new ones will be m = 2, n = 1 and m = 1, n = 2 (see Sect. 4). In these special cases, we determined various relations among D’s, E’s and F’s by direct computation. Next, we make use of the shift map ψk and the swap map ζ M|N between super Yangians (see Sect. 4). These maps allow us to transfer the relations in the special cases with m + n ≤ 3 to relations in Y (gl M|N ) in the setting of general compositions λ and ν. Finally we show that we have found enough relations for our new presentation. As a consequence, we obtain the PBW bases for several distinguished subalgebras of Y (gl M|N ). The parabolic presentation in the extreme case when all parts of λ and ν are 1 was found by Gow in [Go], who used the presentation to define the super Yangian of the special linear superalgebra Y (sl N |N ) which was missing in the literature and to determine the generators of the center of Y (gl M|N ). However, there are non-trivial relations that can not be observed in this special case and nevertheless play an important role in our paper (see Remark 7.1 below). We organize this article in the following manner. In Sect. 2, we recall the definition and some properties of Y (gl M|N ). In Sect. 3, we introduce the generating elements in our parabolic presentation by means of Gauss decomposition. In Sect. 4, we define some maps between super Yangians in order to reduce the general case to special cases when m + n ≤ 3, and Sects. 5 and 6 are devoted to these special cases. Our main theorem in the general case is formulated in Sect. 7 and its proof is completed in Sect. 8.

Parabolic Presentations of the Super Yangian Y (glM|N )

231

2. Properties of the Super Yangian Y (gl M|N ) Most of the theorems and lemmas in Sects. 2 to 4 are generalizations of the counterparts for Y (gl N ) in [MNO] or [BK1]. The super Yangian Y (gl M|N ), which was introduced in [Na], is the associative Z2 -graded algebra (i.e., superalgebra) over C with generators (r ) ti j | 1 ≤ i, j ≤ M + N ; r ≥ 0 , (0)

where ti j := δi j and defining relations (s) ] [ti(rj ) , thk

= (−1)

i j+i h+ j h

min(r,s)−1

(r +s−1−t) (t) th(t)j tik , − th(rj +s−1−t) tik

(2.1)

t=0

where i = 0 if i ≤ M, i = 1 if i ≥ M + 1, and the bracket is understood as a supercom(r ) mutator. For r > 0, the element ti j is defined to be an odd element if i + j = 1 and an even element if i + j = 0. Remark 2.1. When N = 0, the super Yangian Y (gl M|0 ) is naturally isomorphic to the usual Yangian Y (gl M ); when M = 0, the super Yangian Y (gl0|N ) is also isomorphic to the usual Yangian Y (gl N ) by the map ζ0|N , see Sect. 4. We define the formal power series to be the generating series (with non-positive powers of a variable u) of the generators: (1)

(2)

(3)

ti j (u) = δi j + ti j u −1 + ti j u −2 + ti j u −3 + . . . .. Also define T (u) :=

M+N

ti j (u) ⊗ E i j (−1) j(i+1) ∈ Y (gl M|N )[[u −1 ]] ⊗ End C M|N ,

i, j=1

where E i j is the standard elementary matrix. The extra sign ensures that the product of matrices canbe calculated in the usual manner. We may also think of T (u) as an element in Mat M+N Y (gl M|N )[[u −1 ]] , the set of (M + N ) × (M + N ) matrices with entries in Y (gl M|N )[[u −1 ]]. We may also define the super Yangian Y (gl M|N ) by the RTT relation: R(u − v)T1 (u)T2 (v) = T2 (v)T1 (u)R(u − v),

(2.2)

where T1 (u) = T (u) ⊗ I d M+N , and

P12 =

T2 (v) = I d M+N ⊗ T (v), M+N

R(u − v) = 1 −

P12 , (u − v)

(−1) j E i j ⊗ E ji is the permutation matrix.

i, j=1

The equality is in Mat M+N ⊗ Mat M+N ⊗ Y (gl M|N )((u −1 , v −1 )), which means the localization of Mat M+N ⊗ Mat M+N ⊗ Y (gl M|N )[[u −1 , v −1 ]] at the multiplicative set consisting of the non-zero elements of C[[u −1 , v −1 ]].

232

Y.-N. Peng

Remark 2.2. Note that we have (u − v)−1 in the matrix R(u − v). Hence we have to replace Y (gl M|N )[[u −1 , v −1 ]] by a certain extension containing (u − v)−1 . Equating the coefficients of E i j ⊗ E hk on both sides of (2.2), we have the following equivalent defining relations in terms of the generating series: [ti j (u), thk (v)] =

(−1)i j+i h+ j h th j (u)tik (v) − th j (v)tik (u) . (u − v)

(2.3)

Note that the matrix T (u) is invertible, hence one may define the entries of its inverse by M+N . (T (u))−1 := ti j (u) i, j=1

Multiplying T2 (v)−1 on both sides of (2.2) and use the same method getting (2.3), we have yet another relation:

M+N M+N (−1)i j+i h+ j h [ti j (u), thk (v)] = til (u)tlk (v) − δi,k thl (v)tl j (u) . δh, j (u − v) l=1

l=1

(2.4) As an easy consequence of (2.4), we know that for all r and s, if i = k and j = h, then (r ) (s) ti j and thk supercommute. The following is the PBW basis theorem for Y (gl M|N ). Proposition 2.1 [Go, Theorem 1]. The set of all monomials in the elements (r ) ti j |1 ≤ i, j ≤ M + N , r ≥ 1 taken in some fixed order (containing no second or higher order powers of the odd generators) forms a basis for Y (gl M|N ). We have the loop f iltration on Y (gl M|N ) L 0 Y (gl M|N ) ⊆ L 1 Y (gl M|N ) ⊆ L 2 Y (gl M|N ) ⊆ · · · (r )

defined by setting deg ti j = r − 1 for each r ≥ 1 and L k Y (gl M|N ) is the span of all s (ri − 1) ≤ k. We denote monomials of the form ti(r1 1j1) ti(r2 2j2) · · · ti(rs sjs) with total degree i=1 L the associated graded algebra by gr Y (gl M|N ). Let gl M|N [t] denote the loop superalgebra gl M|N ⊗ C[t] with the standard basis {E i j t r | 1 ≤ i, j ≤ M + N , r ≥ 0} and U (gl M|N [t]) denote its universal enveloping algebra. By the PBW theorem for Y (gl M|N ), we have the following corollary. Corollary 2.2 [Go, Corollary 1]. The graded algebra gr L Y (gl M|N ) is isomorphic to the universal enveloping algebra U (gl M|N [t]) by the map gr L Y (gl M|N ) → U (gl M|N [t]) (r )

grrL−1 ti j → (−1)i E i j t r −1 .

Parabolic Presentations of the Super Yangian Y (glM|N )

233

3. Gauss Decomposition and Quasideterminants Let λ be a composition of M and ν be a composition of N . In the remaining part of this article, for a notational reason, we set μi = λi

μm+ j = ν j for all 1 ≤ i ≤ m, 1 ≤ j ≤ n,

and

and μ = (μ1 , μ2 , . . . , μm | μm+1 , μm+2 , . . . , μm+n ) denotes the composition of (M|N ). By definition, the leading minors of the matrix T (u) are invertible. Then it possesses a Gauss decomposition (cf. [GR]) T (u) = F(u)D(u)E(u) for unique block matrices D(u), E(u) and F(u) of the form ⎛

⎞ D1 (u) 0 ··· 0 D2 (u) · · · 0 ⎜ 0 ⎟ ⎟, D(u) = ⎜ .. .. .. ⎝ ... ⎠ . . . 0 0 · · · Dm+n (u) ⎛ ⎞ Iμ1 E 1,2 (u) · · · E 1,m+n (u) Iμ2 · · · E 2,m+n (u) ⎟ ⎜ 0 ⎟, E(u) = ⎜ .. .. .. ⎝ ... ⎠ . . . 0 0 ··· Iμm+n ⎛ Iμ1 0 ··· 0 Iμ2 ··· 0 ⎜ F2,1 (u) F(u) = ⎜ .. .. .. .. ⎝ . . . . Fm+n,1 (u) Fm+n,2 (u) · · ·

⎞ ⎟ ⎟, ⎠

Iμm+n

where Da (u) = Da;i, j (u) 1≤i, j≤μ , a E a,b (u) = E a,b;i, j (u) 1≤i≤μ ,1≤ j≤μ , a b Fb,a (u) = Fb,a;i, j (u) 1≤i≤μ ,1≤ j≤μ ,

(3.1) (3.2) (3.3)

a

b

are μa × μa , μa × μb and μb × μa matrices, respectively, for all 1 ≤ a ≤ m + n in (3.1) and all 1 ≤ a < b ≤ m + n in (3.2) and (3.3). Definition 3.1. We call the indices a, b the block positions, and the indices i, j the entr y positions. Also define the μa × μa matrix Da (u) = Da;i, (u) j

1≤i, j≤μa

Da (u) := (Da (u))−1 .

by

234

Y.-N. Peng

The entries of these matrices are expanded into power series Da;i, j (u) =

(r ) −r Da;i, ju ,

r ≥0

Da;i, j (u) =

(r )

Da;i, j u −r ,

r ≥0

E a,b;i, j (u) =

(r )

E a,b;i, j u −r ,

r ≥1

Fb,a;i, j (u) =

(r ) −r Fb,a;i, ju .

r ≥1

Moreover, for 1 ≤ a ≤ m + n − 1, we set E a;i, j (u) := E a,a+1;i, j (u) =

(r )

E a;i, j u −r ,

r ≥1

Fa;i, j (u) := Fa+1,a;i, j (u) =

(r )

Fa;i, j u −r .

r ≥1

There are explicit descriptions of all these series in terms of quasideterminants (cf. [GKLLRT,GR]). To write them down, we introduce the following notation. Suppose that A, B, C and D are a × a, a × b, b × a and b × b matrices respectively with entries in some ring. Assuming that the matrix A is invertible, we define A B := D − C A−1 B. C D We write the matrix T (u) in block form as ⎛μ ⎞ T1,1 (u) · · · μ T1,m+n (u) ⎜ ⎟ .. T (u) = ⎝ ... ⎠, . ··· μT μT (u) · · · (u) m+n,1 m+n,m+n where each μ Ta,b (u) is a μa × μb matrix. Proposition 3.1 [GR]. We have μ μ T (u) T1,1 (u) · · · μ T1,a−1 (u) 1,a .. .. .. .. . . . . , Da (u) = μ Ta−1,1 (u) · · · μ Ta−1,a−1 (u) μ Ta−1,a (u) μ T (u) · · · μ T μ T (u) a,a a,1 a,a−1 (u) μ μ T (u) T1,1 (u) · · · μ T1,a−1 (u) 1,b .. .. .. .. . . . . , E a,b (u) = Da (u) μ Ta−1,1 (u) · · · μ Ta−1,a−1 (u) μ Ta−1,b (u) μ T (u) · · · μ T μ T (u) a,b a,1 a,a−1 (u)

(3.4)

(3.5)

Parabolic Presentations of the Super Yangian Y (glM|N )

μ T1,1 (u) .. . Fb,a (u) = μ Ta−1,1 (u) μ T (u) b,1

··· .. . ··· ···

235

.. .. . . D (u), a μT μ Ta−1,a (u) a−1,a−1 (u) μ T (u) μT b,a b,a−1 (u) μT 1,a−1 (u)

μ T (u) 1,a

(3.6)

for all 1 ≤ a ≤ m + n in (3.4) and 1 ≤ a < b ≤ m + n in (3.5), (3.6). We denote the (i, j)th entry of the μa × μb matrix μ Ta,b (u) by Ta,b;i, j (u) and denote (r ) the coefficient of u −r in Ta,b;i, j (u) by Ta,b;i, j . By Proposition 3.1, we immediately have (1)

(1)

E b−1;i, j = Tb−1,b;i, j ,

(1)

(1)

Fb−1;i, j = Tb,b−1;i, j ,

for all admissible b, i, j,

(3.7)

and (r )

(r )

(r )

D1;i, j = T1,1;i, j = ti, j , for all 1 ≤ i, j ≤ μ1 , r ≥ 0.

(3.8)

By induction, one may show that for each pair a, b such that 1 < a + 1 < b ≤ m + n − 1 and 1 ≤ i ≤ μa , 1 ≤ j ≤ μb , we have (r )

(r )

(1)

E a,b;i, j = (−1)b−1 [E a,b−1;i,k , E b−1;k, j ],

(r )

(1)

(r )

Fb,a;i, j = (−1)b−1 [Fb−1;i,k , Fb−1,a;k, j ],

(3.9) for any 1 ≤ k ≤ μb−1 . Here, a := 0 if 1 ≤ a ≤ m and a := 1 if m + 1 ≤ a ≤ m + n. (r ) By multiplying out the matrix product T (u) = F(u)D(u)E(u), we see that each ti j (r )

(r )

(r )

can be expressed as a sum of monomials in Da;i, j , E a,b;i, j and Fb,a;i, j , appearing in certain order that all F’s before D’s and all D’s before E’s. By (3.9), it is enough to use (r ) (r ) (r ) Da;i, j , E a;i, j and Fa;i, j only, rather than all E’s and F’s. We have proved the following theorem. Theorem 1. The super Yangian Y (gl M|N ) is generated as an algebra by the following elements: (r ) (r ) Da;i, j , Da;i, j | 1 ≤ a ≤ m + n, 1 ≤ i, j ≤ μa , r ≥ 0 , (r ) E a;i, j | 1 ≤ a < m + n, 1 ≤ i ≤ μa , 1 ≤ j ≤ μa+1 , r ≥ 1 , (r ) Fa;i, | 1 ≤ a < m + n, 1 ≤ i ≤ μ , 1 ≤ j ≤ μ , r ≥ 1 . a+1 a j 4. Maps Between Super Yangians Our ultimate goal in this article is to find out the defining relations among the generating (r ) (r ) (r ) (r ) elements Da;i, j , Da;i, j , E a;i, j , Fa;i, j in Y (gl M|N ). The strategy is to work out the special cases when m and n are either 1 or 2, which are relatively less complicated, and then to apply the maps in this section to obtain the relations in the general case. Proposition 4.1. (1) The map ρ M|N : Y (gl M|N ) → Y (gl N |M ) defined by ρ M|N ti j (u) = t M+N +1−i,M+N +1− j (−u) is an algebra isomorphism.

236

Y.-N. Peng

(2) The map ω M|N : Y (gl M|N ) → Y (gl M|N ) defined by ω M|N (T (u)) = (T (−u))−1 is an algebra automorphism. (3) For any k ∈ Z≥0 , the map ψk : Y (gl M|N ) → Y (glk+M|N ) defined by ψk = ωk+M|N ◦ ϕ M|N ◦ ω M|N , where ϕ M|N : Y (gl M|N ) → Y (glk+M|N ) is the inclusion which sends each ti(rj ) in (r )

Y (gl M|N ) to tk+i,k+ j in Y (glk+M|N ), is an injective algebra homomorphism. (4) The map ζ M|N : Y (gl M|N ) → Y (gl N |M ) defined by ζ M|N = ρ M|N ◦ ω M|N is an algebra isomorphism. Proof. Follows by checking that these maps preserve the RTT relation (2.2).

ζ N |0

Remark 4.1. The composition Y (gl N ) ∼ = Y (gl N |0 ) −−→ Y (gl0|N ) is an algebra isomorphism. We call ψk the shi f t map and ζ M|N the swap map. It is clear that ψ0 is the identity map and ζ M|N has order 2. Since they are important for us, we write down their images explicitly. Lemma 4.2. Let 1 ≤ i, j ≤ M + N . (1) For any k ∈ N, we have t11 (u) .. . ψk ti j (u) = tk1 (u) tk+i,1 (u)

· · · t1k (u) .. .. . . · · · tkk (u) · · · tk+i,k (u)

. tk,k+ j (u) tk+i,k+ j (u) t1,k+ j (u) .. .

(4.1)

(2) We have ζ M|N ti j (u) = t M+N +1−i,M+N +1− j (u).

(4.2)

First note that the description of ψk (ti j ) in (4.1) is independent of M and N, hence our notation is unambiguous. Also, (4.1) along with quasideterminants in Sect. 3 implies that Da;i, j (u) = ψμ1 +μ2 +···+μa−1 D1;i, j (u) , (4.3) (4.4) E a;i, j (u) = ψμ1 +μ2 +···+μa−1 E 1;i, j (u) , (4.5) Fa;i, j (u) = ψμ1 +μ2 +···+μa−1 F1;i, j (u) . Secondly, observe that ψk maps ti j (u) ∈ Y (gl M|N ) to tk+i,k+ j (u) ∈ Y (glk+M|N ). So (r ) ψk Y (gl M|N ) is generated by the set {tk+i,k+ j | 1 ≤ i, j ≤ M + N , r ≥ 0}, as a subalge(r )

bra of Y (glk+M|N ). If we pick any element ti j in the northwestern k × k corner of T (u)

Parabolic Presentations of the Super Yangian Y (glM|N )

237

(viewed as an (k + M + N ) × (k + M + N ) matrix with entries in Y (glk+M|N )[[u −1 ]]), the indices will never overlap with those of ψk Y (gl M|N ) , which are in the southeastern (M + N ) × (M + N ) corner of the same T (u). By Eq. (2.4), they supercommute. Obviously, the elements in the northwestern k × k corner in Y (glk+M|N ) generate a subalgebra isomorphic to Y (glk ) by the defining relations (2.1). We have proved the following lemma. Lemma 4.3. The subalgebras Y (glk ) and ψk Y (gl M|N ) in Y (glk+M|N ) supercommute with each other. Now we study the map ζ M|N . Associated with the composition μ, we may define (r ) (r ) (r ) (r ) the elements {Da;i, j ; Da;i, j }, {E a;i, j }, {Fa;i, j } in Y (gl M|N ) by Gauss decomposition. Consider μr := (μm+n , . . . , μm+1 | μm , . . . , μ2 , μ1 ), the reverse of μ, which is a composition of (N |M). With μr , we may similarly define (r ) (r ) (r ) (r ) the elements {Da;i, j ; Da;i, j }, {E a;i, j }, {Fa;i, j } in Y (gl N |M ), by abuse of notations. Their relations are given in the following proposition, which is a generalization of [Go, Prop. 1]. Proposition 4.4. For all admissible a, i, j, we have ζ M|N Da;i, j (u) = Dm+n+1−a;μ (u), a +1−i,μa +1− j ζ M|N E a;i, j (u) = −Fm+n−a;μa +1−i,μa+1 +1− j (u), ζ M|N Fa;i, j (u) = −E m+n−a;μa+1 +1−i,μa +1− j (u).

(4.6) (4.7) (4.8)

Note that the D’s, E’s and F’s on the left hand side are in Y (gl M|N )[[u −1 ]], while those on the right hand side are in Y (gl N |M )[[u −1 ]]. Proof. The proof is essentially the same as [Go, Prop. 1], except that we decompose the matrix T (u) into block decompositions and the entry positions are flipped around by ζ . For a given composition μ, multiply out the matrix products T (u) = F(u)D(u)E(u)

T (u)−1 = E(u)−1 D(u) F(u)−1 .

and

Then the following matrix identities hold: Ta,a (u) = Da (u) + Fa,c (u)Dc (u)E c,a (u),

(4.9)

c
(u) = Da (u) + Ta,a

c,a (u), a,c (u)Dc (u) F E

c>a

Ta,b (u) = Da (u)E a,b (u) +

(4.10)

Fa,c (u)Dc (u)E c,b (u),

(4.11)

Fb,c (u)Dc (u)E c,a (u),

(4.12)

c,b (u), a,c (u)Dc (u) F E

(4.13)

c,a (u), b,c (u)Dc (u) F E

(4.14)

c
Tb,a (u) = Fb,a (u)Da (u) +

c
a,b (u)Db (u) + (u) = E Ta,b

c>b

b,a (u) + Tb,a (u) = Db (u) F

c>b

238

Y.-N. Peng

for all 1 ≤ a ≤ m + n in (4.9), (4.10) and 1 ≤ a < b ≤ m + n in (4.11)–(4.14). Here (u) denotes the μ ×μ -matrices in the (a, b)th block position of T (u)−1 , T Ta,b a b a,b;i, j (u) (r )

(u), T −r in T denotes the (i, j)th entry of Ta,b a,b;i, j denotes the coefficient of u a,b;i, j (u) and a,b (u) := (−1)s E a,i1 (u)E i1 ,i2 (u) · · · E is−1 ,b (u), E a=i 0
b,a (u) := F

(−1)s Fb,is−1 (u)Fis−1 ,is−2 (u) · · · Fi1 ,a (u).

a=i 0
In fact, (4.7) and (4.8) are the special cases when b = a +1 of the following more general relations: m+n+1−a,m+n+1−b;μa +1−i,μb +1− j (u), ζ M|N E a,b;i j (u) = F (4.15) ζ M|N Fb,a;i j (u) = E m+n+1−b,m+n+1−a;μb +1−i,μa +1− j (u). (4.16) One can easily derive (4.6), (4.15) and (4.16) simultaneously by induction on a. Now we describe the relations among the D’s. We first claim that [Da;i, j (u), Db;h,k (v)] = 0,

unless

a = b.

Assume a < b. For 1 ≤ a ≤ m, there exists a suitable number 1 ≤ k ≤ M such that Da;i, j (u) is contained in the northwestern k × k corner of Y (gl M|N )[[u −1 ]], i.e., Da;i, j (u) ∈ Y (glk )[[u −1 ]] ⊂ Y (gl M|N )[[u −1 ]] and Db;h,k (v) ∈ ψk Y (gl M−k|N ) [[v −1 ]] ⊂ Y (gl M|N )[[v −1 ]]. Hence they supercommute by Lemma 4.3. For m + 1 ≤ a ≤ m + n, we may apply the swap map ζ M|N first then it is transformed to the above case in the super Yangian Y (gl N |M ) and our claim follows. We next compute the bracket explicitly when a = b. For 1 ≤ a ≤ m, by (4.3) and (3.8), we have [Da;i, j (u), Da;h,k (v)] = ψμ1 +μ2 +···+μa−1 [D1;i, j (u), D1;h,k (v)] = ψμ1 +μ2 +···+μa−1 [ti j (u), thk (v)] . For m + 1 ≤ a ≤ m + n, we set a˜ := m + n + 1 − a. Then we have 1 ≤ a˜ ≤ n and hence Da;i j (u) = ζ N |M Da;μ ˜ a˜ +1−i,μa˜ +1− j (u) D1;μa˜ +1−i,μa˜ +1− j (u) = ζ N |M ◦ ψμ1 +μ2 +···+μa−1 ˜ tμa˜ +1−i,μa˜ +1− j (u) . = ζ N |M ◦ ψμ1 +μ2 +···+μa−1 ˜ Therefore, for m + 1 ≤ a ≤ m + n, we have [Da;i, j (u), Da;h,k (v)] = ζ N |M ◦ ψμ1 +μ2 +···+μa−1 ˜ × [tμa˜ +1−i,μa˜ +1− j (u), tμa˜ +1−h,μa˜ +1−k (v)] .

Parabolic Presentations of the Super Yangian Y (glM|N )

239

Referring to the definition (2.3), for any 1 ≤ a ≤ m + n, we have [Da;i, j (u), Da;h,k (v)] =

1 Da;h, j (u), Da;i,k (v) − Da;h, j (v), Da;i,k (u) . u−v

Collecting the coefficients of u −r v −s , we have proved the following proposition, which is parallel to the results in [BK1, Sect. 4]. (r ) (r ) Proposition 4.5. The relations among the elements {Da;i, j , Da;i, j } for all r ≥ 0, 1 ≤ i, j ≤ μa , 1 ≤ a ≤ m + n are given by (0)

Da;i, j = δi j , r

(t)

(r −t)

Da;i, p Da; p, j = δr 0 δi j ,

t=0 (r )

(s)

[Da;i, j , Db;h,k ] = δab

min(r,s)−1

(t)

(r +s−1−t)

Da;h, j Da;i,k

(r +s−1−t)

− Da;h, j

(t) Da;i,k ,

t=0

and these elements generate a subalgebra of Y (gl M|N ). We call the subalgebra in Proposition 4.5 the standar d Levi subalgebra of Y (gl M|N ) associated to μ and denote it by Yμ0 . Note that in the special case when all μi = 1, the 0 is commutative. subalgebra Y(1,...,1) 5. Special Cases: Non-super Case and m = n = 1 The following theorem of Brundan and Kleshchev describes the relations among the generators in the non-super case. Theorem 2 [BK1, Theorem A]. Let λ = (λ1 , λ2 , . . . , λm ) be a composition of M. The following identities hold in Y (gl M )((u −1 , v −1 )) for all admissible a, b, f, g, h, i, j, k: (u − v)[Da;i, j (u), E b;h,k (v)] = δa,b δh, j Da;i, p (u) E a; p,k (v) − E a; p,k (u) −δa,b+1 Da;i,k (u) E b;h, j (v) − E b;h, j (u) , (u − v)[Da;i, j (u), Fb;h,k (v)] = −δa,b δk,i Fb;h, p (v) − Fb;h, p (u) Da; p, j (u) +δa,b+1 Fb;i,k (v) − Fb;i,k (u) Da;h, j (u), (u)Da+1;h, j (u) − Da+1;h, j (v)Da;i,k (v) , (u − v)[E a;i, j (u), Fb;h,k (v)] = δa,b Da;i,k (u − v)[E a;i, j (u), E a;h,k (v)] = E a;i,k (u) − E a;i,k (v) E a;h, j (u) − E a;h, j (v) , (u − v)[Fa;i, j (u), Fa;h,k (v)] = Fa;i,k (u) − Fa;i,k (v) Fa;h, j (u) − Fa;h, j (v) , (u − v)[E a;i, j (u), E a+1;h,k (v)] = δh, j E a;i,q (u)E a+1;q,k (v) − E a;i,q (v)E a+1;q,k (v) +E a,a+2;i,k (v) − E a,a+2;i,k (u) , (u − v)[Fa;i, j (u), Fa+1;h,k (v)] = δi,k −Fa+1;h,q (v)Fa;q, j (u) + Fa+1;h,q (v)Fa;q, j (v) −Fa+2,a;h, j (v) + Fa+2,a;h, j (u) , (u − v)[E a;i, j (u), E b;h,k (v)] = 0 if b > a + 1 or if b = a + 1 and h = j, (u − v)[Fa;i, j (u), Fb;h,k (v)] = 0 if b > a + 1 or if b = a + 1 and i = k,

240

Y.-N. Peng

(u − v) [E a;i, j (u), E b;h,k (v)], E b; f,g (v) = 0 if |a − b| ≥ 1, (u − v) E a;i, j (u), [E a;h,k (u), E b; f,g (v)] = 0 if |a − b| ≥ 1, [E 1;i, j (u), E 2;h,k (v)], E 2; f,g (w) + [E 1;i, j (u), E 2;h,k (w)], E 2; f,g (v) = 0 if |a −b| ≥ 1, E 1;i, j (u), [E 1;h,k (v), E 2; f,g (w)] + E 1;i, j (v), [E 1;h,k (u), E 2; f,g (w)] = 0 if |a −b| ≥ 1,

where the index p (resp. q) is summed over 1, . . . , λa (resp. 1, . . . , λa+1 ). Proof. See [BK1, Sect. 6]. Here, we present the theorem in the series form and we define the indices of F’s in a slightly different manner.

Back to the super case. Consider m = n = 1; that is, μ = (μ1 | μ2 ) = (M | N ). Since we have only one block of E’s and F’s, we may omit the block positions without confusion. That is, we set E i, j (u) := E 1;i, j (u) = E 1,2;i, j (u),

for all 1 ≤ i ≤ μ1 = M, 1 ≤ j ≤ μ2 = N ,

and Fi, j (u) := F1;i, j (u) = F2,1;i, j (u),

for all 1 ≤ i ≤ μ2 = N , 1 ≤ j ≤ μ1 = M.

The relations among them are given in the following proposition, which is a generalization of [BK1, Lemma 6.3]. Proposition 5.1. The following identities hold in Y (gl M|N )((u −1 , v −1 )). δh j D1;i, p (u) E p,k (v) − E p,k (u) , if a = 1, (u − v)[Da;i, j (u), E h,k (v)] = if a = 2, D2;i,k (u) E h, j (v) − E h, j (u) , (u − v)[Da;i, j (u), Fh,k (v)] = (u − v)[E i, j (u), Fh,k (v)] =

(5.1)

δki Fh, p (u) − Fh, p (v) D1; p, j (u), if a = 1, if a = 2, Fi,k (u) − Fi,k (v) D2;h, j (u),

D1;i,k (v)D2;h, j (v) −

D2;h, j (u)D1;i,k (u),

(u − v)[E i, j (u), E h,k (v)] = E i,k (u) − E i,k (v) E h, j (v) − E h, j (u) , (u − v)[Fi, j (u), Fh,k (v)] = Fi,k (u) − Fi,k (v) Fh, j (v) − Fh, j (u) ,

(5.2) (5.3) (5.4) (5.5)

for all admissible i, j, h, k and the index p is summed over 1, . . . , M. Proof. As in the proof of Proposition 4.4, we compute the matrix product T (u) = F(u)D(u)E(u)

and

T −1 (u) = E −1 (u)D (u)F −1 (u)

with respect to the composition μ = (M | N ) and get the following identities: ti, j (u) = ti,M+ j (u) = t M+i, j (u) = t M+i,M+ j (u) =

D1;i, j (u), for all 1 ≤ i, j ≤ M, D1;i, p E p, j (u), for all 1 ≤ i ≤ M, 1 ≤ j ≤ N , Fi, p (u)D1; p, j (u), for all 1 ≤ i ≤ N , 1 ≤ j ≤ M, Fi, p (u)D1; p,q (u)E q, j (u) + D2;i, j (u), for all 1 ≤ i, j ≤ N ,

(5.6) (5.7) (5.8) (5.9)

ti, j (u) = D1;i, j (u) + E i, p (u)D2; p ,q (u)Fq , j (u), for all 1 ≤ i, j ≤ M, (5.10)

ti,M+ j (u) = −E i, p (u)D2; p , j (u),

for all 1 ≤ i ≤ M, 1 ≤ j ≤ N ,

(5.11)

Parabolic Presentations of the Super Yangian Y (glM|N ) t M+i, j (u) = −D2;i, p (u)F p , j (u), t M+i,M+ j (u)

=

D2;i, j (u),

241

for all 1 ≤ i ≤ N , 1 ≤ j ≤ M,

(5.12)

for all 1 ≤ i, j ≤ N ,

(5.13)

where the indices p, q (resp. p , q ) are summed over 1, . . . , M (resp. 1, . . . , N ). Equations (5.1) and (5.2) can be proved using exactly the same method as in [BK1, Lemma 6.3] and hence we skip the detail. To establish (5.3), we need other identities. Computing the brackets in (5.1) in the case a = 2 and (5.2) in the case a = 1 and changing the indices, we have (u − v)E α, j (u)D2;h,β (v) − δh j E α,q (v) − E α,q (u) D2;q,β (v) = (u − v)D2;h,β (v)E α, j (u), −(u − v)Fβ,k (v)D1;i,α (u) + δki Fβ, p (v) − Fβ, p (u) D1; p,α (u) = −(u − v)D1;i,α (u)Fβ,k (v),

where α, p (resp. β, q) are summed over 1, . . . , M (resp. 1, . . . , N ). By (2.4), we have ⎛

(u − v)[ti,M+ j (u), t M+h,k (v)] = − ⎝δh j

M+N

(v) − δ til (u)tlk ki

l=1

M+N

(5.14) (5.15)

⎞

t M+h,s (v)ts,M+ j (u)⎠ .

s=1

Substituting by (5.6)−(5.13) and changing the indices, we may rewrite the above identity as the following: D1;i,α (u) (u − v)E α, j (u)D2;h,β (v) − δh j E α,q (v) − E α,q (u) D2;q,β (v) Fβ,k (v)

−δh j D1;i,α (u)D1;α,h (v) = D2;h,β (v) −(u − v)Fβ,k (v)D1;i,α (u) + δki Fβ, p (v) − Fβ, p (u) D1; p,α (u) E α, j (u) −δki D2;h,β (v)D2;β, j (u),

(5.16)

where α, p (resp. β, q) are summed over 1, . . . , M(resp. 1, . . . , N ). Substituting (5.14) and (5.15) into (5.16), we obtain D1;i,α (u) (u − v)D2;h,β (v)E α, j (u)Fβ,k (v) − δh j D1;i,α (u)D1;α,k (v) = D2;h,β (v) −(u − v)D1;i,α (u)Fβ,k (v)E α, j (u) − δki D2;h,β (v)D2;β, j (u). (5.17) Multiplying D2 (v)D1 (u) from the left on both sides of (5.17), we obtain (5.3). For (5.4), we start with [ti,M+ j (u), th,M+k (v)] = 0. Note that they are both odd ele2 ments. Multiplying (u − v) and computing the bracket after substitution by (5.7) and (5.11), we have (u − v)2 D1;i, p (u)E p, j (u)E h,q (v)D2;q,k (v) +(u − v)E h,q (v)D1;i, p (u)(u − v)D2;q,k (v)E p, j (u) = 0.

(5.18)

Rewriting (5.1) again, we have the following identities: (u − v)E h,q (v)D1;i, p (u) = (u − v)D1;i, p (u)E h,q (v) + δhp D1;i, p (u) E p,q (u) − E p,q (v) , (u − v)D2;q,k (v)E p, j (u) = (u − v)E p, j (u)D2;q,k (v) + δ jq E p,q (u) − E p,q (v) D2;q,k (v).

242

Y.-N. Peng

Substituting these two into the second term in (5.18) and multiplying D1 (u) from the left, D2 (v) from the right simultaneously, we obtain (u − v)2 [E i, j (u), E h,k (v)] = (u − v)E h, j (v) E i,k (v) − E i,k (u) +(u − v) E i,k (v) − E i,k (u) E h, j (u) (5.19) + E i, j (u) − E i, j (v) E h,k (v) − E h,k (u) . For a power series P in Y (gl M|N )[[u −1 , v −1 ]], we write {P}d for the homogeneous component of P of total degree d in the variables u −1 and v −1 . Equation (5.4) follows from the following claim. Claim. For d ≥ 1, we have (u − v) [E i, j (u), E h,k (v)] d+1 = E i,k (u) − E i,k (v) E h, j (v) − E h, j (u) d . We prove the claim by induction on d. For d = 1, we take { }0 on (5.19), and it implies (u − v)2 [E i, j (u), E h,k (v)] 2 = 0. Note that the right-hand side of (5.19) is zero when u = v, hence we may divide both sides by (u − v) and therefore (u − v) [E i, j (u), E h,k (v)] 2 = 0, as desired. Assume the claim is true for some d > 1. By the hypothesis, we have (u − v) [E h, j (u), E i,k (v)] d+1 = E h,k (u) − E h,k (v) E i, j (v) − E i, j (u) d . E h,k (u) − E h,k (v) E i, j (v) − E i, j (u) ⇒ [E h, j (u), E i,k (v)] d+1 = . d u−v (5.20) Note that the right-hand side is zero when u = v. Hence [E h, j (v), E i,k (v)] d+1 = 0, which implies E h, j (v)E i,k (v) = −E i,k (v)E h, j (v).

(5.21)

Take { }d on (5.19): (u − v)2 [E i, j (u), E h,k (v)] d+2 = (u − v) E h, j (v) E i,k (v) − E i,k (u) d+1 +(u − v) E i,k (v) − E i,k (u) E h, j (u) d+1 + E i, j (u) − E i, j (v) E h,k (v) − E h,k (u) d . Substituting the last term by (5.20) and simplifying the result, we have (u − v)2 [E i, j (u), E h,k (v)] d+2 = (u − v) E h, j (v)E i,k (v) + E i,k (u)E h, j (v) + E i,k (v) − E i,k (u) E h, j (u) d+1 . Substituting by (5.21) into the above identity, we have (u − v)2 [E i, j (u), E h,k (v)] d+2 = (u − v) E i,k (u) − E i,k (v) E h, j (v) − E i,k (u) − E i,k (v) E h, j (u) d+1 = (u − v) E i,k (u) − E i,k (v) E h, j (v) − E h, j (u) d+1 . Dividing both sides by u − v establishes the claim. Equation (5.5) follows from applying the map ζ N |M to (5.4) in Y (gl N |M )[[u −1 , v −1 ]] with suitable indices.

Parabolic Presentations of the Super Yangian Y (glM|N )

243

6. Special Case: m = 2, n = 1 Recall that m is the number of parts of the composition of M and n is the number of parts of the composition of N . In the case when m = 2, n = 1, μ = (μ1 , μ2 | μ3 ), where μ1 + μ2 = M and μ3 = N . The relations among E a;i, j (u) and Fb;h,k (u) in different blocks are obtained by the following lemma, which is a generalization of [BK1, Lemma 6.4] and [Go, Lemma 3]. Before stating and proving the lemma, we first set a notation for the remaining part of this article. We denote the super Yangian by the notation Yμ := Y (gl M|N ) to emphasize how we decompose the matrix T (u) into block matrices according to the composition μ of (M|N ) and how those D’s, E’s and F’s are defined. Moreover, by abuse of notations, we will consider the D’s, E’s and F’s in different super Yangians at the same time. It should be clear from the context which super Yangian we are dealing with. Lemma 6.1. The following identities hold in Y(μ1 ,μ2 |μ3 ) ((u −1 , v −1 )) for all admissible g, h, i, j, k: (a) [E 1;i, j (u), F2;h,k (v)] = 0, (b) [E 1;i, j (u), E 2;h,k (v)] =

δh j { E 1;i,q (u) − E 1;i,q (v) E 2;q,k (v) + E 1,3;i,k (v) − u−v

E 1,3;i,k (u)}, (c) [E 1,3;i, j (u), E 2;h,k (v)] = E 2;h, j (v)[E 1;i,g (u), E 2;g,k (v)], (d) [E 1;i, j (u), E 1,3;h,k (v) − E 1;h,q (v)E 2;q,k (v)] = −[E 1;i,g (u), E 2;g,k (v)]E 1;h, j (u). Here, q is summed over 1, . . . , μ2 and g could be any number in {1, 2, . . . , μ2 }.

Proof. (a) By (2.4), we have [ti,μ1 + j (u), tμ 1 +μ2 +h,μ1 +k (v)] = 0. Substituting by (4.9)– (4.14) with respect to the composition μ and according to the indices, we have [D1;i, p (u)E 1; p, j (u), −D3;h,q (v)F2;q,k (v)] = 0.

Computing the bracket, we obtain D1;i, p (u)E 1; p, j (u)D3;h,q (v)F2;q,k (v) − D3;h,q (v)F2;q,k (v)D1;i, p (u)E 1; p, j (u) = 0, (6.1)

where p and q are summed over 1, . . . , μ1 and 1, . . . , μ3 , respectively. Similarly, by (2.4), we have [ti j (u), tμ 1 +μ2 +h,μ1 +k (v)] = [ti,μ1 + j (u), tμ 1 +μ2 +h,μ1 +μ2 +k (v)] = 0, which implies that [D1;i, j (u), F2;h,k (v)] = [E 1;i, j (u), D3;h,k (v)] = 0. Substituting these into (6.1) and noting that [D1;i, j (u), D3;h,k (v)] = 0, we have D1;i, p (u)D3;h,q (v)E 1; p, j (u)F2;q,k (v) − D1;i, p (u)D3;h,q (v)F2;q,k (v)E 1; p, j (u) = 0.

Multiplying D3 (v)D1 (u) from the left, we obtain (a).

244

Y.-N. Peng

(b) By (2.4), we have (u − v)[ti,μ1 + j (u), tμ 1 +h,μ1 +μ2 +k (v)] = δ j h

M+N

tis (u)ts,μ1 +μ2 +k (v).

s=1

Substituting by (4.9)–(4.14) according to the indices in the above identity, we have (v)] (u − v)[D1;i, p (u)E 1; p, j (u), −E 2;h,q (v)D3;q,k = δ j h D1;i, p (u) E 1; p,r (v)E 2;r,q (v) − E 1,3; p,q (v) (v), −E 1; p,r (u)E 2;r,q (v) + E 1,3; p,q (u) D3;q,k

(6.2)

where the indices p, q, r are summed over μ1 , μ3 , μ2 , respectively. Using the facts that (u) = 0, E 1;i, j (v), D3;h,k (explained in the proof of (a)) E 2;i, j (v), D1;h,k (u) = 0, obtained from [ti j (u), tμ 1 +h,μ1 +μ2 +k (v)] = 0 we may cancel D1 (u) from the left and D3 (v) from the right on both sides of (6.2). Dividing both sides by u − v, we have proved (b). (c) By (5.1) in Y(μ2 |μ3 ) [[u −1 , v −1 ]], we have (u − v)[E 1;h,k (u), D2;i, j (v)] = δki E 1;h, p (v) − E 1;h, p (u) D2; p, j (v). Applying the map ψμ1 to this identity and using (4.3)−(4.5), we have the following identity in Y(μ1 ,μ2 |μ3 ) [[u −1 , v −1 ]]: (u − v)[E 2;h,k (u), D3;i, j (v)] = δki E 2;h, p (v) − E 2;h, p (u) D3; p, j (v). Taking the coefficient of u 0 , we obtain (1) , D3;i, [E 2;h,k j (v)] = δki E 2;h, p (v)D3; p, j (v).

(6.3)

Also by (3.9), we have (1) E 1,3;i, j (u) = [E 1;i,g (u), E 2;g, j ], for any 1 ≤ g ≤ μ2 .

(6.4)

(v)] = 0, we have By (6.3), (6.4) and the fact that [E 1;i,g (u), D3;h,k

(1) [E 1,3;i, j (u), D3;h,k (v)] = [E 1;i,g (u), E 2;g, j , D3;h,k (v)] (1) = E 1;i,g (u), [E 2;g, j , D3;h,k (v)] = E 1;i,g (u), δh j E 2;g, p (v)D3; p,k (v) = δh j E 1;i,g (u), E 2;g, p (v) D3; p,k (v).

(6.5)

By (2.4) and (4.9)–(4.14), we have (v)] = 0, [ti,μ1 +μ2 + j (u), tμ 1 +h,μ1 +μ2 +k (v)] = [D1;i, p (u)E 1,3; p, j (u), −E 2;h,q (v)D3;q,k

Parabolic Presentations of the Super Yangian Y (glM|N )

245

where p and q are summed over 1, 2, . . . , μ1 and 1, 2, . . . , μ3 , respectively. Multiply ing D1 (u) from the left, we have [E 1,3;i, j (u), E 2;h,q (v)D3;q,k (v)] = 0, which may be written as [E 1,3;i, j (u), E 2;h,q (v)]D3;q,k (v) − E 2;h,q (v)[E 1,3;i, j (u), D3;q,k (v)] = 0.

Substituting the last bracket by (6.5), we have [E 1,3;i, j (u), E 2;h,q (v)]D3;q,k (v) − δq j E 2;h,q (v)[E 1;i,g (u), E 2;g, p (v)]D3; p,k (v) = 0. ⇒ [E 1,3;i, j (u), E 2;h,q (v)]D3;q,k (v) = E 2;h, j (v)[E 1;i,g (u), E 2;g, p (v)]D3; p,k (v).

Multiplying D3 (v) from the right to both sides of the above equality, we obtain (c). (d) Taking the coefficient of u 0 in (b), we have (1) [E 1;i, j , E 2;h,k (v)] = δh j E 1,3;i,k (v) − E 1;i,q (v)E 2;q,k (v) . Taking the coefficient of v 0 in (5.1) in the case a = 1, we have (1) [D1;i, j (u), E 1;h,k ] = δh j D1;i, p (u)E 1; p,k (u).

By the above two equalities and the fact that [D1;i, j (u), E 2;g,k (v)] = 0, we have (1) , E 2;g,k (v)] [D1;i, j (u), E 1,3;h,k (v) − E 1;h,q (v)E 2;q,k (v)] = [D1;i, j (u), E 1;h,g (1) = [D1;i, j (u), E 1;h,g ], E 2;g,k (v) = [δh j D1;i, p (u)E 1; p,g (u), E 2;g,k (v)] = δh j D1;i, p (u)[E 1; p,g (u), E 2;g,k (v)]. (6.6) Taking the sum of all j in (6.6), we have δhr D1;i, p (u)[E 1; p,g (u)E 2;g,k (v)] = D1;i,r (u) E 1,3;h,k (v) − E 1;h,s (v)E 2;s,k (v) − E 1,3;h,k (v) − E 1;h,s (v)E 2;s,k (v) D1;i,r (u), where p, r, s are summed over μ1 , μ1 , μ2 , respectively. Changing the indices, we may rewrite the above equality as E 1;h,r (v)E 2;r,k (v) − E 1,3;h,k (v) D1;i, p (u) = δhp D1;i, p (u)[E 1; p ,g (u), E 2;g,k (v)] +D1;i, p (u) E 1;h,r (v)E 2;r,k (v) − E 1,3;h,k (v) , (6.7) where r, p, p are summed over μ2 , μ1 , μ1 , respectively. On the other hand, by (2.4) and (4.9)−(4.14), we have ti,μ1 + j (u), th,μ (v) 1 +μ2 +k = D1;i, p (u)E 1; p, j (u), E 1;h,r (v)E 2;r,q (v) − E 1,3;h,q (v) D3;q,k (v) = 0,

(6.8)

246

Y.-N. Peng

where p and q are summed over μ1 and μ3 , respectively. Multiplying D3 (v) from the right and computing the bracket, (6.8) becomes D1;i, p (u)E 1; p, j (u) E 1;h,r (v)E 2;r,k (v) − E 1,3;h,k (v) (6.9) − E 1;h,r (v)E 2;r,k (v) − E 1,3;h,k (v) D1;i, p (u)E 1; p, j (u) = 0, where p and r are summed over μ1 and μ2 , respectively. Substituting (6.7) into the second term of (6.9), we have D1;i, p (u)E 1; p, j (u) E 1;h,q (v)E 2;q,k (v) − E 1,3;h,k (v) −δh, p1 D1;i, p2 (u) E 1; p2 ,g (u), E 2;g,k (v) E 1; p1 , j (u) −D1;i, p3 (u) E 1;h,q1 (v)E 2;q1 ,k (v) − E 1,3;h,k (v) E 1; p3 , j (u) = 0. Multiplying D1 (u) from the left, we obtain E 1;i, j (u) E 1;h,q (v)E 2;q,k (v) − E 1,3;h,k (v) − E 1;i,g (u), E 2;g,k (v) E 1;h, j (u) − E 1;h,q1 (v)E 2;q1 ,k (v) − E 1,3;h,k (v) E 1;i, j (u) = 0. Simplifying the above, we obtain (d).

We have the F-counterpart of Lemma 6.1. Lemma 6.2. The following identities hold in Y(μ1 ,μ2 |μ3 ) ((u −1 , v −1 )) for all admissible g, h, i, j, k: (a) [F1;i, j (u), E 2;h,k (v)] = 0,

δik F2;h,q (v) F1;q, j (v) − F1;q, j (u) − F3,1;h, j (v) (b) [F1;i, j (u), F2;h,k (v)] = u−v +F3,1;h, j (u) , (c) [F3,1;i, j (u), F2;h,k (v)] = [F2;h,g (v), F1;g, j (u)]F2;i,k (v), (d) [F1;i, j (u), F2;h,q (v)F1;q,k (v) − F3,1;h,k (v)] = F1;i,k (u)[F1;g, j (u), F2;h,g (v)]. Here, q is summed over 1, . . . , μ2 and g could be any number in {1, 2, . . . , μ2 }. Proof. They can be proved by similar methods as in the proof of Lemma 6.1 and we skip the details.

The following lemma is a generalization of [BK1, Lemma 6.5, Lemma 6.6] and of part of [Go, Lemma 3]. Lemma 6.3. The following identities hold in Y(μ1 ,μ2 |μ3 ) [[u −1 , v −1 , w −1 ]] for all admissible f, g, h, i, j, k: (a) [E 1;i, j (u), E 2;h,k (v)], E 2; f,g (v) = 0, (b) E 1;i, j (u), [E 1;h,k (u), E 2; f,g (v)] = 0, (c) [E 1;i, j (u), E 2;h,k (v)], E 2; f,g (w) + [E 1;i, j (u), E 2;h,k (w)], E 2; f,g (v) = 0, (d) E 1;i, j (u), [E 1;h,k (v), E 2; f,g (w)] + E 1;i, j (v), [E 1;h,k (u), E 2; f,g (w)] = 0, (e) [F1;i, j (u), F2;h,k (v)], F2; f,g (v) = 0, (f) F1;i, j (u), [F1;h,k (u), F2; f,g (v)] = 0, (g) [F1;i, j (u), F2;h,k (v)], F2; f,g (w) + [F1;i, j (u), F2;h,k (w)], F2; f,g (v) = 0,

Parabolic Presentations of the Super Yangian Y (glM|N )

(h)

247

F1;i, j (u), [F1;h,k (v), F2; f,g (w)] + F1;i, j (v), [F1;h,k (u), F2; f,g (w)] = 0.

Proof. We prove (a) and (c) in detail here, while the others can be proved in a similar fashion. (a) We first claim that [E a;i, j (v), E a;h,k (v)] = 0 for a = 1, 2 in Y(μ1 ,μ2 |μ3 ) [[u −1 , v −1 ]]. The case a = 1 follows from Theorem 2 and a = 2 follows from applying the map ψμ1 to (5.4). By the super-Jacobi identity, together with the claim and Lemma 6.1(b), it suffices to prove the case when j = h = f . In this case, we compute the bracket by Lemma 6.1 as follows: (u − v) [E 1;i, j (u), E 2; j,k (v)], E 2; j,g (v) = −(u − v) E 2; j,k (v), [E 1;i, j (u), E 2; j,g (v)] = [E 1;i,q (u)E 2;q,g (v) − E 1;i,q (v)E 2;q,g (v) + E 1,3;i,g (v) − E 1,3;i,g (u), E 2; j,k (v)] = [E 1;i,q (u)E 2;q,g (v), E 2; j,k (v)] + [E 1,3;i,g (v), E 2; j,k (v)] −[E 1;i,q (v)E 2;q,g (v), E 2; j,k (v)] − [E 1,3;i,g (u), E 2; j,k (v)] = −[E 1;i,q (u), E 2; j,k (v)]E 2;q,g (v) − E 2; j,g (v)[E 1;i, j (u), E 2; j,k (v)] +[E 1;i,q (v), E 2; j,k (v)]E 2;q,g (v) + E 2; j,g (v)[E 1;i, j (u), E 2; j,k (v)] = − [E 1;i, j (u), E 2; j,k (v)], E 2; j,g (v) + [E 1;i, j (v), E 2; j,k (v)], E 2; j,g (v) . Thus we have (u − v − 1) [E 1;i, j (u), E 2; j,k (v)], E 2; j,g (v) = − [E 1;i, j (v), E 2; j,k (v)], E 2; j,g (v) .

(6.10)

Note that the right-hand side of (6.10) is independent of the choice of u. Set u = v + 1, then the right hand side of (6.10) is zero. Using (6.10) again, we obtain (a). (c) It is enough to show that (u − w)(v − w)(u − v) [E 1;i, j (u), E 2;h,k (v)], E 2; f,g (w) (6.11) is symmetric in v and w. We may further assume j = h, as in the proof of (a). By Lemma 6.1(b), we have (u − v) [E 1;i, j (u), E 2; j,k (v)], E 2; f,g (w) = E 1;i,q (u)E 2;q,k (v)− E 1;i,q (v)E 2;q,k (v)+ E 1,3;i,k (v)− E 1,3;i,k (u), E 2; f,g (w) . Multiplying both sides with (u − w)(v − w), computing the brackets by Lemma 6.1, we have (u − w)(v − w)(u − v) [E 1;i, j (u), E 2;h,k (v)], E 2; f,g (w) = (u − w)(v − w) E 1;i,q (u)E 2;q,k (v)E 2; f,g (w) + E 2; f,g (w)E 1;i,q (u)E 2;q,k (v) −E 1;i,q (v)E 2;q,k(v) E 2; f,g (w) − E 2; f,g (w)E 1;i,q (v)E 2;q,k (v)

+E 2; f,k (w)[E 1;i,x (v), E 2;x,g (w)] − E 2; f,k (w)[E 1;i,x (u), E 2;x,g (w)]

248

Y.-N. Peng

= 7(u − w)(v − w) E 1;i,q (u)[E 2;q,k (v), E 2; f,g (w)] − E 1;i,q (u)E 2; f,g (w)E 2;q,k (v) +[E 2; f,g (w), E 1;i,q (u)]E 2;q,k (v) + E 1;i,q (u)E 2; f,g (w)E 2;q,k (v) −E 1;i,q (v)[E 2;q,k (v), E 2; f,g (w)] + E 1;i,q (v)E 2; f,g (w)E 2;q,k (v) −[E 2; f,g (w), E 1;i,q (v)]E 2;q,k (v) − E 1;i,q (v)E 2; f,g (w)E 2;q,k (v)

−[E 1;i,x (v), E 2;x,g (w)]E 2; f,k (w) + [E 1;i,x (u), E 2;x,g (w)]E 2; f,k (w) = (u − w)(v − w)E 1;i,q (u) E 2;q,k (v), E 2; f,g (w) +(u − w)(v − w) E 2; f,g (w), E 1;i,q (u) E 2;q,k (v) −(u − w)(v − w)E 1;i,q (v) E 2;q,k (v), E 2; f,g (w) −(u − w)(v − w) E 2; f,g (w), E 1;i,q (v) E 2;q,k (v) −(u − w)(v − w) E 1;i,x (v), E 2;x,g (w) E 2; f,k (w) +(u − w)(v − w) E 1;i,x (u), E 2;x,g (w) E 2; f,k (w).

(6.12)

Now we use (5.4) and Lemma 6.1 to compute these brackets, then (6.12) equals (u − w)E 1;i,q (u) E 2;q,g (v) − E 2;q,g (w) E 2; f,k (w) − E 2; f,k (v) −(v − w)δq, j E 1;i,q0 (u) − E 1;i,q0 (w) E 2;q0 ,g (w) + E 1,3;i,g (w) −E 1,3;i,g (u) E 2;q,k (v) − (u − w)E 1;i,q (v) E 2;q,g (v) − E 2;q,g (w) E 2; f,k (w) −E 2; f,k (v) + (u − w)δq, f E 1;i,q0 (v) − E 1;i,q0 (w) E 2;q0 ,g (w) +E 1,3;i,g (w) − E 1,3;i,g (v) E 2;q,k (v) −(u − w) E 1;i,q (v)E 2;q,g (w) − E 1;i,q (w)E 2;q,g (w) +E 1,3;i,g (w) − E 1,3;i,g (v) E 2; f,k (w) +(v − w) E 1;i,q (u)E 2;q,g (w) − E 1;i,q (w)E 2;q,g (w) +E 1,3;i,g (w) − E 1,3;i,g (u) E 2; f,k (w), where the indices q and q0 are summed over 1, 2, . . . μ2 . Opening the parentheses of the above equality, we obtain that the resulting expression is indeed symmetric in v and w. Therefore, (6.11) is symmetric in v and w and hence (c) is proved.

7. The General Case

(r ) (r ) Recall that our goal is to obtain the relations among the generators {Da;i, j , Da;i, j }, (r )

(r )

{E a;i, j }, and {Fa;i, j } associated to a composition μ of (M|N ). To that end, we divide them into 3 disjoint parts as follows: (r ) (r ) (r ) (r ) A : Da;i, j , Da;i, j ∪ E a;i, j ∪ Fa;i, j , 1≤a<m 1≤a≤m 1≤a<m (r ) (r ) (r ) (r ) B : Da;i, j , Da;i, j ∪ E a;i, j ∪ Fa;i, j , m+1≤a<m+n m+1≤a<m+n m+1≤a≤m+n (r ) (r ) C : E m;i, j ∪ Fm;i, j , for all admissible indices i, j, r . If we choose two elements from Part A, then their bracket is obtained by Theorem 2. If we choose two elements from Part B, then they are the images of some elements from

Parabolic Presentations of the Super Yangian Y (glM|N )

249

Part A in Y (gl N |M ) under the swap map ζ N |M , and the bracket is obtained by Theorem 2 as well. Now suppose one of them is from Part A and the other is from Part B. Note that every element in Part A is in the northwestern M × M corner of T (u) and hence is in the subalgebra Y (gl M ) of Y (gl M|N ) (see Sect. 4). On the other hand, every element in Part B is inthe southeastern N × N corner of T (u), and hence is in the subalgebra ψ M Y (gl0|N ) of Y (gl M|N ). Thus, their bracket is zero by Lemma 4.3. Therefore, we only have to focus on the cross section where the odd blocks and even blocks are “close”, and this is done in Proposition 5.1, Lemma 6.1 and Lemma 6.2. Moreover, there are some non-trivial ternary brackets relations in the non-super case, and the corresponding ternary relations in the super case are found in Lemma 6.3. The following proposition summarizes the results we have obtained up to now. Proposition 7.1. For all admissible a, b, f, g, h, i, j, k, we have the following equalities in the super Yangian Yμ ((u −1 , v −1 , w −1 )): (u − v)[Da;i, j (u), E b;h,k (v)] ⎧ (−1)b δa,b δh, j Da;i, p (u) E a; p,k (v) − E a; p,k (u) ⎪ ⎪ ⎪ ⎪ ⎨ −δ a,b+1 Da;i,k (u) E b;h, j (v) − E b;h, j (u) , if b = m, = ⎪ ⎪ δa,b δh, j Da;i, p (u) E a; p,k (v) − E a; p,k (u) ⎪ ⎪ ⎩ +δa,b+1 Da;i,k (u) E b;h, j (v) − E b;h, j (u) , if b = m, (u − v)[Da;i, j (u), Fb;h,k (v)] ⎧ (−1)b −δa,b δk,i Fb;h, p (v) − Fb;h, p (u) Da; p, j (u) ⎪ ⎪ ⎪ ⎪ ⎨ +δ a,b+1 Fb;i,k (v) − Fb;i,k (u) Da;h, j (u) , if b = m, = ⎪ −δa,b δk,i Fb;h, p (v) − Fb;h, p (u) Da; p, j (u) ⎪ ⎪ ⎪ ⎩ −δa,b+1 Fb;i,k (v) − Fb;i,k (u) Da;h, j (u), if b = m, (u − v)[E a;i, j (u), E a;h,k (v)] (−1)a E a;i,k (u) − E a;i,k (v) E a;h, j (u) − E a;h, j (v) , if a = m, = E a;i,k (u) − E a;i,k (v) E a;h, j (v) − E a;h, j (u) , if a = m, (u − v)[Fa;i, j (u), Fa;h,k (v)] −(−1)a Fa;i,k (u) − Fa;i,k (v) Fa;h, j (u) − Fa;h, j (v) , if a = m, = Fa;i,k (u) − Fa;i,k (v) Fa;h, j (v) − Fa;h, j (u) , if a = m, (u − v)[E a;i, j (u), Fb;h,k (v)] (u)Da+1;h, j (u) − Da+1;h, j (v)Da;i,k (v) , = δa,b (−1)b+1 Da;i,k (u − v) E a;i, j (u), E a+1;h,k (v) = δh, j (−1)a+1 E a;i,q (u)E a+1;q,k (v) −E a;i,q (v)E a+1;q,k (v) + E a,a+2;i,k (v) − E a,a+2;i,k (u) , (u − v) Fa;i, j (u), Fa+1;h,k (v) = δi,k (−1)a+1 −Fa+1;h,q (v)Fa;q, j (u) +Fa+1;h,q (v)Fa;q, j (v) − Fa+2,a;h, j (v) + Fa+2,a;h, j (u) , (u − v)[E a;i, j (u), E b;h,k (v)] = 0, (u − v)[Fa;i, j (u), Fb;h,k (v)] = 0,

if b > a + 1 or if b = a + 1 and h = j, if b > a + 1 or if b = a + 1 and i = k,

250

Y.-N. Peng

E a;i, j (u), [E a;h,k (v), E b; f,g (w)] + E a;i, j (v), [E a;h,k (u), E b; f,g (w)] = 0, |a −b| ≥ 1, Fa;i, j (u), [Fa;h,k (v), Fb; f,g (w)] + Fa;i, j (v), [Fa;h,k (u), Fb; f,g (w)] = 0, |a −b| ≥ 1,

where a := 0 if 1 ≤ a ≤ m and a := 1 if m + 1 ≤ a ≤ m + n. Proof. This is the consequence of Theorem 2, Proposition 5.1, Lemmas 6.1−6.3,

together with the maps ψk and ζ M|N . The next lemma is a block generalization of [Go, Lemma 5] and the proof is essentially the same, except that we are using block decompositions. The relations are purely super phenomenons. Lemma 7.2. Associated to μ = (μ1 , μ2 , . . . μm | μm+1 , . . . , μm+n ) with m > 1 and n > 1, we have the following identities in Yμ : (r ) (1) (1) (s) [E m−1;i, j , E m;h,k ], [E m;h 0 ,k0 , E m+1; f,g ] = 0, (7.1) (r ) (1) (1) (s) ], [Fm;h , Fm+1; (7.2) [Fm−1;i, j , Fm;h,k f,g ] = 0, 0 ,k0 for all admissible f, g, h, i, j, k, h 0 , k0 , r, s. Proof. By using the maps ζ M|N and ψ, it is enough to show (7.1) in the case m = n = 2 only. Therefore, we want to show (7.1) in Y(μ1 ,μ2 |μ3 ,μ4 ) , i.e., (r ) (1) (1) (s) [E 1;i, j , E 2;h,k ] , [E 2;h 0 ,k0 , E 3; f,g ] = 0. (7.3) We first claim that for all admissible i, j, h, k, [E 1,3;i, j (u) , E 2;h,q (v)E 3;q,k (v) − E 2,4;h,k (v) ] = 0,

(7.4)

where the index q is summed over 1, 2, . . . , μ3 . To prove the claim, we use (4.11) and (4.13) associated to the composition (μ1 , μ2 | μ3 , μ4 ) to derive the following identities: E 1,3;i, j (u) = D1;i, p (u)t p,μ1 +μ2 + j (u),

E 2;h,q (v)E 3;q,k (v) − E 2,4;h,k (v) = tμ 1 +h,μ1 +μ2 +μ3 +r (v)D4;r,k (v), for all 1 ≤ i ≤ μ1 , 1 ≤ j ≤ μ3 , 1 ≤ h ≤ μ2 , 1 ≤ k ≤ μ4 , and the indices p, q, r are summed over μ1 , μ3 , μ4 , respectively. Substituting these identities into the bracket in (7.4) and setting a notation n a := μ1 + μ2 + · · · + μa for short, we have [E 1,3;i, j (u), E 2;h,q (v)E 3;q,k (v) − E 2,4;h,k (v)] = [D1;i, p (u)t p,n 2 + j (u), tμ1 +h,n 3 +r (v)D4;r,k (v)]

= D1;i, p (u)t p,n 2 + j (u)tμ1 +h,n 3 +r (v)D4;r,k (v) + tμ1 +h,n 3 +r (v)D4;r,k (v)D1;i, p (u)t p,n 2 + j (u) = D1;i, p (u)t p,n 2 + j (u)tμ1 +h,n 3 +r (v)D4;r,k (v) + tμ1 +h,n 3 +r (v)D1;i, p (u)D4;r,k (v)t p,n 2 + j (u) = D1;i, p (u)t p,n 2 + j (u)tμ1 +h,n 3 +r (v)D4;r,k (v) + D1;i, p (u)tμ1 +h,n 3 +r (v)t p,n 2 + j (u)D4;r,k (v) = D1;i, p (u)[t p,n 2 + j (u), tμ1 +h,n 3 +r (v)]D4;r,k (v) = 0, and the claim follows.

Note that in the above computation we have used the facts that D1;i, j (u) = ti j (u)

and

D4;i, j (u) = tn 3 +i,n 3 + j (u),

therefore [D1;i, j (u), tμ 1 +h,n 3 +k (v)] = 0 and [D4;i, j (u), th,n 2 +k (v)] = 0 by (2.4).

Parabolic Presentations of the Super Yangian Y (glM|N )

251

It suffices to prove (7.3) when h = j and k0 = f , by Lemma 6.1(b). Computing the following bracket by Lemma 6.1(b), we have (u − v)(w − z) [E 1;i, j (u), E 2; j,k (v)] , [E 2;h 0 , f (w), E 3; f,g (z)] = E 1;i,q (u)E 2;q,k (v) − E 1;i,q (v)E 2;q,k (v) + E 1,3;i,k (v) − E 1,3;i,k (u), −E 2;h 0 , p (w)E 3; p,g (z) + E 2;h 0 , p (z)E 3; p,g (z) − E 2,4;h 0 ,g (z) + E 2,4;h 0 ,g (w) . Taking its coefficient of u −r z −s v 0 w 0 , we have s−1 (r ) (s−t) (t) (r ) (s) −E 1,3;i,k , E 2;h 0 , p E 3; p,g + −E 1,3;i,k , −E 2,4;h 0 ,g , t=1

and it equals the coefficient of u −r z −s in [E 1,3;i,k (u), −E 2;h 0 , p (z)E 3; p,g (z) + E 2,4;h 0 ,g (z)], which is zero by (7.4). Finally, the coefficient of u −r z −s v 0 w 0 in (u − v)(w − z) [E 1;i, j (u), E 2; j,k (v)] , [E 2;h 0 , f (w), E 3; f,g (z)] (r ) (1) (1) (s) is exactly − [E 1;i, j , E 2; j,k ] , [E 2;h 0 , f , E 3; f,g ] and (7.3) follows. Recall the fact stated in Theorem 1 that Y (gl M|N ) is generated as an algebra by the (r ) (r ) (r ) (r ) set Da;i, j , Da;i, j , E a;i, j , Fa;i, j . The following theorem describes the relations among these generators. Theorem 3. The following relations hold in Y (gl M|N ) for all admissible indices a, b, f, g, h, i, j, k, l, r, s, h 0 , k0 : (0) Da;i, j = δi j , r

(t)

(7.5)

(r −t)

Da;i, p Da; p, j = δr 0 δi j ,

(7.6)

t=0

(r )

min(r,s)−1 (t) (r ) (s) (r +s−1−t) (r +s−1−t) (t) Da;h, j Da;i,k − Da;h, j Da;i,k , (7.7) Da;i, j , Db;h,k = δab t=0

(s)

[Da;i, j , E b;h,k ] ⎛ ⎞ ⎧ r −1 r −1 ⎪ ⎪ (t) (r +s−1−t) (t) (r +s−1−t) b ⎪ ⎠ , b = m, ⎪ (−1) ⎝δa,b δh, j Da;i, p E a; p,k − δa,b+1 Da;i,k E b;h, j ⎪ ⎨ t=0 t=0 (7.8) = r −1 r −1 ⎪ ⎪ ⎪ (t) (r +s−1−t) (t) (r +s−1−t) ⎪ Da;i, p E a; p,k + δa,b+1 Da;i,k E b;h, j , b = m, ⎪ ⎩ δa,b δh, j t=0

t=0

(r ) (s) [Da;i, j , Fb;h,k ]

=

⎛ ⎞ ⎧ r −1 r −1 ⎪ ⎪ (r +s−1−t) (t) (r +s−1−t) (t) ⎪ ⎪ (−1)b ⎝−δa,b δk,i Fb;h, p Da; p, j + δa,b+1 Fb;i,k Da;h, j ⎠ , b = m, ⎪ ⎨ t=0

t=0

r −1 r −1 ⎪ ⎪ ⎪ (r +s−1−t) (t) (r +s−1−t) (t) ⎪ δ F D − δ Fb;i,k Da;h, j , b = m, −δ ⎪ a,b+1 ⎩ a,b k,i b;h, p a; p, j t=0

t=0

(7.9)

252

Y.-N. Peng

(r )

(s)

[E a;i, j , E a;h,k ] =

⎛ ⎞ ⎧ s−1 r −1 ⎪ (t) ⎪ (r +s−1−t) (t) (r +s−1−t) ⎠ ⎪ ⎪ (−1)a ⎝ E a;i,k E a;h, j − E a;i,k E a;h, j , a = m, ⎪ ⎨ t=1

t=1

r −1 s−1 ⎪ (t) ⎪ ⎪ (t) (r +s−1−t) (r +s−1−t) ⎪ E a;i,k E a;h, j − E a;i,k E a;h, j , ⎪ ⎩ t=1

a = m,

t=1

(7.10)

(r )

(s)

[Fa;i, j , Fa;h,k ] =

⎛ ⎞ ⎧ r −1 s−1 ⎪ ⎪ (r +s−1−t) (t) (r +s−1−t) (t) ⎠ ⎪ ⎪ (−1)a ⎝ Fa;i,k Fa;h, j − Fa;i,k Fa;h, j , a = m, ⎪ ⎨ t=1

t=1

r −1 s−1 ⎪ (r +s−1−t) (t) ⎪ ⎪ (r +s−1−t) (t) ⎪ Fa;i,k Fa;h, j − Fa;i,k Fa;h, j , ⎪ ⎩ t=1

a = m,

t=1

(7.11) (r )

(s)

[E a;i, j , Fb;h,k ] = −(−1)b+1 δa,b

r +s−1

(r +s−1−t)

Da+1;h, j

(t)

Da;i,k ,

(7.12)

t=0 (r +1)

(s)

(r +1)

(s)

(r )

(s+1)

(r )

(s)

[E a;i, j , E a+1;h,k ] − [E a;i, j , E a+1;h,k ] = (−1)a+1 δh, j E a;i,q E a+1;q,k , (r )

(s+1)

(s)

(r )

[Fa;i, j , Fa+1;h,k ] − [Fa;i, j , Fa+1;h,k ] = −(−1)a+1 δi,k Fa+1;h,q Fa;q, j , (r ) (s) [E a;i, j , E b;h,k ] = 0 (r ) (s) [Fa;i, j , Fb;h,k ] = 0

if b > a + 1 or if b = a + 1 and h = j,

if b > a + 1 or if b = a + 1 and i = k, (s) (r ) (s) (l) (r ) (l) E a;i, j , [E a;h,k , E b; f,g ] + E a;i, j , [E a;h,k , E b; f,g ] = 0, |a − b| ≥ 1, (r ) (s) (s) (l) (r ) (l) Fa;i, j , [Fa;h,k , Fb; f,g ] + Fa;i, j , [Fa;h,k , Fb; f,g ] = 0, |a − b| ≥ 1, (r ) (1) (1) (s) [E m−1;i, j , E m;h,k ] , [E m;h ,k , E m+1; f,g ] = 0, when m > 1, n > 1, 0 0 (r ) (1) (1) (s) [Fm−1;i, j , Fm;h,k ] , [Fm;h ,k , Fm+1; f,g ] = 0, when m > 1, n > 1, 0

0

(7.13) (7.14) (7.15) (7.16) (7.17) (7.18) (7.19) (7.20)

where a := 0 if 1 ≤ a ≤ m, a := 1 if m + 1 ≤ a ≤ m + n, and the index p (resp. q) is summed over 1,. . . , μa (resp. 1, . . . , μa+1 ): Proof. Equations (7.5)–(7.7) follow from Proposition 4.5, while the others come from Proposition 7.1, Lemma 7.2 and the identity S(v) − S(u) = S (r +s−1) u −r v −s , u−v r,s≥1

for any formal series S(u) =

r ≥0

S (r ) u −r .

Remark 7.1. In the special case where all μi = 1, the right hand side of (7.10) and (7.11) degenerate to zero when a = m. See [Go, Theorem 3]. In fact, the relations in Theorem 3 are enough as defining relations of the super Yangian Y (gl M|N ).

Parabolic Presentations of the Super Yangian Y (glM|N )

253

Theorem 4. The super Yangian Y (gl M|N ) is generated by the elements (r )

(r )

{Da;i, j , Da;i, j | 1 ≤ a ≤ m + n, 1 ≤ i, j ≤ μa , r ≥ 0}, (r )

{E a;i, j | 1 ≤ a < m + n, 1 ≤ i ≤ μa , 1 ≤ j ≤ μa+1 , r ≥ 1}, (r )

{Fa;i, j | 1 ≤ a < m + n, 1 ≤ i ≤ μa+1 , 1 ≤ j ≤ μa , r ≥ 1}, subject to the relations (7.5)−(7.20). μ denote the abstract Proof. Recall the notation Yμ := Y (gl M|N ) defined in Sect. 6. Let Y algebra generated by the elements and relations as in the statement of Theorem 4. We (r ) (r ) μ by the relations (3.18), and it may further define all the other E a,b;i, j and Fb,a;i, j in Y is not hard to show that this definition is independent of the choices of k [BK1, p.22]. Let be the map μ −→ Yμ :Y μ into the element in Yμ with the same name. By Theorem 1 sending every element in Y and Theorem 3, the map is a surjective algebra homomorphism. Therefore, it remains to prove that is also injective. The injectivity will be proved in Sect. 8.

8. Injectivity of μ Our strategy of proving the injectivity of is as follows: we find a spanning set for Y (see Proposition 8.1) and show that the images of the spanning set for Yμ under is linearly independent in Yμ (see Proposition 8.4). μ is spanned as a vector space by the monomials in the elements Proposition 8.1. Y (r ) (r ) (r ) {Da;i, j , E a,b;i, j , Fb,a;i, j } taken in certain fixed order. μ+ , Y μ− ) denote the subalgebras of Y μ generated by the elements μ0 (resp. Y Proof. Let Y (r ) (r ) (r ) μ is spanned by {Da;i, j } (resp. {E a,b;i, j }, {Fb,a;i, j }). By the relations in Theorem 3, Y the monomials where all F’s come before all D’s and all D’s come before all E’s. μ by setting Define a filtration on Y (r )

(r )

(r )

deg(Da;i, j ) = deg(E a,b;i, j ) = deg(Fb,a;i, j ) = r − 1,

for all r ≥ 1,

μ . The above argument implies that and denote the associated graded algebra by gr L Y the multiplication map is surjective, μ− ⊗ gr L Y μ0 ⊗ gr L Y μ+ gr L Y μ . gr L Y μ0 is spanned by μ0 is commutative by Proposition 4.5. It follows that Y Moreover, gr L Y (r ) μ+ the monomials in {Da;i, j } in certain fixed order. Hence it is enough to show that gr L Y is spanned by the monomials in E’s in certain order, and the swap map ζ N |M will show μ− is spanned by the monomials in F’s in certain order. that gr L Y (r ) (r ) μ+ by E a,b;i, We denote the image of E a,b;i, j in the graded algebra grrL−1 Y j . We have the following.

254

Y.-N. Peng

Claim*. For all admissible a, b, c, d, i, j, h, k, r, s, we have (r )

(s)

(r +s−1)

(r +s−1)

[E a,b;i, j , E c,d;h,k ] = (−1)b δb,c δh, j E a,d;i,k − (−1)a b+a c+b c δa,d δi,k E c,b;h, j .

(8.1)

μ+ is spanned by the monoAssuming the claim, we have that the graded algebra gr L Y (r ) μ+ is spanned by the monomials in {E (r ) } mials in {E a,b;i, j } in certain order and hence Y a,b;i, j in certain order as well and therefore Proposition 8.1 is established.

To establish the claim*, we first prove some special cases. μ+ : Lemma 8.2. The following identities hold in gr L Y (a) (r )

(s)

[E a,a+1;i, j , E b,b+1;h,k ] = 0, if |a − b| = 1,

(8.2)

(b) (r )

(s)

(r −1)

(s+1)

[E a,a+1;i, j , E b,b+1;h,k ] = [E a,a+1;i, j , E b,b+1;h,k ], if |a − b| = 1,

(8.3)

(c)

(s) (r ) (s) (t) (r ) (t) E a,a+1;i, j , [E a,a+1;h,k , E b,b+1; f,g ] = − E a,a+1;i, j , [E a,a+1;h,k , E b,b+1; f,g ] , (8.4)

if |a − b| = 1, (d) (r )

(r )

(1)

(1)

(r )

E a,b;i, j = (−1)b−1 [E a,b−1;i,h , E b−1,b;h, j ] = (−1)a+1 [E a,a+1;i,k , E a+1,b;k, j ], (8.5) for all b > a + 1 and any 1 ≤ h ≤ μb−1 , 1 ≤ k ≤ μa+1 . Proof. Equations (8.2) and (8.3) follow from (7.15) and (7.13). Equation (8.4) follows from (7.17) and (8.5) follows from (3.9).

μ+ : Lemma 8.3. The following identities hold in gr L Y (a) (r )

(s)

[E a,a+2;i, j , E a+1,a+2;h,k ] = 0, for all 1 ≤ a ≤ m + n − 2,

(8.6)

(b) (r )

(s)

[E a,a+1;i, j , E a,a+2;h,k ] = 0, for all 1 ≤ a ≤ m + n − 2,

(8.7)

(c) (r )

(s)

[E a,a+2;i, j , E a+1,a+3;h,k ] = 0, for all 1 ≤ a ≤ m + n − 3,

(8.8)

(d) (r )

(s)

[E a,b;i, j , E c,c+1;h,k ] = 0, for all 1 ≤ a < c < b ≤ m + n.

(8.9)

Parabolic Presentations of the Super Yangian Y (glM|N )

255

Proof. (a) By (8.5) and (8.4), we have (r ) (r ) (s) (1) (s) (−1)a+1 [E a,a+2;i, j , E a+1,a+2;h,k ] = [E a,a+1;i, f , E a+1,a+2; f, j ] , E a+1,a+2;h,k (r ) (s) (1) = − [E a,a+1;i, f , E a+1,a+2; f, j ] , E a+1,a+2;h,k (r +s−1) (1) (1) = − [E a,a+1;i, f , E a+1,a+2; f, j ] , E a+1,a+2;h,k , and the last term is zero by (8.4). (s) (b) The same method in (a) works, except that we apply (8.5) on the term E a,a+2;h,k . (c) It takes some effort in this case due to the Z2 -grading. First assume that a = m − 1. We apply (8.5) on the left hand side of (8.8) and use the super-Jacobi identity: (r )

(s)

[E a,a+2;i, j , E a+1,a+3;h,k ] (r ) (1) (1) (s) = (−1)a+1+a+2 [E a,a+1;i,h , E a+1,a+2;h, j ][E a+1,a+2;h, j , E a+2,a+3; j,k ] (s) (r ) (1) (1) = (−1)a+1+a+2 [E a,a+1;i,h , E a+1,a+2;h, j ], E a+1,a+2;h, j , E a+2,a+3; j,k (r ) (1) (1) (s) +ε(−1)a+1+a+2 E a+1,a+2;h, j , [E a,a+1;i,h , E a+1,a+2;h, j ], E a+2,a+3; j,k , (r )

(1)

where ε is (−1)αβ , α is the degree of [E a,a+1;i,h , E a+1,a+2;h, j ] and β is the degree of (1)

E a+1,a+2;h, j . By (8.4), the first term is zero. Moreover, by our assumption that a = m −1, (1)

the elements E a+1,a+2;h, j is even and hence ε is 1. Keep using the super-Jacobi identity and Lemma 8.2, we may deduce that the above equals to (1) (r ) (1) (s) (−1)a+1+a+2 E a+1,a+2;h, j , [E a,a+1;i,h , E a+1,a+2;h, j ], E a+2,a+3; j,k (1) (r ) (1) (s) = (−1)a+1+a+2 E a+1,a+2;h, j , E a,a+1;i,h , [E a+1,a+2;h, j , E a+2,a+3; j,k ] + 0 (1) (r ) (1) (s) = (−1)a+1+a+2 [E a+1,a+2;h, j , E a,a+1;i,h ], [E a+1,a+2;h, j , E a+2,a+3; j,k ] + 0 (r ) (1) (1) (s) = −(−1)a+1+a+2 [E a,a+1;i,h , E a+1,a+2;h, j ], [E a+1,a+2;h, j , E a+2,a+3; j,k ] (r )

(s)

= −[E a,a+2;i, j , E a+1,a+3;h,k ]. Therefore, (8.8) is true for all a = m − 1. Now let a = m − 1, same method shows that (r )

(s)

[ E m−1,m+1;i, j , E m,m+2;h,k ] (r ) (1) (1) (s) = (−1)m [ E m−1,m;i, f , E m,m+1; f, j ] , (−1)m+1 [ E m,m+1;h,g , E m+1,m+2;g,k ] (r ) (1) (1) (s) = ± [E m−1,m;i, f , E m,m+1; f, j ] , [E m,m+1;h,g , E m+1,m+2;g,k ] , which is zero by (7.1) and hence (8.8) is true when a = m − 1 as well. (d) By super-Jacobi identity and (8.5), it is enough to show the following 2 cases: (r )

(s)

(r )

(s)

[E a,c+1;i, j , E c,c+1;h,k ] = 0, for all a < c,

(8.10)

and [E a,c+1;i, j , E c,c+2;h,k ] = 0, for all a < c.

(8.11)

256

Y.-N. Peng

They can be proved by using (8.2)−(8.8) and induction on c − a. We show (8.10) in detail here. When c = a + 1, it follows directly from (8.6). Now assume c > a + 1. By (8.5) and super-Jacobi identity, we have (r ) (s) (1) (r ) (s) [E a,c+1;i, j , E c,c+1;h,k ] = (−1)a+1 [E a,a+1;i, f , E a+1,c+1; f, j ] , E c,c+1;h,k (1) (r ) (s) = (−1)a+1 E a,a+1;i, f , [E a+1,c+1; f, j , E c,c+1;h,k ] (r ) (1) (s) ± E a+1,c+1; f, j , [E a,a+1;i, f , E c,c+1;h,k ] . The first term is zero by induction hypothesis and the second term is also zero by (8.2). Proof (Proof of claim*). Without loss of generality, we may assume that a ≤ c. The proof is split into 7 cases and we prove them one by one. Case 1. a < b < c < d. It follows directly from (8.2) and (8.5) that the bracket in (8.1) is zero. Case 2. a < b = c < d. By (8.3) and (8.5), we have (r +1)

(s+1)

(r +s+1)

(1)

[E b−1,b;i1 , j , E b,b+1;h,k1 ] = [E b−1,b;i1 , j , E b,b+1;h,k1 ] (r +s+1)

= δh, j (−1)b E b−1,b+1;i1 ,k1 .

(8.12)

Note that when h = j, the bracket is zero by (7.13) and hence the δh, j comes out. Taking the bracket on both sides of Eq. (8.12) with the elements (1)

(1)

(1)

E b+1,b+2;k1 ,k2 , E b+2,b+3;k2 ,k3 , . . . , E d−1,d;kd−1 ,k from the right and using the super-Jacobi identity, Eqs. (8.2) and (8.5), we have (r +1)

(s+1)

(r +s+1)

[E b−1,b;i1 , j , E b,d;h,k ] = δh, j (−1)b E b−1,d;i1 ,k .

(8.13)

Taking brackets on both sides of (8.13) with the elements (1)

(1)

(1)

E b−2,b−1;i2 ,i1 , E b−3,b−2;i3 ,i2 , . . . , E a,a+1;i,ib−a−1 from the left and using exactly the same method as above, we have (r )

(s)

(r +s−1)

[E a,b;i, j , E b,d;h,k ] = δh, j (−1)b E a,d;i,k , as desired. Case 3. a < c < b = d. Using the super-Jacobi identity, (8.5) and (8.9), we have (r ) (r ) (s) (1) (s) [E a,b;i, j , E c,b;h,k ] = E a,b;i, j , (−1)c+1 [E c,c+1;h, f 1 , E c+1,b; f1 ,k ] (r ) (1) (s) = (−1)c+1 [E a,b;i, j , E c,c+1;h, f1 ], E c+1,b; f 1 ,k (1) (r ) (s) ±(−1)c+1 E c,c+1;h, f 1 , [E a,b;i, j , E c+1,b; f1 ,k ] (1) (r ) (s) = 0 ± (−1)c+1 E c,c+1;h, f 1 , [E a,b;i, j , E c+1,b; f 1 ,k ] (1) (1) = · · · = ± E c,c+1;h, f 1 , [E c+1,c+2; f1 , f2 , (r ) (s) . . . , [E a,b;i, j , E b−1,b; f b−1−c ,k ] · · · . (r )

(s)

By (8.9) again, the bracket [E a,b;i, j , E b−1,b; f b−1−c ,k ] = 0.

Parabolic Presentations of the Super Yangian Y (glM|N )

257

Case 4. a < c < d < b. Using the same method as in Case 3, we have (r ) (r ) (s) (1) (s) [E a,b;i, j , E c,d;h,k ] = E a,b;i, j , (−1)c+1 [E c,c+1;h, f 1 , E c+1,d; f1 ,k ] (r ) (1) (s) = (−1)c+1 [E a,b;i, j , E c,c+1;h, f 1 ], E c+1,d; f 1 ,k ] (1) (r ) (s) ±(−1)c+1 E c,c+1;h, f1 , [E a,b;i, j , E c+1,d; f1 ,k ] (1) (r ) (s) = 0 ± (−1)c+1 E c,c+1;h, f 1 , [E a,b;i, j , E c+1,d; f1 ,k ] (1) (1) = · · · = ± E c,c+1;h, f 1 , E c+1,c+2; f 1 , f2 , (r ) (s) . . . , [E a,b;i, j , E d−1,d; f d−1−c ,k ] · · · . (r )

(s)

By (8.9) again, the bracket [E a,b;i, j , E d−1,d; f d−1−c ,k ] = 0. Case 5. a < c < b < d. We prove this case by induction on d −b ≥ 1. When d −b = 1, we have (r ) (r ) (s) (s) (1) [E a,b;i, j , E c,b+1;h,k ] = E a,b;i, j , (−1)b [E c,b;h, j , E b,b+1; j,k ] (r ) (s) (1) = (−1)b [E a,b;i, j , E c,b;h, j ] , E b,b+1; j,k (s) (r ) (1) ±(−1)b E c,b;h, j , [E a,b;i, j , E b,b+1; j,k ] . Now the bracket in the first term is zero by Case 3, and we may rewrite the (r ) (s) whole second term as ±[E a,b+1;i,k , E c,b;h, j ], which is zero by Case 4. Assume that d − b > 1, then d − 1 > b. By (8.5), the bracket becomes (r ) (r ) (s) (s) (1) [E a,b;i, j , E c,d;h,k ] = E a,b;i, j , (−1)d−1 [E c,d−1;h, f , E d−1,d; f,k ] (r ) (s) (1) = (−1)d−1 [E a,b;i, j , E c,d−1;h, f ] , E d−1,d; f,k (s) (r ) (1) ± E c,d−1;h, f , [E a,b;i, j , E d−1,d; f,k ] . The bracket in the first term is zero by induction hypothesis, while the bracket in the second term is zero as well by Case 1. Case 6. a = c < b < d. (r ) (r ) (s) (1) (s) [E a,b;i, j , E a,d;h,k ] = E a,b;i, j , (−1)a+1 [E a,a+1;h, f , E a+1,d; f,k ] (r ) (1) (s) = (−1)a+1 [E a,b;i, j , E a,a+1;h, f ] , E a+1,d;h,k (1) (r ) (s) ± E a,a+1;h, f , [E a,b;i, j , E a+1,d; f,k ] . (r )

(s)

Note that [E a,b;i, j , E a+1,d; f,k ] = 0 by Case 5. Hence it is enough to show that (r )

(1)

[E a,b;i, j , E a,a+1;h, f ] = 0,

for all b > a.

(8.14)

We prove (8.14) by induction on b − a ≥ 1. When b − a = 1, it follows from (8.2). Now assume b − a > 1. By (8.5), we have (r ) (1) (r ) (1) (1) [ E a,b;i, j , E a,a+1;h, f ] = (−1)b−1 [ E a,b−1;i,g , E b−1,b;g, j ] , E a,a+1;h, f (r ) (1) (1) = (−1)b−1 E a,b−1;i,g , [E b−1,b;g, j , E a,a+1;h, f ] (1) (r ) (1) ±(−1)b−1 E b−1,b;g, j , [E a,b−1;i,g , E a,a+1;h, f ] .

258

Y.-N. Peng (r )

(1)

Note that [E a,b−1;i,g , E a,a+1;h, f ] = 0 by induction hypothesis. Also by (8.2), (1)

(1)

[E b−1,b;g, j , E a,a+1;h, f ] = 0 unless b − 1 = a + 1. When b − 1 = a + 1, (8.14) (r )

(1)

becomes [E a,a+2;i, j , E a,a+1;h, f ], which is zero by (8.7). Case 7. a = c < b = d. We claim that (r )

(s)

[E a,b;i, j , E a,b;h,k ] = 0.

(8.15)

If b = a + 1, it follows directly from (8.2). If b > a + 1, we may expand one term in the bracket of (8.15) by (8.5) as follows: (r ) (s) (r ) (1) (s) [ E a,b;i, j , E a,b;h,k ] = (−1)b−1 [E a,b−1;i, f , E b−1,b; f, j ] , E a,b;h,k (r ) (1) (s) = (−1)b−1 E a,b−1;i, f , [E b−1,b; f, j , E a,b;h,k ] (1) (r ) (s) ±(−1)b−1 E b−1,b; f, j , [E a,b−1;i, f , E a,b;h,k ] . (1)

(s)

(r )

(s)

Note that [E b−1,b; f, j , E a,b;h,k ] = 0 by Case 3 and [E a,b−1;i, f , E a,b;h,k ] = 0 by Case 6. Therefore, we have proved (8.15). This completes the proof of claim*.

Proposition 8.4. The images of the monomials in Proposition 8.1 under are linearly independent. Proof. By Corollary 2.2, we may identify gr L Y (gl M|N ) = gr L Yμ with the loop superalgebra U (gl M|N [t]) via grrL−1 ti(rj ) −→ (−1)i E i j t r −1 . We consider the following composition: μ− ⊗ gr L Y μ0 ⊗ gr L Y μ+ gr L Y μ − gr L Y → gr L Yμ ∼ = U (gl M|N [t]). (r )

Let n a := μ1 + μ2 + · · · + μa for short. By Proposition 3.1, the image of E a,b;i, j (r )

(r )

(resp. D a;i, j , F b,a;i, j ) under the above composition map is (−1)n a +i E n a +i,n b + j t r −1 (resp. (−1)n a +i E n a +i,n a + j t r −1 , (−1)n b +i E n b +i,n a + j t r −1 ). By the PBW theorem for U (gl M|N [t]), the set of all monomials in (r ) grrL−1 Da;i, j | 1 ≤ a ≤ m + n, 1 ≤ i, j ≤ μa , r ≥ 1 (r ) ∪ grrL−1 E a,b;i, j | 1 ≤ a < b ≤ m + n, 1 ≤ i ≤ μa , 1 ≤ j ≤ μb , r ≥ 1 (r ) ∪ grrL−1 Fb,a;i, | 1 ≤ a < b ≤ m + n, 1 ≤ i ≤ μ , 1 ≤ j ≤ μ , r ≥ 1 b a j taken in certain fixed order forms a basis for gr L Yμ , and hence Proposition 8.4 follows.

Let Yμ0 , Yμ+ and Yμ− denote the subalgebras of Yμ generated by all the D’s, E’s and F’s, respectively. Along the proofs of Proposition 8.1 and Proposition 8.4, we have found the PBW bases for each of these algebras.

Parabolic Presentations of the Super Yangian Y (glM|N )

259 (r )

Corollary 8.5. (1) The set of monomials in {Da;i, j }1≤a≤m+n,1≤i, j≤μa ,r ≥1 taken in certain fixed order forms a basis for Yμ0 . (r )

(2) The set of monomials in {E a,b;i, j }1≤a
(3) The set of monomials in {Fb,a;i, j }1≤a
References [BK1] [BK2] [CP] [D1] [D2] [FRT] [GKLLRT] [Go] [GR] [KRS] [Na] [MNO] [Mo] [Ta] [TF]

Brundan, J., Kleshchev, A.: Parabolic presentations of the yangian y(gln ). Commun. Math. Phys. 254, 191–220 (2005) Brundan, J., Kleshchev, A.: Shifted yangians and finite W -algebras. Adv. Math. 200, 136–195 (2006) Chari, V., Pressley, A.: A guide to quantum groups. Cambridge: Cambridge University Press, 1994 Drinfeld, V.: Hopf algebras and the quantum yang-baxter equation. Soviet Math. Dokl. 32, 254–258 (1985) Drinfeld, V.: A new realization of yangians and quantized affine algebras. Soviet Math. Dokl. 36, 212–216 (1988) Faddeev, L., Reshetikhin, N., Takhtadzhyan, L.: Quantization of lie groups and lie algebras. Leningrad Math. J. 1, 193–225 (1990) Gelfand, I., Krob, D., Lascoux, A., Leclerc, B., Retakh, V., Thibon, J.-Y.: Non-commutative symmetric functions. Adv. Math. 112, 218–348 (1995) Gow, L.: Gauss decomposition of the yangian y(glm|n ). Commun. Math. Phys. 276, 799–825 (2007) Gelfand, I., Retakh, V.: Quasideterminants, i. Selecta Math. 3, 517–546 (1997) Kulish, P., Reshetikhin, N., Sklyanin, E.: Yang-baxter equation and representation theory. Lett. Math. Phys. 5, 393–403 (1981) Nazarov, M.: Quantum berezinian and the classical capelli identity. Lett. Math. Phys. 21, 123– 131 (1991) Molev, A., Nazarov, M., Olshanskii, G.: Yangians and classical lie algebras. Russ. Math. Surv. 51, 205–282 (1996) Molev, A.: Yangians and classical Lie algebras. Mathematical Surveys and Monographs, 143, Providence, RI: American Mathematical Society. 2007 Tarasov, V.: Irreducible monodromy matrices for an r-matrix of the XYZ-model and lattice local quantum hamiltonians. Theoret. Math. Phys. 63, 440–454 (1985) Takhtadzhyan, L., Faddeev, L.: The quantum method of the inverse problem and the heisenberg XYZ-model. Russ. Math. Serv. 34, 11–68 (1979)

Communicated by Y. Kawahigashi

Commun. Math. Phys. 307, 261–273 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1303-0

Communications in

Mathematical Physics

The Holst Action by the Spectral Action Principle Frank Pfäffle, Christoph A. Stephan Institut für Mathematik, Universität Potsdam, Am Neuen Palais 10, 14469 Potsdam, Germany. E-mail: [email protected]; [email protected] Received: 10 February 2011 / Accepted: 21 February 2011 Published online: 29 July 2011 – © Springer-Verlag 2011

Abstract: We investigate the Holst action for closed Riemannian 4-manifolds with orthogonal connections. For connections whose torsion has zero Cartan type component we show that the Holst action can be recovered from the heat asymptotics for the natural Dirac operator acting on left-handed spinor fields. 1. Introduction Connes’ spectral action principle ([Co96]) states that any reasonable physical action should be deducible from the spectrum of some suitable Dirac operator. One of the impressive achievements of the spectral action principle is the Chamseddine-Connes spectral action ([CC97]) which comprises the Einstein-Hilbert action of general relativity and the bosonic part of the action of the standard model of particle physics. It gives a conceptual explanation for the Higgs potential in order to have the electro-weak symmetry breaking, and it allows to put constraints on the mass of the Higgs boson. The present article is intended to show how the spectral action principle can be used to derive the Holst action. Loop Quantum Gravity (LQG) is a very promising and successful candidate for a theory of quantum gravity (see [Rov04] and [Th07] for an introduction). Important ingredients of the quantisation procedure are the canonical variables of Ashtekar type ([As86,As87]). In order to have such variables one considers the Holst action ([Ho96]) which is a modification of the Einstein-Cartan-Hilbert action with the same critical points. Recently, a large class of modified actions with the same critical points as the Einstein-Cartan-Hilbert action has been proposed ([DVL10]). This raises the question if the Holst action is distinguished in this large class of actions. We will see how the spectral action principle gives a conceptual explanation for the Holst action for closed Riemannian 4-manifolds, not only within the class of actions proposed in [DVL10]. The underlying geometric objects we will consider are orthogonal connections with general torsion in the sense of É. Cartan (see [Ca23,Ca24,Ca25]). (For an overview of

262

F. Pfäffle, C. A. Stephan

the physical consequences of Einstein-Cartan theory in the Lorentzian setting we refer to [HHKN76 and Sh02].) The torsion of any orthogonal connection decomposes into a vectorial component, a totally anti-symmetric one and one of Cartan type. In Sect. 2 we will recapitulate this decomposition and impose it into the Holst action in order to discuss the appearance of critical values of the Barbero-Immirzi parameters. In Sect. 3 we will consider the classical Dirac operator D associated to such an orthogonal connection. The Cartan type component of the torsion does not affect D, and D is not symmetric if the vectorial component of the torsion is non-zero. We will derive a Lichnerowicz formula for D ∗ D and deduce the heat trace asymptotics for the restriction of D ∗ D to the left-handed spinor fields. It turns out (in Cor. 3.5) that the second term in these asymptotics gives exactly the Holst action if one considers connections with zero Cartan type torsion. Furthermore the constraints from the spectral action principle allow us to fix the value of the Barbero-Immirzi parameter ([Ba95,Im97]) which is a free parameter in LQG. 2. The Einstein-Cartan-Hilbert Action and the Holst Action For the convenience of the reader let us briefly recall the classical Cartan classification of orthogonal connections (see [Ca25, Chap. VIII]), we will adopt the notations of [TV83, Chap. 3]: We consider an n-dimensional manifold M equipped with some Euclidean metric g and some orientation (in order to have a volume form and a Hodge ∗-operator). Let ∇ g denote the Levi-Civita connection on the tangent bundle. For any affine connection ∇ on the tangent bundle there exists a (2, 1)-tensor field A such that g

∇ X Y = ∇ X Y + A(X, Y )

(1)

for all vector fields X, Y . We will require all connections ∇ to be orthogonal, i. e. compatible with the scalar product given by the Euclidean metric g. Therefore, one has g (A(X, Y ), Z ) = −g (Y, A(X, Z )) for any tangent vectors X, Y, Z ∈ T p M. The induced (3, 0)-tensor is given by A X Y Z = g (A(X, Y ), Z ) . Hence, the space of all possible torsion tensors on T p M is 3 1 ⊗ 2 = A ∈ T p∗ M A X Y Z = −A X Z Y ∀X, Y, Z ∈ T p M . It carries a natural euclidean scalar product, which reads for any orthonormal basis e1 , . . . , en of T p M as A, B =

n

Aei e j ek Bei e j ek ,

(2)

i, j,k=1

and the orthogonal group O(n) acts on it by (α A) X Y Z = Aα −1 (X )α −1 (Y )α −1 (Z ) . The corresponding norm is A 2 = A, A Then, one has the following decomposition of 1 ⊗ 2 into irreducible O(n)-subrepresentations: 1 ⊗ 2 = V(T p M) ⊕ T (T p M) ⊕ S(T p M).

Holst Action by the Spectral Action Principle

263

This decomposition is orthogonal with respect to ·, ·, and it is given by V(T p M) = A ∈ 1 ⊗ 2 ∃V s.t. ∀X, Y, Z : A X Y Z = g(X, Y ) g(V, Z ) − g(X, Z ) g(V, Y ) , T (T p M) = A ∈ 1 ⊗ 2 ∀X, Y, Z : A X Y Z = −AY X Z , S(T p M) = A ∈ 1 ⊗ 2 ∀X, Y, Z : A X Y Z + AY Z X + A Z X Y = 0 and

n

A(ea , ea , Z ) = 0 .

a=1

The connections whose torsion tensor is contained in V are called vectorial. Those whose torsion tensor is in T are called totally anti-symmetric, and those with torsion tensor in S are called of Cartan type. From this decomposition we get that for any orthogonal connection ∇ as in (1) there exist a vector field V , a 3-form T and a (3, 0)-tensor field S with S p ∈ S(T p M) for any p ∈ M such that A(X, Y ) = g(X, Y )V − g(V, Y )X + T (X, Y, ·) + S(X, Y, ·) ,

(3)

these V, T, S are unique. As usual : T p∗ M → T p M denotes the canonical isomorphism induced by g. The scalar curvature of this orthogonal connection is R = R g + 2(n − 1) divg (V ) − (n − 1)(n − 2) |V |2 − T 2 +

1 S 2 , 2

(4)

where R g is the scalar curvature of ∇ g and divg denotes the divergence taken with respect to ∇ g (see e.g. [PS11, Lemma 2.5]). Let (θ a )a=1,...,n denote the dual frame of (ea )a=1,...,n , i.e. θ a (·) = g(ea , ·). Then the volume form is dvol = θ 1 ∧ · · · ∧ θ n . For k-forms there is a natural scalar product , k such that the elements θ i1 ∧ · · · ∧ θ ik with i 1 < · · · < i k form an orthonormal basis of k (T p∗ M) (compare [Bl81, Def. 0.1.4]) Furthermore we have the Hodge operator ∗ : k (T p M) → (n−k) (T p M), and for ω, η ∈ k (T p∗ M) one has ω ∧ ∗η = ω, ηk dvol. For n = 4 and k = 2 we have ∗∗ = id; this decomposes the space of 2-forms into the selfdual and the anti-selfdual ones1 : 2 = 2+ ⊕ 2− , where 2± is the ±1-eigenspace of ∗. The (anti-)selfdual component of S is denoted by S ± = S ∩ (1 ⊗ 2± ). Thereby, obtain the decomposition 1 ⊗ 2 = V ⊕ T ⊕ S + ⊕ S −

(5)

which is orthogonal w.r.t. the scalar product given in (2) and decomposes the component S from (3) into S = S+ + S− . In LQG (see [Rov04] or [Th07]) one considers the case n = 4, and most of the local computations are done in Cartan’s moving frame formalism. Let (ea )a=1,...,4 be a local positively oriented orthonormal frame of T M and (θ a )a=1,...,4 its dual frame. 1 Note that in Lorentzian signature ∗∗ = − id and hence ∗ has eigenvalues ±i.

264

F. Pfäffle, C. A. Stephan

To each orthogonal connection ∇ as above one associates the connection 1-forms ωba , the curvature 2-forms ab and the torsion 2-forms a , which are given by ωba (X ) a b (X, Y ) a

= g(∇ X eb , ea ), = g(Riem(X, Y )eb , ea ) = g(∇ X ∇Y eb − ∇Y ∇ X eb − ∇[X,Y ] eb , ea ), (X, Y ) = g(A(X, Y ) − A(Y, X ), ea ) = g(∇ X Y − ∇Y X − [X, Y ], ea ). For these forms one has the following structure equations: ab = dωba +

ωca ∧ ωbc ,

c

= dθ + a

a

ωca ∧ θ c .

c

With respect to the given frame one defines the translational Chern-Simons form by CT T =

a ∧ θ a .

a

With the structure equations one obtains the Nieh-Yan equation (see [NY82]): dC T T =

a ∧ a +

a

ab ∧ θ b ∧ θ a .

(6)

a,b

Proposition 2.1. One has C T T = 6T , where T is the totally anti-symmetric component of the torsion as in (3). Proof. We compute a

a ∧ θ a (ek , e , em ) =

a (ek , e )δam + a (e , em )δak + a (em , ek )δa

a m

= (ek , e ) + k (e , em ) + (em , ek ) sign(σ )A(eσ (k) , eσ ( ) , eσ (m) ), = σ

where the last summation is taken over all permutations of {k, , m}, it is the antisymmetrisation of A ∈ 1 ⊗ 2 and therefore equals 6T . This shows in particular that C T T is globally defined, i.e. independent of the choice of the moving frame. Sometimes it is stated that M dC T T is a topological invariant, and it is called Nieh-Yan invariant. Corollary 2.2. Assume that the totally anti-symmetric component T of the torsion has compact support which avoids the boundary of M, then dC T T = 0. M

Holst Action by the Spectral Action Principle

265

For the case when M is closed the corollary was already shown in [GWZ99] by means of Chern-Weil theory. It also holds in the Lorentzian setting and was already implicitly used e.g. in [DVL10].

If the support of M meets the boundary Stoke’s Theorem gives simple formulas for M dC T T . Such terms have been considered e.g. in [Ba10]. The Nieh-Yan equation is remarkable since the second summand equals the density of the Holst term (see [Ho96]) CH = ab ∧ θ b ∧ θ a . (7) a,b

Proposition 2.3. For any orthogonal connection ∇ one finds 1 C H = 6 dT + 12 T, ∗V 3 dvol − S+ 2 − S− 2 dvol 2 with respect to the decomposition (5), where V (·) = g(V, ·) is the dual form of the vector field V . Proof. To simplify notation we abbreviate θ ab = θ a ∧ θ b , θ abc = θ a ∧ θ b ∧ θ c and set Aabc = A(ea , eb , ec ), and likewise for all components of the decomposition (5), e.g. Tabc = T (ea , eb , ec ), and we set Vd = g(V, ed ). We define the 2-forms Aa = 1 bc and likewise V a , T a etc. From the definition of a we get b,c Aabc θ 2 a (eb , ec ) = Tbca + Vbca + Sbca − Tcba − Vcba − Scba = 2Tabc − Vabc − Sabc . Therefore a = 2T a − V a − S a . In the following we will consider the terms occurring in 4 T a ∧ T a − 4 T a ∧ Sa − 4 T a ∧ V a + V a ∧ V a a ∧ a = a

a

+2 V a ∧ S a + S a ∧ S a .

For the first three terms we calculate 4 T a ∧ Aa =

Tabc Aab c θ bcb c

a,b,c,b ,c

a

=

Tabc Aaad θ bcad + Aada θ bcda

a,b,c,d

=2

Tabc Aaad θ abcd .

(8)

a,b,c,d

For the second equality we observe that θ bcb c = 0 only if b, c, b , c are pairwise distinct and Tabc = 0 only if a, b, c are pairwise distinct. As 1 ≤ a, b, c, b , c ≤ 4 only summands with a = b or a = c can contribute. a a For A = T we get from (8) that a T ∧ T = 0. For A = S we convince ourselves a a that a T ∧ S = 0 by considering the sum in (8) with fixed d, for example d = 4: Tabc Saa4 θ abc4 = S114 (T123 θ 1234 + T132 θ 1324 ) + S224 (T213 θ 2134 + T231 θ 2314 ) a,b,c

+S334 (T312 θ 3124 + T321 θ 3214 ) = 2 T123 (S114 + S224 + S334 ) θ 1234 = 0,

266

F. Pfäffle, C. A. Stephan

since S444 = 0 and the trace of S over the first two entries vanishes. For A = V we have Aaad = Vd − δad Va , which we insert into (8), −4

T a ∧ V a = −2

a

Tabc Vd θ abcd

a,b,c,d

⎛

= −2 ⎝

⎞ Tabc θ

abc ⎠

∧

a,b,c

Vd θ

d

d

= −12 T ∧ V = −12 T, ∗V 3 dvol . Similarly, we obtain

V a ∧ Aa = −

a

1 Va Abcd θ abcd , 2

(9)

a,b,c,d

which is zero for A = V . In the case of A = S in (9) we notice −

Va Sbcd θ abcd =

a,b,c,d

Va (Scdb + Sdbc ) θ abcd = 2

a,b,c,d

a and Finally, with ∗S a = S+a − S−

Sa ∧ Sa =

a

Va Sbcd θ abcd = 0.

a,b,c,d

a a a S(±) , S(±) 2

= 21 S(±) 2 we get

1 + 2 S − S − 2 dvol . S a , ∗S a 2 dvol = 2 a

We conclude that a a ∧ a = −12 T, ∗V 3 dvol + 21 S + 2 − S − 2 dvol. With the Nieh-Yan equation (6) and Prop. 2.3 the claim follows. This shows that C H depends only on the torsion of the connection but not on the Riemannian curvature of the underlying manifold. Observations of that kind have been made before in [HM86,Me06 and Ba10]. The Holst action2 used in LQG is given by IH =

1 16π G

M

ργ∇ dvol =

1 16π G

R dvol − M

1 CH , γ

where G is Newton’s constant and γ is the Barbero-Immirzi parameter ([Ba95,Im97]). The density of the Holst action reads: ργ∇

12 dvol = R + 6 div (V ) − 6 |V | − T − T, ∗V 3 γ 1 1 1 1 + 2 − 2 (1 + ) S + (1 − ) S dvol . + 2 γ 2 γ g

g

2

2

dvol −

6 dT γ (10)

2 Before [Ho96] this action already appeared in [HMS80], a sketch of its history can be found in [BHN11, Sect. III.D].

Holst Action by the Spectral Action Principle

267

Assuming that both T and V have compact support and avoid the boundary of M we obtain 1 1 12 1 R g − 6 |V |2 − T 2 − T, ∗V 3 + (1 + ) S + 2 IH = 16π G M γ 2 γ 1 1 − 2 + (1 − ) S dvol . 2 γ If γ = ±1 one can vary S + or S − without changing the value of I H , thus obtaining more critical points than for the Einstein-Hilbert functional.3 These critical values of the Barbero-Immirzi parameter are well known in LQG, we think our representation of the Holst action offers a clear geometric understanding of this fact.

3. Dirac Operators and the Spectral Action Principle The spectral action principle ([Co96]) of noncommutative geometry ([Co94]) states that the whole information of physical reality is encoded in some suitable Dirac operator, and one should be able to extract any measurable quantity from its spectrum. In the following we want to discuss the relation between the classical Dirac operator and the Holst action. We consider an n-dimensional Riemannian manifold and we assume that M carries a spin structure so that spinor fields are defined. Any orthogonal connection ∇ as in (1) induces a unique connection acting on spinor fields (see [LM89, Chap. II.4] or [PS11, Sect. 4]) which we will also denote by ∇. The Dirac operator associated to ∇ is defined as Dψ =

n a=1

ea · ∇ea ψ = D g ψ +

n 1 Aabc ea · eb · ec · ψ 4 a,b,c=1

n−1 3 = Dg ψ + T · ψ − V · ψ, 2 2

(11)

where D g is the Dirac operator associated to the Levi-Civita connection and “·” is the Clifford multiplication.4 Using the fact that the Clifford multiplication by the vector field V is skew-adjoint w.r.t. the hermitian product on the spinor bundle one observes that D is symmetric with respect to the natural L 2 -scalar product on spinors if and only if the vectorial component of the torsion vanishes, V ≡ 0 (see [FS79 and PS11], and [GS87] for the Lorentzian setting). We would like to stress that the Dirac operator D stays pointwise the same if one changes the Cartan type component S of the torsion (see e.g. [PS11, Lemma 4.7]).5 Therefore the Dirac operator D does not contain any information on the Cartan-type component S of the torsion. We summarise:

Corollary 3.1. In general, neither C H nor M C H nor I H can be recovered from the spectrum of the D. 3 In Lorentzian signature the critical values of the Barbero-Immirzi parameters are ±i. 4 For the Clifford relations we use the convention X · Y + Y · X = −2 g(X, Y ) for any tangent vectors X, Y , and any k-form θ i 1 ∧ · · · ∧ θ i k acts on some spinor ψ by θ i 1 ∧ · · · ∧ θ i k · ψ = ei 1 · · · · · ei k · ψ. 5 In the Lorentzian case it is known that torsion of Cartan type does not contribute to the Dirac action under

the integral [Sh02, Chap. 2.3].

268

F. Pfäffle, C. A. Stephan

Remark 3.2. For any compact spin manifold the Atiyah-Singer Index Theorem relates the index of the left-handed Dirac operator (mapping left-handed to right-handed spinor ˆ fields) to a topological invariant of the manifold (the A-genus). The index of an elliptic operator depends only on its principal symbol (see e.g. [LM89, Cor.III.7.9]). Therefore, the index of the left-handed part of the Dirac operator defined in (11) does not depend on the torsion. The index density in the case of “H -torsions” (i.e. totally anti-symmetric torsion with dT = 0) has been calculated in [Ki07]. Nevertheless, we will recover the Holst action from the heat trace asymptotics for D ∗ D if we restrict to the case S ≡ 0, which is the natural case when dealing with spinors. First, we derive the following Lichnerowicz formula: Theorem 3.3. For the Dirac operator D associated to the orthogonal connection ∇ as given in (11) we have 3 1 g 3 R ψ + dT · ψ − T 2 ψ 4 2 4 2 n−1 n − 1 g + div (V ) ψ + (2 − n) |V |2 ψ 2 2 +3(n − 1) (T · V · ψ + (V T ) · ψ)

D ∗ Dψ = ψ +

(12)

for any spinor field ψ, where is the Laplacian associated to the connection X ψ = ∇ g ψ + 3 (X T ) · ψ − n − 1 V · X · ψ − n − 1 g(V, X ) ψ. ∇ X 2 2 2 Proof. As Clifford multiplication by any 3-form is self-adjoint we have n−1 3 V · ψ. D∗ψ = D g ψ + T · ψ + 2 2 We calculate 2 n−1 2 3 V ·V ·ψ D Dψ = D + T · ψ − 2 2 3 (n − 1) n−1 + V · Dg − Dg V · ψ (V · T − T · V ) · ψ + 4 2 2 2 3 n − 1 |V |2 ψ = Dg + T · ψ + 2 2 3 (n − 1) n−1 g 2V · D g ψ + 2∇V ψ − (T · V + (V T )) · ψ + 2 2 + divg (V ) ψ − d(V ) · ψ , ∗

g

(13)

where we have used the relation V · T + T · V = −2(V T ) for the vector V and the g 3-form T and the identity D g V + V D g = −2∇V −divg (V )+d(V ). In order to calculate we fix some p ∈ M and choose the frame the Laplacian associated to the connection ∇ (ea ) to be synchronous about p, i.e. ∇ g ea | p = 0 for any a = 1, . . . , n:

Holst Action by the Spectral Action Principle

ψ = −

269

ea ∇ ea ψ ∇

a

g 3 3 g ∇ea + (ea T ) ∇ea + (ea T ) ψ 2 2 a n−1 3 g ∇ea + (ea T ) (V · ea + g(V, ea )) ψ + 2 2 a n−1 3 g + (V · ea + g(V, ea )) ∇ea + (ea T ) ψ 2 2 a 2 n−1 − (V · ea + g(V, ea )) (V · ea + g(V, ea )) ψ 2 a 2 3 3 1 3 = D g + T · ψ − R g ψ − dT · ψ + T 2 ψ 2 4 2 4 n−1 3 3 g + (ea T ) · V · ea · ψ V · D g ψ +∇V ψ + (V T ) · ψ −d(V ) · ψ + 2 2 2 a n−1 3 3 g g + V · ea · (ea T ) · ψ V · D ψ + ∇V ψ + (V T ) · ψ + 2 2 2 a n−1 2 (1 − n)|V |2 ψ, − 2 =−

2 g g Here we have used D g + 23 T = − a ∇ea + 23 (ea T ) ∇ea + 23 (ea T ) + 41 R g + 3 3 2 is Thm. 6.2 of [AF04] adapted to our notation. Next we can deduce 2 dT − 4 T which from ea T = 21 b,c Tabc θ bc that 3T = a (ea T )·ea = a ea ·(ea T ) and we further simplify: ψ =

2 3 1 3 3 T · ψ − R g ψ − dT · ψ + T 2 ψ 2 4 2 4 (n − 1)3 n−1 g 2 V · D g ψ +2 ∇V ψ −d(V ) · ψ −9(V T ) · ψ −9T · V · ψ + |V |2 ψ. + 2 4 Dg +

Together with (13) this yields the claim. Now, let kt (x, y) denote the (smooth) kernel of the heat operator exp(−t D ∗ D). Then one has the well-known asymptotic expansion kt (x, x) ∼

1 2 α (x) + t α (x) + t α (x) + · · · 0 2 4 (4π t)2

for t → 0

with α0 (x) = id. We have γ5 = e1 · . . . · e4 = dvol and the projection on left-handed spinors is given by PL = 21 (id −γ5 ). Now let us consider the restriction PL D ∗ D PL to the left-handed spinors. We note that its kernel is given by pt (x, y) = PL ◦ kt (x, y) ◦ PL . Taking the trace over the (4-dimensional) spinor spaces we obtain the following asymptotics:

270

F. Pfäffle, C. A. Stephan

Tr ( pt (x, x)) ∼

1 2 β (x) + t β (x) + t β (x) + · · · 0 2 4 (4π t)2

for t → 0.

(14)

As α0 (x) = id we have β0 (x) = 2 for any x ∈ M, and the function β2 (x) is related to the the density of the Holst action from (10) as follows. Restricting to orthogonal connections with S ≡ 0 is natural since fermions are not able to perceive any torsion of Cartan type. Theorem 3.4. Let M be a compact Riemannian 4-manifold with spin structure. For a g 3-form T and a vector field V consider the orthogonal connection ∇ X Y = ∇ X Y + T (X, Y, ·) + g(X, Y )V − g(V, Y )X . Let D denote the Dirac operator induced by ∇, and consider its restriction PL D ∗ D PL to the left-handed spinors. Then, for the term β2 from the expansion (14) we have 1 β2 dvol = − ργ∇ dvol + 6 dT 6

(15)

for the orthogonal connection ∇ given by ∇ X Y = ∇ X Y − 3T (X, Y, ·) + 3g(X, Y )V − 3g(V, Y )X and for the value γ = 1 of the Barbero-Immirzi parameter. g

Proof. We use the Lichnerowicz formula (12) and the explicit formula for α2 (x) from e. g. [Roe98, Prop. 7.19] to get α2 =

1 g 1 g 3 3 9 3 R − R − dT + T 2 − divg (V ) + |V |2 − 9 (T · V + V T ) . 6 4 2 4 2 2

By construction one has β2 = 21 Tr ((1 − γ5 )α2 ). We observe that the traces of dT , T · V and V T , taken over the 4-dimensional spinor space, all vanish since they act as 2-forms or 4-forms. So we get 1 Tr(α2 ) = − R g + 3 T 2 − 6 divg (V ) + 18|V |2 . 3 For a < b < c and a < b < c we get Tr(ea eb ec ea eb ec ) = 4 if a = a , b = b and c = c , and otherwise Tr(ea eb ec ea eb ec ) = 0. Hence, for 3-forms T, T we have Tr(T T ) = 4T, T 3 . We remark that V γ5 = −∗V and so Tr(T ·V γ5 ) = −4T, ∗V 3 . Furthermore, we have Tr(dT γ5 ) dvol = 4 dT and Tr(V T γ5 ) dvol = 0. This leads to Tr(γ5 α2 ) dvol = −6 dT + 36T, ∗V 3 dvol . We obtain 1 (Tr (α2 ) − Tr (γ5 α2 )) dvol 2 1 g R − 9 T 2 + 18 divg (V ) − 54 |V |2 + 108T, ∗V 3 dvol −18 dT . =− 6

β2 dvol =

Finally we compare this with the density of the Holst action ργ∇ dvol for the orthogonal connection ∇ and the Barbero-Immirzi parameter γ = 1, which is given by

Holst Action by the Spectral Action Principle

ργ∇ dvol =

271

R g − 9 T 2 + 18 divg (V ) − 54 |V |2 + 108T, ∗V 3 dvol +18 dT,

and establish (15). Corollary 3.5. Let M be a 4-dimensional compact manifold and ∇ be an orthogonal connection without Cartan type component as in Theorem 3.4. Then we get for the second coefficient of the heat trace asymptotics for PL D ∗ D PL , 1 8π G β2 dvol = − ργ∇ dvol = − IH, 6 M 3 M where I H denotes the Holst action for the connection ∇ with Barbero-Immirzi parameter γ = 1. In other words Corollary 3.5 states that the spectral action principle naturally predicts the Holst action in the case of orthogonal connection without the Cartan type torsion (which is invisible to fermions). The Barbero-Immirzi parameter then takes the critical value γ = 1 and one has set S − and S + to zero in (10). In [DVL10] it has been proposed to modify the Holst action by adding terms depending on the norm of torsion ( , ) = a a , a 2 and it is shown that such actions in general still have the same critial points as the Einstein-Cartan-Hilbert functional. Apart from considerations of quantisation (the need of canonical variables of Ashtekar type), Corollary 3.5 shows that the Holst action is special within this proposed larger class of actions. There has been a controversy whether the term dC T T = 6dT could be obtained via anomaly calculations and what its significance in quantum field theory and its relevance for the Barbero-Immirzi parameter might be ([CZ97,OMBH97,KM01,CZ01]). In [BS10] the induced gravity approach delivers the value ± 19 for the prefactor of the term dC T T if one additionally takes the specific particle content of the Standard Model into account. Within the approach of Connes’ spectral action principle a comparison of parameters would obtain the value 1 for the prefactor of the term dC T T (or the value −1 if we had projected on the right-handed spinors). This value would be independent of any specific particle model. However, one should be aware that in these actions the Cartan type component of the torsion does not appear, unlike in the action I H which is considered to be the relevant one in LQG. Acknowledgements. The authors appreciate funding by the Deutsche Forschungsgemeinschaft, in particular by the SFB Raum-Zeit-Materie. We would like to thank Christian Bär and Thomas Schücker for their support and helpful discussions. Furthermore, we are thankful to Friedrich Hehl for drawing our attention to literature that was previously unknown to us.

References [AF04] [As86] [As87] [BHN11]

Agricola, I., Friedrich, T.: On the holonomy of connections with skew-symmetric torsion. Math. Ann. 328(4), 711–748 (2004) Ashtekar, A.: New variables for classical and quantum gravity. Phys. Rev. Lett. 57, 2244–2247 (1986) Ashtekar, A.: New hamiltonian formulation of general relativity. Phys. Rev. D (3) 36(6), 1587– 1602 (1987) Baekler, P., Hehl, F.W., Nester, J.M.: Poincaré gauge theory of gravity: friedman cosmology with even and odd parity modes: analytic part. Phys. Rev. D (3) 83, 024001 (2011)

272

[Ba10]

F. Pfäffle, C. A. Stephan

Banerjee, K.: Some aspects of holst and niehyan terms in general relativity with torsion. Class. Quantum Grav. 27, 135012 (2010) [Ba95] Barbero, J.F.: Real ashtekar variables for lorentzian signature space-times. Phys. Rev. D (3) 51(10), 5507–5510 (1995) [Bl81] Bleecker, D.: Gauge theory and variational principles. Global Analysis Pure and Applied Series, Reading, MA: Addison-Wesley, 1981, Unabridged republication, Mineola, NY: Dover, 2005 [BS10] Broda, B., Szanecki, M.: A relation between the barbero-immirzi parameter and the standard model. Phys. Lett. B 690(1), 87–89 (2010) [Ca23] Cartan, É.: Sur les variétés à connexion affine et la théorie de la rélativité généralisée (première partie). Ann. Éc. Norm. Sup. 40, 325–412 (1923) [Ca24] Cartan, É.: Sur les variétés à connexion affine et la théorie de la rélativité généralisée (première partie, suite). Ann. Éc. Norm. Sup. 41, 1–25 (1924) [Ca25] Cartan, É.: Sur les variétés à connexion affine et la théorie de la rélativité généralisée (deuxième partie). Ann. Éc. Norm. Sup. 42, 17–88 (1925) [CC97] Chamseddine, A., Connes, A.: The spectral action principle. Commun. Math. Phys. 186(3), 731– 750 (1997) [CZ97] Chandia, O., Zanelli, J.: Topological invariants, instantons, and the chiral anomaly on spaces with torsion. Phys. Rev. D (3) 55(12), 7580–7585 (1997) [CZ01] Chandia, O., Zanelli, J.: Reply to: “comment on: ‘topological invariants, instantons, and the chiral anomaly on spaces with torsion” by d. kreimer and e. mielke. Phys. Rev. D (3) 63(4), 048502 (2001) [Co94] Connes A.: Noncommutative geometry. San Diego, CA: Academic Press, 1994 [Co96] Connes, A.: Gravity coupled with matter and the foundation of noncommutative geometry. Commun. Math. Phys. 183(1), 155–176 (1996) [DVL10] Dubois-Violette, M., Lagraa, M.: Abundance of local actions for the vacuum einstein equations. Lett. Math. Phys. 91(1), 83–91 (2010) [FS79] Friedrich, T., Sulanke, S.: Ein Kriterium für die formale Selbstadjungiertheit des DiracOperators. Colloq. Math. 40(2), 239–247 (1978/79) [GS87] Göckeler, M., Schücker, T.: Differential Geometry, Gauge Theories, and Gravity. Cambridge Monographs on Mathematical Physics, Cambridge: Cambridge University Press, 1987 [GWZ99] Guo, H.-Y., Wu, K., Zhang, W.: On torsion and nieh-yan form. Commun. Theor. Phys. 32, 381– 386 (1999) [Im97] Immirzi, G.: Real and complex connections for canonical gravity. Class. Quantum Grav. 14(10), L177–L181 (1997) [HHKN76] Hehl, F.W., von der Heyde, P., Kerlick, G.D., Nester, J.N.: General relativity with spin and torsion: foundations and prospects. Rev. Mod. Phys. 48, 393–416 (1976) [HM86] Hehl, F.W., McCrea, J.D.: Bianchi identities and the automatic conservation of energy momentum and angular momentum in general relativistic field theories. Found. Phys. 16(3), 267–293 (1986) [HMS80] Hojman, R., Mukku, C., Sayed, W.A.: Parity violation in metric torsion theories gravitation. Phys. Rev. D (3) 22(8), 1915–1921 (1980) [Ho96] Holst, S.: Barbero’s hamiltonian derived from a generalized hilbert-palatini action. Phys. Rev. D (3) 53(10), 5966–5969 (1996) [Ki07] Kimura, T.: Index theorems on torsional geometries. J. High Energy Phys. 2007(8), 048, 44 pp. (2007) (electronic) [KM01] Kreimer, D., Mielke, E.W.: Comment on: “topological invariants, instantons, and the chiral anomaly on spaces with torsion“ by o. Chanda and J. Zanelli. Phys. Rev. D (3) 63(4), 048501 (2001) [LM89] Lawson, Jr., H:B., Michelsohn, M.-L.: Spin geometry. Princeton Mathematical Series, Princeton, NJ: Princeton University Press, 1989 [Me06] Mercuri, S.: Fermions in the ashtekar-barbero connection formalism for arbitrary values of the immirzi parameter. Phys. Rev. D (3) 73(8), 084016 (1982) [NY82] Nieh, H.T., Yan, M.L.: An identity in riemann-cartan geometry. J. Math. Phys. 23(3), 373–374 (1982) [OMBH97] Obukhov, Y.N., Mielke, E.W., Budczies, J., Hehl, F.W.: On the chiral anomaly in non-riemannian spacetimes. Found. Phys. 27(9), 1221–1236 (1997) [PS11] Pfäffle, F., Stephan, C.A.: On Gravity, Torsion and the Spectral Action Principle. http://arxiv. org/abs/1101.1424v3 [math ph], 2011 [Roe98] Roe, J.: Elliptic operators, topology and asymptotic methods. Second edition. Pitman Research Notes in Mathematics Series, 395, Harlow: Longman, 1998 [Rov04] Rovelli, C.: Quantum gravity, with a foreword by James Bjorken. Cambridge Monographs on Mathematical Physics, Cambridge Cambridge: University Press, 2004 [Sh02] Shapiro, I.L.: Physical aspects of the space-time torsion. Phys. Rept. 357, 113–213 (2002)

Holst Action by the Spectral Action Principle

[Th07] [TV83]

273

Thiemann, T.: Modern canonical quantum general relativity, with a foreword by Chris Isham. Cambridge Monographs on Mathematical Physics, Cambridge: Cambridge University Press, 2007 Tricerri, F., Vanhecke, L.: Homogeneous Structures on Riemannian manifolds. London Math. Soc. Lecture Notes Series, Vol. 83, Cambridge: Cambridge University Press, 1983

Communicated by A. Connes

Commun. Math. Phys. 307, 275–313 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1314-x

Communications in

Mathematical Physics

Strongly Focused Gravitational Waves Michael Reiterer, Eugene Trubowitz Department of Mathematics, ETH Zurich, 8092 Zurich, Switzerland. E-mail: [email protected] Received: 21 December 2009 / Accepted: 5 July 2011 Published online: 9 September 2011 – © Springer-Verlag 2011

Abstract: This paper contains a new proof of the formation of trapped spheres, in vacuum spacetimes, by the focusing of gravitational waves, from generic data. The first such result was obtained by Christodoulou (Zurich: Eur Math Soc, 2009). We exploit the same physical mechanism, but give a logically independent construction of these spacetimes. 1. Introduction The vacuum spacetimes (M, g) constructed in this paper have the following properties:

1. M = (x μ )μ=1,2,3,4 = (ξ 1 , ξ 2 , u, u) ∈ R2 × (0, 1) × (−∞, u 0 ) with u 0 < 0. The plane R2 is identified with a punctured S 2 by stereographic projection. The metric g extends smoothly to S 2 × (0, 1) × (−∞, u 0 ).

276

M. Reiterer, E. Trubowitz

2. g −1 = g ab Fa ⊗ Fb , where ⎛ ∗ ∗ ⎜∗ ∗ μ (Fa ) = ⎝ ∗ ∗ 0 0

Fa = Fa μ ∂ x∂ μ are complex vector fields and ⎞ ⎞ ⎛ 0 0 0 1 0 0 0 0⎟ 0⎟ ⎜1 0 0 , (g ab ) = ⎝ . 0 1⎠ 0 0 0 −1⎠ ∗ 0 0 0 −1 0

The four tuple F = (Fa ) is a complex frame. The indices a, b = 1, 2, 3, 4. Asterisk entries ∗ can be nonzero. Some are nonzero, because F is a frame. We require: F3 + F4 is future-directed, F4 3 > 0 and (F1 , F2 , F3 , F4 ) = (F2 , F1 , F3 , F4 ). g −1

Thus, = ⊗ Fb is real and has signature (−, +, +, +). 3. For all 0 < u < 1/2, ⎞ ⎛ −1 ρ e +iρ −1 e 0 0 ρ = u − u, ⎜ρ −1 e −iρ −1 e 0 0⎟

(Fa μ ) = ⎝ ⎠, 0 0 0 1 e = 21 1 + (ξ 1 )2 + (ξ 2 )2 . 0 0 1 0 g ab Fa

The subset 0 < u < 1/2 of M is isometric to a subset of Minkowski spacetime. The isometry is given by u = 2−1/2 (t + r ), u = 2−1/2 (t − r ), with standard Minkowski time t, radius r , and standard stereographic coordinates (ξ 1 , ξ 2 ). Thus, the past light cone t +r ≤ 0 in Minkowski spacetime can be smoothly attached to M along u = 0. 4. There is a map DATA : R2 × (0, 1) → C with u 2 −1 2 −1 1 2 lim u (F1 − ρ e) = lim iu (F1 − iρ e) = e du DATA(ξ 1 , ξ 2 , u ). u→−∞

u→−∞

ξ 1,

0

ξ 2,

The limits are taken at constant u. Here, u → −∞ is interpreted as past null infinity, and DATA as initial data at past null infinity. It vanishes when u < 21 . Informally, DATA describes the incoming radiation. DATA uniquely determines the metric on M, if limu→−∞ F4 3 = 1, and we make technical assumptions about the decay as u → −∞. These assumptions are not discussed in this Introduction. 5. The uniqueness statement in the last item is possible because the gauge has already been completely fixed,1 by the asterisk pattern of (Fa μ ). The three zeros in the third (resp. fourth) column imply that u (resp. u) is a solution to the eikonal equation g ab Fa (u)Fb (u) = 0 (resp. g ab Fa (u)Fb (u) = 0) and that its gradient2 is −F4 3 F3 (resp. −F4 ). The two zeros in the lower-left corner imply that ξ 1 , ξ 2 are transported along F4 . Thus, two eikonal equations, two transport equations, together with limu→−∞ F4 3 = 1 and the Minkowskian data for 0 < u < 21 , fix the coordinates. 6. There is an explicit expansion in powers of u1 for the frame, for u large negative, with rigorous error estimates. 7. The intersections of level sets of u and u are spacelike 2-dimensional spheres. For a generic class of initial data, some of them are trapped, in the sense that “the two systems of null geodesics which meet [the surface] orthogonally converge locally in future directions at [the surface]” [Pen]. The existence of trapped spheres is read off from the lowest order term of the u1 expansion. 1 Actually, there is still the local U (1) gauge degree of freedom (F , F ) → (e+iθ F , e−iθ F ). It is fixed 1 2 1 2 by a transport equation, but this is not discussed in this Introduction. 2 The gradient of f : M → R is the vector field g ab F ( f )F . a b

Strongly Focused Gravitational Waves

277

8. Let DATA be a C 10 norm3 of DATA. The analysis is tailored to large DATA . However, the bigger DATA , the smaller the interval u ∈ (−∞, u 0 ), with u 0 < 0, that we control. More precisely, |u 0 | ∼ DATA as DATA → ∞. The discussion above is framed in what we call the High Amplitude Picture. Another picture, the Regularized Picture, is better suited for giving proofs, and is used in almost the entire paper, including the statement of the main Theorem 8.6. The two pictures are related by a symmetry transformation4 , see Sect. 10. Modulo this transformation, Theorem 8.6 is precise about the technical issues that were glossed over in the above discussion. A third picture is introduced to compare the present paper with [Chr], here referred to as the Finite Mass Picture. Christodoulou [Chr] implements the focusing of gravitational waves by a dedicated geometric optics argument, that he calls the ‘short pulse method’. It consists of an appropriate geometric setup, not unlike Vaidya’s [Vai], and a self-consistent scheme of bounds for all the unknown quantities in terms of a small parameter δ > 0. Many bounds are of the singular form O(δ −α ), with α > 0. Our results are stronger than [Chr] in the sense that we construct semi-global solutions (all the way to past null infinity)5 , obtain an explicit asymptotic expansion, and control the solution for a longer time. Our results are slightly weaker from the point of view of regularity6 . However, the main point of this paper is its methodology. Here, it yields a new and very short construction of semi-global solutions to the vacuum Einstein equations with trapped spheres that form in evolution. We use the orthonormal frame formalism of Newman and Penrose [NP], and the key idea of hyperbolic reduction to symmetric hyperbolic systems due to Friedrich [Fr]. Our hyperbolic reduction is new, directly in a double null gauge (Sect. 2). The equations are brought into a relevant / irrelevant form that exhibits the essential constituents that have to be treated carefully, and sweeps everything else into ‘generic terms’ that one doesn’t need to know much about (Sect. 5 and the program in Appendix A). Formal power series solutions in u1 (Sect. 6) and energy estimates (Sect. 7) are combined to construct classical solutions (Sect. 8). Under certain generic conditions, they contain trapped spheres (Sect. 9). The initial value problem with data along the asymptotic characteristic surfaces u = −∞, see above, is solved by taking the limit of a sequence of solutions to initial value problems with data on the spacelike level sets of u + u. The elements of the sequence have no direct physical meaning, because they do not necessarily satisfy the constraint equations. However, the semiglobal limit does. A longer version of this paper, in which even standard proofs are written out in all detail, can be found on the arxiv preprint server 7 . The arxiv version in addition includes a proof of a ‘Minkowski to Schwarzschild transition’, and introduces an expansion, more powerful than the u1 expansion, that can be used to continue the solutions further. For a summary of this expansion, see Sect. 11 of the present paper.

3 This norm takes the two patches on S 2 into account. 4 A symmetry of the vacuum Einstein equations in the frame formalism of Newman, Penrose, Friedrich. 5 Christodoulou underscores the importance of the semi-global limit [Chr, p. 4]: “[. . .] the physically interesting problem is the problem where the initial conditions are of arbitrarily low compactness, that is, arbitrarily far from already containing closed trapped surfaces [. . .]”. 6 The amplitude DATA can be taken to be C 7 in [Chr], as opposed to C 10 in this paper. 7 http://arxiv.org/abs/0906.3812 (version v1, June 2009).

278

M. Reiterer, E. Trubowitz

2. Gauge Fixing and Hyperbolic Reduction This section uses the Newman-Penrose-Friedrich formalism, see [NP] and [Fr]. Convention 2.1. Small Latin frame indices and small Greek coordinate indices run from 1 to 4. Pairs such as (ab) run over the ordered sequence (12), (31), (32), (41), (42), (34). Definition 2.2 (Parametrization). Introduce the vector space of real dimension 31:

R = (e, γ , w) ∈ C5 ⊕ C8 ⊕ C5 e3 , e4 , e5 , γ2 , γ6 ∈ R . Let Param : R (e, γ , w) → (F, , W ) be given by ⎛ ⎞ e1 e2 0 0

μ ⎜e 1 e 2 0 0 ⎟ , Fa = ⎝ e4 e5 0 1⎠ 0 0 e3 0 ⎛ ⎞ γ3 + γ 4 γ 7 γ6 γ1 γ2 γ3 − γ 4

⎜ − γ4 − γ 3 γ6 γ7 γ2 γ1 − γ4 + γ 3 ⎟ a(bc) = ⎝ , γ8 − γ 8 0 0 − γ3 + γ 4 γ4 − γ 3 γ8 + γ 8 ⎠ 0 γ 5 γ5 0 0 0 ⎛ ⎞ w3 + w 3 w 4 − w4 w2 − w 2 w3 − w 3 ⎜ w4 w5 0 0 − w3 − w4 ⎟ ⎟

⎜ 0 w5 − w3 0 − w4 ⎟ ⎜ − w4 , W(ab)(cd) = ⎜ w2 0 − w3 w1 0 w2 ⎟ ⎜ ⎟ ⎝ −w ⎠ − w3 0 0 w1 w2 2 w3 − w 3 − w 4 − w4 w2 w2 w3 + w 3 and abc = − acb and Wabcd = −Wbacd = −Wabdc . Definition 2.3.

⎛ 0

⎜1 gab = ⎝ 0 0

1 0 0 0

0 0 0 −1

⎞ 0 0⎟ , −1⎠ 0

⎛ 0

ab ⎜1 g =⎝ 0 0

1 0 0 0

0 0 0 −1

⎞ 0 0⎟ . −1⎠ 0

(2.1)

Convention 2.4. Indices are raised and lowered with (gab ) and its matrix inverse (gab ). Remark 2.5. Wabcd = Wcdab and Wi jka + W jkia + Wki ja = 0 and W a ia j = 0. Equivalently, W has the algebraic symmetries of a Weyl tensor with respect to (gab ). And Fa μ = Fa μ , a b c = abc , Wa b c d = Wabcd with (1 , 2 , 3 , 4 ) = (2, 1, 3, 4). Definition 2.6 ( Vacuum field). Let U ⊂ R4 be open and (x μ ) standard Cartesian coordinates on R4 . A sufficiently differentiable field Φ = (e, γ , w) : U → R is a vacuum field iff (F, , W ) = Param ◦ (e, γ , w) satisfies (a), (b), (c): (a) (Fa ) = (Fa μ ∂ x∂ μ ) is a complex frame on U, that is det(Fa μ ) = 0. And F4 3 > 0. Let g be the real, Lorentzian metric with real8 inverse g −1 = gab Fa ⊗ Fb . (b) The connection given by ∇ Fa Fb = ab c Fc is the Levi-Civita connection of g. 8 Reality follows from (F , F , F , F ) = (F , F , F , F ). 1 2 3 4 2 1 3 4

Strongly Focused Gravitational Waves

279

(c) g([∇ F j , ∇ Fk ]Fb − ∇[F j ,Fk ] Fb , Fa ) = Wabjk . Observe that (a), (b), (c) and Remark 2.5 imply that g is Ricci-flat. Proposition 2.7 (Geometric interpretation of the gauge fixed by a vacuum field). Suppose Φ = (e, γ , w) is a vacuum field. Throughout this paper, we use the notation (x μ ) = (ξ 1 , ξ 2 , u, u),

(Fa ) = (D, D, N , L).

Then, with respect to the metric g and its Levi-Civita connection ∇: Part 1. u and u solve the eikonal equation g −1 (du, du) = g −1 (du, du) = 0 with g −1 (du, du) < 0. The real vector field L is minus the gradient of u, that is L = −g −1 (du, · ), and e3 = L(u) > 0. The real vector field e3 N is minus the gradient of u, that is e3 N = −g −1 (du, · ). The coordinates ξ 1 , ξ 2 satisfy the transport equations L(ξ 1 ) = L(ξ 2 ) = 0. The complex vector field D is such that 21/2 (D, D) is a real orthonormal frame for the spacelike ker du ∩ ker du, and D satisfies the transport equation g(∇ L D, D) = 0. Part 2. Declare N + L to be future-directed and let Su,u be the intersection of the level sets of u and u. The traces of the future-directed second fundamental forms of Su,u relative to L and N are g(∇ D L , D) + g(∇ D L , D) = 2γ2 and g(∇ D N , D) + g(∇ D N , D) = 2γ6 . Recall that γ2 and γ6 are real. The traceless parts of the second fundamental forms are determined by the complex γ1 and γ7 . Proof. These are transcriptions of different properties of the matrices in Definition 2.2, in the context of Definition 2.6. For instance, u solves the Eikonal equation, because g −1 (du, du) = gab Fa (u)Fb (u) = 2F1 4 F2 4 − 2F3 4 F4 4 = 0. Remark 2.8. Su,u is locally trapped, in the sense of [Pen], if and only if γ2 , γ6 < 0. Proposition 2.9 (Local realizability of the gauge fixed by a vacuum field). Every Ricciflat Lorentzian manifold is locally isometric to a pair (U, g) as in Definition 2.6. Proof. Given g and its Levi-Civita connection ∇ on a manifold, Proposition 2.7. Part 1 is a step by step outline for the local construction of u, u, L , e3 , N , ξ 1 , ξ 2 , D. The construction is not unique, because one has to specify initial data for the eikonal and transport equations. The construction yields a field Φ with the desired properties. Definition 2.10. To any sufficiently differentiable Φ : U → R associate (T, U, V ) by: Tab μ = Fb (Fa μ ) − Fa (Fb μ ) + ab c Fc μ − ba c Fc μ , Ukab = Fa ( bk ) − Fb ( ak ) + b m amk − a m bmk −( ab m − ba m ) mk − Wkab , Vabi jk = Fi (Wabjk ) − ia m Wmbjk − ib m Wam jk − i j m Wabmk − ik m Wabjm + Fk (Wabi j ) − ka m Wmbi j − kb m Wami j − ki m Wabm j − k j m Wabim + F j (Wabki ) − ja m Wmbki − jb m Wamki − jk m Wabmi − ji m Wabkm . Here Φ = (e, γ , w) and (F, , W ) = Param ◦ (e, γ , w). Proposition 2.11 (Equivalent characterization of a vacuum field). Φ : U → R is a vacuum field if and only if e3 > 0 and (e1 e2 ) = 0 and (T, U, V ) = 0.

280

M. Reiterer, E. Trubowitz

Proof. First, det(Fa μ ) = −2ie3 (e1 e2 ) and F4 3 = e3 . Then T = 0 iff Definition 2.6 (b) holds, T is torsion; U = 0 iff Definition 2.6 (c) holds, given (b); V = 0 iff the Weyl field associated to W satisfies the differential Bianchi identities. V = 0 is not necessary in Proposition 2.11. However, it is used in the hyperbolic reduction below, cf. [Fr]. Remark 2.12. Tab μ = −Tba μ and Uabk = −Ubak = −Uabk and Vkab = −Vkba and V k kb = 0 and Vabc + Vbca + Vcab = 0, where Vbjk = gai Vabi jk . The components of Vabi jk all vanish if and only if the components of Vbjk all vanish, pointwise on U. Definition 2.13. Introduce the vector space of real dimension 32:

= (t, u, v) ∈ C5 ⊕ C9 ⊕ C3 t1 , t2 ∈ R . R Proposition 2.14. Let Φ : U → R and (T, U, V ) be as in Definition 2.10. Let λ = (λ1 , λ2 , λ3 , λ4 ) be strictly positive weight functions on U. Then, there are unique fields

(t, u, v) = t(Φ, λ), u(Φ, λ), v(Φ, λ) : U → R ⊂ C5 ⊕ C8 ⊕ C5 ,

⊂ C5 ⊕ C9 ⊕ C3 , (t, u, v) = t (Φ, λ), u(Φ, λ), v(Φ, λ) : U → R whose components are quadratic polynomials in Φ, ⎛

⎞

∂ ∂ x μ Φ,

i t1 i t2 0 0 ⎜ t4 t5 0 0⎟ ⎟

⎜ t5 0 0⎟ ⎜ t4 μ T(ab) = ⎜ , t3 0⎟ ⎜ − t1 − t2 ⎟ ⎝− t ⎠ − t2 t3 0 1 t4 t5 − t3 0 ⎛ u 2 + u 2 u 7 − u 8 u 8 − u 7 −u3 − u4 ⎜ u9 −u7 −u6 u5

⎜ −u6 −u7 −u 6 ⎜ −u 9 U(ab)( jk) = ⎜ u4 −u 3 −u1 ⎜ −u 1 ⎝ u −u u −u 1 3 4 2 u 2 − u 2 u 7 + u 8 u 8 + u 7 −u3 + u4 ⎛ 1 1 − λ14 v5 λ2 (−v2 + v1 ) − λ3 v 3 ⎜ 1 1 1

⎜ (v2 − v 1 ) + λ3 v3 λ3 (v3 − v 2 ) Va( jk) = ⎜ 1λ2 1 ⎝ λ3 (v3 − v3 − v2 + v 2 ) λ4 (−v4 + v 3 ) 1 1 λ2 (−v2 + v 2 ) λ3 v 3 − λ11 v1 1 λ2 v2 1 λ2 (v2 − v1 ) − λ11 v1

1 λ2 v 2 − λ11 v1

1 λ2 (v2 − v 1 ) − λ11 v 1

Φ,

∂ ∂ x μ Φ,

u4 + u3 −u 6 u5 −u2 −u1 u4 − u3

such that

⎞ u8 − u8 −u5 ⎟ ⎟ −u5 ⎟ , −u3 + u4 ⎟ ⎟ ⎠ u4 − u3 u8 + u8

1 λ3 (v3 − v2 ) − λ14 v5 1 λ4 (−v4 + v3 ) 1 λ3 v3

⎞

1 1 λ2 (−v2 + v1 ) + λ3 v 3 ⎟ 1 1 λ2 (−v2 + v 1 ) + λ3 v3 ⎟ . ⎟ 1 λ3 (v3 + v3 − v2 − v 2 )⎠ − λ12 (v2 + v 2 )

In particular, (T, U, V ) = 0 if and only if (t, u, v) = 0 and (t, u, v) = 0. Proof. T, U, V lie pointwise in spaces of real dimension 24, 36, 16, by Remark 2.12. For all Φ: T12 3 = 0 (1 r.e.9 ); T31 3 = 0 (2 r.e.); Tab 4 = 0 (6 r.e.); U3441 = U4134 9 n r.e. is short for n real equations.

Strongly Focused Gravitational Waves

281

(2 r.e.); U3132 = 0 (1 r.e.); U4142 = 0 (1 r.e.). Thus, T, U, V lie in subspaces of dimension 24 − 9 = 15, 36 − 4 = 32, 16 − 0 = 16. The matrices on the right-hand sides lie in these subspaces. The linear map from (t, u, v) ⊕ (t, u, v) to these matrices = 31 + 32 = 63. Since this is equal to 15 + 32 + 16, the has maximal rank dimR R ⊕ R fields (t, u, v) ⊕ (t, u, v) exist and are unique. Remark 2.15. A hyperbolic reduction of the vacuum Einstein equations in the Newman Penrose [NP] formalism was given by Friedrich [Fr], using quasilinear symmetric hyperbolic systems. We introduce a new hyperbolic reduction, using the ‘vacuum field’ gauge of Definition 2.6. As in [Fr], there is one quasilinear symmetric hyperbolic system (Proposition 2.16) and one linear homogeneous symmetric hyperbolic system (Proposition 2.18). The latter is used to show that the constraints propagate. Proposition 2.16 (Hyperbolic reduction, Part 1). (t, u, v) = 0 in Proposition 2.14 is component by component equivalent to a system of the form A(Φ)Φ = f(Φ), where

A(Φ) = A(Φ)μ ∂ x∂ μ = diag L , L , N , L , L ⊕ diag L , L , L , L , N , N , N , L ⎛ ⎞ λ1 N λ1 D 0 0 0 ⎜ λ1 D λ1 L + λ2 N λ2 D 0 0 ⎟ ⎜ ⎟ ⎜ λ2 D λ2 L + λ3 N λ3 D 0 ⎟ ⊕⎜ 0 ⎟ ⎝ 0 0 λ3 D λ3 L + λ4 N λ4 D ⎠ 0 0 0 λ4 D λ4 L and A(Φ)μ is a matrix (resp. f(Φ) is a vector) that is a linear (resp. quadratic) polynomial in Φ, Φ. The coefficients depend only on λ = (λ1 , λ2 , λ3 , λ4 ). If e3 > 0, then A(Φ)μ are Hermitian and A(Φ)3 + A(Φ)4 is positive definite. Proof. The proof is by direct hand or machine calculation. For example, (1)

(2)

−t2 = T41 2 = F1 (F4 2 ) − F4 (F1 2 ) + gcd 41c Fd 2 − gcd 14c Fd 2 (3)

= F1 (F4 2 ) − F4 (F1 2 ) + 412 F1 2 + 411 F2 2 − 414 F3 2 − 413 F4 2 − 142 F1 2 − 141 F2 2 + 144 F3 2 + 143 F4 2

(4)

= D(0) − L(e2 ) + 0 · e2 + 0 · e2 + 0 · e5 + γ 5 · 0 −γ2 e2 − γ1 e2 + 0 · e5 − (γ3 − γ 4 ) · 0,

(1) Proposition 2.14, (2) Definition 2.10, (3) sum, (4) (F, , W ) = Param ◦ (e, γ , w). Hence, t2 = 0 is equivalent to L(e2 ) = −γ2 e2 − γ1 e2 , line two of A(Φ)Φ = f(Φ). Proposition 2.17. Let Φ and the associated (T, U, V ) be as in Definition 2.10. Set

Ta μ = a i jk Fi (T jk μ ) − i j m Tmk μ − ik m T jm μ − U m ki j Fm μ − T jk ν ∂ ∂x ν Fi μ ,

Ucab = c i jk Fi (Uabjk ) − ia m Umbjk − ib m Uam jk − i j m Uabmk − ik m Uabjm −U m i jk mab + 13 Vabi jk − Ti j μ ∂ x∂ μ kab ,

282

M. Reiterer, E. Trubowitz

V jk = Fb (V b jk ) + bm b V m jk − bj m V b mk − bk m V b jm + U a mab W mb jk − 21 Um jab W abm k + 21 Umkab W abm j − 21 Tab μ

ab ∂ ∂ x μ W jk ,

where abcd is totally antisymmetric with 1234 = −1. Then (T, U, V) = 0. Proof. For example, since T ∼ F(∂ F) + F (where ∼ indicates ‘sum of terms of the form’) and U ∼ F(∂ ) + 2 + W and V ∼ F(∂ W ) + W , one has V ∼ F(∂ V ) + V + U W + T (∂ W ) ∼ F(∂ F)(∂ W ) + F 2 (∂ 2 W ) + F(∂ )W + F (∂ W ) + 2 W + W 2 . The six terms vanish separately. The 1st Fb μ ( ∂ x∂ μ Fa ν )( ∂ ∂x ν W ab jk ) − 1 ∂ ∂ ν ∂ μ ν ∂ μ ab nd F μ F ν ∂ ab b a ∂ x μ ∂ x ν W jk = 0, 2 (Fb ∂ x ν Fa − Fa ∂ x ν Fb )( ∂ x μ W jk ) = 0, the 2 1 1 th a mb abm abm the 6 −W mab W jk + 2 Wm jab W k − 2 Wmkab W j = 0, etc. The symmetries ∂ ∂ of W and [ ∂ x μ , ∂ x ν ] = 0 were used. Proposition 2.18 (Hyperbolic reduction, Part 2). Let Φ and the associated (T, U, V ) be as in Definition 2.10, and let (t, u, v) ⊕ (t, u, v) be as in Proposition 2.14. Suppose (t, u, v)(Φ, λ) = 0. Then, the vanishing of T4 1 , T4 2 , T1 3 , T2 1 , T2 2 , U414 , (U412 +U434 )/2, U214 , U114 , U223 , U123 , (U112 + U134 )/2, (U212 + U234 )/2, U332 , V41 , (V12 + V34 )/2, V23 , asserted by Proposition 2.17, is a linear homogeneous system A(Φ)Φ = f(Φ, ∂x Φ)Φ for Φ , where, here and in the rest of this paper, Φ = (t, u, v)(Φ, λ)

(2.2)

is called the constraint field associated to Φ, and

A(Φ) = A(Φ)μ ∂ x∂ μ = diag L , L , N , L , L ⊕ diag L , L , L , L , N , N , L , L , N ⎛1 ⎞ 1 1 0 λ1 N + λ2 L λ2 D ⎜ ⎟ 1 1 1 ⊕ ⎝ λ12 D ⎠ λ2 N + λ3 L λ3 D 1 1 1 D N + L 0 λ3 λ3 λ4 and f(Φ, ∂x Φ) is pointwise an R-linear transformation. If e3 > 0, then the matrices A(Φ)μ are Hermitian and A(Φ)3 + A(Φ)4 is positive definite. Proof. By direct hand or machine calculation. For example T4 2 = 0 and (t, u, v) = 0 ∂ imply L(t2 ) = −2γ2 t2 + 2(t3 ∂u e2 ) + 2(e2 u 1 ), line two of A(Φ)Φ = f(Φ, ∂x Φ)Φ . There are no derivatives of (t, u, v) on the right hand-side. 3. Symmetries Let x, x be Cartesian coordinates on the open subsets U, U ⊂ R4 . Proposition 3.1. Let S be any one of the transformations C, Z, J, A defined below. Then, S is a symmetry transformation acting on triples (x, Φ, λ), in the sense that: • x → x = S · x is a diffeomorphism U → U . • Φ → Φ = S · Φ is a map from fields Φ : U → R to fields Φ : U → R such that e3 > 0 and (e1 e2 ) = 0 on U if and only if e3 > 0 and (e1 e2 ) = 0 on U. • λ → λ = S · λ is a map from weights on U to weights on U .

Strongly Focused Gravitational Waves

283

• A(Φ )Φ = f(Φ ) on U if and only if A(Φ)Φ = f(Φ) on U. • (Φ ) = 0 on U if and only if Φ = 0 on U. Definition 3.2. Let C A (x 1 , x 2 ) ∈ R, ζ (x 1 , x 2 ) ∈ U (1), J > 0 and A = 0. Then, • C acts by10 : x = (C1 , C2 , x 3 , x 4 ), (e3 , γ , w , λ )(x ) = (e3 , γ , w, λ)(x) 2 ∂ CA eA+3B (x ) = C=1 (x)eC+3B (x) A = 1, 2 B = 0, 1 ∂xC 11 • Z acts by : x = x,

(Φ , λ )(x ) = (ζ A Φ + 05 ⊕ 21 02 , D(ζ ), D(ζ ), 03 , ζ −1 N (ζ ) ⊕ 05 , λ)(x)

A = diag 1, 1, 0, 0, 0, 2, 0, 1, −1, −1, 0, −2, 0, 2, 1, 0, −1, −2 • J acts by: x = diag(1, 1, J, J)x, (Φ , λ )(x ) = (J A Φ, λ)(x) A = (−1) diag 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2 • A acts by: x = diag(A−1 , A−1 , 1, A2 )x, (Φ , λ )(x ) = A A Φ, diag(1, A2 , A4 , A6 ) λ (x) A = (−1) diag 2, 2, 0, 3, 3, 0, 0, 1, 1, 1, 2, 2, 2, 0, 1, 2, 3, 4 Proof (of Proposition 3.1). Set (F, , W ) = Param ◦ Φ, (F , , W ) = Param ◦ Φ . Set F = (Fa μ ∂ x∂ μ ) and F = (Fa μ ∂(x∂ )μ ) with a = 1, 2, 3, 4. By the above definitions, C : F = F, J : F = J−1 F,

Z : F = diag(ζ, ζ −1 , 1, 1)F, A : F = diag(A−1 , A−1 , A−2 , 1)F.

In these equalities, we regard (x, U) and (x , U ) as global coordinates on the same manifold. In each case, gab Fa ⊗ Fb = C 2 gab Fa ⊗ Fb for a constant C > 0. Given the transformation law for coordinates and frame, the proof is by direct verification. Definition 3.3. Let α = 0. Set Flipα = Z ◦ C, where ζ (ξ ) = −ξ/ξ and C(ξ ) = α 2 /ξ . Here ξ = ξ 1 + iξ 2 and C(ξ ) = C1 (ξ ) + iC2 (ξ ) and x = (ξ 1 , ξ 2 , u, u). Remark 3.4. Flipα ◦ Flipα is the identity. The symmetry transformation Flipα will be used to match constructions between stereographic coordinate patches. 4. The Minkowski Field Set Strip∞ = R2 × (0, ∞) × (−∞, 0) and use the coordinates x = (ξ = ξ 1 + iξ 2 , u, u). Definition 4.1. For all a, A = 0, let Ma,A : Strip∞ → R be the field

Ma,A = ρ −1 e, iρ −1 e, 1, 02 ⊕ ρ −1 0, A2 , λ, λ, 0, −1, 02 ⊕ 05 ,

2 2 ρa,A(u, u) = A2 u − u ea,A(ξ ) = a2 1 + A λa,A(ξ ) = − A |ξ |2 2a ξ. a2

(4.1) (4.2)

We often omit the subscripts on ρ, e, λ. Let S = S(u, u) be given by S = ρ −1 u 2 + u

or, equivalently,

ρ −1 = −u −1 + u −2 S.

10 C is not defined unless x → x is a diffeomorphism. 11 D(ζ ) = (e ∂ + e ∂ )(ζ ) and N (ζ ) = (e ∂ + e ∂ )(ζ ). 1 ∂x1 2 ∂x2 4 ∂x1 5 ∂x2

(4.3)

284

M. Reiterer, E. Trubowitz

Proposition 4.2. Recall, C, A, J, Flipα from Sect. 3. For all a, A = 0: (a) (b) (c) (d) (e)

Ma,A = (C ◦ A) · M1,1 on Strip∞ , where C(ξ ) = a ξ . Ma,A = Flip Aa · Ma,A on Strip∞ ∩ {ξ = 0}. Ma,A = J · Ma,A on Strip∞ , for all J > 0. Ma,A is a vacuum field on Strip∞ (see Definition 2.6). Ma,A is isometric, as a Lorentz manifold, to the open subset of Minkowski space,

0 (X , X) ∈ R × R3 |X 0 | < |X|, X ∈ / {0} × {0} × [0, ∞) ,

where (X 0 , X) are the standard Minkowski coordinates, and

⎛ ⎞ ⎞ 2A 1 X1 a ξ 2 A u−u ⎜ ⎟ 2A 2 ⎝ X 2⎠ = √ 1 ⎝ ⎠. a ξ A2 2 2 2 |A| 3 1 + |ξ | A 2 X a2 − 1 + a 2 |ξ | ⎛

X0 = √

1 2 |A|

(A2 u + u),

Proof. (a) Set Ca = diag(a, a, 1, a, a) ⊕ 113 . Let A be as in Definition 3.2, A. Then,

(C ◦ A) · M1,1 (ξ, u, u) = C · (A · M1,1 ) (ξ, u, u)

−2 = Ca (A · M1,1 )(a −1 ξ, u, u) = Ca A A M1,1 A a ξ, u, A u = Ma,A(ξ, u, u). Parts (b),(c),(d),(e) are by direct calculation. We check (d) for M1,1 , then use (a). 1

1

1 Remark 4.3. The level sets of u = 2− 2 |A|(X 0 − |X|) and u = 2− 2 | A |(X 0 + |X|) are 1

1 null hypersurfaces. They intersect in a standard sphere of radius |X| = 2− 2 | A |ρ, with A the north pole removed, on which a ξ is the standard stereographic coordinate system. a The southern hemisphere corresponds to |ξ | < | A |.

5. The Relevant / Irrelevant Form Recall (x μ ) = (ξ 1 , ξ 2 , u, u). By definition the change of fields, for u < 0, Φ = Ma,A + u −M

(5.1)

is called the far field ansatz, see (4.1). Here, M = diag(2, 2, 2, 3, 3) ⊕ diag(1, 2, 2, 2, 2, 2, 2, 3) ⊕ diag(1, 2, 3, 4, 4), = (1 , 2 , 3 ) = ( f, ω, z) is an R-valued field. We will construct vacuum fields for which has a nontrivial limit as u → −∞. Proposition 5.1. In this proposition, ignore (2.2), regard Φ and Φ as independent, respectively. Set sufficiently differentiable fields with values in R and R, M = diag(2, 2, 2, 3, 3) ⊕ diag(1, 2, 2, 2, 2, 2, 2, 3) ⊕ diag(1, 2, 3, 4, 4), E = diag(4, 4, 4, 6, 6) ⊕ diag(2, 4, 4, 4, 4, 4, 4, 6) ⊕ diag(0, 0, 0, 0, 0), M = diag(2, 2, 2, 3, 3) ⊕ diag(2, 2, 2, 2, 2, 2, 3, 3, 3) ⊕ diag(0, −1, −2), E = diag(4, 4, 4, 6, 6) ⊕ diag(4, 4, 4, 4, 4, 4, 6, 6, 6) ⊕ diag(2, 2, 2),

Strongly Focused Gravitational Waves

285

and Φ(x) = Ma,A(x) + u −M (x), −M

Φ (x) = u (x), 2j λ j (x) = u ,

(x) ∈ R,

(5.2a)

(x) ∈ R, (5.2b) j = 1, 2, 3, 4 (see, Proposition 2.14). (5.2c)

The systems (see, Sect. 2) A(Φ)Φ = f(Φ) and A(Φ) Φ = f(Φ, ∂x Φ) Φ for Φ and Φ are equivalent to the following systems for and : Aa,A(x, ) = fa,A(x, ), Aa,A(x, ) = fa,A(x, , ∂x ) ,

(5.3a) (5.3b)

μ μ where Aa,A(x, ) = Aa,A(x, ) ∂ x∂ μ and Aa,A(x, ) = Aa,A(x, ) ∂ x∂ μ and

μ Aa,A(x, ) = u E u −M Aμ (Φ)u −M ,

fa,A(x, ) = u E−M − Aμ (Φ) ∂ x∂ μ u −M + f(Φ) − Aμ (Φ) ∂ x∂ μ Ma,A ,

μ μ Aa,A(x, ) = u E u −M A (Φ) u −M ,

fa,A(x, , ∂x ) = u E −M − Aμ (Φ) ∂ x∂ μ u −M + f(Φ, ∂x Φ) u −M . Here, Φ, ∂x Φ have to be expressed in terms of , ∂x using (5.2a). We will sometimes drop the a, A and write A(x, ), f(x, ), A(x, ), f(x, , ∂x ). They are notationally distinguished from A(Φ), f(Φ), A(Φ), f(Φ, ∂x Φ) by the number of arguments. Remark 5.2. Aμ (x, ), Aμ (x, ) are Hermitian, so that (5.3a), (5.3b) are symmetric hyperbolic. They are affine R-linear functions of . The R-linear transformation f(x, , ∂x ) depends affine R-linearly on ⊕ ∂x . On the other hand, f(x, ) is a quadratic polynomial in the components of , without constant term. There is no constant term, because Ma,A is a vacuum field. Neither derivatives of ea,A nor derivatives of λa,A appear in the term Aμ (Φ) ∂ x∂ μ Ma,A. Definition 5.3 (Generic Symbols). Let S be as in (4.3). • P is a generic symbol for a quadratic polynomial in the components of the fields and without constant term, whose coefficients are (complex) polynomials in 1 u , A, S, e, λ, λ. • P is a generic symbol for a polynomial in the components of the fields and and all their first order coordinate derivatives, whose coefficients are (complex) polynomials in u1 , A, S, e, λ, λ, and all their first order coordinate derivatives. We use the same symbols P and P for a vector or matrix all of whose entries are polynomials of this kind. Here and below, e = ea,A and λ = λa,A. Remark 5.4. The vector fields D, N and L corresponding to Φ = Ma,A + u −M are:

∂ ∂ ∂ D = − u2 e ∂ξ + u22 e S ∂ξ + u12 f 1 ∂ξ∂ 1 + f 2 ∂ξ∂ 2 , = 21 ( ∂ξ∂ 1 + i ∂ξ∂ 2 ), ∂ξ (5.4)

∂ ∂ ∂ L = ∂u N = ∂u + u13 f 4 ∂ξ∂ 1 + f 5 ∂ξ∂ 2 , + u12 f 3 ∂u .

286

M. Reiterer, E. Trubowitz

Proposition 5.5 (Relevant/Irrelevant Form). The system (5.3a) takes the form ⎛ ⎞ ⎛ ⎞ e ω1 f1 ⎜ f2 ⎟ ⎜ ⎟ −i e ω1 ⎜ ⎟ ⎜ ⎟ 2 e (ω4 − ω3 − ω5 ) ⎟ ⎜ f4 ⎟ ⎜ ⎜ ⎟ ⎜ −2 e (ω − ω − ω ) ⎟ ⎜ f5 ⎟ ⎜ ⎟ 1 4 3 5 ⎜ ⎟ ⎜ ⎟ −z 1 L ⎜ω1 ⎟ = ⎜ ⎟ + P, ⎜ω ⎟ ⎜ ⎟ u 2 −|ω1 | ⎜ 2⎟ ⎜ ⎟ ⎜ω ⎟ ⎜ ⎟ −z 2 − λ ω1 ⎜ 3⎟ ⎜ ⎟ ⎝ω4 ⎠ ⎝ ⎠ −λ ω 1

ω8 z 3 + 2i λ (ω4 − ω5 − ω3 ) ⎛ ⎛ ⎞ ⎞ 2 ( f 3 + ω8 ) f3 ⎜ω ⎟ 1 ⎜ω + ω5 − ω3 ⎟ 1 N ⎝ 5⎠ = ⎝ 4 ⎠ + 2 P, ω6 0 u u 0 ω7 ⎞ ⎛ 1 N 0 0 0 ⎛z ⎞ uD 1 1 ⎜1 D N + 1 L 0 0⎟ ⎟ ⎜z 2 ⎟ ⎜u 2 uD u ⎟⎜ ⎟ ⎜ 1 1 N + u12 L 0 ⎟ ⎜z 3 ⎟ ⎜ 0 uD uD ⎟⎝ ⎠ ⎜ 1 ⎝ 0 N + u12 L D ⎠ z 4 0 uD z5 0 0 0 D L ⎞ ⎛ 0 0 ⎟ 1 1⎜ ⎟ ⎜ 0 = ⎜ ⎟ + 2 P. ⎠ u u⎝ 4λz 5 2 A z 5 − 2λz 4 − 3ω7 z 3 Proof. By direct machine or hand calculation. See Appendix A.

(5.5a)

(5.5b)

(5.5c)

Remark 5.6. We comment on the relevant/irrelevant equations (5.5): • Terms containing P are referred to as irrelevant, the others as relevant. Relevant terms are principal in the number of derivatives or leading order in powers of u1 . • The only relevant, nonlinear term on the right hand side of (5.5a) is −|ω1 |2 . This term generates trapped spheres. • The linear terms in the relevant part of the right hand sides of (5.5) are of two kinds. Either there is an explicit factor of e, λ, λ or there is a numerical factor (besides powers of u1 ). In the first case, we can make the factor small by requiring |A| ≤ |a| and making |a| small. In the second case, we arrange the terms into a linear over R matrix applied to . We exploit the structure of this matrix, it motivates the assumptions (RE11), (RE13a), (RE13b) of Proposition 7.9. Proposition 5.7. Recall (2.2) and (5.2b). Let = (s, p, y). Then, ⎛ ⎞ s1 f1 s4 ⎝s2 ⎠ = P , = −u N + P , s5 f 2 s3 ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ p4 ω1 p1 ⎝ p7 ⎠ = −u N ⎝ ω3 ⎠ + P , ⎝ p2 ⎠ = P , p3 p8 −ω4

(5.6a)

(5.6b)

Strongly Focused Gravitational Waves

287

⎛ ⎞ ⎞ −L(ω7 ) − ω1 p5 ⎝ p6 ⎠ = P = ⎝ ⎠ + 1 P , L(ω6 ) u p9 u D(ω7 ) − u D(ω6 ) + ω4 − ω3 − 4λ ω7 ⎛ ⎞ ⎛ ⎞ ⎛z 1 ⎞ L 0 0 u D − 4λ y1 ⎜z ⎟ 1 ⎝ y2 ⎠ = P = ⎝ ω7 u D − 2λ L 0 ⎠ ⎝ 2⎠ + P , z3 u y3 0 2 ω7 uD L z ⎛

(5.6c)

(5.6d)

4

Proof. By direct machine or hand calculation.

Remark 5.8. Every P in (5.6) has no constant term as a polynomial in the components of , and their first coordinate derivatives, because Ma,A is a vacuum field. Proposition 5.9 (Relevant/Irrelevant Form). Let = (s, p, y). The system (5.3b) takes the form ⎞ ⎞ ⎛ 0 s1 ⎟ 0 ⎜ s2 ⎟ ⎜ ⎟ ⎜s ⎟ ⎜ e ( p − p ) + e ( p − p ) ⎜ ⎟ 3 5 4 6 ⎜ 4⎟ ⎜ ⎜ s5 ⎟ ⎜i e ( p 4 − p5 ) − i e ( p 6 − p3 )⎟ ⎟ ⎜ ⎟ ⎜ ⎟ 1 ⎜ p1 ⎟ ⎜ y ⎟ 1 ⎟ L⎜ ⎟ + P , ⎜ p2 ⎟ = ⎜ 0 ⎜ ⎟ u ⎜ ⎟ ⎜ ⎟ ⎜ p3 ⎟ ⎜ 0 ⎟ ⎜ ⎟ ⎜ ⎟ 0 ⎜ p4 ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ λ( p6 − p 3 ) − λ( p4 − p 5 ) ⎠ p7 p8 −λ( p 6 − p3 ) + λ( p 4 − p5 ) ⎛ ⎞ ⎛ ⎞ s3 + p 7 + p 8 s3 p4 ⎟ 1 ⎜p ⎟ 1 ⎜ N ⎝ 5⎠ = ⎝ ⎠+ 2P , p6 p3 u u p9 p 7 + p8 ⎛ ⎞ 1 ⎛ ⎞ 0 N + u12 L uD y1 ⎜ 1 ⎟ 1 1 1 ⎜ ⎟ ⎠ ⎝ y D N + L D u ⎝ u ⎠ 2 = u2 P . u2 y3 1 N + u12 L 0 uD ⎛

(5.7a)

(5.7b)

(5.7c)

Above, the symbols P are linear over R generic transformations, see Definition 5.3. Proof. By direct machine or hand calculation.

Remark 5.10. The overall factors u E and u E appear in (5.3a) and (5.3b), so that these systems are line by line (up to a permutation of the lines) equivalent to (5.5) and (5.7). 6. Formal Solutions We consider formal power series on Strip∞ ⊂ R4 (see, Sect. 4), ∞ 1 k ∞ 2 [ ](x) = k=0 ( u ) (k)(ξ, u) where (k) ∈ C (R × (0, ∞), R). (6.1)

288

M. Reiterer, E. Trubowitz

Remark 6.1. By Proposition 5.7, the associated formal constraint field [ ] is itself a 1 k formal power series [ ](x) = ∞ k=0 ( u ) (k)(ξ, u), with (k) depending only on (), 0 ≤ ≤ k. The characteristic initial problem in Proposition 6.2 is motivated by [Chr]. Proposition 6.2. For all a, A = 0, u 0 > 0, all smooth DATA(ξ, u) : R2 × (0, ∞) → C that vanish when u < u 0 , there is a unique formal power series [ ] on Strip∞ , which satisfies (5.3a) and [ ] = 0 and (the formal characteristic initial data) [ ] = 0 when u < u 0 ,

ω1 (0) = DATA.

(6.2)

Moreover, for all k ≥ 0, the value of (k) at (ξ, u) ∈ R2 × (0, ∞) depends only on the restriction of DATA(ξ, u) and its derivatives of all orders to the half-open line segment {ξ } × (0, u ] (formal finite speed of propagation). Explicitly, (0) is given by: ω1 (0) = DATA, ω7 (0) = z 1 (0) = z 2 (0) = z 3 (0) =

z 5 (0) = 0,

−∂u−1 ω1 (0), ∂ − ∂u ω1 (0),

∂ 2 e ∂ξ + 2λ ∂u−1 z 1 (0),

∂

2 e ∂ξ + λ ∂u−1 z 2 (0) − ∂u−1 ω7 (0)z 1 (0) ,

∂ z 4 (0) = 2 e ∂ξ

∂u−1 z 3 (0) − 2∂u−1 ω7 (0)z 2 (0) ,

ω2 (0) = −∂u−1 |ω1 (0)|2 , ω4 (0) = −λ ∂u−1 ω1 (0), ω6 (0) = 0, f 1 (0) =

e ∂u−1 ω1 (0),

f 2 (0) = −i e ∂u−1 ω1 (0),

ω3 (0) = −∂u−1 z 2 (0) − λ ∂u−1 ω1 (0),

f 3 (0) = − ω8 (0),

ω5 (0) = −∂u−1 z 2 (0),

f 4 (0) = −4 e ∂u−1 ω5 (0),

ω8 (0) = ∂u−1 z 3 (0) − 4i ∂u−1 λ ω5 (0) , with e = ea,A and λ = λa,A and

∂ ∂ξ

=

f 5 (0) =

1 ∂ 2 ∂ξ 1

(6.3)

4 e ∂u−1 ω5 (0),

u

− i ∂ξ∂ 2 and ∂u−1 g (u) = 0 du g u .

We now prepare for the proof of Proposition 6.2. 1 k Definition 6.3. Recall (4.1). Let [ Ma,A ](x) = ∞ k=0 ( u ) Ma,A(k)(ξ, u) be the formal expansion in u1 for the Minkowski vacuum field Ma,A with [

1 ρ

] =

− u1

+

1 u2

[ S ],

∞ [S] = − ( u1 )k A2(k+1) u k+1 .

(6.4)

k=0

Definition 6.4. Regard the components of (k) and (k), k ≥ 0, and their formal first coordinate derivatives, as an infinite family of independent abstract variables. Set P0 = 0. The generic symbol Pk , k ≥ 1, is an arbitrary polynomial in the components of () and (), 0 ≤ ≤ k − 1, and all their first coordinate derivatives ( ∂ x∂ μ () and ∂ x∂ μ (), μ = 1, 2, 3), whose coefficients are (complex) polynomials in A, u, ea,A, λa,A, λa,A, and all their first coordinate derivatives. It is further required that the polynomial Pk have no constant term, that is, Pk vanishes when () and ∂ ∂ x μ () vanish for all 0 ≤ ≤ k − 1 and μ = 1, 2, 3. We use the same symbol Pk for a vector or matrix whose entries are all polynomials of this kind.

Strongly Focused Gravitational Waves

289

Proposition 6.5. [] is a formal power series solution to (5.3a), with [ Ma,A ] in the role of Ma,A, if and only if its coefficients (k), k ≥ 0, satisfy a system of the form z 1 (k) = Pk , z 2 (k) = Pk , z 3 (k) = Pk , ∂ ∂u

k > 0, k > 0, k > 0,

(6.5a) (6.5b) (6.5c)

k ≥ 0,

(6.5d)

k ≥ 0,

(6.5e)

k ≥ 0,

(6.5f)

k ≥ 0,

(6.5g)

ω3 (k) = −z 2 (k) − λ ω1 (k) + Pk ,

k ≥ 0,

(6.5h)

ω4 (k) = −λ ω1 (k) + Pk ,

1 ω5 (k) = − k+1 ω4 (k) − ω3 (k) + Pk ,

k ≥ 0,

(6.5i)

k ≥ 0,

(6.5j)

ω6 (k) = Pk , ω7 (k) = Pk ,

k > 0, k > 0,

(6.5k) (6.5l)

k ≥ 0,

(6.5m)

e ω1 (k) + Pk ,

k ≥ 0,

(6.5n)

= −i e ω1 (k) + Pk ,

k ≥ 0,

(6.5o)

k ≥ 0,

(6.5p)

k ≥ 0,

(6.5q)

k ≥ 0,

(6.5r)

z 5 (k) = Pk ,

2 ∂ (1 − δk0 )z 4 (k) = − k−δ e ∂ξ + 2λ z 5 (k) + Pk , k0 ∂ ∂u ∂ ∂u ∂ ∂u ∂ ∂u

ω1 (k) = −z 1 (k) + Pk ,

ω2 (k) = −(2 − δk0 ) ω1 (0) ω1 (k) + Pk ,

∂ ∂u ω8 (k) ∂ ∂u f 1 (k) ∂ ∂u f 2 (k) ∂ ∂u ∂ ∂u

= z 3 (k) + 2i λ ω4 (k) − ω3 (k) − ω5 (k) + Pk , =

2 f 3 (k) = − k+2 ω8 (k) + Pk ,

f 4 (k) = 2 e ω4 (k) − ω3 (k) − ω5 (k) + Pk ,

f 5 (k) = −2 e ω4 (k) − ω3 (k) − ω5 (k) + Pk ,

with e = ea,A and λ = λa,A. Proof. Substitute the formal series (6.1) into the relevant / irrelevant form of system (5.3a) given in Proposition 5.5. Collect all coefficients of common powers of u1 . Lemma 6.6. For all a, A = 0, u 0 > 0, all smooth DATA(ξ, u) : R2 × (0, ∞) → C that vanish when u < u 0 , there is a unique formal power series [ ] on Strip∞ , which satisfies (5.3a) and [ ] = 0 when u < u 0 ,

(0) is given by (6.3).

(6.6)

Proof. (0), as given by (6.3), satisfies the k = 0 equations in (6.5). The coefficient functions (k), k ≥ 1, are constructed by induction. For each step k, Eqs. (6.5a) to (6.5r) are solved exactly in this order to obtain (k). The right hand side is explicitly ∂ known by induction and the “upper triangular” structure of (6.5a) to (6.5r). Whenever ∂u appears on the left-hand side, it is inverted using ∂u−1 , because the constant of integration is zero by the first condition in (6.6). By induction, one also verifies that (k), k ≥ 0, vanishes when u < u 0 , so that the first condition in (6.6) is satisfied at all orders. It is essential at precisely this point that the generic polynomial Pk in Definition 6.4 has no

290

M. Reiterer, E. Trubowitz

constant term. Finally, by Proposition 6.5, there exists a formal power series solution satisfying the hypothesis of the lemma. The construction given here is forced at every step, and therefore generates a unique formal power series. Remark 6.7. Lemma 6.6 is simpler than Proposition 6.2, because it assumes Eqs. (6.3) and it makes no statement about the formal constraint field [ ]. Proof (of Proposition 6.2). We first prove existence. It suffices to show that the formal power series [ ] produced by Lemma 6.6 satisfies [ ] = 0. (The formal finite speed of the propagation statement in Proposition 6.2 follows from an examination of the construction of [ ] in the proof of Lemma 6.6.) Note that • [ ] is a formal power series solution to the linear homogeneous system (5.3b). • [ ] = 0 when u < u 0 . • (0) = 0 on R2 × (0, ∞). The first bullet follows from Proposition 2.18, because [ ] is a formal power series solution to (5.3a). The second bullet follows from the first condition in (6.6), which implies [ Φ ] = [ Ma,A ] when u < u 0 , and [ Φ ] = [ Ma,A ] = 0. For the third bullet, note that p5 (0), y1 (0), y2 (0), y3 (0), p6 (0) all vanish on R2 × (0, ∞) by the second condition in (6.6). By the first two bullets and by Eq. (5.7a), we conclude, step by step, that s1 (0), s2 (0), p1 (0), p2 (0), p3 (0), p4 (0), s4 (0), s5 (0), p7 (0), p8 (0) also vanish. The first equation in (5.7b) gives s3 (0) = 0. It remains to show that p9 (0) = 0 on R2 × (0, ∞). By (5.6c),

∂ ∂ p9 (0) = −2 e ∂ξ + 2 λ ω7 (0) + 2 e ∂ξ ω6 (0) + ω4 (0) − ω3 (0). ∂ 2 The second condition in (6.6) implies ( ∂u ) p9 (0) = 0. By the second bullet, p9 (0) ≡ 0.

The three bullets imply, by induction on k ≥ 1, that (k) = 0 on R2 × (0, ∞). In fact, at each step k, one verifies, in the given order, that y1 (k), y2 (k), y3 (k) all vanish by (5.7c), p1 (k), p2 (k), p3 (k), p4 (k) all vanish by (5.7a), p5 (k), p6 (k) both vanish by (5.7b), p7 (k), p8 (k) both vanish by (5.7a), p9 (k), s3 (k) both vanish by (5.7b), and s1 (k), s2 (k), s4 (k), s5 (k) all vanish by (5.7a). This concludes the existence proof. Uniqueness in Lemma 6.6 implies uniqueness in Proposition 6.2, because we now show that (5.3a) and [ ] = 0 and (6.2) together imply (6.3), which is the second condition in (6.6). Condition (6.2) and the k = 0 equations in (6.5) imply (6.3), apart from the formulas for ω7 (0), z 2 (0), z 3 (0), z 4 (0), ω6 (0). The remaining five formulas follow from the vanishing of p5 (0), y1 (0), y2 (0), y3 (0) and p6 (0), see (5.6c) and (5.6d). Here, (0) = (s(0), p(0), y(0)). Proposition 6.8. For all k, R ≥ 0, all 0 < |A| ≤ |a| ≤ 1, and all DATA,

(k) C R (Q) ≤ pk,R DATA C R+2k+3 (Q) , Q = D4| Aa | (0) × (0, 2), where [ ] is the corresponding formal solution in Proposition 6.2, and pk,R : R → R, is an infinite family, indexed by k, R ≥ 0, of universal polynomials without constant term. Here, Dr (0) is the open disk of radius r > 0 in the (ξ 1 , ξ 2 )-plane. Here f C R (Q) = sup|α|≤R ∂ α f C 0 (Q) , where α ∈ N30 . Proof. Observe that: • ea,A C R (Q) ≤

17 2

and λa,A C R (Q) ≤

17 2

for all R ≥ 0.

Strongly Focused Gravitational Waves

291

• ∂u−1 g C R (Q) ≤ 2 g C R (Q) for all R ≥ 0 and all functions g = g(ξ, u) on Q. The existence of polynomials p0,R , R ≥ 0, follow by direct inspection of (6.3). The existence of polynomials pk,R , R ≥ 0, is shown by induction over k ≥ 0. At each step k ≥ 1, we use (6.5). By the inductive hypothesis and Definition 6.4 there is a polynomial pk,R (depending only on k and R) so that each generic term Pk on the right-hand sides of

DATA C R+2k+2 (Q) . We can assume that pk,R has no (6.5) satisfies Pk C R (Q) ≤ pk,R constant term, because Pk does not have one (see, Definition 6.4). Now, the existence of pk,R , R ≥ 0 follows directly from estimating the non generic terms on the right-hand sides of (6.5a) to (6.5r), exploiting the upper triangular structure. Only in one equation, (6.5e), a coordinate derivative appears. Remark 6.9. In Proposition 6.8, the uniformity of the estimate in a, A, when 0 < |A| ≤ |a| ≤ 1, will be exploited later. It is compatible with taking the limit a = A ↓ 0. Remark 6.10. Fix DATA and let [ a,A ] be the formal power series solution in Proposition 6.2. The indices have been added to make the dependence on a, A = 0 explicit. One can show, by induction, that A,A(k)(ξ, u), k ≥ 0, are polynomials in A. Proposition 6.11 (Matching Stereographic Charts). Choose a, A = 0. Pick DATAσ as in Proposition 6.2, for σ = −, +, and let [ σ ] be the associated solution in Proposition 6.2. The following statements are equivalent:

a

a 1 2 2 • |ξξ 2| DATAσ A ξ, u = |ξξ |2 DATA−σ A ξ , u when ξ = 0. • Flip Aa · [ Φ σ ] = [ Φ −σ ] when ξ = 0. Here, [ Φ σ ] = [ Ma,A ] + u −M [ σ ]. • Flip Aa · [ σ ] = [ −σ ] when ξ = 0.

Here, −σ = + when σ = −, and conversely, −σ = − when σ = +. Proof. The equivalence of the last two bullets follows from Proposition 4.2, (b), and the fact that Flip Aa commutes with multiplication by u −M . Each of the last two bullets implies the first. Just look at how Flip Aa acts on the component ω1 . The first bullet implies the last two, because Flip Aa is a field symmetry, and by uniqueness in Proposition 6.2 (more precisely, by formal finite speed of propagation). Definition 6.12. For all (ξ, u) ∈ R2 × (0, ∞) with ξ = 0, set

a2 1

ξ2 Flip a · DATA (ξ, u) = 2 DATA 2 ξ , u . A A ξ

Remark 6.13. Proposition 6.2, the main result of this section, is the formal analog of Theorem 8.6, the main result of this paper. They give solutions to an asymptotic characteristic initial value problem that is motivated by [Chr]. Informally: lim (ξ, u, u) = (0)(ξ, u),

u→−∞

(ξ, u, u) = 0

when u < u 0 ,

(6.7a) (6.7b)

with the understanding that (0) is given in terms of DATA(ξ, u) by Eqs. (6.3). Equation (6.7b) stipulates that Φ coincides with the Minkowski vacuum field Ma,A when u < u 0 . On the other hand, (6.7a) is an asymptotic initial condition at u → −∞, past null infinity. All the notation, definitions and concepts required for Theorem 8.6 have now been introduced. It can be read on its own.

292

M. Reiterer, E. Trubowitz

7. Quasilinear Symmetric Hyperbolic Systems We discuss an abstract local existence theorem and an abstract energy estimate. Convention 7.1. In this section, we use the coordinates q instead of x: x = (x 1 , x 2 , x 3 , x 4 ) = (ξ 1 , ξ 2 , u, u), q = (q 0 , q 1 , q 2 , q 3 ) = (t, ξ 1 , ξ 2 , u),

t = u + u,

q = (q 1 , q 2 , q 3 ).

Br ( p) ⊂ Rn is the open ball of radius r > 0 around p, and Dr ( p) = Br ( p) if n = 2. Matn (Symn ) is the vector space of all n × n real (symmetric) matrices. X a Y means X ≤ CY for a constant C > 0 that depends only on a ∈ Rk . X Y means X ≤ CY for a universal constant C > 0. Proposition 7.2 (Existence / Breakdown Theorem). Suppose a tuple

T, P, Mμ , h, Mμ , H, K, Q

(μ = 0, 1, 2, 3) and the derived tuple (U, A) satisfy: T ∈ R and U = (−∞, T ) × R3 and P ∈ N and A = U × B2 (0) ⊂ R4 × R P . Mμ ∈ C ∞ ( A, Sym P ) and M0 ≥ 21 and h ∈ C ∞ ( A, R P ). μ 0 ∞ 1 (EB2) M ∈ Sym P and M ≥ 2 and H ∈ C ( (−∞, T ], Mat P ). μ 3 (EB3) K ⊂ Q ⊂ R and K compact and Q open. Moreover Mμ (q, ) = M and 3 h(q, ) = H (t) for all (q, ) ∈ ((−∞, T ) × (R \ K)) × B2 (0). Here t = q 0 . (EB0) (EB1)

Extend Mμ and h, by Mμ and H (t), to (−∞, T ) × (R3 \ K) × R P . Then: Part 1. For each t0 < T , there is a t1 ∈ (t0 , T ] and a C ∞ -solution : [t0 , t1 ) × R3 → R P

of

Mμ (q, ) ∂q∂ μ = h(q, )

(7.1)

with (t0 , · ) ≡ 0, such that supp ⊂ [t0 , t1 ) × Br (0) for some finite r > 0, and

[t0 , t1 ) × Q ⊂ B1 (0) ⊂ R P ,

(7.2)

and such that t1 = T implies either one or both of: (Break)1 : ([t0 , t1 ) × Q) ⊂ B1 (0) ⊂ R P . (Break)2 : The vector field ∂q is unbounded on [t0 , t1 ) × Q. Part 2. If M3 ≥ 0 on A and if h(q, 0) = 0 when q 3 < 21 , then |q 3 <1/2 ≡ 0. Remark 7.3. Proposition 7.2 is stated without proof. The main ingredient is a standard existence theorem for quasilinear symmetric hyperbolic systems as in [Tay], pp. 360– 370. The specific geometry (of space, time and target space) in Proposition 7.2 can be reduced to the standard geometry in [Tay] using partitions of unity and finite speed of propagation. (EB3) implies that (7.1) reduces to the linear homogeneous Mμ ∂q∂ μ = H (t) for all q not in the compact K, with Mμ constant.

Strongly Focused Gravitational Waves

293

Lemma 7.4. Recall the matrix differential operators A(Φ) and A(Φ) associated to Φ = (e, γ , w), see Sect. 2. Suppose θ is a one-form and suppose θ (N ) θ (D) ≥ 0. (7.3) θ (L) ≥ 0, θ (N ) ≥ 0, θ (D) θ (L) (≥ 0 as Hermitian matrices). Then

θ A(Φ) ≥ 0,

θ A(Φ) ≥ 0.

Definition 7.5. For all ξ0 ∈ R2 and b ∈ [1, 2] set

1 rb (t, u) = 41 + 0.018 |b + 0.001 − u| · 0.002 −

(7.4)

1 u+|t| ,

θb,ξ0 = the differential of the function q → |ξ − ξ0 | − rb (t, u). Proposition 7.6. Let ξ0 ∈ R2 and b ∈ [1, 2]. Suppose q = (t, ξ 1 , ξ 2 , u) and the parameters a, A ∈ R and the field Φ = Ma,A + u −M (see Sect. 5) satisfy • u ∈ (0, b) and t ∈ (−∞, −1000). a • |ξ − ξ0 | = rb (t, u) and |ξ | < 4| A |. −3 • 0 < |A| ≤ |a| ≤ 10 and |(q)| ≤ 5. Then (7.4) holds at q, for the one-form θb,ξ0 in Definition 7.5. That is, the hypersurface |ξ − ξ0 | = rb (t, u) is non-timelike at q with respect to A(Φ) and A(Φ). Remark 7.7. To prove Proposition 7.6, calculate θb,ξ0 = d(|ξ − ξ0 | − rb (t, u)) and use (5.4) to check (7.3), keeping in mind the coordinate transformation between x and q in Convention 7.1. Assumption |(q)| ≤ 5 implies | f 1 (q)|, . . . , | f 5 (q)| ≤ 5, and only these five inequalities are used. Here = ( f, ω, z). Definition 7.8 (Energy and Supremum). For every integer k ≥ 0, every open X ⊂ R3 , every t ∈ R and every scalar / vector / matrix valued C k -function f = f (t, q), set12 (k) k α EX { f }(t) = d3 q |∂ α f (t, q)|2 , SupX { f }(t) = sup sup |∂ f (t, q)|. |α|≤k X α∈N40

|α|≤k q∈X α∈N40

Proposition 7.9 (Energy estimate). Suppose a tuple (t0 , t ∗ ), ξ0 , b, Pm , R, Mμ , H, Src, , Mμ , H, c1 , c2 , J μ

(μ = 0, 1, 2, 3 and m = 1, 2, 3) and the derived tuple (I, U, P, Mm , Hm ) satisfy (RE0) through (RE12) and one of (RE13a), (RE13b): (RE0)

(RE1) (RE2)

−∞ < t0 < t ∗ < −1000 and I = (t0 , t ∗ ) and ξ0 ∈ R2 and b ∈ [1, 2] and

U = t∈I {t} × O(ξ0 , b, t) O(ξ0 , b, t) = u∈(0,b) Drb (t,u) (ξ0 ) × {u} . P1 , P2 , P3 ∈ N and P = P1 + P2 + P3 ≤ 109 and R ∈ N0 and c1 , c2 , J ∈ R. Mμ ∈ C R+1 (U, Sym P ) and H ∈ C R (U, Mat P ) and Src ∈ C R (U, R P ).

∂ αμ 12 ∂ α = 3 2 T μ=0 ( ∂q μ ) . The pointwise norm | · | is the Euclidean norm. For a matrix, |A| = tr(A A).

294

M. Reiterer, E. Trubowitz

1 (RE3) 2 ≤ M0 ≤ 2 and M3 ≥ 0. (RE4) θμ Mμ ≥ 0 on (∂U) ∩ (I × R2 × (0, b)) with θ = θb,ξ0 as in Definition 7.5. (RE5) ∈ C R+1 (U, R P ) is a solution to the linear symmetric hyperbolic system

Mμ (q) ∂q∂ μ = H (q) + Src(q)

on U.

(7.5)

and Src vanish identically when q 3 < 21 . μ ∞ (RE7) M ∈ Sym P and H ∈ C (I, Mat P ). 0 1 2 3 1 (RE8) 2 ≤ M ≤ 2 and M = M = 0 and M ≥ 0. (RE6)

Make the R P = R P1 ⊕ R P2 ⊕ R P3 block decompositions = (m )

Src

μ Mμ = (Mmn )

= (Srcm )

H = (Hmn )

H = ( Hmn )

with m, n = 1, 2, 3. (RE9) (RE10) (RE11)

μ

μ

μ

Mmn = 0 if m = n and Mmm = Mm for m = 1, 2, 3. μ M22 ∂q∂ μ = ν(q) 1 P2 ( ∂q∂ 0 + ∂q∂ 3 ) for some function ν on U.

Hmn

⎛

0 = ⎝ H1 0

0 0

|t|−1 H2

0 0

⎞

|t|−1 H3

⎠

for real, constant matrices H1 , H2 , H3 , with H3 ∈ Sym P3 and H3 ≤ 0. ≥ 0 and J > 0 and for all t ∈ I: ⎫ R |t|2J +2 E O (ξ0 ,b,t) {Src1 }(t)⎪ ⎪ ⎬ 2J R |t| E O(ξ0 ,b,t) {Src2 }(t) ≤ (c1 )2 . ⎪ ⎪ ⎭ R { Src }(t) |t|2J +2 E O 3 (ξ0 ,b,t)

(RE12) c1

(RE13a)

R ≥ 4 and c2 > 0 and for all t ∈ I: ⎫ R μ μ |t|2 E O (ξ0 ,b,t) {M − M }(t)⎪ ⎪ ⎪ ⎪ ⎪ 2 R |t| E O(ξ0 ,b,t) {H1n − H1n }(t)⎬ R ⎪ EO (ξ0 ,b,t) {H2n − H2n }(t)⎪ ⎪ ⎪ ⎪ ⎭ 2 R |t| E O(ξ0 ,b,t) {H3n − H3n }(t)

(RE13b)

≤ (c2 )2 .

R ≥ 0 and c2 > 0 and for all t ∈ I: ⎫ (max{1,R}) |t| SupO(ξ0 ,b,t) {Mμ − Mμ }(t)⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (R) |t| SupO(ξ0 ,b,t) {H1n − H1n }(t)⎬ (R) SupO(ξ ,b,t) {H2n 0 (R) |t| SupO(ξ0 ,b,t) {H3n

⎪ − H2n }(t)⎪ ⎪ ⎪ ⎪ ⎪ − H }(t)⎭ 3n

≤ c2 .

Strongly Focused Gravitational Waves

295

Then, for all J0 > 0, there are constants c3 (X ) ∈ (0, 1), c4 (X ) > 0 depending only on X = R, J0 , | Hm | , such that J ≥ J0 and c2 ≤ c3 (X ) and |t ∗ |−1 ≤ c3 (X ) imply J R R (7.6) EO sup |τ | J E O (ξ0 ,b,τ ) {}(τ ) ≤ c4 (X ) |t0 | (ξ0 ,b,t0 ) {}(t0 ) + c1 . τ ∈I

(R)

R 4 (R) = Sup Proof. In this proof E R = E O (ξ0 ,b,t) and Sup O(ξ0 ,b,t) , and α, β ∈ N0 .

Step 1. For any function f with values in R Pi , i = 1, 2, 3, set

Ei0 { f }(t) = O(ξ0 ,b,t) d3 q f T Mi0 f (t, q), EiR { f }(t) = |α|≤R Ei0 {∂ α f }(t), the energy naturally associated to the symmetric hyperbolic system (7.5). By (RE3): E R { f }(t) ≤ 2 EiR { f }(t),

EiR { f }(t) ≤ 2 E R { f }(t).

(7.7)

If R ≥ 2 and f is a vector or matrix valued C R -function, then (Sobolev inequality): (R−2) Sup { f }(t) R E R { f }(t). (7.8) Here, it is legitimate to use R for (ξ0 , b, t) ∈ R2 × [1, 2] × (−∞, −1000). In fact, D1 (0) × (0, 1) is diffeomorphic to O(ξ0 , b, t) by (ξ, u) → (ξ0 + rb (t, bu)ξ, bu). Since all derivatives of order up to R − 1 of the Jacobians of both the diffeomorphism and its inverse have finite sup-norms not depending on ξ0 , b, t, inequality (7.8) follows from the analogous Sobolev inequality on D1 (0) × (0, 1), by a change of variables. By the Sobolev inequality and the product rule (suppress the argument t ∈ I): E

0

E R { f1 f2 } R E R { f1 } E R { f2 } [∂ α , f 1 ] f 2 R E R−1 {∂ β f 1 } E R−1 { f 2 }

if R ≥ 4, if R ≥ 4, |α| ≤ R,

|β|=1

2

E R { f 1 f 2 } R Sup(R) { f 1 } E R { f 2 }

2 E 0 [∂ α , f 1 ] f 2 R Sup(R−1) {∂ β f 1 } E R−1 { f 2 }

if R ≥ 0,

(7.9)

if R ≥ 0, |α| ≤ R.

|β|=1

In the fourth inequality, the left-hand side vanishes when R = 0, because then α = 0. Step 2. Here t ∈ I and |α| ≤ R. Apply ∂ α to (7.5): Mμ ∂q∂ μ (∂ α ) = H ∂ α + (S1α , S2α , S3α ) + ∂ α Src, (7.10) def

(S1α , S2α , S3α ) = ∂ α (H − H ) + [∂ α , H ] + [Mμ − Mμ , ∂ α ]∂μ .

296

M. Reiterer, E. Trubowitz

Assumption (RE11) and inequalities (7.9) imply ⎧ 3 ⎨

E 0 {Siα } R

⎩

E 0 {Siα } R

⎧ 3 ⎨

⎩ +

j=1

3

+

μ

μ

E {M − M } R

μ=0

Sup

j=1 3

E R {Hi j − Hi j } + |t|−4 | H2 |2 + |t|−4 | H3 |2

(R)

{Hi j − Hi j }

2

⎫ ⎬ ⎭

if R ≥ 4 E R {}

+ |t|−4 | H2 |2 + |t|−4 | H3 |2

⎫ ⎬ (R) μ μ 2 Sup {M − M } E R {} ⎭

if R ≥ 0

μ=0

Set c∗ = max c2 , |t ∗ |−1 (| H2 |2 + | H3 |2 )1/2 . If (RE13a) or (RE13b), then max |t|2 E 0 {S1α }, E 0 {S2α }, |t|2 E 0 {S3α } R c∗2 E R {}

(7.11)

Step 3. We now derive the inequalities (7.15). Recall (RE2), (RE9). The “energy currents” associated to = (i ), i = 1, 2, 3, and their Euclidean divergences are

μ μ μ ∂μ ji [i ] = iT (∂μ Mi )i + 2iT Mi ∂μ i )

def μ μ ji [i ] = iT Mi i μ

(7.12)

μ

by (Mi )T = Mi . We do not sum over repeated lower indices. For τ ∈ I set

D1 (τ ) = D3 (τ ) = (t, q) ∈ U t ∈ (τ− , τ ) ,

τ− = τ− (τ ) = max{t0 , τ − 2},

D2 (τ ) = D1 (τ ) ∩ { q | q − q < b − τ }. 3

0

For D2 , see the figure, where t0 < τ2 < t0 + b < τ1 < t ∗ . Energy estimates are obtained by integrating the divergence identity in (7.12) over Di (τ ) ⊂ U and applying the Euclidean divergence theorem. The divergence theorem generates integrals over the components of ∂Di (τ ). Their contributions are: Boundary q0 = τ q3 = 0 q 0 = τ− q3 = b q3 − q0 = b − τ q 0 = t0 ξ ∈ ∂ Drb (t,u) (ξ0 )

Contribution Ei0 {i }(τ ) 0 −Ei0 {i }(τ− ) ≥0 0 ≥ −Ei0 {i }(t0 ) ≥0

i= all all 1, 3 1, 3 2 2 all

Remark (RE6) (RE3) (RE10)

τ < t0 + b (RE4)

Strongly Focused Gravitational Waves

297

μ

The discussion literally transposes to ∂ α i and ji [∂ α i ], for |α| ≤ R. Hence μ Ei0 {∂ α i }(τ ) − ki (τ )Ei0 {∂ α i }(τ− ) ≤ d4 q ∂μ ji [∂ α i ], (7.13) Di (τ ) μ EiR {i }(τ ) − ki (τ )EiR {i }(τ− ) ≤ d4 q ∂μ ji [∂ α i ]. (7.14) Di (τ )

|α|≤R

The first inequality implies the second, by summing over |α| ≤ R. By definition, k1 (τ ) = k3 (τ ) ≡ 1, whereas k2 (τ ) = 0 if τ− > t0 and k2 (τ ) = 1 if τ− = t0 . The divergence identity in (7.12), with ∂ α i in the role of i , and (7.10) imply ⎧ ⎫ 3 ⎨ ⎬ μ α μ Hi j ∂ α j + 21 (∂μ Mi )(∂ α i ) + ∂ α Srci + Siα . ∂μ ji [∂ i ] = 2(∂ α i )T ⎩ ⎭ j=1

For i = 1, 2, we directly estimate the right-hand side of (7.14), by using Schwarz’s inequality for the spatial part of the integral, and (RE12), (7.11) and (7.7). For i = 3, we first exploit H3 ≤ 0 from (RE11) to drop the term 2(∂ α 3 )T H33 (∂ α 3 ), and then go on as before. We also use |∂μ Mμ | = |∂μ (Mμ − Mμ )| R c2 |t|−1 ≤ c∗ |t|−1 , which holds when either (RE13a) or (RE13b) is assumed; in the first case use (7.8). Abbreviating Ei = EiR {i } and E = E1 + E2 + E3 , we have for all τ ∈ I: τ

E1 (τ ) − E1 (τ− ) X dt E1 (t) c∗ E(t) + c1 |t|−J |t|−1 , τ− τ

E2 (τ ) − E2 (t0 ) X E3 (τ ) − E3 (τ− ) X

τ −τ τ−

dt

E2 (t) E1 (t) + c∗ E(t) + c1 |t|−J ,

dt

E3 (t) E2 (t) + c∗ E(t) + c1 |t|−J |t|−1 ,

where X is defined as in the proposition. Step 4. For each A = (A1 , A2 , A3 ) ∈ (0, ∞)3 , define

J (A) = t ∈ I supτ ∈[t0 ,t] |τ |2J Ei (τ ) ≤ Ai2 , i = 1, 2, 3 .

(7.15)

298

M. Reiterer, E. Trubowitz

Recall J ≥ J0 > 0. Assume A satisfies A1 > |t0 | J E1 (t0 ), A1 > C J0−1 c1 , A2 > 8Cc1 , A2 > 2|t0 | J E2 (t0 ), J A3 > C J0−1 c1 , A3 > |t0 | E3 (t0 ),

A1 > C J0−1 c∗ |A|, A2 > 8C(A1 + c∗ |A|), A3 >

C J0−1 (A2

(7.16)

+ c∗ |A|),

where |A|2 = A21 + A22 + A23 , and where C = C(X ) > 0 is the maximum of the three constants of proportionality in the inequalities (7.15). By (7.15), (7.16) and the continuity of I τ → Ei (τ ), the set J (A) is an open and closed sub-interval of I and contains t0 . Therefore, J (A) = I. To see that J (A) is open in I, first observe that for every τ ∈ J (A), the inequalities (7.15), (7.16) imply the strict inequalities Ei (τ ) < (Ai |τ |−J )2 , and then use continuity. For each λ ≥ 0, set A(λ) = λ 1, 1 + 8C, 1 + C J0−1 (1 + 8C)). The three rightmost inequalities in (7.16) are homogeneous in A, and hold for A(λ), λ > 0, iff they hold for A(1), which they do if c∗ > 0 is sufficiently small depending on X , because it is true for c∗ = 0. The definition of c∗ before (7.11) implies c∗ ≤ (1 + | H2 |2 + | H3 |2 )1/2 c3 (X ). Consequently, the condition on c∗ holds if c3 (X ) is suitably small. Set λ0 = 2|t0 | J E(t0 ) + max{8, J0−1 }Cc1 ≥ 0. If λ > λ0 , then the remaining six inequalities in (7.16) hold for A(λ), and J (A(λ)) = I. By the definition of J (A), we have J (A(λ0 )) = I. By (7.7), inequality (7.6) follows if c4 (X ) is sufficiently big. 8. Classical Vacuum Fields We show in Proposition 8.5 that Sect. 7 can be applied to (5.3a), (5.3b). Then we prove the main Theorem 8.6. See Remark 6.13. Convention 8.1. Theorem 8.6 is formulated in terms of the coordinates x. Otherwise q is used, see Convention 7.1. Recall u = q 0 − q 3 . For any function f = f (x), the notation f (q) stands for f (x(q)). For any vector field v = v μ (x)(∂/∂ x μ ), the notation v μ (q) stands for v ν (x(q))(∂q μ /∂ x ν )(x(q)). Convention 8.2. In this section, Cm is a real vector space over R with dimension 2m. A linear map Cm → Cn is, by convention, R-linear. It is either a real 2n × 2m matrix, or a complex n × m matrix which may have complex conjugation C as matrix elements. Adopt similar conventions for the real subspaces in Definitions 2.2 and 2.13: R ⊂ C5 ⊕ C8 ⊕ C5 ,

⊂ C5 ⊕ C9 ⊕ C3 . R

To put (5.3a) and (5.3b) in the form required by Sect. 7, we use: (S1) (S2)

a, A ∈ R satisfy 0 < |A| ≤ |a| ≤ 10−3 . Let [ ] be as in Proposition 6.2 with DATA|q 3 <1/2 ≡ 0. Fix an integer K ≥ 0. Set K =

K +1 k=0

( u1 )k (k).

Strongly Focused Gravitational Waves

(S3)

299

Recall = ( f, ω, z). Change fields to (h, σ, ) = − K and then rearrange to = (1 , 2 , 3 ) = ⊕ (h 1 , h 2 , h 4 , h 5 , σ1 , σ2 , σ3 , σ4 , σ8 ) ⊕ (h 3 , σ5 , σ6 , σ7 ).

Let π be the permutation matrix with (h, σ, ) = π(1 , 2 , 3 ). The field = (1 , 2 , 3 ) takes values in π −1 R ⊂ C5 ⊕ C9 ⊕ C4 . (S4) System (5.3a), A(q, ) = f(q, ), is equivalent to B(q, ) = Q(q, ) + Src,

(8.1)

where B(q, ) = π −1 A(q, K + π ) π, 1

−1 d Q(q, ) = π ds s=0 ds − A(q, sπ ) K + f(q, K + s π + sπ ) , 0

−1 Src(q) = π f(q, K ) − A(q, K ) K . 1 d Here, π −1 R → π −1 R, → Q(q, ) is R-linear. The operator ds |s=0 0 ds acts on a quadratic polynomial in s, s . (S5) → Bμ (q, ) and → Q(q, ) are affine R-linear. Set •

Bμ (q) =

•

μ d ds s=0 B (q, s),

Q(q) =

d ds s=0

Q(q, s).

The C5 ⊕ C9 ⊕ C4 block decomposition of B is B = diag(B1 , B2 , B3 ), B2 = 19 L, B3 = 14 N , and B1 is the 5 × 5 Hermitian matrix operator on the left-hand side of (5.5c). The block decomposition of Q is denoted Q = (Q mn )m,n=1,2,3 . (S7) Let Q(t) be the C5 ⊕ C9 ⊕ C4 block matrix ⎛ ⎞ 0 0 0 0 0 ⎠. Q(t) = ( Q mn )m,n=1,2,3 = ⎝ Q 1 0 |t|−1 Q 2 |t|−1 Q 3 (S6)

Q 1 , Q 2 , Q 3 are the matrices whose only nonzero entries are (C is complex conjugation): ( Q 1 )51 = ( Q 1 )72 = ( Q 2 )28 = ( Q 3 )22 = −1, ( Q 1 )93 = 1, ( Q 2 )19 = −1 − C, ( Q 2 )27 = C, ( Q 3 )11 = −2. Observe that Q 3 is symmetric and Q 3 ≤ 0. μ ∂ ∂ (S8) B = B ∂q μ = diag(U, U, U, U, V )⊕19 V ⊕14 U with U = , V = ∂q∂ 0 + ∂q∂ 3 . ∂q 0 (S9)

Let ψ = ψ(q) : R3 → [0, 1] be the smooth cutoff function

s 3 − |A s 43 − |q 3 − 1| a ξ |R 2 ψ(q) =

,

A

5 s 3 − |q 3 − 1| + s |q 3 − 1| − 2 s 3 − |A 4 3 a ξ |R 2 + s | a ξ |R 2 − 2 where ξ = (q 1 , q 2 ), and s(x) = 0 if x ≤ 0 and s(x) = e−1/x when x > 0. Set K = D3| Aa | (0) × ( 41 , 47 )

⊂

Q = D4| Aa | (0) × (0, 2).

By construction, suppR3 ψ ⊂ K, and ψ is equal to 1 on D 5 | a | (0) × ( 13 , 53 ). 2 A By (S1) we have ψ C R (R3 ) R 1 for each integer R ≥ 0. μ (S10) Set Mμ (q, ) = ψBμ (q, )+(1−ψ) B and H (q, ) = ψ Q(q, )+(1−ψ) Q(t) and h(q, ) = H (q, ) + ψ Src(q).

300

M. Reiterer, E. Trubowitz

Recall (S4). If (1) , (2) are both smooth solutions to B = Q + Src, then ϒ = (2) − (1) is a solution to B(q, (1) )ϒ = Gϒ. By definition,

d Q(q, (1) )(s) − B(q, s)(2) + Q(q, s)(2) . G = ds s=0

The map π −1 R → π −1 R, → G = G q, (1) , (2) , ∂q (2) is R-linear. ! Recall the constraint field = (s, p, y). Rearrange to (S1)

(S11)

= ( 1 , 2 , 3 ) = y ⊕ (s1 , s2 , s4 , s5 , p1 , p2 , p3 , p4 , p7 , p8 ) ⊕ (s3 , p5 , p6 , p9 ), ⊂ C3 ⊕C10 ⊕C4 . The permutation with values in π −1 R π is such that = π . ! System (5.3b), (S2) A(q, ) = f(q, , ∂q ) , is equivalent to the linear, homo , ∂q ) , with geneous symmetric hyperbolic system B(q, ) = Q(q, Bμ (q, ) = π −1 π Aμ (q, )

, ∂q ) = Q(q, π −1 π. f(q, , ∂q )

→ → Q(q, , ∂q ) is R-linear. Moreover, π −1 R, Bμ The map π −1 R depends affine R-linearly on , and Q depends affine R-linearly on ⊕ ∂q .

! The C3 ⊕ C10 ⊕ C4 block decomposition of B2 = (S3) B is B = diag B1 , B2 , B3 , 110 L , B3 = 14 N , and B1 is the 3 × 3 Hermitian matrix operator on the left-hand is denoted Q = (Q mn )m,n=1,2,3 . side of (5.7c). The block decomposition of Q be the C3 ⊕ C10 ⊕ C4 block matrix ! Let Q(t) (S4) ⎛ ⎞ 0 0 0 1 = ( Q mn )m,n=1,2,3 = ⎝ Q 0 0 ⎠. Q(t) 2 |t|−1 Q 3 0 |t|−1 Q 1 )5,1 = 1, Q1, Q2, Q 3 are the matrices whose only nonzero entries are ( Q 2 )3,7 = ( Q 2 )1,9 = ( Q 2 )4,10 = ( Q 3 )1,1 = −1, ( Q 2 )1,10 = ( Q 2 )2,8 = ( Q ( Q 2 )4,9 = −C. Observe that Q 3 is symmetric and Q 3 ≤ 0. ! (S5) B = Bμ ∂q∂ μ = 13 U ⊕ 110 V ⊕ 14 U with U = ∂q∂ 0 and V = ∂q∂ 0 + ∂q∂ 3 . Definition 8.3. Each entry to the left of the vertical bar is a generic symbol for a polynomial (with complex coefficients) in the (components of the) quantities to the right and their complex conjugates.

J u −1

J 1 (That is, a generic symbol for a complex number)

H linear over R in (0)

H linear over R in A, e, λ

G K u −1 , A, S, e, λ, (k)k=0...K +1 , and their first derivatives

G K

u −1 , A, u, S K , e, λ, (k)k=0...K +1 , and their first derivatives.

It has no constant term as a polynomial in (k) and its derivatives.

−1 G u , A, S, e, λ, (0), − (0), and first derivatives G0 u −1 , A, S, e, λ, (0), − (0)

G1 like G , but it has no constant term as a polynomial in −(0), ∂q −(0)

G u −1 , A, u, S0 , e, λ, (0), and first derivatives.

Strongly Focused Gravitational Waves

301

The symbols G K , G K represent polynomials whose coefficients and degrees may depend on K . The remaining eight symbols represent polynomials whose coefficients and degrees are independent of K . The functions S K and S0 are defined through 1 S = − k=0 ( u1 )k A2(k+1) u k+1 + u +1 S for all ≥ 0, see (4.3) and (6.4). Proposition 8.4. Part 1. Let Src = (Srcm )m=1,2,3 be the C5 ⊕ C9 ⊕ C4 decomposition. Bμ (q, 0) = Bμ

u −1 H

+ u −2 G K ,

Q 1n (q, 0) = Q 1n (q) + u −1 H + u −1 H

+ u −2 G K ,

Q 2n (q, 0) = Q 2n (q) +

H

+ u −1 G K ,

Q 3n (q, 0) = Q 3n (q) + (|t|−1 + u −1 )J

+ u −2 G K ,

Src1 (q) • μ

+

= u −(K +2) G K

B (q) = u −2 J

•

H+ Src2 (q)

= u −(K +2) G K

Q 1n (q) = u −1 J

•

Q 2n (q) =

−(K +3) Src3 (q) = u GK , • J Q 3n (q) = u −2 J .

Part 2a. Bμ (q, ) = Bμ 1n (q) 1n (q, , ∂q ) = Q Q 2n (q) + 2n (q, , ∂q ) = Q H Q −1 −1 Q 3n (q, , ∂q ) = Q 3n (q) + (|t| + u )J

+ u −2 G0 , + u −2 G , + u −1 G , + u −2 G .

Part 2b.

1 , 3 , s1 , s2 , p1 , p2 , p3 = u −1 G + G1 ,

s4 , s5 , p4 , p7 , p8 = u −1 G + u G1 . Part 2c. It is a consequence of (5.3b) that ⎛ ⎞ ⎛ ⎞ e( p 4 − p5 ) + e( p 6 − p3 ) s4 ⎜ s5 ⎟ ⎜i e( p 4 − p5 ) − i e( p 6 − p3 )⎟ ⎜ ⎟ ⎜ ⎟ 0 L ⎜ p4 ⎟ = ⎜ ⎟ + u −1 G . ⎝ p ⎠ ⎝ λ( p − p ) − λ( p − p ) ⎠ 7 6 4 3 5 p8 −λ( p 6 − p3 ) + λ( p 4 − p5 ) Proof. Part 1. By direct verification, using (S4), (S6), (S7), Proposition 5.5, Remark 5.4, Definition 5.3. Three representative cases are (C is complex conjugation): entry (4, 5) of B1 (q, 0) − B1 = − u1 e + u12 G K , entry (6, 5) of Q 22 (q, 0) − Q 22 (q) = −ω1 (0) C − ω1 (0) + u1 G K , 2 ). entry (1, 1) of Q 33 (q, 0) − Q 33 (q) = ( u2 + u12 G K ) − (− |t| For Src1 , use f(q, 0) = 0 and the construction of [ ]. The truncation at K + 1 in (S2) is used, in particular, to show that the last component of Src1 is u −(K +2) G K . Parts 2a, 2c. By direct verification, using Proposition 5.9, Remark 5.4, Definition 5.3. Part 2b. Recall Proposition 5.7. Write = (0) + ( − (0)). Consider each field in Part 2b as a polynomial in − (0) and ∂q ( − (0)), with coefficients possibly depending on (0), ∂q (0). The constant term of this polynomial is u −1 G , by [ ] = 0 and Remark 6.1. Everything else is of the form G1 or u G1 , respectively.

302

M. Reiterer, E. Trubowitz

Proposition 8.5 (Main Technical Proposition). Fix K ≥ 0 and DATA as in (S2). Recall ! through (S5) ! . Let R ≥ 4 be an integer. Set (S1) through (S11) and (S1)

Y = R, K , DATA C R+2K +6 (Q) .

(8.2)

Fix c2 ∈ (0, 1) and T < −1000. There are constants c6 (R), c7 (Y ) ∈ (0, 1), nonincreasing in all their arguments, such that Parts 1, 2, 3 below hold whenever |a| ≤ c6 (R)c2

DATA C R+4 (Q) ≤ c6 (R)c2

|T |−1 ≤ c7 (Y )c2 .

(8.3)

Part 1. All the assumptions of Proposition 7.2 (also those of Part 2) hold for the tuple (T, 31, Mμ , h, Bμ , Q, K, Q) whose last six entries are defined in (S7), (S8), (S9), (S10). Part 2. If t0 < T and : [t0 , t0 + ) × R3 → π −1 R ( > 0) is a C ∞ -solution to M(q, ) = h(q, ), which vanishes identically at t0 , then |t0 | K +1 supξ0 ∈R2

R −1 EO (ξ0 ,2,t0 ) {}(t0 ) ≤ (c7 (Y )) .

(8.4)

Part 3. Assumptions. We distinguish three systems (Sys1), (Sys2), (Sys3). Part 3 of the proposition applies to each of these systems individually. First, (Sys1): (Sys2), (Sys3):

b = 2, b = 1,

ξ0 ∈ R2 , ξ0 ∈ D2| Aa | (0).

Second, fix t0 < T and let V = t∈(t0 ,T ) {t} × O(ξ0 , b, t) ⊂ R4 . For (Sys1), (Sys3) there is a field , for (Sys2) there are two fields (1) and (2) , with: (i) , (1) , (2) ∈ C p ( V, π −1 R) with p = ∞ for (Sys1), (Sys3) and p = 1 for (Sys2). (ii) ⎧ ⎪ ⎨ M(q, ) = h(q, ) B(q, (i) )(i) = Q(q, (i) ) + Src(q) ⎪ ⎩ B(q, ) = Q(q, ) + Src(q)

for (Sys1), for (Sys2), for (Sys3).

(iii) , (1) , (2) vanish when q 3 < 21 . (iv) For all t ∈ (t0 , T ): ⎧ ⎪ ⎪ ⎨

R 2 EO (ξ0 ,2,t) {}(t) ≤ (c6 (R)c2 )

(1) (i) SupO(ξ ,1,t) { }(t) 0 ⎪ ⎪ ⎩ (1) SupO(ξ ,1,t) {}(t) 0

for (Sys1),

≤ c6 (R)c2

for (Sys2),

≤ c6 (R)c2

for (Sys3).

Part 3, Conclusion 1. (V), (1) (V), (2) (V) ⊂ B1/2 (0) ⊂ π −1 R ∼ = R31 .

Strongly Focused Gravitational Waves

303

Part 3, Conclusion 2. The assumptions (RE0) through (RE12) and (RE13a) or (RE13b) in Proposition 7.9 hold if the tuple in Proposition 7.9 is given by (Sys1),

uses (RE13a) (Sys2), uses (RE13b) (Sys3), uses (RE13b) see (t0 , T ) (t0 , T ) (t0 , T ) ξ0 ξ0 ξ0 2 1 1 (10, 15, 6) (10, 15, 6) (6, 18, 8) R 0 0 μ μ (1) μ B (q, ) M (q, ) ) (S10), (S4), (S2)

B(1)(q, (2) (2) H (q, ) G q, , , ∂q (S2) Q(q, , ∂q ) (S10), (S11), ψ Src 0 0 (S9), (S4) ! ϒ = (2) − (1) (S1) μ μ μ B B (S8), (S5) B Q Q (S7), (S4) Q (c7 (Y ))−1 0 0 c2 c2 c2 K +1 >0 >0 (S2)

Part 3, Conclusion 3. For (Sys3), if t0 + 1 < T , then

(1) supt∈(t0 +1,T ) |t|Sup(0) O(ξ0 ,1,t) { }(t) Y 1 + supt∈(t0 ,T ) |t|SupO(ξ0 ,1,t) {}(t) . Proof. We give a detailed sketch of the proof. The proof makes a finite number of smallness assumptions on c6 (R) and c7 (Y ). Overall Preliminaries. For all n ≥ 0 and 0 ≤ k ≤ K + 1 and β ∈ N40 with |β| ≤ 1 + R, the following estimates hold on (−∞, T ) × Q: |∂ β u −n | (R,n) |t|−n |∂ β (0)| R c6 (R) c2 β

|∂ (k)| Y 1

|∂ β A| = |A| δβ0 ≤ |a|

|∂ β u| ≤ 2

|∂ β e| ≤

|∂ β λ| ≤

17 2

|a|

β

|∂ S| R 1

17 2

|a|

β

|∂ S K | Y 1

To estimate (0), (k) use (8.2), (8.3), Proposition 6.8. See Definition 8.3 for S K . Hence, by the product rule and (8.3), for all α ∈ N40 with |α| ≤ R: n = 0, 1, n = 1, 2,

|t|n |∂ α (u −(n+1) G K )| Y |T |−1 Y c7 (Y ) c2 , |t| K +n |∂ α (u −(K +n) G K )| Y 1,

n = 0, 1,

|t|n |∂ α (u −n H)| R c6 (R) c2 ,

n = 0, 1,

|t|n |∂ α (u −n H)| R |a| R c6 (R) c2 ,

n = 0, 1, 2,

|t|n |∂ α (u −n J )| R 1, |t| |∂ α (|t|−1 + u −1 )J | R |T |−1 R c7 (Y ) c2 ,

at every point of (−∞, T ) × Q. In this instance, the constants also depend on the particular polynomial represented by the generic symbols. Preliminaries for Part 3. Let (V1 , V2 ) be the open cover of V given by V1 = V ∩((t0 , T )× Q) and V2 = V ∩ ((t0 , T ) × (R3 \ K)). The sets Q, K are defined in (S9).

304

M. Reiterer, E. Trubowitz

• (Sys1): The Overall Preliminaries apply to V1 . On V2 the equations simplify because ψ = 0. The estimate ψ C R (R3 ) R 1 in (S9) is used on the transition region. • (Sys2): V = V1 . The Overall Preliminaries suffice. • (Sys3): V = V1 . Supplement Overall Preliminaries by:

| − (0)|, ∂q − (0) Y 1,

|t| |u −2 G0 |, |t| |∂q (u −2 G0 )| Y |T |−1 Y c7 (Y ) c2 , n = 0, 1 :

|t|n |u −(n+1) G | Y |T |−1 Y c7 (Y ) c2 , |H| |a| c6 (R) c2 ,

|t| |(|t|−1 + u −1 )J | Y |T |−1 Y c7 (Y ) c2 , K +1 1 k ( u ) (k)+π and (iv) in the proposition. on V. For the first, use −(0) = k=1 The rest are consequences of the Overall Preliminaries. Proof of Part 1. If c6 (R), c7 (Y ) are small enough, then 21 ≤ M0 (q, ) ≤ 2 and M3 (q, ) ≥ 0 for all (q, ) ∈ ((−∞, T ) × R3 ) × B2 (0), with B2 (0) ⊂ R31 . In fact, Mμ is a convex combination of Bμ and Bμ , and to estimate B0 and B3 , use || ≤ 2 and the estimates for (0) and (k) in the Overall Preliminaries. To check the assumptions of Proposition 7.2, Part 2, use K |q 3 <1/2 ≡ 0, see (S2). Proof of Part 2. One must estimate ∂ α (t0 , · ) with α ∈ N40 , |α| ≤ R. They are determined by M(q, ) = h(q, ) and (t0 , · ) ≡ 0. They vanish on R3 \K, by the support of ψ. To obtain decay, use the results about Src in Proposition 8.4 Part 1. Proof of Part 3, Conclusion 1. By (iv) in the proposition and by making c6 (R) small enough. For (Sys1), we also use R ≥ 2 and the Sobolev inequality (7.8). Proof of Part 3, Conclusion 2. Most of (RE0) to (RE12) and (RE13a) or (RE13b) in Proposition 7.9 are verified directly. To check (RE4), use Proposition 7.6. To check (RE5) for (Sys3), recall that solves (5.3b) because solves (5.3a). We now discuss two estimates that are quite representative for the proof: the first inequality in (RE12) for (Sys1), and the second inequality in (RE13a) for (Sys1). Recall Proposition 8.4 Part 1 and observe that

(R) 2 R EO (ξ0 ,b,t) { f }(t) R SupO (ξ0 ,b,t) { f }(t) . Let t ∈ (t0 , T ). The (RE12) estimate: R 2K +4 R |t|2K +4 E O E O(ξ0 ,2,t) {ψ u −(K +2) G K }(t) Y 1. (ξ0 ,2,t) {ψ Src1 }(t) = |t|

The left-hand side is ≤ (c7 (Y ))−2 if c7 (Y ) > 0 is small enough. The (RE13a) estimate: R |t|2 E O (ξ0 ,2,t) H1n (q, ) − Q 1n (t) •

R = |t|2 E O (ξ0 ,2,t) ψ Q 1n (q, 0) − Q 1n + ψ Q 1n (q) (t) 1 1 R 2 R 1 1 R |t|2 E O (ξ0 ,2,t) ψ u H + u H + u J (t) + |t| E O (ξ0 ,2,t) ψ u 2 G K (t). The first term is R (c6 (R)c2 )2 , the second is Y (c7 (Y )c2 )2 . Here, we use condition (iv) in the proposition. The result is ≤ (c2 )2 if c6 (R) and c7 (Y ) are small enough. Proof of Part 3, Conclusion 3. Let κ be the right-hand side of the inequality in Conclusion 3. Using Proposition 8.4 Part 2b, one shows that on V = V1 the estimates

Strongly Focused Gravitational Waves

305

| 1 |, | 3 |, |s1 |, |s2 |, | p1 |, | p2 |, | p3 | Y κ|t|−1 and |s4 |, |s5 |, | p4 |, | p7 |, | p8 | Y κ hold. The last inequality can be improved to Y κ|t|−1 for t ∈ (t0 + 1, T ) by integrating the equations in Proposition 8.4, Part 2c along the vector field L, using |q 3 <1/2 ≡ 0. Theorem 8.6. Let (ξ, u, u) be Cartesian coordinates on the truncated unit strip = R2 × (0, 1) × (−∞ , −λ−1 ),

Strip(1, λ)

λ > 0.

Suppose 0 < |A| ≤ |a|. Assume the functions DATAσ (ξ, u) : R2 × (0, ∞) → C, σ ∈ {−, +}, are C ∞ , vanish when u < 21 , and are Flip-compatible DATA

σ

= Flip Aa · DATA−σ ,

ξ = 0,

(8.5)

see Definitions 3.3, 6.12. Let [ σ ] be the formal power series solution corresponding to DATAσ as in Proposition 6.2. Fix an integer R ≥ 4 and ∈ (0, 21 ). Let the constant b = b(B) ∈ (0, 1) be sufficiently small depending only on B = (R, ). Suppose 0 < |A| ≤ |a| ≤ b,

maxσ ∈{−,+} DATAσ C R+4 (C (a,A,2)) ≤ b.

(8.6)

Here, C(a, A, b) = D4| Aa | (0) × (0, b) for each b > 0. Fix an integer K ≥ 0 and set

C = R, , K , maxσ ∈{−,+} DATAσ C R+2K +6 (C (a, A, 2)) . Let the constant c = c(C) ∈ (0, 1) be sufficiently small depending only on C. Then: Existence: Part 1. There exists a pair of fields − , + ∈ C 1 ( Strip(1, c), R) with σ = Flip Aa · −σ (ξ = 0) which are both solutions to (5.3a), vanish when u < 21 , and satisfy " " α σ " ∂ − σ (0) ( · , u) " 0 sup = 0. (8.7) lim |u| C (C (a,A,1)) u→−∞

α∈N40 : |α|≤1

Part 2. The constraint fields ( − ) , ( + ) associated to the fields in Part 1 vanish, and

(Φ − , Φ + ) = Ma,A + u −M − , Ma,A + u −M + are a pair of vacuum fields with Φ σ = Flip Aa · Φ −σ (ξ = 0) , see Definitions 2.6, 3.3. Part 3. The fields in Part 1 are actually in C R−3 ( Strip(1, c), R). Moreover sup |u| K +1

u<−c−1

K " 1 σ (k)( · ) " " " ≤ . sup "∂ α σ ( · , u) − " 0 k C ( C (a, A ,1) ) u c α∈N4 0

|α|≤R−3

k=0

(8.8) #−, # + ) have all the properties listed in Part 1. Uniqueness: Assume ( − , + ) and ( c Then they coincide on Strip(1, 1+c ) ⊂ Strip(1, c).

306

M. Reiterer, E. Trubowitz

Proof. Theorem 8.6 is formulated in terms of x = (ξ, u, u). The proof uses q = (t, ξ, u). ! to (S5) ! , where DATA in (S2) is identified Recall t = u + u. We use (S1) to (S11) and (S1) with either one of DATAσ , σ = −, +. The smallness condition b < 10−3 ensures that a, A satisfy (S1). In this entire proof, c3 ( · ) and c4 ( · ) are as in Proposition 7.9, and c6 ( · ) and c7 ( · ) are as in Proposition 8.5. Furthermore, c8 (R) > 1 is the constant of proportionality in the Sobolev inequality (7.8), for all (ξ0 , b, t) ∈ R2 × [1, 2] × (−∞, −1000). The smallness condition on b. Set J0 = , with as in Theorem 8.6, and X = (R, J0 , | Q m |)

X ∗ = (0, J0 , | Q m |)

X = (0, J0 , | Q m |)

! . Set c2 (B) = 21 min{c3 (X ), c3 (X ∗ ), c3 ( X )} ∈ (0, 1). with m = 1, 2, 3. See (S7), (S4) −3 The single smallness condition for b in this proof is b < min{10 , c6 (R)c2 (B)}. Every application of Proposition 8.5 in this proof uses c2 = c2 (B). The first smallness condition on c. Set

Y = R, K , max DATAσ C R+2K +6 (Q) , σ ∈{−,+}

$

1 , T(C) = −1 − max 1000, c7 (Y )c2 (B)

4 c8 (R) (c4 (X ) + 1) c2 (B)c6 (R)c7 (Y )

1 K +1

% .

(8.9)

Observe that C(a, A, 2) = Q, where Q is defined in (S9). Thus, Y depends only on C, and so does the right-hand side of (8.9). We impose c < 1/(|T(C)| + 2). There will be one more smallness condition on c, later in the proof. Notation. System (8.1) corresponding to DATAσ will be denoted (8.1)σ . We sometimes suppress the superscript and write DATA. We abbreviate Flip Aa by Flip. We have Flip · (Ma,A + u

−M

) = Ma,A + u −M Flip · and Flip · Kσ = K−σ .

Therefore, if solves (8.1)σ , then π −1 Flip · (π ) solves (8.1)−σ . The permutation π is defined in (S3). We abbreviate π −1 Flip · (π ) by Flip · . Step 1 through Step 10 below build on each other: Step 1. The assumptions of Proposition 8.5 up to and including (8.3) are satisfied for all T ≤ T(C), if DATA = DATAσ for σ = −, +. Step 2. For all t0 < T(C) there is t0 ∈ C ∞ ([t0 , t1 (t0 )) × R3 , π −1 R) with t1 (t0 ) ∈ (t0 , T(C)] and M(q, t0 )t0 = h(q, t0 ) and t0 (t0 , · ) ≡ 0 and t0 |q 3 <1/2 ≡ 0, so that t1 (t0 ) = T(C) implies (Break)1 or (Break)2 in Proposition 7.2. Step 3. t1 (t0 ) = T(C), for all t0 < T(C), and for all (τ, ξ0 ) ∈ [t0 , T(C)) × R2 : 2c4 (X ) R EO (ξ0 ,2,τ ) {t0 }(τ ) ≤ c (Y )|τ | K +1 7 c6 (R)c2 (B) 2c4 (X ) . (8.10) ≤ ≤ K +1 c7 (Y )|T(C)| 2c8 (R) For Steps 4 and 5, introduce a new field, the restriction of t0 to [t0 , T(C)) × W, with W = D 5 | a | (0) × (0, 1) ⊂ R3 . 2 A

Henceforth, t0 denotes this new field. We have t0 ∈ C ∞ ([t0 , T(C)) × W, π −1 R).

Strongly Focused Gravitational Waves

307

Step 4. t0 is a solution to (8.1), for all t0 < T(C). Step 5. For each t0 < T(C) and σ ∈ {−, +}, let σt0 be the solution to (8.1)σ . Let a a Y(t0 ) = [t0 , T(C)) × 25 | A | < |ξ | < 25 | A | × (0, 1), Z(t0 ) = τ ∈[t0 ,T(C)) | a |≤|ξ0 |<2| a | {τ } × O(ξ0 , 1, τ ) ⊂ Y(t0 ). A

A

−σ Flip · t0

σt0

and are both defined on Y(t0 ), and coincide on Z(t0 ) ⊂ Y(t0 ). Then For the remaining steps, introduce a new field in C ∞ ([t0 , T(C)) × R2 × [0, 1], π −1 R): & a σt0 (q) if |ξ | < 2| A | (8.11) q → −σ 1 a Flip · t0 (q) if |ξ | > 2 | A | It is well defined by Step 5. Henceforth, σt0 denotes this new field. Step 6. For each t0 < T(C) and σ ∈ {−, +}, we have σt0 = Flip · −σ t0 (when ξ = 0) and σt0 |q 3 <1/2 ≡ 0. Moreover, σt0 solves (8.1)σ on its entire domain of definition, and sup

1 2 c6 (R)c2 (B),

|τ | K +1 SupO(ξ0 ,1,τ ) {σt0 }(τ ) ≤

2c4 (X )c8 (R) . c7 (Y )

(R−2)

sup

sup a |ξ0 |<2| A |

(R−2) σ SupO(ξ ,1,τ ) {t0 }(τ ) 0

≤

sup

a |ξ0 |<2| A | τ ∈(t0 ,T(C))

τ ∈(t0 ,T(C))

Step 7. For each t0 < T(C) − 1 and σ ∈ {−, +}, σ 0 sup (τ ) (Y,J0 ) |t0 |−1 . sup EO (ξ0 ,1,τ ) t0 a |ξ0 |<2| A | τ ∈(t0 +1,T(C))

Step 8. For all t1 ≤ t2 < T(C) and σ ∈ {−, +}, sup

sup a |ξ0 |<2| A |

τ ∈(t2 ,T(C))

0 σ σ −1 EO (ξ0 ,1,τ ) {t2 − t1 }(τ ) (Y,J0 ) |t2 | .

Step 9. There is a pair of solutions σ ∈ C R−3 ((−∞, T(C)) × R2 × [0, 1], π −1 R) to (8.1)σ and (σ ) ≡ 0, with σ = Flip · −σ (when ξ = 0) and σ |q 3 <1/2 ≡ 0 and sup

sup

τ ∈(−∞,T(C))

a |ξ0 |<2| A |

sup

τ ∈(−∞,T(C))

sup

τ ∈(−∞,T(C)) 1 2

≤

1 2 c6 (R)c2 (B),

(R−3)

sup

a |ξ0 |<2| A |

(R−3) SupO(ξ ,1,τ ) {σ }(τ ) 0

|τ | K +1 SupO(ξ0 ,1,τ ) {σ }(τ ) (Y,J0 ) 1, (R−3) {σ }(τ ) 4| a | (0)×(0,1)

|τ | K +1 Sup D

A

(Y,J0 ) 1,

≤ B0 (q, σ (q)) ≤ 2 for all q ∈ (−∞, T(C)) × R2 × (0, 1).

#σ ∈ C 1 ((−∞, t1 ] × R2 × [0, 1], π −1 R) with t1 < T(C) is a pair of Step 10. Suppose #σ = Flip · #−σ (when ξ = 0) and #σ |q 3 <1/2 ≡ 0 and solutions to (8.1)σ , with lim

sup

τ →−∞ |ξ |<2| a | 0 A

#σ |τ | J0 Sup(1) O(ξ0 ,1,τ ) { }(τ ) = 0.

#σ coincides (on its domain) with σ in Step 9. Then

(8.14)

308

M. Reiterer, E. Trubowitz

Proof of Step 1 through Step 10. To prove Step 2 use Proposition 8.5, Part 1 with T = T(C). To prove Step 3, introduce for each ξ0 ∈ R2 and t0 < T(C) the set

2 R . J (ξ0 , t0 ) = t ∈ [t0 , t1 (t0 )) supτ ∈[t0 ,t] E O (ξ0 ,2,τ ) {t0 }(τ ) ≤ c6 (R)c2 (B) It is an interval and closed as a subset of [t0 , t1 (t0 )). Proposition 8.5, Part 2 and (8.9) imply that [t0 , t ∗ ] ∈ J (ξ0 , t0 ) for some t ∗ > t0 . For every such t ∗ , the assumptions of Proposition 8.5, Part 3 for (Sys1) are satisfied with T = t ∗ . By Conclusion 2 and K + 1 ≥ J0

c2 (B) ≤ c3 (X )

t ∗ < T(C) < −1/c2 (B) ≤ −1/c3 (X ),

we can apply Proposition 7.9. Together with Proposition 8.5 Part 2, we obtain (8.10) for all τ ∈ (t0 , t ∗ ). Recall c8 (R) > 1. Conclude that J (ξ0 , t0 ) is also open as a subset of [t0 , t1 (t0 )), and J (ξ0 , t0 ) = [t0 , t1 (t0 )). See Proposition 8.5, Part 3, Conclusion 1. Now (Break)1 and (Break)2 in Step 2 are excluded, and t1 (t0 ) = T(C). To prove Step 5, use a finite speed of propagation argument. Namely, use Proposition 8.5, Part 3 for (Sys2) to a a show that if | A | ≤ |ξ0 | < 2| A |, then σt0 = Flip ·−σ t0 on τ ∈[t0 ,T(C)) {τ }×O(ξ0 , 1, τ ) ⊂ Z(t0 ). To prove Step 7, use Proposition 8.5, Part 3 for (Sys3), Conclusions 3 and 2. To prove Step 8, use Proposition 8.5, Part 3 for (Sys2), Conclusion 2. To prove Step 9, introduce for every β ∈ (0, 1) the compact set Xβ = [T(C) − β −1 , T(C) − β] × D2| Aa | (0) × [0, 1]. Set σ |Xβ = L 2 − limt→−∞ σt , see Step 8. By Step 6 and the Arzela Ascoli Theorem, for each β ∈ (0, 1), we have σ |Xβ ∈ C R−3 (Xβ , π −1 R), and σtn → σ in C R−3 (Xβ , π −1 R) for a sequence tn → −∞. Recall R − 3 ≥ 1. By construction, σ = Flip · −σ on Xβ ∩ (Flip · Xβ ) for all β ∈ (0, 1). Hence, there is a unique pair of Flip-compatible extensions σ ∈ C R−3 ((−∞, T(C)) × R2 × [0, 1], π −1 R). Step 7 implies (σ ) ≡ 0. Step 6 implies the first two estimates in Step 9. The third follows from the second, by using Flip-compatibility. The fourth estimate in Step 9 follows from Flip-compatibility and (RE3) in Proposition 7.9, see Conclusion 2 of Proposition 8.5, Part 3 for (Sys2). For Step 10, first use Proposition 8.5, Part 3 for (Sys2), Conclusion 2, to show a 0 σ #σ that E O (ξ0 ,1,τ ) { − }(τ ) = 0 for all |ξ0 | < 2| A | and all sufficiently negative τ < t1 . Show that this implies Step 10. To complete the proof of Theorem 8.6, observe that the x-set Strip(1, c) is contained in the q-set (−∞, T(C))×R2 ×(0, 1). Set σ = Kσ +π σ with σ as in Step 9. Equations (8.7), (8.8) follow from the definition of Kσ , and σ (k + 1) C R+1 (C (a,A,2)) Y 1 for 0 ≤ k ≤ K , if c is sufficiently small depending only on (Y, J0 ). We have 21 ≤ e3 ≤ 2 on Strip(1, c) by the last estimate in Step 9, where e3 is a component of Φ σ = (e, γ , w). Equation L(e1 e2 −e1 e2 ) = −2γ2 (e1 e2 −e1 e2 ) and e1 e2 |u<1/2 < 0 imply e1 e2 < 0 on Strip(1, c). Existence in Theorem 8.6 now follows from Proposition 2.11. Uniqueness follows from Step 10. 9. Conclusions Proposition 9.1 (Asymptotic expansion). Let Φ σ = Ma,A + u −M σ be the pair of vacuum fields of Theorem 8.6 for K = 0. For each L ≥ 0, sup

|α|≤R−3

L " α σ

1 k σ " "∂ ( · , u) − (k)( · ) " u

k=0

C 0 (C (a,A,1))

= O |u|−L−1

Strongly Focused Gravitational Waves

309

as u → −∞, with α ∈ N40 . In other words, the formal power series [ σ ] is an asymptotic expansion for σ . Proof. The conditions on the data a, A, DATAσ , R, in Theorem 8.6 are independent of K . Therefore, Theorem 8.6 can be applied with the same data for all K ≥ 0. Remark 9.2. The smallness assumptions of Theorem 8.6, with a = A and = 41 and R = 4 and K = 0, are maxσ ∈{−,+} DATAσ C 8 (D4 (0)×(0,2)) ≤ b and 0 < |A| ≤ b for a universal constant b ∈ (0, 1). The domain Strip(1, c) of the vacuum field in Theorem 8.6 depends only on maxσ ∈{−,+} DATAσ C 10 (D4 (0)×(0,2)) . Proposition 9.3 (Trapped sphere). Let Φ σ be the pair of vacuum fields of Theorem 8.6, in the context of Remark 9.2. Suppose there is a u 1 ∈ (0, 1) such that > 0, where = min

inf

σ ∈{−,+} ξ ∈D4 (0)

u1

du |DATAσ (ξ, u)|2 .

0

√ If 0 < |A| < 41 c min{1, }, then u 1 = − 21 A−2 ∈ (−∞, −c−1 ), and the intersection of u = u 1 and u = u 1 is a trapped sphere. Proof. Suppress σ ∈ {−, +}. By Remark 2.8, one must check γ2 , γ6 < 0 on (u, u) = (u 1 , u 1 ). We only discuss the more subtle γ2 < 0. Recall γ2 = A2 (A2 u − u)−1 + u −2 ω2 . By (8.8), |ω2 − ω2 (0)| ≤ c−1 |u|−1 , when (ξ, u, u) ∈ D4 (0) × (0, 1) × (−∞, −c−1 ). By u Proposition 6.2, ω2 (0)(ξ, u) = − 0 du |DATA(ξ, u )|2 . If (ξ, u, u) ∈ D4 (0) × {u 1 } × −1 |u|−1 and γ ≤ |u|−1 A2 −|u|−2 + (−∞, −c−1 ), then ω2 (0) ≤ − and ω2 ≤ −+c 2 √ |u|−3 c−1 . If |u| = 21 A−2 and 0 < |A| < 41 c, then γ2 < 0. 10. Three Points of View Vacuum fields in Theorem 8.6 are, by definition, in the Regularized Picture R. We use symmetry transformations (see Sect. 3) to introduce two more pictures:

The transformations depend on a, A from Theorem 8.6. They are global conformal transformations of the Lorentzian metric. The Minkowski boundary M X , the ξ -disk corresponding to one hemisphere of S 2 in M X , the initial data DATAσX , and the domain of the vacuum field Φ Xσ = M X + u −M Xσ are, for X = R, H, F:

310

M. Reiterer, E. Trubowitz

X MX Hemisphere (in M X ) DATAσX (ξ, u) Domain

R Ma,A a |ξ | < | A |

σ η ξ, u

Strip 1, c

H M1,1 |ξ | < 1

a −2 A ησ A ξ, u

Strip 1, c A2

F M1,1 |ξ | < 1

a −2 σ A η A ξ, A−4 u

4 Strip A , c A−2

Column R corresponds to Theorem 8.6. The entry DATAσR = ησ is by definition of ησ . The columns H, F are obtained from R, as indicated above, by scaling. Here, Strip(μ, λ)

= R2 × (0, μ) × (−∞ , −λ−1 )

μ, λ > 0.

Remark 10.1. From this perspective, Christodoulou [Chr] investigates the case a = A 4 in the picture F. His small δ > 0 is to be

identified with our A . His “short pulse ansatz” σ −2 σ −4 corresponds to DATAF (ξ, u) = A η ξ, A u . Remark 10.2. In the picture H, the Minkowski boundary is always M1,1 , and the domain of the vacuum field always has u ∈ (0, 1). Therefore, different triples (a, A, ησ ) that correspond to the same DATAσH (ξ, u) also correspond to the same vacuum field ΦHσ (by uniqueness in Theorem 8.6). This must be taken into account when interpreting the smallness assumptions in Theorem 8.6. 11. A More Powerful Expansion for a = A Let DATA be fixed and independent of A. Theorem 8.6 produces vacuum fields on a domain Strip(1, c) that is independent of a = A, with 0 < |A| < b. Its proof estimates the deviation of the true solution from a truncation of a u1 formal power series solution. If a A2 formal power series solution is used instead, we expect that one can go beyond Strip(1, c), for small |A|. We give a non-rigorous discussion. The i th component of (k)(ξ, u) is an even or odd polynomial in A, with degree ≤ (1, 1, 2, 2, 2, 0, 2, 1, 1, 1, 0, 0, 2, 0, 1, 2, 3, 2)i + 2k.

(11.1)

This suggests that a consistent ansatz is given by P = diag(1, 1, 0, 2, 2) ⊕ diag(0, 0, 1, 1, 1, 0, 0, 0) ⊕ diag(0, 1, 0, 1, 0), −P

{A

Φ = A P (A−P Φ), Φ } = {A−P MA,A} + u −M {A−P },

where line means that Φ is A P times a new field denoted A−P Φ, and { f } = ∞ the2 second 2 =0 (A ) f {}(ξ, u, u) denotes a formal power series in A with smooth coefficients f {} on an open subset of Strip∞ . The formal characteristic initial value problem for { A−P Φ } is an infinite family of partial differential equations, all but a finite number of which are linear. A key observation is that one only has to solve partial differential equations in (1 + 1)-dimensions, because in the sense of formal power series D = O(A),

N=

∂ ∂u

+ O(A2 ),

L = (1 +

1 f ) ∂. u 2 3 ∂u

A subset of the equations can be solved explicitly: γ2 {0} e3 {0}

=

1 ∂ 2h ∂u h,

γ6 {0} =

1 ∂ 2h ∂u h,

h(ξ, u, u) = u 2 − (ϕ(ξ, u))2 ,

Strongly Focused Gravitational Waves

311

where ϕ ≥ 0 and ϕ 2 = 2 ∂u−1 ∂u−1 |DATA|2 . Thus, the formal solution exists at most on {(ξ, u, u) ∈ Strip∞ | u < −ϕ(ξ, u)}. Actually, one can show that this is exactly the domain of existence of the formal solution { A−P Φ}; no earlier breakdown occurs. Since (A−P ){} = O(|u|−+1 ) as u → −∞ when ≥ 1, see (11.1), we expect that the arguments in Sect. 8 can be K +1 1 k ( u ) (k) is replaced by applied with minor modifications when the truncation k=0 K +2 (A2 ) (A−P ){}. It will give smaller errors when |A| is small. A P =0 Acknowledgements. We appreciate the great effort that Demetrios Christodoulou invested over many years to nurture the mathematical study of general relativity at ETH Zurich. We thank Lydia Bieri, Joel Feldman, Horst Knörrer and Martin Lohmann for encouragement and helpful conversations.

A. Program for Proposition 5.5 The Mathematica13 code below generates the equations in Proposition 5.5 from scratch, by implementing definitions in Sects. 2, 4 and 5. The following notation is used: F[1,2] is F1 2 , \[CapitalGamma][1,1,2] is 112 , W[3,2,3,4] is W3234 , T[3,4,1] is T34 1 , Fr[2,W[1,2,3,4]] is F2 (W1234 ), ctr[W,{2,3},{4,3,4},1] is 23 m Wm434 , ctr[W,{2,3},{4,3,4},3] is 23 m W43m4 , C is conjugation, \[GothicCapitalA] is A. Module[{C,E}, (* Convention 2.1 *) I1={1,2,3,4}; I1T[n_]:=Tuples[I1,n]; I2={{1,2},{3,1},{3,2},{4,1},{4,2},{3,4}}; (* Definition 2.2 *) With[{P=\[CapitalPhi]},FMatrix={{P[1],P[2],0,0},{C[P[1]],C[P[2]],0,0}, {P[4],P[5],0,1},{0,0,P[3],0}};\[CapitalGamma]Matrix={{P[8]+C[P[9]], C[P[12]],P[11],P[6],P[7],P[8]-C[P[9]]},{-P[9]-C[P[8]],P[11],P[12],P[7], C[P[6]],-P[9]+C[P[8]]},{P[13]-C[P[13]],0,0,-P[8]+C[P[9]],P[9]-C[P[8]], P[13]+C[P[13]]},{0,C[P[10]],P[10],0,0,0}};WMatrix={{P[16]+C[P[16]], C[P[17]],-P[17],P[15],-C[P[15]],P[16]-C[P[16]]},{C[P[17]],C[P[18]],0,0, -C[P[16]],-C[P[17]]},{-P[17],0,P[18],-P[16],0,-P[17]},{P[15],0,-P[16], P[14],0,P[15]},{-C[P[15]],-C[P[16]],0,0,C[P[14]],C[P[15]]}, {P[16]-C[P[16]],-C[P[17]],-P[17],P[15],C[P[15]],P[16]+C[P[16]]}};]; Map[Set@@#&,Solve[Flatten[{Table[F[i,j],{i,I1},{j,I1}]==FMatrix, Table[\[CapitalGamma]@@Join[{i},j],{i,I1},{j,I2}]== \[CapitalGamma]Matrix, Table[W@@Join[i,j],{i,I2},{j,I2}]==WMatrix, Table[\[CapitalGamma]@@a==-\[CapitalGamma]@@a[[{1,3,2}]],{a,I1T[3]}], Table[W@@a==-W@@a[[{2,1,3,4}]]==-W@@a[[{1,2,4,3}]],{a,I1T[4]}]}], Flatten[{Table[F@@a,{a,I1T[2]}],Table[\[CapitalGamma]@@a,{a,I1T[3]}], Table[W@@a,{a,I1T[4]}]}]][[1]]]; (* Definitions 2.10, 2.3 and Remark 2.12 *) T[a_,b_,m_]:= Fr[b,F[a,m]]-Fr[a,F[b,m]]+ctr[F,{a,b},{m},1]-ctr[F,{b,a},{m},1]; U[k_,l_,a_,b_]:= -W[k,l,a,b]+Fr[a,\[CapitalGamma][b,l,k]]-Fr[b,\[CapitalGamma][a,l,k]] +ctr[\[CapitalGamma],{b,l},{a,k},2]-ctr[\[CapitalGamma],{a,l},{b,k},2] 13 http://www.wolfram.com/mathematica/.

312

M. Reiterer, E. Trubowitz

-ctr[\[CapitalGamma],{a,b},{l,k},1]+ctr[\[CapitalGamma],{b,a},{l,k},1]; V[b_,j_,k_]:=Sum[s[[1]]*With[{a=s[[2]],i=s[[3]]},Fr[i,W[a,b,j,k]] -ctr[W,{i,a},{b,j,k},1]-ctr[W,{i,b},{a,j,k},2]-ctr[W,{i,j},{a,b,k},3] -ctr[W,{i,k},{a,b,j},4]],{s,{{1,1,2},{1,2,1},{-1,3,4},{-1,4,3}}}]; ctr[b_,m_,n_,i_]:=Sum[s[[1]]*\[CapitalGamma]@@Insert[m,s[[2]],3] *b@@Insert[n,s[[3]],i],{s,{{1,1,2},{1,2,1},{-1,3,4},{-1,4,3}}}]; (* Definition 4.1 and equation (5.1) *) M={2,2,2,3,3,1,2,2,2,2,2,2,3,1,2,3,4,4}; MapThread[(\[CapitalPhi][#1]=#2+Power[u,-#3]*\[CapitalPsi][#1])&, {Range[1,18],{ebold,I*ebold,\[Rho],0,0,0,Power[\[GothicCapitalA],2], \[Lambda]bold,C[\[Lambda]bold],0,-1,0,0,0,0,0,0,0}/\[Rho],M}]; (* Properties of conjugation C *) C[a_+b_]:=C[a]+C[b]; C[a_*b_]:=C[a]*C[b]; C[Power[a_,n_Integer]]:=Power[C[a],n]; C[C[x_]]:=x; C[x_?NumericQ]:=Conjugate[x]; C[x:\[Rho]|u|ebold]:=x; (* Properties of the derivation Fr *) Fr[a_,b_*c_]:=b*Fr[a,c]+c*Fr[a,b]; Fr[_,_?NumericQ|\[GothicCapitalA]]=0; Fr[a_,Power[b_,n_Integer]]:=n*Power[b,n-1]*Fr[a,b]; Fr[a_,b_+c_]:=Fr[a,b]+Fr[a,c]; Fr[1|2|4,u]=0; Fr[3,u]=1; Fr[n:1|2|3|4,C[b_]]:=C[Fr[{2,1,3,4}[[n]],b]]; Fr[4,ebold|\[Lambda]bold]=0; Fr[1|2,\[Rho]]=0; Fr[3,\[Rho]]=-1; Fr[4,\[Rho]]=\[CapitalPhi][3]*Power[\[GothicCapitalA],2]; (* Prettyprinting *) Format[Fr[n:1|2|3|4,x_]]:=({"D",OverBar["D"],"N","L"}[[n]])[x]; Format[\[GothicCapitalA]]="\[GothicCapitalA]"; Format[u]="u"; Format[ebold]=Style["e",Bold]; Format[C[x_]]:=OverBar[x]; Format[\[Lambda]bold]=Style["\[Lambda]",Bold]; Format[S]="S"; Format[\[CapitalPsi][i_]]:=Piecewise[{{Subscript["f",i],1<=i<=5}, {Subscript["\[Omega]",i-5],6<=i<=13},{Subscript["z",i-13],14<=i<=18}}]; (* Propositions 2.14, 2.16, 5.1 *) \[Lambda][i_]:=Power[u,2*i]; E={4,4,4,6,6,2,4,4,4,4,4,4,6,0,0,0,0,0}; Collect[Expand[Expand[Power[u,E-M]*{T[1,4,1],T[1,4,2],T[4,3,3],T[3,4,1], T[3,4,2],U[1,4,4,1],U[2,4,4,1],(U[2,1,4,1]+U[4,3,4,1])/2,(U[1,2,4,2] +U[3,4,4,2])/2,U[2,3,3,4],U[1,3,3,2],U[2,3,3,2],(U[1,2,3,4] +U[3,4,3,4])/2,\[Lambda][1]*V[1,1,4],\[Lambda][1]*V[4,1,4]+ \[Lambda][2]*V[3,4,1],\[Lambda][2]*V[2,4,1]+\[Lambda][3]*V[1,3,2], \[Lambda][3]*V[4,3,2]+\[Lambda][4]*V[3,2,3],\[Lambda][4]*V[2,2,3]}] /.{\[Rho]->1/(-1/u+S/Power[u,2])} ][[{1,2,4,5,6,7,8,9,13,3,10,11,12,14,15,16,17,18}]],u]//MatrixForm ] (* end of Module *)

References [AnRe] [BBM] [Cha] [Chr] [ChrKl] [Fr] [NP] [Pen]

Anderson, L., Rendall, A.D.: Commun. Math. Phys. 218, 479–511 (2001) Bondi, H., van der Burg, M.G.J., Metzner, A.W.K.: Proc. Roy. Soc. Lond. A 269, 21–52 (1962) Chandrasekhar, S.: The Mathematical Theory of Black Holes. Oxford: Oxford U., 1983 Christodoulou, D.: The Formation of Black Holes in General Relativity. Zurich: Eur. Math. Soc., 2009 Christodoulou, D., Klainerman, S.: The Global Nonlinear Stability of the Minkowski Space. Princeton, NJ: Princeton U., 1993 Friedrich, H.: Proc. R. Soc. Lond. A 375, 169–184 (1981) Newman, E., Penrose, R.: J. Math. Phys. 3, 566–578 (1962) Penrose, R.: Phys. Rev. Lett. 14, 57–59 (1965)

Strongly Focused Gravitational Waves

[Tay] [Vai]

313

Taylor, M.E.: Partial Differential Equations III. Berlin-Heidelberg-NewYork: Springer, 1997, pp. 360–370 Vaidya, P.C.: Proc. Indian Acad, Sci. A 33, 264 (1951); reprinted in Gen. Rel. and Grav. 31, 121 (1999)

Communicated by M. Aizenman and P.T. Chru´sciel

Commun. Math. Phys. 307, 315–350 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1311-0

Communications in

Mathematical Physics

Protecting the Conformal Symmetry via Bulk Renormalization on Anti deSitter Space Michael Dütsch1,2 , Karl-Henning Rehren1,2 1 Institut für Theoretische Physik, Universität Göttingen, Friedrich-Hund-Platz 1, 37077 Göttingen, Germany.

E-mail: [email protected]; [email protected]

2 Courant Research Centre “Higher Order Structures in Mathematics”, Universität Göttingen, Bunsenstr. 3–5,

37073 Göttingen, Germany Received: 26 March 2010 / Accepted: 20 February 2011 Published online: 24 August 2011 – © The Author(s) 2011. This article is published with open access at Springerlink.com

Dedicated to Raymond Stora on the occasion of his 80th birthday Abstract: The problem of perturbative breakdown of conformal symmetry can be avoided, if a conformally covariant quantum field ϕ on d-dimensional Minkowski spacetime is viewed as the boundary limit of a quantum field φ on d + 1-dimensional AntideSitter spacetime (AdS). We study the boundary limit in renormalized perturbation theory with polynomial interactions in AdS, and point out the differences as compared to renormalization directly on the boundary. In particular, provided the limit exists, there is no conformal anomaly. We compute explicitly the one-loop “fish diagram” on AdS4 by differential renormalization, and calculate the anomalous dimension of the composite boundary field ϕ 2 with bulk interaction κφ 4 . Contents 1. 2. 3. 4. 5. A. B. C. D.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The General Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . Case Studies I: The Interacting Boundary Field ϕκφ k . . . . . . . . . Case Studies II: The Interacting Composite Field (ϕ 2 )κφ k . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Källen-Lehmann Representation of +m 1 (y)+m 2 (y) . . . . . . . . . The Origin of the Logarithmic Boundary Terms . . . . . . . . . . . Details of the Renormalization of the Massless Fish Diagram on AdS Integrals for the Boundary Limit . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

315 318 329 336 343 343 344 347 349

1. Introduction When a scale invariant free field is perturbed by an interaction, the scaling symmetry is in general broken. In the case of the free massless scalar field in 4-dimensional Minkowski Supported in part by the German Research Foundation (Deutsche Forschungsgemeinschaft (DFG)) through the Institutional Strategy of the University of Göttingen, and DFG Grant RE 1208/2-1.

316

M. Dütsch, K.-H. Rehren

space, this “conformal anomaly” is well known: the renormalization of loop diagrams requires the introduction of a scale parameter which breaks scale invariance. Using the non-uniqueness of renormalization, the best one can reach is “almost homogeneous scaling”, i.e. the breaking terms for the scaling x → λx are proportional to some power of log λ. (For a systematic treatment in the framework of causal perturbation theory see [9,17,18].) In this paper, we want to address the analogous issue for scale invariant generalized free fields (free fields with non-canonical scaling dimension, see (2.9) below). Such fields naturally arise as boundary limits of Klein-Gordon fields on AdS [3,4,29]. The basic question is: – Is it possible to construct scale invariant interacting fields (admitting for anomalous dimensions) (ϕ l )κ L (x) = : ϕ l (x) : + O(κ)

(1.1)

as perturbative expansions around Wick powers : ϕ l (x) : of scale invariant generalized free fields ϕ [11]? (L denotes the interaction density and κ the coupling constant.) Perturbation theory around a generalized free field (in Minkowski space) suffers from a huge arbitrariness which is due to renormalization, as we point out in Sect. 2. On the other hand, the requirement of scale invariance is very restrictive. In important cases (which we do not want to exclude) it cannot be fulfilled even for tree diagrams (Sect. 3.4). Namely, the propagator needs a nontrivial renormalization if the scaling dimension is ≥ 2 in four dimensions ( d2 in d dimensions), and for integer a breaking of scale invariance cannot be avoided. We propose here a method to circumvent these difficulties and construct perturbatively interacting fields with unbroken conformal symmetry, by taking advantage of the AdS-CFT correspondence. Viewing a conformally covariant field on Minkowski space-time as a boundary limit of an AdS covariant field on Anti-deSitter space-time [3,4,10,29], an AdS invariant renormalization in the bulk guarantees an anomaly free conformal symmetry of the boundary field, provided the boundary limit exists. In this way, the AdS-CFT correspondence turns out to be a useful tool also when one is only interested in CFT in Minkowski spacetime. In [3,4] and [11] it was shown that the boundary limit z 01 of the scalar KleinGordon field φ(z, x) of mass M on (d + 1)-dimensional AdS is a generalized free field ϕ(x) with scaling dimension d2 d ν= + M2 , = + = + ν , (1.2) 2 4 see Sect. 2. The corresponding boundary limit of the free Wick powers W (z, x) = : φ l (z, x) : yields fields w(x) = : ϕ l (x) : which have scaling dimensions l. Notice that in the Witten model [29] of Maldacena’s conjectured AdS-CFT correspondence [22], one studies instead the “dual” field with boundary conditions corresponding to − = d2 − ν, which is coupled to the sources in a “dual” way. However, it was shown 1 We use Poincaré coordinates X ≡ (z, x μ ) ∈ R ×Rd of AdS −1 x μ , 1 (z 2 − x 2 −1), + d+1 such that ξ = z 2 1 2 2 2 (z − x + 1) lies on the hyperboloid ξ · ξ = 1 w.r.t. to the metric of signature (+, − . . . −, +) in the ambient space Rd+2 . The AdS metric is the induced one: ds 2 = z −2 (d xμ d x μ − dz 2 ), see e.g. [3,4].

AdS Renormalization of Perturbative Conformal QFT

317

in [10] that the dual coupling modifies the relevant bulk propagator by a correction term in such a way, that the full propagator becomes that of the above Klein-Gordon field, and the unrenormalized perturbative expansion of the dually coupled boundary field is formally equivalent to the boundary limit of the bulk field φ(z, x) with the same interaction. (The same nontrivial features, that are of representation theoretic nature, were established for the propagators of symmetric tensor fields of any rank [26].) Regarding the generalized free field as a limit of a canonical free field on AdS, the task is to extend this relation to the renormalized interacting fields. Hence, we first construct the interacting AdS fields Wκ L (z, x) = : φ l (z, x) : + O(κ)

(1.3)

for polynomial interactions L = φ k in Sect. 3 and Sect. 4, using standard renormalization methods of causal perturbation theory (reviewed in Sect. 2.2 and 2.3). At this stage, the non-uniqueness of the renormalization can be classified by the usual short distance power counting [6,17,18], and the propagator is unique and AdS-invariant, hence the AdS symmetry is fully preserved. Then, the essential step is to investigate the existence of a boundary limit wκ L (x) = lim z −κL · Wκ L (X ) W

z0

(1.4)

in the renormalized theory. Here, we admit for anomalous dimensions, i.e., κWL = l + O(κ). If this limit exists, we prove that it inherits the AdS symmetry of the bulk as an exact (unbroken) conformal symmetry (Sect. 2.4). Our main result is that the boundary limit does exist, for typical polynomial interactions, for the interacting field (Sect. 3) and for composite fields (Sect. 4), due to nontrivial cancellations within the renormalized one-loop distributions taking place in the limit. Although the actual computations are “hidden” in Apps. C and D, these cancellations constitute the essential mechanism to allow the passage to the boundary. In order to establish this result, along the way we develop a “universal” formula (Lemma B.1 in App. B) that controls the asymptotic behaviour near the boundary of a large class of typical interactions and diagrams. Thus, the above posed question gets an affirmative answer for those interactions L[ϕ(x)] of the conformal field which are “induced” by the corresponding polynomial AdS interaction L[φ(X )] (as indicated by retaining the subscript κL in (1.4) also for the boundary field). This means [11] that √ κ d d x L[ϕ(x)] = κ dz d d x −g L[φ(z, x)] , (1.5) hence the CFT interaction density √ L[ϕ(x)] = dz −g L[φ(z, x)] =

dz L[ϕh z (x)] z d+1

(1.6)

arises as the z-integral over L[ϕh z (x)], where ϕh z (x) is the AdS field φ(z, x) re-expressed as a family of boundary generalized free fields belonging to the Borchers class of ϕ ([11], see Sect. 2.1). We point out that, due to the integration in (1.6), the interaction vertices “remain in the bulk”. In this sense, the situation is converse to Rühl’s reconstruction [27] of an AdS field from an interacting conformal field where the AdS interaction is

318

M. Dütsch, K.-H. Rehren

restricted to the boundary (namely, the AdS field in [27] satisfies the free field equation in the bulk). It is an essential aspect of our approach that, while the general principles of renormalization are the same, the detailed implementation of the rules differ in the bulk and on the boundary. In order to exhibit the methodic difference which allows the renormalization in the bulk to preserve the symmetry that is necessarily broken by renormalization on the boundary, we compare both approaches in Sect. 2.5 with a flat space toy model, where this difference is much more transparent. 2. The General Strategy 2.1. Free fields. Let us recall [11] how the Klein-Gordon field on (d + 1)-dimensional Anti-deSitter space and generalized free fields on d-dimensional Minkowski space can be represented in terms of the same creation and annihilation operators, and hence as field operators on the same Hilbert space. The free Klein-Gordon field φ of mass M on AdS can be expressed as 1 d ∞ φ(z, x) = √ z 2 dm 2 Jν (mz)ϕm (x) , (2.1) 2 0 where ϕm is a massive free boundary field given by ϕm (x) ≡ d d k δ(k 2 − m 2 ) a(k)e−ikx + a + (k)eikx . k0 ≥0

(2.2)

2

The parameter ν > −1 is related to the mass by M 2 = ν 2 − d4 . The functions √ z d/2 Jν ( k 2 z) exp ±ikx are the plane-wave solutions to the Klein-Gordon equation on AdS, where the Laplacian is X = −z 1+d ∂z z 1−d ∂z + z 2 x ,

(2.3)

and a(k), a + (k) (k ∈ Rd ) are creation and annihilation operators normalized as [a(k), a + (k )] = (2π )−(d−1) δ d (k − k ) ,

[a, a] = 0 = [a + , a + ] ,

(2.4)

in the Fock space H over the continuous mass 1-particle space H1 = L 2 (V+ , d d k). In this Hilbert space, the fields ϕh (x) ≡ d d k h(k 2 ) a(k)e−ikx + a + (k)eikx (2.5) V+

(with h any sufficiently smooth polynomially bounded real function on R+ ) are local and Poincaré covariant generalized free scalar fields in d-dimensional Minkowski space with Källen-Lehmann measure dμ(m 2 ) = h(m 2 )2 dm 2 . Thus, φ may be written as 1 d φ(z, x) = ϕh z (x) with h z (m 2 ) ≡ √ z 2 Jν (zm). 2

(2.6)

Taking the boundary limit, we get [3,4,11]: lim z − φ(z, x) = ϕ(x)

z0

(2.7)

AdS Renormalization of Perturbative Conformal QFT

with2

ϕ(x) = Cν

319

ν

d d k (k 2 ) 2 [a(k)e−ikx + a + (k)eikx ] , ≡ ν + V+

i.e., ϕ = ϕh with h(m 2 ) = Cν m ν , Cν ≡ homogeneous function of the mass:

d , 2

(2.8)

1

2−ν− 2

(ν+1) .

Its Källen-Lehmann measure being a

dμ(m 2 ) = Cν2 m 2ν dm 2 ,

(2.9)

the boundary field ϕ is scale invariant: U (λ) ϕ(x) U (λ)∗ = λ ϕ(λx) ,

(2.10)

and in fact transforms like a conformal scalar field under the representation of the AdS symmetry group on the Fock space of the AdS Klein-Gordon field φ. The boundary limit (2.7) can obviously be generalized to arbitrary Wick polynomials

a W = : lj=1 ∂x j φ : , w(x) = lim z −l W (z, x) = : z0

which have scaling dimension D W = l+

l

∂ a j ϕ(x) : ,

(2.11)

j=1

j

|a j | (where a j ∈ (N0 )d is a multi-index).

2.2. Causal perturbation theory. The aim of this paper is to investigate causal perturbation theory [13] around the generalized free field (2.8) (and its Wick polynomials (2.11)). Causal perturbation theory proceeds [6,9,13] by defining, for each Wick polynomial W of free fields φ, the interacting field WgL as formal expansion in Wick products of the free field φ with distributional coefficients. This expansion is obtained as the exponential series of retarded products of W with the interaction gL, where the retarded products are operator-valued distributions. They are determined recursively (by the postulated causal properties of the interacting fields) at non-coinciding points only; the renormalization of the perturbative expansion consists in the extension of these distributions to coinciding points. “Renormalization conditions” (covariance, Ward identities, …) serve to reduce the arbitrariness in the extension, and the main problem is to decide whether all desirable renormalization conditions can be fulfilled at the same time, with a finite number of free parameters remaining. This program is performed with the interaction being cut off in space and time by means of a space-time dependent coupling constant g(x). It then remains to control the adiabatic limit of removing the cutoff, g(x) → κ. This limit is in general plagued by infrared problems; it is, however, possible to define the algebraic adiabatic limit [6], i.e., the local field algebras Fκ L (K ) in arbitrary bounded space-time regions K , without infrared problems as long as the construction of the interacting vacuum state is postponed. Causal perturbation theory around a generalized free field is, however, problematic for the following reason. To construct the general solution for the perturbative S-matrix 2 It should not lead to confusion that the present field ϕ was denoted ϕ () in [11], whereas ϕ with h h(m 2 ) = 1 was denoted ϕ.

320

M. Dütsch, K.-H. Rehren

one has to use the Wick expansion formula for time-ordered or retarded products (also called the “causal Wick expansion”) [6,9,13]. For simplicity, let us discuss here the ordinary Wick expansion formula, which for mass shell free fields is : ϕmk1 (x1 ) : . . . : ϕmkn (xn ) : n

ki r1 rn (, : ϕmk1 −r1 (x1 ) : . . . : ϕmkn −rn (xn ) : ) · : ϕm (x1 ) . . . ϕm (xn ) : . = ri r1 ,...,rn i=1

(2.12) For generalized free fields, the Lagrangian can be any field relatively local w.r.t. the generalized free field, i.e., any element of its Borchers class. The Borchers class contains at least the “generalized Wick polynomials” [11] ( : ϕ l : )h (x) = d d k1 . . . V+

V+

d d kl h(k12 , . . . , kl2 ) · : [a(k1 )e−ik1 x + h.c.] . . . [a(kl )e−ikl x +h.c.] : , (2.13)

where h : (R+ )l → C is any symmetric and sufficiently regular function. Let us choose a Lagrangian L(y) = ( : ϕ 4 : ) H (y) with an arbitrary function H (k12 , . . . , k42 ). It is then easy to see, that the Wick expansion of, say, ϕh (x) with L(y) does not factorize as in (2.12), but rather contains terms of the form d d k1 . . . d d k3 h(x − y; k12 , k22 , k32 ) V+

× : [a(k1 )e−ik1 y + h.c.] . . . [a(k3 )e−ik3 y + h.c.] : , where

h(x − y; k12 , k22 , k32 ) =

V+

dq e−iq(x−y) h(q 2 ) H (q 2 , k12 , k22 , k32 ).

(2.14)

(2.15)

Because the dependence of this function on x − y and on ki2 is entangled in a nontrivial manner, the numerical distribution cannot be separated from the operator-valued distribution as in (2.12) (unless H happens to be a factorizing function). Interpreting (2.14) as an operator product expansion, reveals a characteristic feature of the theory of generalized free fields: performing first the k-integrations, the subsequent q-integration may be interpreted as a “continuous sum” over generalized Wick products. More importantly, however, the failure of separation as in (2.12) would require more refined methods to establish the existence of a renormalization, than the standard methods of causal perturbation theory, which proceeds by renormalizing only the numerical distributions (see below). Let us contrast the general case to the case when the interaction is induced by a local interaction on AdS [11] as described in the introduction, i.e., when the conformal field ϕ arises as the boundary limit of a canonical AdS field φ with interaction κL. The Lagrangian L given by (1.6) with, say, L = φ 4 on AdS is L = ( : ϕ 4 : ) H with h z (ki2 ), H (k12 , . . . , k42 ) = dz z −d−1 (2.16)

AdS Renormalization of Perturbative Conformal QFT

321

i.e., H is a z-integral over factorizing functions; one can therefore reorganize the continuous OPE as a z-integral over Wick products of the distinguished fields ϕh z (x) as in (2.6), rather than generalized Wick products as in (2.13). This fact seems to reduce the renormalization ambiguity drastically, since the freedom is only in the choice of suitable weight functions in z. Whether a conformally covariant renormalization of the OPE of perturbed boundary fields is possible, would require a nontrivial analysis. This is the reason why we propose to work instead with the “bulk approach” mentioned before, using the correspondence (2.7) and (2.11); i.e., we first construct the perturbative interacting fields on (d + 1)-dimensional Anti-deSitter space [6,17,18], and then study their boundary limit. We shall see that conformal covariance can be maintained on the boundary because AdS covariance can be maintained in the bulk. The issue therefore has been shifted to the existence of the limit. It will be illustrated in Sect. 2.5, why this indirect approach gives different results than the direct approach perturbing generalized free fields on the boundary. In [6] and [17,18] perturbative interacting fields have been constructed on an arbitrary globally hyperbolic curved spacetime M for localized interactions G(x)L(x), i.e., the interaction L is switched on by G ∈ D(M). The Anti-deSitter spacetime is not itself globally hyperbolic, but its covering is conformally equivalent to a Z2 quotient of a globally hyperbolic space-time [2]. In this way, the lack of global hyperbolicity can be circumvented in terms of boundary conditions “at infinity” (z = 0). If one wants to take the boundary limit, one obviously must not cut off the interaction on the boundary of AdS, hence we must perform a “partial adiabatic limit” which puts the switching function G(z, x) to be 1 for x ∈ K (a compact region ⊂ Md ) and z = 0. It can be easily seen that the conclusion of [6], i.e., the independence of the algebraic adiabatic limit on the details of the switching function outside the compact region of interest, holds also true for the partial adiabatic limit. We may therefore assume that the switching function factorizes as G(z, x) = κ γ (z) g(x), where G|[0,a]×K ≡ κ = constant

(2.17)

with g| K ≡ 1 and γ |[0,a] ≡ 1 for some a > 0. In addition g and γ are smooth, supp g is compact and the support of γ (z) is bounded for z → ∞. Since the support of such functions G is not compact in AdS, there may in principle be IR problems associated with the partial adiabatic limit; but our explicit calculations in Sect. 3 show that these do not appear in the relevant examples. The (partial) algebraic adiabatic limit does not depend on the details of the functions g and γ , provided a is sufficiently large. In practice, we proceed as follows: Given a Wick monomial w in the generalized free field ϕ and its derivatives, we first replace ϕ(x) by the AdS field φ(z, x) (whose boundary limit is ϕ(x)), and construct the interacting AdS field Wκ L (z, x) associated with the corresponding Wick monomial W in φ and its derivatives. Then we define the interacting field wκ L (x) in Md as boundary limit of the interacting field Wκ L (z, x) on AdS, provided this limit exists: wκ L (x) = lim z −κL Wκ L (z, x) , W

z0

(2.18)

where κWL = l +

∞

n=1

W (n) κ n (L ) .

(2.19)

322

M. Dütsch, K.-H. Rehren

W )(n) ∈ C, n ≥ 1) is The deformation l → κWL (i.e., the sequence of coefficients (L determined by the requirement that the limit (2.18) exists.

Remark. In ordinary perturbative QFT the anomalous dimension is the deviation of the scaling dimension of an (interacting) quantum field Aκ L from the scaling dimension of the corresponding (interacting) classical field. For generalized free fields there is no obvious classical counterpart. Instead, we call “anomalous dimension of wκ L ” the deformation of the scaling dimension due to the interaction. In contrast to ordinary perturbative QFT, it does not come from the breaking of scale invariance in the renormalization of loop diagrams (we maintain the AdS-symmetry in the renormalization). Instead its appearance is enforced by the existence of the boundary limit. In causal perturbation theory on AdS, Wκ L is given by [6,17,18] Wκ L (X ) =

∞ n

κ n n=0

n!

r =1 0

∞

dzr γ (zr ) zrd+1

d d xr g(xr )

·Rn,1 (L(X 1 ), . . . , L(X n ); W (X )),

(2.20)

where X ≡ (z, x) and X j ≡ (z j , x j ). The unrenormalized retarded products Rn,1 are determined as distributions at non-coinciding points X i = X j = X . The result is [8,20] Rn,1 (L(X 1 ), . . . L(X n ); W (X )) = (−i)n n! 0 > · · · > x10 · [L(X 1 ), [L(X 2 ) . . . [L(X n ), W (X )] . . .]] , ×S θ x 0 > xn0 > xn−1 (2.21) where S means symmetrization in X 1 , . . . , X n .

a Now let W = : lj=1 ∂x j φ : and L = : φ k : . Then, using (2.6) and (2.1), the retarded product (2.21) may be rewritten as l a ∂x j ϕh z (x) : Rn,1 : ϕhkz (x1 ) : , . . . , : ϕhkzn (xn ) : ; : 1

j=1

0 = (−i)n n!S θ x 0 > xn0 > xn−1 > · · · > x10 k n l (z 1k . . . z nk z l )d/2 2 dm J (m z ) dm 2j Jν (m j z) · ν r s r rs 2(l+nk)/2 r =1 s=1 j=1 aj · [: ϕm 1s (x1 ) : , . . . [ : ϕm ns (xn ) : , : ∂ ϕm j (x) : ] . . .] . s

s

(2.22)

j

We emphasize that writing (2.21) as in the left-hand side of (2.22) is misleading: It is not a retarded product in Minkowski space, but in AdS, defined with respect to the causal structure in AdS. In particular, the problem with the causal Wick expansion for generalized free fields mentioned before, is absent, and its correct definition is the righthand side of (2.22). Moreover, renormalization is needed for coinciding AdS points X i = X only, and not on the whole submanifold xi = x, as will be discussed in the next subsection.

AdS Renormalization of Perturbative Conformal QFT

Fig. 2 1. The “fish” diagram arising kin first order perturbation theory for the φ κ L (X ) with interaction L = φ . The diagram symbolizes the distribution ◦ (, R1,1 ( : φ 2 (X 1 ) : , : φ 2 (X ) : )) or the corresponding unrenormalized expression rfish or (4.2) resp., appearing in the second line of (2.24))

323

interacting field rfish (X 1 ; X ) ≡ (given by (2.44)

In the sequel, we shall be mainly concerned with special cases of the type R1,1 ( : φ k (X 1 ) : ; φ(X )) = k i(X ; X 1 )θ (x 0 − x10 ) · : φ(X 1 )k−1 :

(2.23)

and R1,1 ( : φ k (X 1 ) : ; : φ 2 (X ) : ) = 2k i(X ; X 1 )θ (x 0 −x10 ) · : φ(X 1 )k−1 φ(X ) : + k(k − 1) i(+ (X ; X 1 )2 − + (X 1 ; X )2 )θ (x 0 − x10 ) · : φ(X 1 )k−2 : ,

(2.24)

where + (X ; X 1 ) = (, φ(X )φ(X 1 )) is the scalar 2-point function, and (X ; X 1 ) = (, [φ(X ), φ(X 1 )]) the commutator function.

2.3. The problem of renormalization. The expressions (2.21)–(2.24) are not defined as distributions at coinciding points, due to the time-ordering θ functions. The problem of renormalization is thus the extension of the retarded products to distributions Rn,1 (. . .) on (R+ × Rd )n+1 . By the recursive construction principle underlying causal perturbation theory, once this has been achieved for Rl,1 (l < n), then Rn,1 is already determined everywhere outside the total diagonal n+1 ≡ {(X 1 , . . . , X n ; X ) | X j = X ∀ j = 1, . . . , n}.

(2.25)

Renormalization at n th order is thus reduced to the extension of the distributions Rn,1 from (R+ × Rd )n+1 \ n+1 to (R+ × Rd )n+1 . Applying the recursion as indicated, gives rise to a diagrammatic expansion of Rn,1 in ◦ (. . .) (m ≤ n) terms of Wick products with propagators and numerical distributions rm,1 as coefficients, as in (2.24). The latter are the vacuum expectation values of operatorvalued distributions (with field arguments of possibly lower order). E.g., for W = : φ 2 : and L = : φ k : , there arises the “fish diagram” (Fig. 1) as the coefficient of : φ k−2 (X 1 ) : to first order in κ. ◦ , by extending Renormalization is done in terms of the numerical distributions rm,1 d m+1 them to distributions rm,1 on (R+ × R ) . (For an example in flat space, see Sect. 2.5.) We shall see, however, that the z 0 behaviour of the renormalized operator-valued distributions Rn,1 on AdS is in general not the same as that of the numerical distributions rm,1 ; thus the existence of the limit has to be studied for the operator-valued distribution Rn,1 .

324

M. Dütsch, K.-H. Rehren

For a rigorous and complete definition of the retarded products Rn,1 we refer to the renormalization axioms given in [9],3 with appropriate modifications due to the curvature of AdS [6,17,18]. In particular, the renormalization should not increase the scaling degree of a distribution [6], which controls the “strength of the UV singularity”: The scaling degree is defined in flat space by sdY ( f ( · ; X )) = inf{δ ∈ R| lim λδ f (X + λY ; X ) = 0} , λ0

(2.26)

where the limit is meant as a distribution in Y ∈ Rd+1 ; in curved spacetime, Y is taken in the tangent space and the argument X +λY has to be replaced by the geodesic exponential exp X (λY ). ↑ Moreover, the renormalization conditions of translation invariance and L+ -covariance are replaced by AdS-invariance (group S O(2, d)). The expression (2.21) is obviously AdS-invariant, so the problem consists in the preservation of this symmetry upon renormalization. Since we construct the interacting field on AdS, Rn,1 (L(X 1 ), . . . , L(X n ); W (X )) needs to be renormalized at X k = X ∀ k only, while at xk = x for all k, z k = z for some k, it is already defined by the recursion. This fact is responsible for a drastic reduction of renormalization ambiguities in the AdS approach, as compared to renormalization of generalized free fields on Minkowski space. The renormalization freedom is further reduced by requiring the existence of a boundary limit as a renormalization condition. We shall see in some typical examples (Sect. 3 and Sect. 4) that this condition may require a “field mixing”, i.e., perturbative corrections of an interacting Wick monomial by O(κ) times other Wick monomials, in order to cancel perturbative contributions of different scaling dimensions. We shall show in the next subsection that for W = : φ l : (no derivatives), the AdS covariant renormalization of Wκ L ensures conformal covariance of its boundary limit (2.18) wκ L (x) = lim z −κL · Wκ L (z, x) W

z0

(2.27)

provided this limit exists, with a suitable (coupling dependent) scale dimension κWL = l + O(κ). Then we shall illustrate the difference between renormalization on AdS and renormalization on the Minkowski boundary by a flat space model which avoids the technical complications of the curvature. In Sect. 3, we shall address the renormalizability on AdS and the existence of the boundary limit (2.27) with some case studies. 2.4. Conformal symmetry. In this subsection we assume that for a polynomial interac

a tion L(φ), and for W = : lj=1 ∂x j φ : a Wick polynomial of the free field, an AdSinvariant renormalization of the interacting field Wκ L has been achieved, and that the boundary limit (2.18) of Wκ L exists with a suitable deformation l → κWL of the power of z as in (2.19). Under these assumptions we shall prove: 3 The “off-shell” formalism in [9] is advantageous only when derivatives of fields appear as arguments of retarded products. In the present study, the field operators may be regarded as “on-shell”, i.e., the unperturbed field satisfies the free equation of motion. To simplify the notation, we will, however, write Wκ L = (φ l )κφ k

when W = : φ l : and L = : φ k : .

AdS Renormalization of Perturbative Conformal QFT

325

Proposition 2.1. If the boundary limit (2.18) wκ L of Wκ L exists, then it is a scale covariant field with scaling dimension DκwL

=

l

|a j | + κWL .

(2.28)

j=1

If W = : φ l : contains no derivatives, then wκ L is a conformally covariant scalar field. This is, of course, a variant of the central result in [3,4], that the boundary limit of a scalar AdS field, if it exists, automatically inherits unbroken conformal symmetry. The proof given there describes the CFT as a “theory à la Lüscher–Mack” [3,4, Sect. 3] on the cone C2,d = {ξ ∈ Rd+2 : ξ · ξ = 0}, or a covering thereof. We want to include here a proof that refers directly to the CFT on d-dimensional Minkowski spacetime Md , which is (a chart of) the projective cone PC2,d = C2,d {ξ ∼ λξ }, or a covering thereof. (PC2,d is also known as the Dirac manifold CMd .) Proof of Prop. 2.1. Let U be the unitary representation of S O(2, d) on the Fock space of the free Klein-Gordon field φ on AdS, which implements also the conformal transformation of the boundary generalized free field [11]. For the subgroup corresponding to conformal scale transformations on the boundary, we have Ad U (λ) φ(z, x) ≡ U (λ) φ(z, x) U (λ)∗ = φ(λz, λx)

(2.29)

and hence Ad U (λ) W (X ) = λ for W = :

l

aj j=1 ∂x φ : .

j

|a j |

W (λX )

(2.30)

By means of (2.21) and (2.29) we conclude

Ad U (λ) Rn,1 (L(X 1 ), . . . ; W (X )) = λ

j

|a j |

Rn,1 (L(λX ), . . . ; W (λX ))

(2.31)

at non-coinciding points (using here that the interaction L contains no derivatives of φ). Since we assume that an AdS-invariant renormalization has been achieved,4 this identity is maintained in the extension to coinciding points. In terms of the interacting fields (2.20), this gives Ad U (λ) Wκ L (X ) = λ

j

|a j |

Wκ L (λX )

(2.32)

in the algebraic adiabatic limit. With that and (2.18) we obtain Ad U (λ) wκ L (x) = lim z −κL Ad U (λ) Wκ L (z, x) W

=λ

z0

j

W |a j |+κL

lim (λz)−κL Wκ L (λx, λz) = λ W

j

W |a j |+κL

z0

wκ L (λx). (2.33)

This proves the first assertion of the proposition. We are now going to investigate whether the conclusion (2.33) applies to arbitrary AdS-transformations. Let t ∈ S O(2, d) : (z, x) → (z , x ) be an AdS-transformation, t¯ the conformal transformation induced by t on the boundary, i.e., lim z0 x (z, x) = t¯x. 4 Concrete AdS-invariant renormalization schemes will be presented below.

326

M. Dütsch, K.-H. Rehren

For free Wick powers W = : φ l : (without derivatives) and consequently w = : ϕ l : we obtain: Ad U (t¯) w(x) = lim z −l Ad U (t) W (z, x) z0

= lim

z0

z l z

(z )−l W (z , x ) = lim

z0

z l z

w(t¯x).

(2.34)

(This argument would fail if W involved derivatives.) Now, AdS-invariance of the volume element z −d−1 dz d d x implies ∂(z , x ) z −d−1 = z −d−1 , ∂(z, x) from which it is an easy exercise to conclude that in the limit z 0 (where lim z0

z

and lim z0 ∂z ∂z = lim z0 z ) one obtains lim

z0

∂(t¯x) 1/d z

= . z ∂x

(2.35) ∂z

∂x

=0

(2.36)

Thus, the factor in (2.34) equals the conformal prefactor for a covariant field of scaling dimension l. Turning to interacting fields Wκ L for W = : φ l : , the AdS-invariance of the retarded products, Ad U (t) Rn,1 (L(X 1 ), . . . ; W (X )) = Rn,1 (L(t X 1 ), . . . ; W (t X ))

(2.37)

for t ∈ S O(2, d), implies AdS-invariance of the interacting bulk fields in the algebraic adiabatic limit:5 Ad U (t) Wκ L (X ) = Wκ L (t X ).

(2.38)

With that we find as before that wκ L (x) is conformally covariant with scaling dimension κWL (provided it exists). Namely, Ad U (t¯) wκ L (x) = lim z −κL · Ad U (t) Wκ L (z, x) W

= lim

z0

z W

κL

z

z0

∂(t¯x) W /d W κL (z )−κL · Wκ L (z , x ) = · wκ L (t¯x). ∂x

This completes the proof of Prop. 2.1.

(2.39)

5 For a special conformal transformation t¯ the function G(t −1 (z, x)) does not factorize as (2.17) if G does; but this does not obstruct our procedure thanks to Prop. 8.1 in [6]: in the algebraic adiabatic limit only the constancy of G in the region of interest matters, and this is preserved by the transformation t.

AdS Renormalization of Perturbative Conformal QFT

327

2.5. Renormalization on a submanifold: A pedagogical example. We want to illustrate by a simple model that renormalization of a field in d + 1 dimensions and subsequent restriction to a d-dimensional submanifold is not equivalent to renormalization of the restricted fields. Instead of CMd as boundary of AdSd+1 , we study the 4-dimensional Minkowski space M4 (with coordinates x = (x μ )μ=0,...,3 ∈ R4 and relative coordinates y) as a submanifold of the 5-dimensional Minkowski space M5 (with coordinates X = (z ≡ x 4 , x) ∈ R × M4 and relative coordinates Y = (u, y)). The boundary limit (2.7) corresponds to the restriction to M4 of the fields in M5 . The two-point function of a Klein-Gordon field of mass M ≥ 0 in Md is given by 1 +(d) M (y) ≡ (, φ(x + y)φ(x)) = (2.40) d d p θ ( p 0 )δ( p 2 − M 2 )e−i py . (2π )d−1 Putting d = 5 and replacing y by Y = (u, y), this can be viewed as the 2-point function of a generalized free field in M4 with u-dependent Källen-Lehmann weight [3,4]: √ ∞ 2 2 1 +(5) 2 cos( m − M u) M (Y ) = dm +(4) (2.41) √ m (y). 2π M 2 m2 − M 2 For later reference, we also introduce the corresponding commutator functions (d)

+(d)

+(d)

M (y) ≡ M (y) − M (−y)

(2.42)

and the retarded propagators ret(d) (y) M (d)

≡

0 (d) M (y)θ (y )

i = (2π )d

dd p

e−i py , p2 + i p0 0 − M 2

(2.43)

ret(d)

such that M (y)θ (−y 0 ) = − M (−y). We first investigate the renormalization of the fish diagram (Fig. 1) in M5 . This means that we have to extend the distribution ◦ (Y ) ≡ −i(, [ : φ 2 (X + Y ) : , : φ 2 (X ) : ])θ (−y 0 ) rfish +(5) +(5) = −2i M (Y )2 − M (−Y )2 θ (−y 0 ) ,

(2.44)

which is well defined for Y ≡ (u, y) = 0 (because [ : φ 2 (·) : , : φ 2 (·) : ] vanishes for y 2 < u 2 ), to a distribution rfish ∈ D (M5 ) (i.e., to Y = 0). The extension has to be such that it does not increase the scaling degree with respect to Y → 0. To obtain a solution of the extension problem in M5 , we work with the KällenLehmann representation in M5 . The square of the 2-point function is given in App. A. Choosing for simplicity the field to be massless, this gives (using (A.1) with d = 5 and m 1 = m 2 = 0) ∞ |S3 | ◦ ret(5) rfish (Y ) = dm 2 m im (−Y ). (2.45) 8(2π )4 0 ◦ shows up in the divergence The UV divergence of the unrenormalized distribution rfish of the mass integral. The most general S O(1, 4) Lorentz invariant extension with the required scaling degree is given by [9] m (μ) rfish (Y ) ∝ (−Y + μ2 ) dm 2 2 iret(5) (−Y ) (μ2 ≥ 0) (2.46) m m + μ2

328

M. Dütsch, K.-H. Rehren

depending on a renormalization parameter μ. (The symbol ∝ stands for suppressed numerical factors). (μ) We have obtained rfish by renormalizing in M5 . We now consider how this distribution would appear when regarded as a distribution on the hypersurface M4 with the transverse difference coordinate u as a parameter. Writing Y = (u, y) and the five-momentum as (v, p), we arrive at m e−ivu (μ) 4 i py rfish (u, y) ∝ (−Y + μ2 ) dm 2 2 d dv p e m + μ2 m 2 + v2 − p2 − i p0 0 √ 2 2 0 m e−|u| m − p −i p 0 2 2 4 i py 2 ∝ (− y + ∂u + μ ) d p e dm . m 2 + μ2 m 2 − p 2 − i p 0 0 (2.47) The appearance of the derivative ∂u2 (outside of the integrals) is characteristic for the 5-dimensional renormalization. One cannot get rid of this operator, because it cannot be shifted under the integral. (The integrand is not differentiable with respect to u at u = 0). It is the reason why 5-dimensional renormalization “as seen from the hypersurface” goes beyond standard 4-dimensional renormalization. One way to understand this fact is that on the hypersurface, the fields ∂zn φ(z, x)|z=0 are independent fields which “mix” with φ|z=0 upon 5-dimensional renormalization. In order to exhibit this more clearly, we compare the result of renormalization in the bulk with the alternative procedure of renormalization on the hypersurface, where we have a z-dependent family of fields in four dimensions, similar as in (2.6). The label z just distinguishes different generalized free fields ϕz (x) ≡ φ(x, z) on the same hypersurface, see [11]. That is, we write the 5-dimensional 2-point functions in the unrenormalized ◦ (2.44) as a u = z − z -dependent integral over 4-dimensional 2-point distribution rfish 1 2 functions as in (2.41) (with M = 0), and apply the Källen-Lehmann representation for the resulting products of 2-point functions as in (A.1) with d = 4. This gives ∞ ∞ cos m 1 u cos m 2 u 2i ◦ 2 rfish (Y ) = − dm 1 dm 22 (2π )2 0 m1 m2 0 +(4) +(4) +(4) 0 · +(4) m 1 (y)m 2 (y) − m 1 (−y)m 2 (−y) θ (−y ) ∞ = dm 2 F(m 2 , u) iret(4) (−y) , (2.48) m 0

with 2m −2 F(m , u) ≡ (2π )4 2

∞

∞

dm 2 θ (m − m 1 − m 2 ) · cos(m 1 u) cos(m 2 u) (m 2 − m 21 − m 22 )2 − 4m 21 m 22 . dm 1

0

0

(2.49)

◦ exists in D (M \{0}), but for Y = 0, the mass inteThe unrenormalized distribution rfish 5 gral on the right hand side of (2.48) diverges in the region m 2 → ∞. Renormalization on M4 means regarding (2.48) as a u-dependent Källen-Lehmann representation in M4 and extending it to the diagonal of M4 in an S O(1, 3) Lorentz invariant way. At u = 0, an extension to y = 0 is in fact trivial because (2.48) is already defined there, but the extension is non-unique (δ-functions in y). In order to extend also to u = 0

AdS Renormalization of Perturbative Conformal QFT

329

(u = 0 corresponds to two fields on the same hypersurface), one has to consider the most general S O(1, 3) Lorentz invariant 4-dimensional renormalization F(m 2 , u) (μ) 2 r˜fish (Y ) := (− y + μ ) dm 2 2 · iret(4) (−y) (μ2 ≥ 0). (2.50) m m + μ2 (μ)

◦ ) These distributions exist even in D (M5 ), have scaling degree sd (˜rfish ) = 6 = sd (rfish (μ) ◦ for y = 0. So, r˜ and agree with rfish fish (u, y) solves the renormalization (i.e. extension) ◦ problem in M4 . But it is not a renormalization in M5 because it does not agree with rfish at y = 0 ∧ u = 0. To see this, we evaluate both (2.48) and (2.50) on a test function G(Y ) = γ (u)g(y) with 0 ∈ supp γ . Suppressing irrelevant constants, the difference is (μ) ◦ 2 r˜fish (G) − rfish (G) ∝ dm du γ (u)F(m 2 , u) 1 k 2 + μ2 − ˆ × d 4 k g(k) (m 2 + μ2 )(k 2 − m 2 ) k 2 − m 2 F(m 2 , u) 2 ∝ dm du γ (u) 2 d 4 k g(k) ˆ m + μ2 dm 2 ∝ g(0) (2.51) du γ (u)F(m 2 , u). m 2 + μ2

One can actually compute F(m 2 , u) = m 2 f (mu) by using variables m 1 u + m 2 u = mx and m 1 u − m 2 u = my in (2.49), giving f (t) ∝ J0 (t) + J2 (t) = 2t −1 J1 (t). Thus, since 1 0 ∈ supp γ , the u-integral in (2.51) decays ∼ m − 2 due to the oscillatory behaviour of J1 , so that the m 2 -integral is finite as required for a 4-dimensional renormalization. But it obviously does not vanish for generic γ , as would be required by a 5-dimensional 2 renormalization. This proves the claim. Note that the scale-invariant choice ∞ μ = 0 does not alter the conclusion (the mass integral in (2.51) in this case is ∝ 0 J1 (t)dt = 1). An analogous but more refined argument shows that also when one admits a function ◦ for all (y = 0, u = 0). μ(u), the resulting distribution cannot coincide with rfish The fact that renormalization performed on a submanifold (Eq. (2.50)) does not coincide with proper renormalization in the bulk (Eqs. (2.46), (2.47)), is the main message of this subsection. The breakdown of the bulk symmetry in the hypersurface renormalization is the counterpart of conformal symmetry breaking in AdS-CFT. It can be avoided by bulk renormalization, and subsequent restriction (boundary limit). 3. Case Studies I: The Interacting Boundary Field ϕκφ k We proceed with some case studies concerning the compatibility of an AdS-invariant renormalization with the existence of the boundary limit. We shall not endeavour the greatest possible generality; e.g., we shall always assume the AdS mass parameter M 2 to be sufficiently large to avoid the Breitenlohner-Freedman critical behaviour in the 2 range ν 2 ≡ d4 + M 2 < 1 (see, e.g., [3,4]). We start with the perturbative construction of the interacting field ϕκ L with interaction L = : φ k : as a deformation of ϕ. The renormalization of R1,1 (L(X 1 ), φ(X )) in this case is unproblematic, but it serves to illustrate the difference between various approaches. In order to work out the boundary limit of the renormalized bulk field φκ L ,

330

M. Dütsch, K.-H. Rehren

X

κ

k−1

X1 (n)

Fig. 2. Factorization of φκ L

we introduce a general technique of computation (Sect. 3.2) to be used in more general cases as well. In the subsequent section, we shall choose to study the renormalization and boundary limit of the field (φ 2 )κ L because in this case, the perturbative expansion involves a loop diagram (the fish diagram, Fig. 1) already at first order. Our strategy is to construct the interacting AdS field φκφ k (X ), and then take its boundary limit. In the diagrammatic expansion of φκφ k (X ), each diagram has a single propagator line extending from X to the first interaction vertex X 1 (Fig. 2). Therefore, the z 0 behaviour of each diagram is dictated by the same function (apart from potential IR problems), so that the analysis of the limit can be essentially done in the first order. Nontrivial renormalization, in contrast, becomes relevant only at higher order. To first order perturbation theory n = 1 we obtain ∞ dz 1 (1) k−1 φκφ k (X ) = k γ (z 1 ) d d x1 g(x1 ) · iret (X 1 ) : , (3.1) AdS (X, X 1 ) : φ z 1d+1 0 + + 0 0 where ret AdS (X, X 1 ) = (AdS (X, X 1 ) − AdS (X 1 , X ))θ (x − x 1 ) is the retarded propagator on d + 1-dimensional AdS, according to (2.1) given by 1 ret d/2 AdS (X, X 1 ) = (zz 1 ) (x − x1 ). (3.2) dm 2 Jν (mz)Jν (mz 1 ) ret(d) m 2

At this point, one might be tempted to read off the z 0 behaviour directly from (3.2) and the well-known behaviour of the Bessel functions near zero. We shall see, however, that this attempt is too naive, and that the subsequent z 1 -integration in (3.1) changes the limit behaviour substantially. 3.1. Interaction L = κφ (field shift). For the trivial case k = 1 (i.e., the “interaction” amounts just to a shift of the field by a constant), the adiabatic limit γ (z 1 ) = 1, g(x1 ) = 1 can be taken directly in (3.1) and yields the expected result dz 1 1 (1) , (3.3) φκφ (X ) = d d x1 · iret AdS (X, X 1 ) = d+1 z1 M2 d+1 δ(z − z )δ d (x − x ) upon intewhich follows from ( X + M 2 )iret 1 1 AdS (X, X 1 ) = z gration over X 1 , using AdS-invariance so that the integral does not depend on X . One may also perform the integrations explicitly in the representation (3.2) where the x1 integration is obvious from (2.43), and the subsequent z 1 - and m-integrations are carried out using formula (13.24(1)) in [28], ∞

( 21 (1 + ν + μ)) 1 −ν−1<μ< . (3.4) du u μ Jν (u) = 2μ 1 2

( 2 (1 + ν − μ)) 0

Clearly, the shift by a multiple of the “constant field” 1 destroys the existence of the boundary limit with z − . After the subtraction of the vacuum expectation value (i.e.,

AdS Renormalization of Perturbative Conformal QFT

331

undoing the shift), the boundary limit can be taken and reproduces the original boundary field. This trivial example shows that in general, interacting fields of different scaling dimensions may “mix”, and the appropriate boundary limits have to be taken after their separation. 3.2. Interaction L = κφ 2 (mass shift). In the case k = 2, the interaction just amounts to a change of the AdS mass by δ M 2 = −2κ, so that the perturbed field is just a free field with a different mass. This is an instance of the “Principle of Perturbative Agreeϕ ment” [19]. Consequently, we expect an anomalous dimension according to κφ 2 = d/2 + (d/2)2 + M 2 − 2κ = − κ/ν + O(κ 2 ) to arise. Thus, we are led to study the boundary limit of φκφ 2 (z, x) z

κφ 2

φ (1)2 (z, x) 1 φ(z, x) φ(z, x) κφ · = + κ + log z + O(κ 2 ) , z z ν z

where the first order term (3.1) is dz 1 (1) γ (z ) d d x1 g(x1 ) · iret φκφ 2 (X ) = 2 1 AdS (X, X 1 )φ(X 1 ) . z 1 d+1

(3.5)

(3.6)

(1)

Indeed, in the partial adiabatic limit φκφ 2 exhibits a logarithmic z-dependence which is precisely cancelled by the combination occurring in (3.5). Namely, (3.6) implies (1)

( X + M 2 )φκφ 2 (X ) = 2γ (z)g(x) · φ(X ),

(3.7)

and consequently, using (2.3), 2 log z (1) ( X + M 2 ) φκφ 2 (X ) + · φ(X ) = ( − z∂z )φ(X ) ν ν

(3.8)

in the region where γ (z) = 1, g(x) = 1. The right-hand side vanishes in the limit z 0 faster than z because the leading z behaviour of the unperturbed field is annihilated by the differential operator − z∂z . Since the Klein-Gordon operator preserves homogeneity in z (except for the z 2 x term which is suppressed at small z), the combination of fields on the left-hand side also vanishes faster than z , up to a solution of the homogeneous equation. The homogeneous solution can behave ∼ z or ∼ z d− . If we can exclude the latter (dominant) contribution, then it follows that the limit (3.5) at first order in κ exists. Unfortunately, the previous argument based on the Klein-Gordon operator cannot discriminate between ∼ z and ∼ z d− . We shall therefore develop a more refined analytical method of computation which is “universal” (see Lemma B.1 in App. B) in the sense that it can also be applied when dealing with interactions of higher polynomial degree (Sect. 3.3) and with diagrams with loops (Sect. 4). This method at the same time shows the emergence of the z log z terms. The argument is lengthy, with essential parts contained in App. B, but it is crucial for the understanding of the boundary limit. For the sake of transparency and computational simplicity, we present only the case d=3

and

M = 0.

(3.9)

332

M. Dütsch, K.-H. Rehren

The AdS 2-point functions are explicitly known in terms of hypergeometric functions or associated Legendre functions of the second kind [3,4,15]: Let X = (z, x), X 1 = (z 1 , x + y) (z, z 1 ∈ R+ ; x, y ∈ M3 ), and v=

z 2 + z 12 − y 2 . 2zz 1

(3.10)

v is AdS-invariant. Namely, viewing AdSd+1 as the hypersurface ξ · ξ = 1 in a d + 2-dimensional ambient space Rd+2 of signature (+, − . . . −, +), we have v = ξ · ξ1 ,

(3.11)

hence v is related to the “chordal distance” by d(ξ, ξ1 ) = (ξ − ξ1 )2 = 2(1 − v). We expect singularities at d(ξ, ξ1 ) = 0 (⇔ v = 1) and, due to the identification of −ξ1 with ξ1 , also at d(ξ, −ξ1 ) = 0 (⇔ v = −1). Note also that timelike separation between X and X 1 corresponds to v ∈ [−1, 1]. Then for d = 3, +AdS (X 1 , X ) = −

1 Q 1 (v + i y 0 0). 4π 2 ν− 2

(3.12)

Here Q (u) is a solution of Legendre’s differential equation (1 − u 2 ) f

− 2u f + ( + 1) f = 0 ,

(3.13)

which is analytic outside a cut along the real interval [−1, 1]. For M = 0, hence ν = 23 , = 3, it is the elementary function u+1 u log −1 2 u−1

1 u+1 (1 + u∂u ) log . (3.14) 2 u−1 + + 0 The retarded propagator ret AdS (X, X 1 ) = AdS (X, X 1 ) − AdS (X 1 , X ) θ (−y ) is given by the discontinuity across the cut: 1 u + 1 u=v+i0 (1 + u∂u ) log · θ (−y 0 ) 4πi ret AdS (X, X 1 ) = 2π u − 1 u=v−i0 (3.15) = −(1 + v∂v )θ (1 − |v|) · θ (−y 0 ). Q 1 (u) =

⇒

Q 1 (u) =

This discontinuity is to be understood as a distribution by partial integration w.r.t. v: +1 H [ f ] := − dv f (v)(1 + v∂v )θ (1 − |v|) = dv v∂v f (v). (3.16) −1

Because we have represented the retarded propagator as a distribution w.r.t. the variable v, we have to perform all other integrations (at fixed value of v) first. We therefore change the integration variables: in spatial polar coordinates, let y = (−t, r eϕ ), and w := y 2 ≡ t 2 − r 2 . Then the new variables are v≡

z 2 + z 12 − w , z 1 , t ≡ −y 0 , ϕ. 2zz 1

(3.17)

The measure becomes d 3 y θ (−y 0 )

dz 1 dz 1 θ (z 1 ) = z · dv · 3 θ (z 1 ) · dt θ (t) θ (t 2 − w) · dϕ , 4 z1 z1

(3.18)

AdS Renormalization of Perturbative Conformal QFT

333

where w = wv,z (z 1 ) = z 2 + z 12 − 2v · zz 1 ≡ (z 1 v − z)2 + (1 − v 2 )z 12 .

(3.19)

There is a dense domain of vectors for which matrix elements (1 , φ(X 1 )2 ) of the distributional field become a smooth function. We then extract the leading z 1 behaviour and write

(z 1 , x 0 − t, x + r eϕ ) := γ (z 1 )g(x1 ) · z 1−3 (1 , φ(z 1 , x1 )2 ).

(3.20)

This is a smooth function with compact support, because of the cutoff functions g and γ . At z 1 = t = r = 0, it equals the corresponding matrix element of ϕ(x), because g(x1 ) = 1 and γ (z 1 ) = 1 in the region of interest (partial adiabatic limit). Finally we average over the spatial directions and put 1

x (z 1 , t, r 2 ) := (3.21) dϕ (z 1 , x 0 − t, x + r eϕ ). 2π Then x is smooth6 in all three arguments ≥ 0, and

x (0, 0, 0) = (1 , ϕ(x)2 ).

(3.22)

With these preparations, (the matrix element of) the first-order correction (3.6) to the renormalized field becomes (1)

(1 , φκφ 2 (X )2 ) ∞ dz 1 =z·H 0

∞ 0

dt θ (t 2 − w) · x (z 1 , t, t 2 − w)w=w

v,z (z 1 )

with the functional H [·] as defined in (3.16). We claim that this equals 2 (1) (1 , φκφ 2 (X ), 2 ) = − z 3 log z · x (0, 0, 0) + (regular) , 3

,

(3.23)

(3.24)

where (regular) stands for a contribution that is regular in z at z = 0. The argument goes as follows. For a smooth function f on R3 with compact support, we denote by I0 (v, z)( f ) the integral ∞ ∞ dz 1 dt θ (t 2 − w) f (z 1 , t, t 2 − w)w=w (z ) . (3.25) I0 (v, z)( f ) := 0

0

v,z

1

Thus, to compute (3.23), we have to apply the functional H to I0 (v, z)( f ) when f equals

x on R3+ . In App. B, we prove that I0 (v, z)( f ) is continuous w.r.t. v and differentiable in the range v 2 < 1. Thus, the definition (3.16) of H by partial integration is unambiguous, and it is sufficient to know this function at v 2 < 1, where w ≥ (1 − v 2 )z 2 > 0. In physical terms, this remark means that there are no singular contributions from lightlike y (w = 0): the integration (3.6) can be properly computed by exhausting the backward lightcone “from the inside”. 6 It will be important later (App. B) that is regular in the quadratic variable r 2 . This is obvious at r > 0 x because the square root is smooth. At r = 0, the smoothness can be seen by a Taylor expansion with remainder of (z 1 , x 0 − t, x + r eϕ ), because the angular averaging annihilates all odd terms.

334

M. Dütsch, K.-H. Rehren

In App. B, we also prove that in the range v 2 < 1, I0 (v, z)( f ) is of the form I0 (v, z)( f )

1 − v2 = Ak ( f ) v k z + z 2 · log (1 − v)z · f (0, 0, 0) + Rv,z ( f ), 2 0≤k≤≤2

(3.26) where Ak are certain distributions that do not depend on v and z, while the remainder Rv,z is a family of distributions that is differentiable w.r.t. v in the range v 2 < 1, and vanishes ∼ z 3 at z = 0. Noting that H [v 0 ] = H [v 1 ] = 0, the leading terms are annihilated: H [v 2 ] 2 z log z · f (0, 0, 0) H I0 (v, z)( f ) = − 2 1 − v2 log(1 − v) · f (0, 0, 0) + H [Rv,z ( f )]. (3.27) +z 2 A22 ( f ) + H 2 Thus, with H [v 2 ] = 43 , we have Proposition 3.1. For any test function f on R3 , the limit 2 lim z −2 H I0 (v, z)( f ) + z 2 log z · f (0, 0, 0) z0 3

(3.28)

is finite. (1) For f = x on R3+ , this is our claim (3.24). This ensures that φκφ 2 (X ) decays at least (1)

like z 3 log z, and because of (3.22), it also ensures that φκφ 2 (X ) + z

log z ν

· φ(X ) (recall

ν = in (3.5)) decays at least like = In other words, the boundary limit exists (in first order perturbation theory, and in the obvious weak sense), and is exactly given by the expected correction of the scaling dimension of the boundary field. Apart from establishing the existence of the (expected) boundary limit, the main message to be drawn from the nontrivial computations in App. B, however, is that 3 2

z3.

– the origin of the logarithmic term (corresponding to the anomalous dimension) is the range z 1 = 0 of the integral (3.6), and not the power law behaviour of the retarded propagator at z = 0. 3.3. Interactions L = φ k (k > 2). We now turn to the non-trivial interactions k > 2. In these cases (3.1) yields (1)

( X + M 2 )φκφ k (X ) = γ (z)g(x) · k : φ k−1 (X ) : ,

(3.29)

where the right-hand side ∼ z (k−1) vanishes faster than z . By the same argument (1) as used after (3.8), φκφ k (X ) behaves either like z or like z d− . In the special case

d = 3, M = 0, we can explicitly see the absence of the “wrong” contribution ∼ z d− , by repeating the explicit computation as in the previous section. Replacing φ(X 1 ) by 3(k−2) : φ(X 1 )k−1 : , one gets an additional factor z 1 in (3.25). Because the logarithmic

AdS Renormalization of Perturbative Conformal QFT

335

term in this case appears at order O(z 3+3(k−2) ) (Lemma B.1), it is manifest that the first-order term is of order O(z 3 ), as desired, and the logarithmic term is suppressed in the limit. Thus, the boundary limit exists without an anomalous dimension. Although a complete analysis of renormalization at higher-order is beyond the scope of this paper, let us anticipate what happens in the case at hand. First, we observe (see (n) Fig. 2 above) that φκφ k can be written as ∞ k−1 (n−1) dz 1 (n) γ (z 1 ) d d x1 g(x1 ) · iret (X 1 ). (3.30) φκφ k = k AdS (X, X 1 ) φ κφ k d+1 z1 0 Thus, in order to renormalize φκφ k at order n, one previously has to renormalize k−1 φ at order n −1. In principle, one has to renormalize “all fields simultaneously”, κφ k but in practice, for any finite order of any given field it is sufficient to renormalize only a finite number of fields to lower orders. Thus, assuming recursively that φ k−1 κφ k has been defined up to (n −1)st order, and anticipating that its boundary limit exists with an anomalous dimension of order O(κ), (n−1) then φ k−1 κφ k behaves like z (k−1) times a polynomial in log z, as z 0. Because the canonical dimension (k − 1) is larger than , the same argument as before applies to ensure that the partial adiabatic limit for φκφ k is unproblematic, and for z sufficiently small (such that γ (z) = 1), the equation (n−1) (n) ( X + M 2 )φκφ k (X ) = k · g(x) φ k−1 κφ k (X ) (3.31) (n)

implies the z behaviour of φκφ k (X ) as z 0. Again, this equation does not yet exclude

a term ∼ z d− , but an explicit computation as in Lemma B.1 in the special case d = 3, M = 0 again shows its absence. We conclude that anomalous dimensions do not arise also in higher orders of perturbation theory. Actually, one can go beyond this statement: even if the logarithms could be summed (borrowing suitable higher order terms, i.e., violating the proper perturbative systemφ k−1

atics) to give rise to an anomalous dimension κφ k up to order n − 1 (see Sect. 4), φ k−1

then the argument would still hold true as long as κφ k > (cf. Lemma B.1 with φ k−1

n = κφ k − ).

In the next section, we shall discuss the behaviour of “composite fields” φ 2 κ L . Depending on the interaction, these fields will exhibit finite anomalous dimensions.

3.4. Comparison of bulk vs boundary renormalization schemes. We conclude this section with a comparison of the competing renormalization prescriptions in the case at hand. Concerning the renormalization, we find here significant differences between (a) our procedure, as just outlined, and (b) perturbation theory around the generalized free field ϕ in Minkowski space Md , requiring Poincaré invariance (b1), or in conformal Minkowski space CMd , requiring conformal invariance (b2): (a) (Renormalization in the bulk) The numerical distribution r ◦ (X 1 ; X ) = (, R1,1 ( φ(X 1 ); φ(X ))) coincides with the retarded propagator iret AdS (X, X 1 ) in AdS. Its extension to the diagonal is uniquely given by (3.2), and there is no freedom of renormalization, because its scaling degree in the relative coordinates equals

336

M. Dütsch, K.-H. Rehren

d − 1 (for z > 0), which is smaller than the dimension of the relative coordinates (= d + 1) [6]. The boundary behaviour of the resulting fields is dominated by the z 1 -integration near z 1 = 0, which depends sensitively on the operator valued distribution with which r is multiplied. It is important to keep in mind that we have renormalized (extended r ◦ to the diagonal) first, and then taken the limit z 0 (in the partial adiabatic limit at the boundary). (b) (Renormalization on the boundary) Doing perturbation theory on the boundary, instead, we have to take the limit z 0 first. This yields the unrenormalized distribution rϕ◦ (x − x1 ) = (, R1,1 (ϕ(x1 ); ϕ(x))): rϕ◦ (x

− x1 ) = i[ϕ(x), ϕ(x1 )]θ (x

0

−x10 )

=

dm 2 m 2ν iret m (x −x 1 ).

(3.32)

This product of distributions exists on D(Md ) only in the range −1 < ν < 0.7 2ν For ν ≥ 0 the integral dm 2 m 2 − mp2 −i p0 0 diverges, nevertheless [ϕ(x), ϕ(x1 )]·

θ (x 0 − x10 ) is well defined for x = x1 , and one is faced with the problem to extend rϕ◦ from D(Md \ {0}) to D(Md ). One has two options: – Case (b1). One only requires that the Lorentz invariant extension does not increase the scaling degree (with respect to 0) of rϕ◦ [6,13], which has the value sd(rϕ◦ ) = 2 = d + 2ν. In this case, the retarded propagator is non-unique for ν ≥ 0: the general solution reads rϕ (y) = (μ − y ) 2

[ν]+1

dm 2

m 2ν iret m (y) + Cn n δ(y), (μ2 + m 2 )[ν]+1 n≤ν

(3.33)

where μ > 0 and the Cn ’s are arbitrary constants (cf. [9, App. C]). Clearly, the renormalization mass μ and the local terms break the scale invariance (unless n = ν). – Case (b2). Requiring conformal covariance of the extension, a necessary condition is that the homogeneous scaling behaviour of rϕo is maintained: this is an intensification of the requirement in (b1). From (3.33) we see that there is a unique solution for −1 < ν ∈ N0 which is obtained by choosing μ = 0 and Cn = 0 ∀n. But if ν ∈ N0 , the mass integral is IR-divergent for μ = 0, and a scaling covariant retarded propagator does not exist. 4. Case Studies II: The Interacting Composite Field (ϕ 2 )κφ k 4.1. General considerations. We turn to the field (φ 2 )κ L with interaction L = : φ k : (k ≥ 2). In this case, there exist three types of diagrams which a priori behave differently as z 0: those diagrams in which the two interaction vertices connected to the field vertex are distinct and do not belong to a common loop, those in which they are distinct and belong to a common loop, and those in which they coincide (Fig. 3). Diagrams of the first type factorize into two diagrams as for the field φκ L and consequently can be treated as in Sect. 3.3. The second type does not arise in first order. 7 The expression on the right side results from the definition of ϕ (2.8). Alternatively, it can be obtained by taking the boundary limit lim z,z 1 0 (zz 1 )− . . . (2.7) of (3.2). This limit may be done before the mass integration in (3.2) iff −1 < ν < 0.

AdS Renormalization of Perturbative Conformal QFT

337

Fig. 3. Three types of diagrams arising in perturbation theory for the interacting field φ 2 κ L (X ) with k interaction L = : φ :

Diagrams of the last type contain the fish diagram (Fig. 1) as a subdiagram, which determines their z-dependence. This diagram gives the contribution to (φ 2 )κφ k , k(k − 1) ∞ dz 1 γ (z ) d x1 g(x1 ) rfish (X 1 ; X ) : φ k−2 (X 1 ) : . (4.1) 1 2 z 1d+1 0 ◦ (X ; X ) ≡ In order to define this contribution, the unrenormalized distribution rfish 1 ◦ 2 2 (, R1,1 (φ (X 1 ); φ (X ))), given by ◦ rfish (X 1 ; X ) ≡ −2i +AdS (X 1 , X )2 − +AdS (X, X 1 )2 θ (x 0 − x10 ) (4.2)

at X 1 = X (cf. (2.24)), has to be extended to the diagonal X 1 = X . Then we have to study the boundary behaviour z 0 of the renormalized integral (4.1) in the partial adiabatic limit. Our task is to understand the influence of the UV renormalization on the boundary limit. The unrenormalized distribution (4.2) is real-valued and AdS-invariant. We require that the extension rfish (X 1 ; X ) has the same properties: (I) rfish is real-valued (i.e., rfish ( f )∗ = rfish ( f ∗ )) and the scaling degree in the relative coordinates Y = (y, u) is not increased by the extension: ◦ ( · ; X )) = 2d − 2 ∀X. sdY (rfish ( · ; X )) = sdY (rfish

(4.3)

(II) rfish is AdS-invariant rfish (t X 1 ; t X ) = rfish (X 1 ; X ) ∀t ∈ S O(2, d).

(4.4)

In addition, we want to impose the existence of the boundary limit of the interact+ O(κ 2 ) as a condition on the renormalization, ing field (φ 2 )κφ k = : φ 2 : + κ(φ 2 )(1) κφ k

admitting for an anomalous dimension 2 + κδ + O(κ 2 ). Thus, up to first order of perturbation theory, (1)

(φ 2 ) k (z, x) : φ 2 (z, x) : : φ 2 (z, x) : κφ + κ − δ log z z 2 z 2 z 2

(4.5)

should converge with z 0. We have already seen that the contributions from the first (1) type of diagrams (Fig. 3) to (φ 2 )κφ k behave ∼ z 2 if k > 2, and with a logarithmic correction if k = 2, so that their limit exists separately because > 0. Because the only possibly divergent contribution comes from the fish diagram integrated with : φ k−2 : , a cancellation against the contribution from an anomalous dimension can occur in (4.5) only if k = 4 and only if the divergence of z −2rfish (X 1 , X ) integrated with : φ 2 : is logarithmic. Thus, we are led to require

338

M. Dütsch, K.-H. Rehren

(III) The renormalized expression (4.1) taken in the partial adiabatic limit and multiplied by z −2 converges at z 0 if k = 4, while for k = 4 it may diverge 2 (X ) : ∼ log z : φz 2 . Due to general theorems [6,17,18] there exist extensions which fulfill (I) and (II). For d ≤ 4, these two requirements reduce the freedom of normalization to rfish (X 1 ; X ) + C z d+1 δ(x1 − x)δ(z 1 − z).

(4.6)

So there is only one normalization constant C at disposal to fulfill (III). For this reason, we concentrate on d = 3 and d = 4 from now on. (1) Changing the value of C just adds a multiple of : φ k−2 : to (φ 2 )κφ k . If k = 2 or k = 3,

this term ∼ z 0 or ∼ z must not be present in the boundary limit taken with z −2 , so Condition (III) – if it can be fulfilled – fixes the value of C, and thus determines a “field mixing”. If k = 4, the addition just amounts to a multiplicative renormalization of the zero order term. If k > 4, the addition is ineffective in the boundary limit. In both cases k ≥ 4, the renormalization parameter C is unconstrained by Condition (III). These a priori conclusions are in perfect agreement with the corresponding conclusions drawn from the analysis of Witten diagrams for correlation functions in the dual approach to the AdS-CFT correspondence [29].

4.2. d = 3, M = 0. Renormalization of the fish diagram on AdS4 . The standard strategy ◦ in curved space[17–19] to renormalize (extend) a distribution like the fish diagram rfish ◦ which gives a distribution in the tangent space at time is to pass to the scaling limit r¯fish the point X . The latter carries the leading UV singularity and can be renormalized as in flat space (with the constant metric g X ), while the less singular “reduced” distribution ◦red = r ◦ − r¯ ◦ is (in d = 3 or d = 4) uniquely extended “by continuity”. The probrfish fish fish red and r¯ lem with this strategy in our situation is that rfish fish (the latter being independent of because the scaling limit loses the information about the AdS mass M 2 ) behave differently at the boundary, and do not allow us to deduce the boundary behaviour of the integral (4.1). Let us look more closely at the distribution (4.2). Unfortunately, the AdS KällenLehmann expansion of (+AdS )2 is not known explicitly [5], with which one could perform the renormalization in the spirit of (2.46). Instead, we shall use again the explicit form (3.12) of +AdS (X 1 , X ) ∝ Q 1 in d + 1 = 4 bulk dimensions, and its elementary ν− 2

expression (3.14) if M = 0, hence ν = 23 and = 3. In order to renormalize (4.2) (i.e., to define the retarded product as a distribution on AdS×2 4 ), we adopt the method of differential renormalization [14]: As a distribution on AdS×2 4 \ {(X, X )|X ∈ AdS4 }, (4.2) is of the form ◦ rfish (X 1 , X ) = j (X 1 , X )θ (x 0 − x10 )

with j (X 1 , X ) ∝ Q

ν− 12

(v − i y 0 0)2 − Q

ν− 12

(4.7)

(v + i y 0 0)2 . One writes

j (X 1 , X ) = X 1 J (X 1 , X ),

(4.8)

AdS Renormalization of Perturbative Conformal QFT

339

where J is an AdS-invariant distribution which vanishes if X 1 is spacelike separated from X , and sd(J ) < sd( j), so that J (X 1 , X )θ (x 0 − x10 ) is well-defined as a distribution on AdS×2 4 . One then defines rfish (X 1 , X ) := X 1 J (X 1 , X )θ (x 0 − x10 ) . (4.9) At X = X 1 , this differs from the unrenormalized distribution θ (x 0 − x10 ) · X 1 J (X 1 , X ) by a term ∝ ∂0 J (X 1 , X )δ(x 0 − x10 ) + δ(x 0 − x10 )∂0 J (X 1 , X ). (4.10) The support property of J ensures that this vanishes at X = X 1 , hence rfish (X 1 , X ) is ◦ . Obviously, r indeed an extension of rfish fish satisfies the requirements (I) and (II) above. We follow this strategy in the case M = 0, where by (3.12), +AdS (X 1 , X ) ∼ Q 1 is given explicitly in terms of the elementary function (3.14). We thus obtain (+AdS (X 1 , X ))2 u 2 u + 1 2 u + 1 2 u 1 − log = + u∂u log + (4.11) u=v+i y 0 0 4 64π u−1 u−1 u+1 u−1 Here, the first term is a logarithmically bounded function, hence well-defined as a distribution, and consequently also the second. The last term is defined as a distribution by

2 1 1 . = −∂v 0 0 v ± 1 + iy 0 v ± 1 + iy 0

(4.12)

We now look for a function F such that X 1 F(u) ≡ (1 − u 2 )F

(u) − 4u F (u) = Q 1 (u)2

(4.13)

and i F(v − i y 0 0) − F(v + i y 0 0) = 0 4 8π Next, we determine the discontinuity along the cut J (X 1 , X ) =

if |v| > 1.

δ F(v) = F(v + i0) − F(v − i0)

(4.14)

(4.15)

as a distribution. Then, we can define the renormalized fish diagram as rfish (X 1 , X ) =

i · X 1 δ F(v)θ (−y 0 ) . 4 8π

(4.16)

Proposition 4.1. Equation (4.13) is solved by 1 2 2 1 d 2 2 F(u) = Li3 + Li3 + Li3 − Li3 2 1−u 1+u 6 du 1−u 1+u u+1 1 u + 1 2 2 2 1 · Li2 − Li2 − log + log 6 u−1 1+u 1−u 16 u−1 3 u d 2 u + 1 u + 1 1 d 6 − u2 (u 2 + 3) log log + + + 144 du u−1 16 du u−1 12(u 2 − 1) (4.17) plus the general solution C1 Q 1 (u) + C2 of the homogeneous equation.

340

M. Dütsch, K.-H. Rehren

We point out that F(u) is analytic for u ∈ C \ [−1, 1] (see App. C), and that the particular solution given by the proposition is symmetric (F(−u) = F(u)), but Q 1 (−u) = −Q 1 (u). By writing some terms as derivatives, the boundary values F(v±iε) are defined as distributions. Proof. By insertion into (4.13). In App. C we sketch the derivation of (4.17).

Proposition 4.2. The discontinuity δ F(v) = F(v + i0) − F(v − i0) is given by δ F(v) = iπ θ (1 − |v|) h 0 (v) + ∂v θ (1 − |v|) h 1 (v) , (4.18) where

1+v 1 − v 2 1 1 + v Li2 − Li2 + log , 3 2 2 3 1−v (4.19) 1+v 1−v v 1 + v π2 5 1 h 1 (v) = log log − log + + . 3 2 2 4 1−v 18 12 Notice that the derivative of θ (1 − |v|) cannot be taken separately, because h 1 is logarithmically divergent at v = ±1. Instead, δ F is understood as a distribution in v, where the derivative is defined by partial integration, see below. h 0 (v) =

Proof. See App. C.

Adding the homogeneous solutions, the second of the integration constants, C2 , does not contribute to the discontinuity. Thus, the (expected) renormalization freedom consists in adding to (4.16) the term i C1 X 1 Q 1 (v − i y 0 0) − Q 1 (v + i y 0 0) θ (−y 0 ) 4 8π C1 4 −i C1 X 1 ret z δ(z 1 − z) δ (3) (x1 − x). (4.20) = AdS (X, X 1 ) = − 2π 2 2π 2 Remark. (I) In contrast to the renormalization of the massless fish diagram in 4dimensional Minkowski space, the present renormalization on AdS does not require the introduction of a mass scale. This is because there is already a mass 2 scale in the formalism, namely R is the radius of AdS. (In our 1/R , kwhere 2 0 2 conventions: R ≡ (ξ ) − k=1,2,3 (ξ )2 + (ξ 4 )2 = 1.) (II) δ F in Prop. 4.1 is antisymmetric in v; however the renormalization freedom (4.20) is symmetric in v. Hence, there is a distinguished renormalization: C1 = 0. The term (4.20) contributes a multiple of : φ k−2 (X ) : to the first order term of (φ 2 )κφ 4 . As discussed in Sect. 4.1, for k > 4 this terms does not contribute to the boundary limit, while for k = 4 its boundary limit : ϕ 2 (x) : exists trivially and amounts to a multiplicative renormalization of (ϕ 2 )κφ 4 . For k = 3, it produces a “mixing” of the field : φ 2 : with φ(X ), and the boundary limit has to be taken of the appropriate mixed field (cf. the end of Sect. 4.3). We shall therefore disregard this term in the sequel. Thus, (4.16) with δ F specified by Prop. 4.2 is the starting point for the subsequent analysis of the boundary limit. In that analysis, δ F is understood as a distribution on the differentiable functions on the interval (−1, 1), i.e., +1 1 δ F[ f ] := Hfish [ f ] ≡ dv h 0 (v) − h 1 (v)∂v f (v). (4.21) iπ −1 The crucial property will be

AdS Renormalization of Perturbative Conformal QFT

341

Proposition 4.3. The linear functional Hfish vanishes on even powers f (v) = v 2m , and Hfish [v 2m+1 ] =

2m+1 2 2m + 1 6m + 7 5 Jν + J2m+1 − , 3 2m + 2 6 6

(4.22)

ν=0

where Jn are given in (D.6). In particular, Hfish [v p ] = 0 for p = 0, 1, 2, 3, 4, and 4 Hfish [v 5 ] = 81 . Proof. The even powers of v are automatically annihilated by Hfish by symmetry under v ↔ −v. For the nontrivial case of the odd powers, see App. D. 4.3. d = 3, M = 0. The boundary limit. Let us first consider the most interesting case of the interaction : φ 4 : , i.e., k = 4. The fish diagram contribution to the first order correction to (φ 2 )κφ 4 is given by ∞ dz 1 3 6 d x1 g(x1 ) γ (z 1 ) rfish (X 1 , X ) : φ 2 (X 1 ) : z 1d+1 0 ∞ dz 1 6i 3 = y δ F(v) θ (−y 0 ) · X 1 γ (z 1 ) g(x + y) : φ 2 (z 1 , x + y) : , d d+1 4 8π z1 0 (4.23) z 2 +z 2 −y 2

where X 1 = (z 1 , x1 ), y = x1 − x, and v = 2zz1 1 as before. To study the boundary limit, we proceed exactly as in Sect. 3.2, when evaluating (3.6). We choose again d = 3 and M = 0. Making the same change of variables, we put

(z 1 , x 0 − t, x + r eϕ ) := z 1−6 X 1 γ (z 1 ) g(x1 ) (1 , : φ 2 (X 1 ) : 2 ) (4.24) and

x (z 1 , t, r 2 ) :=

1 2π

dϕ (z 1 , x 0 − t, x + r eϕ ).

(4.25)

Again, x is regular at 0, and

x (0, 0, 0) = −18 (1 , : ϕ 2 (x) : 2 ).

(4.26)

The factor −18 is produced by the Laplace operator (2.3) when acting on : φ(z 1 , x1 )2 : ∼ z 16 at small z 1 . Then we arrive at the matrix element of (4.23), ∞ ∞ 6z 3 z 1 dz 1 dt θ (t 2 − w) · x (z 1 , t, t 2 − w)w=w (z ) , = − 2 · Hfish v,z 1 4π 0 0 (4.27) which is of the same form as (3.23), except for the additional power z 13 (due to the factor : φ 2 : in (4.23) as compared to φ in (3.5)), and with the functional H replaced by Hfish given in (4.21). The argument in square brackets is of the form I3 (v, z)( f ) with f = x on R3+ , as computed in Lemma B.1 of App. B. By the same arguments as before, it is sufficient

342

M. Dütsch, K.-H. Rehren

to v 2 < 1, where it is given by (B.3): there are polynomial terms know it in the range k 0≤k≤≤5 Ak ( f ) v z , a logarithmic contribution 1 z 5 · B3 (v)·log (1 − v)z · f (0, 0, 0) with B3 (v) = ·v(1 − v 2 )(7v 2 − 3), 8

(4.28)

and a remainder Rv,z ( f ) = O(z 6 ) that vanishes in the boundary limit z 0. By Prop. 4.3, the leading polynomial terms with k ≤ 4 are annihilated by Hfish [·], so that only the term Hfish [v 5 ] A55 ( f ) z 5 survives. The log(1 − v) term in (4.28) produces another constant8 times z 5 f (0, 0, 0), and the log z-term produces the contribution 4 − 78 Hfish [v 5 ] · z 5 log z · f (0, 0, 0). Since Hfish [v 5 ] = 81 (Prop. 4.3), we have thus found the following analog of Prop. 3.1: Proposition 4.4. For any test function f on R3 , the limit 7 5 z log z · f (0, 0, 0) lim z −5 Hfish I3 (v, z)( f ) + z0 162

(4.29)

is finite. Inserting this result with f = x and (4.26) into (4.27), we find the first order contribution (4.23) = −

7z 6 · log z · : ϕ(x)2 : + O(z) . 2 6π

(4.30)

The absence of all lower order terms establishes the existence of the boundary limit, and the presence of the logarithmic term signals the anomalous dimension of the composite boundary field φ2

κφ 2 = 6 −

7 · κ + O(κ 2 ) 6π 2

(4.31)

at first order of perturbation theory. This establishes the existence of the boundary limit of (ϕ 2 )κφ 4 in first order perturbation theory, when M = 0. The result requires the nontrivial cancellations Hfish [v] = Hfish [v 3 ] = 0 of Prop. 4.3, involving the precise functions h 0 and h 1 of Prop. 4.2 appearing in the renormalized fish diagram. It remains to investigate whether similar cancellations persist for M = 0, d = 3, and at higher orders. It is now easy to repeat the analysis for the interaction : φ 3 : , i.e., : φ 2 (X 1 ) : ∼ z 16 on the r.h.s. of (4.23) has to be replaced by φ(X 1 ) ∼ z 13 . In this case, the power z 13 in the z 1 -integral is absent (n = 0 in Lemma B.1), hence the logarithmic term log z arises at order z 2 with a coefficient ∼ (1 − v 2 ). Because Hfish annihilates the quadratic polynomial B0 (v) = 21 (1−v 2 ), but not B0 (v) log(1−v), the first-order diagram will not contain log z terms, but finite terms ∼ z 3 ϕ(x). This reflects the expected perturbative mixing of the fields φ 2 and φ under the cubic interaction. Accordingly, the boundary limit z 0 should be taken of a suitable combination like z −6 φ 2 + O(κ) φ κφ 3 . 8 The factor (1 − v 2 ) in B in Lemma B.1 ensures the finiteness of H n fish [B3 (v) log(1 − v)].

AdS Renormalization of Perturbative Conformal QFT

343

5. Conclusion We have pursued the strategy of perturbative construction of interacting conformal fields in d dimensions, which proceeds by the perturbative construction of interacting AdS fields in d + 1 dimensions and subsequently performing a boundary limit. The unperturbed conformal field is a generalized free field (or a Wick product thereof). This procedure resolves the problematic issues associated with the perturbation theory around generalized free fields, and at the same time drastically reduces the expected infinite arbitrariness involved in its renormalization. The most important benefit is the fact that the boundary fields, if renormalized by this method, do not suffer from the conformal anomaly, i.e., the conformal symmetry is perturbatively preserved. We find, however, that the existence of the boundary limit is not automatically guaranteed. Requiring its existence may be viewed as another renormalization condition for the AdS field which cannot always be fulfilled. We have pursued a number of case studies involving polynomial interactions of scalar fields. In relevant cases, the boundary limit exists, and the renormalized boundary fields have anomalous dimensions that can be computed. (An anomalous dimension does not mean a conformal anomaly!) Because the exact analytical expressions are quite involved, we have considered only very special cases; but in view of the highly systematic emergence of the cancellations, we believe that the promising results found in these cases pertain also to more general cases. The method is applicable only when the Lagrangian interaction density of the conformal boundary field is induced by a polynomial interaction on AdS. Such densities are rather special elements of the Borchers class of the generalized free field, which carry a reminiscence of its AdS origin. But in view of the fact that a general perturbation theory for generalized free fields has not yet been formulated, it is encouraging that a successful renormalization can be achieved at least for a limited class of interactions. There arises an interesting question, concerning the “continuous operator product expansion” for generalized free fields, as discussed in Sect. 2.3. The OPE in the bulk is certainly a discrete sum. Taking the boundary limit, when it exists, should not alter this feature. Recalling that the continuous OPE is caused by the failure of factorization of the weight functions h(k12 , . . . , kl2 ) in (2.13), we are tempted to conjecture the perturbative stability of a discrete OPE for “factorizing” Wick products whenever only the Lagrangian is a non-factorizing generalized Wick product. To establish such a result, one would have to reorganize the OPE of the perturbed limit fields, whose subleading terms are continuous in terms of the unperturbed fields, into a discrete OPE in terms of the perturbed fields. Acknowledgements. MD profitted from discussions with Günter Scharf and Raymond Stora during an early stage of this work. Extensive discussions with Klaus Fredenhagen clarified many conceptual issues. We thank the anonymous referee for insisting, by his very detailed and qualified inquiries, on more detailed explanations in Sect. 2.3, and for raising the interesting issue of the structure of the OPE. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

A. Källen-Lehmann Representation of +m1 ( y)+m2 ( y) Let +m (y) denote the 2-point function of a massive scalar free field in d-dimensional Minkowski space. We are going to prove

344

M. Dütsch, K.-H. Rehren

+m 1 (y)+m 2 (y) =

∞ 0

dm 2 ρm 1 ,m 2 (m 2 ) +m (y)

(A.1)

with ρm 1 ,m 2 (m 2 ) =

|S d−2 | 4 · 2d−3 · (2π )d−1 ×θ (m − m 1 − m 2 ) · m 2−d

d−3 2 2 m 2 − m 21 − m 22 − 4m 21 m 22 (A.2)

with

|S d |

the surface of the unit sphere

Sd

|S d | =

in d + 1 dimensions, 2π

d+1 2

( d+1 2 )

.

(A.3)

From the definitions, and using Lorentz invariance, it is easily seen that the KällenLehmann weight is given by 1 2 d ρm 1 ,m 2 (m ) = d p1 d d p2 δ( p12 − m 21 ) δ( p22 − m 22 ) δ( p1 + p2 − p), (2π )d−1 V+ V+ (A.4) where p ∈ V+ is any four-momentum such that p 2 = m 2 . It is convenient to choose p = (m, 0) and perform the integrations over the energies pi0 first, and evaluate the momentum conservation p2 = − p1 . The resulting integral over p ≡ p1 reads in polar coordinates p = | p|, ∞ dp p d−2 |S d−2 | 2 + m2 + 2 + m2 − m . ρm 1 ,m 2 (m 2 ) = δ p p 1 2 4(2π )d−1 0 p 2 + m 21 p 2 + m 22 (A.5) The argument of the δ-function vanishes at 2 1 2 m − m 21 − m 22 − 4m 21 m 22 , p0 = 2m

(A.6)

provided (m 2 − m 21 − m 22 )2 − 4m 21 m 22 > 0 and m − (m 1 + m 2 ) > 0, where the first bound is redundant. From this, we obtain (A.2). B. The Origin of the Logarithmic Boundary Terms We use notations as introduced in Sect. 3.2, with u ≡ z 1 . For a test function f on R3 , we denote by I (u, v, z) the integral ∞ I (u, v, z) := I (w)w=w (u) , where I (w) := dt θ (t 2 −w) · f (u, t, t 2 −w), (B.1) v,z

and by In (v, z)( f ) the integral

0

In (v, z)( f ) := 0

We want to prove:

∞

u n du I (u, v, z)

(n ≥ 0).

(B.2)

AdS Renormalization of Perturbative Conformal QFT

345

Lemma B.1. Let z > 0. Then In (v, z)( f ) is continuous w.r.t. v. In the range v 2 < 1, it is of the form

Ak ( f ) v k z In (v, z)( f ) = 0≤k≤≤n+2

+Bn (v) · z n+2 log (1 − v)z · f (0, 0, 0) + Rv,z ( f ) ,

(B.3)

if n ≥ 0 is an integer. Here, Ak are distributions, Bn (v) = 21 (1 − v 2 ) 2 F1 (−n, n + 3; 2; 1−v 2 ) is a polynomial of degree n + 2, and the remainder Rv,z is a family of distributions that is differentiable in v in the range v 2 < 1, and that vanishes at least ∼ z n+3 log z as z 0. If n = [n] + ε is not an integer, then the first (polynomial) sum extends until [n] + 2, the logarithmic term is replaced by Cn (v) · z [n]+2+ε · f (0, 0, 0) with a possibly non-polynomial function Cn , and the remainder is O(z [n]+3 ). Remark. The emphasis is here on the various subleading terms after the polynomial terms, because they become the leading ones in different instances of our case studies of the boundary limit, and we expect that this happens also in more general cases. The log z-term is essential for Prop. 3.1 and Prop. 4.4. The log(1 − v)-term is used in the last paragraph of Sect. 4.3, and the z [n]+2+ε -term in the non-integer case is relevant in Sect. 3.3. Proof. The integrals I (u, v, z) and In (v, z)( f ) are continuous w.r.t. v by definition, because the integrand and the range of integration vary continuously. For the differentiability w.r.t. v when v 2 < 1, we note that the dependence on v is only through w, and w = wv,z (u) ≥ (1 − v 2 )u 2 > 0. Thus ∂v I (u, v, z) = −2uz∂w I (w), and ∞ √ 1 (B.4) − ∂w I (w) = √ f (u, w, 0) + √ dt · ∂3 f (u, t, t 2 − w). 2 w w We now compute the leading derivatives w.r.t. z in the range v 2 < 1. Again, the dependence is only through w = wv,z (u) > 0, and ∂z I (u, v, z) = −2(uv − z)∂w I (w), hence ∂z I (u, v, z)

=

k=[ +1 2 ]

Ck · (uv − z)2k− (−∂w )k I (w)w=w

v,z (u)

(B.5)

with certain combinatorial coefficients Ck . Computing (−∂w )k I (w), the derivatives can either all go on the integrand, giving ∞ k 2 (B.6) √ dt ∂3 f (u, t, t − w), w

or after q < k derivatives on the integrand, the next derivative goes on the lower bound√ √ q ary, producing (2 w)−1 ∂3 f (u, w, 0), and the remaining k−q −1 derivatives produce a sum of terms (neglecting numerical coefficients for the moment) p

1

w −k+q+ 2 + 2 · ∂2 ∂3 f (u, p q

√

w, 0)

with p + q ≤ k − 1.

(B.7)

346

M. Dütsch, K.-H. Rehren

Now, at z = 0, we have w = u 2 , hence the terms (B.6), (B.7) inserted into (B.5) become, respectively, ∞ p q (uv)2k− dt ∂3k f (u, t, t 2 − u 2 ), (uv)2k− u −2k+2q+ p+1 ∂2 ∂3 f (u, u, 0). (B.8) u

∞ To obtain ∂z In (v, z)( f ), these remain to be integrated with 0 u n du . . .. The u-integrals are unproblematic at large u by the falloff of the test function, but they may become singular at u = 0. The most singular terms are the latter ones in (B.8) when p = q = 0, i.e., v 2k− u −+1 f (u, u, 0). It is then obvious that the u-integrals over (B.8) are finite multiples of v 2k− , as long as < n + 2. Thus, for < n + 2, ∂z In (v, z)( f )|z=0 ≤ k ≤ . is finite, and is in fact a polynomial in v of degree , because +1 2 If n is an integer and = n + 2, the most singular terms p = q = 0 are ∞ √ 1 du u n (uv − z)2k−n−2 w −k+ 2 · f (u, w, 0)w=w (u) , (B.9) v,z

0

with n+3 ≤ k ≤ n + 2. While all other terms are finite multiples of v 2k−n−2 at 2 z = 0, these terms are logarithmically divergent at z = 0. To isolate the divergence, we split the integration range into the intervals (0, U ) and (U, ∞), for any fixed U > 0. 2k−n−2 at z = 0. In the former, we write The latter √ integral is a finite multiple √ of v f (u, w, 0) = f (0, 0, 0) + f (u, w, 0) − f (0, 0, 0) , so that the second contribution is also a finite multiple of v 2k−n−2 at z = 0. The remaining terms, that diverge at u = 0 when z = 0, are f (0, 0, 0) ·

U

0

1

div du u n (uv − z)2k−n−2 wv,z (u)−k+ 2 =: In,k (v, z).

(B.10)

Restoring the suppressed numerical coefficients, these terms sum up to n+2

1 Ckn+2

k=[ n+3 2 ]

1

2 2

·

k−1

div In,k (v, z)

=

− f (0, 0, 0) · ∂zn+2

U

du u n

0

wv,z (u). (B.11)

Equation (B.11) comprises all contributions to ∂zn+2 In (v, z)( f ) that are divergent at z = 0, while all other contributions are polynomials in v of degree n + 2. The u-integral in (B.11) can be performed explicitly: Introducing the integration var2 2 2 iable s = u − √vz and the constant a := (1 − v )z , the integrand is a linear combination m 2 2 s + a . If m is odd, the primitive function is a polynomial in s, a 2 , and of √ terms s s 2 + a 2 . Evaluated at the upper and lower values s = U − vz and s = −vz, these are regular functions in v and z, that possess convergent power series expansions in vz and z in the range v 2 < 1, 0 ≤ z < U . In particular, they contribute further finite values at z = 0 to (B.11), that are polynomials in v of degree n + 2. If m = 2μ is even, in addition to terms of the previous algebraic type, the primitive functions contain terms of the form U −vz U − vz + wv,z (U ) 2μ+2 2 2 2 μ+1 2 2 a log s + s + a = (z − v z ) log . (B.12) s=−vz −vz + wv,z (0)

AdS Renormalization of Perturbative Conformal QFT

347

The logarithm of the numerator is again a convergent power series as above, and contributes further finite values at z = 0 to (B.11), that are polynomials in v of degree n + 2. But the denominator yields the logarithmic term log (1 − v)z . Collecting all prefactors, we find the total logarithmic contribution to (B.11) to be given by (n + 2)! · Bn (v) · log (1 − v)z · f (0, 0, 0) (B.13) 1−v 2 with Bn (v) = 21 (1 − v 2 ) · v n 2 F1 − n2 , − n−1 2 ; 2; − v 2 . With [1, Eqs. 15.3.19, 15.3.5], this can be brought into the manifestly polynomial form of Bn as given in the lemma. Knowing (the form of) the first n + 2 derivatives of In (v, z)( f ) at z = 0, we obtain the claim of the lemma, for n integer. n = [n] + ε is not an integer, then all terms (B.8) give rise to finite integrals If u n . . . as long as ≤ [n] + 2, i.e., ∂z In (v, z)( f )|z=0 are polynomials in v of degree up to ≤ [n] + 2. However, a scaling argument shows that ∂z[n]+2 In (v, z)( f )|z=0 has a subleading term of order O(z ε ): Namely, the integrands 1

gk (u, v, z) = u n (uv − z)2k−[n]−2 wv,z (u)−k+ 2

(B.14)

of the leading terms are homogeneous of order ε − 1 in u and z. Using Euler’s equation in the form (z∂z − ε)gk (u, v, z) = (−1 − u∂u )gk (u, v, z) = −∂u (u gk (u, v, z)), this implies U du gk (u, v, z) = −U gk (U, v, z) , (B.15) (z∂z − ε) 0

U where U gk (U, v, 0) = v 2k−[n]−2 U ε . This differential equation for 0 du gk (u, v, z) admits contributions ck (v) · z ε with undetermined integration constants ck (v), that sum up to Cn (v) in the statement of the lemma. This proves the lemma for non-integer n. The proof of the lemma clearly exhibits the origin of the logarithmic divergence to be the range z 1 ≈ 0 of the integration over z 1 ≡ u. Notice also in (B.3) the logarithmic √ singularity at v = 1, where w = |z 1 − z|. It arises upon integration over z 1 in the vicinity of z, corresponding to the point X 1 = X . This singularity does not lead to divergences, because it is always tamed by the factor 1 − v 2 in Bn (v). C. Details of the Renormalization of the Massless Fish Diagram on AdS We work with the convention that the cut of log z (z ∈ C) is along (−∞, 0]. As usual we define for z ∈ C \ [1, ∞), log(1 − z ) Li2 (z ) Li2 (z) := − dz

, Li (z) := dz

, (C.1) 3

z z

Cz Cz where C z is any smooth curve from 0 to z which does not intersect [1, ∞). With that Li2 (z) and Li3 (z) are analytic on C \ [1, ∞). Since u+1 ∈ (−∞, 0] ⇔ u ∈ [−1, 1] , u−1

2 ∈ [1, ∞) ⇔ u ∈ [−1, 1] , 1±u

the expression (4.17) for F(u) is manifestly analytic for u ∈ [−1, 1].

(C.2)

348

M. Dütsch, K.-H. Rehren

The formula (4.17) for F(u) can be derived by first computing the integral x 1 dt (1 − t 2 )(Q 1 (t))2 for x ∈ R , |x| > 1 , F (x) = (1 − x 2 )2

(C.3)

which gives (after analytic continuation to z ∈ C \ [−1, 1]) F (z) =

2 + 3z − z 3 1 (1 − z 2 )2 12 1 z2 z+1 log + + + 6 3 z−1

z + 1 2 z log − z−1 3 2 2 Li2 + 2 C1 , 3 1−z

(C.4)

where C1 is an undetermined constant. A second integration yields F(u) for u ∈ [−1, 1]. Here we use well-known identities for Li2 and Li3 (see, e.g., [21]) and z + 1 n−1 z + 1 n 1 −1 d log log = (n = 2, 3) , (C.5) z2 − 1 z−1 2n dz z−1 d 2 2 1 Li2 = ∓ Li3 . (C.6) 1±z 1±z dz 1±z The expressions on the l.h.s. are problematic, since they have poles at u = ±1, which overlap with the cut along [−1, 1] of the pertinent function in the numerator. But the boundary values at u = v ± i0 along both sides of the cut of the expressions on the r.h.s. are well defined distributions. To compute δ F(v) = F(v + i0) − F(v − i0) we use that the complex derivative is given by the infinitesimal differential quotient in any direction, in particular we may choose the direction of the real axis: d d f (z)z=v+iw = f (v + iw) if f is holomorphic at z = v + iw , (C.7) dz dv and hence

z=v+i0 d d f (z) f (v + i0) − f (v − i0) . = z=v−i0 dz dv

(C.8)

In addition we give the following formulas: 1 u=v+i0 d θ (1 − |v|) , v ∈ R , (C.9) = iπ(δ(v + 1) − δ(v − 1)) = iπ 2 u − 1 u=v−i0 dv v + i0 + 1 v+1 (C.10) log = log − iπ θ (1 − |v|) , v ∈ R , v + i0 − 1 v−1 x Im log(1 − (t ± i0)) Im Li2 (x ± i0) = − dt = ±θ (x − 1) iπ log x, x ∈ R, t 0 (C.11) 2 x −1 1 π Re Li2 (x ± i0) = Li2 + (log x)2 + − log x · log(x − 1) , x > 1, (C.12) x 2 6 x Im Li2 (t ± i0) iπ Im Li3 (x ± i0) = dt (C.13) = ±θ (x − 1) (log x)2 , x ∈ R. t 2 0 With that the result (4.18) is obtained by a straightforward calculation (dropping terms involving (1 − v 2 ) · ∂v θ (1 − |v|) ≡ 0).

AdS Renormalization of Perturbative Conformal QFT

349

D. Integrals for the Boundary Limit +1 Applying the functional Hfish [ f ] = −1 dv h 0 (v) − h 1 (v)∂v f (v) (with h 0 and h 1 as in Prop. 4.2) to odd power functions f (v) = v 2m+1 , all integrals are of the types +1 +1 1+v 1−v = (−1)n , (D.1) dv v n log dv v n log Jn = 2 2 −1 −1 +1 1−v 1+v log , (D.2) Kn = dv v n log 2 2 −1 +1 +1 1+v 1−v n n = (−1) , (D.3) Ln = dv v Li2 dv v n Li2 2 2 −1 −1 so that

π2 5 1 2 4 1 L 2m+1 + J2m+1 −(2m + 1) K 2m − J2m+1 − − . (D.4) 3 3 3 2 9 6 Since we could not find these integrals in the literature, we sketch their computation here. 1+v In Jn , we partially integrate log 1+v 2 with primitive (1 + v)(log 2 − 1). This gives J0 = −2 and the recursion 1 + (−1)n n Jn−1 , Jn = − − (D.5) (n + 1)2 n+1 which is solved by Hfish [v 2m+1 ] =

n

[2] 1 (−1)n+1 . Jn = 2 n+1 2ν + 1

(D.6)

ν=0

Summing the geometric series in the integrand of Jn , we also get +1 ∞

1+v dv 1 − v v=+1 π2 log = Li2 Jn = =− . 2 2 v=−1 6 −1 1 − v

(D.7)

n=0

K n vanish if n is odd. Partially integrating v 2m in K 2m , expanding (1 − v)−1 as a geometric series, and using (D.7), we get K 2m =

∞ 2m

2 −2 π 2 + Jn = Jn . 2m + 1 2m + 1 6 n=2m+1

(D.8)

n=0

The integrals L n can be obtained by partial integration of the factor Li2 1+v 2 with 1−v π2 1+v primitive (1 − v) 1 − log 2 + (1 + v)Li2 2 , which yields L 0 = 3 − 2 and the recursion π2 1 + (−1)n (n + 1)L n = − (−1)n n(Jn + Jn−1 ) − − n L n−1 (D.9) 3 n+1 with solution ∞ n

π2 π2 − (−1)n + (−1)n (n + 1)L n = Jν = (1 + (−1)n ) Jν . (D.10) 6 6 ν=n+1

Inserting (D.6), (D.8), (D.10) into (D.4) proves Prop. 4.3.

ν=0

350

M. Dütsch, K.-H. Rehren

References 1. Abramovitz, M., Stegun, I.A.: Handbook of Mathematical Functions. New York: Dover Publications, 1972 2. Avis, S.J., Isham, C.J., Storey, D.: Quantum field theory in anti-De Sitter space-time. Phys. Rev. D 18, 3565–3576 (1978) 3. Bertola, M., Bros, J., Moschella, U., Schaeffer, R.: A general construction of conformal field theories from scalar anti-de Sitter quantum field theories. Nucl. Phys. B 587, 619–644 (2000) 4. Bertola, M., Bros, J., Gorini, V., Moschella, U., Schaeffer, R.: Decomposing quantum fields on branes. Nucl. Phys. B 581, 575–603 (2000) 5. Bros, J., Epstein, H., Moschella, U.: Towards a general theory of quantized fields on the anti-de Sitter space-time. Commun. Math. Phys. 231, 481–528 (2002) 6. Brunetti, R., Fredenhagen, K.: Microlocal analysis and interacting quantum field theories: renormalization on physical backgrounds. Commun. Math. Phys. 208, 623–661 (2000) 7. Dütsch, M., Fredenhagen, K.: A local (perturbative) construction of observables in gauge theories: the example of QED. Commun. Math. Phys. 203, 71–105 (1999) 8. Dütsch, M., Fredenhagen, K.: Algebraic quantum field theory, perturbation theory, and the loop expansion. Commun. Math. Phys. 219, 5–30 (2001) 9. Dütsch, M., Fredenhagen, K.: Causal perturbation theory in terms of retarded products, and a proof of the Action Ward Identity. Rev. Math. Phys. 16, 1291–1348 (2004) 10. Dütsch, M., Rehren, K.-H.: A comment on the dual field in the AdS-CFT correspondence. Lett. Math. Phys. 62, 171–184 (2002) 11. Dütsch, M., Rehren, K.-H.: Generalized free fields and the AdS-CFT correspondence. Ann. Henri Poincaré 4, 613–635 (2003) 12. Epstein, H.: On the Borchers class of a free field. Nuovo Cim. 27, 886–893 (1963) 13. Epstein, H., Glaser, V.: The role of locality in perturbation theory. Ann. Inst. H. Poincaré A 19, 211–295 (1973) 14. Freedman, D.Z., Johnson, K., Latorre, J.I.: Differential regularization and renormalization: a new method of calculation in quantum field theory. Nucl. Phys. B 371, 353–414 (1992) 15. Fronsdal, C.: Elementary particles in a curved space. II. Phys. Rev. D 10, 589–598 (1974) 16. Haag, R., Kastler, D.: An algebraic approach to quantum field theory. J. Math. Phys. 5, 848–861 (1964) 17. Hollands, S., Wald, R.M.: Local Wick polynomials and time-ordered products of quantum fields in curved spacetime. Commun. Math. Phys. 223, 289–326 (2001) 18. Hollands, S., Wald, R.M.: Existence of local covariant time-ordered products of quantum fields in curved spacetime. Commun. Math. Phys. 231, 309–345 (2002) 19. Hollands, S., Wald, R.M.: Conservation of the stress tensor in perturbative interacting quantum field theory in curved spacetimes. Rev. Math. Phys. 17, 227–312 (2005) 20. Källen, G.: Formal integration of the equations of quantum theory in the Heisenberg representation. Ark. Fysik 2, 371–410 (1950) 21. Lewin, L.: Polylogarithms and Associated Functions. Amsterdam: Elsevier North Holland, 1981 22. Maldacena, J.M.: The large N limit of superconformal field theories and supergravity. Adv. Theor. Math. Phys. 2, 231–252 (1998) 23. Moretti, V.: Comments on the stress-energy tensor operator in curved spacetime. Commun. Math. Phys. 232, 189–221 (2003) 24. Rehren, K.-H.: Algebraic holography. Ann. Henri Poincaré 1, 607–623 (2000) 25. Rehren, K.-H.: Local quantum observables in the AdS-CFT correspondence. Phys. Lett. B 493, 383–388 (2000) 26. Rehren, K.-H.: QFT lectures on AdS-CFT. In: Proceedings of the 3rd Summer School in Modern Mathematical Physics, Zlatibor, Serbia (2004), B. Dragovich (ed.), Belgrade, 2005, pp. 95–118 27. Rühl, W.: Lifting a conformal field theory from D-dimensional flat space to (D + 1)-dimensional AdS space. Nucl. Phys. B705, 437–456 (2005) 28. Watson, G.N.: A Treatise on the Theory of Bessel Functions. Cambridge: Cambridge Univ. Press, 1958 (2nd edition) 29. Witten, E.: Anti-de Sitter space and holography. Adv. Theor. Math. Phys. 2, 253–291 (1998) Communicated by M. Salmhofer

Commun. Math. Phys. 307, 351–382 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1312-z

Communications in

Mathematical Physics

Curvature Diffusions in General Relativity Jacques Franchi1, , Yves Le Jan2 1 IRMA, Université de Strasbourg et CNRS, 7 rue René Descartes, 67084 Strasbourg Cedex, France.

E-mail: [email protected]

2 Université Paris Sud 11, Mathématiques, Bâtiment 425, 91405 Orsay, France.

E-mail: [email protected] Received: 13 April 2010 / Accepted: 24 March 2011 Published online: 6 September 2011 – © Springer-Verlag 2011

Abstract: We define and study on Lorentz manifolds a family of covariant diffusions in which the quadratic variation is locally determined by the curvature. This allows the interpretation of the diffusion effect on a particle by its interaction with the ambient space-time. We will focus on the case of warped products, especially Robertson-Walker manifolds, and analyse their asymptotic behaviour in the case of Einstein-de Sitter-like manifolds. Contents 1. 2.

3.

4. 5.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Canonical Vector Fields andCurvature . . . . . . . . . . . . . . . . . . 2.1 Isomorphism between 2 R1,d and so(1, d) . . . . . . . . . . . . . 2.2 Frame bundle G(M) over (M, g) . . . . . . . . . . . . . . . . . . 2.3 Expressions in local coordinates . . . . . . . . . . . . . . . . . . . 2.4 Case of a perfect fluid . . . . . . . . . . . . . . . . . . . . . . . . . Covariant -Relativistic Diffusions . . . . . . . . . . . . . . . . . . . . 3.1 The basic relativistic diffusion . . . . . . . . . . . . . . . . . . . . 3.2 Construction of the -diffusion . . . . . . . . . . . . . . . . . . . 3.3 The R-diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 The E-diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . Warped (or Skew) Products . . . . . . . . . . . . . . . . . . . . . . . . Example of Robertson-Walker (R-W) Manifolds . . . . . . . . . . . . . 5.1 -relativistic diffusions in an Einstein-de Sitter-like manifold . . . . 5.2 Asymptotic behavior of the R-diffusion in an Einstein-de Sitter-like manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Asymptotic energy of the E-diffusion in an Einstein-de Sitter-like manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bénéficiaire d’une aide de l’Agence Nationale de la Recherche, no ANR-09-BLAN-0364-01.

. . . . . . . . . . . . . .

352 353 353 354 357 358 358 359 360 362 363 364 367 369

.

370

.

372

352

J. Franchi, Y. Le Jan

6.

Sectional Relativistic Diffusion . . . . . . . . . . . . . . . . 6.1 Intrinsic relativistic generators on G(M) . . . . . . . . 6.2 Sign condition on timelike sectional curvatures . . . . . 6.3 Sectional diffusion in an Einstein-de Sitter-like manifold References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

375 375 378 380 381

1. Introduction It is known since Dudley’s pioneer’s work [Du] that a relativistic diffusion, i.e. a Lorentz-covariant Markov diffusion process, cannot exist on the base space, even in the Minkowski framework of special relativity, but possibly makes sense at the level of the tangent bundle. In this spirit, the general case of a Lorentz manifold was first investigated in [F-LJ], where a general relativistic diffusion was introduced. The quadratic variation of this diffusion is constant and does not vanish in the vacuum. In this article, we investigate diffusions whose quadratic variation is locally determined by the curvature of the space, and then vanishes in empty (or at least flat) regions. The relativistic diffusion considered in [F-LJ] lives on the pseudo-unit tangent bundle T 1 M of the given generic Lorentz manifold (M, g). As is recalled in Sect. 3.1 below, it can be obtained by superposing, to the geodesic flow of T 1 M, random fluctuations of the velocity that are given by hyperbolic Brownian motion, if one identifies the tangent space Tξ1 M with the hyperbolic space Hd (at point ξ ∈ M, by means of the pseudometric g). These Brownian fluctuations of the velocity can be defined by the vertical 2 Dirichlet form T 1 M ∇ξv˙ F(ξ, ξ˙ ) μ(dξ, d ξ˙ ) , considered with respect to the Liouville measure μ . The Dirichlet forms we investigate in this article depend only on the local geometry of (M, g), e.g. on the curvature tensor at the current point ξ , and on the velocity ξ˙ . We consider several examples : – If the scalar curvature R(ξ ) is everywhere non-positive (which is physically relevant, 2 see [L-L]), then the Dirichlet form can be : − 1 ∇ v F(ξ, ξ˙ ) R(ξ ) μ(dξ, d ξ˙ ) , T M

ξ˙

leading to the covariant relativistic diffusion we call R-diffusion. – If the energy E(ξ, ξ˙ ) is everywhere non-negative (which is physically relevant, see [L-L], and [H-E], where this is called the “weak energy condition”), then we can 2 v choose the Dirichlet form to be : ∇ F(ξ, ξ˙ ) E(ξ, ξ˙ ) μ(dξ, d ξ˙ ) , leading 1 T M

ξ˙

to another covariant relativistic diffusion, we call energy E-diffusion. Contrary to the basic relativistic diffusion, these new relativistic diffusions reduce to the geodesic flow in every empty (vacuum) region. Note that −R(ξ ) and E(ξ, ξ˙ ) could be replaced by ϕ(R(ξ )) and ψ(E(ξ, ξ˙ ), for more or less arbitrary non-negative increasing functions ϕ, ψ. We shall present this class of covariant -relativistic diffusions, or -diffusions, in Sect. 3 below. – If the sectional curvatures of timelike planes are everywhere non-negative (sectional curvature has proved to be a natural tool in Lorentzian geometry, see for example [H,H-R]), as this is often the case (at least in the usual symmetrical examples), then it is possible to construct a covariant sectional relativistic diffusion which undergoes velocity fluctuations that are no longer isotropically Brownian, using the whole curvature tensor (not the Ricci tensor alone). See Sect. 6 below. This sectional relativistic diffusion depends on the curvature tensor in a canonical way. Its diffusion symbol

Curvature Diffusions in General Relativity

353

vanishes in flat regions (i.e. regions where the whole curvature tensor vanishes), but does not vanish, in general, in empty regions (i.e. regions where the Ricci tensor vanishes). Note that all these covariant diffusions are the projections on T 1 M of diffusions on the frame bundle G(M). Actually, they are constructed directly on G(M), as in [F-LJ] and in the classical construction of Brownian motion on Riemannian manifolds, see [El,M,I-W,Em,Hs,A-C-T]. These constructions are performed in Sects. 3 and 6 below. Note also that, while in the flat case the Dudley diffusion [Du] is the unique covariant diffusion, the above examples show that this is not at all the same for curved spaces. In Sects. 4 and 5, we study in more detail the case of warped products, and specify further some particularly symmetrical examples, namely Robertson-Walker manifolds, which are warped products with energy-momentum tensor of perfect fluid type. We investigate more closely Einstein-de Sitter-like manifolds (Robertson-Walker manifolds for which the expansion rate is α(t) = t c for some positive c), reviewing in this simple class of examples, the relativistic diffusions we introduced, which appear to be distinct. We perform in this setting an asymptotic study of the R-diffusion, and of the minimal sub-diffusion (ts , t˙s ) relative to the energy E-diffusion. 2. Canonical Vector Fields and Curvature We present in this section the main notations and recall a few known facts (see [K-N]). 2

R1,d and so(1, d). On the Minkowski space-time R1,d , ⎛ ⎞ 1 0 ... 0 ⎜0 −1 . . . 0 ⎟ we denote by η = ((ηi j ))0≤i, j≤d the Minkowski tensor ⎜ . We .. ⎟ .. ⎝ ... . . ⎠

2.1. Isomorphism between

0 . . . 0 −1 also denote by ((ηi j ))0≤i, j≤d the inverse tensor, so that ηi j η jk = δik (or equivalently : ηi j = ηi j := 1{i= j=0} − 1{1≤i= j≤d} ), and by ·, ·η the corresponding Minkowski pseudo-metric. For u, v, w ∈ R1,d , we set u ∧ v (w) := u, wη v − v, wη u.

(1)

In other terms, this is the interior product of u ∧ v by the dual of w with respect to η . This defines an endomorphism of R1,d which belongs to so(1, d), since for any w, w ∈ R1,d we have clearly u ∧ v(w), w η + w, u ∧ v(w )η = 0 . It vanishes only if u and v are collinear, hence if and only if u ∧ v = 0 . We have thus an isomorphism between 2 R1,d and so(1, d). Remark 2.1.1. The Lie bracket of so(1, d) can be expressed, for any a, b, u, v ∈ R1,d , by [a ∧ b, u ∧ v] = a, uη b ∧ v + b, vη a ∧ u − a, vη b ∧ u − b, uη a ∧ v. The Minkowski pseudo-metric ·, ·η extends to 2 R1,d , by setting : u ∧ v, a ∧ bη := u, aη v, bη − u, bη v, aη 1 u ∧ v(a), bη − u ∧ v(b), aη , = 2

(2)

354

J. Franchi, Y. Le Jan

so that, if (e0 , . . . , ed ) is a Lorentz (i.e. pseudo-orthonormal) basis of (R1,d , η), then (ei ∧ e j | 0 ≤ i < j ≤ d) is an orthogonal basis of ( 2 R1,d , η), such that ei ∧ e j , ei ∧ e j η = ηii η j j . 2.2. Frame bundle G(M) over (M, g). Let M be a time-oriented C ∞ (1 + d)dimensional Lorentz manifold, with pseudo-metric g having signature (+, −, . . . , −), and let T 1 M denote the positive half of the pseudo-unit tangent bundle. Let G(M) be 1 the bundle of direct pseudo-orthonormal frames, with first element in T M , which has its fibers modelled on the special Lorentz group. Let π1 : u → π(u), e0 (u) denote the 1 canonical projection from G(M) onto the unit tangent bundle T M , which to each frame e0 (u), . . . , ed (u) associates its first vector e0 . π2

We denote by T M −→ M the tangent bundle, by (T M) the set of C 2 vecπ tor fields on M (sections of π2 ), by G(M) −→ M the frame bundle, by u = π(u); e0 (u), . . . , ed (u) the generic element of G(M). We extend (1) to a linear action of so(1, d) ≡ 2 R1,d on G(M), by setting : ek ∧ e (e j (u)) := η jk e (u) − η j ek (u) ,

for any 0 ≤ j, k, ≤ d ,

where (e0 , . . . , ed ) denotes the canonical basis of R1,d . The action of S O(d) on (e1 , . . . , ed ) induces the identification T 1 M ≡ G(M)/S O(d). The right action of so(1, d) on G(M) defines a linear map V from so(1, d) into vector fields on G(M) (i.e. sections of the canonical projection of T G(M) on G(M)), such that

2 1,d [Va∧b , Vα∧β ] = V[a∧b,α∧β] , for any a ∧ b , α ∧ β ∈ R . (3) Vector fields Va∧b are called vertical. Notation. To abreviate the notations, we shall consider mostly the canonical vector fields : Vi j := Vei ∧e j ,

for 0 ≤ i, j ≤ d.

By (3) and (1), for 0 ≤ i, j, k, ≤ d we have : [Vi j , Vk ] = ηik V j + η j Vik − ηi V jk − η jk Vi .

(4)

We shall often write V j for V0 j . p

Denote by p the canonical projection T T M −→ T M , by p˜ the canonical prop˜

( p,T π2 )

jection T G(M) −→ G(M), and consider also the projection T T M −−−−−→ T M ⊕ T M , where T M ⊕ T M := (ξ ; v1 , v2 ) ξ ∈ M , vi ∈ Tξ M ≡ (w1 , w2 ) wi ∈ T M , π2 (w1 ) = π2 (w2 ) is the so-called Whitney sum. σ A connection σ can be defined as a bilinear section T M ⊕ T M −→ T T M of ( p, T π2 ), the bilinearity being that of (v1 , v2 ) → σ (ξ ; v1 , v2 ), above any given base γ point ξ ∈ M. Given such a connection σ and a C 1 curve γ , the parallel transport t

Curvature Diffusions in General Relativity

355

γ along γ of any v0 ∈ Tγ0 M is vt = t v0 ∈ Tγt M defined by the ordinary differential d (γt ; vt ) = σ (γt ; vt , γ˙t ) . Then the covariant derivative ∇γ˙0 X (γ0 ) ∈ equation : dt γ −1 Tγ0 M of a C 1 vector field X is defined as the derivative at 0 of t → X (γt ) . t A connection σ is said to be metric if the associated parallel transport preserves the pseudo-metric, and then acts on G(M) as well as on T M . A metric connection σ defines the horizontal vector fields Hk on G(M), for 0 ≤ k ≤ d , given for any F ∈ C 1 (G(M)) and u ∈ G(M), by : γ (5) Hk F(u) is the derivative at 0 of t → F t u , the C 1 curve γ being such that γ0 = π(u), γ˙0 = ek (u) . Note that T π(Hk ) = ek . The canonical vectors Vi j , Hk span T G(M) the horizontal (resp. vertical) sub bundle of T G(M) being spanned by Hk ’s (resp. Vi j ’s) . Note that H0 generates the geodesic flow, that V1 , . . . , Vd generate the boosts, and that the Vi j (1 ≤ i, j ≤ d) generate rotations. This allows to define the intrinsic torsion tensor ((Ti kj )) and curvature tensor ((Ri j k )) (with 0 ≤ i, j, k, ≤ d) of the metric connection σ , by the assignment : [Hi , H j ] =

d

Ti kj Hk +

k=0

Ri j k Vk ;

(6)

0≤k< ≤d

we can denote this more simply by Ti kj Hk + For any metric connection we have :

1 2

Ri j k Vk .

[Vi j , Hk ] = ηik H j − η jk Hi ,

for 0 ≤ i, j, k ≤ d.

(7)

There exists a unique metric connection with vanishing torsion, called the Levi-Civita connection. We shall henceforth consider this one. The curvature operator Rξ is defined on 2 Tξ M by : Rξ ei (u) ∧ e j (u) := Ri j k ek (u) ∧ e (u), 0≤k< ≤d

for any u ∈ π

−1

(ξ ) and 0 ≤ i, j ≤ d.

(8)

The curvature operator is alternatively given by : for any C 1 vector fields X, Y, Z , A , R (X ∧ Y ) , A ∧ Z η = [∇X , ∇Y ] − ∇[X,Y ] Z , A g . (9) The Ricci tensor and Ricci operator are defined, for 0 ≤ i, k ≤ d , by : Rik :=

d

Ri j k j ,

and

d Ricciξ ei (u) := Rik ek (u) , for any u ∈ π −1 (ξ ).

j=0

k=0

(10) d

The scalar curvature is : R := k=0 Rkk . The indexes of the curvature tensor ((Ri j k )) and of the Ricci tensor ((Rik )) are lowered or raised by means of the Minkowski tensor ((ηab )) and its inverse ((ηab )). For example, we have : R p jqr = Ri j k ηi p ηkq η r , and Ri j = Rik ηk j .

356

J. Franchi, Y. Le Jan

Remark 2.2.1. The curvature and Ricci operators and tensors are symmetrical: R(a ∧ b), v ∧ wη = a ∧ b, R(v ∧ w)η , and Ricci(v), wη = v, Ricci(w)η , for any a, b, v, w ∈ R1,d . Equivalently, for 0 ≤ i, j, k, ≤ d : Ri j k = R k i j , and Ri j = R ji . The energy-momentum tensor ((T jk )) and operator Tξ are defined as : T jk := R kj − Note that

d j=0

1 R δ kj 2

j

T j = − d−1 2 R.

and

Tξ := Ricciξ −

1 R. 2

(11)

The energy at any line-element (ξ, ξ˙ ) ∈ T 1 M is

E(ξ, ξ˙ ) := Tξ (ξ˙ ), ξ˙ g(ξ ) = T00 (ξ, ξ˙ ).

(12)

The last equality is easily derived from (10) and (11) since writing (ξ, ξ˙ ) = (π(u), e0 (u)) for any u ∈ π1−1 (ξ, ξ˙ ) and Ti j = Tik ηk j , we have : Tξ (ξ˙ ), ξ˙ g(ξ ) = g Tξ (e0 ), e0 = g(T0k ek , e0 ) = T0k ηk0 = T00 = T00 (ξ, ξ˙ ). The weak energy condition (see [H-E]) stipulates that E(ξ, ξ˙ ) ≥ 0 on the whole

T 1 M.

We shall need the following general computation rule. Lemma 2.2.2. For 0 ≤ i, j, k, , p, q ≤ d , we have: Vq p Ri j k = ηqi R pj k − ηi p Rq j k + ηq j Ri p k − η j p Riq k + δqk Ri j p −δ kp Ri jq − δq Ri j p k + δ p Ri jq k . Proof. Using (8) and (1), we have indeed: Vq p Ri j k = Vq p R(ei ∧ e j ), en ∧ em η ηkn η m = R (ηqi e p − ηi p eq ) ∧ e j , en ∧ em

η

+ R ei ∧ (ηq j e p − η j p eq ) , en ∧ em ηkn η m +

η

R(ei ∧ e j ), (ηqn e p − ηnp eq ) ∧ em

η

+ R(ei ∧ e j ), en ∧ (ηqm e p − ηmp eq ) ηkn η m η

= ηqi R pj −ηi p Rq j +ηq j Ri p −η j p Riq k +δqk Ri j p k

k

k

− δ kp Ri jq − δq Ri j p k + δ p Ri jq k .

Curvature Diffusions in General Relativity

357

2.3. Expressions in local coordinates. Consider local coordinates (ξ i , ekj ) for u = (ξ, e0 , . . . , ed ) ∈ G(M), with e j = ekj ∂ξ∂ k . Then the horizontal and vertical vector fields Vi j , V j , Hk , which satify the commutation relations (4),(6),(7) of the preceding Sect. 2.2, read as follows. First, denoting by k j = jk the Christoffel coefficients of the Levi-Civita connexion ∇, we have for 0 ≤ i, j ≤ d: ∇

∂ ∂ξ i

∂ ∂ = ikj (ξ ) k ∂ξ j ∂ξ

and

H j = ekj

∂ ∂ − ekj eim km (ξ ) . ∂ξ k ∂ei

This is consistent with (5). Indeed, we have a priori an expression H j = a kj ), dξ k

+b ji

∂ ∂ei

,

with on one hand = T π(H j = e j = . On the other hand, for a −1 curve γ satisfying γ0 = ξ , γ˙0 = e j , denoting by e the matrix inverse to e ≡ ((eik )) we have : do γ −1 −1 i γ ∂ ∂ ekj km (e )m ei = ∇γ˙0 m = ∇γ˙0 (e−1 )im ei = u t t ∂ξ ∂ξ dt ∂ ∂ do −1 i γ (e )m t u ei = H j (e−1 )im ei = −(e−1 )im H j (ei ) = dt ∂ξ ∂ξ a kj

, dξ k

∂ ∂ξ k

(13)

ekj

C1

as wanted. by (5), so that b ji = H j (ei ) = −ekj eim km Recall that the Christoffel coefficients of ∇ are computed by : ∂g ∂gi j 1 ∂gi j , + − ikj = g k 2 ∂ξ i ∂ξ j ∂ξ

or equivalently, by the fact that geodesics solve ξ¨ k + ikj ξ˙ i ξ˙ j = 0 . Then for 0 ≤ i, j, k ≤ d : Vi j ek = ηik e j − η jk ei = (ηik e j − η jk ei )

∂ ∂ ∂ = (ηiq emj − η jq eim ) m ek , ∂eq ∂ξ ∂ξ

whence for 0 ≤ i, j ≤ d : Vi j = (ηiq emj − η jq eim ) that is : Vi j = eik

∂ ∂ekj

− ekj

∂ ∂eik

and V j = e0k

∂ ∂ekj

+ ekj

∂ , ∂eqm

∂ ∂e0k

(14)

, for 1 ≤ i, j ≤ d .

The curvature operator is expressed in a local chart as : for 0 ≤ m, n, p, q ≤ d , ∂ ∂ ∂ ∂ mnpq := R , R ∧ ∧ q m n p ∂ξ ∂ξ ∂ξ ∂ξ g r r ∂ ∂np nq r s r s . (15) = gmr ps nq − qs np + − ∂ξ p ∂ξ q Then, the Ricci operator can be computed similarly, as : for 0 ≤ m, p ≤ d , n n ∂mp ∂ ∂ ∂mn q q n mnpq g nq = nq . R˜ mp := Ricci( m ) , p = R mp − npq mn + − n ∂ξ ∂ξ g ∂ξ ∂ξ p (16)

358

J. Franchi, Y. Le Jan

The scalar curvature and the energy-momentum operator can be computed by : R = R˜ i j g i j

1 T˜ m = R˜ m − R g m . 2

and

(17)

To summarize, the Riemann curvature tensor ((Ri j k )) is made of the coordinates of the curvature operator R in an orthonormal moving frame, and its indexes are lowered or raised by means of the Minkowski tensor ((ηab )), while the curvature tensor (( R˜ mnpq )) is made of the coordinates of the curvature operator in a local chart, and its indexes are lowered or raised by means of the metric tensor ((gab )). To go from one tensor to the other, note that by (15) and (8) we have R ∂ξ∂m ∧ ∂ξ∂ n = 1 ab ∂ ∂ , k pq = Ri j mn emp enq , or equivalently: Rmn whence : ek e R a ∧ b 2

∂ξ

∂ξ

i

k r s ek e ear es , Ri jab = R i j b

j

or as well :

r spq = Rabmn ear es emp enq . R b

(18)

2.4. Case of a perfect fluid. The energy-momentum tensor T (of (11), or equivalently T˜ , recall (17)) is associated to a perfect fluid (see [H-E]) if it has the form : T˜k = q Uk U − p gk ,

(19)

for some C 1 field U in T 1 M (which represents the velocity of the fluid), and some C 1 functions p, q on M . By Einstein Equations (17), (19) is equivalent to : R˜ k = q Uk U + p˜ gk ,

with

p˜ = (2 p − q)/(d − 1),

(20)

or as well, by (16), to : Ricci(W ), W η = q × g(U, W )2 + p˜ × g(W, W ) ,

for any W ∈ T M.

(21)

The quantity U (ξs ), ξ˙s , (which is the hyperbolic cosine of the distance, on the unit hyperboloid at ξs identified with the hyperbolic space, between the space-time velocities of the fluid and the path) will be denoted by As or A(ξs , ξ˙s ). Note that necessarily As ≥ 1 . By Formulas (12) and (19), the energy equals : E(ξ, ξ˙ ) = q(ξ ) A(ξ, ξ˙ )2 − p(ξ ). Remark 2.4.1. (i) The energy of the fluid is simply : T˜k U k U = q − p . scalar curvature equals R = 2 [(d + 1) p − q]/(d − 1). (ii) By (22), the weak energy condition reads here: q ≥ p + .

(22) and the

3. Covariant -Relativistic Diffusions Let denote a non-negative smooth function on G(M), invariant under the right action of S O(d) (so that it identifies with a function on T 1 M). Our examples will be = − 2 R and = 2 E (for a positive constant ). We call -relativistic diffusion or -diffusion the G(M)-valued diffusion process (s ) associated to we will construct in Sect. 3.2 below, as well as its T 1 M-valued projection π1 (s ). Let us first recall our previous construction, which corresponds to a constant .

Curvature Diffusions in General Relativity

359

3.1. The basic relativistic diffusion. The relativistic diffusion process (ξs , ξ˙s ) was defined in [F-LJ], as the projection under π1 of the G(M)-valued diffusion (s ) solving the following Stratonovitch stochastic differential equation (for a given Rd -valued j Brownian motion (ws ) and some fixed > 0) : ds = H0 (s ) ds +

d

j

V j (s ) ◦ dws .

(23)

j=1

The infinitesimal generator of the G(M)-valued relativistic diffusion (s ) is H := H0 +

d 2 2 Vj , 2

(24)

j=1

and the infinitesimal generator of the relativistic diffusion (ξs , ξ˙s ) := π1 (s ) is the relativistic operator : H1 := L0 +

d 2 ∂ ∂ 2 v = ξ˙ k k + ξ˙ k − ξ˙ i ξ˙ j ikj (ξ ) 2 ∂ξ 2 ∂ ξ˙ k +

2 k ∂2 (ξ˙ ξ˙ − g k (ξ )) k , 2 ∂ ξ˙ ∂ ξ˙

(25)

where L0 denotes the vector field on T 1 M generating the geodesic flow, and v denotes the vertical Laplacian, i.e. the Laplacian on Tξ1 M equipped with the hyperbolic metric induced by g(ξ ). The relativistic diffusion (ξs , ξ˙s ) is parametrized by proper time s ≥ 0 , possibly till some positive explosion time. In local coordinates (ξ i , ekj ), setting s = (ξsi , ekj (s)), Eq. (23) becomes locally equivalent to the following system of Itô equations : k (ξ ) ξ˙ i ξ˙ ds + dξsk = ξ˙sk ds = e0k (s) ds; d ξ˙sk = −i s s s

d i=1

k (ξ ) e (s) ξ˙ i ds + ξ˙ k dw j + dekj (s) = −i s j s s s

2 k e (s) ds, 2 j

eik (s) dwsi +

d 2 k ξ˙ ds, 2 s

and

for 1 ≤ j ≤ d , 0 ≤ k ≤ d.

Remark 3.1.1. We have on T 1 M : d

V j2 E = 2(d + 1) E − 2 Tr(T ) = 2(d + 1) E + (d − 1) R.

(26)

j=1

Indeed, since for each 1 ≤ j ≤ d V j exchanges the basis vectors e0 = ξ˙ and e j (recall (14)) we get : V j2 E = 2 V j (T˜ m ξ˙ emj ) = 2 T˜ m (ξ˙ ξ˙ m + e j emj ), whence d

V j2 E = 2d E + 2 T˜ m (ξ˙ ξ˙ m − g m ) = 2(d + 1) E − 2 Tr(T˜ ) ,

j=1

and Formulas (26) follow at once by (11). As an application, a direct computation yields the following evolution of the energy.

360

J. Franchi, Y. Le Jan

Remark 3.1.2. The random energy process Es = E(ξs , ξ˙s ) associated to the basic relativistic diffusion π1 (s ) = (ξs , ξ˙s ) satisfies the following equation (where ∇v := v j ∇ j ) : d −1 2 R(ξs ) ds + d MsE , dEs = ∇ξ˙s Es ds + (d + 1)Es + 2 with the quadratic variation of its martingale part d MsE given by : [dEs , dEs ] = [d MsE , d MsE ] = 42 [Es2 − T˜ ξ˙s , T˜ ξ˙s ] ds. j (We have here in particular ∇ξ˙s Es = ∂ξ k T˜i j (ξs ) − 2 T˜i (ξs ) jk (ξs ) ξ˙si ξ˙s ξ˙sk .) Note that the energy Es is not, in general, a Markov process. 3.2. Construction of the -diffusion. Let us start with the following Stratonovitch stoj chastic differential equation on G(M) (for a given Rd -valued Brownian motion (ws )) : ds = H0 (s ) ds +

1 j V j (s )V j (s ) ds + (s ) V j (s ) ◦ dws . 4 d

d

j=1

j=1

√

(27)

Note that all coefficients√ in this equation are clearly smooth, except on its vanishing set −1 (0). However, is a locally Lipschitz function ; see ([I-W], Prop. IV.6.2). Hence, Eq. (27) does define a unique G(M)-valued diffusion (s ). We have the following proposition, defining the -relativistic diffusion (or -diffusion) (s ) on G(M), and (ξs , ξ˙s ) on T 1 M, possibly till some positive explosion time. Proposition 3.2.1. The Stratonovitch stochastic differential equation (27) has a unique solution (s ) = (ξs ; ξ˙s , e1 (s), . . . , ed (s)), possibly defined till some positive explosion time. This is a G(M)-valued covariant diffusion process, with generator H := H0 +

d 1 Vj Vj. 2

(28)

j=1

Its projection π1 (s ) = (ξs , ξ˙s ) defines a covariant diffusion on T 1 M, with S O(d)invariant generator 1 H := L0 +

1 v ∇ ∇v , 2

(29)

∇ v denoting the gradient on Tξ1 M equipped with the hyperbolic metric induced by g(ξ ). Moreover, the adjoint of H with respect to the Liouville measure of G(M) is d ∗ := −H + 1 H 0 j=1 V j V j . In particular, if there is no explosion, then the Liou2 ville measure is invariant. Furthermore, if does not depend on ξ˙ , i.e. is a function on M, then the Liouville measure is preserved by the stochastic flow defined by Eq. (27). We specify at once how this looks in a local chart, before giving a proof for both statements.

Curvature Diffusions in General Relativity

361

Corollary 3.2.2. The T 1 M-valued -diffusion (ξs , ξ˙s ) satisfies dξs = ξ˙s ds , and in any local chart, the following Itô stochastic differential equations: for 0 ≤ k ≤ d , (denoting s = (ξs , ξ˙s )) d ξ˙sk = d Msk − ikj (ξs ) ξ˙si ξ˙s ds + j

d 1 ∂ s ξ˙sk ds + [ξ˙sk ξ˙s − g k (ξs )] (ξs , ξ˙s ) ds , 2 2 ∂ ξ˙ (30)

with the quadratic covariation matrix of the martingale term (d Ms ) given by: [d ξ˙sk , d ξ˙s ] = [ξ˙sk ξ˙s − g k (ξs )] s ds , for 0 ≤ k, ≤ d. Proof. In local coordinates (ξ i , ekj ), = (ξ ; e0 , e1 , . . . , ed ), using Sect. 2.3, Eq. (27) reads : for any 0 ≤ k ≤ d , dξsk = ξ˙sk ds = e0k (s) ds , k d ξ˙sk = −i (ξs ) ξ˙si ξ˙s ds +

d d ! 1 j V j (ξs , ξ˙s ) ekj (s) ds + (ξs , ξ˙s ) ekj (s) ◦ dws ; 4 j=1

j=1

and for 1 ≤ j ≤ d, 0 ≤ k ≤ d, k (ξs ) e j (s) ξ˙si ds + dekj (s) = −i

1 V j (ξs , ξ˙s ) ξ˙sk ds + 4

! j (ξs , ξ˙s ) ξ˙sk ◦ dws .

We compute now the Itô corrections, which involve the partial derivatives of with respect to ξ˙ . We get successively, for 1 ≤ j ≤ d , 0 ≤ k ≤ d : ! ! j k ˙ ˙ (ξs , ξs ) d (ξs , ξs ) e j (s) , dws

(ξ, ξ˙ )

1 j j = (ξs , ξ˙s ) dekj (s), dws + ekj (s) d (ξs , ξ˙s ), dws 2 ! 1 ∂ 3/2 ˙ k k ˙ ˙ = (ξs , ξs ) ξs ds + e j (s) (ξs , ξs ) (ξs , ξ˙s ) e j (s) ds, 2 ∂ ξ˙ hence, ! 1 j k ˙ d (ξs , ξs ) e j (s) , dws = (ξs , ξ˙s ) ξ˙sk ds + V j (ξs , ξ˙s ) ekj (s) ds; 2 ! ! j (ξs , ξ˙s ) d (ξs , ξ˙s ) ξ˙sk , dws 1 j j = (ξs , ξ˙s ) d ξ˙sk , dws + ξ˙sk d (ξs , ξ˙s ), dws 2 ! ∂ 1 = (ξs , ξ˙s )3/2 ekj (s) ds + ξ˙sk (ξs , ξ˙s ) (ξs , ξ˙s ) e j (s) ds , 2 ∂ ξ˙ hence, ! 1 j (ξs , ξ˙s ) ξ˙sk , dws = (ξs , ξ˙s ) ekj (s) ds + V j (ξs , ξ˙s ) ξ˙sk ds. d 2

362

J. Franchi, Y. Le Jan

Note that the simplification by (ξs , ξ˙s ) is allowed, since on −1 (0) both sides of the simplified formula vanish identically. Hence, in local coordinates and in Itô form, Eq. (27) reads : for any 0 ≤ k ≤ d , j dξsk = ξ˙sk ds = e0k (s) ds , and setting d Msk := dj=1 (ξs , ξ˙s ) ekj (s) dws , k (ξs ) ξ˙si ξ˙s ds + d ξ˙sk = d Msk − i

d (ξs , ξ˙s ) ξ˙sk ds 2

∂ 1 + [ξ˙sk ξ˙s − g k (ξs )] (ξs , ξ˙s ) ds; 2 ∂ ξ˙ ! 1 j k dekj (s) = (ξs , ξ˙s ) ξ˙sk dws − i (ξs ) e j (s) ξ˙si ds + (ξs , ξ˙s ) ekj (s) ds 2 1 k + V j (ξs , ξ˙s ) ξ˙s ds. 2 d k k ˙k ˙ Note that we used the formula j=1 e j (s)e j (s) = ξs ξs − g (ξs ) , which expresses that ∈ G(M). Using again this formula, we get the quadratic covariation matrix of the martingale term (d Ms ), displayed in the above proposition, which shows that (π1 (s )) is indeed a diffusion, and proves Corollary 3.2.2. On the other hand, comparing the above equations with Eqs. (23) and (24), which correspond to ≡ 2 , we get precisely the wanted form (28) for the generator of (s ). Then, since ∇ v ∇ v = v + (∇ v ) ∇ v , comparing the above Eq. (30) for ξ˙sk with Eq. (25) (for which ≡ 2 ), we see that establishing Formula (29) giving the projected 1 reduces now to proving that (∇ v ) ∇ v ≡ [ξ˙ k ξ˙ − g k ] ∂ ∂ . Now generator H s s ∂ ξ˙ ∂ ξ˙ k

this becomes clear, noting that ∇ vj = ekj ∂ ∂ξ˙ k , for each 1 ≤ j ≤ d . Finally, the assertions relative to the Liouville measure are direct consequences of the fact that the vectors H0 and V j are antisymmetric with respect to the Liouville measure of G(M). (The invariance with respect to H0 , i.e. to the geodesic flow, is proved in the same way as in the Riemannian case, and the invariance with respect to V j is straightforward.)

Remark 3.2.3. (i) The vertical terms could be seen as an effect of the matter or the radiation present in the space-time M. The -diffusion (s ) reduces to the geodesic flow in the regions of the space where vanishes, which happens in particular for empty space-times M in the cases = −2 R(ξ ), or = 2 E(ξ, ξ˙ ), or also ˙ = −2 R(ξ ) eκ E (ξ,ξ )/R(ξ ) (for positive constant κ) for example. (ii) As well as for the basic relativistic diffusion, the law of the -relativistic diffusion is covariant with any isometry of (M, g). The basic relativistic diffusion corresponds to ≡ 2 > 0 , and the geodesic flow to ≡ 0 . (iii) In [B] a general model for relativistic diffusions is considered, which may be covariant or not. Up to enlarge it slightly, by allowing the “rest frame” (denoted by z in includes the generic -diffusion [B]) to have space vectors of non-unit norm, this model compare the above Eq. (27) to (2.5), (3.3) in [B] . 3.3. The R-diffusion. We assume here that the scalar curvature R = R(ξ ) is everywhere non-positive on M, which is physically relevant (see [L-L]), and consider the particular case of Sect. 3.2 corresponding to = −2 R(ξ ), with a constant positive parameter .

Curvature Diffusions in General Relativity

363

In this case, as its central term clearly vanishes, Eq. (27) takes on the simple form : ds = H0 (s ) ds +

d

j

−R(s ) V j (s ) ◦ dws .

j=1

3.4. The E-diffusion. We assume that the Weak Energy Condition (recall Sect. 2.2) holds (everywhere on T 1 M), which is physically relevant (see [L-L,H-E]), and consider the particular case of Sect. 3.2 corresponding to = 2 E = 2 E(ξ, ξ˙ ) = 2 T00 . We call energy relativistic diffusion or E-diffusion the G(M)-valued diffusion process (s ) we get in this way, as well as its T 1 M-valued projection π1 (s ). The following is easily derived from Lemma 2.2.2 and Formula (10). As a consequence, the central drift term in Eq. (27) is a function of the Ricci tensor alone when is. Lemma 3.4.1. We have V j Rik = δ0i R kj − ηi j R0k + δ0k Ri j − δ kj R0i , for 0 ≤ i, k ≤ d and 1 ≤ j ≤ d . In particular, V j R = 0 , and V j E = V j T00 = V j R00 = 2R0 j . By Lemma 3.4.1, the drift term of Eq. (30) which involves the derivatives of equals here : 2 dj=1 R0 j (ξs ) ekj (s) ds . As we have R0 j = R˜ mn e0m enj by (16), we get the alternative expression : 2 dj=1 R0 j (ξs ) ek (s) ds = 2 R˜ mn (ξs ) ξ˙ m [ξ˙sk ξ˙sn − j

g kn (ξs )] ds . Another expression is got by using the Einstein equation (17), or equivalently, by computing directly from (30) and (12) : [ξ˙ k ξ˙ − g k ] ∂∂ξ˙E = 2[ξ˙ k ξ˙ − g k ] T˜ m ξ˙ m = 2[E ξ˙ − T˜ ξ˙ ]k , where the notation (T˜ ξ˙ )k ≡ T˜mk ξ˙ m has the meaning of a matrix product. Hence, Formula (30) of Proposition 3.2.2 expressing d ξ˙s reads here : j d ξ˙sk = d Msk − ikj (ξs ) ξ˙si ξ˙s ds +

2 d Es ξ˙sk ds + 2 R˜ mn (ξs ) ξ˙ m [ξ˙sk ξ˙sn −g kn (ξs )] ds, 2 (31)

or equivalently: d + 1) Es ξ˙s ds − 2 T˜ ξ˙s ds. (32) 2 We can then compute the equation satisfied by the random energy Es . In particular, the drift term discussed above for d ξ˙sk contributes now for: d ξ˙s = d Ms − i· j (ξs ) ξ˙si ξ˙s ds + 2 ( j

22 T˜km ξ˙ m [E ξ˙ − T˜ ξ˙ ]k = 22 (E T˜km ξ˙ m ξ˙ k − T˜km ξ˙ m [T˜ ξ˙ ]k ) = 22 (E 2 −[T˜ ξ˙ ]k [T˜ ξ˙ ]k ). This leads to the following, to be compared with Corollary 3.1.2. Remark 3.4.2. The random energy Es := E(ξs , ξ˙s ) associated to the E-diffusion (s ) satisfies the following equation (where ∇v := v j ∇ j ) : dEs = ∇ξ˙s E(ξs , ξ˙s ) ds + (d + 2) 2 Es2 ds − 22 g(T˜ ξ˙s , T˜ ξ˙s ) ds + 2 d MsE , with the quadratic variation of its martingale part d MsE given by: [dEs , dEs ] = 42 [d MsE , d MsE ] = 42 [Es2 − g(T˜ ξ˙s , T˜ ξ˙s )] Es ds. Note that the diffusion coefficient [Es2 −g(T˜ ξ˙s , T˜ ξ˙s )] appearing in the quadratic variation [dEs , dEs ] is necessarily non-negative, which can of course be checked directly. Remark 3.4.3. Case of Einstein Lorentz manifolds. The Lorentz manifold M is said to be Einstein if its Ricci tensor is proportional to its metric tensor. Bianchi’s contracted

364

J. Franchi, Y. Le Jan

identities (see [H-E] or [K-N]), which entail the conservation equations ∇k T˜ jk = 0 , force the proportionality coefficient p˜ to be constant on M. Hence: R˜ m (ξ ) = p˜ g m (ξ ) ,

for any ξ in M and 0 ≤ , m ≤ d.

Then the scalar curvature is R(ξ ) = (d + 1) p˜ , and by Einstein Equations (17) we have: d −1 p) ˜ g m (ξ ) =: − p g m (ξ ). T˜ m (ξ ) = ( − 2 Hence Eq. (19) holds, with q = 0 : we are in a limiting case of the perfect fluid. Moreover, R and E are constant, so that on an Einstein Lorentz manifold, the R-diffusion and the E-diffusion coincide with the basic relativistic diffusion (of Sect. 3.1). 4. Warped (or Skew) Products Let us consider here a Lorentz manifold (M, g) having the warped product form : M = I × M , where I is an open interval of R+ and (M, h) is a Riemannian manifold, and is endowed with the Lorentzian pseudo-norm g given by: ds 2 := dt 2 − α(t)2 |d x|2h ,

(33)

or equivalently g0k := δ0k

and

gi j := −α(t)2 h i j (x) , for 0 ≤ k ≤ d , 1 ≤ i, j ≤ d.

(34)

Here ξ ≡ (t, x) ∈ I × M denotes the generic point of M, and the expansion factor α is a positive C 2 function on I . The so-called Hubble function is : H (t) := α (t)/α(t).

(35)

This structure is considered in [B-E], which contains most of the following proposition. Proposition 4.1. Consider a Lorentz manifold (M, g) having the warped product form. (i) Its curvature operator R is given by: for u, v, w, a ∈ C 2 (I ) and X, Y, Z , A ∈ (T M), R ((u∂t + X ) ∧ (v∂t + Y )) , (a∂t + A) ∧ (w∂t + Z ) g

= −α 2 K (X ∧ Y ) , A ∧ Z + αα h(u Y − v X, a Z − w A) + (αα )2 [h(X, Z )h(Y, A) − h(X, A)h(Y, Z )],

(36)

K denoting the curvature operator of (M, h). (ii) Denoting by Ric the Ricci operator of (M, h) and by ·,· the standard canonical inner product of Rd , we have: α Ricci(v∂t + Y ), w∂t + Z η = Ric(Y ), Z + [(d −1) |α |2 + αα ] h(Y, Z )−d vw. α

(37)

Curvature Diffusions in General Relativity

365

(iii) If ∇ g and ∇ h denote the Levi-Civita connections of (M, g) and (M, h) respectively : " # g ∇(u∂t +X ) (v∂t + Y ) = uv + α(t)α (t) h(X, Y ) ∂t + ∇ Xh Y + H (t) (u Y + v X ). (38) Remark 4.2. In any warped local chart (t, x j ) : for 1 ≤ m, n, p, q ≤ d (K mnpq denoting the curvature tensor of (M, h), and K mp its Ricci tensor) and 0 ≤ k ≤ d , we have · (g) = 0 ; (g) = H δ ; 0 (g) = αα h 00 mn ; mn (g) = mn (h) , n mn 0n 2 0nkq = δ0k α(t)α (t) h nq ; R mnpq = (αα ) (t) [h mq h np −h mp h nq ]−α 2 (t) K mnpq , R p

R˜ 0k = − δ0k d × α (t)/α(t) ;

p

p

p

R˜ mp = K˜ mp + [(d − 1) |α |2 + αα ] h mp .

(39) (40) (41)

Let us outline the proof for the convenience of the reader. Proof. We get (38), and then (39), by using the Koszul formula : 2 h(∇ Xh Y, Z ) = X h(Y, Z ) + Y h(X, Z ) − Z h(X, Y ) + h([X, Y ], Z ) −h([X, Z ], Y ) + h([Z , Y ], X ), for both ∇ h and ∇ g . Hence, g g g ∇(u∂t +X ) ∇(v∂t +Y ) (w∂t + Z ) = ∇(u∂t +X ) [vw +αα h(Y, Z )]∂t + ∇Yh Z + H [v Z + wY ] = u[v w + vw + (αα ) h(Y, Z )] + αα X h(Y, Z ) + h(X, ∇Yh Z ) + H [vh(X, Z ) + wh(X, Y )] ∂t + ∇ Xh ∇Yh Z + H [v ∇ Xh Z + w ∇ Xh Y ] + u H [v Z + w Y ] + u H [v Z + w Y ] , + H u ∇Yh Z + u H [v Z + wY ] + [vw + αα h(Y, Z )] X . Therefore g g ∇(u∂ +X ) , ∇(v∂ +Y ) (w∂t + Z ) t t = [uv − u v]w + (αα ) [u h(Y, Z ) − v h(X, Z )] + αα [X h(Y, Z ) − Y h(X, Z )] ∂t h Z )] + α 2 H 2 [v h(X, Z ) − u h(Y, Z )] ∂ + αα [h(X, ∇Yh Z ) − h(Y, ∇ X t h , ∇ h ]Z + H (v ∇ h Z − u ∇ h Z + w [X, Y ]) + (H w) [u Y − v X ] + H [uv − u v] Z +[∇ X X Y Y h Z + (H w − w )(u Y − v X ) + αα [h(Y, Z ) X − h(X, Z ) Y ] . +H u ∇Yh Z − v ∇ X

And since g

g

∇[(u∂t +X ),(v∂t +Y )] (w∂t + Z ) = ∇([uv −u v] ∂t +[X,Y ]) (w∂t + Z ) h = [uv −u v]w +αα h([X, Y ], Z ) ∂t + ∇[X,Y ] Z + H ([uv − u v] Z + w [X, Y ]) ,

366

J. Franchi, Y. Le Jan

we get

g g g ∇(u∂t +X ) , ∇(v∂t +Y ) − ∇[(u∂t +X ),(v∂t +Y )] (w∂t + Z ) = (αα ) [u h(Y, Z ) − v h(X, Z )] + αα [X h(Y, Z ) − Y h(X, Z )] ∂t + αα [h(X, ∇Yh Z ) − h(Y, ∇ Xh Z ) − h([X, Y ], Z )] + α 2 H 2 [v h(X, Z ) − u h(Y, Z )] ∂t h h h + [∇ Xh , ∇Yh ] − ∇[X,Y ] Z + H (v ∇ X Z − u ∇Y Z ) + (H w) [u Y − v X ] +H ∇uh Y −v X Z + H (H w − w )(u Y − v X ) + α 2 H 2 [h(Y, Z ) X − h(X, Z ) Y ] h = αα h(uY −v X, Z )∂t +([∇ Xh , ∇Yh ]−∇[X,Y ] )Z +

α w(uY −v X ) + α 2 [h(Y, Z )X − h(X, Z )Y ]. α

By Formula (9), this entails Formula (36), which is equivalent to (40), by (15).Then, denoting by (e1 , . . . , ed ) an orthonormal basis of (M, h) : Ricci(v∂t + Y ), w∂t + Z η = R (∂t ∧ (v∂t + Y )) , ∂t ∧ (w∂t + Z ) η

− α(t)−2

d

R e j ∧ (v∂t + Y ) , e j ∧ (w∂t + Z ) , η

j=1

which yields (37) ; which is in turn equivalent to (41), by (16).

Corollary 4.3. A Lorentz manifold (M, g) having the warped product form is of perfect fluid type (recall Sect. 2.4) if and only if the Ricci operator of its Riemannian factor M is conformal to the identical map : Ric = × Id , for some ∈ C 0 (M). If this holds, we must have : U = ∂t , p˜ + q = −d (α /α), and q = α −2−(d−1)H . Proof. By (21) and (37), this happens if and only if for any v ∈ C 0 (I ), Y ∈ (T M) : Ric(Y ), Y + [(d − 1)α 2 + αα ]h(Y, Y ) − d(α /α)v 2 = q g(U, v∂t + Y )2 + p˜ [v 2 − α 2 h(Y, Y )] , ˜ = −d (α /α) which (seen as a polynomial in v) forces U = ∂t , and then splits into p+q 2 2 Ric(Y and ), Y = −[(d − 1)α + αα + p˜ α ] h(Y, Y ). This latter equation is equivalent to p˜ = −[(d − 1)H 2 + (α /α) + α −2 ] , and then, using p˜ + q = −d (α /α), to q = α −2 − (d − 1)H . Corollary 4.4. Consider a Lorentz manifold (M, g) having the warped product form. 1 (I × M) equals: The energy (12) at ξ˙ ≡ (t˙, x) ˙ ∈ Tξ1 M ≡ T(t,x) E(ξ, ξ˙ ) = Ric(x), ˙ x ˙ − (d − 1) H (t) (t˙2 − 1) +

1 RM . d (d − 1) H 2 (t) + 2 2 α 2 (t) (42)

Then the weak energy condition is equivalent to the following lower bounds for the Ricci operator and the scalar curvature R M of the Riemannian factor (M, h): inf

w∈T M

Ric(w), w ≥ (d − 1) sup {α α − α 2 } ; R M ≥ − d (d − 1) inf α 2 . I h(w, w) I (43)

Curvature Diffusions in General Relativity

367

And the scalar curvature R of (M, g) equals: $ R = −α

−2

R

M

% 2 α α − d (d − 1) + 2 , α α

(44)

so that its non-positivity is equivalent to the lower bound on the scalar curvature R M : R M ≥ − d × inf {(d − 1) α 2 + 2 α α }.

(45)

I

Proof. From (37), we compute the scalar curvature of M: R = Ricci(∂t ), ∂t η −α −2

d j=1

$

% 2 α α Ricci(e j ), e j η = −α −2 R M −d (d −1) +2 . α α

1 (I × M): On the other hand, by (37) we have at ξ˙ ≡ (t˙, x) ˙ ∈ Tξ1 M ≡ T(t,x)

t˙2 − 1 α 2 − d t˙ α2 α α = Ric(x), ˙ x ˙ − (d − 1) H (t) (t˙2 − 1) − d (t). α

˙ x ˙ + [(d − 1) |α |2 + αα ] Ricci(ξ˙ ), ξ˙ η = Ric(x),

Thence Formula (42). Then, the weak energy condition holds if and only if for any t ∈ I and w ∈ T M : Ric(w), w ≥ (d − 1) α 2 (t)H (t) h(w, w) −

1 RM . d (d − 1) H 2 (t) − 2 2 α 2 (t)

By homogeneity with respect to w , this can be split into the following lower bound for the Ricci operator of the Riemannian factor M: inf

w∈T M

Ric(w), w ≥ (d − 1) sup {α α − α 2 } , h(w, w) I

together with the condition particularised to w = 0 , which yields the following lower bound for the scalar curvature of the Riemannian factor M: d(d − 1) H 2 (t) + α(t)−2 R M ≥ 0 , or

R M ≥ − d (d − 1) inf α 2 . I

5. Example of Robertson-Walker (R-W) Manifolds These important manifolds are particular cases of a warped product : they can be written M = I × M , where I is an open interval of R+ and M ∈ {S3 , R3 , H3 }, with spherical coordinates ξ ≡ (t, r, ϕ, ψ) (which are global in the case of R3 , H3 , and are defined separately on two hemispheres in the case of S3 ), and are endowed with the pseudo-norm : r˙ 2 2 2 2 2 ˙2 , (46) g(ξ˙ , ξ˙ ) := t˙2 − α(t)2 + r ϕ ˙ + r sin ϕ ψ 1 − kr 2

368

J. Franchi, Y. Le Jan

where the constant scalar spatial curvature k belongs to {−1, 0, 1} (note that r ∈ [0, 1] for k = 1 and r ∈ R+ for k = 0, −1), and the expansion factor α is as in the previous Sect. 4. Note that we have necessarily t˙ ≥ 1 everywhere on T 1 M . By (36), we have the curvature operator given by: R ((u∂t + X ) ∧ (v∂t + Y )) , (a∂t + A) ∧ (w∂t + Z ) η

= αα h(uY − v X, a Z − w A) − α 2 (α 2 + k) × [h(X, A)h(Y, Z ) − h(X, Z )h(Y, A)]. By (37), the Ricci tensor (( R˜ m )) is diagonal, with diagonal entries :

α (t) A(t) , A(t) r 2 , A(t) r 2 sin2 ϕ , , α(t) 1 − kr 2 where A(t) := α(t) α (t) + 2 α (t)2 + 2k ,

−3

and the scalar curvature is R = −6 [α(t) α (t) + α (t)2 + k] α(t)−2 . The Einstein energy-momentum tensor R˜ m − 21 R g m = T˜ m is diagonal as well, with diagonal entries : α (t)2 + k − A(t) ˜ ˜ r 2 , − A(t) ˜ r 2 sin2 ϕ , , − A(t) , 3 α(t)2 1 − kr 2 ˜ := 2α(t)α (t) + α (t)2 + k. with A(t) Hence, we have ˜ g m = 2 [k α(t)−2 − H (t)] 1{ =m=0} . T˜ m − α(t)−2 A(t) Thus, in accordance with Corollary 4.3, we have here an example of perfect fluid: Eq. (19) holds, with U j ≡ δ 0j , − p(ξ ) = k α(t)−2 + 2H (t) + 3H 2 (t) , q(ξ ) = 2 [k α(t)−2 − H (t)] , p(ξ ˜ ) = −2 [2k α(t)−2 + H (t) + 3H 2 (t)]/(d − 1).

(47)

Note that As = Ui (ξs )ξ˙si = t˙s

and

Es = 2 [k α(ts )−2 − H (ts )] t˙s2 − p(ξs ).

(48)

By Corollary 4.4 (or by Remark 2.4.1(ii) as well), the weak energy condition is equivalent to : α 2 + k ≥ (α α )+ . We shall consider only eternal Robertson-Walker space-times, which have their future-directed half-geodesics complete. This amounts to I = R∗+ , together with ∞ α √ = ∞ . In the case of the basic relativistic diffusion (solving Eq. (23) in 1+α 2 such Robertson-Walker model), we have in particular : ! 32 d t˙s = t˙s2 − 1 dws + t˙s ds − H (ts )[t˙s2 − 1] ds. 2

(49)

Curvature Diffusions in General Relativity

369

5.1. -relativistic diffusions in an Einstein-de Sitter-like manifold. We consider henceforth the particular case I = ]0, ∞[ , k = 0 , and α(t) = t c , with exponent c > 0 . Note that such expansion functions α can be obtained by solving a proportionality relation between p and q (see [H-E] or [L-L]). Thus q = 2c t −2 , p = (2 − 3c)c t −2 , R = −6 c (2c − 1) t −2 , E = c t −2 (2 t˙2 + 3c − 2) . Note that the weak energy condition holds. The scalar curvature is non-positive if and only if c ≥ 1/2 , and the pressure p is non-negative if and only if c ≤ 2/3 . Note that the particular case c = 23 corresponds to a vanishing pressure p , and is precisely known as that of Einstein-de Sitter universe (see for example [H-E]). And the analysis of [L-L] shows up precisely both limiting cases c = 23 and c = 21 . 5.1.1. Basic relativistic diffusion in an Einstein-de Sitter-like manifold. In order to compare with the other relativistic diffusions, we mention first for the basic relativistic diffusion (of Sect. 3.1), the stochastic differential equations satisfied by the main coordinates t˙s and r˙s , appearing in the 4-dimensional sub-diffusion (ts , t˙s , rs , r˙s ). By (49), we have, for independent standard real Brownian motions w, w˜ : ! 32 c d t˙s = t˙s2 − 1 dws + t˙s ds − (t˙s2 − 1) ds ; 2 ts & 1 t˙s r˙s r˙ 2 32 d r˙s = r˙s ds dws + 2c − 2 s d w˜ s + ts 2 t˙s − 1 t˙s2 − 1 2 ds 2c t˙ − 1 + s 2c − r˙s2 − t˙s r˙s ds. ts rs ts

(50)

(51)

Almost surely (see [A]), lims→∞ t˙s = ∞ , and xs /rs ∼ x˙s /|x˙s | converges in S2 . 5.1.2. R-diffusion in an Einstein-de Sitter-like manifold. With the above, Sect. 3.3 reads here, for the R-relativistic diffusion, when c ≥ 1/2 : j d ξ˙s = d Ms + 9c (2c − 1)2 ts−2 ξ˙s ds − i· j (ξs ) ξ˙si ξ˙s ds ,

(52)

with the quadratic covariation matrix of the martingale part d Ms given by: −2 [d ξ˙sk , d ξ˙s ] = 6c (2c − 1) [ξ˙sk ξ˙s − g k (ξs )] ts−2 ds , for 0 ≤ k, ≤ d. In particular, we have for independent standard real Brownian motions w, w˜ : ! 92 c (2c − 1) c 6c (2c − 1)(t˙s2 − 1) dws + t˙s ds − (t˙s2 − 1) ds; ts ts2 ts & $ % √ 1 6c (2c − 1) r˙s2 t˙s r˙s d w˜ s dws + − 2 d r˙s = ts ts2c t˙s − 1 t˙s2 − 1 2 92 c (2c − 1) 2c t˙s − 1 2 ds + r˙s ds + − r˙s − t˙s r˙s ds. ts2 ts2c rs ts d t˙s =

(53)

(54)

As the scalar curvature Rs = 6c (1 − 2c)/ts2 vanishes asymptotically, we expect that almost surely the R-diffusion behaves eventually as a timelike geodesic, and in particular that lims→∞ t˙s = 1 .

370

J. Franchi, Y. Le Jan

5.1.3. E-diffusion in an Einstein-de Sitter-like manifold. Similarly, using (32), (47), (48), we have here E ξ˙ − T˜ ξ˙ = 2(0 − H )(t˙2 ξ˙ − t˙ U ) , so that Sect. 3.4 reads here, for the E-diffusion : 32 c −2 t (2 t˙s2 + 3c − 2) ξ˙s ds + 22 c ts−2 (t˙s ξ˙s − Us ) t˙s ds 2 s j −i· j (ξs ) ξ˙si ξ˙s ds , (55)

d ξ˙s = d Ms +

with the quadratic covariation matrix of the martingale part d Ms given by : −2 [d ξ˙sk , d ξ˙s ] = c [ξ˙sk ξ˙s − g k (ξs )] (2 t˙s2 + 3c − 2) ts−2 ds , for 0 ≤ k, ≤ d. In particular, we have for some standard real Brownian motion w : √ ! c 9c t˙s t˙s2 − 1 2 2 2 2 ds; (2 t˙s − 2 + 3c)(t˙s − 1) dws + c 5 (t˙s − 1 + ) 2 − d t˙s = ts 10 ts ts (56) & $ % √ ! 1 c t˙s r˙s r˙ 2 d r˙s = dws + 2 t˙s2 − 2 + 3c − 2 s d w˜ s 2c ts ts t˙s − 1 t˙s2 − 1 2 ds 9c r˙s 2c t˙ − 1 + 2 c (5 t˙s2 − 3 + ) 2 ds − . (57) t˙s r˙s ds + s 2c − r˙s2 2 ts ts ts rs Remark 5.1.4. Comparison of -diffusions in an E.-d.S.-like manifold. Along the preceding Sects. 5.1.1, 5.1.2, 5.1.3, we specified to an Einstein-de Sitter-like manifold the various -diffusions we considered successively in Sects. 3.1, 3.3, 3.4. Restricting to the only equation relating to the hyperbolic angle As = t˙s , or in other words, to the simplest sub-diffusion (ts , t˙s ), this yields Eqs. (50), (53), (56) respectively. We observe that even in this simple case, all these covariant relativistic diffusions differ notably, having pairwise distinct minimal sub-diffusions (with 3 non-proportional diffusion factors). 5.2. Asymptotic behavior of the R-diffusion in an Einstein-de Sitter-like manifold. We present here the asymptotic study of the R-diffusion of an Einstein-de Sitter-like manifold (recall Sects. 5.1, 5.1.2). We will focus our attention on the simplest sub-diffusion (ts , t˙s ), and on the space component xs ∈ R3 . Recall from (48) that t˙s = As equals the hyperbolic angle, measuring the gap between the ambient fluid and the velocity of the diffusing particle. Recall also that, by the unit pseudo-norm relation, t˙s controls the behavior of the whole velocity ξ˙s . We get as a consequence the asymptotic behavior of the energy Es . As quoted in Sect. 5.1.2, we must have here c ≥ 21 . Note that for c = 21 , the scalar curvature vanishes, and the R-diffusion reduces to the geodesic flow, whose equations are easily solved and whose time coordinate satisfies (for constants a and s0 ): √ s − s0 = ts (ts + a 2 ) − a 2 log[ ts + ts + a 2 ], whence ts ∼ s. The proofs of this section (and of the following one) will use several times the elementary fact that almost surely a continuous local martingale cannot go to infinity. The following confirms a conjecture stated at the end of Sect. 5.1.2. Proposition 5.2.1. The process t˙s goes almost surely to 1, and Es → 0 , as s → ∞ .

Curvature Diffusions in General Relativity

371

Proof. By Eq. (53), t˙s log − 32 c(2c − 1) t˙1

'

s 1

(2 + t˙τ−2 )

dτ +c tτ2

'

s 1

(1 − t˙τ−2 )

t˙τ dτ tτ

s is a continuous martingale with quadratic variation 62 c (2c − 1) 1 (1 − t˙τ−2 ) dτ . Hence, tτ2 ˙ since tτ ≥ 1 and therefore tτ ≥ τ , the non-negative process log t˙s + c

' 1

s

(1 − t˙τ−2 )

t˙τ dτ tτ

converges almost surely as s → ∞ . This forces the almost sure convergence of the ∞ ˙ integral : 1 (1 − t˙τ−2 ) ttττ dτ < ∞ , and of t˙s , towards some t˙∞ ∈ [1, ∞[ . This implies ∞ ˙ in turn tτ = O(τ ), hence 1 (1 − t˙τ−2 ) dτ τ < ∞ , whence finally t∞ = 1 . √ Consider now the functional a := t c t˙2 − 1 , which is constant along any geodesic. Lemma 5.2.2. For c > 21 , the process as := tsc t˙s2 − 1 goes almost surely to and cannot vanish. Moreover, for any ε > 0 we have almost surely : infinity, ∞ 2c−2 ds < ∞. 1 ts a 2+ε s

Proof. We get from Eq. (53) : das =

ts

!

6c (2c − 1)(as2 + ts2c ) dws + 32 c (2c − 1)

3 as2 + 2 ts2c ds , ts2 as

and then for any ε ∈ ]0, 1] and for some continuous local martingale M : ' 0 ≤ as−ε = a1−ε − Ms − 3ε2 c (2c − 1)

s 1

[2 − ε] aτ2 + [1 − ε] tτ2c dτ. tτ2 aτ2+ε

The signs in this last formula, and the fact that almost surely a continuous local martingale cannot go to infinity, imply the convergence of the last integral and of the martingale −1 , hence of a ∈ ]0, ∞], term Ms , entailing the almost sure existence of a finite a∞ ∞ ∞ limit dτ and the almost sure convergence of the integral 1 2+ε 2−2c < ∞ . Now, by Proposia τ tτ ∞ dτ ∞ dτ tion 5.2.1, this implies 1 a 2+ε ≤ < ∞ , hence a∞ = ∞ . Finally, the 2+ε 2−2c 1 a τ τ τ

τ

equation for as−ε forbids also the existence of a finite zero s0 for as . Indeed, s s0 would force the martingale term of this equation to go to −∞, which is impossible.

The following reveals the asymptotic behavior of the space component (xs ) for c > 21 . Proposition 5.2.3. For c > 21 , the space component converges almost surely (as s → ∞): xs → x∞ ∈ R3 .

372

J. Franchi, Y. Le Jan

Proof. (i) Let us consider the non-negative process u s := ts (t˙s2 − 1), which is constant for c = 21 , and cannot vanish for c > 21 , by Lemma 5.2.2. By Eq. (53), ' s ' s u τ t˙τ dτ 2 dτ − 6 c (2c − 1) [4 t˙τ2 − 1] u s + (2c − 1) t tτ τ 1 1 is a continuous local martingale. Then, for any ε > 0 , ' s 2 ' s us [4t˙τ − 1] dτ u τ dtτ 2 − 6 c [2c − 1] + [2c − 1 + ε] 1+ε 1+ε tsε t 1 1 tτ τ is a continuous local martingale. By Proposition 5.2.1, the central term converges almost ∞ dt τ surely. This implies that 1 utτ1+ε < ∞ and that ut εs converges, almost surely. τ

s

(ii) By the unit pseudo-norm relation, we have ts2c |x˙s |2 = t˙s2 − 1 = u s /ts . Let us apply (i) above with ε := c − 21 , to get : ' ∞ 2 ' ∞ ' ∞ ' ∞ us 2 |x˙s | ds ≤ tsε−2c ds × ts2c−ε |x˙s |2 ds ≤ ds < ∞. 2c − 1 1 ts1+ε 1 1 1 s ∞ This proves that xs = x1 + 1 x˙τ dτ → x1 + 1 x˙s ds ∈ R3 , almost surely as s → ∞. In the case c = 21 of the R-diffusion being the geodesic flow, we have ! rs = b2 /a 2 + (a + o(1)) log s ∼ a log s as s → ∞ ,

which shows that Proposition 5.2.3 does not hold for the limiting case c = 21 . To compare the R-diffusion with geodesics, note that (as iseasily seen ; see for exam s dτ ple [A]) along any timelike geodesic, we have xs = x1 + |xx˙˙11 | 1 at 2c and |xx˙˙ss | = |xx˙˙11 | , τ

which converges precisely for c > 21 ; and along any lightlike geodesic, we have t 1−c 1+c xs = x1 + |xx˙˙11 | t1s dτ and |xx˙˙ss | = |xx˙˙11 | , which converges only for c > 1 . τc ∼ V × s On the other hand, for c ≤ 1, the behavior of the basic relativistic diffusion proves to s satisfy (see [A]) : rs ∼ 1 aτt 2cdτ −→ ∞ (exponentially fast, at least for c < 1). s→∞ τ Hence, the R-diffusion behaves asymptotically more like a (timelike) geodesic than like the basic relativistic diffusion. However, owing to Lemma 5.2.2, the asymptotic behavior of the R-diffusion seems to be somehow intermediate between those of the geodesic flow and of the basic relativistic diffusion. 5.3. Asymptotic energy of the E-diffusion in an Einstein-de Sitter-like manifold. We consider here the case of Sect. 5.1.3, dealing with the energy diffusion in an Einstein-de Sitter-like manifold, and more precisely, with its absolute-time minimal sub-diffusion (ts , t˙s ) satisfying Eq. (56), and with the resulting random energy: Es = c ts−2 (2 t˙s2 + 3c − 2) = 2c (t˙s /ts )2 + O(s −2 ). Let us denote by ζ the explosion time :

ζ := sup{s > 0 | t˙s < ∞} ∈ ]0, ∞] .

Lemma 5.3.1. We have almost surely : either lims→ζ t˙s = 1 and ζ = ∞ , or lims→ζ t˙s = ∞ .

Curvature Diffusions in General Relativity

373

Proof. By Eq. (56), ' s∧ζ ' s∧ζ " " 1 1 # dτ 3c + 2 3c − 2 # dtτ −c + 2 c + 1− 2 3+ 2 t˙τ2 tτ2 t˙s∧ζ t˙τ tτ t˙τ4 s0 s0 s∧ζ " #" # . 1− t˙12 dτ is a continuous martingale with quadratic variation 22 c s0 1+ 3c−2 2 t˙τ2 tτ2 τ Hence, since t˙τ ≥ 1 and therefore tτ ≥ τ , the process ' s∧ζ dτ −1 −c (1 − t˙τ−2 ) t˙s∧ζ tτ s0 −1 converges almost surely as s → ∞ . As 0 ≤ t˙s∧ζ ≤ 1 , this forces the almost sure conζ " # < ∞ , and of t˙s∧ζ , towards some t˙ζ ∈ [1, ∞]. vergence of the integral : s0 1 − t˙12 dτ tτ τ Moreover, the convergence of the integral forces either ζ < ∞ and then t˙ζ = ∞ , or ζ = ∞ and then t˙ζ ∈ {1, ∞}.

The asymptotic behavior can, with positive probability, be partly opposite to that of the preceding R-diffusion : Proposition 5.3.2. From any starting point (ts0 , t˙s0 ), there is a positive probability that both As = t˙s and the energy Es explode. This happens with arbitrary large probability, starting with t˙s0 /ts0 sufficiently large and t0 bounded away from zero. On the other hand, there is also a positive probability that the hyperbolic angle As = t˙s does not explode and goes to 1, and then that the random energy Es goes to 0. This happens actually with arbitrary large probability, starting with sufficiently large ts0 /t˙s0 . Proof. Let us set λs := ts /t˙s ≥ 0 . From the above proof of Lemma 5.3.1, we get directly: ' s∧ζ c 1− dτ λs∧ζ − λs0 = Ms∧ζ + [1 + c] [1 + c]t˙τ2 s0 ' s∧ζ 3c − 2 3c − 2 dτ 2 − c + (58) 3+ 2 t˙τ2 λτ t˙τ4 s0 ' s∧ζ 3 1 + 2 t˙τ−2 dτ , (59) − ≤ Ms∧ζ + [1 + c](s ∧ ζ − s0 ) − 2 c λτ tτ t˙τ s0 s" #" # (Ms ) denoting a martingale having quadratic variation 22 c s0 1 + 3c−2 1 − t˙12 dτ . 2 t˙τ2 τ (i) Let us first start the time sub-diffusion from (ts0 , t˙s0 ) such that ts0 ≥ s0 ≥ 1 and ˙ts0 ≥ n m ts0 , with fixed n ≥ 2 and m ≥ 2 + 1+c , and consider T := ζ ∧ inf{s > 32 c s0 | m λs > 1}. Thus, we have on [s0 , T ] : λ−1 τ ≥ m , and then 3+

1+c 2 c

3 λτ

−

1+2 t˙τ−2 tτ t˙τ

≥ 3m − 3 ≥

. Therefore, by Inequality (59), we have almost surely for any s ≥ s0 : 0 ≤ λs∧T ≤ λs0 − 32 c (s ∧ T − s0 ) + Ms∧T .

Integrating this inequality and letting s ∞ yields : 1 1 , P ˙ [T < ζ ] ≤ lim inf E[λs∧T ] ≤ λs0 ≤ s→∞ m (ts0 , ts0 ) nm 1 P(ts , t˙s ) [T = ζ ] ≥ 1 − . 0 0 n

whence

374

J. Franchi, Y. Le Jan

Moreover, almost surely on the event {T = ζ }, the above inequality 0 ≤ λs∧ζ ≤ λs0 − 32 c (s ∧ ζ − s0 ) + Ms∧ζ implies clearly (using that a continuous martingale almost surely cannot go to infinity) ζ < ∞ , and by the previous lemma that t˙ζ = ∞ . Then (58) implies the convergence of λs∧ζ to some λζ ∈ R+ . Furthermore, t˙ζ = ∞ and λζ > 0 for finite ζ would imply trivially tζ = ∞ , whence its logarithmic derivative " should explode, # which leads to a contradiction. This proves that we have P(ts , t˙s ) ζ < ∞ , λζ = 0 ≥ 1 − n −1 . 0 0 Since (by the support theorem of Stroock and Varadhan, see for example Theorem 8.1 in [I-W]) from any starting point the sub-diffusion (ts , t˙s ) hits with a positive probability some (ts0 , t˙s0 ) as above, we find there is always a positive probability that t˙s and Es explode (together). (ii) Let us now start the time sub-diffusion from (ts1 , t˙s1 ) such that n m t˙s1 ≤ ts1 , with fixed m ≥ 92 c (1 + c) and n ≥ 2 + c , and consider T := ζ ∧ inf{s > s1 | λs < m }. By Eq. (56) we have at once : almost surely, for any s ≥ s1 , 9c − 10 −3 λτ dτ 10 t˙τ2 s1 ' s∧T ' s∧T −[1 + c] λ−2 dτ + c tτ−2 dτ + Ms∧T , τ

−1 2 λ−1 s∧T = λs1 + 5 c

'

s∧T

1+

s1

s1

s" #" # −4 1 (Ms ) denoting a martingale having quadratic variation 22 c s1 1 + 3c−2 1− λτ dτ . 2 2 ˙ ˙ 2 tτ tτ # " 2 5 c (1+c) 9c−10 −1 2 Since on [s1 , T ] we have λ−1 ≤ 5/9 , τ ≤ 1/m , and then 5 c 1+ 10 t˙τ2 λτ ≤ m we get : 0≤

λ−1 s∧T

≤

λ−1 s1

' −c

s∧T s1

λ−2 τ dτ +

c + Ms∧T . ts1

T −1 −1 This entails s1 λ−2 τ dτ < ∞ and λs∧T → λT ∈ R+ (which implies moreover −1 λT = 0 almost surely on {T = ∞} ), and " # c 1+c 1 ≤ λ−1 P(ts ,t˙s ) [T < ζ ] ≤ E(ts ,t˙s ) λ−1 ≤ . s1 + T 1 1 1 1 m ts1 n m Hence, we get P(ts ,t˙s ) [T = ζ ] ≥ 1 − 1+c > 0 . Furthermore, as in (i) above, n 1 1 t˙T = ∞ and λT > 0 for finite T is impossible, which excludes T = ζ < ∞ . Therefore P(ts ,t˙s ) [T = ζ = ∞] ≥ 1 − 1+c n > 0. 1 1 # 2 −1 " 2c ≤ (6+9c) < 21 on Then from the equation for λ , using that 3 + 9c 2 c λτ 2 m [s1 , T ], we get almost surely : λs∧T

' s∧T dτ 9c 2 c − λs1 ≥ (s ∧ T − s1 ) − 3 + + Ms∧T 2 λτ s1 1 ≥ (s ∧ T − s1 ) + Ms∧T , 2

Curvature Diffusions in General Relativity

375

which shows (since [Ms , Ms ] = O(s)) that almost surely {T = ζ = ∞} ⊂ {λs → ∞}. On this same event, by Eq. (56) we have almost surely for any s ≥ s1 : ⎡ ⎤ ' s& ' s −2 1 + 9c−10 2 ˙ 3c − 2 1 t˙τ 1 − t ˙ 10 t τ ⎦ τ ⎣52 t˙s − t˙s1 = 1− 2 dwτ + c − 2c 1 + dtτ , t˙τ λτ 2 t˙τ2 λ2τ λτ s1 s1

which shows that t˙s cannot go to infinity, since this would forbid the last integral, and then the right hand side, to go to +∞ . Hence, by Lemma 5.3.1, we obtain that almost surely {T = ζ = ∞} ⊂ {t˙s → 1}. The proof is ended as in (i) above, by applying the support theorem of Stroock and Varadhan, and by taking n arbitrary large. 6. Sectional Relativistic Diffusion We turn now our attention towards a different class of intrinsic relativistic generators on G(M), whose expressions derive directly from the commutation relations of Sect. 2.2, on canonical vector fields of T G(M). They all project on the unit tangent bundle T 1 M 1 onto a unique relativistic generator Hcur v , whose expression involves the curvature 1 tensor. Semi-ellipticity of Hcur v requires the assumption of non-negativity of timelike 1 sectional curvatures. Note that in general Hcur v does not induce the geodesic flow in an empty space. 6.1. Intrinsic relativistic generators on G(M). We shall actually consider among these generators, those which are invariant under the action of S O(d) on G(M). To this aim, we introduce the following dual vertical vector fields, by lifting indexes : V i j := ηim η jn Vmn . Note that V j ≡ V 0 j = −V0 j = −V j , and that V i j = Vi j for 1 ≤ i, j ≤ d . We consider again a positive parameter . Proposition 6.1.1. The following four S O(d)-invariant differential operators define the 1 1 same operator Hcur v on T M: H0 −

d d 2 [H0 , H j ]V j + V j [H0 , H j ] ; H0 + 2 [H j , H0 ]V j ; 2 j=1

j=1

H0 + 2

d

j

R0 V j − 2

j=1

H0 − 1 (Hcur v

2 4

R0 j0k V j Vk ;

1≤ j,k≤d

[Hi , H j ] V i j + V i j [Hi , H j ] ; 1≤i, j≤d

− L0 )

Note that is self-adjoint with respect to the Liouville measure of T 1 M. The proof will be broken in several lemmas. We begin with the following general and useful computation rules, derived from Sects. 2.1 and 2.2. Lemma 6.1.2. For 0 ≤ j, k, ≤ d , we have: V i Ri j k = δ0k R j − δ0 R kj + (1 − d)R0 j k + R j0 k − Rk j0 ; V i R0i k = δ0 R0k − δ0k R0 ;

[ [Hi , H j ], V i ] = (d − 1)[H0 , H j ].

Proof. We get the first formula by multiplying by ηi p the formula of Lemma 2.2.2, and particularising to q = 0 . As to the second one, by particularising the latter to j = 0 and changing sign, we get:

376

J. Franchi, Y. Le Jan

V i R0i k = δ0 R0k − δ0k R0 + Rk 00 − R 00 k . Then, we note that R 00 k = R0 k 0 = Rk 00 . Finally, the last formula derives from the second one and from the commutation relations (4) and (6), as follows: 1 1 1 [Ri j k Vk , V i ] = Ri j k [Vk , V i ] − (V i Ri j k )Vk 2 2 2 1 i ip k i = Ri j δk V − δ Vk + η (η0 V pk − η0k V p ) 2 1−d R0 j k + R j0 k Vk − δ0k R j + 2 d −1 p k R0 j k Vk − R j0 k Vk = R j V + R j 0 V pk − R j V + 2 d −1 R0 j k Vk = (d − 1)[H0 , H j ]. = 2

[ [Hi , H j ], V i ] =

We get then first the following. Lemma 6.1.3. On C 2 (T 1 M), we have [ [H0 , H j ], V j ] = 0 , and o Hcur v

1 [H0 , H j ]V j + V j [H0 , H j ] = := − [H j , H0 ]V j 2 =

d

d

j=1

j=1

d

j

R0 V j −

j=1

R0 j0k V j Vk .

1≤ j,k≤d

Proof. Using the commutation relations (4) and (6), that Vk = 0 on C 2 (T 1 M) for 1 ≤ k, ≤ d , and (10), we have on one hand : 1 1 R0 j k Vk V j = R0 j k ([Vk , V j ] + V j Vk ) 2 2 1 k i j = R0 j η (ηik V − ηi Vk + η0 Vik − η0k Vi ) + R0 j 0 V j V 2 = −R0 j k j Vk + R0 ik 0 Vik + R0 j 0k V j Vk = R0 j 0k V j Vk − R0k Vk .

[H0 , H j ]V j =

On the other hand, using this first part of proof and Lemma 6.1.2, we get : 1 1 1 [R0 j k Vk , V j ] = R0 j k [Vk , V j ] − (V j R0 j k )Vk 2 2 2 1 = −R0k Vk − (δ0 R0k − δ0k R0 )Vk = −R0k Vk + R0k Vk = 0. 2

[[H0 , H j ], V j ] =

Using [H0 , H j ]V j + V j [H0 , H j ] = 2 [H0 , H j ]V j − [ [H0 , H j ], V j ] ends the proof. We get then the following.

Curvature Diffusions in General Relativity

Lemma 6.1.4. On C 2 (T 1 M), we have

377

[[Hi , H j ], V i j ] = 0 , and

1≤i, j≤d

o [Hi , H j ] V i j + V i j [Hi , H j ] = − 4 Hcur v. 1≤i, j≤d

Proof. As for the proof of Lemma 6.1.3, we use the commutation relations (4) and (6), that Vk = 0 on C 2 (T 1 M) for 1 ≤ k, ≤ d , (10), Lemmas 2.2.2 and 6.1.2, and the symmetries of the Riemann tensor. We have thus on one hand and on C 2 (T 1 M): 1 [R pj k Vk , Vmn ] ηim η jn 2 1 1 = R pj k [Vk , Vmn ]ηim η jn − (Vmn R pj k )ηim η jn Vk 2 2 1 k = R pj ηkm V n + η n Vkm − η m Vkn − ηkn V m ηim η jn 2 1 k − × ηmp Rn jk − ηnp Rm jk + ηm j Rnpk − η jn Rmpk + δm R pjn 2 −δnk R pjm − δm R pjnk + δn R pjmk ηim η jn Vk 1 = R p nk δki V n + η n Vk i − δ i Vkn − ηkn V i 2 1 − Ri p k + Ri p k − (d + 1)Ri p k + ηik R pj j − 2 −η jk R pji − ηi R pj jk + η j R pjik Vk

[[H p , H j ], V i j ] =

1 1 (R pni V n + R kp Vk i − R pnki Vkn + R p V i ) + [(d + 1)Ri p k 2 2 +ηik R p + R pki − ηi R kp − R p ik ]Vk 1 − R p ki + ηi R kp − R p ki − ηik R p + (d + 1)Ri p k = 2 =

+ηik R p + R pki − ηi R kp − R p ik Vk d +1 Ri p k Vk . = 2

i k In particular, we get [ [Hi , H j ], V i j ] = ( d+1 2 ) R i Vk = 0 . And on the other hand, 2 1 on C (T M) again:

1 im jn η η Ri j k Vk Vmn = ηi0 η jn Ri j k Vk Vn 2 = R0 jk Vk V j = R0 jk ([Vk , V j ] + V j Vk )

[Hi , H j ]V i j =

= R0 jk (η jk V − η j Vk + η0 V jk − η0k V j ) + R0 jk V j Vk = −2 R0k Vk + 2 R0 jk 0 V jk + 2 R0 j0 V j V o = 2 (R0 j0k V j Vk − R0k Vk ) = −2 Hcur v.

The final assertion relating to the Liouville measure is proved as in Theorem 3.2.1.

378

J. Franchi, Y. Le Jan

1 Proposition 6.1.5. In local coordinates, the second order operator Hcur v defined on 1 T M by Proposition 6.1.1 reads:

∂ ∂ 2 n ˜ k ∂ 2 p q k ∂ 2 − ξ˙ i ξ˙ j ikj k + ξ˙ Rn k − ξ˙ ξ˙ R p q j ∂ξ 2 2 ∂ ξ˙ ∂ ξ˙ ∂ ξ˙ k ∂ ξ˙ ∂ ∂ ∂ ∂2 2 m = ξ˙ j j − ξ˙ i ξ˙ j ikj k + ξ˙ Rmnpq g nq g pk k − ξ˙ p g nk g q k . ∂ξ 2 ∂ ξ˙ ∂ ξ˙ ∂ ξ˙ ∂ ξ˙

1 ˙j Hcur v = ξ

Proof. By Sect. 2.3, we have on C 2 (T 1 M) : V j Vk = enj ek

∂2 ∂e0n ∂e0

+ δ jk e0n

∂ ∂2 ∂ ∂2 ∂ + e0n e0 n + enj n = enj ek n + δ jk e0n n , n ∂e0 ∂ek ∂e0 ∂e j ∂ek ∂e0 ∂e0

whence by Lemma 6.1.3: o j0k n Hcur e j ek v = −R0 q

= −R0 j0k enj ek

∂2 ∂e0n ∂e0 ∂2 q ∂e0n ∂e0

j ∂ ∂ + R0 enj n n ∂e0 ∂e0 d

− R0 j0k δ jk e0n

j=1

j

+ R0 enj

∂ (including now j = 0). ∂e0n

On the other hand, by Formula (18) we have : mnp e p ec ηa j ηb0 ηck R0 j0k = R0abc ηa j ηb0 ηck = e0m ean R b p mr p ec ηa j ηck , = em e ear R 0

0

whence q p mr p ec ηa j ηck en eq = em e p R mr p gr n g q = em e p R m n p q . R0 j0k enj ek = e0m e0 ear R 0 0 0 0 j k

And in a similar way, by (16) : j q n m pq g q g pn . = e0m R R0 enj = e0m ei R˜ mq ηi j enj = e0m R˜ m

This and (13),(14) yield the wanted formula, whose coefficients depend only on (ξ, ξ˙ ) ∈ T 1 M, as it must be by the S O(d)-invariance underlined in Proposition 6.1.1. 1 6.2. Sign condition on timelike sectional curvatures. The generator Hcur v defined on 1 T M by Proposition 6.1.1 is covariant with any Lorentz isometry of (M, g). Hence, it is a candidate to generate a covariant “sectional” relativistic diffusion on T 1 M, provided it be semi-elliptic. As a consequence of Sect. 6.1, the intrinsic sectional generator we are led to consider 2 j j0k V V . 1 on T 1 M is Hcur j k v , a restriction of H0 + 2 R0 V j − R0 Now, a necessary and sufficient condition, in order that such an operator be the generator of a well-defined diffusion, is that it be subelliptic. We are thus led to consider the following negativity condition on the curvature: R(u ∧ v), u ∧ v η ≤ 0 , for any timelike u and any spacelike v . (60)

Curvature Diffusions in General Relativity

379

This condition is equivalent to the lower bound on sectional curvatures of timelike planes Ru + Rv: R(u ∧ v), u ∧ vη ≥ 0, g(u ∧ v, u ∧ v) since g(u ∧ v, u ∧ v) := g(u, u)g(v, v) − g(u, v)2 < 0 for such planes. Note that Sectional Curvature has proved to be a natural tool in Lorentzian geometry, see for example [H,H-R]. We test this negativity condition on warped products, in Corollary 6.2.1 below. When this negativity condition is fulfilled, we call the resulting covariant diffusion on T 1 M, 1 which has generator Hcur v given by Propositions 6.1.1, 6.1.5, the sectional relativistic diffusion. Corollary 6.2.1. Consider a Lorentz manifold (M, g) having the warped product form. Then the sign condition (60) is equivalent to : α ≤ 0 on I , together with the following lower bound on sectional curvatures of the Riemannian factor (M, h): inf

X,Y ∈T M

K (X ∧ Y ) , X ∧ Y ≥ sup{α α − α 2 }. h(X, X ) h(Y, Y ) − h(X, Y )2 I

(61)

Proof. Let us denote by S R(U ∧ V ) the sectional curvature of the timelike plane associated with U ∧ V . Recall from Sect. 6.2 that the negativity condition (60) reads simply S R(U ∧ V ) ≥ 0 , for any timelike U and spacelike V . By choosing a pseudo-orthonormal basis of such given timelike plane, we can moreover restrict to g(U, U ) = 1 = −g(V, V ) and g(U, V ) = 0 . Setting U = u∂t + X and V = v∂t +Y , with u, v ∈ C 0 (I ) and X, Y ∈ T M, we can thus suppose that: u= which implies

α 2 h(X, X ) + 1 ; v =

α 2 h(Y, Y ) − 1 ; uv = α 2 h(X, Y ) ,

α 2 h(X ∧ Y, X ∧ Y ) + h(Y, Y ) − h(X, X ) = α −2 ,

and

h(u X − vY, u X − vY ) = α 2 h(X ∧ Y, X ∧ Y ) + α −2 . Recall that h(X ∧ Y, X ∧ Y ) := h(X, X )h(Y, Y ) − h(X, Y )2 . Now, by (36), this entails : S R(U ∧ V ) = α 2 K (X ∧ Y ) , X ∧ Y − αα [α 2 h(X ∧ Y, X ∧ Y ) + α −2 ] +|αα |2 h(X ∧ Y, X ∧ Y ) = α 2 K (X ∧ Y ) , X ∧ Y − (α α − α 2 ) h(X ∧ Y, X ∧ Y ) − α /α. Reciprocally, given any X, Y ∈ T M such that α 2 h(X ∧Y, X ∧Y )+h(Y, Y )−h(X, X ) = α −2 and h(Y, Y ) ≥ α −2 , setting u := α 2 h(X, X ) + 1 and v := ± α 2 h(Y, Y ) − 1 , we get (uv)2 = α 4 h(X, Y )2 , and thence a basis (U, V ) as above. Hence, the negativity condition (60) is equivalent to: 0 ≤ α 2 K (X ∧ Y ) , X ∧ Y − (α α − α 2 ) h(X ∧ Y, X ∧ Y ) − α /α , for any X, Y ∈ T M such that α 2 h(X ∧ Y, X ∧ Y ) + h(Y, Y ) − h(X, X ) = α −2 and h(Y, Y ) ≥ α −2 .

380

J. Franchi, Y. Le Jan

Distinguishing between collinear and non-collinear pairs X, Y , and denoting in the latter case by S K (X ∧ Y ) the sectional curvature of the plane associated with X ∧ Y , this condition splits into both: (α /α) ≤ 0 , together with: (α /α) ≤ S K (X ∧ Y ) − (α α − α 2 ) , α 2 h(X ∧ Y, X ∧ Y ) for any non-collinear X, Y ∈ T M such that α 2 h(X ∧ Y, X ∧ Y ) + h(Y, Y ) − h(X, X ) = α −2 and h(Y, Y ) ≥ α −2 . We shall have proved that this condition is equivalent to the wanted inequality (61), if we show now that for any given α > 0 , any given plane P in T M is generated by pairs X, Y such that α 2 h(X ∧Y, X ∧Y )+h(Y, Y )−h(X, X ) = α −2 , h(Y, Y ) ≥ α −2 , and such that h(X ∧ Y, X ∧ Y ) is arbitrary large. Y0 Now, starting from an arbitrary {X 0 , Y0 } generating P, take Y := α √h(Y and 0 ,Y0 ) h(X 0 ,Y0 ) X := q X 0 − h(Y0 ,Y0 ) Y0 . Then α 2 h(X ∧ Y, X ∧ Y ) + h(Y, Y ) − h(X, X ) = α −2 = h(Y, Y ), and h(X ∧ Y, X ∧ Y ) = q 2 arbitrary large q .

h(X 0 ∧Y0 ,X 0 ∧Y0 ) α 2 h(Y0 ,Y0 )

is indeed arbitrary large, for

In particular, in an Einstein-de Sitter-like manifold, the sign condition (60) holds if and only if α ≤ 0 , i.e. if and only if c ≤ 1 .

6.3. Sectional diffusion in an Einstein-de Sitter-like manifold. We must have here c ≤ 1 . By (40), we have : for 0 ≤ k ≤ d and 1 ≤ m, n, p, q ≤ d, 0nkq = δ0k c (c − 1) t 2c−2 h nq and R mnpq = c2 t 4c−2 [h mq h np − h mp h nq ] − t 2c K˜ mnpq . R Using Cartesian coordinates (x j ) instead of spherical coordinates (r, ϕ, ψ) for the Euclidean factor M = R3 , we have merely: 0nkq = c (c − 1) t 2c−2 δ0k δnq R

and

mnpq = c2 t 4c−2 [δmq δnp − δmp δnq ]. R

Thence, for 0 ≤ k ≤ d and 1 ≤ m, n, p, q ≤ d: 0 n k q = c(c − 1) t −2c−2 δ0k δ nq and R q n m n p q = c(c − 1) t 2c−2 δmp δ tn δ tq + c2 t −2 [δm R δ p − δmp δ nq ]. And

R˜ i j = δi j 3c (1 − c) t −2 δ0 j + c (3c − 1) t 2c−2 [1 − δ0 j ] ,

whence

R˜ nk = −δnk c t −2 3(c − 1)δ0n + (3c − 1)[1 − δ0n ] . And by (39), the non-vanishing Christoffel coefficients are : 0k j = kj0 = c t −1 δ kj ,

and

i0j = c t 2c−1 δi j , for 1 ≤ i, j, k ≤ d.

Curvature Diffusions in General Relativity

381

Therefore, by Proposition 6.1.5, we have : ∂ ∂ 2 n ˜ k ∂ 2 p q k ∂ 2 ˙ i ξ˙ j ikj ˙ Rn − ξ + − ξ ξ˙ ξ˙ R p q ∂ξ j 2 2 ∂ ξ˙ k ∂ ξ˙ k ∂ ξ˙ k ∂ ξ˙ 2 2c ∂ ∂ c ∂ 3 c ∂ = ξ˙ j j − (t˙2 − 1) − (c − 1) t˙ t˙ x˙ j j − 2 ˙ ∂ξ t ∂t t ∂ x˙ 2t ∂ t˙ 2 ∂ c − 2 (3c − 1) x˙ j j 2t ∂ x˙ 2 c (c − 1) 2 ∂2 2 c (c − 1) 2 ˙ − 1) − ( t − t˙ x 2 t2 ∂ t˙2 2 t 2c+2 ∂2 2 c2 + 2 t −2c (t˙2 − 1)x − x˙ i x˙ j 2t ∂ x˙ i ∂ x˙ j

1 ˙j Hcur v =ξ

32 c 2c ∂ ∂ c 2 ∂ ∂ ˙ − 1) − ( t − (c − 1) t˙ − t˙ x˙ j j ∂ξ j t ∂ t˙ 2 t2 ∂ t˙ t ∂ x˙ 2 c (3c − 1) j ∂ 2 c (1 − c) 2 ∂2 2 c ˙ − 1) − x ˙ + ( t + (t˙2 − c) x 2 t2 ∂ x˙ j 2 t2 ∂ t˙2 2 t 2c+2 ∂2 2 c2 . − 2 x˙ i x˙ j 2t ∂ x˙ i ∂ x˙ j

= ξ˙ j

We see that even in this simple case, the sectional and curvature diffusion differ tangibly, apart from the fact that the range of values of c for which they are defined are different. References [A] [A-C-T] [B] [B-E] [De] [Du] [El] [Em] [F1] [F] [F-LJ] [H-R] [H] [H-E]

Angst, J.: Étude de diffusions à valeurs dans des variétés lorentziennes. Thesis of Strasbourg University, 2009 Arnaudon, M., Coulibaly, K.A., Thalmaier, A.: Brownian motion with respect to a metric depending on time; definition, existence and applications to ricci flow. C. R. Acad. Sci. Paris 346, 773–778 (2008) Bailleul, I.: A stochastic approach to relativistic diffusions. Ann. I.H.P. 46(3), 760–795 (2010) Beem, J.K., Ehrlich, P.E.: Global Lorentzian Geometry. New York-Basel: Marcel Dekker, 1981 Debbasch, F.: A diffusion process in curved space-time. J. Math. Phys. 45(7), 2744–2760 (2004) Dudley, R.M.: Lorentz-invariant markov processes in relativistic phase space. Arkiv för Mat. 6(14), 241–268 (1965) Elworthy, D.: Geometric aspects of diffusions on manifolds. École d’Été de Probabilités de SaintFlour (1985-87), Vol. XV-XVII. Lecture Notes in Math., Vol. 1362. Berlin: Springer, 1988, pp. 277-425 Émery, M.: On two transfer principles in stochastic differential geometry. Séminaire de Probabilités (1990), Vol. XXIV. Berlin: Springer, 1990, pp. 407–441 Franchi, J.: Asymptotic windings over the trefoil knot. Revista Matemática Iberoamericana 21(3), 729–770 (2005) Franchi, J.: Relativistic diffusion in Gödel’s universe. Commun. Math. Phys. 290(2), 523–555 (2009) Franchi, J., Le Jan, Y.: Relativistic diffusions and Schwarzschild geometry. Comm. Pure Appl. Math. LX(2), 187–251 (2007) Hall, G.S., Rendall, A.D.: Sectional curvature in general relativity. Gen. Rel. Grav. 19(8), 771–789 (1987) Harris, S.G.: A characterization of Robertson-Walker spaces by null sectional curvature. Gen. Rel. Grav. 17(5), 493–498 (1985) Hawking, S.W., Ellis, G.F.R.: The large-scale structure of space-time. Cambridge: Cambridge University Press, 1973

382

[Hs] [I-W] [K-N] [L-L] [M]

J. Franchi, Y. Le Jan

Hsu, E.P.: Stochastic analysis on manifolds. Graduate studies in Mathematics Vol. 38, Providence, RI: Amer. Math. Soc., 2002 Ikeda, N., Watanabe, S.: Stochastic differential equations and diffusion processes. AmsterdamTokyo: North-Holland Kodansha, 1981 Kobayashi, S., Nomizu, K.: Foundations of Differential Geometry. New York-London-Sydney: Interscience Publishers/John Wiley & Sons, 1969 Landau, L., Lifchitz, E.: Physique théorique, tome II: Théorie des champs. Moscou: Éditions MIR de Moscou, 1970 Malliavin, P.: Géométrie différentielle stochastique. Montréal: Les Presses de l’Université de Montréal, 1978

Communicated by S. Smirnov

Commun. Math. Phys. 307, 383–427 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1327-5

Communications in

Mathematical Physics

KAM for the Quantum Harmonic Oscillator Benoît Grébert , Laurent Thomann Laboratoire de Mathématiques J. Leray, Université de Nantes, UMR CNRS 6629, 2, rue de la Houssinière, 44322 Nantes Cedex 03, France. E-mail: [email protected]; [email protected] Received: 28 April 2010 / Accepted: 21 March 2011 Published online: 28 August 2011 – © Springer-Verlag 2011

Abstract: In this paper we prove an abstract KAM theorem for infinite dimensional Hamiltonians systems. This result extends previous works of S.B. Kuksin and J. Pöschel and uses recent techniques of H. Eliasson and S.B. Kuksin. As an application we show that some 1D nonlinear Schrödinger equations with harmonic potential admits many quasi-periodic solutions. In a second application we prove the reducibility of the 1D Schrödinger equations with the harmonic potential and a quasi periodic in time potential. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . 2. Statement of the Abstract Result . . . . . . . . . . 3. The Linear Step . . . . . . . . . . . . . . . . . . 4. The KAM Step . . . . . . . . . . . . . . . . . . . 5. Iteration and Convergence . . . . . . . . . . . . 6. Application to the Nonlinear Schrödinger Equation 7. Application to the Linear Schrödinger Equation . A. Appendix . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

383 387 393 409 411 413 420 425 427

1. Introduction Let : N −→ [0, +∞[ so that ( j) ≥ j for all j ≥ 1. We consider the (complex) Hilbert space 2 defined by the norm w2 = |w j |2 2 ( j). j≥1 The first author was supported in part by the grant ANR-06-BLAN-0063. The second author was supported in part by the grant ANR-07-BLAN-0250.

384

B. Grébert, L. Thomann

We define the symplectic phase space P = P as P = Tn × Rn × 2 × 2 ,

(1.1)

equipped with the canonic symplectic structure: n

dθ j ∧ dy j +

j=1

du j ∧ dv j .

j≥1

For (θ, y, u, v) ∈ P we introduce the following Hamiltonian in normal form N=

n

ω j (ξ )y j +

j=1

1 j (ξ )(u 2j + v 2j ), 2

(1.2)

j≥1

where ξ ∈ Rn is an external parameter. In [10], (see also [11] and a slightly generalised version in [16]) S.B. Kuksin has shown the persistence of n−dimensional tori for the perturbed Hamiltonians H = N + P with general conditions on the frequencies ω j , j and perturbation P which essentially are the following: Firstly the frequencies satisfy some Melnikov conditions and the external frequencies j have to be well separated in the sense that there exists d ≥ 1 so that roughly speaking (see Assumption 2 below) j (ξ ) ≈ j d .

(1.3)

Denote by P a, p the phase space given by the weight ( j) = j p/2 ea j , where p ≥ 0 and a ≥ 0. Secondly, the perturbation is real analytic and the corresponding Hamiltonian vector field is so that p ≥ p for d > 1, X P : P a, p −→ P a, p with (1.4) p > p for d = 1, where d is the constant which appears in (1.3). For instance, the Schrödinger and the wave equation on [0, π ] with Dirichlet boundary conditions satisfy the previous conditions, see respectively the KAM results of Kuksin-Pöschel [13] and Pöschel [18]. Indeed the result in [13] is stronger because there is no external parameter ξ in the equation. Now, if we consider the nonlinear harmonic oscillator i∂t u = −∂x2 u + x 2 u + V (x)u + |u|2m u, (t, x) ∈ R × R,

(1.5)

with real and bounded potential V , we have j ∼ 2 j + 1, hence d = 1 but the Hamiltonian perturbation which is here ¯ m+1 dx, (1.6) P = (u u) R

does not satisfy the strict smoothing condition (1.4) (see Sect. 6 for more details). The aim of this paper is to prove a KAM theorem (Theorem 2.3) in the case d = 1 and p = p in (1.3) and (1.4). To compensate the lack of smoothing effect of X P we need some additional conditions (see Assumption 4) on the decay of the P derivatives (in the spirit of the so-called Töplitz-Lipschitz condition used by Eliasson & Kuksin in [6]) which will be satisfied by the perturbation (1.6). The general strategy is explained with more details in Sect. 2.3.

KAM for the Quantum Harmonic Oscillator

385

Notice that S.B. Kuksin has already considered in [11] the harmonic oscillator with a smoothing nonlinearity of type P = R ϕ(|u ξ |)dx where ξ is a fixed smooth function. We present two applications of our abstract result concerning the harmonic oscillator T = −∂x2 + x 2 . Let p ≥ 2 and denote by 2p the space 2 with ( j) = j p/2 . The operator T has eigenfunctions (h j ) j≥1 (the Hermite functions) which satisfy T h j = (2 j − 1)h j , j ≥ 1 and form a Hilbertian basis of L 2 (R). Let u = j≥1 u j h j be a 2 2 p typical element of L (R). Then (u j ) j≥1 ∈ p if and only if u ∈ H := D(T p/2 ) = {u ∈ L 2 (R) | T p/2 u ∈ L 2 (R)}. Indeed H p is a Sobolev space based on T and we can check that H p = D(T p/2 ) = {u ∈ L 2 (R) | x α ∂ β u ∈ L 2 (R) for α + β ≤ p}. In this context, we are able to apply our KAM result to (1.5) and we obtain (see Theorem 6.6 for a more precise statement) Theorem 1.1. Let m ≥ 1 be an integer. For typical potential V and for > 0 small enough, the nonlinear Schrödinger equation i∂t u = −∂x2 u + x 2 u + εV (x)u ± |u|2m u

(1.7)

has many quasi-periodic solutions in H∞ . Here the notion of “typical potential” is vague. This means that there exists rather a large class of perturbations of the harmonic oscillator so that the result of Theorem 1.1 holds true (unfortunately our result does not cover the case V = 0). Since the definition of this class is technical, we postpone it to Sect. 6. The physical motivation for considering Eq. (1.7) (for V = 0) comes from the Gross-Pitaevski equation used in the study of Bose-Einstein condensation (see [15]). The harmonic potential x 2 arises from a Taylor expansion near the bottom of a general smooth well. In our work, we have to add a small linear perturbation V to the harmonic potential in order to avoid resonances (see the non-resonance condition (2.3) below). The generalisation of such a result in a multidimensional setting is not evident for a spectral reason: the spectrum of the linear part is no more well separated. We could expect to adapt the tools introduced in [6] but the arithmetic properties of the corresponding spectra are not the same: in [6] the free frequencies are j12 + j22 + · · · + j D2 for all j1 , . . . , j D ∈ Z, while in our case they are 2( j1 + j2 + · · · + j D ) + D for all j1 , . . . , j D ∈ N. Nevertheless we mention that it is still possible to obtain a Birkhoff normal form for (1.5) as recently proved in [9]. A consequence of Theorem 1.1 is the existence of periodic solutions to (1.7). There are other approaches to construct periodic solutions of this equation. For instance, the gain of compacity yielded by the confining potential x 2 allows the use of variational methods. We develop this point of view in the Appendix. The second application concerns the reducibility of a linear harmonic oscillator, T = −∂x2 + x 2 , on L 2 (R) perturbed by a quasi-periodic in time potential. Such kind of reducibility result for PDE using KAM machinery was first obtained by Bambusi & Graffi [1] for the Schrödinger equation with an x β potential, β being strictly larger than 2 (notice that in that case the exponent d > 1 in the asymptotic of the frequencies (1.3)). This result was recently extended by Liu and Yuan [14] to include the Duffing oscillator. Here we follow the more recent approach developed by Eliasson & Kuksin (see [7]) for the Schrödinger equation on the multidimensional torus. Namely we consider the

386

B. Grébert, L. Thomann

linear equation i∂t u = −∂x2 u + x 2 u + V (tω, x)u, u = u(t, x), x ∈ R, where > 0 is a small parameter and the frequency vector ω of forced oscillations is regarded as a parameter in U ⊂ Rn . We assume that the potential V : Tn × R (θ, x) → R is analytic in θ on |Im θ | < s for some s > 0, and C 2 in x, and we suppose that there exists δ > 0 and C > 0 so that for all θ ∈ [0, 2π )n and x ∈ R, |V (θ, x)| ≤ C(1 + x 2 )−δ ,

|∂x V (θ, x)| ≤ C,

|∂x x V (θ, x)| ≤ C.

(1.8)

In Sect. 7 we consider the previous equation as a linear non-autonomous equation in the complex Hilbert space L 2 (R) and we prove (see Theorem 7.1 for a more precise statement) Theorem 1.2. Assume that V satisfies (1.8). Then there exists 0 such that for all 0 ≤ < 0 there exists ε ⊂ [0, 2π )n of positive measure and asymptotically full measure: Meas(ε ) → (2π )n as → 0, such that for all ω ∈ ε , the linear Schrödinger equation i∂t u = −∂x2 u + x 2 u + V (tω, x)u

(1.9)

reduces, in L 2 (R), to a linear equation with constant coefficients (with respect to the time variable). In particular, we prove the following result concerning the solutions of (1.9). Corollary 1.3. Assume that V is C ∞ in x with all its derivatives bounded and satisfying (1.8). Let p ≥ 0 and u 0 ∈ H p . Then there exists ε0 > 0 so that for all 0 < ε < ε0 and ω ∈ ε , there exists a unique solution u ∈ C R ; H p of (1.9) so that u(0) = u 0 . Moreover, u is almost-periodic in time and we have the bounds (1 − εC)u 0 H p ≤ u(t)H p ≤ (1 + εC)u 0 H p , ∀ t ∈ R, for some C = C( p, ω). Remark 1.4. In the very particular case where V satisfies (1.8) and is independent of θ , the result of Corollary 1.3 is easy to prove. In that case, the solution of (1.9) reads 2 u(t, x) = cn e−iλn t ϕn (x), n≥0

where (ϕn )n≥0 and (λn )n≥0 are the eigenfunctions and the eigenvalues of −∂x2 + x 2 + εV (x), and some (cn )n≥0 ∈ C. The result follows thanks to the asymptotics of ϕ j when ε → 0 (see Sect. 6 for similar considerations.) The previous results show that all solutions to (1.9) remain bounded in time, for a large set of parameters ω ∈ [0, 2π )n . A natural question is whether we can find a real valued potential V , quasi-periodic in time and a solution u ∈ H p so that u(t)H p does not remain bounded when t −→ +∞. J.-M. Delort [4] has recently shown that this is the case if V is replaced by a pseudo differential operator : he proves that there exist smooth solutions so that for all p ≥ 0 and t ≥ 0, u(t)H p ≥ ct p/2 , which is the optimal growth. We also refer to the introduction of [4] for a survey on the problem of Sobolev growth for the linear Schrödinger equation.

KAM for the Quantum Harmonic Oscillator

387

Another way to understand the result of Theorem 1.2 is in terms of Floquet operator (see [5] and [23] for mathematical considerations, and [8,21] for the physical meaning). Consider on L 2 (R) ⊗ L 2 (Tn ) the Floquet Hamiltonian K := i

n k=1

ωk

∂ − ∂x2 + x 2 + V (θ, x), ∂θk

(1.10)

then we have Corollary 1.5. Assume that V satisfies (1.8). There exists ε0 > 0 so that for all 0 < ε < ε0 and ω ∈ ε , the spectrum of the Floquet operator K is pure point. A similar result, using a different KAM strategy, was obtained by W.M. Wang in [23] in the case where V (tω, x) = |h 1 (x)|2

n

cos(ωk t + ϕk ),

k=1

where h 1 is the first Hermite function. At the end of Sect. 7 we make explicit computations in the case of a potential which is independent of the space variable. This example shows that one can not avoid to restrict the choice of parameters ω to a Cantor type set in Theorem 1.2. 2. Statement of the Abstract Result We give in this section our abstract KAM result. 2.1. The assumptions on the Hamiltonian and its perturbation. Let ∈ Rn be a bounded closed set so that Meas() > 0, where Meas denote the Lebesgue measure in Rn . The set is the space of the external parameters ξ . Denote by ξ η the difference operator in the variable ξ : ξ η f = f (·, ξ ) − f (·, η). For l = (l1 , . . . , lk , . . . ) ∈ Z∞ so that only a finite number of coordinates are non∞ ∞ zero, we denote by |l| = j=1 |l j | its length, and l = 1 + | j=1 jl j |. We set Z = {(k, l) = 0, |l| ≤ 2} ⊂ Zn × Z∞ . The first two assumptions we make concern the frequencies of the Hamiltonian in normal form (1.2) Assumption 1 (Nondegeneracy). Denote by ω = (ω1 , . . . , ωn ) the internal frequencies. We assume that the map ξ → ω(ξ ) is an homeomorphism from to its image which is Lipschitz continuous and its inverse also. Moreover we assume that for all (k, l) ∈ Z,

Meas ξ : k · ω(ξ ) + l · (ξ ) = 0 = 0, (2.1) and for all ξ ∈ , l · (ξ ) = 0, ∀ 1 ≤ |l| ≤ 2.

388

B. Grébert, L. Thomann

Assumption 2 (Spectral asymptotics). Set 0 = 0. We assume that there exists m > 0 so that for all i, j ≥ 0 and uniformly on , |i − j | ≥ m|i − j|. Moreover we assume that there exists β > 0 such that the functions ξ −→ j 2β j (ξ ), are uniformly Lipschitz on for j ≥ 1. If the previous assumptions are satisfied (and actually without assuming (2.1)),

α ⊂ with J. Pöschel [16] proves that there exist a finite set X ⊂ Z and

α ) −→ 0 when α −→ 0, such that for all ξ ∈

α , Meas(\ k · ω(ξ ) + l · (ξ ) ≥ α

l , (k, l) ∈ Z\X , 1 + |k|τ

(2.2)

for some large τ depending on n and β. Then assuming (2.1), J. Pöschel proves [16, Corollary C and its proof] that the nonresonance condition (2.2) remains valid on all Z, i.e k · ω(ξ ) + l · (ξ ) ≥ α

l

α . , (k, l) ∈ Z, ξ ∈ 1 + |k|τ

(2.3)

In the sequel, we will use the distance | − |2β, = sup sup j 2β | j (ξ ) − j (ξ )| ξ ∈ j≥1

and the semi-norm 2β ||L 2β, = sup sup j ξ,η∈ j≥1 ξ =η

|ξ η j | . |ξ − η|

Finally, we set L |ω|L + ||2β, = M,

where |ω|L = supξ,η∈ max1≤k≤n ξ =η

|ξ η ωk | |ξ −η| .

Remark 2.1. The proof of (2.3) crucially uses the control of the Lipschitz semi-norm ||L 2β, (see [16, Lemma 5]). For this reason in Assumptions 3 and 4 below we have to control the Lipschitz version of each semi-norm introduced on P or X P . Recall that the phase space P is defined by (1.1), with a weight so that ( j) ≥ j, as in the beginning of the Introduction. As in [16], for s, r > 0 we define the (complex) neighbourhood of Tn × 0, 0, 0 in P,

D(s, r ) = (θ, y, u, v) ∈ P s.t. |Im θ | < s, |y| < r 2 , u + v < r . (2.4) Let r > 0. Then for W = (X, Y, U, V ) we define |W |r = |X | +

1 1 U + V . |Y | + r2 r

KAM for the Quantum Harmonic Oscillator

389

The next assumption concerns the regularity of the vector field associated to P. Denote by X P = ( ∂ y P, −∂θ P, ∂v P, −∂u P ). Then Assumption 3 (Regularity). We assume that there exist s, r > 0 so that X P : D(s, r ) × −→ P . Moreover we assume that for all ξ ∈ , X P (·, ξ ) is analytic in D(s, r ) and that for all w ∈ D(s, r ), P(w, ·) and X P (w, ·) are Lipschitz continuous on . We then define the norms P D(s,r ) :=

|P| < +∞,

sup

D(s,r )×

and PL D(s,r ) = sup

sup

ξ,η∈ D(s,r ) ξ =η

|ξ η P| , |ξ − η|

where ξ η P = P(·, ξ ) − P(·, η) and we define the semi-norms X P r,D(s,r ) :=

sup

|X P |r < +∞,

sup

|ξ η X P |r < +∞, |ξ − η|

D(s,r )×

and L X P r,D(s,r ) := sup

ξ,η∈ D(s,r ) ξ =η

where ξ η X P = X P (·, ξ ) − X P (·, η). In the sequel, we will often work in the complex coordinates 1 1 z = √ (u − iv), z = √ (u + iv). 2 2 Notice that this is not a canonical change of variables and in the variables (θ, y, z, z¯ ) ∈ P the symplectic structure reads n

dθ j ∧ dy j + i

j=1

dz j ∧ d z¯ j ,

j≥1

and the Hamiltonian in normal form is N=

n j=1

ω j (ξ )y j +

j (ξ )z j z j .

(2.5)

j≥1

As we mentioned previously we need some decay on the derivatives of P. We first introβ β L duce the space r,D(s,r ) : Let β > 0, we say that P ∈ r,D(s,r ) if Pr,D(s,r ) +Pr,D(s,r ) < ∞ where:

390

B. Grébert, L. Thomann

• The norm · r,D(s,r ) is defined by the conditions1 P ≤ r 2 Pr,D(s,r ) , D(s,r ) ∂P ≤ Pr,D(s,r ) , max 1≤ j≤n ∂ y j D(s,r ) ∂P r ≤ β Pr,D(s,r ) , ∀ j ≥ 1 and w j = z j , z j , ∂w j D(s,r ) j ∂2 P 1 ≤ Pr,D(s,r ) , ∀ j, l ≥ 1 and w j = z j , z j . ∂w j ∂wl D(s,r ) ( jl)β L • The semi-norm · r,D(s,r ) is defined by the conditions

L L P ≤ r 2 Pr,D(s,r ), D(s,r ) ∂ P L L ≤ Pr,D(s,r max ), 1≤ j≤n ∂ y j D(s,r ) ∂ P L r L ≤ β Pr,D(s,r ) , ∀ j ≥ 1 and w j = z j , z j , ∂w j D(s,r ) j ∂ 2 P L 1 L ≤ Pr,D(s,r ) , ∀ j, l ≥ 1 and w j = z j , z j . ∂w j ∂wl D(s,r ) ( jl)β The last assumption is then the following β

Assumption 4 (Decay). P ∈ r,D(s,r ) for some β > 0. Remark 2.2. The control of the second derivative is the most important condition. The other ones are imposed so that we are able to recover the last one after the KAM iteration (see Lemma 3.4). Furthermore the assumptions on the first derivatives are already contained in Assumption 3 as soon as p > 0. L 2.2. Statement of the abstract KAM Theorem. Recall that M = |ω|L + ||2β, .

Theorem 2.3. Suppose that N is a family of Hamiltonians of the form (2.5) on the phase space P depending on parameters ξ ∈ so that Assumptions 1 and 2 are satisfied. Then there exist ε0 > 0 and s > 0 so that every perturbation H = N + P of N which satisfies Assumptions 3 and 4 and the smallness condition α L L X P r,D(s,r ε = X P r,D(s,r ) + Pr,D(s,r ) + ) + Pr,D(s,r ) ≤ ε0 α, M for some r > 0 and 0 < α ≤ 1, the following holds. There exist (i) a Cantor set α ⊂ with Meas(\α ) → 0 as α → 0; (ii) a Lipschitz family of real analytic, symplectic coordinate transformations : D(s/2, r/2) × α → D(s, r ); (iii) a Lipschitz family of new normal forms 1 This means that P r,D(s,r ) is the smallest real number which satisfies the mentioned conditions : this defines a norm.

KAM for the Quantum Harmonic Oscillator

N =

n

391

ωj (ξ )y j +

j=1

j (ξ )z j z¯ j

j≥1

defined on D(s/2, r/2) × α ; such that H ◦ = N + R, where R is analytic on D(s/2, r/2) and globally of order 3 at Tn × {0, 0, 0}. That is the Taylor expansion of R only contains monomials y m z q z¯ q¯ with 2|m| + |q + q| ¯ ≥ 3. Moreover each symplectic coordinate transformation is close to the identity − I dr,D(s/2,r/2) ≤ cε,

(2.6)

the new frequencies are close to the original ones |ω − ω|α + | − |2β,α ≤ cε,

(2.7)

and the new frequencies satisfy a non resonance condition l k · ω (ξ ) + l · (ξ ) ≥ α , (k, l) ∈ Z, ξ ∈ α . (2.8) 2 1 + |k|τ As the consequence, for each ξ ∈ α the torus Tn × {0, 0, 0} is still invariant under the flow of the perturbed Hamiltonian H = N + P, the flow is linear (in the new variables) on these tori and furthermore all these tori are linearly stable. 2.3. General strategy. The general strategy is the classical one used for instance in [10,11,16]. For convenience of the reader we recall it. Let H = N + P be a Hamiltonian, where N is given by (2.5) and P a perturbation which satisfies the assumptions of the previous section. We then consider the second order Taylor approximation of P which is R= Rkmqq eik·θ y m z q z q , (2.9) 2|m|+|q+q|≤2 k∈Zn

with Rkmqq = Pkmqq and we define its mean value by R0mqq y m z q z q . [R] = |m|+|q|=1

Recall that in this setting z, z have homogeneity 1, whereas y has homogeneity 2. Let F be a function of the form (2.9) and denote by X tF the flow at time t associated to the vector field of F. We can then define a new Hamiltonian by H ◦ X 1F := N+ + P+ , and the Hamiltonian structure is preserved, because X 1F is a symplectic transformation. The idea of the KAM step is to find, iteratively, an adequate function F so that the new error term has a small quadratic part. Namely, thanks to the Taylor formula we can write H ◦ X 1F = N ◦ X 1F + (P − R) ◦ X 1F + R ◦ X 1F 1

= N + N, F + (1 − t) N , F , F ◦ X tF dt 0

+(P − R) ◦ X 1F + R +

1

0

R, F ◦ X tF dt.

392

B. Grébert, L. Thomann

, where In view of the previous equation, we define the new normal form by N+ = N + N satisfies the so-called homological equation (the unknown are F and N ) N

= R. F, N + N (2.10) The new normal form N+ has the form (2.5) with new frequencies given by ω+ (ξ ) = ω(ξ ) + ω(ξ )

and

(ξ ), + (ξ ) = (ξ ) +

where ω j (ξ ) =

∂N (0, 0, 0, 0, ξ )) ∂yj

j (ξ ) =

and

∂2 N (0, 0, 0, 0, ξ ). ∂z j ∂z j

(2.11)

Once the homological equation is solved, we define the new perturbation term P+ by P+ = (P − R) ◦ X 1F +

1

0

R(t), F ◦ X tF dt,

(2.12)

+ t R in such a way that where R(t) = (1 − t) N H ◦ X 1F = N+ + P+ . Notice that if P was initially of size ε, then R and F are of size ε, and the quadratic part of P+ is formally of size ε2 . That is, the formal iterative scheme is exponentially convergent. Without any smoothing effect on the regularity, there is no decreasing property in the correction term added to the external frequencies (2.11). In that case it would be impossible to control the small divisors (see (2.3)) at the next step. In this work the smoothing condition (1.4) on X P is replaced by Assumption 4 (see also Remark 4.3). The difficulty is to verify the conservation of this assumption at each step.

Plan of the proof of Theorem 2.3. In Sect. 3 we solve the homological equation and give estimates on the solutions. Then we study precisely the flow map X tF and the composition H ◦ X 1F . In Sect. 4 we estimate the new error term and the new frequencies after the KAM step, and Sect. 5 is devoted to the convergence of the KAM method and the proof of Theorem 2.3. Notations. In this paper c, C denote constants the value of which may change from line to line. These constants will always be universal, or depend on the fixed quantities ∗ n, β, , p. We denote by N the set of the non negative ∞ integers, and N = N\{0}. For ∞ l = (l1 , . . . , lk , . . . ) ∈ Z , we denote by |l| = j=1 |l j | its length (if it is finite), and

n ∞ l = 1 + | ∞ j=1 jl j |. We define the space Z = (k, l) = 0, k ∈ Z , l ∈ Z , |l| ≤ 2 . The notation Meas stands for the Lebesgue measure in Rn . In the sequel, we will state without proof some intermediate results of [16] which still hold under our conditions ; hence the reader should refer to [16] for the details. For the convenience of the reader we decided to remain as close as possible to the notations of J. Pöschel.

KAM for the Quantum Harmonic Oscillator

393

3. The Linear Step In this section, we solve Eq. (2.10) and study the Lie transform X tF . Following [16], · ∗ (respectively · ∗ ) stands either for · or · L (respectively · or · L ) and · λ stands for · + λ · L . 3.1. The homological equation. The following result shows that it is possible to solve Eq. (2.10) under the Diophantine condition (2.3).

α , for some α > 0 Lemma 3.1 ([16]). Assume that the frequencies satisfy, uniformly on which is the condition (2.3). Then the homological equation (2.10) has a solution F, N ] = N , and satisfies for all 0 < σ < s, and 0 ≤ λ ≤ α/M, normalised by [F] = 0, [ N ∗ ∗ λ X N r,D(s,r ) ≤ X R r,D(s,r ) , X F r,D(s−σ,r ) ≤

C λ X R r,D(s,r ), ασ t

where t only depends on n and τ . β

The space r,D(s,r ) is not stable under the Poisson bracket. Therefore we need to β,+

β

+,L + introduce the space r,D(s,r ) ⊂ r,D(s,r ) endowed with the norm · r,D(s,r ) + · r,D(s,r ) defined by the following conditions: ∂ F ∗ ∗ +,∗ +,∗ 2 F ≤ r F , max ≤ Fr,D(s,r r,D(s,r ) ), D(s,r ) 1≤ j≤n ∂ y j D(s,r ) ∂ F ∗ r +,∗ ≤ β+1 Fr,D(s,r ) , ∀j ≥ 1 and w j = z j , z j , ∂w j D(s,r ) j

∂ 2 F ∗ 1 +,∗ Fr,D(s,r ≤ ) ∀j, l ≥ 1 and w j = z j , z j . β ∂w j ∂wl D(s,r ) ( jl) (1 + | j − l|) This definition is motivated by the following result, which can be understood as a smoothing property of the homological equation be

α . Let F, N Lemma 3.2. Assume that the frequencies satisfy (2.3), uniformly on β given by Lemma 3.1. Assume moreover that R ∈ r,D(s,r ) , then there exists C > 0 so β,+ ∈ β that for any 0 < σ < s, we have F ∈ ,N and r,D(s−σ,r )

r,D(s−σ,r )

C Rr,D(s,r ) , ασ t C M L R R ≤ + r,D(s,r ) r,D(s,r ) , ασ t α

+ Fr,D(s−σ,r ) ≤ +,L Fr,D(s−σ,r )

and L L r,D(s−σ,r ) ≤ Rr,D(s,r ) , N N r,D(s−σ,r ) ≤ Rr,D(s,r ) ,

where t only depends on n and τ . For the proof of this result, we need the classical lemma

(3.1)

394

B. Grébert, L. Thomann

Lemma 3.3. Let f : R −→ C be a periodic function and assume that f is holomorphic in the domain |Im θ | < s, and continuous on |Im θ | ≤ s. Then there exists C > 0 so that its Fourier coefficients satisfy | f (k)| ≤ Ce−|k|s sup | f (θ )|. |Im θ|<s

Proof of Lemma 3.2. In [16], the author looks for a solution F of (2.10) of the form of (2.9), i.e. F= Fkmqq eik·θ y m z q z q . (3.2) 2|m|+|q+q|≤2 k∈Zn

A direct computation then shows that the coefficients in (3.2) are given by ⎧ Rkmqq ⎨ , if |k| + |q − q| = 0, i Fkmqq = k · ω + (q − q) · ⎩ 0, otherwise,

(3.3)

= [R]. and that we can set N In the following we will use the notation q j = (0, · · · , 0, 1, 0, · · · ), where the 1 is at the j th position, and q jl = q j + ql . The variables z and z exactly play the same role, therefore it is enough to study the derivatives in the variable z. In the sequel we write Ak = 1 + |k|τ . Then it easy to check that for any j ≥ 1 and σ > 0, j C Ak e−|k|σ ≤ t , σ n k∈Z

for some C > 0 and t = 2 jτ + n + 1. In the sequel, t may vary from line to line, but will remain independent of σ . C + ♠ We first prove that Fr,D(s−σ,r ) ≤ ασ t Rr,D(s,r ) . 2R = k∈Zn Rk 0 q jl 0 eik·θ , then according to Lemma 3.3, there exists • Observe that ∂z∂ j ∂z l

C > 0 so that |Rk 0 q jl 0 | ≤ C

Rr,D(s,r ) e−|k|s , ( jl)β

|Fk 0 q jl 0 | ≤ C

and thus by (3.3) and (2.3),

Ak Rr,D(s,r ) e−|k|s . α ( jl)β (1 + | j − l|)

(3.4)

Therefore, as we also have ∂2 F = Fk 0 q jl 0 eik·θ , ∂z j ∂zl n

(3.5)

k∈Z

we deduce that

∂2 F ≤ |Fk 0 q jl 0 |e|k|(s−σ ) ∂z j ∂zl D(s−σ,r ) n k∈Z

≤

CRr,D(s,r ) Ak e−|k|σ β α( jl) (1 + | j − l|) n k∈Z

CRr,D(s,r ) ≤ . t ασ ( jl)β (1 + | j − l|)

(3.6)

KAM for the Quantum Harmonic Oscillator

• We compute ∂F = Fk 0 q j 0 eik·θ + ∂z j n k∈Z

Now observe that

∂R

∂z j |z=z=0

395

k∈Zn , l≥1

=

Fk 0 q j q l eik·θ z l + 2

Fk 0 2q j 0 eik·θ z j . (3.7)

k∈Zn

Rk 0 q j 0 eik·θ , then by Lemma 3.3,

k∈Zn

∂ R sup |z=z=0 ∂z j |Im θ|<s ∂R e−|k|s ≤ C e−|k|s ≤ Cr β Rr,D(s,r ) . ∂z j D(s,r ) j

|Rk 0 q j 0 | ≤ Ce−|k|s

From the previous estimate, (3.3) and (2.3) we get |Fk 0 q j 0 | ≤ and thus

Ak Cr Ak e−|k|s |Rk 0 q j 0 | ≤ Rr,D(s,r ) , α(1 + j) α j β (1 + j)

Fk 0 q j 0 eik·θ k∈Zn

D(s−σ,r )

≤

|Fk 0 q j 0 |e|k|(s−σ )

k∈Zn

≤ Cr

Rr,D(s,r ) Ak e−|k|σ α j β (1 + j) n

Cr Rr,D(s,r ) . ≤ ασ t j β (1 + j) Similarly, we have |Fk 0 2q j 0 | ≤

Cr Ak e−|k|s Rr,D(s,r ) , α j β (1+ j)

Fk 0 2q j 0 eik·θ k∈Zn

D(s−σ,r )

≤

k∈Z

(3.8)

which leads to

Cr Rr,D(s,r ) . ασ t j β (1 + j)

(3.9)

By Cauchy-Schwarz in the variable l and (3.5), (3.6), Fk 0 q j q l eik·θ z l D(s−σ,r )

k∈Zn , l≥1

≤

−2 (l)|

k∈Zn

l≥1

Fk 0 q j q l eik·θ |2

1 2

|zl |2 2 (l)

1 2

l≥1

1 1 Cr 2 ≤ Rr,D(s,r ) t β 2β 2 2 ασ j l (l)(1 + | j − l|) l≥1

Cr Rr,D(s,r ) ≤ , ασ t j β (1 + j) since (l) ≥ l. Finally, inserting (3.8), (3.9) and (3.10) in (3.7) we obtain ∂F Cr Rr,D(s,r ) . ≤ ∂z j D(s−σ,r ) ασ t j β (1 + j)

(3.10)

(3.11)

396

B. Grébert, L. Thomann

• We can write

∂F ∂yj

=

Fkm j 0 0 eik·θ . Hence by (3.3) and (2.3), |Fkm j 0 0 | ≤ Ak ∂R ik·θ , k∈Zn Rkm j 0 0 e α |Rkm j 0 0 |, and thanks to Lemma 3.3 applied to the series ∂ y j = k∈Zn

|Fkm j 0 0 | ≤ C

Ak −|k|s e Rr,D(s,r ) , α

(3.12)

and we obtain ∂F C ≤ |Fkm j 0 0 |e|k|(s−σ ) ≤ Rr,D(s,r ) . t ∂ y j D(s−σ,r ) ασ n

(3.13)

k∈Z

• To obtain the bound for F D(s−σ,r ) write F= Fk 0 0 0 eik·θ + Fk m j 0 0 eik·θ y j + k∈Zn

+

k∈Zn ,1≤ j≤n

k∈Zn , j,l≥1

Since R|y=z=z=0 =

Fk 0 q jl 0 eik·θ z j zl

k∈Zn , j,l≥1

Fk 00 q jl eik·θ z j z l +

Fk 0q j ql eik·θ z j zl .

(3.14)

k∈Zn , j,l≥1

k∈Zn

Rk 0 0 0 eik·θ , by Lemmas 3.3 and 3.1 we deduce that

Fk0 0 0 ≤ Cr 2 Ak e−|k|s Rr,D(s,r ) , α

(3.15)

hence, thanks to (3.12) and (3.15) we can bound the sums of the first line in (3.14) as in the previous point. Now thanks to (3.4) and to the Cauchy-Schwarz inequality we have Fk 0 q jl 0 eik·θ z j zl D(s−σ,r )

k∈Zn , j,l≥1

≤ ≤

|z j zl | CRr,D(s,r ) t β ασ ( jl) (1 + | j − l|) j,l≥1 |z | 2 CR r,D(s,r )

ασ t

j

j≥1

jβ

1 CRr,D(s,r ) 2 2 ≤ ( j)|z | j ασ t j 2β 2 ( j) j≥1

≤

j≥1

Cr 2 Rr,D(s,r ) . ασ t Cr 2 R

r,D(s,r ) Therefore we proved that F D(s−σ,r ) ≤ . ασ t This latter estimate together with the estimates (3.6), (3.11) and (3.13) shows that

+ Fr,D(s−σ,r ) ≤

C Rr,D(s,r ) . ασ t

♠ We now show that r,D(s−σ,r ) ≤ Rr,D(s,r ) . N

(3.16)

KAM for the Quantum Harmonic Oscillator

397

= [R] we have Since N = N

n

R0m j 00 y j +

j=1

R00q j q j z j z j ,

and we can observe that

∂R (θ, 0, 0, 0)dθ, θ∈Tn ∂ y j ∂2 R 1 = (θ, 0, 0, 0)dθ, (2π )n θ∈Tn ∂z j ∂z j

R0m j 00 = R00q j q j

(3.17)

j≥1

1 (2π )n

(3.18)

which imply the bounds |R0m j 00 | ≤ Rr,D(s,r ) and |R00q j q j | ≤ Rr,D(s,r ) /j 2β and thus (3.16). ♠ It remains to check the estimates with the Lipschitz semi-norms. As in [16], for |k| + |q j − ql | = 0 define δk, jl = k · ω + j − l . Then by (3.3), −1 −1 iξ η Fkmq j q l = δk, jl (η)ξ η Rkmq j q l + Rkmq j q l (ξ )ξ η δk, jl . −1 By (2.3), |δk, jl | ≤ Ak /α and thus −1 |ξ η δk, jl | ≤

A2k |k||ξ η ω| + |ξ η j | + |ξ η l | , 2 α

hence −1 |ξ η δk, jl |

|ξ − η|

≤C

k A2k L k A2k |ω| + ||L 2β, ≤ C M 2 , 2 α α

and we have |ξ η Fkmq j q l | |ξ − η|

≤C

k Ak |ξ η Rkmq j q l | M + Rkmq j q l (ξ ) . α |ξ − η| α

(3.19)

Thanks to the estimate (3.19) it is easy to obtain (3.1). Finally, the estimate L L N r,D(s−σ,r ) ≤ Rr,D(s,r ) is a straightforward consequence of (3.17) and (3.18). 3.2. Estimates on the Poisson bracket. β

β,+

Lemma 3.4. Let R ∈ r,D(s,r ) and F ∈ r,D(s,r ) be both of degree 2, i.e. of the form (2.9). Then there exists C > 0 so that for any 0 < σ < s,

C + R, F r,D(s−σ,r ) ≤ Rr,D(s,r ) Fr,D(s,r ), σ and

C +,L + L R, F L Rr,D(s,r ) Fr,D(s,r + Fr,D(s,r ) Rr,D(s,r ) . β,D(s−σ,r ) ≤ ) σ

(3.20)

398

B. Grébert, L. Thomann

Proof. The expansion of R, F reads

R, F

=

n ∂R ∂F ∂R ∂F ∂R ∂F ∂R ∂F +i . − − ∂θk ∂ yk ∂ yk ∂θk ∂z k ∂z k ∂z k ∂z k k=1

k≥1

It remains to estimate each term of this expansion and its derivatives. We will control the derivative with respect to θk thanks to the Cauchy formula: ∂P C ≤ P D(s,r ) , (3.21) ∂θk D(s−σ,r ) σ which explains the loss of σ . Notice that if P is of degree 2 (and that is the case for F and R) we have ∂2 P ∂2 P ∂3 P = = = 0, ∂z∂ y ∂ y2 ∂z 3

(3.22)

a fact which will be crucially used in the sequel. Finally observe that z and z exactly ∂ play the same role, hence we will only take ∂z into consideration. ♠ We first prove (3.20). • Since P Q D(s,r ) ≤ P D(s,r ) Q D(s,r ) we have by the Cauchy formula,

R, F

D(s−σ,r )

≤

1 Cr 2 + )Rr,D(s,r ) Fr,D(s,r (2n + ) σ k 2β+1 k≥1

Cr 2 + Rr,D(s,r ) Fr,D(s,r ≤ ). σ

(3.23)

• With (3.21) we have ∂ ∂ R ∂ F ∂ ∂ R ∂F ≤ ∂ y j ∂θk ∂ yk D(s−σ,r ) ∂θk ∂ y j D(s−σ,r ) ∂ yk D(s,r ) ∂F C ∂R ≤ σ ∂ y j D(s,r ) ∂ yk D(s,r ) C + ≤ Rr,D(s,r ) Fr,D(s,r ), σ and the same estimate holds interchanging R and F. In view of (3.22) we deduce ∂

C + R, F max ≤ Rr,D(s,r ) Fr,D(s,r (3.24) ). 1≤y≤n ∂ y j D(s,r ) σ 2F ∂F • By (3.22), ∂z∂ j ∂∂yRk ∂θ = ∂∂yRk ∂z∂ j ∂θ , and by (3.21), k k ∂ R ∂2 F ∂F C ∂R ≤ ∂ yk ∂z j ∂θk D(s−σ,r ) σ ∂ yk D(s,r ) ∂z j D(s,r ) Cr + ≤ β Rr,D(s,r ) Fr,D(s,r ). j σ

KAM for the Quantum Harmonic Oscillator

∂R Similarly ∂z∂ j ∂θ k

∂F ∂ yk

D(s−σ,r )

≤

399 Cr + Rr,D(s,r ) Fr,D(s,r ). jβσ

By the Leibniz rule,

∂ ∂ R ∂ F ∂F ∂R ∂2 R ∂2 F ≤ + ∂z j ∂z k ∂z k D(s,r ) ∂z k ∂z j D(s,r ) ∂z k D(s,r ) ∂z j ∂z k D(s,r ) ∂z k D(s,r ) Cr 1 1 + Rr,D(s,r ) Fr,D(s,r ≤ β 2β+1 + 2β ), j k k (1 + | j − k|) and taking the sum in k yields Cr ∂ ∂R ∂F + ≤ β Rr,D(s,r ) Fr,D(s,r ). ∂z j ∂z k ∂z k D(s,r ) j σ k≥1

The previous estimates imply that ∂

Cr + R, F ≤ β Rr,D(s,r ) Fr,D(s,r ). D(s−σ,r ) ∂z j j σ 2 3 ∂F = ∂∂yRk ∂z j∂∂zFl ∂θk , and by (3.21) we obtain • Thanks to (3.22), ∂z∂j ∂zl ∂∂yRk ∂θ k ∂3 F ∂R ∂ 2 ∂ R ∂ F ≤ ∂z j ∂zl ∂ yk ∂θk D(s−σ,r ) ∂ yk D(s,r ) ∂z j ∂zl ∂θk D(s−σ,r ) C + Rr,D(s,r ) Fr,D(s,r ≤ ), ( jl)β σ

(3.25)

(3.26)

and the same estimate holds interchanging R and F. On the other hand, ∂2 R ∂2 F ∂2 R ∂2 F ∂2 ∂ R ∂ F = + , ∂z j ∂zl ∂z k ∂z k ∂z j ∂z k ∂zl ∂z k ∂zl ∂z k ∂z j ∂z k and ∂2 R ∂2 F ∂2 F ∂2 R ≤ D(s−σ,r ) D(s,r ) ∂z j ∂z k ∂zl ∂z k ∂z j ∂z k ∂zl ∂z k D(s,r ) C + Rr,D(s,r ) Fr,D(s,r ≤ ). ( jlk 2 )β (1 + |l − k|) Hence, with (3.26) we conclude that ∂2

C + Rr,D(s,r ) Fr,D(s,r R, F ≤ (3.27) ), D(s−σ,r ) ∂z j ∂zl ( jl)β σ 1 converges. Finally, the estimates (3.23), (3.24), (3.25) as the series k≥1 k 2β (1+|l−k|) and (3.27) yield the estimate (3.20). ♠ To prove the estimate with the Lipschitz norms, we can use the previous analysis and the two following facts. Firstly, since ξ η ( f g) = f (ξ )ξ η g + g(η)ξ η f , hence L L f gL D(s,r ) ≤ f D(s,r ) g D(s,r ) + g D(s,r ) f D(s,r ) .

Secondly, the operator ξ η commutes with the derivative in any variable.

400

B. Grébert, L. Thomann

3.3. The canonical transform. In this section we study the Hamiltonian flow generated β,+ by a function F ∈ r,D(s−σ,r ) globally of degree 2, i.e. of degree 2 in the variables z, z and of degree 1 in the variable y. Namely, we consider the system ˙ θ˙ (t), y˙ (t), z˙ (t), z(t) = X F θ (t), y(t), z(t), z(t) , (3.28) θ (0), y(0), z(0), z(0) = θ 0 , y 0 , z 0 , z 0 . β,+

Lemma 3.5. Let 0 < σ < s/3 and F ∈ r,D(s−σ,r ) with F of degree 2. Assume that F+ < Cσ . Then the solution of Eq. (3.28) with initial condition 0 0 r,D(s−σ,r ) θ , y , z 0 , z 0 ∈ D(s − 3σ, r4 ), satisfies θ (t), y(t), z(t), z(t) ∈ D(s − 2σ, r2 ) for all 0 ≤ t ≤ 1, and we have the estimates ∂ y (t) Cr F+ k r,D(s−σ,r ) ≤ sup with w 0j = z 0j or z 0j , (3.29) 0 β σ j ∂w 0≤t≤1 j + ∂w (t) CFr,D(s−σ,r k ) ≤ + δ jk with wk = z k or z k , w 0j = z 0j or z 0j , sup 0 β (1 + | j − k|) ( jk) ∂w 0≤t≤1 j (3.30) ∂ y (t) CF+ k r,D(s−σ,r ) + δ jk , (3.31) sup ≤ 0 σ 0≤t≤1 ∂ y j + ∂ 2 y (t) CFr,D(s−σ,r k ) sup 0 0 ≤ with wi0 = z i0 or z i0 , w 0j = z 0j or z 0j . σ (i j)β (1 + |i − j|) 0≤t≤1 ∂w j ∂wi (3.32) Before we turn to the proof of Lemma 3.5, we introduce a space of infinite dimensional matrices, with decaying coefficients. Let · be any submultiplicative norm on β,+ M2,2 (C), the space of the 2 × 2 complex matrices. For β > 0, we say that B ∈ Ms + + 2 if B β,s < ∞, where the norm · β,s is given by the condition sup sup B jl ≤

ξ ∈ |Im θ|<s

B +β,s

( jl)β (1 + | j − l|)

, ∀ j, l ≥ 1.

Then we have the following result β,+

β,+

Lemma 3.6. Let A, B ∈ Ms . Then AB ∈ Ms +β,s

and

AB ≤ C B +β,s . Proof. For all j, l ≥ 1, AB jl = k≥1 A jk Bkl . Since · is submultiplicative, A jk Bkl AB jl ≤ A +β,s

k≥1

≤

A +β,s B +β,s ( jl)β

k≥1

1 . k 2β (1 + | j − k|)(1 + |l − k|)

(3.33)

2 This means that · + is the smallest real number which satisfies the mentioned conditions : this defines β,s a norm.

KAM for the Quantum Harmonic Oscillator

401

Thanks to the triangle inequality, for all j, l ≥ 1,

1 1 k ≥ 1 : |l − k| ≥ | j − l| , k ≥ 1 ⊂ k ≥ 1 : | j − k| ≥ | j − l| 3 3 thus, by splitting the sum in (3.33) we obtain the desired result. Proof of Lemma 3.5. Here we introduce the notations Z j = (z j , z j ) and Z = (Z j ) j≥1 . Then F reads 1 (3.34) F(θ, y, Z ) = b0 (θ ) + b1 (θ ) · y + a(θ ) · Z + A(θ )Z · Z , 2 with b0 (θ ) = F(θ, 0, 0),

b1 (θ ) = ∇ y F(θ, 0, 0),

and A = (Ai, j ) is the infinite matrix so that ⎛ 2 ∂ F ⎜ ∂z i ∂z j (θ, 0, 0) Ai, j (θ ) = ⎜ ⎝ ∂2 F (θ, 0, 0) ∂z i ∂z j

a(θ ) = ∇ Z F(θ, 0, 0),

⎞ ∂2 F (θ, 0, 0)⎟ ∂z i ∂z j ⎟. ⎠ ∂2 F (θ, 0, 0) ∂z i ∂z j

(3.35)

Observe that A is symmetric. By [16, Est. (9)], the flow X tF exists for 0 ≤ t ≤ 1 and maps D(s − 3σ, r4 ) into D(s − 2σ, r2 ). Here we have to give a precise description of X tF for 0 ≤ t ≤ 1. This is possible thanks to the particular structure (3.34) of F. In the sequel we write (θ (t), y(t), Z (t)) = X tF (θ 0 , y 0 , Z 0 ). ♠ To begin with, the equation for θ reads ˙ = ∇ y F(θ, 0, 0) = b1 (θ ), θ (0) = θ 0 . θ(t)

(3.36)

Since b1 is a smooth function (see (3.2)), the n-dimensional system (3.36) admits a unique (smooth) local solution θ (t). By the work of J. Pöschel, this solution exists until time t = 1, and we have the bound sup |Im θ (t)| < s − 2σ,

(3.37)

0≤t≤1

(this can here be recovered by the usual bootstrap argument, using the smallness assumption on F). ♠ We now turn to the equation in Z . We have to solve Z˙ (t) = J ∇ Z F(θ, y, Z )(t), where J = diag

0 −1

1 0

Z (0) = Z 0 ,

j≥1

(3.38)

.

Notice that by [16, Est. (9)] we already know that sup Z (t)2 <

0≤t≤1

r , 2

(3.39)

402

B. Grébert, L. Thomann

but we need to precise the behavior of Z (t). Since θ = θ (t) is known by the previous step, in view of (3.34), Eq. (3.38) reads Z˙ (t) = b(t) + B(t) · Z (t),

Z (0) = Z 0 ,

(3.40)

where b(t) = J a(θ (t)) and B(t) = J A(θ (t)). We now iterate the integral formulation of the problem t 0 Z (t) = Z + b(t1 ) + B(t1 ) · Z (t1 ) dt1 , 0

and formally obtain Z (t) = b∞ (t) + 1 + B ∞ (t) Z 0 ,

(3.41)

where b∞ (t) =

t k≥1 0

t1

···

0

tk−1 k−1

0

B(t j )b(tk )dtk · · · dt2 dt1 ,

(3.42)

j=1

and B ∞ (t) =

t k≥1 0

t1

0

k

tk−1

··· 0

B(t j )dtk · · · dt2 dt1 .

(3.43)

j=1

By (3.35) and (3.37), there exists C > 0 so that sup B(t)2 →2 ≤ C,

0≤t≤1

and thus, for all 0 ≤ t ≤ 1 the series (3.42) converges and 1 t1 b∞ (t)2 ≤ sup b(t)2 C k−1 ···

0≤t≤1

≤ sup b(t)2 0≤t≤1

≤ sup b(t)2

0

k≥1

0

tk−1

dtk · · · dt2 dt1

0

C k−1 k!

k≥1 eC − 1

C ≤ C sup b(t)2 . 0≤t≤1

0≤t≤1

(3.44)

Similarly we have uniformly in 0 ≤ t ≤ 1, B ∞ (t)2 →2 ≤ C.

As a conclusion, the formula (3.41) makes sense. Indeed, we need more precise estimates on B ∞ . Recall that B(t) = A(θ (t)), where β,+ A is defined by (3.35). Then by (3.35) and (3.37), for all 0 ≤ t ≤ 1, B(t) ∈ Ms−σ and + + sup0≤t≤1 B(t) β,s−σ ≤ C F r,D(s−σ,r ) . Hence by Lemma 3.6 and (3.43), + B ∞ +β,s−σ ≤ eC F r,D(s−σ,r ) − 1 ≤ C F r,D(s−σ,r ). +

(3.45)

KAM for the Quantum Harmonic Oscillator

403

♠ Finally we turn to the equation in y, y˙ (t) = −∇θ F(θ, y, Z )(t),

y(0) = y 0 .

We already know the functions θ (t) and Z (t). Moreover as the function F (3.34) is linear in y, the previous n−dimensional system reads y˙ (t) = f (t) + g(t)y(t),

y(0) = y 0 ,

(3.46)

with f (t) = −∇θ b0 (θ (t)) + ∇θ a(θ (t)) · Z (t) +

1 ∇θ A(θ (t))Z (t) · Z (t), 2

and g(t) = −∇θ b1 (θ (t)) = −∇θ ∇ y F(θ, 0, 0). We can solve Eq. (3.46) with the same techniques as Eq. (3.38). In fact we have formally y(t) = f ∞ (t) + 1 + g ∞ (t) y 0 , (3.47) where f ∞ (t) =

t

t1

0

k≥1 0

tk−1 k−1

··· 0

g(t j ) f (tk )dtk · · · dt2 dt1 ,

(3.48)

j=1

and ∞

g (t) =

t k≥1 0

t1

···

0

0

tk−1

k

g(t j )dtk · · · dt2 dt1 . j=1

By (3.37) and the Cauchy formula + ∂F C F r,D(s−σ,r C ) max , sup g(t) ≤ ≤ 1≤ j≤n D(s−σ,r ) σ ∂ y σ j 0≤t≤1

and similarly to (3.44) we have for all 0 ≤ t ≤ 1, | f ∞ (t)| ≤ C sup | f (t)|, 0≤t≤1

and g ∞ (t) ≤

+ C F r,D(s−σ,r )

σ

,

which shows the convergence of the series defining (3.47). ♠ It remains to show the estimates on the solutions of (3.28). • First we prove (3.30). By (3.40), 1 0 ∇ Z 0 Z k (t) = δ + Bk∞j (t), 0 1 kj j

(3.49)

404

B. Grébert, L. Thomann

therefore by (3.45), for k = j we have ∇ Z 0 Z k (t) ≤ j

+ CFr,D(s−σ,r )

( jk)β (1 + | j − k|)

which was the claim. • We prove (3.31). By (3.47) we have yk (t) = f k∞ (t) + yk0 +

, and ∇ Z 0 Z j (t) ≤ 1, j

(3.50)

0 g∞ jk (t)y j ,

1≤ j≤n

hence

∂ yk ∂ y 0j

∞ does not depend = δ jk + g ∞ jk (t) and the claim follows from (3.49) ( f

on y 0 ). • We prove (3.29). Since g and g ∞ do not depend on Z , from (3.47) we deduce that ∂y ∂f ∞ = . Now by definition (3.48) of f ∞ , we get that for all 0 ≤ t ≤ 1, 0 ∂z ∂z 0 j

j

∂ y(t) ∂ f ∞ (t) 0 = ≤ ∇ Z 0 f ∞ (t) ≤ C sup |∇ Z 0 f (t)|. 0 j j ∂z j ∂z j 0≤t≤1

For all 1 ≤ l ≤ n, we compute

∇ Z k fl (t) = ∂θl ak (θ (t)) +

∂θl Aki (θ (t))Z i (t).

(3.51)

(3.52)

i≥1

As ak (θ ) = ∇ Z k F(θ, 0, 0), with the Cauchy formula we deduce + Cr Fr,D(s−σ,r C ) sup ∂θl ak (θ (t)) ≤ ∇ Z k F D(s−σ,r ) ≤ . 1+β σ σ k 0≤t≤1

Similarly with (3.35), sup ∂θl Aki (θ (t)) ≤

0≤t≤1

+ CFr,D(s−σ,r )

σ (ik)β (1 + |i − k|)

.

Inserting the two previous estimates in (3.52), we obtain using (3.39) and the CauchySchwarz inequality, + |Z i | C Fr,D(s−σ,r ) r+ |∇ Z k fl (t)| ≤ β β σ k i (1 + |k − i|) i≥1

≤

Since ∇ Z 0 j

+ Cr Fr,D(s−σ,r )

. (3.53) σ kβ fl (t) = k≥1 ∇ Z 0 Z k (t) ∇ Z k fl (t), from (3.50) and (3.53) we deduce j ∇ Z 0 Z k (t)∇ Z k fl (t) |∇ Z 0 fl (t)| ≤ j

j

k≥1

≤ ≤

+ Cr Fr,D(s−σ,r )

σ jβ

+ Cr Fr,D(s−σ,r )

σ jβ

k≥1

,

1 + 1 k 2β (1 + | j − k|)

KAM for the Quantum Harmonic Oscillator

405

and together with (3.51), we get that for all j ≥ 1, ∂ y(t) Cr F+ r,D(s−σ,r ) . sup 0 ≤ σ jβ 0≤t≤1 ∂z j • It remains to show (3.32). First we have ∂ y(t) 0 0 ≤ ∇ Z 0 ∇ Z 0 f ∞ (t) ≤ C sup ∇ Z 0 ∇ Z 0 f (t). i j i j ∂z i ∂z j 0≤t≤1 Then from the very definition of f, ∇ Z 0 ∇ Z 0 f (t) = ∇θ Ai j (θ (t)), and using the Cauchy i j estimate in θ we get, + ∂ y(t) CFr,D(s−σ,r ) , 0 0 ≤ β σ (i j) (1 + |i − j|) ∂z i ∂z j

which was the claim. In the next result, we denote by | · |L the Lipschitz norm | f (ξ ) − f (η)| . |ξ − η| ξ,η∈

| f |L = sup

ξ =η

We have an analogous result to Lemma 3.5 with Lipschitz norms. +,L Lemma 3.7. Under the assumptions of Lemma 3.5 and the condition Fr,D(s−σ,r ) ≤ Cσ the solution of (3.28) satisfies moreover +,L ∂ y (t) L Cr Fr,D(s−σ,r k ) sup ≤ with w 0j = z 0j or z 0j , 0 σ jβ 0≤t≤1 ∂w j +,L ∂w (t) L CFr,D(s−σ,r k ) with wk = z k or z k , w 0j = z 0j or z 0j , sup ≤ 0 β ( jk) (1 + | j − k|) ∂w j 0≤t≤1 +,L ∂ y (t) L CFr,D(s−σ,r k ) , sup ≤ 0 σ 0≤t≤1 ∂ y j +,L ∂ 2 y (t) L CFr,D(s−σ,r k ) sup 0 0 ≤ with wi0 = z i0 or z i0 , w 0j = z 0j or z 0j . β (1 + |i − j|) σ (i j) ∂w ∂w 0≤t≤1 j i

Proof. We won’t detail the proof, since it is tedious and similar to the proof of Lemma 3.5. β,+,L β,+ L with norm · +, First we define the space Ms β,s similarly to Ms , but with a Lipschitz norm in ξ . Then we have AB+,L ≤ C A+,L B+ + B+,L A+ . Then one can follow the proof of Lemma 3.5 and use thatthe different norms (say · ) which appear satisfy f gL ≤ C f L g + f gL . To conclude this section, we state a result which shows that the Lie transform associated to a quadratic function, is also quadratic. This will be crucial in the proof of Theorem 2.3 (see Sect. 5.2).

406

B. Grébert, L. Thomann

Corollary 3.8. The symplectic application X 1F reads ⎛

⎞ ⎛ ⎞ θ K (θ ) ⎝ y ⎠ −→ ⎝ L(θ, Z ) + M(θ )Z + S(θ )y ⎠ , Z T (θ ) + U (θ )Z where L(θ, Z ) is quadratic in Z , M(θ ) and U (θ ) are bounded linear operators from 2 × 2 into itself and S(θ ) is a bounded linear map from Rn to Rn . Proof. The claim follows from the proof of Lemma 3.5. The structure of Z (1) follows from (3.40), while the structure of y(1) comes from (3.47) and (3.48).

3.4. Composition estimates. In this section we study the new Hamiltonian obtained after composition with the canonical transformation X 1F . β

Proposition 3.9. Let 0 < η < 1/8 and 0 < σ < s, R ∈ ηr,D(s−2σ,4ηr ) and F ∈ β,+

+,L + 1 r,D(s−σ,r ) with F of degree 2. Assume that Fr,D(s,r ) +Fr,D(s,r ) < Cσ . Then R◦X F ∈ β

ηr,D(s−5σ,ηr ) and we have the estimates R ◦ X 1F ηr,D(s−5σ,ηr ) ≤ C R ηr,D(s−2σ,4ηr ) , L R ◦ X 1F L ηr,D(s−5σ,ηr ) ≤ C R ηr,D(s−2σ,4ηr ) + R ηr,D(s−2σ,4ηr ) .

(3.54)

Proof. The proof of the first estimate relies on Lemma 3.5. We omit the proof of the second, which is similar using the estimates of Lemma 3.7 instead. In the sequel, we use the notation (θ, y, z, z¯ ) = X 1F (θ 0 , y 0 , z 0 , z¯ 0 ). ♠ Since X 1F maps D(s − 3σ, r4 ) into D(s − 2σ, r2 ), it is clear that R ◦ X 1F D(s−5σ,ηr ) ≤ CRηr,D(s−2σ,4ηr ) .

(3.55)

♠ By the Leibniz rule, for all 1 ≤ j ≤ n, ∂(R ◦ X 1F ) ∂ y 0j

=

n ∂ R(X 1 ) ∂ yk F

k=1

∂ yk

∂ y 0j

,

and by (3.31) we deduce ∂(R ◦ X 1 ) F ≤ CRηr,D(s−2σ,4ηr ) . 0 D(s−5σ,ηr ) ∂yj ♠ For j ≥ 1, the derivative in z 0j reads ∂(R ◦ X 1F ) ∂z 0j

=

n ∂ R(X 1 ) ∂ yk F

k=1

∂ yk

∂z 0j

+

∂ R(X 1 ) ∂z k F

k≥1

∂z k

∂z 0j

+

∂ R(X 1F ) ∂z k . ∂z k ∂z 0j

(3.56)

KAM for the Quantum Harmonic Oscillator

407

Therefore, thanks to (3.29) and (3.32) we get ∂(R ◦ X 1 ) F D(s−5σ,ηr ) ∂z 0j n ∂y ∂Z ∂ R(X 1 ) k k F ≤ 0+ ∇ Z k R(X 1F ) D(s−5σ,ηr ) ∂z D(s−5σ,r ) ∂z 0 ∂ yk j j k=1 k≥1 C 1 ≤ β Rηr,D(s−2σ,4ηr ) 1 + 2β j k (1 + | j − k|) k≥1

≤

C Rηr,D(s−2σ,4ηr ) . jβ

2 ∂ (R◦X 1 ) ♠ We now estimate ∂z 0 ∂z 0F i

j

D(s−5σ,ηr )

(3.57)

for i, j ≥ 1. By the Leibniz rule, the result

will follow from the next estimations.

• Using the Cauchy estimate in yl and (3.29), ∂ 2 R(X 1 ) ∂ y ∂ y CRηr,D(s−2σ,4ηr ) k l F ≤ . 0 0 ∂ yk ∂ yl ∂z i ∂z j D(s−5σ,ηr ) (i j)β 1≤k,l≤n

• By (3.32) ∂ R(X 1 ) ∂ 2 y CRηr,D(s−2σ,4ηr ) k F ≤ . ∂ yk ∂z i0 ∂z 0j D(s−5σ,ηr ) (i j)β 1≤k≤n

• By (3.30) ∂ 2 R(X 1 ) ∂z ∂z CRηr,D(s−2σ,4ηr ) k l F ≤ . 0 0 ∂z k (t)∂zl ∂z i ∂z j D(s−5σ,ηr ) (i j)β k,l≥1

• Using the Cauchy estimate in z k , (3.29) and (3.30) we get, ∂ 2 R(X 1 ) ∂ y ∂z CRηr,D(s−2σ,4ηr ) l k F ≤ . ∂z k ∂ yl ∂z i0 ∂z 0j D(s−5σ,ηr ) (i j)β k≥1 1≤l≤n

All these estimates yield ∂ 2 (R ◦ X 1 ) CRηr,D(s−2σ,4ηr ) F ≤ . D(s−5σ,ηr ) (i j)β ∂z i0 ∂z 0j Finally, (3.54) follows from (3.55), (3.56), (3.57) and (3.58).

(3.58)

408

B. Grébert, L. Thomann

3.5. Approximation estimates. Recall that the notation · ∗ (respectively · ∗ ) stands either for · or · L (respectively · or · L ). First we recall some approximation results [16, Est. (7)], which show that the second order approximation of P can be controlled by P, and that P − R is small when we contract the domain (this contraction is governed by the new parameter η): Lemma 3.10 ([16]). Let P satisfy Assumption 3 and consider its Taylor approximation R of the form (2.9). Then there exists C > 0 so that for all η > 0, ∗ ∗ ∗ ∗ X R r,D(s,r ) ≤ CX P r,D(s,r ) , and X P − X R ηr,D(s,4ηr ) ≤ CηX P r,D(s,r ) .

We have an analogous result for the norm · r,D(s,r ) . β

Lemma 3.11. Let P ∈ r,D(s,r ) and consider its Taylor approximation R of the form (2.9). Then there exists C > 0 so that for all η > 0, ∗ ∗ Rr,D(s,r ) ≤ CPr,D(s,r ) ,

and ∗ P − R∗ηr,D(s,4ηr ) ≤ CηPr,D(s,r ).

Proof. • We first prove the second estimate. Define the one variable function f (t) = P(θ, t 2 y, t z, t z). Then by the Taylor formula, there exists 0 < t0 < 1 so that f (1) = f (0) + f (0) +

1 1 f (0) + f (3) (t0 ), 2 6

which reads 1 (3) f (t0 ) 6 ∂3 P ∂2 P ∂2 P , y2 2 . = O z 3 3 , yz ∂z ∂ y∂z ∂y

P(θ, y, z, z) − R(θ, y, z, z) =

Using the Cauchy estimates in z or in y, we obtain P − R D(s,4ηr ) ≤ Cη (ηr )2 Pr,D(s,r ) . The estimates of the derivatives are obtained by the same method, with the adequate choice of the function f . A derivative in z costs η and a derivative in y costs η2 . L It is then also clear that we have P − RL ηr,D(s,4ηr ) ≤ CηPr,D(s,r ) . ∗ ∗ • The inequality Rr,D(s,r ) ≤ CPr,D(s,r ) is a consequence of the previous point with η = 1.

KAM for the Quantum Harmonic Oscillator

409

4. The KAM Step Let N be a Hamiltonian in normal form as in (1.2), which reads in the variables (θ, y, z, z), ω j (ξ ) + (ξ )z j z j , N= 1≤ j≤n

j≥1

and suppose that Assumptions 1 and 2 are satisfied. Consider a perturbation P which satisfies Assumptions 3 and 4 for some r, s > 0. Then choose 0 < η < 1/8, 0 < σ < s, and assume that ασ t+1 η2 α L L Pr,D(s,r Pr,D(s,r ) + X P r,D(s,r ) + + X , (4.1) P r,D(s,r ) ≤ ) M c0 where t is given by Lemmas 3.1 and 3.2, c0 is a large constant depending only on n and τ (see [16, Est. (6)].) Thus, by Lemmas 3.1 and 3.10, the solution F of the homological equation (2.10) satisfies C ∗ L 2 X F r,D(s−σ,r X P r,D(s−σ,r ) ≤ ) ≤ ση . ασ t Similarly, by Lemmas 3.2 and 3.11, +,∗ Fr,D(s−σ,r ) ≤

C L 2 Pr,D(s,r ) ≤ ση , ασ t

so that the hypothesis Lemma 3.5 are fulfilled. We use the notations of Sect. 2.3. 4.1. Estimates on the new error term. We estimate the new error term P+ given by (2.12). Lemma 4.1. Assume (4.1). Then there exists C > 0 (independent of η and σ ) so that α for all 0 ≤ λ ≤ M , P+ ληr,D(s−5σ,ηr ) + X P+ ληr,D(s−5σ,ηr ) 2 C λ λ λ λ P . ≤ + X + Cη P + X P P r,D(s,r ) r,D(s,r ) r,D(s,r ) r,D(s,r ) ασ t η2 Proof. By [16, Est. (13)], we already have X P+ ληr,D(s−5σ,ηr ) ≤

2 C λ λ X P r,D(s,r ) + CηX P r,D(s,r ) . ασ t η2

(4.2)

It remains to prove a similar estimate for the , norm. By Lemmas 3.5 and 3.11, λ (P − R) ◦ X 1F ληr,D(s−5σ,ηr ) ≤ CP − Rληr,D(s−2σ,4ηr ) ≤ CηPr,D(s,r ).

Then by Lemma 3.5 again, 1 1

t λ R(t), F ◦ X F dtηr,D(s−5σ,ηr ) ≤ C R(t), F ◦ X tF ληr,D(s−5σ,ηr ) dt 0 0

≤ C R(t), F ληr,D(s−2σ,4ηr ) .

410

B. Grébert, L. Thomann β

β,+

Since R ∈ r,D(s,r ) and F ∈ r,D(s−σ,r ) are both of degree 2 we can apply Lemma 3.4 and write 1

C R(t), F ◦ X tF dtληr,D(s−5σ,ηr ) ≤ Rληr,D(s,ηr ) F+,λ ηr,D(s−σ,ηr ) . σ 0 Finally by Lemmas 3.2 and 3.11, 2 2 C λ C λ Rληr,D(s,ηr ) F+,λ R P ≤ ≤ ηr,D(s,ηr ) r,D(s,r ) , ηr,D(s−σ,ηr ) ασ t ασ t η2 where we used that · ηr,D(s,ηr ) ≤ η−2 · r,D(s,r ) . Putting the previous estimates together, we complete the proof. 4.2. Estimates on the frequencies. We turn to the new frequencies given by (2.11). Lemma 4.2. There exists K > 10 and α+ > 0 so that k · ω+ (ξ ) + l · + (ξ ) ≥ α+ l , |k| ≤ K , |l| ≤ 2. Ak In fact K can be made explicit, it depends on n, τ, c0 and on all the constants C.

Proof. On the one hand, since ω j (ξ ) = ∂∂ yNj (0, 0, 0, 0, ξ ), by Lemma 3.10 we deduce that ∂N | ≤ X N r,D(s,r ) ≤ CX R r,D(s,r ) ≤ CX P r,D(s,r ) . | ω| ≤ sup | D(s,r )× ∂ y j (ξ ) = On the other hand, |2β, ≤ |

sup

D(s,r )×

|

∂2 N ∂z j ∂z j

(0, 0, 0, 0, ξ ), thus

2β ∂2 N r,D(s−σ,r ) ≤ CRr,D(s,r ) ≤ CPr,D(s,r ) , | j ≤ N ∂z j ∂z j (4.3)

hence by the two previous estimates |2β, ≤ C X P r,D(s,r ) + Pr,D(s,r ) . | ω | + |

(4.4)

Similarly, for the Lipschitz norms we obtain L L L | ω |L + ||2β, ≤ C X P r,D(s,r ) + Pr,D(s,r ) . We follow the analysis done in [16] to bound the small divisors and thanks to (4.4), | ≤ |k|l | |2β, |k · ω+l · ω | + | ≤ C|k|l X P r,D(s,r ) + Pr,D(s,r ) . We now choose α ≥ C0 K max|k|≤K Ak (X P r,D(s,r ) + Pr,D(s,r ) ), where C0 is a large universal constant, and thanks to the estimate given by the frequencies before the iteration we get |k · ω+ (ξ ) + l · + (ξ )| ≥ α +

l , |k| ≤ K , Ak

α . It remains to show that α + > 0. This is done in [16, Sect. 4], and the with α+ = α − proof still holds with the new norms.

KAM for the Quantum Harmonic Oscillator

411

Remark 4.3. The key point in the previous proof is the estimate (4.3), which shows that the perturbations of the external frequencies can be controlled by Pr,D(s,r ) . In the case of a smoothing perturbation P (case p > p in (1.4)), the norm ·r,D(s,r ) is not needed (more precisely, the decay of the derivatives of P is not needed), because we then have |2β, ≤ X P r,D(s,r ) with β = ( p − p)/2. | 5. Iteration and Convergence In this section we are exactly in the setting of [16], and we can make the same choice of the parameters in the iteration. We reproduce here the argument of J. Pöschel. 5.1. The iterative lemma. Denote P0 = P and N0 = N . Then at the ν th step of the Newton scheme, we have a Hamiltonian Hν = Nν + Pν , so that the new error term Pν+1 is given by the formula (2.12) and the new normal form Nν+1 is associated with the new frequencies given by (2.11). Let c1 be twice the maximum of all constants obtained during the KAM step. Set r0 = r, s0 = s, α0 = α and M0 = M. For ν ≥ 0 and κ = 4/3 set α0 αν , (1 + 2−ν ), Mν = M0 (2 − 2−ν ), λν = 2 Mν c1 ενκ σν εν = , σν+1 = , , ην3 = (αν σνt )κ−1 2 αν σνt

αν = εν+1 and

sν+1 = sν − 5σν , rν+1 = ην rν . The initial conditions are chosen in the following way : σ0 = s0 /40 ≤ 1/4 so that s0 > s1 > · · · ≥ s0 /2, −3 ε0 = γ0 α0 σ0t and γ0 = c0 + 2t+3 c1 , where c0 is the constant which appears in (4.1). We also define K ν = K 0 2ν with K 0τ +1 = 1/(c1 γ0 ). With the notation Dν = D(sν , rν ) we have Lemma 5.1 (Iterative lemma, [16]). Suppose that Hν = Nν + Pν is given on Dν × ν , L where Nν = ων (ξ ) · y + ν (ξ ) · zz is a normal form satisfying |ων |L ν + |ν |2β,ν ≤ Mν , |k · ων (ξ ) + l · ν (ξ )| ≥ αν

l , (k, l) ∈ Z, Ak

on ν and Prλνν,Dν + X P rλνν,Dν ≤ εν . Then there exists a Lipschitz family of real analytic symplectic coordinate transformations ν+1 : Dν+1 × ν −→ Dν and a closed subset Rν+1 ν+1 = ν \ kl (αν+1 ), |k|>K ν

412

B. Grébert, L. Thomann

of ν , where l Rν+1 , kl (αν+1 ) = ξ ∈ ν : |k · ων+1 + l · ν+1 | < αν+1 Ak such that for Hν+1 = Hν ◦ ν+1 = Nν+1 + Pν+1 , the same assumptions are satisfied with ν + 1 in place of ν. We don’t give the details of the proof of this result, since it is entirely done in [16] : it is of course an induction on ν ∈ N which essentially relies on the results of Sect. 4. 5.2. Proof of Theorem 2.3. The result of Theorem 2.3 is the convergence of the sequence Hν to a Hamiltonian in normal form, for parameters ξ in a set α , which is the limit of the sets ν . We again follow the proof of Pöschel and we recall the following lemma Lemma 5.2 (Estimates, [16]). For ν ≥ 0, 1 Cεν ν+1 − idrλνν,Dν+1 , Dν+1 − I rλνν,rν ,Dν+1 ≤ , σν αν σνt ν ≤ Cεν . |ων+1 − ων |λνν , |ν+1 − ν |λ2β, ν

! Set 0 = \ k,l Rαkl0 and α = ∩ν≥1 ν . The proof that Meas(\α ) −→ 0 when α −→ 0 is done in [16, Sect. 5] and we do not repeat it here. For ν ≥ 1 we define the map ν = 1 ◦ · · · ◦ ν : Dν × ν−1 −→ Dν−1 , and thus we have Hν = H ◦ν . With Lemma 5.2 and since3 ∩ν≥1 Dν ×ν = D(s/2)× α , we are then able to show, as in [16], that ν is a Cauchy sequence for the supremum norm on D(s/2) × α . Thus it converges uniformly on D(s/2) × α and its limit is real analytic on D(s/2). Further, the estimate (2.6) holds on D(s/2) × α . It remains to prove that is indeed defined on D(s/2, r/2) × α with the same estimate. By Corollary 3.8 all the transforms ν are linear in y and quadratic in z, z¯ and thus the same is true for the transform (this fact was also used in [17] or [6]). This specific form is stable by composition and thus all the ν have this form and in particular they are linear in y and quadratic in z, z¯ . Therefore it suffices to verify that the first derivatives with respect to y, z, z¯ and the second derivatives with respect to z, z¯ of ν are uniformly convergent on D(s/2) × α to conclude that ν convergences to (actually an extension of the previously defined ) uniformly on D(s/2, ρ) × α for any ρ. In particular, for r small enough, : D(s/2, r/2) × α → D(s, r ), and still satisfies estimate (2.6). So it remains to analyse the convergence of the derivatives. Using Lemma 5.2 we obtain successively Dν rν ,rν ,Dν ≤ 2, and then uniformly on D(s/2) × α , Dν+1 − Dν rν ,rν ,Dν ≤ Dν rν ,rν ,Dν Dν − I |rν ,rν ,Dν , 3 Here we use the notation D(s/2) = D(s/2, 0).

KAM for the Quantum Harmonic Oscillator

413

and we deduce that uniformly on D(s/2) × α , Dν+1 − Dν rν ,rν ,Dν ≤ cν1/2 . So again Dν converges uniformly on D(s/2) × α . Similarly we obtain the convergence of the second derivatives using the formula D 2 ν+1 = D 2 ν · (Dν )2 + ν · D 2 ν . On the other hand, again using Lemma 5.2, the frequencies functions ων and ν converge uniformly on α to Lipschitz functions ω and satisfying (2.7) and thus (2.8) in view of Lemma 5.1. We then deduce that, uniformly on D(s/2, r/2) × α , Rν := H ◦ ν − Nν

−→

H ◦ − N =: R ,

and since for all ν the Taylor expansion of Rν contains only monomials y m z q z¯ q¯ , with 2|m| + |q + q| ¯ ≥ 3, the same property holds true for R . 6. Application to the Nonlinear Schrödinger Equation Let n ≥ 1 be an integer and ν, ε > 0 be two small parameters so that ν ≥ C0 ε, where C0 > 0 is a constant which will be defined later. Set = [−1, 1]n . We consider a perturbation of the one dimensional Schrödinger equation with harmonic potential i∂t u + ∂x2 u − x 2 u − νV (ξ, x)u = ε|u|2m u, (t, x) ∈ R × R, (6.1) where m ≥ 1 is an integer and V (ξ, ·) ξ ∈ is family of a real analytic bounded potentials with V (0, ·) = 0 which will be made explicit below. Recall that T = −∂x2 + x 2 denotes the harmonic oscillator. Its eigenfunctions are the Hermite functions (h j ) j≥1 , associated to the eigenvalues (2 j − 1) j≥1 . Now consider the linear operator A = A(ν, ξ ) = −∂x2 + x 2 + νV (ξ, x). Under the previous assumptions, A is self-adjoint and has pure point spectrum with simple eigenvalues (λ j (ν, ξ )) j≥1 satisfying λ j (ν, ξ ) ∼ 2 j − 1. Its eigenfunctions ϕ j (ξ, ·) j≥1 form an orthonormal basis of L 2 (R), and ϕ j (ξ, ·) ∼ h j as ν → 0 in L 2 norm. As a consequence A and T have the same domain and D(A p/2 ) = H p . We will prove these facts for the particular class of potentials we will consider (see Lemmas 6.2 and 6.3 below). The parameter ε > 0 will be small so that we can apply Theorem 2.3 and ν > 0 will be small too, so that we have a suitable perturbation theory for the operator A. We now explain the restriction ν ≥ C0 ε. The aim of this section is to construct a potential V so that Theorem 2.3 applies, and in particular, (2.3) has to be satisfied. Small values of k, l in (2.3) and the asymptotics of Lemma 6.3 give Cν ≥ α. This together with the condition ε ≤ ε0 α in Theorem 2.3 yields the result. We fix a finite subset J of N of cardinal n. Without loss of generality and in order to simplify the presentation, we assume J = {1, · · · , n}. We then expand u and u¯ in the basis of eigenfunctions using the phase space structure of the Introduction, namely we write n 1 u(x) = (y j + I j ) 2 eiθ j ϕ j (ξ, x) + z j ϕ j+n (ξ, x), j=1

u(x) ¯ =

n j=1

j≥1 1

(y j + I j ) 2 e−iθ j ϕ j (ξ, x) +

j≥1

z j ϕ j+n (ξ, x),

414

B. Grébert, L. Thomann

where (θ, y, z, z¯ ) ∈ P p = Tn × Rn × 2p × 2p (recall that 2p is the space 2 with ( j) = j p/2 ) are regarded as variables and I ∈ Rn+ are regarded as parameters (here R+ denotes the set of non negative real numbers). In this setting Eq. (6.1) reads as the Hamilton equations associated to the Hamiltonian function H = N + P, where N=

n

λ j (ν, ξ )y j +

j=1

j (ν, ξ )z j z¯ j ,

j≥1

j (ν, ξ ) = λ j+n (ν, ξ ), G(u, u) ¯ = (u u) ¯ m+1 and P(θ, y, z, z) = ε

R

G

n 1 (y j + I j ) 2 eiθ j ϕ j (ξ, x) + z j ϕ j+n (ξ, x), j=1

j≥1

n 1 (y j + I j ) 2 e−iθ j ϕ j (ξ, x) + z j ϕ j+n (ξ, x) dx. j=1

(6.2)

j≥1

For the sequel we fix (I j )1≤ j≤n . We assume that (θ, y, z, z¯ ) ∈ D(s, r ) for some fixed s, r > 0 (recall Definition (2.4) of D(s, r )). There is no particular smallness assumption on s, r , we only have to take r > 0 with r < min1≤ j≤n I j so that (y j + I j )1/2 is well-defined. We now show that we can construct a class of potentials V so that Theorem 2.3 applies. 6.1. Definition of the family of potentials V . Let ( f j )1≤ j≤n be the dual basis of (h 2j )1≤ j≤n , i.e. ( f j ) ∈ SpanR (h 21 , . . . , h 2n ) and R f j h 2k = δ jk for all 1 ≤ j, k ≤ n. We say that α = (αk )k≥n+1 ∈ Zn if − 21 ≤ αk ≤ 21 for all k ≥ n + 1. We endow the set of such sequences by the probability measure defined as the infinite product (k ≥ n + 1) of the Lebesgue measure on [−1/2, 1/2]. Then define g(x) =

√ αk e−k h 2k−1 ( 2x),

k≥n+1

and for ξ = (ξ1 , . . . , ξn ) ∈ = [−1, 1]n and V (ξ, x) =

n

ξk f k (x) + ξ1 g(x).

(6.3)

k=1

The spectral data ϕ j and λ j are defined by the spectral equation

− ∂x2 + x 2 + νV (ξ, x) ϕ j (ξ, x) = λ j (νξ )ϕ j (ξ, x),

(6.4)

and we assume that the (ϕ j ) are L 2 −normalised (ϕ j (ξ, ·) L 2 = 1 for all ξ ∈ and j ≥ 1). Moreover, in order to define ϕ j uniquely, we impose ϕ j , h j > 0. In the sequel we need a particular case of estimates proved by K. Yajima & G. Zhang [22].

KAM for the Quantum Harmonic Oscillator

415

Lemma 6.1 ([22]). For all 2 < p < ∞ there exists α > 0 and C > 0 so that for all ξ ∈ and j ≥ 1, ϕ j (ξ, ·) L p (R) ≤ C j −α .

(6.5)

The next result is the key estimate in our perturbation theory. Lemma 6.2. There exist α > 0 and C > 0 so that for all ξ ∈ , ν > 0 and j ≥ 1, ϕ j (ξ, ·) − ϕ j (η, ·) L 2 ≤ Cν|ξ − η| j −α .

(6.6)

In particular ϕ j (ξ, ·) − h j L 2 ≤ Cν|ξ | j −α , which shows that the ϕ j are close to the Hermite functions in L 2 norm. Proof. In the sequel, we write ϕ j (ξ ) instead of ϕ j (ξ, ·). For ξ, η ∈ , we compute A(ν ξ )ϕ j (η) = − ∂x2 + x 2 + νV (ξ, x) ϕ j (η) = λ j (ν η)ϕ j (η) + ν(V (ξ, x) − V (η, x))ϕ j (η). Thus by (6.3) and (6.5) there exists α > 0 such that A(ν ξ ) − λ j (ν η) ϕ j (η)

L2

= ν(V (ξ ) − V (η))ϕ j (η) L 2 ≤ νV (ξ ) − V (η) L 4 ϕ j (η) L 4 ≤ Cν|ξ − η| j −α .

(6.7)

Choosing η = 0 in (6.7), and as ϕ j (0) = h j and λ j (0) = 2 j − 1, we get −1 1 = h j L 2 ≤ A(ν ξ ) − (2 j − 1) L 2 →L 2 A(ν ξ ) − (2 j − 1) h j L 2 −1 ≤ Cν j −α A(ν ξ ) − (2 j − 1) L 2 →L 2 . The previous estimate together with the general formula which holds for any self-adjoint −1 −1 operator A(ν gives dist 2 j − ξ ) − (2−αj − 1) L 2 →L 2 = dist 2 j − 1, σ (A(ν ξ )) the spectrum of A(ν ξ ). A similar 1, σ (A(ν ξ )) ≤ Cν j , where σ (A(ν ξ )) denotes argument, taking ξ = 0 in (6.7), leads to dist λ j (νη), σ (T ) ≤ Cν j −α . Thus for all j ≥ 1, λ j (νξ ) = 2 j − 1 + ν O( j −α ).

(6.8)

Using that (ϕk (ξ ))k≥1 is a Hilbertian basis of L 2 (R), we deduce ϕ j (η) − ϕ j (ξ ), ϕ j (η)ϕ j (ξ )2 2 = |ϕ j (η) − ϕ j (ξ ), ϕ j (η)ϕ j (ξ ), ϕk (ξ )|2 L k≥1

=

k≥1,k= j

|ϕ j (η), ϕk (ξ )|2 .

(6.9)

416

B. Grébert, L. Thomann

With the same decomposition, we can also write A(ν ξ ) − λ j (ν η) ϕ j (η)2L 2 = | A(ν ξ ) − λ j (ν η) ϕ j (η), ϕk (ξ )|2 k≥1

= | λk (ν ξ ) − λ j (ν η) ϕk (ξ ), ϕ j (η)|2 k≥1

=

|λk (ν ξ ) − λ j (ν η)|2 |ϕk (ξ ), ϕ j (η)|2

k≥1

≥

|ϕk (ξ ), ϕ j (η)|2 ,

(6.10)

k≥1,k= j

because by (6.8) |λk (ν ξ ) − λ j (ν η)| ≥ 1 for k = j uniformly in ξ, η and uniformly in ν small enough. Now by (6.7), (6.9) and (6.10) we deduce that ϕ j (η) − ϕ j (ξ ), ϕ j (η)ϕ j (ξ )2 2 ≤ Cν|ξ − η| j −α . L In particular, taking the scalar product of ϕ j (η) with ϕ j (η) − ϕ j (ξ ), ϕ j (η)ϕ j (ξ ), we obtain 1 − ϕ j (ξ ), ϕ j (η)2 ≤ Cν|ξ − η| j −α . The last two estimates imply ϕ j (ξ ) − ϕ j (η) L 2 ≤ Cν|ξ − η| j −α which was the claim. Lemma 6.3. We have the following asymptotics when ν −→ 0, λ j (ν ξ ) = 2 j − 1 + νξ j + o(ν), ∀ 1 ≤ j ≤ n, (6.11) n j (ν ξ ) = λ j+n (νξ ) = 2( j + n) − 1 + ν ξk ( f k + δ1k g)h 2n+ j + o(ν), ∀ j ≥ 1. k=1

R

(6.12) Proof. We first prove (6.11). We differentiate Eq. (6.4) in ξk , A(ν ξ )

ϕ j (ξ ) ∂λ j (ν ξ ) ϕ j (ξ ) + ν( f k + δ1k g)ϕ j (ξ ) = λ j (ν ξ ) + ϕ j (ξ ), ∂ξk ∂ξk ∂ξk

take the scalar product with ϕ j (ξ ) and the selfadjointness of A(ν ξ ) gives ∂λ j (ν ξ ) = ν ( f k + δ1k g)ϕ 2j (ξ ). ∂ξk R

(6.13)

Now by (6.6), | ( f k + δ1k g)(ϕ 2j (ξ ) − h 2j )| ≤ f k + δ1k g L ∞ ϕ j (ξ ) + h j L 2 ϕ j (ξ ) − h j L 2 R

≤ Cϕ j (ξ ) − h j L 2 −→ 0

KAM for the Quantum Harmonic Oscillator

417

when ν −→ 0. Thus by definition of the f k and g and by estimate (6.13), we obtain that for all 1 ≤ j ≤ n, λ j (ν ξ ) = 2 j − 1 + ν

n k=1

ξk

R

( f k + δ1k g)h 2j + o(ν)

= 2 j − 1 + νξ j + o(ν), which is (6.11). The asymptotic of (6.12) is proved in the same way. Observe that we can prove a better estimate on the error term using (6.8), but we do not need it here. 6.2. Verification of Assumptions 1 and 2. Lemma 6.4. There exists a null measure set N ⊂ Zn such that for all α ∈ Zn \N we have for all 1 ≤ p, q, with p = q, ( f 1 + g)h 2n+ p ∈ / Z, (6.14) R

and

R

( f 1 + g)(h 2n+ p ± h 2n+q ) ∈ / Z.

(6.15)

Proof. For j ≥ 1, the Hermite function h j reads h j (x) = P j (x)e−x /2 , where P j is a polynomial of degree exactly ( j − 1), and P j is even (resp. odd) when ( j − 1) is even 2 (resp. odd). We have SpanR (h 1 , . . . , h n ) = e−x /2 Rn−1 [X ]. Thus we deduce that there exist (μk j ) so that 2

h 2j (x)

=

j

√ μk j h 2k−1 ( 2x),

(6.16)

k=1

with μ j j = 0. We assume that q < p. The application (αn , αn+1 , . . . ) −→ ( f 1 + g)(h 2n+ p ± h 2n+q ) R

is a linear form. In order to prove (6.15), it suffices to check that this linear form is nontrivial. According to (6.16) and to the definition of f 1 and g, the coefficient of αn+ p is √ √ −(n+ p) e μn+ p,n+ p h 22(n+ p)−1 ( 2x)dx = e−(n+ p) μn+ p,n+ p / 2 = 0. R

Therefore for fixed p, q, (6.15) is satisfied on the complementary of a null measure set N p,q . Finally, (6.15) is satisfied on Zn \N , where N = ∪ p,q≥1 N p,q . The proof of (6.14) is similar.

418

B. Grébert, L. Thomann

In the sequel we fix α ∈ Zn \N so that Lemma 6.4 holds true. We are now able to show that Assumption 1 is satisfied. Recall that in our setting, the internal frequencies are λ(νξ ) = (λ j (νξ ))1≤ j≤n and the external frequencies are (νξ ) = ( j (νξ )) j≥1 with j (νξ ) = λn+ j (νξ ). Lemma 6.5. There exists ν0 > 0 so that for all 0 < ν < ν0 we have Meas

ξ ∈ : k · λ(ν ξ ) + l · (ν ξ ) = 0

= 0, ∀ (k, l) ∈ Z,

(6.17)

and for all ξ ∈ , l · (ν ξ ) = 0, ∀ 1 ≤ |l| ≤ 2.

(6.18)

Proof. We prove (6.17) by contradiction. Let (k, l) ∈ Z. In the case |l| = 2 in (6.17) we can write k · λ(ν ξ ) + l · (ν ξ ) =

n

k j λ j (ν ξ ) + λn+ p (ν ξ ) − λn+q (ν ξ ) := F(ν ξ ),

j=1

for some p, q ≥ 1. Now if (6.17) does not hold, F : Rn −→ R is a C 1 function which vanishes on a set of positive measure in any neighbourhood of 0, thus F(0) = 0 and for ∂F all 1 ≤ k ≤ n, ∂ξ (0) = 0. By Lemma 6.3 these conditions read k n (2 j − 1)k j + 2( p − q) = 0

kj +

and

j=1

R

( f j + δi j g)(h 2n+ p − h 2n+q ) = 0, ∀ 1 ≤ j ≤ n.

(6.19)

In particular for j = 1, (6.19) is in contradiction with (6.15). The case |l| = 1 is similar, using (6.14). It remains to prove (6.18). For all j ≥ 1, j (ν ξ ) −→ 2 j − 1 when ν −→ 0. Hence (6.18) holds true if ν is small enough. We now check Assumption 2. Firstly, thanks to (6.8) we have that for j, k ≥ 1, | j (νξ ) − k (νξ )| ≥ | j − k| and | j (νξ )| ≥ j. Then by (6.13) and (6.5), | j (νξ ) − j (νη)| ≤ ν|ξ − η| sup

ξ ∈ R

( f k + δ1k g)ϕ 2 (ξ, ·) j+n

≤ ν|ξ − η| f k + δ1k g L 2 sup ϕ j+n (ξ, ·)2L 4 ξ ∈

≤ Cν|ξ − η| j and Assumption 2 is fulfilled.

−α

,

KAM for the Quantum Harmonic Oscillator

419

6.3. Verification of Assumptions 3 and 4. Recall that for p ≥ 0, H p = D(T p/2 ) is the Sobolev space based on the harmonic oscillator. Thanks to (6.6) and (6.8), we also have H p = D(A p/2 (ν ξ )) for all ν > 0 small enough and ξ ∈ . Observe that H p is an algebra and the Sobolev embeddings which hold for the usual Sobolev space H p are also true here, since H p ⊂ H p . Let u = j≥1 α j ϕ j . Then u ∈ H p if and only if α j ∈ 2p . We now check the smoothness of P and the decay of the vector field X P . Let p ≥ 2 so that we are in the framework of Theorem 2.3. Since G(u, u) = (uu)m+1 in (6.2), we have P=ε |u|2(m+1) . (6.20) R

We first show that

∂P ∂z j

∈ 2p . We have ∂P = ε(m + 1) ϕ j+n u m u m+1 , ∂z j R

thus

∂P ∂z j m+1

(6.21)

is (up to a constant factor) the ( j + n)th coefficient of the decomposition of

, and this latter term is in H p (because H p is an algebra), hence the result. The um u other components of X P can be handled in the same way, and we get X P ∈ P p . By (6.20) and Sobolev embeddings, sup

2(m+1)

D(s,r )×

2(m+1)

|P| ≤ εu L 2(m+1) ≤ εuH p

.

(6.22)

Similarly, using (6.21) and " # 1 ∂P = εi(m + 1)(y j + I j ) 2 eiθ j ϕ j u m u m+1 + e−iθ j ϕ j u m+1 u m , ∂θ j R R " # ∂P m+1 − 21 iθ j m m+1 −iθ j (y j + I j ) e =ε ϕju u +e ϕ j u m+1 u m , ∂yj 2 R R it is easy to see that sup D(s,r )× |X P |r ≤ Cε. We now turn to the Lipschitz norms. Let ξ, η ∈ , |P(ξ ) − P(η)| ≤ Cεu(ξ ) − u(η) L 2 (u(ξ )2m+1 + u(η)2m+1 ) L 4(2m+1) L 4(2m+1) ≤ Cεu(ξ ) − u(η) L 2 u2m+1 Hp .

(6.23)

Now by (6.6) u(ξ ) − u(η) L 2 ≤ C

n

ϕ j (ξ ) − ϕ j (η) L 2 +

j=1

≤ C|ξ − η|,

j p |z j |ϕ j+n (ξ ) − ϕ j+n (η) L 2

j≥1

(6.24)

where in the last line we used Cauchy-Schwarz and the fact that (z j ) j≥1 ∈ l 2p with p ≥ 2. Then (6.23) and (6.24) show the Lipschitz regularity of P. We can proceed similarly for X P .

420

B. Grébert, L. Thomann

It remains to prove the decay estimates of Assumption 4. Using (6.21), (6.5) and the Sobolev embeddings, we obtain ∂P ≤ Cεj −α u2m+1 ≤ ε(m + 1)ϕ j+n L ∞ (R) u2m+1 Hp , L 2m+1 ∂z j and similarly, from ∂2 P = εm(m + 1) ϕ j+n ϕl+n u m−1 u m+1 , ∂z j ∂zl R we deduce ∂2 P ≤ εC( jl)−α u2m ≤ Cεϕ j+n L ∞ (R) ϕl+n L ∞ (R) u2m Hp . L 2m ∂z j ∂zl The estimates of the Lipschitz norms are obtained as in (6.23), (6.24) and using (6.6). As a conclusion Assumptions 1–4 are satisfied and we can apply Theorem 2.3 with some β > 0 if ε > 0 is small enough. Recall that = [−1, 1]n . Theorem 6.6. Let m ≥ 1 and n ≥ 1 be two integers. Let V (ξ, ·) be the n parameters family of potentials defined by (6.3). There exist ε0 > 0, ν0 > 0, C0 > 0 and, for each ε < ε0 , a Cantor set ε ⊂ of asymptotic full measure when ε → 0, such that for each ξ ∈ ε and for each C0 ε ≤ ν < ν0 , the solution of i∂t u + ∂x2 u − x 2 u − νV (ξ, x)u = ε|u|2m u, (t, x) ∈ R × R

(6.25)

with initial datum u 0 (x) =

n

1/2

I j eiθ j ϕ j (ξ, x),

(6.26)

j=1

with (I1 , · · · , In ) ⊂ (0, 1]n and θ ∈ Tn , is quasi-periodic with a quasi-period ω∗ close to ω0 = (2 j − 1)nj=1 : |ω∗ − ω0 | < Cν. More precisely, when θ covers Tn , the set of solutions of (6.25) with initial datum (6.26) covers a n dimensional torus which is invariant by (6.25). Furthermore this torus is linearly stable. Remark 6.7. From the proof it is clear that our result also applies to any non linearity which is a linear combination of |u|2m u. Moreover, under ad hoc conditions on the derivatives of G, we can admit some non linearities of the form ∂G ∂u (x, u, u) (i.e. depending on x) in (6.1). Also we can replace the set {1, · · · , n} by any finite set of N of cardinality n. 7. Application to the Linear Schrödinger Equation In this section we prove Theorem 1.2 following the scheme developed by H. Eliasson and S. Kuksin in [7] for the linear Schrödinger equation on the torus, quasi-periodic in time potentials. The setting differs slightly from Sect. 6 since now we are not considering a perturbation around a finite dimensional torus but we want to construct a linear change of variable defined on all the phase space. Consider the equation i∂t u = −∂x2 u + x 2 u + V (tω, x)u,

(7.1)

KAM for the Quantum Harmonic Oscillator

421

where V satisfies the condition (1.8). Recall the definition of the phase space P p = Tn × Rn × 2p × 2p . Recall also that h j , j ≥ 1 denote the eigenfunctions of the quantum harmonic oscillator T = −∂x2 + x 2 and that we have T h j =(2 j − 1)h j , j ≥ 1. Expanding u and u¯ on the Hermite basis, u = j≥1 z j h j , u¯ = j≥1 z¯ j h j , Eq. (7.1) reads as a non autonomous Hamiltonian system ⎧

z, z¯ ), j ≥ 1, ⎨ z˙ j = −i(2 j − 1)z j − iε ∂ Q(t, ∂ z¯ j (7.2)

z, z¯ ), ⎩ z˙¯ j = i(2 j − 1)¯z j + iε ∂ Q(t, j ≥ 1, ∂z j where

z, z¯ ) = Q(t,

R

V (ωt, x)

z j h j (x)

j≥1

z¯ j h j (x) dx

j≥1

and4 (z, z¯ ) ∈ 22 × 22 . We then re-interpret (7.2) as an autonomous Hamiltonian system in an extended phase space P 2 , ⎧ z˙ j = −i(2 j − 1)z j − iε ∂∂z¯ j Q(θ, z, z¯ ) j ≥ 1, ⎪ ⎪ ⎪ ⎨ z˙¯ = i(2 j − 1)¯z + iε ∂ Q(θ, z, z¯ ) j ≥ 1, j j ∂z j (7.3) ˙ ⎪ θj = ωj j = 1, · · · , n, ⎪ ⎪ ⎩ y˙ = −ε ∂ Q(θ, z, z¯ ) j = 1, · · · , n, j ∂θ j where

Q(θ, z, z¯ ) =

R

V (θ, x)

z j h j (x)

j≥1

z¯ j h j (x) dx

j≥1

is quadratic in (z, z¯ ). We notice that the first three equations of (7.3) are independent of y and are equivalent to (7.2). Furthermore (7.3) reads as the Hamiltonian equations associated with the Hamiltonian function H = N + Q where, N (ω) =

n

ωj yj +

j=1

(2 j − 1)z j z¯ j . j≥1

Here the external parameters are directly the frequencies ω = (ω j )1≤ j≤n ∈ [0, 2π )n =: and the normal frequencies j = 2 j − 1 are constant. 7.1. Statement of the results and proof. Theorem 7.1. There exists ε0 > 0 such that if 0 < ε < ε0 , there exist (i) a Cantor set ε ⊂ with Meas(\ε ) → 0 as ε → 0; (ii) a Lipschitz family of real analytic, symplectic and linear coordinate transformation : ε × P 0 → P 0 of the form ω (y, θ, Z ) = (y +

1 Z · Mω (θ )Z , θ, L ω (θ )Z ), 2

(7.4)

where Z = (z, z¯ ), L ω (θ ) and Mω (θ ) are linear bounded operators from 2p × 2p into itself for all p ≥ 0 and L ω (θ ) is invertible; 4 For the moment we work in 2 × 2 , the largest phase space in which our abstract result applies. 2 2

422

B. Grébert, L. Thomann

(iii) a Lipschitz family of new normal forms

N (ω) =

n

ωj yj +

j=1

j (ω)z j z¯ j ;

j≥1

such that on ε × P 0 , H ◦ = N . Moreover the new external frequencies are close to the original ones | − |2β,ε ≤ cε, and the new frequencies satisfy a non resonant condition, there exists α > 0 such that for all ω ∈ ε , k · ω + l · (ω) ≥ α

l , (k, l) ∈ Z. 1 + |k|τ

Notice that in the new coordinates, (y , θ , z , z¯ ) = −1 ω (y, θ, z, z¯ ), the dynamic is linear with y invariant: ⎧ j ≥ 1, z˙ j = ij z j ⎪ ⎪ ⎨ z˙¯ = −i z¯ j ≥ 1, j j j (7.5) θ˙ j = ω j j = 1, · · · , n, ⎪ ⎪ ⎩ y˙ j = 0 j = 1, · · · , n. As (7.1) is equivalent to (7.3), this theorem impliesTheorem 1.2. In particular the solutions u(t, x) of (7.1) with initial datum u 0 (x) = j≥1 z j (0)h j (x) read u(t, x) = z (t)h (x) with j j j≥1

(z, z¯ )(t) = L ω (ωt)(z (0)ei t , z¯ (0)e−i t ) and (z (0), z¯ (0)) = L −1 ω (0)(z(0), z¯ (0)). Thus u(t, x) = ψ j (ωt, x)ei j t , j≥1

where ψ j (θ, x) = ≥1 [L ω (θ )L −1 ω (0)(z(0), z¯ (0))] h (x). In particular the solutions are all almost periodic in time with a non resonant frequencies vector (ω, ). Fur thermore we observe that ψ j (ωt, x)ei j t solves (7.1) if and only if j + k · ω is an eigenvalue of (1.10) (with eigenfunction ψ j (θ, x)eiθ·k ). This shows that the spectrum of the Floquet operator (1.10) equals {j + k · ω | k ∈ Zn , j ≥ 1} and thus Corollary 1.4 is proved. Remark 7.2. Although is defined on P 0 , the normal forms N and N are well defined on P p only when p ≥ 1/2. Nevertheless their flows are well defined and continuous from P 0 into itself (cf. (7.5)).

KAM for the Quantum Harmonic Oscillator

423

˜ ⊂ be the subset of Diophantine vector of frequencies ω, i.e. having the Proof. Let property that there exists 0 < α ≤ 1 such that k · ω − b ≥ 2π α , k ∈ Zn \ {0}, b ∈ Z |k|τ −1

(7.6)

˜ = 0. Further this Diophantine for some τ > n + 2. It is well known that Meas(\) condition implies that k · ω + l · ≥ α

l , (k, l) ∈ Z, 1 + |k|τ

since l · ∈ Z and if l ≤ 2π |k|, then k · ω + l · ≥ 2l − 2π |k| ≥ l ≥ α

l ≥ α 1+|k| τ , while if l ≥ 2π |k|, then

2π α |k|τ −1 l 1+|k|τ .

Thus Assumption 1 holds true. Further as the normal frequencies j = 2 j − 1 are constant, Assumption 2 is satisfied. We now show that Assumption 3 holds. Because of the assumptions on the smooth∂Q )k≥1 ∈ 2p . We have ness of V , the only condition which needs some care is that ( ∂z k ∂Q = V (θ, x)h k u dx, ∂z k R which is the k th coefficient of the decomposition of V (θ, x)u in the Hermite basis. Thus ∂Q )k≥1 ∈ 22 if and only if V (θ, x)u ∈ H2 which is true since u¯ ∈ H2 and V and ∂x V ( ∂z k are bounded. We turn to Assumption 4. Recall that by (6.5), for all 2 < r ≤ +∞, there exists β > 0 so that h j L r (R) ≤ C j −β . On the other hand, by assumption V is real analytic in θ and L q in x for some 1 ≤ q < +∞. Consider 1 < q ≤ +∞ so that q1 + q1 = 1, then with Hölder, we compute ∂Q V (θ, x)h k u dx ≤ sup V (θ, ·) L q (R) h k u L q (R) = ∂z k R θ∈[0,2π ]n ≤

sup

θ∈[0,2π ]n −β

≤ Ck

V (θ, ·) L q (R) h k L 2q (R) u L 2q (R)

.

Similarly, ∂2 Q V (θ, x)h k h l dx ≤ = ∂z k ∂z l R

sup

θ∈[0,2π ]n −β

≤ C( jl)

V (θ, ·) L q (R) h k L 2q (R) h l L 2q (R) .

Therefore, Theorem 2.3 applies (with p = 2) and we almost obtain the conclusions of Theorem 7.1. Indeed, comparing with Theorem 2.3, we have to prove: (i) (ii) (iii) (iv)

the symplectic coordinate transformation is quadratic (and thus it is defined on the whole phase space) and have the specific form (7.4); the new normal form still has the same frequencies vector ω; the new Hamiltonian reduces to the new normal form, i.e. R = 0; the symplectic coordinate transformation , which is defined by Theorem 7.1 on each P 2 , extends to P 0 = Tn × Rn × 22 × 22 .

424

B. Grébert, L. Thomann

Actually, at the principle Q is homogeneous of degree 2 in Z and independent of y and the same is true for F, the solution of the first homological equation, = ε Q. {F, N } + N does not contain linear terms in y and thus ω remains As a first consequence, N unchanged by the first iterative step (cf. (2.11)). Now going to Lemma 3.5 we notice that following notations (3.34), b0 = b1 = a = 0. Therefore θ remains unchanged (θ˙ = 0) and the equation for Z reads Z˙ = J A(θ )Z which leads to Z (τ ) = eτ J A(θ) Z (0), (see (1) (1) (3.40)). Thus Z (1) = L ω (θ )Z (0), where L ω (θ ) = e J A(θ) is invertible from P 2 onto itself. In the same way, y˙ (τ ) = − 21 ∇θ A(θ )Z (τ ) · Z (τ ) (see (3.46)) which leads to y(1) = y(0) + 21 Z (0) · Mω (θ )Z (0) for some linear operator Mω (θ ). Finally the new error term 1 (cf. (2.12)) Q + = 0 {Q(t), F} ◦ X tF dt is still homogeneous of degree 2 in Z and independent of y. Thus properties (i), (ii) are satisfied after the first step and the new error term conserves the same form. Therefore we can iterate the process and the limiting transformation = 1 ◦ 2 ◦ · · · also satisfies (i) and (ii). Furthermore the transformed Hamiltonian as well as the original one is linear in y and quadratic in Z and thus (iii) holds true. It remains to check (iv). This follows from the fact that is a linear symplectomorphism and thus, as remarked in [12, Prop. 1.3’], extends by duality on 2p × 2p for all p ∈ [−2, 2] and in particular for p = 0. Proof of Corollary 1.3. The point is that, when V is smooth with bounded derivatives, the perturbation Q satisfies Assumption 3 for all p ≥ 0. That is X Q maps smoothly P p into itself. Therefore Theorem 2.3 applies for all p ≥ 2 and by (2.6), the canonical transformation is close to the identity in the P p -norm. Since in the new variables, (y , θ , z , z¯ ) = −1 (y, θ, z, z¯ ), the modulus of z j is invariant, we deduce that there exist a constant C such that (1 − Cε)z(0) p ≤ z(t) p ≤ (1 + Cε)z(0) p which in turn implies (1 − εC)u 0 H p ≤ u(t)H p ≤ (1 + εC)u 0 H p , ∀ t ∈ R. 7.2. An explicit example. Consider the linear equation i∂t u = −∂x2 u + x 2 u + V (tω)u, Tn

(7.7)

where V : −→ R is real analytic and independent of x ∈ R. Up to a translation of the spectrum, we can assume that V (0) = 0. Notice that this case is not in the scope of Theorem 7.1, since V does not satisfy (1.8). We suppose moreover that Tn V = 0 and that ω ∈ [0, 2π )n is Diophantine (see (7.6)). t Define v(t, x) = e−iε 0 V (ωs)ds u(t, x). The function u satisfies (7.7) iff v satisfies i∂t v = −∂x2 v + x 2 v. This latter equation is explicitly solvable using the Hermite basis, and the solution of (7.7) with initial condition u 0 (x) = ∞ j=1 α j h j (x) then reads u(t, x) = eiε

t 0

V (ωs)ds

∞ j=1

α j h j (x)ei(2 j−1)t .

KAM for the Quantum Harmonic Oscillator

425

ik·θ . Then, as ω is Diophantine, we can compute Write V (θ ) = k∈Zn ,k=0 ak e t ak V (ωs)ds = −i k∈Zn ,k=0 k·ω (eik·ωt − 1), and W defined by W (θ ) = exp 0 ak ik·θ (e − 1) is a periodic and analytic function in θ . Finally, u(t, x) = ε n ∞ k∈Z ,k=0 k·ω i(2 j−1)t is an almost periodic function in time (as an infinite sum j=1 α j W (ωt)h j (x)e of quasi-periodic functions). We can explicitly compute the transformation in (7.4). Here the Hamiltonian reads H = N + Q with Q = V (θ ) k≥1 |z k |2 . Set (y , θ , z , z ) = (y, θ, z, z), where ⎧ j ≥1 ⎪ ⎨ z j = W (θ ) z j , z¯ j = W (θ ) z¯ j , a k |zl |2 , 1 ≤ j ≤ n. θ = θ j , y j = y j − εk j eik·θ ⎪ ⎩ j k·ω n k∈Z ,k=0

l≥1

Then a straightforward computation gives H ◦ (y , θ , z , z ) =

n j=1

ω j y j +

(2 j − 1)z j z¯ j .

j≥1

Therefore in this case j (ω) = 2 j − 1. Finally we study the spectrum of the Floquet operator associated to Eq. (7.7). Observe that W (ωt)h j (x)ei(2 j−1)t solves (7.7) if and only if any 2 j − 1 + k · ω (with j ≥ 1 and k ∈ Zn ) is an eigenvalue of (1.10) (with eigenfunction W (θ )h j (x)eiθ·k ). This shows that the Floquet spectrum is pure point, since linear combinations of W (θ )h j (x)eiθ·k are dense in L 2 (R) ⊗ L 2 (Tn ). Acknowledgements. The first author thanks Hakan Eliasson and Serguei Kuksin for helpful suggestions at the principle of this work. Both authors thank Didier Robert for many clarifications in spectral theory.

A. Appendix We show here how we can construct periodic solutions to the equation i∂t u + ∂x2 u − x 2 u = |u| p−1 u, p ≥ 1 (t, x) ∈ R × R, u(0, x) = f (x),

(A.1)

thanks to variational methods. This is classical, see e.g. [20] and [2] for more details. See also [3]. Recall that for s ≥ 0 we have defined the Sobolev space Hs (R) = D(T s/2 ), where T = −∂x2 + x 2 is the harmonic oscillator. We also define H∞ (R) = ∩s>0 Hs (R). We then have the following result. Proposition A.1. Let μ > 0. Then there exists an L 2 (R)-orthogonal family (ϕ j ) j≥1 ∈ H∞ (R) with ϕ j L 2 (R) = μ and a sequence of positive numbers (λ j ) j≥1 so that for all j ≥ 1, u(t, x) = e−iλ j t ϕ j (x) is a solution of (A.1). Proof. We look for a solution of (A.1) of the form u(t, x) = e−iλt ϕ(x); hence ϕ has to satisfy (A.2) − ∂x2 + x 2 ϕ = λϕ − |ϕ| p−1 ϕ.

426

B. Grébert, L. Thomann

Let μ > 0, denote by E μ the set E μ = ϕ ∈ H1 (R), s.t. ϕ L 2 (R) = μ , and define the functional J (ϕ) =

1 1 (∂x ϕ)2 + x 2 ϕ 2 + |ϕ| p+1 dx, 2 p+1

which is C 1 on E μ . Then the problem minϕ∈E μ J (ϕ) admits a solution ϕ 1 , and ϕ 1 solves (A.2) for some λ = λ1 > 0. Indeed, by Rellich’s theorem (see e.g. [19, p. 247]), for all C > 0, the set 1 1 ϕ ∈ H1 (R), s.t. ϕ L 2 (R) = μ, (∂x ϕ)2 + x 2 ϕ 2 + |ϕ| p+1 ≤ C , 2 p+1 is compact in L 2 (R) (observe that we have used the Sobolev embedding H1 ⊂ L p+1 which holds for any p ≥ 1). Then, if ϕn is a minimising sequence of J , up to a subsequence, we can assume that ϕn −→ ϕ 1 ∈ E μ in L 2 (R). Finally, the lower semicontinuity of J ensures that ϕ 1 is a minimum of J in E μ , and the claim follows. Moreover, λ1 is given by 1 λ1 = (∂x ϕ 1 )2 + x 2 (ϕ 1 )2 + |ϕ 1 | p+1 . μ

Now we define the set E μ1 = E μ ∩ ϕ, ϕ 1 L 2 (R) = 0 . Similarly, we may construct ϕ 2 ∈ E μ1 so that J (ϕ 2 ) = minϕ∈E μ1 J (ϕ). The orthogonality condition implies in par-

ticular that ϕ 2 = ϕ 1 . Let k ≥ 1, and assume that we have constructed (ϕ j )1≤ j≤k so that ϕ i , ϕ j L 2 = μ2 δi j for all 1 ≤ i, j ≤ k. Define the set

E μk = E μ ∩ ϕ, ϕ j L 2 = 0, 1 ≤ j ≤ k . By Rellich’s theorem, the set ϕ ∈ H1 (R), s.t. ϕ L 2 (R) = μ, 1 1 (∂x ϕ)2 + x 2 ϕ 2 + |ϕ| p+1 ≤ C, ϕ, ϕ j L 2 = 0, 1 ≤ j ≤ k , 2 p+1 is compact in L 2 (R) and we can construct ϕ k+1 ∈ E μk so that J (ϕ k+1 ) = minϕ∈E μk J (ϕ). Then ϕ k+1 is a nontrivial solution of (A.2) with 1 λk+1 = (∂x ϕ k+1 )2 + x 2 (ϕ k+1 )2 + |ϕ k+1 | p+1 . μ

The regularity ϕ j ∈ H∞ is a direct consequence of the ellipticity of the operator −∂x2 + x 2 .

KAM for the Quantum Harmonic Oscillator

427

Remark A.2. Of course, the proof can be generalised to a larger class of nonlinearities in (A.1). In particular, we can deal with the nonlinearity −ε|u| p−1 u with ε > 0 provided p+3 that p < 5 and that εμ 2 > 0 is small enough. Indeed in that case, thanks to the Gagliardo-Nirenberg inequality we have ( p−1)/4 p+3 ε (∂x ϕ)2 + x 2 ϕ 2 , |ϕ| p+1 ≤ Cεμ 2 p+1 and the nonlinear part of the energy can be controlled by the linear part, which enables us to perform the same arguments as previously. References 1. Bambusi, D., Graffi, S.: Time quasi-periodic unbounded perturbations of Schrödinger operators and KAM method. Commun. Math. Phys. 219(2), 465–480 (2001) 2. Berestycki, H., Lions, P.-L.: Nonlinear scalar field equations. Arch. Rat. Mech. Anal. 82(4), 313–345 and 347–375 (1983) 3. Carles, R.: Rotating points for the conformal NLS scattering operator. Dyn. Part. Diff. Eq. 6(1), 35–51 (2009) 4. Delort, J.-M.: Growth of Sobolev norms for solutions of time dependent Schrödinger operators with harmonic oscillator potential. Preprint 5. Eliasson, L.H.: Almost reducibility of linear quasi-periodic systems. In available at http://hal.archivesouverts.fr/docs/00/46/75/72/PDF/artide.pdf, 2010 Smooth ergodic theory and its applications (Seattle, WA, 1999), Proc. Sympos. Pure Math., 69, Providence, RI: Amer. Math. Soc., 2001, pp. 679–705 6. Eliasson, L.H., Kuksin, S.B.: KAM for the nonlinear Schrödinger equation. Ann. of Math. (2) 172(1), 371– 435 (2010) 7. Eliasson, L.H., Kuksin, S.B.: On reducibility of Schrödinger equations with quasiperiodic in time potentials. Commun. Math. Phys. 286(1), 125–135 (2009) 8. Enss, V., Veselic, K.: Bound states and propagating states for time-dependent hamiltonians. Ann. IHP 39(2), 159–191 (1983) 9. Grébert, B., Imekraz, R., Paturel, É.: Normal forms for semilinear quantum harmonic oscillators. Commun. Math. Phys. 291, 763–798 (2009) 10. Kuksin, S.B.: Hamiltonian perturbations of infinite-dimensional linear systems with an imaginary spectrum. Funct. Anal. Appl. 21, 192–205 (1987) 11. Kuksin, S.B.: Nearly integrable infinite-dimensional Hamiltonian systems. Lecture Notes in Mathematics, 1556. Berlin: Springer-Verlag, 1993 12. Kuksin, S.B.: Analysis of Hamiltonian PDEs. Oxford Lecture Series in Mathematics and its Applications, 19. Oxford: Oxford University Press, 2000 13. Kuksin S.B. Pöschel J.: Invariant Cantor manifolds of quasi-periodic oscillations for a nonlinear Schrödinger equation. Ann. of Math. 143, 149–179 (1996) 14. Liu, J., Yuan, X.: Spectrum for quantum Duffing oscillator and small-divisor equation with large-variable coefficient. Comm. Pure Appl. Math. 63(9), 1145–1172 (2010) 15. Pitaevski, L.P., Stringari, S.: Bose-Einstein Condensation. Oxford: Oxford University Press, 2003 16. Pöschel, J.: A KAM-theorem for some nonlinear partial differential equations. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 23(1), 119–148 (1996) 17. Pöschel, J.: On elliptic lower-dimensional tori in Hamiltonian systems. Math. Z. 202(4), 559–608 (1989) 18. Pöschel, J.: Quasi-periodic solutions for a nonlinear wave equation. Comment. Math. Helv. 71, 269– 296 (1996) 19. Reed, M., Simon, B.: Methods of modern mathematical physics. IV. Analysis of operators. New York: Academic Press, 1978 20. Struwe, M.: Variational methods. Applications to nonlinear partial differential equations and Hamiltonian systems. Fourth edition. Berlin: Springer-Verlag, 2008 21. Yajima, K., Kitada, H.: Bound states and scattering states for time periodic Hamiltonians. Ann. Inst. H. Poincaré Sect. A (N.S.) 39(2), 145–157 (1983) 22. Yajima, K., Zhang, G.: Smoothing property for Schrödinger equations with potential superquadratic at infinity. Commun. Math. Phys. 221(3), 573–590 (2001) 23. Wang, W.M.: Pure point spectrum of the Floquet Hamiltonian for the quantum harmonic oscillator under time quasi-periodic perturbations. Commun. Math. Phys. 277, 459–496 (2008) Communicated by G. Gallavotti

Commun. Math. Phys. 307, 429–462 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1330-x

Communications in

Mathematical Physics

Wall Crossing as Seen by Matrix Models Hirosi Ooguri1,2,3 , Piotr Sułkowski1, , Masahito Yamazaki2 1 California Institute of Technology, Pasadena, CA 91125, USA. E-mail: [email protected] 2 Institute for the Physics and Mathematics of the Universe, University of Tokyo, Kashiwa 277-852, Japan 3 Max-Planck-Institut für Gravitationsphysik, Potsdam 14476, Germany

Received: 4 June 2010 / Accepted: 31 March 2011 Published online: 4 September 2011 – © The Author(s) 2011. This article is published with open access at Springerlink.com

Abstract: The number of BPS bound states of D-branes on a Calabi-Yau manifold depends on two sets of data, the BPS charges and the stability conditions. For D0 and D2-branes bound to a single D6-brane wrapping a Calabi-Yau 3-fold X , both are naturally related to the Kähler moduli space M(X ). We construct unitary one-matrix models which count such BPS states for a class of toric Calabi-Yau manifolds at infinite ’t Hooft coupling. The matrix model for the BPS counting on X turns out to give the topological string partition function for another Calabi-Yau manifold Y , whose Kähler moduli space M(Y ) contains two copies of M(X ), one related to the BPS charges and another to the stability conditions. The two sets of data are unified in M(Y ). The matrix models have a number of other interesting features. They compute spectral curves and mirror maps relevant to the remodeling conjecture. For finite ’t Hooft coupling they give rise to yet containing Y . more general geometry Y 1. Introduction The topological string theory has deep connections to a variety of BPS counting problems in string theory [1,2]. In this paper, we focus on the generalized Donaldson-Thomas (DT) invariants, namely the numbers of D0 and D2 bound states on a single D6 brane wrapping a Calabi-Yau 3-fold X . The DT invariants are background dependent. As we vary the Kähler moduli of X and cross a wall of marginal stability, the numbers can jump. To count BPS bound states, we have to specify the stability condition, i.e. the chamber in the moduli space where we perform the counting. Thus, the DT invariant depends on two sets of data, the BPS charges and the stability conditions. In particular, the commutative DT invariants are defined in the chamber corresponding to the infinity in the Kähler moduli space, while the non-commutative DT invariants are defined in the chamber containing the origin. On leave from University of Amsterdam and Sołtan Institute for Nuclear Studies, Poland.

430

H. Ooguri, P. Sułkowski, M. Yamazaki

It is convenient to introduce the generating function Z BPS of the DT invariants α,β (n), α,β (n)q α Q β , (1.1) Z BPS (q, Q; n) = α,β

where α ∈ Z is the D0 brane charge, β ∈ H2 (X, Z) are the D2 brane charges, and n is a set of parameters which specify the chamber in the Kähler moduli space. In this paper we consider toric Calabi-Yau manifolds without compact 4-cycles, see Fig. 6. For a manifold X in this class, it was shown in [3] that Z BPS is given by a certain reduction of the square of the topological string partition function Z top (q, Q), . (1.2) Z BPS (q, Q; n) = Z top (q, Q) · Z top (q, Q −1 ) reduction at n

In this case, Z top (q, Q) is expressed as a product in the harmonic oscillator form. The reduction means dropping an appropriate set of harmonic oscillator factors from |Z top |2 corresponding to D0/D2 states that do not bind with the single D6 brane in the chamber n. Both Q and n are related to the Kähler moduli space M(X ) of X . The relation of n to the moduli space is clear since it specifies a chamber in M(X ). It is also natural to identify Q = e−t in (1.1) with t being flat coordinates of M(X ) since the BPS charges couple to the areas of the corresponding homology cycles, i.e. the Kähler moduli. However, these two data appear asymmetrically in (1.2). In this paper, we will present another connection of Z BPS to the topological string theory, in which they are treated more symmetrically. We will show that there is another Calabi-Yau manifold Y , whose Kähler moduli space M(Y ) contains two copies of M(X ), and the topological string partition function for Y is related to Z BPS for X . For example, when X is the resolved conifold with dimC M(X ) = 1, the corresponding Y is the suspended pinch point (SPP) geometry with dimC M(Y ) = 2. Similarly, when X is C3 /Z2 , the corresponding Y is C3 /Z3 . We will find this relation by constructing the unitary one-matrix model whose partition function Z matrix (q, Q; n) is related to Z BPS (q, Q; n). In particular, Z matrix is equal to Z BPS in the non-commutative chamber (n = 0) and is equal to Z top (X ) in the commutative chamber (n = ∞). To derive the matrix model, we start with the crystal melting model [4,5] to count the generalized DT invariants, and use the vertex operator formalism [6,7], in which the partition function is expressed as correlators of exponentials of fermion bilinears. The correlators are defined for all chambers in the Kähler moduli space, and we can transform the computation into unitary matrix integrals. This construction is closely connected to the free fermion picture for the topological string and Seiberg-Witten theory developed in [8–10]. Equivalently, we can also express the partition function as a sum over non-intersecting paths following and generalizing [11], which gives yet another derivation of such matrix models. One interesting feature of our matrix model for the conifold, in the commutative chamber, is its close relation to the so-called Chern-Simons matrix model of [12,13]. In the commutative chamber in our model, Q is the only parameter, and it appears only in the potential. The Chern-Simons matrix model also depends on a single parameter, which is the ’t Hooft coupling. It turns out that these two parameters play the same role in the partition function in both models. Moreover one can consider a model with non-zero values of both these parameters. From this viewpoint, departure from the commutative to arbitrary chamber can be interpreted as turning on yet another parameter. In general one can consider simultaneously non-zero values of all three parameters: Q, chamber dependence, and ’t Hooft coupling. This gives rise to the spectral curve encoding yet

Wall Crossing as Seen by Matrix Models

431

, which contains the manifold Y described above. more general Calabi-Yau manifold Y is a symmetric resolution of C3 /Z2 × Z2 , while Y is the SPP When X is the conifold, Y can in principle be constructed for any geometry as we mentioned in the above. Such Y initial toric manifold X . As a bonus of our matrix model construction, it sheds new light on the remodeling conjecture. It has been conjectured in [14] that the topological string partition function for this class of Calabi-Yau manifolds is completely characterized by the recursion relations of [15], applied to the curve which should be identified with the mirror curve of a given manifold. Such recursion relations would arise if we had a matrix model formulation of the topological strings. In this paper we provide a construction of such matrix models in several instructive cases, and verify that to the leading order their spectral curves agree with relevant mirror curves, which is an important step towards a proof of the remodeling conjecture. We expect that application of our methods should lead to analogous results in the general case of toric manifold without compact 4-cycles. We also note that, for the case of C3 , a similar approach was presented in [11]. For an earlier related work, see [16]. Matrix models for other Calabi-Yau manifolds in the commutative chamber were derived from the topological vertex formalism or Nekrasov partition functions in [17–20]. In the course of this work we received the paper [21], in which matrix models are derived in the commutative chamber also from the topological vertex perspective. Related ideas have been considered in [22,23]. This paper is organized as follows. In Sect. 2 we introduce matrix models for BPS counting and explain how they are related to the DT invariants. In Sect. 3 we examine spectral curves of the matrix models and identify the corresponding Calabi-Yau geometries. In particular, when X is the resolved conifold, we also identify the total geometry for finite ’t Hooft coupling, and discuss its relations to the Chern-Simons matrix Y model. The derivation of the matrix model is given in Sect. 4. We end with summary and discussion on future research directions in Sect. 5. 2. Matrix Models In this section, we will present matrix models which count the DT invariants, namely the number of BPS states of D0 and D2-branes bound to a single D6 wrapping a Calabi-Yau manifold X . In general these are matrix models for unitary matrices of infinite size, and arise from crystal melting interpretation of BPS generating functions. The derivation of these matrix models will be given in Sect. 4. 2.1. C3 . When X = C3 , the generating function of BPS invariants is given by the MacMahon function which counts plane partitions. We find that this BPS generating function is equal to the partition function of the matrix model given by (2.1) Z matrix (q) = dU det (U |q), where the integral is over the unitary group U (N ), and we are interested in the limit of N → ∞. The integrand is given by the theta-product, (u|q) =

∞ k=0

(1 + uq k )(1 + u −1 q k+1 ).

(2.2)

432

H. Ooguri, P. Sułkowski, M. Yamazaki

To perform the integral (2.1), it is convenient to diagonalize U = diag(u 1 , . . . , u n ) and to consider the integral over eigenvalues u i = eiφi as Z matrix (q) =

dφi (eiφi |q)

i

(eiφi − eiφ j )(e−iφi − e−iφ j ).

(2.3)

i< j

As usual, the two factors of the Vandermonde determinant come from the integral over off-diagonal elements of U . To perform the integral (2.3) over eigenvalues, we expand the integrand in powers of q, (eiφ |q) = 1 + eiφ + (1 + e−iφ + eiφ + e2iφ ) q + (2 + e−iφ + 2eiφ + e2iφ ) q 2 + · · · , and pick up appropriate combinations of e±iφi ’s from the measure factor in (2.3) to cancel the φ-dependence in (eiφ |q). In this way, we can directly verify that the integral gives the MacMahon function, Z matrix (q) = 1 + q + 3q 2 + 6q 3 + 13q 4 + · · · =

∞ k=1

1 . (1 − q k )k

(2.4)

This is indeed the generating function of plane partitions and reproduces the counting of the DT invariants on C3 if we identify the power of q as the D0 brane charge. In this case, there is no distinction between commutative and non-commutative chambers. To relate this to the Chern-Simons matrix model, we make the identification of q = e−gs , where gs is the string coupling constant. For small gs , the modular transformation of with respect to gs gives (eiφ |e−gs ) = e

2

φ − 2g s

−1 · 1 + O(e gs ) .

(2.5)

If we ignore non-perturbative terms in gs , this is equal to the integrand for the unitary Gaussian matrix model derived from the Chern-Simons theory on the conifold [13]. In fact, (2.1) itself has also been proposed for the topological string theory on the conifold in [16], whose approach is a special case of our fermionic derivation applied to C3 as we will see below. The Kähler moduli T of the resolved conifold is given by the ’t Hooft coupling, T = gs N .

(2.6)

We are interested in the N → ∞ limit for fixed gs , namely T → ∞. It is shown in [16] that the model (2.3) with finite N has an interpretation of counting plane partitions in a container with a wall at position N . As we will discuss in the next section, a finite ’t Hooft parameter has similar wall interpretation in our more general models. From this perspective, N → ∞ limit in the C3 model corresponds to computing all plane partitions. This limit suppresses instanton corrections on the conifold, leaving only contributions from constant maps. For a general Calabi-Yau manifold, the sum over constant maps gives the MacMahon function to the power of χ /2, where χ is the Euler characteristics of the Calabi-Yau manifold. Since χ = 2 for the resolved conifold, we find that the N → ∞ limit gives one power of the MacMahon function, reproducing (2.4).

Wall Crossing as Seen by Matrix Models

433

2.2. Conifold. The Kähler moduli space of the resolved conifold is complex 1-dimensional, and it is divided into chambers parametrized by an integer n, which is the integer part of the B-field flux through the P1 [3]. The non-commutative chamber corresponds to n = 0 and the commutative chamber is at n = ∞. We find the following matrix model in the non-commutative chamber:

(U |q) , (2.7) Z matrix (q, Q; n = 0) = dU det (QU |q) where Q = e−t

(2.8)

keeps track of the D2 brane charge. By expanding the integrand in powers of q and by performing the integral over U (N ) in the N → ∞ limit as in the previous example, we can verify that Z matrix (q, Q; n = 0) = 1 + (2 − Q −1 − Q)q + (8 − 4Q −1 − 4Q)q 2 + · · · =

∞ (1 − Qq k )k (1 − Q −1 q k )k . (1 − q k )2k

(2.9)

k=1

This reproduces Z BPS (q, Q; n = 0) in the non-commutative chamber. For a general chamber, the BPS partition function is given by Z BPS (q, Q; n) =

∞ (1 − Qq k )k (1 − Q −1 q n+k )n+k . (1 − q k )2k

(2.10)

k=1

The free fermion expression for Z BPS (q, Q; n), discussed in Sect. 4, gives rise to the following matrix integral: n (U |q) Z matrix (q, Q; n) = dU det (1 + Q −1 U −1 q k ) . (2.11) (QU |q) k=1

The BPS partition function and the matrix model partition function are related as Z BPS (q, Q; n) = Cn · Z matrix (q, Q; n),

(2.12)

where the prefactor Cn is given by Cn =

n k=1

n ∞ 1 − Q −1 q k 1 . (1 − q k )k 1 − qk

(2.13)

k=n+1

We also verfied (2.12) by expanding the matrix model integrand and integrating it term by term. The origin of the prefactor Cn will be explained in Sect. 4. Note that this prefactor is trivial in the non-commutative chamber, Cn=0 = 1. It is known that the BPS partition function in the commutative chamber and the topological string partition function are identical, up to one power of the MacMahon function, Z BPS (q, Q; n = ∞) = Z top (q = e−gs , Q = e−t ) ·

∞ k=1

1 . (1 − q k )k

(2.14)

434

H. Ooguri, P. Sułkowski, M. Yamazaki

Since the prefactor Cn reduces to the MacMahon function in the commutative limit, ∞

Cn=∞ =

k=1

1 , (1 − q k )k

(2.15)

the matrix model partition function gives precisely the topological string partition function in the commutative chamber, (n=∞) Z matrix

=

dU det

∞ (1 + U q k )(1 + U −1 q k+1 ) (1 + QU q k )

k=0

= Z top (q, Q).

(2.16)

In this way, the matrix model partition function Z matrix (q, Q; n) interpolates between Z BPS in the non-commutative chamber and Z top in the commutative chamber.

2.3. C3 /Z2 . Another toric Calabi-Yau manifold with dimC M(X ) = 1 is C3 /Z2 . The matrix model for the non-commutative chamber is given by Z matrix (q, Q; n = 0) =

dU det ((U |q)(QU |q)) .

(2.17)

For a general chamber, we can write the explicit product form of the BPS generating function as a matrix integral Z BPS (q, Q; n) =

∞

(1 − q k )−2k (1 − Qq k )−k (1 − Q −1 q n+k )−n−k

k=1

= Cn ·

(U |q)(QU |q) , dU det n −1 −1 k k=1 (1 + Q U q )

(2.18)

with the prefactor Cn =

n k=1

n ∞ 1 1 . (1 − q k )k (1 − q k )(1 − Q −1 q k )

(2.19)

k=n+1

This can be verified explicitly by expanding both sides of (2.18) in powers of q. Again, in this case, we have Cn=0 = 1 and Cn=∞ = k (1 − q k )−k . Thus, the matrix model partition function interpolates between Z BPS in the non-commutative chamber and Z top in the commutative chamber, Z matrix (q, Q; n = 0) = Z top (q, Q) · Z top (q, Q −1 ) = Z BPS (q, Q; n = 0), Z matrix (q, Q; n = ∞) = Z top (q, Q).

(2.20)

Wall Crossing as Seen by Matrix Models

435

2.4. General toric Calabi-Yau manifold. A toric Calabi-Yau 3-fold X without compact 4-cycle consists of a chain of P1 ’s, which is resolved either by O(−1, −1) or O(−2, 0). The topological string partition function for such a Calabi-Yau manifold is given by [24,25] ∞ χ /2 ∞ 1 Z top (q, Q) = (1 − Q i · · · Q j q k )si ···s j k , (1 − q k )k 1≤i< j≤χ −1 k=1

k=1

(2.21) where χ is the Euler characteristics of X , the number of P1 ’s is (χ − 1), and Q 1 , . . . , Q χ −1 are the Kähler moduli that measure their sizes. Depending on whether the i th P1 is resolved by O(−1, −1) or O(−2, 0), we set si = −1 or +1. The BPS partition function in the non-commutative chamber is given by Z BPS (q, Q; n = 0) = Z top (q, Q) · Z top (q, Q −1 ).

(2.22)

This is reproduced by the matrix model partition function, Z matrix (q, Q; n = 0) =

dU det

χ −1

(s1 Q 1 · · · si Q i U |q)s1 ···si .

(2.23)

i=1

Following the procedure described in Sect. 4, it is possible to write down matrix models for other chambers. However, we have not attempted to derive a closed-form expression of the matrix model potential for a general chamber. 3. Spectral Curves and Geometric Unification The eigenvalue distribution of the large N matrix model is controlled by a spectral curve. In particular, the resolvent is a one-form on the curve and the large N effective action is evaluated by its period integral on the curve. It has been argued from several viewpoints that spectral curves of matrix models arising from the topological string theory should be related to the geometry of the corresponding Calabi-Yau manifold [14,18,26]. In this section, we will identify the spectral curves and the corresponding Calabi-Yau geometries Y for the matrix models defined in the previous section. These geometries arise in the limit of infinite ’t Hooft coupling. In a nontrivial case of X = C3 , they contain two copies of the initial Calabi-Yau manifold X for a generic chamber. For the coni which arises for finite fold case, we will analyze in detail yet more general geometry Y ’t Hooft coupling, as well as reveal the close relation between the conifold matrix model in the commutative chamber and the so-called Chern-Simons matrix model [12,13]. 3.1. C3 . As a warm-up exercise, let us describe the unitary Gaussian model, discussed in [13,16,27], − 1 φ2 dφi e 2gs i (eiφi − eiφ j )(e−iφi − e−iφ j ). (3.1) Z matrix (q) = i

i< j

Since φi ’s are periodic variables, it may appear unnatural to have the non-periodic potential, e

− 2g1s φi2

. In our construction, it is the gs → 0 limit of the periodic integrand

436

H. Ooguri, P. Sułkowski, M. Yamazaki

given in (2.3). The integrand (eiφ |q) has a series of zeros at φ = ikgs with k ∈ Z, which becomes a branch cut along the imaginary axis in the limit gs → 0. The spectral curve for the unitary Gaussian matrix model is given by the equation [13,27] e x + e y + e x−y−T + 1 = 0,

(3.2)

where T = N gs is the ’t Hooft coupling. The corresponding Calabi-Yau manifold is the mirror of the resolved conifold. In the limit of T → ∞, the curve reduces to e x + e y + 1 = 0,

(3.3)

which is the mirror of C3 . This result also arises as a special case Q, e−T , μ → 0 of a derivation of the conifold curve presented in the next section.

3.2. Conifold. In this section we analyze the conifold matrix model in all chambers. From the form of the spectral curve and for finite ’t Hooft coupling we identify the total to be a resolution of the C3 /Z2 × Z2 orbifold. In the limit of the infinite manifold Y reduces to the suspended pinch point (SPP) geometry Y , which ’t Hooft coupling, Y contains two copies of the initial conifold geometry. To examine the gs → 0 limit of the matrix model for the conifold, let us look at the integrand of (2.11). We find it convenient to choose the freedom of renaming U → U −1 described in Sect. 4.1.2 and consider an equivalent integrand n ∞ (U −1 |q) (1 + U −1 q k )(1 + U q k+1 ) −1 k . (1 + Q U q ) = −1 (QU |q) (1 + QU −1 q k )(1 + Q −1 U e−τ q k+1 ) k=1

k=0

(3.4) Here and in what follows we set τ = ngs . In order to retain interesting dependence on the chamber parameter n, we should take the limit gs → 0 in such a way that τ is held finite. By using the identity, log

∞

(1 + U q k ) ∼ −

k=1

1 Li2 (−U ), (|gs | 1), gs

(3.5)

where Li2 is the dilogarithm function, the integrand ( 3.4) can be approximated by e

− g1s V (U )

with

V (U ) = T log U + Li2 (−U )+Li2 (−U −1 ) − Li2 (−QU −1 )−Li2 (−Q −1 e−τ U ), (3.6) where we also took into account the shift (A.1) of the potential which arises from the transformation of the measure to the form which includes the Vandermonde determinant. Therefore U T − log(U + Q) + log 1 + Qe τ ∂U V = . (3.7) U

Wall Crossing as Seen by Matrix Models

437

Let us define the resolvent ω(u) by

1 1 tr , ω(u) = N u −U

(3.8)

where the expectation value is taken over the large N eigenvalue distribution of U . As we expect to find a genus 0 curve, we postulate the existence of a one-cut solution. In this case, in the weakly coupled phase of a unitary matrix model, the resolvent can be computed using the standard Migdal integral 1 dv ∂v V (v) (u − a+ )(u − a− ) ω(u) = , (3.9) 2T 2πi u − v (v − a+ )(v − a− ) where the integration contour encircles counter-clockwise the endpoints of the cut a± . We perform this computation in Appendix A and find (a+ + Q)(a− − u)− (a− + Q)(a+ − u) u + Qeτ e T /2 1 log ω(u) = . uT (a+ + Qeτ )(a− − u)− (a− + Qeτ )(a+ − u) u + Q Q 1/2 eτ/2 (3.10) This form of the resolvent already takes into account the boundary condition ω(u → ∞) ∼

1 . u

This condition also gives rise to two equations on the location of a± , √ √ 1 a+ + Q − a− + Q = Q 2 e(τ +T )/2 , √ √ a+ + Qeτ − a− + Qeτ (a+ + Q)a− − (a− + Q)a+ 1 = Q 2 e−(τ +T )/2 . τ τ (a+ + Qe )a− − (a− + Qe )a+

(3.11)

(3.12) (3.13)

With some effort these equations can be solved in the exact form a± = −1 + 2 ±2i

(1 − μ)(1 − μ 2 ) + (1 − Q)(1 + μ 2 − 2μ) (1 − μ 2 )2 (1 − Q)(1 − 2 )(1 − μ)(1 − Qμ 2 ) , (1 − μ 2 )2

(3.14)

see Fig. 1. The parameters μ and are related to the chamber number τ and the ’t Hooft parameter T as μ = Q −1 e−τ , = e−T /2 .

(3.15)

From the form of (3.14) it is clear that in the saddle point approximation the cuts are deformed and do not lay on a unit circle, but (as often happens in similar situations) on arcs which are deformations thereof. One can also verify that ω(u) given by (3.10) is indeed a solution to the Riemann-Hilbert problem, ω+ (u) + ω− (u) =

1 ∂V , T ∂u

(3.16)

438

H. Ooguri, P. Sułkowski, M. Yamazaki

0.4

0.2

1.4

1.2

1.0

0.8

0.2

0.4

Fig. 1. Behavior of cut end-points a+ (solid line) and a− (dashed line) given in (3.14), for fixed , Q and varying μ. For μ < 1, end-points a± are complex conjugate to each other. For μ = 1 we have a+ = a− = 2 −1 − (1−Q) and the cut shrinks to zero size. For μ > 1 both a± are real and spread in opposite directions 2

1−

where ω± are the values of ω(u) right above and below the branch cut. From the resolvent one can also find the eigenvalue density (for μ < 1) ρ(u) = ω+ (u) − ω− (u) (1 + μ 2 )u + 1 + Q 2 − (1 − μ 2 ) (u − a+ )(u − a− ) 1 log = . uT (1 + μ 2 )u + 1 + Q 2 + (1 − μ 2 ) (u − a+ )(u − a− ) To identify the spectral curve we note first that the non-trivial part of the resolvent takes the form (for μ < 1) ω(u) ∼

1 + Q 2 1 − μ 2 1 log − u − + (u − a+ )(u − a− ) . 2 2 uT 1 + μ 1 + μ

(3.17)

After identification x = uT ω(u), and setting u = e y , we find that e x and e y satisfy a polynomial equation. Appropriate constant shifts of x and y transform this equation into the following form: e x+y + e x + e y + Q 1 e2x + Q 2 e2y + Q 3 = 0,

(3.18)

where 1 + μQ , (1 + μ 2 )(1 + Q 2 ) 1 + Q 2 Q2 = μ · , (1 + μQ)(1 + μ 2 ) 1 + μ 2 Q3 = Q · . (1 + 2 Q)(1 + μQ) Q1 = 2 ·

(3.19)

The above equation represents the spectral curve we have been after. It is interesting that the curve (3.18) is symmetric under exchanges of Q, μ = Q −1 q n and 2 = e−T .

Wall Crossing as Seen by Matrix Models

439

Namely, the original Kähler moduli Q of the resolved conifold, the chamber parameter n and the ’t Hooft parameter T appear symmetrically in the spectral curve. We also note that the above form of the curve, as well as the density and the resolvent given in (3.17), are valid for |μ| < 1. For |μ| > 1, an appropriate analytic continuation is required. is a resolution of the orbifold C3 /Z2 ×Z2 . The corresponding Calabi-Yau manifold Y There are two such resolutions, the symmetric one (also known as closed topological vertex) and the asymmetric one. Both of these resolutions consist of three P1 ’s and are related to each other by a flop of one of the P1 ’s, see Fig. 2. The appropriate geometry underlying our solution is the symmetric resolution. Indeed, when |Q|, |μ|, | | < 1, Eq. (3.18) describes the mirror of the symmetrically resolved orbifold, with Q, μ, 2 being exponentials of flat coordinates of the Kähler moduli space, as we discuss in more detail below. We also note that on general grounds it is known that for Calabi-Yau manifolds of the form uv + H (x, y) = 0, with H (x, y) = 0 encoding a Riemann surface as in (3.18), the special geometry relations reduce to T =

λ, a

top

∂ F0 ∂T

=

λ, b

where λ is a reduction of the holomorphic three-form along u, v directions, a and b are top dual one-cycles on a Riemann surface H (x, y) = 0, and F0 is the topological string free energy. The same relations hold for the free energy F0 of matrix models, if T is identified with the ’t Hooft coupling [27]. Therefore the fact that the spectral curve in the , ensures the agreement of derivatives case we consider agrees with the mirror curve of Y of matrix model and topological string free energies with respect to T , up to an integration constant which is a function of Q and μ (which are just parameters of the matrix potential). As the exact topological string partition function is a symmetric function of Q, μ and 2 , this implies that this integration constant must restore this symmetry, and the resulting matrix model free energy F0 = Li3 (Q) + Li3 (μ) + Li3 ( 2 ) + Li3 (Qμ 2 ) − Li3 (Q 2 ) − Li3 (μ 2 ) − Li3 (Qμ), top

has to agree with the topological string result F0 .

Fig. 2. Two resolutions of the C3 /Z2 × Z2 geometry, symmetric one (a.k.a. closed topological vertex, left) and asymmetric one (right), related by a flop of one P1 . The Kähler parameters of both geometries are related to each other [28] as P1 = Q 1 Q 2 , P2 = 1/Q 2 , P3 = Q 2 Q 3

440

H. Ooguri, P. Sułkowski, M. Yamazaki

For the BPS counting problem, we are interested in the limit of T → ∞, or equivalently → 0. With appropriate shifts of x and y, Eq. (3.18) in this limit becomes μ e2y + e x+y + e x + (1 + Qμ) e y + Q = 0.

(3.20)

The manifold Y corresponding to this curve is the SPP geometry, with Q and μ being exponentials of flat coordinates representing sizes of its two P1 ’s, which encode two copies of the initial O(−1, −1) → P1 geometry, see Fig. 3. Not only does the spectral curve agree with the mirror curve of the SPP geometry in the limit of gs → 0, but in fact the matrix integral reproduces the full topological string partition function at finite gs . Indeed, it is known that the SPP topological string partition function, with Kähler parameters Q and μ, is equal to SPP Z top (q, Q, μ) =

∞ (1 − Qq k )k (1 − μq k )k . (1 − q k )3k/2 (1 − μQq k )k

(3.21)

k=1

On the other hand, from the explicit structure of the BPS generating function and formulas (2.10), (2.12) and (2.13), we find that the value of the matrix integral, in the N → ∞ limit, is related to the above topological string partition function as SPP Z matrix (q, Q; n) = Z top (q, Q, μ = Q −1 q n ) ·

∞

(1 − q k )k/2 .

(3.22)

k=1

In this way, the Kähler moduli Q and the chamber number n for the BPS counting on the conifold are unified into the two Kähler moduli of the SPP geometry. We note that there is an extra factor of the MacMahon function in this relation. The appearance of the MacMahon factor, which is independent of Q, is a common and subtle issue in relations between the topological string and other systems. We note that the spectral curves (3.18) and (3.20) arising from the matrix model automatically encode the relevant mirror map. For example, in the parametrization of (3.20), Q and μ are directly identified with the exponentials of the flat coordinates. We can verify this by explicit evaluation of period integrals, see [29], Sect. 3.3. The form (3.20) of the curve factorizes for μ = 1 and Q = 1, which is consistent with degeneration of the topological string partition function (3.21) for these values. Also in the limit x → ±∞, the solutions of the curve equation for e y reproduce appropriate locations of the asymptotic legs of the SPP toric diagram in Fig. 3. The same parametrization

Fig. 3. Toric diagram for the Suspended Pinch Point (SPP) geometry, and the corresponding dual diagram. This manifold contains two copies of O(−1, −1) → P1 geometry

Wall Crossing as Seen by Matrix Models

441

naturally arises also in [30] as a characteristic polynomial of the dimer model; see the example in Sect. 4.2 and in particular (4.2.8) of [31]. All these arguments can be extended mirror curve (3.18). Note that the standard parametrization of this mirror curve, to the Y such as the one in [32], would suggest the equation e x+y +e x +e y + 2 e2x +μe2y + Q = 0. This is however valid for large values of Kähler parameters, and consistent with (3.18), as in this regime the quadratic terms in Q, μ and 2 are negligible. Because of the form of the spectral curve at finite ’t Hooft coupling ( 3.18), it is natural to conjecture that the partition function of our conifold matrix model for finite ’t Hooft coupling is equal to the topological string partition function of the resolution of C3 /Z2 × Z2 [28], modulo a MacMahon factor total Z matrix (q, Q, μ, 2 ) ∞ ∞ (1 − Qq k )k (1 − μq k )k (1 − 2 q k )k (1 − Qμ 2 q k )k (1 − q k )k · . = (1 − Qμq k )k (1 − μ 2 q k )k (1 − Q 2 q k )k k=1

(3.23)

k=1

We chose the MacMahon factor in such a way that it reduces to our result (3.22) in the infinite ’t Hooft coupling limit → 0. As another evidence for the conjecture, we point out that, in the limit Q, μ → 0, our model reduces to the Chern-Simons matrix model (discussed in the next section) and the above partition function correctly reduces to the appropriate Chern-Simons partition function. It would be interesting to test this conjecture, for example by applying matrix model recursion relations of [15]. As discussed in [28], the right-hand side of (3.23) is precisely (including the correct power of MacMahon function) the generating function of plane partitions in a finite K × L × M cube, and up to one power of MacMahon reproduces the closed topological vertex partition function with Kähler parameters identified as Q 1 = gs K , Q 2 = gs L , Q 3 = gs M (this generalizes the C3 model of plane partitions with one wall discussed in Sect. 2.1). In the present case we have one analogous identification of the ’t Hooft parameter T = gs N . Our ’t Hooft parameter has also a nice combinatorial interpretation: finite N corresponds to matrices with N eigenvalues, which in the construction of our matrix models arise from truncation of products in (4.18) to N operators . This translates to truncation of Young diagrams, which arise from slicing of the crystal model pyramid, to at most N rows, which is equivalent to considering a wall at location N . Therefore our present model involves one wall associated to finite ’t Hooft coupling, the second parameter μ which involves finite n (which also measures a size of the crystal), and the third parameter Q which appears in the matrix model potential in the same way as μ, however does not have a clear crystal interpretation. The cube model of [28] involves three symmetric walls and has the same generating function (3.23). It would be interesting to understand the relations between these two models in more detail. As the final remark, we note that there are three limits in which our full matrix model reproduces both the mirror curve, as well as the topological string partition function of the conifold. The first such limit μ, Q → 0 brings us to the Chern-Simons matrix model and will be discussed in the next section. The second limit μ, → 0 is just the commutative limit of the model with matrices of infinite size. In both these limits it is not surprising that the size of the conifold is identified respectively with ’t Hooft coupling e−T or the original Kähler parameter Q. However in the third limit Q, → 0 we obtain the conifold of the size μ = Q −1 e−τ , which in fact means the Q vanishes however the chamber parameter τ → ∞. It also corresponds to the commutative limit, and shows that for vanishing Q the role of the conifold Kähler parameter is attained by μ. This is

442

H. Ooguri, P. Sułkowski, M. Yamazaki

in agreement with the picturesque identification of the conifold size with the length of the top row of the pyramid in the crystal melting model, and puts this identification on firmer footing. 3.3. Relation to the Chern-Simons matrix model. We now discuss the commutative chamber n → ∞ of the conifold model presented above. We show that it leads to the matrix model which is equivalent to the Chern-Simons matrix model, and these two models can be unified in a geometric way. By the Chern-Simons matrix model [12,13,27] we understand the unitary matrix model with the Gaussian potential, as in (3.1), and finite ’t Hooft coupling T = N gs . Including the shift (A.1) arising from the measure, we write its potential as 1 T − log U VC S = T log U − (log U )2 , ∂U VC S = . (3.24) 2 U Our present model is also unitary and in the commutative chamber the derivative of its potential (3.6) reduces to T − log(U + Q) . (3.25) U We recall that our matrix model arises from rewriting the BPS generating function, which in the n th chamber

takes form ( 2.10). In the commutative chamber n → ∞ the term M(Q −1 ) = k (1 − Q −1 q k )−k is removed from that expression. On the other hand,

in this limit the prefactor (2.13) reduces to a single MacMahon function M(1) = k (1 − q k )−k . Therefore in the commutative chamber we find −1V M(1) (u ) = du i (u j − u k )2 e gs n→∞ i . M(Q) ∂U Vn→∞ =

i

j
k

The ratio on the left hand side is precisely the partition function of the Chern-Simons theory on S 3 , which is also reproduced by the Chern-Simons matrix model (3.24). The spectral curve of that model has genus zero and is identified with P1 which arises from the geometric transition of the S 3 . The size of this P1 is given by the (finite) ’t Hooft coupling T . Now we find the model whose partition function is given by the same Chern-Simons partition function and its spectral curve has also genus zero, however our association of parameters is different. Instead of finite ’t Hooft coupling parameterizing the size of P1 , in our model ’t Hooft coupling is infinite, while the size of P1 is encoded in a fixed parameter Q deforming the unitary Gaussian potential as in (3.25). As an immediate check we notice that for Q = 0 our potential (3.25) indeed reduces to (3.24), and for infinite T it reproduces a Gaussian result for plane partitions (3.1). The dependence of the potential Vn→∞ on the parameter Q is shown in Fig. 4. As we approach the conifold singularity at Q = 1, it is interesting to observe the flattening of the matrix potential. It is known that the conifold singularity has to do with the flattening of the Coulomb branch moduli space [33,34]. This indicates a connection between the matrix variable and the Coulomb branch variables. With both finite ’t Hooft coupling T and finite Q, we find a unifying geometric viewpoint, again in terms of the SPP geometry, however now with Kähler parameters Q and e−T . In this topological string limit Eqs. (3.12) and (3.13) take form a+ + Q + a− + Q = 2e−T /2 , (3.26) √ T /2 √ a− (a+ + Q) + a+ (a− + Q) = ( a + b)e , (3.27)

Wall Crossing as Seen by Matrix Models

443

Vn

50

10

10

2 2 Fig. 4. Matrix potential (without (A.1) shift), −Vn→∞ (ϕ) = π6 + ϕ2 + Li2 (−Qe−ϕ ) in terms of a variable ϕ

u = e . The solid plot represents the Gaussian potential with Q = 0. Increasing Q flattens the potential (dashed and medium-dashed). At the conifold singularity, corresponding to Q = 1, the potential becomes flat (tiny-dashed, horizontal plot)

and their solution is given by a± = −1 + (2 − Q) 2 ± 2i (1 − Q)(1 − 2 ),

(3.28)

which leads to the following form of the resolvent u + 1 + Qe−T − (u + 1 + Qe−T )2 − 4(u + Q) 2 1 log . ω(u)μ=0 = uT 2e−T (u + Q) (3.29) The spectral curve which arises from this resolvent is again the mirror curve of the SPP geometry and reads x + u + xu + x 2

2 Q + = 0. 1 + Q 2 1 + Q 2

(3.30)

It is clear that the → 0 limit leads to the conifold geometry with the conifold of size Q. Finally, for Q = 0 the resolvent (3.29) u + 1 − (u + 1)2 − 4ue−T 1 log ω(u)μ=Q=0 = uT 2ue−T agrees1 with the resolvent of the Chern-Simons matrix model found in [13,27], and the spectral curve reproduces the conifold mirror curve of the size given by the ’t Hooft coupling x + u + xu + x 2 e−T = 0. 1 Instead of introducing the T log U term to the potential (3.24) to get the standard Vandermonde determinant, the solution in [27] involves completing the square, which leads to a redefinition u here = p[27] e T . Due to a different sign of gs we also need to identify ’t Hooft couplings as There = −t[27] . Taking this into

account, our cut endpoints (3.28) with Q = 0 also agree with those in [27].

444

H. Ooguri, P. Sułkowski, M. Yamazaki

Fig. 5. Toric diagram for the resolution of C3 /Z3 singularity, and the corresponding dual diagram. This geometry contains two copies of C3 /Z2 resolution

3.4. C3 /Z2 . A similar analysis as for the conifold can be performed for C3 /Z2 geometry, for arbitrary chamber. Even though we do not repeat a matrix model derivation of the spectral curve for this case, we note that the relation to topological string theory is also immediate, and the relevant geometry Y for this case is the resolution of C3 /Z3 singularity shown in Fig. 5. This geometry contains two P1 ’s of O(0, −2) type, and denoting its Kähler parameters by Q and μ, its topological string partition function reads C3 /Z3

Z top

(q, Q, μ) =

∞

(1 − q k )−3k/2 (1 − Qq k )−k (1 − μq k )−k (1 − μQq k )−k .

k=1

(3.31) Therefore, from (2.18) and (2.19) we find in this case C3 /Z3

Z matrix (q, Q; n) = Z top

(q, Q, μ = Q −1 q n ) ·

∞

(1 − q k )k/2 .

(3.32)

k=1

This shows that the matrix model partition function (2.18) in the n th chamber is equal to the topological string partition function for C3 /Z3 with its two Kähler moduli given by Q and μ = Q −1 q n , up to the MacMahon function as in (3.22). This unified geometry Y contains two copies of the initial C3 /Z2 resolution. It would be interesting to check would arise for finite ’t Hooft coupling, and whether it is consistent what geometry Y with the total matrix model partition function.

3.5. General toric Calabi-Yau manifold. Matrix model for a general toric manifold and in general chamber can be constructed in a similar manner, following the fermionic approach of [6,7], and then analyzed along the lines above. We do not present a construction in a general chamber which is technically much more involved, however we found explicit expressions for matrix models for a general manifold X in the non-commutative chamber. These models are presented in Sect. 2.4. Nonetheless, we postulate that for arbitrary chamber, in the infinite ’t Hooft limit, we should also find a toric manifold Y which contains two copies of X . For the finite ’t Hooft limit matrix model the . spectral curve would encode yet more general manifold Y

Wall Crossing as Seen by Matrix Models

445

Fig. 6. Toric diagram for Calabi-Yau manifold without compact 4-cycles arises from a triangulation of a strip. There are N independent P1 ’s with Kähler parameters Q i = e−ti , and χ vertices to which we associate ⊕ and signs Si . Intervals which connect vertices with opposite signs represent O(−1, −1) → P1 local neighborhoods. Intervals which connect vertices with the same signs represent O(−2, 0) → P1 local neighborhoods

4. Derivations of the Matrix Models In this section we give two derivations of our matrix models. One derivation (Sect. 4.1) uses the free fermion formalism, while the other (Sect. 4.2) uses a set of non-intersecting paths.2 Both derivations are based on the following observation. Let us begin with the crystal melting model of [5]. Given a configuration of a crystal, we can slice the crystal by a sequence of parallel planes. On each slice, we have a Young diagram. The Young diagrams evolve according to the interlacing conditions, which is equivalent to the melting rules of [5]. For C3 , we have [4], . . . ≺ λ(−2) ≺ λ(−1) ≺ λ(0) λ(1) λ(2) . . . ,

(4.1)

where we write λ μ (equivalently μ ≺ λ) for two partitions λ = (λi ) and μ = (μi ), if λi = μi + 1 or λi = μi for each i.

(4.2)

For conifold in the non-commutative chamber [35], +

+

+

. . . ≺ λ(−2) ≺ λ(−1) ≺ λ(0) λ(1) λ(2) . . . ,

(4.3)

+

where we write λ μ for λ = (λi ), μ = (μi ) if λt μt , i.e. . . . ≥ λi ≥ μi ≥ λi−1 ≥ μi−1 . . .

(4.4)

We can also discuss more general toric Calabi-Yau 3-folds X without compact 4-cycles (see Fig. 6). The ( p, q)-web for X has χ vertices, where χ is the Euler characteristics of X . To each vertex we associate a sign Si = ±1 so that, Si Si+1 = si ,

(4.5)

2 We have been informed by Mina Aganagic that there is yet another derivation of the matrix models [23].

446

H. Ooguri, P. Sułkowski, M. Yamazaki

where the sign factor si = ±1 is defined in Sect. 2.4. This means that if the local neighborhood of i th P1 represented by an interval between vertices i and i + 1 is O(−2, 0), then Si+1 = Si ; if this neighborhood is of O(−1, −1) type, then Si+1 = −Si . There is a binary choice of overall signs Si : the type of the first vertex could be chosen as either S1 = +1 or S1 = −1. This choice corresponds to the exchange of rows and columns of Young diagrams. Each choice gives rise to a matrix model potential, and they are related to each other by analytic continuation. Given such Si , the interlacing conditions in the noncommutative chamber are given by S−2

S−1

S0

S1

S2

S3

. . . ≺ λ(−2) ≺ λ(−1) ≺ λ(0) λ(1) λ(2) . . . ,

(4.6)

−

where = and we extended the definition of Si (i = 1, . . . , χ ) to Si (i ∈ Z) by periodic identification: Si+χ = Si . More general expression, applicable to any chamber, is given in [7,36].

4.1. Derivation (I): Free fermions. 4.1.1. Wall crossing and free fermions. The first derivation is based on the free fermion formalism developed in [6,7] (see also [36]), which we now review briefly. The basic idea is as follows. We have seen that states in the crystal melting model are represented by Young diagrams. Since Young diagrams are represented by states of free fermion systems and evolutions of Young diagrams by vertex operators, the partition function is written as a correlator of fermions bilinears [4,35]. We give the resulting expression in the notation of [6]. For any toric geometry without compact 4-cycles, the generating function of DT invariants in the non-commutative chamber can be written as Z BPS (q, Q; n = 0) = + |− ,

(4.7)

where |± are fermionic states which will be defined below. Moreover, we can introduce wall-crossing operators W p to write the expression in other chambers,3 where n p = m for the p th P1 and all other n’s set equal to zero: (4.8) Z BPS q, Q; n p = m, all other n = 0 = + |(W p (1))m |− . In the remainder of this subsection we give explicit expressions for the states |± and wall-crossing operators W p . Si (x) at each vertex as We first define a vertex operator ± Si =+1 ± (x) = ± (x),

Si =−1

± (x) = ± (x),

(x) are defined in Appendix B. These operators represent the evowhere ± (x) and ±

are nothing but the evolution lution of Young diagrams λ(t), and + , − , + and − +

+

rules ≺, , ≺ and , respectively [35]. 3 In this example, inserting wall crossing operators is equivalent to commuting vertex operators, which is proposed in [7,36].

Wall Crossing as Seen by Matrix Models

447

Si Next, we consider a product of χ such operators ± (x) interlaced with χ operators i representing colors qi , for i = 0, 1, . . . , χ − 1. Operators Q 1 , . . . , Q χ −1 are assoQ ciated to P1 in the toric diagram (also defined in Appendix B), and there is an additional 0 . We then introduce Q Sχ −1 Sχ S1 S2 1 ± 2 · · · ± χ −1 ± 0 . A± (x) = ± (x) Q (x) Q (x) Q (x) Q

(4.9)

i ’s to the left or right we also introduce Commuting all Q xq 0 Q χ −1 )−1 A+ (x) = +S1 xq +S2 xq · · · +Sχ 1 · · · Q , A+ (x) = ( Q q1 q1 q2 · · · qχ −1 (4.10) S χ S S −1 0 Q χ −1 ) = −1 (x)−2 (xq1 ) · · · − (xq1 q2 qχ −1 ). 1 · · · Q A− (x) = A− (x) ( Q (4.11) The states |± are now defined as + | = 0| . . . A+ (1)A+ (1)A+ (1) = 0| . . . A+ (q 2 )A+ (q)A+ (1), |− = A− (1)A− (1)A− (1) . . . |0 = A− (1)A− (q)A− (q 2 ) . . . |0.

(4.12) (4.13)

It was shown in [6] that we have the relation ( 4.7) under the following identification between qi parameters which enter a definition of |± and topological string parameters Q i = e−Ti and q = e−gs : qi = (Si Si+1 )Q i ,

q = q0 q1 · · · qχ −1 .

(4.14)

In addition the wall-crossing operators are defined by tp t1 t2 1 − 2 · · · − p W p (x) = − (x) Q (x) Q (x) Q t p+1 · · · +tχ −1 (x) Q χ −1 +tχ (x) Q 0 × +p+1 (x) Q

(4.15)

and the relation (4.8) holds under the change of variables Q p = (S p S p+1 )q p q m ,

Q i = (Si Si+1 )qi for i = p,

q = q0 q1 · · · qχ −1 . (4.16)

4.1.2. Matrix models from free fermions. Once the BPS partition function is written in the fermionic formalism, it can be turned into a matrix model upon inserting the appropriately chosen identity operator in the correlator (4.8): Z BPS q, Q; n p = m, all other n = 0 = + | I (W p (1))m |− .

(4.17)

The identity operator I is represented by the complete set of states |RR| (representing two-dimensional partitions). Using orthogonality relations of U (∞) characters χ R , and

448

H. Ooguri, P. Sułkowski, M. Yamazaki

the fact that these characters are given in terms of Schur functions χ R = s R ( u ) for u = (u 1 , u 2 , u 3 , . . .), we can write |RR| = δ P t R t |PR| I= R

=

dU

P,R

s P t ( u )s R t ( u )|PR|

P,R

=

dU

− (u i )|0 0| + (u i−1 ) ,

i

(4.18)

i

where dU denotes the unitary measure for U (∞), which can be written in terms of eigenvalues u i = eiφi of U as, eiφi − eiφ j e−iφi − e−iφ j . dU = dφk k

i< j

Having inserted the identity operator in this form into (4.17) we can commute away

operators and get rid of operator expressions. This leads to a matrix model with the ± unitary measure dU . In case of the non-commutative chamber all factors arising from

operators depend on u and contribute just to the matrix model commuting these ± i potentials. In other chambers additional factors arise which do not depend on u i and therefore contribute to some overall factor Cn (in a chamber labeled by n). Thus in general we write the DT generating function as a matrix model, up to the factor Cn . In the non-commutative chamber, the integrand can be expressed in terms of the theta-product, (U |q) =

∞

(1 + U q k )(1 + U −1 q k+1 ),

k=0

and in other chambers of certain modification thereof. We emphasize here that this fermionic method of constructing matrix models applies to any chamber for any toric Calabi-Yau 3-fold without compact 4-cycles. This includes, for example, chambers where the BPS partition function becomes a finite product [6]. One may ask if our construction of the matrix model is unique. One potential source of ambiguity is the location of the operator I. In (4.17) we inserted the operator I on the left side of (W 1 (1))m . When inserted on the right, we find a seemingly different matrix model potential, for example, ∞ (1 + U q k ) (1 + U −1 q k+n+1 ) dU det , (4.19) (1 + QU q k−n )(1 + Q −1 U −1 q k+n+1 ) k=0

in the conifold with the same prefactor (2.13). This integral can be turned into n (QU |q) 1 , Z matrix (q, Q; n) = dU det (U |q) 1 + Q −1 U −1 q k

(4.20)

k=1

by the simple change of integration variable, U → q n+1 Q −1 U −1 . The resulting integral is similar to our original integral ( 2.11), but the numerator and the denominator are

Wall Crossing as Seen by Matrix Models

449

exchanged. Then the matrix model (4.20) can be derived by taking advantage of the freedom of changing overall signs Si → −Si ’s, which was mentioned in Sect. 4.1.1. The two matrix models can then be related by analytic continuation in q, as described in footnote 5 of [37]. Another possible ambiguity involves renaming U → U −1 in (4.18), which does not affect the form of the measure, however may affect the form of the potential.

4.2. Derivation (II): Non-intersecting paths. In this section we give yet another derivation of our unitary matrix models, based on the non-intersecting paths. The fundamental observation here is that states in the crystal melting model, i.e., a sequence of Young diagrams satisfying interlacing conditions, can be equivalently expressed as a set of nonintersecting paths on an oriented graph.4 Using the Linström-Gessel-Viennot (LGV) formula described in Appendix C, we can express the result as a determinant, which is an integral over a matrix whose eigenvalues are the height of the non-intersecting paths. This gives a multi-matrix model. We finally simplify the matrix model into a 1-matrix model. 4.2.1. C3 . Let us begin with the simplest example, C3 . Let us define h k (t) = λ N −k+1 (t) + k − 1,

(4.21)

for k = 1, . . . , N . Since λ(t) is a partition, we have h k (t) < h k+1 (t),

(4.22)

for all t. We also have the boundary condition, h k (t) = k − 1 when |t| large.

(4.23)

Moreover, ( 4.1) means we have, for each step t, h k (t + 1) − h k (t) = 0 or − 1, for t ≥ 0 and h k (t + 1) − h k (t) = 0 or 1, for t < 0. For later purpose, it is convenient to introduce the minus of the sign function σ (t ) (t ∈ Z + 21 ), −1 (t > 0).

(4.24) σ (t ) = +1 (t < 0). so that 1 h k (t + 1) − h k (t) = 0 or σ (t + ). 2

(4.25)

4 The matrix model for C3 is constructed recently by [11]. We will generalize those arguments to conifold later. Also, even for C3 the explicit expression of the potential for the 1-matrix model (4.35) in our paper seems to be new.

450

H. Ooguri, P. Sułkowski, M. Yamazaki

Summarizing, we see that Young diagrams {λ(t)} are equivalently expressed by a set of coordinates h k (t), which satisfy the conditions above. We can represent these coordinates as a set of non-intersecting paths on an oriented graph shown in Fig. 7.5 The coordinates of the k th path at time t (specified by the t th dotted line) is given by an integer h k (t). It is easy to see that there is a one-to-one correspondence between the coordinates h k (t) satisfying the conditions above and the non-intersecting paths on the oriented graph. The inequality (4.22) is translated into the non-intersecting condition. The step condition (4.25) corresponds to the fact that we have two arrows for each vertex on the oriented graph, one with the same coordinate and another with coordinate increasing/decreasing by one unit. Thus, the BPS partition function can be expressed as a sum over non-intersecting paths, Z BPS (q) = q i h i (t) , (4.26) {h i (t)} : non-intersecting t

where paths are assumed to satisfy the boundary condition ( 4.23). By the LGV formula (Appendix C), ( 4.26) is equivalent to Z matrix (q) = dh(t) deti, j (G(i, j; t)) ,

(4.27)

t

where the Green function G(i, j; t) is given by

G(i, j; t) = q i h i (t) × δ h i (t + 1) − h j (1) + δ h i (t + 1) − h j (t) + σ (t + 1/2) . (4.28) The discrete coordinates h i (t) are turned into continuous variables. The delta functions enforce the condition (4.25). The contributions including off-diagonal components of Green functions correspond to intersecting paths, which cancel out by the sign of the determinant. We also need to set the boundary condition (4.23) h i (t) ∈ {1, . . . , N }, |t| 1,

(4.29)

where N is an integer which we take to infinity at the end of the computation. Keeping N finite corresponds to taking only the first N paths. The delta functions can be generated by introducing Lagrange multipliers φ(t ) (t ∈ Z + 21 ), 1

1

1

dφ(t ) e−Tr Vt (φ(t )) det(ei h i (t + 2 )φ j (t ) ) det(e−i h i (t − 2 )φ j (t ) ) N! 1 1 1 1

= det δ(h i (t + ) − h j (t − )) + δ(h i (t + ) − h j (t − ) + σ (t )) , 2 2 2 2 where the potentials Vt (φ(t )) depend on the signs of t and are given by

e−Vt (φ) = 1 + eiφσ (t ) .

(4.30)

5 This oriented graph arises from the lozenge tiling of the plane, which is another way of representing crystal for C3 . The paper [11] uses this tiling to construct an oriented graph. For the derivation of the matrix model in this paper, however, we do not need to invoke the notion of lozenge tilings.

Wall Crossing as Seen by Matrix Models

451

t=-3

t=-2

t=-1

t=0

t=1

t=2

t=3

t=-3

t=-2

t=-1

t=0

t=1

t=2

t=3

t=-3

t=-2

t=-1

t=0

t=1

t=2

t=3

Fig. 7. Top: An oriented graph for C3 . Middle: An example of 3 non-intersecting paths shown by thick arrows. The location of the k th path at time t gives h k (t). Bottom: The corresponding evolution of Young diagrams

452

H. Ooguri, P. Sułkowski, M. Yamazaki

We can enforce the boundary condition (4.29) by limiting −M ≤ t ≤ M for the height function h(t) and −M − 21 ≤ t ≤ M + 21 for the Lagrange multiplier φ(t ) for sufficiently large M, and by introducing the factors, 1 1 (4.31) e−iφ(−M− 2 ) eiφ(M+ 2 ) , at the initial and final points, where (eiφ ) is the Vandermonde determinant, (eiφ ) = det kl eikφl . In the free fermion formalism in the previous subsection, the two determinants in (4.31) correspond to the bra and ket states for the Fock vacuum. We also set the potentials at the two end points to vanish, Vt =−M− 1 = Vt =M+ 1 = 0. The partition function then takes the form6 Z matrix (q) =

2

dh(t)

×

1 1 dφ(t ) e−iφ(−M− 2 ) eiφ(M+ 2 ) ,

t

t

2

q Tr h(t)

1

1

e−Tr Vt (φ(t )) det(ei h(t + 2 )φ(t ) ) det(e−i h(t − 2 )φ(t ) ).

t

t

(4.32) We can turn this into a matrix integral. We use the Itzykson-Zuber integral over unitary matrix U [38,39] † det(ei X i Y j ) = (X )(Y ) dU eiTr XU Y U , (4.33) to generate squares of the Vandermonde determinants (h(t))2 and (φ(t ))2 for all t and t , except for φ(−M − 21 ) and φ(M + 21 ), for which (φ(−M − 21 )) and (φ(M + 21 )) are generated. The resulting expression can now be written as a matrix integral 1 1 e−iφ(−M− 2 ) eiφ(M+ 2 ) Z matrix (q) = d H (t) d(t ) (φ(−M − 21 )) (φ(M + 21 ))

t t

1

1 × q Tr H (t) e−Tr Vt ((t )) eTr i(t )(H (t + 2 )−H (t − 2 )) , (4.34) t

t

where H (t) (t ∈ Z) and (t ) (t ∈ Z + 21 ) are N × N Hermitian matrices whose eigenvalues are (h i (t))i=1,...,N and (φi (t ))i=1,...,N , respectively. The matrix model (4.34) is a multi-matrix model. However, H (t) appear only linearly in the integrand, and can be trivially integrated out, yielding the constraints

1 1 (t ) = (t − 1) + log q · 1 N ×N = − + t + log q · 1 N ×N . 2 2 6 We drop an overall constant here for simplicity. In the final expression, we will present the correct formula including the overall factor.

Wall Crossing as Seen by Matrix Models

453

The matrix model then simplifies to a one-matrix. The measure factor for = (− 21 ) is (eiφ ) 2 = dU, d · (φ) where U = ei . The factor (4.31) we inserted to impose the boundary condition turns the hermitian measure for into the unitary measure for U . Thus, the matrix integral can be written as Z matrix = dU (U |q). (4.35) This gives another derivation of (2.1). 4.2.2. Conifold and more general geometries. Let us next describe the conifold in the non-commutative chamber. Again, we define h k (t) by (4.21). We then have the nonintersecting condition (4.22) and the boundary condition (4.23). We also have, from (4.3), 1. When t is odd, 1 h k (t + 1) − h k (t) = 0 or σ (t + ). (4.36) 2 2. When t is even, · · · ≤ h k−1 (t + 1) < h k (t) ≤ h k (t + 1) < h k+1 (t) ≤ · · · .

(4.37)

for t ≥ 0 and · · · ≤ h k−1 (t) < h k (t + 1) ≤ h k (t) < h k+1 (t + 1) ≤ · · · .

(4.38)

for t < 0. The set of coordinates h k (t) satisfying the conditions above can be expressed as coordinates of non-intersecting paths of the oriented graph shown in Fig. 8. Let us show that this is indeed the case. When t is odd, the story is similar to the C3 example; we have two possibilities in (4.36), which corresponds to the two arrows starting from a vertex of the oriented graph, one with the same height and another with increasing/decreasing height by one unit. The situation changes when t is even; during one time unit, after going one unit horizontally we can choose to go vertically as much as we want, as long as we respect the non-intersecting condition. In other words, the intersection of the k th path and the time slice t is an oriented interval, and when take h k (t) to be the value at the endpoint of the oriented interval, the interval is expressed as [h k (t), h k (t + 1)] (this is for t < 0; for t ≥ 0, we have [h k (t + 1), h k (t)] instead). This means that the conditions (4.37), (4.38) are translated into the non-intersecting conditions for paths. In the vertex

). operator formalism explain in Sect. (4.1.2), t odd (t even) corresponds to ± (± 3 The procedure to obtain the matrix model is similar to the C example. The multimatrix model is given as follows: 1 1 e−iφ(−M− 2 ) eiφ(M+ 2 ) Z matrix (q) = d H (t) d(t ) (φ(−M − 21 )) (φ(M + 21 ))

t t H (t)

1

1 × qt e−Tr Vt ((t )) eTr i(t )(H (t + 2 )−H (t − 2 )) . (4.39) t

t

454

H. Ooguri, P. Sułkowski, M. Yamazaki

Fig. 8. Top: an oriented graph for the conifold. Middle: an example of 3 non-intersecting paths on the graph shown by thick arrows. Bottom: the corresponding evolution of Young diagrams

Wall Crossing as Seen by Matrix Models

455

The main differences from the C3 example are that we have two parameters which depend on time t as q0 (t: odd), qt = (4.40) q1 (t: even), and that the potential takes the form

1 + eiφσ (t ) (t : odd), −Vt (φ) = e

) iφσ (t 1/(1 − e ) (t : even).

(4.41)

When t is odd, there are 2 possibilities for h k (t + 1) − h k (t), which is the reason for the

2 terms in the potential. When t is even, e−Vt (φ) = 1 + e−iφσ (t ) + e−2iφσ (t ) + · · ·; this reflects the condition (which is part of (4.38)) h k (t + 1) = h k (t) + mσ (t + 1/2), m = 1, 2, 3, . . . , i.e., h k (t + 1) ≥ h k (t). The remaining conditions of ( 4.38) is taken care of by the non-intersecting condition. Again, we can integrate out Mt ’s, and the matrix model simplifies. When we diagonalize it, we finally have (U |q) Z matrix = dU det , (4.42) (QU |q) 1

where U ≡ ei(− 2 ) , q ≡ q0 q1 and Q = −q1 . The sign in Q comes from the localization [40], which should properly be taken into account in the definition of the matrix model. We can also repeat the same analysis for more general geometries. In particular, in the non-commutative chamber we obtain the results presented in Sect. 2.4. The oriented graph can be constructed from the data of Si , by combining the 4 basic patterns (correp ) for each i ∈ Z; see Fig. 9. As an example, the oriented graph onding to + , − , + , − for SPP in the noncommutative chamber with S1 = +1, S2 = −1, S3 = −1 is given in Fig. 9. We stress that this approach is equivalent to the fermionic picture described earlier. For example, the sum over all possible paths in the region t < 0 (or t > 0) is encoded in the state + | (respectively |− ). These states live in the Fock space associated to t = 0, and can be expressed in terms of a sum over two-dimensional partitions from both fermionic and non-intersecting paths viewpoints. The correlator (4.7) represents gluing paths extending in the t < 0 region with paths in the t > 0 region in a consistent way. Since the evolution rules of Young diagrams in more general chambers are already given in [6,7,36], it is in principle straightforward to generalize the analysis to more general chambers. 5. Discussion In this paper, we derived unitary matrix models of infinite-size matrices, which give the counting of BPS bound states of D0 and D2-branes bound to a single D6-brane wrapping a toric Calabi-Yau manifold X without compact 4-cycle. These matrix models depend on a set of parameters Q, which keep track of the BPS charges, and the chamber parameters n. Both Q and n are associated to the Kähler moduli space M(X ) of X . It turned out that these matrix models define the topological string on another Calabi-Yau

456

H. Ooguri, P. Sułkowski, M. Yamazaki

Fig. 9. Top: An oriented graph in general cases are constructed by combining the 4 basic types of graphs shown here. Bottom: An oriented graph for SPP, with S1 = +1, S2 = −1, S3 = −1

manifold Y , whose moduli space contains two copies of M(X ). The parameters Q and n are unified as the Kähler moduli of Y . In addition, when the ’t Hooft coupling gs N is . In the crystal model this finite ’t Hooft finite, we found a yet more general manifold Y coupling has an interpretation of restricting a crystal configuration by a wall located at position N , and then the limit N → ∞ provides mathematically rigorous definition of our models. The relation between the BPS counting on X and the topological string on Y is clearest in the commutative and the non-commutative chambers. In other chambers, there is a non-trivial prefactor in the relation between the BPS partition function and the matrix

Wall Crossing as Seen by Matrix Models

457

model partition function. We hope to understand the origin and the nature of the prefactor better. Our methods provide a rigorous derivation of matrix models and spectral curves, which encode the mirror map expected from the remodeling conjecture [14]. In this context it is interesting to note the subtlety related to the counting of MacMahon factors. For example, in the conifold example in the commutative chamber with either Q = 0 or e−T = 0, we have one power of MacMahon function M(q), which agrees with topological string result and Chern-Simons partition function. However there is a mismatch by M(q)1/2 between our matrix model integral formula and the topological string partition function for SPP. Similar mismatches arise in matrix models derived in [17,18,21]. The notion of the spectral curve also exists in the dimer model. In [30], which discusses the thermodynamic limit of the crystal melting model, it was proven using the results of [41], that genus 0 contribution of the DT partition function in the noncommutative chamber agrees with the genus 0 part of the topological string on the spectral curve of the dimer model, which is the mirror of X . An interesting problem is to understand how the spectral curve of the matrix model is related to that of the dimer model. The holomorphic anomaly equations of topological string amplitudes can be interpreted as the manifestation of their background independence [42,43]. The relation between the BPS partition function on X and the topological string on Y suggests that the wall crossing phenomenon on X may be related to the background independence on Y . In this context it would also be interesting to relate our analysis directly to the continuous limit of Kontsevich-Soibelman equations [44]. In this paper we considered bound states of D6-D2-D0 branes. This analysis can be extended, both from M-theory and matrix model, to include an additional D4-brane and associated open BPS invariants [49,50]. Refined versions of our results can also be found using similar techniques [51]. Acknowledgements. We thank Mina Aganagic, Vincento Bouchard, Kentaro Hori, and Yan Soibelman for discussions. H. O. and P. S. thank Hermann Nicolai and the Max-Planck-Institut für Gravitationsphysik for hospitality. Our work is supported in part by the DOE grant DE-FG03-92-ER40701. H. O. and M. Y. are also supported in part by the World Premier International Research Center Initiative of MEXT. H. O. is supported in part by JSPS Grant-in-Aid for Scientific Research (C) 20540256 and by the Humboldt Research Award. P. S. acknowledges the support of the European Commission under the Marie-Curie International Outgoing Fellowship Programme and the Foundation for Polish Science. M. Y. is supported in part by the JSPS Research Fellowship for Young Scientists and the Global COE Program for Physical Science Frontier at the University of Tokyo. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

A. Unitary Measure and the Migdal Integral Matrix models derived in this paper, either from the fermionic or non-intersecting paths viewpoint, are of the form − 1 Tr V (U ) Z matrix = dU e gs unitary , where the unitary measure, after diagonalization U = diag(u 1 , . . . , u n ) with eigenvalues u i = eiφi , takes the form dφk (eiφi − eiφ j )(e−iφi − e−iφ j ). dU = k

i< j

458

H. Ooguri, P. Sułkowski, M. Yamazaki

This measure can be turned into the form involving the standard Vandermonde determi nant dU → k du k i< j (u i − u j )2 at the expense of introducing an additional term T log U to the matrix potential Vunitary (U ) → V (U ) = Vunitary (U ) + T log U,

T = gs N .

(A.1)

To find the resolvent for compact domain of eigenvalue distribution, arising from the initial unitary matrix ensemble, one can use results of [52]. Namely, the resolvent ω(u) of the resulting matrix model can be solved using the Migdal integral, as also explained in [27] and confirmed in explicit computations e.g. in [18,45]. In case of the one-cut matrix model this integral takes form 1 dz ∂z V (z) (u − a+ )(u − a− ) ω(u) = , (A.2) 2T 2πi u − z (z − a+ )(z − a− ) where the integration contour encircles counter-clockwise two endpoints of the cut a± . In computing such Migdal integrals we often come across the situation where the derivative of the potential ∂z V (z) contains terms of the form log(z+c) . In this case we z find 1 dz log(z + c) (u − a+ )(u − a− ) ωc (u) = 2T 2πi z(u − z) (z − a+ )(z − a− ) (a + c)(a − u) − (a + c)(a − u) 2 1 + − − + log =− √ √ 2uT (u + c)( a− − u − a+ − u) (a + c)a − (a + c)a 2 (u − a+ )(u − a− ) + − − + − log . (A.3) √ √ √ 2uT a+ a− c( a+ − a− ) This result arises from contour integrals around poles at z = 0 and z = u, as well as along the branch cut of the logarithm (−∞, −c). To find the latter contributions the following integral is useful: 1 dx = −√ √ (x − u) (x − a)(x − b) (u − a)(u − b) √ √ ( (x − a)(b − u) − (x − b)(a − u))2 . × log √ (u − x) (u − a)(u − b) In particular, for the conifold matrix model with the potential given in (3.6), the resolvent can be expressed as T − log(Qeτ ) (u − a+ )(u − a− ) 1 . (A.4) ω(u) = ω Qeτ (u) − ω Q (u) + + √ 2T u a+ a− u In consequence we find that the resolvent is given by a sum of two terms, which in the limit u → ∞ are respectively constant and of order 1/u. Imposing the asymptotic condition on the resolvent ω(u) ∼ 1/u given in (3.11) implies that the constant term must vanish, while the ∼ 1/u term must have a proper coefficient. This leads to the result (3.10), and moreover gives rise to the two equations (3.12) and (3.13) for the endpoints of

Wall Crossing as Seen by Matrix Models

459

the cut a± . The solution to these equations is given in (3.14). For various computations concerning this conifold example it is advantageous to use the identities a+ a− =

1 − Q 2 2

, 1 − μ 2 1 − Q(1 − 2 + μ 2 ) 2 (a+ + Q)(a− + Q) = , 1 − μ 2 1 − μ(1 − 2 + Q 2 ) 2 (1 + a+ μ)(1 + a− μ) = . 1 − μ 2 B. Free Fermion Formalism For completeness we review free fermion formalism [46] following conventions of [6,35]. We start with the Heisenberg algebra [αm , α−n ] = nδm,n and define ± (x) = e

xn n>0 n

α±n

,

± (x) = e

n>0

(−1)n−1 x n n

α±n

.

They act on fermionic states |μ corresponding to partitions μ as x |λ|−|μ| |λ, + (x)|μ = x |μ|−|λ| |λ, − (x)|μ = +

λμ

(x)|μ −

=

(B.1)

+

μλ

x

|λ|−|μ|

|λ,

+ (x)|μ

λμ

=

x |μ|−|λ| |λ,

(B.2)

μλ

+

where and are interlacing relations defined in (4.2) and (4.4). These operators satisfy commutation relations 1 − (y)+ (x), 1 − xy 1

(y)+ (x), + (x)− (y) = 1 − xy − + (x)− (y) = (1 + x y)− (y)+ (x),

+ (x)− (y) = (1 + x y)− (y)+ (x). + (x)− (y) =

(B.3) (B.4) (B.5) (B.6)

g , We also introduce various colors qg and the corresponding operators Q g |λ = qg|λ| |λ. Q They commute with operators as g = Q g + (xqg ), + (x) Q g , g − (x) = − (xqg ) Q Q

g = Q g + (xqg ), + (x) Q

g − g . Q (x) = − (xqg ) Q

(B.7) (B.8)

460

H. Ooguri, P. Sułkowski, M. Yamazaki

C. LGV Formula In this appendix we explain the Linström-Gessel-Viennot (LGV) formula [47,48], which is crucial for the derivation of the matrix model in Sect. 4.2. Consider an oriented graph without closed loops. We assume that a weight w(e) is assigned to each edge e of the graph. We consider N particles which follow paths pi , each starting at vertices ai and ending at bi (i = 1, . . . , N ). For such paths P = { pi : ai → bi }, we assign a weight w( pi ) = w(e). (C.1) e∈ pi

What we want to compute is the quantity F({ai }, {bi }) =

w( pi ),

(C.2)

P: non-intersecting i

where the summation is over non-intersecting paths. The LGV formula states that this can be computed by summing over general (meaning, including intersecting) paths. More precisely, when we define the “Green function” G(ai , b j ) = w( p), (C.3) p: a path from ai to b j

then the LGV formula states that F({ai }, {bi }) = det (G(ai , b j )). i, j

(C.4)

The proof is elementary, and proceeds by checking that contributions from intersecting paths cancel out due to the sign in the definition of the determinant. The determinant in the formula can be thought of as a discretized version of a Vandermonde determinant for free fermions, representing the Coulomb repulsions among particles. Now consider a more general situation. Suppose that we are given a set of vertices {ai (k)}, where k = 1, . . . , L. We consider N particles, with the following condition: i th particle starts from ai (0), goes through ai (1), then ai (2), …, and finally arrives at ai (L). Then the multiplicative property of the determinant says that det (G(ai (1), a j (L))) = i, j

L−1 k=1

det (G(ai (k), a j (k + 1))). i, j

(C.5)

This is the expression we need in the main text. References 1. Gopakumar, R., Vafa, C.: M-theory and topological strings. I, http://arXiv.org.abs/hep-th/9809187v1, 1998; M-theory and topological strings. II, http://arXiv.org.abs/hep-th/9812127vl, 1998 2. Ooguri, H., Strominger, A., Vafa, C.: Black hole attractors and the topological string. Phys. Rev. D 70, 106007 (2004) 3. Aganagic, M., Ooguri, H., Vafa, C., Yamazaki, M.: Wall crossing and M-theory. Pub. RIMS 47, 569 (2011) 4. Okounkov A., Reshetikhin, N., Vafa, C.: Quantum Calabi-Yau and classical crystals. http://arXiv.org/ abs/hep-th/0309208v2, 2003

Wall Crossing as Seen by Matrix Models

461

5. Ooguri, H., Yamazaki, M.: Crystal Melting and Toric Calabi-Yau Manifolds. Commun. Math. Phys. 292, 179 (2009) 6. Sułkowski, P.: Wall-crossing, free fermions and crystal melting. Commun. Math. Phys. 301, 517 (2011) 7. Nagao, K.: Non-commutative Donaldson-Thomas theory and vertex operators. http://arXiv.org/abs/0910. 5477v4 [math.AG], 2010 8. Aganagic, M., Dijkgraaf, R., Klemm, A., Marino, M., Vafa, C.: Topological strings and integrable hierarchies. Commun. Math. Phys 261, 451 (2006) 9. Dijkgraaf, R., Hollands, L., Sułkowski, P., Vafa, C.: Supersymmetric Gauge Theories, Intersecting Branes and Free Fermions. JHEP 0802, 106 (2008) 10. Dijkgraaf, R., Hollands, L., Sułkowski, P.: Quantum Curves and D-Modules. JHEP 0911, 047 (2009) 11. Eynard, B.: A Matrix model for plane partitions and TASEP. J. Stat. Mech. 0910, P10011 (2009) 12. Marino, M.: Chern-Simons theory, matrix integrals, and perturbative three-manifold invariants. Commun. Math. Phys. 253, 25 (2004) 13. Aganagic, M., Klemm, A., Marino, M., Vafa, C.: Matrix model as a mirror of Chern-Simons theory. JHEP 0402, 010 (2004) 14. Bouchard, V., Klemm, A., Marino, M., Pasquetti, S.: Remodeling the B-model. Commun. Math. Phys. 287, 117 (2009) 15. Eynard, B., Orantin, N.: Invariants of algebraic curves and topological expansion. http://arXiv.org/abs/ math-ph/0702045v4, 2007 16. Okuda, T.: Derivation of Calabi-Yau crystals from Chern-Simons gauge theory. JHEP 0503, 047 (2005) 17. Eynard, B.: All orders asymptotic expansion of large partitions. J. Stat. Mech. 0807, P07023 (2008) 18. Klemm, A., Sułkowski, P.: Seiberg-Witten theory and matrix models. Nucl. Phys. B 819, 400 (2009) 19. Sułkowski, P.: Matrix models for 2* theories. Phys. Rev. D 80, 086006 (2009) 20. Sułkowski, P.: Matrix models for β-ensembles from Nekrasov partition functions. JHEP 1004, 063 (2010) 21. Eynard, B., Kashani-Poor, A. K., Marchal, O.: A matrix model for the topological string I: Deriving the matrix model. http://arXiv.org/abs/1003.1737v2 [hep-th], 2010 22. Dijkgraaf, R., Sułkowski, P., Vafa, C.: In progress. 23. Aganagic, M.: In progress 24. Aganagic, M., Klemm, A., Marino, M., Vafa, C.: The topological vertex. Commun. Math. Phys. 254, 425 (2005) 25. Iqbal, A., Kashani-Poor, A. K.: The vertex on a strip. Adv. Theor. Math. Phys. 10, 317 (2006) 26. Dijkgraaf, R., Vafa, C.: Matrix models, topological strings, and supersymmetric gauge theories. Nucl. Phys. B 644, 3 (2002); On geometry and matrix models. Nucl. Phys. B 644, 21 (2002) 27. Marino, M.: Chern-Simons Theory, Matrix Models, And Topological Strings. Oxford: Oxford University Press, 2005 28. Sułkowski, P.: Crystal model for the closed topological vertex geometry. JHEP 0612, 030 (2006) 29. Imamura, Y., Isono, H., Kimura, K., Yamazaki, M.: Exactly marginal deformations of quiver gauge theories as seen from brane tilings. Prog. Theor. Phys. 117, 923 (2007) 30. Ooguri, H., Yamazaki, M.: Emergent Calabi-Yau Geometry. Phys. Rev. Lett. 102, 161601 (2009) 31. Yamazaki, M.: Crystal Melting and Wall Crossing Phenomena. Int. J. Mod. Phts A 26, 1097–1228 (2011) 32. Hori, K., Vafa, C.: Mirror symmetry. http://arXiv.org/abs/hep-th/0002222v3, 2000 33. Witten, E.: Phases of N = 2 theories in two dimensions. Nucl. Phys. B 403, 159 (1993) 34. Ooguri, H., Vafa, C.: Worldsheet Derivation of a Large N Duality. Nucl. Phys. B 641, 3 (2002) 35. Bryan, J., Young, B.: Generating functions for colored 3D Young diagrams and the Donaldson-Thomas invariants of orbifolds. http://arXiv.org/abs/0802.3948v2 [math.CO], 2008 36. Nagao, K., Yamazaki, M.: The Non-commutative Topological Vertex and Wall Crossing Phenomena. http://arXiv.org/abs/0910.5479vL [hep-th], 2009 37. Ooguri, H., Vafa, C.: Knot invariants and topological strings. Nucl. Phys. B 577, 419 (2000) 38. Harish-Chandra, : Differential operators on a semisimple Lie algebra. Amer. J. Math. 79, 87 (1957) 39. Itzykson, C., Zuber, J. B.: The Planar Approximation. 2. J Math. Phys. 21, 411 (1980) 40. Mozgovoy, S., Reineke, M., On the noncommutative Donaldson-Thomas invariants arising from brane tilings. http://arXiv.org/abs/0809.0117v2 [math.AG], 2008 41. Kenyon, R., Okounkov, A., Sheffield, S.: Dimers and Amoebae. http://arXiv.org/abs/math-ph/0311005v1, 2003 42. Bershadsky, M., Cecotti, S., Ooguri, H., Vafa, C.: Kodaira-Spencer theory of gravity and exact results for quantum string amplitudes. Commun. Math. Phys. 165, 311 (1994) 43. Witten, E.: Quantum background independence in string theory. http://arXiv.org/abs/hep-th/9306122v1, 1993 44. Kontsevich, M., Soibelman, Y.: Stability structures, motivic Donaldson-Thomas invariants and cluster transformations. http://arXiv.org/abs/0811.2435v1 [math.AG], 2008 45. Caporaso, N., Griguolo, L., Marino, M., Pasquetti, S., Seminara, D.: Phase transitions, double-scaling limit, and topological strings. Phys. Rev. D 75, 046004 (2007)

462

H. Ooguri, P. Sułkowski, M. Yamazaki

46. Jimbo, M., Miwa, T.: Solitons and Infinite Dimensional Lie Algebras. Kyoto University, RIMS 19, 943 (1983) 47. Lindström, B.: On the vector representations of induced matroids. Bull. London Math. Soc. 5, 85 (1973) 48. Gessel, I., Viennot, G.: Binomial determinants, paths, and hook length formulae. Adv. in Math. 58, 300 (1985) 49. Aganagic, M., Yamazaki, M.: Open BPS Wall Crossing and M-theory. Nucl. Phys. B 834, 258 (2010) 50. Sułkowski, P.: Wall-crossing, open BPS counting and matrix models. JHEP 1103, 089 (2011) 51. Sułkowski, P.: Refined matrix models from BPS counting. Phys. Rev. D 83, 085021 (2011) 52. Mandal, G.: Phase Structure Of Unitary Matrix Models. Mod. Phys. Lett. A 5, 1147–1158 (1990) Communicated by A. Kapustin

Commun. Math. Phys. 307, 463–512 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1329-3

Communications in

Mathematical Physics

The Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data David E. Evans1 , Terry Gannon2 1 School of Mathematics, Cardiff University, Senghennydd Road, Cardiff CF24 4AG, Wales, UK

E-mail: [email protected]

2 Department of Mathematics, University of Alberta, Edmonton, AB T6G 2G1, Canada.

E-mail: [email protected] Received: 7 June 2010 / Accepted: 23 March 2011 Published online: 9 September 2011 – © Springer-Verlag 2011

Abstract: The quantum double of the Haagerup subfactor, the first irreducible finite depth subfactor with index above 4, is the most obvious candidate for exotic modular data. We show that its modular data DHg fits into a family Dω Hg2n+1 , where n ≥ 0 and ω ∈ Z2n+1 . We show D0 Hg2n+1 is related to the subfactors Izumi hypothetically associates to the cyclic groups Z2n+1 . Their modular data comes equipped with canonical and dual canonical modular invariants; we compute the corresponding alpha-inductions, etc. In addition, we show there are (respectively) 1, 2, 0 subfactors of Izumi type Z7 , Z9 and Z23 , and find numerical evidence for 2, 1, 1, 1, 2 subfactors of Izumi type Z11 , Z13 , Z15 , Z17 , Z19 (previously, Izumi had shown uniqueness for Z3 and Z5 ), and we identify their modular data. We explain how DHg (more generally Dω Hg2n+1 ) is a graft of the quantum double DSym(3) (resp. the twisted double Dω D2n+1 ) by affine so(13) (resp. so(4n 2 + 4n + 5)) at level 2. We discuss the vertex operator algebra (or conformal field theory) realisation of the modular data Dω Hg2n+1 . For example we show there are exactly 2 possible character vectors (giving graded dimensions of all modules) for the Haagerup VOA at central charge c = 8. It seems unlikely that any of this twisted Haagerup-Izumi modular data can be regarded as exotic, in any reasonable sense. Contents 1.

2.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Modular data and modular invariants . . . . . . . . . 1.2 Subfactors, quantum doubles and tube algebras . . . 1.3 The canonical and dual canonical modular invariants 1.4 VOAs and vector-valued modular functions . . . . . Comparing the Haagerup, Sym(3) and SO(13) . . . . . . 2.1 The Tube Algebra of Sym(3) . . . . . . . . . . . . . 2.2 The Haagerup tube algebra . . . . . . . . . . . . . . 2.3 Haagerup modular data DHg . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

464 467 469 472 474 476 476 477 480

464

D. E. Evans, T. Gannon

2.4 Clarifying the Haagerup-Sym(3) relation . . . . . . . . . 2.5 The Haagerup and SO(13) . . . . . . . . . . . . . . . . 3. Generalising the Modular Data of the Haagerup . . . . . . . 3.1 Dihedral groups and orthogonal algebras . . . . . . . . . 3.2 Generalising the Haagerup modular data . . . . . . . . . 3.3 Further generalisations . . . . . . . . . . . . . . . . . . 3.4 Miscellanea involving modular invariants . . . . . . . . 4. Subfactors for Haagerup–Izumi Modular Data . . . . . . . . 4.1 Izumi’s subfactors and their modular data . . . . . . . . 4.2 Principal graphs, α-induction, etc. for Izumi’s subfactors 5. VOAs for Haagerup-Izumi Modular Data . . . . . . . . . . . 5.1 The Haagerup-dihedral diamond . . . . . . . . . . . . . 5.2 Character vectors for the untwisted Haagerup double . . 5.2.1 Untwisted Haagerup at central charge c ≡24 16. . . . 5.2.2 Untwisted Haagerup at central charge c ≡24 0. . . . 5.3 Character vectors of the twisted Haagerup double . . . . 5.3.1 The 1-twisted Haagerup at central charge c ≡24 8. . 5.3.2 The 1-twisted Haagerup at central charge c ≡24 16. . 5.3.3 The 1-twisted Haagerup at central charge c ≡24 0. . 5.3.4 The 2-twisted Haagerup at central charge c ≡24 8. . 5.3.5 The 2-twisted Haagerup at central charge c ≡24 16. . 5.3.6 The 2-twisted Haagerup at central charge c ≡24 0. . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

480 484 484 484 486 487 490 493 493 500 503 503 504 504 506 507 507 508 508 509 509 510 510

1. Introduction From the early years of conformal field theory (CFT) (see e.g. [51,60]), we find: Speculation. The standard constructions (orbifolds, cosets, simple-current extensions, . . .) applied to the basic theories (lattice compactifications, affine Kac-Moody algebras, . . .) exhaust all rational theories. Perhaps this should be applied to the modular tensor category rather than the full CFT, in which case this would constitute a sort of generalised Tannaka-Krein duality holding for modular tensor categories, where the dual to the category (i.e. the analogue of the compact group) is (say) a vertex operator algebra constructed in a standard way from the basic examples. Possible counterexamples have been around just as long. For instance Walton [59] explained that rank-level duality applied to conformal embeddings of affine algebra CFTs (the simplest being A1,10 ⊂ C2,1 , yielding an extension of A9,2 ) yield what seem to be new CFTs. The possibility that rank-level duality is somehow inherently sick was eliminated by Xu [61], who realised many of these examples with completely rational nets of subfactors (these should correspond to rational CFT). Perhaps all that this accomplishes is to insist that rank-level duality (or in subfactor language the mirror) should be included as one of those standard constructions. More recently, [19] proposed four modular data (each with 6 or 7 primaries) as possible counterexamples to this speculation, but didn’t show they could be realised by a rational CFT (or vertex operator algebra or net of subfactors). Perhaps the most obvious place to look for a truly exotic example is the Haagerup subfactor [1]. It is the first irreducible finite depth subfactor with index greater than 4 —

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

465

√ it has index (5 + 13)/2 ≈ 4.30278. It is generally regarded as exotic, since (so far) it can only be constructed by hand without any natural algebraic symmetries. However it is nonbraided and so to get a braided system (which has a chance to correspond to the fusion ring and modular data of a rational CFT) we should take its quantum double, or equivalently asymptotic inclusion (see Ocneanu as in [25, Chap. 12]) or Longo-Rehren inclusion [49,50]. Its modular data DHg was computed in [38], and subfactor realisations of its modular invariants were studied in [26]. Question 1. Is this Haagerup quantum double realized by a rational CFT (or rational vertex operator algebra — see Definition 3 below — or completely rational net of subfactors [45]), in the sense that they share the same modular data DHg? We would want a deeper relation between the Haagerup double and the corresponding vertex operator algebra (VOA) than merely that their modular data coincide, but for now that can suffice. A more direct realisation of the Haagerup subfactor N ⊂ M in a rational CFT could be that its N -N sectors (say) form the algebra of defect lines (or full system) of a rational CFT. After all this will in general fail to be commutative. However, even this cannot happen: the modular data of the quantum double of a full system must be in factorised form X ⊗ X opp . This is because the quantum double of the full system is equivalent to the double of the fusion algebra X by [52, 53, Cor. 2.2], and this double factors as X ⊗ X opp by [24, Prop. 2.2] as long as the braiding on X is non-degenerate. However, the modular data of the double of the Haagerup is easily seen to not be in factorised form [35]. There remains the possibility, which we won’t explore in this paper, that the Haagerup systems are the full system of a degenerately braided system. Note that much of the theory of modular invariants, including alpha-induction and the full system, holds for degenerately braided systems [8–12]. Question 2. If DHg is realised by a rational VOA, say, then is that VOA exotic, in the sense that it cannot be constructed from standard methods and examples? The three known constructions of the Haagerup subfactor (namely Haagerup’s connection computation [1,34], Izumi’s Cuntz algebra construction [38] and Peters’ planar algebra construction [54]) are largely combinatorial tours de force (although Izumi’s suggests an underlying cyclic group), and for this reason the Haagerup subfactor is generally regarded as exotic. Even given this, it is conceivable that its quantum double DHg could be constructed directly in more standard ways. [35] confirms that DHg does not fall into the simplest possibilities, namely the modular data of an affine algebra, lattice, or finite group. We argue that the answer to Question 1 is yes, and that for Question 2 is no. In particular, we explain why such a VOA should be a conformal subalgebra of the central charge c = 8 VOA V(E 6 A2 ) corresponding to (root) lattice E 6 ⊕ A2 (by conformal subalgebra we mean a subVOA of identical central charge and identical conformal vector). A more familiar conformal subVOA of V(E 6 A2 ) is an order-2 orbifold, realising the modular data DS3 of the quantum double of the symmetric group Sym(3) = S3 . An early indication for us that the Haagerup is related to S3 was the similarity of their tube algebras. Tube algebras were introduced by Ocneanu (see e.g. [25]) as a way of computing the irreducible objects and fusion rules of the quantum doubles of subfactors, and developed further by Izumi [37,38] in particular in analysing and determining their modular data. The list of modular invariants and nimreps for the doubles of the Haagerup and S3 [15,26,27] are also strikingly similar, and in fact the modular invariants lead us to E 6 ⊕ A2 . The connection between modular data and VOAs is made much more explicit

466

D. E. Evans, T. Gannon

using character vectors. A second, equally promising possibility for a VOA realisation of DHg, by GKO cosets using the affine algebra VOA V(B6,2 ), is discussed but not analysed in detail. We generalize the modular data DHg in two directions, by fitting it into an infinite sequence, and showing the n th term in this sequence can be twisted by Z2n+1 . The role of S3 is now played by the twisted quantum double of D2n+1 , and V(E 6 A2 ) (which realises the DZ3 modular data) by a holomorphic orbifold by Z2n+1 . Subfactor realizations of D0 Hg2n+1 should be provided by Izumi’s hypothetical generalisation of the Haagerup: we construct a unique subfactor of Izumi type which realises D0 Hgν for all 1 ≤ ν ≤ 19 and conjecture that indeed this continues for every n. For this reason we suggest calling D0 Hg2n+1 Haagerup-Izumi modular data. von Neumann algebra realisations of twisted Haagerup-Izumi modular data Dω Hg2n+1 are still unclear. Question 3. Can we orbifold a VOA by something more general than a group? If so, this should provide a simple construction of VOAs realising Dω Hg2n+1 , starting from a holomorphic orbifold by Z2n+1 . Perhaps this is not that unreasonable — after all, a subfactor itself is really a generalised orbifold, in the sense that the quantum double of a subfactor M G ⊂ M of fixed points recovers DG. Indeed the whole motivation of subfactor theory (see e.g. [25]) is to understand non-grouplike quantum symmetries through subfactors or inclusions of factors N ⊂ M, and associated notions of N -M, N -M, M-M irreducible bimodules or sectors, their fusion rules, paragroups, λ-lattices or planar algebras. This notion of generalised orbifold construction is also featured in the framework of [30]. Our work suggests a further generalisation. Consider a quadruple (K , H, α, ω) where K and H are finite groups, and H acts freely on K , i.e. the group homomorphism α : H → Aut(K ) obeys α(h)k = k iff h = e or k = e. Freeness implies the projection h → Out(K ) is injective; such a semi-direct product D = K × α H is called a Frobenius group. The twist ω ∈ H 3 (B K ; T) should be compatible with α in the sense that it lies in the image of the natural map H 3 (B D ; T) → H 3 (B K ; T). The Haagerup subfactor is associated to (Z3 , Z2 , α(1) = −, [0]) and D = S3 . Wildly Optimistic Guess. Let (K , H, α, ω) be any quadruple defined above, corresponding to Frobenius group D = K × α H . There is an irreducible finite-depth subfactor associated to the pair (K , H ); the triple (K , H, α) is realised by a Q-system of endomorphisms. To any such quadruple there is a rational VOA V(K , H, α, ω), and a completely rational net of subfactors realising the modular data of this twisted quantum double. This VOA is a generalised orbifold (controlled in some sense by H ) of a holomorphic orbifold V K by K . This VOA V K also contains a holomorphic orbifold V D ; both V(K , H, α, ω) and V D contain a common rational VOA. For K an odd cyclic group and H = Z2 acting by n → −n, this recovers the modular data given in Sect. 3.2 and also a subfactor of Izumi type K , i.e. a solution to Eqs. (7.1)–(7.5) of [38]; the K = Z3 special case recovers the Haagerup subfactor. This guess suggests that the Haagerup subfactor itself cannot be regarded as exotic. But this guess is in some control only for the odd cyclic case discussed above, where it seems quite plausible. The reason for requiring freeness is that this condition plays an important role in the derivation of Dω Hgν in Sect. 3.2, as well as in [38]; probably it can be dropped but at a cost of complexity. The possibility that the Haagerup subfactor is related in some way to S3 appears to have been first made in [21]. The present paper clarifies, deepens, and generalises that relation. The remainder of this first section recalls modular data and modular invariants

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

467

(Sect. 1.1), reviews the basic theory of subfactors and the tube algebra (Sect. 1.2), and discusses the modular data and character vectors of VOAs (Sect. 1.4). We also explain in Sect. 1.3 how to recover the original subfactor from its quantum double — unlike the rest of Sect. 1, this subsection contains original material. Section 2 develops the relation between the Haagerup subfactor and S3 . Section 3 puts the Haagerup modular data into a sequence and twists it. We also introduce the notion of grafting and discuss the role of affine so(ν 2 + 4) at level 2. Section 4 relates our untwisted sequence to Izumi’s hypothetical family of 3-star, 5-star, 7-star, . . . subfactors. Section 5 discusses VOA interpretations for our twisted sequence, and explains the Haagerup-dihedral diamond which generalises Sect. 2. The Haagerup subfactor arose in Haagerup’s√classification [34] of irreducible finite depth subfactors of index between 4 and 3 + 3 ≈ 4.73205. Two other subfactors N ⊂ M (both also regarded as exotic) appear there: the Asaeda-Haagerup subfactor √ with Jones index (5 + 17)/2 ≈ 4.56155 [1], and the extended Haagerup subfactor [4] with index ≈ 4.37720. Both seem unrelated to our family, and neither lies in a known sequence. As with the Haagerup subfactor, neither their N -N nor M-M systems are braided. Of course Questions 1 and 2 should also be asked of them, but no serious work can begin until the modular data of their doubles has been computed.

1.1. Modular data and modular invariants. Modular data arises naturally in several contexts (for instance CFT, subfactors, and VOAs) — see [31] for a review. In CFT, modular invariants correspond to the 1-loop closed string partition function. Definition 1. Modular data (, 0, S, T ) consists of a finite set , an element 0 ∈ , and matrices S = (Si, j )i, j∈ and T = (Ti, j )i, j∈ , such that: a) b) c) d)

S, T are unitary, S is symmetric, T is diagonal and of finite order; S 2 = (ST )3 , S 4 = I , the identity matrix; C := S 2 is a permutation matrix, and row 0 of S consists of nonzero real numbers; for all i, j, k ∈ , the following quantities are nonnegative integers: Ni,k j :=

Si,l S j,l Sk,l . S0,l

(1.1)

l∈

We call modular data (, 0, S, T ) and ( , 0 , S , T ) equivalent if there is a bijection

φ : → such that φ(0) = 0 , Sφ(i),φ( j) = Si, j and Tφ(i),φ(i) = Ti,i for all i, j ∈ . The i ∈ are called primaries and 0 ∈ is called the vacuum. The order 1 or 2 permutation C is called charge-conjugation. T0,0 = e−π ic/12 for a real number c (only defined mod 24 by the modular data) is called the central charge. Equation (1.1) is called the Verlinde formula; the quantities Nikj are called fusion coefficients and comprise the structure constants of a commutative associative algebra called the fusion ring. The coefficients Ni, j,k := Ni,Cij are invariant under all 6 reorderings of indices. Also, N0,i, j = δi,C j . Condition b) says that modular data generates a representation ρ of the modular group SL2 (Z) through the assignment S=ρ

0 1

1 −1 , T =ρ 0 0

1 . 1

(1.2)

468

D. E. Evans, T. Gannon

Most examples considered in this paper have the additional property that C = I , i.e. S is real and the representation is factors to PSL2 (Z). We assume throughout that the first row of S is strictly positive — such modular data is called unitary. From Definition 1 we obtain Si, j = T i,i T j, j T0,0 Tk,k Sk,0 Ni,k j . (1.3) k∈

Thus if T is known exactly but S only approximately, (1.3) together with the integrality of the fusion coefficients (1.1) can be used to determine S exactly (the quantum-dimension Si,0 /S0,0 is the Perron-Frobenius eigenvalue of the matrix Ni = (Ni,k j ), and S0,0 > 0 is determined by the square i (Si,0 /S0,0 )2 of the global dimension 1/S0,0 ). Equivalently,

if S, T and S , T are both modular data sharing the same T , and S is sufficiently close to S, then S = S . This is how we’ll identify the modular data of the double of subfactors in Sect. 4.1, given numerical estimates of S. One consequence of this definition is that the entries of S and T lie in a cyclotomic extension Q[ξ L ] of the rationals (throughout this paper we write ξk for exp[2π i/k]). For any Galois automorphism σ ∈ Gal(Q[ξ L ]/Q) ∼ = Z× L , (, 0, σ S, σ T ) will be (generally inequivalent) modular data with identical fusion coefficients, where σ acts entry-wise on S and T . It can be shown [14] that for any σ ∈ Gal(Q[ξ L ]/Q), there is a permutation j → σ j of and choice sσ ( j) of signs such that σ (Si, j ) = sσ (i) Sσ i, j = sσ ( j) Si,σ j.

(1.4)

An important class of examples of modular data comes from finite groups G. Write e for the identity and g h for h −1 gh. Select a cocycle ω ∈ Z 3 (BG ; T), where we write T for the unit circle in C; the corresponding modular data depends up to equivalence only on its class in H 3 (BG ; T). Define θg (h, l) = ω(h, g h , l) ω(g, h, l) ω(h, l, g hl ).

(1.5)

Then the restriction of each θa to the centraliser Za := {h ∈ G : ha = ah} in G is a normalised 2-cocycle. In this paper we are only interested in the cohomologically trivial case where all θa are coboundaries (this will happen for instance whenever the Schur multiplier H 2 (BZa ; T) of each centraliser Za is trivial). This means that there exist 1-cochains a : Za → T for which a (e) = 1 and both θa (h, g) = (δa )(h, g) = a (h) a (g) a (hg), x −1 ax (x −1 hx) = θa (x, x −1 hx) θa (h, x) a (h),

(1.6) (1.7)

for all g, h ∈ Za and x ∈ G. The primaries are all pairs (a, π ) where a runs through a = Irr(Za ) are irreducible reprerepresentatives of conjugacy classes in G and π ∈ Z sentations (irreps). The vacuum is (e, 1). The modular matrices are [16] 1 ω S(a,χ χ (ga g −1 ) χ (a g ) a (ga g −1 ) ga g−1 (a), (1.8a) ),(a ,χ ) = |Za ||Za |

g∈G(a,a )

ω T(a,χ ),(a ,χ ) = δa,a δχ ,χ

χ (a) a (a), χ (e)

(1.8b)

where G(a, a ) = {g ∈ G : a g b = ba g }. In the following we let Dω G denote this modular data. See e.g. [15–17] for more details; we recover this formula in the next subsection using tube algebras.

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

469

Another class of modular data is associated to even positive-definite lattices. Let L be such a lattice, and n its dimension. Then = L ∗ /L, where L ∗ is the dual of L. The vacuum 0 corresponds to the coset [0]. Then S[u],[v] = |L ∗ /L|−1/2 exp[2π i u · v], T[u],[u] = exp[π i u · u − π i n/12].

(1.9a) (1.9b)

The final class we need is associated to affine nontwisted Lie algebras g(1) and positive integers k (see e.g. [42]). Let g,k denote the corresponding modular data: is the set P+k (g(1) ) of integrable highest-weights of level k, the vacuum is k0 , and explicit formulas for the matrices S and T are given for instance in [42, Ch.13] or [31]. Finite group, lattice and affine algebra modular data (and as we’ll see next subsection, the double of a subfactor) is always unitary. Definition 2. A matrix Z = (Z i, j )i, j∈ is called a modular invariant provided i) all entries Z i, j are nonnegative integers and Z 0,0 = 1; ii) Z S = S Z and Z T = T Z . Examples are Z = I and Z = C. A modular invariant is often written in equivalent form as the formal expression Z= Z i, j ch i ch j . (1.10) i, j∈

Unitary modular data has only finitely many modular invariants. The Galois symmetry (1.4) implies, for any σ ∈ Gal(Q[ξ L ]/Q), Mi, j = sσ (i) sσ ( j) Mσ i,σ j , Mi, j = 0 implies sσ (i) = sσ ( j).

(1.11) (1.12)

1.2. Subfactors, quantum doubles and tube algebras. We refer to [25,41] for the basic theory of subfactors, principal graphs etc, [48] for the theory of sectors, and [9–11] for the theory of alpha-induction. Given a type III factor N , let N X N denote a finite system of endomorphisms on N [11, Defn. 2.1]. Write ( N X N ) for the endomorphisms which decompose into a finite number of irreducibles from N X N . For λ, ρ ∈ ( N X N ), the intertwiner space Hom(λ, ρ) is a finite-dimensional Hilbert space. Write λ, ρ for its dimension. The sector [λ] identifies all endomorphisms Ad(u)λ for unitaries u in the target algebra. Suppose N X N is nondegenerately braided [11, Sect. 2.2]. Among other things this means λ, μ ∈ N X N commute up to a unitary = (λ, μ), i.e. λμ = Ad()μλ, and these unitaries {(λ, μ)} can be chosen to satisfy the braiding–fusion relations. Unitary modular data S, T is obtained from this set-up by the intertwiners associated to the Hopf link and twist [55,58]; here = N X N and 1 is the identity endomorphism, and the [ν] corresponding fusion ring is realised by composition in N X N : [λ][μ] = N[λ],[μ] [ν],

[ν] where N[λ],[μ] = [λ][μ], [ν]. Modular invariants are recovered as follows. Suppose we have a subfactor N ⊂ M. Let ι : N → M be the inclusion and ι : M → N its conjugate. Then θ = ιι is called the canonical endomorphism and γ = ιι its dual canonical endomorphism. Suppose θ is in

470

D. E. Evans, T. Gannon

( N X N ). Using the braiding + := or its opposite − := −1 , we can lift an endomorphism λ ∈ N X N of N to one of M: αλ± := γ −1 Ad( ± (λ, θ ))λγ . Then Z λ,μ := αλ+ , αμ− ± generate the full system is a modular invariant [11,20]. The induced α ± ( N X N ) = M X M X . The N -M system X (resp. the M-N system X ) consists of all irreducibles M M N M M N in ιλ (resp. λι) for all λ ∈ N X N . By the nimrep we mean the N X N action on the N -M system N X M . This is one of 8 (6 independent) natural products P X Q × Q X R → P X R among the sectors, one for each triple P, Q, R ∈ {M, N }. Many examples (e.g. the Haagerup and finite groups) arise naturally as nonbraided systems of endomorphisms. To get a braided system, one takes the quantum double (asymptotic inclusion) of N X N . This can be realised by the Longo-Rehren inclusion A ⊂ B where B = N ⊗ N opp (see e.g. [37,49]): we are interested in the double D( N X N ) system on A (which contains the A-A system of this subfactor, cf. Remark (i) p. 146 of [37]) whereas the full system B X B is simply N X N ⊗ N X N opp (see Theorem 1 below) with dual canonical endomorphism γ L R = ξ ∈ N X N ξ ⊗ ξ opp . If M X M is an M-M system of a subfactor N ⊂ M, then the dual Longo-Rehren inclusion is ιˆ : A1 ⊂ M ⊗ M opp =: B1 , where γ = ιˆιˆ = η∈ M X M η ⊗ ηopp . Here opp B1 X B1 is M X M ⊗ M X M but we can and will identify the double D( N X N ) on A with the double D( M X M ) on A1 , and A1 with A. The induction-restriction graph of the quantum double system is constructed in the following way [37]. First, given ρ ∈ ( N X N ), a system of unitaries {Eρ (ξ )} is called a half-braiding of ρ if Eρ (ξ ) ∈ Hom(ρξ, ξρ) for some ξ ∈ N X N , and xEρ (ζ ) = ξ(Eρ (η))Eρ (ξ )ρ(x) for every x ∈ Hom(ζ, ξ η), as in [37, Def. 4.2]. This can be described by a matrix representation for an orthonormal basis {wρ (ξ )i } of Hom(ξ, ρ), if we set Eρp (ξ )(η,i),(ζ, j) := ρξ (wρ (ζ )∗j )Eρp (ξ )wρ (η)i ∈ Hom(ηξ, ξ ζ ). Then the even vertices of the quantum double system are labelled by inequivalent halfp braidings Eρ , and the odd vertices are labelled by (ξ ⊗ idopp )ι with ξ running in N X N . Finally vertices Eρ and (ξ ⊗ idopp )ι are joined by ρ, ξ edges, see [37, p. 154]. The principal graph of A ⊂ B is the connected component containing (id ⊗ idopp )ι. p Incidentally, the forgetful functor sending the half-braiding Eρ to the representation ρ is an algebra homomorphism, and indeed has a name: it is alpha-induction. See also the discussion at the end of the next subsection. The tube algebra is an effective way to compute the modular data S, T of the quantum double. In Sect. 2 we compare the tube algebras of the Haagerup subfactor and the group S3 . Their similarity is the first indication of a possible relation between them. The tube algebra of N X N is the finite dimensional C∗ -algebra, Hom(ξ · ζ, ζ · η). (1.13) Tube( N X N ) = ξ,η,ζ ∈ N X N p

The simple summands of Tube( N X N ) are labelled by inequivalent half-braidings Eρ . The modular data S, T is obtained from the half-braidings by (2.8), (2.9) of [38]. Let’s consider in more detail the special case of tube algebras of finite groups. Let G be a finite group with identity e. Given a type III factor N , write Int(N ) for the group of inner automorphisms and Out(N ) := Aut(N )/Int(N ). Let α : G → Out(N ) be a homomorphism. We take a lift α : G → Aut(N ), as in [57]. Since [αg ][αh ] = [αgh ] as sectors, there exists a unitary u g,h ∈ N satisfying αg · αh = Ad(u g,h )αgh . In particular u g,h ∈ Hom(αgh , αg · αh ). Associativity (αg αh )αk =

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

471

αg (αh αk ) implies Ad(u g,h )u gh,k ·αghk = Ad(αg (u h,k )u g,hk )·αghk . Then since N is a factor, there exists a scalar ω(g, h, k) ∈ T satisfying u g,h u gh,k = ω(g, h, k)αg (u h,k )u g,hk , i.e. ω is an element in Z 3 (BG ; T). Conversely, every element in Z 3 (BG ; T) arises as such an ω [40,57]. Set c(g, h) := u h,g h u ∗g,h . The product and *-structure of the tube algebra Tube( N X N ) are: c(g, h) c(k, l) = δk,g h θg (h, l) c(g, hl),

(1.14)

c(g, h)∗ = ω(gh, h −1 , h) ω(h −1 , gh, h −1 ) ω(h −1 , h, g h ) c(g h , h −1 )

(1.15)

(recall (1.5)). Consequently, c(e, g) c(e, h) = c(e, gh) and c(e, g)∗ = c(e, g −1 ), ∗ and thus the group algebra C[G] of G is a C -subalgebra of Tube( N X N ). The identity of the tube algebra is g∈G c(g, e). class of an element g ∈ G. Then Let K g = {g h : h ∈ G} denote the conjugacy we have the decomposition Tube( N X N ) = K u Tube(K u ), where the sum ranges over the conjugacy classes of G, and Tube(K u ) = g∈K u ,l∈G Hom(αg αl , αl αgl ). Hence C[G] = Tube(K e ). This decomposition can be further refined. Consider first a trivial twist ω [28]. The outer action α gives a subfactor N ⊂ N G = M. The N -N system is {αg } ≡ G, The tube algebras Tube(G) Tube(G) are whereas the M-M system is the irreps G. Morita equivalent as finite-dimensional C∗ -algebras [37], so their centres, the quantum are identified. The simple components of Tube(G), equivalently the doubles of G and G, u even vertices of the quantum double of G, are labelled by pairs (K u , π ), where π ∈ Z are irreps of the centralizers Zu , see e.g. [46] or [26, Sect. 4]. To every such pair (K u , π ) is attached the endomorphism ρ(K u ,π ) ∈ (G) defined as ρ(K u ,π ) = dim(π ) αh . (1.16) h∈K u

attached to such a pair the endomorphism ρˆ(K u ,π ) in (G) In the dual system G, G π from Z to G, decomposed into irreps of G. (K u , π ) is the Mackey induction IndZ u u The number of inequivalent half-braidings associated to ρ(K u ,π ) equals the number u such that dim(π ) = dim(π ). The half-braiding for of inequivalent irreps π ∈ Z ρ = ρ(K u ,π ) in its matrix decomposition is Eρπ (ξ )(u p ,i),(u q , j) = π ji ( pξ −1 q −1 )id N ,

(1.17)

for all ξ ∈ p −1 Zu q, u p , u q ∈ K u , where i, j label the basis vectors in the representation space of π . All this extends to the twisted case. The θa ’s of (1.5) are normalised twisted cocycles on G, namely from (1.14) they satisfy θa (x, y) θa (x y, z) = θa (x, yz) θx −1 ax (y, z),

∀x, y, z ∈ G.

They thus describe projective representations of Za . As we know [17], the primary fields in the model twisted by a given 3-cocycle ω consist of all pairs (K a , π˜ ), where now π˜ ∈ θa -Irr(Za ). In the cohomologically trivial case discussed in Sect. 1.1, this is immediate: we can twist the formula (1.17) for half-braidings by ; inserting into (2.8),(2.9) of [38] recovers (1.8).

472

D. E. Evans, T. Gannon

1.3. The canonical and dual canonical modular invariants. This subsection contains original material. Suppose we have a subfactor N ⊂ M, with the N -N system denoted by and the . The double D = L R() of is realised as sectors on a factor A M-M system by in N ⊗ N opp =: B. Alpha-induction reverses this. That is, if you take the appropriate modular invariant Z in the double, and take alpha-induction from A to B, then the full system is ⊗ opp . In some sense then, the factor A and the modular invariant Z remembers the original system on A. In fact it is a consequence of Theorem 1 below that the A-B system is just regarded as a nimrep. To get B, only the canonical endomorphism θ on A is needed (because of Theorem 1 below, we can read θ off from the modular invariant), plus the associated Q-system on θ . That is, all the information about B and the full system ⊗opp is carried in the primary fields and modular data of the double, the modular invariant Z and its vacuum block θ . We suggest this modular invariant Z be called the canonical modular invariant. acts on M, Dually, there is an inclusion A ⊂ M ⊗ M opp =: B, where the dual but the same double acts on A. This comes with another modular invariant Z , with a ⊗ opp as the full system. Here (again from Theorem 1 corresponding θ which regains . Again the modular invariant Z below) the A-B system is just — the dual canonical . modular invariant — and θ together encode the original system as equipped with two canonical This means we should regard the double D ∼ = D modular invariants Z and Z , from which using say alpha-induction one can recover . In particular the the full system B-B and the nimrep A-B entirely in terms of or opp opp full system is ⊗ or ⊗ , depending on the choice of modular invariant Z . or Z , and the nimrep is or Note that in both cases the A-B system here is an algebra. For most modular invariants the A-B system is only a module, but when the inclusion is type I, which implies for instance that the modular invariant is a sum of squares, the A-B system will necessarily be an algebra ( A ⊂ B is type I iff the equivalent conditions of Proposition 3.2 in [8] hold; in the nets of subfactors setting A(I ) ⊂ B(I ), this means the extended net is local). This suggests that both inclusions A ⊂ N ⊗ N opp and A ⊂ M ⊗ M opp are type I. In fact more is true: Theorem 1. The inclusions A ⊂ N ⊗ N opp and A ⊂ M ⊗ M opp are both type I.

2 Moreover, we have (Z )i, j = (Z )i,0 (Z ) j,0 , i.e. Z = i Z i,0 ch i (similarly for Z ). The vectors u and v with entries u i = (Z )i,0 and vi = (Z )i,0 are eigenvectors with eigenvalue 1 of both S and T . Proof. Corollaries 6.3 and 6.4 of [37] show that, at the level of subfactors, B X B+ := α + ( B X B ) ⊆ ⊗ 1 and B X B− := α − ( A X A ) ⊆ 1 ⊗ opp . The neutral system B X B0 = − + opp and hence equals 1. But dim X ± B X B ∩ B X B is thus contained in ⊗ 1 ∩ 1 ⊗ B B is computed in [8, Prop.3.1], for such a nondegenerate system like ours, to be 1 S0,0

=

Si,0 S j,0 Z i,0 = Z 0, j , S0,0 S0,0

(1.18)

which matches the dimensions of ⊗ 1 and 1 ⊗ opp . Therefore we get the equalities − + opp . B X B = ⊗ 1 and B X B = 1 ⊗ sector: θ ≥ Any canonical endomorphism θ is bounded below by the± vacuum ± λ Z λ,0 λ. This inequality is a special case of θ λ, μ ≥ αλ , αμ ([12, Eq. (37)]; see also [6, Thm.3.9]). But dim θ is given by

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

dim θ = dim(ιι) = dim(ιι) =

dim(ξ ⊗ ξ opp ) =

473

1 (dim ξ )2 = , S0,0

agreeing with (1.18), and thus θ = λ Z λ,0 λ. By [8, Prop.3.2], this implies the inclusion A ⊂ B is type I. of alpha-induction, we know a modular invariant takes the form Z = Because ± ± + b b τ,λ,μ τ,λ τ,μ , where τ runs over all sectors of the neutral system and bτ,λ = τ, αλ + − are the branching coefficients. Because this system is type I, we have b = b . Because the neutral system is trivial, we have only one τ . Thus the modular invariant Z corresponding to θ , and Z θ , take the desired forms. The statements about corresponding to the vectors u, v follow from modular invariance. ) is noncomFrom the theory of alpha-induction, we know that (respectively mutative iff some (Z )i, j > 1 (resp. some (Z )i, j > 1) — see [11, Cor.6.9],[12, Thm.4.11]. For type I inclusions, we know from Sect. 4.1 of [7] that the A-B system ± A X B is isomorphic to B X B . We will call any modular invariant of the type Z = | i Z i,0 ch i |2 a monomial modular invariant. Of course all this applies for a system of endomorphisms not necessarily coming from a subfactor — we expect this is relevant for our twisted Haagerup data. Again is recovered from the double D and the canonical modular invariant Z , which again . is monomial. The only difference is that there is no Suppose a finite group G acts by outer automorphism on a type III factor N . Then as mentioned last subsection, by taking a crossed product we have a subfactor N ⊂ M, respectively. Taking where the N -N and M-M systems are identified with G and G opp the Longo-Rehren inclusion A ⊂ N ⊗ N , the doubled A-A system can be identified with the untwisted quantum double DG. We will sometimes find it useful to use the K-theory language developed in [22,23], which identifies DG with the equivariant 0 K-group K G0 (G) ∼ (G × G), where G acts on G by the conjugate action and = K − ∼ = (G) = G acts diagonally on the left and right of G × G. The neutral system is trivial (K 0 (e, e) = Z), with sigma-restriction σid = θD G ∼ = χ ∈G dim χ (e, χ ) . The full system is × opp ∼ = K 0 (G × G) with canonical modular invariant

2

χ (e) ch (e,χ ) , (1.19) Z =

The dual LR-inclusion is A ⊂ M ⊗ M opp , where the douthe sum over χ ∈ G. ×G opp ∼ bled A-A system is again the quantum double of G, the full system is G = 0 0 K G×G (G/G × G/G) and the neutral system K (G/G × G/G) is trivial. Here sigma restriction is given by σid = θD G ∼ = (g, id), where g runs over representatives of all conjugacy classes, and the dual canonical modular invariant is

2

ch (a,1) , (1.20) Z =

where the sum runs over representatives a of all conjugacy classes of G. We recover from is always commutative, whereas = CG is commutative iff =G (1.19), (1.20) that all dimensions χ (e) = 1, i.e. iff G is abelian. By the Galois theory of [37], a subgroup K < G induces an intermediate subfactor A ⊂ C ⊂ N ⊗ N opp , where the doubled C-C system is the untwisted quantum double 0 DK ∼ = K K0 (K ) ∼ = K K −K (K × K ). Then A ⊂ C is a braided subfactor of index

474

D. E. Evans, T. Gannon

± ∼ 0 ∼ |G/K |. The full system C XC ∼ = K K −K (G × G) and chiral systems C XC = C X A = 0 0 0 K K −G (G × G). The branching coefficients and sigma-restriction K K (K ) → K G (G) Z

are given by (g, π ) → (g, Ind Kg∩Z g π ). The trivial example K = 1 recovers the canonical modular invariant (1.19). In the case of the dihedral group G = Dν and its cyclic subgroup K = Zν , in which we are interested, the canonical modular invariant arising from the inclusion is given in (3.11) below. 1.4. VOAs and vector-valued modular functions. For the basic theory of VOAs, see [47]. Let V be a VOA. Write Irr(V) for the set of irreducible V-modules M. Among other things, V and its modules carry a representation of the Virasoro algebra Vir = Span{L n , C}n∈Z , where the central term C acts as a scalar c = c(V) called the central charge. c also arose in Sect. 2.1, where it was defined only mod 24. The action of L 0 ∈ V ir on each V-module M defines a grading (by eigenvalues) on M into finitedimensional spaces. Definition 3. A VOA V is called rational if i) V ∈ Irr(V) and V is isomorphic to its contragredient as V-modules; ii) writing V = ⊕n∈Z Vn for the grading by L 0 of V, we have Vn = 0 for n < 0 and V0 is 1-dimensional; iii) every weak V-module is completely reducible. See [47] for the details, which play no role in the following. There is no standard definition of rationality — we chose Definition 3 to guarantee the existence of modular data. All VOAs considered in this paper are rational in this sense. Rational VOAs V (or if you prefer, rational CFT) realise modular data as follows. The primaries consist of the finitely many modules M ∈ Irr(V) = , and the vacuum 0 is V itself. Define their characters to be the graded dimensions ch M (τ ) = Tr M q L 0 −c/24 ,

(1.21)

using the above grading by L 0 , where as always we write q = e2π iτ . These ch M will be holomorphic throughout the upper half-plane H = {τ ∈ C : Im(τ ) > 0} [62]. Collect ). Then [62] shown there is a these finitely many characters into a column vector ch(τ representation ρ of SL2 (Z) such that aτ + b = ρ a b ch(τ ), ∀ a b ∈ SL2 (Z). (1.22) ch e d e d eτ + d The matrices S, T now defined by (1.2) constitute modular data [36]. It may or may not be unitary. Unlike braided subfactors, which are naturally associated to modular invariants, VOAs see only one chiral half of the rational CFT, and so capture only the notion of modular data. There is a rational VOA V(L) corresponding to any even positive-definite lattice L (the central charge equals the dimension n of L), recovering the modulardata of (1.9). The character corresponding to coset [v] ∈ L ∗ /L is ch [v] (τ ) = η(τ )−n x∈[v] q x·x/2 . There is a rational VOA V(gk ) corresponding to affine algebra g(1) at positive integral level k, recovering the modular data of [42]. The character ch λ corresponding to highest-weight λ coincides with the affine algebra character χλ , specialised to τ ∈ H. Affine algebra and lattice VOAs overlap for the simply-laced g at level 1, corresponding to the root lattices L. Finite group modular data Dω G is recovered by taking the orbifold V G

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

475

(i.e. subVOA of fixed-points of G) of a subgroup G of automorphisms of a holomorphic VOA (i.e. a rational VOA V with Irr(V) = {V}) — it has been conjectured that any such V G will itself be rational, but a general proof of this still seems far away. Examples of holomorphic VOAs are the V(L) for L self-dual — e.g. for c = 8 there is the E 8 root lattice. In this case the orbifold theory is under better control, see e.g. [43], and indeed this is the main one we’ll consider. A sexier example of holomorphic VOA though is the c = 24 Moonshine module V . Incidentally, Theorem 1 suggests that: a VOA corresponding to a quantum double should be a conformal subalgebra of a holomorphic VOA. Definition 4. Let ρ be a d-dimensional representation of SL2 (Z), with diagonal T := ) with multiplier ρ ρ( 01 11 ). A weakly holomorphic vector-valued modular function ch(τ d is a holomorphic function H → C satisfying (1.22), with q-expansion ) = qλ ch(τ

∞

nqn ch

(1.23)

n=0

n∈ for some diagonal matrix λ satisfying T = e2π iλ , where each Fourier coefficient ch d for ρ. C is independent of τ . Let M(ρ) denote the space of all such ch ) is meromorphic at the cusp. [62] showed that the charEquation (1.23) means ch(τ acter vector of a rational VOA is a vector-valued modular function in this sense. These character vectors will help us study and identify the VOA. The characters of rational VOAs also satisfy other conditions. For instance, by defi n in (1.23) are nonnegative integers. The vacuum nition (1.21) the Fourier coefficients ch −c/24 character ch V = ch 0 begins with 1q . In a unitary VOA, λ in (1.23) can be chosen so that λ M > −c/24 for all M = V in Irr(V). Recall charge-conjugation C = S 2 from Definition 1. A consequence of (1.22) is that = ch. For convenience we assume in the remainder of this subsection that C = I , C ch i.e. we have a representation of PSL2 (Z). This holds for most modular data considered in this paper (other than Dω Z2n+1 ), but at the end of Sect. 2.4 we explain how to reduce C = I to the C = I case. Given any d-dimensional PSL2 (Z)-representation ρ, [2,3] describe how to find all ) ∈ M(ρ). In particular, there is a d × d matrix (τ ) with the property that ch(τ ) ∈ M(ρ) iff ch(τ ) is of the form X (τ ) = (τ ) P(J (τ )), where P(x) ch(τ is any column vector whose entries are polynomials in x, and J (τ ) is the Hauptmodul J (τ ) = q −1 + 196884q + · · ·. Thus it suffices to find (τ ) for ρ. (τ ) is determined through a differential equation it satisfies. This differential equation and the relevant initial condition is determined from two d × d matrices of numbers. One is a diagonal matrix satisfying among other things the relation e2π i = T ; the more elusive one is called χ . In terms of and χ , the matrix (τ ) has q-expansion (τ ) = q

∞

q n n = q (I q −1 + χ +

n=−1

∞

q n n ),

n=1

where −1 is the d × d identity matrix, 0 is the matrix χ , and for n = 1, 2, 3, . . . the matrix n is recursively defined by the commutator [, n ] + (n + 1)n =

n−1 l=−1

l ( f n−l ( − I ) + gn−l (χ + [, χ ])),

(1.24)

476

D. E. Evans, T. Gannon

where f n , gn are defined by (J − 240)/E 10 = f n q n and /E 10 = gn q n for the 24 discriminant form = η and Eisenstein series E 10 = E 4 E 6 . That is, the (i, j)-entry of n is the (i, j)-entry of (1.24), divided by ii − j j + n + 1. The following method suffices to find , χ for the ρ considered in this paper (though invariably there are more elegant methods). First, decompose ρ into irreps. What is special about irreps ρ is that their M(ρ) is cyclic (Theorem 4.1 of [3]): the space M(ρ) is a module over the ring C[J, ∇1 , ∇2 , ∇3 ] of differentiable operators k E 10 E8 E6 d ∇1 = D0 , ∇2 = D1 D0 , ∇3 = D2 D1 D0 , for Dk = q − E2 , dq 6 and if ρ is irreducible, then M(ρ) (and hence and χ ) is generated over that ring by ∈ M(ρ). All irreps occurring in this paper are subrepresentations of the any nonzero ch modular data coming from even lattices, and so the desired nonzero modular function can be built from lattice theta functions. Knowing and χ for a given S, T is equivalent to knowing and χ for S, ωT for any third root of unity ω, but the explicit equivalence [3] is not easy. Write (k) (τ ) for the matrix (τ ) for S, ξ3k T and assume (0) (τ ) (hence (0) and χ (0) ) is known. Then the columns of (1) are linear combinations over C of the columns of η−16 E 42 (0) and η−16 E 6 D0 (0) − E 42 (0) ((0) − 1) , while the columns of (2) are linear combina tions over C of η−8 E 4 (0) and η−8 D1 D0 2 − E 4 (0) ((0) − I )((0) − 76 I ) . The contragredient (ρ T )−1 is handled similarly [3]: the columns of E 10 E 14 −2 T ( )−1 are linearly independent vectors in M((ρ T )−1 ). 2. Comparing the Haagerup, S ym(3) and SO(13) 2.1. The Tube Algebra of Sym(3). Recall the discussion in Sect. 1.2. Write S3 = Sym(3) = {e, u, u 2 , τ, τ u, τ u 2 }. The three conjugacy classes are K e = {e}, K u = {u, u 2 } and K τ = {τ, τ u, τ u 2 }. Write S3 = {1, , σ }, where is sgn and σ is 2-dimeng , gives rise to a simple component of the sional. Then each pair (K g , π ), with π ∈ Z tube algebra. There are two half-braidings for endomorphism ρ = e as ρ(K e ,1) = ρ(K e ,) = e, and a third attached to ρ(K e ,σ ) = e + e. We denote these half-braidings by e(1) , e(2) and 2e, respectively. The conjugacy class K u provides an endomorphism ρ(K u ,π ) = u + u 2 for every irrep π of the centraliser Zu Z3 , and hence three half-braidings (u+u 2 )(1) , (u+u 2 )(2) , (u+u 2 )(3) . The conjugacy class K τ has centraliser Zτ Z2 , so the endomorphism ρ = τ +τ u+τ u 2 has two half-braidings (τ +τ u+τ u 2 )(1) , (τ +τ u+τ u 2 )(2) . This enables us to match up Fig. 1, where the bottom graph has been drawn in [38, Fig. 1] or [26, Fig. 31]: the upper graph arises from the Longo-Rehren inclusion of S3 , and the lower describes the Longo-Rehren inclusion of the dual system S3 , see [37, Remark (i) in P. 154] as well as Sect. 1.2 above for the general description of the inductionrestriction graphs between A-B and B-B sectors of the Longo-Rehren inclusion A ⊂ B using the structure of the tube algebra. The middle vertices in Fig. 1 also describe how the half-braidings from S3 and S3 match up. The Longo-Rehren dual sectors associated to S3 and S3 are respectively G [θD S3 ] = [K e , Indid ] = dim(π )[K e , π ] = [ (1) ] + [ (2) ] + 2[2], (2.1) π ∈G

[θDS3 ] = [K g , Ind G K g id] = [1] + [1 + ] + [1 + σ ], g

(2.2)

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

477

Fig. 1. Dual principal graphs for doubles of Sym(3) and Sym(3)

specialised from the discusson of Sect. 1.3. Observe that [θD S3 ] has also been computed in [38,26]. These braided subfactors yield the canonical modular invariants (recall (1.19), (1.20)) Z 22 = |ch 0 + ch b + 2ch a|2 and Z 55 = |ch 0 + ch c0 + ch d1 |2 with full systems opp opp S3 × S3 , S3 × S3 respectively by [26, p. 357], where the names of the primaries of DS3 are taken from the middle row of Fig. 1 and the names Z 22 , Z 55 come from the list in [26]. As explained at the end of Sect. 1.3, the subsystem Z3 of corresponds to an interopp mediate subfactor A ⊂ C ⊂ N ⊗ N opp, where the dual canonical endomorphism of opp C ⊂ N⊗N is γ = α∈Z3 α ⊗ α and the double C-C system is the quantum double DZ3 ∼ = K Z0 3 (Z3 ). Then A ⊂ C is a braided subfactor of index 2, with canonical endomorphism [θ ] = [0] + [b] and associated modular invariant Z = |ch 0 + ch b|2 + 2|ch a|2 + 2|ch c0 |2 + 2|h c1 |2 + 2|ch c2 |2 .

(2.3)

2.2. The Haagerup tube algebra. Let = {id, u, u 2 , ρ, ρu, ρu 2 } be the noncommutative N -N system (the even vertices of the √ top graph of Fig. 2) of the Haagerup subfactor N ⊂ M of index δ + 1, where δ = (3 + 13)/2. The fusions (product of sectors) are given by: [u]3 = [id], [u][ρ] = [ρ][u]2 , [ρ]2 = [id] + [ρ] + [ρu] + [ρu 2 ]. = {id, a, b, c} be These have statistical dimensions dim(u) = 1 and dim(ρ) = δ. Let the commutative M-M system (the even vertices of the bottom graph of Fig. 2), whose fusion rules are: [a]2 = [id] + [a] + [b] + [c], [b]2 = [id] + [c], [c]2 = [id] + 2[a] + [b] + 2[c], [a][b] = [a] + [c], [a][c] = [a] + [b] + 2[c], [b][c] = [a] + [b] + [c]. Hence we obtain the statistical dimensions dim(a) = δ, dim(b) = δ − 1, dim(c) = δ + 1.

478

D. E. Evans, T. Gannon

Fig. 2. Principal graphs of the Haagerup (5 +

√ 13)/2 subfactor

The M-N sectors, [κu i ] and [κ ] where κ is the inclusion N ⊂ M, are the odd vertices of both graphs in Fig. 2. From this figure we see we can choose the endomorphism c = κuκ ∼ = κu 2 κ. These principal graphs encode multiplication by κ — e.g. i

[κ][ρu ] = [κ ] + [κu i ]. From this we obtain the statistical dimensions dim(κu i ) = λ and dim(κ ) = λ (δ − 1). The remaining products M-N × N -N → M-N and MM × M-N → M-N , as well as M-N × N -M → M-M and N -M × M-N → N -N , were computed by Bisch [5] and are generalised in Sect. 4.2 below. should remind one of S3 . In particular, the system has The fusions of and Z3 = {id, u, u 2 } as a subsystem and is some sort of perturbation of the usual S3 multi system plication table [u]3 = [id], [u][ρ] = [ρ][u]2 , [ρ]2 = [id]. Similarly, the reduces to the character ring S3 when ignoring [c], where b should be regarded as S3 -representation and a as σ . These similarities were our first indication of a relation between the Haagerup subfactor and S3 . We find in Sect. 4 that this ↔ S3 relation ↔ generalises naturally, whereas S3 does not. The structure of the tube algebra of the system has been studied in [38, Sect. 8]. Izumi introduced the following endomorphisms in (): μ = ρ + ρu + ρu 2 , π1 = id + μ, π2 = 2(id) + μ, σ = u + u 2 + μ, and proved that π1 and π2 have only one half-braiding, σ has three (σ (1) , σ (2) , σ (3) ), and finally μ has six (μ(1) , . . . , μ(6) ) and that these half-braidings exhaust all of them, so that the quantum double of has 12 primaries as in the middle row of Fig. 3. The inductionrestriction graph of the Longo-Rehren inclusion of the system is given in [38, Fig. 5], see also the top of Fig. 3. From this we read off the Longo-Rehren dual sector associated to , namely [θD ] = [0] + [b] + 2[a], using the labelling of Fig. 3. This corresponds to the canonical modular invariant (recall Sect. 1.3) Z 22 = |ch 0 + ch b + 2ch a|2 , with corresponding full system ×opp , where the name Z 22 comes from the list of [27] (see also Sect. 3.4 below). From Fig. 3 we obtain the quantum dimensions 1 + 3δ, 2 + 3δ, 2 + 3δ, 3δ for b, a, c j , dl , respectively. , on the bottom of Fig. 3, More subtle is the dual principal graph corresponding to with dual canonical endomorphism [θD ] = [0] + [b] + [a] + [c0 ] and modular invariant

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

479

for Haagerup Fig. 3. Dual principal graphs for doubles of and

× opp . To our knowledge this Z 11 = |ch 0 + ch b + ch a + ch c0 |2 corresponding to appears here for the first time. To derive it, first note that the only monomial modular invariants for DHg are Z 22 = |ch 0 + ch b + 2ch a|2 , Z 11 = |ch 0 + ch b + ch a + ch c0 |2 , and Z 33 = |ch 0 + ch b + 2ch c0 |2 Proposition 4 below with ν = 3). The number of sectors in opp + a ⊗ a opp + b ⊗ bopp + c ⊗ copp . is 4, and the canonical endomorphism γ = 0⊗0 Since γ Frobenius reciprocity. Thus the canonical , γ = 4, we have θ , θ = 4 by 2 = 4, forcing it to be Z and fixing modular invariant Z associated to θ satisfies i Z 0,i 11 2 2 2 2 has 1 + 1 + 1 + 1 = 4 irreducible sectors, call them e0 , e1 , e2 , e3 , one θ . Hence of which (say e0 ) is the identity α0 . To obtain the dual principal graph, in which ei is connected to the primary x with αx , ei edges, use αx , α y = θ x, y, which holds for any type I θ (see Theorem 1). The fusions for the double Haagerup are explicitly given in [27] (or Sect. 3.2 below). The edges from a are determined from the calculations αa, αa = 6 = 12 + 12 + 22 and αa, α0 = 1: call e1 the sector not adjacent to a, and e2 the one connected to a with 2 edges. The additivity of the statistical dimension dim αa = dim a then identifies e1 = b, e2 = a, e3 = c. Likewise, the edges from b come from αb, αb = 4 and αb, α0 = 1, and those from dl come from αdl , αdl = 3 and αdl , α0 = 0. From αc j , αc j = 5 + δ j,0 , αc j , α0 = δ j,0 , and statistical dimensions, we obtain the final edges. We have A ⊂ N ⊗ N opp , the double of , with subsystem Z3 ⊂ . Hence by the Galois theory of Izumi [37], we have as in Sect. 1.3 and the last subsection an interopp mediate subfactor A ⊂ C ⊂ N ⊗ N opp, where the dual canonical endomorphism of opp C ⊂ N⊗N is γ = α∈Z3 α ⊗ α and the double C-C system is the quantum double DZ3 ∼ = K Z0 3 (Z3 ). Then A ⊂ C is a braided subfactor of index 2 + 3δ, with canonical endomorphism [θ ] = [0] + [b] and associated modular invariant (recall (2.3))

Z = |ch 0 + ch b|2 + 2|ch a|2 + 2|ch c0 |2 + 2|ch c1 |2 + 2|ch c2 |2 .

(2.4)

480

D. E. Evans, T. Gannon

2.3. Haagerup modular data DHg. The modular data DHg for the quantum double of the even part of the Haagerup subfactor was first computed in [38] and simplified somewhat in [27]. It is necessary for all that follows though to significantly simplify it further. The result is −2 2 −6 −5 6 5 T = diag(1, 1, 1, 1, ξ3 , ξ3 , ξ13 , ξ13 , ξ13 , ξ13 , ξ13 , ξ13 ), ⎛ x 1−x 1 1 1 1 y y x 1 1 1 1 −y −y ⎜1 − x ⎜ 1 1 2 −1 −1 −1 0 0 ⎜ ⎜ 1 1 −1 2 −1 −1 0 0 ⎜ ⎜ 1 1 −1 −1 −1 2 0 0 ⎜ 1⎜ 1 1 −1 −1 2 −1 0 0 S= ⎜ y −y 0 0 0 0 c(1) c(2) 3⎜ ⎜ ⎜ y −y 0 0 0 0 c(2) c(4) ⎜ ⎜ y −y 0 0 0 0 c(3) c(6) ⎜ −y 0 0 0 0 c(4) c(5) ⎜ y ⎝ y −y 0 0 0 0 c(5) c(3)

y

−y

0

0

0

0

y −y 0 0 0 0 c(3) c(6) c(4) c(1) c(2) c(6) c(1) c(5)

y −y 0 0 0 0 c(4) c(5) c(1) c(3) c(6) c(2)

y −y 0 0 0 0 c(5) c(3) c(2) c(6) c(1) c(4)

(2.5a) ⎞

y −y ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟, c(6) ⎟ ⎟ c(1) ⎟ ⎟ c(5) ⎟ ⎟ c(2) ⎟ c(4) ⎠ c(3) (2.5b)

√ √ for x = (13 − 3 13)/26, y = 3/ 13 and c( j) = −2y cos(2π j/13). 6l 2 for It is important to note that the last 6 diagonal entries of T are Tdl ,dl = ξ13 1 ≤ l ≤ 6, and the bottom-right 6 × 6 submatrix has entries Sdl ,dl = c(ll )/3 for 1 ≤ l, l ≤ 6. Our expression for that 6 × 6 submatrix is considerably simpler than the corresponding expressions in [27,35,38]. The proof of the equivalence is easy, e.g. use (1.3). A direct derivation is given in Theorem 5 below. DHg should be compared to the modular data DS3 for the (untwisted) double of S3 , which can be obtained from (1.8) (or see (3.1) below). As would be anticipated from the previous subsection, they are very similar, except that x becomes 1/2 and the last 6 rows/columns collapse into 2. The given matrix T forces central charge c to be a multiple of 24 (since T0,0 = 1), but multiplying T by a third root of unity (and leaving S unchanged) allows us to consider the Haagerup at any multiple c of 8. Next section we generalise DHg in two ways: it can be twisted by a Z3 (this twist is analogous to the H 3 (BG ; T)-twist of finite group modular data, or the level k ∈ Z>0 of affine algebra modular data); and it lies in an infinite sequence corresponding to the odd dihedrals.

2.4. Clarifying the Haagerup-Sym(3) relation. To emphasise that the relations between S3 and the Haagerup aren’t spurious, let us consider possible character vectors realising these modular data (recall Sect. 1.4). This will also help identify a VOA realisation. In this subsection we focus on the simplest possibility, central charge c = 8 (the smallest possible), although similar results hold for any other multiple of 8. Consider first the modular data DS3 for the quantum double of S3 , with primaries 0, b, a, ci , dl as in Fig. 1. This 8-dimensional PSL2 (Z)-representation decomposes into 3 copies of the 1-dimensional irrep with T = e−2π i/3 , a 2-dimensional irrep with kernel the principal congruence subgroup (2), and a 3-dimensional irrep with kernel containing (3). These 1-, 2-, and 3-dimensional irreps are also subrepresentations of the modular data for the lattices E 8 , D8 and A2 ⊕ E 6 respectively, and as explained at the

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

481

end of Sect. 1.4 the corresponding theta functions provide all the information needed to extract and χ for these irreps. From this we obtain for DS3 at c ≡24 8 the matrices ≡8 S3 = diag(2/3, 2/3, 2/3, 2/3, 0, 1/3, 2/3, 1/6) and ⎛

≡8 χ S3

39

47

81

81

8748

1215

128

39

81

81

8748

1215

−128

81

167

−81

−8748

−1215

0

81

−81

167

−8748

−1215

0

3

−3

−3

−12

18

0

⎜ 47 ⎜ ⎜ ⎜ 81 ⎜ ⎜ 81 ⎜ =⎜ ⎜ 3 ⎜ ⎜ 27 ⎜ ⎜ ⎝128

27

−27

−27

1458

−152

0

−128

0

0

0

0

120

16

−16

0

0

0

0

−16

5120

⎞

−5120⎟ ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟ ⎟ −5120⎠ 140

(throughout the paper we write a ≡n b for a ≡ b (mod n)). Once we know , χ , then the simple recursion (1.24) yields the full q-expansion of (τ ). From we know completely explicitly all possible weakly holomorphic vectorvalued modular functions for this SL2 (Z)-representation. Specialising to c = 8, we find there is a unique possible character vector, namely the first column of ≡8 S3 : ⎛ −1/3 ⎞ q (1 + 39q + 699q 2 + 5761q 3 + 35593q 4 + · · · ) ⎟ ⎜ ⎟ ⎜ ⎟ q 2/3 (47 + 671q + 5825q 2 + 35459q 3 + · · · ) ch b(τ ) ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 2/3 (1 + 17q + 143q 2 + 877q 3 + · · · ) ⎜ch a(τ ) = ch c0 (τ )⎟ ⎜ 81q ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 2 3 ⎟. (τ ) ch 3 + 243q + 2916q + 21870q + · · · ⎜ ⎟=⎜ c1 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 1/3 2 3 ch (τ ) ⎜ ⎟ 27q (1 + 22q + 221q + 1476q + · · · ) c2 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ch d1 (τ ) ⎝ ⎠ ⎝ 128q 2/3 (1 + 16q + 136q 2 + 832q 3 + · · · ) ⎠ 1/6 2 3 ch d2 (τ ) 16q (1 + 36q + 394q + 2776q + · · · ) ⎛

ch 0 (τ )

⎞

(2.6) It is easy to realise DS3 (and hence the character vector (2.6)) at c = 8. Start with the lattice VOA V(E 8 ) corresponding to the E 8 root lattice. Its automorphism group is the compact Lie group E 8 (R). Then the VOA realising (2.6) is the orbifold of V(E 8 ) by some subgroup G of E 8 (R) isomorphic to S3 (i.e. is the subVOA of V(E 8 ) fixed by G). This is indeed possible (see Theorem 4.3 of [43]). Now turn to the double of the Haagerup, also at c = 8, with primaries labelled as in Fig. 3. Its 12-dimensional PSL2 (Z)-representation decomposes into 1 + 1 + 3 + 7, where the 1- and 3-dimensional irreps are as before, and the 7-dimensional irrep occurs as a subrepresentation of the modular data of the 4-dimensional lattice A3 521 [1, 1/4], in the gluing notation of Conway-Sloane [13]. From this we quickly obtain for c ≡24 8 its matrices and χ :

482

D. E. Evans, T. Gannon

diag(2/3, 2/3, 2/3, 2/3, 0, 1/3, 5/39, 20/39, 32/39, 2/39, 8/39, 11/39), ⎞ 6 80 81 81 8748 1215 3549 273 13 5538 2275 1378 ⎜ 80 6 81 81 8748 1215 −3549 −273 −13 −5538 −2275 −1378 ⎟ ⎜ ⎟ ⎜ 81 81 167 −81 −8748 −1215 0 0 0 0 0 0 ⎟ ⎜ ⎟ ⎜ 81 81 −81 167 −8748 −1215 0 0 0 0 0 0 ⎟ ⎜ ⎟ ⎜ 3 18 0 0 0 0 0 0 ⎟ 3 −3 −3 −12 ⎜ ⎟ ⎜ 27 27 −27 −27 1458 −152 0 0 0 0 0 0 ⎟ ⎜ ⎟ ⎜ 7 −7 0 0 0 0 −88 −14 −1 50 63 64 ⎟ ⎜ ⎟ ⎜ 42 −42 0 0 0 0 −1484 92 16 2940 −192 −1041 ⎟ ⎜ ⎟ ⎜ 119 −119 0 0 0 0 −2142 987 11 −24990 −6035 4641 ⎟ ⎜ ⎟ ⎜ 5 −5 0 0 0 0 17 13 −3 −2 35 −14 ⎟ ⎜ ⎟ ⎝ 13 −13 0 0 0 0 174 −1 −5 294 −147 51 ⎠ 14 −14 0 0 0 0 448 −77 7 −343 125 −24 ⎛

(2.7) As before, this gives us the full q-expansion of ≡8 hg . There are only two possible character vectors for the Haagerup modular data at c = 8, namely γ = 0 or γ = 1 in ⎛

⎞ ch 0 (τ ) ⎜ ⎟ ch b (τ ) ⎜ ⎟ ⎜ch a (τ ) = ch c0 (τ )⎟ ⎜ ⎟ ⎜ ⎟ ch c1 (τ ) ⎜ ⎟ ⎜ ⎟ ch c2 (τ ) ⎜ ⎟ ⎜ ⎟ ch (τ ) d ⎜ ⎟ 1 ⎜ ⎟ ch (τ ) d2 ⎜ ⎟ ⎜ ⎟ ch d3 (τ ) ⎜ ⎟ ⎜ ⎟ ch (τ ) d4 ⎜ ⎟ ⎝ ⎠ ch (τ ) d5

⎛

ch d6 (τ )

q 2/3 q −1 + (6 + 13γ ) + (120 + 78γ )q + (956 + 351γ )q 2 + (6010 + 1235γ )q 3 + · · · q 2/3 (80 − 13γ ) + (1250 − 78γ )q + (10630 − 351γ )q 2 + (65042 − 1235γ )q 3 + · · · q 2/3 81 + 1377q + 11583q 2 + 71037q 3 + · · ·

⎞

⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 2 3 ⎜ ⎟ 3 + 243q + 2916q + 21870q + · · · ⎜ ⎟ ⎜ ⎟ 1/3 2 3 ⎜ ⎟ q 27 + 594q + 5967q + 39852q + · · · ⎜ ⎟ ⎜ ⎟ 5/39 (7 − γ ) + (292 − 6γ )q + (3204 − 43γ )q 2 + (23010 − 146γ )q 3 + · · · ⎜ ⎟. q =⎜ ⎟ ⎜ ⎟ 20/39 2 3 ⎜ ⎟ (42 + 16γ ) + (777 + 121γ )q + (7147 + 547γ )q + (45367 + 2000γ )q + · · · q ⎜ ⎟ ⎜ 32/39 ⎟ −1 2 3 ⎜q γ q + (11γ + 119) + (73γ + 1623)q + (300γ + 12996)q + (76429 + 1063γ )q + · · · ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ q 2/39 (5 − 3γ ) + (229 − 50γ )q + (2738 − 252γ )q 2 + (19942 − 1032γ )q 3 + · · · ⎜ ⎟ ⎜ ⎟ 8/39 2 3 ⎜ ⎟ q (13 − 5γ ) + (347 − 37γ )q + (3804 − 212γ )q + (26390 − 794γ )q + · · · ⎝ ⎠ q 11/39 (14 + 7γ ) + (441 + 61γ )q + (4445 + 303γ )q 2 + (30329 + 1167γ )q 3 + · · ·

The proof for this (and for (2.6)) is similar to that sketched in Sect. 5.2. That there are sensible character vectors here is strong evidence for the existence of the corresponding VOA. We see remarkable similarities between DHg and DS3 when c = 8. In particular, the corresponding characters ch a, ch ci are identical, as are the sums ch 0 + ch b. The remaining characters ch d1 , . . . , ch d6 of DHg can have no relation with characters ch d1 , ch d2 of DS3 , since the exponents of their q-expansions are unrelated.

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

483

This suggests that there is a rational VOA at c = 8, call it V8 , which has conformal subalgebras realising the DHg and DS3 modular data (conformal subalgebra was defined at the beginning of Sect. 1). The V8 characters will be ch 0 + ch b (the vacuum character) and some multiples of ch a, ch ci , for the ch’s given in (2.6). Indeed, the knowledge of modular invariants says how this should go. We should look for modular invariants of extension type (i.e. sum of squares) for both DS3 and DHg, which start with |ch 0 + ch b|2 . Both DS3 and DHg have one, namely (2.3) and (2.4). In the case of S3 we readily identify the corresponding chiral extension: V8 corresponds to the quantum double DZ3 of the cyclic group Z3 . The modular invariant says two inequivalent DZ3 -primaries correspond to each ch a, ch ci . So V8 is the Z3 -orbifold (i.e. the Z3 -invariant part) of the rational VOA V(E 8 ), for some choice of order-3 element g of E 8 (R). The group E 8 (R) contains four inequivalent order 3 elements: orbifolding by any of these would give a VOA with 9 primaries realising modular data associated to a quantum double of Z3 . Half the time this double is twisted and half the time it isn’t. Take one of the order 3 orbifolds corresponding to the untwisted double DZ3 . To recover the DS3 VOA, choose an order-2 element h of E 8 (R) for which gh = hg −1 : the h-orbifold of V8 realises DS3 . Unfortunately the Haagerup VOA won’t itself be an orbifold of V8 (see Sect. 5.1 below), which begs Question 3 of Sect. 1. To better identify V8 , we should repeat the analysis for DZ3 . Label its 9 primaries by (i, j) ∈ Z23 , where (0, 0) is the vacuum. Note that charge-conjugation C = S 2 is no longer the identity: it sends (i, j) to (−i, − j). Hence the characters ch (i, j) and ch (−i,− j) are equal and we should project to the subrepresentation Span {ch (0,0) , (ch (0,1) + ch (0,2) )/2, (ch (1,0) +ch (2,0) )/2, (ch (1,1) +ch (2,2) )/2, (ch (1,2) +ch (2,1) )/2} on which Z2 ∼ = C acts trivially. This decomposes into 1 + 1 + 3, where the 1- and 3-dimensional irreps are as before. Choosing the above basis, we obtain ≡8 Z 3 = diag(2/3, 2/3, 2/3, 0, 1/3) and ⎞ ⎛ 86 162 162 17496 2430 ⎜81 167 −81 −8748 −1215⎟ ⎟ ⎜ 81 −81 167 −8748 −1215⎟. χ Z0,≡8 3 =⎜ ⎝3 −3 −3 −12 18 ⎠ 27 −27 −27 1458 −152 It is immediate that the only character vector possible for V8 is ⎛ ⎞ ch (0,0) ⎜ch (0,1) = ch (0,2) = ch (1,0) = ch (2,0) ⎟ ⎝ ⎠ ch (1,1) = ch (2,2) ch (1,2) = ch (2,1) ⎞ ⎛ −1/3 (1 + 86q + 1370q 2 + 11586q 3 + · · · ) q ⎜q 2/3 (81 + 1377q + 11583q 2 + 71037q 3 + · · · )⎟ ⎟. =⎜ ⎠ ⎝ 3 + 243q + 2916q 2 + 21870q 3 + · · · 1/3 2 3 q (27 + 594q + 5967q + 39852q + · · · )

(2.8)

We recognise this as the character vector for the lattice VOA V(A2 E 6 ), so this is the VOA V8 containing both the S3 and Haagerup VOAs. This V8 also has, like V(E 8 ), an interpretation as an affine algebra VOA, and the containment V8 ⊂ V(E 8 ) corresponds to the conformal embedding A2,1 E 6,1 ⊂ E 8,1 . It can also be realised explicitly as an orbifold — see [43]. The only task remaining is to identify the Haagerup VOA as a subalgebra of V(A2 E 6 ).

484

D. E. Evans, T. Gannon

2.5. The Haagerup and SO(13). The relationship with S3 concerns the primaries a, ci and to a lesser extent 0, b. There also is a striking relationship with the affine algebra modular data B6,2 concerning the primaries dl (and to a lesser extent 0, b). V(B6,2 ) has central charge c = 12, and 10 primaries we’ll denote 0, b = 21 , a1 = 6 , a2 = 1 +6 , d1 = 6l 2 ), while the S1 , . . . , d5 = 5 , d6 = 26 . The T -matrix is diag(−1, −1; −i, i; −ξ13 matrix is [44] ⎛

⎞ y/2 y/2 3/2 3/2 y y y y y y ⎜ y/2 y/2 −3/2 −3/2 y y y y y y ⎟ ⎜ ⎟ ⎜ 3/2 −3/2 3/2 −3/2 ⎟ 0 0 0 0 0 0 ⎜ ⎟ ⎜ 3/2 −3/2 −3/2 3/2 ⎟ 0 0 0 0 0 0 ⎜ ⎟ 1⎜ y y 0 0 −c(1) −c(2) −c(3) −c(4) −c(5) −c(6) ⎟ ⎟, S= ⎜ y 0 0 −c(2) −c(4) −c(6) −c(5) −c(3) −c(1) ⎟ 3⎜ ⎜ y ⎟ ⎜ y y 0 0 −c(3) −c(6) −c(4) −c(1) −c(2) −c(5) ⎟ ⎜ ⎟ ⎜ y y 0 0 −c(4) −c(5) −c(1) −c(3) −c(6) −c(2) ⎟ ⎜ ⎟ ⎝ y y 0 0 −c(5) −c(3) −c(2) −c(6) −c(1) −c(4) ⎠ y y 0 0 −c(6) −c(1) −c(5) −c(2) −c(4) −c(3)

(2.9) √ where y = 3/ 13 and c( j) = −2y cos(2π j/13) as before. Ignoring the first 4 primaries, the only difference with DHg are some signs. This strongly suggests relations between V(B6,2 ) and the (still hypothetical) Haagerup VOA involving the Goddard–Kent–Olive coset construction [33]. In the VOA language, the coset construction was developed in [29]; see also the lucid treatment in Sect. 3.11 of [47]. There are several ways this could go: e.g. the Haagerup VOA at c = 8 could be a coset of V(B6,2 ) by a c = 4 subVOA V4 . In this case the characters of V(B6,2 ) (which can be determined by the Weyl-Kac character formula [42, Ch.13]) would be built from the characters of V4 and the Haagerup. Although the possible relation between the Haagerup and V(B6,2 ), in particular those involving the coset construction, seems very intriguing, for reasons of space we limit the discussion in this paper to relations between the Haagerup and S3 .

3. Generalising the Modular Data of the Haagerup 3.1. Dihedral groups and orthogonal algebras. Last section we observed that the Haagerup modular data is closely related to that of the symmetric group S3 = Sym(3) and affine so(13) at level 2. More generally, we propose next subsection a two-parameter generalisation of the Haagerup, related to the odd dihedral groups and affine so(2m + 1) at level 2. First, we compute the modular data for quantum doubles of the dihedral group Dν = τ, u | τ 2 , u ν , τ u = u −1 τ , where ν = 2n + 1. Of course S3 ∼ = D3 . The twist group is H 3 (B Dν ; T) = Z2ν ∼ = Z2 ×Zν ; because the Schur multipliers of Dν and cyclic groups all vanish, the modular data is cohomologically trivial and is given by (1.8). The conjugacy classes of Dν have representatives e, u h , τ for 1 ≤ h ≤ n, with centralisers Dν , u, τ respectively. There are two 1-dimensional irreps of Dν (call them 0 = 1, 1 ) and n 2-dimensional ones (call them σi for 1 ≤ i ≤ n); denote the ν 1-dimensional irreps of u ∼ = Zν by π j for 0 ≤ j < ν, and the two 1-dimensional irreps of τ ∼ = Z2 by 0 , 1 again. The primaries fall into four classes:

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

1. 2. 3. 4.

485

two primaries: the vacuum 0 := (e, 1) and b := (e, 1 ); n primaries, labeled ai := (e, σi ) for 1 ≤ i ≤ n; nν primaries, labeled ch, j := (u h , π j ) for 1 ≤ h ≤ n, 0 ≤ j < ν; two primaries, labeled dl := (τ, l ), l = 1, 2.

Fix a sign s = ± and integer ω ∈ Z. Corresponding to these 4 classes we have the modular data T = diag(1, 1; 1, . . . , 1; exp[2π i (ωh 2 + νh j)/ν 2 ]; t, −t), ⎛1 ⎞ 12×n 12×nν νB 2 2×2 1 ⎜ 1n×2 2n×n D 0n×2 ⎟ ⎟, S= ⎜ t ⎝ D E 0nν×2 ⎠ ν 1nν×2 ν Bt 02×n 02×nν sν F

(3.1)

where t = 1, i for s = 1, −1 resp., ka×b for any number k is the a × b matrix with constant entry k, B=

1 1 2 −1

1 −1

and F =

1 1 2 −1

−1 , 1

Di,(h, j) = 2 cos(2πi h/ν), and E (h, j),(h , j ) = 2 cos(2π(2ωhh + νh j + νh j)/ν 2 ). We’ll denote this modular data by Ds,ω Dν . The untwisted double is D+,0 Dν . Note that Ds,ω+kν Dν is equivalent for any integer k, so the twist does indeed live in Z2ν . We’ve separated the order 2 from the order ν twists, for later convenience. The quantum-dimensions (=statistical dimensions) S∗0 /S00 of b, ai , ch, j , dl are 1, 2, 2, ν, respectively, and the global dimension 1/S0,0 is 2ν. The S, T entries live in Q[ξ4ν 2 ]. Choose any Galois automorphism σ ∈ Gal(Q[ξ4ν 2 ]/Q) fixing i and define ∈ Zν 2 by σ ξν 2 = ξν2 . Then applying σ to Ds,ω Dν , as explained in Sect. 1.1, yields modular data equivalent to Ds,ω Dν . In particular the fusion ring s,2 ω D and of Ds,ω Dν depends only on ν, s and gcd(ω, ν). In fact, if ∈ Z× ν ν , then D s,ω D Dν are equivalent. The second ingredient going into our generalisation of the Haagerup modular data is the affine algebra modular data Bm,2 , which has central charge c = 2m. In particular [44], Bm,2 has m + 4 primaries, labelled 0, b = 21 , a1 = m , a2 = 1 + m , dl = l for 1 ≤ l < m and dm = 2m . Write μ = 2m + 1. The T -matrix is 2 −m ξ12 diag(1, 1; ξ8m , −ξ8m ; ξμml ), while the S-matrix is ⎛x S=

2 2×2 ⎝ BT

xm×2

B F 0m×2

⎞ x2×m 02×m ⎠ , H

(3.2)

√

where x = 1/ μ, B and F are √ as above, and Hl,l = 2x cos(2πll /μ). The √ quantumdimensions of b, ai , dl are 1, μ, 2, respectively, with global dimension 2 μ. There is no obvious twist of the Bm,2 . Curiously [15], affine so(ν 2 ) at level 2 coincides with Dω Dν for a specifically chosen twist ω (this is one of the few overlaps of affine algebra and finite group modular data [15]), but that fact seems to have no role in our story (in particular we are not interested in the values m = (ν 2 − 1)/2).

486

D. E. Evans, T. Gannon

3.2. Generalising the Haagerup modular data. Sects. 2.3, 2.5 and 3.1 suggest a generalisation of DHg. Write ν = 2n + 1 for n ≥ 0, and choose √ ω ∈ Z as before. Write m = 2(n 2 + n + 1), μ = 2m + 1 = ν 2 + 4, and δ = (ν + μ)/2. Note that δ satisfies δ 2 = νδ + 1 and lies in the range ν < δ < ν + 21 . Again the primaries fall into four classes: 1. 2. 3. 4.

two primaries, denoted 0 and b; n primaries, denoted ai for 1 ≤ i ≤ n; nν primaries, denoted ch, j for 1 ≤ h ≤ n, 0 ≤ j < ν; m primaries, denoted dl for 1 ≤ l ≤ m. Breaking S and T into 16 blocks, as in the previous subsection, we get T = diag(1, 1; 1, . . . , 1; exp[2π i(ωh 2 + νhi)/ν 2 ]; exp[2π i l 2 m/μ]), ⎛ ⎞ A 12×n 12×nν B 1 ⎜1 2n×n D 0n×m ⎟ , S = ⎝ n×2 Dt E 0nν×m ⎠ ν 1nν×2 B t 0m×n 0m×nν −ν H

(3.3)

where ka×b , D, E and H are as in Sect. 3.1 (so −ν Hl,l = −2y cos(2πll /μ)), and A=

1 1−y 2 1+y

1+y 1−y

and B = y

1 −1

1 −1

··· ···

1 −1

for y = √νμ . Denote this modular data by Dω Hg2n+1 ; we call D0 Hgν Haagerup-Izumi modular data. To our knowledge this modular data is new except when n ≤ 1 and ν divides ω; in particular twisting the Haagerup data was unanticipated in the literature. We discuss further generalisations next subsection. The quantum dimensions (or statistical dimensions) S∗,0 /S0,0 for b, ai , ch, j , dl are respectively 1 + νδ, 2 + νδ, 2 + νδ, and νδ. The global dimension is 1/S0,0 = ν (νδ + 2). Note that the submatrix −ν H has exactly half of the μ − 1 rows and columns of S. The fusions can be computed directly using Verlinde’s formula (1.1) (or from Proposition 2 below). As mentioned in Sect. 1.1, it suffices to consider all i, j, k different from 0. For Dω Hgν , all of those fusion coefficients are 1 except for Nb,ai ,ai = 2, Nb,ch,i ,ch,i = 2, Nb,dl ,dl = 0, Nai ,a j ,ak = 2 if k ≡ν si + s j, Nch,i ,ch , j ,ch

,k = 2 if h

≡ν sh + s h , k ≡ν si + s j + 2ω (sh + s h − h)/ν, Nai ,ch, j ,ch,k = 2 if si ≡ν j − k, Ndl ,dl ,dl

= 0 if l

≡μ sl + s l ,

where s, s ∈ {±1} are arbitrary signs. As in Sect. 3.1, Dω Hg2n+1 depends up to equivalence on the value of the twist ω mod ν; the fusion ring depends up to equivalence only on n and gcd(ν, ω); and 2 Dω Hg2n+1 ∼ = D ω Hg2n+1 for any coprime to ν. The computation of fusions reduces to the identity 8

k d=1

cos

2π bd 2π cd 2πad cos cos = −4 + (2k + 1)s, 2k + 1 2k + 1 2k + 1

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

487

proved using 2 cos(x) = eix + e−ix , where 0 ≤ s ≤ 4 is the number of pairs (sa , sb ) of signs ±1 such that c ≡2k+1 sa a + sb b. Modularity reduces to the Gauss sum 2k

√ exp[2π i ca /(2k + 1)] = 2k + 1 2

a=0

c 1 i 2k + 1

for k even . for k odd

Since μ ≡8 5, μ can never be a perfect square, and the Jacobi symbol m μ will always equal −1. See Proposition 2 below for a more elegant argument. Of course D0 Hg3 recovers the original Haagerup double DHg, given in Sect. 2.3. The similarity with the modular data D+,ω Dν and Bm,2 is evident (ignoring the first 4 primaries of Bm,2 , the only difference with Dω Hgν is that the bottom-right corner of both T and S are off by some 6th root of 1). In particular, class 1 runs through the 2-dimensional irreps of Dν , while h and j in class 3 parametrise the size-2 conjugacy classes in Dν and the irreps of Zν , respectively; class 4 runs through the fundamental nonspinorial weights of so(μ). The mysterious ‘13’ in DHg is thus 4 + 32 , where the 3 here references the normal subgroup Z3 of S3 . It is tempting to guess that the 4 is 22 , where the ‘2’ is the involution of S3 . This suggests the further generalisation of Dω Hgν given in the Wildly Optimistic Guess of Sect. 1. 3.3. Further generalisations. The way in which Dω Hgν is built from D+,ω Dν and Bm,2 leads to the notion of grafting modular data. A simple instance is provided by the following proposition, but it can be massively generalised. For instance it would be interesting to extend it to vacuum blocks larger than 2 × 2. For simplicity restrict here to unitary modular data (recall Sect. 1.1) with real matrix S. Call modular data (, 0, S, T ) Z2 -laminated if it can be written in block form = (0, b; a1 , . . . , am ; d1 , . . . , dn ) such that S0,b = S0,0 , Nba,iai = 0, Tb,b = T0,0 and Sb,d j < 0. It is elementary to verify that modular data is Z2 -laminated iff it can be written in the form ⎛

x ⎜x S=⎜ ⎝a T dT

x x a T −dT

a a A 0m×n

⎞ d −d ⎟ ⎟, 0m×n ⎠ D

T = diag(r, r ; s; t),

(3.4)

t ∈ Rn and matrices A, D. for some numbers x, r ∈ R, row vectors a , s ∈ Rm , d, j The condition Sb,0 = S0,0 implies the matrix Nb with entries Nb,i for all i, j ∈ is a permutation matrix (see e.g. [31]); in conformal field theory, such a primary b is called a simple-current. The modular data D±,ω Dν and Br,2 are both Z2 -laminated, for any odd ν and any twist (±, ω) and any rank r , as is A1,4 . Proposition 2. Consider Z2 -laminated modular data (, 0, S, T ) and ( , 0 , S , T ). √ (a) There are integers M > 0, L ≥ 0, N > 0 such that 2x = 1/ M, N and 4 L √ √ divide√4M, 21−L M a ∈ Zm , 2 M/N d ∈ Zn , and the gcd over all components √ 21−L Mai is 1 (likewise for the components 2 M/N dk ).

488

D. E. Evans, T. Gannon

(b) Suppose x = x (that is, the global dimensions agree) and r = r (i.e. the central charges c, c agree mod 24). Then the following defines modular data: ⎞ ⎛ x x a d ⎜ x x a −d ⎟ ⎟, T = diag(r, r ; s; t ). S=⎜ T T ⎝ a a A 0 ⎠ d T −d T 0 D (c) Suppose x > x and r = −r ω for some 3rd root ω of 1 (that is, c − c ≡8 4). Define x± = x ± x . Then ⎞ ⎛ x+ a a x−

⎜ x+ x− a − a⎟ ⎟, T = diag(r, r ; s; −ωt ) S=⎜ ⎝ a T a T A 0 ⎠ a T − a T 0 −A satisfies all conditions of modular data except possibly that the fusion coefficients are nonnegative integers. Its fusion coefficients are all integers iff 4|(M − M), define modular data if in addition N ≤ and both L > 0, L > 0. S, T a ,a ,a

8 L /(M − M).

k

k

k

Proof. Recall (1.4). Clearly, any σ permutes {0, b}, so the number x −2 will be fixed by any σ . But x −2 is an algebraic integer, being the sum of squares of certain eigenvalues Si,0 /S0,0 of integer matrices. Therefore x −2 is indeed an integer. Computing Nai ,dk ,dk /Na j ,dk ,dk = ai /a j and Nai ,dk ,dk /Nai ,dk ,dk = dk /dk , we see that there are β, γ > 0 for which both vectors β a and γ d are integral. The gcd condition follows by choosing these β, γ as small as possible. Because d · d = 1/2, we know γ 2 ∈ Q. Because Nai ,dk ,dk = 2ai dk2 /x ∈ Z and each quantum-dimension dk /x is an algebraic integer, we know ai /x ∈ Z, and hence (βx)−1 ∈ Z. From 2x 2 + a · a = 1/2 we now see that 4 divides x −2 , and (βx)−1 is a power 2 L of 2. This gives us (a). The proof of (b) and (c) is now straightforward. Most interesting are the fusion calcu lations in (c). Note that 4x 2 x 2 /(x 2 −x 2 ) = 1/(M −M), ai /x ∈ 2 L Z and ak /x ∈ 2 L Z. We obtain 4 2ai ai ai b,b,a = b,a ,a = , N , N + δi,i , i i i

2 −M x (M − M) x (M − M) ak ak 2ak ai ak b,a ,a = b,a ,a = b,b,a = , N , N − δk,k , N i k k k k x (M − M) x x (M − M) x 2 (M − M) ai ai ak ai ai ai

a ,a ,a = ai ,a ,a

= + Nai ,ai ,ai

, N , N i i i i 3

2 k 2x (M − M) 2x x (M − M) ai ak ak ak ak ak

a ,a ,a = a ,a ,a = N , N − Na ,a ,a . i k k

2

3 k k k k k k

2x x (M − M) 2x (M − M) b,b,b = N

M

the graft of S , T onto S, T . Of course Dω Hgν is the graft of Bm,2 onto We call S, T 2 D±,ω Dν ; here M = μ, M = ν 2 (so M − M = 4), ω = ξ3n +n−2 , and L = L = 2. An easy example of (b) is grafting D−,0 Dν onto D+,ω Dν to form D−,ω Dν .

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

489

An easy generalisation of Dω Hgν is to replace Zν with any abelian group K of odd order ν. Take a trivial twist ω = 0, for simplicity. The entries of S and T involving classes 1 and 4 are identical to those in (3.3). The role of Dν is played by the semi-direct product D K = K ×Z2 , so the class 2 primaries are labelled by the 2-dimensional irreps i of D K , and the class 3 ones by pairs (h, j), where h runs through representatives of the cardinality-2 conjugacy classes of D K and j runs through the irreps of K . The T -entries for class 2 are again 1’s, while those for class 3 are the evaluation j (h). The submatrices D and E are again cosines, namely Di,(h, j) = i(h) and E (h, j),(h , j ) = 2 Re( j (h) j (h )). Although the modular data thus generalises naturally to arbitrary odd order abelian groups, we see in Sect. 4.1 that the subfactor realisation for noncyclic K is more subtle. It is tempting to look for additional twists of DHgν . After all, we’ve suggested that the generalised Haagerup is a twin of the dihedral Dν , and the latter can be twisted by Z2ν and not merely Zν . The Zν twist of both D Dν and DHgν affects the class 3 primaries. The independent Z2 twist of D Dν affects the class 4 primaries, so this suggests one should look for additional twists (by Z2 or perhaps Zμ ) of DHgν , which sees the m primaries of class 4 and leaves the other classes untouched. We haven’t gotten this to work. For instance, one approach to probing some of the missing twists would be a Galois automorphism acting on S and T entry-by-entry. The entries of Dω Hgν lie in Q[ξν 2 μ ], with Galois group Z× × Z× μ . The class 3 twists ω ∈ Zν ν2 × are stable under Zν 2 , as explained last subsection, so we should consider the effect of ω Z× μ on D Hgν . ω For ∈ Z× μ , σ y = ±y for some sign. If σ y = −y, then σ sends D Hgν to nonunitary modular data with exactly the same fusion coefficients: the vacuum primary is still the first one, but the positive column of S is the second one. This means this modular data won’t have a subfactor realisation (it could perhaps have a planar algebra interpretation, if the requirement of positive-definiteness there is dropped), though it can still have a VOA one. The notion of twist should presumably preserve unitarity, so we won’t regard this σ Dω Hgν as a twist. Nevertheless it may be interesting to search for its VOA realisations. For this reason we should focus on those ∈ Z× μ for which σ y = +y. Now, it is typical for small n that μ is a prime power (this is true for all ν < 19 except ν = 9). In this case, any such σ will send Dω Hgν to equivalent modular data. The search for new twists using Galois seems to fail. However when μ is not a prime power, this idea bears fruit. Given the prime decomk position μ = i=1 pim i , where each m i > 0, then σ ∈ Gal(Q[ξν 2 μ ]/Q) ∼ fixes = Z× ν2μ k 2 δ iff i=1 pi = +1. We can also require here that ≡ν 2 1 (the ν -part of merely ω shuffles the twists ω). The automorphism σ maps D Hgν to equivalent modular data iff pi

= +1∀i. Given any set P (possibly empty) containing an even number of distinct prime divisors of μ, pick any P ∈ Z× with P ≡ν 2 1 and pi = −1 iff pi ∈ P. By ν2μ

Dω,P Hgν we mean the modular data σ P S, σ P T ; up to equivalence it is well-defined, and is inequivalent to Dω Hgν even though it has the same fusions. Thus as long as k > 1, i.e. μ is not a prime power, this construction yields new modular data. In Sect. 4.1 we find subfactor realisations for this new Galois twist, at least when ν = 9 and ν = 19. We learned in Sect. 2.4 that DHg also sees DZ3 , in some ways more directly than it does DS3 . Since DZν has only a Zν worth of twists, it is certainly not inconceivable that the correct generic twist group for DHgν is indeed Zν .

490

D. E. Evans, T. Gannon

Table 1. Modular invariants for the doubles of Sym(3) and Haagerup

D+,0 S3 D−,1 S3 D+,1 S3 D−,0 S3 D0 Hg3

Nimless

Nimble, insufferable

Sufferable

Total

14 1 1 9 ?

6 0 0 7 ?

28 4 8 12 8 ≤? ≤ 13

48 5 9 28 28

3.4. Miscellanea involving modular invariants. [26, Sect. 6] studied the quantum double S3 modular data in detail, and similar techniques [28] can be used for their twists, as noted in [21]. Using [27, Theorem 4.3, 28] (with the correction to [27] where Z 11 was erroneously recorded as being sufferable, when we have now seen in Sect. 2.2 of this paper that it is indeed a dual canonical modular invariant), we can summarize these results in the following Table 1 (nimless means there is no compatible nimrep, while nimble means there is; sufferable means there is a subfactor realisation): The commutant of S, T of course forms an algebra under matrix multiplication, but the sufferable modular invariants themselves have a fusion structure, in the sense that the product of two of them is always a linear combination over nonnegative integers of other sufferable modular invariants. The modular invariants for Ds,ω Dν and Ds,ω Dν , including this fusion structure, are naturally isomorphic for any ∈ Z× ν , using σ Ds,ω Dν ∼ = Ds,ω Dν ; the same applies to Dω Hgν and Dω Hgν . [21,28] remarked that there is an (injective) homomorphism from the 28 DHg modular invariants to the 28 sufferable D+,0 S3 ones. We can define a natural injective homo(S ,+0) morphism φ(S33,sω) from the sufferable twist s, ω quantum double S3 modular invariants into the untwisted S3 ones. We can actually embed the sufferable −1 into +1 modular invariants, the sufferable −1 into the −0 modular invariants and the sufferable −0 into +0 modular invariants. This can be generalised and understood as follows. As is clear from Sects. 3.1 and 3.2, primaries for both Ds,ω Dν and Dω Hgν fall naturally into 4 classes, and so their modular invariants decompose naturally into 4 × 4 blocks as did their S, T matrices. Write S[i],[ j] , T[i],[i] , M[i],[ j] for these blocks. Theorem 3. Write k for the number of distinct prime divisors of μ. Define a map ⎛ ⎞ M[1],[1] M[1],[2] M[1],[3] 02×2 ⎜M ⎟ 0n×2 ⎜ [2],[1] M[2],[2] M[2],[3] ⎟ ω Dν φHgν M = ⎜ ⎟, ⎝ M[3],[1] M[3],[2] M[3],[3] 0nν×2 ⎠

02×2

02×n

02×nν

(a − b)I2×2

a b Dν is a bijection between all modular invariants . Then ω φHgν c d Dν is a of Dω Hgν with b = 0, and all modular invariants of D−,ω Dν with b = 0. ω φHgν k−1 ω 2 -to-1 surjection from all modular invariants of D Hgν with b = 0, onto all modular invariants of D−,ω Dν with b = 0. Moreover, any modular invariant of D−,ω Dν is a modular invariant of D+,ω Dν (but not conversely). where M[1],[1] =

Proof. Let M and M be modular invariants for Dω Hgν and D−,ω Dν , respectively. Let S, T (resp. S , T ) denote the modular data for Dω Hgν (resp. D−,ω Dν ). From M T =

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

491

T M we see that M[i],[4] = 0 = M[4],[i] for i = 1, 2, 3. We also see that M[4],[4] is diagonal. Recall (1.12): we see from the 0-row of S that s in classes [1,2,3] is identi , while s (d ) = cally +1 for all ∈ Z× l 2 μ equals −1 when e.g. ≡μ m. Thus also ν μ M[i],[4] = 0 = M[4],[i] for i = 1, 2, 3. Evaluating M S = S M at ([1, 4]) and ([4, 1]), we get two possibilities (identical conclusions hold for M ):

(i) (ii)

M[1],[1] = I and M[4],[4] is a permutation matrix: Mdl ,dl = δπl,l for some permutation π of 1 ≤ l ≤ m; M[1],[1] = 12×2 and M[4],[4] = 0.

Looking at the remaining equations M S = S M, M T = T M, M S = S M and M T = T M , we see that (in case (i)) the equations involving M[4],[4] decouple from the others and reduce to π(l) π(l ) ≡μ ±ll , π(l)2 ≡μ l 2 .

(3.5)

On the other hand, the equations for the unknown entries of M[i],[ j] , i, j = 4, are iden tical to the corresponding equations for M[i],[ j] . For example, (M S)[2],[4] = (S M)[2],[4]

and (M S )[2],[4] = (S M )[2],[4] both reduce to Mai ,0 = Mai ,b and Ma i ,0 = Ma i ,b. Dν is indeed a

, this means ω φHgν Therefore in case (ii), where M[4],[4] = 0 = M[4],[4] bijection.

= I , so we need to identify the permutation π . Write In case (i), we know M[4],[4] π 1 = . Then (3.5) requires 2 ≡μ 1, hence πl ≡μ ±l. So to any 1 ≤ ≤ m satisfying 2 ≡μ 1, define πl to be the unique number 1 ≤ πl ≤ m obeying πl ≡μ ±l. Then π satisfies (3.5) and thus defines the M[4],[4] -block of a case (i) Dω Hgν modular Dν in case (i). D +,ω D is handled invariant. These 2k−1 parametrise the kernel of ω φHgν ν identically. Dν is linear and preserves matrix multiplication. In particular, when μ Note that ω φHgν is a prime-power (as it is for the original Haagerup DHg, and all ν < 19 except ν = 9), there will be an algebra isomorphism between the span of modular invariants for Dω Hgν and for D−,ω Dν , and an (injective but nonsurjective) algebra homomorphism of these into D+,ω Dν . The relation between the 28 sufferable D+,0 S3 modular invariants and the 28 D0 Hg3 modular invariants would seem to be a coincidence: e.g. at ν = 1, D+,0 D1 = D+,0 Z2 has exactly 6 modular invariants, all sufferable, while D0 Hg1 has exactly 2. The dihedral group is only half the story; the other half is the affine algebra data Bm,2 . Its modular invariants were classified in [32]: when μ is not a perfect square (the situation here), the complete list is B(d0 , 0 ) and B(d1 , 1 ; d2 , 2 ) = B(d2 , 2 ; d1 , 1 ), where di |μ, μ|di2 , and 1 ≤ i ≤ m obeys i2 ≡d 2 /μ 1. for matrices B(d, ) and B(d1 , 1 ; d2 , 2 ) i defined in [32]. Those modular invariants which are permutation matrices are the B(μ, ) for 1 ≤ ≤ m satisfying 2 ≡μ 1: their nonzero entries are B(μ, )z,z = 1 for z ∈ {0, b, ai }, and B(μ, )dl ,dl = 1 for l ≡μ ±l. We see from this and Theorem 3 that the modular invariants of Dω Hgν come from those of D−,ω Hgν and Bm,2 in a very direct sense. Recall [22,53] that the sufferable modular invariants of the double of a finite group G are parametrised by pairs (H, ψ), where H is a subgroup of G×G and ψ ∈ H 2 (B H ; C× ) is called discrete torsion. This suggests:

492

D. E. Evans, T. Gannon

Question 4. Find some analogue for D0 Hgν of this (H, ψ) parametrisation of sufferable modular invariants. This (H, ψ) parametrisation belongs most naturally to the D formulation of the is also the double of G. Is there double of the group G. As mentioned in Sect. 1.2, D also a parametrisation of the sufferable modular invariants of the double of G which is language? more natural in the D Of special importance (recall Sect. 1.3) are the monomial modular invariants of D0 Hgν . Three of these are obvious: |ch 0 + ch b + 2 ch ai |2 , |ch 0 + ch b + ch ai + ch ch,0 |2 , |ch 0 + ch b + 2 ch ch,0 |2 . (3.6) Proposition 4. When ν = p or pq for (not necessarily distinct) primes p, q, the only monomial modular invariants are the three in (3.6). Proof. We need to find all eigenvectors u of S, T with eigenvalue 1, with u 0 = 1 and all other u x ∈ {0, 1, 2, . . .}, so Z = | x u x ch x |2 . We know u b = 1 since Mb,b = 1 for any modular invariant (clear from the proof of Theorem 3), hence u dl = 0 for all l. We need to determine u i := u ai and u h, j := u ch, j . T u = u implies u h, j = 0 unless ν divides h j. Therefore if u h, j = 0 we have: (i) (ii) (iii)

for ν prime, j must be 0; for ν = p 2 , only for j = 0 or p|h and p| j; for ν = pq ( p = q), only for j = 0, or p|h and q| j, or q|h and p| j.

Su = u implies, for all 1 ≤ h, h

≤ n and 0 ≤ i < ν, u h , j cos(2πi h /ν) = νu i , 2+2 uj +2 2+2 cos(2π j h/ν) + 2 cos(2π(h j + h i)/ν)u h , j = νu h,i , cos(2π h j/ν) u h

, j = cos(2π h

j/ν) u h, j ,

(3.7) (3.8) (3.9)

where we sum over h , j ((3.9) was obtained by hitting (3.8) with cos(2π h

i/ν) and summing over i). In analysing these equations, it is useful to recall (for k odd) cos(2πl/k) = μ(k), (3.10) ξk = 2 the sums taken over the 1 ≤ ≤ k and 1 ≤ l < k/2 which are coprime to k. The Möbius function μ(k) equals (−1)n if k is a product of n ≥ 0 distinct primes, and 0 otherwise. The Galois symmetry σ for ∈ Z× ν (recall (1.11)) maps ai to a±i and ch, j to c±h,j . Therefore u i = u gcd(i,ν) and u h, j = u gcd(h,ν), j h/gcd(h,ν) , so u h,− j = u h, j . For ν prime, this means u i = u 1 and u h, j = u 1,0 δ j,0 . Plugging this into (3.7) gives u 1 + u 1,0 = 2, which correspond to the three solutions (3.6). For ν = p 2 , we read off from (3.9) with h = p, h = 1 that u p, j p = u p,0 − u 1,0 for all j = 0. Then comparing (3.7) for i = 1, p gives 2 = u p + u 1,0 = u 1 + u p,0 . We can now force u p = u 1 by (3.8) at h = p, i = 1, and again we recover (3.6). Finally, turn to ν = pq. (3.9) with h = 1 and h = p, q gives respectively u p,q j = u p,0 − u 1,0 , u q, pj = u q,0 − u 1,0 for j = 0. Then (3.7) at i = 1, p, q gives u q + u q,0 = u p + u p,0 = 2 and u 1,0 = 2 + u 1 − u p − u q . Now (3.8) at h = i = p and h = i = q give u 1 = u p = u q and u 1,0 = u p,0 = u q,0 and we’re done.

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

493

The ν = 3 modular invariants (2.3),(2.4) generalise, for any D±,ω Dν , Dω Hgν , to |ch 0 + ch b|2 + 2 |ch ai |2 + 2 |ch ch, j |2 . (3.11) This corresponds to the Zν subsystem in both D Dν and D0 Hgν , and to the VOA realising Dω Zν and containing both D+,ω Dν and Dω Hgν VOAs. 4. Subfactors for Haagerup–Izumi Modular Data 4.1. Izumi’s subfactors and their modular data. In Sect. 5 we address the question of realising this generalised Haagerup modular data Dω Hgν by VOAs. In this section we address their subfactor realisation. In Sect. 7 of [38], Izumi suggests associating subfactors to any odd abelian group K of order ν, using endomorphisms in the Cuntz algebra Oν+1 . (Warning: Izumi’s n is our ν = 2n + 1.) Let A(K ) be the set of all ν × ν complex matrices A = (A g,h ), g, h ∈ K , satisfying A g,h = Ah,g , A g,h = A−h,g−h = Ah−g,−g , A g,0 = δg,0 − 1/(δ − 1), A g+m,h Ah,m = δg,0 − δh,0 /δ, m∈K

(4.1) (4.2)

m∈K

Am,g+h A g,m+k Ah,m+l = A g+l,k Ah+k,l − δg,0 δh,0 /δ, √ ν 2 + 4)/2). These equations imply [38] √ |A g,h | = δ/(δ − 1), A g+k,h Ah,k = A−k,g A g,h−k ,

(4.3)

for all g, h, k, l ∈ K (recall δ = (ν +

(4.4) (4.5)

for any g, h, k ∈ K with g = 0, h = 0, g = h. Izumi shows that to any matrix A ∈ A(K ), there corresponds a (nonbraided) subfactor of index δ + 1 with principal graph the ν-star (see the top graphs of Figs. 2 and 4). We call these subfactors of Izumi type K . Next subsection we determine much of the data of these subfactors. His Theorem 8.4 identifies the modular data S, T of the even part of the quantum double of this l , g, h ∈ K , 1 ≤ l ≤ m, satisfying the subfactor in terms of m(ν 2 + 1) variables ωl , C g,h m(ν 2 + 1) equations: l l l C0,g = ωl − ωl /δ, ωl C g,h − A g+k,2h C h,k = δh,0 ωl /δ (4.6) g∈K

k∈K

(recall m = 2(n 2 +n+1)). Some solutions to (4.6), occurring when ωlν = 1, are redundant (i.e. correspond to b, ai or ch, j ) and should be ignored. In particular, precisely σ1 (ν) − 1 (i.e. the sum of divisors d > 1 of ν) solutions to (4.6) with ωl = 1 are redundant, and for any other root of unity ωl of order d dividing ν, the number of redundant solutions is d φ(ν/d )/2, where φ is Euler’s totient and the sum is over all divisors d < d of d. The entries of the S and T matrices equal those of D0 Hgν except possibly for those in the bottom-right m × m block, which are given by Tdl ,dl = e−2π ic/24 ωl and ⎞ ⎛ 1 ⎝

l l ⎠. Sdl ,dl = C g,g+h C−g,h (4.7) ωl ωl + δ νδ + 2 g,h∈K

494

D. E. Evans, T. Gannon

Fig. 4. Principal graphs for Izumi’s Z5 subfactor

When some ωl has more than one solution in (4.6), (4.7) is ambiguous as it isn’t obvious which solutions to (4.6) to use. Izumi shows in his Example 7.1 that A(Z3 ) contains exactly two matrices, both corresponding to the Haagerup subfactor. In Appendix C he exactly solves (4.6) for A ∈ A(Z3 ), and in this way obtains (a complicated expression for) DHg. In his Example 7.2, Izumi shows A(Z5 ) contains exactly 4 matrices, again corresponding to a single (new) subfactor. Izumi does not solve (4.6) for them. This analysis of (4.1)-(4.3) for ν = 3 and ν = 5 is as far as Izumi went (this is what we mean when we call it a hypothetical family of subfactors). Nevertheless we can push a little further this analysis. The group Aut(K ) acts naturally on the matrix A by shuffling the entries: α ∈ Aut(K ) sends A to Aα defined by Aαg,h = Aαg,αh . For example, when K = Zν , Aut(K ) = Z× ν acts by multiplication. In this way, given any A ∈ A(K ), Aut(K ) embeds into Aut(Q[A]) and fixes δ, where Q[A] is the field generated over Q by all entries A g,h . Call A, A ∈ A(K ) equivalent if A = Aα for some α ∈ Aut(K ). Equivalent matrices give rise to equivalent subfactors and equivalent modular data. For later convenience, for x ∈ Q[A], write Tr x for the orbit sum y∈x y over the K -orbit x. It is very difficult to solve (4.6) directly (see Appendix C of [38] for the ν = 3 argument, which is already quite involved). Our strategy for finding the modular data is to reduce it to computer calculations. A key observation is that if S, T are the modular data of D0 Hgν , and if S , T is a second modular data, with

1

(4.8)

Sdl ,dl − Sdl ,dl < √ 18 μ for all 1 ≤ l, l ≤ m, and all other entries of S and S are equal, then (1.3) implies S = S everywhere. Maple 11 (x86 64 Linux) was used in these calculations. It is difficult to rigourously analyse the resulting error, but various consistency checks (e.g. unitarity of the numerically obtained S matrix, or the values of |Sd l ,d − Sdl ,dl |) all indicate it is l

on the order of 10−7 or better, far more precision than is actually needed. More details are given below. Computer calculations were not used in the determination of A(K ) in Theorem 5. Theorem 5. (a) There is a unique A ∈ A(1). Its modular data is D0 Hg1 . (b) There are precisely 2 matrices in A(Z3 ). They are equivalent and realise D0 Hg3 . (c) There are precisely 4 matrices in A(Z5 ). They are equivalent and realise D0 Hg5 .

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

495

(d) There are precisely 6 matrices in A(Z7 ). They are equivalent and realise D0 Hg7 . (e) There are precisely 12 matrices in A(Z9 ). They form 2 equivalence classes and define inequivalent subfactors; one realises the T matrix of D0 Hg9 , and the other realises the T of D0,{5,17} Hg9 (defined in Sect. 3.3). (f) A(Z3 × Z3 ) is empty. Proof. Let’s begin by identifying A(K ) for these groups (done in [38] for K = Z3 , Z5 ). δ−2 , and for Z3 and Z5 (4.1) fixes everything in terms of It is trivial that A(1) = δ−1 complex numbers a and a, b, respectively: ⎛

⎞ δ − 2 −1 −1 −1 −1 −1 −1 a b a ⎟ 1 ⎝ δ − 2 −1 −1 ⎠ 1 ⎜ ⎜ −1 a −1 b b ⎟ −1 −1 a A(Z3 ) = , A(Z5 ) = ⎜ ⎟. ⎝ δ−1 δ − 1 −1 a −1 b b −1 a ⎠ −1 −1 a b a −1 ⎞

⎛

For ν = 3, (4.3) forces a 2 − a + δ = 0. The two roots of this polynomial are complex × conjugates, and indeed −1 ∈ Z× 3 here acts as a ↔ a. Similarly, the generator 2 ∈ Z5 acts on A(Z5 ) by sending a → b → a → b → a. Then a has minimal polynomial x 4 − δx 3 + (3δ − 2)x 2 − δ 2 x + δ 2 = 0 and the other 3 roots of this polynomial are a, b, b — indeed, the Galois group over Q[δ] of this polynomial is Z× 5 with this action. Given a, this and the equation 1 − a − a + ab + ab = 0 then determine b. Computation of A(Z7 ) is handled similarly. Equations (4.1) and (4.5) give us ⎛

⎞ δ − 2 −1 −1 −1 −1 −1 −1 ⎜ −1 −1 a b c b a ⎟ ⎜ ⎟ ⎜ −1 a −1 b d d b ⎟ ⎜ ⎟ 1 ⎜ −1 b b −1 c d c ⎟ A(Z7 ) = ⎜ ⎟. δ − 1 ⎜ −1 ⎟ c d c −1 b b ⎜ ⎟ ⎝ −1 b d d b −1 a ⎠ −1 a b c b a −1

(4.9)

The generator 3 ∈ Z× 7 sends a → c → d → a → c → d → a and b → b → b. (4.5) gives ad = bc, while (4.2) with (g, h) = (1, 1), (3, 1), (2, 3) give (b + b + 2)(c + c) = −1 + Tr a = (b + b + 2)(a + a) = (b + b + 2)(d + d), where the last two equalities come from the Z× 7 symmetry. If b + b = −2, then a = d = c and δb = a 3 and we get two incompatible equations for s := a + a, namely s 4 + s 2 (δ − 4) − δs = δ − 2 and s 3 + (δ√ − 3)s − δs 2 = 5δ + 1. Thus b + b = −2 (hence b = −1 ± i δ − 1) and Tr a = 1. Let t = a + d + c. Then 2Re(t) = 1 and (from (4.2)) 2Re(tb) = −3δ − 5 gives 2t = 1 + (b + 1)(3 − δ) and we obtain a 3 −ta 2 +bta−δb = 0. The other two roots of this polynomial are d and c (which to call d follows from 2 Re (b+c−ab) = δ), and its Galois group over Q[δ, b] is 32 < Z× 7. For example, one choice is a ≈ −2.1 + 1.7i, b ≈ −1 − 2.5i, c ≈ 2.6 − .8i, d ≈ 2.7i. We can rigourously establish that such a quadruple (a, b, c, d) satisfies (4.1)–(4.3), using floating point calculations as follows. The basic idea is that if all Galois associates

496

D. E. Evans, T. Gannon

of an algebraic integer have modulus < 1, then that algebraic integer must be 0. First, (4.1) follows from (4.9). Next, it is clear that bb = δ; since (δ/a)3 − (1 − t)(δ/a)2 + bt (δ/a) − δb =

−δ 2 (−δb + tab − ta 2 + a 3 ) = 0 ba 3

(and similarly for δ/c, δ/d), we know the sets {a, c, d} and {δ/a, δ/c, δ/d} are equal, so a floating point calculation gives the pairings aa = cc = dd = δ. Using this, we can eliminate all occurrences of a, b, c, d from (4.9), so that the Eqs. (4.2)-(4.3) we need to verify are all algebraic in a, b, c, d. Finally, observe that δ is a unit and a, b, c, d are all algebraic integers (after all, t satisfies t 2 − t + (4δ − 2) = 0 so itself is an algebraic integer). The action of Gal(Q[A]/Q[δ]) ∼ d is clear. For any other σ ∈ = Z× 7 on a, b, c,√

ν2 + Gal(Q[A]/Q) (there are 6 of these), σ δ = (ν − √ √4)/2 =: δ and σ b ∈

{−1 ± 1 − δ . Without loss of generality fix σ b = −1 + 1 − δ =: b . Then t := σ t = (1+(b +1)(3−δ ))/2), and σ {a, c, d} are the roots of x 3 −t x 2 +b (1−t )x −δ b . Again, without loss of generality we can let a := σ a be any of these roots; then c := σ c and d := σ d are fixed by b + δ /b + c + δ /c − a δ /b − δ b /a = δ . For example we may take (a , b , c , d ) ≈ (−.08, .07, −.06, .05). We thus know the action on A of all 12 elements σ i 3 j of Gal(Q[A]/Q) = σ, Z× 7 . Multiplying (4.2) by (δ − 1)2 and (4.3) by (δ − 1)3 , the right and left sides of those equations manifestly become algebraic integers. Calculate (using floating point) the 12 Galois associates of the difference left − right for each of these equations. Note that the Z× 7 -orbit permutes g, h, k, l and can be ignored, so only σ need be applied. Maple does this in a fraction of a second and finds |left − right| is always < 10−6 . Again, all we needed was < 1. Assuming we can trust Maple here, this establishes (4.2)-(4.3) and the existence of our solution. Computation of A(Z9 ). We get as before ⎛ δ − 2 −1 −1 −1 −1 −1 −1 −1 −1 ⎞ b c d c b a ⎟ ⎜ −1 −1 a ⎜ −1 a −1 b e f f e b ⎟ ⎜ ⎟ ⎜ −1 b b −1 c f g f c ⎟ ⎜ ⎟ 1 ⎜ c e c −1 d f f d ⎟ A(Z9 ) = ⎜ −1 ⎟. δ − 1 ⎜ −1 d f f d −1 c e c ⎟ ⎜ ⎟ ⎜ −1 c f g f c −1 b b ⎟ ⎜ ⎟ ⎝ −1 b e f f e b −1 a ⎠ −1 a b c d c b a −1

(4.10)

Moreover, (4.5) yields gb = c f and ac = be = d f . The generator 2 ∈ Z× 9 sends a → e → d → a → e → d → a, b → f → c → b → f → c → b, and g → g → g. Up to the relations gb = c f, ac = be = d f , there are precisely 17 Z× 9 orbits in b, b, c, c, f, f of degree ≤ 3. Using (4.2) with g = 3 and (4.3) when 3 divides both g and h, we can compute these 17 orbit sums Tr as linear expressions in s := Tr b = 2 Re(b+c+ f ). For example, Tr g = 1−s, Tr (bg) = 3(s−δ), Tr (bg) = 2δ+1+s (2−δ). But Tr b Tr g = Tr (bg) + Tr (bg), so s 2 + (4 − δ)s + 1 − δ = 0. Choose one of these 2 solutions for s (they both yield A ∈ A(Z9 )).

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

497

√ From g + g = 1 − s we get g = (1 − s ± (δ − 6)s − 3δ)/2. Again select one of these g (they are complex conjugates). Let t := b + c + √f . Then t + t = s and tg + tg = 3(s − δ), which fixes t = s/2 ± (3δ − 6 − δs − s) (δ − 6)s − 3δ/36 (use the same sign as in g). Then bc + b f + c f = tg and bc f = δg, so b, c, f are the 3 roots of x 3 − t x 2 + gt x − δg = 0. Let b be any of these roots; which to call c and f is fixed by the relation Tr (b2 c) = 6δ + 3 + (8 − δ)s. Now (4.2) at (g, h) = (1, 1) and (4,2) gives a 2 + va/u + δu/u = 0 for (u, v) = (b + c2 / f − 1, 1 + bc + bc) and ( f + c − bc/δ, δ − f − f ) respectively; subtracting these expresses a rationally in terms of b, c, f . Then d = a f /c and e = bc/a. These values of a, . . . , g will solve (4.1)-(4.3). There are 2 choices for s, 2 for g, and 3 for b. The Z× 9 symmetry accounts for this g and b ambiguity, leaving 2 inequivalent solutions. That these are indeed solutions can be established in the same way as for Z7 given earlier. Computation of A(Z23 ). Write the (1, 0)-row of the matrix (δ−1)A(Z23 ) as (−1, −1, t, u, v, w, x, y, z). Equation (4.5) with g = h = (1, 0) and k = (0, 1), (1, 1) gives yu = xv = zw =: δa for some complex number a with |a| = 1. From (4.5) with g = (1, 1), h = (1, 0), k = (1, 2) yields δt = x yz, and applying the Z23 -automorphism (i, j) → (i, − j) to this we get δt = uvw. Comparing these gives a 3 = 1. Putting g = (0, 1), h = (1, 0) into (4.2) yields 1 − δ = (au + au − 1)(av + av − 1).

(4.11)

Hitting this with the Z23 -automorphism (i, j) → (i + j, j) gives the same equation with w replacing u. Thus au + au − 1 = av + av − 1 and (4.11) says there is a real number whose square is negative. This impossibility establishes part (f). l Determining the modular data. Consider first K = Z3 and fix A ∈ A(Z3 ). Let ωl , C g,h be some solution to (4.6). Then we learn from the above proof that any automorphism σ ∈ Gal(Q/Q[δ]) corresponds to an ∈ Z× 3 in such a way that σ A g,h = Ag,h . Certainly σ sends ωl to some other root of unity wl σ of the same order d, and moreover σ ωl = ωl σ (since σ commutes with complex conjugate in cyclotomic fields). l l σ given by Hence σ sends the solution ωl , C g,h of (4.6) to another solution ωl σ , C g,h l ) = Clσ σ (C g,h g,h . Incidentally, (4.7) then implies the Galois actions

σ (Sdl ,dl ) = Sdl σ ,dl σ , σ (Tdl ,dl ) = Tdl σ ,dl σ (the second equation is simply the statement σ ωl = ωl σ ; to see the first, compute instead σ S). This is stronger than the usual Galois action (1.4) in modular data, though of course it is compatible with D0 Hgν . Since δ ∈ Q[ξμ ], when μ is coprime to the order d of ωl any automorphism in Gal(Q[ξd ]/Q) will lift to a σ fixing δ (when gcd(d, μ) > 1, only half will). So if (4.6) can be solved for a d th root of unity wl , for d coprime to μ, then it can be solved for all d th roots of 1. That is, we would have at least φ(d) solutions. As long as d doesn’t divide ν, none of these are redundant. But Izumi tells us there are precisely m nonredundant (independent) solutions to (4.6) (corresponding to the m half-braidings, giving rise to the m primaries of type dl ). Hence any solution to (4.6) involves ωl of order d, where φ(d) ≤ m (for d coprime to μ) or φ(d) ≤ 2m (otherwise). There is thus a finite number of possibilities for ωl . A little thought reduces the number further — e.g. when gcd(d, μ) = 1 it suffices to consider only ωl = ξd . We find that (for ν = 3) only 12 possibilities for ωl need be

498

D. E. Evans, T. Gannon

6 . To rule out these possibilities, considered, namely ±1, ±ξ3 , i, ±ξ7 , ±ξ9 , ±ξ13 , −ξ13 we compute determinants. In particular, for ωl = 1, ξ3 , we show that the linear system (4.6) is inconsistent by showing the (ν 2 + 1) × (ν 2 + 1) augmented matrix (formed from the coefficients and constant vector) has nonzero determinant. To show ωl = 1 (respectively ωl = ξ3 ) has at most 3 (resp. 1) independent solutions, we deleted 3 (resp. 1) equations and 2 (resp. 0) variables from (4.6) so that the resulting reduced coefficient matrix has nonzero determinant. We evaluated these determinants numerically using Maple. If some determinant was fairly close to 0, we found some Galois associate of ωl with large determinant (i.e. of order 1 or higher). 2 In this way we show that ωl must be one of the values ξμml . We can now determine the modular data, rigourously identifying it with D0 Hgν , as follows. Numerically compute the solution to (4.6) and plug it into (4.7) to obtain an estimate S . We find that S matches S to within 2 × 10−9 , well within the 0.015 required by (4.8). This argument for K = Z5 and Z7 is identical. For them exactly 14 and 22, respectively, values of ωl need to be considered. The error |S − S| is less than 2 × 10−9 and 5 × 10−8 respectively, again well within (4.8). The modular data calculation for K = Z9 is more complicated for two reasons, both related to the compositeness of μ = 5 · 17. Fix A ∈ A(Z9 ) (the result for the other A is automatic by Galois considerations). First, an automorphism σ will permute the entries of A iff σ fixes both δ and s = Tr b (incidentally, Gal(Q[s]/Q) ∼ = Z22 , so s lies in a cyclotomic extension of Q). Hence potentially only 1/4 of Gal(Q[ξd ]/Q) lifts to such σ . This means more possibilities for ωl to be eliminated (218 possibilities is overkill), and we find that the T matrix for A exactly matches that of either D0 Hg9 or D0,{5,17} Hg9 . More significantly, because of equalities like ω1 = ω16 , there won’t always be unique solutions to (4.6), and (4.7) is ambiguous. This means we cannot identify S without more work.

The key difference between K = Z9 and Z23 is the size of their automorphism groups (6 vrs 48). Nonuniqueness for K = Z9 arises because μ = 85 is composite: more generally, where k > 1 is the number of distinct prime divisors of μ, there will be at least k2 inequivalent A and indeed inequivalent subfactors, realising the inequivalent modular data of type D0,P Hgν defined in Sect. 3.3. This nonuniqueness of subfactors with identical principal graphs is not a surprise: for a simple example, the principal graph fails to uniquely identify the subfactor even in index < 4 (two subfactors realise each of the graphs E 6 , E 8 although one is the opposite (1) (1) of the other) whilst in index 4 the affine graphs A2n−5 and its orbifold Dn both have n − 2 inequivalent subfactors [39]. Exact expressions for the matrices A ∈ A(K ) appear in the proof of Theorem 5. Numerical estimates for these A may also be of interest. A convenient way to express this, for K cyclic, is in terms of j2 , j3 , . . . , jn+1 ∈ R: for 0 < g < h < ν we have √

A g,h =

δ exp[i( jh − jg − jh−g )], δ−1

where j1 = 0 and jn+1+i = jn+1 + jn − jn−i for 1 ≤ i < n (see Lemma 7.3 of [38]).

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data (3)

j2 (5)

(5)

(7)

(7)

499

≈ 1.292076 ;

( j2 , j3 ) ≈ (0.1846862, 1.5984702) ; (7)

( j2 , j3 , j4 ) ≈ (2.471228, 0.51685555, 0.2137724) ; (9)

(9)

(9)

(9)

( j2 , . . . , j5 ) ≈ (2.396976693, 2.079251103, −0.2079168419, −2.508673987) ; ( j2 , . . . , j5 ) ≈ (−2.364737070, 1.031057162, 1.569692175, 0.3383837765). j (9) corresponds to D0 Hg9 and j (9) to D0,{5,17} Hg9 . It should be possible to identify A(Zν ) for the next couple ν’s, using the method of ν = 9 described in the proof. To probe the answer, we used Maple to obtain numerical solutions to (4.1)-(4.3). We don’t have any proof that these correspond to actual solutions (although the numerics are convincing), and we certainly have no proof that there are no other solutions for these ν. The inequivalent numerical solutions we obtained for K = Zν , 11 ≤ ν ≤ 19, are: j (11) ≈ (0.9996507, 2.7258434, −0.5714203, −1.7797340, 1.2675985), j (11) ≈ (−2.6444397, −1.7629598, −2.6444440, 2.7572657, 0.1128260) ; j (13) ≈ (−3.1050384, 0.5993399, −0.111708, −0.969766, 1.336848, 1.00483129) ; j (15) ≈ (−1.0777623, −.7748018, −2.171863, −1.6068402, −.257508, 2.092502, .72289565) ; j (17) ≈ (−1.466074, .291489, 3.130735, −2.693185, 1.398153, −.611938, −1.667078, −1.754821) ; j (19) ≈ (−2.677465, 1.088972, −.899442, .015448, −1.240928, −.493394, 1.839879, −1.525884, −2.084374) ; j (19) ≈ (.896858, −.882585, −2.369855, −1.873294, −1.711620, −.119360, 2.972018, −2.460652, .041334). For ν = 13, 15, 17 respectively, μ = 173, 229, 293 is prime. In these cases, choosing 2 ωl = ξμml yields a unique solution to (4.6) (according to Maple) and plugging into (4.7) yields a very close approximation (of order 10−6 or so) S to the S matrix of D0 Hgν . With high confidence we can expect j (ν) for ν = 13, 15, 17 to describe a subfactor of Izumi type Zν realising the modular data D0 Hgν . For ν = 11, 19 respectively, μ = 53 , 5 · 73 are composite and the situation is more 2 subtle. We find (up to the numerical accuracy of Maple) that for these j (ν) , ωl = ξμml in (4.6) yields the correct multiplicity of solutions, and thus the corresponding T matrix agrees with that of D0 Hgν . Likewise, the solution j (19) corresponds to D0,{5,73} Hg19 . As with ν = 9, the existence of higher multiplicities means we cannot determine S unambiguously from (4.7) for these 3 As. We expect j (11) , j (19) , j (19) to correspond to actual solutions of (4.1)-(4.3), hence subfactors of Izumi type Z11 , Z19 , Z19 respectively, with modular data D0 Hg11 , D0 Hg19 , D0,{5,73} Hg19 resp.

500

D. E. Evans, T. Gannon

The only unexpected solution here is j (11) . Following the method described in the proof of Theorem 5, we obtain the corresponding T matrix: for each 1 ≤ l ≤ m, 2 ξμml if 5|l Tdl ,dl = . (4.12) 2 5ml ξμ otherwise As before, S is not uniquely determined by (4.7). We do not have a guess for what this S matrix is, though its entries Sdl ,dl should lie in Q[cos(2π/25)]. Just as extra solutions like j (9) , j (19) occur whenever distinct primes divide μ, we would expect that solutions like j (11) occur whenever a nontrivial prime power divides μ. On the basis of these observations, it is tempting to make the following conjecture. Conjecture 1. The modular data D0,P Hgν for any ν = 2n + 1 and any set P (possibly empty) consisting of an even number of distinct prime divisors of μ, is realised by the even part of the quantum double of a subfactor of Izumi type Zν . All A ∈ A(K ) should correspond to modular data obeying T 2μν = I . 2μ

We expect Tdl ,dl = 1 for reasons clear from the proof of Theorem 5. Existence is accumulating suggesting that (up to equivalence) there is a unique matrix A ∈ A(Zν ), and a unique subfactor of Izumi type Zν , iff μ = ν 2 + 4 is prime. It would be interesting to see if A(K ) can be nonempty when K is not cyclic. Question 5. Interpret the Zν twist in Dω Hgν in a von Neumann algebra formalism. We would expect this to be in terms of twisted systems of endomorphisms, or twisted fusion categories, as what happens for twisted doubles of finite groups. 4.2. Principal graphs, α-induction, etc. for Izumi’s subfactors. Consider a subfactor N ⊂ M of Izumi type K , an abelian group of odd order ν, defined by some matrix respectively. We argued last subsecA ∈ A(K ), with N -N and M-M systems and is often (always?) the untwisted tion that the modular data of the double D and D Haagerup–Izumi modular data D0 Hgν or some generalisation thereof. This subsection describes the data associated to this subfactor, generalising Sect. 2.2. For the special case of the Haagerup subfactor, i.e. K = Z3 , much of this was computed in [5], though without appreciating the underlying Z3 structure which organises everything. Write κ for the inclusion N ⊂ M. Write K for a set of representatives of the equivalence classes (K \0)/±; then g ∈ K labels the conjugacy classes |g| := √ {g, −g} in the dihedral group K ×Z2 (where Z2 acts by inverse). Recall δ = (ν + μ)/2, n = √ (ν − 1)/2, m = (ν 2 + 3)/2, μ = ν 2 + 4 = 2m + 1 and write λ = 1 + δ. Theorem 6. Let N ⊂ M be a subfactor of Izumi type K as above. It has index δ + 1. The N -N system is = {id, u, . . . , u ν−1 , ρ, ρu, . . . , ρu ν−1 }, and its sectors have fusions generated by [u g ][u h ] = [u g+h ], [u g ][ρ] = [ρ][u −g ], [ρ]2 = [id] + [ρu g ]. (4.13) The M-N system consists of endomorphisms κu g (g ∈ K ) and κ . The principal graph consists of ν segments, each of length 3, sharing a common central vertex. The product M-N × N -N → M-N is [κu h ] + (ν − 1)[κ ], [ρu h ] = [κ ] + [κu −g+h ], [κ ][ρu g ] = (4.14) [κu g ][u h ] = [κu g+h ], [κ ][u g ] = [κ ].

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

501

The product N -M × M-N → N -N is: κu g t [κu h ] = [u h−g ] + [ρu h+g ], [κ ][κu g ] = [u g ] + (ν − 1) [ρu g ]. [κ ][κ ] =

[ρu h ],

(4.15)

The primaries of the double D are 0, b, ai , ci, j , dl for i ∈ K , j ∈ K , 1 ≤ l ≤ m. The canonical endomorphism is θD = 1 + b + 2 i ai with modular invariant Z = |ch 0 + ch b + 2 i ch ai |2 . The alpha-inductions are αx+ = αx ⊗ 1, αx− = 1 ⊗ (αx)opp where, for all i ∈ K , j ∈ K , 1 ≤ l ≤ m, [ρu j ], [αb] = [id] + [ρu j ], [αci, j ] = [u i ] + [u −i ] + [ρu j ], [αdl ] = [ρu j ]. [αai ] = 2[id] +

(4.16)

Proof. In Lemma 7.1 and Theorem 7.2 of [38], we learn that contains u g and ρu g for all g ∈ K , and the fusions [κκ] = [id] + [ρ], [u g ][u h ] = [u g+h ] and [ρ]2 = [id] + [ρu g ]. By definition consists of all irreducible sectors in any (κκ)k , and from this we quickly obtain that = {u g , ρu g }. Because κρ, κρ = κκρ, ρ = (ρ + 1)ρ, ρ = 2 and κρ, κu g = (ρ + 1)ρ, u g = δg,0 , we see that [κρ] − [κ] := [κ ] is an irreducible M-N sector distinct from any [κu g ]. We obtain the fusion [κ u g ] = [κ ] from the calculation κ u g , κ = κ(ρ − 1)u g , κ(ρ − 1) = (ρ 2 − 1)u g , ρ − 1 = 1. [κ][κ ] has dimension λ2 (δ − 1) = νδ so will equal the sum ofν sectors [ρu g ], perhaps with multiplicities, but K -invariance now forces [κ][κ ] = [ρu g ]. As above we see that the only sectors in κ(κκ)k are [ρu g ], [κ ], so we have exhausted all M-N sectors, and obtain the given principal graph. All remaining products are now easy to compute. For example, κ κ has dimension λ2 (δ − 1)2 = ν + (ν − 1)νδ, so by K -invariance this must be [u g ] + (ν − 1) [ρu g ]. Let A ⊂ B = N ⊗ N opp be the Longo-Rehren inclusion as in Sect. 1.2. At the end of Sect. 4 of [37] is explained how to recover the induction-restriction (i.e. dual principal) graph between A-B and A-A sectors, from the structure of Tube(). In particular, we obtain the dual principal graph directly from the half-braidings as described in Proposition 8.2 of [38] — for instance θ is read off from the half-braidings containing the identity. From the graph we read off the given alpha-inductions. From this the dual principal graph for the double D is the obvious analogue of the top graph of Fig. 3. Note the resemblance between the fusions for and the group algebra CD K of D K = K ×Z2 , where Z2 acts on K by inverse. We also see that the sectors [u g ], [ρu g ], [κu g ], and [κ ] have statistical dimensions 1, δ, λ and (δ − 1)λ respectively. It is far from obvious (though of course an immediate consequence of Theorem 6) that the map α defined there is a ring homomorphism. Theorem 7. Assume the hypotheses and notation of Theorem 6. In addition, assume 0 the fusions of the double system D = {id, b, ai , ci, j , dl } agree with those of D Hgν , and that we have the canonical endomorphism θD ai + ch,0 . Then we =1+b+ with {id, a, bi , ci }i∈K where ci = κu i κ ∼ can identify the M-M system = κu −i κ. Its

502

D. E. Evans, T. Gannon

sectors obey the fusions [bi ] + [ci ], [a][bi ] = [a] + [bk ] − [bi ] + [ck ], [a]2 = [id] + [a] + [bk ] + [ci ] + [ck ], [bi ][c j ] = [a] + [bk ] + [ck ], [a][ci ] = [a] + [bi ][b j ] = δi, j [id] + (1 − δi, j )[a] + [bk ] − [bi+ j ] − [bi− j ] + [ck ], [bk ] + [ci+ j ] + [ci− j ] + [ck ], [ci ][c j ] = δi, j [id] + (1 + δi, j )[a] + where we set [c0 ] = 0 = [b0 ] and write [c−i ] = [ci ], [b−i ] = [bi ]. The dual principal graph is as in Fig. 4: e.g. there is a (odd) central vertex, to which is attached [a], the [bi ]’s and the [ci ]’s. The product M-M × M-N → M-N is [κu g ] + (ν − 1)[κ ], [bi ][κu g ] = [κ ], [a][κu g ] = [κu g ] + [κ ], [a][κ ] = [bi ][κ ] = (ν − 2)[κ ] + [κu g ], [ci ][κu g ] = [u g+i ] + [κu g−i ] + [κ ], (4.17) [κu g ] + ν[κ ], [ci ][κ ] = while the product M-N × N -M → M-M is [κu g ][κu h ] = ([id] + [a])δg,h + [cg−h ](1 − δg,h ), [κu g ][κ ] = [a] + [κ ][κ ] = [id] + (ν − 1)[a] + (ν − 2) [bi ] + ν [ci ].

[bi ] + [ci ], (4.18)

The dual canonical modular invariant is Z = |ch 0 + ch b + ch ai + ch ch,0 |2 . Writing αx+ = α x ⊗ 1 and αx− = 1 ⊗ ( α x)opp , the alpha-inductions are [bh ] − [bi ] + [ch ], [ α b] = [id] + [a] + [bh ] + [ch ], [ α ai ] = [id] + 2[a] + [bh ] + [ci ] + [ch ], [ α ci, j ] = [id]δ j,0 + ([a] − [bi ])(1 − δ j,0 ) + [bi ] + [ci ]. (4.19) [ α dl ] = [a] + is determined from Proof. From θ we obtain immediately Z . The cardinality of 2 = ν + 1. Fix any l and define [id] = [ γ, γ = θ, θ = Z i,0 α0 ] (the identity sector ), A = [ of α (dl − ch,0 + ai )], Bi = A+ν [ α (b−ai )], C h = A+ν [ α ( ch,0 − b)]. We compute (using α x, α y = θ , x y and the fusions of the double computed in Sect. 3.2) α (ai − b) = 1 + δi,i , ch − b, ch − b = 1 + δh,h , ai − b, ch − b = that α (ai − b), −1, dl , ai − b = dl , ch − b = 0, dl , dl = ν. Hence the A, Bi , C h are mutually orthogonal, each with norm ν, so they form a basis (over Q) for the vector space spanned . by the sectors of ; choose [e0 ] = [id]. Then [A], [Bi ], [Ci ] Let [ek ]0≤k≤ν be the irreducible sectors of L (1) , . . . , L (2n) ∈ Zν+1 by are linear combinations (over Z) of these [ek ]: define A, (i) [A] = Ak [ek ], [Bi ] = [A] + ν L k [ek ], [Ci ] = [A] + ν L k (i+n) [ek ]. Then A · A = ν 2 , A · L (k) = −ν, L (k) · L (l) = 1 + δk,l . From [id], αai − αb = [id], αch,0 − αb = [id], αdl = 0 we have A0 = L (k) = 0. Thus (reordering the [e ] if necessary) k 0 (k) A = (0, s1 (ν + t), . . . , sν (ν + t), st) and L (k) has components L i = sδi,ν + sk δi,k , where s, sk ∈ {±1} and t ∈ Z. This integer t satisfies ν 2 = t 2 + ν (ν + t)2 , so t = −ν.

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

503

This means that [a] := ν1 [A], and hence [bi ] := ν1 [Bi ] and [ci ] := ν1 [Ci ], all being , possibly up to signs. That the signs are indeed all +1’s follows by norm 1, are in writing say [ α d1 ] in terms of [a], [bi ], [ci ]: the coefficients should be nonnegative. now reduce to the fusions in the double. The dual principal graph The fusions in and the two remaining products follow by arguments as in Theorem 6. This guess for θ is the natural one, matching what happens for the Haagerup subfactor (recall Sect. 2.2) and what we know about monomial modular invariants in Proposition 4, which appear to be severely constrained. The only example we know violating the hypothesis on the fusions of D is the solution j (11) of Sect. 4.1. The noncom arise because of the presence respectively mutativity of and commutativity of absence of higher multiplicities in θ resp. θ . The generalisation of Fig. 3 can be read off from the alpha-inductions listed in the theorem. These imply the statistical dimensions dim(a) = δ, dim(bi ) = δ − 1 and dim(c j ) = δ + 1. The dihedral group K ×Z2 has one nontrivial 1-dimensional irrep and n 2-dimensional irreps. But for the Haagerup considered in Sect. 2.2, a corresponded to the 2 here are unrelated for dimensional and b the 1-dimensional irreps, so K ×Z2 and ν > 3, even after projecting away the ci ’s. 5. VOAs for Haagerup-Izumi Modular Data 5.1. The Haagerup-dihedral diamond. In Sect. 2.4 we saw DHg as a mutation of DS3 and contained in a conformal subalgebra of DZ3 ; now this becomes Dω Hgν is a mutation of D+,ω Dν and is contained in a conformal subalgebra of Dω Zν . The branching rules for this embedding can be read off from the modular invariant (3.11). In some ways (e.g. modular data) the Haagerup-Izumi closely resembles its sibling the dihedral group, and in other ways (e.g. VOA realisations) it more closely relates to its parent the cyclic group. The trivial case D0 Hg1 is already interesting. In c = 8, its VOA can be realised as the tensor of the affine algebra VOA V(G 2,1 ) (which has c = 14/5), with the affine algebra VOA V(F4,1 ) (which has c = 26/5). (Hence there is a VOA realisation for c any positive multiple of 8, by tensoring by copies of V(E 8 ).) Call this V(Hg01 ). The VOA corresponding to DZ1 in c = 8 of course must be V(E 8 ) = V(E 8,1 ), and the containment corresponds to the conformal embedding G 2,1 F4,1 ⊂ E 8,1 . The VOA corresponding to V D1 is the lattice VOA V(A1 E 7 ). Although the latter is an orbifold of V(E 8 ), V(G 2,1 F4,1 ) cannot be, since its modular data is not that of the (possibly twisted) quantum double of a finite group. This is an important lesson: we cannot expect the Dω Hgν VOAs to be orbifolds, by some subgroup of automorphisms, of Dω Zν VOAs. Hence Question 3. Another important observation from this trivial case ν = 1 is that the intersection of the holomorphic orbifold V D1 with the ν = 1 Haagerup-Izumi VOA V(Hg01 ) is itself a rational VOA. In particular, there are conformal embeddings V(B3,1 G 2,1 ) ⊂ V(E 7,1 ) and V(A1,1 B3,1 ) ⊂ V(F4,1 ) [56], and so both V D1 = V(A1 E 7 ) and V(Hg01 ) = V(G 2,1 F4,1 ) contain copies of the rational VOA V(A1,1 G 2,1 B3,1 ). Thus in this baby example at least, the dihedral and Haagerup-Izumi VOAs are commensurable in this sense. It is tempting to guess though that something similar happens for higher ν. This is what we mean by the Haagerup-dihedral diamond: at the top is a VOA realising Dω Zν , which contains as conformal subalgebras D+,ω Dν and Dω Hg, and these contain a common conformal subalgebra (the analogue of V(A1,1 G 2,1 B3,1 )).

504

D. E. Evans, T. Gannon

Conjecture 2. There is a rational VOA realisation of each Dω Hgν for any sufficiently large central charge c (c a multiple of 8). As ν increases, we will not be able to realise all Dω Hgν — or even the Dω Zν — as conformal subalgebras of V(E 8 ). In other words, c > 8 will sometimes be necessary. This conjecture is open even for ν = 3. To make it more plausible, and also to aid in the construction of the VOAs, in the following sections we supply all the information needed to identify the possible character vectors. We saw in Sect. 2.4 that the Haagerup VOA would see the holomorphic orbifold by Z3 more directly than that by S3 , so for this reason we also supply the relevant data for Dω Z3 . We should repeat here the suggestion of Sect. 2.5 that a second approach to constructing VOAs realising Dω Hgν is to use the affine algebra V(Bm,2 ) in the coset construction. For reasons of space, we haven’t explored this in this paper.

5.2. Character vectors for the untwisted Haagerup double. In this section we determine the matrices and χ for the untwisted Haagerup modular data D0 Hg3 , at all allowed values of central charge c (i.e. when 8|c). As explained in Sect. 1.4, from this all possible character vectors can be obtained. We illustrate this with c = 16 and c = 24 (for c = 8 see Sect. 2.4). 5.2.1. Untwisted Haagerup at central charge c ≡24 16. We found , χ for D0 Hg3 at c ≡24 8 in Sect. 2.4. Changing the central charge by a multiple of 8 leaves S unchanged but multiplies T by some 3rd root ω of 1. At the end of Sect. 1.4 we explain how this affects , χ . Taking the primaries in the usual order 0, b; a; c0 , c1 , c2 ; d1 , . . . , d6 , we obtain for c ≡24 16, = diag(1/3, 1/3; 1/3; 1/3, 2/3, 0; 31/39, 7/39, 19/39, 28/39, −5/39, −2/39), ⎛ ⎞ 17 155 162 162 27 729 13 286 65 13 1001 728 ⎜ 155 17 162 162 27 729 −13 −286 −65 −13 −1001 −728 ⎟ ⎜ ⎟ ⎜ 162 162 ⎟ 334 −162 −27 −729 0 0 0 0 0 0 ⎜ ⎟ ⎜ 162 162 −162 334 −27 −729 0 ⎟ 0 0 0 0 0 ⎜ ⎟ ⎜ 1215 1215 −1215 −1215 −76 17496 0 ⎟ 0 0 0 0 0 ⎜ ⎟ ⎜ 9 ⎟ 0 0 0 0 0 9 −9 −9 6 −12 0 ⎟. χ =⎜ ⎜ 1925 −1925 0 ⎟ 0 0 0 −21 −6250 −125 50 51331 43175 ⎜ ⎟ ⎜ 45 −45 ⎟ 0 0 0 0 −9 58 43 12 −45 −297 ⎜ ⎟ ⎜ 374 −374 ⎟ 0 0 0 0 0 1650 106 −51 −4250 3927 ⎜ ⎟ ⎜ 1288 −1288 0 0 0 0 23 5313 −759 4 39721 −15432 ⎟ ⎜ ⎟ ⎝ 1 ⎠ −1 0 0 0 0 1 0 −1 1 3 1 3 −3 0 0 0 0 3 −9 4 −2 6 4

From (1.24) the matrix (τ ) is recursively computed. Any weakly holomorphic vector ) for some polynomial vector P. valued modular function equals P(J Specialise now to a VOA character vector at c = 16. The corresponding polynomials Pi (x) are heavily constrained. For one thing, the vacuum character ch 0 (τ ) starts like 1q −2/3 +· · · and all other characters ch i (τ ) start like ai q h i −c/24 +· · ·, where h i > 0. This forces P = (1, 0, 0, 0, e, 0, g, h, 0, j, 0, 0)T for constants e, g, h, j. So we’ve uniquely determined the hypothetical c = 16 Haagerup character vector, once 4 numbers (namely e, g, h, j) are specified. Its first few terms are

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

505

⎛

⎞ ch 0 (τ ) ⎜ ⎟ ch b (τ ) ⎜ ⎟ ⎜ch a (τ ) = ch c 0 (τ )⎟ ⎜ ⎟ ⎜ ⎟ ch c 1 (τ ) ⎜ ⎟ ⎜ ⎟ (τ ) ch c 2 ⎜ ⎟ ⎜ ⎟ ch (τ ) d1 ⎜ ⎟ ⎜ ⎟ (τ ) ch d 2 ⎜ ⎟ ⎜ ⎟ ch d 3 (τ ) ⎜ ⎟ ⎜ ⎟ ch (τ ) d ⎜ ⎟ 4 ⎝ ⎠ ch d 5 (τ ) ch d 6 (τ ) ⎛ −2/3 ⎞ (1 + (17 + 13g + 27e + 65h + 13 j)q + (2013 + 594e + 4173h + 260 j + 91g)q 2 + · · · ) q ⎜ q 1/3 ((155 − 13g + 27e − 65h − 13 j) + (21245 + 594e − 4173h − 260 j − 91g)q + · · · ) ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ q 1/3 (−27e + 162) + (23247 − 594e)q + (705024 − 5967e)q 2 + · · · ⎜ ⎟ ⎜ q −1/3 e + (−76e + 1215)q + (−1384e + 79704)q 2 + (−11580e + 1886166)q 3 + · · · ⎟ ⎜ ⎟ ⎜ ⎟ 2 + ··· (6e + 9) + (5832 + 486e)q + (247131 + 5832e)q ⎜ ⎟ ⎜ −8/39 ⎟ (g + (−21g + 1925 − 125h + 50 j)q + (−239g + 103631 − 4376h + 506 j)q 2 + · · · )⎟ =⎜ ⎜q ⎟. ⎜ ⎟ q 7/39 ((43h + 45 + 12 j − 9g) + (4792h + 258 j + 10400 − 71g)q + · · · ) ⎜ ⎟ ⎜ ⎟ −20/39 2 ⎜ ⎟ h + (106h + 374 − 51 j)q + (4980h − 634 j + 34606 − 43g)q + · · · q ⎜ ⎟ ⎜ q −11/39 ( j + (4 j + 1288 + 23g − 759h)q + (−84 j + 141g + 79583 − 25135h)q 2 + · · · ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ q −5/39 ((−h + 1 + j + g) + (−856h + 130 j + 2704 + 31g)q + · · · ) q −2/39 ((4h + 3 − 2 j + 3g) + (1582h − 95 j + 3898 + 70g)q + · · · )

(5.1) Since the Fourier coefficients of a VOA character vector are nonnegative integers, we know each e, g, h, j ∈ Z≥0 . In addition the q 1/3 -coefficient of ch a forces e ≤ 6 and the q 1/3 -coefficient of ch b then forces g + j + 5h ≤ 24, so there are only finitely many possibilities. Other coefficients yield further bounds, e.g. h ≤ 2, j ≤ 11, g ≤ 16. There are several other constraints. One is that VOA characters are linear combinations over Z≥0 of the Virasoro characters at that c for various values of conformal weight h (because the VOA carries a Virasoro representation). All these characters are well-known: ch (16,0) = q −2/3 (1 + q 2 + q 3 + 2q 4 + · · · ) and ch (24,h) = q h−2/3 (1 + q + 2q 2 + 3q 3 + 5q 4 + · · · ) for h > 0. This isn’t so useful here, but it is effective for the c = 24 case considered next. More important, Dong-Mason [18] prove that the conformal weight 1 part of a rational VOA V is a reductive Lie algebra ⊕i gi of central charge i ki dim gi /(ki + h i∨ ) ≤ c and dimension equal to the coefficient of q 1−c/24 in ch 0 . This gives a finite set of possible values for that coefficient, and knowing this reductive Lie algebra helps in constructing V. Each V-module is also a module for the corresponding affine Lie algebra, so the V-characters are linear combinations of the affine characters. This is as far as we’ll carry the analysis here. The matrices , χ for D0 Z3 , with primaries in the order (00), (01) = (02), (10) = (20), (11) = (22), (12) = (21) are = diag(1/3, 1/3, 1/3, 2/3, 0) and ⎞ ⎛ 172 324 324 54 1458 334 −162 −27 −729 ⎟ ⎜ 162 ⎟ ⎜ 334 −27 −729 ⎟. χ = ⎜ 162 −162 ⎝ 1215 −1215 −1215 −76 17496 ⎠ 9 −9 −9 6 −12

506

D. E. Evans, T. Gannon

At c = 16 there is only one free parameter, d: the possible character vectors are ⎛ ⎜ch (0,1) ⎝

⎞ ch (0,0) = ch (0,2) = ch (1,0) = ch (2,0) ⎟ ⎠ ch (1,1) = ch (2,2) ch (1,2) = ch (2,1)

⎞ q −2/3 (1 + (172 + 54d)q + (23258 + 1188d)q 2 + · · · ) ⎜q 1/3 ((162 − 27d) + (23247 − 594d)q + (705024 − 5967d)q 2 + · · · )⎟ ⎟. =⎜ ⎠ ⎝ q −1/3 (d + (1215 − 76d)q + (79704 − 1384d)q 2 + · · · ) (9 + 6d) + (5832 + 486d)q + (247131 + 5832d)q 2 + · · · (5.2) ⎛

This implies d ∈ Z≥0 and d ≤ 6. This d equals the parameter e of D0 Hg. 5.2.2. Untwisted Haagerup at central charge c ≡24 0. Next turn to the Haagerup at 24|c. Finding its , χ is now routine: = diag(0, 0, 1, 1, 1/3, 2/3, 6/13, 11/13, 2/13, 5/13, 7/13, 8/13), ⎛ −12 0 1/2 1/2 18 6 6 1 10 5 −12 1/2 1/2 18 6 −6 −1 −10 −5 ⎜ 0 ⎜ 65610 65610 0 0 −5832 −243 0 0 0 0 ⎜ ⎜ 65610 65610 0 0 −5832 −243 0 0 0 0 ⎜ ⎜ 729 729 0 0 −152 54 0 0 0 0 ⎜ ⎜ 8748 8748 0 0 2430 −76 0 0 0 0 χ =⎜ ⎜ 1716 −1716 0 0 0 0 −252 −5 −176 176 ⎜ ⎜ 22451 −22451 0 0 0 0 −980 8 16556 2464 ⎜ ⎜ 104 −104 0 0 0 0 0 5 44 −56 ⎜ 0 0 77 8 −847 −32 ⎜ 910 −910 0 0 ⎝ 3003 −3003 0 0 0 0 330 0 −1540 704 5200 −5200

0

0

0

0

4 −4 0 0 0 0 134 −96 −15 120 −216 616 −14 3388 −595 132

⎞ 4 −4 ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟. 100 ⎟ ⎟ −385 ⎟ ⎟ 20 ⎟ ⎟ −34 ⎟ 55 ⎠ −20

Its possible character vectors are found as before to be ⎞ ⎞ ⎛ q −1 + (−12 + 10i + 18e + c+d ch 0 (τ ) 2 + 6 f + 4k + 5 j + h + 4l + 6k) + · · · ⎟ ⎜ ch b (τ ) ⎟ ⎜ (−10i + 18e + c+d 2 + 6 f − 4k − 5 j − h − 4l − 6g) + · · · ⎟ ⎟ ⎜ ⎜ 2 + ··· ⎟ ⎜ ch a (τ ) ⎟ ⎜ c + (65610 − 5832e − 243 f )q + (7164612 − 247131e − 2916 f )q ⎟ ⎟ ⎜ ⎜ ⎜ ch c0 (τ ) ⎟ ⎜ d + (65610 − 5832e − 243 f )q + (7164612 − 247131e − 2916 f )q 2 + · · · ⎟ ⎟ ⎟ ⎜ ⎜ ⎜ ch c1 (τ ) ⎟ ⎜ eq −2/3 + (729 − 152e + 54 f )q 1/3 + (370332 − 23236e + 1188 f )q 4/3 + · · · ⎟ ⎟ ⎟ ⎜ ⎜ −1/3 2/3 5/3 ⎜ ch c2 (τ ) ⎟ ⎜ f q + (8748 − 76 f + 2430e)q + (1743039 − 1384 f + 159408e)q + · · ·⎟ ⎟ ⎟=⎜ ⎜ ⎟ ⎜ch d (τ )⎟ ⎜ gq −7/13 + (1716 − 252g − 176i + 134k + 100l − 5h + 176 j)q 6/13 + · · · 1 ⎟ ⎟ ⎜ ⎜ ⎜ch d (τ )⎟ ⎜ hq −2/13 + (22451 + 8h + 2464 j + 16556i − 385l − 96k − 980g)q 11/13 + · · · ⎟ 2 ⎟ ⎟ ⎜ ⎜ ⎟ ⎜ch d (τ )⎟ ⎜ iq −11/13 + (104 + 44i − 15k + 5h − 56 j + 20l)q 2/13 + · · · 3 ⎟ ⎟ ⎜ ⎜ ⎟ ⎜ch (τ )⎟ ⎜ −8/13 5/13 jq + (910 − 32 j − 34l − 847i + 77g + 8h + 120k)q + ··· ⎟ ⎜ d4 ⎟ ⎜ ⎠ ⎝ch d (τ )⎠ ⎝ kq −6/13 + (3003 + 704 j − 1540i + 55l − 216k + 330g)q 7/13 + · · · ⎛

5

ch d6 (τ )

lq −5/13 + (5200 − 20l + 3388i + 616g − 14h + 132k − 595 j)q 8/13 + · · ·

for certain nonnegative integers c, d, e, f, g, h, i, j, k, l. By the usual arguments we find a finite number of possibilities. Likewise the matrices for D0 Z3 are = diag(0, 1, 1, 1/3, 2/3) and

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

⎛

−12 ⎜ 65610 ⎜ χ = ⎜ 65610 ⎝ 729 8748

1 0 0 0 0

507

⎞ 1 36 12 0 −5832 −243 ⎟ ⎟ 0 −5832 −243 ⎟, 0 −152 54 ⎠ 0 2430 −76

and we find a character vector depending on 4 bounded nonnegative integers. The most intriguing possibility is if these Haagerup and Z3 VOAs are both subVOAs of the Moonshine module. It is tempting to guess that this is the most natural home of the Haagerup. Then the character J of the Moonshine module will equal one of ch 0 + ch b + 2ch a, ch 0 + ch b + 2ch c, or ch 0 + ch b + ch a + ch c0 , and we find that (c, d, f ) is either (12,0,0), (0,12,0) or (0,0,1), and that (g, h, j, k, l) is either (1,2,0,0,0), (0,1,1,0,0), (0,2,0,1,0) or (0,1,0,0,1), and all other parameters vanish.

5.3. Character vectors of the twisted Haagerup double. The modular group representation for the ω = 0 twist of the Haagerup decomposes into a sum of 3 irreps: the trivial one, a 4-dimensional one with kernel (9), and a 7-dimensional one with kernel (13). Only the 4-dimensional irrep is new: it is handled by e.g. the root lattice A8 . Actually, 6 different 4-dimensional irreps arise here, varying with the twist and central charge mod 24, but they are all obtained from the one in A8 by some combination of 3rd roots of unity and taking the contragredient. The effects on , χ of both of these is discussed at the end of Sect. 1.4. 5.3.1. The 1-twisted Haagerup at central charge c ≡24 8. The matrices , χ for D1 Z3 and D1 Hg3 are respectively diag(2/3, 2/3, 7/9, 1/9, 4/9), ⎛ ⎞ 80 168 54 9504 1078 ⎜ 84 164 −27 −4752 −539 ⎟ ⎜ ⎟ ⎜ 126 −126 8 −17248 1375 ⎟, ⎜ ⎟ ⎝ 9 −9 −8 96 10 ⎠ 36 −36 20 392 −340 diag(−1/3, 2/3, 2/3, 7/9, 1/9, 4/9, 5/39, 20/39, 32/39, 41/39, 8/39, 11/39), ⎛ −42/5 1 0 0 0 0 −17/5 −13/5 3/5 1/5 ⎜ 209456/5 80 84 81 14256 1620 −19054/5 −2366/5 166/5 77/5 ⎜ ⎜ 228912/5 168 164 −81 −14256 −1620 −1428/5 −1092/5 252/5 84/5 ⎜ ⎜ 176556/5 84 −42 8 −17248 1375 −714/5 −546/5 126/5 42/5 ⎜ ⎜ 114/5 6 −3 −8 96 10 −51/5 −39/5 9/5 3/5 ⎜ ⎜ 11391/5 24 −12 20 392 −340 −204/5 −156/5 36/5 12/5 ⎜ ⎜ 546/5 0 0 0 0 0 −559/5 −161/5 16/5 7/5 ⎜ ⎜ 52611/5 0 0 0 0 0 −8134/5 −86/5 206/5 42/5 ⎜ ⎜ 600457/5 0 0 0 0 0 −12733/5 3388/5 412/5 119/5 ⎜ ⎜ 2851862/5 0 0 0 0 0 100947/5 6118/5 437/5 229/5 ⎜ ⎝ 1729/5 0 0 0 0 0 649/5 −174/5 14/5 13/5 5252/5 0 0 0 0 0 2002/5 −567/5 77/5 14/5

−7 −2814 −588 −294 −21 −84 14 −486 −6868 31027 −238 27

⎞ 14/5 −5812/5 ⎟ ⎟ 1176/5 ⎟ ⎟ 588/5 ⎟ ⎟ 42/5 ⎟ ⎟ 168/5 ⎟ ⎟. 418/5 ⎟ ⎟ −4617/5 ⎟ ⎟ 24871/5 ⎟ ⎟ −49514/5 ⎟ ⎟ 437/5 ⎠ 76/5

At c = 8, D1 Z3 is realised by the lattice VOA V(A8 ). D+,1 S3 is realised by the affine algebra VOA V(B4,2 ). Both of these are conformal embeddings.

508

D. E. Evans, T. Gannon

5.3.2. The 1-twisted Haagerup at central charge c ≡24 16. The matrices , χ for D1 Z3 and D1 Hg3 are respectively diag(1/3, 1/3, 4/9, 7/9, 1/9), ⎞ ⎛ 166 330 198 18 900 ⎜ 165 331 −99 −9 −450 ⎟ ⎟ ⎜ 351 −351 56 −25 1625 ⎟ χ =⎜ ⎟, ⎜ ⎝ 2079 −2079 −1694 53 3146 ⎠ 27 −27 28 1 −102 diag(−2/3, 1/3, 1/3, 4/9, 7/9, 1/9, 31/39, 7/39, 19/39, 28/39, 34/39, −2/39), ⎞ ⎛ 3 1 0 0 0 0 −1 0 1 −1 1 −1 ⎜ 6732 166 165 297 27 1350 −165 −286 87 −165 152 −880 ⎟ ⎟ ⎜ ⎜ 7359 330 331 −297 −27 −1350 −165 0 165 −165 165 −165 ⎟ ⎟ ⎜ ⎜ 8199 234 −117 56 −25 1625 −117 0 117 −117 117 −117 ⎟ ⎟ ⎜ ⎜ 173727 1386 −693 −1694 53 3146 −693 0 693 −693 693 −693 ⎟ ⎟ ⎜ ⎜ 126 18 −9 28 1 −102 −9 0 9 −9 9 −9 ⎟ ⎟. ⎜ ⎜ 498225 0 0 0 0 0 −1946 −6250 1800 −1875 1925 41250 ⎟ ⎟ ⎜ ⎜ 858 0 0 0 0 0 −54 58 88 −33 45 −342 ⎟ ⎟ ⎜ ⎜ 31603 0 0 0 0 0 −374 1650 480 −425 374 3553 ⎟ ⎟ ⎜ ⎜ 263120 0 0 0 0 0 −1265 5313 529 −1284 1288 −16720 ⎟ ⎟ ⎜ ⎝ 920556 0 0 0 0 0 −2673 −2025 1848 −2574 2704 22572 ⎠ 13 0 0 0 0 0 0 −9 7 −5 3 1

5.3.3. The 1-twisted Haagerup at central charge c ≡24 0. The matrices , χ for D1 Z3 and D1 Hg3 are respectively diag(1, 0, 1/9, 4/9, 7/9), ⎞ ⎛ 0 131274 185328 13770 324 6 −135 −54 −27 ⎟ ⎜1 ⎟ ⎜ 24 −35 10 ⎟, ⎜ 0 −27 ⎝ 0 −594 −2002 290 11 ⎠ 0 −5967 14144 340 −64 diag(0, 1, 0, 1/9, 4/9, 7/9, 6/13, 11/13, 2/13, 5/13, 7/13, 8/13), ⎛ −12 1 0 0 0 0 20 2 10 8 12 ⎜ 60033 0 65637 92664 6885 162 −18997 −16 −3200 −350 −1617 ⎜ −18 2 6 −135 −54 −27 20 2 10 8 12 ⎜ ⎜ 27 0 −27 24 −35 10 0 0 0 0 0 ⎜ ⎜ 594 0 −594 −2002 290 11 0 0 0 0 0 ⎜ ⎜ 5967 0 −5967 14144 340 −64 0 0 0 0 0 ⎜ ⎜ 1716 0 0 0 0 0 −252 −5 −176 176 134 ⎜ ⎜ 22451 0 0 0 0 0 −980 8 16556 2464 −96 ⎜ ⎜ 104 0 0 0 0 0 0 5 44 −56 −15 ⎜ 0 0 0 77 8 −847 −32 120 ⎜ 910 0 0 ⎝ 3003 0 0 0 0 0 330 0 −1540 704 −216 5200 0

0

0

0

0

616

−14 3388 −595 132

⎞ 8 −792 ⎟ 8 ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟. 100 ⎟ ⎟ −385 ⎟ ⎟ 20 ⎟ ⎟ −34 ⎟ 55 ⎠ −20

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

509

5.3.4. The 2-twisted Haagerup at central charge c ≡24 8. The matrices , χ for D2 Z3 and D2 Hg3 are respectively

diag(−1/3, 2/3, 8/9, 2/9, 5/9), ⎞ ⎛ 2 1 1/3 −1/3 −2/3 ⎜ 46683 248 26 −836 −133 ⎟ ⎟ ⎜ ⎜ 706401 0 156 2584 475 ⎟, ⎟ ⎜ ⎝ 2187 0 14 172 −77 ⎠ 56862 0 65 −2431 −74 diag(−1/3, −1/3, 2/3, 8/9, 2/9, 5/9, 5/39, 20/39, 32/39, 41/39, 8/39, 11/39), ⎛ −16/5 26/5 1/2 1/6 −1/6 −1/3 −17/10 −13/10 3/10 1/10 ⎜ 26/5 −16/5 1/2 1/6 −1/6 −1/3 17/10 13/10 −3/10 −1/10 ⎜ ⎜ 46683 46683 248 26 −836 −133 0 0 0 0 ⎜ ⎜ 706401 706401 0 156 2584 475 0 0 0 0 ⎜ ⎜ 2187 2187 0 14 172 −77 0 0 0 0 ⎜ ⎜ 56862 0 65 −2431 −74 0 0 0 0 ⎜ 56862 ⎜ −546/5 0 0 0 0 −559/5 −161/5 16/5 7/5 ⎜ 546/5 ⎜ ⎜ 52611/5 −52611/5 0 0 0 0 −8134/5 −86/5 206/5 42/5 ⎜ ⎜ 600457/5 −600457/5 0 0 0 0 −12733/5 3388/5 412/5 119/5 ⎜ ⎜ 2851862/5 −2851862/5 0 0 0 0 100947/5 6118/5 437/5 229/5 ⎜ ⎝ 1729/5 −1729/5 0 0 0 0 649/5 −174/5 14/5 13/5 5252/5 −5252/5 0 0 0 0 2002/5 −567/5 77/5 14/5

⎞ −7/2 7/5 7/2 −7/5 ⎟ ⎟ ⎟ 0 0 ⎟ ⎟ 0 0 ⎟ ⎟ 0 0 ⎟ ⎟ 0 0 ⎟ ⎟. 14 418/5 ⎟ ⎟ −486 −4617/5 ⎟ ⎟ −6868 24871/5 ⎟ ⎟ 31027 −49514/5 ⎟ ⎟ −238 437/5 ⎠ 27 76/5

5.3.5. The 2-twisted Haagerup at central charge c ≡24 16. The matrices , χ for D2 Z3 and D2 Hg3 are respectively

diag(1/3, 1/3, 5/9, 8/9, 2/9), ⎛ 160 336 32/3 2/3 ⎜ 168 328 −16/3 −1/3 ⎜ ⎜ 5832 −5832 −272 2 ⎜ ⎝ 32076 −32076 196 12 729 −729 40 −4

⎞ 196/3 −98/3 ⎟ ⎟ 1925 ⎟ ⎟, −15092 ⎠ 28

diag(−2/3, 1/3, 1/3, 5/9, 8/9, 2/9, 31/39, 7/39, 19/39, 28/39, 34/39, −2/39), ⎛ 3 1 0 0 0 0 −1 0 1 −1 1 ⎜ 6717 160 168 16 1 98 −162 −286 84 −162 149 ⎜ ⎜ 7374 336 328 −16 −1 −98 −168 0 168 −168 168 ⎜ ⎜ 221859 3888 −1944 −272 2 1925 −1944 0 1944 −1944 1944 ⎜ ⎜ 3788856 21384 −10692 196 12 −15092 −10692 0 10692 −10692 10692 ⎜ ⎜ 5589 486 −243 40 −4 28 −243 0 243 −243 243 ⎜ ⎜ 498225 0 0 0 0 0 −1946 −6250 1800 −1875 1925 ⎜ ⎜ 858 −54 58 88 −33 45 0 0 0 0 0 ⎜ ⎜ 31603 0 0 0 0 0 −374 1650 480 −425 374 ⎜ ⎜ 263120 0 0 0 0 0 −1265 5313 529 −1284 1288 ⎜ ⎝ 920556 0 0 0 0 0 −2673 −2025 1848 −2574 2704 13 0 0 0 0 0 0 −9 7 −5 3

⎞ −1 −877 ⎟ ⎟ −168 ⎟ ⎟ −1944 ⎟ ⎟ −10692 ⎟ ⎟ −243 ⎟ ⎟. 41250 ⎟ ⎟ −342 ⎟ ⎟ 3553 ⎟ ⎟ −16720 ⎟ ⎟ 22572 ⎠ 1

510

D. E. Evans, T. Gannon

5.3.6. The 2-twisted Haagerup at central charge c ≡24 0. The matrices , χ for D2 Z3 and D2 Hg3 are respectively diag(1, 0, 8/9, 5/9, 2/9), ⎞ ⎛ 0 131274 6 528 9282 −3 −1 −7 −8 ⎟ ⎜1 ⎟ ⎜ ⎜0 −104247 3 −1001 12376⎟, ⎝0 −12393 −7 232 476 ⎠ 0 −729 5 22 −224 diag(0, 0, 1, 2/9, 5/9, 8/9, 6/13, 11/13, 2/13, 5/13, 7/13, 8/13), ⎛ −15/2 9/2 1/2 4 7/2 1/2 6 1 10 5 7/2 1/2 −6 −1 −10 −5 ⎜ 9/2 −15/2 1/2 4 ⎜ 65637 65637 0 −4641 −264 −3 0 0 0 0 ⎜ ⎜ 729 729 0 −224 22 5 0 0 0 0 ⎜ ⎜ 12393 12393 0 476 232 −7 0 0 0 0 ⎜ ⎜ 104247 104247 0 12376 −1001 3 0 0 0 0 ⎜ ⎜ 1716 −1716 0 0 0 0 −252 −5 −176 176 ⎜ ⎜ 22451 −22451 0 0 0 0 −980 8 16556 2464 ⎜ ⎜ 104 −104 0 0 0 0 0 5 44 −56 ⎜ −910 0 0 0 0 77 8 −847 −32 ⎜ 910 ⎝ 3003 −3003 0 0 0 0 330 0 −1540 704 5200

−5200

0

0

0

0

4 −4 0 0 0 0 134 −96 −15 120 −216 616 −14 3388 −595 132

⎞ 4 −4 ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟. 100 ⎟ ⎟ −385 ⎟ ⎟ 20 ⎟ ⎟ −34 ⎟ 55 ⎠ −20

Acknowledgements. The authors thank the University of Alberta Mathematics Dept, Cardiff School of Mathematics, Swansea University Dept of Computer Science, and Universität Würzburg Institut für Mathematik for generous hospitality while researching this paper. Their research was supported in part by EU-NCG Research Training Network: MRTN-CT-2006 031962, DAAD (Prodi Chair), and NSERC. We thank Matthias Gaberdiel, Pinhas Grossman, Paulo Pinto, Ingo Runkel, and Feng Xu for discussions.

References √ 1. Asaeda, √ M., Haagerup, U.: Exotic subfactors of finite depth with Jones indices (5 + 13)/2 and (5 + 17)/2. Commun. Math. Phys. 202, 1–63 (1999) 2. Bantay, P., Gannon, T.: Vector-valued modular functions for the modular group and the hypergeometric equation. Commun. Number Theory Phys. 1, 637–666 (2008) 3. Bantay, P., Gannon, T.: Vector-valued modular forms and the Riemann-Hilbert problem. (In preparation) 4. Bigelow, S., Morrison, S., Peters, E., Snyder, N.: Constructing the extended Haagerup planar algebra. http://arXiv.org/abs/0909.4099v2 [math.OA], 2009 5. Bisch, D.: On the structure of finite depth subfactors. In: Algebraic methods in operator theory. Boston: Birkhäuser, 1994, pp. 175–194 6. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors, I. Commun. Math. Phys. 197, 361–386 (1998) 7. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors, III. Commun. Math. Phys. 205, 183–228 (1999) 8. Böckenhauer, J., Evans, D.E.: Modular invariants from subfactors: Type I coupling matrices and intermediate subfactors. Commun. Math. Phys. 213, 267–289 (2000) 9. Böckenhauer, J., Evans, D.E.: Modular invariants from subfactors. In: Quantum Symmetries in Theoretical Physics and Mathematics (Bariloche, 2000), Contemp. Math. 294. Providence, RI: Amer. Math. Soc., 2002, pp. 95–131 10. Böckenhauer, J., Evans, D.E.: Modular invariants and subfactors. In: Mathematical Physics in Mathematics and Physics (Siena, 2000). Fields Inst. Commun. 30. Providence, RI: Amer. Math. Soc., 2001, pp. 11–37

Exoticness and Realisability of Twisted Haagerup–Izumi Modular Data

511

11. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: On α-induction, chiral generators and modular invariants for subfactors. Commun. Math. Phys. 208, 429–487 (1999) 12. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: Chiral structure of modular invariants for subfactors. Commun. Math. Phys. 210, 733–784 (2000) 13. Conway, J.H., Sloane, N.J.A.: Low-dimensional lattices. I: Quadratic forms of small determinant. Proc. R. Soc. Lond. A 418, 17–41 (1988) 14. Coste, A., Gannon, T.: Remarks on Galois symmetry in rational conformal field theories. Phys. Lett. B 323, 316–321 (1994) 15. Coste, A., Gannon, T., Ruelle, P.: Finite group modular data. Nucl. Phys. B 581, 679–717 (2000) 16. Dijkgraaf, R., Vafa, C., Verlinde, E., Verlinde, H.: The operator algebra of orbifold models. Commun. Math. Phys. 123, 485–526 (1989) 17. Dijkgraaf, R., Witten, E.: Topological gauge theories and group cohomology. Commun. Math. Phys. 129, 393–429 (1990) 18. Dong, C., Mason, G.: Integrability of C2 -cofinite vertex operator algebras. Int. Math. Res. Not. 2006, Art. ID 80468, (2006) 19. Dovgard, R., Gepner, D.: Conformal field theories with a low number of primary fields. J. Phys. A, Math. Theor. 42, 304009 (2009) 20. Evans, D.E.: Critical phenomena, modular invariants and operator algebras. In: Operator algebras and mathematical physics (Constan¸ta 2001). Cuntz, J., Elliott, G. A., Stratila, S. et al., (eds.) Bucharest: The Theta Foundation, 2003, pp. 89–113 21. Evans, D.E.: From Ising to Haagerup. Markov Processes Relat. Fields 13, 267–287 (2007) 22. Evans, D.E.: Twisted K-theory and modular invariants: I. Quantum doubles of finite groups. In: Bratteli, O., Neshveyev, S., Skau, C. (eds.) Operator Algebras: The Abel Symposium 2004. Berlin-Heidelberg: Springer, 2006, pp. 117–144 23. Evans, D.E., Gannon, T.: Modular invariants and twisted equivariant K-theory. Commun. Number Theory Phys. 3, 209–296 (2009) 24. Evans, D.E., Kawahigashi, Y.: Orbifold subfactors from Hecke algebras II: quantum double and braiding. Commun. Math. Phys. 196, 331–361 (1998) 25. Evans, D.E., Kawahigashi, Y.: Quantum Symmetries on Operator Algebras. Oxford: Oxford University Press, 1998 26. Evans, D.E., Pinto, P.R.: Subfactor realisation of modular invariants. Commun. Math. Phys. 237, 309– 363 (2003) 27. Evans, D.E. Pinto, P.R.: Modular invariants and the double of the Haagerup subfactor. In: Advances in Operator Algebras and Mathematical Physics (Sinaia 2003). Boca, F.-P., Bratteli, O., Longo, R., Siedentop, H. (eds.) Bucharest: The Theta Foundation, 2006, pp. 67–88 28. Evans, D.E., Pinto, P.R.: Subfactor realisation of modular invariants: II. Intern. J Math. (to appear) 29. Frenkel, I.B., Zhu, Y.: Vertex operator algebras associated to representations of affine and Virasoro algebras. Duke Math. J. 66, 123–168 (1992) 30. Fröhlich, J., Fuchs, J., Runkel, I., Schweigert, C.: Defect lines, dualities, and generalised orbifolds. http:// arXiv.org/abs/0909.5013vi [math-ph], 2009 31. Gannon, T.: Modular data: the algebraic combinatorics of conformal field theory. J. Alg. Combin. 22, 211– 250 (2005) 32. Gannon T.: The level 2 and 3 modular invariants for the orthogonal algebras. Canad. Math. J. 2, 503–521 (2000) 33. Goddard, P., Kent, A., Olive, D.: Unitary representations of Virasoro and super-Virasoro algebras. Commun. Math. Phys. 103, 105–119 (1986) √ 34. Haagerup, U.: Principal graphs of subfactors in the index range 4 < [M : N ] < 3 + 3. In: Subfactors. H. Araki et al (eds.) Singapore: World Scientific, 1994, pp.1–38 35. Hong, S.-M. Rowell E., Wang, Z.: On exotic modular tensor categories. Commun. Contemp. Math. 10(Suppl. 1), 1049–1074 (2008) 36. Huang, Y.-Z.: Vertex operator algebras, the Verlinde conjecture, and modular tensor categories. Proc. Natl. Acad. Sci. USA 102, 5352–5356 (2005) 37. Izumi, M.: The structure of sectors associated with Longo-Rehren inclusions, I. General Theory. Commun. Math. Phys. 213, 127–179 (2000) 38. Izumi, M.: The structure of sectors associated with Longo-Rehren inclusions, II. Examples. Rev. Math. Phys. 13, 603–674 (2001) (1) 39. Izumi, M., Kawahigashi, Y.: Classification of subfactors with the principal graph Dn . J. Funct. Anal. 112, 257–286 (1993) 40. Jones, V.F.R.: An invariant for group actions. In: de la Harpe, P. (ed.) Algèbres d’opèrateurs (Les Planssur-Bex 1978). Lecture Notes in Math. 725. Berlin: Springer, 1979, pp. 237–253. 41. Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1–25 (1983)

512

D. E. Evans, T. Gannon

42. Kac, V.G.: Infinite-dimensional Lie algebras, 3rd edn. Cambridge: Cambridge University Press, 1990 ∗ . 43. Kac, V.G., Todorov, I.T.: Affine orbifolds and rational conformal field theory extensions of W1+∞ Commun. Math. Phys. 190, 57–111 (1997) 44. Kac, V.G., Wakimoto, M.: Modular and conformal invariance constraints in representation theory of affine algebras. Adv. Math. 70, 156–236 (1988) 45. Kawahigashi, Y., Longo, R., Müger, M.: Multi-interval subfactors and modularity of representations in conformal field theory. Commun. Math. Phys. 219, 631–669 (2001) 46. Kosaki, H., Munemasa, A., Yamagami, S.: Irreducible bimodules associated with crossed product algebras Internat. J Math 3, 661–676 (1992) 47. Lepowski J., Li H.: Introduction to Vertex Operator Algebras and their Representations. Boston: Birkhäuser: Boston, 2004 48. Longo, R.: Index of subfactors and statistics of quantum fields, I. Commun. Math. Phys. 126, 217–247 (1989) 49. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 50. Masuda, T.: An analogue of Longo’s canonical endomorphism for bimodule theory and its application to asymptotic inclusions. Internat. J. Math. 8, 249–265 (1997) 51. Moore, G., Seiberg, N.: Lectures on RCFT. In: Physics, Geometry and Topology, H. C. Lee (ed.), Nato ASI Series, Vol. 238, Newyork: Plenium Press, 1990, pp.263–361 52. Müger, M.: From subfactors to categories and topology II. The quantum double of tensor categories and subfactors. J. Pure Appl. Alg. 180, 159–219 (2003) 53. Ostrik, V.: Module categories over the Drinfeld double of a finite group Inter. Math. Research Notices 27, 1507–1520 (2003) 54. Peters, E.: A planar algebra construction of the Haagerup subfactor. http://arXiv.org/abs/0902.1294v2 [math.OA], 2009 55. Rehren, K.-H.: Braid group statistics and their superselection rules. In: The algebraic theory of superselection sectors (Palermo 1989). Singapore: World Scientific, 1990, pp. 333–355 56. Schellekens, A.N., Warner, N.P.: Conformal subalgebras of Kac-Moody algebras. Phys. Rev. D 34, 3092–3096 (1986) 57. Sutherland, C.: Cohomology and extensions of von Neumann algebras I, II. Publ. RIMS. Kyoto Univ. 16, 135–174 (1980) 58. Turaev, V.G.: Quantum Invariants of Knots and 3-manifolds. de Gruyter Studies in Mathematics, Vol. 18. Berlin: Walter de Gruyter, 1994 59. Walton, M.A.: Conformal branching rules and modular invariants. Nucl. Phys. B 322, 775–790 (1989) 60. Witten, E.: The search for higher symmetry in string theory. Physics and mathematics of strings. Philos. Trans. Roy. Soc. London Ser. A 329(1605), 349–357 (1989) 61. Xu, F.: Mirror extensions of local nets. Commun. Math. Phys. 270, 835–847 (2007) 62. Zhu, Y.: Modular invariance of characters of vertex operator algebras. J. Amer. Math. Soc. 9, 237–302 (1996) Communicated by Y. Kawahigashi

Commun. Math. Phys. 307, 513–560 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1331-9

Communications in

Mathematical Physics

Spectrum of Non-Hermitian Heavy Tailed Random Matrices Charles Bordenave1 , Pietro Caputo2 , Djalil Chafaï3 1 IMT UMR 5219 CNRS and Université Paul-Sabatier Toulouse III, Toulouse, France.

E-mail: [email protected]

2 Dipartimento di Matematica, Università Roma Tre, Rome, Italy.

E-mail: [email protected]

3 LAMA UMR 8050 CNRS and Université Paris-Est Marne-la-Vallée, Paris, France.

E-mail: [email protected] Received: 8 June 2010 / Accepted: 7 April 2011 Published online: 4 September 2011 – © Springer-Verlag 2011

Abstract: Let (X jk ) j,k 1 be i.i.d. complex random variables such that X jk is in the domain of attraction of an α-stable law, with 0 < α < 2. Our main result is a heavy tailed counterpart of Girko’s circular law. Namely, under some additional smoothness assumptions on the law of X jk , we prove that there exist a deterministic sequence an ∼ n 1/α and a probability measure μα on C depending only on α such that with probability one, the empirical distribution of the eigenvalues of the rescaled matrix (an−1 X jk )1 j,k n converges weakly to μα as n → ∞. Our approach combines Aldous & Steele’s objective method with Girko’s Hermitization using logarithmic potentials. The underlying limiting object is defined on a bipartized version of Aldous’ Poisson Weighted Infinite Tree. Recursive relations on the tree provide some properties of μα . In contrast with the Hermitian case, we find that μα is not heavy tailed. Contents 1. 2.

3.

Introduction . . . . . . . . . . . . . . . . . 1.1 Main results . . . . . . . . . . . . . . . 1.2 Notation . . . . . . . . . . . . . . . . . Bipartized Resolvent Matrix . . . . . . . . . 2.1 Bipartization of a matrix . . . . . . . . 2.2 Bipartization of an operator . . . . . . . 2.3 Operator on a tree . . . . . . . . . . . . 2.4 Local operator convergence . . . . . . . 2.5 Poisson Weighted Infinite Tree (PWIT) . 2.6 Local convergence to PWIT . . . . . . 2.7 Convergence of the resolvent matrix . . 2.8 Proof of Theorem 1.1 . . . . . . . . . . Convergence of the Spectral Measure . . . . 3.1 Tightness . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

514 515 517 517 517 520 522 524 525 526 529 530 530 530

514

C. Bordenave, P. Caputo, D. Chafaï

3.2 Invertibility . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Distance from a row to a vector space . . . . . . . . . . 3.4 Uniform integrability . . . . . . . . . . . . . . . . . . . 3.5 Proof of Theorem 1.2 . . . . . . . . . . . . . . . . . . . 4. Limiting Spectral Measure . . . . . . . . . . . . . . . . . . . 4.1 Resolvent operator on the Poisson Weighted Infinite Tree 4.2 Density of the limiting measure . . . . . . . . . . . . . . 4.3 Proof of Theorem 1.3 . . . . . . . . . . . . . . . . . . . Appendix A. Logarithmic Potentials and Hermitization . . . . . . Appendix B. General Spectral Estimates . . . . . . . . . . . . . . Appendix C. Additional Lemmas . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

532 534 541 544 544 544 547 549 554 555 556 558

1. Introduction The eigenvalues of an n × n complex matrix M are the roots in C of its characteristic polynomial. We label them λ1 (M), . . . , λn (M) so that |λ1 (M)| · · · |λn (M)| 0. We also denote by s1 (M) · · · sn (M) the singular values of M, defined for every √ 1 k n by sk (M) := λk ( M M ∗ ), where M ∗ = M is the conjugate transpose of M. We define the empirical spectral measure and the empirical singular values measure as μM =

n 1 δλk (M) and n

νM =

k=1

n 1 δsk (M) . n k=1

Let (X i j )i, j 1 be i.i.d. complex random variables with cumulative distribution function F. Consider the matrix X = (X i j )1i, j n . Following Dozier and Silverstein [19,20], if F has finite positive variance σ 2 , then for every z ∈ C, there exists a probability measure Qσ,z on [0, ∞) depending only on σ and z, with explicit Cauchy-Stieltjes transform, such that a.s. (almost surely) ν √1

n

X −z I n→∞

Qσ,z ,

(1.1)

where denotes the weak convergence of probability measures. The proof of (1.1) is based on a classical approach for Hermitian random matrices with bounded second moment: truncation, centralization, recursion on the resolvent, and cubic equation for the limiting Cauchy-Stieltjes transform. In the special case z = 0, the statement (1.1) reduces to the quarter-circular law theorem (square version of the Marchenko-Pastur theorem, see [37,52,54]) and the probability measure Q σ,0 is the quarter-circular law with Lebesgue density 1 2 4σ − x 2 1[0,2σ ] (x). (1.2) x → πσ2 Girko’s famous circular law theorem [25] states under the same assumptions that a.s. μ √1

n

X n→∞

Uσ ,

(1.3)

where Uσ is the uniform law on the disc {z ∈ C; |z| σ }. This statement was established through a long sequence of partial results [4,5,21,24–28,33,39,40,48,50], the general case (1.3) being finally obtained by Tao and Vu [50] by using Girko’s Hermitization with logarithmic potentials and uniform integrability, the convergence (1.1), and polynomial bounds on the extremal singular values.

Non-Hermitian Heavy Tailed Random Matrices

515

1.1. Main results. The aim of this paper is to investigate what happens when F does not have a finite second moment. We shall consider the following hypothesis: (H1) There exists a slowly varying function L (i.e. limt→∞ L(x t)/L(t) = 1 for any x > 0) and a real number α ∈ (0, 2) such that for every t 1, P(|X 11 | t) = d F(z) = L(t)t −α , {z∈C;|z|t}

and there exists a probability measure θ on the unit circle S1 := {z ∈ C; |z| = 1} of the complex plane such that for every Borel set D ⊂ S1 , X 11 lim P ∈ D |X 11 | t = θ (D). t→∞ |X 11 | Assumption (H1) states a complex version of the classical criterion for the domain of attraction of a real α-stable√law, see e.g. Feller [23, Thm. IX.8.1a]. For instance, if X 11 = V1 + i V2 with i = −1, and where V1 and V2 are independent real random variables both belonging to the domain of attraction of an α-stable law, then (H1) holds. When (H1) holds, we define the sequence an := inf{a > 0 s.t. nP(|X 11 | a) 1} and (H1) implies that limn→∞ nP(|X 11 | an ) = limn→∞ nan−α L(an ) = 1. It follows then classically that an = n 1/α (n) for every n 1, for some slowly varying function . The additional possible assumptions on F to be considered in the sequel are the following: (H2) P(|X 11 | t) ∼t→∞ c t −α for some c > 0 (this implies an ∼n→∞ c1/α n 1/α ). (H3) X 11 has a bounded probability Lebesgue density on R or on C. One can check that (H1-H2-H3) hold e.g. when |X 11 | and X 11 /|X 11 | are independent with |X 11 | = |S|, where S is real symmetric α-stable. Another basic example is given by X 11 = εW −1/α with ε and W independent such that ε takes values in S 1 and W is uniform on [0, 1]. For every n 1, let us define the i.i.d. n × n complex matrix A = An by Ai j := an−1 X i j

(1.4)

for every 1 i, j n. Our first result concerns the singular values of A − z I, z ∈ C. Theorem 1.1 (Singular values). If (H1) holds then for all z ∈ C, there exists a probability measure να,z on [0, ∞) depending only on α and z such that a.s. ν A−z I να,z . n→∞

The case z = 0 was already obtained by Belinschi, Dembo and Guionnet [6]. Theorem 1.1 is a heavy tailed version of the Dozier and Silverstein theorem (1.1). Our main results below give a non-Hermitian version of Wigner’s theorem for Lévy matrices [7,6,11,14], as well as a heavy tailed version of Girko’s circular law theorem (1.3). Theorem 1.2 (Eigenvalues). If (H1–H2–H3) hold then there exists a probability measure μα on C depending only on α such that a.s. μ A μα . n→∞

516

C. Bordenave, P. Caputo, D. Chafaï

Theorem 1.3 (Limiting law). The probability distribution μα from Theorem 1.2 is isotropic and has a continuous density. Its density at z = 0 equals (1 + 2/α)2 (1 + α/2)2/α . π (1 − α/2)2/α Furthermore, up to a multiplicative constant, the density of μα is equivalent to α

|z|2(α−1) e− 2 |z|

α

as |z| → ∞.

Recall that for a normal matrix (i.e. which commutes with its adjoint), the absolute value of the eigenvalues are equal to the singular values. Theorem 1.3 reveals a striking contrast between μα and να,0 . The limiting law of the eigenvalues μα has a stretched exponential tail while the limiting law να,0 of the singular values is heavy n tailed with power n exponent α, see e.g. [6]. This does not contradict the identity k=1 |λk (A)| = k=1 sk (A), but it does indicate that A is typically far from being a normal matrix. A similar shrinking phenomenon appears already in the finite second moment case (1.1)– (1.3): the law of the absolute value under the circular law Uσ has density r → 2σ −2 r 1[0,σ ] (r ) in contrast with the density (1.2) of the quarter-circular law Qσ,0 , even the supports differ by a factor 2. The proof of Theorem 1.1 is given in Sect. 2.8. It relies on an extension to non-Hermitian matrices of the “objective method” approach developed in [11]. More precisely, we build an explicit operator on Aldous’ Poisson Weighted Infinite Tree (PWIT) and prove that it is the local limit of the matrices An in an appropriate sense. While Poisson statistics arises naturally as in all heavy tailed phenomena, the fact that a tree structure appears in the limit is roughly explained by the observation that non-vanishing entries of the rescaled matrix An = an−1 X can be viewed as the adjacency matrix of a sparse random graph which locally looks like a tree. In particular, the convergence to PWIT is a weighted-graph version of familiar results on the local structure of Erd˝os-Rényi random graphs. The proof of Theorem 1.2 is given in Sect. 3. It relies on Girko’s Hermitization method with logarithmic potentials, on Theorem 1.1, and on polynomial bounds on the extremal singular values needed to establish a uniform integrability property. This extends the Hermitization method to more general settings, by successfully mixing various arguments already developed in [11,12,50]. Following Tao and Vu, one of the key steps will be a lower bound on the distance of a row of the matrix A to a subspace of dimension at most n − n 1−γ , for some small γ > 0. Girko’s Hermitization method gives a characterization of μα in terms of its logarithmic potential (see Appendix A). In our settings, however, this is not convenient to derive properties of the measure μα , and our proof of Theorem 1.3 is based on an analysis of a self-adjoint operator on the PWIT and a recursive characterization of the spectral measure from the resolvent of this operator. This method is explained in Sect. 2 while the actual computations on the PWIT are performed in Sect. 4. Let us conclude with some final remarks. Following [16], the derivation of a Markovian version of Theorems 1.1 and 1.2 is an interesting problem, see [10,11] for the symmetric case and [12,18] for the light tailed non-symmetric case. In another direction, it is also tempting to seek for an interpretation of να,z and μα in terms of a sort of graphical free probability theory. Indeed, our random operators are defined on trees

Non-Hermitian Heavy Tailed Random Matrices

517

and tree structures are closely related to freeness. Also, with a proper notion of trace, it is possible to define the spectral measure of an operator, see e.g. [15,31,36]. However these notions are usually defined on algebras of bounded operators and we will not pursue this goal here. Note finally that Theorems 1.1 and 1.2 remain available for additive perturbations of finite rank, by following the methodology used in [17,47,50]. 1.2. Notation. Throughout the paper, the notation n 1 means large enough n. For any c ∈ [0, ∞] and any couple f, g of positive functions defined in a neighborhood of c , we say that f (t) ∼ g(t) as t goes to c, if limt→c f (t)/g(t) = 1. We denote by D (C) the set of Schwartz-Sobolev distributions endowed with its usual convergence with respect to all infinitely differentiable functions with bounded support C0∞ (C). We √ will consider the differential operators on C R2 , for z = x + i y (here i = −1), ∂=

1 1 ∂x − i∂ y and ∂¯ = ∂x + i∂ y . 2 2

¯ = 0, ∂z = ∂¯ z¯ = 1 and the Laplace differential operator on C is given We have ∂ z¯ = ∂z by = 4∂ ∂¯ = ∂x2 + ∂ y2 . We use sometimes the shortened notation A − z instead of A − z I . 2. Bipartized Resolvent Matrix The aim of this section is to develop an efficient machinery to analyze the spectral measures of a non-hermitian matrix which avoids a direct use of the logarithmic potential and the singular values. Our approach builds upon similar methods in the physics literature [22,29,43,44]. 2.1. Bipartization of a matrix. Let n be an integer, and A be a n × n complex matrix. We introduce the symmetrized version of ν A−z , νˇ A−z =

n

1 δσk (A−z) + δ−σk (A−z) . 2n k=1

Let C+ = {z ∈ C : Im(z) > 0} and consider the quaternionic-type set η z H+ = U = , η ∈ C+ , z ∈ C ⊂ M2 (C). z¯ η For z ∈ C, η ∈ C+ and 1 i, j n integers, we define the elements of H+ and M2 (C) respectively, 0 Ai j η z and Bi j = ¯ U (z, η) = . z¯ η 0 A ji We define the matrix in Mn (M2 (C)) M2n (C), B = (Bi j )1i, j n . Since B ∗ji = Bi j , as an element of M2n (C), B is an Hermitian matrix. Graphically, the matrix A can be

518

C. Bordenave, P. Caputo, D. Chafaï

identified with an oriented graph on the vertex set {1, . . . , n} with weight on the oriented edge (i, j) equal to Ai j . Then, the matrix B can be thought of as the bipartization of the matrix A, that is a non-oriented graph on the vertex set {1, −1, . . . , −n, n}, for every integer 1 i, j n the weight on the non-oriented edge {i, − j} is Ai j , and there is no edge between i and j or −i and − j. For U ∈ H+ , let U ⊗ In ∈ Mn (M2 (C)) be the matrix given by (U ⊗ In )i j = δi j U, 1 i, j n. The resolvent matrix is defined in Mn (M2 (C)) by R(U ) = (B − U ⊗ In )−1 , so that for all 1 i, j n, R(U )i j ∈ M2 (C). For 1 k n, we write, with U = U (z, η), a (z, η) bk (z, η) . (2.1) R(U )kk = k bk (z, η) ck (z, η) The modulus of the entries of the matrix R(U )kk are bounded by (Im(η))−1 (see the forthcoming Lemma 2.2). As an element of M2n (C), R is the usual resolvent of the matrix B(z) = B − U (z, 0) ⊗ In . Indeed, with U = U (z, η), R(U ) = (B(z) − ηI2n )−1 .

(2.2)

In the next proposition, we shall check that the eigenvalues of B(z) are ±σk (A − z), 1 k n, and consequently μ B(z) = νˇ A−z .

(2.3)

It will follow that the spectral measures μ A and νˇ A−z can be easily recovered from the resolvent matrix. Recall that the Cauchy-Stieltjes transform of a measure ν on R is defined, for η ∈ C+ , as 1 ν(d x). m ν (η) = R x −η The Cauchy-Stieltjes transform characterizes the measure. For a probability measure on C, it is possible to define a Cauchy-Stieltjes-like transform on quaternions, by setting for U ∈ H+ , −1 0 λ μ(dλ) ∈ H+ . Mμ (U ) = −U λ¯ 0 C This transform characterizes the measure: in D (C), limt↓0 (∂ Mμ (U (z, it))12 = −π μ. If A is normal, i.e. if A∗ A = A A∗ , then it can be checked that R(U )kk ∈ H+ and n 1 R(U )kk = Mμ A (U ). n k=1

However, if A is not normal, the above formula fails to hold and the next proposition explains how to recover anyway μ A from the resolvent.

Non-Hermitian Heavy Tailed Random Matrices

519

Theorem 2.1 (From resolvent to spectral measure). Let U = U (z, η) ∈ H+ , and ak , bk , bk , ck be as in (2.1). Then (2.3) holds, n 1 m νˇ A−z (η) = (ak (z, η) + ck (z, η)) , 2n k=1

and, in

D (C), n n 1 1 μA = − ∂bk (·, 0) = lim − ∂bk (·, it). t↓0 πn πn k=1

k=1

In particular, if A is a random matrix with exchangeable entries, then by linearity we get m Eνˇ A−z (η) = Ea1 (z, η), and, in D (C), 1 1 Eμ A = − ∂Eb1 (·, 0) = lim − ∂Eb1 (·, it). t↓0 π π Proof of Theorem 2.1. Through a permutation of the entries, the matrix B(z) is similar to 0 (A − z) , (A − z)∗ 0 whose eigenvalues are easily seen to be ±σk (A − z), 1 k n. We get tr R =

n k=1

(ak + ck ) =

n n (σk (A − z) − η)−1 + (−σk (A − z) − η)−1 . k=1

k=1

And the first statement and (2.3) follow. Also, from (A.3), in the appendix, for z ∈ / supp(μ A ), 1 Uμ A (z) = ln |x|μ B(z) (d x) = ln | det B(z)|, (2.4) 2n where Uμ is the logarithmic potential of a measure μ on C, see (A.1). Recall that the differential of X → det(X ) at point X (invertible) in the direction Y is tr(X −1 Y ) det(X ) (this is sometimes referred to as the Jacobi formula). The sign of det B(z) is (−1)n . We deduce that in D (C), ¯ ¯∂ ln | det B(z)| = ∂ det B(z) = tr B(z)−1 ∂¯ 0 −z ⊗ In . −¯z 0 det B(z) ¯ = 0, ∂¯ z¯ = 1, With Rkk = R(U (z, 0))kk = (B(z)−1 )kk , we get from ∂z ∂¯ ln | det B(z)| =

n 0 tr Rkk −1 k=1

0 0

=−

n k=1

bk (z, 0).

520

C. Bordenave, P. Caputo, D. Chafaï

¯ Now from Eq. (A.2), in D (C), using = 4∂ ∂, 2π μ A = Uμ A =

n 1 2 ∂bk . ln | det B(z)| = − 2n n k=1

To get the limit as t ↓ 0, we note that for real t > 0, 1 ln | det(B(z) − it)|. ln |x − it|μ B(z) (d x) = 2n Note that det(B(z) − it) is real and its sign is (−1)n . As t ↓ 0, the left hand side of the above identity converges in D (C), to Uμ A . Taking the Laplacian, and arguing as above, we get

n 2 ln |x − it|μ B(z) (d x) = − ∂bk (z, it). n

(2.5)

k=1

The conclusion follows.

Note that even if − k ∂bk is a measure on C, for each 1 k n, −∂bk is not in general a measure on C (default of positivity, this can be checked on 2 × 2 matrices). 2.2. Bipartization of an operator. We shall generalize the above finite dimensional construction. Let V be a countable set and let 2 (V ) denote the Hilbert space defined by the scalar product φ¯ u ψu , φ, ψ := φu = δu , φ, u∈V

where δu is the unit vector supported on u ∈ V . Let D(V ) denote the dense subset of 2 (V ) of vectors with finite support. Let (wuv )u,v∈V be a collection of complex numbers such that for all u ∈ V , |wuv |2 + |wvu |2 < ∞. v∈V

We may then define a linear operator A on D(V ), by the formula, δu , Aδv = wuv .

(2.6)

Let Vˆ be a set in bijection with V , the image of v ∈ V being denoted by vˆ ∈ Vˆ . We set V b = V ∪ Vˆ and define the symmetric operator B on D(V b ), by the formulas, δu , Bδvˆ = δvˆ , Bδu = δu , Bδv = δuˆ , Bδvˆ =

wuv , 0.

In other words, if u : 2 (V b ) → C2 denotes the orthogonal projection on (u, u), ˆ 0 wuv . u B∗v = w¯ vu 0

(2.7)

Non-Hermitian Heavy Tailed Random Matrices

521

For z ∈ C, we also define on D(V b ), the symmetric operator B(z): for all u, v in V , δu , B(z)δvˆ = δvˆ , B(z)δu = wuv − z1(u = v), δu , B(z)δv = δuˆ , B(z)δvˆ = 0. Hence, if we identify V b with {1, 2} × V , we have B(z) = B − U (z, 0) ⊗ I V .

(2.8)

The operator B(z) is symmetric and it has a closure on a domain D(B) ⊂ 2 (V b ). We also denote by B(z) the closure of B(z). If B is self-adjoint then B(z) is also self-adjoint (recall that the sum of a bounded self-adjoint operator and a self-adjoint operator is also a self-adjoint operator). Recall also that the spectrum of a self–adjoint operator is real. For all U = U (z, η) ∈ H+ , B(z) − ηI V b = B −U (z, η) ⊗ I V is invertible with bounded inverse and the resolvent operator is then well defined by R(U ) = (B(z) − ηI V b )−1 . We may then define R(U )vv = v R(U )∗v =

av (z, η) bv (z, η)

bv (z, η) . cv (z, η)

In the sequel, we shall use some properties of resolvent operators. Lemma 2.2 (Properties of resolvent). Let B be the above bipartized operator. Assume that B is self-adjoint and let U = U (z, η) ∈ H+ , v ∈ V . Then, av , cv ∈ C+ , for each z ∈ C, the functions av (z, ·), bv (z, ·), bv (z, ·), cv (z, ·) are analytic on C+ , and |av | (Im(η))−1 , |cv | (Im(η))−1 , |bv | (2Im(η))−1 and |bv | (2Im(η))−1 . Moreover, if η ∈ iR+ , then av and cv are pure imaginary and bv = b¯v . Proof. For a proof of the first statements refer e.g. to Reed and Simon [42]. For the last statement concerning η ∈ iR+ , we define the skeleton of B(z) as the graph on V b obtained by putting an edge between two vertices u, v in V b , if δu , B(z)δv = 0. Then since there is no edge between two vertices of V or Vˆ , the skeleton of B(z) is a bipartite graph. Assume first that B(z) is bounded: for all u ∈ V b , B(z)δu C. Then for |η| > C, the series expansion of the resolvent gives R(U ) = −

∞ B(z)n . ηn+1 n=0

However since the skeleton is a bipartite graph, all cycles have an even length. It implies that for n odd, δu , B(z)n δu = 0. Applied first to v ∈ V , we deduce that for |η| > C, a(z, −η) ¯ = −a(z, ¯ η) and then applied to v, ˆ we get c(z, −η) ¯ = −c(z, ¯ η). We may then extend to C+ this last identity by analyticity. For η = it ∈ iR+ , we deduce that

522

C. Bordenave, P. Caputo, D. Chafaï

av and cv are pure imaginary. Similarly, since the skeleton is a bipartite graph, a path from a vertex v ∈ V to a vertex uˆ ∈ Vˆ must of be of odd length. We get for |η| > C, b¯v (z, −η) ¯ = δvˆ , R(U (z, −η))δ ¯ v = −

∞ δvˆ , B(z)2n+1 δv n=0

η2n+2

= δv , R(U )δvˆ = bv (z, η), where we have used the symmetry of B(z). It follows that bv (z, −η) ¯ = b¯v (z, η). If B(z) is not bounded, then B(z) is the limit of a sequence of bounded operators and we conclude by invoking Theorem VIII.25(a) in [42]. 2.3. Operator on a tree. We keep the setting of the above paragraph and consider a (non-oriented) tree T = (V, E) on the vertices V with edge set E (recall that a tree is a connected graph without cycles). For ease of notation, we note u ∼ v if {u, v} ∈ E. We assume that if {u, v} ∈ / E, then wuv = wvu = 0. In particular wvv = 0 for all v ∈ V . We continue to consider the operator A defined by (2.6). In the special case when wuv = w vu for all u, v in V , the operator A is symmetric and we first look for sufficient conditions for A to be essentially self-adjoint. Lemma 2.3 (Criterion of self-adjointness). Let κ > 0 and T = (V, E) be a tree. Assume / E, then wuv = wvu = 0. Assume that for all u, v ∈ V, wuv = w vu and that if {u, v} ∈ also that there exists a sequence of connected finite subsets (Sn )n 1 in V , such that Sn ⊂ Sn+1 , ∪n Sn = V , and for every n and v ∈ Sn , |wuv |2 κ. u ∈S / n :u∼v

Then A is essentially self-adjoint. For a proof, see [11, Lemma A.3]. The above lemma has an interesting corollary for the bipartized operator B of A defined by (2.7)–(2.8). Corollary 2.4 (Criterion of self-adjointness of bipartized operator). Let κ > 0 and T = (V, E) be a tree. Assume that if {u, v} ∈ / E then wuv = wvu = 0. Assume also that there exists a sequence of connected finite subsets (Sn )n 1 in V , such that Sn ⊂ Sn+1 , ∪n Sn = V , and for every n and v ∈ Sn , |wuv |2 + |wvu |2 κ. u ∈S / n :u∼v

Then for all z ∈ C, B(z) is self-adjoint. Proof. From (2.8), it is sufficient to check that B is self-adjoint. Let ∅ ∈ V be a distinguished vertex; we define two disjoint trees G ∅ = (V∅, E ∅) and Gˆ ∅ = (Vˆ∅, Eˆ ∅) on a partition (V∅, Vˆ∅) of V b as follows. The trees G ∅ and Gˆ ∅ are the unique trees such ˆ ∈ Vˆ∅ and that satisfy the following properties: that ∅ ∈ V∅, ∅ (i) if {u, v} ∈ E and u in V∅ (or Vˆ∅) then vˆ ∈ V∅ (or Vˆ∅) and {u, v} ˆ ∈ E ∅ (or Eˆ ∅), (ii) if {u, v} ∈ E and uˆ in V∅ (or Vˆ∅) then v ∈ V∅ (or Vˆ∅) and {u, ˆ v} ∈ E ∅ (or Eˆ ∅).

Non-Hermitian Heavy Tailed Random Matrices

523

We note that by construction if u ∈ V∅ and v ∈ Vˆ∅ then δu , Bδv = 0. It follows that the operator B decomposes orthogonally into two operators B∅ and Bˆ ∅ on domains in 2 (V∅) and 2 (Vˆ∅) respectively: B = B∅ ⊕ Bˆ ∅. We may then safely apply Lemma 2.3 to B∅ and Bˆ ∅. When the operator B is self-adjoint, the resolvent operator has a nice recursive expression due to the tree structure. Let ∅ ∈ V be a distinguished vertex of V (in graph language, we root the tree T at ∅). For each v ∈ V \{∅}, we define Vv ⊂ V as the set of vertices whose unique path to the root ∅ contains v. We define Tv = (Vv , E v ) as the subtree of T spanned by Vv . We finally consider Av , the projection of A on Vv , and Bv the bipartized operator of Av . The skeleton of Av is contained in Tv . Finally, we note that if B is self-adjoint then so is Bv (z) for every z ∈ C. The next lemma can be interpreted as a Schur complement formula on trees. Lemma 2.5 (Resolvent on a tree). Assume that B is self-adjoint and let U = U (z, η) ∈ H+ . Then −1 0 w∅v 0 wv ∅ R(U )∅∅ = − U + , R(U )vv wv∅ 0 w ∅v 0 v∼∅

)vv = v R Bv (U )∗v and R Bv (U ) = (Bv (z) − η)−1 is the resolvent operator where R(U of Bv . Proof. Define the operator C on D(V b ) by its matrix elements C∅ := ∅C∗∅ = −U (z, 0) ,

Cv := ∅C∗v = v C∗∅ =

0 wv∅

w∅v 0

for all v ∈ V such that v ∼ ∅, and u C∗v = 0 otherwise. The operator C is symmetric and bounded. Its extension to 2 (V b ) is thus self-adjoint (also denoted by C). In this way, we have from V = {∅} v∼∅ Vv , Bv (z). B(z) = C + B with B= v∼∅

) = ( B. From the resolvent We shall write R(U B −ηI )−1 for the associated resolvent of identity, these operators satisfy )C R(U ) = R(U ) − R(U ). R(U

(2.9)

uv = u R(U ∅∅ = −η−1 I2 . Also )∗v and Ruv = u R(U )∗v . Observe that R Set R the direct sum decomposition V = {∅} v∼∅ Vv implies Rvv = v R Bv (U )∗v and uv = 0 for every u = v with u ∼ ∅, v ∼ ∅. Similarly we have that R ∅v = 0 = R v ∅ R

for every v ∈ V \{∅}. Using the identity u∈V ∗u u = I , we get ∅∅C∅ R∅∅ + )C R(U )∗∅ = R ∅∅Cv Rv ∅ ∅ R(U R v∼∅

=η

−1

U (z, 0)R∅∅ − η−1

v∼∅

Cv Rv ∅.

524

C. Bordenave, P. Caputo, D. Chafaï

We compose the identity (2.9) on the left by v and on the right by ∗∅; we obtain, for v ∼ ∅, vv Cv∗ R∅∅ = −Rv ∅. R We finally compose (2.9) on the left by ∅ and on the right by ∗∅, vv Cv∗ R∅∅ = −η−1 I2 − R∅∅, η−1 U (z, 0)R∅∅ + η−1 Cv R or equivalently (U (z, η) +

v∼∅

∗ v∼∅ C v Rvv C v )R∅∅

= −I2 .

2.4. Local operator convergence. In the next paragraphs, we are going to prove that the sequence of random matrices (An ) converges to a limit random operator on an infinite tree. Let us recall a notion of convergence that we have already used in [11]. Definition 2.6 (Local convergence). Suppose (An ) is a sequence of bounded operators on 2 (V ) and A is a linear operator on 2 (V ) with domain D(A) ⊃ D(V ). For any u, v ∈ V we say that (An , u) converges locally to (A, v), and write (An , u) → (A, v) , if there exists a sequence of bijections σn : V → V such that σn (v) = u and, for all φ ∈ D(V ), σn−1 An σn φ → Aφ , in 2 (V ), as n → ∞. Assume in addition that A is closed and D(V ) is a core for A (i.e. the closure of A restricted to D(V ) equals A). Then, the local convergence is the standard strong convergence of operators up to a re-indexing of V which preserves a distinguished element. With a slight abuse of notation we have used the same symbol σn for the linear isometry σn : 2 (V ) → 2 (V ) induced in the obvious way. As pointed out in [11], the point for using Definition 2.6 lies in the following theorem on strong resolvent convergence. Theorem 2.7 (From local convergence to resolvents). Assume that (An ) and A satisfy the conditions of Definition 2.6 and (An , u) → (A, v) for some u, v ∈ V . Let Bn be the self-adjoint bipartized operator of An . If the bipartized operator B of A is self-adjoint and D(V b ) is a core for B, then, for all U ∈ H+ , R Bn (U )uu → R B (U )vv ,

(2.10)

where R B (U )vv = v R B (U )∗v and R B (U ) = (B(z) − η)−1 is the resolvent of B(z). Proof of Theorem 2.7. It is a special case of Reed and Simon [42, Thm. VIII.25(a)]. Indeed, we first fix z ∈ C and extend the bijection σn to V b by the formula, for all ˆ = σˆ n (w). Then we define Bn (z) = σn−1 Bn (z)σn , so that Bn (z)φ → w ∈ V, σn (w) B(z)φ for all φ in a common core of the self–adjoint operators Bn (z), B(z). This implies the strong resolvent convergence, i.e. ( Bn (z) − ηI )−1 ψ → (B(z) − ηI )−1 ψ for any η ∈ C+ , ψ ∈ 2 (V ). We conclude by using the identities : v ( Bn (z) − ηI )−1 δv = u (Bn (z) − ηI )−1 δu and v ( Bn (z) − ηI )−1 δvˆ = u (Bn (z) − ηI )−1 δuˆ .

Non-Hermitian Heavy Tailed Random Matrices

525

We shall apply the above theorem in cases where the operators An and A are random operators on 2 (V ), which satisfy with probability one the conditions of Theorem 2.7. In this case we say that (An , u) → (A, v) in distribution if there exists a random bijection σn as in Definition 2.6 such that σn−1 An σn φ converges in distribution to Aφ, for all φ ∈ D(V ) (where a random vector ψn ∈ 2 (V ) converges in distribution to ψ if limn→∞ E f (ψn ) = E f (ψ) for all bounded continuous functions f : 2 (V ) → R). Under these assumptions then (2.10) becomes convergence in distribution of (bounded) complex random variables. Note that in order to prove Theorems 1.1, 1.2, we will also need almost-sure convergence statements. 2.5. Poisson Weighted Infinite Tree (PWIT). We now define an operator on an infinite rooted tree with random edge–weights, the Poisson weighted infinite tree (PWIT) introduced by Aldous [1], see also [3]. Let ρ be a positive Radon measure on R such that ρ(R) = ∞. PWIT(ρ) is the random weighted rooted tree defined as follows. The vertex set of the tree is identified with N f := ∪k∈N Nk by indexing the root as N0 = ∅, the offsprings of the root as N and, more generally, the offsprings of some v ∈ Nk as (v1), (v2), · · · ∈ Nk+1 (for short notation, we write (v1) in place of (v, 1)). In this way the set of v ∈ Nn identifies the n th generation. We then define T as the tree on N f with (non-oriented) edges between the offsprings and their parents. We denote by Be(1/2) the Bernoulli probability distribution 21 δ0 + 21 δ1 . Now assign marks to the edges of the tree T according to a collection {v }v∈N f of independent realizations of the Poisson point process with intensity measure ρ ⊗ Be(1/2) on R × {0, 1}. Namely, starting from the root ∅, let ∅ = {(y1 , ε1 ), (y2 , ε2 ), . . . } be ordered in such a way that |y1 | |y2 | · · ·, and assign the mark (yi , εi ) to the offspring of the root labeled i. Now, recursively, at each vertex v of generation k, assign the mark (yvi , εvi ) to the offspring labeled vi, where v = {(yv1 , εv1 ), (yv2 , εv2 ), . . . } satisfy |yv1 | |yv2 | · · ·. The Bernoulli mark εvi should be understood as an orientation of the edge {v, vi} : if εvi = 1, the edge is oriented from vi to v and from v to vi otherwise. For a probability measure θ on S 1 , we introduce the measure on C, for all Borel D: ∞ θ (D) = 1{ω−α r ∈D} θ (dω)dr. (2.11) 0

S1

Consider a realization of PWIT(2 θ ). We now define a random operator A on D(N f ) by the formula, for all v ∈ N f and k ∈ N, −1/α

δv , Aδvk = εvk yvk

and

−1/α

δvk , Aδv = (1 − εvk )yvk

,

(2.12)

and δv , Aδu = 0 otherwise. It is an operator as in §2.3. Indeed, if u = vk is an offspring of v, we set −1/α

wvu = εvk yvk

and

−1/α

wuv = (1 − εvk )yvk

,

(2.13)

otherwise, we set wuv = 0. We may thus consider the bipartized operator B of A. Proposition 2.8 (Self-adjointness of bipartized operator on PWIT). Let A be the random operator defined by (2.12). With probability one, for all z ∈ C, B(z) is self-adjoint. We shall use Corollary 2.4. We start with a technical lemma proved in [11, Lem. A.4].

526

C. Bordenave, P. Caputo, D. Chafaï

Lemma 2.9. Let κ > 0, 0 < α < 2 and let 0 < x1 < x2 < · · · be a Poisson process

−2/α κ}. Then Eτ is finite and of intensity 1 on R+ . Define τ = inf{t ∈ N : ∞ k=t+1 x k goes to 0 as κ goes to infinity. Proof of Proposition 2.8. For κ > 0 and v ∈ N f , we define τv = inf{t 0 :

∞

|yvk |−2/α κ}.

k=t+1

The variables (τv ) are i.i.d. and by Lemma 2.9, there exists κ > 0 such that Eτv < 1. We fix such κ. Now, we put a green color to all vertices v such that τv 1 and a red color otherwise. We consider an exploration procedure starting from the root which stops at red vertices and goes on at green vertices. More formally, define the sub-forest T g of T , where we put an edge between v and vk if v is a green vertex and 1 k τv . Then, if the root ∅ is red, we set S1 = C g (T ) = {∅}. Otherwise, the root is green, and we g g g consider T∅ = (V∅, E ∅) the subtree of T g that contains the root. It is a Galton-Watson g tree with offspring distribution τ∅. Thanks to our choice of κ, T∅ is almost surely g g finite. Consider L ∅ the leaves of this tree (i.e. the set of vertices v in V∅ such that for g all 1 k τv , vk is red). We set S1 = V∅ v∈L g {1 k τv : vk}. Clearly, the set ∅ S1 satisfies the condition of Lemma 2.3. Now, we define the outer boundary of {∅} as ∂τ {∅} = {1, . . . , τ∅} and for v = (i 1 , . . . i k ) ∈ N f \{∅} we set ∂τ {v} = {(i 1 , . . . , i k−1 , i k + 1)} ∪ {(i 1 , . . . , i k , 1), . . . , (i 1 , . . . , i k , τv )}. For a connected set S, its outer boundary is ∂τ S = ∂τ {v} \S. v∈S

Now, for each vertex u 1 , . . . , u k ∈ ∂τS1 , we repeat the above procedure to the rooted subtrees Tu 1 , . . . , Tu k . We set S2 = S1 ∪1i k C b (Tu i ). Iteratively, we may thus almost surely define an increasing connected sequence (Sn ) of vertices with the properties required for Corollary 2.4. 2.6. Local convergence to PWIT. We may now come back to the random matrix An defined by (1.4). We extend it as an operator on D(N f ) by setting for 1 i, j n, δi , Aδ j = Ai, j and otherwise, if either i or j is in N f \{1, . . . n}, δi , Aδ j = 0. The aim of this paragraph is to prove the following theorem. Theorem 2.10 (Local convergence to PWIT). Assume (H1). Let An be as above and A be the operator associated to PWIT(2 θ ) defined by (2.12). Then in distribution (An , 1) → (A, ∅). Up to small differences, this theorem has already been proved in [11, Sect. 2]. We review here the method of proof and stress the differences. The method relies on the local weak convergence, a notion introduced by Benjamini and Schramm [8], Aldous and Steele [3], see also Aldous and Lyons [2]. We define a network as a graph with weights on its edges taking values in some metric space. Let G n be the complete network on {1, . . . , n} whose weight on edge {i, j} equals (ξi,n j ), for some collection (ξi,n j )1i j n of i.i.d. complex random variables. We

Non-Hermitian Heavy Tailed Random Matrices

527

set ξ nj,i = ξi,n j . We consider the rooted network (G n , 1) obtained by distinguishing the vertex labeled 1. We follow Aldous [1, Sect. 3]. For every fixed realization of the marks (ξinj ), and for any B, H ∈ N, such that (B H +1 − 1)/(B − 1) n, we define a finite rooted subnetwork (G n , 1) B,H of (G n , 1), whose vertex set coincides with a B–ary tree of depth H with root at 1. To this end we partially index the vertices of (G n , 1) as elements in H J B,H = ∪ =0 {1, . . . , B} ⊂ N f ,

the indexing being given by an injective map σn from J B,H to Vn := {1, . . . , n}. We set I∅ = {1} and the index of the root 1 is σn−1 (1) = ∅. The vertex v ∈ Vn \I∅ is given n has the k th smallest absolute value among the index (k) = σn−1 (v), 1 k B, if ξ1,v n {ξ1, j , j = 1}, the marks of edges emanating from the root 1. We break ties by using the lexicographic order. This defines the first generation. Now let I1 be the union of I∅ and the B vertices that have been selected. If H 2, we repeat the indexing procedure for the vertex indexed by (1) (the first child) on the set Vn \I1 . We obtain a new set {11, . . . , 1B} of vertices sorted by their weights as before (for short notation, we concatenate the vector (1, 1) into 11). Then we define I2 as the union of I1 and this new collection. We repeat the procedure for (2) on Vn \I2 and obtain a new set {21, . . . , 2B}, and so on. When we have constructed {B1, . . . , B B}, we have finished the second generation (depth 2) and we have indexed (B 3 − 1)/(B − 1) vertices. The indexing procedure is then repeated until depth H so that (B H +1 − 1)/(B − 1) vertices are sorted. Call this set of vertices VnB,H = σn J B,H . The subnetwork of G n generated by VnB,H is denoted (G n , 1) B,H (it can be identified with the original network G n where any edge e touching the complement of VnB,H is given a mark xe = ∞). In (G n , 1) B,H , the set {u1, . . . , u B} is called the set of offsprings of the vertex u. Note that while the vertex set has been given a tree structure, (G n , 1) B,H is still a complete network on VnB,H . The next proposition shows that it nevertheless converges to a tree (i.e. extra marks diverge to ∞) if the ξi,n j satisfy a suitable scaling assumption. Let ρ be a Radon measure on C and let T be a realization of PWIT(ρ) defined in §2.5. For the moment, we remove the Bernoulli marks (εv )v∈N f and, for v ∈ N f and k ∈ N, we define the weight on edge {v, vk} to simply be yvk . Then (T, ∅) is a rooted network. We call (T, ∅) B,H the finite random network obtained by the same sorting procedure. Namely, (T, ∅) B,H consists of the subtree with vertices in J B,H , with the marks inherited from the infinite tree. If an edge is not present in (T, ∅) B,H , we assign to it the mark +∞. We say that the sequence of random finite networks (G n , 1) B,H converges in distribution (as n → ∞) to the random finite network (T, ∅) B,H if the joint distributions of the marks converge weakly. To make this precise we have to add the points {±∞} as possible values for each mark, and continuous functions on the space of marks have to be understood as functions such that the limit as any one of the marks diverges to +∞ exists and coincides with the limit as the same mark diverges to −∞. We may define C = C ∪ {±∞}. The next proposition generalizes [1, Sect. 3], for a proof see [11, Prop. 2.6] (the proof there is stated for a measure ρ on R, the complex case extends verbatim). Proposition 2.11 (Local weak convergence to a tree). Let (ξi,n j )1i j n be a collection of i.i.d. random variables in C and set ξ nj,i = ξi,n j . Let ρ be a Radon measure on C with

528

C. Bordenave, P. Caputo, D. Chafaï

no mass at 0 and assume that n nP(ξ12 ∈ ·) ρ. n→∞

(2.14)

Let G n be the complete network on {1, . . . , n} whose mark on edge {i, j} equals ξinj , and T a realization of PWIT(ρ). Then, for all integers B, H , (G n , 1) B,H (T, ∅) B,H . n→∞

Now, we shall extend the above statement to directed networks. More precisely, let (ξi,n j )1i, j n be i.i.d. real random variables. We consider the complete graph G¯ n on Vn whose weight on edge {i, j} equals, if i j, (ξi,n j , ξ nj,i ) ∈ R2 . As above, we partially index the vertices of (G¯ n , 1) as elements in H J B,H = ∪ =0 {1, · · · , B} ⊂ N f ,

the indexing being given by an injective map σn from J B,H to Vn such that σn−1 (1) = ∅. The difference with the above construction, is that the vertex v ∈ Vn \{1} is given the n |, |ξ n |) has the k th smallest value among index (k) = σn−1 (v), 1 k B, if min(|ξ1,v v,1 n n {min(|ξ1, j |, |ξ j,1 |), j = 1}. Similarly, let (T, ∅) be the infinite random rooted network with distribution PWIT(ρ). This time we do not remove the Bernoulli marks (εv )v∈N f and define the weight on edge {v, vk} as (yvk , ∞) if εvk = 1 and (∞, yvk ) if εvk = 0. Again, we call (T, ∅) B,H the finite random network obtained by the sorting procedure : (T, ∅) B,H consists of the subtree with vertices in J B,H , with the marks inherited from the infinite tree. We apply Proposition 2.11 to the complete network G n with mark on edge {i, j} equal, if i j, to min(|ξi,n j |, |ξ nj,i |). This network satisfies (2.14) with 2ρ. We remark that if u, v ∈ J B,H then from (2.14), max(|ξσnn (u),σn (v) |, |ξσnn (v),σn (u) |) diverges weakly to infinity. We also notice that, given (G n , 1) B,H , with equal probability |ξσnn (u),σn (v) | is larger or less than |ξσnn (v),σn (u) |. We deduce the following. Corollary 2.12 (Local weak convergence to a tree). Let ρ be a Radon measure on C with no mass at 0. Let (ξi,n j )1i, j n be a collection of i.i.d. random variables in C such that (2.14) holds. Let G¯ n be the complete network on {1, . . . , n} whose mark on edge {i, j} equals, if i j, (ξi,n j , ξ nj,i ), and T a realization of PWIT(2ρ). Then, for all integers B, H , (G¯ n , 1) B,H (T, ∅) B,H . n→∞

We may now prove Theorem 2.10. Proof of Theorem 2.10. We argue as in the proof of theorem 2.3(i) in [11, Sect. 2]. We first define the weights (ξi,n j )i, j∈N f as follows. For integers 1 i, j n, we set ξi,n j = Ai,−αj = anα X i,−αj , with the convention that ξi,n j = ∞ if X i, j = 0. For this choice, by assumption (H1), (2.14) holds with ρ = θ and θ in (2.11). If i or j is in N f \{1, · · · , n}, we set ξi,n j = ∞.

Non-Hermitian Heavy Tailed Random Matrices

529

Let G¯ n denote the complete network on {1, · · · , n} with marks (ξi,n j , ξ nj,i ) on edge {i, j}, if i j. From Corollary 2.12, for all B, H, (G¯ n , 1) B,H converges weakly to (T, ∅) B,H , where T has distribution PWIT(2 θ ). Let A be the random operator associated to T . Let σnB,H be the map σn associated to the network (G¯ n , 1) B,H . The maps σn are arbitrarily extended to a bijection N f → N f . From the Skorokhod Representation Theorem we may assume that (G¯ n , 1) B,H converges a.s. to (T, ∅) B,H for all B, H . Thus we may find sequences Bn , Hn tending to infinity and a sequence of bijections σn := σnBn ,Hn Hn +1 such that (Bn − 1)/(Bn − 1) n and such that for any pair u, v ∈ N f we have n ξ which converge a.s. to σn (v) σn (u), ⎧ ⎨ yuk if for some integer k, v = uk and εuk = 1, yvk if for some integer k, u = vk and εvk = 0, ⎩ ∞ otherwise. It follows that a.s. −1/α n δu , σn−1 An σn δv = ξ → δu , Aδv . σn (v) σn (u),

For any v, set ψnv := σn−1 An σn δv . To prove Theorem 2.10, it is sufficient to show that f v for any v ∈ N , ψn → Aδv in 2 (N f ) almost surely as n goes to infinity, i.e., ( δu , ψnv − δu , Aδv )2 → 0. u

From what precedes, we know that δu , ψnv → δu , Aδv for every claim follows u. The if we have (almost surely) uniform (in n) square-integrability of ( δu , ψnv )u . This in turn follows from Lemma 2.4(i) and Lemma 2.7 in [11].

2.7. Convergence of the resolvent matrix. Let An and A be as in Theorem 2.10. From Proposition 2.8, we may almost surely define the resolvent R of the bipartized random operator of A. For U = U (z, η) ∈ H+ , we set a(z, η) b(z, η) ∗ . (2.15) R(U )∅∅ = ∅ R(U )∅ = b (z, η) c(z, η) We define similarly, Rn (U ) = (Bn (z)−η)−1 , the resolvent of Bn , the bipartized operator of An . We set Rn (U )11 = 1 Rn (U )∗1 . Theorem 2.13 (Convergence of the resolvent matrix). Let An and A be as in Theorem 2.10. For all U = U (z, η) ∈ H+ , Rn (U )11 R(U )∅∅. n→∞

Proof. We apply Proposition 2.8, Theorem 2.10 and the “in distribution” version of Theorem 2.7.

530

C. Bordenave, P. Caputo, D. Chafaï

2.8. Proof of Theorem 1.1. Again, we consider the sequence of random n × n matrices (An ) defined in the Introduction by (1.4). Theorem 2.14. For all z ∈ C+ , almost surely the measure νˇ An −z (d x) converges weakly to a measure νˇ α,z (d x) whose Cauchy-Stieltjes transform is given, for η ∈ C+ , by m νˇα,z (η) = Ea(z, η), where a(z, η) was defined in (2.15). Proof. For every z ∈ C, by Proposition 2.8, the operator B(z) is a.s. self-adjoint. It implies that there exists a.s. a measure on R, ν∅,z , called the spectral measure with vector δ∅, such that for all η ∈ C+ , a(z, η) = δ∅, R(U )δ∅ =

ν∅,z (d x) = m ν∅,z (η). x −η

, the bipartized matrix of An . For U = We define Rn as the resolvent matrix of Bn ak bk . By Theorem 2.1, U (z, η) ∈ H+ , we write Rn (U )kk = bk ck m Eνˇ An −z (η) = Ea1 (z, η). By Lemma 2.2, for U ∈ H+ , the entries of the matrix Rn (U )11 are bounded. It follows from Theorem 2.13 that for all U ∈ H+ , lim ERn (U )11 = E

n→∞

a b

b , c

where the limit matrix was defined in (2.15). Hence, for all z ∈ C+ , lim m Eνˇ An −z (η) = Ea(z, η).

n→∞

We deduce that Eˇν An −z converges to the measure να,z = Eν∅,z . This convergence can be improved to almost sure by showing that the random measure νˇ An −z concentrates around its mean. This is done by applying the Borel-Cantelli Lemma and Lemma C.2 to the matrix Bn (z) whose spectral measure equals νˇ An −z , see (2.3). Theorem 1.1 is a corollary of the above theorem up to the fact that Ea(z, η) does not depend on the measure θ which appears in (H1). The latter will be a consequence of the forthcoming Theorem 4.1.

3. Convergence of the Spectral Measure 3.1. Tightness. In this paragraph, we prove that the counting probability measures of the eigenvalues and singular values of the random matrices (An ) defined by (1.4) are a.s. tight. For ease of notation, we will often write A in place of An .

Non-Hermitian Heavy Tailed Random Matrices

531

Lemma 3.1 (Tightness). If (H1) holds, there exists r > 0 such that for all z ∈ C, a.s. ∞ lim t r ν A−z I (dt) < ∞, and thus (ν A−z I )n 1 is tight. n→∞ 0

Moreover, a.s. lim

n→∞ C

|z|r μ A (dz) < ∞, and thus (μ A )n 1 is tight.

Proof. In both cases, the a.s. tightness follows from the moment bound and the Markov inequality. The moment bound on μ A follows from the statement on ν A (take z = 0) by using the Weyl inequality (B.6). It is therefore enough to establish the moment bound on ν A−z I for every C. Let us fix z ∈ C and r > 0. By definition of ν A−z I we have

∞

t r ν A−z I (dt) =

0

n 1 sk (A − z I )r . n k=1

From (B.2) we have sk (A − z I ) sk (A) + |z| for every 1 k n, and one can then safely assume that z = 0 for the proof. By using (B.7) we get for any 0 r 2, 0

∞

⎛ ⎞r/2 n n 1 t r ν A (dt) Z n := Yn,i where Yn,i := ⎝ an−2 |X i j |2 ⎠ . n i=1

j=1

We need to show that (Z n )n 1 is a.s. bounded. Assume for the moment that 4 sup E(Yn,1 )<∞

n 1

(3.1)

for some choice of r . Since Yn,1 , . . . , Yn,n are i.i.d. for every n 1, we get from (3.1) that ⎞ ⎛ E((Z n − EZ n )4 ) = n −4 E⎝ (Yn,i − EYn,i )2 (Yn, j − EYn, j )2 ⎠ = O(n −2 ). 1i, j n

Therefore, by the monotone convergence theorem, we get E( n 1 (Z n − EZ n )4 ) < ∞,

which gives n 1 (Z n − EZ n )4 < ∞ a.s. and thus Z n − EZ n → 0 a.s. Now the sequence (EZ n )n 1 = (EYn,1 )n 1 is bounded by (3.1) and it follows that (Z n )n 1 is a.s. bounded. It remains to show that (3.1) holds, say if 0 < 4r < α. To this end, let us define Sn,a,b :=

n

an−2 |X 1 j |2 1{an−2 |X 1 j |2 ∈[a,b)} for every a < b.

j=1 4 = (S 2r 2r Now Yn,1 n,0,∞ ) = (Sn,0,1 + Sn,1,∞ ) and thus,

4 2r 2r E(Yn,1 ) 22r −1 E(Sn,0,1 ) + E(Sn,1,∞ ) .

(3.2)

532

C. Bordenave, P. Caputo, D. Chafaï

2r ) < ∞. Indeed, since 2r < 1, by the Jensen inequality, We have supn 1 E(Sn,0,1 2r E(Sn,0,1 ) (ESn,0,1 )2r ,

and by Lemma C.1, ESn,0,1 ∼n α/(2 − α). For the second term of the right hand side of (3.2), we set Mn := max an−1 |X 1 j |1{an−1 |X 1 j |>1} and Nn := #{1 j n s.t. an−1 |X 1 j | > 1}. 1 j n

From the Hölder inequality, if 1/ p + 1/q = 1, we have 2r p 1/ p 4rq 1/q 2r E(Sn,1,∞ EMn ) E Nn2r Mn4r ENn .

(3.3)

Recall that P(|X 12 | > an ) = (1 + o(1))/n 2/n for n 1. By the union bound, for n 1, P(Nn k)

n n k 2k 2k P(|X 12 | > an )k = . k k k! n k! η

In particular, we have supn 1 ENn < ∞ for any η > 0. Similarly, since the function L is slowly varying, for n 1 and all t 1, we have P(Mn t) nP(|X 12 | > tan ) = nan−α t −α L(an t) 2t −α . γ

It follows that if γ < α, supn 1 EMn < ∞. Taking p and q so that 4rq < α, we thus 2r ) < ∞. conclude from (3.3) that supn 1 E(Sn,1,∞

3.2. Invertibility. In this paragraph, we find a lower bound for the smallest singular value of the random matrix A − z I , where A is defined by (1.4), in other words an upper bound on the operator norm of the resolvent of A. Such lower bounds on the smallest singular value of random matrices were developed in recent years by using Littlewood-Offord type problems, as in [48,49] and [45]. The available results require moments assumptions which are not satisfied when the entries have heavy tails. Here we circumvent the problem by requiring the bounded density hypothesis (H3). The removal of this hypothesis can be done by adapting the Hermitization lemma together with the Rudelson and Vershynin approach already used by Götze and Tikhomirov [28]. Lemma 3.2 (Invertibility). If (H3) holds then for some r > 0, every z ∈ C, a.s. lim nr sn (A − z I ) = +∞.

n→∞

Non-Hermitian Heavy Tailed Random Matrices

533

Proof. For√every x, y ∈ Cn and S ⊂ Cn , we set x · y := x1 y1 + · · · + xn yn and x2 := x · x and dist(x, S) := min y∈S x − y2 . Let R1 , . . . , Rn be the rows of A − z I and set R−i := span{R j ; j = i} for every 1 i n. From Lemma B.2 we have √ min dist(Ri , R−i ) n sn (A − z I ), 1i n

and consequently, by the union bound, for any u 0, √ P( n sn (A − z I ) u) n max P(dist(Ri , R−i ) u). 1i n

Let us fix 1 i n. Let Yi be a unit vector orthogonal to R−i . Such a vector is not unique. We just pick one. This defines a random variable on the unit sphere Sn−1 = {x ∈ Cn : x2 = 1}. By the Cauchy–Schwarz inequality, |Ri · Yi | πi (Ri )2 Yi 2 = dist(Ri , R−i ), where πi (·) is the orthogonal projection on the orthogonal complement of R−i . Let νi be the distribution of Yi on Sn−1 . Since Yi and Ri are independent, for any u 0, P(dist(Ri , R−i ) u) P(|Ri · Yi | u) = P(|Ri · y| u) dνi (y). Sn−1

Let us first consider the case where X 11 has a bounded density ϕ on C. Since y2 = 1 −1 √ there exists an index j0 ∈ {1, . . . , n} such that y j0 = 0 with y j0 n. The complex random variable Ri · y is a sum of independent complex random variables and one −1 of them √ is an X i j0 y j0 , which is absolutely continuous with a density bounded above by an n ϕ∞ . Consequently, by a basic property of convolutions of probability measures, the complex random √ variable Ri · y is also absolutely continuous with a density ϕi bounded above by an n ϕ∞ , and thus √ P(|Ri · y| u) = 1|s|u ϕi (s) ds π u 2 an n ϕ∞ . C

Therefore, for every b > 0, P(sn (A − z I ) n −b−1/2 ) = O(n 3/2−2b an ), where the O does not depend on z. By taking b large enough, the first Borel-Cantelli lemma implies that there exists r > 0 such that a.s. for every z ∈ C and n 1, sn (A − z I ) n −r . It remains to consider the case where X 11 has a bounded density ϕ on R. As for the complex case, let us fix y ∈ Sn−1 . Since y2 = 1 there exists an index j0 ∈ {1, . . . , n} −1 √ √ √ such that y j0 n. Also, either |Re(y j0 )|−1 2n or |Im(y j0 )|−1 2n. √ Assume for instance that |Re(y j0 )|−1 2n. We observe that for every u 0, P(|Ri · y| u) P(|Re(Ri · y)| u).

534

C. Bordenave, P. Caputo, D. Chafaï

The real random variable Re(Ri · y) is a sum of independent real random variables and one of them √ is an−1 X i j0 Re(y j0 ), which is absolutely continuous with a density bounded above by an 2n ϕ∞ . Consequently, by a basic property of convolutions of probability measures, the real random variable Re(Ri · y) is also absolutely continuous with a √ density ϕi bounded above by an 2n ϕ∞ . Therefore, we have for every u 0, √ ϕi (s) ds 23/2 an n u ϕ∞ . P(|Re(Ri · y)| u) = [−u,u]

We skip the rest of the proof, which is identical to the complex case. 3.3. Distance from a row to a vector space. In this paragraph, we give two lower bounds on the distance of a row of the random matrix A − z defined by (1.4) to a vector space of not too large dimension. The first ingredient is an adaptation of Proposition 5.1 in Tao and Vu [50]. Proposition 3.3 (Distance of a row to a subspace). Assume that (H1) holds. Let 0 < γ < 1/2, and let R be a row of an (A − z). There exists δ > 0 depending on α, γ such that for all d-dimensional subspaces W of Cn with n − d n 1−γ , one has δ P dist(R, W ) n (1−2γ )/α e−n . The proof of Proposition 3.3 is based on a concentration estimate for the truncated variables X 1i 1{|X 1i |bn } for suitable sequences bn . We first recall a concentration inequality of Talagrand. Theorem 3.4 (Talagrand concentration inequality [46] and [34, Cor. 4.10]). Let us denote by D := {z ∈ C; |z| 1} the complex unit disc and let P be a product probability measure on the product space Dn . Let F : Dn → R be a Lipschitz convex function on Dn with FLip 1. If M(F) is a median of F under P then for every r 0, P (|F − M(F)| r ) 4e−r

2 /4

.

Proof of Proposition 3.3. We first perform some pre-processing of the vector R as in Tao-Vu [50]. To fix ideas, we may assume that R is the first row of an (A − z). Then R = X 1 − zan e1 , where X 1 is the first row of X = an A. We then have dist(R, W ) dist(X 1 − zan e1 , span(W, e1 )) = dist(X 1 , W1 ), where we have set W1 = span(W, e1 ). Note that d dim W1 d + 1. For any sequence bn , from the Markov inequality, n √

n √ P 1{|X 1i |bn } n e− n Ee1|X 11 |bn i=1

√

n 1 + eL(bn )bn−α

e−

n

e−

n+en L(bn )bn−α

√

.

(3.4)

Choose bn = an n −2γ /α . Clearly, bn /n (1−2γ )/α ∈ [n −ε , n ε ] eventually for all ε > 0.

Non-Hermitian Heavy Tailed Random Matrices

535

Let J denote the set of indexes i such that |X 1i | bn . From (3.4) we see that, for some δ > 0: √ δ P(|J | < n − n) e−n . It follows √ that it is sufficient to prove the statement conditioned on the event {|J | In particular, we shall prove that for any fixed I ⊂ {1, . . . , n}, such that n − n}. √ |I | n − n, δ (3.5) P dist(X 1 , W1 ) n (1−2γ )/α | J = I e−n . √ Without loss of generality, we assume that I = {1, · · · , n } with n n − n. Let √ πI be the orthogonal projection on span(ei : i ∈ I ). If W2 = π I (W1 ), we find d − n dim(W2 ) dim(W1 ) d + 1 and dist(X 1 , W1 ) dist(π I (X 1 ), W2 ). Note that π I (X 1 ) is simply the vector X 1i , i = 1, . . . , n . We set W = span(W2 , E[π I (X 1 ) | J = I ]) , √ so that d − n dim(W ) d + 2 and

Y = π I (X 1 ) − E[π I (X 1 ) | J = I ] ,

dist(π I (X 1 ), W2 ) dist(Y, W ). Let P denote the orthogonal projection matrix to the orthogonal complement of W

in Cn . We have dist2 (Y, W ) = i, j Yi Pi j Y¯ j , and, since Y = (Yi )1i n is a mean zero vector under P(· | I = I ), ! " E[dist2 (Y, W ) | J = I ] = E Yi Pi j Y¯ j | J = I i, j n

=

Pii E[|Yi |2 | J = I ] = E[|Y1 |2 | J = I ] tr P.

i=1

We have for any ε > 0 and for n 1: E[|Y1 |2 | J = I ] = E[|X 11 |2 | J = I ] − (E[|X 11 | | J = I ])2 bn2−α n −ε , where the last bound follows from Lemma C.1, since by independence one has E[|X 11 |2 | J = I ] = E[|X 11 |2 | |X 11 | bn ] , and |E[X 11 | J = I ]|2 = |E[X 11 | |X 11 | bn ]|2 is O(1) if α > 1, while (by Lemma C.1) it is O(bn2−2α+ε ) for any ε > 0, if α ∈ (0, 1]. Using tr P = n − dim(W ) 21 (n − d), it follows that, for any ε > 0, for n 1: E[dist2 (Y, W ) | J = I ] cL(bn )bn2−α (n − d) n q(ε) ,

(3.6)

where q := (1 − 2γ ) α2 + γ − ε. Under P(· | J = I ), the vector (Y1 /bn , · · · , Yn /bn ) is a vector of independent vari ables on Dn , where D be the unit complex ball. We consider the function F : x →

536

C. Bordenave, P. Caputo, D. Chafaï

dist(x, W ). The mapping F is 1-Lipschitz and convex. From Theorem 3.4, we deduce that 2 −r2 8b P( dist(Y, W ) − M(dist(Y, W )) r | J = I ) 4e n ,

(3.7)

where M(dist(Y, W )) is a median of dist(Y, W ) under P(· | J = I ). It follows that, for e.g. δ = γ /2, taking ε = γ /4 in (3.6), we obtain q(ε) = (1 − 2γ ) α2 + δ + ε, and therefore there exists c > 0 such that n 1, bn−2 E[dist2 (Y, W ) | J = I ] c

n q(ε) c nδ . bn2

(3.8)

From (3.7) it follows that " ! 2 E M(dist(Y, W )) − dist(Y, W ) | J = I = O bn2 . From the Cauchy-Schwarz inequality we then have 2 # M(dist(Y, W )) − E[dist2 (Y, W ) | J = I ] ! " 2 E M(dist(Y, W )) − dist(Y, W ) | J = I = O bn2 . The above estimates, with (3.6) and (3.8), imply that M(dist(Y, W )) 21 n q(ε)/2 for n 1. Therefore, for n 1, P dist(Y, W ) n (1−2γ )/α | J = I 1 P M(dist(Y, W )) − dist(Y, W ) n q(ε)/2 | J = I . 4 The desired conclusion (3.5) now follows from (3.7) and (3.8). So far we have shown that under Assumption (H1), the distance of a row to a space with codimension n − d n 1−γ is at least n (1−2γ )/α with large probability. We want a sharper estimate, namely at the order n 1/α . We will obtain such a bound in a weak sense in the forthcoming Proposition 3.7. Furthermore, we shall require Assumption (H2) to do so. We start with some preliminary facts. Below we write Z = Z (β) , β ∈ (0, 1), for the one-sided β-stable distribution such that for all s 0, E exp(−s Z i ) = exp(−s β ). From the standard inversion formula, for m > 0, ∞ y −m = (m)−1 x m−1 e−x y d x , 0

we see that all moments E[Z −m ] = (m)−1

0

∞

β

x m−1 e−x d x

(3.9)

Non-Hermitian Heavy Tailed Random Matrices

537

are finite for m > 0. Also, recall that if (Z i )1i n is an i.i.d. vector with distribution Z then, for every (wi )1i n ∈ Rn+ , in distribution n

d

wi Z i =

n

i=1

Indeed, (3.10) follows from E exp(−s variables.

1/β β wi

Z1.

(3.10)

i=1

wi Z i ) = exp(−s β

β

wi ) and a change of

Lemma 3.5. Assume (H2). There exists ε > 0 and p ∈ (0, 1) such that the random variable |X 11 |2 dominates stochastically the random variable ε D Z , where P(D = 1) = 1 − P(D = 0) = p is a random variable with law Be( p) , Z = Z (β) with β = α2 , and D and Z are independent. Proof. From our assumptions, there exist δ > 0 and x0 > 0 such that P(|X 11 |2 > x) δ x −β P(δ 2 Z > x) , x > x0 . Let p be the probability that |X 11 |2 > x0 . If x > x0 then P(|X 11 |2 > x) p P(δ 2 Z > x) = P(δ 2 D Z > x). On the other hand, if x x0 then P(|X 11 |2 > x) p P(δ 2 D Z > x). In any case, setting ε = δ 2 we have P(|X 11 |2 > x) P(ε D Z > x) , x > 0. This implies the lemma.

n Lemma 3.6. Assume (H2). Let ωi ∈ [0, 1] be numbers such that ω(n) := i=1 ωi 1 +ε n 2 for some ε > 0. Let X 1 = (X 1i )1i n be i.i.d. random variables distributed as X 11 , and let Z = Z (β) with β = α2 . There exist δ > 0 and a coupling of X 1 and Z such that n 1 δ P ωi |X 1i |2 δ ω(n) β Z e−n . (3.11) i=1

Proof. Let D = (Di )1i n denote an i.i.d. vector of Bernoulli variables with parameter p given by Lemma 3.5. From this latter lemma and (3.10) we know that there exist ε > 0 and a coupling of X 1 , D and Z such that P

n

ωi |X 1i |2 ε

i=1

n

β ωi

1 Di

β

Z

i=1

It remains to show that for some ε > 0: n β ε ωi Di ε ω(n) e−n . P i=1

= 1.

538

C. Bordenave, P. Caputo, D. Chafaï β

Observe that ωi ωi , so that E n β ωi Di ε ω(n) P i=1

n

β i=1 ωi

Di p ω(n). Therefore, for 0 < ε < p,

n 2 2 β β ωi Di − Eωi Di ( p − ε ) ω(n) 2e−2( p−ε ) ω(n) /n , P i=1 1

where we have used the Hoeffding inequality in the last bound. Since ω(n) n 2 +ε , this implies the lemma. Proposition 3.7. Assume (H2) and take 0 < γ α/4. Let R be the first row of the matrix an (A − z). There exists a constant c > 0 and an event E such that for any d-dimensional subspace W of Cn with codimension n − d n 1−γ , we have 2

E[dist−2 (R, W ) ; E] c (n − d)− α and P(E c ) c n −(1−2γ )/α . Proof. As in the proof of Proposition 3.3, we have dist(R, W ) dist(X 1 , W1 ) , where W1 = span(W, e1 ), d dim W1 d + 1, and X 1 = (X 1i )1i n is the first row of X = an A. Let I denote the set of indexes i such that |X 1i | an . From (3.4) we know that √ δ P(|I| < n − n) < e−n , for some δ√> 0. It is thus sufficient to prove that for any set I ⊂ {1, . . . , n} such that |I | n − n, 2

E[dist−2 (R, W ) ; E I | I = I ] c (n − d)− α , for some event E I satisfying P((E I )c | I = I ) n −(1−2γ )/α . We will then simply set √ E = E I ∩ {|I| n − n}. Without loss of generality, we assume that I = {1, · · · , n } with n n − n 1/2 . Let π I be the orthogonal projection on span(ei : i ∈ I ). If W2 = π I (W1 ), set Note that d −

√

W = span (W2 , E(π I (X 1 ) | I = I )) . n dim(W ) dim(W1 ) + 1 d + 2. Defining Y = π I (X 1 ) − E(π I (X 1 ) | I = I ),

we have dist(R, W ) dist(X 1 , W1 ) dist(Y, W ). Thus, Y = (Yi )1i n is an i.i.d. mean zero vector under P(· | I = I ). Let P denote the orthogonal projection matrix to the orthogonal of W in Cn . By construction, we have n ! " 2 E dist (Y, W ) | I = I = E Yi Pi j Y¯ j | I = I = E |Y1 |2 | I = I tr P. i, j=1

Non-Hermitian Heavy Tailed Random Matrices

Here tr P =

n i=1

539

Pii , where Pii = (ei , Pei ) ∈ [0, 1] and tr P = n − dim(W ) satisfies 2(n − d) tr P

Let S =

n

i=1

Pii |Yi |2 . We have

⎛

E (dist (Y, W ) − S) | I = I = E ⎝ 2

2

1 (n − d). 2

Yi Pi j Y¯ j

i= j

=

2

(3.12)

⎞ |I = I⎠

Pi1 j1 Pi2 j2 E Yi1 Y¯ j1 Yi2 Y¯ j2 | I = I

(i 1 = j1 ),(i 2 = j2 )

=2

i 1 = j1

Pi21 j1 E[|Y1 |2 | I = I ]

2E[|Y1 |2 | I = I ] tr P 2 . Note that E[|Y1 |2 | I = I ] E[|X 11 |2 | I = I ] = E[|X 11 |2 | |X 11 | an ] E[|X 11 |2 ; |X 11 | an ] = O(an2 /n) , P(|X 11 | an ) where the last bound follows from Lemma C.1. Since P 2 = P, we deduce that " ! 2 2 2 n−d . (3.13) E (dist (Y, W ) − S) | I = I = O an n Next, let Z = Z (β) with β = α2 , as in Lemma 3.6. Set ωi = Pii , i = 1, . . . n , and for ε > 0, consider the event ⎧ ⎫ n ⎨ ⎬ 1 I = ωi |X 1i |2 ε (n − d) β Z . ⎩ ⎭ i=1

From Lemma 3.6 (with n replaced by n n − n 1/2 ) and using (3.12) there exists a coupling of the vector X 1i , i = 1, . . . , n and Z such that δ

P( cI ) e−n ,

(3.14)

for some δ > 0 and some choice of ε > 0. Also, since (a − b)2 a 2 /2 − b2 for all a, b ∈ R, we have S 21 Sa − Sb , where

Sa =

n

ωi |X 1i | , Sb = 2

i=1

n

ωi E [|X 1i | | |X 1i | an ]2 .

i=1

From Lemma C.1 and (3.12) we have Sb = E [|X 11 | | |X 11 | an ]2 tr P = h (α) (n, d),

(3.15)

540

C. Bordenave, P. Caputo, D. Chafaï

where h (α) (n, d) ∼ (n − d)an2 /n 2 if α ∈ (0, 1] and h (α) (n, d) ∼ (n − d) if α ∈ (1, 2). Let G 1I be the event that Sa 3 Sb . From (3.15) and the definition of I we have, for some c0 > 0, P((G 1I )c ∩ I | I = I ) P(Z c0 (n − d)−1/β h (α) (n, d) | I = I ). Note that, thanks to the assumptions n − d n 1−γ , γ α/4, we have (n − d)−1/β h (α) (n, d) n −ε0 for some ε0 = ε0 (α) > 0 for all α ∈ (0, 2), for n 1. Therefore, for n 1, P((G 1I )c ∩ I | I = I ) P(Z c0 n −ε0 | I = I ) P(Z c0 n −ε0 ; |X 1i | an , ∀i = 1, . . . , n ) , = P(|X 1i | an , ∀i = 1, . . . , n ) where the last identity follows from the independence of the X 1i . Observing that the probability for the event {|X 1i | an , ∀i = 1, . . . , n } is lower bounded by 1/c > 0 uniformly in n, we obtain P((G 1I )c ∩ I | I = I ) c P(Z c0 n −ε0 ). The latter probability can be estimated using Markov’s inequality and the fact that E[Z −m ] = u m is finite (cf. (3.9)). Indeed, for every m > 0, P(Z t) u m t −m . Thus, we have shown that for every p > 0 there exists a constant κ p such that P((G 1I )c ∩ I | I = I ) κ p n − p . Next, we set I = G 1I ∩ I and we claim that ! " E S −2 ; I | I = I = O (n − d)−4/α .

(3.16)

(3.17)

Indeed, on I we have S 16 Sa 6ε (n − d)2/α Z and therefore, for some constant c1 , ! " ! " E S −2 ; I | I = I c1 (n − d)−4/α E Z −2 | I = I . Using independence as before, and recalling that the event {|X 1i | an , ∀i = 1, . . . , n } has uniformly positive probability we have ! " E Z −2 | I = I c E[Z −2 ] = c u 2 . This proves (3.17). Now, for the event Markov’s and Cauchy-Schwarz’ inequalities lead to

|dist2 (Y, W ) − S| P dist2 (Y, W ) S/2 ; I | I = I P 1/2 ; I | I = I S ' ( |dist2 (Y, W ) − S| ; I | I = I 2E S # ) * ) * 2 E |dist2 (Y, W ) − S|2 | I = I E S −2 ; I | I = I .

Hence, if G 2I denotes the event {dist2 (Y, W ) S/2}, we deduce from (3.13) and (3.17), 1 1 2 (3.18) P (G 2I )c ∩ I | I = I = O an n − 2 (n − d) 2 − α .

Non-Hermitian Heavy Tailed Random Matrices

541 1

Note that, using n − d n 1−γ , the last expression is certainly O(n − α (1−2γ ) ). On the other hand, by (3.17) and Cauchy-Schwarz’ inequality ! " E dist−2 (X, W ) ; G 2I ∩ I | I = I ! " 2 E S −1 ; (3.19) I | I = I = O (n − d)−2/α . To conclude the proof we take E I = G 2I ∩ I = G 1I ∩ G 2I ∩ I . We have

P((E I )c | I = I ) P ( I )c | I = I +P (G 1I )c ∩ I | I = I + P (G 2I )c ∩ G 1I ∩ I | I = I . From (3.16) and (3.18) we see that 1 P (G 1I )c ∩ I | I = I + P (G 2I )c ∩ G 1I ∩ I | I = I = O n − α (1−2γ ) , and all it remains to prove is an upper bound on P (( I )c | I = I ). By independence, as before

P ( I )c | I = I c P ( I )c ; |X 1i | an , ∀i = 1, . . . , n . δ

From (3.14) we obtain P (( I )c | I = I ) c e−n . This ends the proof. 3.4. Uniform integrability. Let z ∈ C and σn · · · σ1 be the singular values of An − z with An defined by (1.4). For 0 < δ < 1, we define K δ = [δ, δ −1 ]. In this paragraph, we prove the uniform integrability in probability, meaning that for all ε > 0, there exists δ > 0 such that P

K δc

| ln(x)|ν An −z (d x) > ε

→ 0.

(3.20)

From Lemma 3.1, with probability 1 there exists c0 > 0, such that for all n, ∞ ln2 (x)ν An −z (d x) < c0 . 1

+∞ It follows from the Markov inequality that for all t 1, t ln(x)ν An −z (d x) < c0 / ln t. The upper part (δ −1 , ∞) of (3.20) is thus not an issue. For the lower part (0, δ), it is sufficient to prove that n−1 1 −2 1{σn−i δn } ln σn−i n i=0

converges in probability to 0 for any sequence (δn )n converging to 0. From Lemma 3.2, we may a.s. lower bound σn−i by cn −r for some constant c and all integers n 1.

542

C. Bordenave, P. Caputo, D. Chafaï

Take 0 < γ < α/4 to be fixed later. Using this latter bound for every 1 i n 1−γ , it follows that it is sufficient to prove that 1 n

n−1

−2 1{σn−i δn } ln σn−i

i=n 1−γ

converges in probability to 0. We are going to prove that there exists an event Fn such that, for some δ > 0 and c > 0, P((Fn )c ) c exp(−n δ ),

(3.21)

! " n 2 +1 α −2 E σn−i | Fn c . i

(3.22)

and

We first conclude the proof before proving (3.21)–(3.22). From Markov inequality, and (3.22), we deduce that P(σn−i δn ) P((Fn )c ) + c δn2

n 2 +1 α

i

.

1/( 2 +1)

If follows that there exists a sequence εn = δn α tending to 0 such that the probability that P(σn−nεn δn ) converges to 0. We obtain that it is sufficient to prove that 1 n

ε n n

−2 ln σn−i ,

i=n 1−γ

given Fn converges in probability to 0. However, using the concavity of the logarithm and (3.22) we have ⎡ ⎤ ε εn n n n 1 1 −2 −2 E⎣ ln σn−i ln E[σn−i |Fn ] Fn ⎦ n n 1−γ 1−γ i=n

i=n

εn n n c1 ln n i i=1 = c1 −εn ln εn + εn + O(n −1 ) .

It thus remain to prove (3.21)–(3.22). Let Bn be the matrix formed by the first n − i/2 rows of an (An − z I ). If σ1 · · · σn−i/2 are the singular values of Bn , then by the Cauchy interlacing Lemma B.4, σn−i

σn−i

an

.

By the Tao-Vu negative second moment Lemma B.3, we have

−2 −2 σ1−2 + · · · + σn−i/2 = dist−2 1 + · · · + dist n−i/2 ,

Non-Hermitian Heavy Tailed Random Matrices

543

where dist j is the distance from the j th row of Bn to the subspace spanned by the other rows of Bn . In particular, n−i/2 i −2 2 dist−2 σn−i an j . 2 j=1

Let Fn be the event that for all 1 j n − i/2, dist j n (1−2γ )/α . Since the dimension of the span of all but one row of Bn is at most d n − i/2, we can use Proposition 3.3, to obtain P((Fn )c ) exp(−n δ ) , for some δ > 0. Then we write i −2 σ 1 F an2 2 n−i n

n−i/2

dist−2 j 1 Fn .

j=1

Taking expectation, we get " " ! ! −2 E iσn−i ; Fn 2 an2 nE dist−2 1 ; Fn .

(3.23)

Since we are on Fn we can always estimate dist 1 n (1−2γ )/α . By introducing a further decomposition we can strengthen this as follows. Recall that from Proposition 3.7, there exists an event E independent from the rows j = 1 such that P((E)c ) n −(1−2γ )/α and for any W ⊂ Cn with dimension d < n − n 1−γ one has E[dist(R, W )−2 ; E] c (n − d)−2/α . Here R is the first row of the matrix Bn . By first conditioning on the value of the other rows of Bn and recalling that the dimension d of the span of these is at most n − i/2 n − 2n 1−γ , we see that −2/α . E[dist−2 ; E] = O i 1 Therefore

" ! −2 c −2(1−2γ )/α E dist−2 1 ; Fn E(dist 1 ; E) + P((E) ) n c2 i −2/α + n −3(1−2γ )/α .

(3.24)

Now, if γ < 1/6 we have 3(1 − 2γ )/α > 2/α and therefore n −3(1−2γ )/α i −2/α . Thus, (3.24) implies " ! −2/α . (3.25) E dist−2 1 ; Fn 2 c2 i From (3.23) we obtain

" ! −2 ; Fn 2 c2 an2 n i −2/α . E iσn−i

From (H2) it follows that (3.22) holds. This concludes the proof of (3.21)–(3.22).

544

C. Bordenave, P. Caputo, D. Chafaï

3.5. Proof of Theorem 1.2. We may now invoke Theorem 1.1 and (3.20). From Lemma A.2, μ An converges in probability to μα , where for almost all z ∈ C, Uμα (z) = + ln(x)να,z (d x). Now, from Lemma 3.1, the sequence of measures (μ An ), n ∈ N, is a.s. tight. From the uniqueness of the limit, it follows that μ An converges a.s. to μα . 4. Limiting Spectral Measure In this section, we give a close look to the resolvent of the random operator on the PWIT and we deduce some properties of the limiting spectral measure μα . For ease of notation we set α β= 2 and define the measure on R+ , α =

α − α −1 x 2 d x. 2

4.1. Resolvent operator on the Poisson Weighted Infinite Tree. In this paragraph, we analyze the random variable a(z, η) b(z, η) . R(U )∅∅ = b (z, η) c(z, η) By Lemma 2.2, for t ∈ R+ , a(z, it) is pure imaginary and we set h(z, t) = Im(a(z, it)) = −ia(z, it) ∈ [0, t −1 ]. The random variables a(z, η) and h(z, t) solve a nice recursive distribution equation. Theorem 4.1 (Recursive Distributional Equation). Let U = U (z, η) ∈ H+ , t ∈ R+ . Let L U be the distribution on C+ of a(z, η) and L z,t the distribution of h(z, t). (i) L U solves the equation in distribution

η + k∈N ξk ak d

, a= 2

|z| − η + k∈N ξk ak η + k∈N ξk ak

(4.1)

where a, (ak )k∈N and (ak )k∈N are i.i.d. with law L U independent of {ξk }k∈N , {ξk }k∈N two independent Poisson point processes on R+ with intensity α . (ii) L z,t is the unique probability distribution on [0, ∞) such that

t + k∈N ξk h k d

, h= 2 (4.2)

|z| + t + k∈N ξk h k t + k∈N ξk h k where h, (h k )k∈N and (h k )k∈N are i.i.d. with law L z,t , independent of {ξk }k∈N , {ξk }k∈N two independent Poisson point processes on R+ with intensity α . (iii) For t = 0 there are two probability distributions on [0, ∞) solving (4.2) such that Eh α/2 < ∞: δ0 and another denoted by L z,0 . Moreover, for the topology of weak convergence, L z,t converges to L z,0 as t goes to 0.

Non-Hermitian Heavy Tailed Random Matrices

545

We start with an important lemma.

b is equal in distribution to c

−z η + k∈N ξk ak

, (4.3) −¯z η + k∈N ξk ak ξ a

Lemma 4.2. For every U = U (z, η) ∈ H+ ,

|z|2

1

− η + k∈N ξk ak η + k∈N

a b

k k

where a, (ak )k∈N and (ak )k∈N are i.i.d. with law L U independent of {ξk }k∈N , {ξk }k∈N two independent Poisson point processes on R+ with intensity α . Proof of Lemma 4.2. Consider a realization of PWIT(2 θ ) on the tree T . For k ∈ N, we define Tk as the subtree of T spanned by kN f . With the notation of Lemma 2.5, for k ∈ N, R Bk (U ) = (Bk (z) − η)−1 is the resolvent operator of Bk and set )kk = k R Bk (U )∗k = ak bk . R(U bk ck Then, by Lemma 2.5 and (2.13), we get −1/α ak bk 0 εk yk R(U )∅∅ = − U + −1/α bk ck (1 − εk )yk 0 k∈N −1 −1/α 0 (1 − εk )yk × −1/α εk yk 0 −1 −2/α c 0 k k∈N (1 − εk )|yk |

=− U+ −2/α a 0 k k∈N εk |yk |

−2/α η + k∈N εk |yk | ak −z

, = D −1 −¯z η + k∈N (1 − εk )|yk |−2/α ck

with D = |z|2 − η + k∈N εk |yk |−2/α ak η + k∈N (1 − εk )|yk |−2/α ck . Now the structure of the PWIT implies that (i) ak and ck have common distribution L U ; and (ii) the variables (ak , ck )k∈N are i.i.d.. Also the thinning property of Poisson processes implies that (iii) {εk |yk |−2/α }k∈N and {(1 − εk )|yk |−2/α }k∈N are independent Poisson point process with common intensity α . The next well-known and beautiful lemma will be crucial in the computations that will follow. It is a consequence of the LePage-Woodroofe-Zinn representation of stable laws [35], see also Panchenko and Talagrand [41, Lemma 2.1]. Lemma 4.3. Let {ξk }k∈N be a Poisson process with intensity α . If (Yk ) is an i.i.d. sequence of non–negative random variables, independent of {ξk }k∈N , such that β E[Y1 ] < ∞, then d d β 1 β 1 ξk Yk = E[Y1 ] β ξk = E[Y1 ] β S, k∈N

k∈N

where S is the positive β-stable random variable with Laplace transform for all x 0,

E exp(−x S) = exp −(1 − β)x β . (4.4)

546

C. Bordenave, P. Caputo, D. Chafaï

Proof of Lemma 4.3. Recall the formulas, for y 0, η > 0 and 0 < η < 1 respectively, ∞ ∞ x η−1 e−x y d x and y η = (1 − η)−1 η x −η−1 (1 − e−x y )d x. y −η = (η)−1 0

0

(4.5) From the Lévy-Khinchin formula we deduce that, with s 0, ∞ −xsY1 −β−1 E exp −s ξk Yk = exp E (e − 1)βx dx k

0

β = exp −(1 − β)s β E[Y1 ] .

Proof of Theorem 4.1. Statement (i) is contained in Lemma 4.2. For (ii), let t > 0 and h be a solution of (4.2). Then h is positive and is upper bounded by 1/t. By Lemma 4.3, we may rewrite (4.2) as d

h=

t + E[h β ]1/β S

,

|z|2 + t + E[h β ]1/β S t + E[h β ]1/β S

(4.6)

where S and S are i.i.d. variables with common Laplace transform (4.4). In particular, E[h β ]1/β is solution of the equation in y: β t + yS β . y =E |z|2 + (t + y S) (t + y S ) Since t > 0, E[h β ] > 0, it follows that E[h β ]1/β is solution of the equation in y: β t y −1 + S 1=E . (4.7) |z|2 + (t + y S) (t + y S ) −1

t y +S For every S, S > 0, the function y → |z|2 +(t+y is decreasing in y. It follows S)(t+y S ) that β t y −1 + S y → E |z|2 + (t + y S) (t + y S )

is decreasing in y. As y goes to 0 it converges to ∞ and as y goes to infinity, it converges to 0. In particular, there is a unique point, y∗ (|z|2 , t) of such that (4.7) holds. This proves (ii) since from (4.6), the law of h is determined by E[h β ]1/β = y∗ (|z|2 , t). For Statement (iii) and t = 0, then h = 0 is a particular solution of (4.2). If h is not a.s. equal to 0, then E[h β ]1/β > 0 and the argument above still works since, for every s s, s > 0, the function y → |z|2 +y 2 ss is decreasing in y. We deduce the existence of a unique positive solution y∗ (|z|2 , 0) of (4.7). We also have the continuity of the function t → y∗ (|z|2 , t) on [0, ∞). Finally h = y∗ (|z|2 , 0)S/(|z|2 + y∗2 (|z|2 , 0)SS ), d

and from (4.6), it implies the weak convergence of L z,t to L z,0 .

Non-Hermitian Heavy Tailed Random Matrices

547

4.2. Density of the limiting measure. In this paragraph, we analyze the RDE (4.3). For all t > 0, let L z,t be as in Theorem 4.1. From Eq. (4.6), h may be expressed as d

h=

t + y∗ S , |z|2 + (t + y∗ S) (t + y∗ S )

where S and S are i.i.d. variables with common Laplace transform (4.4) and y∗ := y∗ (|z|2 , t) is the unique solution in (0, ∞) of (4.7) (uniqueness is proved in Theorem 4.1). We extend continuously the function y∗ (r, t) for t = 0 by defining y∗ (|z|2 , 0) as the unique solution in (0, ∞): 1=E

S 2 |z| + y 2 SS

β

.

(4.8)

Lemma 4.4. The function y∗ : [0, ∞)2 → (0, ∞) is C 1 . For every t 0, the mapping r → y∗ (r, t) is decreasing to 0. Proof. For every t 0, the derivative in y > 0 of the function E

β

t y −1 +S

|z|2 +(t+y S)(t+y S )

is

(t y −1 + S)β−1 (t y −1 + S)β (S(t + y S ) + S (t + y S)) − βt y −2 E .

β − βE

β+1 |z|2 + (t + y S) (t + y S ) |z|2 + (t + y S) (t + y S ) (4.9) The last computation is justified since all terms are integrable, indeed we have y −β+1 y −2β (t y −1 + S)β−1

β SS β (t + y S) (t + y S )β |z|2 + (t + y S) (t + y S ) and from (4.5), for all η > 0, ES −η = (η)−1

β

x η−1 e−(1−β)x d x < ∞.

(4.10)

Similarly, for the second term of (4.9), we write S(t + y S ) + S (t + y S) (t y −1 + S)β (S(t + y S ) + S (t + y S)) y −1

β+1 (t + y S) (t + y S )β+1 |z|2 + (t + y S) (t + y S ) y −1

S (t + y S) (t +

y −β−2 S

−β

y S )β

+ y −β−2 S

+ y −1 −β

S (t + y S )β+1

.

The expression (4.9) is finite and strictly negative for all y > 0. The statement follows from the implicit function theorem. From (4.3), for all t > 0, d

b(z, it) = −

|z|2

z

.

2 + t + y∗ (|z| , it)S t + y∗ (|z|2 , it)S

548

C. Bordenave, P. Caputo, D. Chafaï

By Lemma 4.4, we may also define z

d

b(z, 0) = lim b(z, it) = − t↓0

|z|2

+

y∗2 (|z|2 , 0)SS

.

For ease of notation, we set y∗ (r ) = y∗ (r, 0). Since ∂z = 1, ∂|z|2 = z¯ , we deduce that z |z|2 + y∗2 (|z|2 )SS −1 −2 = E |z|2 + y∗2 (|z|2 )SS − |z|2 E |z|2 + y∗2 (|z|2 )SS −2 −2|z|2 y∗ (|z|2 )y∗ (|z|2 )ESS |z|2 + y∗2 (|z|2 )SS SS = y∗2 (|z|2 ) − 2|z|2 y∗ (|z|2 )y∗ (|z|2 ) E

2 |z|2 + y∗2 (|z|2 )SS

− E∂b(z, 0) = E∂

(4.11) .

The latter is justified since −2 SS |z|2 + y 2 SS y −4 (SS )−1 is integrable from (4.10). The next lemma is an important consequence of Theorems 2.13 and 1.2. Lemma 4.5. The following identity holds in D (C): 1 μα = − ∂Eb(·, 0). π Therefore the measure μα is isotropic and has a continuous density given by 1/π times the right hand side of (4.11). Proof. Let Rn be the resolvent matrix of Bn , the bipartized matrix of An defined by (1.4). By Theorem 2.13 and Lemma 2.2, for all t > 0 and z ∈ C, iEh(z, t) Eb(z, it) lim ERn (U (z, it))11 = ¯ it) iEh(z, t) . Eb(z, n→∞ From Theorem 2.14, Eν An −z converge weakly to να,z and, by Lemma 3.1, for all t > 0, 1 1 2 2 lim ln(x + t )Eν An −z d x) = ln(x 2 + t 2 )να,z (d x). n→∞ 2 2 + From Eq. (3.20), ln(x)να,z (d x) is integrable. We deduce that for all z 0 ∈ C, there exists an open neighborhood of z 0 and a sequence (tn )n 1 converging to 0 such that for all z in the neighborhood, iEh(z, 0) Eb(z, 0) lim ERn (U (z, itn ))11 = (4.12) ¯ 0) iEh(z, 0) , Eb(z, n→∞ and 1 n→∞ 2 lim

ln(x 2 + tn2 )Eν An −z (d x) =

ln(x)να,z (d x).

(4.13)

Non-Hermitian Heavy Tailed Random Matrices

549

Moreover from Theorem 1.2, Eq. (3.20), Lemma A.2, in D (C): ln(x)να,z (d x) = 2π μα . On the other hand, (2.5),

1 2

1 2

+

ln(x 2 + t 2 )ν An −z (d x) =

1 2n

(4.14)

ln | det(B(z) − it I2n ))|, and from

ln(x 2 + t 2 )Eν An −z , (d x) = −2∂Eb1 (z, it).

The conclusion follows from (4.12), (4.13) and (4.14). It is possible to compute explicitly the expression (4.11) at z = 0. Lemma 4.6. The density of μα at z = 0 is 1 (1 + 1/β)2 (1 + β)1/β . π (1 − β)1/β Proof. By definition, the real y∗ (0) solves the equation β y −2β S β −2β −β =y ES = 1=E x β−1 e−(1−β)x d x. y 2 SS (β) With the change of variable x → x β and the identity z(z) = (1 + z), we find easily, ES −β = ((1 − β)(1 + β))−1 and y∗ (0) = ((1 − β)(1 + β)) We also have β ES −1 = e−(1−β)x d x =

1 β(1 − β)1/β

1 − 2β

.

x 1/β−1 e−x d x =

(1 + 1/β) , (1 − β)1/β

where we have used again the identity z(z) = (1 + z). Then the right-hand side of (4.11) at z = 0 is equal to 2 y∗2 (0)y∗−4 (0)E(SS )−1 = y∗−2 (0) ES −1 . 4.3. Proof of Theorem 1.3. In this subsection, we prove the last statement of Theorem 1.3 (the first part of the theorem being contained in Lemmas 4.5, 4.6). We start with a first technical lemma. Lemma 4.7. Let 0 < β < 1, δ > 0, and f be a bounded measurable R+ → R function such that f (y) = O(y β+δ ) as y ↓ 0. Let Y be a random variable such that P(Y t) = L(t)t −β for some slowly varying function L. Then as t goes to infinity, ∞ Y −β Ef f (y)y −β−1 dy. ∼ β L(t)t t 0

550

C. Bordenave, P. Caputo, D. Chafaï

Proof. Define Yt = Y/t. We fix ε > 0 and consider the distribution P(Yt ∈ ·|Yt ε). By assumption, for s > ε, P(Yt s|Yt ε) ∼ (s/ε)−β . In particular, the distribution of Yt given {Yt ε} converges weakly as t goes to infinity to the distribution with density βx −β−1 εβ d x. Since f is bounded and L slowly varying, we get ' ( ! " Y E f 1{Y εt} = P(Yt ε)E f (Yt ) Yt ε t ∞ f (y)βy −β−1 εβ dy ∼ L(εt)ε−β t −β ∞ε −β ∼ β L(t)t f (y)y −β−1 dy. ε

Finally, by assumption, for some constant, c > 0, ' ( Y E f 1{Y εt} ct −β−δ E[Y β+δ 1{Y εt} ]. t Thus by Lemma C.1, for some new constant c > 0 and all t 1/ε, ( ' L(εt) Y 1{Y εt} ct −β−δ L(εt)(εt)δ = ct −β L(t)εδ E f . t L(t) We may thus conclude by letting t tend to infinity and then ε to 0. Lemma 4.8. Let S be a random variable with Laplace transform (4.4). There exists a constant c0 > 0 such that as t goes to infinity, ES β 1{S t} = ln t + c0 + o(1). Proof. Let gβ be the density function of S. From Eq. (2.4.8) in Zolotarev [56], gβ has a convergent power series representation gβ (x) =

∞ (nβ + 1) 1 (−1)n−1 sin(π nβ)x −nβ−1 . π (n + 1)(1 − β)n n=1

#

x x The Stirling formula (x) ∼x→∞ 2π implies that the convergence radius of x e the series is +∞. Recall that (β + 1) = β(β), and the Euler reflection formula, (1 − β) sin(πβ)/π = (β). Thus, as x goes to infinity, gβ (x) = βx −β−1 + O(x −2β−1 ).

Non-Hermitian Heavy Tailed Random Matrices

551

The next lemma is a consequence of the Karamata Tauberian theorem. Lemma 4.9. As t goes to infinity,

and, with c1 = β 2

+∞ 0

P(SS t) ∼ βt −β ln t, (x + 1)−2 x −β d x, E

SS ∼ c1 t −1−β ln t. (t + SS )2

Proof. Let x > 0, since S and S are independent we have

E exp(−x SS ) = E exp −(1 − β)x β S β . From Corollary 8.1.7 in [9], we have as t goes to infinity, P(S > t) ∼ t −β . In particular, we have P(S β > t) ∼ t −1 and a new application of Corollary 8.1.7 in [9] gives as x ↓ 0, 1 − E exp(−x S β ) ∼ x ln x −1 . We obtain 1 − E exp(−x SS ) ∼ (1 − β)x β ln((1 − β)x −β ) ∼ β(1 − β)x β ln(x −1 ). We then conclude by a third application of Corollary 8.1.7 in [9]. The second statement is a consequence of Lemma 4.7. The next lemma gives the asymptotic behavior of y∗ (r ) as r goes to infinity. Lemma 4.10. There exists a constant c2 > 0 such that as r goes to infinity, √ β y∗ (r ) ∼ c2 r e−r /2 . Proof. From Eqs. (4.5), (4.8), we have with y∗ = y∗ (r ), xr 1 1= x β−1 E exp − − x y∗2 S d x (β) S 2β xr 1 β−1 −x β y∗ (1−β) = e Ee− S d x x (β) x 1/β r y∗−2 1 − −x S(1−β)1/β d x. Ee = e 2β (1 + β)(1 − β)y∗

(4.15)

By Lemma 4.4, limr →∞ y∗ (r ) = 0. Hence, from the above expression, we deduce that the term r y∗−2 goes to infinity as r goes to infinity. Define 1/β 1 − x I (y) = e−x e y(1−β)1/β d x = I0 (y) + I1 (y) + I2 (y), (1 + β)(1 − β) with I0 (y) = I (y)1{y 1} ,

1/β 1{y 1} − x e y(1−β)1/β d x = y β 1{y 1} , (1 + β)(1 − β) 1/β 1{y 1} − x I2 (y) = (e−x − 1)e y(1−β)1/β d x. (1 + β)(1 − β) I1 (y) =

552

C. Bordenave, P. Caputo, D. Chafaï

The function I is increasing and lim y→∞ I (y) < ∞. Also, the function I0 is equal to 0 in a neighborhood of 0. By Lemma 4.7, we get as t goes to infinity, EI0 (S/t) ∼ a0 t −β , for some positive constant a0 = Lemma 4.8,

+∞+ 1 (1+β)(1−β) 1

e−x e

−

x 1/β y(1−β)1/β

βy −β−1 d xd y. By

E[I1 (S/t)] = t −β ln t + c0 t −β + o(1). Also, from the Laplace method, I2 (y) ∼ −(2β)(1 − β)2 y 2β as y goes to 0. By Lemma 4.7, EI2 (S/t) ∼ a2 t −β , 1 with a2 = (1+β)(1−β) we get from (4.15),

+1+ 0

(e−x − 1)e

−

x 1/β y(1−β)1/β

βy −β−1 d xd y. Hence, for t = r y∗−2 ,

y∗ = (r y∗−2 )−β ln(r y∗−2 ) + (c0 + a0 + a2 )(r y∗−2 )−β + o((r y∗−2 )−β ). 2β

In other words, r β = ln(r y∗−2 ) + (c0 + a0 + a2 ) + o(1). We conclude by setting c2 = exp((c0 + a0 + a2 )/2).

Lemma 4.11. As r goes to infinity, y (r ) ∼ −c3−1 y∗ (r )r β−1 , where c3 = 2

+∞+∞ 0

0

xe−x e

−

x 1/β s(1−β)1/β

βs −β−1 d xds/ ((1 + β)(1 − β)).

Proof. We define

S G(y, r ) = E r + y 2 SS

β

1 = (β)

x β−1 e−x

β y 2β (1−β)

xr

Ee− S d x.

From the implicit function theorem y∗ (r ) = −

∂r G(y∗ , r ) . ∂ y G(y∗ , r )

We have ∂ y G(y, r ) = −

2β(1 − β)y 2β−1 (β)

x 2β−1 e−x

2 = − 2β+1 y (1 + β)(1 − β)

β y 2β (1−β)

xe−x Ee

−

xr

Ee− S d x

x 1/β r y −2 S(1−β)1/β

d x.

Non-Hermitian Heavy Tailed Random Matrices

553

The Laplace method implies that, as t goes infinity, 1/β − x t xe−x e (1−β)1/β d x ∼ (2β)(1 − β)2 t −2β . Thus by Lemma 4.7, we deduce that 1/β 1/β − x t − x xe−x Ee S(1−β)1/β d x ∼ t −β xe−x e s(1−β)1/β βs −β−1 d xds ∼ ct −β . Applying the above to t = r y∗−2 (r ) we deduce, with c3 = 2c/((1 + β)(1 − β)), ∂ y G(y∗ , r ) ∼ −c3 r −β y∗−1 (r ). Similarly, the derivative of G with respect to r is ∂r G(y, r ) = −

1 2β+2 y (1 − β)1/β+1 (1 + β)

x 1/β e−x Ee

−

x 1/β r y −2 S(1−β)1/β

S −1 d x.

Once again, Laplace method implies that, as t goes infinity, 1/β − x t x 1/β e−x e (1−β)1/β d x ∼ (β + 1)(1 − β)1/β+1 t −β−1 . In particular, for all ε > 0 there exists t0 such that (1 − ε)t −β−1 ES β 1{S t/t0 }

1

(1 − β)1/β+1 (1 + β)

x 1/β e−x Ee

−

x 1/β t S(1−β)1/β

S −1 1{S t/t0 } d x

(1 + ε)t −β−1 ES β 1{S t/t0 } . By Lemma 4.8, ES β 1{S t/t0 } ∼ ln t. It follows that for some t1 > t0 and all t t1 , 1 (1 − 2ε)t −β−1 ln t 1/β+1 (1 − β) (1 + β) x 1/β t − x 1/β e−x Ee S(1−β)1/β S −1 1{S t/t0 } d x (1 + 2ε)t −β−1 ln t. On the other hand, for some constant c > 0 and all t 1, 1/β − x t1/β −1 β+1 1/β −x S(1−β) x e Ee S 1{S t/t0 } d x x 1/β e−x d xP(S t/t0 ) ct −β−1 t0 . We thus have proved that 1 (1 − β)1/β+1 (1 + β)

x 1/β e−x Ee

−

x 1/β t S(1−β)1/β

S −1 d x ∼ t −β−1 ln t,

and ∂r G(y∗ (r ), r ) ∼ −r −β−1 ln(r y −2 ) ∼ −r −1 . The statement follows.

554

C. Bordenave, P. Caputo, D. Chafaï

Proof of Theorem 1.3. From Eq. (4.11) and Lemma 4.9, the density at r = |z|2 is equivalent to 1/π times y (r ) 1 − 2r ∗ y −2 (r )c1 (r y∗−2 )−1−β ln(r y∗−2 ). y∗ (r ) ∗ It remains to apply Lemmas 4.10 and 4.11, and set the multiplicative constant to be 2β c = 2π −1 c3−1 c1 c2 . Appendix A. Logarithmic Potentials and Hermitization Let P(C) be the set of probability measures on C which integrate ln |·| in a neighborhood of infinity. For every μ ∈ P(C), the logarithmic potential Uμ of μ on C is the function Uμ : C → [−∞, +∞) defined for every z ∈ C by Uμ (z) = ln |z − z | μ(dz ) = (ln |·| ∗ μ)(z). (A.1) C

Note that in classical potential theory, the definition is opposite in sign, but ours turns out to be more convenient (lightweight) for our purposes. Since ln |·| is Lebesgue locally integrable on C, one can check by using the Fubini theorem that Uμ is Lebesgue locally integrable on C. In particular, Uμ < ∞ a.e. (Lebesgue almost everywhere) and Uμ ∈ D (C). Since ln |·| is the fundamental solution of the Laplace equation in C, we have, in D (C), Uμ = 2π μ.

(A.2)

Lemma A.1 (Unicity). For every μ, ν ∈ P(C), if Uμ = Uν a.e. then μ = ν. Proof. Since Uμ = Uν in D (C), we get Uμ = Uν in D (C). Now (A.2) gives μ = ν in D (C), and thus μ = ν as measures since μ and ν are Radon measures. If A is an n × n complex matrix and PA (z) := det(A − z I ) is its characteristic polynomial, 1 1 Uμ A (z) = ln z − z μ A (dz ) = ln |det(A − z I )| = ln |PA (z)| n n C for every z ∈ C\{λ1 (A), . . . , λn (A)}. We have also the alternative expression ∞ 1 ln(t) ν A−z I (dt). Uμ A (z) = ln det( (A − z I )(A − z I )∗ ) = n 0

(A.3)

The identity above bridges the eigenvalues with the singular values, and is at the heart of the following lemma, which allows to deduce the convergence of μ A from the one of ν A−z I . The strength of this Hermitization lies in the fact that contrary to the eigenvalues, one can control the singular values with the entries of the matrix. The price paid here is the introduction of the auxiliary variable z and the uniform integrability. We recall that on a Borel measurable space (E, E), we say that a Borel function f : E → R is uniformly integrable for a sequence of probability measures (ηn )n 1 on E when lim lim | f | dηn = 0. t→∞ n→∞ {| f |>t}

Non-Hermitian Heavy Tailed Random Matrices

555

We will use this property as follows: if ηn η and f is+continuous+ and uniformly integrable for (ηn )n 1 then f is η-integrable and limn→∞ f dηn = f η. Similarly for a sequence of random probability measures (ηn )n 1 we will say that f is uniformly integrable for (ηn )n 1 in probability, if for all ε > 0, lim lim P | f | dηn > ε = 0. t→∞ n→∞

{| f |>t}

A proof of Lemma A.2 below can be found in [12] which covers the “a.s.” case, the “in probability” case being similar. It relies only on the unicity Lemma A.1, the classical Prohorov theorem, and the Weyl inequalities of Lemma B.5 linking eigenvalues and singular values. Lemma A.2 (Girko’s Hermitization method). Let (An )n 1 be a sequence of complex random matrices where An is n × n for every n 1. Suppose that for Lebesgue almost all z ∈ C, there exists a probability measure νz on [0, ∞) such that

(i) a.s. ν An −z I n 1 tends weakly to νz .

(ii) a.s. (resp. in probability) ln(·) is uniformly integrable for ν An −z I n 1 . Then there exists a probability measure μ ∈ P(C) such that

(j) a.s. (resp. in probability) μ An n 1 converges weakly to μ, (jj) for a.a. z ∈ C, ∞ ln(t) νz (dt). Uμ (z) = 0

Appendix B. General Spectral Estimates Lemma B.1 (Basic inequalities [32]). If A and B are n × n complex matrices then s1 (AB) s1 (A)s1 (B) and s1 (A + B) s1 (A) + s1 (B)

(B.1)

max |si (A) − si (B)| s1 (A − B).

(B.2)

and 1i n

Lemma B.2 (Rudelson-Vershynin row bound [12,45]). Let A be a complex n × n matrix with rows R1 , . . . , Rn . Define the vector space R−i := span{R j ; j = i}. We have then n −1/2 min dist(Ri , R−i ) sn (A) min dist(Ri , R−i ). 1i n

1i n

Recall that the singular values s1 (A), . . . , sn (A) √ of a rectangular n × n complex ∗ matrix A with n n are defined by si (A) := λi ( A A ) for every 1 i n .

Lemma B.3 (Tao-Vu negative second moment [50, Lemma A4]). If A is a full rank n × n complex matrix (n n) with rows R1 , . . . , Rn , and R−i := span{R j ; j = i}, then

n i=1

si (A)

−2

=

n i=1

dist(Ri , R−i )−2 .

556

C. Bordenave, P. Caputo, D. Chafaï

Lemma B.4 (Cauchy interlacing by rows deletion [32]). Let A be an n × n complex matrix. If B is n × n, obtained from A by deleting n − n rows, then for every 1 i n , si (A) si (B) si+n−n (A). Lemma B.5 (Weyl inequalities [53]). For every n × n complex matrix A, we have k 0

k 0

|λi (A)|

i=1

si (A) and

i=1

n 0

si (A)

i=k

n 0

|λi (A)|

(B.3)

i=k

for all 1 k n. In particular, by viewing |det(A)| as a volume, | det(A)| =

n 0

|λk (A)| =

k=1

n 0

sk (A) =

k=1

n 0

dist(Rk , span{R1 , . . . , Rk−1 }), (B.4)

k=1

where R1 , . . . , Rn are the rows of A. Moreover, for every increasing function ϕ from (0, ∞) to (0, ∞) such that t → ϕ(et ) is convex on (0, ∞) and ϕ(0) := limt→0+ ϕ(t) = 0, we have k

ϕ(|λi (A)|2 )

i=1

k

ϕ(si (A)2 )

(B.5)

i=1

for every 1 k n. In particular, with ϕ(t) = t r/2 , r > 0, and k = n, we obtain n

|λk (A)| r

k=1

n

sk (A)r .

(B.6)

k=1

Lemma B.6 (Schatten bound [55, proof of Theorem 3.32]). Let A be an n × n complex matrix with rows R1 , . . . , Rn . Then for every 0 < r 2, n

sk (A)r

k=1

n

Rk r2 .

(B.7)

k=1

Appendix C. Additional Lemmas We begin with a lemma on truncated moments. We skip the proof since it follows from an adaptation of the proof in the real case given by e.g. Feller [23, Theorem VIII.9.2]. Lemma C.1 (Truncated moments). If (H1) holds then for every p > α, ) * E |X 11 | p 1{|X 11 |t} ∼ c( p)L(t)t p−α , where c( p) := α/( p − α). In particular, we have p ) * an E |X 11 | p 1{|X 11 |an } ∼ c( p) . n

Non-Hermitian Heavy Tailed Random Matrices

557

We end up this section by a result on the concentration of the spectral measure of Hermitian or Hermitized random matrices, mentioned in [13]. The total variation norm of f : R → R is f TV := sup | f (xk+1 ) − f (xk )|, k∈Z

where the supremum runs over all sequences (xk )k∈Z such that xk+1 xk for any k ∈ Z. If f = 1(−∞,s] for some real s then f TV = 1, while if f has a derivative in L1 (R), we get f TV = | f (t)| dt. R

The following lemma comes with remarkably weak assumptions, and allows to deduce the almost sure weak convergence of empirical spectral measures of random matrices without any moment assumptions on the entries. We discovered that this lemma was obtained independently by Guntuboyina and Leeb in [30], where they discuss the relationships with more classical results. Lemma C.2 (Concentration for spectral measures). Let H be an n×n random Hermitian matrix. Let us assume that the vectors (Hi )1i n , where Hi := (Hi j )+1 j i ∈ Ci , are independent. Then for any f : R → R such that f TV 1 and E f dμ H < ∞, and every t 0, nt 2 P f dμ H − E f dμ H t 2 exp − . 2 Similarly, if M is an n × n complex random matrix with independent rows (or with independent columns) then for any f : R → R with f TV 1 and every t 0, P f dν M − E f dν M t 2 exp −2nt 2 . Proof. We prove only the Hermitian version, the non-Hermitian version being entirely similar. Let us start by showing that for every n × n deterministic Hermitian matrices A and B and any measurable function f with f TV = 1, f dμ A − f dμ B rank(A − B) . (C.1) n Indeed, it is well known (follows from interlacing, see e.g. [51] or [5, Theorem 11.42]) that rank(A − B) FA − FB ∞ , n where FA and FB are the cumulative distribution functions of μ A and μ B respectively. Now if f is smooth, we get, by integrating by parts, f dμ A − f dμ B = f (t)FA (t) dt − f (t)FB (t) dt R R rank(A − B) | f (t)| dt, n R

558

C. Bordenave, P. Caputo, D. Chafaï

and since the left-hand side depends on at most 2n points, we get (C.1) by approximating f by smooth functions. Next, for any x = (x1 , . . . , xn ) ∈ X := {(xi )1i n : xi ∈ Ci−1 × R}, let H (x) be the n × n Hermitian matrix given by H (x)i j := xi, j for 1 j i n. We have μ H = μ H (H1 ,...,Hn ) . For all x ∈ X and xi ∈ Ci−1 × R, the matrix H (x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ) − H (x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ) has only the i th row and column possibly different from 0, and thus

rank H (x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ) − H (x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ) 2. Therefore from C.1, we obtain, for every f : R → R with f TV 1, f dμ H (x ,...,x ,x ,x ,...,x ) − f dμ H (x ,...,x ,x ,x ,...,x ) 2 . n 1 i−1 i i+1 n 1 i−1 i i+1 n The desired result follows now from the Azuma–Hoeffding inequality, see e.g. [38, Lemma 1.2].

References 1. Aldous, D.: Asymptotics in the random assignment problem. Probab. Th. Rel. Fields 93(4), 507– 534 (1982) 2. Aldous, D., Lyons, R.: Processes on unimodular random networks. Electron. J. Probab. 12(54), 1454–1508 (2007) (electronic) 3. Aldous, D., Steele, J.M.: The objective method: probabilistic combinatorial optimization and local weak convergence. Probability on discrete structures, Encyclopaedia Math. Sci., Vol. 110, Berlin: Springer, 2004, pp. 1–72 4. Bai, Z.D.: Circular law. Ann. Probab. 25(1), 494–529 (1997) 5. Bai, Z.D., Silverstein, J.W.: Spectral Analysis of Large Dimensional Random Matrices. Mathematics Monograph Series 2, Beijing: Science Press, 2006 6. Belinschi, S., Dembo, A., Guionnet, A.: Spectral measure of heavy tailed band and covariance random matrices. Commun. Math. Phys. 289(3), 1023–1055 (2009) 7. Ben Arous, G., Guionnet, A.: The spectrum of heavy tailed random matrices. Commun. Math. Phys. 278(3), 715–751 (2008) 8. Benjamini, I., Schramm, O.: Recurrence of distributional limits of finite planar graphs. Electron. J. Probab. 6(23), 13 (2001) (electronic) 9. Bingham, N.H., Goldie, C.M., Teugels, J.L.: Regular variation. Encyclopedia of Mathematics and its Applications, Vol. 27, Cambridge: Cambridge University Press, 1989 10. Bordenave, Ch., Caputo, P., Chafaï, D.: Spectrum of large random reversible Markov chains: two examples. ALEA Lat. Am. J. Probab. Math. Stat. 7, 41–64 (2010) 11. Bordenave, Ch., Caputo, P., Chafaï, D.: Spectrum of large random reversible Markov chains: heavy tailed weigths on the complete graph. http://arXiv.org/abs/0903.3528v4 Ann. Prob. 39(4), 1544–1590 (2011). 12. Bordenave, Ch., Caputo, P., Chafaï, D.: Circular Law Theorem for Random Markov Matrices. Prob. Th. Rel. Fields, doi:10.1007/s00440-010-0336-1, 2011 13. Bordenave, Ch., Lelarge, M., Salez, J.: The rank of diluted random graphs. Ann. Prob. 39(3), 1097–1121 (2011) 14. Bouchaud, J., Cizeau, P.: Theory of Lévy matrices. Phys. Rev. E 3, 1810–1822 (1994) 15. Brown, L.G.: Lidski˘ı’s theorem in the type II case. In: Geometric methods in operator algebras (Kyoto, 1983), Pitman Res. Notes Math. Ser., Vol. 123, Harlow: Longman Sci. Tech., 1986, pp. 1–35 16. Chafaï, D.: Aspects of large random Markov kernels. Stochastics 81(3-4), 415–429 (2009) 17. Chafaï, D.: Circular law for noncentral random matrices. J. Theoret. Probab. 23(4), 945–950 (2010) 18. Chafaï, D.: The Dirichlet Markov ensemble. J. Multivariate Anal. 101(3), 555–567 (2010) 19. Dozier, R.B., Silverstein, J.W.: Analysis of the limiting spectral distribution of large dimensional information-plus-noise type matrices. J. Multivariate Anal. 98(6), 1099–1122 (2007)

Non-Hermitian Heavy Tailed Random Matrices

559

20. Dozier, R.B., Silverstein, J.W.: On the empirical distribution of eigenvalues of large dimensional information-plus-noise-type matrices. J. Multivariate Anal. 98(4), 678–694 (2007) 21. Edelman, A.: The probability that a random real Gaussian matrix has k real eigenvalues, related distributions, and the circular law. J. Multivariate Anal. 60(2), 203–232 (1997) 22. Feinberg, J., Zee, A.: Non-Hermitian random matrix theory: Method of Hermitian reduction. Nucl. Phys. B 504(3), 579–608 (1997) 23. Feller, W.: An introduction to probability theory and its applications. Vol. II. Second edition, New York: John Wiley & Sons Inc., 1971 24. Girko, V.L.: The circular law. Teor. Veroyatnost. i Primenen. 29(4), 669–679 (1984) 25. Girko, V.L.: Strong circular law. Random Oper. Stochastic Eqs. 5(2), 173–196 (1997) 26. Girko, V.L.: The circular law. Twenty years later. III. Random Oper. Stochastic Eqs. 13(1), 53–109 (2005) 27. Goldsheid, I.Y., Khoruzhenko, B.A.: The Thouless formula for random non-Hermitian Jacobi matrices. Israel J. Math. 148, 331–346 (2005) 28. Götze, F., Tikhomirov, A.: The Circular Law for Random Matrices. Ann. Probab. 38(4), 1444–1491 (2010) 29. Gudowska-Nowak, E., Jarosz, A., Nowak, M., Pappe, G.: Towards non-Hermitian random Lévy matrices. Acta Physica Polonica B 38(13), 4089–4104 (2007) 30. Guntuboyina, A., Leeb, H.: Concentration of the spectral measure of large Wishart matrices with dependent entries. Electron. Commun. Probab. 14, 334–342 (2009) 31. Haagerup, U., Schultz, H.: Brown measures of unbounded operators affiliated with a finite von Neumann algebra. Math. Scand. 100(2), 209–263 (2007) 32. Horn, R.A., Johnson, Ch.R.: Topics in matrix analysis. Cambridge: Cambridge University Press, 1994 (corrected reprint of the 1991 original) 33. Hwang, C.-R.: A brief survey on the spectral radius and the spectral distribution of large random matrices with i.i.d. entries. In: Random matrices and their applications (Brunswick, Maine, 1984), Contemp. Math., Vol. 50, Providence, RI: Amer. Math. Soc., 1986, pp. 145–152 34. Ledoux, M.: The concentration of measure phenomenon. Mathematical Surveys and Monographs, Vol. 89, Providence, RI: Amer. Math. Soc., 2001 35. LePage, R., Woodroofe, M., Zinn, J.: Convergence to a stable distribution via order statistics. Ann. Probab. 9(4), 624–632 (1981) 36. Lyons, R.: Identities and Inequalities for Tree Entropy. Combin. Probab. Comput. 19(2), 303–313 (2010) 37. Marchenko, V.A., Pastur, L.A.: The distribution of eigenvalues in sertain sets of random matrices. Mat. Sb. 72, 507–536 (1967) 38. McDiarmid, C.: On the method of bounded differences. Surveys in combinatorics, (Norwich, 1989), London Math. Soc. Lecture Note Ser., Vol. 141, Cambridge: Cambridge Univ. Press, 1989, pp. 148–188 39. Mehta, M.L.: Random matrices and the statistical theory of energy levels. New York: Academic Press, 1967 40. Pan, G.M., Zhou, W.: Circular law, extreme singular values and potential theory. J. Multivar. Anal. 101(3), 645–656 (2010) 41. Panchenko, D., Talagrand, M.: On one property of Derrida-Ruelle cascades. C. R. Math. Acad. Sci. Paris 345(11), 653–656 (2007) 42. Reed, M., Simon, B.: Methods of modern mathematical physics. I. Second ed., New York: Academic Press Inc. [Harcourt Brace Jovanovich Publishers], 1980 43. Rogers, T.: Universal sum and product rules for random matrices. J. Math. Phys. 51, 093304 (2010) 44. Rogers, T., Castillo, I.P.: Cavity approach to the spectral density of non-Hermitian sparse matrices. Phys. Rev. E 79, 012101 (2009) 45. Rudelson, M., Vershynin, R.: The Littlewood-Offord problem and invertibility of random matrices. Adv. Math. 218(2), 600–633 (2008) 46. Talagrand, M.: Concentration of measure and isoperimetric inequalities in product spaces. Inst. Hautes Études Sci. Publ. Math. 81(1), 73–205 (1995) 47. Tao, T.: Outliers in the spectrum of iid matrices with bounded rank perturbations. http://arXiv.org/abs/ 1012.4818v3 [math.PR], 2011 48. Tao, T., Vu, V.: Random matrices: the circular law. Commun. Contemp. Math. 10(2), 261–307 (2008) 49. Tao, T., Vu, V.: Smooth analysis of the condition number and the least singular value. Math. Comp. 79(272), 2333–2352 (2010) 50. Tao, T., Vu, V.: Random matrices: universality of ESDs and the circular law, with an appendix by Manjunath Krishnapur. Ann. Probab. 38(5), 2023–2065 (2010) 51. Thompson, R.C.: The behavior of eigenvalues and singular values under perturbations of restricted rank. Linear Algebra and Appl. 13(1/2), 69–78 (1976) (collection of articles dedicated to Olga Taussky Todd) 52. Wachter, K.W.: The strong limits of random matrix spectra for sample matrices of independent elements. Ann. Prob. 6(1), 1–18 (1978) 53. Weyl, H.: Inequalities between the two kinds of eigenvalues of a linear transformation. Proc. Nat. Acad. Sci. U. S. A. 35, 408–411 (1949)

560

C. Bordenave, P. Caputo, D. Chafaï

54. Yin, Y.Q.: Limiting spectral distribution for a class of random matrices. J. Multivariate Anal. 20(1), 50–68 (1986) 55. Zhan, X.: Matrix inequalities. Lecture Notes in Mathematics, Vol. 1790, Berlin: Springer-Verlag, 2002 56. Zolotarev, V.M.: One-dimensional stable distributions. In: Translations of Mathematical Monographs, Vol. 65, Providence, RI: Amer. Math. Soc., 1986, Translated from the Russian by H. H. McFaden, Translation edited by Ben Silver Communicated by P. Forrester

Commun. Math. Phys. 307, 561–563 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1333-7

Communications in

Mathematical Physics

Erratum

Erratum to: Diffusion at the Random Matrix Hard Edge José A. Ramírez1 , Brian Rider2 1 Department of Mathematics, Universidad de Costa Rica, San Jose 2060, Costa Rica.

E-mail: [email protected]

2 Department of Mathematics, University of Colorado at Boulder, Boulder, CO 80309, USA.

E-mail: [email protected] Received: 23 May 2011 / Accepted: 9 June 2011 Published online: 27 August 2011 – © Springer-Verlag 2011

Commun. Math. Phys. 288, 887–906 (2009)

As stated, Theorem 2 of [2] is valid only for a ≥ 0. While this restriction does not affect the main result of [2], we still would like to communicate the correct statement of Theorem 2 and its proof. The starting point remains Theorem 1 of [2]. There it was proved that the scaled limiting points of the (β, a)-Laguerre ensemble1 corresponds to the eigenvalues of Gβ,a , where −Gβ,a generates the diffusion process with random speed and scale measures m(d x) = e

−(a+1)x− √2β b(x)

d x, s(d x) = e

ax+ √2β b(x)

d x.

Here x → b(x) is a standard Brownian motion. More precisely, the reciprocal of the (β, a)-Laguerre points converge to the ordered eigenvalues of the almost surely trace class integral operator ∞ x∧y s(dz)ψ(y)m(dy), for ψ ∈ L 2 [R+ , m]. (Gβ,a )−1 ψ(x) = 0

0

The corresponding boundary conditions for an eigenfunction f may be read off from the above. That one must have ψ(0) = 0 is clear, but it was a misunderstanding of the boundary “at infinity” which led to the error in question. For Gβ,a , +∞ is entrance, not exit if a ≥ 0, while it is entrance and exit when a ∈ (−1, 0] (see [1] for definitions). In fact, the theorem in question remains correct for a ≥ 0 (though for reasons slightly different than stated). For a < 0, the process generated by Gβ,a can reach +∞, and this has consequences not addressed in [2]. The online version of the original article can be found under doi:10.1007/s00220-008-0712-1. β β 1 This is the measure with density proportional to β n−1 2 (a+1)−1 e− 2 λk on (R )n . + j 0 and a > −1.

562

J. A. Ramírez, B. Rider

Consider now the “Riccati diffusion” for Gβ,a , given by the Itô equation: dp(x) = √2β p(x)db(x) + a + β2 p(x) − p 2 (x) − λe−x d x,

(1)

(x is the time parameter). The process p may be begun at +∞, which it leaves instantaneously, and there is a positive probability of explosion to −∞. Note also, if p(x) = 0 at some x < ∞, then p(x ) < 0 for all x < x . With 0 (β, a) < 1 (β, a) < · · · the eigenvalues of Gβ,a , the corrected Theorem 2 ([2]) reads as follows. Theorem. Let P∞,x denote the law induced by p(·; β, a, λ) started at +∞ at time x, and restarted at +∞ and time m upon any m < ∞, p(m) = −∞. Then, P(0 (β, a) > λ) = P∞,0 ( p never hits 0), P(λk (β, a) < λ) = P∞,0 ( p hits 0 at least k + 1 times). The proof still comes down to Sturm oscillation. The difference between a ≥ 0 and −1 < a < 0 is connected to the next simple fact pointed out to us by O. Zeitouni. Claim. With m c the passage time to position c, P(m−∞ < ∞|m0 < ∞) = 1 whenever a ≥ 0. In words, when a ≥ 0 the process x → p(x) will hit −∞ with probability one once it hits 0. Hence we also have the following. Corollary. Let a ≥ 0, and set νx (dc) = P∞,x (m−∞ ∈ dc). Then, P(0 (β, a) > λ) = ν0 ({∞}) and P(λk (β, a) < λ) = Rk+1 ν0 (d x1 )νx1 (d x2 ) · · · νxk (d xk+1 ). This corollary is just how Theorem 2 ([2]) was stated, erroneously, for all a > −1. Effectively, when a ≥ 0 we can think of there being a Dirichlet condition at +∞. Said better, for a ≥ 0 the half-line eigenvalue problem is the L ↑ ∞ limit of the problem on [0, L] with either Dirichlet or Neumann conditions at L (and Dirichlet at 0). For −1 < a < 0 however the boundary point at +∞ must be viewed as Neumann − it can only be approximated by a sequence of problems on [0, L] with a Neumann condition at L. In terms of x → p(x), this results in counting passages to 0 (Neumann) rather than to −∞ (Dirichlet). Finally note that Theorem 3 of [2] (which is a direct corollary of Theorem 2) is unaffected, this being a statement about a ↑ +∞. L (0 < L < ∞) defined Proof of the theorem. Bring in the approximating operator Gβ,a via x∧y ∞ L −1 (Gβ,a ) ψ(x) = s L (x, y)ψ(y)m(dy), s L (x, y) = s(dz) 1x,y∈[0,L] , (2) 0

0

acting on ψ ∈ L 2 ([0, L], m). One may readily check that any solution to ψ(x) = L )−1 ψ(x) must satisfy ψ(0) = 0 and ψ (L) = 0. That is, this defines the eigenλ(Gβ,a value problem for Gβ,a , cut down to [0, L] with Dirichlet/Neumann conditions at 0 and L. L )−1 → (G −1 in trace In the manner of Lemma 12 of [2] one may check that (Gβ,a β,a ) norm, which provides convergence of the ordered eigenvalues. As a bit of amplification, L prescribing in Lemma 12 of [2] we had applied the argument to the approximating G β,a

Diffusion at the Random Matrix Hard Edge

563

Dirichlet eigenvalues at both 0 and L. In that case the integral kernel s L (x, y) (with respect to m) from (2) is changed to ⎤ ⎡ L x∧y s(dz) x∨y ⎦ 1x,y∈[0,L] . s L (x, y) = s(dz) × ⎣ L 0 s(dz) 0 L )−1 → (Gβ,a )−1 only for a ≥ 0. The crux of the problem is that (G β,a ∞ Differentiating both sides of ψ(x) = λ 0 s L (x, y)ψ(y)m(dy) leads to the same system for x → (ψ(x), ψ (x)) found in [2] (see Eq. (3.1) in that reference). Further, x → p(x) = ψ (x)/ψ(x), sensible away from the zeros of ψ, again solves the stochastic differential equation (1). L only if (ψ(x, λ), ψ (x, λ)) satisfying With this setup, λ is an eigenvalue of Gβ,a the differential system with initial condition (0, 1) at x = 0 comes to (·, 0) at x = L. In terms of p, λ is an eigenvalue if p(L , λ) = 0 (after possible passages to −∞ and subsequent “re-starts”). As λ increases, one sees that the zeros of p on [0, L] move from right to left, additional zeros appearing at L. At this point, the proof proceeds exactly as in [2], with the understanding that the passages of p to 0, not to −∞, comprises the eigenvalue counting function.

References 1. Itô, K., McKean, H.P.: Diffusion processes and their sample paths. Berlin-Heidelberg-New York: SpringerVerlag, 1974 2. Ramírez, J., Rider, B.: Diffusion at the random matrix hard edge. Commun. Math. Phys. 288(3), 887– 906 (2009) Communicated by H. Spohn

Commun. Math. Phys. 307, 565–566 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1332-8

Communications in

Mathematical Physics

Erratum

Erratum to: Unitary Representations of Super Lie Groups and Applications to the Classification and Multiplet Structure of Super Particles C. Carmeli1,2 , G. Cassinelli1,2 , A. Toigo3,4 , V. S. Varadarajan5 1 Dipartimento di Fisica, Università di Genova, Via Dodecaneso 33, 16146 Genova, Italy 2 Istituto Nazionale di Fisica Nucleare, Sezione di Genova, Via Dodecaneso 33, 16146 Genova, Italy.

E-mail: [email protected]; [email protected]

3 Dipartimento di Matematica “Francesco Brioschi”, Politecnico di Milano, Piazza Leonardo da Vinci 32,

20133 Milano, Italy

4 Istituto Nazionale di Fisica Nucleare, Sezione di Milano, Via Celoria 16, 20133 Milano, Italy.

E-mail: [email protected]

5 Department of Mathematics, University of California at Los Angeles, Box 951555, Los Angeles,

CA 90095-1555, USA. E-mail: [email protected] Received: 28 June 2011 / Accepted: 29 June 2011 Published online: 30 August 2011 – © Springer-Verlag 2011

Commun. Math. Phys. 263, 217–258 (2006)

Professor Hadi Salmasian has drawn our attention a misstatement in Lemma 1 where the correct statement should be X B ⊂ B. In the corrigendum below we insert this correction and a small set of consequent corrections in Lemma 1 as well as Propositions 2 and 3. We thank professor Salmasian for this. 1. p. 222: In item (ii) of Lemma 1, replace “such that X B ⊂ D(X )” with “such that X B ⊂ B”. 2. p. 222: In the last statement of Lemma 1, replace “if we only assume that B is invariant under H and contains a dense set of analytic vectors” with “if we only assume that B ⊂ D(H ) and contains a dense set of analytic vectors for H ”. 3. p. 222: In the last paragraph of the proof of Lemma 1, replace “Finally, let us assume that H B ⊂ B and that B contains a dense set of analytic vectors for H ” with “Finally, let us assume that B ⊂ D(H ) and that B contains a dense set of analytic vectors for H ”. 4. p. 222: In the last paragraph of the proof of Lemma 1, replace “we have X 2n ψ = H n ψ ∈ B and X 2n+1 ψ ∈ D(X ) by assumption, and” with “we have ψ ∈ D(X n ) for all n by X -invariance of B, and”. 5. p. 226, last paragraph before Proposition 2: In item (b)-(vi), replace “ρ(X )B ⊂ D(ρ(Y )) for all X, Y ∈ g1 ” with “ρ(X )B ⊂ B for all X ∈ g1 ”. 6. p. 227, in the proof of Proposition 2: Before the paragraph beginning with “It remains only to show...”, add the following paragraph: “We now prove that, for all X ∈ g1 , the operator ρ(X ) is odd on C ∞ (π0 ). If Pi : H → H is the orthogonal projection onto Hi , then Pi B ⊂ B, and ρ(X )Pi ψ = Pi+1 (mod 2) ρ(X )ψ for all ψ ∈ B The online version of the original article can be found under doi:10.1007/s00220-005-1452-0.

566

C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan

by item (iii). If ψ ∈ C ∞ (π0 ) and (ψn ) is a sequence in B such that ψn → ψ and ρ(X )ψn → ρ(X )ψ, then Pi ψn → Pi ψ and ρ(X )Pi ψn = Pi+1 (mod 2) ρ(X )ψn → Pi+1 (mod 2) ρ(X )ψ. Thus we have ρ(X )Pi ψ = Pi+1 (mod 2) ρ(X )ψ, and the claim follows.” 7. p. 227, item (i) in the statement of Proposition 2: Replace “so that π , as in Proposition (1), is a representation of g in C ∞ (π0 )” with “so that π , as in Proposition 1, restricts to a representation of g in C ω (π0 )”. Communicated by Y. Kawahigashi

Commun. Math. Phys. 307, 567–607 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1344-4

Communications in

Mathematical Physics

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians Luigi Barletti1 , Naoufel Ben Abdallah2,† 1 Dipartimento di Matematica, Università di Firenze, Viale Morgagni 67/A, 50134 Firenze, Italy.

E-mail: [email protected]

2 Institut de Mathématiques de Toulouse, Université de Toulouse Univ. Paul Sabatier, 118 route de Narbonne,

31062 Toulouse, France Received: 10 November 2009 / Accepted: 8 June 2011 Published online: 22 September 2011 – © Springer-Verlag 2011

Abstract: In this paper the effective mass approximation and the k·p multi-band models, describing quantum evolution of electrons in a crystal lattice, are discussed. Electrons are assumed to move in both a periodic potential and a macroscopic one. The typical period of the periodic potential is assumed to be very small, while the macroscopic potential acts on a much bigger length scale. Such homogenization asymptotic is investigated by using the envelope-function decomposition of the electron wave function. If the external potential is smooth enough, the k·p and effective mass models, well known in solid-state physics, are proved to be close (in the strong sense) to the exact dynamics. Moreover, the position density of the electrons is proved to converge weakly to its effective mass approximation.

1. Introduction The effective mass approximation is a common approximation in solid state physics [7,8,24] and states roughly speaking that the motion of electrons in a periodic potential can be replaced with a good approximation by the motion of a fictitious particle in vacuum but with a modified mass called the effective mass of the electron. This approximation is valid when the lattice period is small compared to the observation length scale, and relies on the Bloch decomposition theorem for the Schrödinger equation with a periodic potential. The effective mass is actually a tensor and depends on the energy band in which the electron “lives”. One of the most important references in the physics literature on the subject is the paper of Kohn and Luttinger [16] which dates back to 1955. As for rigorous mathematical treatment of this problem, we are aware of the work of Poupaud and Ringhofer [18] and that of Allaire and Piatnitski [4]. The aim of the † After the first submission of the manuscript Naoufel Ben Abdallah unexpectedly left us. I would like to express here my feelings of gratitude for his precious friendship and for the extraordinary scientific enrichment he offered me at every occasion (L.B.).

568

L. Barletti, N. Ben Abdallah

present work is to provide an alternative mathematical treatment which is based on the original work of Kohn and Luttinger. Like in [4] (see also [3] and [5] for related problems), we consider the scaled Schrödinger equation x x 1 1 + V x, ψ (t, x), i∂t ψ (t, x) = − + 2 WL 2 where WL (z) is a periodic potential with the periodicity of a lattice L, representing the crystal ions, while V (x, z) represents an external potential. The latter is assumed to act both on the macroscopic scale x and on the microscopic scale z = x/, and to be L-periodic with respect to z. The small parameter is interpreted as the so-called “lattice constant”, that is the typical separation between lattice sites. Note that this scaling of the Schrödinger equation is a homogenization scaling [4,18]. As mentioned above, the analysis of the limit → 0 has been done in Refs. [4] and [18] by different techniques. In [18], the analysis is done indirectly by means of Wigner functions techniques. Using Bloch functions which diagonalize the periodic Hamiltonian, a Wigner function is constructed. The limit → 0 is done in the Wigner equation and is reinterpreted as the Wigner transform of an effective mass Schrödinger equation. In [4], the problem is tackled differently thanks to homogenization techniques, mainly double-scale limits. The wave function is spanned on the Bloch basis and the limiting equation is obtained by expanding around a reference wave vector the Bloch functions and the energy bands. The approach we adopt in this paper is completely different from [18] and somehow related to [4] although the techniques are different. The main idea, borrowed from the celebrated work of Kohn and Luttinger [16] (see also the more recent paper of Burt [10]), consists of expanding the wave function on a modified Bloch basis. This choice of basis does not allow to completely diagonalize the periodic part of the Hamiltonian, but completely separates the “oscillating” part of the wave function from its slowly varying one. By doing so, we introduce a so-called envelope function decomposition of the wave function and rewrite the Schrödinger equation as an infinite system of coupled Schrödinger equations for the envelope functions. Each of the envelope functions has a fast oscillating scale in time with a frequency related to the energy band for vanishing wavevector. Therefore adiabatic decoupling occurs as it is commonly the case for fast oscillating systems [13,17,21,23]. Moreover, the action of the macroscopic potential becomes in the envelope function formulation a convolution operator in both the position variable and band index. The limit of this operator is a multiplication operator in position by a matrix potential (in the band index). The analysis of these limiting processes is obtained through simple Fourier-like analysis and perturbation of point spectra of self-adjoint operators. The kind of result that we obtain is strong convergence of envelope functions and weak convergence of the probability density |ψ |2 associated to the wave function. It has to be remarked that our method allows to handle an infinite number of Bloch waves (although explicit rates of convergence can be given only for a finite number of bands), while, by contrast, Ref. [4] deals with initial data carried by a finite number of bands. On the other hand, what we are able to prove is the convergence of the position probability density |ψ |2 , while Allaire and Piatnitski in Ref. [4] prove the double-scale convergence of the wavefunction ψ . Another point of interest of our method is that we derive the well known “k·p model” [7,24] as an intermediate model between the original Schrödinger equation and its limiting effective mass approximation. The outline of the paper is as follows. In Sect. 2 we introduce notations and the functional setting, and we also give a concise presentation of the main result. As mentioned

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

569

above, the Schrödinger equation is reformulated as an infinite system of coupled Schrödinger equations, where the coupling comes both from the differential part and from the potential part. In Sect. 3, we concentrate on the potential part and analyze its limit. Section 4 is devoted to the diagonalization of the differential part and to the expansion of the corresponding eigenvalues in the Fourier space. In Sect. 5, we analyze the convergence of the solution of the Schrödinger equation towards its effective mass approximation. The method relies on the definition of intermediate models and the comparison of their respective dynamics. In Subsect. 6.1 we briefly illustrate an important generalization of the main theorem, obtained by expanding the energy bands around nonzero wavevectors, and in Subsect. 6.2 we discuss some aspects related to the smoothness of the envelope functions. Finally, Appendix A contains some postponed proofs. 2. Notations and Main Results 2.1. Bloch decomposition. Let us consider the operator x 1 1 HL = − + 2 WL , 2 where WL is a bounded L-periodic potential and the lattice L is defined by L = Lz z ∈ Zd ⊂ Rd ,

(1)

(2)

L being a d × d matrix with det L = 0. The centered fundamental domain C of L is, by definition,

1 1 d C = Lt t ∈ − , . (3) 2 2 Note that the volume measure |C| of C is given by |C| = |det L|. The reciprocal lattice L∗ is, by definition, the lattice generated by the matrix L ∗ such that L T L ∗ = 2π I. The Brillouin zone B is the centered fundamental domain of L∗ , i.e.1

1 1 d ∗ . B= L t t ∈ − , 2 2

(4)

(5)

Thus, we clearly have |C| |B| = (2π )d .

(6)

We assume without loss of generality that the periodic potential is non-negative (WL ≥ 0). In solid state physics, WL is interpreted as the electrostatic potential generated by the ions of the crystal lattice [7]. With the change of variables z = x/, the operator HL turns to 12 HL1 , where HL1 is given by (1) with = 1. This operator has a band structure which is given by the celebrated Bloch theorem [20]. 1 In solid state physics the Brillouin zone used has a slightly different definition. However, the two definitions are equivalent to our purposes.

570

L. Barletti, N. Ben Abdallah

Definition 2.1. Let WL ∈ L ∞ (Rd ) be L-periodic. For any k ∈ B, the fiber Hamiltonian HL (k) =

1 2 1 |k| − ik · ∇ − + WL , 2 2

(7)

defined on L 2 (C) with periodic boundary condition has a compact resolvent. Its eigenfunctions form an orthonormal sequence of periodic solutions (u n,k )n∈N solving the eigenvalue problem HL (k) u n,k = E n (k) u n,k .

(8)

The functions u n,k are the so-called Bloch functions and the eigenvalues E n (k) are the energy bands of the crystal. For each fixed value of k ∈ B, the set {u n,k | n ∈ N} is a Hilbert basis of L 2 (C) [9,20]. The Bloch waves defined for k ∈ B and n ∈ N by bn,k (x) = |B|−1/2 eik·x u n,k (x) form a complete basis of L 2 (Rd ) and satisfy the equation HL1 bn,k = E n (k)bn,k . In order to analyze the homogenization limit → 0, the usual starting point is to decompose the wave function on the scaled Bloch wave functions x k·x . (x) = |B|−1/2 ei u n,k bn,k This has the big advantage of completely diagonalizing the periodic Hamiltonian, but since the wave vector appears both in the plane wave eik·x and in the standing periodic function u n,k , the separation between the fast oscillating scale and the slow motion carried by the plane wave is not immediate. We follow in this work the idea of Kohn and Luttinger [16] who decompose the wave function on the basis Xn,k (x) = |B|−1/2 1B (k) eik·x u n,0 (x),

(9)

where k ∈ Rd and 1B denotes the indicator function of the set B. The family Xn,k is also a complete orthonormal basis of L 2 (R3 ) but only partially diagonalizes HL1 since

1 2 |k| − ik · ∇ + E n (0) u n,0 HL1 Xn,k = |B|−1/2 1B/ (k) eik·x 2 1 |k|2 δnn − ik · Pnn + E n δnn u n ,0 = |B|−1/2 1B/ (k) eik·x 2 n

1 |k|2 δnn − ik · Pnn + E n δnn Xn ,k . = (10) 2

n

Here, E n = E n (0) and

P

nn

=

C

u n,0 (x)∇u n ,0 (x) d x

(11)

are the matrix elements of the gradient operator between Bloch functions. The LuttingerKohn basis is strictly connected with the envelope function decomposition of the wave function, a connection that will be detailed in the following subsection.

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

571

2.2. Envelope functions. In the following, we shall use the symbol F to denote the Fourier transformation 1 Fψ(k) = e−i x·k ψ(x) dk, (12) (2π )d/2 Rd extended to L 2 (Rd ), and F ∗ = F −1 for the inverse transformation. We shall use a hat, ψˆ = Fψ, for the Fourier transform of ψ. Definition 2.2. We define L 2B (Rd ) ⊂ L 2 (Rd ) ported in B: L 2B (Rd ) = f ∈ L 2 (Rd )

to be the subspace of L 2 -functions sup supp f ⊂ B .

(13)

Thus, F ∗ L 2B (Rd ) is the space of L 2 -functions whose Fourier transform is supported in B. The envelope function decomposition is defined by the following theorem. Theorem 2.1. Let vn : Rd → C be L-periodic functions such that {vn | n ∈ N} is an orthonormal basis of L 2 (C). For every ψ ∈ L 2 (Rd ) there exists a unique sequence of “envelope functions” f n ∈ F ∗ L 2B (Rd ), n ∈ N, such that ψ = |C|1/2 f n vn . (14) n

If gn are the envelope functions of another wave function ϕ, then we have the Parseval identity ψ, ϕ L 2 (Rd ) = f n , gn L 2 (Rd ) . (15) n

For any > 0 we shall consider the scaled version f n of the envelope function decomposition as follows: ψ = |C|1/2 f n vn , (16) n

with fˆn ∈ L 2B/ (Rd ), where vn (x) = vn We still have the Parseval identity ψ, ϕ L 2 (Rd ) =

x

n

.

f n , gn

(17) L 2 (R d )

.

Finally, the Fourier transforms of the -scaled envelope functions are given by ˆ f n (k) = X n,k (x) ψ(x) d x, Rd

(18)

(19)

where, for x ∈ Rd , k ∈ Rd , n ∈ N, (x) = |B |−1/2 1B/ (k) eik·x vn (x). Xn,k

(20)

572

L. Barletti, N. Ben Abdallah

The proof of this theorem is postponed to Appendix A. The functions f n will be called the envelope functions of ψ relative to the basis {vn | n ∈ N}, while f n will be called the -scaled envelope function relative to the basis {vn | n ∈ N}. We remark that supp( fˆn ) ⊂ B and supp( fˆn ) ⊂ B/. Remark 2.1. Note that the above result is a variant of the so-called Bloch transform. In [6], the function ˆ ψ(x, k) = |C|1/2 fˆn (k) vn (x) n

is referred to as the Bloch transform of ψ. We also refer to [2,15] and [20] for Bloch wave methods in periodic media. Theorem 2.2. Let us consider the -scaled envelope function decomposition (16) of ψ ∈ L 2 (Rd ). Then, for every test function θ ∈ L 1 (Rd ) such that θˆ ∈ L 1 (Rd ), we have

(21) θ (x) |ψ(x)|2 − | f n (x)|2 d x = 0. lim →0 Rd

n

The proof of this theorem is also postponed to Appendix A.

2.3. Functional spaces. In this section, we define some functional spaces which will be used all along the paper. Definition 2.3. We define the space L2 = 2 N, L 2 (Rd ) as the Hilbert space of sequences g = (g0 , g1 , . . .), gn = gn (k), with gn ∈ L 2 (Rd ), such that

g 2L2 =

gn 2L 2 (Rd ) < ∞. (22) n

Moreover, for μ ≥ 0 let L2μ be the subspace of all sequences g ∈ L2 such that 2

g 2L2 = (1 + |k|2 )μ/2 g L2 = μ

2

(1 + |k|2 )μ/2 gn L 2 < ∞

(23)

n

and let Hμ = 2 N, H μ (Rd ) , with

f 2Hμ =

n

f n 2H μ =

2

(1 + |k|2 )μ/2 fˆn L 2 < ∞.

(24)

n

It is readily seen that f ∈ Hμ if and only if fˆ ∈ L2μ . Let us now define the functional spaces for the external potential, which is assumed to be periodic with respect to the second variable. Definition 2.4. For μ ≥ 0 we define the spaces Wμ = V ∈ L ∞ (R2d ) V (·, z + λ) = V (·, z), λ ∈ L, V Wμ < ∞ ,

(25)

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

where

V Wμ =

1 ess sup (1 + |k|)μ |Vˆ (k, z)| dk (2π )d/2 z∈C Rd

and Vˆ (k, z) = (2π )−d/2

Rd

573

(26)

e−ik·x V (x, z) d x.

Note that V ∈ Wμ , with μ > 0, is a Carathéodory function and, therefore, the diagonal evaluation V (x, x/) is a measurable function of x (see e.g. Ref. [12]). 2.4. Main theorem. We announce in this section the main theorem of our paper. Let us redefine the eigenpairs (E n , vn ) of the operator HL1 = − 21 + WL with periodic boundary conditions by 1 − 2 vn + WL vn = E n vn , on C (27) 2 C |vn | d x = 1, vn periodic (note that vn = u n,0 and E n = E n (0), according to Definition 2.1). The sequence E n is increasing and tends to +∞. Theorem 2.3. Assume that WL ∈ L ∞ and that all the eigenvalues E n = E n (0) are simple. Let ψ in, be an initial datum in L 2 (Rd ) and let f nin, be its scaled envelope functions relative to the basis vn . Assume that μ > 0 exists such that the sequence f in, = ( f nin, ) belongs to Hμ , with a uniform bound for the norm as vanishes, and that it converges in L2 as tends to zero to an initial datum f in . Let ψ be the unique solution of x x 1 1 + V x, ψ (t, x), i∂t ψ (t, x) = − + 2 WL 2 (28) ψ(t = 0) = ψ in, , and assume that V ∈ Wμ . Then for any given test function θ ∈ L 1 (Rd ) such that θˆ ∈ L 1 (Rd ), we have

2 h em,n (t, x)2 d x = 0, lim θ (x) ψ (t, x) − →0

n

uniformly in bounded time intervals, where the envelope function h em,n is the unique solution of the homogenized (effective mass) Schrödinger equation 1 in i∂t h em,n = − div M−1 n ∇h em,n + Vn (x) h em,n , h em,n (t = 0) = f n , 2 with

Vn (x) =

C

V (x, z)|vn (z)|2 dz

574

L. Barletti, N. Ben Abdallah

and M−1 n = ∇ ⊗ ∇ E n (k) |k=0 = I − 2

Pnn ⊗ Pn n En − En

n =n

(effective mass tensor of the n th band). Remark 2.2. The assumption that the eigenvalues E n = E n (0) are simple is quite restrictive, even in the one-dimensional case (d = 1) [20]. However, the simplicity of the eigenvalues E n (k) is a generic property [1,22] and, therefore, it is always satisfied by replacing 0 with a generic wave vector k0 . What changes, in this case, is that k0 is not necessarily a critical point for all bands (instead, k = 0 is always a critical point for all bands if the eigenvalues E n (0) are simple). In Subsect. 6.1 we shall discuss the generalization of Theorem 2.3 for a generic reference wave vector k0 . As we shall see (see also Ref. [4]), drift terms of order 1/ appear in those bands for which k0 is not critical. For such bands the effective mass equation always holds but the homogenized potential term Vn (x) must be replaced by a suitable “far-away” asymptotic value Vn∞ . Moreover, the 2 weak convergence of |ψ |2 to n h em,n has to be mediated by drifting test functions (see Eq. (102)). 3. From the Schrödinger Equation to the k·p Model Let ψ (t, x) be the solution of the Schrödinger equation (28) and let f n (t, x) be its -scaled envelope function relative to the basis vn , as defined in (27) and (17): ψ (t, x) = |C|1/2 f n (t, x)vn (x). n

Let us define gn (t, k) = fˆn (t, k). From now on, we will reserve the notation f for functions of the position variable x, while g will be used for functions of the wavevector k. Multiplying the Schrödinger (x) (see Eq. (20)) and integrating over x leads to the following equaequation by Xn,k tion: i∂t gn (t, k) =

1 2 i 1 |k| gn (t, k) − k · Pnn gn (t, k) + 2 E n gn (t, k) 2

n

Unn +

(k, k ) gn (t, k ) dk , n

Rd

(29)

where the kernel Unn (k, k ) is given by x

Xn ,k (x) d x X n,k (x) V x, Unn (k, k ) = Rd x

= |B |−1 1B/ (k) vn (x) d x. 1B/ (k ) e−i(k−k )·x vn (x) V x, Rd

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

575

By writing V (x, z)vn (z) =

Vn n (x)vn (z),

n

where Vn n (x) =

C

v n (z) vn (z) V (x, z) dz = V nn (x),

(30)

(k, k ) in the form we can express Unn

Unn

(k, k ) =

1B/ (k)

1B/ (k ) e−i(k−k )·x vn (x) Vmn (x)vm (x) d x. d |B| R m

(31)

In position variables, the envelope functions satisfy the system i∂t f n (t, x) =

En 1 f n (t, x) − f n (t, x) 2 2 1

Pnn · ∇ f n (t, x) + Vnn −

(x, x ) f n (t, x ) d x , d

R n ∈N n ∈N

(32)

where

Vnn

(x, x )

1 = dk dy dk

(2π )d |B| B/ Rd B/ y

vn (y) e−ik ·x . × eik·x e−i(k−k )·y v n (y)V y,

(33)

From Eq. (32) we see that the fast oscillation scales are different for different envelope functions. This will naturally lead to adiabatic decoupling (see [13,17,21,23]). Definition 3.1. Let us define the operator U on L2 as follows: for any element g = (g0 , g1 , . . .) of L2 we put

Ug

(k) = n

n

Rd

Unn

(k, k ) gn (k ) dk ,

(34)

(k, k ) is given by (31). Let us also define the operator V on the position where Unn

2 space L by

V f

n

(x) =

We obviously have V f = U fˆ.

n

Rd

Vnn

(x, x ) f n (x ) d x .

(35)

576

L. Barletti, N. Ben Abdallah (k, k ) is given by Since vn and vm are L-periodic, the formal limit of Unn

0

Unn

(k, k ) =

vn , vm m

|B||C|

Rd

e−i(k−k )·x Vmn (x) d x =

1 Vˆnn (k − k ). (2π )d/2

Therefore the formal limit of U is the operator U 0 defined by 1 U 0 g (k) = Vˆnn (k − k ) gn (k ) dk , d/2 d n (2π ) R

(36)

n

which means that in position space the limit of V is the non-diagonal multiplication operator V 0 defined by Vnn (x) f n (x). (37) V 0 f (x) = n

n

The operators become diagonal in n if V (x, z) does not depend on z. Indeed, in this case Vnn (x) = V (x)δnn . The k·p approximation found in semiconductor theory [24], consists in replacing the operator U by U 0 . Let us now analyze the departure of U from U 0 . We recall that the functional spaces Wμ for the potential are introduced in Definition 2.4. Lemma 3.1. Let the external potential V (x, z) be in L ∞ . Then, for any ≥ 0, U is a bounded operator on L2 and we have the uniform bound

U ≤ V L ∞ , ∀ ≥ 0.

(38)

V 0 f , where f = F ∗ g. Proof. Let us begin with the case = 0. We remark that U 0 g = Let G be another element of L2 , and let F be its back Fourier transform. We have 0 0 Vnn (x) f n (x)Fn (x) d x U g, G = V f, F =

nn = V (x, z)vn (z)vn (z) f n (x)Fn (x) d x dz

nn = V (x, z) f n (x)vn (z) Fn (x)vn (z) d x dz n n 1 1 2 2 2 2 ≤ V L ∞ f n (x)vn (z) d x dz Fn (x)vn (z) d x dz n

n

≤ V L ∞ f L2 F L2 = V L ∞ g L2 G L2 . Since the result holds for any g and G in L2 , this implies that U 0 g L2 ≤ V L ∞ g L2 . For > 0 it is enough to observe that U is unitarily equivalent to the multiplication operator by V (x, x ) in position space. More precisely, defining f = F ∗ (1B/ g) and defining ψ (x) = n f n (x)vn (x), so that f n are the scaled envelope functions of ψ ,

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

577

then it follows from the definition of U that (U g)n are the scaled envelope functions x of V x, ψ (x), expressed in Fourier variables. It is now readily seen that

x 2 ψ ≤ V 2L ∞ ψ 2L 2 ≤ V 2L ∞ g 2L2 .

U g 2L2 = V x, L2

Lemma 3.2. For any γ > 0 let γ B be the set of γ k where k is in B. Then γ B + βB = (γ + β)B. Moreover Let k ∈ B and k ∈ 13 B. Let λ be a non vanishing element of the reciprocal / 13 B. lattice L∗ . Then k − k + λ ∈ The proof of this lemma is immediate (using the fact that B is the linear deformation of a hypercube, see definition (5)) and is left to the reader. 1 Lemma 3.3. Let V ∈ W0 and g ∈ L2 be such that supp Vˆnm ⊂ 3 B and supp(gn ) ⊂ 1 0 3 B, for all n, m ∈ N. Then, in this case, U g = U g. Proof. Let us first notice that {|C|−1/2 eiη·x | η ∈ L∗ } is an orthonormal basis of L 2 (C) (the Fourier basis). We first deduce from (31) and from the identity vn (x) = where vn,λ = vn ,

U g

(k) = n

eiλ·x |C |1/2

that

λ,λ ∈L∗

1 vn,λ eiλ·x , |C|1/2 λ∈L∗

m,n

λ−λ )·x

R d ×R d

e−i(k−k +

1B/ (k )1B/ (k)

×Vnm (x)vm,λ vn ,λ gn (k ) d x dk

λ − λ

d/2

ˆ = 1B/ (k)(2π ) vm,λ vn ,λ gn (k ) d x dk . Vnm k − k +

B /

∗ λ,λ ∈L m,n

Since the support of gn is included in B/3 and k ∈ B/, Lemma 3.2 implies that the

only contributing terms to the above sum are those for which λ = λ . Therefore, we are lead to evaluate λ vm,λ vn ,λ which is equal to vn , vm = δmn because of the orthonormality of the family (vn ). Therefore Vˆnn (k − k )gn (k ) d x dk . U g n (k) = (2π )−d/2 1B/ (k) n

B/

Now, we can remove 1B/ (k) from the right hand side of the above identity, since both 1 the support of gn and that of Vˆnn are in 3 B. Hence U g n (k) = (2π )−d/2 Vˆnn (k − k )gn (k ) d xdk = U 0 g (k). n

Rd

n

578

L. Barletti, N. Ben Abdallah

Theorem 3.1. Assume that V ∈ Wμ for some μ ≥ 0. Then, a constant cμ > 0, independent of , exists such that

U g − U 0 g L2 ≤ μ cμ V Wμ g L2μ

(39)

for all g ∈ L2μ and for all > 0.

Proof. Let the smoothed potential Vs be defined by Vˆs (k, z) = 1B/3 (k) Vˆ (k, z).

(40)

Moreover, let Us denote the operator U with the potential Vs . Let us assume firstly that supp (gn ) ⊂ B/3 for all n ∈ N. Then, from Lemma 3.3 we have Us g = Us0 g and we can write

U g − U 0 g L2 ≤ U g − Us g L2 + Us0 g − U 0 g L2 .

(41)

Using (38) and the linearity of U and U 0 with respect to the potential, we have

U g − Us g L2 ≤ V − Vs W0 g L2 .

≥ 0,

Recalling the definition (26), we also have 1

V − Vs W0 = ess sup |Vˆ (k, z)| dk (2π )d/2 z∈C k ∈/ B/3 μ |3k| μ 1 3 ˆ ≤ ess sup |V (k, z)| dk ≤

V Wμ d/2 (2π ) R R k∈ / B/3 z∈C where R > 0 is the radius of a sphere contained in B. Then (still in the case supp(gn ) ⊂ B/3), from (41) we get μ 3

U g − U 0 g L2 ≤ 2

V Wμ g L2 . (42) R Now, if g ∈ L2μ (Definition 2.3), we can write (using 1c = 1 − 1)

U g − U 0 g L2 ≤ U 1cB/3 g

L2

+ (U − U 0 )1B/3 g L2 + U 0 1cB/3 g 2 .

From (38) we have U 1cB/3 g 2 ≤ V W0 1cB/3 g 2 , for all ≥ 0. But L L |gn (k)|2 dk

1cB/3 g 2 2 = L

n

≤

n

k∈ / B/3

k∈ / B/3

|3k| R

2μ

|gn (k)| dk ≤

3 R

L

(43)

2μ

g 2L2 , μ

and so we can estimate the first and third term in the right hand side of (43) as follows: μ 3

U 1cB/3 g 2 + U 0 1cB/3 g 2 ≤ 2

V W0 g L2μ . L L R Moreover, since Eq. (42) holds for 1B/3 g, then we can estimate also the second term: μ 3 0

(U − U )1B/3 g L2 ≤ 2

V Wμ g L2 . R Since V W0 ≤ V Wμ and g L2 ≤ g L2μ , then from (43) we conclude that (39) holds, with cμ = 4(3/R)μ (note that R does not depend on ).

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

579

4. Diagonalization of the k·p Hamiltonian In this section, we consider the case V (x, z) = 0 and concentrate on the diagonalization of the free k·p Hamiltonian. The envelope function dynamics are then given in Fourier variables by Eq. (29) which we rewrite under the form i 2 ∂t gn (t, k) =

1 2 2 |k| gn (t, k) − i k · Pnn gn (t, k) + E n gn (t, k). 2

(44)

n

Putting ξ = k, we are therefore led to consider, for any fixed ξ ∈ Rd , the following operators, acting in 2 ≡ 2 (N, C) and defined on their maximal domains: (A0 )nn = E n δnn , (A1 (ξ ))nn = −iξ · Pnn , (A2 (ξ ))nn =

1 2 |ξ | δnn . 2

(45)

Moreover, we put A(ξ ) = A0 + A1 (ξ ) + A2 (ξ ), so that (A(ξ ))nn = E n δnn − iξ · Pnn +

1 2 |ξ | δnn

2

(46)

is the operator at the right-hand side of Eq. (44) (with ξ = k). Lemma 4.1. The following properties hold: (a) for any given ξ ∈ Rd , A1 (ξ ) is A0 -bounded with A0 -bound less than 1, which implies that A(ξ ) = A0 + A1 (ξ ) + A2 (ξ ) is self-adjoint on the (fixed) domain of A0 , that is |E n gn |2 < ∞ ; D(A0 ) = g ∈ 2 (47) n

(b) {A(ξ ) | ξ ∈ Rd } is a holomorphic family of type (A) of self-adjoint operators [14]; (c) for any given ξ ∈ Rd , A(ξ ) has compact resolvent, which implies that A(ξ ) has a sequence of eigenvalues λ1 (ξ ) ≤ λ2 (ξ ) ≤ λ3 (ξ ) ≤ · · ·, with λn (ξ ) → ∞, and a corresponding sequence ϕ (1) (ξ ), ϕ (2) (ξ ), ϕ (3) (ξ ) . . . of orthonormal eigenvectors. Proof. (a) We first recall (see (27)) that (vn , E n ) is an eigencouple of HL1 = − 21 + WL 2 (C) (the subscript “per” denoting periodic boundary conditions). The on the domain Hper operator A0 is the representation in the basis {vn } of the operator HL1 , while A1 (ξ ) is the representation in the same basis of −iξ · ∇ with domain H1 (C): 2 D (A0 ) ≡ Hper (C) ⊂ H1 (C) ≡ D (A1 (ξ )) . Then, for any given sequence (gn ), denoting g(x) = n gn vn (x), we have 1 |∇g(x)|2 d x + WL (x)|g(x)|2 d x = HL1 g, g 2 = E n |gn |2 . L (C ) 2 C C n

Since WL is bounded and (without loss of generality) WL ≥ 0, then for g ∈ D(A0 ) we obtain

A1 (ξ )g 2 2 ≤ |ξ |2 ∇g 2L 2 (C ) ≤ 2|ξ |2 E n |gn |2 , (48) n

580

L. Barletti, N. Ben Abdallah

where we used the notation g for both g(x) = n gn vn (x) and for the sequence g = (gn ) ∈ 2 . Since E n → ∞, then, for any given 0 < b < 1, a positive integer n(ξ ) exists such that 2|ξ |2 E n < bE n2 for n ≥ n(ξ ) and we can write 2|ξ |2

E n |gn |2 ≤ 2|ξ |2 E n(ξ )

n

n(ξ )

∞

|gn |2 +

b|E n gn |2 .

n=n(ξ )

n=1

Thus, A1 (ξ )g 2 2 ≤ 2|ξ |2 E n(ξ ) g 2 2 + b A0 g 2 2 , with b < 1, which proves point (a). The proof of the remaining points is standard (see Refs. [9,14,20]). Remark 4.1. Recalling Definition 2.1 and Eq. (10) we see that A(ξ ) is nothing else but the expression of the fiber Hamiltonian HL (ξ ) in the Bloch basis vn = u n,0 . Then, the diagonalization of A(ξ ) corresponds to the diagonalization of HL (ξ ) and, therefore, the eigenvalues λn (ξ ) coincide with the energy bands E n (ξ ) inside the Brillouin zone. Moreover, ϕ (n) (ξ ) is clearly the component expression of u n,ξ in the basis u n,0 , i.e. ϕ (n) (ξ ) = u n,ξ , u n,0 L 2 (C ) . The difference between the bands λn and the bands E n is more evident in the case of a “shifted” representation that will be treated in Subsect. 6.1. The eigenvalues λn (ξ ) have been numbered in increasing order for each ξ ; this means that, when a eigenvalue crossing occurs, then the smoothness of λn (ξ ) (and of ϕ (n) (ξ )) is lost. Let us make, therefore, our central assumption that the eigenvalues λn (0) = E n are simple (see also Remark 2.2), i.e. E1 < E2 < E3 < · · · , with limn→∞ E n = +∞ by (c) of Lemma 4.1. In this case, λn (ξ ) and ϕ (n) (ξ ) are analytic in a neighborhood of the origin. Of course, such neighborhood depends on n: the next lemma allows to estimate the growth of the eigenvalues and, consequently, the size of the analyticity domain. Lemma 4.2. For any given ξ ∈ Rd , an integer n 0 (ξ ) ≥ 0 exists such that 1 |λn (ξ ) − E n | ≤ |ξ | 2E n + |ξ |2 , for all n ≥ n 0 (ξ ). 2

(49)

Proof. The behavior of the eigenvalues λn (ξ ) for large n will be investigated by means of the max min principle, which holds for increasingly-ordered eigenvalues [19]. Since the operators A(ξ ) have compact resolvent, the max min principle reads as follows: λn (ξ ) = max

min

S∈Mn−1 g∈S ⊥ ∩D (A0 ), g =1

A(ξ )g, g 2 ,

where Mn denotes the set of all subspaces of dimension n. In particular, λn (0) = E n = max

min

S∈Mn−1 g∈S ⊥ ∩D (A0 ), g =1

A0 g, g 2 ,

Let g ∈ D(A0 ) with g 2 = 1. From (48) we have

A1 (ξ )g 2 2 ≤ 2|ξ |2 E n |gn |2 = 2|ξ |2 A0 g, g 2 n

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

581

√ 1/2 and, therefore, A1 (ξ )g, g 2 ≤ A1 (ξ )g 2 ≤ 2|ξ | A0 g, g 2 , which, using A(ξ ) = A0 + A1 (ξ ) + A2 (ξ ), yields √ 1 A(ξ )g, g 2 − A0 g, g 2 ≤ 2|ξ | A0 g, g 1/2 + |ξ |2 . (50)

2 2 From (50) we get, in particular, √ 1 1/2 A(ξ )g, g 2 ≤ A0 g, g 2 + 2|ξ | A0 g, g 2 + |ξ |2 , 2 √ which allows us to estimate λn (ξ ) from above. In fact, since x + 2|ξ |x 1/2 + 21 |ξ |2 is an increasing function of x, we can write √ 1 2 1/2 max min A(ξ )g, g 2 ≤ max min A0 g, g 2 + 2|ξ |A0 g, g 2 + |ξ | 2 √ 1 1/2 ≤ max min A0 g, g 2 + 2|ξ | max min A0 g, g 2 + |ξ |2 2 (where “max min” is a shorthand for max S∈Mn−1 min g∈S ⊥ ∩D(A0 ), g =1 ). Then, √

|ξ |2 , (51) 2 which holds for all n ∈ N. We now estimate λn (ξ ) from below, at least for large n. From (50) we get √ 1 1/2 A(ξ )g, g 2 ≥ A0 g, g 2 − 2|ξ |A0 g, g 2 − |ξ |2 , 2 √ 2 1/2 and we remark that x − 2|ξ |x − |ξ | /2 is an increasing function of x for x ≥ |ξ |2 /2. Thus, let n 0 (ξ ) be such that E n 0 (ξ ) ≥ |ξ |2 /2 and fix n ≥ n 0 (ξ ). Let us define λn (ξ ) ≤ E n +

1/2

2|ξ | E n

+

0 Sn−1 = span{e(1) , e(2) , . . . e(n−1) },

where {e(n) | n ∈ N} is the canonical basis of 2 (eigenbasis of A0 ). We therefore have min

0⊥ ∩D (A ), g =1 g∈Sn−1 0

A0 g, g 2 = E n ,

0⊥ = span{e(n) , e(n+1) , . . .}. Thus, for every g ∈ S 0⊥ ∩ D(A ) with g = because Sn−1 0

2 n−1 1, we can write √ √ 1 1 1/2 1/2 A(ξ )g, g 2 ≥ A0 g, g 2 − 2|ξ | A0 g, g 2 − |ξ |2 ≥ E n − 2|ξ | E n − |ξ |2 , 2 2

(because E n ≥ E n 0 (ξ ) ≥ |ξ |2 /2), and so min

0⊥ ∩D (A ), g∈Sn−1 0

g =1

A(ξ )g, g 2 ≥ E n −

√

1/2

2|ξ | E n

1 − |ξ |2 . 2

0 Since Sn−1 ∈ Mn−1 , we conclude that

√ 1 1/2 2|ξ | E n − |ξ |2 , 2 which, together with (51), yields (49). λn (ξ ) ≥ E n −

n ≥ n 0 (ξ ),

(52)

582

L. Barletti, N. Ben Abdallah

From (49) we see that, for fixed ξ , the sequences E n and λn (ξ ) are asymptotically equivalent. Moreover it is not difficult to prove the following. Corollary 4.1. A constant C √0 , independent of n, exists such that λn (ξ ) > λn−1 (ξ ) for all |ξ | ≤ C0 (E n − E n−1 )/ E n . Then, the first N bands do not cross each other in a ball of radius RN =

C0 min{E n − E n−1 | 2 ≤ n ≤ N } > 0. √ EN

Let us now consider the family of diagonalization operators {T (ξ ) : 2 → 2 | ξ ∈ Rd }, i.e. the unitary operators that map 1-1 the basis {e(n) | n ∈ N} onto the basis {ϕ (n) (ξ )|n ∈ N}, so that ⎛

λ1 (ξ ) 0 0 0 λ2 (ξ ) 0 ⎜ Λ(ξ ) = T ∗ (ξ )A(ξ )T (ξ ) = ⎜ 0 λ3 (ξ ) ⎝ 0 .. .. .. . . .

⎞ ··· ···⎟ ···⎟ ⎠. .. .

(53)

Definition 4.1. We denote $ by Π N the projection on the N -dimensional subspace spanned # by e(1) , e(2) , . . . , e(N ) . In other words, Π N is the cut-off operator after the N th component, which can be thought of indifferently as acting either on 2 or on L2 (see Definition 2.3). Theorem 4.1. For any given ≥ 0 let us define T on the space L2 by

T g (k) = T (k)g(k).

(54)

Then, for every ≥ 0, the operator T : L2 → L2 is unitary, with T0 = I . Moreover, if g ∈ L2μ for some μ > 0, then a constant C(μ, N ), independent of , exists such that

(T − I )Π N g L2 ≤ min{μ, 1} C(μ, N ) g L2μ ,

(55)

where Π N is the above defined cut-off operator. Proof. The first part of the statement is clear, because Rd

T (k)g(k) 2 2 dk =

Rd

g(k) 2 2 dk = g 2L2

and λn (0) = E n . Since the first N bands do not cross in a ball of radius R N (see 4.1),$ then ξ → #T (ξ )Π N is unitary analytic $from the span of # (1) Corollary e , e(2) , . . . , e(N ) to the span of ϕ (1) (ξ ), ϕ (2) (ξ ), . . . , ϕ (N ) (ξ ) , in |ξ | ≤ R N . Let g ∈ L2μ and put g (N ) = Π N g.

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

583

Let > 0 and r > 0 be such that r ≤ R N . Then, using the analyticity of T (k)Π N in |k| ≤ r ≤ R N , we can write 2 (N ) 2

(T − I )g L2 =

(T (k) − I ) g (N ) (k) 2 dk d R 2 =

(T (k) − I ) g (N ) (k) 2 dk |k|≤r 2

(T (k) − I ) g (N ) (k) 2 dk + |k|>r 4 2 2 |k|2 g (N ) (k) 2 dk + 2μ |k|2μ g (N ) (k) 2 dk, ≤ L 2N r |k|≤r |k|>r for some Lipschitz constant L N > 0. Now, it can be easily verified that the inequality |k|n ≤ (1 + |k|μ ) r max{n−μ, 0}

(56)

holds for any r > 0, n ≥ 0, μ ≥ 0, and |k| ≤ r . From this (with n = 1) we get 2 2 2 (N ) 2 2 max{1−μ, 0} |k| g (k) 2 dk ≤ r (1 + |k|μ )2 g (N ) (k) 2 dk |k|≤r

|k|≤r

and, therefore, 2

(T − I )g (N ) L2 ≤ L 2N 2 r 2 max{1−μ, 0} + 4r −2μ g 2L2 . μ

Choosing r = R N / we obtain inequality (55), with 2 max{1−μ, 0} −2μ 1/2 + 4R N . C(μ, N ) = L 2N R N Let us now consider the second-order approximation of Λ(ξ ), ⎞ ⎛ (2) λ1 (ξ ) 0 0 ··· ⎟ ⎜ (2) ⎜ 0 0 ···⎟ λ2 (ξ ) ⎟, Λ(2) (ξ ) = ⎜ (2) ⎜ 0 0 λ3 (ξ ) · · · ⎟ ⎠ ⎝ .. .. .. .. . . . .

(57)

(2)

where λn (ξ ) is the second-order Taylor approximation of λn (ξ ): 3 λn (ξ ) = λ(2) n (ξ ) + O |ξ | . (2)

The approximated eigenvalues λn (ξ ) can be computed by means of standard nondegenerate perturbation techniques, which yield λ(2) n (ξ ) = E n +

1 ξ · M−1 n ξ, 2

(58)

584

L. Barletti, N. Ben Abdallah

where M−1 n = ∇ ⊗ ∇ λn (ξ ) |ξ =0 = I − 2

Pnn ⊗ Pn n En − En

(59)

n =n

is the n th band effective mass tensor [24] (we recall that Pnn = 0 if n = n ). Note that the 1st order term in (58) is zero. The operators Λ(ξ ) and Λ(2) (ξ ), which are self-adjoint on their maximal domains, generate, respectively, the exact dynamics and the effective mass dynamics (in Fourier variables and in the absence of external fields). Theorem 4.2. Let g in ∈ L2μ , for some μ > 0, and assume g in = Π N g in (i.e. the initial datum is confined in the first N bands). Then, a constant C(μ, N , t) ≥ 0, independent of , exists such that

(e

− it2 Λ(k)

−e

− it2 Λ(2) (k)

)g in L2 ≤ min{μ/3, 1} C(μ, N , t) g in L2μ .

Proof. Note that, since Λ(k) and Λ(2) (k) are diagonal, then both e

− it Λ(2) (k) in e 2 g

− it Λ(2) (k) in e 2 g ,

− it2 Λ(k) in g

remain confined in the first N bands at all times. Denoting h (t, k)

− it2 Λ(k)

− it2 Λ(2) (k)

)g in

the function = (e −e Duhamel formula t i(t−s) (2) − Λ(k) Λ(k) − Λ (k) h (t, k) = e 2 g (s, k) ds, 2 0

(60) and

g (t, k)

=

satisfies the

so that

h (t, k) 2 ≤

t% % (2) % Λ(k) − Λ (k) % g (s, k) % % 2 ds.

2 0

Since λ1 (ξ ), . . . λ N (ξ ) are analytic for |ξ | ≤ R N (see Corollary 4.1), then a Lipschitz constant L N exists such that % % Λ(k) − Λ(2) (k) % % g (s, k) % 2 ≤ L N |k|3 g (k, s) 2 = L N |k|3 g in (k) 2 % 2

for all k with |k| ≤ R N (where we also used the fact that the 2 norm of g is conserved during the unitary evolution). Now we can proceed as in the proof of Theorem 4.1: if r > 0 is such that r ≤ R N , then we can write 2 |k|6 g in (k) 2 dk

h (t, k) 2 2 dk ≤ (L N t)2 |k|≤r

|k|≤r

and, using inequality (56) with n = 3, 2 2

h (t, k) 2 2 dk ≤ L N t r max{3−μ, 0} g in L2μ . |k|≤r

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

Moreover,

585

1 |k|2μ h (t, k) 2 2 dk

h (t, k) 2 2 dk ≤ 2μ r |k|>r |k|>r 4 4 2 2 2μ in |k| g (t, k) 2 dk ≤ 2μ g in L2μ , ≤ 2μ r r |k|>r

where we used the fact that g (t, k) 2 = g in (k) 2 for all t. Hence,

2 2 2

max{3−μ, 0} −2μ

g in L2μ

h (t) 2 ≤ L N t r + 4r and, choosing r = R N / 1/3 , we obtain h (t) 2 ≤ C(μ, N , t) min{μ/3, 1} g in L2μ , that is inequality (60), with 1/2

−2μ/3 max{3−μ, 0} 2 C(μ, N , t) = L N t R N + 4R N . 5. Comparison of the Models We are now in position to exhibit the ensemble of models encountered and to compare their respective dynamics. Exact dynamics. We first started with the exact dynamics. Let the wave function ψ (t, x) be the solution of the initial value problem (28). If we denote by f nin, (x) the -scaled envelope functions of the initial wave function ψ in, , relative to the basis vn , and by gnin, (k), their Fourier transform, then the Fourier transformed envelope functions gn of ψ are the solutions of i∂t g = Akp g + U g ,

g (t = 0) = g in, ,

where & ' 2 |k| E 1 i n Akp g (k) = 2 (A(k) g(k))n = + (k)− k · Pnn gn (k) g n n 2 2

(61)

(62)

n

and U has been defined in (34). The k·p model. The k·p approximation consists in passing to the limit in U . Therefore, (t) as the solution of we define gkp = Akp gkp + U 0 gkp , i∂t gkp

gkp (t = 0) = g in, ,

(63)

(t) where U 0 is given by (36). It is worth noting that the back Fourier transform of gkp which we will denote by f kp (t, x) is the solution of = i∂t f kp,n

En 1 1 f kp,n − f kp,n − Pnn · ∇ f kp,n + Vnn f kp,n

, 2 2

(64) n n

f kp,n (t = 0) = f nin, (x),

where Vnn is given by (30).

586

L. Barletti, N. Ben Abdallah

Effective mass model. The diagonalization of the operator Akp performed in the previous section leads to the effective mass dynamics = Aem gem + U 0 gem , gem (t = 0) = g in, , i∂t gem

where

Aem g

1 (2) En 1 −1 (k) = 2 Λ (k) g(k) = + k · Mn k gn (k). n 2 2

(65)

(66)

(t, k), solution of (65), will be denoted by f (t, x) The back Fourier transform of gem em and it is easily shown to be the solution of = i∂t f em,n

1 1 −1 div M E f − ∇ f Vnn f em,n

, n em,n n em,n + 2 2

n

f em,n (t

= 0) =

(67)

f nin, (x).

This equation is still involving oscillations in time. These oscillations can be filtered out by setting f n,em (t, x) = h n,em (t, x)e

−i E n

t 2

(68)

which will be solution of 1 2 i∂t h em,n = − div M−1 eiωnn t/ Vnn h em,n , n ∇h em,n + 2

n

h em,n (t

= 0) =

(69)

f nin, (x),

where ωnn = E n − E n .

(70)

Limit effective mass dynamics. The limit h em,n as → 0 of these functions is the solution of 1 in i∂t h em,n = − div M−1 (71) n ∇h em,n + Vn h em,n , h em,n (t = 0) = f n (x), 2 where

Vn (x) ≡ Vnn (x) =

C

V (x, z)|vn (z)|2 dz

(72)

and f nin (x) is the limit as tends to zero of f nin, (x), which will be made precise later on. Remark 5.1. The free k·p operator A(ξ ) and the effective mass operator Λ(2) (ξ ) (see definitions (46) and (57)) are now re-introduced as operators acting in L2 . Recalling definition (53), we shall also consider the diagonal k·p operator

Λ g

n

(k) =

1 1 (Λ(k) g(k))n = 2 λn (k) gn (k). 2

(73)

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

587

The operators Akp , Aem and Λ are “fibered” self-adjoint operators in L2 , with fiber space 2 . It is well known (see Ref. [20]) that a fibered self-adjoint operator L in L2 has self-adjointness domain 2 d D(L) = g ∈ L g(ξ ) ∈ D (L(ξ )) a.e. ξ ∈ R and

L(ξ ) g(ξ ) 2 2 dξ < ∞ , Rd

where D (L(ξ )) is the self-adjointness domain of L(ξ ) in 2 . 5.1. Comparison of different envelope function dynamics. Assuming V ∈ W0 (Definition 2.3), we know from Lemma 3.1 that U and U 0 are bounded (and, clearly, symmetric). Therefore, Akp + U , Akp + U 0 and Aem + U 0 are generators of the unitary evolution groups G (t) = e

−it (Akp +U )

, G kp (t) = e

−it (Akp +U 0 )

, G em (t) = e−it (Aem +U ) . 0

Our goal is to compare, in the limit of small , the three mild solutions2 of Eqs. (61), (63) and (65), i.e. (t) = G kp (t) g in, , gem (t) = G em (t) g in, . g (t) = G (t) g in, , gkp

(74)

Lemma 5.1. Let g in, ∈ L2μ and V ∈ Wμ for some μ ≥ 0 (see Definition 2.3). Then, suitable constants c1 (μ, V ) ≥ 0 and c2 (μ, V ) ≥ 0, independent of , exist such that

gkp (t) L2 ≤ ec1 (μ,V )t g in, L2μ ,

gem (t) L2 ≤ ec2 (μ,V )t g in, L2μ , μ

μ

(75)

for all t ≥ 0. Proof. We prove the lemma only for gkp , the proof for gem being identical. We also drop the superscript of g in, . Let α be a fixed multi-index with |α| ≤ μ. For R > 0, consider the bounded multiplication operators on L2 , α k gn (k), if |k| ≤ R, m R g n (k) = 0, otherwise. Moreover, let us denote by m ∞ the (unbounded) limit operator m ∞ g n (k) = k α gn (k). Since m R (with R < ∞) commutes with Akp on D(Akp ), then, by applying standard semigroup techniques, we obtain t

(t) = G kp (t) m R g in + G kp (t − s) m R , U 0 gkp (s) ds m R gkp 0

and, therefore,

m R gkp (t) L2 ≤ m R g in L2 +

t 0

m R , U 0 gkp (s)

L2

ds.

(76)

2 We recall that a mild solution is, by definition, the function obtained by acting with the group e−it H , generated by a Hamiltonian H , on a generic initial state, not necessarily in the domain of H . Thus, the mild solution does not in general satisfy the evolution equation in a strict sense.

588

L. Barletti, N. Ben Abdallah

Using (36) and the identity k α − ηα =

α β<α β (k

− η)α−β ηβ , we have

1 (k α − ηα ) Vˆnn (k − η) gkp,n

(η) dη d/2 d (2π ) R n

α (k − η)α−β Vˆnn (k − η) ηβ gkp,n

(η) dη. β Rd

(k) = m ∞ , U 0 gkp n =

n

1 (2π )d/2

β<α

Recalling Definition 2.4, since V ∈ Wμ , the potential Uαβ (x, z), defined by Uˆ αβ (k, z) = k α−β Vˆ (k, z), belongs to W0 with Uαβ W0 ≤ V Wμ . Then, using (38), we obtain

α

Uαβ W0 ηβ gkp

m ∞ , U 0 gkp

2≤

L2 ≤ c1 (μ, V ) gkp

L2 , (77) L μ β β<α

with c1 (μ, V ) = (2d − 1) V Wμ . Letting R → +∞, it is not difficult to show that the dominated convergence theorem applies and yields

lim m R , U 0 gkp

2 = m ∞ , U 0 gkp

2 ≤ c1 (μ, V ) gkp

L2 . L

R→+∞

L

μ

Then, passing to the limit for R → +∞ in (76), we get t (t) L2 ≤ g in L2μ + c1 (μ, V )

gkp (s) L2 ds,

gkp μ

0

μ

and, therefore, Gronwall’s Lemma yields inequality (75). (t). Let us begin by comparing the exact dynamics g (t) with the k·p dynamics gkp (t) be, respectively, the solutions of (61) and (63). If Theorem 5.1. Let g (t) and gkp g in, ∈ L2μ and V ∈ Wμ , for some μ ≥ 0, then, for any given τ ≥ 0, a constant C(μ, V, τ ) ≥ 0, independent of , exists such that

g (t) − gkp (t) L2 ≤ μ C(μ, V, τ ) g in L2μ ,

(78)

for all 0 ≤ t ≤ τ . (t) satisfies the integral equation Proof. The function h (t) = g (t) − gkp

h (t) =

0

t

G (t − s) U 0 − U gkp (s) ds

and, therefore,

h (t) L2 ≤

0

t

U 0 − U gkp (s) L2 ds.

(t) belongs to L2 From Lemma 5.1 we have that gkp μ

for all t and, therefore, we can apply Theorem 3.1, which gives (s) L2 ≤ μ cμ V Wμ gkp (s) L2 ,

U 0 − U gkp μ

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

589

for a suitable constant cμ . Then we have

h (t) L2 ≤ μ cμ V Wμ

t 0

gkp (s) L2 ds

and, by (75), we have that (78) holds with C(μ, V, τ ) =

μ

cμ ec(μ,V )τ −1 c(μ,V )

V Wμ .

(t) with the effective mass dynamics g (t) We now compare the k·p dynamics gkp em (see definitions (74)). Recalling the discussion in Sec. 4, we need, as an intermediate (t) and g (t), the function g (t) = T ∗ g (t), that is step between gkp em ∗ kp

(79) g∗ (t) = T∗ G kp (t) g in, = exp −it Λ + T∗ U 0 T T∗ g in, ,

representing the diagonalized k·p dynamics (definitions (53), (54) and (73)). We also recall that Π N is the projector on the first N components (Definition 4.1). (t) and g (t) be respectively defined by (65) and (79). Let g in, ∈ Lemma 5.2. Let gem ∗ 2 Lμ and V ∈ Wμ , for some μ > 0, and assume that N > 0 exists such that

g in, = Π N g in, ,

U 0 = ΠN U 0

(t) remains confined in the first N bands for all times). Then, for (which implies that gem any given τ ≥ 0, a suitable constant C (μ, N , V, τ ), independent of , exists such that (t) L2 ≤ min{μ/3, 1} C (μ, N , V, τ ) g in, L2μ ,

g∗ (t) − gem

(80)

for all 0 ≤ t ≤ τ . (t) = exp(−itΛ ), S (t) = exp(−it A ) and U := T ∗ U 0 T . Then, Proof. Let SΛ em em T t gem (t) = Sem (t)g in, + Sem (t − s) U 0 gem (s) ds, 0 t g∗ (t) = SΛ (t)T∗ g in, + SΛ (t − s) UT g∗ (s) ds. 0

Putting

h

=

g∗

−

, gem

we can write

t h (t) = SΛ (t) g in, + SΛ − Sem (t) T∗ − I g in, + SΛ (t − s) UT h (s) ds 0 t t + SΛ − Sem (t − s) U 0 gem SΛ (t − s) UT − U 0 gem (s) ds + (s) ds. (81) 0

0

From the effective mass theorem, Theorem 4.2, a constant C(μ, N , t) exists such that − Sem (82) (t) g in, L2 ≤ min{μ/3, 1} C(μ, N , t) g in, L2μ .

SΛ (t) and U 0 g (t) belong to L2 for Moreover, from Lemma 5.1 we have that both gem em μ all t, and that a constant C1 (μ, V, t) ≥ 0 exists such that

U 0 gem (t) L2 ≤ C1 (μ, V, t) g in, L2μ μ

(83)

590

L. Barletti, N. Ben Abdallah

). (this stems, in particular, from the commutator inequality (77), which still holds for gem Recalling that our assumptions imply that gem (t) remains confined in the first N bands for all t, the last inequality, together with Theorem 4.2, yields

SΛ (t − s) U 0 gem − Sem (s) L2 ≤ min{μ/3, 1} C2 (μ, N , t − s) g in, L2μ (84)

for a suitable constant C2 (μ, N , t) ≥ 0. In order to estimate the last integral in (81), let us write UT − U 0 gem (s) = T∗ − I U 0 gem (s) + T∗ U 0 (T − I ) gem (s). Using inequalities (55) and (83) we see that another constant C3 (μ, N , V, t) ≥ 0 exists such that

UT − U 0 gem (s) L2 ≤ min{μ,1} C3 (μ, N , V, t) g in, L2μ . (85) In conclusion, from inequalities (82), (84) and (85), and from Eq. (81), we get t

h (t) L2 ≤ min{μ/3,1} C4 (μ, N , V, τ ) g in, L2μ + V W0

h (s) L2 ds, 0

for all 0 ≤ t ≤ τ (here we also used the fact that all the estimation constants introduced so far are non-decreasing with respect to time). Hence, inequality (80), with C (μ, N , V, τ ) = eτ V W0 C4 (μ, N , τ, V ), follows from Gronwall’s Lemma. Lemma 5.3. Let g : R → L2 be uniformly continuous in bounded time intervals and assume that M ≥ 0 exists such that g(t) L2 ≤ M g(0) L2 . Then lim Π Nc g(t) 2L = 0

N →∞

uniformly in bounded time intervals (where, as usual, Π Nc = I − Π N ). Proof. For f, h ∈ L2 we can write f 2 − h 2 ≤ | f − h, f | + |h, f − h | ≤ ( f + h ) f − h . Choosing f = Π Nc g(t) and h = Π Nc g(s) we obtain c 2 2 Π N g(t) − Π Nc g(s) ≤ 2M g(0) g(t) − g(s) , showing that the sequence of functions FN (t) := Π Nc g (t) 2L2 =

gn (t) 2L 2

n≥N

is uniformly equicontinuous in bounded time intervals. By standard results, this implies the existence of a subsequence FNk that converges uniformly in bounded intervals for N → ∞. But FN (t) is point-wise monotonically decreasing to 0 and, therefore, the whole sequence FN goes to 0 uniformly in bounded intervals.

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

591

(t), g (t) as in (74), and assume g in, ∈ L2 and V ∈ W for Theorem 5.2. Let gkp μ em μ in, 2 some μ > 0, with g uniformly bounded in Lμ as tends to zero. Then lim gkp (t) − gem (t) L2 = 0,

→0

uniformly in bounded time-intervals. ,N Proof. Let g∗ (t) be given by (79), and let g∗,N (t), gem (t) be the cut-off approximations of g∗ (t) and gem (t), obtained by the substitutions

g in, → Π N g in, ,

U 0 → Π N U 0

,N (so that g∗,N (t) and gem (t) remain confined in the first N bands for all times). Then we can write

gkp (t) − gem (t) L2 ≤ (T∗ − I )gkp (t) L2 + g∗ (t) − gem (t) L2

≤ (T∗ − I )Π N gkp (t) L2 + (T∗ − I )Π Nc gkp (t) L2 ,N + g∗ (t) − g∗,N (t) L2 + g∗,N (t) − gem (t) L2

,N + gem (t) − gem (t) L2 =: I1 + I2 + I3 + I4 + I5 .

We now estimate one by one the five terms at right-hand side of the above inequality. For the first one we can use inequalities (55) and (75) to obtain that a constant C(μ, N , V, τ ) exists such that I1 = (T∗ − I )Π N gkp (t) L2 ≤ min{μ, 1} C(μ, N , V, τ ) g in, L2μ ,

for all 0 ≤ t ≤ τ . By using the unitarity of T , for the second term we easily get I2 = (T∗ − I )Π Nc gkp (t) L2 ≤ 2 Π Nc gkp (t) L2 .

The third term can be estimated by means of the usual Duhamel formula, which yields t ,N c ∗ in,

T∗ Π Nc U 0 T g∗,N (s) L2 ds. I3 = g∗ (t) − g∗ (t) L2 ≤ Π N T g L2 + 0

The fifth term I5 has an analogous estimate. Finally, Lemma 5.2 applies to the fourth term and, therefore, ,N I4 = g∗,N (t) − gem (t) L2 ≤ min{μ/3, 1} C (μ, N , V, τ ) g in, L2μ ,

for all 0 ≤ t ≤ τ . By the assumptions on g in, , Lemma 5.3 can be applied to the terms appearing in the estimates for I2 and I3 (and I5 as well), with constants M independent of . Then, we can fix N large enough (independently of and uniformly in 0 ≤ t ≤ τ ) to make I2 , I3 , I5 arbitrarily small. Once N has been fixed in this way, can be chosen small enough to make also I1 and I4 arbitrarily small (uniformly in 0 ≤ t ≤ τ ). Recalling Definition 2.3, the following corollary follows directly from Theorems 5.1, 5.2 and Lemma 5.2.

592

L. Barletti, N. Ben Abdallah

Corollary 5.1. Assume that f in, ∈ Hμ and V ∈ Wμ , for some μ > 0, with f in, uniformly bounded in Hμ as → 0. Then we have the local uniform in time convergence lim f (t) − f em (t) L2 = 0,

→0

are, respectively, the mild solutions of the exact equations (32) and of where f and f em the effective mass approximation (67). If, moreover, f in, = Π N f in, and U 0 = Π N U 0 for some N , then

f (t) − f em (t) L2 ≤ C(τ ) min{μ/3, 1} ,

for all 0 ≤ t ≤ τ and some constant C(τ ) independent of . We can finally prove the convergence towards the limit effective mass dynamics. Theorem 5.3. Let h em (t, x) and h em (t, x) be the mild solutions of, respectively, Eq. (69) and Eq. (71). Assume lim→0 f in, − f in L2 = 0 and assume that μ > 0 exists such that V ∈ Wμ and f in, is bounded uniformly in Hμ as → 0. Then lim h em (t) − h em (t) L2 = 0,

→0

uniformly in bounded time intervals. Proof. First of all we remark that, since the dynamics generated by (69) and (71) both preserve the L2 norm, we can replace without loss of generality the initial condition f in, with f in . Let us then consider the diagonal operator H0 in L2 , 1 (H0 h)n (x) = div M−1 n ∇h n (x) + Vn (x)h n (x), 2 where Vn (x) ≡ Vnn (x). We recall that the matrix Vnn defines a bounded operator on L2 (that is, the operator U 0 in position variables, see definition (36)). Such operator, as well as its diagonal and off-diagonal parts, are bounded operators with bound V W0 (see Lemma 3.1). Then, H0 is self-adjoint on the domain 2 2

div(M−1 D(H0 ) = h ∈ L h n ∈ H 2 (Rd ), n ∇h n ) L 2 (Rd ) < ∞ . n

Let S(t) = exp(−it H0 ) denote the (diagonal) unitary group generated by H0 . Moreover we consider the operator R (t) given by 2 eiωnn t/ Vnn (x) h n (x), R (t)h n (x) = n =n

which, being unitarily equivalent to the off-diagonal part of U 0 , is again bounded by

V W0 (for all t). The two mild solutions satisfy t S(t − s)R (s) h em (s) ds, h em (t) = S(t) f in , h em (t) = S(t) f in + 0

and, therefore, what we need to do is prove that t h em (t) − h em (t) = S(t − s)R (s) h em (s) ds 0

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

593

goes to zero as → 0. To this aim we resort to the usual cutoff. For any fixed N ∈ N we decompose the right-hand side of the previous equation h em (t) − h em (t) = I N (t) + I Nc (t), where, using the projection operators Π N and Π Nc = I − Π N (Definition 4.1) we have put t I N (t) = S(t − s)Π N R (s)Π N h em (s) ds, 0 t ) ( c I N (t) = S(t − s) Π Nc R (s)Π N + Π N R (s)Π Nc + Π Nc R (s)Π Nc h em (s) ds. 0

Case of regular data. We assume in this part that V ∈ W2 and that f in ∈ H2 . We fix a δ > 0 arbitrarily small and a maximum time τ . Because R (t) is uniformly bounded and h em (t) L2 = f in L2 then, clearly, a number N (δ, τ ) (independent of ) exists such that I Nc (t) L2 ≤ δ, for all N ≥ N (δ, τ ) and 0 ≤ t ≤ τ . We now turn our attention to I N (t). Using the assumption V ∈ W2 , it is not difficult to prove the following facts: (i) for every N , if h ∈ H2 , then Π N h ∈ D(H0 ), and a constant C N exists such that

H0 Π N h L2 ≤ C N h H2 ; (ii) for every N , a constant C N , independent of t and , exists such that, if h ∈ H2 , then

Π N R (t)Π N h H2 ≤ C N h H2 . Moreover, in a similar way to Lemma 5.1, we can prove the following: (iii) if f in ∈ H2 , then h em (t) ∈ H2 for all t and a function C(t), bounded on bounded time intervals and independent of , exists such that h em (t) H2 ≤ C(t) f in H2 .

Using (i), (ii) and (iii) we have that Π N R (s)Π N h (s) ∈ D(H0 ) and, therefore, S(t − s)Π N R (s)Π N h em (s) is continuously differentiable in s. This makes it possible to perform an integration by parts in the integral defining I N (t). Since R (t) = 2 where

R˜ (t)h

n

(x) =

n =n

d ˜ R (t), dt

1 2 eiωnn t/ Vnn (x) h n (x), iωnn

then the integration by parts yields s=t I N (t) = 2 S(t − s)Π N R˜ (s)Π N h em (s) s=0

t d − 2 S(t − s)Π N i H0 R˜ (s)Π N h em (s) + R˜ (s)Π N h em (s) ds, ds 0 d h em (s) = H0 h em (s) + R (s)h em (s). Since Π N R˜ (t) is uniformly where, of course, ds bounded in time by some constant dependent on N (in particular, such constant depends on 1/ min{ωnn | n = n, n ≤ N }), then, from (i), (ii) and (iii), and using Π N H0 = H0 Π N , we obtain that a constant C N (τ ), independent of , exists such that

I N (t) L2 ≤ 2 C N (τ ) f in H2 ,

0 ≤ t ≤ τ.

594

L. Barletti, N. Ben Abdallah

Thus, fixing N ≥ N (δ, τ ), a small enough exists such that I N (t) L2 ≤ δ, for all 0 ≤ t ≤ τ . For such N and we have, therefore,

h em (t) − h em (t) L2 ≤ I N (t) L2 + I Nc (t) L2 ≤ 2δ, which proves the theorem in the regular case. Case of general data. If μ ≥ 2, then there is nothing to do. Let us assume 0 < μ < 2, let δ be a regularizing parameter and let f δin and Vδ be two regularizations of f in and of V such that f δin ∈ H2 ,

lim f δin − f in L2 = 0

δ→0

and Vδ ∈ W2 ,

lim Vδ − V W0 = 0.

δ→0

Let h em,δ and h em,δ be the corresponding solutions of (69) and (71) with the modified initial data and potential. Then we have

(h em − h em )(t) L2 ≤ (h em − h em,δ )(t) L2

+ (h em,δ − h em,δ )(t) L2 + (h em,δ − h em )(t) L2 .

The above analysis of the regular case shows that for any fixed δ > 0, the second term of the right-hand side tends to zero as tends to zero. Thanks to Lemma 3.1 it is easy to show that the third term of the right-hand side tends to zero as δ tends to zero and that the first term of the right-hand also tends to zero as δ tends to zero uniformly in and in bounded time intervals. We remark that, as it emerges from the above proof, if the dynamics is restricted to the first N bands, then the convergence rate of h em (t) to h em (t) is of order 2 . 5.2. Convergence of the probability density. In this subsection we prove the main theorem of the paper, i.e. the weak convergence of the probability density |ψ |2 towards 2 the superposition of the probability densities h em,n , where h em,n follow the effective mass dynamics. Theorem 5.4. Let ψ in, be an initial datum in L 2 (Rd ) and let f nin, be its -scaled envelope functions relative to the basis vn (x) = u n,0 (x). Assume that μ > 0 exists such that the sequence f in, = ( f nin, ) belongs to Hμ , with a uniform bound for the norm as vanishes, and that it converges in L2 as tends to 0 to an initial datum f in = ( f nin ). Moreover, assume that V ∈ Wμ . Then for any given test function θ ∈ L 1 (Rd ) such that θˆ ∈ L 1 (Rd ), we have

2 h em,n (t, x)2 d x = 0, lim θ (x) ψ (t, x) − (86) →0

n

uniformly in bounded time intervals, where ψ (t, x) and h em (t, x) are, respectively, the mild solutions of (28) and (71).

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

595

Proof. Throughout this proof, ·, · and · , where not otherwise indicated, will denote the hermitian product and the norm of L 2 (Rd ). Thus we rewrite Eq. (86) (to be proved) as (87) lim θ ψ (t), ψ (t) − θ h em,n (t), h em,n (t) = 0. →0

n

Let h n (t, x) = f n (t, x)ei E n t/ , 2

where f n are the envelope functions of ψ . By using the unitarity of the transformation f n → h n we deduce from the results of the above subsection, in particular from Corollary 5.1 and Theorem 5.3, that

h n (t) − h em,n (t) 2 = 0 (88) lim h (t) − h em (t) 2L2 = lim →0

→0

n

(locally uniformly in time). Let us introduce, for any positive constant ρ, the truncation operator Tρ ( f ) = F ∗ (1ρ B fˆ),

(89)

and let h˜ n = T 1 (h n ),

θ˜ = T 1 (θ ), 3

3

h˜ em,n = T 1 (h em,n ). 3

Recalling that ψ (t, x) = |C|1/2

h n (t, x)e−i E n t/ vn (x), 2

n

let us define ψ˜ (t, x) = |C|1/2

2 h˜ n (t, x)e−i E n t/ vn (x).

n

Note that, since the Fourier transform of h˜ n (t, x)e−i E n t/ is supported in B/3, these functions are the envelope functions of ψ˜ and, therefore, 2 2 2

ψ (t) − ψ˜ (t) =

h n (t) − h˜ n (t) =

(1 − 1B/3 )hˆ n (t) . 2

n

n

Thus, by writing

(1 − 1B/3 )hˆ (t) L2 ≤ hˆ (t) − hˆ em (t) L2 + (1 − 1B/3 )hˆ em (t) L2 , from (88) we have 2 lim ψ (t) − ψ˜ (t) = 0,

→0

uniformly in bounded time intervals (BTIs). From the continuity of the hermitian product it is now clear that lim θ ψ (t), ψ (t) − θ ψ˜ (t), ψ˜ (t) = 0 →0

596

L. Barletti, N. Ben Abdallah

(uniformly in BTIs) and, therefore, in (87) we can replace ψ by ψ˜ . Moreover, 2 lim θ ψ˜ (t), ψ˜ (t) − θ˜ ψ˜ (t), ψ˜ (t) ≤ lim θ˜ − θ L ∞ ψ˜ (t) = 0 →0

→0

and the limit is uniform in time since ψ˜ (t) ≤ h (t) L2 = f in, L2 , and f in, → f in in L2 , by assumption. Therefore, we can also replace θ by θ˜ . But θ˜ ψ˜ (t), ψ˜ (t) = |C| h˜ n (t, x)vn (x) θ˜ (x)h˜ n (t, x)vn (x), n

n

˜ h˜ n ) ⊂ B/3 + B/3 ⊂ B/; therefore the Parseval formula (18) yields and supp(θ θ˜ ψ˜ (t), ψ˜ (t) =

θ˜ h˜ n (t), h˜ n (t) . n

Thus, we have shown that proving (87) is equivalent to proving lim θ h em,n (t), h em,n (t) = 0 θ˜ h˜ n (t), h˜ n (t) − →0

n

(90)

n

uniformly in BTIs. But (90) easily follows from lim→0 h (t) − h˜ (t) L2 = 0 (which has been proved above), from (88) and from the continuity of the hermitian product of L2 . 6. Further Developments and Comments 6.1. Generalization to noncritical points. As initially remarked (see Remark 2.2), one of the most restrictive hypotheses that we made in the previous sections is the simplicity of all the eigenvalues E n = E n (0). Nevertheless, since the simplicity of the eigenvalues E n (k) is a generic property [1,22], such restriction is only apparent and can be removed by changing the reference wave vector from k = 0 to a generic k0 ∈ B. However, in this case, k0 needs not to be a critical point of all bands and in the Taylor expansion around k0 the first-order terms may not vanish. Let us now briefly see how our results can be extended to such general case. The Luttinger-Kohn wave functions centered in k = k0 ∈ B read as follows [16]: Xn,k (x) = |B|−1/2 1B (k) eik·x eik0 ·x u n,k0 (x), or, in the scaled version, Xn,k (x) = |B|−1/2 1B/ (k) eik·x ei

k0 ·x

u n,k0

x

.

Recalling Definition 2.1, we note that |B|−1/2 eik0 ·x u n,k0 (x) = bn,k0 (x) is the n th Bloch wave with wave vector k0 . The envelope function decomposition corresponding to such Luttinger-Kohn basis is given by (14) with vn (x) = eik0 ·x u n,k0 (x).

(91)

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

597

Indeed, the envelope function theorem, Theorem 2.1, has been proved for vn periodic but it is not difficult to see that it remains true for vn quasi-periodic, i.e. of the form (91). The scaled envelope decomposition, therefore, reads as follows: x k0 ·x . (92) ψ(x) = |C|1/2 f n (x)vn (x) = |C|1/2 f n (x)ei u n,k0 n n We remark that an analogous decomposition appears (for a single band) in the work of Allaire and Piatnitski [4]. By using the fundamental property (8) it is easy to see that the evolution equation for the envelope functions f n (t, x) is still given by Eq. (32), where now E n = E n (k0 ) and

Pnn =

C

e−ik0 ·x u n,k0 (x)∇ eik0 ·x u n ,k0 (x) d x.

The analysis of Sects. 3 and 4 still holds with the only difference that the Taylor expansion (58) now contains a first-order term: λn (ξ ) = E n + ξ · Vn +

1 3 ξ · M−1 n ξ + O(|ξ | ), 2

(93)

where3 E n = E n (k0 ),

Vn = ∇ E n (k0 ),

M−1 n = ∇ ⊗ ∇ E n (k0 ).

(94)

The resulting effective mass dynamics, i∂t f em,n =

En i 1 −1 V div M f − · ∇ f − ∇ f Vnn f em,n

, n em,n n em,n + 2 em,n 2

(95)

n

now contains drift terms of order 1/. It turns out, however, that Theorems 4.1, 4.2, 5.1 and 5.2 are still valid, because they are only based on general the properties of envelope functions and/or on the analyticity of the eigencouples ϕ (n) (ξ ), λn (ξ ) in a neighborhood of ξ = 0. In order to investigate the limit dynamics, we have to re-define the functions h n,em (compare with Eq. (68)) in order to filter out both oscillations and drift: 2 (t, x) = e−it E n / h n,em t, x − t Vn . f n,em Then, the functions h n,em will be solutions of 1 2 i∂t h em,n = − div M−1 ∇h eiωnn t/ Vnn x + t Vn h em,n , n em,n + 2

(96)

n

with ωnn = E n − E n . Now, assuming that the external potential is such that lim V (x, z) = V ∞ (z),

|x|→∞

3 Note that the expansion around 0 of the bands λ corresponds to the expansion around k of the bands n 0 E n : this is due to the phase factor eik0 ·x , contained in vn , which shifts the wave vector space of envelope functions (see also Remark 4.1).

598

L. Barletti, N. Ben Abdallah

uniformly in z ∈ C, we have to compare (96) with the limit dynamics 1 *n h em,n , i∂t h em,n = − div M−1 +V ∇h em,n n 2

(97)

where *n (x) = V

(x, z)|vn (z)|2 dz, if Vn = 0, C V ∞ 2 C V (z)|vn (z)| dz, if Vn = 0.

(98)

Of course, if k0 is a critical point for all bands, then all the drift velocities Vn vanish and Eq. (97) reduces to Eq. (71). Note that the potentials V (x, z) considered so far are of class Wμ for some μ > 0 (see Definition 2.4), which prevents V ∞ (z) from being different from 0. However, it is readily seen that all the results we have proved hold more in general for a potential of the form V (x, z) = V μ (x, z) + V ∞ (z),

(99)

where V μ ∈ Wμ , V ∞ ∈ L ∞ (Rd ) and lim|x|→∞ V μ (x, z) = 0 uniformly in z ∈ C. All this considered, the proof of Theorem 5.3 can be straightforwardly adapted to the more general situation in which the effective mass dynamics Eq. (69) is substituted by Eq. (96), the limit dynamics Eq. (71) is substituted by Eq. (97), and the potential V is assumed to be of the form (99). However, the statement of Theorem 5.4 has to be reformulated because the different drift velocities 1 Vn of the different envelope functions need to be accompanied by corresponding drifts of the test functions (see Eq. (102)). All this considered, the main theorem, Theorem 2.3, can be generalized as follows. Theorem 6.1. Let k0 ∈ B such that all the eigenvalues E n = E n (k0 ) are simple, and let Vn = ∇ E n (k0 ),

M−1 n = ∇ ⊗ ∇ E n (k0 ).

Let ψ in, be an initial datum in L 2 (Rd ) and let f nin, be its scaled envelope functions relative to the basis vn (x) = eik0 ·x u n,k0 (x). Assume that μ > 0 exists such that the sequence f in, = ( f nin, ) belongs to Hμ , with a uniform bound for the norm as vanishes, and that it converges in L2 as tends to zero to an initial datum f in = ( f nin ). Let ψ (t, x) be the unique mild solution of x x 1 1 + V x, ψ (t, x), i∂t ψ (t, x) = − + 2 WL 2 ψ(t = 0) = ψ in, , let f n (t, x) be its scaled envelope functions relative to the basis vn and let ψn (t, x) = |C|1/2 f n (t, x) vn (x)

(100)

be the projection of ψ in the n th envelope-function subspace. Moreover, assume that V (x, z) = V μ (x, z) + V ∞ (z), with V μ ∈ Wμ , V ∞ ∈ L ∞ (Rd ) and

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

599

lim|x|→∞ V μ (x, z) = 0 uniformly in z ∈ C. Finally, let h em,n be the mild solution of the homogenized Schrödinger equation 1 *n h em,n , h em,n (t = 0) = f nin , (101) i∂t h em,n = − div M−1 +V ∇h em,n n 2 *n is given by (98). Then, for every sequence of test functions θn ∈ L 1 (Rd ) for where V which φ ∈ L 1 (Rd ) exists such that |θˆn (k)| ≤ φ(k) for a.e. k ∈ Rd and for all n, we have 2 2 (102) lim θn x − t Vn ψn (t, x) − θn (x)h em,n (t, x) d x = 0, →0

n

uniformly in bounded time intervals. Proof. According to the previous discussion, we shall assume that the analogues of Theorems 5.1, 5.2 and 5.3 hold. Then, we proceed as in the proof of Theorem 5.4. Let us put n (t, x) = θn x − t Vn and rewrite Eq. (102) (to be proved) as ( ) lim n (t)ψn (t), ψn (t) − θn h em,n (t), h em,n (t) = 0 →0

(103)

n

(where ·, · and · denote the hermitian product and norm of L 2 (Rd )). Let 2 h n (t, x) = f n t, x + t Vn ei E n t/ , where f n are the envelope functions of ψ . Using the fact that f n → h n is a unitary transformation in L2 , from (the analogues of) Theorems 5.1, 5.2 and 5.3 we have lim h (t) − h em (t) 2L2 = 0,

(104)

→0

uniformly in bounded time intervals (BTIs). Then, we put as usual ˜ = T 1 ( ), h˜ n = T 1 (h n ), h˜ em,n = T 1 (h em,n ) θ˜ = T 1 (θ ), 3

3

3

3

(see definition (89)) and, recalling that 2 ψ (t, x) = |C|1/2 h n t, x − t Vn e−i E n t/ vn (x), n

we define ψ˜ (t, x) = |C|1/2

n

2 h˜ n t, x − t Vn e−i E n t/ vn (x).

Thus, as in the proof of Theorem 5.4 we can deduce that lim ψ (t) − ψ˜ (t) = 0,

→0

uniformly in BTIs, and, by using

n (t) ∞ = θn ∞ ≤ θˆn L 1 ≤ φ L 1 ,

600

L. Barletti, N. Ben Abdallah

the orthogonality properties

ψn (t) 2 = ψ (t) 2 , n

2 2

ψ˜ n (t) = ψ˜ (t) ,

n

and the norm conservation

ψ˜ (t) = h˜ (t) L2 ≤ h (t) L2 = f in, L2 , we obtain

lim n (t)ψn (t), ψn (t) − n (t)ψ˜ n (t), ψ˜ n (t) = 0

→0

n

and, therefore, in (103) we can replace ψ by ψ˜ . Moreover, 2 ˜ n (t) ψ˜ n (t), ψ˜ n (t) ≤ n (t) −

θn − θ˜n ∞ ψ˜ n n

≤

n

n 2

1B/3 θˆn 1 ψ˜ n L c

2

≤ 1B/3 φ 1 ψ˜ c

L

goes to 0 uniformly in BTIs as → 0, showing that in (103) we can also replace n by ˜ n . But, resorting to the usual Parseval identity, it is easily seen that ˜ n (t)ψ˜ (t), ψ˜ (t) = θ˜n h˜ n (t), h˜ n (t) . n

Thus, proving (87) is equivalent to proving

lim θ˜n h˜ n (t), h˜ n (t) − θn h em,n (t), h em,n (t) = 0 →0

n

(uniformly in BTIs), which can be done in exactly the same way as in the final section of the proof of Theorem 5.4 (i.e. where (90) is proved). The result stated in Theorem 6.1 can be directly compared (apart from an opposite sign convention for the energy bands) with that of Allaire and Piatnitski in Ref. [4], where the homogenized Schrödinger equation (101) is also deduced. However, as already remarked in the Introduction, the two approaches differ in two respects. Indeed, our approach can deal with initial data that can be carried by all the energy bands, while in Ref. [4] the initial datum is concentrated on a finite number of bands. On the other hand, our convergence result holds for the position probability density |ψ (t, x)|2 , while Allaire and Piatnitski prove the convergence of the wavefunction ψ (in the sense of double-scale convergence with drift). We believe, although we have not proved it, that our technique is able to prove the convergence of other observables, such as the probability current. In both approaches, the simplicity of eigenvalues has to be assumed. Nevertheless, this assumption can be removed and replaced by the fact that the initial datum envelope functions corresponding to multiple eigenvalues are vanishing. The proof has however to be reshuffled and we have chosen to stick to the restrictive hypothesis of simple eigenvalues. Let us however briefly explain how we can deal with this problem. One important step is the diagonalization of the k·p Hamiltonian which gives rise to Eq. (79). In this formula the operator Λ is diagonal in the n index while T∗ U 0 T is not (the existence of

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

601

the unitary transformation is still valid even in the case of multiple eigenvalues; it is continuous, but not regular for eigenvalues with multiplicity larger than one). Because of the separation of the eigenvalues, it is easy to show that the eigenspaces with different energies are decoupled from each other (adiabatic decoupling) and we can replace T∗ U 0 T 0 δ . If the initial data are only concentrated on modes with multiplicity one, then by Unn nn the solution itself is almost concentrated on these modes and, for these modes, we can make the expansion of eigenvalues and obtain the effective mass equation (71). Let us also mention a recent work by Fendt-Delebecque and Méhats [11] where the effective mass approximation is performed for the Schrödinger equation with large magnetic field and which relies on large time averaging of almost periodic functions. This approach might be of help for analyzing the limit for multiple eigenvalues.

6.2. Asymptotic behavior of scaled envelope functions. One final question which has not been addressed so far is the relationship between the regularity of the wave function ψ and that of its corresponding sequence f of envelope functions. In particular, one may look for sufficient conditions on ψ so that f ∈ Hμ (Definition 2.3). Since the envelope function is a Fourier-like expansion of ψ on the basis vn , then their decay as n becomes bigger depends not only on the regularity of ψ but also on that of the basis (vn ) which itself will depend on the regularity of the potential WL . We show in the present subsection some results in this direction, namely the asymptotic behavior as tends to zero of the scaled envelope functions relative to the basis (vn ) defined in (27) (i.e. vn = u n,0 ). From (19) it is readily seen that the limit as tends to zero of the envelope functions f n of ψ is given by x ψ(x) d x lim fˆn (k) = lim |B |−1/2 1B/ (k) e−ik·x vn →0 →0 Rd ˆ = |B |−1/2 |C|−1 vn , 1 e−ik·x ψ(x) d x = |C|−1/2 vn , 1 ψ(k). Rd

Therefore lim f n = |C|−1/2 vn , 1 ψ.

→0

(105)

The following proposition, shows that the regularity of the crystal potential leads to decay properties of the coefficients vn , 1 = C vn (x) d x. Proposition 6.1. Let WL be in C ∞ . Then for any integer p, the coefficients vn , 1 satisfy the inequality |vn , 1 | ≤

Cp p, En

where C p is a constant only depending on WL W 2 p,∞ . Proof. We first remark that p p p E n vn , 1 = HL vn , 1 = vn , HL 1

602

L. Barletti, N. Ben Abdallah

(where HL denotes the p th power of HL , not to be confused with the notation HL introp duced in Sec. 2). Now it is readily seen that if WL ∈ W 2 p,∞ , then HL 1 ∈ L ∞ with p

HL 1 L ∞ ≤ C WL W 2 p,∞ , for a suitable constant C ≥ 0. Then p

p

p

E n |vn , 1 | ≤ vn L 2 HL 1 L 2 ≤ C p , with C p only depending on WL W 2 p,∞ , which ends the proof. We also have the following property. Lemma 6.1. Let λ and λ be two elements of the reciprocal lattice L∗ . Assume that WL ∈ C ∞ . Then, for any integers k, p, we have the estimate 2k (1 + |λ|2k λ ) k iλ·x k iλ ·x , ≤ Ck, p HL e , HL e 1 + |λ − λ |2 p for a suitable constant Ck, p ≥ 0. α iλ·x , where V contains products of Proof. It is clear that HLk eiλ·x = 2k α |α|=0 λ Vα (x)e WL and its derivatives up to order 2k − |α|. Therefore

HLk eiλ·x , HLk eiλ ·x =

2k |α|,|β|=0

λα (λ )β

C

Vα (x)Vβ (x) ei(λ−λ )·x d x.

Now the result can be obtained by simply integrating by parts 2 p times. The estimate of Lemma 6.1 is not optimal and can certainly be refined, but this is not the scope of our paper. Next proposition follows from the previous result. Proposition 6.2. Assume WL ∈ L ∞ and let f n be the envelope functions of ψ. Then the following estimate holds for any μ ≥ 0:

f 2Hμ =

2

(1 + |k|2 )μ/2 fˆn (k) L 2 ≤ Cμ ψ 2H μ .

(106)

n

Let now WL be in C ∞ , then the following estimate holds for any integer s: E ns f n 2L 2 ≤ Cs ( ψ 2L 2 + 2s ψ 2H s ). n

Proof. Let us first prove (106). Using the identity x −1/2 ˆ ψ(x) d x, 1B/ (k) e−ik·x vn f n (k) = |B | Rd as well as the decomposition vn (x) =

1 vn,λ eiλ·x , |C|1/2 λ∈L∗

(107)

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

where vn,λ = vn , n

=

eiλ·x |C |1/2

603

, we obtain,

2

(1 + |k|2 )μ/2 fˆn (k) L 2

B/

n λ,λ

(1 + |k|2 )μ vn,λ vn,λ ψˆ k − λ ψˆ k −

λ

dk.

Summing first with respect to n and using the identity

1 iλ·x iλ ·x = δλλ , e , e |C|

vn,λ vn,λ =

n

the right hand side of the above identity takes the simple form

2 fˆn (k) L 2 =

2 μ/2

(1 + |k| )

λ∈L∗

n

λ 2 ˆ (1 + |k| ) ψ k − dk. B/ 2 μ

It is now readily seen that there exists a constant c ≥ 1, only depending on the fundamental cell C, such that for all k ∈ B and for all λ ∈ L∗ , we have the estimate |k| ≤ c|k − λ|, so that & 2 'μ 2 λ 2 2 μ/2 ˆ 2μ ψˆ k − λ dk

(1 + |k| ) f n (k) L 2 ≤ c 1 + k − n λ∈L∗ B/ ˆ 2 = c2μ (1 + |k|2 )μ ψ(k) dk. Rd

This implies that a suitable constant Cμ exists such that (106) holds. Let us now prove (107). We proceed analogously and find

E ns f n 2L 2 =

n λ,λ

n

B/

λ λ

dk. E ns vn,λ vn,λ ψˆ k − ψˆ k −

As we first make the sum over the index n and, therefore, we need to evaluate above, s n E n vn,λ vn,λ . We first remark that eiλ·x eiλ·x E ns vn,λ = HLs vn , 1/2 = vn , HLs 1/2 |C| |C| and, therefore, n

E ns vn,λ vn,λ =

1 s iλ·x iλ ·x . H e ,e |C| L

Contrary to the proof of (106), the obtained formula is not diagonal in (λ, λ ) but Lemma 6.1 leads to the following estimate, which holds for large enough integers p:

604

L. Barletti, N. Ben Abdallah

E ns f n 2L 2

n

λ λ ˆ ˆ ψ k− dk ψ k− ≤ Cs, p 1 + |λ − λ |2 p λ,λ B/ 2

2 Cs, p λ λ 1 + |λ|2s ψˆ k − + ψˆ k − dk ≤

|2 p 2 |λ 1 + − λ B /

λ,λ λ 2 (1 + |λ|2s )ψˆ k − dk. ≤ Cs, p ∗ B /

1 + |λ|2s

λ∈L

Note that we used the fact that, for large enough p, the following estimates hold with constants C1 and C2 only depending on s and p: λ∈L∗

1 + |λ|2s 1 + |λ − λ |2 p

2s ≤ C1 (1 + λ ),

λ ∈L∗

1 + |λ|2s 1 + |λ − λ |2 p

≤ C2 (1 + |λ|2s ).

Now, for λ = 0 and k ∈ B it is readily seen that |λ| ≤ c0 |λ − k|, where c0 is a positive constant independent of λ and k. Therefore, λ 2 2 2s 2s 2 (1 + |λ|2s )ψˆ k − dk ≤ ψ L 2 + c0 ψ H s , ∗ B / λ∈L

which implies that a suitable constant Cs exists such that (107) holds. Acknowledgements. N. Ben Abdallah acknowledges support from the project QUATRAIN (BLAN07-2 212988) funded by the French Agence Nationale de la Recherche) and from the Marie Curie Project DEASE: MEST-CT-2005-021122 funded by the European Union. L. Barletti acknowledges support from Italian national research project PRIN 2006 “Mathematical modelling of semiconductor devices, mathematical methods in kinetic theories and applications” (2006012132_004).

A. Postponed Proofs A.1. Proof of Theorem 2.1. For any Schwartz function ψ we can write −d/2 ik·x −d/2 ˆ ˆ ψ(k) e dk = ψ(k) eik·x dk ψ(x) = (2π ) (2π ) Rd

=

(2π )−d/2 eiη·x

η∈L∗

B

ˆ + η) eiξ ·x dξ = ψ(ξ

−d/2

B

η∈L∗

=

η∈L∗ B+η

eiη·x G η (x),

ˆ + η) eiξ ·x dξ ψ(ξ

clearly belongs to F ∗ L 2B (Rd ). Moreover, we have 2

G η 2L 2 =

Gˆ η L 2 = η∈L∗

η∈L∗

where G η (x) = (2π )

B+η

η∈L∗

d η∈L∗ R

2 ˆ ψ(ξ + η)1B (ξ ) dk

ˆ 2 ψ(ξ ) dξ = ψ 2L 2 .

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

Thus, defining F(x, y) =

eiη·x G η (y),

605

(x, y) ∈ C × Rd ,

η∈L∗

we have that F ∈ L 2 (C × Rd ) and |C|−1 F 2L 2 (C ×Rd ) =

η∈L∗

G η 2L 2 (Rd ) = ψ 2L 2 (Rd )

(where we used the fact that {|C|−1/2 eiη·x | η ∈ L∗ } is a orthonormal basis of L 2 (C)). Since {vn | n ∈ N} is another orthonormal basis of L 2 (C), then we can also write F(x, y) = |C|1/2 f n (y)vn (x), (108) n

where f n (y) = |C|−1/2 F(·, y), vn L 2 (C ) . Note that fˆn ∈ L 2B (Rd ) for every n and that

ψ 2L 2 (Rd ) = |C|−1 F 2L 2 (C ×Rd ) =

n

(109)

f n 2L 2 (Rd ) .

For y = x, (108) yields (14), at least for Schwartz functions. However, it can be easily proved that the mapping ψ → ( f 0 , f 1 , . . .) can be uniquely extended to an isometry between L 2 (Rd ) and 2 (N, F ∗ L 2B (Rd )), with the properties (14) and (15). The identity (20) follows from a straightforward computation. A.2. Proof of Theorem 2.2. We have to prove Eq. (21), that we rewrite as lim θ f n , f n = θ ψ, ψ , →0

(110)

n

where ·, · denotes, throughout this proof, the hermitian product of L 2 (Rd ). Recalling definition (89), let f˜n = T 1 ( f n ),

θ˜ = T 1 (θ ), 3

and define ψ˜ (x) = |C|1/2

3

f˜n (x) vn (x).

(111)

n

Then, we can write θ f n , f n ≤ θ ψ, ψ − θ ψ˜ , ψ˜ + (θ − θ˜ )ψ˜ , ψ˜ θ ψ, ψ − n

+ θ˜ ψ˜ , ψ˜ − θ˜ f˜n , f˜n + θ˜ f˜n , f˜n − θ˜ f n , f n n

n

+ (θ˜ − θ ) f n , f n =: I1 + I2 + I3 + I4 + I5 . n

n

606

L. Barletti, N. Ben Abdallah

˜ f˜n ) ⊂ B/3 + B/3 ⊂ B/, then θ˜ f˜n are the envelope functions of Since supp(θ θ˜ ψ˜ and the Parseval identity (18) can be applied to the functions ψ˜ and θ˜ ψ˜ , which yields I3 = 0. As far as the terms I2 and I5 are concerned, we have I2 ≤ θ − θ˜ L ∞ ψ˜ L 2 = θ − θ˜ L ∞

f˜n L 2 ≤ θ − θ˜ L 1 ψ L 2 n

and, therefore, I2 → 0 as → 0. Similarly we can prove that I5 → 0. In order to prove that I1 and I4 also vanish in the limit, let us first remark that, using the envelope function decomposition (16), the Fourier transform of ψ can be expressed as follows: ˆ , (112) fˆn (k − λk ) vn,λ ψ(k) = |C|1/2 k n , for λ ∈ L∗ /, are the Fourier coefficients of the periodic functions v (x) where vn,λ n and λk is defined as the unique λk ∈ L∗ / such that k − λk ∈ B/. From (111) and (112) we obtain

ˆ (ψ − ψ˜ )(k) = 1 − 1B/3 (k − λk ) ψ(k) and, if R is the radius of a ball contained in B, 2 2 ˆ ˜ ψ(k) dk ≤

ψ − ψ L 2 ≤ |k−λk |≥R/3

|k|≥R/3

ψ(k) ˆ 2 dk,

showing that ψ˜ → ψ in L 2 , which also implies that θ ψ˜ → θ ψ in L 2 . Hence, I1 → 0, because of the continuity of the hermitian product. Now, since f − f˜ L2 =

ψ − ψ˜ L 2 , the same arguments show that I4 → 0. In conclusion (110) holds and Theorem 2.2 is therefore proved. References 1. Albert, J.H.: Genericity of simple eigenvaues for elliptic PDE’s. Proc. Amer. Math. Soc. 48(2), 413–418 (1975) 2. Allaire, G., Conca, C.: Bloch wave homogenization and spectral asymptotic analysis. J. Math. Pures Appl. (9) 77(2), 153–208 (1998) 3. Allaire, G., Capdeboscq, Y., Piatnitski, A., Siess, V., Vanninathan, M.: Homogenization of periodic systems with large potentials. Arch. Rat. Mech. Anal. 174, 179–220 (2004) 4. Allaire, G., Piatnitski, A.: Homogenization of the Schrödinger equation and effective mass theorems. Commun. Math. Phys. 258, 1–22 (2005) 5. Allaire, G., Vanninathan, M.: Homogenization of the Schrödinger equation with a time oscillating potential. Dis. Contin. Dyn. Syst. Ser. B 6, 1–16 (2006) 6. Allaire, G.: Periodic homogenization and effective mass theorems for the Schrödinger equation. In: Ben Abdallah, N., Frosali, G. (eds.) Quantum transport. Modelling, analysis and asymptotics. Lecture Notes in Math. 1946. Berlin: Springer, 2008 7. Ashcroft, N.W., Mermin, N.D.: Solid State Physics. Philadelphia, PA: Saunders College Publishing, 1976 8. Bastard, G.: Wave mechanics applied to semiconductor heterostructures. New York: Wiley Interscience, 1990 9. Berezin, F.A., Shubin, M.A.: The Schrödinger Equation. Dordrecht: Kluwer, 1991 10. Burt, M.G.: The justification for applying the effective mass approximation to microstructures. J. Phys. Condens. Matter 4, 6651–6690 (1992) 11. Fendt-Delebecque, F., Méhats, F.: An effective mass theorem for the bidimensional electron gas in a strong magnetic field. Commun. Math. Phys. 292, 829–870 (2009) 12. Giusti, E.: Direct methods in the calculus of variations. Singapore: World Scientific, 2003

Quantum Transport in Crystals: Effective Mass Theorem and K·P Hamiltonians

607

13. Hagedorn, G.A., Joye, A.: A time-dependent Born-Oppenheimer approximation with exponentially small error estimates. Commun. Math. Phys. 223, 583–626 (2001) 14. Kato, T.: Perturbation Theory for Linear Operators (Second edition). Berlin: Springer-Verlag, 1980 15. Kuchment, P.: Floquet theory for partial differential equations. In: Operator Theory: Advances and Applications, 60. Basel: Birkhäuser Verlag, 1993 16. Luttinger, J.M., Kohn, W.: Motion of electrons and holes in perturbed periodic fields. Phys. Rev. 97, 869–882 (1955) 17. Panati, G., Spohn, H., Teufel, S.: The time-dependent Born-Oppenheimer approximation. M2AN Math. Model. Numer. Anal. 41, 297–314 (2007) 18. Poupaud, F., Ringhofer, C.: Semi-classical limits in a crystal with exterior potentials and effective mass theorems. Commun. Part. Diff. Eq. 21, 1897–1918 (1996) 19. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, I - Functional Analysis. New York: Academic Press, 1972 20. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, IV - Analysis of Operators. New York: Academic Press, 1978 21. Spohn, H., Teufel, S.: Adiabatic decoupling and time-dependent Born-Oppenheimer theory. Commun. Math. Phys. 224, 113–132 (2001) 22. Uhlenbeck, K.: Generic properties of eigenfunctions. Amer. J. Math. 98(4), 1059–1078 (1976) 23. Teufel, S.: Adiabatic Perturbation Theory in Quantum Dynamics. Berlin: Springer-Verlag, 2003 24. Wenckebach, T.: Essentials of Semiconductor Physics. Chichester: Wiley, 1999 Communicated by P. Constantin

Commun. Math. Phys. 307, 609–627 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1346-2

Communications in

Mathematical Physics

A Short Proof of Stability of Topological Order under Local Perturbations Sergey Bravyi1 , Matthew B. Hastings2 1 IBM Watson Research Center, Yorktown Heights, NY 10594, USA. E-mail: [email protected] 2 Microsoft Research Station Q, CNSI Building, University of California, Santa Barbara, CA 93106, USA.

E-mail: [email protected] Received: 27 January 2010 / Accepted: 6 June 2011 Published online: 17 September 2011 – © Springer-Verlag 2011

Abstract: Recently, the stability of certain topological phases of matter under weak perturbations was proven. Here, we present a short, alternate proof of the same result. We consider models of topological quantum order for which the unperturbed Hamiltonian H0 can be written as a sum of local pairwise commuting projectors on a D-dimensional lattice. We consider a perturbed Hamiltonian H = H0 + V involving a generic perturbation V that can be written as a sum of short-range bounded-norm interactions. We prove that if the strength of V is below a constant threshold value then H has well-defined spectral bands originating from the low-lying eigenvalues of H0 . These bands are separated from the rest of the spectrum and from each other by a constant gap. The width of the band originating from the smallest eigenvalue of H0 decays faster than any power of the lattice size. 1. Introduction Quantum spin Hamiltonians exhibiting topological quantum order (TQO) have a remarkable property that their ground state degeneracy cannot be lifted by generic local perturbations [1,2]. In addition, the spectral gap above the ground state does not close in the presence of such perturbations. This property is in sharp contrast with the behavior of classical spin Hamiltonians such as the 2D Ising model for which the ground state degeneracy is unstable in the presence of the external magnetic field. Recently the authors of [3] presented a rigorous proof of the gap stability for a large class of Hamiltonians describing TQO. The proof was direct, but very long. This paper presents a short technical note giving an alternate proof of the same result. We feel that both proofs are worth having, since they are complementary. In particular, the present proof is much shorter, but less direct. Since we consider the same models and prove the same result, many of the definitions in this paper are identical. The interest in proving stability of a spectral gap under perturbations has increased due to recent progress in mathematical physics, where a combination of Lieb-Robinson

610

S. Bravyi, M. B. Hastings

bounds [4–6] and the method of quasi-adiabatic continuation [7,8] with appropriately chosen filter functions now provides powerful techniques for studying the properties of gapped local quantum Hamiltonians. Perhaps surprisingly, we will use similar techniques to prove the existence of a gap. In contrast, to the best of our knowledge, all previous works addressing the gap stability problem relied on perturbative methods such as the cluster expansion for the thermal Gibbs state [9–11], Kirkwood-Thomas expansion [12,13], or the coupled cluster method [14,15]. Our techniques are particularly well suited to handle topologically ordered ground states, although they can also be applied to topological trivial states such as ground states of classical Hamiltonians (in the classical case stability is possible only for non-degenerate ground states). Further, as shown in [3,7], the gap stability implies that many of the topological properties of the unperturbed system carry over to the perturbed system. We consider a system composed of finite-dimensional quantum particles (qudits) occupying sites of a D-dimensional lattice of linear size L. Suppose the unperturbed Hamiltonian H0 can be written as a sum of geometrically local pairwise commuting projectors, H0 =

Q A,

A⊆

such that its ground subspace P is annihilated by every projector, Q A P = 0. We impose two extra conditions on H0 and the ground subspace P which are responsible for the topological order. In [3], it was shown that neither of these conditions by itself is sufficient for the gap stability. Let us first state these conditions informally (see Sect. 2 for formal definitions): TQO-1: The ground subspace P is a quantum code with a macroscopic distance, TQO-2: Local ground subspaces are consistent with the global one. We consider a perturbation V that can be written as a sum of local interactions V =

Vr,A ,

r ≥1 A∈S (r )

where S(r ) is a set of cubes of linear size r and Vr,A is an operator acting on sites of A. We assume that the magnitude of interactions decays exponentially for large r , max Vr,A ≤ J e−μr ,

A∈S (r )

where J, μ > 0 are some constants independent of L. Our main result is the following theorem. Theorem 1. There exist constants J0 , c1 > 0 depending only on μ and the spatial dimension D such that for all J ≤ J0 the spectrum of H0 + V is contained (up to an overall energy shift) in the union of intervals k≥0 Ik , where k runs over the spectrum of H0 and Ik is the closed interval Ik = [k(1 − c1 J ) − δ, k(1 + c1 J ) + δ], for some δ bounded by J times a quantity decaying faster than any power of L.

A Short Proof of Stability of Topological Order under Local Perturbations

611

Using the quasi-adiabatic continuation operators in [23], this superpolynomial bound on δ can be improved to a bound by an exponential of a polynomial of L, where the polynomial may be taken to be arbitrarily close to linear. The stability proof presented in [3] involved two main steps: (i) proving stability for a special class of block-diagonal perturbations V composed of local interactions Vr,A preserving the ground subspace of H0 , and (ii) reducing generic local perturbations to block-diagonal perturbations. The most technical part of the proof was step (ii) which required complicated convergence analysis for Hamiltonian flow equations. In the present paper we show how to simplify step (ii) significantly using an exact quasi-adiabatic continuation technique [7,8]. In Sect. 3 we define two types of block-diagonal perturbations: the ones that preserve the ground subspace of H0 locally ([Vr,A , P] = 0) and globally ([V, P] = 0). We prove stability under locally block-diagonal perturbations in Sect. 5 which mostly follows [3] and uses the technique of relatively bounded operators. The reduction from globally block-diagonal to locally block-diagonal perturbations is in two steps; in Lemma 7 we exploit the idea of [16,17] to write a Hamiltonian of a gapped system as a sum of terms such that the ground states are (approximate) eigenvectors of each term separately, and then we show that such terms can be written as locally block-diagonal perturbations in Sect. 4. Finally, we use an exact quasi-adiabatic continuation technique presented in Sect. 6 to reduce generic local perturbations to globally block-diagonal perturbations in Sect. 7. In contrast to [3] we use certain self-consistent assumptions on the spectral gap of the perturbed Hamiltonian Hs = H0 + sV . In Sect. 7 we prove that if the minimal gap min along the path 0 ≤ s ≤ 1 is bounded by some constant (say 1/2) then, in fact, min is bounded by a larger constant (say 3/4). We use continuity arguments to translate this result into an unconditional constant bound on min , see Sect. 7. The continuity arguments we use rely on the fact that the spectrum of H0 + sV is a continuous function of s for any finite sized system; however, our theorem gives bounds that are independent of system size. 2. Hamiltonians Describing TQO 2.1. Notation. To simplify notation we shall restrict ourselves to the spatial dimension D = 2. A generalization to an arbitrary D is straightforward. Let = Z L × Z L be a two-dimensional square lattice of linear size L with periodic boundary conditions. We assume that every site u ∈ is occupied by a finite-dimensional quantum particle (qudit) such that the Hilbert space describing is a tensor product H= Hu . (1) u∈

None of the bounds in this paper will depend on the dimensions of the Hilbert spaces Hu so long as they are finite. Let S(r ) be a set of all square blocks A ⊆ of size r × r , where r is a positive integer. Note that S(r ) contains L 2 translations of some elementary square of size r × r for all r < L , S(L) = , and S(r ) = ∅ for r > L. We shall assume that the unperturbed Hamiltonian H0 involves only 2 × 2 interactions (otherwise consider a coarse-grained lattice, see below): H0 = Q A. (2) A∈S (2)

612

S. Bravyi, M. B. Hastings

Here Q A is an interaction that has support on a square A. We also assume that the interactions Q A are pairwise commuting projectors, Q 2A = Q A ,

Q A Q B = Q B Q A for all A, B ∈ S(2).

(3)

2.2. Examples. Well known examples of Hamiltonians composed of commuting projectors are Levin-Wen models [18] and quantum double models [2]. We give one example, the toric code [2], in slightly greater detail. In keeping with the notation in this paper, we place the degrees of freedom on the sites, rather than on the bonds as originally defined. Thus, each site as a two-dimensional Hilbert space, corresponding to a spin-1/2 degree of freedom. The projectors Q A take one of two forms. Imagine coloring the squares of the latticein a checkerboard fashion, either red or black. If square A is red, then Q A = (1 − i∈A σiz )/2, where the product ranges over the four sites i in the square, and σiz is a Pauli spin operator with eigenvalues ±1 acting on spin i. Note that this projector is zero acting on any state with an even number of spins up. If square A is black, then Q A = (1 − i∈A σix )/2. One may verify that the different projectors commute, even though we use σ z operators for one operator and σ x operators for the other projector. On a torus, this Hamiltonian has 4 exactly degenerate zero energy ground states, and it obeys the conditions TQO-1,2 for topological order that we define later. A typical perturbation of interest would be to add a term J i σiz . One can see that if J is very large, then the system has a unique ground state, with this state converging to a spin polarized state as J → ∞. However, our result shows that for sufficiently small J , the system still has four approximately degenerate low energy states, with a gap to the rest of the spectrum. The Levin-Wen models [18] are more complicated, since the projectors in the corresponding Hamiltonian act on 12 particles. However, since the support of any projector has a constant size, one can always tile the lattice by squares of size O(1) such that the support of any projector is covered by some 2 × 2 cluster of such squares. We can now regard each square as a ‘super-site’ of a coarse-grained lattice. The Hilbert space associated with every super-site is the tensor product of O(1) single-particle Hilbert spaces. The Hamiltonian is a sum of terms which are commuting projectors acting only on 2 × 2 squares on the coarse-grained lattice. However, there may be more than one projector acting on a given square; this does not affect our proof.

2.3. Topological quantum order. We assume that H0 has zero ground state energy, i.e., the ground states of H0 are zero-eigenvectors of every projector Q A . Let P and Q be the projectors onto the ground subspace and the excited subspace of H0 , that is, (I − Q A ) and Q = I − P. (4) P= A∈S (2)

For any square B ∈ S(r ), r ≥ 2 define local versions of P and Q as (I − Q A ) and Q B = I − PB . PB = A∈S (2) A⊆B

Note that PB and Q B have support on B.

(5)

A Short Proof of Stability of Topological Order under Local Perturbations

613

We make the following definition, which can be thought of as describing a “ball” of distance l around a given square A: Definition 1. For any l ∈ Z+ , we use bl (A) to denote the square containing square A as well as all first, second,…,l th neighbors of square A. We shall need two extra properties of H0 related to TQO defined in [3]. We shall assume that there exists an integer L ∗ ≥ L a for some constant a > 0 such that one has the following properties: 1. TQO-1: Let A ∈ S(r ) be any square of size r ≤ L ∗ . Let O A be any operator acting on A. Then P OA P = cP for some complex number c. 2. TQO-2: Let A ∈ S(r ) be any square of size r ≤ L ∗ and let B = b1 (A). Let O A be any operator acting on A such that O A P = 0. Then O A PB = 0. The property TQO-1 is often taken as a definition of TQO, see Ref. [1,2,19,20]. One can also show that L ∗ ≥ d/4, where d is the distance of P considered as a codespace of a quantum error correcting code, i.e., the smallest integer m such that erasure of any subset of m particles can be corrected, see Ref. [21] for details. Physically, one can motivate these two definitions of TQO as follows. TQO-1 always holds when H0 has a unique ground state. If H0 has degenerate ground states, then TQO1 implies that local operators (operator supported on sets with diameter less than L ∗ ) are equal to the identity operator when projected into the ground state subspace. Colloquially, one cannot locally distinguish between different ground states. As an example of a system which does not obey TQO-1, consider a system of spin-1/2 degrees of freedom on a lattice. Let all the projectors be diagonal in the S z basis, and let them penalize any state in which the spins in the given square are not either all pointing up or all pointing down. Then, this model, a ferromagnetic Ising model, has two degenerate ground states, one in which all spins are up and one in which all spins are down. However, this model does not have TQO-1 since the operator Suz on any site u is not equal to the identity when projected into the ground state subspace. This model is also not stable against weak perturbations: adding a perturbation u bSuz , corresponding to a magnetic field in the z direction will close the gap once the field strength b is of order 1/||. Thus, TQO-1 is a necessary condition for our result (see, however, the discussion of symmetry in the last section). TQO-2 is a property that expresses a consistency of local and global ground states. Consider the previously mentioned ferromagnetic Ising model and add an additional term to the Hamiltonian acting on some given site v which penalizes a state in which v is not pointing up along the z direction (one can construct projectors Q A which enforce the appropriate penalty). Now, H0 has a unique ground state so TQO-1 is automatically satisfied. However, the ground state is still not stable against a weak magnetic field of strength of order 1/||. In this case, the Hamiltonian H0 does not obey TQO-2. To see this, let O A project onto the state on which a spin in square A points down. Then, O A P = 0, but if square B does not contain the site v then O A PB is not equal to zero. We will need later a corollary of TQO-1 and TQO-2. Corollary 1. Let O A be any operator supported on a square A. Let C = b2 (A) and suppose that C has size bounded by L ∗ . Then

614

S. Bravyi, M. B. Hastings

PC O A PC = c PC ,

(6)

for some constant c. Thus, all states ψ with PC ψ = ψ have the same reduced density matrix on A, and in particular the reduced density matrix of ψ on square A is equal to the reduced density matrix of any ground state on A. Finally, we have the useful equality: O A P = O A PC .

(7)

Proof. Let B = b1 (A). We claim that PB O A PB commutes with the Hamiltonian. To show this, we show that [Q C , PB O A PB ] = 0 for any 2 × 2 square C . Consider first the case that C and A are disjoint. Then, Q C commutes with O A and Q C commutes with PB since all terms in H0 commute, so Q C commutes with PB O A PB . Now, consider the case that C and A are not disjoint. In this case, C is a subset of B. Therefore, PB Q C = Q C PB = 0, so [Q C , PB O A PB ] = 0. Since PB O A PB commutes with H0 , it commutes with P. So, P (PB O A PB ) P = (PB O A PB ) P. So by condition TQO-1, (PB O A PB ) P = c P = O − c. Then, P O P P = 0. So, by condition TQO-2, for some c. Define O A B B A A PB O A PB PC = 0. Note that PB O A PB commutes with PC . So, PC PB O A PB PC = 0. Hence, PC O A PC = 0. So, PC O A PC = c PC . Now consider the second claim. Let ψ be any state such that PC ψ = ψ. Let ρ A (ψ) be the reduced density matrix of ψ on A. Then, tr(ρ A O A ) = tr(PC |ψ ψ|PC O A ) = c(O A )tr(|ψ ψ|) = c(O A ), where the constant c(O A ) depends only on O A and not on ψ. Thus, the reduced density matrix is the same for all vectors such that PC ψ = ψ. Finally, let ψ, 0 satisfy O A PC = |O A ψ| and O A P = |O A 0 |. Then, we have that PC ψ = ψ and P0 = 0 , which implies tr(|O A |2 |ψ ψ|) = tr(|O A |2 |0 0 |), since their reduced density matrices agree on A. Hence, O A P = O A PC , as claimed. 3. Local Hamiltonians and the Lieb-Robinson Bounds Throughout this paper we restrict ourselves to Hamiltonians with a sufficiently fast spatial decay of interactions. Any such Hamiltonian V will be specified using a decomposition into local interactions, V = Vr,A , (8) r ≥1 A∈S (r ) † = Vr,A is an operator acting non-trivially only on a square A. Simiwhere Vr,A larly, given a time-dependent Hamiltonian V (t), we define a decomposition V = r ≥1 A∈S (r ) Vr,A (t). We will make frequent use of O(. . .) notation to describe bounds on quantities, which we now briefly explain. Stating that a quantity is O(J ), for example, implies that it is bounded by a constant times J for all J (note that we assume that the bound holds for all J , not just for J small or for J large), while stating that a quantity is O(1) implies that it is bounded by some given constant. If O(. . .) notation appears in both the conditions and conclusions of some claim, then the constants in the conclusion may depend upon the constants in the conditions. For example, in Lemma 1 below, we assume that a given quantity is O(1), implying that it is bounded by some constant c1 , and then claim

A Short Proof of Stability of Topological Order under Local Perturbations

615

that this implies that some other quantity is O(J ), implying that it is bounded by some constant c2 J for all J ; in this case, the constant c2 may depend upon c1 . Let us define several important classes of Hamiltonians and their associated decompositions. Note that we always consider a Hamiltonian V as specified by a given decomposition; thus, the definitions below are properties of both the operator V and the given decomposition. It is possible that, for example, an operator V has one decomposition which is not locally block-diagonal, but can be re-written in a different decomposition which is locally block-diagonal and in fact part of the later proof consists of constructing such re-writings. Definition 1. A Hamiltonian V has support near a site u ∈ if all squares in the decomposition Eq. (8) contain u. Definition 2. A Hamiltonian V is globally block-diagonal iff it preserves the ground subspace P, that is, [V, P] = 0. A Hamiltonian V is locally block-diagonal iff all terms Vr,A in the decomposition Eq. (8) preserve the ground subspace, that is, [Vr,A , P] = 0 for all r, A. Definition 3. A Hamiltonian V has strength J if there exists a function f : Z+ → [0, 1] decaying faster than any power such that Vr,A ≤ J f (r ) for allr ≥ 1for allA ∈ S(r ). We define analogous definitions for time-dependent Hamiltonians. A time-dependent Hamiltonian has support near site u if V (t) has support near u for any t. A time-dependent Hamiltonian is globally block diagonal if V (t) is globally block diagonal for all t. Similarly, a time-dependent Hamiltonian has strength J if there exists a function f : Z+ → [0, 1] decaying faster than any power such that Vr,A (t) ≤ J f (r ) for allr ≥ 1for all A ∈ S(r ) and for all t. Note that the bound here is uniform: the same function f (r ) appears in the bound for all t. Our main technical tool will be the following corollary of the Lieb-Robinson bound for systems with interactions decaying slower than exponential [20,24]. Lemma 1. Let H (t) be a (time-dependent) Hamiltonian with strength O(1). Let V be a Hamiltonian with strength J . Let U (t) be the unitary evolution for time t, |t| ≤ 1, generated by H (given a time-dependent H (t), define U (t) by ∂t U (t) = −i H (t)U (t) and U (0) = I ). Define V˜ = U (t)V U (t)† . Then V˜ has strength O(J ). If V has support near a site u then V˜ also has support near u. We will also need an analogue of Lemma 1 for an infinite evolution time. It will be applicable only to evolution under Hamiltonians H having a finite Lieb-Robinson velocity [4–6,24], for example, Hamiltonians with exponentially decaying interactions. Definition 4. A Hamiltonian V has strength J and decay rate μ if Vr,A ≤ J exp (−μr ) for all r ≥ 1 for all A ∈ S(r ).

616

S. Bravyi, M. B. Hastings

Lemma 2. Let H (t) be a (time-dependent) Hamiltonian with strength O(1) and decay rate μ > 0. Let V be a Hamiltonian with strength J . Let U (t) be the unitary evolution for time t generated by H . Finally, let g(t) be any function decaying faster than any power for large |t|. Define

∞ dt g(t)U (t)V U (t)† . V˜ = −∞

Then V˜ has strength O(J ). If V has support near a site u then V˜ also has support near u. The superpolynomially decaying function of r bounding the norm of V˜r,A depends on the superpolynomially decaying function of r bounding the norm of Vr,A as well as on μ and the function g(t). (Remark. In fact, one can show that there exist g(t) that have the properties needed later and that decay as an exponential of a power of t, making it possible also to consider Hamiltonians H that do not have a decay constant μ > 0 but that instead have sufficiently fast stretched exponential decay. We omit this for simplicity.) 4. Reduction from Global to Local Block-Diagonality In this section we prove that a certain class of globally block-diagonal perturbations can be reduced to locally block-diagonal perturbations with a small error by simply rewriting the Hamiltonian in a different form. Lemma 3. Consider a Hamiltonian H = H0 +

Xu,

(9)

u∈

where H0 obeys TQO-1,2 and X u is a perturbation with strength J that has support near u. Suppose that [X u , P] = 0 for all u. Then we can rewrite H = H0 + V + ,

(10)

where V is a locally block-diagonal perturbation with strength O(J ), and decays faster than any power of L ∗ . Proof. Considersome fixed site u and let X≡ X u . By assumption, X has a decomposition X = r ≥1 X (r ), where X (r ) = A∈S (r ), Au X r,A and the norm X (r ) decays faster than any power of r . By adding constants1 to the different terms X r,A and using TQO-1 we can achieve P X r,A P = 0 for all r ≤ L ∗ . Define = P X = P X P = r >L ∗ P X (r )P. Note that decays faster than any power of L ∗ . Define also X = X − such that X P = 0. 1 Since we shift each X r,A by a constant, in fact in what follows we will be rewriting not H but H plus a constant; however, since a constant is by definition locally block diagonal and is also strength O(J ) we may ignore this constant term in what follows.

A Short Proof of Stability of Topological Order under Local Perturbations

617

We can assume that X has strength O(J ) and its support is near u if we treat as a single interaction on a square of size L. To simplify notation we set X = X in the rest of the proof. Choosing any l ≤ L ∗ and applying Eq. (7) to O = lr =1 X (r ) we arrive at

l

X (r )Pbl+2 (u) =

r =1

l

X (r )P ≤ X P +

r =1

X (r )P ≤ J f (l)

(11)

r >l

for some function f decaying faster than any power of l. Consider a given site u. Let Br be the union of all squares of size r that contain u, such that X (r ) has support on Br . We have B1 ⊂ B2 ⊂ . . . ⊂ B M = for some integer M+1 E m by M. Define an orthogonal unity decomposition I = m=1 E 1 = Q B1 , E m = Q Bm PBm−1 for 2 ≤ m ≤ M, E M+1 = PB M = P.

(12)

Taking into account that E M+1 X = X E M+1 = 0 we arrive at X=

E p X (q)Er =

1≤ p,q,r ≤M

Y ( j) +

Z (q),

(13)

q≥1

j≥1

where Y ( j) =

⎞

⎛

max( p,r )−2

Ep ⎝

1≤ p,r ≤M p+r = j

X (q)⎠ Er .

(14)

q=1

and Z (q) =

E p X (q)Er .

(15)

1≤ p,r ≤q+1

All operators Y ( j) and Z ( j) are hermitian, annihilate P, and act only on sites in the square B j+1 . The norm of Z (q) decays faster than any power of q because of the decay of the norm of X (q). We claim that the norm of Y ( j) decays faster than any power of j. Indeed, Y ( j) is a sum over j − 1 different terms corresponding to different choices of p, r . We show that the norm of each such term p, r decays fastenough. Assume without loss of generality max( p,r )−2 max( p,r )−2 that p ≥ r . Then, E p X (q) Er ≤ E p X (q) , which q=1 q=1 decays faster than any power of p by Eq. (11). Since p ≥ j/2, it decays faster than any power of j. Thus, we have decomposed X as a sum of terms which annihilate P, supported on squares of increasing radius about site u, with norm decreasing faster than any power of the size of the square. This completes the proof.

618

S. Bravyi, M. B. Hastings

5. Stability under Locally Block-Diagonal Perturbation In this section, we define the concept of relatively bounded perturbations (see Chap. IV of [22]) and show that the spectral gaps separating low-lying eigenvalues of H0 are stable against such perturbations. We then demonstrate that locally block-diagonal perturbations satisfy the relative boundness condition which proves the gap stability for locally block-diagonal perturbations. Note that our condition of relative boundness is different from the one used in [10]. In particular, our condition provides an elementary proof of the gap stability without the need to use cluster expansions as was done in [10]. Let H0 and W be any Hamiltonians acting on some Hilbert space H. We shall say that W is relatively bounded by H0 iff there exist 0 ≤ b < 1 such that W ψ ≤ b H0 ψ for all |ψ ∈ H.

(16)

The following lemma asserts that a relatively bounded perturbation can change eigenvalues of H0 at most by a factor 1 ± b. Lemma 4. Suppose W is relatively bounded by H0 . Then the spectrum of H0 + W is contained in the union of intervals [λ0 (1 − b), λ0 (1 + b)], where λ0 runs over the spectrum of H0 . Proof. Indeed, suppose (H0 + W ) |ψ = λ |ψ , that is, (H0 − λ I ) |ψ = −W |ψ .

(17)

The relative boundness then implies (H0 − λ I )ψ ≤ bH0 ψ, that is,

ψ|(H0 − λ I )2 |ψ ≤ b2 ψ|H02 |ψ .

(18)

Let H0 = λ0 λ0 Pλ0 be the spectral decomposition of H0 . Here the sum runs over the spectrum of H0 and Pλ0 is a projector onto the eigenspace with an eigenvalue λ0 . Define a probability distribution p(λ0 ) = ψ|Pλ0 |ψ . Substituting it into Eq. (18) one gets (λ0 − λ)2 p(λ0 ) ≤ b2 λ20 p(λ0 ). (19) λ0

λ0

Therefore there exists at least one eigenvalue λ0 such that (λ0 − λ)2 ≤ b2 λ20 . This is equivalent to λ0 (1 − b) ≤ λ ≤ λ0 (1 + b).

(20)

In the rest of this section we prove stability under locally block-diagonal perturbations. Let H0 be a Hamiltonian satisfying TQO-1,2. Lemma 5. Let V be a locally block-diagonal perturbation with strength J . Then the spectrum of H0 + V is contained, up to an overall shift by a constant, in the union of intervals k≥0 Ik , where k runs over the spectrum of H0 and Ik = {λ ∈ R : k(1 − b) − δ ≤ λ ≤ k(1 + b) + δ} for some b = O(J ) and for some δ decaying faster than any power of L ∗ .

A Short Proof of Stability of Topological Order under Local Perturbations

619

Proof. For any integer r ≥ 1 define W (r ) =

Vr,A .

A∈S (r )

By assumptions of the lemma we have Vr,A ≤ J f (r )

(21)

for some function f decaying faster than any power and [Vr,A , P] = 0. Performing an overall energy shift (that is, subtracting a constant from each term Vr,A and shifting the spectrum of H by the sum of those constants) and using TQO-1 we can assume that Vr,A P = P Vr,A = 0 for all1 ≤ r ≤ L ∗ and for all A ∈ S(r ).

(22)

Proposition 1. Suppose 1 ≤ r ≤ L ∗ . Then W (r ) is relatively bounded by H0 with a constant b(r ) = O(Jr 2 f (r )). This proposition immediately implies that 1≤r ≤L ∗ W (r ) is relatively bounded by L ∗ H0 with a constant b = r =1 b(r ) = O(J ). Lemma 4 then implies that the spectrum of H0 + 1≤r ≤L ∗ W (r ) is contained in the union of intervals [k(1 − b), k(1 + b)]. Treating the residual terms r >L ∗ W (r ) using the standard perturbation theory then leads to the desired result. Proof of the proposition. Let = B1 ∪ B2 ∪ . . . ∪ B M be a partition of the lattice into contiguous squares of size r such that any 2 ×2 square is contained in exactly one square Ba . If a 2 × 2 square is on the boundary of two, or more, of the Ba , we include it in the bottom left one in the partition. Moreover, if r does not divide L, then squares at the boundary may be truncated to rectangles, to fill in the partition. We refer to Ba as boxes to distinguish them from the squares involved in the decomposition of V . For any binary string Y ∈ {0, 1} M define a projector RY =

M Ya Q Ba + (1 − Ya )PBa . a=1

Clearly the family of projectors RY defines an orthogonal decomposition of the Hilbert space, that is, Y RY = I . Given a string Y ∈ {0, 1} M , we shall say that a box Ba is occupied iff Ya = 1. We claim that any operator Vr,A acting on a square A ∈ S(r ) and satisfying Eq. (22) has only a few off-diagonal blocks with respect to this decomposition. Specifically, TQO-2 implies that RY Vr,A RZ = 0

(23)

only if A has distance O(1) from some occupied box in Y and A has distance O(1) from some occupied box in Z, and the configurations Y, Z differ only at those boxes that overlap with A. Clearly, for any fixed Y such that Y has k occupied boxes the number of pairs (A ∈ S(r ), Z) that could satisfy Eq. (23) is at most O(kr 2 ). Denote for simplicity W ≡ W (r ), w ≡ J f (r ).

620

S. Bravyi, M. B. Hastings

For any state |ψ we get

ψ|W 2 |ψ =

ψ|RY W RZ W RV |ψ

Y ,Z ,V ⊆[M]

≤

RY W RZ · RZ W RV · RY ψ · RV ψ

Y ,Z ,V

≤

Y ,Z ,V

=

RY W RZ · RZ W RV · ψ|RY |ψ

Y ,Z ,V

≤

1 RY W RZ · RZ W RV · ( ψ|RY |ψ + ψ|RV |ψ ) 2

k≥0

Y : |Y |=k

where G=

O(k 2 r 4 w 2 ) ψ|RY |ψ = O(w 2 r 4 ) ψ|G|ψ ,

k≥0

Y : |Y |=k

k 2 RY .

(24)

(25)

The inequality Eq. (24) follows from the fact that Y and Z differ at at most O(1) boxes and an obvious bound k(k + O(1)) = O(k 2 ). Finally, note that G ≤ H02 . To see this, consider a simultaneous eigenbasis of Q A for all squares A. In this basis, G and H0 are diagonal. Let φ be a vector in this basis. This vector φ is a simultaneous eigenvector of all Q A , and so there is a unique bit string Y such that RY φ = φ; a given bit Ya is equal to 1 if and only if there is a square Q A contained in box Ba such that Q A φ = φ. If |Y| = k, then there are at least k squares A ∈ S(2) such that Q A φ = φ since, by construction, each square A is contained in exactly one box. Thus, φ|H02 |φ ≥ k 2 = φ|G|φ . Using this inequality in Eq. (24), we arrive at

ψ|W 2 |ψ ≤ b2 ψ|H02 |ψ , b = O(wr 2 ).

(26)

This completes the proof. 6. Exact Quasi-Adiabatic Continuation We define a continuous family of Hamiltonians, Hs = H0 + sV,

(27)

so that as s varies from 0 to 1, Hs continuously interpolates between H0 and the perturbed Hamiltonian. We define a quasi-adiabatic continuation operator, Ds by

iDs ≡ dt F(t) exp(i Hs t) ∂s Hs exp(−i Hs t), (28) where the function F(t) is defined to have the following properties. First, the Fourier ˜ transform of F(t), which we denote F(ω), obeys |ω| ≥ 1/2

→

˜ F(ω) = −1/ω.

(29)

A Short Proof of Stability of Topological Order under Local Perturbations

621

˜ Second, F(ω) is infinitely differentiable and F(t) decays faster than any power of |t| (we describe how to construct such F(t) in the Appendix). Third, F(t) = −F(−t) and ˜ F(ω) is real. This implies that F(t) is pure imaginary so that Ds is Hermitian. We define a unitary operator Us by

s Us ≡ S exp(i ds Ds ), (30) 0

S

where the notation denotes that the above Eq. (30) is an s -ordered exponential. The motivation for defining the above unitary operator is contained in the following lemma. Lemma 6. Let Hs be a differentiable family of Hamiltonians. Let | i (s) denote eigenstates of Hs with energies E i . Let E min (s) < E max (s) be continuous functions of s. Define a projector P(s) onto an eigenspace of Hs by P(s) =

E i ∈[E min (s),E max (s)]

| i (s) i (s)|.

(31)

i

Assume that the space that P(s) projects onto is separated from the rest of the spectrum by a gap of at least 1/2 for all s with 0 ≤ s ≤ 1. That is, all eigenvalues of Hs are either in the interval [E min (s), E max (s)], or are separated by at least 1/2 from this interval. Then, for all s with 0 ≤ s ≤ 1, we have P(s) = Us P(0)Us† .

(32)

Proof. This is Lemma 7.1 in [3] and the basic idea goes back to [5]. We repeat the proof. By linear perturbation theory, ∂s P(s) =

i∈I (s)

j ∈I / (s)

=−

i∈I (s)

1 | j (s) j (s)|∂s H (s)| i (s) i (s)| + h.c Ei − E j

| j (s)

j ∈I / (s)

× j (s)| dt F(t) exp(i Hs t)∂s H (s) exp(−i Hs t)| i (s) i (s)| + h.c = i[Ds , Ps ].

(33)

The first equality above holds because j (s)| dt F(t) exp(i Hs t)∂s H (s) exp(−i Hs t)| ˜ j − E i ) j (s)|∂s H (s)| i (s) , and Eq. (29) implies F(E ˜ j − Ei ) = i (s) = F(E −1/(E j − E i ) given the assumption that |E j − E i | ≥ 1/2. Since ∂s (Us P(0)Us† ) = i[Ds , Us P(0)Us† ], and U0 = I , Eq. (32) follows from Eq. (33). 7. Stability Proof In this section we prove Theorem 1. For any s ∈ [0, 1] define Hs = H0 + sV.

622

S. Bravyi, M. B. Hastings

Let g be the ground state degeneracy of H0 . Choose E min (s) as the smallest eigenvalue of Hs and let E max (s) be the g th smallest eigenvalue of Hs (taking into account multiplicities). Finally, let (s) be the spectral gap separating eigenvalues of Hs in the interval [E min (s), E max (s)] from the rest of the spectrum. We shall choose the constants J0 , c1 sufficiently small such that the interval I0 is separated from Ik , k > 0 by a gap at least 3/4. Then the theorem implies that the interval [E min (s), E max (s)] is contained in I0 for all s ∈ [0, 1], and hence (s) ≥

3 for all 0 ≤ s ≤ 1. 4

(34)

Suppose we have already proved the theorem for the special case when (s) ≥

1 for all s ∈ [0, 1]. 2

(35)

In the remaining case there must exist s ∗ ∈ [0, 1) such that (s) ≥ 1/2 for s ∈ [0, s ∗ ] and (s ∗ ) = 1/2 (use the fact that (0) ≥ 1 and continuity of (s)). Applying the theorem to a perturbation s ∗ V which satisfies Eq. (35) we conclude that (s ∗ ) ≥ 3/4 obtaining a contradiction. Thus it suffices to prove the theorem for the case Eq. (35). Define Hs = Us† (H0 + sV )Us = Us† Hs Us , where Us ≡ S exp(i

s

ds Ds ),

(36)

(37)

0

is the exact adiabatic continuation operator constructed in Sect. 6. Since we assumed (s) ≥ 1/2 for all s, Lemma 6 implies that [Hs , P] = 0.

(38)

Hs = H0 + V ,

(39)

V = Us† H0 Us − H0 + sUs† V Us .

(40)

We can represent Hs as

where

Lemma 2 implies that Ds has strength O(J ) = O(1). Applying Lemma 1 we conclude that sUs† V Us has strength O(J ). Let us now focus on the term Us† H0 Us − H0 . We use

s Us† H0 Us − H0 = −i ds Us† [Ds , H0 ]Us . (41) 0

Since Ds has strength O(J ), the commutator [Ds , H0 ] also has strength O(J ). Applying Lemma 1 to the unitary evolution Us† , we infer that Us† H0 Us − H0 has strength O(J ). To conclude, we have shown that V has strength O(J ), and Eq. (38) implies [V , P] = 0,

(42)

A Short Proof of Stability of Topological Order under Local Perturbations

623

that is, V is a globally block-diagonal perturbation with strength O(J ). In Lemma 7 below, we will show that we can rewrite Hs = H0 + V = H0 + Xu (43) u∈

where X u obeys [X u , P] = 0, where X u has strength O(J ) and its support is near u. Applying Lemma 3 to Eq. (43) implies that Hs can be written in the form Hs = H0 + V = H0 + V + ,

(44)

where V is locally block-diagonal with strength O(J ) and decays faster than any power of L ∗ . The statement of the theorem then follows straightforwardly from Lemma 5 which asserts that V is relatively bounded by H0 with a small error decaying faster than any power of L ∗ . Lemma 7 that we need follows the idea in [17] to write a Hamiltonian of a gapped system as a sum of terms such that the ground states are eigenvectors of each term separately (in [16] a related idea of writing it so that the ground state was an approximate eigenvector of each term separately was considered). The properties of Hs that we use are that it is globally block diagonal, it has a spectral gap ≥ 1/2, the perturbation V has strength J , and that it is unitarily related by Us to a Hamiltonian with a decay rate μ > 0. Lemma 7. Let Hs be defined as above. Then, we can re-write Hs = H0 + Xu,

(45)

u∈

where X u obeys [X u , P] = 0, where X u has strength O(J ) and its support is near u. Proof. We start from representing V as V = u∈ Vu , where Vu includes only interactions affecting a site u. Then Vu has strength J and its support is near u. We set

∞ V˜u = dt g(t) exp(i Hs t)Vu exp(−i Hs t), (46) −∞

˜ where g(t) is a function satisfying g(−t) = g(t)∗ such that its Fourier transform g(ω) is infinitely differentiable, has g(0) ˜ = 1, and g(ω) ˜ = 0 for |ω| ≥ 1/2. Define

∞ Q˜ A = dt g(t) exp(i Hs t)Q A exp(−i Hs t). (47) −∞

Then, Hs

=

∞ −∞

dt g(t) exp(i Hs t)Hs exp(−i Hs t) =

A∈S (2)

Q˜ A +

V˜u .

(48)

u∈

By construction of g(ω) ˜ we have (1 − P) Q˜ A P = (1 − P)V˜u P = 0. Hence both Q˜ A ˜ and Vu preserve P. Using the definition of V˜u we get

∞ Us V˜u Us† = dt g(t) exp(i Hs t)Us Vu Us† exp(−i Hs t). (49) −∞

624

S. Bravyi, M. B. Hastings

Recall that Ds has strength O(1). Thus we can apply Lemma 1 to the unitary evolution ˜ is infinitely differentiable, Us to infer that Us Vu Us† has strength O(J ). Because g(ω) g(t) decays faster than any power. Also, by assumptions of the theorem, Hs has strength O(1) and decay rate μ > 0. Hence we can apply Lemma 2 to the unitary evolution exp(i Hs t) to infer that Us V˜u Us† has strength O(J ), Finally, applying Lemma 1 to the unitary evolution Us† we infer that V˜u has strength O(J ). In addition, V˜u has support near u since all Hamiltonians obtained at the intermediate steps have support near u, see Lemmas 1,2. We now consider the terms Q˜ A . We have

∞ ˜ dt g(t) exp(i Hs t)Q A exp(−i Hs t) QA = −∞

∞

t dt g(t) dt1 exp (i Hs t1 )[V, Q A ] exp (−i Hs t1 ) = QA + i −∞ 0

∞ = QA + dt1 f (t1 ) exp (i Hs t1 )[V, Q A ] exp (−i Hs t1 ) (50) −∞

for some function f (t1 ) decaying faster than any power. It follows that

∞ Us ( Q˜ A − Q A )Us† = dt1 f (t1 ) exp (i Hs t1 )Us [V, Q A ]Us† exp (−i Hs t1 ). (51) −∞

Applying the same arguments as in the analysis of Eq. (49) we conclude that Q˜ A −Q A has strength O(J ) and has support near u(A) — the center of the square A. In addition, Q˜ A − Q A commutes with P since both Q˜ A and Q A do. Let us define X u = V˜u + ( Q˜ A − Q A ), where A is the square centered at u. 8. Discussion We have proven lower bounds on the stability radius of models whose Hamiltonian is a sum of commuting projectors obeying conditions TQO-1,2. One can extend the result to also consider the case of systems with symmetries. Suppose there is some symmetry obeyed by both H0 and V . That is, suppose that there is an operator Q which commutes with every term in the decomposition of H0 , V . In this case, we can prove stability under a weaker set of assumptions, in that definitions TQO-1,2 are required only to hold for operators O A which also obey the symmetry. As an example of such a system, consider the ferromagnetic Ising model mentioned previously. The Hamiltonian does not have property TQO-1 and is not stable against a weak magnetic field in the z-direction. However, if we restrict to perturbations which respect the Ising symmetry (so that V commutes with the operator i Six which flips all the spins) then we can prove stability. The proof of stability in this symmetric case is completely analogous to our proof above: all manipulations with the Hamiltonians that we use involve multiplication on the left and on the right by the projectors PM , Q M , quasi-adiabatic continuation, and evolving local terms in time under the full Hamiltonian with various filter functions. All these manipulations preserve the subset of Hamiltonians invariant under the symmetry group, so we never use TQO-1,2 for operators O A violating the symmetry. Another example would be to consider a global SU (2) symmetry. Suppose we have a chain of N particles with spin s. The Hamiltonian H0 is a sum of Heisenberg antiferromagnetic interactions on every second pair (so that different terms in H0 act on

A Short Proof of Stability of Topological Order under Local Perturbations

625

disjoint pairs of spins, forcing each pair into a singlet state in the ground state). Suppose also that we have an unpaired spin on each boundary of the chain, so that the ground state is (2s + 1)2 degenerate. This Hamiltonian obeys TQO-1,2 for SU (2)-invariant local perturbations (though not for arbitrary perturbations) which implies stability under such perturbations. Similarly, we can consider superselection rules instead of symmetries. For example, consider a fermionic quantum wire with unpaired Majorana modes on the boundary. This obeys TQO-1,2 if we restrict to operators O A which are even in the fermionic operators and we can then prove stability against arbitrary local perturbations since any term in the Hamiltonian must be even in the fermionic operators. One final remark: while one advantage of our results is that they apply to a general class of models, this is also a drawback if the goal is to prove tight bounds on stability. For example, consider the Levin-Wen model [18] on a hexagonal lattice describing the quantum double of some anyon theory X . Any such model has 12-qubit commuting projectors and the code distance L ∗ equals the lattice size. Hence our proof provides a constant lower bound on the stability radius which does not depend on X at all (the dimension D of the local Hilbert spaces certainly depends on X , but because we always use the operator norm, our bound does not depend on D). However, of course we expect that distinct models (and distinct perturbations of those models) will have different stability radii. Acknowledgements. We thank S. Michalakis for useful discussions and for collaboration on [3]. SB was partially supported by the DARPA QUEST program under contract number HR0011-09-C-0047.

A. Construction of F ˜ We explain the construction of F(t). This follows [23] and [8]. Let h(ω) be an even func˜ ˜ tion and have the property that h(ω) = 1 for ω = 0 and h(ω) = 0 for |ω| ≥ 1/2. Let h(t) ˜ ˜ be the Fourier transform of h(ω). Let h(ω) be infinitely differentiable so that h(t) decays superpolynomially in t (see [25] for the optimal construction of such h(t); the paper [25] allows one to construct h(t) decaying as fast as, for example, exp(−t/ log(t)2 ), and using this construction throughout the present paper allows the superpolynomial bounds to be repaced by an exponential of a power for any power less than one). Then, define F(t) by

i F(t) = sign(t) − duh(u)sign(t − u) , (52) 2 where sign(t − u) is the sign function: sign(t − u) = 1 for t > u, sign(t − u) = −1 for ˜ t < u, and sign(0) = 0 and δ(t) is the Dirac δ-function. Note that since h(0) = 1, we have, for t > 0,

∞ F(t) = i duh(u), (53) t

and similarly for t < 0,

F(t) = −i

t

−∞

duh(u).

(54)

We now show that F(t) decays superpolynomially and we show that the Fourier trans˜ form F(ω) is equal to −1/ω for |ω| ≥ 1/2, as desired. F(t) is odd and pure imaginary by construction.

626

S. Bravyi, M. B. Hastings

˜ Lemma 8. Let F(t) be as defined in 52. Let F(ω) be the Fourier transform of F(t). Then,

∞ |F(t)| ≤ | h(u)du|, (55) |t|

and −1 ˜ ˜ (1 − h(ω)). F(ω) = ω

(56)

Finally, if h(t) decays superpolynomially in t then F(t) decays superpolynomially in t. Proof. The first claim follows immediately by a triangle inequality. The fact that superpolynomial decay in h(t) implies superpolynomial decay in F follows immediately. To compute the Fourier transform, we have

t

∞

∞ 0 ˜ F(ω) = i − dt exp(iωt) duh(u) + dt exp(iωt) duh(u) . (57) −∞

−∞

0

t

Integrating by parts in t, we have

∞

∞

∞

∞ i i dt exp(iωt) duh(u) = − dt exp(iωt)h(t) + duh(u). (58) ω 0 ω 0 0 t Similarly,

0

t

i dt exp(iωt) duh(u) = ω −∞ −∞

˜ Adding these terms, we obtain F(ω) =

0

i dt exp(iωt)h(t) − ω −∞

−1 ω (1 −

0

−∞

duh(u). (59)

˜ h(ω)) as claimed.

References 1. Wen, X.G., Niu, Q.: Ground-state degeneracy of the fractional quantum Hall states in the presence of a random potential and on high-genus Riemann surfaces. Phys. Rev. B41, 9377 (1990) 2. Kitaev, A.: Fault-tolerant quantum computation by anyons. Ann. Phys. 303, 2 (2003) 3. Bravyi, S., Hastings, M.B., Michalakis, S.: Topological quantum order: stability under local perturbations. J. Math. Phys. 51, 093512 (2010) 4. Lieb, E.H., Robinson, D.W.: The finite group velocity of quantum spin systems. Commun. Math. Phys. 28, 251 (1972) 5. Hastings, M.B.: Lieb-Schultz-Mattis in higher dimensions. Phys. Rev. B69, 104431 (2004) 6. Nachtergaele, B., Sims, R.: Lieb-Robinson bounds and the exponential clustering theorem. Commun. Math. Phys. 265, 119 (2006) 7. Hastings, M.B., Xiao-Gang, W.: Quasi-adiabatic continuation of quantum states: the stability of topological ground state degeneracy and emergent gauge invariance. Phys. Rev. B72, 045141 (2005) 8. Osborne, T. J.: Simulating adiabatic evolution of gapped spin systems. Phys. Rev. A75, 032321 (2007) 9. Kennedy, T., Tasaki, H.: Hidden symmetry breaking and the Haldane phase in S = 1 quantum spin chains. Commun. Math. Phys. 147, 431–484 (1992) 10. Yarotsky, D.A.: Ground states in relatively bounded quantum perturbations of classical lattice systems. Commun. Math. Phys. 261, 799–819 (2006) 11. Klich, I.: On the stability of topological phases on a lattice. Ann. Phys. 325(10), 2120–2131 (2010) 12. Kirkwood, J., Thomas, L.: Expansions and phase transitions for the ground state of quantum ising lattice systems. Commun. Math. Phys. 88, 569–580 (1983) 13. Datta, N., Kennedy, T.: Expansions for one quasiparticle states in spin 1/2 systems. J. Stat. Phys. 108, 373 (2002)

A Short Proof of Stability of Topological Order under Local Perturbations

627

14. Yarotsky, D.: Perturbations of ground states in weakly interacting quantum spin systems. J. Math. Phys. 45(6), 2134 (2004) 15. Bravyi, S., DiVincenzo, D., Loss, D.: Polynomial-time algorithm for simulation of weakly interacting quantum spin systems. Commun. Math. Phys. 284, 481–507 (2008) 16. Hastings, M.B.: Solving gapped Hamiltonians locally. Phys. Rev. B73, 085115 (2006) 17. Kitaev, A.: Anyons in an exactly solved model and beyond. Ann. Phys. 321, 2–111 (2006). see Proposition D.1 18. Levin, M.A., Wen, X.-G.: String-net condensation: a physical mechanism for topological phases. Phys. Rev. B71, 045110 (2005) 19. Freedman, M.H., Kitaev, A., Larsen, M.J., Wang, Z.: Topological quantum computation. http://arXiv.org/abs/quant-ph/0101025v2, 2002 20. Bravyi, S., Hastings, M.B., Verstraete, F.: Lieb-Robinson bounds and the generation of correlations and topological quantum order. Phys. Rev. Lett. 97, 050401 (2006) 21. Bravyi, S., Poulin, D., Terhal, B.: Tradeoffs for reliable quantum information storage in 2D systems. Phy. Rev. 104, 050503 (2010) 22. Kato, T.: Perturbation theory for linear operators. Springer-Verlag, New York (1966) 23. Hastings, M.B.: http://arXiv.org/abs/1001.5280v2 [math-phy], 2010 24. Hastings, M.B., Koma, T.: Spectral gap and exponential decay of correlations. Commun. Math. Phys. 265, 781 (2006) 25. Ingham, A.E.: A note on Fourier transforms. J. London Math. Soc. 9, 29 (1934) Communicated by I.M. Sigal

Commun. Math. Phys. 307, 629–673 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1353-3

Communications in

Mathematical Physics

A KAM Theorem for Hamiltonian Partial Differential Equations with Unbounded Perturbations Jianjun Liu, Xiaoping Yuan School of Mathematical Sciences and Key Lab of Math. for Nonlinear Science, Fudan University, Shanghai 200433, P.R. China. E-mail: [email protected]; [email protected]; [email protected] Received: 8 March 2010 / Accepted: 27 January 2011 Published online: 24 September 2011 – © Springer-Verlag 2011

Abstract: We establish an abstract infinite dimensional KAM theorem dealing with unbounded perturbation vector-field, which could be applied to a large class of Hamiltonian PDEs containing the derivative ∂x in the perturbation. Especially, in this range of application lie a class of derivative nonlinear Schrödinger equations with Dirichlet boundary conditions and perturbed Benjamin-Ono equation with periodic boundary conditions, so KAM tori and thus quasi-periodic solutions are obtained for them. 1. Introduction and Main Results Consider a Hamiltonian partial differential equation (HPDE) w˙ = Aw + F(w), where Aw is a linear Hamiltonian vector-field with d := ord A > 0 and F(w) is a nonlinear Hamiltonian vector-field with d˜ := ord F and is analytic in the neighborhood of the origin w = 0. When d˜ = ord F ≤ 0, the vector-field F is called bounded perturbation. For example, in this case lie a class of nonlinear Schrödinger equations (with d˜ = 0) iu t + u x x + V (x)u + |u|2 u = 0, and a class of nonlinear wave equations (with d˜ = −1) u tt − u x x + V (x)u + u 3 = 0. Supported by NNSFC, 973 Program (No. 2010CB327900), the Research Foundation for Doctor Programme, China Postdoctoral Science Foundation (No. 20100480553), China Postdoctoral Special Science Foundation (No. 201104236), Shanghai Postdoctoral Science Foundation (No. 11R21412000).

630

J. Liu, X. Yua

For the existence of KAM tori of the PDEs with bounded perturbations has been deeply and widely investigated by many authors. In this field of study there are too many references to list here. We give just two survey papers by Kuksin [9] and Bourgain [5]. When d˜ = ord F > 0, the vector-field F is called unbounded perturbation. According to a well-known example, due to Lax [12] and Klainerman [6] (see also [8]), it is reasonable to assume d˜ ≤ d − 1 in order to guarantee the existence of KAM tori for the PDE. The quantity d − d˜ measures the strength of nonlinearity of the PDE. The smaller the d − d˜ is, the stronger the nonlinearity is. When d − d˜ = 1, the nonlinearity of the PDE is the strongest. For the PDE with unbounded Hamiltonian perturbation, the only previous KAM theorem is due to Kuksin [8] where it is assumed that d − d˜ > 1. Kuksin’s theorem is in [8] used to prove the persistence of the finite-gap solutions of the KdV equation, as well as its hierarchy, subject to periodic boundary conditions. See also Kappeler-Pöschel [11]. Another KAM theorem with unbounded linear Hamiltonian perturbation is due to Bambusi-Graffi [2] where the spectrum property is investigated for the time dependent linear Schrödinger equation i y˙ (t) = (A + εF(t))y(t) with d − d˜ > 1 also. The assumption d − d˜ > 1 excludes a large class of interesting partial differential equations such as a class of derivative nonlinear Schrödinger (DNLS) equations ¯ x =0 iu t + u x x − Mσ u + i f (u, u)u subject to Dirichlet boundary conditions, and the perturbed Benjamin-Ono (BO) equation u t + H u x x − uu x + perturbation = 0 subject to periodic boundary conditions, where H is the Hilbert transform. In both cases, d − d˜ = 1. In the present paper, we will construct a KAM theorem including d − d˜ > 1 and the limiting case d − d˜ = 1 and show the existence of KAM tori for DNLS and perturbed BO equations. If the Hamiltonian operator A has continuous spectra, there is usually no KAM tori for the HPDE. Therefore, we should assume that the Hamiltonian operator A is of pure point spectra before stating our theorems. Taking the eigenfunctions of A as a basis and changing partial coordinates into action-angle variables, from the HPDE one can usually obtain a small perturbation H = N + P of an infinite dimensional Hamiltonian in the parameter dependent normal form N=

1≤ j≤n

ω j (ξ )y j +

1 j (ξ )(u 2j + v 2j ) 2 j≥1

on a phase space P a, p = Tn × Rn × a, p × a, p (x, y, u, v)

(1.1)

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

631

with symplectic structure 1≤ j≤n dy j ∧d x j + j≥1 du j ∧dv j , where a, p is the Hilbert space of all real sequences w = (w1 , w2 , . . .) with 2 wa, e2a j j 2 p |w j |2 < ∞, p = j≥1

where a ≥ 0 and p ≥ 0. The tangential frequencies ω = (ω1 , . . . , ωn ) and normal frequencies = (1 , 2 , . . .) are real vectors depending on parameters ξ ∈ ⊂ Rn ,

a closed bounded set of positive Lebesgue measure, and roughly j (ξ ) = j d + · · · . The perturbation term P is real analytic in the space coordinates and Lipschitz in the parameters, and for each ξ ∈ its Hamiltonian vector field X P = (Py , −Px , −Pv , Pu )T defines near T0 := Tn × {y = 0} × {u = 0} × {v = 0} a real analytic map X P : P a, p → P a,q , where ˜ p − q = d. In the whole of this paper the parameter a is fixed. Moreover, throughout this paper, for convenience, we will adopt lots of notations and definitions from [11]. a, p We denote by PC the complexification of P a, p . For s, r > 0, We introduce the a, p complex T0 -neighborhoods in PC , D(s, r ) : |Imx| < s, |y| < r 2 , ua, p + va, p < r,

(1.2)

a,q

and weighted norm for W = (X, Y, U, V ) ∈ PC , W r,a,q = |X | +

|Y | U a,q V a,q + , + r2 r r

where | · | denotes the sup-norm for complex vectors. Furthermore, for a map W : a,q D(s, r ) × → PC , for example, the Hamiltonian vector field X P , we define the norms W r,a,q,D(s,r )× = lip

W r,a,q,D(s,r )× =

sup

D(s,r )×

sup

W r,a,q ,

ξ,ζ ∈ ,ξ =ζ

ξ ζ W r,a,q , |ξ − ζ | D(s,r ) sup

where ξ ζ W = W (·; ξ ) − W (·; ζ ). In a completely analogous manner, the Lipschitz semi-norm of the frequencies ω and are defined as lip

|ω| =

sup

ξ,ζ ∈ ,ξ =ζ

| ξ ζ ω| , |ξ − ζ |

lip

||−δ, =

sup

sup

ξ,ζ ∈ ,ξ =ζ j≥1

j −δ | ξ ζ j | |ξ − ζ |

(1.3)

for any real number δ. Theorem 1.1. Suppose the normal form N described above satisfies the following assumptions:

632

J. Liu, X. Yua

(A) The map ξ → ω(ξ ) between and its image is a homeomorphism which is Lipschitz continuous in both directions, i.e. there exist positive constants M1 and L lip lip such that |ω| ≤ M1 and |ω−1 |ω( ) ≤ L; (B) There exists d > 1 such that |i − j | ≥ m|i d − j d |

(1.4)

for all i = j ≥ 0 uniformly on with some constant m > 0. Here 0 = 0; (ξ ) (C) There exists δ ≤ d − 1 such that the functions ξ → jj δ are uniformly Lipschitz lip

on for j ≥ 1, i.e. there exist a positive constant M2 such that ||−δ, ≤ M2 ; (D) We additionally assume 4E L M2 ≤ m,

(1.5)

where E = |ω| := supξ ∈ |ω(ξ )|. Set M = M1 + M2 . Then for every β > 0, there exists a positive constant γ , depending only on n, d, δ, m, the frequencies ω and , s > 0 and β, such that for every perturbation term P described above with d˜ = p − q ≤ d − 1

(1.6)

and ε := X P r,a,q,D(s,r )× +

α lip X P r,a,q,D(s,r )× ≤ (αγ )1+β M

(1.7)

for some r > 0 and 0 < α < 1, there exist (1) a Cantor set α ⊂ with | \ α | ≤ c1 α,

(1.8)

where |·| denotes Lebesgue measure and c1 > 0 is a constant depends on n, ω and ; (2) a Lipschitz family of smooth torus embeddings : Tn × α → P a, p satisfying: for every non-negative integer multi-index k = (k1 , . . . , kn ), ∂xk ( − 0 )r,a, p,Tn × α + where ∂xk :=

∂ |k| k ∂ x1 1 ···∂ xnkn

1 α k lip ∂x ( − 0 )r,a, p,Tn × α ≤ c2 ε 1+β /α, (1.9) M

with |k| := |k1 | + · · · + |kn |,

0 : Tn × → T0 ,

(x, ξ ) → (x, 0, 0, 0)

is the trivial embedding for each ξ , and c2 is a positive constant which depends on k and the same parameters as γ ; (3) a Lipschitz map φ : α → Rn with |φ − ω| α +

α lip |φ − ω| α ≤ c3 ε, M

(1.10)

where c3 is a positive constant which depends on the same parameters as γ ,

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

633

such that for each ξ ∈ α the map restricted to Tn × {ξ } is a smooth embedding of a rotational torus with frequencies φ(ξ ) for the perturbed Hamiltonian H at ξ . In other words, t → (θ + tφ(ξ ), ξ ),

t ∈R

is a smooth quasi-periodic solution for the Hamiltonian H evaluated at ξ for every θ ∈ Tn and ξ ∈ α . This theorem can be applied to the class of derivative nonlinear Schrödinger equations mentioned above. However, the assumption (1.5) excludes the perturbed Benjamin-Ono equation mentioned above. Thus, in the following, we give a modified version of the above theorem: Theorem 1.2. The above theorem also holds true with, respectively, replacing the assumption (D) and conclusion (1) by the assumption (D*) and conclusion (1*) below: (D*) For every k ∈ Zn and l ∈ Z∞ with 1 ≤ |l| ≤ 2 (here |l| = j≥1 |l j |), the resonance set {ξ ∈ : k, ω(ξ ) + l, (ξ ) = 0} has Lebesgue measure zero; Moreover, if δ = d − 1, we additively assume that there exist δ0 < d − 1, a partition = 1 + 2 , positive constant M3 and M4 with 8E L M4 ≤ m, lip

(1.11)

lip

such that |1 |−δ0 , ≤ M3 , |2 |−δ, ≤ M4 ; (1*) a Cantor set α ⊂ with | \ α | → 0 as α → 0, where | · | denotes Lebesgue measure. This paper is organized as follows: In Sect. 2 we give an outline of the proof of the above theorems, and some new difficult ideas compared with [8] [11] are exhibited. In Sects. 3-6 the above theorems are proved in detail. The proof of Theorem 1.2 is the same as that of Theorem 1.1 except the measure estimate. Thus, Sects. 3-5 and Subsect. 6.1 are devoted to the proof of Theorem 1.1, while the measure estimate for Theorem 1.2 is given in Subsect. 6.2. In Sects. 7-8 Theorem 1.1 and Theorem 1.2 are applied to derivative nonlinear Schrödinger equations and perturbed Benjamin-Ono equation, respectively. Finally, a technical lemma is listed in Sect. 9. 2. Outline of The Proof and More Remarks The above theorems generalize Kuksin’s theorem from d˜ < d − 1 to d˜ ≤ d − 1 such that the range of application is extended to a class of derivative nonlinear Schrödinger equations and the perturbed Benjamin-Ono equation. Here we would like to compare the proof of our theorems with that of Kuksin’s theorem. By and large, as any KAM theorem, in both cases Newton iteration is used to overcome the notorious small divisor difficulty. Therefore, our proof is mainly based on Kuksin’s approach in [8]. (Also see [11]). There is, however, some essential differences between the proof of our theorems and that of Kuksin’s theorem. In order to see clearly the differences, let us give the basic procedure of the proof of KAM theorem from [8] and [11], which consists of the following steps.

634

J. Liu, X. Yua

2.1. Derivation of the For convenience, introduce complex vari√ √ homological equations. ables z = (u−iv)/ 2 and z¯ = (u+iv)/ 2. Assume we are now in the ν th KAM iterative step. Write the integrable part Nν of the Hamiltonian Hν , Nν = ω, y +

∞

j z j z¯ j ,

j=1

and develop the perturbation Pν into the Taylor series in (y, z, z¯ ): 3 Pν = ν Rν + O(|y|2 + |y|||z||a, p + ||z||a, p ), ν

where ν goes to zero very fast, for example, taking ν ≈ (5/4) , and Rν = R x (x) + R y (x), y + R z (x), z + R z¯ (x), z¯ + R zz (x)z, z + R z¯ z¯ (x)¯z , z¯ + R z z¯ (x)z, z¯ = O(1). A key point is, very roughly speaking, to search for a Hamiltonian function of the same form as Rν : Fν = F x (x) + F y (x), y + F z (x), z + F z¯ (x), z¯ + F zz (x)z, z + F z¯ z¯ (x)¯z , z¯ + F z z¯ (x)z, z¯ which satisfies {Nν , Fν } + Rν = 0,

(2.1)

where {·, ·} is the Poisson bracket with respect to the symplectic structure dy j ∧d x j − i dz j ∧d z¯ j . 1≤ j≤n

j≥1

By ν := X 1ν Fν denote the time-1 map of the Hamiltonian vector field X ν Fν . It is a symplectic transformation. A simple calculation shows that ν changes Hν = Nν + Pν into 3 Hν+1 = Hν ◦ ν = Nν + R˜ ν+1 + O(|y|2 + |y|||z||a, p + ||z||a, p)

(2.2)

with 1 R˜ ν+1 = ν2 {{Nν , Fν }, Fν } + ν2 {Rν , Fν } + · · · . 2 Our task is now to search for Fν satisfying {Nν , Fν } + Rν = 0 which is a set of the first order partial differential equations: ω · ∂x F x (x) = R x (x), ω · ∂x F y (x) = R y (x), ··· z z¯ ω · ∂x F (x) + iF z z¯ (x) − iF z z¯ (x) = R z z¯ (x), where = diag ( j : j = 1, 2, . . .). Let us consider the last equation which is the most difficult one. By Fi j (x) and Ri j (x), denote the matrix elements of the operators F z z¯ and R z z¯ , respectively. Then the last equation becomes − iω · ∂x Fi j (x) + (i − j )Fi j (x) = −iRi j (x), i, j = 1, 2, . . .

(2.3)

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

635

or i j (k) = −i R i j (k), ( k, ω + i − j ) F i j (k) are the k th Fourier coefficients of Fi j (x) and Ri j (x), respeci j (k) and R where F j j (0) is put into j as a modification j j (0) = 0, otherwise R tively. One can assume R of the normal form Nν . Thus, (2.3) can be solved by i j (k) = F

i j (k) R

k, ω + i − j

under the non-resonant conditions

k, ω + i − j = 0 unless k = 0, i = j. This is actually the KAM iterative procedure for bounded perturbation P. However, the thing is not so simple when the perturbation P is unbounded, i.e., d˜ = p − q > 0. When X P : P a, p → P a,q , d˜ = p − q > 0, one has, very roughly, ˜

˜

i j (k)| ≈ |i|d + | j|d → ∞, as |i| + | j| → ∞. |R i j (k) → ∞. In other words, the solution Fν or the transformation This leads usually to F 1 ν = X ν Fν would be unbounded. One should note that the coordinate transformations ν = X 1ν Fν must be bounded even if the the perturbation P is unbounded in order that the domains of the KAM iteration are always in the same phase space P a, p and that the KAM iterative procedure can work. In order to guarantee the boundedness of Fν , it is required in [8] and [11] that |i − j | ≈ |i|d−1 + | j|d−1 , i = j or rather roughly, | k, ω + i − j | ≈ |i|d−1 + | j|d−1 . Together with d˜ ≤ d − 1, one has that for i = j, i j (k)| ≈ |F

˜

˜

|i|d + | j|d = O(1). |i|d−1 + | j|d−1

It is clear that this estimate fails when i = j. To avoid this plight, Kuksin[8] smartly j j (0) into j as a modification of the normal form put the whole R j j (x) rather than R Nν so that it is not necessary to solve the equation for F j j (x). In doing so, the term Nν becomes the generalized normal form (i.e. normal frequencies depend on the angle variable x) N˜ ν := Nν +

∞ j=1

R j j (x)z j z¯ j = ω, y +

∞ j=1

( j + R j j (x))z j z¯ j ,

636

J. Liu, X. Yua

the homological equation (2.1) is modified into { N˜ ν , Fν } + Rν = 0,

(2.4)

and the remaining term R˜ ν+1 is changed into 1 R˜ ν+1 = ν2 {{ N˜ ν , Fν }, Fν } + ν2 {Rν , Fν } + · · · . 2

(2.5)

Accordingly, the homological equation (2.3) becomes − iω · ∂x Fi j (x) + (i − j + Rii (x) − R j j (x))Fi j (x) = −iRi j (x), i = j.

(2.6)

Let u = Fi j (x), λ = i − j , μ(x) = Rii (x) − R j j (x) and r (x) = −iRi j (x). Then this equation can be abbreviated as an abstract equation − iω · ∂x u + (λ + μ(x))u = r (x).

(2.7)

Since Rii and R j j are large, μ(x) is usually large. And the coefficient μ(x) involves the angle variable. The equations of this type are called “small-denominators equations with large variable coefficients” by Kuksin [7]. We remark that, for simplicity, the modification of ω is omitted here. 2.2. Solving the homological equations. In order to make the KAM iterative procedure work, the existence domain of the solution u should be the strip-type neighborhood of Tn with some width s > 0: D(s) := {x ∈ Cn /2π Zn : |Im x| ≤ s}. Assume μ(x) ≈ C γ˜ , where C is some small constant and where γ˜ should usually be a large magnitude. Since (2.7) is scalar, it can be solved directly and estimated by sup

x∈D(s−σ )

|u(x)| e2C γ˜ ||r ||s , 0 ≤ σ < s,

(2.8)

where r s := supx∈D(s) |r (x)|. This estimate is, however, not good enough to support the KAM iteration procedure, since the solution u becomes too large as μ is large. In fact, in the ν th KAM iteration step, γ˜ ≈ 2ν . Thus, u ≈ exp (2ν ) goes to infinity very rapidly, which makes the coordinate transformations essentially unbounded. When d˜ < d − 1, Kuksin’s Lemma [7] can solve this problem. Following Kuksin [7], we assume there are constants C > 0 and 0 < θ < 1 such that |λ|θ ≥ Cμs .

(2.9)

Kuksin’s Lemma states that under suitable non-resonant conditions on ω, the solution u satisfies the estimate: 1

us−σ ≤ C1 exp(C2 C31−θ )||r ||s ,

(2.10)

where C1 and C2 are positive constants depending on only n and σ , and C3 > 0 is a constant depending on the non-resonant conditions. When this estimate is applied to (2.6), one needs to take ˜

˜

λ = i − j ≈ i d − j d ≈ i d−1 + j d−1 , γ˜ ≈ i d + j d , i, j = 1, 2, 3, . . . .

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

637

From this, we see that there is indeed a constant 0 < θ < 1 such that (2.9) holds true if d˜ < d − 1. Therefore, the solution u of the homological equation (2.7) has a uniform bound independent of the size of μ. This makes the coordinate transformation bounded. We see also that d˜ = d − 1 leads to θ = 1 in (2.9). In this case, the estimate (2.10) is invalid. (The right-hand side of (2.10) is equal to ∞.) Now it is clear that we need some new estimate for the solution u covering not only the case d˜ < d − 1 but also the limit case d˜ = d − 1. The new estimate has been obtained in our recent paper [13]: us−σ ≤ C4 eC γ˜ s r s ,

(2.11)

where C4 is a constant depending on the non-resonant conditions of ω and where C is a small enough positive constant. Since the parameter γ˜ (which measures the magnitude of the perturbation μ(x)) goes into the exponential in the right hand side of (2.11), in this sense, the upper bound of this new estimate looks weaker than that of the original Kuksin’s lemma. However, compared with (2.8), there is some essential improvement in (2.11): there is an s in the exponential in (2.11). This number s will be used crucially in the following manner: In the ν th KAM iteration, we let s = 2−ν . One will find that there is no small divisor problem in the homological equation (2.7) when γ˜ > some large constant K ≈ 2ν (5/4)ν | ln ε|. In this case, the homological equation (2.7) can be solved by the implicit function theorem. So the non-trivial case is when γ˜ ≤ K . At this time, we find that eC γ˜ s εν−C (5/4)ν

with εν := ε and the constant C 1. Thus, by the new estimate (2.11) and noting ||r ||s ≤ εν , we have us−σ εν−C ||r ||s ≤ εν−C εν εν .

(2.12) 2ν εν

In the usual KAM iteration, one would have obtained ||u||s−σ ≤ εν . Although here εν−C εν 2ν εν , inequality (2.12) can guarantee the KAM procedure to be iterated. Therefore, although the new estimate (2.11) is weaker than that of the original Kuksin’s Lemma, it can cover both d˜ < d − 1 and the limit case d˜ = d − 1, and it is sufficient for the proof of the KAM theorems of the present paper. 2.3. Estimate of the remaining terms R˜ ν+1 , etc. Recall Rν = O(1). By (2.11), Fν = O(ν−C ) with 0 < C 1. Thus, {Fν , Rν } = O(ν−C ). By (2.4), { N˜ ν , Fν } = −Rν = O(1). Thus, {{ N˜ ν , Fν }, Fν } = {O(1), O(ν−C )} = O(ν−C ). Consequently, by (2.5), very roughly, 1 R˜ ν+1 = ν2 {{ N˜ , Fν }, Fν } + ν2 {Rν , Fν } + · · · = O(ν2−C ) = O(ν+1 ) := ν+1 Rν+1 , 2 where Rν+1 = O(1). Therefore, we can rewrite Hν+1 as 3 Hν+1 = N˜ ν+1 + ν+1 Rν+1 + O(|y|2 + |y|||z||a, p + ||z||a, (2.13) p ).

638

J. Liu, X. Yua

2.4. Convergence of the iterative procedure. Repeating the above procedure and letting ν → ∞ and noting ν → 0 very fast, one can finally get 3 H∞ := lim H ◦ 1 ◦ · · · ◦ ν = N˜ ∞ + O(|y|2 + |y|||z||a, p + ||z||a, p ), ν→∞

where N˜ ∞ = limν→∞ N˜ ν = ω∞ , y + j≥1 ∞, j (x)z j z¯ j . The system corresponding to H∞ is ⎧ x˙ = ∂ ∂Hy∞ = ω∞ + O(|y| + ||z||a, p ) ⎪ ⎪ ⎨ 2 ) y˙ = − ∂ ∂Hx∞ = O(|y|2 + |y|||z||a, p + ||z||a, p ⎪ ⎪ ⎩z˙ = −i ∂ H∞ = −i z + O(|y| + ||z||2 ), j ≥ 1. j

∞, j j

∂ z¯ j

a, p

It is clearly seen that T∞ := {(x, y, z, z¯ ) ∈ P a, p : x = ω∞ t, y = 0, z = z¯ = 0, t ∈ R} forms an invariant torus of the hamiltonian vector field X H∞ . Going back to the original vector field X H , then (limν→∞ 1 ◦· · ·◦ν )T∞ is an invariant torus of the Hamiltonian system defined by H . 2.5. Estimate of the measure of the parameters. When d˜ = d − 1, another new difficulty also arises in search of F. It is under suitable non-resonant conditions that either (2.11) or (2.10) can hold true. In other words, one has to remove some resonant sets consisting of “bad” parameters ξ , equivalently, to remove the “bad” parameters ω when ω = ω(ξ ) depends on ξ in some non-degenerate way. For example, we need to eliminate the resonant set {ξ ∈ : k, ω(ξ ) + i (ξ ) − j (ξ ) is small}, where i = j ∈ N and k ∈ Zn . Clearly we hope that the Lebesgue measure of the set is small. To that end we need to verify that k, ω(ξ ) + i (ξ ) − j (ξ ) is twisted with respect to ξ ∈ , equivalently, twisted with respect to ω ∈ ω( ), that is, we need to show lip

() := | k, ω + i (ξ(ω)) − j (ξ(ω))|ω( ) > 0, where ω(ξ(ω)) = ω. Recall |i − j | ≥ m|i d − j d |. Thus, k, ω+i − j is not small if |k| ≤ C˜ 0 |i d − j d | with some constant C˜ 0 depending on m and |ω| . Now we assume |k| ≥ C˜ 0 |i d − j d |. At the ν th KAM step, because of the modification of frequencies from unbounded perturbation, we have ˜

| j |ω( ) = O( j δ ) + O( j d ). lip

˜ Therefore, there exists a constant By a small trick (see §3.2 below), we can let δ = d. ˜ C1 such that ˜ ˜ () ≥ |k| − C˜ 1 (i d + j d ) 1 ˜ ˜ ≥ C˜ 0 (i d−1 + j d−1 ) − C˜ 1 (i d + j d ) 2 1

˜ ¯ ˜ ˜ ≥ C˜ 0 (i d−1−d + j d−1−d ) − C˜ 1 (i d + j d ), 4

(2.14) (2.15) (2.16)

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

639

which is similar to (10.50) in [8] (see also p. 174 in [11]). When d˜ < d − 1, it follows from (2.16) that () > 0 if max(i, j) >

4C˜

1

C˜ 0

1 d−1−d˜

:= K .

(2.17)

Therefore, if d˜ < d − 1 it remains to verify the twist condition () > 0 just for only a finite number of cases (k, i, j) :

¯ ˜ C˜ 0 |i d − j d | ≤ |k| ≤ C˜ 1 (i d + j d ) and i, j ≤ K .

(2.18)

Note that the constant K is independent of i, j and k, so is the number of the cases. For these cases, the measure estimate of the resonant sets can be dealt with by some initial assumptions (for example, see Proposition 22.2 in [11]). When d˜ = d − 1, K = +∞. However, it follows directly from (2.15) that () > ( 21 C˜ 0 − C˜ 1 )(i d−1 + j d−1 ) > 0 if C˜ 0 > 2C˜ 1 . This completes the measure estimate of the resonant sets for Theorem 1.1. One can verify that the condition C˜ 0 > 2C˜ 1 is indeed satisfied by DNLS equations. However, the inequality C˜ 0 > 2C˜ 1 is not satisfied by the BO equation. The procedure of the measure estimate of the resonant set is modified as follows: lip Assume that can be split into two parts: j = 1j +2j such that |1j |ω( ) ≤ C˜ 2 j δ0 lip with δ0 < d − 1, and |2 | ≤ C˜ 3 j d−1 with C˜ 3 suitably small. Thus, j ω( )

() ≥ |k| − C˜ 2 (i δ0 + j δ0 ) − C˜ 3 (i d−1 + j d−1 ) 1 ≥ ( C˜ 0 − C˜ 3 )(i d−1 + j d−1 ) − C˜ 2 (i δ0 + j δ0 ). 2 It follows from δ0 < d − 1 that there is a constant K˜ = K˜ (d, δ0 , C˜ 0 , C˜ 2 , C˜ 3 ) > 0 such that () > 0 if max{i, j} > K˜ and 21 C˜ 0 > C˜ 3 . The latter inequality is satisfied by the BO equation with d = 2 and δ0 = 0. We also mention that the partition of into 1 + 2 is rather natural. In fact, 1 is usually regarded as the initial frequency vector, while 2 corresponds to the modification in KAM iteration steps. Remarks. 1. In [13] we mentioned Theorem 1.1 above and its application to DNLS equation without proof. In this paper we give the proof and add Theorem 1.2 and a new application to the Benjamin-Ono equation. 2. The proofs of both (2.10) and (2.11) depend heavily on the fact that the homological equation (2.7) is scalar so that the solution can be expressed explicitly. This restric tion requires that the normal frequency j ’s must be simple, i.e., j = 1. Therefore, the range of application of Theorems 1.1 and 1.2 and Kuksin’s KAM theorem ([8]) lies in those PDEs with simple frequencies such as the DNLS equation subject to Dirichlet boundary conditions, the KdV and BO equations with periodic boundary conditions. For the DNLS equation, iu t + u x x + i(|u|2 u)x = 0,

(2.19)

subject to periodic boundary conditions, the multiplicity j = 2. Note that the nonlinearity (|u|2 u)x does not involve x explicitly. Passing to Fourier coefficients, the corresponding Hamiltonian consists of monomials qi q¯ j qk q¯l , where the subscript

640

J. Liu, X. Yua

(i, j, k, l) satisfies i − j + k − l = 0. Further, after some symplectic transformations, the Hamiltonian essentially consists of monomials qn 1 q¯n 2 qn 3 q¯n 4 · · ·qn 2r −1 q¯n 2r , where the subscript (n 1 , n 2 , . . . , n 2r ) satisfies n 1 − n 2 + n 3 − n 4 + · · · + n 2r −1 − n 2r = 0.

(2.20)

By Bourgain’s observations in p. 2 of [4] , the multiplicity j = 2 is essentially

reduced to j = 1. Therefore, the estimate (2.11) still works. Moreover, the quasiperiodic solutions of (2.19) can be obtained, too. The details will appear in our forthcoming paper [14]. However, if the nonlinearity (for example a(x)|u|2 u x ) contains x explicitly such that (2.20) is violated, then the existence of KAM tori is hard to get. In addition, for the Kadomtsev-Petviashvili (KP) equation, (u t + u x x x + uu x )x ± u yy = 0, u = u(t, x, y), (x, y) ∈ T2 ,

(2.21)

with the frequency multiplicity j → ∞ as | j| → ∞, there is nothing to know about the existence of KAM tori under perturbation. Actually, this is a well-known open problem by Kuksin. See [5,9]. 3. The Homological Equations 3.1. Derivation of homological equations. The proof of Theorem 1.1 employs the rapidly converging iteration scheme of Newton type to deal with small divisor problems introduced by Kolmogorov, involving the infinite sequence of coordinate transformations. At the ν th step of the scheme, a Hamiltonian Hν = Nν + Pν is considered, as a small perturbation of some normal form Nν . A transformation ν is set up so that Hν ◦ ν = Nν+1 + Pν+1 with another normal form Nν+1 and a much smaller perturbation Pν+1 . We drop the index ν of Hν , Nν , Pν , ν and shorten the index ν + 1 as +. √ √ Using convenient complex notation z = (u − iv)/ 2 and z¯ = (u + iv)/ 2, the generalized normal form reads N = ω(ξ ), y + j (x; ξ )z j z¯ j . (3.1) j≥1

Let R be the 2-order Taylor polynomial truncation of P, that is, R = R x + R y , y + R z , z + R z¯ , z¯ + R zz z, z + R z¯ z¯ z¯ , z¯ + R z z¯ z, z¯ ,

(3.2)

where ·, · is the formal product for two column vectors and R x , R y , R z , R z¯ , R zz , R z¯ z¯ , R z z¯ depend on x and ξ . For a function u on Tn , let

1 u(x)d x. [u] = (2π )n Tn

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

641

By [[R]] denote the part of R in generalized normal form as follows [[R]] = [R x ] + [R y ], y + diag(R z z¯ )z, z¯ , where diag(R z z¯ ) is the diagonal of R z z¯ . Note that [R x ] and [R y ] are independent of x. In the following, the term [R x ] will be omitted since it does not affect the dynamics. The coordinate transformation is obtained as the time-1-map X tF |t=1 of a Hamiltonian vector field X F , where F is of the same form as R: F = F x + F y , y + F z , z + F z¯ , z¯ + F zz z, z + F z¯ z¯ z¯ , z¯ + F z z¯ z, z¯ , (3.3) and [[F]] = 0. Denote ∂ω = 1≤b≤n ωb ∂∂xb , = diag( j : j ≥ 1). Then we have H ◦ = (N + R) ◦ X 1F + (P − R) ◦ X 1F

1 = N + {N , F} + R + {(1 − t){N , F} + R, F} ◦ X tF dt + (P − R) ◦ X 1F 0 =N+

∂x j , F y z j z¯ j + [R y ], y + diag(R z z¯ )z, z¯ (3.4) j≥1

+(−∂ω F x + R x ) + − ∂ω F y + R y − [R y ], y + − ∂ω F z + iF z + R z , z + − ∂ω F z¯ − iF z¯ + R z¯ , z¯ + (−∂ω F zz + iF zz + iF zz + R zz )z, z + (−∂ω F z¯ z¯ − iF z¯ z¯ − iF z¯ z¯ + R z¯ z¯ )¯z , z¯ + (−∂ω F z z¯ − iF z z¯ + iF z z¯ + R z z¯ − diag(R z z¯ ))z, z¯

1 {(1 − t){N , F} + R, F} ◦ X tF dt + (P − R) ◦ X 1F . +

(3.5) (3.6) (3.7) (3.8) (3.9) (3.10) (3.11) (3.12)

0

We wish to find the function F such that (3.5)–(3.11) vanish. To this end, F x , F y , F z , F z¯ , F zz , F z¯ z¯ and F z z¯ should satisfy the homological equations: ∂ω F x = R x , ∂ω F y = R y − [R y ], ∂ω F jz − i j F jz = R zj , j ≥ 1,

(3.13) (3.14) (3.15)

∂ω F jz¯ + i j F jz¯ = R z¯j ,

j ≥ 1,

(3.16)

i, j ≥ 1,

(3.17)

i, j ≥ 1,

(3.18)

i, j ≥ 1, i = j.

(3.19)

∂ω Fizzj − i(i + j )Fizzj ∂ω Fiz¯jz¯ + i(i + j )Fiz¯jz¯ ∂ω Fizjz¯ + i(i − j )Fizjz¯

= = =

Rizzj , Riz¯jz¯ , Rizjz¯ ,

¯ = [] and ˜ = 3.2. Solving the homological equations. Let = ( j : j ≥ 1), ¯ − . Define

k = max{1, |k|}, ld = max{1, | j d l j |}. j≥1

642

J. Liu, X. Yua

Moreover, for an analytic function u on D(s), we define |u|s,τ := |uˆ k ||k|τ e|k|s , k∈Zn

where uˆ k := (2π )−n Tn u(x)e−ik·x d x is the k-Fourier coefficient of u. ˜ decreasing Consider the conditions δ ≤ d − 1 and d˜ = p − q ≤ d − 1. If δ > d, ˜ ˜ q such that δ = d, then for the new q, the inequality (1.7) still holds true; if δ < d, ˜ increasing δ such that δ = d, then for the new δ, the assumption (C) still holds true, and if d˜ = d − 1, the assumption (D*) is satisfied with 2 = 0. Thus, without loss of generality we assume δ = d˜ ≤ d − 1 in the following. Equations (3.13)–(3.19) will be solved under the following conditions: uniformly on ,

ld ,

kτ ¯ )| ≥ m ld , | l, (ξ ˜ j |s,τ +1 ≤ αγ0 j δ , |

¯ )| ≥ α | k, ω(ξ ) + l, (ξ

k = 0, |l| ≤ 2,

(3.20)

0 < |l| ≤ 2,

(3.21)

j ≥ 1,

(3.22)

with constants τ ≥ n, d > 1, 0 < γ0 ≤ 1/8, m > 0, and a parameter 0 < α ≤ m. We mention that d is the same as in Theorem 1.1 and α, m will be the iteration parameters αν , m ν in the ν th KAM step. Equations (3.13), (3.14) can be easily solved by a standard approach in classical, finite dimensional KAM theory, so we only give the related results at the end of this subsection. Equations (3.15)-(3.18) are easier than (3.19) and can be solved in the same way as (3.19) done, so we only give the details of solving (3.19) in the following. For any positive number K , we introduce a truncation operator K as follows: ( K f )(x) := ∀ f : Tn → C, fˆk eik·x , |k|≤K

where fˆk is the k-Fourier coefficient of f . Set C0 = 2|ω| /m and K being a positive number which will be the iteration parameter K ν in the ν th KAM step. (1) For (i, j) with 0 < |i d − j d | < C0 K , we solve exactly (3.19): ∂ω Fizjz¯ + i(i − j )Fizjz¯ = Rizjz¯ ;

(3.23)

(2) for (i, j) with |i d − j d | ≥ C0 K , we solve the truncated equation of (3.19):

∂ω Fizjz¯ + i K (i − j )Fizjz¯ = K Rizjz¯ , K Fizjz¯ = Fizjz¯ . (3.24) Comparing (3.24) with (3.19), we find that (3.11) doesn’t vanish. Actually, at this time, (3.11) is equal to Rˆ z z¯ z, z¯ with the matrix elements of Rˆ z z¯ being defined by ⎧ ⎨ 0, |i d − j d | < C0 K , z z¯

ˆ (3.25) Ri j = ⎩ (1 − K ) − i(i − j )Fizjz¯ + Rizjz¯ , |i d − j d | ≥ C0 K .

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

643

¯ ij + ˜ i j and dropping the superscript ‘z z¯ ’ for brevity, Letting i j = i − j = (3.23) (3.24) (3.25) become ¯ i j Fi j + ˜ i j Fi j = −iRi j , − i∂ω Fi j + ¯ i j Fi j + K ( ˜ i j Fi j ) = −i K Ri j , −i∂ω Fi j + Rˆ i j =

0 < |i d − j d | < C0 K , ˜ i j Fi j + Ri j ), (1 − K )(−i

0,

|i d − j d | ≥ C0 K .

(3.26) (3.27)

(3.28)

We are now in position to solve the homological equations (3.26) (3.27) by using the following two lemmas, which have been proved in [13] as Theorem 1.4 and Lemma 2.6 respectively: Lemma 3.1 ([13]). Consider the first order partial differential equation − i∂ω u + λu + μ(x)u = p(x),

x ∈ Tn ,

(3.29)

for the unknown function u defined on the torus Tn , where ω = (ω1 , · · · , ωn ) ∈ Rn and λ ∈ C. Assume (1) There are constants α, γ˜ > 0 and τ > n such that α , k ∈ Zn \{0}, |k|τ α γ˜ |k · ω + λ| ≥ , k ∈ Zn . 1 + |k|τ |k · ω| ≥

(3.30) (3.31)

(2) μ : D(s) → C is real analytic (here ‘real’ means μ(Tn ) ⊂ R) and is of zero average: [μ] = 0. Moreover, assume there is constant C > 0 such that |μ|s,τ +1 ≤ C γ˜ .

(3.32)

(3) p(x) is analytic in x ∈ D(s). Then (3.29) has a unique solution u(x) which is defined in a narrower domain D(s − σ ) with 0 < σ < s, and which satisfies sup

x∈D(s−σ )

|u(x)| ≤

c(n, τ ) 2C γ˜ s/α e sup | p(x)| α γ˜ σ n+τ x∈D(s)

(3.33)

for 0 < σ < min{1, s}, where the constant c(n, τ ) = (6e + 6)n [1 + ( 3τe )τ ]. Lemma 3.2 ([13]). Consider the first order partial differential equation with the truncation operator K , − i∂ω u + λu + K (μu) = K p,

x ∈ Tn ,

(3.34)

for the unknown function u defined on the torus Tn , where ω ∈ Rn , 0 = λ ∈ C, and 0 < 2K |ω| ≤ |λ|. Assume that μ is real analytic in x ∈ D(s) with k∈Zn

|μˆ k |e|k|s ≤

|λ| 4ι

(3.35)

644

J. Liu, X. Yua

for some constant ι ≥ 1, and assume p(x) is analytic in x ∈ D(s). Then (3.34) has a unique solution u(x) with u = K u and sup

x∈D(s−σ )

sup

x∈D(s−σ )

|u(x)| ≤

|(1 − K )(μu)(x)| ≤

c(n) sup | p(x)|, |λ|σ n x∈D(s)

(3.36)

c(n) −9K σ/10 e sup | p(x)| ισ n x∈D(s)

(3.37)

for 0 < σ < s, where the constant c(n) = 4(20e + 20)n . Set 0 < σ < min{1, s/5}. In what follows the notation a b stands for “there exists a positive constant c such that a ≤ cb, where c can only depend on n, τ .” First, let us consider (3.26) for (i, j) with 0 < |i d − j d | < C0 K . From (3.20), (3.21) we get α , k ∈ Zn \{0}, |k|τ α|i d − j d | ¯ ij| ≥ | k, ω(ξ ) + , k ∈ Zn . 1 + |k|τ | k, ω(ξ )| ≥

(3.38) (3.39)

From (3.22) we get ˜ i j |s,τ +1 ≤ αγ0 (i δ + j δ ) ≤ 2αγ0 |i d − j d |. |

(3.40)

Applying Lemma 3.1 to (3.26), we have |Fi j | D(s−2σ )

1 d d e4γ0 |i − j |s |Ri j | D(s−σ ) . α|i d − j d |σ n+τ

(3.41)

In view of |i d − j d | < C0 K , we get |Fi j | D(s−2σ )

α|i d

1 e4C0 γ0 K s |Ri j | D(s−σ ) . − j d |σ n+τ

(3.42)

Then, let us consider (3.27) for (i, j) with |i d − j d | ≥ C0 K . From (3.22) (3.21) we get ˜ i j |s,τ +1 ≤ αγ0 (i δ + j δ ) ≤ ˜ i j |s,0 ≤ | |

¯ ij| ¯ ij| | 2αγ0 | ≤ . m|i − j| 4|i − j|

(3.43)

Now applying Lemma 3.2 to (3.27), we have 1 |Ri j | D(s−σ ) , m|i d − j d |σ n 1 e−9K σ/10 |Ri j | D(s−σ ) . |i − j|σ n

|Fi j | D(s−2σ ) ˜ i j Fi j )| D(s−2σ ) |(1 − K )(

(3.44) (3.45)

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

645

For a bounded linear operator from a, p to a,q , define its operator norm by · a,q, p . As in Lemma 19.1 of [11], in view of (3.42) and (3.44), using Lemma 9.1 below, we get the estimates of F z z¯ : F z z¯ a, p, p,D(s−2σ ) , F z z¯ a,q,q,D(s−2σ )

1

e4C0 γ0 K s R z z¯ a,q, p,D(s) ασ 2n+τ 1 e4C0 γ0 K s X R r,a,q,D(s,r ) . ασ 2n+τ (3.46)

Multiplying by z, z¯ we then get 1 | F z z¯ z, z¯ | D(s−2σ,r ) ≤ F z z¯ a, p, p,D(s−2σ ) , r2

(3.47)

and finally by Cauchy’s estimate we have X F z z¯ z,¯z r,a, p,D(s−3σ,r )

1 e4C0 γ0 K s X R r,a,q,D(s,r ) . ασ 2n+τ +1

(3.48)

To obtain the estimate of the Lipschitz semi-norm, we proceed as follows. Shortening ξ ζ as and applying it to (3.26) and (3.27), one gets that, for (i, j) with 0 < |i d − j d | < C0 K , ¯ i j Fi j + ˜ i j Fi j = −i∂ ω Fi j − ( i j )Fi j + i Ri j := Q i j , i∂ω ( Fi j ) +

(3.49)

and that, for (i, j) with |i d − j d | ≥ C0 K , ¯ i j Fi j + K ( ˜ i j Fi j ) i∂ω ( Fi j ) + = −i∂ ω Fi j − k (( i j )Fi j − i Ri j ) := Q i j .

(3.50)

For 0 < |i d − j d | < C0 K , we have | ω| |Fi j | D(s−2σ ) +(i δ + j δ )| |−δ,D(s) |Fi j | D(s−3σ ) +| Ri j | D(s−3σ ) σ e4C0 γ0 K s (| ω| + | |−δ,D(s) )|Ri j | D(s−σ ) + | Ri j | D(s−σ ) ασ n+τ +1 e4C0 γ0 K s | ω| + | |−δ,D(s) |Ri j | D(s−σ ) + | Ri j | D(s−σ ) . (3.51) n+τ +1 σ α

|Q i j | D(s−3σ ) ≤

Again applying Lemma 3.1 to (3.49), we have | ω|+| |−δ,D(s) e8C0 γ0 K s | Fi j | D(s−4σ ) d |Ri j | D(s−σ ) +| Ri j | D(s−σ ) . α|i − j d |σ 2n+2τ +1 α (3.52) For |i d − j d | ≥ C0 K , we have | ω| + | |−δ,D(s) 1 |Ri j | D(s−σ ) + | Ri j | D(s−σ ) . |Q i j | D(s−3σ ) 3n σ m

(3.53)

646

J. Liu, X. Yua

Again applying Lemma 3.2 to (3.50), we have, | Fi j | D(s−4σ ) | ω| + | |−δ,D(s) 1 |Ri j | D(s−σ ) + | Ri j | D(s−σ ) , m|i d − j d |σ 4n m ˜ i j Fi j )| D(s−4σ ) |(1 − K )( −9K σ/10 | ω| + | |−δ,D(s) e |Ri j | D(s−σ ) + | Ri j | D(s−σ ) . |i − j|σ 4n m

(3.54)

(3.55)

In view of (3.52) and (3.54), applying Lemma 9.1 below again, we get the estimates of F z z¯ : F z z¯ a, p, p,D(s−4σ ) , F z z¯ a,q,q,D(s−4σ ) | ω| + | |−δ,D(s) z z¯ e8C0 γ0 K s 3n+2τ +1 R a,q, p,D(s) + R z z¯ a,q, p,D(s) . (3.56) ασ α Dividing by |ξ − ζ | = 0 and taking the supremum over , we get F z z¯ a, p, p,D(s−4σ )× , F z z¯ a,q,q,D(s−4σ )×

M z z¯ e8C0 γ0 K s z z¯ lip R a,q, p,D(s)× + R a,q, p,D(s)× , 3n+2τ +1 ασ α lip

lip

lip

(3.57)

lip

where M := |ω| + ||−δ,D(s)× . Thus, in the same way as (3.48), we get lip

X F z z¯ z,¯z r,a, p,D(s−5σ,r )×

M e8C0 γ0 K s lip X R r,a,q,D(s,r )× + X R r,a,q,D(s,r )× . 3n+2τ +2 ασ α

(3.58)

For λ ≥ 0, define · λ = · + λ · . lip

The symbol ‘λ’ in · λ will always be used in this role and never has the meaning of exponentiation. Set 0 ≤ λ ≤ α/M. From (3.48) (3.58) we get λ X F z z¯ z,¯z r,a, p,D(s−5σ,r )×

e8C0 γ0 K s λ X R r,a,q,D(s,r )× . ασ 3n+2τ +2

(3.59)

Now considering the homological equations (3.13) (3.14), by a standard approach in finite dimensional KAM theory, we can easily get X F x r,a, p,D(s−σ,r ) , X F y ,y r,a, p,D(s−σ,r ) lip

1 X R r,a,q,D(s,r ) , (3.60) ασ τ +n

lip

X F x r,a, p,D(s−2σ,r )× , X F y ,y r,a, p,D(s−2σ,r )×

1 M lip ( X R r,a,q,D(s,r )× + X R r,a,q,D(s,r )× ). ασ 2τ +2n+1 α

(3.61)

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

647

From (3.60) (3.61) we get λ λ X F x r,a, p,D(s−2σ,r )× , X F y ,y r,a, p,D(s−2σ,r )×

1 λ X R r,a,q,D(s,r )× . ασ 2τ +2n+1 (3.62)

For the other terms of F, i.e. F z , z, F z¯ , z¯ , F zz z, z, F z¯ z¯ z¯ , z¯ , the same results - even better - than (3.59) can be obtained. Thus, we finally get the estimate for F: λ X F r,a, p,D(s−5σ,r )×

e8C0 γ0 K s λ X R r,a,q,D(s,r )× . ασ 3n+2τ +2

(3.63)

4. The New Hamiltonian From (3.4)–(3.12) we get the new Hamiltonian H ◦ = N+ + P+ ,

(4.1)

where N+ = (3.4) and

1 ˆ ˆ + t R, F} ◦ X t dt + (P − R) ◦ X 1F , P+ = R + {(1 − t)( Nˆ + R) F

(4.2)

0

where Rˆ = (3.7)+· · ·+(3.11) := Rˆ z , z+ Rˆ z¯ , z¯ + Rˆ zz z, z+ Rˆ z¯ z¯ z¯ , z¯ + Rˆ z z¯ z, z¯ . The aim of this section is to estimate the new normal form N+ and the new perturbation P+ . 4.1. The new normal form. In view of (3.4), denote N+ = N + Nˆ with ˆ j z j z¯ j , Nˆ = ω, ˆ y + j≥1

where ωˆ := [R y ],

(4.3)

˜ j , F y . ˆ j := R j j + ∂x j , F y = R j j + ∂x

(4.4)

From (4.3) we easily get λ |ω| ˆ λ X R r,a,q,D(s,r )× .

(4.5)

ˆ = ( ˆ j : j ≥ 1). In view of the second estimate of In the following, we estimate (3.60), ˜ j , F y | D(s−σ ) ≤ | ˜ j |s,τ +1 X F y ,y r,a, p,D(s−σ,r ) | ∂x Thus, together with |R j j | D(s−σ ) ≤ j δ X R r,a,q,D(s,r ) ,

γ0 j δ X R r,a,q,D(s,r ) . σ τ +n

648

J. Liu, X. Yua

we get ˆ −δ,D(s−σ ) ||

1 σ τ +n

X R r,a,q,D(s,r ) .

(4.6)

ˆ j , we have Applying to ˜ j , F y . ˜ j , F y − ∂x ˆ j = R j j − ∂x

(4.7)

Since | R j j | D(s−2σ ) ≤ j δ X R r,a,q,D(s,r ) , 1 ˜ j , F y | D(s−2σ ) ≤ | j | D(s−σ ) X F y ,y r,a, p,D(s−2σ,r ) | ∂x σ | |−δ,D(s) j δ X R r,a,q,D(s,r ) , ασ τ +n+1 ˜ j , F y | D(s−2σ ) ≤ | ˜ j |s,τ +1 X F y ,y r,a, p,D(s−2σ,r ) | ∂x ≤ αγ0 j δ X F y ,y r,a, p,D(s−2σ,r ) , we get ˆ || −δ,D(s−2σ )× lip

1 M lip ( X R r,a,q,D(s,r )× + X R r,a,q,D(s,r )× ). (4.8) σ 2n+2τ +1 α

Therefore, from (4.6) (4.8) we get ˆ λ−δ,D(s−2σ )× ||

1 λ X R r,a,q,D(s,r )× . σ 2n+2τ +1

(4.9)

4.2. The new perturbation. We firstly estimate the error term Rˆ z z¯ with its matrix elements Rˆ i j in (3.28). Split Rˆ z z¯ into three parts: Rˆ z z¯ = S 1 + S 2 + S 3 , such that S 1 , S 2 have their matrix elements as follows: 0, i f |i d − j d | < C0 K , 1 Si j = (4.10) ˜ i j Fi j ), i f |i d − j d | ≥ C0 K , (1 − K )(−i −(1 − K )(Ri j ), i f 0 < |i d − j d | < C0 K , Si2j = (4.11) 0, i f i = j or |i d − j d | ≥ C0 K , and S 3 is the cut-off of the perturbation R z z¯ , that is, S 3 = (1 − K )(R z z¯ − diag(R z z¯ )).

(4.12)

In view of (3.45), and using Lemma 9.1 below, we get S 1 a,q, p,D(s−2σ )

e−9K σ/10 z z¯ R a,q, p,D(s) . σ 2n

(4.13)

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

649

From (3.55) and e−9K σ/10 δ (i + j δ )| |−δ,D(s−2σ ) |Fi j | D(s−2σ ) σ 2n e−9K σ/10 | |−δ,D(s−2σ ) |Ri j | D(s−σ ) , (4.14) m|i − j|σ 3n

˜ i j )Fi j )| D(s−3σ ) |(1 − K )((

by using Lemma 9.1 below, we get lip

S 1 a,q, p,D(s−4σ )×

e−9K σ/10 σ 5n

M z z¯ lip R a,q, p,D(s)× + R z z¯ a,q, p,D(s)× . m (4.15)

Since Si2j D(s−2σ )

e−9K σ/10 Ri j D(s−σ ) , σn

(4.16)

by Lemma 9.1 below, we get e−9K σ/10 max{|i − j| : |i 2 − j 2 | < C0 K }R z z¯ a,q, p,D(s) σ 2n C0 K e−9K σ/10 z z¯ R a,q, p,D(s) σ 2n C0 e−4K σ/5 z z¯ R a,q, p,D(s) . (4.17) σ 2n+1

S 2 a,q, p,D(s−2σ )

Again applying Lemma 9.1 to S 2 , we can get lip

S 2 a,q, p,D(s−4σ )×

C0 e−4K σ/5 z z¯ lip R a,q, p,D(s)× . σ 2n+1

(4.18)

It is obvious that e−9K σ/10 z z¯ R a,q, p,D(s) , σn e−9K σ/10 z z¯ lip R a,q, p,D(s)× . σn

S 3 a,q, p,D(s−σ ) lip

S 3 a,q, p,D(s−σ )×

(4.19) (4.20)

Thus (1 + C0 )e−4K σ/5 X R r,a,q,D(s,r ) , (4.21) σ 2n+2 (1 + C0 )e−4K σ/5 5n+1 σ M lip X R r,a,q,D(s,r )× + X R r,a,q,D(s,r )× . × α (4.22)

X Rˆ z z¯ z,¯z r,a,q,D(s−3σ,r ) lip

X Rˆ z z¯ z,¯z r,a,q,D(s−5σ,r )×

650

J. Liu, X. Yua

Therefore, from (4.21) (4.22) we get λ X Rˆ z z¯ z,¯z r,a,q,D(s−5σ,r )×

(1 + C0 )e−4K σ/5 λ X R r,a,q,D(s,r )× . σ 5n+1

(4.23)

ˆ i.e. Rˆ z , z, Rˆ z¯ , z¯ , Rˆ zz z, z, Rˆ z¯ z¯ z¯ , z¯ , the same results For the other terms of R, even better - than (4.23) can be obtained. Thus, we finally get the estimate for the error ˆ term R: λ X Rˆ r,a,q,D(s−5σ,r )×

(1 + C0 )e−4K σ/5 λ X R r,a,q,D(s,r )× . σ 5n+1

(4.24)

ˆ + t R, Now consider the new perturbation (4.2). By setting R(t) = (1 − t)( Nˆ + R) we have

1 X P+ = X Rˆ + (X tF )∗ [X R(t) , X F ]dt + (X 1F )∗ (X P − X R ). (4.25) 0

We assume that λ X P r,a,q,D(s,r )× ≤

αη2 −8C0 γ0 K s e , Bσ

(4.26)

for 0 ≤ λ ≤ α/M with some 0 < η < 1/16 and 0 < σ < min{s, 1}, where Bσ = cσ −9(n+τ +1) with c being a sufficiently large constant depending only on n, τ and |ω| . Since R is 2-order Taylor polynomial truncation in y, z, z¯ of P, we can obtain λ λ X R r,a,q,D(s,r )× X P r,a,q,D(s,r )× ,

X P −

X R ληr,a,q,D(s,4ηr )×

λ ηX P r,a,q,D(s,r )× .

(4.27) (4.28)

As in Lemma 19.3 of [11], we obtain λ λ D X F r,a, p, p,D(s−6σ,r )× , D X F r,a,q,q,D(s−6σ,r )×

e8C0 γ0 K s λ X R r,a,q,D(s,r )× . ασ 3n+2τ +3

(4.29)

From (3.63), (4.5), (4.9), (4.24), (4.29) and (4.27) we get e8C0 γ0 K s λ X P r,a,q,D(s,r )× , ασ 3n+2τ +2 1 λ λ X Nˆ r,a,q,D(s−2σ,r )× 2n+2τ +1 X P r,a,q,D(s,r )× , σ (1 + C0 )e−4K σ/5 λ λ X P r,a,q,D(s,r X Rˆ r,a,q,D(s−5σ,r )× )× , σ 5n+1 λ λ D X F r,a, p, p,D(s−6σ,r )× , D X F r,a,q,q,D(s−6σ,r )×

λ X F r,a, p,D(s−5σ,r )×

e8C0 γ0 K s λ X P r,a,q,D(s,r )× . ασ 3n+2τ +3

(4.30) (4.31) (4.32)

(4.33)

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

651

Moreover, together with the smallness assumptions (4.26), by properly choosing c, we get λ λ X F r,a, p,D(s−5σ,r )× , D X F r,a, p, p,D(s−6σ,r )× , λ D X F r,a,q,q,D(s−6σ,r )× ≤

η2 σ c0

(4.34)

with some suitable constant c0 ≥ 1. Then the flow X tF of the vector field X F exists on D(s − 7σ, r/2) for −1 ≤ t ≤ 1 and takes this domain into D(s − 6σ, r ). Similarly, it takes D(s − 8σ, r/4) into D(s − 7σ, r/2). In the same way as (20.6) in [11], we obtain λ λ X tF − idr,a, p,D(s−7σ,r/2)× X F r,a, p,D(s−6σ,r )× , λ D X tF − I r,a, p, p,D(s−8σ,r/4)×

t λ D X F − I r,a,q,q,D(s−8σ,r/4)×

λ D X F r,a, p, p,D(s−6σ,r )× , λ D X F r,a,q,q,D(s−6σ,r )× .

(4.35) (4.36) (4.37)

Also in the same way as (20.7) in [11], we obtain that for any vector field Y , (D X tF )∗ Y ληr,a,q,D(s−9σ,ηr )× Y ληr,a,q,D(s−7σ,4ηr )× .

(4.38)

From (4.27), (4.31), (4.32) and the assumption (1 + C0 )e−4K σ/5 1, we get λ X R(t) r,a,q,D(s−5σ,r )×

1 λ X P r,a,q,D(s,r )× . σ 3n+2τ +1

(4.39)

Moreover, we have λ [X R(t) , X F ]r,a,q,D(s−6σ,r/2)×

λ λ D X R(t) r,a,q, p,D(s−6σ,r/2)× X F r,a, p,D(s−6σ,r/2)×

λ λ +D X F r,a,q,q,D(s−6σ,r/2)×

X R(t) r,a,q,D(s−6σ,r/2)×

2 e8C0 γ0 K s λ X P r,a,q,D(s,r )× . 6n+4τ +4 ασ

(4.40)

Hence, also [X R(t) , X F ]ληr,a,q,D(s−6σ,r/2)×

2 e8C0 γ0 K s λ X P r,a,q,D(s,r )× . 2 6n+4τ +4 αη σ

(4.41)

Together with the estimates of Rˆ in (4.32) and X P − X R in (4.28), we finally arrive at the estimate X P+ ληr,a,q,D(s−9σ,ηr )×

1 Bσ e8C0 γ0 K s Bσ e−4K σ/5 λ λ ≤ X + + η X P r,a,q,D(s,r P r,a,q,D(s,r )×

)× . 3 αη2 αη2 (4.42) This is the bound for the new perturbation.

652

J. Liu, X. Yua

5. Iteration and Convergence

β Set β = min{ 1+β , 41 } and κ = 43 − β3 . Now we give the precise set-up of iteration parameters. Let ν ≥ 0 be the ν th KAM step. αν = α100 (9 + 2−ν ), which is used to dominate the measure of removed parameters, m ν = m100 (9 + 2−ν ), which is used for describing the growth of external frequencies, E ν = E90 (10 − 2−ν ), which is used to dominate the norm of internal frequencies, M M M1,ν = 91,0 (10 − 2−ν ), M2,ν = 92,0 (10 − 2−ν ), Mν = M1,ν + M2,ν , which are used to dominate the Lipschitz semi-norm of frequencies, L ν = L90 (10 − 2−ν ), which is used to dominate the inverse Lipschitz semi-norm of internal frequencies, −

1

ν

J0 = γ0 τ +1 , Jν = J0κ , which is used for the estimate of measure, sν = s0 /2ν , which dominates the width of the angle variable x, σν = sν /20, which serves as a bridge from sν to sν+1 , −9(n+τ +1) Bν = Bσν := cσν ; here c is a large constant only depending on n, τ and E 0 , 1 ν ν−1 Bμ μ+1 κ εν = ε0 μ=0 ( αμ ) 3κ , which dominates the size of the perturbation Pν in ν th KAM iteration, K ν = 5| ln εν |/(4σν ), which is the length of the truncation of Fourier series, 1−β αν . ην3 = εν αν−1 Bν , rν+1 = ην rν , Dν = D(sν , rν ), λν = M ν 5.1. Iterative Lemma. Lemma 5.1. Suppose that ε0 ≤

α0 γ0

∞

1 1−β

80

−

Bμ 3κ

1 μ+1

,

α0 ≤

μ=0

m0 . 10

(5.1)

Suppose Hν = Nν + Pν is regular on Dν × ν , where Nν is a generalized normal form with coefficients satisfying

ld , k = 0, |l| ≤ 2,

kτ ¯ ν (ξ )| ≥ m ν ld , 0 < |l| ≤ 2, | l, ˜ ν, j |sν ,τ +1 ≤ (α0 − αν )γ0 j δ , | j ≥ 1,

¯ ν (ξ )| ≥ αν | k, ων (ξ ) + l,

|ων | ν ≤ E ν , lip |ν |−δ,Dν × ν

lip |ων | ν

≤ M1,ν ,

lip |ων−1 |ων ( ν )

≤ M2,ν ,

(5.2) (5.3) (5.4) ≤ L ν , (5.5) (5.6)

on ν , and Pν satisfies X Pν rλνν,a,q,Dν × ν ≤ εν .

(5.7)

Then there exists a Lipschitz family of real analytic symplectic coordinate transformations ν+1 : Dν+1 × ν → Dν satisfying ν+1 − idrλνν,a, p,Dν+1 × ν , Dν+1 − I rλνν,a, p, p,Dν+1 × ν , Dν+1 − I rλνν,a,q,q,Dν+1 × ν ≤

Bν 1−β ε , αν ν

(5.8)

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

and a closed subset

ν+1 = ν \

653

Rν+1 kl (αν+1 ),

(5.9)

|k|>Jν ,|l|≤2

where Rν+1 kl (α)

¯ ν+1 (ξ )| < α = ξ ∈ ν : | k, ων+1 (ξ ) + l,

ld

kτ

,

(5.10)

such that for Hν+1 = Hν ◦ ν+1 = Nν+1 + Pν+1 , the estimate ν |ων+1 − ων |λ νν , |ν+1 − ν |λ−δ,D ≤ Bν εν ν+1 × ν

(5.11)

holds and the same assumptions as above are satisfied with ‘ν + 1’ in place of ‘ν’. Proof. Setting C0,ν = 2E ν /m ν , then it’s obvious C0,ν ≤ 4C0,0 . Thus we have e8C0,ν γ0 K ν sν ≤ εν−β

(5.12)

by K ν sν = 20K ν σν = 25| ln εν | and choosing γ0 small enough such that 800C0,0 γ0 ≤ 1−β β . In view of the definition of ην , namely ην3 = εν αν−1 Bν , the smallness condition (4.26), namely εν ≤

αν ην2 −8C0,ν γ0 K ν sν , Bν e

is satisfied if

εν1−β ≤

αν . Bν

(5.13)

To verify the last inequality we argue as follows. As Bν and αν−1 are increasing with ν, Bν αν

1 1−β

=

Bν αν

1 3(κ−1)

=

∞ ∞ 1 ν Bν 1 κ ν Bμ μ+1 κ ( ) 3κ μ+1 ≤ ( ) 3κ . α α ν μ μ=ν μ=ν

(5.14)

By the definition of εν above, the bound αν ≥ 9α0 /10 and the smallness condition on ε0 in (5.1), εν1−β

∞ Bμ 1 κ ν (1−β ) γ0 κ ν Bν ≤ ε0 ( ) 3κ μ+1 ≤ ≤ 1. αν αμ 72

(5.15)

μ=0

So the smallness condition (4.26) is satisfied for each ν ≥ 0. In particular, noticing κ ≥ 5/4, we have εν1−β

Bν γ0 ≤ ν+6 . αν 2

(5.16)

Now there exists a coordinate transformation ν+1 : Dν+1 × ν → Dν taking Hν into Hν+1 . Moreover, (5.8) is obtained by (4.30), (4.33), (4.35)-(4.37), and (5.11) is obtained by (4.31). More explicitly, (5.11) is written as |ων+1 − ων | ν , |ν+1 − ν |−δ,Dν+1 × ν ≤ Bν εν , Mν lip lip |ων+1 − ων | ν , |ν+1 − ν |−δ,Dν+1 × ν ≤ Bν εν . αν

(5.17) (5.18)

654

J. Liu, X. Yua

In view of (5.16)–(5.18), by choosing γ0 properly small, (5.3)–(5.6) are satisfied with ‘ν + 1’ in place of ‘ν’. As to the diophantine conditions, in view of the definition of ν+1 in (5.9), only the case k ≤ Jν remains to verify. In this case, by the definition of Jν , we have ν

Bν εν ≤

αν γ0κ αν − αν+1 ≤ . 2ν+6 3Jντ +1

(5.19)

Hence, for k = 0 and k ≤ Jν , ¯ ν+1 − ¯ ν | ≤ |k||ων+1 − ων | + 2 ld | ¯ ν+1 − ¯ ν |−δ | k, ων+1 − ων + l, ≤ 3|k| ld Bν εν |k| l ≤ (αν − αν+1 ) τ +1d Jν

l ≤ (αν − αν+1 ) dτ (5.20)

k on Dν+1 × ν . This completes the proof of (5.2) with ‘ν + 1’ in place of ‘ν’. On the other hand, from (4.42) we get

1 Bν e8C0,ν γ0 K ν sν Bν e−4K ν σν /5 εν + + ην εν 2 2 3 αν ην αν ην

1 Bν 1−β Bν 1−β ≤ ε + ε + η ν εν 3 αν ην2 ν αν ην2 ν Bν 1 κ 3ε = ν αν = εν+1 . (5.21)

ν+1 X Pν+1 rλν+1 ,a,q,Dν+1 × ν ≤

This completes the proof of the iterative lemma.

5.2. Convergence. We are now in a position to prove the KAM theorem. To apply the iterative lemma with ν = 0, set N0 = N ,

P0 = P,

s0 = s,

r0 = r,

and similarly E 0 = E, L 0 = L, M1,0 = M1 , M2,0 = M2 , m 0 = m, α0 = α and λ0 = λ = α/M. Define γ in the KAM theorem by setting 1 1−β 1 − 3κ μ+1 Bμ , 80

∞

γ = γ0 γs ,

γs =

(5.22)

μ=0

where γ0 is the same parameter as before and γs only depends on n, τ , E, s, β. The smallness condition (5.1) of the iterative lemma is then satisfied by the assumption of the KAM theorem: 1

ε0 := X P0 rλ00,a,q,D0 × 0 ≤ (αγ )1+β ≤ (α0 γ0 γs ) 1−β .

(5.23)

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

655

The small divisor conditions (5.2) are satisfied by setting

0 = \ R0kl (α0 ),

(5.24)

(k,l)=(0,0),|l|≤2

and the other conditions (5.3)–(5.6) about the unperturbed frequencies are obviously true. Hence, the iterative lemma applies, and we obtain a decreasing sequence of domains Dν × ν and a sequence of transformations ν = 1 ◦ · · · ◦ ν : Dν × ν−1 → D0 , such that H ◦ ν = Nν + Pν for ν ≥ 1. Moreover, the estimates (5.8) and (5.11) hold. Shorten · r,a, p as · r and consider the operator norm L W r . W =0 W r˜

Lr,˜r = sup

For r ≥ r˜ , these norms satisfy ABr,˜r ≤ Ar,r Br˜ ,˜r , since W r ≤ W r˜ . For ν ≥ 1, by the chain rule, using (5.8) (5.16), we get Dν r0 ,rν ,Dν × ν−1 ≤ s Dν r0 ,rν ,Dν × ν−1 ≤ lip

ν

μ=1 ν

Dμ rμ ,rμ ,Dμ × μ−1 ≤

≤2

μ=1

1+

μ=1 lip

Dμ rμ ,rμ ,Dμ × μ−1

μ=1 ν

∞

1

2μ+6

≤ 2, (5.25)

Dρ rρ ,rρ ,Dρ × ρ−1

1≤ρ≤ν,ρ=μ

lip

Dμ − I rμ ,rμ ,Dμ × μ−1 ≤ 2

∞ Mμ 1 M0 ≤ . αμ 2μ+6 α0

μ=1

(5.26) Thus, with the mean value theorem we obtain ν+1 − ν r0 ,Dν+1 × ν ≤ Dν r0 ,rν ,Dν × ν−1 ν+1 − idrν ,Dν+1 × ν ≤ 2ν+1 − idrν ,Dν+1 × ν , (5.27) ν+1 − ν r0 ,Dν+1 × ν lip

≤ Dν r0 ,rν ,Dν × ν−1 ν+1 − idrν ,Dν+1 × ν lip

+Dν r0 ,rν ,Dν × ν−1 ν+1 − idrν ,Dν+1 × ν lip

≤

M0 lip ν+1 − idrν ,Dν+1 × ν + 2ν+1 − idrν ,Dν+1 × ν . α0

(5.28)

It follows that ν+1 − ν rλ00,Dν+1 × ν ≤ 3ν+1 − idrλνν,Dν+1 × ν .

(5.29)

From (5.29) (5.8), we get ν+1 − ν rλ00,Dν+1 × ν ≤ 3

Bν 1−β ε . αν ν

(5.30)

656

J. Liu, X. Yua

For every non-negative integer multi-index k = (k1 , . . . , kn ), by Cauchy’s estimate we have ∂xk (ν+1 − ν )rλ00,Dν+2 × ν ≤ 3

Bν 1−β k1 !· · ·kn ! ε , 0 αν ν ( 2sν+2 )|k|

(5.31)

the right side of which super-exponentially decay with ν. This shows thatν converge uniformly on D∗ × α , where D∗ = Tn × {0} × {0} × {0} and α = ν≥0 ν , to a Lipschitz continuous family of smooth torus embeddings : Tn × α → P a, p , for which the estimate (1.9) holds. Similarly, the frequencies ων converge uniformly on α to a Lipschitz continuous limit ω∗ , and the frequencies ν converge uniformly on D∗ × α to a regular limit ∗ , with estimate (1.10) holding. Moreover, X H ◦ = D · X N∗ on D∗ for each ξ ∈ α , where N∗ is the generalized normal form with frequencies ω∗ and ∗ . Thus, the embedded tori are invariant under the perturbed Hamiltonian flow, and the flow on them is linear. Now it only remains to prove the claim about the set \ α , which is the subject of the next section. 6. Measure Estimate 6.1. Proof of (1.8). We know

\ α =

Rνkl ,

(6.1)

ν≥0 |k|>Jν−1 ,|l|≤2 −κ ν /(τ +1)

where J−1 = 0, Jν = γ0 and

l ¯ ν (ξ )| < αν d Rνkl = ξ ∈ ν−1 : | k, ων (ξ ) + l,

kτ

(6.2)

¯ ν are defined and Lipschitz continuous on ν−1 , and with −1 = . Here, ων and ¯ ω0 = ω, 0 = are the frequencies of the unperturbed system. Lemma 6.1. If γ0 is sufficiently small and τ ≥ n + 1 + 2/(d − 1), then | \ α | ≤ cα,

(6.3)

where c > 0 is a constant depends on n, d, E, L and m. Proof. We only need to give the proof of the most difficult case that l has two non-zero components of opposite sign. In this case, rewriting Rνkl as |i d − j d | ¯ ν,i (ξ ) − ¯ ν, j (ξ )| < αν , i = j, Rνki j = ξ ∈ ν−1 : | k, ων (ξ ) +

kτ (6.4) we only need to estimate the measure of α :=

ν≥0 |k|>Jν−1 ,i= j

Rνki j .

(6.5)

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

657

To estimate the measure of |k|>Jν−1 ,i= j Rνki j , we introduce the perturbed frequencies ζ = ων (ξ ) as parameters over the domain Z := ων ( ν ) and consider the resonance ˙ ν = ων (Rν ) in Z . Regarding ¯ ν as a function of ζ , then from the iterative zones R ki j ki j lemma above, we know |ζ | ≤ E ν ≤

10 E0 , 9

lip

|ξ | Z ≤ L ν ≤

¯ ν,i − ¯ ν, j | Z ≥ m ν |i d − j d | ≥ | ¯ ν, j | ≤ L ν M2,ν j δ ≤ ( | Z lip

10 L 0, 9

(6.6)

9 m 0 |i d − j d |, 10

(6.7)

10 2 ) L 0 M2,0 j δ , 9

(6.8)

where E 0 , L 0 , m 0 , M2,0 are just E, L, m, M2 in Theorem 1.1. ˙ν . Now we consider a fixed R ki j 9m ν d If |k| < 10E |i − j d |, we get | k, ζ | < ν ν ˙ we know R ki j is empty.

9m ν d 10 |i

− j d |. In view of (6.7) and αν ≤

mν 10 ,

9m ν d 9 3m δ If |k| ≥ 10E |i − j d |, we have |k| ≥ 21 ( 10 ) E (i + j δ ). Fix w1 ∈ {−1, 1}n such ν that |k| = k · w1 and write ζ = aw1 + w2 with w1 ⊥w2 . As a function of a, for t > s,

¯ ν,i (|

k, ζ |ts = |k|(t − s), 10 ¯ ν, j |)|ts ≤ ( )2 L M2 (i δ + j δ )(t − s). − 9

(6.9) (6.10)

Thus

9 ¯ ν,i − ¯ ν, j )|ts ≥ |k|(t − s) 1 − ( )2 L M2 (i δ + j δ )/|k| ( k, ζ + 10 10 E L M2

≥ |k|(t − s) 1 − 2( )5 9 m 1 ≥ |k|(t − s), 10

(6.11)

by using the assumption 4E L M2 ≤ m in Theorem 1.1 and the fact (9/10)6 > 1/2. Therefore, we get ˙ ν | ≤ 10(diam Z )n−1 αν |R ki j

3E |i d − j d | 20 n−1 ( 10 9 ) m E) ≤ 10( α .

kτ +1 9

kτ

(6.12)

Going back to the original parameter domain ν by the inverse frequency map ω−1 , we get |Rνki j | ≤ c1

α ,

kτ

(6.13)

where c1 = 5(10/9)2n+2 (2E L)n /m. Consequently, we have that for any |k| > Jν−1 , |

i= j

Rνki j | ≤

|i d − j d |≤(10/9)3 (E/m)|k|

|Rνki j | ≤ c2

α 2 τ − d−1

k

,

(6.14)

658

J. Liu, X. Yua

2/(d−1) where c2 = c1 2(10/9)3 (E/m) . Moreover, since τ ≥ n + 1 + 2/(d − 1), we have α | Rνki j | ≤ c2 c3 , (6.15) 1 + Jν−1 |k|>Jν−1 ,i= j

where c3 > 0 depends only on n. The sum of the latter inequality over all ν converges, and we finally obtain the estimate of Lemma 6.1. 6.2. Proof of Theorem 1.2. Since the proof of the case δ = d˜ < d − 1 can be found in [8] or [11], we assume δ = d˜ = d − 1 in the following. ¯ 0 = 1 + 2 with |1 |lip From Assumption (D*) we know −δ0 , ≤ M3 and lip 2 th 1 2 ¯ ≤ M4 . Thus in ν KAM step, setting ν = + ν , we know 2 = 2 and | | −δ,

0

lip

|2ν |−δ, ≤ M4,ν ,

(6.16)

where the iteration parameter M4,ν = M94 (10 − 2−ν ). In the following we consider the excluded set of parameters under Assumption (D*) instead of (D). Our aim is to verify the conclusion (1*) in Theorem 1.2. In the same way as [11], we write \ α = 1α + 2α , where 1 R0kl , (6.17) α = 0<|k|≤J0 ,|l|≤2

2 α

=

Rνkl .

(6.18)

ν≥0 |k|>max(J0 ,Jν−1 ),|l|≤2

Since |k| ≤ J0 and for each k there are only finitely many l for which R0kl is not empty, the set 1α is a finite union of resonance zones. For each of its members we know that |R0kl | → 0 as α → 0 for l = 0 by the first part of Assumption (D*), and for l = 0 by elementary volume estimates. Thus | 1α | → 0 as α → 0. In the remainder of this section we estimate the measure of 2α . Lemma 6.2. If γ0 is sufficiently small and τ ≥ n + 1 + 2/(d − 1), then |

2 α|

≤ cα,

(6.19)

where c > 0 is a constant depends on n, d, E, L and m. Proof. As in Lemma 6.1, we only need to give the proof of the most difficult case that l has two non-zero components of opposite sign. Seeing (6.4) for the definition of Rνki j , we only need to estimate the measure of Rνki j . (6.20) ν≥0 |k|>max(J0 ,Jν−1 ),i= j

For the measure of |k|>max(J0 ,Jν−1 ),i= j Rνki j , we introduce the perturbed frequencies ζ = ων (ξ ) as parameters over the domain Z = ων ( ν ) and consider the resonance

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

659

˙ ν = ων (Rν ) in Z . The estimate (6.6), (6.7) still hold true. Moreover, for zones R ki j ki j ¯ ν, j = 1 + 2 , we have the estimate j ν, j 10 L M 3 j δ0 , 9 10 lip |2ν, j | Z ≤ L ν M4,ν j δ ≤ ( )2 L M4 j δ . 9 Let δ1 = min(δ − δ0 , δ) and δ∗ = max(1, δ0 /δ1 ), |1j | Z ≤ L ν M3 j δ0 ≤ lip

10 4 2E L M3 ) , 9 m Choose γ0 sufficiently small so that L∗ = (

1 − τ +1

J0 = γ0

J∗ =

(6.21) (6.22)

20 L M3 L δ∗∗ . 9

≥ J∗ ,

(6.23)

˙ ν with |k| > J∗ . and thus Jν ≥ J0 ≥ J∗ for all ν ≥ 0. Now we consider a fixed R ki j ˙ ν is empty. If |k| < 9m ν |i d − j d |, then R If |k| ≥

10E ν 9m ν d 10E ν |i

ki j

− j d |, we have

1 9 3m δ ( ) (i + j δ ). (6.24) 2 10 E Fix w1 ∈ {−1, 1}n such that |k| = k · w1 and write ζ = aw1 + w2 with w1 ⊥w2 . As a function of a, for t > s, |k| ≥

k, ζ |ts = |k|(t − s), 10 (|i1 − 1j |)|ts ≤ L M3 (i δ0 + j δ0 )(t − s), 9 10 (|2ν,i − 2ν, j |)|ts ≤ ( )2 L M4 (i δ + j δ )(t − s). 9

(6.26)

10 |k| L M3 (i δ0 + j δ0 ) ≤ . 9 2

(6.28)

(6.25)

(6.27)

We claim

In fact, (1) If i δ1 + j δ1 ≥ L ∗ , in view of (6.24) and 2(i δ0 + j δ0 )(i δ1 + j δ1 ) ≤ i δ + j δ ,

(6.29)

then (6.28) follows from |k| ≥ (

9 m 20 9 3 m δ0 δ0 δ1 δ1 ) (i + j )(i + j ) ≥ ( )3 (i δ0 + j δ0 )L ∗ = L M3 (i δ0 + j δ0 ); 10 E 10 E 9 (6.30)

(2) If i δ1 + j δ1 < L ∗ , then |k| 10 10 10 J∗ L M3 (i δ0 + j δ0 ) ≤ L M3 (i δ1 + j δ1 )δ∗ ≤ L M3 L δ∗∗ = ≤ . 9 9 9 2 2 This completes the proof of (6.28).

(6.31)

660

J. Liu, X. Yua

In view of (1.11) in Assumption (D*), (6.24) and the fact (9/10)6 > 1/2, we get (

10 2 9 ) L M4 (i δ + j δ ) ≤ |k|. 9 20

(6.32)

Thus, from (6.25)-(6.28) and (6.32), ¯ ν, j )|ts ≥ ¯ ν,i − ( k, ζ +

1 |k|(t − s). 20

(6.33)

In the same process as (6.12)-(6.15) in Lemma 6.1, we finally obtain the estimate of Lemma 6.2. 7. Application to Derivative Nonlinear Schrödinger Equation In this section, using Theorem 1.1, we show the existence of quasi-periodic solutions for a class of derivative nonlinear Schrödinger equations subject to Dirichlet boundary conditions iu t + u x x − Mσ u + i f (u, u)u ¯ x = 0, (t, x) ∈ R × [0, π ], (7.1) u(t, 0) = 0 = u(t, π ), where Mσ is a real Fourier multiplier, Mσ sin j x = σ j sin j x, σ j ∈ R, j ≥ 1,

(7.2)

and f is analytic in some neighborhood of the origin in C2 with f (u, u) ¯ = f (u, u), ¯

f (−u, −u) ¯ = − f (u, u). ¯

We study this equation as a Hamiltonian system on some suitable phase space P. As the phase space one may take, for example, the usual Sobolev space H02 ([0, π ]). The same as in [10], introducing the inner product

π u vd ¯ x, (7.3)

u, v = Re 0

then (7.1) can be written in Hamiltonian form ∂u = −i∇ H (u), (7.4) ∂t

π

π

π 1 1 1 H (u) = |u x |2 d x + (Mσ u)ud ¯ x+ g(u, u)u ¯ x d x, (7.5) 2 0 2 0 2 0 z where the gradient ∇ is defined with respect to ·, ·, and g(z 1 , z 2 ) = −i 0 2 f (z 1 , ζ )dζ . To write it in infinitely many coordinates, we make the ansatz 2 sin j x, j ≥ 1. (7.6) qjφj, φj = u = Sq = π j≥1

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

661

a, p

The coordinates are taken from the Hilbert space C of all complex-valued sequences q = (q1 , q2 , · · · ) with 2 qa, p =

|q j |2 j 2 p e2a j < ∞.

(7.7)

j≥1

We fix a ≥ 0 and p >

3 2

in the following. Then (7.4) can be rewritten as q˙j = −2i

∂H , j ≥1 ∂ q¯ j

(7.8)

with the Hamiltonian H (q) = + G

1 π 1 2 2 ( j + σ j )|q j | + g(S q, S q)(S ¯ q)x d x. = 2 2 0

(7.9)

j≥1

The perturbation term G has the following properties: Lemma 7.1. For a ≥ 0 and p > 23 , the function G is analytic in some neighborhood a, p of the origin in C with real value, and the Hamiltonian vector field X G is an analytic a, p a, p−1 with map from some neighborhood of the origin in C into C 2 X G a, p−1 = O(qa, p ).

(7.10)

z z Proof. Set h(z 1 , z 2 ) = 0 2 0 1 f (η, ζ )dηdζ . From f (u, u) ¯ = f (u, u), ¯ we know h(u, u) ¯ is real. In view of the definition of g(z 1 , z 2 ) above, g(u, u) ¯ = −i

∂h(u, u) ¯ ∂h(u, u) ¯ =i . ∂u ∂ u¯

(7.11)

Therefore,

π ∂h ∂h

0 = −i ux + u¯ x d x ∂u ∂ u¯ 0 0

π

¯ = g(u, u)u ¯ x − g(u, u) ¯ u¯ x d x = 2(G − G). π

dh d x = −i dx

(7.12)

0

This illustrates that G is real valued. In view of G in (7.9), we have

∂G i π =− f (u, u)u ¯ x φ j d x, u = S q. ∂ q¯ j 2 0

(7.13)

From f (−u, −u) ¯ = − f (u, u), ¯ we know that f (0, 0) = 0 and f (u, u)u ¯ x can be expanded as a Fourier sine series. Thus the components of the gradient G q¯ are the Fourier sine coefficients of f (u, u)u ¯ x . Now the estimate (7.10) can be obtained in the same way as that of Lemma 3 in [10].

662

J. Liu, X. Yua

Pick a set J = { j1 < j2 < · · · < jn } ⊂ N as n basic modes. We assume

σ jb = ξb , b = 1, . . . , n, σ j = 0, j∈ / J,

(7.14)

and take ξ := (ξ1 , . . . , ξn ) ∈ ⊂ Rn as parameters, where is a closed bounded set of positive Lebesgue measure. We introduce symplectic polar and real coordinates (x, y, u, v) by setting √ q jb = 2(Ib + yb )e−ixb , b = 1, . . . , n, (7.15) q j = u j − iv j , j∈ / J, where I = (I1 , . . . , In ) is fixed. Then we have −

i dq j ∧d q¯ j = dyb ∧d xb + du j ∧dv j . 2 j≥1

(7.16)

j ∈J /

1≤b≤n

Therefore, up to a constant term, the Hamiltonian (7.9) can be rewritten as H =N+P 1 = ωb yb + j (u 2j + v 2j ) + G(q(x, y, u, v)) 2 with symplectic structure

(7.17)

j ∈J /

1≤b≤n

1≤b≤n

dyb ∧d xb +

ωb = jb2 + ξb , j = j ,

j ∈J /

du j ∧dv j , where

b = 1, . . . , n,

j∈ / J.

2

(7.18) (7.19)

It’s obvious that the tangential frequencies ω = (ω1 , . . . , ωn ) and normal frequencies = ( j : j ∈ / J ) satisfy Assumptions (A) (B) (C) (D) in Theorem 1.1. From Lemma 7.1 above, we know there exists r > 0, such that, for every fixed I satisfying |I | = O(r 2 ), the Hamiltonian vector field X P is real analytic from D(1, r ) to P a, p−1 with X P r,a, p−1,D(1,r ) = O(r ).

(7.20)

Moreover, since X P is independent of ξ , we know lip

X P r,a, p−1,D(1,r )× = 0.

(7.21)

Therefore, defining ε as (1.7), we have ε = O(r ).

(7.22)

We simply assume | | = O(1) and fix β = 1/5. Then by letting α = O(r 2/3 ), the inequality in (1.7) is satisfied when r is small enough. Now Theorem 1.1 yields the following

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

663

Theorem 7.2. Consider a family of the derivative nonlinear Schrödinger equation (7.1) parameterized by the Fourier multiplier Mσ with σ = σ (ξ ) defined by (7.14). Then for any 0 < ε 1, there is a subset ε ⊂ with | \ ε | = O(ε2/3 ),

(7.23)

such that for every ξ ∈ ε , the equation has a smooth quasi-periodic solution of the form u(t, x) =

q˜ j (t)φ j (x),

(7.24)

j≥1

where {q˜ j } j≥1 are quasi-periodic functions with frequencies ω := (ω1 , · · · , ωn ). Moreover, |ω − ω| = O(ε),

(7.25)

and for every non-negative integer ν, there exists a positive constant c depending on ν such that d ν q˜ j | ≤ cε, j ∈ J, dt ν d ν q˜ j e2a j j 2 p | ν |2 ≤ cε7/3 . dt |

(7.26) (7.27)

j ∈J /

Remarks. 1. The estimate (7.25) follows from (1.10) in Theorem 1.1. From (1.9) in Theorem 1.1, we know

max |

1≤b≤n

d ν (x˜b − θb − ωb t) d ν y˜b −2 2a j 2 p d ν q˜ j 2 1/2 −1 | + max | | r + e j | ν | r 1≤b≤n dt ν dt ν dt

= O(ε

j ∈J /

1 1+β

where q˜ jb =

/α) = O(ε1/6 ),

2(Ib + y˜b )eix˜b , 1 ≤ b ≤ n and θ = (θ1 , . . . , θn ) ∈ Tn . Furthermore, d ν (x˜b − θb − ωb t) d ν y˜b 1/6 = O(ε ), = O(ε13/6 ), dt ν dt ν d ν q˜ j 1/2 e2a j j 2 p | ν |2 = O(ε7/6 ). dt

(7.28) (7.29)

j ∈J /

The estimate (7.26) follows from (7.28) and I = O(ε2 ). The estimate (7.27) follows from (7.29). 2. Under Dirichlet boundary conditions, in order to get the Hamiltonian (7.9) of discrete form, one needs to develop u = u(t, x) into Fourier sine series (7.6). Once it is done, the nonlinearity must be developed into Fourier sine series in proving the regularity of the nonlinear Hamiltonian vector field (see Lemma 7.1). This excludes the usual nonlinearity (|u|2 u)x .

664

J. Liu, X. Yua

8. Application to Perturbed Benjamin-Ono Equation The Benjamin-Ono equation describes the evolution of the interface between two inviscid fluids under some physical conditions (see [3]). Under periodic boundary conditions it reads u t + H u x x − uu x = 0,

(t, x) ∈ R × T,

(8.1)

where u is real-valued and H is the Hilbert transform defined for 2π -periodic functions with mean value zero by H ( f )(0) = 0,

H ( f )( j) = −isgn( j) fˆ( j), j ∈ Z\{0}.

(8.2)

The Benjamin-Ono equation is an integrable system (see [1]). For the global well-posedness of the Cauchy problem of the above equation, see [15,16]. In the first subsection, we transform the Benjamin-Ono Hamiltonian into its Birkhoff normal form up to order four. In the second subsection, by using Theorem 1.2, we prove there are many KAM tori and thus quasi-periodic solutions for the above equation with small Hamiltonian perturbations. 8.1. Birkhoff normal form. We introduce for any N > 3/2 the phase space H0N = {u ∈ L 2 (T; R) : u(0) ˆ = 0, u2N = | j|2N |u( ˆ j)|2 < ∞} j∈Z\{0}

of real valued functions on T, where

2π u(x)e− j (x)d x, u( ˆ j) = 0

1 e j (x) = √ ei j x . 2π

Under the standard inner product on L 2 (T; R), Eq. (8.1) can be written in Hamiltonian form

with Hamiltonian

∂u d ∂H =− ∂t d x ∂u

(8.3)

1 1 H (u) = u(H u x )d x − u 3 d x. 2 T 6 T

(8.4)

To write it in infinitely many coordinates, we make the ansatz u(t, x) = γ j q j (t)e j (x),

(8.5)

j=0

√ where γ j = | j|. The coordinates are taken from the Hilbert space N +1/2 of all complex-valued sequences q = (q j ) j=0 with |q j |2 j 2N +1 < ∞, q− j = q¯ j . (8.6) q2N +1/2 = j=0

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

Then (8.3) can be rewritten as ∂H , ∂q− j

q˙j = −iσ j

σj =

665

1, j ≥ 1 −1, j ≤ −1

(8.7)

with the Hamiltonian H (q) = + G =

j≥1

1 j 2 |q j |2 − √ 6 2π

γ j γk γl q j qk ql .

(8.8)

j+k+l=0

The function G is analytic in N +1/2 with real value, and the Hamiltonian vector field X G is an analytic map from N +1/2 into N −1/2 with X G N −1/2 = O(q2N +1/2 ).

(8.9)

In the following theorem, we transform the above Hamiltonian into its Birkhoff normal form up to order four. Theorem 8.1. There exists a real analytic symplectic coordinate transformation defined in a neighborhood of the origin of N +1/2 which transforms the above Hamiltonian H into its Birkhoff normal form up to order four. More precisely, H ◦=+ B+ R

(8.10)

with B=−

1 min( j, k)|q j |2 |qk |2 , 8π

(8.11)

j,k≥1

X R N −1/2 = O(q4N +1/2 ).

(8.12)

Proof. (1) The first step is to eliminate the three order term G. Define F 3 = 3 j,k,l=0 F jkl q j qk ql by 1 γ j γk γl √ for j + k + l = 0, λ +λ +λ , 3 = 6 2π j k l (8.13) iF jkl 0, otherwise, where λ j = σ j j 2 . Then we have {, F 3 } + G = 0, where {·, ·} is a Poisson bracket with respect to the symplectic structure −i ∧dq− j . Letting 1 = X 1F 3 , then H ◦ 1 = H ◦ X tF 3 |t=1

1

= + {, F 3 }+ 0

1

= + 0

(1 − t){{, F 3 }, F 3 } ◦ X tF 3 dt + G +

1 0

(8.14) j≥1 dq j

{G, F 3 } ◦ X tF 3 dt

t{G, F 3 } ◦ X tF 3 dt

1 1 = + {G, F 3 } + 2 2

0

1

(1 − t 2 ){{G, F 3 }, F 3 } ◦ X tF 3 dt.

(8.15)

666

J. Liu, X. Yua

The j th element of vector field X F 3 reads explicitly − iσ j

−σ j γ j γk γl ∂ F3 3 = −iσ j 3F(− q q = qk ql . √ k l j)kl ∂q− j 2 2π k+l= j λ− j + λk + λl k+l= j (8.16)

For any j, k, l = 0 with j + k + l = 0,

Thus,

λ j + λk + λl = 2 jkl/ max{| j|, |k|, |l|}.

(8.17)

√ 2 max{|k + l|, |k|, |l|} γk+l γk γl . |= | ≤ √ λ−k−l + λk + λl 2 2 |(k + l)kl|

(8.18)

Hence we have the estimate | − iσ j

∂ F3 1 1 |≤ √ |qk ||ql | = √ g− j , ∂q− j 4 π 4 π

(8.19)

k+l= j

where g− j stands for the sum k+l= j |qk ||ql |. Obviously, g = (g j ) j=0 is the two-fold convolution of w = (|q j |) j=0 , that is, g = w∗w. Thus, for any r > 1/2, 1 X F 3 r ≤ √ gr ≤ cwr2 = cqr2 , 4 π

(8.20)

where c > 0 depends only on r . This establishes the regularity of the vector field X F 3 . (2) The second step is to normalize the four order term 21 {G, F 3 } in (8.15). By a simple calculation, we have 1 i ∂G ∂ F 3 {G, F 3 } = − σj 2 2 ∂q j ∂q− j j=0

γ− j γm γn 1 = σj γ j γk γl qk ql qm qn 16π λ− j + λm + λn j=0

1 = 16π

k+l=− j

k, l, m, n = 0 k +l +m +n = 0 k + l = 0

m+n= j

(m + n)γk γl γm γn qk ql qm qn . −λm+n + λm + λn

(8.21)

Let B consist of all terms with k + m = 0 or k + n = 0 in (8.21). Then write B explicitly, B=

1 −2k 3 1 |qk |4 + 16π λ2k − 2λk 8π k=0

1 1 =− |k||qk |4 − 16π 16π k=0

k,l,k±l=0

σkl max{|k|, |l|, |k + l|}|qk |2 |ql |2

k,l,k±l=0

1 1 =− k|qk |4 − l|qk |2 |ql |2 , 8π 4π k≥1

−(k + l)|kl| |qk |2 |ql |2 λk+l − λk − λl

k>l≥1

(8.22)

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

667

which is (8.11). Let Q = 21 {G, F 3 } − B. We will find a coordinate transformation 2 = X 1F 4 to eliminate Q. In complete analogy to the first step we let 4 F4 = Fklmn qk ql qm qn , (8.23) k,l,m,n=0

4 iFklmn =

⎧ ⎪ ⎨− ⎪ ⎩0,

(m+n)γk γl γm γn 1 16π (−λm+n +λm +λn )(λk +λl +λm +λn ) ,

k +l +m +n = 0 , k + l, k + m, k + n = 0 otherwise. for

(8.24) Then we have {, F 4 } + Q = 0.

(8.25)

In order to complete the second step, we only need to establish the regularity of the +λm +λn vector field X F 4 . We claim that for f klmn := −λm+nm+n (λk + λl + λm + λn ) with k + l + m + n = 0 and k, l, m, n, k + l, k + m, k + n = 0, 1 max{|k|, |l|, |m|, |n|}. (8.26) 2 We prove this claim in four cases. Ahead of the proof, we give a simple inequality for two positive integers a, b: | f klmn | ≥

2ab≥a + b, which will be frequently used. Without loss of generality, we assume |k| ≥ |l| and |m| ≥ |n| because of their symmetry in f klmn . (1) l, m, n have the opposite sign of k. Without loss of generality, we assume k < 0 and l, m, n > 0. In this case, we have | − λm+n + λm + λn | = 2|mn|≥|m + n|, |λk + λl + λm + λn | = 2(lm + ln + mn) ≥ 2(lm + ln) = 2l(|k| − l) ≥ |k|, which leads to (8.26). (2) k, l, n have the opposite sign of m. Without loss of generality, we assume m < 0 and k, l, n > 0. In this case, we have | − λm+n + λm + λn | = 2|(m + n)n| ≥ |m + n|, |λk + λl + λm + λn | = 2(kl + kn + ln) ≥ 2(kn + ln) = 2(|m| − n)n ≥ |m|, which leads to (8.26). (3) k, l have the same sign while m, n have the same sign. Without loss of generality, we assume k, l > 0 and m, n < 0. Here (8.26) follows from the following inequalities: |

−λm+n + λm + λn 2mn |=| |≥|n|, m+n m+n

|n||λk + λl + λm + λn | = 2|n|(k − |n|)|k − |m||≥k|k − |m|| ≥

1 max{k, |m|}. 2

668

J. Liu, X. Yua

(4) k, n have the same sign while l, m have the same sign. Without loss of generality, we assume k, n > 0 and l, m < 0. Here (8.26) follows from the following inequalities: |

−λm+n + λm + λn | = 2n, m+n

2n|λk + λl + λm + λn | = 4n(|m| − n)||m| − k|≥2|m||k − |m|| ≥ max{k, |m|}. Finally, the proof of (8.26) is finished. From (8.26) we get γk γl γm γn 4 |Fklmn |≤ . 8π max{|k|, |l|, |m|, |n|}

(8.27)

The j-th element of vector field X F 4 explicitly reads − iσ j

∂ F4 = −iσ j ∂q− j

4 4 4 4 F(− + F + F + F j)lmn l(− j)mn lm(− j)n lmn(− j) ql qm qn .

l+m+n= j

(8.28) Hence we have the estimate | − iσ j

∂ F4 1 |≤ ∂q− j 2π γ j

γl γm γn |ql qm qn | ≤

l+m+n= j

1 g− j , 2π γ j

(8.29)

where g− j stands for the sum l+m+n= j γl γm γn |ql ||qm ||qn |. Obviously, g = (g j ) j=0 is the three-fold convolution of w = (γ j |q j |) j=0 , that is g = w∗w∗w. Thus, for any r > 1, X F 4 r ≤

1 gr −1/2 ≤ cwr3−1/2 = cqr3 , 2π

(8.30)

where c > 0 depends only on r . This establishes the regularity of the vector field X F 4 , and finishes the proof of Theorem 8.1. 8.2. Quasi-periodic solutions for perturbed Benjamin-Ono equation. Here is our main result in this section: Theorem 8.2. Consider the Benjamin-Ono equation (8.1) with a small Hamiltonian perturbation, written in the Hamiltonian form ∂u d ∂ H ∂K

=− +ε , (8.31) ∂t d x ∂u ∂u where the Hamiltonian H is defined by (8.4), and the Hamiltonian K is real analytic in a complex neighborhood V of the origin in H0,NC , which is the complexification of H0N , N > 3/2. Moreover, K satisfies the regularity condition ∂K : V → H0,NC , ∂u

∂K N ,V ≤ 1. ∂u

(8.32)

Then, for each index set J = { j1 < j2 < · · · < jn } ⊂ N, there exists an ε0 > 0 depending only on J , N and the size of V , such that for ε < ε0 , the equation has uncountable quasi-periodic solutions with frequency vector close to ( j12 , . . . , jn2 ).

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

669

Proof. Set the perturbed Hamiltonian H˜ = H + εK . Then, by the transformation in Theorem 8.1, we get the new Hamiltonian, still denoted by H˜ , H˜ = + B + R + εK ◦ ,

(8.33)

which is analytic in some neighborhood U of the origin of N +1/2 with in (8.8), B in (8.11), R satisfying (8.12) and the last term satisfying X K ◦ N −1/2,U ≤ 2.

(8.34)

We introduce symplectic polar and real coordinates (x, y, u, v) by setting √ √ q jb = ξb + yb e−ixb , q− jb = ξb + yb eixb , b = 1, . . . , n, qj =

√1 (u j 2

− iv j ),

q− j =

√1 (u j 2

+ iv j ),

j ∈ N∗ := N\J,

where ξ = (ξ1 , · · · , ξn ) ∈ Rn+ . Then 1 2 2 = jb2 (ξb + yb ) + j (u j + v 2j ), 2 1≤b≤n j∈N∗ min( jb , jb )(ξb + yb )(ξb + yb ) −8π B =

(8.35)

(8.36)

1≤b,b ≤n

+

min( jb , j)(ξb + yb )(u 2j + v 2j )

1≤b≤n, j∈N∗

+

1 min( j, j )(u 2j + v 2j )(u 2j + v 2j ). 4

(8.37)

j, j ∈N∗

Thus the new Hamiltonian, still denoted by H˜ , up to a constant depending only on ξ , is given by 1 H˜ = N + P = ωb yb + j (u 2j + v 2j ) + Q + R + εK , (8.38) 2 j∈N∗

1≤b≤n

with symplectic structure ωb = jb2 − j = j2 −

1 4π

1≤b≤n

dyb ∧d xb +

1 8π

du j ∧dv j , where

min( jb , jb )ξb ,

(8.39)

1≤b ≤n

1 min( jb , j)ξb , 4π

(8.40)

1≤b≤n

Q=−

j∈N∗

1≤b,b ≤n

1 − 32π

min( jb , jb )yb yb −

1 8π

min( jb , j)yb (u 2j + v 2j )

1≤b≤n, j∈N∗

min( j, j )(u 2j + v 2j )(u 2j + v 2j ).

(8.41)

j, j ∈N∗

Set

= {ξ ∈ Rn+ : |ξ | ≤ ε2/5 }. In the following we check Assumptions (A) (B) (C) and (D*).

(8.42)

670

J. Liu, X. Yua

In view of (8.39), we know that ξ → ω is an affine transformation from to Rn . Regarding ω as n-dimensional column vector, ⎛ 2⎞ j1 ⎛ j1 ⎜ 2⎟ ⎜ j2 ⎟ j ⎜ 1 ⎜ ⎟ 1 ⎜ 1 ⎜ ⎟ ⎜ j1 ω = ωˇ − A ξ = ⎜ j32 ⎟ − ⎜ . ⎟ 4π ⎜ 4π ⎝ ... ⎜ v .. ⎟ ⎝ ⎠ j1 jn2

j1 j2 j2 .. .

j1 j2 j3 .. .

j2

j3

··· ··· ··· .. . ···

⎞⎛ ⎞ ξ1 j1 j2 ⎟ ⎜ξ2 ⎟ ⎜ ⎟ j3 ⎟ ⎟ ⎜ ξ3 ⎟ , ⎟ .. ⎠ ⎜ .⎟ . ⎝ .. ⎠

(8.43)

ξn

jn

where ωˇ is a n-dimensional column vector with its bth element ωˇ b = jb2 and A is a n × n matrix with its (b, m)th element Abm = min( jb , jm ). Let ⎛ ⎜ ⎜ ⎜ T =⎜ ⎜ ⎝

1

−1 1

−1 1

.. ..

⎞⎛ 1 j1 − j2 − ⎟⎜ j1 ⎟⎜ ⎟⎜ ⎜ ⎟⎜ ⎟⎜ ⎠ −1 ⎝ 1

. .

⎞ 1 −

j2 − j1 j3 − j2

1 .. .

..

. − jn−2 − jn−1 jn − jn−1 v1

⎟ ⎟ ⎟ ⎟. ⎟ ⎟ ⎠ (8.44)

Then by a simple calculation, we have AT = A˜ := diag( jb − jb−1 : 1≤b≤n) (here j0 = 0). Thus, det

∂ω

∂ξ

= (−

1 n ) j1 ( j2 − j1 )( j3 − j2 ) · · · ( jn − jn−1 ) = 0. 4π

(8.45)

This implies that Assumption (A) is fulfilled with positive constant M1 and L only depending on the set J . In view of (8.40), we know j = j2 −

1 B j ξ, 4π

(8.46)

where B j is a n-dimensional row vector with its m th element B jm = min( j, jm ). It’s obvious that j ≈ j 2 , thus, Assumption (B) is fulfilled with d = 2 and m = 1/2. From (8.46), we get lip

| j | =

1 1 B jm = min( j, jm ). 4π 4π 1≤m≤n

(8.47)

1≤m≤n

Thus, ||−1, = sup j −1 | j | ≤ lip

lip

j∈N∗

n , 4π

that is, Assumption (C) is fulfilled with δ = 1 and M2 =

n 4π .

(8.48)

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

671

Moreover, taking δ0 = 0 and letting 1j = j , 2j = 0 in Assumption (D*), then from (8.47) we have lip

lip

|1 |−δ0 , = sup | j | = j∈N∗

1 jm := M3 , 4π

(8.49)

1≤m≤n

which only depends on J . On the other hand, since 2 = 0, the positive constant M4 can be chosen arbitrarily small. Therefore, we can choose it small enough such that (1.11) is fulfilled. Thus the second part of Assumption (D*) holds true. In view of (8.46), regarding as an infinite dimensional column vector with its index j ∈ N∗ , we have ˇ − =

1 Bξ, 4π

(8.50)

ˇ is an infinite dimensional column vector with its j th element ˇ j = j 2 and B where is a ∞ × n matrix with its j th row B j . For the first part of Assumption (D*), regarding k and l as n-dimensional and infinite dimensional row vectors respectively, we have to check for every k ∈ Zn and l ∈ Z∞ with 1 ≤ |l| ≤ 2, ˇ = 0 k ωˇ + l

or

k A + l B = 0.

(8.51)

˜ which is a ∞ × n matrix with By a simple calculation, we have B A−1 = BT A˜ −1 = B, its j th row B˜ j : (1) B˜ j = ( jj1 , 0, . . . , 0) for j < j1 , only the first element nonzero; jm+1 − j j− jm (2) B˜ j = (0, . . . , 0, jm+1 − jm , jm+1 − jm , 0, . . . , 0) for jm < j < jm+1 with 1 ≤ m < n, only the m th and (m + 1)th elements nonzero; (3) B˜ j = (0, · · · , 0, 1) for j > jn , only the last element nonzero. Then the second inequality of (8.51) is equivalent to k + l B˜ = 0.

(8.52)

Set N∗ = N1∗ N2∗ N3∗ , where N1∗ = {1≤ j < j1 }, N2∗ = { j1 ≤ j≤ jn }\J and N3∗ = { j > jn }. Notice that for j ∈ N1∗ N2∗ the nonzero elements of B˜ j are positive and less than 1. Thus the equality (8.52) holds true for k ∈ Zn and 1 ≤ |l| ≤ 2 except the following three cases: (1) lj =

±2, j = j1 /2 , 0, otherwise

(2) For a fixed m with 1 ≤ m < n, lj =

±2, j = ( jm + jm +1 )/2 , 0, otherwise

km =

∓1, m = 1 ; 0, otherwise

⎧ ⎨ ∓1, m = m km = ∓1, m = m + 1 ; ⎩ 0, otherwise

672

J. Liu, X. Yua

(3) l j = 0 for j ∈ N1∗

N2∗ , and km =

0, 1≤m
It’s easy to check that the first inequality of (8.51) holds true for all of the above three cases. Therefore, the first part of Assumption (D*) holds true. Now we consider sup norm and Lipschitz semi-norm of the perturbation P = Q + R + εK

(8.53)

on D(s, r ) × , where D(s, r ) is defined in (1.2) by letting p = N + 1/2 and is defined in (8.42). We choose s > 0 a constant, and r = ε1/4 .

(8.54)

X Q r,a, p−1,D(s,r )× = O(r 2 ) = O(ε1/2 ).

(8.55)

In view of (8.41), we have

In view of (8.12), we know R is at least a five order of q. Thus,

X R r,a, p−1,D(s,r )× = O (ε1/5 )5r −2 = O(ε1/2 ).

(8.56)

In view of (8.34), we have X K r,a, p−1,D(s,r )× = O(r −2 ) = O(ε−1/2 ).

(8.57)

From (8.55) (8.56) (8.57) we get X P r,a, p−1,D(s,r )× = O(ε1/2 ).

(8.58)

Since X P is real analytic in ξ , we have X P r,a, p−1,D(s,r )× = O(ε1/2 ε−2/5 ). lip

(8.59)

We choose α = ε9/20 γ −1 ,

β = 1/18,

(8.60)

where γ is taken from the KAM theorem. Set M := M1 + M2 , which only depends on the set J . It’s obvious that when ε is small enough, ε˜ := X P r,a, p−1,D(s,r )× +

α lip X P r,a, p−1,D(s,r )× = O(ε1/2 ) ≤ (αγ )1+β , M (8.61)

which is just the smallness condition (1.7). Now the conclusion of Theorem 8.2 follows from Theorem 1.2.

KAM Theorem for Hamiltonian PDEs with Unbounded Perturbations

673

9. A Technical Lemma Lemma 9.1. Let F = (Fi j )i, j≥1 be a bounded operator on 2 which depends on x ∈ Tn such that all elements (Fi j ) are analytic on D(s). Suppose R = (Ri j )i, j≥1 is another operator on 2 depending on x whose elements satisfy sup |Ri j (x)| ≤

x∈D(s )

1 |i − j|

sup

x∈D(s−σ )

|Fi j (x)|,

i = j,

(9.1)

and Rii = 0. Then R is a bounded operator on 2 for every x ∈ D(s ), and sup R(x) ≤

x∈D(s )

4n+1 sup F(x). σ n x∈D(s)

(9.2)

Proof. This is lemma M.3 in [11]. Acknowledgement. The authors are very grateful to the referees for their invaluable suggestions.

References 1. Ablowitz, M.J., Fokas, A.S.: The inverse scattering transform for the Benjamin-Ono equation, a pivot for multidimensional problems. Stud. Appl. Math. 68, 1–10 (1983) 2. Bambusi, D., Graffi, S.: Time Quasi-Periodic Unbounded Perturbations of Schrödinger Operators and KAM Methods. Commun. Math. Phys. 219, 465–480 (2001) 3. Benjamin, T.B.: Internal waves of permanent form in fluids of great depth. J. Fluid Mech. 29, 559–592 (1967) 4. Bourgain, J.: On invariant tori of full dimension for 1D periodic NLS. J. Funct. Anal. 229, 62–94 (2005) 5. Bourgain, J.: Recent progress on quasi-periodic lattice Schrödinger operators and Hamiltonian PDEs. Russ. Math. Surv. 59(2), 231–246 (2004) 6. Klainerman, S.: Long-time behaviour of solutions to nonlinear wave equations. In Proceedings of International Congress of Mathematics (Warsaw, 1983), Amsterdam: North Holland, 1984, pp. 1209–1215 7. Kuksin, S.B.: On small-denominators equations with large varible coefficients. J. Appl. Math. Phys. (ZAMP) 48, 262–271 (1997) 8. Kuksin, S.B.: Analysis of Hamiltonian PDEs. Oxford: Oxford Univ. Press, 2000 9. Kuksin, S.B.: Fifteen years of KAM in PDE, “Geometry, topology, and mathematical physics”: S.P. Novikov’s seminar 2002–2003 edited by V.M. Buchstaber, I.M. Krichever. Amer. Math. Soc. Transl. Series 2, Vol. 212, Providence, RI: Amer. Math. Soc., 2004, pp. 237–257 10. Kuksin, S.B., Pöschel, J.: Invariant Cantor manifolds of quasi-periodic oscillations for a nonlinear Schrödinger equation. Ann. Math. 143, 147–179 (1996) 11. Kappeler, T., Pöschel, J.: KdV&KAM. Berlin, Heidelberg: Springer-Verlag, 2003 12. Lax, P.D.: Development of singularities of solutions of nonlinear hyperbolic partial differential equations. J. Math. Phys. 5, 611–613 (1964) 13. Liu, J., Yuan, X.: Spectrum for quantum Duffing oscillator and small-divisor equation with large variable coefficient. Commun. Pure. Appl. Math. 63, 1145–1172 (2010) 14. Liu, J., Yuan, X.: KAM for the Derivative Nonlinear Schrödinger Equation with periodic Boundary conditions. In preparation 15. Molinet, L.: Global well-posedness in the energy space for the Benjamin-Ono equation on the circle. Math. Ann. 337, 353–383 (2007) 16. Molinet, L.: Global well-posedness in L 2 for the periodic Benjamin-Ono equation. Amer. J. Math. 130, 635–683 (2008) Communicated by G. Gallavotti

Commun. Math. Phys. 307, 675–712 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1348-0

Communications in

Mathematical Physics

String Structures and Trivialisations of a Pfaffian Line Bundle Ulrich Bunke NWF I - Mathematik, Universität Regensburg, 93040 Regensburg, Germany. E-mail: [email protected] Received: 12 March 2010 / Accepted: 24 June 2011 Published online: 25 September 2011 – © Springer-Verlag 2011

Abstract: The present paper is a contribution to categorial index theory. Its main result is the calculation of the Pfaffian line bundle of a certain family of real Dirac operators as an object in the category of line bundles. Furthermore, it is shown how string structures give rise to trivialisations of that Pfaffian.

Contents 1.

2.

3.

4.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Topological, geometric and categorial aspects of index theory . . . . 1.2 Description of the results . . . . . . . . . . . . . . . . . . . . . . . The Bundle L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Trivializations of Spin-structures and η3 -forms . . . . . . . . . . . 2.2 The Cheeger-Simons character pˆ21 (V) . . . . . . . . . . . . . . . . 2.3 The construction of L . . . . . . . . . . . . . . . . . . . . . . . . . The Pfaffian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Pfaffian and determinant line bundles for families of Dirac operators 3.2 Theory for generalized Dirac operators . . . . . . . . . . . . . . . . 3.3 Construction of local sections of the Pfaffian . . . . . . . . . . . . . 3.4 Proof of Theorem 1.1 . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Asymptotic expansion of η-forms in the adiabatic limit . . . . . . . String Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Geometric and multiplicative gerbes . . . . . . . . . . . . . . . . . 4.2 Geometric string structures . . . . . . . . . . . . . . . . . . . . . . 4.3 The string structure associated to a trivialisation . . . . . . . . . . . 4.4 Proof of Theorem 1.3 . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

676 676 677 680 680 683 685 688 688 691 693 697 697 702 702 705 707 709

676

U. Bunke

1. Introduction 1.1. Topological, geometric and categorial aspects of index theory. The present paper is a contribution to categorial index theory. Its main results are the following: 1. 2.

We consider a family of Dirac operators associated to a surface bundle and twisted by a real spin vector bundle. We then calculate its Pfaffian line bundle as an object in the category of hermitian line bundles with connection. We furthermore show that a geometric string structure, which refines the spin structure of the twisting vector bundle, naturally provides a trivialization of this Pfaffian line bundle.

Before we describe the results of the paper in greater detail in Subsect. 1.2 let us review the different levels of index theory appearing in the title of the present Subsection. The index of a family of elliptic differential operators D parametrised by a space B is a K -theory class index(D) ∈ K 0 (B). It is the homotopy class of an associated family of Fredholm operators defined by functional analytic techniques (see e.g. [Bos77]). If D has a kernel bundle, then alternatively we can view the index as the formal difference index(D) = [ker(D)] − [coker(D)] ∈ K 0 (B) of vector bundles. Index theory provides the tools to calculate the K -theory class index(D) ∈ K 0 (B) or its cohomological invariants like its Chern character ch(index(D)) ∈ H ev (B; Q) in terms of the symbol of D [AS68]. If D := D(E) is the family of Dirac operators associated to a geometric family E (see [Bun09]) on a smooth manifold B, then its index can be refined to a geometric object. If the kernel bundle of D exists, then it has an induced hermitian metric and a metric connection [BGV92, Ch. 9]. This geometric information is encoded in the differential K -theory index class index(D) ∈ Kˆ 0 (B) which can be defined in general without any assumption on the existence of a kernel bundle [BS07]. The differential index theorem calculates the class index(D) ∈ Kˆ (B) ev ˆ index(D)) or its Chern character ch( ∈H Q (B) in differential rational cohomology in differential geometric terms, see [FL09] and [BS07]. Let E be the total space of the underlying proper submersion π : E → B of the geometric family E. If W → E is a complex vector bundle equipped with a hermitian metric h W and a metric connection ∇ W , then we can form the twisted Dirac operator D(E ⊗ W), where by the boldface letter W := (W, h W , ∇ W ) we denote the geometric bundle. In a categorial refinement of index theory one would consider the index of D(E ⊗W) as an object index(D(E ⊗W)) ⊗ W)) from the in a certain category K˜ 0 (B) and study the functor W → index(D(E category of geometric vector bundles on E. At the moment such a theory is only partially understood, and the present paper discusses a particular aspect of that idea. For related developments in the context of algebraic geometry see [Del87] and [Fra91]. The first Chern class c1 (index(D)) ∈ H 2 (B; Z) classifies the topological type of the determinant line bundle of D. If D = D(E) is the Dirac operator associated to a geometric family E on B, then det(D) comes with a Quillen metric h det(D) and the Bismut-Freed connection h det(D) , see [BGV92, Ch. 9 and 10] for details. The isomorphism class of the geometric line bundle det(D) over B is classified by the first 2 differential Chern class cˆ1 (det(D)) ∈ H Z (B) [CS85]. It can be derived from the differential K -theory class index(D) by the identity cˆ1 (det(D)) = cˆ1 (index(D)), see [Bun,Bun09]. The calculation of integral Chern classes like c1 (index(D)) and its differential refinements cˆ1 (index(D)) is the contents of integral index theory, see [Mad09] for first steps in the topological case. The characterization of the determinant line bundle det(D) as an object in the category Line(B) of geometric line bundles

String Structures and Trivialisations of a Pfaffian Line Bundle

677

over B is an aspect of categorial index theory. In the presence of a real structure J commuting with D, as observed in [Fre03], one can define a natural square root of the inverse of the determinant line bundle,1 the Pfaffian bundle Pfaff(D, J ). The main goal of the present paper is the calculation of the object Pfaff(D, J ) ∈ Line(B) in a special situation which is motivated by applications in mathematical physics, notably string theory. 1.2. Description of the results. We consider a bundle π : E → B of compact twodimensional manifolds, or alternatively, a proper submersion π such that dim(E) − dim(B) = 2. Its vertical bundle is defined by Tvv π := ker(dπ ). We further assume that we are given a fibrewise Riemannian metric g T π and a complement vT h π ⊂ T E of the vertical bundle, i.e. a horizontal distribution. In [FL09] the pair (g T π , T h π ) is called v a Riemannian structure on π since it gives rise to a Levi-Civita connection ∇ T π on the vertical bundle T v π , see [BGV92, Ch. 9]. We finally assume a spin structure on T v π which allows to define the spinor bundle S(T v π ). In [Bun09] we subsumed this collection of data into the notion of a geometric family E := (π : E → B, g T

vπ

, T h π, S(T v π )).

The spinor bundle associated to a two-dimensional vector bundle (like T v π ) with spinstructure is Z/2Z-graded and has a quaternionic structure J (see [BS08, 2.2.6]), i.e. a parallel, anti-linear, anti-selfadjoint, and odd bundle endomorphism which commutes with the Clifford multiplication. Let V := (V, h V , ∇ V ) be a real geometric vector bundle over E. Then we can form the Dirac bundle S(E) ⊗R V. Since we tensor over the real numbers, the quaternionic structure extends to the tensor product. We let E ⊗R V := (π : E → B, g T

vπ

, T h π, S(T v π ) ⊗R V)

denote the induced geometric family and use the symbol J also to denote the extended quaternionic structure on S(E ⊗R V) = S(T v π ) ⊗R V. In general, given a geometric family E we let D(E) denote the associated Dirac operator which acts on the sections of the Dirac bundle S(E) of E. The composition J D(E ⊗R V) is an anti-linear, anti-selfadjoint, and even operator. By (J D(E ⊗R V))+ we denote the component acting on sections of S(E ⊗R V)+ . The relative Pfaffian line bundle Pfaff(E ⊗R V, J, r el) := Pfaff((J D(E ⊗R V))+ ) ⊗ Pfaff((J D(E))+ )−n , n := dim(V ) is a complex line bundle with connection functorially associated to this data described above. Its construction (see e.g. [Fre03,Bor92,FM06,Fre02]) will be recalled in Subsect. 3.1 below. The square of the relative Pfaffian line bundle is isomorphic to the inverse of the relative determinant line bundle det(D(E ⊗R V)) ⊗ det(D(E))−n as a geometric line bundle. The Pfaffian line bundle plays an important role in two-dimensional quantum field theory [FM06], where the functional integral of the action over the fermions can be 1 Note that our determinant line bundle, following the conventions in [BGV92], is the inverse of the determinant line bundle in [Fre03]. The Pfaffians coincide.

678

U. Bunke

interpreted as a section of the Pfaffian line bundle. In order to interpret the action as a complex valued function it is important to construct trivialisations of the Pfaffian line bundle. The construction of the line bundle Pfaff(E ⊗R V, J, r el) is based on the analytic properties of the family of Dirac operators D(E ⊗R V). So it is not obvious how additional topological and differential geometric structures can lead to a trivialization. In Subsect. 2.3 of the present paper we give a functorial, differential geometric construction of a geometric line bundle L = (L , h L , ∇ L ) under the assumption that V has a spin structure which is fibrewise trivial. The isomorphism class of L is classi2 Z (B). The spin-characteristic class fied by the first differential Chern class cˆ1 (L) ∈ H p1 pˆ 1 4 4 2 (V ) ∈ H (E; Z) also has a differential refinement 2 (V) ∈ H Z (E). We refer to [CS85] for a first construction of differential integral cohomology groups (there called groups of differential characters) and of the differential refinements of characteristic classes. We will give an alternative and well-adapted to the present purpose description of these classes in Subsect. 2.2. Essentially by construction we have pˆ 1 2 cˆ1 (L) = − (V) ∈ H Z (B), E/B 2 see Lemma 2.7. We have Theorem 1.1 (Theorem 3.4). If V has a spin structure which is fibrewise trivial, then there is a functorial isomorphism of geometric line bundles Pfaff(E ⊗R V, J, r el) ∼ = L. The adjective “functorial” here and above refers to base change along smooth maps B → B. As a consequence we get Corollary 1.2. If V has a spin structure which is fibrewise trivial, then pˆ 1 (V). cˆ1 (Pfaff(E ⊗R V, J, r el)) = − E/B 2

(1)

The underlying equality c1 (Pfaff(E ⊗R V, J, r el)) = −

E/B

p1 (V ), 2

which has first been derived in [Fre03, Prop. 5.4],2 can be considered as an example of an integral index theorem, as alluded to in [Bun] (see [Mad09] for a more general example), and Eq. (1) can be considered as its differential refinement. Our second result concerns trivialisations of the relative Pfaffian line bundle Pfaff(E ⊗R V, J, r el). We assume that dim(V ) = n ≥ 3. We define the homotopy type B String(n) as the homotopy fibre of p21 : B Spin(n) → K (Z, 4), where B Spin(n) is a classifying space of the Lie group Spin(n), K (Z, 4) is an Eilenberg-Mac Lane space, and the map p21 classifies the universal characteristic class of B Spin(n) 2 The result is stated for a torus bundle, but the proof applies in general. Moreover, it is not necessary to assume that the spin structure of V is fibrewise trivial.

String Structures and Trivialisations of a Pfaffian Line Bundle

679

denoted by the same symbol. A topological string structure for a spin bundle V is a homotopy class of lifts B String(n) , : B

V

/ B Spin(n)

where we use the symbol V also to denote a map classifying the spin bundle V . In the present paper we use an equivalent, more geometric, notion of a string structure as a trivialization of the Chern-Simons 2-gerbe. This and the notion of a geometric string structure str of a geometric spin bundle V has been discussed in [Walb]. We will explain details in Subsect. 4.2. In particular, to a geometric string structure str we have an associated 3-form Hstr ∈ 3 (E), see (39). Theorem 1.3 (Theorem 4.16). A geometric string structure str on V gives rise to a functorial unit-norm section sstr ∈ C ∞ (B, L) with L ∇ log sstr = −2πi Hstr . E/B

If we combine Theorem 1.1 with Theorem 1.3 we get the following consequence. Corollary 1.4. A geometric string structure str on V gives rise to a functorial unit-norm section sstr ∈ C ∞ (B, Pfaff(E ⊗R V, J, r el) such that ∇ Pfaff(E ⊗R V,J,r el) log sstr = −2πi Hstr . E/B

This corollary was the original motivation of the present paper. It answers a question by Stephan Stolz. Let us finally explain a typical example, which is the application looked for by physicists. We consider a compact surface with a Riemannian metric g T and a spin structure. Furthermore, we consider a Riemannian manifold X and a smooth map B → Map(, X ). Technically, this is a smooth map f

E := × B → X. We let π : E → B be the projection, T h π := T B ⊂ T ( × B) be the canonical horivπ T zontal subspace, and g and the spin structure on T v π be induced by g T and the spin structure on , respectively. In this way we get a geometric family E. The real vector bundle is obtained by V := f ∗ TX, where the geometric bundle TX = (T X, g T X , ∇ T X ) is given by the Riemannian geometry of X , in particular, ∇ T X is the Levi-Civita connection. We assume that X has a string structure. In this situation by Corollary 1.2 the isomorphism class of the relative Pfaffian line bundle can be calculated by transgressing the differential Pontrjagin class of TX,. pˆ 1 cˆ1 (Pfaff(E ⊗R V, J, r el)) = − f ∗ ( (TX)). 2 ×B/B Note that the string structure on X ensures that the spin structure of f ∗ T X is fibrewise trivial. Furthermore, a refinement of the string structure of X to a geometric string structure str gives rise to a trivialization sstr of Pfaff(E ⊗R V, J, r el) by Corollary 1.4.

680

U. Bunke

2. The Bundle L 2.1. Trivializations of Spin-structures and η3 -forms. A Z/2Z-graded complex geometric vector bundle W = (W, h W , ∇ W ) over a manifold M gives rise to a geometric family W with zero-dimensional fibres, see [Bun09, 2.2.2.1]. An invertible odd bundle endomorphism Q¯ ∈ End(W )odd can be considered as a taming Wt in the sense of [Bun09, Def 2.2.4]. To the tamed geometric family Wt by [BC89, (2.26)] we can associate an eta form η(Wt ) ∈ odd (M) such that dη(Wt ) = ch(∇ W ). We refer to (3) for our normalizations. In the present subsection we study special properties of the eta form in the case that W comes from a real spin vector bundle V and Q¯ is induced by a spin trivialization Q of V . The main result is Lemma 2.4. Let V = (V, h V , ∇ V ) be a real n ≥ 3-dimensional geometric vector bundle over some manifold M. Then we form the Z/2Z-graded complex geometric bundle W = (W, h W , ∇ W ) such that W+ := V ⊗R C and W− is the n-dimensional trivial geometric bundle over M. A spin structure on V is given by a Spin(n)-reduction Spin(V ) → O(V ) of the orthonormal frame bundle O(V ) of V . A trivialization of the spin bundle V is ∼ a trivialization Q : Spin(V ) → M × Spin(n). The trivialisation of the spin bundle V naturally gives rise to a unitary vector bundle isomorphism Q + : W + → W − . We define the unitary and odd involution 0 (Q + )∗ ¯ ∈ End(W )odd . Q := Q+ 0 Let W be the geometric family given by the Z/2Z-graded complex geometric vector bundle W. The local index form (W) [Bun09, Def. 2.2.8] in this case is the Chern-Weil representative ch(∇ W ) ∈ ev (M) of the Chern character of W for the connection ∇ W : (W) = ch(∇ W ) = ch(∇ V ⊗R C ) − n. We are in particular interested in the degree-4-component. We have c1 (∇ V ⊗R C ) = 0 and therefore ch2 (∇ V ⊗R C ) = −c2 (∇ V ⊗R C ) = p1 (∇ V ), where ci (∇ V ⊗R C ) ∈ 2i (M) and pi (∇ V ) ∈ 4i (M) again denote the Chern-Weil representatives of the corresponding characteristic classes. The endomorphism Q¯ gives rise to a taming Wt Q and an associated η-form. Its definition employs the rescaled super connection At := t 2 Q¯ + ∇ W . 1

(2)

Definition 2.1. The degree-2k − 1 component of the eta-form of Wt Q is defined by ∞ 1 η2k−1 (Wt Q ) = tr[∂t At exp(−A2t )]2k−1 . (3) (2πi)k 0 Here [ω]2k−1 is the degree-2k − 1-component of the inhomogeneous form ω. Furthermore, if W is a Z/2Z-graded vector bundle, then tr : End(W ) → C denotes the super trace. The definition [Bun09, Def 2.2.16] of the eta form of a tamed geometric family is based on a different family of super connections. It is related to (2) by a transformation in the scaling parameter t. Hence, it gives the same eta form as in the present paper.

String Structures and Trivialisations of a Pfaffian Line Bundle

681

In the present paper we are in particular interested in η3 (Wt Q ) ∈ 3 (M). It satisfies dη3 (Wt Q ) = 4 (W) = p1 (∇ V ).

(4)

Let us calculate the eta form explicitly in order to see that it is nothing else than a classical Chern-Simons form. Note that W − is trivialized so that we can identify sections of End(W − ) with +matrix-valued functions. We define the matrix-valued oneform B + such that Q + ∇ W (Q + )∗ = d + B + , and the matrix-valued curvature two-form R + := d B + + B +2 . +

Definition 2.2. The Chern-Simons form3 of the connection ∇ W (in the trivialization given by Q + )4 is defined by 1 1 1 W+ + + +3 trR B − trB . C S(∇ ) := 4 (2πi)2 3 Its differential is related to the second Chern form by: 1 + + dC S(∇ W ) = − c2 (∇ W ). 2 Lemma 2.3. We have +

η3 (Wt Q ) = 2C S(∇ W ). Proof. Define U := diag(Q + , 1). Then 1

Bt := U At U ∗ = d + B + t 2 K ,

K :=

0 1 , 1 0

B :=

B+ 0 0 0

is a rescaled super connection on the trivial Z/2Z-graded bundle W − ⊕ W − which is isomorphic to At . Using the notation + 0 B+ R 0 , C := , R := 0 0 −B + 0 we get 1

1

Bt2 = d B + B 2 + t 2 (B K + K B) + t K 2 = R + t 2 C + t. 3 The standard normalization of the Chern-Simons form is +

C Sst (∇ W ) = trR + B + −

1 trB +3 3

such that +

dC Sst (∇ W ) = tr(R ∇ The factor 41

W+ 2

+ + + 1 ) = 2(2πi)2 ch2 (∇ W ) = 2(2πi)2 ( c1 (∇ W )2 − c2 (∇ W )). 2

1 is introduced for convenience. (2πi)2 4 One should better write C S(∇ W + , Q + ), but we refrain from doing so in order to shorten the notation.

682

U. Bunke

Furthermore ∂t Bt =

1 1

2t 2

K.

We now calculate the 3-form component: 1 3 −t 2 2 t t e 2 [∂t Bt e−Bt ]3 = 1 K (RC + C R) − C 3 2 6 2t 2 te−t e−t 0 R+ B+ 0 −(B + )3 − K K = −B + R + 0 (B + )3 0 4 12 −t −t + + +3 e te R B B 0 0 = − . 0 −B + R + 0 −B +3 4 12 We get tr[∂t Bt e−Bt ]3 = 2

e−t te−t trR + B + − trB +3 . 2 6

Using

∞

−t

e dt = 1,

0

∞

te−t dt = 1

0

we get 1 η (Wt Q ) = (2πi)2 3

1 1 + + + +3 − trR B + trB = 2C S(∇ W ). 2 6

We now consider a second spin trivialization Q of the spin bundle V . An element x ∈ H 3 (M; R) is called even, if it belongs to 2im(H 3 (M; Z) → H 3 (M; R)). Lemma 2.4. The difference η3 (Wt Q ) − η3 (Wt Q ) is closed and represents an even class in H 3 (M; R). Proof. The difference is closed by (4). That it represents an even class relies on the fact that we consider pairs of spin trivialisations as opposed to just trivialisations. In order to see this we invoke some index theory. We form a geometric family I over M. The v underlying fibre bundle of I is p : [0, 1] × M → M with the standard metric g T p and horizontal distribution T h p := T M ⊂ T ([0, 1] × M). The trivial bundle T v p has op an induced spin structure. Note that ∂(I ⊗ p ∗ W) ∼ = W ⊕ W op . The tamings Wt Q and Wt Q induce a boundary taming (I ⊗ W)bt . Since I is a one-dimensional boundary tamed geometric family its index index((I ⊗ W)bt ) is an element of K −1 (M). We let chodd : K −1 (M) → H odd (M; Q) be the odd Chern character. By the index theorem for boundary tamed families [Bun09, Thm. 2.2.18] we have the following equality in de Rham cohomology: 3 3 3 chodd 3 (index((I ⊗ W)bt )) = [ (I ⊗ W) + η (Wt Q ) − η (Wt Q )].

String Structures and Trivialisations of a Pfaffian Line Bundle

We now observe that

(I ⊗ W) = 3

[0,1]×M/M

683

p ∗ ch2 (∇ W ) = 0.

It remains to show that chodd 3 (index(I ⊗ W)bt ) is even. We first consider the special case where VSpin(n) = Spin(n)×Rn is the trivial bundle over Spin(n) with the trivial metric, connection and spin structure Spin(VSpin(n) ) = Spin(n) × Spin(n). For Q Spin(n) we take the given trivialization, and for Q Spin(n) we take the twisted trivialisation Q Spin(n) (h, g) = (h, hg). This defines by the construction above a boundary tamed geometric family (I Spin(n) ⊗ W Spin(n) )bt over Spin(n). 3 We first claim that chodd 3 (index(I Spin(n) ⊗ W Spin(n) )bt ) ∈ H (Spin(n); R) is even. In order to see this we show that (I Spin(n) ⊗ W Spin(n) )bt is a pull-back of a boundary tamed family via the two-fold covering s : Spin(n) → S O(n). Indeed, we can consider op the trivial bundle VS O(n) → S O(n) and W S O(n) := VS O(n) ⊕ VS O(n) . We set Q¯ S O(n) :=

0 Q +S O(n)

(Q +S O(n) )∗ 0

∈ End(W S O(n) ),

where Q +S O(n) (g, x) = (g, gx), (g, x) ∈ S O(n) × Rn = VS O(n) . As above this gives a boundary tamed geometric family (I S O(n) ⊗ W S O(n) )bt , and we have (I Spin(n) ⊗ W Spin(n) )bt ∼ = s ∗ (I S O(n) ⊗ W S O(n) )bt . 3 −1 (X ). ThereNote that in general chodd 3 (x) ∈ H (X ; R) is integral for every x ∈ K odd ∗ 3 fore ch3 (index((I S O(n) ⊗ W S O(n) )bt )) is integral. Since s : H (S O(n); R) → H 3 (Spin(n); R) maps integral to even elements, we conclude that ∗ odd chodd 3 (index((I Spin(n) ⊗ W Spin(n) )bt )) = s ch3 (index(I S O(n) ⊗ W S O(n) )bt )

is even. We now come back our original case. The transition Q ◦ Q −1 : M × Spin(n) → M × Spin(n) determines a map T : M → Spin(n) such that Q ◦ Q −1 (m, g) = (m, T (g)m). We observe that T ∗ (I Spin(n) ⊗ W Spin(n) )bt differs, up to isomorphism induced by Q, from (I ⊗ W)bt only by the choice of the connection on the twisting bundle. But the index as a homotopy invariant is independent of the connection so that ∗ odd chodd 3 (index((I ⊗ W)bt )) = T ch3 (index((I Spin(n) ⊗ W Spin(n) )bt ))

is an even class.

2.2. The Cheeger-Simons character pˆ21 (V). In the present paper we use the multiplica∗ tive differential extension of ( H Z , R, I, a) of integral cohomology H Z. Its first model has been constructed in [CS85]. Its is uniquely characterized by some natural axioms ∗ explained in [SS08] or [BS09]. Here H Z is a contravariant functor from manifolds to graded commutative rings, and ∗ R:H Z (M) → ∗cl (M),

∗ ∗ H Z (M) → H Z∗ (M), a : ∗−1 (M)/im(d) → H Z (M) (5)

684

U. Bunke

are natural transformations. They are restricted by various relations listed e.g. in [BS09]. k In the original model of [CS85] the k th differential integral cohomology group H Z (M) is the group of differential characters of degree k. Let Z k−1 (M) denote the group of smooth k − 1-dimensional cycles in M. A differential character of degree k is a homok morphism φ : Z k−1 (M)

U (1)such that there exists a smooth form R(φ) ∈ (M) → so that φ(∂c) = exp 2πi c R(φ) for all smooth chains c ∈ Ck (M). It suffices to verify this condition for small5 smooth simplices. A real spin vector bundle V → M has a characteristic class p21 (V ) ∈ H 4 (M; Z). For a geometric spin bundle V this characteristic class has a differential refinement pˆ 1 4 2 (V) ∈ H Z (M) first defined using Chern-Weil theory in [CS85]. We now describe this class as a differential character pˆ21 (V) : Z 3 (M) → U (1). If z ∈ Z 3 (M), then we can choose a neighborhood U ⊆ M of the trace |z| of z which is homotopy equivalent to a three-dimensional C W -complex. Furthermore, since B Spin(n) is 3-connected, we can choose a trivialization Q of the spin bundle V|U . Definition 2.5. We define pˆ 1 (V)(z) := exp πi η3 (Wt Q ) . 2 z

(6)

We must show that this does not depend on the choice of the trivialization. Indeed, if Q is a second trivialisation, then by Lemma 2.4, η3 (Wt Q ) − η3 (Wt Q ) ∈ 2Z. z

z

If c ∈ C4 (M) is a small smooth 4-simplex, then still we can choose a trivialisation of Q of the spin bundle V|U on some neighborhood U of |c|. In this case we get by Stoke’s theorem and (4), pˆ 1 3 V (V)(∂c) = exp πi dη (Wt Q = exp πi p1 (∇ ) . 2 c c Therefore R(

1 pˆ 1 (V)) = p1 (∇ V ) 2 2

as it should be. Lemma 2.6. The differential cohomology class defined by the differential character above coincides with the class pˆ21 (V) defined in [CS85]. Proof. For the moment, let us denote the differential cohomology class corresponding ˆ ˆ to the differential character described above by φ(V). It is easy to see that φ(V) is natural with respect to base change along smooth maps M → M . Since B Spin(n) is 3-connected and H 4 (B Spin(n); Z) is torsion-free there exists a smooth manifold M with a real n-dimensional geometric spin bundle V and a map f : M → M such that V∼ = f ∗ V , this isomorphism is covered by an isomorphism Spin(V ) ∼ = f ∗ Spin(V ), 3 4 and such that H (M ; R) = 0 and H (M ; Z) is torsion free. Under these conditions a 5 A smooth simplex in M is called small if it admits a contractible open neighborhood in M.

String Structures and Trivialisations of a Pfaffian Line Bundle

685

class xˆ ∈ H Z (M ) is completely determined by its curvature R(x) ˆ ∈ 4 (M ). Since ˆ ) = pˆ1 (V ). By naturality this ˆ )) = 1 p1 (∇ V ) = R( pˆ1 (V )) we have φ(V R(φ(V 2 2 2 p ˆ ˆ implies φ(V) = 1 (V). 4

2

2.3. The construction of L. Let π : E → B be a bundle of compact oriented twodimensional manifolds. In the present subsection we construct a geometric line bundle L = (L , h L , ∇ L ) over B functorially associated to a geometric spin bundle V over E which is trivial as a spin bundle over the fibres of π . More precisely, by functoriality we mean that for a smooth map f : B → B and induced cartesian diagram E

F

/E

f

/B

π

π

B

(7)

∼

we have an associated isomorphism φ f : f ∗ L → L , where L is the line bundle associated to F ∗ V. Moreover, for a second smooth map g : B → B we have the associativity relation φg ◦ g ∗ φ f = φ f ◦g between these isomorphisms. We are going to describe the geometric bundle L by describing its local sections. We first describe a set-valued presheaf I... : B ⊇ U → IU on B such that IU is non-empty if U is contractible. The elements Q ∈ I Q will index local sections s Q ∈ C ∞ (U, L). After all, the maps IU Q → s Q ∈ C ∞ (U, L) for all U ⊆ B together combine to a map I... → C ∞ (. . . , L) of presheaves from I... to the sheaf of local sections of L. In order to complete the definition of (L , h L ) we must provide the transition functions s Q = c(Q , Q) ∈ C ∞ (U, U (1)) sQ for pairs Q, Q ∈ IU , see (9). These functions must fit into a map of presheaves I... × I... → C ∞ (. . . , U (1)) and satisfy a cocycle relation. In order to construct the connection ∇ L of L we will explicitly (see (8)) describe the connection one-forms ω Q := ∇ L log s Q ∈ 1 (U ) and verify their compatibility with the transition functions. ∼ We now explain the details. We let IU be the set trivialisations Q : Spin(V|EU ) → EU × Spin(n), where EU = π −1 (U ). Recall that we assume that the spin bundle Spin(V ) → E is trivial on the fibres of π : E → B. Hence, if U ⊆ B is contractible, then IU = ∅. The spin trivialization Q ∈ IU of V|EU → EU induces a tamed geometric family Wt Q over EU as described in Subsect. 2.1. We define the connection one-form η3 (Wt Q ) ∈ 1 (U ). (8) ω Q := −πi EU /U

Since it is imaginary it defines a metric connection. It suffices to define the transition functions for contractible open subsets U ⊆ B so that they are compatible with the restriction to contractible U ⊆ U , see (11).

686

U. Bunke

Assume that U is contractible and consider Q, Q ∈ IU . Since the fibres of π : E → B are two-dimensional, the manifold EU is homotopy equivalent to an at most twodimensional C W -complex. Since Spin(n) is two-connected, there exists a trivialization H of Spin bundles pr∗ V|EU connecting Q with Q , where pr : [0, 1] × EU → EU is the projection. We define 3 ∗ c(Q , Q) := exp −πi η (pr Wt H ) ∈ C ∞ (U, U (1)). (9) [0,1]×EU /U

This is independent of the choice of H . Indeed, if H is another choice, then we can concatenate these two choices in order to get a Spin-trivialization G := H H op, of pr∗ V|EU , where pr is now the projection pr : S 1 × EU → EU . We must show that η3 (pr∗ WtG ) ∈ C ∞ (U, 2Z). (10) S 1 ×EU /U

Let u ∈ U, E u := π −1 (u) be the fibre over u, and let q : S 1 × E u → E be the restriction of pr. Since q factors over the two-dimensional manifold E u we conclude that 4 4 0 = q∗ : H Z (E) → H Z (S 1 × E u ). By the construction of the differential character pˆ 1 2 (V) in Subsect. 2.2 we have pˆ 1 ∗ 1 1 (q V) = a( η3 (q ∗ WtG )) = q ∗ a( η3 (WtG )) = 0, 2 2 2 4 where a : 3 (. . . )/im(d) → H Z (. . . ) denotes one of the structure maps of the dif4 ferential extension H Z listed in (5) This implies that pˆ 1 ∗ 3 ∗ (q V) = a πi a η (pr WtG ) = 0, S 1 ×E u 2 S 1 ×EU /U |{u}

and hence the relation (10). It immediately follows from the construction that c(Q , Q)|U = c(Q |U , Q |U )

(11)

for a contractible open subset U ⊆ U . We now check the cocycle condition. Let Q ∈ IU be a third trivialization. Then we can concatenate the path H with a path H from Q to Q to a path G := H H from Q to Q . The cocycle condition now follows from 3 ∗ 3 ∗ η (pr Wt H ) + η (pr Wt H ) = η3 (pr∗ WtG ). [0,1]×EU /U

[0,1]×EU /U

[0,1]×EU /U

Finally we check the compatibility with the connection one-forms. We must show, that ω Q − ω Q = c(Q , Q)−1 dc(Q , Q). Indeed, by Stokes’ theorem it follows immediately from the definition (9) and 3 ∗ dη (pr Wt H ) = pr∗ p1 (∇ V ) = 0 [0,1]×EU /U

[0,1]×EU /U

String Structures and Trivialisations of a Pfaffian Line Bundle

that c(Q , Q)−1 dc(Q , Q) = −πi

687

EU /U

η3 (Wt Q ) −

EU /U

η3 (Wt Q ) .

This finishes the construction of the bundle L. Given a cartesian diagram (7) and an open subset U ⊆ B, by pulling back trivialisations we get a map F ∗ : IU → I f −1 (U ) . The bundle map φ f : f ∗ L → L is characterized by the property that for Q ∈ IU it maps the section f ∗ s Q ∈ C ∞ ( f −1 (U ), f ∗ L) to the section s F ∗ Q ∈ C ∞ ( f −1 (U ), L ). One easily checks, using the functoriality of η-forms, that this defines an isomorphism of geometric line bundles which behaves as required under compositions. If π : E → B is a proper submersion with an orientation of the vertical bundle T v π , then there is an integration map ∗ ∗−dim(E)+dim(B) :H Z (E) → H Z (B). E/B

It has a very convenient description in the model introduced in [BKS09], see also the literature cited therein. In the present paper we will only need the existence of the integration map, its compatibility with cartesian diagrams of the form (7) in the sense that ∗ ∗ F xˆ = f xˆ E /B

E/B

∗

for xˆ ∈ H Z (E), and the property, that for α ∈ ∗−1 (E) we have a(α) = a( α). E/B

E/B

We now come back to our surface bundle π : E → B. We calculate the first differen2 tial Chern class cˆ1 (L) ∈ H Z (B) of the geometric line bundle L. This task is equivalent to the calculation of the holonomy of L since the corresponding differential character cˆ1 (L) : Z 1 (B) → U (1) associates to the smooth cycle z ∈ Z 1 (B) the holonomy hol(L)(z) ∈ U (1) of L along z. Lemma 2.7. We have

cˆ1 (L) = − E/B

pˆ 1 (V). 2

Proof. Let z ∈ Z 1 (B) be a smooth one-cycle. Then we can find a neighborhood U of its trace |z| which is homotopy equivalent to a one-dimensional C W -complex. Since B Spin(n) is 3-connected and EU := π −1 (U ) is homotopy equivalent to a three-dimensional C W -complex there exists an element Q ∈ IU . But then pˆ 1 1 (V)|EU = a( η3 (Wt Q )). 2 2 Hence

EU /U

pˆ 1 1 (V)|EU = a( 2 2

EU /U

η3 (Wt Q ))

688

U. Bunke

so that EU /U

pˆ 1 pˆ 1 3 (V) (z) = (V)|EU (z) = exp πi η (Wt Q ) . 2 EU /U 2 z EU /U

The spin trivialization Q of VEU gives rise to the section s Q and the corresponding connection one-form ω Q defined by (8). We have η3 (Wt Q ) . hol(L)(z) = exp( ω Q ) = exp −πi z

Hence

z

cˆ1 (L)(z) = hol(L)(z) = −

EU /U

EU /U

pˆ 1 (V) (z). 2

3. The Pfaffian 3.1. Pfaffian and determinant line bundles for families of Dirac operators. In this subsection we recall the construction of the Pfaffian line bundle of a family of Dirac operator with a real structure. Let E be an even geometric family over a base B such that the underlying Clifford bundle has an odd anti-linear, anti-selfadjoint automorphism J . In particular, J is parallel and commutes with the Clifford multiplication. In this situation we define a geometric line bundle Pfaff(E, J ) over B. It is functorial and comes with a canonical isomorphism ∼

κ : Pfaff(E, J )2 → det(E)−1 ,

(12)

where det(E) is the determinant line bundle of E, see e.g. [BGV92, Ch 9 and 10] and the recapitulation below. In order to construct Pfaff(E, J ) we will first construct the underlying complex line bundle together with the isomorphism (12). We then define the geometry on Pfaff(E, J ) in the unique way such that (12) becomes an isomorphism of geometric bundles. First we recall the construction [Fre03,Bor92,FM06]. The geometric family E gives rise to a family of Dirac operators D(E) which acts on the bundle of Z/2Z-graded Hilbert spaces H (E). We consider the families of anti-linear and anti-selfadjoint operators D ± := (J D(E))± which act on the subbundles H (E)± . The compositions ± := D ∓ (E)D(E)± are non-negative and selfadjoint. For λ ∈ [0, ∞) we consider the open subset Uλ := {u ∈ B | λ2 ∈ σ ( + (u)) ∪ σ ( − (u))} ⊆ B, where σ ( ± (u)) denotes the spectrum of the operator ± (u) over the point u ∈ B. By E ± [0, λ] we denote spectral projection of ± onto the interval [0, λ]. This family of projections is smooth over Uλ . Its image Hλ± := E ± [0, λ]H (E)± is a finite-dimensional smooth complex vector bundle. For an n-dimensional complex vector bundle W → X we let det(W ) → X denote the maximal non-trivial alternating power det(W ) := n W . On Uλ we define the line bundle Pfaff(E, J )λ := det(Hλ+ ).

String Structures and Trivialisations of a Pfaffian Line Bundle

689

+ If μ > λ, then over Uλ ∩ Uμ we have an orthogonal decomposition Hμ+ ∼ = Hλ+ ⊕ Hλ,μ + + of smooth bundles of Hilbert spaces, where Hλ,μ = E + (λ, μ]H (E) . On Uλ ∩ Uμ we get an isomorphism + Pfaff(E, J )μ ∼ ). = Pfaff(E, J )λ ⊗ det(Hλ,μ + + . Therefore the is an anti-linear anti-symmetric isomorphism of Hλ,μ Note that D|H λ,μ form + + ∗ dλ,μ (. . . , . . . ) := D|H . . . , . . . ∈ (2 Hλ,μ ) λ,μ

is nowhere vanishing. Hence we get a nowhere vanishing section −

+ ) dim(Hλ,μ 2

+ ) := dλ,μ Pfaff(D|H λ,μ

+ ∈ C ∞ (Uλ ∩ Uμ , det(Hλ,μ )).

We define an isomorphism ∼

+ ). cλ,μ : Pfaff(E, J )λ → Pfaff(E, J )μ , cλ,μ (s) := s ⊗ Pfaff(D|H λ,μ

For ν ≥ μ ≥ λ these isomorphisms satisfy the cocycle condition cμ,ν ◦ cλ,μ = cλ,ν . We glue the collection of line bundles (Pfaff(E, J )λ → Uλ )λ≥0 using the cocycle (cλ,μ )λ,μ≥0 in order to get the underlying complex line bundle of Pfaff(E, J ). For the construction of the metric and the connection we invoke a canonical isomorphism ∼

κ : Pfaff(E, J )2 → det(E)−1 . To this end we recall the very similar construction of the determinant line bundle det(E). Over Uλ we define the line bundle det(E)λ := det(Hλ+ )−1 ⊗ det(Hλ− ). For μ > λ on Uλ ∩ Uμ we have isomorphisms − + −1 det(E)μ ∼ ) ⊗ det(Hλ,μ ). = det(E)λ ⊗ det(Hλ,μ

The operator − + + : H D + (E)|Hλ,μ λ,μ → Hλ,μ

is an isomorphism and therefore gives a nowhere vanishing section − ∞ + −1 + ) ∈ C det(D + (E)|Hλ,μ (Uλ ∩ Uμ , det(Hλ,μ ) ⊗ det(Hλ,μ )).

We define the cocycle f λ,μ : det(E)λ → det(E)μ ,

+ ). f λ,μ (s) := s ⊗ det(D + (E)|Hλ,μ

The underlying complex vector bundle of det(E) is obtained by glueing the family of bundles (det(E)λ → Uλ )λ≥0 using the cocycle ( f λ,μ )λ,μ≥0 .

690

U. Bunke ∼

We now define the isomorphism κ : Pfaff(E, J )2 → det(E)−1 by defining a collection (κλ )λ≥0 of isomorphisms det(Hλ+ ) ⊗ det(Hλ+ )

κλ ∼ =

Pfaff(E, J )2λ

∼ =

/ det(H + ) ⊗ det(H − )−1 λ,μ λ,μ

/ det(E)λ

Indeed, the part J + of the anti-linear isomorphism J induces an isomorphism Jλ+ : ∼ Hλ+ → H¯ λ− , and therefore an isomorphism det(Jλ+ ) : det(Hλ+ ) → det(Hλ− )−1 . We set κλ := 1 ⊗ det(Jλ+ ). It is easy to check that the collection (κλ )λ≥0 is compatible with the cocycles and therefore defines an isomorphism κ as required. As explained above the geometry of Pfaff(E, J ), i.e. the metric and the connection, is now defined in the unique way such that κ becomes an isomorphism of geometric line bundles. This finishes the description of the Pfaffian bundle Pfaff(E, J ) as a geometric line bundle.6 The family of Dirac operators D(E) is invertible on U0 . Moreover, since H0± = 0 we have a canonical isomorphism det(H0± ) ∼ = U0 × C and therefore isomorphisms Pfaff(E, J )0 ∼ = U0 × C, det(E)0 ∼ = U0 × C. We let scan ∈ C ∞ (U0 , Pfaff(E, J )), scan ∈ C ∞ (U0 , det(E)) 1/2

denote the corresponding sections. Then −1 κ(scan ⊗ scan ) = scan . 1/2

1/2

By [BGV92, Ch 9], 1/2

scan 2 = det( + ), scan 2 =

det( + )

−1

.

0,1/2

0 and s We let scan can denote the normalized sections 1/2

0 scan :=

scan scan 1/2,0 , scan := 1/2 . scan scan

6 We again remind the reader that our definition is the inverse of the Pfaffian in [Fre03].

(13)

String Structures and Trivialisations of a Pfaffian Line Bundle

691

The connection one-form of det(E) is determined by 0 = 2πiη1 (EU0 ,t ), ∇ det(E ) log scan

where EU0 ,t is the canonical taming of EU0 (by the zero perturbation). Therefore the connection one-form for the Pfaffian bundle is given by ∇ Pfaff(E ,J ) log scan = −πiη1 (EU0 ,t ). 1/2,0

(14)

Finally we calculate the curvature (see [BGV92, Thm 10.35], [Fre03, Thm 3.1]): R(det(E)) = 2πi2 (E),

R(Pfaff(E, J )) = −πi2 (E).

(15)

3.2. Theory for generalized Dirac operators. The Dirac operator D(E) of a geometric family is compatible, i.e. associated to a bundle of Clifford modules. This fact is important if one wants to use the standard local index theory calculations [BGV92] which e.g. give (15). On the other hand, the construction of the Pfaffian and determinant line bundle works equally well for generalized Dirac operators, i.e. zero-order perturbations of compatible Dirac operators. In the present paper we will consider generalized Dirac operators which arise as follows. Recall that we consider a two-dimensional geometric family E with underlying surface bundle π : E → B whose Dirac bundle is the spinor bundle S(T v π ), and a real geometric vector bundle V over E. Our final goal is to study the Pfaffian of the family of Dirac operators associated to the geometric family E ⊗ W, where W := W+ ⊕ W− with W+ := V ⊗R C and W− is a trivial bundle of dimension dimR V . The odd anti-involution J is induced by the quaternionic structure of S(T v π ) and the real structure of W . If Q¯ ∈ End(W ) is an odd, selfadjoint endomorphism which commutes with the real structure of W , then we can form the family of generalized Dirac operator ¯ := D(E ⊗ W) + 1 ⊗ Q. ¯ D(E ⊗ W, Q) Its Pfaffian and determinant line bundles will be denoted by ¯ J ), det(E ⊗ W, Q). ¯ Pfaff(E ⊗ W, Q, Let us now discuss the contribution of the additional term 1 ⊗ Q¯ to the local index ¯ Let At := At (E ⊗ W) calculation. For simplicity we just write Q¯ instead of 1 ⊗ Q. be the rescaled super connection of the geometric family E ⊗ W. With the additional 1 ¯ Note that (see [BGV92, term we must consider the super connection A Q,t ¯ := At + t 2 Q. Prop. 10.15]) 1 ¯ + ∇ H (E ⊗W) − 1 c(T ), A Q,t ¯ = t 2 (D(E ⊗ W) + Q) 1 4t 2

(16)

where T is the curvature tensor associated to the horizontal distribution T h π , and ∇ H (E ⊗W) is an unitary connection on the Hilbert bundle H (E ⊗ W). We now calculate the square 2 H (E ⊗W) ¯ ¯ ¯2 2 , Q]. A2Q,t ¯ = At + t ([D(E ⊗ W), Q] + Q ) + t [∇ 1

(17)

692

U. Bunke

Note that for two odd endomorphisms X, Y we have by definition [X, Y ] := X Y + Y X . ¯ = 0 since The mixed term with Q¯ and the curvature T disappears because of [c(T ), Q] ¯ the Clifford multiplication anti-commutes with Q. We now calculate the commutator terms using a vertical orthonormal frame (ei ) as well as a horizontal frame ( f α ) and its dual ( f α ). Note that

1 S(T v π )⊗W ∇ H (E ⊗W) = f α (∇ fα + k( f α )), (18) 2 α where k is the mean curvature of the fibre in the direction f α . Then we have

v π )⊗W ¯ = ¯ = [c(ei )∇eS(T , Q] c(ei )∇eWi Q¯ [D(E ⊗ W), Q] i i

and ¯ = [∇ H (E ⊗W) , Q]

i

S(T v π )⊗W ¯ = ¯ [ f α (∇ fα + k( f α )), Q] f α∇W f α Q. α

α

¯ = 0 and [ f α , Q] ¯ = 0. We now perform The mean curvature drops out since [k( f α ), Q] ¯ We the Getzler rescaling as in [BGV92, Ch 10]. We only study the terms involving Q. have lim δu (t Q¯ 2 ) = 0,

u→0

¯ = 0, lim δu (t ([D(E ⊗ W), Q])

u→0

¯ = t2 lim δu (t 2 [∇ H (E ⊗W) , Q]) 1

1

u→0

α

In particular the limit Tr exp(A2Q,t ¯ )

(19)

limu→0 δu (A2Q,t ¯ )

h ¯ ¯ 2 f α∇W f α Q =: t ∇ Q. 1

exists. Therefore the t → 0-asymptotic of

is still regular. But we may get a contribution to the local index form ¯ := ϕ lim Tr exp(−A2¯ ), (E ⊗ W, Q) Q,t t→0

where ϕ scales 2k-forms by

1 . (2πi)k

In formula [BGV92, 10.28] one has to replace the

W ¯ Note that ∇ W Q¯ ∈ 1 (End(W )odd ) twisting curvature F = by R ∇ + ∇ h Q. W commutes with multiplication by two-forms, but not necessarily with R ∇ . We get ⎞ ⎛ Tvπ R∇ 1 ∇W h ¯ 2 ¯ =ϕ ⎠ (E ⊗ W, Q) tr exp −R det 2 ⎝ − ∇ Q . v ∇T π E/B sinh( R 2 ) W R∇

We calculate the 4-form component of the integrand. Note that dim W = dim W + − dim W − = 0, ⎡ ⎞ ⎛ ⎤ Tvπ R∇ 1 W 2 ⎣det 2 ⎝ ⎠ tr exp −R ∇ − ∇ h Q¯ ⎦ Tvπ R∇ sinh( 2 ) 4 ∇W h ¯ = tr exp −R −∇ Q 4

1 1 W W ¯ 2 ) + 1 tr(∇ h Q) ¯ 4, = tr(R ∇ )2 − tr(R ∇ (∇ h Q) 2 2 24

String Structures and Trivialisations of a Pfaffian Line Bundle

693

¯ = tr(R ∇ (∇ h Q) ¯ 2 ) = tr((∇ h Q) ¯ 2 R ∇ ). Note where we use that tr(∇ h Q¯ R ∇ ∇ h Q) that 1 1 W 2 tr(R ∇ )2 (E ⊗ W) = 2πi E/B 2 W

W

W

¯ 4 has no vertical component so that and the form tr(∇ h Q) W ¯ = 2 (E ⊗ W) − 1 ¯ 2 ). tr(R ∇ (∇ h Q) 2 (E ⊗ W, Q) 4πi E/B This gives the curvature of the Pfaffian and determinant line bundle of the family of generalized Dirac operators 1 W ¯ J )) = − 1 2 (E ⊗ W) + 1 ¯ 2 ), R(Pfaff(E ⊗ W, Q, tr(R ∇ (∇ h Q) 2πi 2 8πi E/B (20) 1 1 W 2 ∇ h 2 ¯ = (E ⊗ W) − ¯ ). R(det(E ⊗ W, Q)) tr(R (∇ Q) 2πi 4πi E/B 3.3. Construction of local sections of the Pfaffian. If U ⊆ B is open such that restriction of V to EU is trivial as a spin bundle as in Sect. 2 we let IU denote the set of trivialisations. For every Q ∈ IU we are going to construct a section d Q ∈ C ∞ (U, Pfaff(E ⊗ W, J )). Let pr : R × U → U be the projection and consider the family pr∗ (E ⊗ W) over R × U . Its underlying bundle is R × EU → R × U , and we consider the projection Pr : R × EU → EU of total spaces. We define the odd selfadjoint endomorphism Q˜ ∈ End(Pr∗ W|EU ) so that it equals a Q¯ on the slice {a} × EU ⊂ R × EU . We calculate the Laplacian ((a, u)) by extracting the zero-form part of (17) over the base point (a, u) ∈ R × U at time t = 1. We get ¯ + a 2 Q¯ 2 . ((a, u)) = D(E ⊗ W)2 + ac(∇ W Q) ¯ More Note that Q¯ 2 = 1 is positive. For large a the term a 2 Q¯ 2 dominates ac(∇ W Q). precisely, if we assume that U has a compact closure in B, then there exists a0 ≥ 0 such that for a0 ≤ a the operator ((a, u)) is positive, and hence invertible. Therefore we have the section ˜ Pr∗ J )). scan ∈ C ∞ ([a0 , ∞) × U, Pfaff(pr∗ (E ⊗ W), Q, 1/2

1/2

1/2,0

The norm of scan is given by (13), and we consider the unit-norm section scan 1/2 1/2 scan −1 scan . We have a canonical identification

:=

˜ Pr∗ J )|{0}×U ∼ Pfaff(pr∗ (E ⊗ W), Q, = Pfaff(E ⊗ W, J )|U . For b ≥ a0 we define the unit-norm section ˜ Pr∗ J )) d(b) ∈ C ∞ (R × U, Pfaff(pr∗ (E ⊗ W), Q, 1/2,0

such that d(b)(a, u) is the parallel transport of scan (b, u) along the path [0, 1] → ((1 − t)b + ta, u). We define d Q (b) ∈ C ∞ (U, Pfaff(E ⊗ W, J )) by evaluation of d(b) at u = 0, i.e. d Q (b)(u) := d(b)(0, u)

694

U. Bunke

We now consider the η1 -form η1 := η1 (pr∗ (E ⊗ W)t Q˜ ) ∈ 1 ([a0 , ∞) × U ). Recall from (14) that ∗ ∗ ˜ 1/2,0 − πiη1 = ∇ Pfaff(pr (E ⊗W), Q,Pr J ) log scan .

(21)

We write η1 = a 2 (daθ + λ), where θ ∈ C ∞ ([a0 , ∞) × U ) and λ ∈ C ∞ ([a0 , ∞) × U, pr∗ T ∗ B). Proposition 3.1. For a → ∞ there are asymptotic expansions

a −n θ−n , λ = a −n λ−n , θ= n≥0

n≥0

where θi ∈ C ∞ (U ), λi ∈ 1 (U ). Moreover, θ0 and θ−3 are constant. Proof. The second assertion will be shown as a consequence of the first in the proof of Proposition 3.2. The existence of the asymptotic expansion will be shown later in Subsect. 3.5. We define

1 1 d˜Q (b) := d Q (b) exp iπ( b3 θ0 + b2 θ−1 + bθ−2 + log(b)θ−3 ) . 3 2

Proposition 3.2. The limit d Q := lim d˜Q (b) b→∞

exists in the

1 -sense. Cloc

The connection one-form of the limit is given by ∇ Pfaff(E ⊗W,J ) log d Q = −πi η3 (Wt Q ). EU /U

Proof. Note that by (21) for b ≥ b we have

d(b )|{b}×U d Q (b ) 1 = = exp −iπ η . d Q (b) d(b)|{b}×U [b,b ]×U/U

If we insert the asymptotic expansion of η1 , then we get b 3 − b3 b 2 − b2 θ0 + θ−1 + (b − b)θ−2 + (log(b ) − log(b))θ−3 η1 = 3 2 [b,b ]×U/U +O(b −1 , b−1 ). It follows that d˜Q (b ) = exp(i O(b −1 , b−1 )). d˜Q (b) This implies the existence of limb→∞ d˜Q (b).

String Structures and Trivialisations of a Pfaffian Line Bundle

695 ∗

˜

1 Now we consider the connection one-forms. Let R := 2πi R Pfaff(pr (E ⊗W), Q,Pr Note that by (20) we have 1 ∗ 1 W ˜ 2 ). R = − pr (E ⊗ W) + tr(pr∗ R ∇ (∇ h Q) 2 8πi [0,1]×EU /[0,1]×U

∗ J)

.

Since Q˜ contains the variable a linearly we see that −2R = ada ∧ pr∗ S + a 2 pr∗ T + pr∗ (E ⊗ W) for some S ∈ 1 (U ) and T ∈ 2 (U ). On [a0 , ∞) × U we have the identity dη1 = −2R. If we assume that η1 has an asymptotic expansion as stated in Proposition 3.1, then we get

− a 2−n da ∧ dθ−n + a 2−n dλ−n + (2 − n)a 1−n da ∧ λ−n n≥0 ∗

n≥0 ∗

n≥0 ∗

= ada ∧ pr S + a pr T + pr (E ⊗ W). 2

This gives the following identities for the terms in the asymptotic expansion of η1 : a2: a1: a0: a −1 :

dλ0 = pr∗ T , dθ0 = 0, dλ−1 = 0, −dθ−1 + 2λ0 = pr∗ S, dλ−2 = pr∗ (E ⊗ W), −dθ−2 + λ−1 = 0, dλ−3 = 0, −dθ−3 = 0.

In particular we obtain the second assertion of Proposition 3.1. Let ∗ ∗ ˜ ω(b) := ∇ Pfaff(pr (E ⊗W), Q,Pr J ) log d(b)

be the connection one-form of the section d(b). Since the section d(b) is parallel in the a-direction we get for a vector field X ∈ X (U ) that ∗ J) ˜ Pfaff(pr∗ (E ⊗W), Q,Pr

∂a ω(b)(X )d(b) = ∇∂a

∗

˜

= R Pfaff(pr (E ⊗W), Q,Pr = −πiaS(X )d(b),

∗ J)

∗ J) ˜ Pfaff(pr∗ (E ⊗W), Q,Pr

∇X

d(b)

(∂a , X )d(b)

and therefore ∂a ω(b)(X ) = −πiaS(X ). Let ω Q (b) := ∇ Pfaff(E ⊗W,J ) log d Q (b). Then we get by integration from 0 to b, ω(b){b}×U (X ) = −

πib2 S(X ) + ω Q (b)(X ). 2

Note that 1 ω(b){b}×U (X ) = −πiη{b}×U (X ) = −πib2 λ(b)(X ).

696

U. Bunke

The connection one-form ω˜ Q (b) of d˜Q (b) is given by 1 3 1 2 b dθ0 + b dθ−1 + bdθ−2 + log bdθ−3 ω˜ Q (b) = ω Q (b) + iπ 3 2 1 2 b dθ−1 + bdθ−2 = ω Q (b) + iπ 2 πi 2 1 2 2 = −πib λ(b) + b S + iπ b dθ−1 + bdθ−2 2 2 1 2 1 −1 = πi b ( S − λ0 + dθ−1 ) + b(−λ−1 + dθ−2 ) − λ−2 + o(b ) 2 2 = −πiλ−2 + o(b−1 ). Therefore lim ω˜ Q (b) = −πiλ−2 .

b→∞

It remains to calculate λ−2 . To this end we consider a function χ ∈ C ∞ (0, ∞) such that χ (t) = 0 for t ≤ 1 and χ (t) = 1 for t ≥ 2. Let (x, a, u) ∈ [0, 1] × R × U and pr ˜ : [0, 1] × R × U → U be the projection. Then pr ˜ ∗ EU ∼ = [0, 1] × R × EU . We let ˜ Pr : [0, 1] × R × EU → EU be the projection. On H (pr ˜ ∗ (E ⊗ W)) we consider the family of rescaled super connections A˜ t which is given by ¯ ˜ ∗ Q. A˜ t := pr ˜ ∗ At (E ⊗ W) + t 2 a(x + (1 − x)χ (t 2 a))Pr 1

1

(22)

The local index theory and the proof of Proposition 3.1 given in Subsect. 3.5 still applies to this more general super connection. The additional term involving the cut-off function χ vanishes in the Getzler rescaling (19) and does not contribute to the curvature. We want to show that λ−2 is independent of x. Note that as a → ∞ we have an asymptotic expansion

a n (da ∧ θ˜−n + d x ∧ κ˜ −n + λ˜ −n ), η˜ 1 := η1 ( A˜ t ) ∼ a 2 n≥0

where the terms may depend on x. We have ˜ ∗ S + a 2 x 2 pr ˜ ∗ T + pr ˜ ∗ (E ⊗ W). d η˜ 1 = (ax 2 da + a 2 xd x) ∧ pr From this we deduce (note that the term κ˜ −2 does not contribute) that ∂x λ˜ −2 = 0. The restriction of A˜ t to the slice {x = 0} is the super connection of a tamed geometric family, and the parameter enters as appropriate for adiabatic limits, see [BS07, 2.2.5]. We get 1 ˜ ˜ λ−2 = λ−2|{1}×U = λ−2|{0}×U = lim η˜ |{(0,a)}×U = η3 (Wt Q ). a→∞

EU /U

String Structures and Trivialisations of a Pfaffian Line Bundle

697

We now consider a second trivialization Q ∈ I (U ) and define the section d Q by Proposition 3.2. We assume that U is contractible. Then we can choose a homotopy H from Q to Q . It induces a taming of the family pr∗ (E ⊗W), where pr : [0, 1]×U → U is the projection. It furthermore induces a taming Pr∗ Wt H , where Pr : [0, 1] × EU → EU is the induced projection. Lemma 3.3. We have

d Q = exp −πi η3 (Pr∗ Wt H ) . dQ [0,1]×EU /U

Proof. Indeed we can define the section d H ∈ C ∞ ([0, 1] × U, Pfaff(pr∗ (E ⊗ W), Pr∗ J )) by Proposition 3.2. It restricts to d Q and d Q at {0} × U and {1} × U . Of course, Pfaff(pr∗ (E ⊗ W), Pr∗ J ) ∼ = pr∗ Pfaff(E ⊗ W, J ). Therefore d Q = exp dQ

[0,1]×U/U

∇

Pfaff(pr∗ (E ⊗W),Pr∗ J )

log d H .

But again by Proposition 3.2 we have ∗ ∗ ∇ Pfaff(pr (E ⊗W),Pr J ) log d H = −πi

This gives the result.

[0,1]×EU /[0,1]×U

η3 (Pr∗ Wt H ).

3.4. Proof of Theorem 1.1. Recall the construction of the geometric line bundle L in Sect. 2. Theorem 3.4. There exists a canonical functorial isomorphism of geometric line bundles L∼ = Pfaff(E ⊗ W, J ). It is characterized by the property that for every open U ⊆ B and Q ∈ IU it maps the section s Q ∈ C ∞ (U, L) to the section d Q ∈ C ∞ (U, Pfaff(E ⊗ W, J )). Proof. By inspection one checks that the cocycles and the connection one-forms for the collections of local sections (s Q )U,Q∈IU and (d Q )U,Q∈IU coincide. Note that all constructions are natural with respect to pull-back along smooth maps B → B. 3.5. Asymptotic expansion of η-forms in the adiabatic limit. In this technical subsection we prove the asymptotic expansion of the η1 -form stated in Proposition 3.1 and used in the more general case in the course of the proof of Proposition 3.2. In general, the η1 -form for a rescaled super connection A˜ t is given by ∞ 1 ˜2 η1 := η1 ( A˜ t ) = Tr ∂t A˜ t e− At . (23) 1 2πi 0

698

U. Bunke

In our situation we consider a super connection of the form (22), i.e. ¯ ˜ ∗ Q. ˜ ∗ At (E ⊗ W) + t 2 a(x + (1 − x)χ (t 2 a))Pr A˜ t := pr 1

1

We simplify the notation. We write ¯ A˜ t := At + f (t 2 a) Q, 1

where At is a Bismut super connection associated to a family of Dirac operators D over a base R × B with coordinates (a, b). We let H := ∇ h Q¯ be the horizontal derivative in ¯ where Q¯ is an odd involution, and f ∈ C ∞ ([0, ∞) × B) is such the B-direction of Q, that ∂t f (t, b) ≥ 0, f (t, b) = t for t ≥ 1, and f (t, b) = 0 for t ≤ 21 . Furthermore, we abbreviate G := [∇, D], where ∇ is the connection part of the super connection A as in (18). In order to save notation, we write the Duhamel formula in the form

1

DUH(X, Y ) :=

es X Y e(1−s)X ds.

(24)

0

Then we have for the one-form component ˜2

[Tr∂t A˜ t e− At ]1 1 1 1 1 1 1 1 1 ¯ 2 −t 2 G−t 2 f (t 2 a)H −t 2 ¯ −(t 2 D+ f (t 2 a) Q) = Tr 1 (D + a f (t 2 a) Q)e 2t 2 2 1 1 1 21 ¯ = − Tr (D + a f (t a) Q) DUH − t 2 D + f (t 2 a) Q¯ , 2 1 21 ¯ 2 G + f (t a)H + f (t a) Qda .

1

¯ f (t 2 a) Qda

1

We further calculate ¯ 2 = t D 2 + t 2 f (t 2 a)E + f (t 2 a)2 , (t 2 D + f (t 2 a) Q) 1

1

1

1

1

¯ is the Clifford multiplication by the vertical derivative of Q. ¯ where E := c(∇ v Q) 3 3 1 1 1 − We now introduce the variable s = a 2 t. Then we have dt = a 2 ds, t 2 a = a 4 s 2 , and ∞ 1 1 3 3 1 1 1 1 ¯ η1 = − Tr (D + a f (a 4 s 2 ) Q)DUH −a − 2 s D 2 + a − 4 s 2 f (a 4 s 2 )E 4πi 0 3 1 1 1 1 1 1 ¯ + f (a 4 s 2 )2 , G + f (a 4 s 2 )H + f (a 4 s 2 ) Qda a − 2 ds. (25) 1

1

1

1

1

1

Note that for s ≥ 1 and a ≥ 1 we have f (a 4 s 2 ) = a 4 s 2 and f (a 4 s 2 ) = 1. In this region the integrand simplifies to e−a

1 2s

3 3 1 1 ¯ DUH −sa − 2 (D 2 + a E), G + a 4 s 2 H + Qda ¯ Tr (D + a Q) a − 2 . (26)

String Structures and Trivialisations of a Pfaffian Line Bundle

699

Lemma 3.5. Locally uniformly on B there exist constants C < ∞ and c > 0 independent of a ≥ 1 such that ∞ 1 2 | (26) ds| ≤ Ce−ca . 1

Proof. We will show that there exists C < ∞ and M ∈ N such that for all s ≥ 1 and a ≥ 1, 3 1 1 ¯ DUH −sa − 2 (D 2 + a E), G + a 4 s 2 H + Qda ¯ |Tr (D + a Q) | ≤ Ca M s M . 1

The growth of the right-hand side in this estimate can be absorbed in the prefactor e−a 2 s . The assertion of the lemma then follows by elementary calculus. This estimate of the trace is obtained by a combination of spectral and trace class estimates. The trace class property is here supplied by an inclusion of Sobolev spaces H k+ p → H p which is of trace class if k is larger than the dimension of the underlying space which is two in our case. We write A := D 2 + a E. The operator under the trace and the integral (24) in DUH is a composition of differential operators growing at most polynomially in a and s, an operator of the form e−τ A for τ ≥ 0 which is bounded by 1 on H 0 , and an operator −3

of the form e−sa 2 A with ≥ 21 . The last factor provides the regularizing effect. We claim that for all p, q ∈ R there exist constants M ∈ N and C ∈ R independent of ∈ [ 21 , 1] and a, s ≥ 1 such that e−sa

− 23

A

: H p → Hq

is bounded by Cs M a M . We get a similar bound for the trace norm by factoring H p+3

e−sa

− 23

−→

A

H q+3

trace class

−→

Hq.

In order to estimate the whole composition mentioned above we choose the difference q − p sufficiently large in order to capture the differential operator factors as bounded. We normalize the position of p ∈ R such that we can consider the factor e−τ A as a bounded operator on H 0 . We now show the claim. Ellipticity of D implies that the graph norm associated to D k is equivalent to the k th Sobolev norm. It suffices to show that for every N ∈ N there exist constants c > 0 and C < ∞ and integers M ∈ N, D 2N e−sa

− 23

A

≤ Cs M a M

(27)

(we can absorb the factor into s). 3 We use u := sa − 2 and write D 2N e−u A = A2N e−u A + (D 2N − A2N )e−u A . 2

2

2

The first summand can be written as 1

u −N (u 2 A)2N e−(u

1 2

A)2

.

(28)

700

U. Bunke

We now use that 1

(u 2 A)2N e−(u

1 2

A)2

≤ C.

This follows by the spectral mapping principle from the bound sup x 2N e−x < ∞. 2

x∈[0,∞]

If we combine these estimates we get A2N e−u A ≤ u N C. 2

2N −2 ki Note that the difference (D 2N − A2N ) can be written as i=0 a E i , where E i are differential operators of order i. We can therefore use induction by N in order to deal with the second term in (28) and to get an estimate of the form

D 2N e−u A ≤ Cu M a M 2

3

for some M ∈ N. We now insert u = sa − 2 and get (27).

Lemma 3.6. Locally uniformly in B we have an asymptotic expansion 1

1 . . . ds ∼ a 2 a −n η−n , 0

n≥0

where . . . stands for the integrand of (25). Proof. We must control the integral kernel of the smoothing operator e−t (D

1 2 +t − 2

1

f (t 2 a)E)

3

in the region 0 < t ≤ a − 2 . For fixed T we construct a formal solution of the heat equation (∂t − (D 2 + T E))Ht = 0. by the iterative procedure of the proof of [BGV92, Thm 2.26] and keep control of the dependence on T . We get

t i i (x, y), Ht (x, y) = qt (x, y) n≥0

where qt is a Gauss kernel and the coefficients i (x, y) implicitly also depend on T . The coefficients i (x, y) are given by an iterative formula stated in [BGV92, Thm 2.26]. By inspection we see that it is a polynomial in T of degree at most i. If we write (∂t − (D 2 + T E))Ht = qt (x, y)

N

t i i (x, y) + t N −1rtN (x, y),

n=0

then the remainder term is bounded in C l -norm by t −l . Moreover, it is a polynomial in . T of degree at most N + 1. Note that the t-power is explained by N − 1 = N − dim(E/B) 2

String Structures and Trivialisations of a Pfaffian Line Bundle

701

We now construct the heat kernel e−t (D +T E) using the Volterra series method as in 1 1 [BGV92, Sect. 2.4]. Then we set T (t, a) := t − 2 f (t 2 a). We observe that for N > M 2 , 2

t N T M (t, a) ≤ Ca −

sup t∈(0,a

− 23

3N 2

+M

.

(29)

)

We split N −1 3N −3 t 4 rtN . t N −1rtN = t 4 Using that rtN is a polynomial in T of degree N + 1 we apply (29) to the second factor and see that for N > 3 the remainder term estimate in [BGV92, Thm. 2.20 (3)] is for 3 t ∈ (0, a − 2 ) N

rtN l ≤ C(l)a 3− 8 t

N −1 4 −l

,

where C(l) does not depend on a furthermore. This leads to an estimate N

rtk+1 l ≤ C k+1 a (k+1)(3− 8 ) t (k+1)

N −1 4 −l

tk k!

in [BGV92, Lem. 2.21]. Finally, in [BGV92, Thm. 2.23 (2)] we get for all integers l, n, that for sufficiently large N depending on n, l the approximate kernel ktN (x, y) := qt (x, y)

N

t i i (x, y)

n=0

differs from the true kernel in C l -norm by Ca −n uniformly in t and a. In order to get the asymptotic expansion of the η1 -form we can therefore replace the true heat kernel by its approximation. We must derive an asymptotic expansion of 1 1 1 1 1 1 ¯ − f (a 4 s 2 ) η1 ∼ − Tr (D + a f (a 4 s 2 ) Q)e 4πi 0 1 1 1 1 1 3 ¯ dτ a − 2 ds × k N − 3 G + f (a 4 s 2 )H + f (a 4 s 2 ) Qda kN −3 0

=−

1 4πi 1

τa

2s

1 a4

(1−τ )a

2s

¯ − f (u)2 Tr (D + a f (u) Q)e

0

N ¯ kτNa −2 u 2 G + f (u)H + f (u) Qda k(1−τ )a −2 u 2 dτ a −2 2udu 0 a 1 ¯ − f (u)2 =− Tr (D + a f (u) Q)e 4πi 0 1 N ¯ × kτNa −2 u 2 G + f (u)H + f (u) Qda k(1−τ )a −2 u 2 dτ a −2 2udu, ×

0

where in the last step we use that because of the factor e− f (u) the integral does not contribute to the asymptotic expansion. Also note that T (t, a) =

a 1

. . . du

a4 au −1

f (u).

702

U. Bunke

Therefore the integrand obviously has an expansion in terms of powers of a with integrable functions of u.7 So the integral gives an expansion in powers of a. It remains to determine the leading order. It is a multiple of a 2 . 4. String Structures 4.1. Geometric and multiplicative gerbes. In this subsection we recall some aspects of the theory of U (1)-banded gerbes in manifolds with an emphasis on geometric structures. Furthermore, we review multiplicative gerbes on Lie groups in some detail. We use the language of stacks and refer to [Hei05] for a very readable introduction to stacks in manifolds which suffices for the purpose of the present paper. This in particular applies to the notion of a gerbe with band in an abelian Lie group. In the following a gerbe is a U (1)-banded gerbe in manifolds. Gerbes over a given manifold M form a monoidal 2-category. Details of the construction of the tensor product of gerbes can be found in [BSST08, 6.1.9]. The tensor unit is the trivial gerbe M × BU (1) → M, where the second factor is the quotient stack BU (1) := [∗/U (1)]. For every pair t1 , t2 : H → H of 1-morphisms between gerbes there is an associated U (1)-principal bundle which we denote by tt21 . A 2-morphism t1 ⇒ t2 can be viewed as a trivialization 1 M → tt21 of this U (1)-bundle, where 1 M := M × U (1) → M is the trivial U (1)-bundle. The isomorphism class of a gerbe H in the 2-category of gerbes over M is classified by the Dixmier-Douady class D D(H) ∈ H 3 (M; Z). We have D D(H ⊗ H ) = D D(H) + D D(H ). Furthermore, if H and H are isomorphic, then the set of isomorphism classes in the category Hom(H, H ) is a torsor over H 2 (M; Z). Next we discuss multiplicative gerbes on a Lie group G. The case of interest in the present paper is G = Spin(n). The notion of a multiplicative gerbe in a simplicial context has been introduced in [CJM+ 05]. Here we prefer to work directly in the 2-category of U (1)-banded gerbes on G. We have the maps pr1 , pr2 , m : G×G → G given by pr1 (g, h) := g, pr2 (g, h) := h, and m(g, h) := gh. We use the notation Gi := pri∗ G. Definition 4.1. A multiplicative structure on a gerbe G on G is given by a pair (μ, a) of 1-morphism μ : G1 ⊗ G2 → m ∗ G

(30)

and an associativity 2-morphism a satisfying a higher coherence condition. In the simplicial context the following lemma has been shown in [CJM+ 05, Sect. 5]. Lemma 4.2. If G is compact, connected and simply connected, then a gerbe G on G admits a multiplicative structure which is unique up to isomorphism. Below we will apply Lemma 4.2 to the group G = Spin(n) for n ≥ 3 and the basic gerbe G whose Dixmier-Douady class is a generator of H 3 (Spin(n); Z) ∼ = Z. Geometric structures on U (1)-banded gerbes have been popularised in [Hit01,Bry93] and are usually described in particular models, e.g. bundle gerbes. In the following we give a model-independent account. By 0 M := M × BU (1) we denote the trivial gerbe on M. 7 We a priori know that all terms are integrable at u = 0.

String Structures and Trivialisations of a Pfaffian Line Bundle

703

Definition 4.3. A connection ω on a gerbe H over a manifold M associates to every local ∼ trivialisation t : 0U → H|U , U ⊆ M, a 2-form ωt . This association must be compatible with restriction to subsets. Furthermore, to a pair (t0 , t1 ) of local trivialisations of the t1

gerbe H the connection ω associates a connection ∇ t0 t1 t0 such that R∇

t1 t0 ,ω

,ω

on the associated U (1)-bundle

= ωt1 − ωt0 .

This association is again compatible with restriction to open subsets. The form dωt ∈ 3 (U ) is independent of t and therefore the restriction of a closed global three-form R ω ∈ 3 (M), the curvature of the connection ω. The cohomology class of R ω is the image of the Dixmier-Douady class D D(H) in de Rham cohomology. If α ∈ 2 (M), then we can define a new connection ω + α such that (ω + α)t = ωt + α t1

t1

,ω+α

,ω

and ∇ t0 = ∇ t0 . We have R ω+α = R ω + dα. Next we discuss the notion of a connection on a morphism between geometric gerbes. Such a notion has first been introduced in a slightly different way in [Wal07]. Let f : H → H be a morphism between gerbes with connections ω and ω . t

,ω

Definition 4.4. A connection ω f on the morphism f associates a connection ∇ f ◦t f on the U (1)-bundle ft◦t for each pair of local trivialisations t : 0U → H|U and t : such that for every other pair t˜, t˜ of such local trivializations we have 0U → H|U (

t˜ ,∇ f ◦ t˜

t˜ ,ω f f ◦t˜

t˜ t˜ t ,ω ,∇ )∼ = ( , ∇ t ) ⊗ ( t f ◦t

t f ◦t ,ω f

t t ) ⊗ ( , ∇ t˜ ,ω ) ˜t

(31)

as U (1)-bundles with connection. This association must be compatible with restriction to open subsets. By the relation (31) there is a unique form R ω f ∈ 2 (M) such that ω R|Uf

=

ωt

− ωt − R

∇

t f ◦t ,ω f

for all pairs of objects t ∈ H(U ) and t ∈ H (U ). This form is called the curvature of the connection. Note that connections always exist. The curvature of a connection satisfies

d Rω f = Rω − Rω .

(32)

Definition 4.5. Two gerbes (H, ω), (H , ω ) with connections are isomorphic, if there exists a morphism f : H → H which admits connection ω f with the following two properties: 1. ω f is flat, and 2. for each t ∈ H(U ) it associates to the pair (t, f ◦ t) a trivial bundle with connection ( ff ◦t ◦t , ∇

f ◦t f ◦t ,ω f

).

Remark 4.6. By (31) these condition fix the connection ω f uniquely.

704

U. Bunke

In [Wal07, Def. 4.2.3] such a connection is called compatible. There is an obvious definition of the composition of morphisms with connection. Isomorphism classes of gerbes with connection [H, ω] are classified by the differ ential cohomology classes D D[H, ω] ∈ H Z3 (M), see [Hit01,Bry93]. In terms of the structure maps R, I, a of differential cohomology (5) we have R( D D[H, ω]) = R ω , I ( D D[H, ω]) = D D(H), D D[H, ω] + a(α) = D D[H, ω + α]. Following [Wala, Def. 1.3] we make the following definition: Definition 4.7. A geometric multiplicative gerbe on a Lie group G is a multiplicative gerbe (G, μ, a) together with a connection ωG on G and a connection ωμ on μ such that 1. The curvature ρ of ωμ satisfies pr∗23 ρ + m ∗23 ρ = pr∗12 ρ + m ∗12 ρ,

(33)

2. (μ, ωμ ) induces an isomorphism of gerbes with connection (G, ωG )1 ⊗ (G, ωG )2 → (m ∗ G, m ∗ ωG + ρ),

(34)

and 3. the associativity 2-morphism a preserves the connections. Let G be a compact, connected, and simply connected Lie group. For x ∈ H 3 (G; Z) there exists a unique bi-invariant form ωx ∈ 3 (G) which represents the image of x in de Rham cohomology. Since H 2 (G; R/Z) = 0 and H 3 (G; Z) is torsion-free a class 3 ˆ ∈ 3 (G). Therefore a xˆ ∈ H Z (G) is completely determined by the invariant R(x) gerbe G on G with x = D D(G) has a unique connection ωG with curvature R ωG = ωx . The following lemma has been shown in [Wala, Sect. 1]. It extends Lemma 4.2 to the geometric context. Lemma 4.8. Let G be a gerbe on a compact, connected, and simply connected Lie group with connection ωG with bi-invariant curvature. This structure can be extended in a unique (up to isomorphism) way to a structure of a geometric multiplicative gerbe. Proof. We will need some of the details of the proof later in the proofs of Lemmas 4.12 and 4.15. We must add a connection on the multiplication map (30) which turns G into a geometric multiplicative gerbe. For simplicity, we assume that G is simple. Then we have an explicit formula for ωx in terms of the Maurer-Cartan form θ := g −1 dg ∈ 1 (G, Lie(G)). We have ωx =

kx θ, [θ, θ ], 6cG

where k x ∈ Z depends on x = D D(G), ., . is the Killing form, and cG ∈ R is defined such that 6c1G θ, [θ, θ ] represents the image of the generator of H 3 (G; Z) ∼ = Z in de Rham cohomology. In this case one can choose [Walb, (1.7)] ρ := where θ¯ := dgg −1 .

kx pr∗1 θ, pr∗2 θ¯ , cG

(35)

String Structures and Trivialisations of a Pfaffian Line Bundle

705

In order to construct the connection on the morphism μ we use the fact that again a 3 class yˆ ∈ H Z (G 2 ) is completely determined by its curvature R( yˆ ) ∈ 3 (G 2 ). By a calculation we check that pr∗1 ωx + pr∗2 ωx − m ∗ ωx = dρ. Therefore the two sides of the arrow (34) have the same curvature and hence are isomorphic. According to Remark 4.6 we can find a unique connection ωμ such that (34) is an isomorphism of gerbes with connection. The associativity morphism a can now be adapted so that it preserves the connections. Let us apply this to the basic gerbe G on Spin(n) which by Lemma 4.8 becomes a geometric multiplicative gerbe. Its bi-invariant curvature will be denoted by C S ∈ 3 (Spin(n)). 4.2. Geometric string structures. First we recall the notion of a string structure according to [Walb]. We consider an n-dimensional spin vector bundle V → M. Let p : P → M be the corresponding Spin(n)-principal bundle (earlier we used the longer notation P = Spin(V )). Then we have a canonical isomorphism ∼

(id, g) : P × M P → P × Spin(n)

(36)

of Spin(n)-principal bundles whose inverse is given by ( p, h) → ( p, ph). Let G be the basic multiplicative gerbe on Spin(n). We define the gerbe P := g ∗ G on P × M P. The multiplicative structure of G induces a 1-morphism ν : P12 ⊗ P23 → P13

(37)

together with an associativity 2-morphism, where pri j : P × M P × M P → P × M P are the projections and Pi j := pri∗j P. In detail, if we define gi j := g ◦ pri j , then we have m ◦ (g12 × g23 ) = g13 and Pi j ∼ = gi∗j G so that P13 ∼ = (g12 × g23 )∗ m ∗ G and ∗ ∼ P12 ⊗ P23 = (g12 × g23 ) (G1 ⊗ G2 ). Therefore ν is defined as the pull-back of μ in (30) via g12 × g23 . According to [Walb, Sect. 2.1] we make the following definition: Definition 4.9. The gerbe P over P together with the multiplication 1-morphism (37) and the corresponding associativity 2-morphism is called the Chern-Simons bundle 2-gerbe CS on M associated to the spin bundle V . By [Walb, Thm 1.1.4] we can define a string structure as a trivialisation of the Chern-Simons bundle 2-gerbe. In detail, for i = 1, 2 let pri : P × M P → P denote the projections and set Si := pri∗ S for a gerbe S on P. Definition 4.10. A string structure on the spin bundle V is a gerbe S over P together with a 1-morphism f : P ⊗ S2 → S1 and an associativity 2-morphism (which essentially turns S into module over P) satisfying a higher coherence condition [Walb, Def. 2.2.1].

706

U. Bunke

We will usually denote a string structure by the same symbol as its underlying gerbe. By [Walb, Lemma 2.2.2] a string structure exists if and only if p21 (V ) = 0. In this case the set of isomorphism classes of string structures on V forms a torsor over H 3 (M; Z), see [Walb, Thm.1.1.2], where this result is attributed to [ST04]. We continue with recalling notions and results from [Walb]. Definition 4.11. A connection h on the bundle 2-gerbe CS consists of 1. a 3-form κh ∈ 3 (P), 2. a connection ωh on the gerbe P, 3. a connection σh on the multiplication (37) such that 1. pr∗2 κh − pr∗1 κh = R ωh , 2. (ν, σh ) realizes an isomorphism of gerbes with connection, and 3. the associativity morphism preserves connections. It follows that pr∗1 dκh = pr∗2 dκh . Therefore there exists a unique closed 4-form R h ∈ 4 (M) such that p ∗ R h = dκh . The cohomology class of R h is the image of p21 (V ) in de Rham cohomology. Lemma 4.12 ([Wala]). Let V = (V, ∇ V , h V ) be a geometric spin bundle. Then we have an associated connection h V on the Chern-Simons 2-gerbe CS of V . Proof. In the following we describe the construction of h V which is due to [Wala, Sect. 3]. We will need the details later in the proof of Lemma 4.15. Recall that the basic gerbe G on Spin(n) has a unique connection ωG with curvature R ωG = C S =

1 6c Spin(n)

θ, [θ, θ ] ∈ 3 (Spin(n)).

The gerbe P = g ∗ G has an induced connection ωP = g ∗ ωG with curvature g ∗ C S. The bundle p ∗ P ∼ = P × M P has a canonical trivialisation (36) which we denote by Q for the moment. It induces a taming ( p ∗ W)t Q as explained in Sect. 2. We have seen ∗ in Lemma 2.3 that C S(∇ p V ) = 21 η3 (( p ∗ W)t Q ) ∈ 3 (P) is the usual Chern-Simons ∗ form of ∇ p V (in the trivialisation given by Q). Let A ∈ 1 (P, spin(n)) denote the connection one-form of ∇ V . Then in the new notation 2 1 1 ∗ d A, A + A, [A, A] , C S(∇ p V ) = c Spin(n) 3 2 and this fixes the sign of c Spin(n) . Then we define ω :=

1 c Spin(n)

¯ ∈ 2 (P × M P). pr∗1 A, g ∗ θ

We have by a direct calculation (see [Walb, Sect. 3.1]) pr∗2 C S(∇ p ∗

∗V

) − pr∗1 C S(∇ p

∗V

) = g ∗ C S + dω.

(38)

We set κh V := C S(∇ p V ) and ωh V := ωP + ω. Then condition 4.11.1 holds true. Since the multiplication ν in (37) is defined as the pull-back of the multiplicative structure μ

String Structures and Trivialisations of a Pfaffian Line Bundle

707

of G we can define the connection σh V on ν by pulling back the connection ωμ . Since the associativity morphism for the Chern-Simons gerbe is also obtained by pulling back the associativity morphism of the multiplicative structure on G our definition of σh V ensures that the associativity morphism of the Chern-Simons gerbe preserves connections, hence condition 4.11.3 holds true. Moreover it satisfies part 4.5.2 of condition 4.11.2. In order to verify condition 4.11.2 completely we must check 4.5.1, i.e. that (ν, σh V ) is flat. With the curvature ρ of μ given by (35) this is the equality (g12 × g23 )∗ ρ = pr∗12 ω + pr∗23 ω − pr∗13 ω which again can be checked by a direct calculation (compare [Wala, (3.21)]).

We now consider a Chern-Simons gerbe CS with a connection h. Definition 4.13. A geometric string structure is a triple str := (S, ωS , ω f ) of a string structure S with action f : P ⊗ S2 → S1 together with a connection ωS on S and a connection ω f on the morphism f such that 1. ( f, ω f ) realizes an isomorphism of gerbes with connection, and 2. the associativity 2-morphism preserves connections. It is shown in [Walb, Thm 1.3.4] that a string structure S can always be refined to a geometric string structure. Assume that we have chosen a geometric string structure str . It was shown in [Walb, Thm 1.3.3] that there is a unique form Hstr ∈ 3 (M) such that p ∗ Hstr = R ωS + κh .

(39)

This form is closely related to the Cheeger-Simons cocycle pˆ21 (V) : Z 3 (E) → U (1). Indeed, if z ∈ Z 3 (E), then there exists a neighborhood U ⊆ E of the trace |z| of z such that p21 (V )|U = 0. Therefore there exists a geometric string structure str on PU . Then pˆ 1 (V)(z) = exp(2πi 2

Hstr ). z

This follows by combining (6) with Eq. (41) below (compare [Walb, Sect. 3.4]).

4.3. The string structure associated to a trivialisation. In this subsection we show that a trivialisation Q of the geometric spin bundle V on M induces a geometric string structure str Q . In Lemma 4.15 we calculate the associated form Hstr Q ∈ 3 (M) (see (39)). Lemma 4.14. A trivialisation Q : P → M × Spin(n) of Spin(n)-principal bundles gives rise to a string structure S Q . Proof. The trivialization Q of P gives a pull-back diagram q

P

/ Spin(n)

p

M

/∗

708

U. Bunke

of Spin(n)-principal bundles. The Chern-Simons bundle 2-gerbe of V as well as the string structure are now obtained by induced pull-backs. More explicitly, the ChernSimons gerbe is given by P ∼ = (q1 × q2 )∗ m˜ ∗ G, where qi = q ◦ pri . We define the string −1∗ structure by S Q := q G, where q −1 denotes the composition of q with inversion in Spin(n). The action map is then given by f := (m˜ × q2−1 )∗ μ, where we use the identity q1−1 = m ◦ (m˜ ◦ (q1 × q2 ) × q2−1 ).

(40)

Let V be a geometric spin bundle and h V be the associated connection on CS given by Lemma 4.12. Let Q be a trivialization of V as above. Lemma 4.15. The geometry of V induces a natural refinement of the string structure S Q given by Lemma 4.14 to a geometric string structure str Q . Moreover, we have Hstr Q =

1 3 η (Wt Q ). 2

(41)

Proof. Using the trivialization Q we identify (with G := Spin(n)) P∼ = M × G,

P ×M P ∼ = M×G×G

so that q(b, h) = h, pr1 (b, h, l) = (b, h), pr2 (b, h, l) = (b, l), g(b, h, l) = m(h, ˜ l) = h −1l. We define the form α :=

1 c Spin(n)

A, q ∗ θ ∈ 2 (P).

We equip the gerbe S Q with the connection ω S Q := q −1∗ ωG + α. Furthermore, the action (40) will be equipped with the connection ω f := (m˜ ×q2−1 )∗ ωμ . This already ensures that the associativity morphism obtained by pull-back from the multiplicative structure of G respects the connection. Furthermore part 4.5.2 of the compatibility of ω f is satisfied. It remains to verify the condition 4.5.1 that ( f, ω f ) is flat, i.e. we must check the identity of 2-forms on P × M P: (m˜ × q2−1 )∗ ρ − ω − α2 + α1 = 0, where we use the notation αi = pri∗ α. This is a straightforward calculation. ω We now show (41). Note that we have R S Q = q ∗ C S + dα. Let σ Q : M → P denote the section given by Q. Then the composition q ◦ σ Q : M → Spin(n) is con1 stant and hence σ Q∗ q ∗ C S = 0 and σ Q∗ α = c Spin(n) σ Q∗ A, σ Q∗ q ∗ θ = 0. Moreover, the pull-back via σ Q of the canonical trivialisation of p ∗ V is exactly the trivialisation of ∗ V given by Q. Therefore σ Q∗ (C S(∇ p V )) = C S(∇ V ) = 21 η3 (Wt Q ). Applying σ Q∗ to ∗ p ∗ Hstr Q = q ∗ C S + C S(∇ p V ) we get Hstr Q = 21 η3 (Wt Q ).

String Structures and Trivialisations of a Pfaffian Line Bundle

709

4.4. Proof of Theorem 1.3. Let π : E → B be our surface bundle with the geometric spin bundle V on E. Let L be the geometric line bundle on B constructed in Subsect. 2.3. Theorem 4.16. A geometric string structure str = (S, ωS , ω f ) on V gives a functorial unit-norm section sstr ∈ C ∞ (B, L). It satisfies Hstr . ∇ L log sstr = −2πi E/B

The meaning of the adjective functorial is here again the obvious compatibility of the construction with cartesian diagrams of the form (7). Proof. If U ⊆ B is a contractible open subset and Q ∈ IU is a trivialisation of P|EU , then by the construction of L in Subsect. 2.3 we have a section s Q ∈ C ∞ (U, L). Using the string structure we will define a function a Q ∈ C ∞ (U, U (1)) such that s˜ Q := a Q s Q is independent of the choice of Q. The collection (˜s Q )U ⊆B,Q∈IU therefore defines a global section sstr ∈ C ∞ (B, L). In order to show the second part we will calculate that ∇ L log s˜ Q = −2πi Hstr . EU /U

In order to define a Q we consider the projection Pr : [0, 1] × EU → EU . There exists a unique string structure S˜ on Pr∗ V which restricts to S|EU on {1} × EU , and to S Q on {0} × EU . In fact, since H 3 (EU ; Z) = 0 = H 3 ([0, 1] × EU ; Z) there is only one up-to isomorphism string structure on V|EU and Pr∗ V|EU , respectively. We first fix the iso˜ ω˜ ˜ , ω˜ ˜ ) ! = (S, morphisms above. Then we can choose a geometric string structure str f S which restricts to the given ones, namely to str Q (defined in Lemma 4.15) and str|EU at the boundaries {0} × EU and {1} × EU . The existence of such an interpolation follows from the fact that different geometric refinements of a string structure can be glued using a partition of unity (use [Walb, Prop. 3.3.4]). We define a Q := exp −2πi Hstr ! , [0,1]×EU /U

! by (39). where Hstr ! is the 3-form associated to the string structure str We first show that a Q does not depend on the choices made in the construction, namely the isomorphisms of gerbes S˜|{0}×EU ∼ = S Q , S˜|{1}×EU ∼ = S|EU , and of the geometry. From two such choices str, str we can produce a geometric string structure " on str pr∗EU V → S 1 × EU ∼ = ([0, 1] × EU ) ([0, 1] × EU )/ ∼, where ∼ identifies the boundaries, such that Hstr ! − [0,1]×EU /U

[0,1]×EU /U

Hstr ! =

Over each point u ∈ U value

exp 2πi

S 1 ×EU /U

S 1 ×EU /U

Hstr "

∈ U (1)

Hstr ".

710

U. Bunke pˆ 1 ∗ 2 (pr E V)(z u )

is the evaluation on the cycle

∈ U (1) of the Cheeger-Simons character

pˆ 1 ∗ 2 (pr E V)

z u = (S 1 × E {u} → D 2 × E) ∈ Z 3 (D 2 × E), where pr E : D 2 × E → E is the projection. Indeed, z u is the boundary of the 4-chain (φ : D 2 × E {u} → E) ∈ C4 (D 2 × E), and therefore pˆ 1 ∗ ∗ V (pr E V)(z u ) = exp πi φ p1 (∇ ) = 1. 2 D 2 ×E {u} This implies

[0,1]×EU /U

Hstr ! − a

[0,1]×EU /U

∞ Hstr ! ∈ C (U, Z).

Next we calculate the quotient aQQ ∈ C ∞ (U, U (1)). We choose a homotopy H from Q to Q . This homotopy gives a geometric string structure str H on [0, 1] × EU which connects str Q and str Q . We consider the projection Pr EU : [0, 1] × [0, 1] × EU → EU . " which restricts to str ! on {0} × On Pr∗EU V we can find a geometric string structure str ! on {1} × [0, 1] × EU , to str H in [0, 1] × {0} × EU , and to pr∗E str EU [0, 1] × EU , to str U on [0, 1] × {1} × EU . By Stokes’ Theorem,

[0,1]×EU /U

Hstr ! −

[0,1]×EU /U

=

[0,1]×[0,1]×EU /U

1 2 = 0.

=

Hstr ! −

[0,1]×EU /U

Hstr H

d Hstr "

[0,1]×[0,1]×EU /U

Pr∗EU p1 (∇ V )

This implies that a Q 1 , = exp 2πi Hstr H = exp πi η3 (Pr∗ Wt H ) = , Q) aQ c(Q [0,1]×EU /U [0,1]×EU /U where c(Q , Q) ∈ C ∞ (U, U (1)) = s˜ Q

s˜ Q

sQ sQ

is as in (9). We thus get

= c(Q , Q)

a Q =1 aQ

String Structures and Trivialisations of a Pfaffian Line Bundle

711

as required. We now calculate the covariant derivative of s˜ Q . We use (8), Lemma 4.15, and Stokes’ Theorem, L L ∇ log s˜ Q = ∇ log s Q − 2πid Hstr ! [0,1]×EU /U 3 = −πi η (Wt Q ) − 2πi Hstr + 2πi Hstr Q [0,1]×EU /U EU /U EU /U = −2πi Hstr . EU /U

Acknowledgement. I thank Stephan Stolz for suggesting this problem and Dan Freed for valuable hints.

References [AS68] [BC89] [BGV92] [BKS09] [Bor92] [Bos77] [Bry93] [BS07] [BS08] [BS09] [BSST08]

[Bun] [Bun09] [CJM+ 05] [CS85] [Del87] [Fra91] [FL09] [FM06] [Fre03]

Atiyah, M.F., Singer, I.M.: The index of elliptic operators. i. Ann. Math. 87(2), 484–530 (1968) Bismut, J.-M., Cheeger, J.: η-invariants and their adiabatic limits. J. Amer. Math. Soc. 2(1), 33–70 (1989) Berline, N., Getzler, E., Vergne, M.: Heat kernels and Dirac operators. Volume 298 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], Berlin: Springer-Verlag, 1992 Bunke, U., Kreck, M., Schick, Th.: A geometric description of smooth cohomology. Ann. Math. Blaise Pascal 17(1), 1–16 (2010) Borthwick, D.: The pfaffian line bundle. Commun. Math. Phys. 149(3), 463–493 (1992) B. Booss. Topology and analysis. The Atiyah-Singer index formula and gauge-theoretic physics. Universitext, Berlin-New York: Springer-Verlag, 1977 Brylinski, J.L.: Loop spaces, characteristic classes and geometric quantization. Volume 107 of Progress in Mathematics. Boston, MA: Birkhäuser Boston Inc., 1993 Bunke, U., Schick, Th.: Smooth k-theory. Astérisque 328, 45–135 (2010) Bunke, U., Schick, Th.: Real secondary index theory. Algebr. Geom. Topol. 8(2), 1093– 1139 (2008) Bunke, U., Schick, Th.: Uniqueness of smooth extensions of generalized cohomology theories. J. Topol. 3(1), 110–156 (2010) Bunke, U., Schick, Th., Spitzweck, M., Thom, A.: Duality for topological abelian group stacks and T-duality. In: Cortinas, Guillermo (ed.) et al., K -theory and noncommutative geometry. Proceedings of the ICM 2006 satellite conference, Valladolid, Spain, 2006, Zürich: European Mathematical Society (EMS), 2008, pp. 227–347 Bunke, U.: Chern classes on differential k-theory. Pacific J. Math. 247(2), 313–322 (2010) Bunke, U.: Index theory, eta forms, and Deligne cohomology. Mem. Amer. Math. Soc. 198(928): vi+120, Providence, RI: Amer. Math. Soc., 2009 Carey, A.L., Johnson, St., Murray, M.K., Stevenson, D., Wang, B.L.: Bundle gerbes for chern-simons and wess-zumino-witten theories. Commun. Math. Phys. 259(3), 577–613 (2005) Cheeger, J., Simons, J.: Differential characters and geometric invariants. In: Geometry and topology (College Park, Md., 1983/84), Volume 1167 of Lecture Notes in Math., Berlin: Springer, 1985, pp. 50–80 Deligne, P.: Le déterminant de la cohomologie. In: Current trends in arithmetical algebraic geometry (Arcata, Calif., 1985), Volume 67 of Contemp. Math., Providence, RI: Amer. Math. Soc., 1987, pp. 93–177 Franke, J.: Chern functors. In: Arithmetic algebraic geometry (Texel, 1989), Volume 89 of Progr. Math., Boston, MA: Birkhäuser Boston, 1991, pp. 75–152 Freed, D.S., Lott, J.: An index theorem in differential k-theory. Geom. Topol. 14(2), 903–966 (2010) Freed, D.S., Moore, G.W.: Setting the quantum integrand of m-theory. Commun. Math. Phys. 263(1), 89–132 (2006) Freed, D.S.: On determinant line bundles. In: Mathematical aspects of string theory (San Diego, Calif., 1986), Singapore: World Sci. Publishing, 1987, pp. 189–238

712

[Fre02] [Hei05] [Hit01] [HS05] [Mad09] [SS08] [ST04] [Ste04] [Wala] [Walb] [Wal07]

U. Bunke

Freed, D.S.: K -theory in quantum field theory. In: Current developments in mathematics, 2001, Somerville, MA: Int. Press, 2002, pp. 41–87 Heinloth, J.: Notes on differentiable stacks. In: Mathematisches Institut, Georg-AugustUniversität Göttingen: Seminars Winter Term 2004/2005, Göttingen: Universitätsdrucke Göttingen, 2005, pp. 1–32 Hitchin, N.: Lectures on special Lagrangian submanifolds. In: Winter School on Mirror Symmetry, Vector Bundles and Lagrangian Submanifolds (Cambridge, MA, 1999), Volume 23 of AMS/IP Stud. Adv. Math., Providence, RI: Amer. Math. Soc., 2001, pp. 151–182 Hopkins, M.J., Singer, I.M.: Quadratic functions in geometry, topology, and m-theory. J. Diff. Geom. 70(3), 329–452 (2005) Madsen, I.: An integral riemann-roch theorem for surface bundles. Adv. Math. 225(6), 3229– 3257 (2010) Simons, J., Sullivan, D.: Axiomatic characterization of ordinary differential cohomology. J. Topol. 1(1), 45–56 (2008) St. Stolz, Teichner, P.: What is an elliptic object?. In: Topology, geometry and quantum field theory, London Math. Soc. Lecture Note Ser. 308, Cambridge: Cambridge Univ. Press, 2004, pp. 247–343 Stevenson, D.: Bundle 2-gerbes. Proc. London Math. Soc. (3) 88(2), 405–435 (2004) Waldorf, K.: Multiplicative bundle gerbes with connection. Diff. Geom. Appl. 28(3), 313–340 (2010) Waldorf, K.: String connections and chern-cimons theory. Diff. Geom. Appl. 28(3), 313–340 (2010) Waldorf, K.: More morphisms between bundle gerbes. Theory Appl. Categ. 18(9), 240–273 (electronic) (2007)

Communicated by N.A. Nekrasov

Commun. Math. Phys. 307, 713–759 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1350-6

Communications in

Mathematical Physics

Global Solutions to the 3-D Incompressible Anisotropic Navier-Stokes System in the Critical Spaces Marius Paicu1 , Ping Zhang2 1 Institut de Mathématiques de Bordeaux, Université Bordeaux 1, 33405 Talence Cedex, France.

E-mail: [email protected]

2 Academy of Mathematics and Systems Science and Hua Loo-Keng Key Laboratory of Mathematics,

Chinese Academy of Sciences, Beijing 100190, China. E-mail: [email protected] Received: 13 April 2010 / Accepted: 29 May 2011 Published online: 24 September 2011 – © Springer-Verlag 2011

Abstract: In this paper, we consider the global wellposedness of the 3-D incompressible anisotropic Navier-Stokes equations with initial data in the critical Besov-Sobolev −1,1

type spaces B and B4 2 2 (see Definitions 1.1 and 1.2 below). In particular, we proved that there exists a positive constant C such that (AN Sν ) has a unique global solu tion with initial data u 0 = (u 0h , u 30 ) which satisfies u 0h B exp νC4 u 30 4B ≤ c0 ν or u 0h − 1 , 1 exp νC4 u 30 4 − 1 , 1 ≤ c0 ν for some c0 sufficiently small. To overcome the B4

2 2

B4

2 2

difficulty that Gronwall’s inequality can not be applied in the framework of Cheminp (B), we introduced here sort of weighted Chemin-Lerner type Lerner type spaces, L t spaces, L 2 (B) for some apropriate L 1 function f (t). t, f

1. Introduction We first recall the classical (isotropic) Navier-Stokes system for incompressible fluids in the whole space: ⎧ ⎨ ∂t u + u · ∇u − νu = −∇ p, (t, x) ∈ (0, ∞) × R3 , (N Sν ) div u = 0, ⎩ u|t=0 = u 0 , where u(t, x) denote the fluid velocity and p(t, x) the pressure. In the seminal paper [21], J. Leray proved the global existence of finite energy weak solutions to (N Sν ). This result used the structure of the nonlinear terms in (N Sν ) in order to obtain the energy inequality. An approach due to T. Kato reduces the solving of (N Sν ) to the search of a fixed point for some quadratic functional. The first result in that direction is the theorem of H. Fujita and T. Kato (see [13]) in which the authors proved that the system (N Sν ) is

714

M. Paicu, P. Zhang

1 globally wellposed for small initial data in the homogeneous Sobolev spaces H˙ 2 (R3 ) which is the space of tempered distributions u with Fourier transform of which satisfy def 2 u 1 = |ξ | | u (ξ )|2 dξ < ∞.

H˙ 2

R3

Cannone, Meyer and Planchon [3], Cannone [4], and Planchon [25] proved a similar −1+ 3

result for (N Sν ) with initial data in the negative Besov spaces B p,∞p (R3 ) for 3 < p < ∞. The interest of Besov spaces with negative regularity indices concerns the global wellposedness of the Navier-Stokes equations with highly oscillatory initial data. A different important role of Besov spaces is to give a functional framework to construct global self-similar solutions for (N Sν ) with small data homogeneous of degree −1 (see [3]). This approach has reached its end point with the theorem of Koch and Tataru [18]. Their theorem implies in particular that, for a given function φ in the Schwartz space S(R3 ), if we consider the family of initial data u ε0 defined by u ε0 (x) =

def

λ x1 sin (0, −∂3 φ, ∂2 φ) , ε ε

(1.1)

if λ is small enough, a positive ε0 exists such that for any ε ≤ ε0 the initial data u ε0 generates a unique global solution to (N Sν ). Those theorems are global existence results for a generalized Navier-Stokes system with small initial data and do not take into account any particular properties of the nonlinear structure in the Navier-Stokes equation. One may check [20] for complete references in this direction. In this text, we are going to study a version of the system (N Sν ) where the usual Laplacian is substituted by the Laplacian in the horizontal variables h = ∂x21 + ∂x22 , namely ⎧ ⎨ ∂t u + u · ∇u − νh u = −∇ p, (t, x) ∈ (0, ∞) × R3 , (AN Sν ) div u = 0, ⎩ u|t=0 = u 0 . Systems of this type appear in geophysical fluids (see for instance [8]). In fact, instead of putting the classical viscosity −ν in (N Sν ), meteorologists often modelize turbulent diffusion by putting a viscosity of the form: −νh h − ν3 ∂x23 , where νh and ν3 are empirical constants, and ν3 is usually much smaller than νh . We refer to the book of J. Pedlovsky [22], Chap. 4 for a more complete discussion. We note also that in the particular case of the so-called Ekman layers (see [12,14]) for rotating fluids, ν3 = νh and is a very small parameter. The system (AN Sν ) has been studied first by J. Y. Chemin, B. Desjardins, I. Gallagher and E. Grenier in [7] and D. Iftimie in [17] where it is proved that the anisotropic Navier-Stokes system (AN Sν ) is locally wellposed for initial data in the anisotropic Sobolev space

def 0, 21 +ε def 2 3 2 H = u ∈ L (R ) / u 0, 1 +ε = |ξ3 |1+2ε | u (ξh , ξ3 )|2 dξ < +∞ , H˙

R3

2

for some ε > 0. Moreover, it has also been proved that if the initial data u 0 is small enough in the sense that u 0 εL 2 u 0 1−ε ≤ cν 0, 1 +ε H˙

2

(1.2)

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

715

for some sufficiently small constant c, then they have a global wellposedness result. Let us notice that the space in which uniqueness is proved in [7,17], is the space of contin1 uous functions with value in H 0, 2 +ε (R3 ) and the horizontal gradient of which belongs 1 2 ([0, T ]; H 0, 2 +ε (R3 )). to L loc On the other hand, we notice that, as a classical Navier-Stokes system, the system (AN Sν ) has a scaling. Indeed, if u is a solution of (AN Sν ) on a time interval [0, T ] with initial data u 0 , then the vector field u λ defined by def

u λ (t, x) = λu(λ2 t, λx)

(1.3)

is also a solution of (AN Sν ) on the time interval [0, λ−2 T ] with the initial data λu 0 (λx). The smallness condition (1.2) is of course scaling invariant. But the norm · ˙ 1 +ε is H2 not and this norm determines the level of regularity required to have wellposedness. M. Paicu proved in [23] a theorem of the same type for the system (AN Sν ) in the case when the initial data u 0 belongs to B (see Definition 1.1 below). This result can be looked upon as the equivalence of Fujita-Kato’s theorem in the case of the (AN Sν ) system. In [11], the authors proved a theorem which in particular implies the following global wellposedness result of (AN Sν ) for initial data with high oscillation in the horizontal variable: for a given function φ in the Schwartz space S(R3 ), if we consider the family of initial data u ε0 defined by u ε0 (x) =

def

λ 1

ε2

sin

x 1

ε

(0, −∂3 φ, ∂2 φ) ,

(1.4)

then, if λ is small enough, there exists a small positive constant ε0 such that for any small enough ε ≤ ε0 , the system (AN Sν ) is globally wellposed with the initial data u ε0 . This is analogous (with a smaller power of ε) to the example of initial data given in (1.1) for the case to the (N Sν ) system. The purpose of this paper is to extend the wellposedness results in [11,23], in particular, our theorem here will provide examples of larger initial data such that (AN Sν ) has a unique global solution. In the first part of the paper, we shall prove the global wellposedness of the anisotropic Navier-Stokes system with initial data having a large vertical component provided that the horizontal component is small enough (compared with the vertical one and with the horizontal viscosity coefficient). The functional framework that we shall use is invariant by scaling and by vertical dilation and is given by the anisotropic Besov-Sobolev space B (see Definition 1.1 below). Moreover, we note that we are able to obtain the energy estimate in this anisotropic space without using an additional vertical derivative, and consequently our results are valid also in the case of a vanishing vertical viscosity. The main idea to handle this anisotropic model is that the velocity field verifies a 2-D Navier-Stokes type system in the horizontal variables while in the vertical variable we have to deal with a 1D hyperbolic type equation. We shall use energy estimates in the horizontal variables so that the divergence free condition allows to control the vertical derivatives to the vertical component of the velocity field. We emphasize that our proof uses in a fundamental way the algebraic structure of the Navier-Stokes system. The first step is to obtain energy estimates on the horizontal components on the one hand and on the vertical component on the other hand. One of the difficulties with this strategy is that the pressure term does not disappear but has to be estimated. We remark that the equation on the vertical component is a linear equation with coefficients depending on the horizontal components. Therefore, the equation on

716

M. Paicu, P. Zhang

the vertical component does not demand any smallness condition. While the equation on the horizontal component contains bilinear terms in the horizontal components and also terms taking into account the interactions between the horizontal components and the vertical one. In order to solve this equation, we need a smallness condition on the horizontal component (amplified by the vertical component) of the initial data. At this point, we need to use the Gronwall Lemma which can not be applied directly in the p (B). To overcome this difficulty we shall framework of Chemin-Lerner type spaces L t introduce here sort of weighted Chemin-Lerner type spaces, L 2 (B) for some approprit, f

ate L 1 function f (t). As we already explained, our result allows to give some examples of large data which are slowly varying in the vertical direction and which are larger than the “well prepared” case studied in [9] for the classical Navier-Stokes system. In the second part of this paper, we shall give an analogous result in the case of initial data with very rough regularity, namely belonging to Besov-Sobolev spaces with negative regularity in the horizontal variables. The main ingredient of the proof is a combination between the strategy that we already explained above and the methods introduced in [11]. The idea is to decompose the initial data in a part where the horizontal frequencies are higher than the vertical frequencies and the rest which belong to the Besov-Sobolev spaces B for which we can use our previous methods. This new result allows us to present another new example of large data where we combine the high frequencies feature in the horizontal variables with a slow varying vertical variable. As in [7,23] and [11] (or more recently the book [1]), the definition of the spaces we are going to work with requires anisotropic dyadic decomposition of the Fourier space. Let us recall kh a = F −1 (ϕ(2−k |ξh |) a ), v a = F −1 (ϕ(2− |ξ3 |) a ), and kh a, Sv a = v a, Skh a = k ≤k−1

(1.5)

≤−1

where Fa and a denote the Fourier transform of the distribution a, and ϕ(τ ) is a smooth function such that 8 3 ≤ |τ | ≤ and ∀τ > 0, ϕ(2− j τ ) = 1. Supp ϕ ⊂ τ ∈ R / 4 3 j∈Z

Before we present the spaces we are going to work with, let us first recall the − 21 , 21

Besov-Soblev type space B from [16,23] and B4

from [11].

Definition 1.1. We call B the space of tempered distributions, which is the completion of S(R3 ) by the following norm: def

aB =

∈Z

2 2 v a L 2 (R3 ) .

(1.6)

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

717

The space B(T ) is the completion of C ∞ ([0, T ]; S(R3 )) by the norm: √ def aB(T ) = a + ν∇h a L2 (B) with L∞ T T (B ) def = 2 2 v a L ∞ (L 2 (R3 )) and a L ∞ (B ) T

def

∇h a L2 (B) = T

∈Z

(1.7)

T

∈Z

2 2 ∇h v a L 2 (L 2 (R3 )) . T

In [23], the author proved the local wellposedness of (AN Sν ) with initial data in B. Moreover, if the initial data is small enough compared to the horizontal viscosity, he also established the global wellposedness result. Note from the above definition that u ε0 defined in (1.4) is not small in this space no matter how small ε is. The main motivation −1,1

for the authors to introduce the space B4 2 2 in [11] is to find a scaling invariant BesovSoblev type space, which is very close to the classical Besov spaces of negative indices, such that in particular u ε0 defined in (1.4) is small in this space for ε and λ sufficiently small. We emphasize that such a space has to be Besov type space with negative regularity indices in the horizontal variables in order to take into account strong oscillations in the horizontal variables. Meanwhile for the vertical variable, we have to use a space which is invariant by vertical dilation in order to consider “slowly varying” data in the vertical variable. This justifies the following definition: − 1 , 12

Definition 1.2. We denote by B4 2 of S(R3 ) by the following norm: a

−1,1 B4 2 2

def

=

2

∈Z − 21 , 21

The space B4

2

∞

2

−k

kh v a2L 4 (L 2 ) v

1 2

+

h

k=−1

j∈Z

j

2 2 S hj−1 vj a L 2 (R3 ) . (1.8)

(T ) is the completion of C ∞ ([0, T ]; S(R3 )) by the norm: def

a

−1,1 B4 2 2 (T )

= a

− 21 , 21 L∞ ) T (B4

1 1

−2,2 L∞ ) T (B4

def

a

=

2

j∈Z

def 1 1

2 (B − 2 , 2 ) L 4 T

2

∞

∈Z

+

∇h a

the space of distributions, which is the completion

=

2

+

j∈Z

2

√ ν∇h a

−k

kh v a2L ∞ (L 4 (L 2 )) v

j

k=−1

2

k

kh v a2L 2 (L 4 (L 2 )) v T

j 2

h

2 ∇h S hj−1 vj a L 2 (L 2 (R3 )) . T

2

and

T

∞

1

h

2 2 S hj−1 vj a L ∞ (L 2 (R3 )) 2

with

1 1

2 (B − 2 , 2 ) L 4 T

T

k=−1

∈Z

+

1 2

(1.9)

718

M. Paicu, P. Zhang

More recently, Zhang [26] claimed the following result: Theorem 1.1. A positive constant C1 exists such that if u 0 ∈ B which satisfies div u 0 = 0 and (1.10) C1 ν −1 u 0h B exp C1 (ν −1 u 30 B + 1)4 ≤ 1, then (AN Sν ) has a unique global solution. However, we found that there is a serious gap in the proof of Theorem 1.1 in [26], the main reason is that one can not use the Gronwall type inequality in the framework p (B). Indeed, the only gap in the proof of Theorem 1.1 lies in the proof of of spaces L t Proposition 3.2 in [26]. For instance, using the Hölder inequality and Lemma 3.3 in [26], what one can obtain is T 1 1 1 1 j G vj (T ) d j (t)2− 2 u h (t)B2 ∇h u h (t)B2 u 3 (t)B2 ∇h u 3 (t)B2 vj ∇h w(t) L 2 dt, 0

which can not be dominated by d 2j 2− j

0

T

1

1

1

1

u h (t)B2 ∇h u h (t)B2 u 3 (t)B2 ∇h u 3 (t)B2 ∇h w(t)B dt,

as it is claimed in [26], where both (d j ) j∈Z and (d j (t)) j∈Z are generic elements in 1 (Z) with the norm of which equal to 1. To overcome the difficulty mentioned above, the authors [15] basically proved the global wellposedness of (AN Sν ) provided that

u 0h H 0,s0 exp C0 (ν −1 u 30 H 0,s0 )4 ≤ cν for s0 > 21 and some c sufficiently small. Moreover, a sort of global stability result was proved for the classical Navier-Stokes system (N Sν ) with anisotropic type perturbation of the initial data to any given global smooth solution of (N Sν ). However, the regularity level of the initial data in this result is not scaling invariant and this is our main motivation to obtain the following Theorem 1.2. Now we present the main results in this paper: Theorem 1.2. Let u 0 = (u 0h , u 30 ) ∈ B be a divergence free vector field, and there exists a positive constant L such that L def η = u 0h B exp 4 u 30 4B ≤ c0 ν (1.11) ν for some c0 sufficiently small. Then the system (AN Sν ) has a unique global solution u ∈ C([0, ∞); B) with ∇h u ∈ L 2 (R+ ; B). Moreover, L h η and u h + ∇ u ≤ 2 exp + + h L 2 (R ;B) L ∞ (R ;B) 16 (1.12) 3 3 u 3 + ∇ u ≤ 2u + ν holds. + + h B 2 ∞ 0 L (R ;B) L (R ;B)

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

719

Remark 1.1. (1) This theorem ensures the global wellposedness of (AN Sν ) with initial data of the form u 0 = (u 0h , u 30 ) with div u 0 = 0 and u 30 B ≤ Cν for any positive constant C while u 0h B ≤ cν for some sufficiently small c, which in particular implies the global wellposedness result in [23]. (2) Theorem 1.2 also ensures the global wellposedness of (AN Sν ) with initial data of the form u 0 = (− ln )δ v0h (x h , x3 ), (− ln )δ u 30 (x h , x3 ) for 0 < δ < 1/4 and ε sufficiently small, as that claimed in [26]. (3) Very recently, Zhang [27] corrected his former result (1.10) in [26] to be C1 ν −1 u 0h B exp C1 (ν −1 u 30 B + 1)8 ≤ 1 by using basically the same idea of [26]. The main tool that we shall use to overcome the difficulty that we can not use the p (B), as we mentioned before, is to introduce Gronwall inequality in the framework of L t the following weighted Chemin-Lerner [10] type norm: 1 (R ), f (t) ≥ 0. We define Definition 1.3. Let f (t) ∈ L loc +

u 2

L T, f (B)

=

q

2

q 2

T

f (t)qv u(t)2L 2 dt

0

1 2

.

Remark 1.2. In fact, Definition 1.3 is very much motivated by the following variant of Gronwall’s Lemma. Let X (t), f (t), h(t) be positive functions so that d dt X (t) ≤ C f (t)X (t) + h(t), (1.13) X (0) = X 0 . Instead of directly applying Gronwall’s Lemma to (1.13), we get by multiplying def t gλ (t) = exp −λ 0 f (t ) dt to (1.13) and integrating the resulting inequality over [0, t] that t t t gλ (t)X (t) + λ gλ (t ) f (t )X (t ) dt ≤ X 0 + C gλ (t ) f (t )X (t ) dt + gλ (t )h(t ) dt , 0

0

0

in particular, if we take λ ≥ C, this gives rise to t gλ (t)X (t) ≤ X 0 + gλ (t )h(t ) dt , 0

that is

X (t) ≤ X 0 exp λ 0

t

t f (t ) dt + exp λ 0

t t

f (τ ) dτ h(t ) dt ,

(1.14)

for λ ≥ C. The main motivation to introduce Definition 1.3 is to adapt the proof of (1.14) to the framework of Chemin-Lerner type spaces, namely, integrating any dyadic block with respect to time first and then making the summation with respect to q.

720

M. Paicu, P. Zhang − 21 , 21

On the other hand, it follows from [11] that: given u 0 = (u 0h , u 30 ) ∈ B4 has a unique solution of the form

, (AN Sν )

u = u F + w,

(1.15)

where w ∈ B(T ) for some positive time T and u F is given by u F = eνth u hh def

def

u hh =

and

kh v u 0 .

(1.16)

k≥−1

Substituting (1.15) into (AN Sν ) results in ⎧ ⎨ ∂t w + w · ∇w − νh w + w · ∇u F + u F · ∇w = −u F · ∇u F − ∇ p, div w = 0, ⎩ def w|t=0 = u h = u 0 − u hh . Notice that

vj u h L 2

| j− j |≤1

S hj −1 vj u 0 L 2 ,

(1.18) − 21 , 12

which along with Definition 1.2 implies that if u 0 belongs to B4 to B and u h B u 0

− 21 , 21

B4

(1.17)

, then u h belongs

.

(1.19)

Combining the techniques used in the proof of Theorem 1.2 and the above observation, we can prove the following wellposedness result for (AN Sν ) : − 21 , 21

Theorem 1.3. Let u 0 = (u 0h , u 30 ) ∈ B4 exists a positive constant M such that

be a divergence free vector field, and there

def

η1 =

u 0h

− 21 , 21

B4

M 3 4 exp u 1 1 ν 4 0 B− 2 , 2

≤ c1 ν

(1.20)

4

for some c1 sufficiently small. Then the system (AN Sν ) has a unique global solution u = u F + w such that u F is given by (1.16) and w ∈ C([0, ∞); B) with ∇h w ∈ L 2 (R+ ; B). Moreover, there exists a positive constant K such that w h + ∇h w h ≤ K η1 and L 2 (R+ ;B) L ∞ (R+ ;B) w 3 + ∇h w 3 ≤ K u 30 L 2 (R+ ;B) L ∞ (R+ ;B)

− 1 , 21

B4,12

+ ν.

(1.21)

Remark 1.3. (1) This theorem ensures the global wellposedness of (AN Sν ) with initial data of the form u 0 = (u 0h , u 30 ) with div u 0 = 0 and u 30 − 1 , 1 ≤ Cν for any positive constant C while

u 0h

B4

− 21 , 21

B4

2 2

≤ cν for some sufficiently small c. In

particular, this theorem implies the global wellposedness result in [11].

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

721

(2) For a given function φ in the Schwartz space S(R3 ), Theorem 1.3 along with Prop1 osition 1.1 (which claims that ei x1 /ε φ − 1 , 1 ≤ cφ ε 2 ) also ensures the global B4

2 2

wellposedness of (AN Sν ) with initial data of the form

x 1 1 1 (0, −ε 2 ∂3 φ(x h , εx3 ), ε− 2 ∂2 φ(x h , εx3 )) u 0 = (− ln ε)δ sin ε for 0 < δ < 1/4 and ε sufficiently small. (3) Both Theorem 1.2 and Theorem 1.3 holds for classical Navier-Stokes equations (N Sν ). We emphasize that our results strongly use the algebraic structure of the nonlinear terms and the divergence-free condition of the velocity field. In our subsequent paper [24], we obtained similar global wellposedness results for the 3-D inhomogeneous Navier-Stokes equations with initial data in the critical Besov spaces. In the rest of the paper, we shall constantly use the anisotropic version of the isotropic para-differential decomposition due to J. M. Bony [2] that: for a, b ∈ S (R3 ), ab = Tav b + R v (a, b), or ab = Tav b + Tbv a + Rv (a, b), where v v Tav b = Sq−1 aqv b, R v (a, b) = qv aSq+2 b, and q∈Z

Rv (a, b) =

(1.22)

q∈Z

qv aqv b.

|q−q |≤1

Similar decomposition for the horizontal variables will also be used frequently. The organization of this paper follows: Scheme of the proof and organization of the paper. In Sect. 2, we shall present the proof of Theorem 1.2. The main idea can be outlined as follows: we first work out the energy estimate for the horizonal components of the velocity field so that u λh L ∞ (B ) + t

√ λu λh 2

L t, f (B)

1 2 ≤ u 0h B + C0 u h ∞

+

√

ν∇h u λ L 2 (B ) t

3

L t (B )

∇h u λh + ν − 2 u λh 2 L 2 (B )

L t, f (B)

t

,

(1.23)

where u λ (t, x) = e−λ def

t 0

f (t ) dt

def

u(t, x), with f (t) = u 3 (t)2B ∇h u 3 (t)2B . (1.24) 4C 2

Along the same lines as Remark 1.2, if we take λ = ν 30 in (1.23) and take T ∗ so small that

√ 1 def T ∗ = max t : u h L ν∇h u h ≤ min( , ε )ν , (1.25) ∞ (B ) + 0 2 L t (B ) t 4C02 we infer from (1.23) that u λh L ∞ (B ) + t

√

ν∇h u λh ≤ 2u 0h B L 2 (B ) t

for t < T ∗ .

722

M. Paicu, P. Zhang

Then thanks to (1.24), we arrive at √ u h L ν∇h u h ∞ (B ) + L 2t (B) t

4C 2 t 3 2 3 2 0 ≤ 2u 0h B exp u (t ) ∇ u (t ) dt h B B ν3 0

(1.26)

for t < T ∗ . The second step of the proof is to obtain the energy estimate for the vertical component of the velocity field. We use at this point, the important fact that u 3 verifies a linear equation with coefficients depending on u h so that √ 1/2 3 3 3 u L 2ν∇h u ≤ u 0 B + C u h ∇h u h ∞ (B ) + L 2t (B) L 2t (B) L∞ t t (B ) 1/4 3 1/4 h 1/4 h 3/4 . + u 3 ∇ u u ∇u h ∞ ∞ 2 2 L t (B )

L t (B )

L t (B )

L t (B )

Then thanks to (1.25), we obtain u 3 L ∞ (B ) + t

√

3 5 1/4 3 2ν∇h u 3 ≤ u + C ε02 ν + ε0 ν 8 u 3 B 2 ∞ 0 L (B )

L t (B )

t

small enough, we obtain for t < T ∗ . Taking ε0 = ε0 (C) √ ν∇h u 3 ≤ 2u 30 B + ν u 3 L ∞ (B ) + L 2 (B ) t

t

1/4

∇u 3 2

L t (B )

for t < T ∗ .

,

(1.27)

The last step is to prove that T ∗ = ∞, provided that c0 in (1.11) is sufficiently small. Indeed if T ∗ < ∞, it follows from (1.27) that t 16 3 2 3 2 3 2 3 2 3 4 4 u (t )B ∇h u (t )B dt ≤ u L ≤ 16u 0 − 1 , 1 + ν . ∞ (B ) ∇h u L 2t (B) t ν 0 B 2 2 4

Substituting the above inequality into (1.26) ensures that u L ∞ (B ) + h

t

√

ν∇h u ≤ L 2 (B ) h

t

≤

2 exp(64C02 )u 0h B 1 1 min( 2 , ε0 )ν 2 4C0

provided that we take L = 1024C02 and c0 ≤

1 4

1024C02 3 4 exp u 0 B ν4

for t < T ∗ ,

exp(−64C02 ) min( 4C1 2 , ε0 ) in (1.11). 0

This contradicts the definition of T ∗ defined in (1.25), and therefore T ∗ = +∞. In Sect. 2, we shall rigorously work out the above estimates for appropriate approximate solutions of (AN Sν ), and then prove the existence part via the compactness argument. Exactly following the same line as the proof of Theorem 1.2 in Sect. 2, we shall present the proof of Theorem 1.3 in Sect. 3. However, comparing (1.17) with (AN Sν ), three additional new terms appear in (1.17). Therefore more a complicated argument has to be involved in the proof of the estimates like (1.26) and (1.27) for w. One may check (3.50) and (3.54) for more details. Let us complete the introduction with the notations we are going to use in this context.

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

723

Notations. Let A, B be two operators; we denote [A; B] = AB − B A, the commutator between A and B, a b. We mean that there is a uniform constant C, which may be different on different lines, such that a ≤ Cb. We shall denote by (a|b) (or p q (a|b) L 2 ) the L 2 (R3 ) inner product of a and b. We denote L rT (L h (L v )) the space r p q L ([0, T ]; L (Rx1 × Rx2 ; L (Rx3 ))). Finally, we denote by (ck )k∈Z (resp. (d j ) j∈Z ) a generic element of the sphere of 2 (Z) (resp. 1 (Z)), and (dk, j )(k, j)∈Z2 a generic sequence such that j∈Z

1 2

2 dk, j

= 1.

k∈Z

2. The Proof of Theorem 1.2 The goal of this section is to present the proof of Theorem 1.2. For the convenience of the readers, we first recall the following Bernstein type lemma from [6,11]: Lemma 2.1. Let Bv be a ball of Rv , and Cv a ring of Rv ; let 1 ≤ p2 ≤ p1 ≤ ∞ and 1 ≤ q2 ≤ q1 ≤ ∞. Then the following hold: If the support of a is included in 2k Bh , then ∂xαh a L p1 (L qv1 ) h

2

k |α|+2 p1 − p1 2

1

a L p2 (L qv1 ) . h

If the support of a is included in 2 Bv , then β

∂3 a L p1 (L qv1 ) 2

β+ q1 − q1 2

1

h

If the support of a is included in

2k C

h,

a L p1 (L qv2 ) . h

then

a L p1 (L qv1 ) 2−k N sup ∂xαh a L p1 (L qv1 ) . |α|=N

h

h

If the support of a is included in 2 Cv , then a L p1 (L qv1 ) 2−N ∂3N a L p1 (L qv1 ) . h

h

Proof. Those inequalities are classical (see for instance [23] or [11]). For the reader’s convenience, we shall prove the last one in the particular case when N = 1. Let us consider ϕ in D(R2 \{0}) such that ϕ has value 1 near Cv . Then for any tempered distribution a such that the support of a is included in 2 Cv , we have a = 2− iξ3 ϕ(2− ξ3 ) a with ϕ(ξ3 ) = − def

ϕ (ξ3 ) iξ3 . |ξ3 |2

Then, we have v,3 −1 − a ). a = 2− ∂3 v,3 a with a = F (ϕ(2 ξ3 )

def

(2.1)

This formula will be useful later on and it proves the fourth inequality of the lemma in the particular case when N = 1.

724

M. Paicu, P. Zhang

The proof of Theorem 1.2 will mainly be based on the following two propositions. The first proposition is to deal with the L 2 inner product of a · ∇b and b for any dyadic block on vertical frequencies, without using any vertical derivative of a and b. Here, the divergence free condition of a plays an essential role. The idea is to use integration by v a 3 ∂ v a · v a d x and to use the divergence free condition of parts for the term R3 Sq−1 3 q q a = (a h , a 3 ) in the form ∂3 a 3 = − divh a h . We remark that in the horizontal variables, we shall use classical energy estimates for the 2D Navier-Stokes equation. The second proposition presents the estimate to the anisotropic L 2 inner product of the horizontal derivative to the pressure with the horizontal components of the velocity field, which does not disappear in our energy estimates. There again the divergence-free condition of the velocity field as well as the algebraic expression of the equation satisfied by the pressure plays a key role. Proposition 2.1. Let a = (a h , a 3 ), b ∈ B(t) with div a = 0. Let g ∈ L ∞ (0, t), and def

def

we denote ag (t, x) = g(t)a(t, x) and bg (t, x) = g(t)b(t, x). Then there holds for all q ∈ Z,

qv (a · ∇bg ) | qv bg L 2 dt 0 1 1 2 2 dq2 2−q a h ∇h agh 2 ∞ t

1

3

2 2 b ∇h bg L t (B ) L 2t (B) L t (B ) L∞ t (B ) h . +∇h ag b ∇h bg L 2 (B ) L 2 (B ) L ∞ (B ) t

(2.2)

t

t

Proof. The main idea of the proof to this lemma essentially follows from that of Lemma 3 of [7] and proposition 3.3 of [11]. Noticing that the right-hand side of (2.2) does not contain the term with ∂3 bg , we distinguish the terms with the horizontal derivatives from the term with the vertical one so that def

Iq,g (t) =

t

0

qv (a · ∇bg )|qv bg

h v dt = Iq,g (t) + Iq,g (t),

def

L2

with def

h Iq,g (t) =

t 0

qv (a h ·∇h bg )|qv bg

v dt and Iq,g (t) = L2

def

t

qv (a 3 ∂3 bg )|qv bg 0

L2

dt .

Thanks to Bony’s decomposition (1.22), we have a h · ∇h bg = Tavh ∇h bg + R v (a h , ∇h bg ). Whereas a simple interpolation in 2-D gives 1 1 √ gqv a h L 4 (L 4 (L 2 )) qv a h L2 ∞ (L 2 ) ∇h qv agh L2 2 (L 2 ) t

h

v

t

dq 2

− q2

a

h

t

1 2

1

2 ∇h agh , L ∞ (B ) L 2 (B ) t

t

(2.3)

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

which along with Lemma 2.1 implies √ gqv (R v (a h , ∇h bg ))

q ≥q−4

4

L t3 (L h3 (L 2v ))

√ gqv a h L 4 (L 4 (L 2 )) Sqv +2 ∇h bg L 2 (L 2 (L ∞ )) t

1

q

L t (B )

t

2 ∇h agh 2

L t (B )

4 4 L t3 (L h3 (L 2v ))

v

h

1

2 dq 2− 2 a h ∞

and similarly √ gqv (Ta h ∇h b))

4

725

|q −q|≤5

∇h bg , L 2 (B ) t

√ gSqv −1 a h L 4 (L 4 (L ∞ )) qv ∇h bg L 2 (L 2 ) t

1

q

2 dq 2− 2 a h ∞

As a consequence, we obtain h I (t) √gv (a h · ∇h bg ) q,g q dq2 2−q a h ∞

L t (B )

t

2 ∇h agh 2

4 4 L t3 (L h3 (L 2v )) 1 h 2 h g 2 L t (B )

∇ a

v

h

1

L t (B )

1 2

v

h

L t (B )

∇h bg . L 2 (B ) t

√ gqv b L 4 (L 4 (L 2 )) t

1 2

v

h

3 2

b ∞

L t (B )

∇h bg 2

L t (B )

.

(2.4)

v (t), we need to use the assumption that div a = 0 On the other hand, to deal with Iq,g and the trick from [7,11]. Toward this, we first use Bony’s decomposition for a 3 ∂3 bg in the vertical variables and then a commutator process for qv (Ta 3 ∂3 bg ) so that t v v Iq,g (t) = Sq−1 a 3 ∂3 qv bg |qv bg L 2 dt 0 t + [qv ; Sqv −1 a 3 ]∂3 qv bg |qv bg L 2 dt |q −q|≤5 0

+

+

t v (Sqv −1 a 3 − Sq−1 a 3 )∂3 qv qv bg |qv bg L 2 dt

|q −q|≤5 0 t q ≥q−4

0

qv (qv a 3 Sqv +2 ∂3 bg )|qv bg

L2

dt

def

1,v 2,v 3,v 4,v = Iq,g (t) + Iq,g (t) + Iq,g (t) + Iq,g (t).

(2.5)

In what follows, we shall successively estimate all the terms above. Firstly as div a = 0, we get by using integration by parts that 1 t 1,v v Iq,g (t) = g 2 Sq−1 divh a h |qv b|2 d x dt , 2 0 R3 from which and (2.3), we deduce that √ 1,v v |Iq,g (t)| ≤ Sq−1 (divh agh ) L 2 (L 2 (L ∞ )) gqv b2L 4 (L 4 (L 2 )) t

dq2 2−q divh

h

v

t

h

v

agh b L . ∞ (B ) ∇h bg L 2 (B ) L 2 (B ) t t

t

726

M. Paicu, P. Zhang

2,v To handle the commutator in Iq,g (t), we first use Taylor’s formula to get

Iq2,v (t)

=

2

t

q

0

|q−q |≤5

q

R3 R

h(2 (x3 − y3 ))

1

Sqv −1 ∂3 a 3 (x h , τ y3 + (1−τ )x3 ) dτ

0

×(y3 − x3 )∂3 qv bg (t , x h , y3 ) dy3 qv bg (t , x) d x dt ,

(2.6)

def

where h(x3 ) = F −1 (ϕ(|ξ3 |))(x3 ). Applying Lemma 2.1, (2.3) and Young’s inequality yields √ √ 2,v |Iq,g (t)| Sqv −1 ∂3 ag3 L 2 (L 2 (L ∞ )) gqv b L 4 (L 4 (L 2 )) gqv b L 4 (L 4 (L 2 )) t

|q−q |≤5

v

h

t

h

v

t

v

h

dq2 2−q divh agh b L . ∞ (B ) ∇h bg L 2 (B ) L 2 (B ) t

t

t

Whereas thanks to Lemma 2.1 and div a = 0, we have q

qv ag3 L 2 (L 2 (L ∞ )) 2− 2 qv ∂3 ag3 L 2 (L 2 ) dq 2−q divh agh , L 2 (B ) t

h

v

t

(2.7)

t

which together with (2.3) ensures that √ 3,v |Iq,g (t)| 2q qv ag3 L 2 (L 2 (L ∞ )) gqv b2L 4 (L 4 (L 2 )) t

|q−q |≤5

h

v

t

h

v

dq2 2−q divh agh b L . ∞ (B ) ∇h bg L 2 (B ) L 2 (B ) t

t

t

Finally again thanks to Lemma 2.1 and (2.7), we obtain √ √ 4,v |Iq,g (t)| 2q qv ag3 L 2 (L 2 ) gSqv +2 b L 4 (L 4 (L ∞ )) gqv b L 4 (L 4 (L 2 )) t

q ≥q−4

t

h

v

t

h

v

dq2 2−q divh agh b L . ∞ (B ) ∇h bg L 2 (B ) L 2 (B ) t

t

t

Therefore, we obtain v |Iq,g (t)| dq2 2−q ∇h agh b L . ∞ (B ) ∇h bg L 2 (B ) L 2 (B ) t

t

t

This along with (2.4) completes the proof of Lemma 2.1.

Taking g(t) = 1 in Proposition 2.1 immediately gives Corollary 2.1. Under the assumptions of Proposition 2.1, we have 1 1 1 3 t v 2 2 2 2 q (a · ∇b) | qv b L 2 dt dq2 2−q a h ∇h a h b ∇h b 2 ∞ ∞ L t (B ) L 2t (B) L t (B ) L t (B ) 0 . +∇h a h b ∇ b h L 2 (B ) L 2 (B ) L ∞ (B ) t

t

t

To handle the pressure term, we need the following proposition:

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

727

Proposition 2.2. Let λ ≥ 0, u = (u h , u 3 ) ∈ B(t) with div u = 0. We define def

u λ (t, x) = gλ (t)u(t, x)

and

3

def

pλ =

,k=1

(−)−1 ∂ ∂k (u u kλ ),

t def for f (t) = u 3 (t)2B ∇h u 3 (t)2B and gλ (t) = exp −λ 0 f (t ) dt . Then, one has for all q ∈ Z,

t 0

(qv ∇h pλ | qv u λh ) L 2 dt

1 3 h 2 h 2 h 2 dq2 2−q u h . ∇ u + u ∇ u h λ 2 h λ 2 λ L∞ L t (B ) L (B ) t (B ) L 2 (B )

(2.8)

t

t, f

Proof. Motivated by [15,19,26], here we again distinguish the terms with horizontal derivatives from the terms with a vertical one so that def

Pq,λ (t) =

3 t v h v q (−)−1 ∂ ∂k (u u kλ ) | qv divh u λh L 2 dt = Pq,λ (t) + Pq,λ (t),

,k=1 0

(2.9) where def

h Pq,λ (t) =

2 t v q (−)−1 ∂ ∂k (u u kλ ) | qv divh u λh L 2 dt ,

,k=1 0

and def v Pq,λ (t) =

t 2 v −1 2 3 3 3 k v h ∂3 ∂k (u u λ )] | q divh u λ q (−) [∂3 (u u λ ) + 2 0

k=1

dt . L2

h (t). Indeed thanks to Bony’s decomposition (1.22), we We start with the estimate to Pq,λ have 2 ,k=1

qv (u u kλ ) L 2 (L 2 ) t

2

⎛ ⎝

|q −q|≤5

,k=1

+

q ≥q−4

√ √ gλ Sqv −1 u L 4 (L 4 (L ∞ )) gλ qv u k L 4 (L 4 (L 2 )) T

h

v

T

√ √ gλ qv u L 4 (L 4 (L 2 )) gλ Sqv +2 u k L 4 (L 4 (L ∞ )) ⎠ , T

h

v

T

from which and (2.3), we deduce that 2 ,k=1

q

h qv (u u kλ ) L 2 (L 2 ) dq 2− 2 u h L . ∞ (B ) ∇h u λ L 2 (B ) t

h

⎞

t

t

h

v

v

728

M. Paicu, P. Zhang

Whence we obtain 2 h P (t) qv (u u kλ ) L 2 (L 2 ) qv divh u λh L 2 (L 2 ) q,λ t

t

,k=1 h 2 dq2 2−q u h L . ∞ (B ) ∇h u λ t L 2 (B )

(2.10)

t

While as div u = 0, we get by using Bony’s decomposition that v (t) = 2 Pq,λ

=

2 t v q (−)−1 ∂3 (u kλ ∂k u 3 ) | qv divh u λh L 2 dt

k=1 0 v,1 (t) + Pq,λ

(2.11)

v,2 Pq,λ (t),

with def v,1 Pq,λ (t) =

2 t qv (−)−1 ∂3 (qv u kλ Sqv +2 ∂k u 3 ) | qv divh u λh L 2 dt , 2 k=1 q ≥q−5 0

and v,2 (t) = 2 Pq,λ

def

2 t qv (−)−1 ∂3 (Sqv −1 u kλ qv ∂k u 3 ) | qv divh u λh L 2 dt . k=1 |q −q|≤5 0

v,1 (t). Indeed using integration by parts, one has Let us begin by the estimate of Pq,λ v,1 Pq,λ =2

2 t qv (qv ∂k u kλ Sqv +1 u 3 ) | qv (−)−1 ∂3 divh u λh L 2 dt

k=1 q ≥q−5 0 t + (qv (qv u kλ Sqv +1 u 3 ) 0

def | qv (−)−1 ∂3 divh ∂k u λh ) L 2 dt = A1,λ (t)+A2,λ (t).

Then it follows from 1

1

1

1

a(D)g L 4 (L 2 ) ≤ Ca(D)g L2 2 a(D)∇h g L2 2 ≤ Cg L2 2 ∇h g L2 2 h

v

for any homogeneous function a(ξ ) of degree zero, that t A1,λ (t) Sqv +2 u 3 L 4 (L ∞ ) qv ∇h u λh L 2 qv u λh L 4 (L 2 ) dt h

q ≥q−5 0

q ≥q−5 0

t

v

1

h

v

1

Sqv +2 u 3 L2 2 (L ∞ ) Sqv +2 ∇h u 3 L2 2 (L ∞ ) qv ∇h u λh L 2 h

1

v

v

h

1

× qv u λh L2 2 qv ∇h u λh L2 2 dt ,

(2.12)

applying Hölder’s inequality gives 1 t 1 4 A1,λ (t) u 3 2B ∇h u 3 2B qv (u λh )2L 2 qv (∇h u λh ) L 2 (L 2 ) qv (∇h u λh ) L2 2 (L 2 ) , q ≥q−5

0

t

t

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

729

from which and Definition 1.3, we deduce that 1 A1,λ (t) d 2 2−q u h 2

3

2 ∇h u λh . λ L 2t (B) L 2t, f (B)

q

(2.13)

v,1 The same estimate holds for A2,λ (t) and for Pq,λ (t) as well. Whereas again we get by using integration by parts and div u = 0 that v,2 Pq,λ (t) = −2

2 t qv ∂3 (Sqv −1 u kλ qv u 3 ) | qv (−)−1 divh ∂k u λh L 2 dt

k=1 |q −q|≤5 0 t −2 (qv (Sqv −1 ∂3 u 3 qv u 3λ ) 0

| qv (−)−1 divh ∂3 u λh ) L 2 dt

def

= D1,λ (t) + D2,λ (t).

Thanks to Lemma 2.1, one has √ √ D1,λ (t) gλ Sqv −1 u h L 4 (L 4 (L ∞ )) qv ∂3 u 3λ L 2 (L 2 ) gλ qv u h L 4 (L 4 (L 2 )) , t

|q −q|≤5

v

h

t

t

v

h

which along with (2.3) and div u = 0 ensures that q q h 2 D1,λ (t) dq 2− 2 dq 2− 2 u h L ∞ (B ) ∇h u λ 2

L t (B )

t

|q −q|≤5

h 2 dq2 2−q u h L ∞ (B ) ∇h u λ 2

L t (B )

t

.

(2.14)

And a similar argument gives t D2,λ (t) Sqv −1 u 3 L 4 (L ∞ ) qv ∂3 u 3λ L 2 qv u λh L 4 (L 2 ) dt h

|q −q|≤5 0

t

|q −q|≤5 0

1

v

h

1

v

1

1

u 3 B2 ∇h u 3 B2 qv ∇h u λh L 2 qv u λh L2 2 qv ∇h u λh L2 2 dt ,

from which and Definition 1.3, we deduce that D2,λ (t)

|q −q|≤5

t

0

u 3 2B ∇h u 3 2B qv u λh 2L 2 dt

1

2 dq2 2−q u λh 2

3

L t, f (B)

2 ∇h u λh 2

As a consequence, we obtain v,2 2 −q P (t) d 2 u h ∞ q,λ

q

L t (B)

1 4

1

qv ∇h u λh L 2 (L 2 ) qv ∇h u λh L2 2 (L 2 ) t

t

.

h 2 L t (B) ∇h u λ L 2 (B ) t

1

3

2 2 + u λh ∇h u λh L 2t (B) L 2t, f (B)

.

Summing up (2.9), (2.10), (2.13) and (2.15), we complete the proof of (2.8).

(2.15)

730

M. Paicu, P. Zhang

Taking λ = 0 in the above proposition implies that Corollary 2.2. Under the assumptions of Proposition 2.2, we have for all q ∈ Z, t v ( ∇h p0 | v u h ) L 2 dt d 2 2−q u h ∞ ∇h u h 2 q q q 2 L (B ) L t (B )

t

0

1 2

1 2

+u ∇h u 2 ∞ 3

3

L t (B )

L t (B )

1 2

3 2

u ∇h u 2 ∞ h

h

L t (B )

L t (B )

.

Proof. Indeed taking λ = 0 in (2.12) gives 1 1 A1,0 (t) Sqv +2 u 3 L2 ∞ (L 2 (L ∞ )) Sqv +2 ∇h u 3 L2 2 (L 2 (L ∞ )) qv ∇h u h L 2 (L 2 ) t

q ≥q−5

h

v

t

1

h

t

v

1

×qv u h L2 ∞ (L 2 ) qv ∇h u h L2 2 (L 2 )

t 1

t

1

1

3

2 2 2 2 dq2 2−q u 3 ∇h u 3 u h ∇h u h . L ∞ (B ) L ∞ (B ) L 2 (B ) L 2 (B ) t

t

t

(2.16)

t

The same estimate also holds for A2,0 (t). On the other hand, notice that D2,0 (t) Sqv −1 u 3 L 4 (L 4 (L ∞ )) qv ∂3 u 3λ L 2 (L 2 ) qv u λh L 4 (L 4 (L 2 )) , t

|q −q|≤5

h

v

t

t

h

v

which along with the proof of (2.16) ensures that 1 D2,0 (t) d 2 2−q u 3 2 ∞

q

(B) ∇h u L t

3

1

2 2

L t (B )

1

2 u h ∞

L t (B )

3

2 ∇h u h 2

L t (B )

.

This together with (2.10), (2.14) for λ = 0 and (2.16) completes the proof of the corollary. Now we are in a position to complete the proof of Theorem 1.2. Proof of Theorem 1.2. We shall use the classical Friedrichs’ regularization method to construct the approximate solutions to (AN Sν ). For simplicity, we just outline it here (for the details in this context, see [23] or [6]). In order to do so, let us first define the sequence of projection operators (Pn )n∈N by def Pn a = F −1 1 B(0,n) a

(2.17)

and we define (u nh , u 3n ) via ⎧ h ∂t u n − νh u nh + Pn (u n · ∇u nh ) + 3,k=1 Pn ∇h (−)−1 ∂ ∂k (u n u kn ) = 0, ⎪ ⎪ ⎨ 3 ∂t u n − νh u 3n + Pn (u n · ∇u 3n ) + 3,k=1 Pn ∂3 (−)−1 ∂ ∂k (u n u kn ) = 0, (2.18) h 3 ⎪ ⎪ ⎩ divhh u 3n + ∂3 u n = 0, h (u n , u n )|t=0 = (Pn u 0 , Pn u 30 ), where (−)−1 ∂ j ∂k is defined precisely by (−)−1 ∂ j ∂k a = −F −1 (|ξ |−2 ξ j ξk a ). def

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

731

Because of properties of L 2 and L 1 functions the Fourier transform of which are supported in the ball B(0, n), the system (2.18) appears to be an ordinary differential equation in the space

def L 2n = a ∈ L 2 (R3 ) : Supp a ⊂ B(0, n) . (2.19) This ordinary differential equation is globally wellposed because t ∇h u n (t )2L 2 dt = Pn u 0 2L 2 ≤ u 0 2L 2 . u n (t)2L 2 + 2ν

(2.20)

0

We refer to [6] and [23] for the details. Next let us turn to the uniform estimates for the thus obtained approximate solution sequence (u n )n∈N . To overcome the difficulty that we can not use Gronwall inequality in p (B), for any positive λ > 0, we define: the framework of Chemin-Lerner type spaces L t u n,λ (t, x) = e−λ def

Then u λh solves

t 0

f n (t ) dt

def

u n (t, x), with f n (t) = u 3n (t)2B ∇h u 3n (t)2B . (2.21)

⎧ h h − ν u h + P (u · ∇u h ) ∂t u n,λ + λ f n (t)u n,λ ⎪ h n,λ n n n,λ ⎪ ⎪ 3 j ⎨ −1 = − i, j=1 Pn ∇h (−) ∂i ∂ j (u in u n,λ ), ⎪div u n,λ = 0, ⎪ ⎪ ⎩ h u n,λ |t=0 = Pn u 0h .

(2.22)

The main idea will be now to use the sort of weighted Chemin-Lerner spaces L 2t, f (B) introduced in Definition 1.3 which allows to avoid Gronwall-type lemma in the energy estimates. This kind of spaces will be useful also in the proof of Theorem 1.3 in Sect. 3. We first apply qv to (2.22) and then take the L 2 inner product of the resulting equation h to get with qv u n,λ 1 d h h h qv u n,λ (t)2L 2 + λ f n (t)qv u n,λ (t)2L 2 + νqv ∇h u n,λ (t)2L 2 2 dt 3 j h h = − qv (u n · ∇u n,λ ) | qv u nh − ∇h (−)−1 ∂i ∂ j qv (u in u n,λ ) | qv u n,λ . i, j=1

Applying Proposition 2.1 and Proposition 2.2, we obtain t t v h 2 v h 2 h f n (t )q u n,λ (t ) L 2 dt + 2ν qv ∇h u n,λ (t )2L 2 dt q u n,λ (t) L 2 + 2λ 0

≤

qv u 0h 2L 2

+ Cdq2 2−q

h 2 u h L ∞ (B ) ∇h u n,λ 2

L t (B )

t

0

1

h 2 +u n,λ

L 2t, f n (B)

3

h 2 ∇h u n,λ 2

L t (B )

which along with (1.7) and Definition 1.3 ensures √ √ h h h u n,λ L + 2ν∇h u n,λ 2λu n,λ ∞ (B ) + 2 L 2 (B ) L t, f n (B)

t

1 2

≤ u 0h B + C u nh ∞

L t (B )

t

1

h h 4 ∇h u n,λ + u n,λ L 2 (B ) t

L 2t, f n (B)

3

h 4 ∇h u n,λ 2

L t (B )

.

,

732

M. Paicu, P. Zhang

Note that Young’s inequality ensures that 1/4

h Cu n,λ

3

3/4

L 2t, f n (B)

h ∇h u n,λ 2

L t (B )

whence we obtain

h ≤ C0 ν − 2 u n,λ 2

L t, f n (B)

√ √ h + ( 2 − 1) ν∇h u n,λ , L 2 (B ) t

√

√ h λu n,λ + ν∇h u n,λ L 2t (B) L 2t, f n (B)

3 1/2 h h ≤ u 0h B + C0 u nh ∇h u n,λ + ν − 2 u n,λ ∞ 2 L 2 (B )

h L u n,λ ∞ (B ) + t

L t (B )

L t, f n (B)

t

. (2.23)

Let ε0 be a small positive constant, which will be determined later on, we denote

√ 1 def Tn∗ = max t : u nh L ν∇h u nh ≤ min( , ε )ν . (2.24) ∞ (B ) + 0 2 2 L t (B ) t 4C0 Then taking λ =

4C02 ν3

in (2.23), we obtain

h u n,λ L ∞ (B ) + t

√

for t < Tn∗ .

h ν∇h u n,λ ≤ 2u 0h B L 2 (B ) t

(2.25)

While thanks to (2.21), we have t h h h e−λ 0 fn (t )dt u nh L ≤ u n,λ L , ∞ (B ) + ∇h u n ∞ (B ) + ∇h u n,λ 2 L (B ) L 2 (B ) t

t

t

which together with (2.25) implies that √ ν∇h u nh u nh L ∞ (B ) + L 2t (B) t

4C 2 t 3 2 3 2 0 . ≤ 2u 0h B exp u (t ) ∇ u (t ) dt h n n B B ν3 0

t

(2.26)

On the other hand, we get by applying qv to the vertical equation in (2.18) and then taking the L 2 inner product of the resulting equation with qv u 3n that t 1 v 3 1 2 u (t) L 2 + ν qv ∇h u 3n (t )2L 2 dt − qv Pn u 30 2L 2 2 q n 2 0 t v q (u n · ∇u 3n ) | qv u 3n L 2 dt =− 0

3 t ∂3 (−)−1 ∂ ∂k qv (u n u kn ) | qv u 3n L 2 dt . − ,k=1 0

Notice that div u n = 0, and one gets by using integration by parts 3 t ∂3 (−)−1 ∂ ∂k qv (u n u kn ) | qv u 3n, L 2 dt

,k=1 0

3 t j ∇h (−)−1 ∂ ∂k qv (u in u n ) | qv u nh L 2 dt , =− ,k=1 0

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

which along with Corollary 2.1 and Corollary 2.2 gives

√ 3 u nh 1/2 2ν∇h u 3n ≤ u + C u 3n L ∞ (B ) + B 2 ∞ 0 L (B ) t

L t (B )

t

733

∇h u nh L 2 (B ) t

1/4 1/4 1/4 3/4 +u 3n ∇h u 3n 2 u nh ∇u nh 2 L ∞ (B ) L ∞ (B ) L (B ) L (B ) t

Then thanks to (2.24), we obtain u 3n L ∞ (B ) t

+

√

2ν∇h u 3n L 2 (B ) t

≤

u 30 B

1/4 ∇u 3n 2 L t (B ) L t (B )

t

.

t

3 5 1/4 + C ε02 ν + ε0 ν 8 u 3n ∞

small enough, we obtain for t < Tn∗ . Taking ε0 = ε0 (C) √ ν∇h u 3n ≤ 2u 30 B + ν u 3n L ∞ (B ) + L 2 (B ) t

t

t

for t < Tn∗ .

,

(2.27)

Now we claim that Tn∗ = ∞ provided that c0 in (1.11) is sufficiently small. Indeed if Tn∗ < ∞, it follows from (2.27) that t 16 3 2 u 3n (t )2B ∇h u 3n (t )2B dt ≤ u 3n 2L ≤ 16u 30 4 − 1 , 1 + ν 4 . ∞ (B ) ∇h u n L 2t (B) t ν 0 B 2 2 4

Substituting the above inequality into (2.26) claims that u nh L ∞ (B ) t

√ 1024C02 3 4 h 2 h + ν∇h u n ≤ 2 exp(64C0 )u 0 B exp u 0 B L 2t (B) ν4 ≤

1 1 min( 2 , ε0 )ν, 2 4C0

provided that we take L = 1024C02 and c0 ≤

1 4

(2.28)

exp(−64C02 ) min( 4C1 2 , ε0 ) in (1.11). 0

This contradicts the definition of Tn∗ defined in (2.24), and therefore Tn∗ = +∞. With (2.27) and (2.28) being obtained for Tn∗ = ∞, one can prove the existence part of Theorem 1.2 via a standard compactness argument. And the uniqueness of the solution to (AN Sν ) in B(T ) has already been proved in [23]. One may check [23] or [6] for the details. This completes the proof of the theorem. 3. The Proof of Theorem 1.3 The goal of this section is to prove Theorem 1.3. As a convention in what follows, we −1,1

shall always assume that u 0 ∈ B4 2 2 , then according to Definition 1.2, we split u 0 as u hh + u h with def def h u hh = kh v u 0 , u h = S j−1 vj u 0 . (3.1) k≥−1

j∈Z

Correspondingly, we shall seek the solution of (AN Sν ) of the form u = u F + w given by (1.15). We notice that the low horizontal frequencies part u h belongs to B so that we can use part of the techniques used in the previous section to build the solution of (1.17) for w. For u F , we shall use the fact that any vertical derivative of u hh can be

734

M. Paicu, P. Zhang

controlled by its horizontal derivative. Finally, we have to use once again the weighted Chemin-Lerner spaces given by Definition 1.3 in order to avoid using the Gronwall type lemma in the energy estimates. The following lemma from [11] will be very useful in this section. Lemma 3.1. Let u F be given by (1.16). Then there hold (1)

kh v u F L p (L 4 (L 2 )) T h v

⎧ ⎨ dk, 1 p

2

k

1 2 2− p

2− 2 u 0

− 21 , 21

B4

⎩ ν 0, otherwise,

, for k ≥ − 1,

(3.2)

for any 1 ≤ p ≤ ∞; (2) For any ( p, q) in [1, ∞] × [4, ∞], we have kh u F L p (R+ ;L q (L ∞ )) h v

If in addition

1

1

νp

ck 2

−k 2 1p + q1 −1

u 0

− 21 , 21

B4

.

(3.3)

;

(3.4)

1 1 1 + > , we have p q 2

vj u F L p (R+ ;L q (L 2 )) h v

1 1

νp

dj2

− j 2 1p + q1 − 21

u 0

− 21 , 21

B4

(3) j dj vj u F L 2 (R+ ;L ∞ (L 2v )) √ 2− 2 u 0 − 1 , 1 and h ν B4 2 2 1 u F L 2 (R+ ;L ∞ (R3 )) √ u 0 − 1 , 1 . ν B4 2 2

(3.5)

Outline of the proof. Indeed thanks to Lemma 2.1 of [5], there exists a positive constant c such that kh v u F (t) L 4 (L 2 ) e−cνt2 kh v u hh L 4 (L 2 ) 2k

h

v

v

h

e

−cνt22k

k 2

dk, 2 2

from which and Lemma 2.1, we deduce (3.2–3.4).

− 2

u 0

− 21 , 21

B4

,

(3.6)

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

735

To prove (3.5), we write vj u F 2L 2 (R+ ;L ∞ (L 2 )) = (vj u F )2 L 1 (R+ ;L ∞ (L 1v )) , v

h

h

and using Bony’s paradifferential decomposition (1.22) in the horizonal variables, one has h h (vj u F )2 = Sk−1 vj u F kh vj u F + kh vj u F Sk+2 vj u F , k∈Z

k∈Z

which along with (3.2) gives (3.5). One may check Lemma 2.5, Corollary 2.3 and Lemma 2.6 of [11] for the detailed proof. In what follows, everything will be different for terms involving the horizontal derivatives and for terms involving the vertical derivative. For terms involving horizontal derivatives, the following Lemma 3.2 and Corollary 3.1 will be very useful. − 21 , 21

Lemma 3.2. Let a ∈ B4

(t) and b ∈ B(t). Let 0 ≤ g ∈ L ∞ (0, t); we denote

def

ag (t, x) = g(t)a(t, x). Then there holds for all q ∈ Z, √ gqv (a · ∇h bg )

1

1 ∇h ag 2 ∇h bg 2 . 1 1 −2,2 −1,1 L t (B ) L 2t (B4 2 2 ) L∞ ) t (B4

q

4 4 L t3 (L h3 (L 2v ))

dq 2− 2 a 2

Proof. We first get by using Bony’s decomposition (1.22) in the vertical variable that qv (Sqv −1 aqv ∇h bg ) + qv (qv aSqv +2 ∇h bg ). (3.7) qv (a · ∇h bg ) = |q −q|≤5

q ≥q−4

While according to Definition 1.2, we can split a into a part where the horizontal frequencies are greater than the vertical ones and a part where the horizontal frequencies are smaller than the vertical ones, more precisely, def def h a = ah + al with ah = kh v a and al = S j−1 vj a. (3.8) j∈Z

k≥−1

Notice that

√ gqv ah2L 4 (L 4 (L 2 )) = g(qv ah)2 L 2 (L 2 (L 1 )) , t

h

v

t

h

v

and using Bony’s decomposition in the horizontal variables gives h h (qv ah)2 = Sk−1 qv ahkh qv ah + Sk+2 qv ahkh qv ah. k∈Z

k∈Z

However it follows from Lemma 2.1 and Definition 1.2 that h Sk+2 qv ahkh qv ah L 2 (L 2 (L 1 )) g t

k∈Z

k∈Z

h

v

h Sk+2 qv ah L ∞ (L 4 (L 2 )) kh qv (gah) L 2 (L 4 (L 2 )) t

2 −q dk,q a 2

h

v

t

1 1 ∇h a g −1,1 ∞ (B − 2 , 2 ) L 2t (B4 2 2 ) L t 4 k∈Z dq2 2−q a 1 1 ∇h a g −1,1 , ∞ (B − 2 , 2 ) L 2t (B4 2 2 ) L t 4

h

v

736

M. Paicu, P. Zhang

which gives 1 q √ gqv ah L 4 (L 4 (L 2 )) dq 2− 2 a 2 t

v

h

1

1 1 ∞ (B − 2 , 2 ) L t 4

∇h ag 2

1 1

− , L 2t (B4 2 2 )

.

(3.9)

.

(3.10)

On the other hand, notice that

qv al =

qv (S hj−1 vj a),

| j−q |≤1

which along with the simple interpolation in 2-D gives 1 1 √ gqv al L 4 (L 4 (L 2 )) qv al L2 ∞ (L 2 ) gqv ∇h al L2 2 (L 2 ) t

v

h

t

dq 2

− q2

t

a

1 2

1

1 1 ∞ (B − 2 , 2 ) L t 4

∇h ag 2

1 1

− , L 2t (B4 2 2 )

Thanks to Lemma 2.1 and (3.9), (3.10), we obtain 1 √ gSqv −1 a L 4 (L 4 (L ∞ )) a 2 t

from which we deduce that √ g qv (Sqv −1 aqv ∇h bg ) |q −q|≤5

|q −q|≤5

4

∇h ag 2

1 1

− , L 2t (B4 2 2 )

,

t

dq 2

q−q 2

4

L t3 (L h3 (L 2v ))

v

h

t

q 1 2− 2 a 2

1

1 1 ∞ (B − 2 , 2 ) L t 4

|q −q|≤5 1

∇h ag 2

1

dq 2− 2 a 2

1 1 ∞ (B − 2 , 2 ) L t 4

∇h ag 2

1 1

− , L 2t (B4 2 2 )

1 1

− , L 2t (B4 2 2 )

∇h bg L 2 (B ) t

∇h bg . L 2 (B )

(3.12)

t

Similar argument shows that √ g qv (qv aSqv +2 ∇h bg ) q ≥q−4

(3.11)

√ gSqv −1 a L 4 (L 4 (L ∞ )) qv ∇h bg L 2 (L 2 )

q

1

1 1 ∞ (B − 2 , 2 ) L t 4

v

h

q ≥q−4 q

4

4

L t3 (L h3 (L 2v ))

√ gqv a L 4 (L 4 (L 2 )) Sqv +2 ∇h bg L 2 (L 2 (L ∞ )) t

1

dq 2− 2 a 2

1 1 ∞ (B − 2 , 2 ) L t 4

h

v

t

1

∇h ag 2

1 1

− , L 2t (B4 2 2 )

h

v

∇h bg . L 2 (B ) t

This along with (3.7) and (3.12) concludes the proof of the lemma.

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

737

An immediate corollary implied by the proof of the above lemma is: Corollary 3.1. (1) Under the assumptions of Lemma 3.1, we have 1

q

1

qv (abg ) L 2 (L 2 ) dq 2− 2 a 2

− 21 , 21 L∞ ) t (B4

t

− 21 , 21

(2) Let a, b be in B4

1

∇h ag 2

−1,1 L 2t (B4 2 2 )

1

b 2 ∞

L t (B )

2 ∇h bg 2

L t (B )

.

(t). We have 1

q

1

qv (abg ) L 2 (L 2 ) dq 2− 2 a 2

1 1

− , L ∞ (B 2 2 )

t

t

1

∇h ag 2

1 1

− , L 2t (B4 2 2 )

4

1

b 2

1 1

− , L ∞ (B 2 2 ) t

∇h bg 2

1 1

− , L 2t (B4 2 2 )

4

.

Proof. Indeed thanks to (3.7), one has qv (abg ) L 2 (L 2 ) t

|q −q|≤5

√ √ gSqv −1 a L 4 (L 4 (L ∞ )) gqv b L 4 (L 4 (L 2 )) t

+

q ≥q−4

v

h

t

h

v

√ √ gqv a L 4 (L 4 (L 2 )) gSqv +2 b L 4 (L 4 (L ∞ )) , t

v

h

t

h

v

which along with (2.3) and (3.9–3.11) gives the corollary.

The following propositions are the key ingredients in the proof of Theorem 1.3. Proposition 3.1. Let u F be given by (1.16). Let a ∈ B(t) and 0 ≤ m ∈ L ∞ (0, t). We denote def gλ (t) = exp −λu 3hh 4 − 1 , 1 − m(t) , B4

2 2

def

(3.13)

def

u F,λ (t, x) = gλ (t)u F (t, x), aλ (t, x) = gλ (t)a(t, x), and def h Iλ,q (t) = def

Iqv (t) =

0

t

qv (u F · ∇u hF,λ ) | qv aλ

t

0

qv (u F · ∇u 3F ) | qv a

Then we have for all q ∈ Z, h 1 2 −q I (t) d 2 √ u hhh 2 − 1 , 1 ∇h aλ q λ,q L 2t (B) ν B 2 2 4

+

1 1

λ4 ν

v I (t) d 2 2−q u h q q hh

u hhh

− 21 , 21

B4

B4

2 2

− 21 , 21

B4

L2

L2

dt

and

dt.

aλ , L ∞ (B )

3 − 1 , 1 u hh

(3.14)

t

1 1 . a a + ∇ √ h L 2t (B) L∞ t (B ) ν ν

(3.15)

738

M. Paicu, P. Zhang

Proof. As div u F = 0, we first get by using integration by parts that t t v h h h v q (u F ⊗ u F,λ ) | ∇h q aλ L 2 dt + ∂3 qv (u 3F u hF,λ ) | qv aλ L 2 dt Iλ,q (t) = − def

=

0 h,1 h,2 Iλ,q (t) + Iλ,q (t).

0

(3.16)

Applying Corollary 3.1 gives

h,1 v I (t) ≤ v u h ⊗ u h q F F,λ L 2 (L 2 ) ∇h q aλ L 2 (L 2 ) λ,q t

dq2 2−q u hF

t

1 1 ∞ (B − 2 , 2 ) L t 4

∇h u hF

−1,1 L 2t (B4 2 2 )

∇h aλ . L 2 (B ) t

However, thanks to Definition 1.2 and (3.2), we have u hF

1 1

∞ (B − 2 , 2 ) L t 4

u hhh

∇h u hF

and

−1,1 B4 2 2

1 1

− , L 2t (B4 2 2 )

1 √ u hhh − 1 , 1 , ν B4 2 2 (3.17)

from which we deduce that h,1 d2 I (t) √q 2−q u h 2 1 1 ∇h aλ 2 . hh λ,q − , L t (B ) ν B 2 2

(3.18)

4

For the term with the vertical derivative, let us write using Lemma 2.1 that h,2 I (t) 2q v (u 3 u h ) 1 2 v aλ L ∞ (L 2 ) . q F F,λ L (L ) q λ,q t t

Using Bony’s decomposition in the vertical variable, we infer qv (Sqv −1 u 3F qv u hF,λ ) + qv (qv u 3F Sqv +2 u hF,λ ), (3.19) qv (u 3F u hF,λ ) = |q −q|≤5

q ≥q−4

whereas using Bony’s decomposition (1.22) in the horizontal variables and the definition of u hh gives

h Sk−1 Sqv −1 u 3F kh qv u hF,λ Sqv −1 u 3F qv u hF,λ = k≥q −4

h + kh Sqv −1 u 3F Sk+2 qv u hF,λ .

(3.20)

The two terms of the above sum can be estimated along the same lines. Whereas thanks to (3.2) and (3.6), we have h Sk−1 Sqv −1 u 3F kh qv u hF,λ L 1 (L 2 ) t

k≥q −4

k≥q −4

h Sk−1 Sqv −1 u 3F L ∞ (L 4 (L ∞ )) kh qv u hF,λ L 1 (L 4 (L 2 ))

k≥q −4

t

ck dk,q 2k

t

e 0

h

−cνt 22k

v

t

dt 2

− q2

h

v

−λu 3hh 4

u 3hh

−1,1 B4 2 2

e

−1,1 B4 2 2

u hhh

− 21 , 21

B4

,

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

739

which gives

h Sk−1 Sqv −1 u 3F kh qv u hF,λ L 1 (L 2 ) t

k≥q −4

dq 1 4

λ ν

2−

3q 2

u hhh

− 21 , 21

B4

,

and consequently h,2 d2 I (t) q 2−q u h 1 1 aλ . − , hh 1 λ,q L∞ t (B ) B4 2 2 λ4 ν This along with (3.18) proves (3.14). On the other hand, again as div u F = 0, we write by using integration by parts Iqv (t)

t

=− 0

qv (u hF u 3F )

|

qv ∇h a L 2

t

dt − 0

qv ((u 3F )2 ) | ∂3 qv a

L2

dt ,

which along with Lemma 2.1 ensures v I (t) v (u h u 3 ) 2 2 v ∇h a 2 2 q q F F L (L ) q L (L ) t

+2

q

t

qv ((u 3F )2 ) L 1 (L 2 ) qv a L ∞ 2 . t (L ) t

(3.21)

Applying Corollary 3.1 and (3.17) gives q dq qv (u hF u 3F ) L 2 (L 2 ) √ 2− 2 u hhh − 1 , 1 u 3hh − 1 , 1 . t ν B4 2 2 B4 2 2

(3.22)

On the other hand, applying Lemma 2.1 and div u F = 0 yields

k≥q −4

h Sk−1 Sqv −1 u 3F kh qv u 3F L 1 (L 2 ) t

2−q

k≥q −4

h Sk−1 Sqv −1 u 3F L ∞ (L 4 (L ∞ )) kh qv ∂3 u 3F L 1 (L 4 (L 2 )) t

h

v

t

h

v

3q 1 ck dk,q 2− 2 u 3hh − 1 , 1 u hhh − 1 , 1 ν B4 2 2 B4 2 2 k≥q −4

dq − 3q 3 2 2 u hh − 1 , 1 u hhh − 1 , 1 , ν B4 2 2 B4 2 2 from which we deduce by a similar version of (3.19) and (3.20) for q ((u 3F )2 ) that qv ((u 3F )2 ) L 1 (L 2 ) t

dq − 3q 3 2 2 u hh − 1 , 1 u hhh − 1 , 1 , ν B4 2 2 B4 2 2

which along with (3.21) and (3.22) ensures (3.15). This completes the proof of the proposition.

740

M. Paicu, P. Zhang

Proposition 3.2. Let u F be given by (1.16). Let a = (a h , a 3 ) and b be in B(t) with div a = 0. For a given nonnegative function m ∈ L ∞ (0, t) and λ > 0, we denote t def def 3 2 3 2 and gλ (t) = exp −λ f (t ) dt − m(t) , f (t) = a (t)B ∇h a (t)B 0

(3.23) and def h (t) = Jλ,q def

Jqv (t) =

0

t

qv (a · ∇u hF,λ ) | qv bλ

t

qv (a · ∇u 3F ) | qv b

0

L2

L2

dt

and

dt,

with u F,λ , aλ and bλ being given by (3.13). Then there holds for q ∈ Z,

1 1 1 1 h 1 2 2 2 J (t) d 2 2−q 1 ∇h a h 2 b 2 ∇ b + b b h λ λ λ q λ,q λ L t (B ) 1 3 L 2t (B) ν 4 L∞ L∞ L 2t, f (B) t (B ) t (B ) ν4 1 1 1 1 h 2 2 u hhh − 1 , 1 , + 1 a h ∇h aλh ∇h bλ b 2 (B ) + 1 ∇h aλ 2 ∞ L L 2t (B) L∞ L t (B ) t L t (B ) t (B ) B4 2 2 ν4 ν2 (3.24) and v d2 J (t) q 2−q u h 1 1 ∇h a 3 2 + u 3 1 1 ∇h a h 2 q −2,2 −2,2 hh hh 1 L t (B ) L t (B ) B4 B4 ν4 1 1 1 1 2 2 2 2 ×b ∇h b + u hhh − 1 , 1 a 3 ∇h a 3 2 2 ∞ ∞ L t (B )

+u 3hh

− 21 , 21

B4

L t (B )

B4

1 2

1 2

L t (B )

L t (B )

L t (B )

2 2

a h ∇h a h 2 ∞

. ∇h b L 2 (B )

L t (B )

(3.25)

t

Proof. Again we first distinguish the terms with horizontal derivatives from terms with the vertical one to write t t v h v 3 h h Jλ,q q (a · ∇h u hF,λ ) | qv bλ L 2 dt + q (a ∂3 u F,λ ) | qv bλ L 2 dt (t) = def

=

0 h,1 h,2 Jλ,q (t) + Jλ,q (t).

0

(3.26)

Note that one gets by using integration by parts t t v v h h h,1 q (divh a h u hF,λ ) | qv bλ L 2 dt − q (a u F,λ ) | ∇h qv bλ L 2 dt , (t) = − Jλ,q 0

0

so that applying (2.3), Lemma 3.2 and Corollary 3.1 gives h,1 √ J (t) √gλ v (divh a h u h ) 4 4 gλ qv b L 4 (L 4 (L 2 )) q λ F λ,q t

L t3 (L h3 (L 2v ))

+qv (aλh u hF ) L 2 (L 2 ) qv ∇h bλ L 2 (L 2 ) t

dq2 2−q u hF

1 2 1 1 ∞ (B − 2 , 2 ) L t 4

1

2 ×∇h bλ 2

L t (B )

∇h u hF,λ 1

2 + a h ∞

L t (B )

t 1 2

−1,1 L 2t (B4 2 2 ) 1 h 2 h λ 2 L t (B )

∇ a

v

h

1

2 ∇h aλh b ∞ L 2 (B ) t

, ∇h bλ 2 L (B ) t

L t (B )

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

741

from which and (3.17), we deduce that 1 1 h,1 d2 J (t) q 2−q u h 1 1 ∇h a h 2 b 2 ∞ ∇h bλ 2 −2,2 λ L t (B ) hh 1 λ,q (B ) L L 2t (B) t B4 ν4 1 1 2 2 . +a h ∇h aλh ∇h bλ ∞ 2 L 2 (B )

L t (B )

L t (B )

(3.27)

t

While again we get by using Bony’s decomposition (1.22) in the vertical variable, we write t h,2 Jλ,q qv (Sqv −1 a 3 ∂3 qv u hF,λ ) | qv bλ L 2 dt (t) = |q −q|≤5 0

+

t

q ≥q−4 0

qv (qv a 3 ∂3 Sqv +2 u hF,λ ) | qv bλ

dt = H1,q (t) + H2,q (t). def

L2

Then it follows from Lemma 2.1 that t 2q Sqv −1 a 3 L 4 (L 2 ) qv u hF L 4 (L 2 ) qv bλ L 2 dt |H1,q (t)|

2

q

t

0

|q −q|≤5

×qv u hF

v

h

0

|q −q|≤5

h

v

a 3 (t )2B ∇h a 3 (t )2B qv bλ (t )2L 2 dt

1 4

1

4 L t3 (L 4h (L 2v ))

qv bλ L2 ∞ (L 2 ) , t

which along with Definition 1.2 and (3.4) gives |H1,q (t)|

dq2

2−q u hhh 3

1

−1,1 B4 2 2

ν4

1

2 bλ

L 2t, f (B)

2 bλ ∞

L t (B )

.

And again thanks to Lemma 2.1, we write |H2,q (t)| qv ∂3 aλ3 L 2 (L 2 ) Sqv +2 u hF L 2 (L ∞ ) qv b L ∞ 2 , t (L ) t

q ≥q−4

t

which along with ∂3 a 3 = − divh a h and (3.5) gives q−q 1 |H2,q (t)| 1 dq 2 2 dq 2−q u hhh − 1 , 1 ∇h aλh b L ∞ (B ) L 2t (B) t B4 2 2 ν 2 q ≥q−4

dq2 1

ν2

2−q u hhh

− 21 , 21

B4

∇h aλh b L ∞ (B ) . L 2 (B ) t

t

As a consequence, we obtain h,2 J (t) d 2 2−q u h q hh q,λ

−1,1 B4 2 2

1 ν

3 4

1

2 bλ

L 2t, f (B)

1

2 bλ ∞

This together with (3.26) and (3.27) proves (3.24).

L t (B )

+

1 ν

1 2

∇h aλh b ∞ (B ) . 2 L L (B ) t

t

742

M. Paicu, P. Zhang

To deal with Jqv (t), we first use div u F = 0 and integration by parts to get Jqv (t) = −

t 0

− 0

qv (divh a h u 3F − ∇h a 3 u hF ) | qv b

t

qv (a h u 3F − a 3 u hF ) | ∇h qv b

L2

dt

dt = Jqv,1 (t) + Jqv,2 (t). def

L2

(3.28)

Thanks to (2.3) and Corollary 3.1, we write v,1 J (t) ≤ v (divh a h u 3 − ∇h a 3 u h ) q q F F

dq2 2−q u 3F

1 2 1 1

∞ (B − 2 , 2 ) L t 4

1

4

4

L t3 (L h3 (L 2v ))

∇h u 3F

1 2 1 1

− , L 2t (B4 2 2 )

1

+u hF 2

1 1 ∞ (B − 2 , 2 ) L t 4

qv b L 4 (L 4 (L 2 ))

∇h u hF 2

−1,1 L 2t (B4 2 2 )

t

h

v

∇h a h L 2 (B ) t

1 2 b ∇h a 3 2 ∞ L (B )

1

L t (B )

t

2 ∇h b 2

L t (B )

,

which along with (3.17) gives

v,1 d2 J (t) q 2−q u 3 1 1 ∇h a h 2 + u h 1 1 ∇h a 3 2 q , −2,2 − hh hh 1 L t (B ) L t (B ) B4 B4 2 2 ν4 1

2 ×b ∞

L t (B )

1

2 ∇h b 2

L t (B )

.

(3.29)

Exactly following the same line, we obtain v,2 J (t) ≤ v (a h u 3 − a 3 u h ) 2 2 v ∇h b 2 2 q q q F F L (L ) L (L )

t

−q u 3hh 2 1

dq2 ν4

+u hhh

− 21 , 21

B4

t

1 2

1

a h ∞

−1,1 B4 2 2 1 3 2 ∞ (B ) L t

a

L t (B )

2 ∇h a h 2

L t (B )

1 2

∇h a 3 2

L t (B )

∇h b , L 2 (B ) t

which together with (3.28) and (3.29) shows (3.25). This completes the proof of the proposition. Proposition 3.3. Let 0 ≤ g(t) ≤ 1 and b ∈ B(t). We denote def

def

t

bg (t, x) = g(t)b(t, x) and Fq,g (t) =

0

qv (u F · ∇bg ) | qv bg

L2

dt .

Then there holds for all q ∈ Z, Fq,g (t) d 2 2−q u h q hh

−1,1 B4 2 2

1 ν

1 4

1

3

2 2 b ∇h bg 2 ∞

L t (B )

L t (B )

+

1 ν

1 2

. b ∇ b h g 2 ∞ L (B ) L (B ) t

t

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

743

Proof. The proof of this proposition basically follows from the proof of Proposition 2.1. Indeed, we first write Fq,g (t) as Fq,g (t) =

t

0

qv (u hF · ∇h bg ) | qv bg

dt + L2

t

qv (u 3F ∂3 bg ) | qv bg

0

L2

h v = Fq,g (t) + Fq,g (t).

def

(3.30)

Thanks to (2.3), (3.17) and Lemma 3.2, we have h F (t) √gv (u h · ∇h bg ) 4 q,g q F

4

L t3 (L h3 (L 2v ))

dq2 −q ν

1 4

2

dt

1

u hhh

−1,1 B4 2 2

√ gqv b L 4 (L 4 (L 2 )) t

3

2 b ∞

L t (B )

2 ∇h bg 2

L t (B )

h

v

.

(3.31)

Whereas similar to (2.5), we write t v v Fq,g (t) = Sq−1 u 3F ∂3 qv bg |qv bg L 2 dt 0 t + [qv ; Sqv −1 u 3F ]∂3 qv bg |qv bg L 2 dt |q −q|≤5 0

t v (Sqv −1 u 3F − Sq−1 u 3F )∂3 qv qv bg |qv bg L 2 dt

+

+

|q −q|≤5 0 t q ≥q−4

qv (qv u 3F Sqv +2 ∂3 bg )|qv bg

0

L2

dt

def

1,v 2,v 3,v 4,v = Fq,g (t) + Fq,g (t) + Fq,g (t) + Fq,g (t).

Then one gets by using div u F = 0 and integration by parts that 1,v t F (t) = q,g

v Sq−1 u hF qv bg qv ∇h bg d x dt

0 R3 v v Sq−1 u hF L 2 (L ∞ ) qv b L ∞ 2 q ∇h bg L 2 (L 2 ) t (L ) t t

dq2 √ 2−q u hhh − 1 , 1 b L , ∞ (B ) ∇h bg L 2t (B) t ν B4 2 2 where we used (3.5) in the last step. While thanks to (2.6) and using integration by parts, we write 2,v Fq,g (t)

=

|q−q |≤5

2

q

t 0

R3 R

q

h(2 (x3 − y3 )) 0

1

Sqv −1 u hF (x h , τ y3 + (1 − τ )x3 ) dτ

×(y3 − x3 ) ∂3 qv ∇h bg (t , x h , y3 ) dy3 qv bg (t , x) ! +∂3 qv bg (t , x h , y3 ) dy3 qv ∇h bg (t , x) d x dt ,

744

M. Paicu, P. Zhang

from which and (3.5), we deduce that 2,v F (t) Sqv −1 u hF L 2 (L ∞ ) qv ∇h bg L 2 (L 2 ) qv b L ∞ 2 q,g t (L ) t

t

|q −q|≤5 v +qv bg L ∞ 2 q ∇h bg L 2 (L 2 ) t (L ) t dq2 −q h 2 u hh − 1 , 1 b L . ∞ (B ) ∇h bg 1 L 2t (B) t 2 2

ν2

B4

On the other hand, let qv,3 be defined by (2.1). Then it is easy to observe that t v v,3 v 4,v Fq,g (t) = 2−q q (q q ; ∂3 u 3F Sqv +2 ∂3 bg )|qv bg L 2 dt 0

q ≥q−4

t −q v h v v = 2 qv (qv,3 q ; u F Sq +2 ∂3 ∇h bg )|q bg L 2 dt 0

q ≥q−4 t

+

0

v h v v qv (qv,3 q ; u F Sq +2 ∂3 bg )|q ∇h bg

L2

dt ,

which along with Lemma 2.1 and (3.5) ensures that 4,v F (t) qv u hF L 2 (L ∞ (L 2 )) Sqv +2 ∇h bg L 2 (L 2 (L ∞ )) qv bg L ∞ 2 q,g t (L ) t

h

v

q ≥q−4 +Sqv +2 bg L ∞ (L 2 (L ∞ )) qv ∇h bg L 2 (L 2 ) t t h v

1 1

ν2

dq2 ν

1 2

dq 2

q−q 2

dq 2−q u hhh

2−q u hhh

− 21 , 21

B4

− 21 , 21

B4

q ≥q−4

t

h

v

b L ∞ (B ) ∇h bg L 2 (B ) t

t

b L . ∞ (B ) ∇h bg L 2 (B ) t

t

3,v A similar argument shows the same estimate for Fq,g (t). As a consequence, we obtain

v d2 F (t) q 2−q u h 1 1 b ∇h bg . q,g − , hh 1 L∞ L 2t (B) t (B ) B4 2 2 ν2 This along with (3.30) and (3.31) concludes the proof of the proposition. To deal with the pressure term, we need the following two propositions: Proposition 3.4. Let a = (a h , a 3 ) be in B(t) and gλ , aλ , u F,λ be given by (3.13). We denote 3 t def P¯q,λ (t) = ∇h qv (−)−1 ∂ ∂k (u F u kF,λ ) | qv aλh L 2 dt . ,k=1 0

Then there holds for all q ∈ Z, P¯q,λ (t) d 2 2−q u h q hh

−1,1 B4 2 2

1 1 h . + a √ u hhh − 1 , 1 ∇h aλh 2 ∞ λ 1 L t (B ) L t (B ) ν B4 2 2 νλ 4

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

745

Proof. Similar to the proof of Proposition 2.2, we again distinguish the terms with horizontal derivatives from the terms with the vertical one so that h v P¯q,λ (t) = P¯q,λ (t) + P¯q,λ (t),

(3.32)

with 2 t ∇h qv (−)−1 ∂ ∂k (u F u kF,λ ) | qv aλh L 2 dt ,

def h P¯q,λ (t) =

and

,k=1 0

def v (t) = P¯q,λ

t

qv (−)−1 [∂32 (u 3F u 3F,λ ) + 2

0

2

∂3 ∂k (u 3F u kF,λ )] | qv aλh

L2

dt .

k=1

Then the proof of (3.18) ensures h d2 P¯ (t) √q 2−q u h 2 1 1 ∇h a h 2 . q,λ λ L t (B ) hh − , ν B 2 2

(3.33)

4

Whereas a similar proof of (2.11) gives 2 t v k v,1 v,2 v ¯ Pq,λ (t) = 2 (t) + P¯q,λ (t), q (u F ∂k u 3F,λ ) | qv (−)−1 ∂3 divh aλh L 2 dt = P¯q,λ k=1 0

where def v,1 P¯q,λ (t) = 2

2 t qv (qv u kF Sqv +2 ∂k u 3F,λ ) | qv (−)−1 ∂3 divh aλh L 2 dt ,

def v,2 P¯q,λ (t) = 2

2 t qv (Sqv −1 u kF qv ∂k u 3F,λ ) | qv (−)−1 ∂3 divh aλh L 2 dt .

k=1 q ≥q−4 0

k=1 |q −q|≤5 0

Using Bony’s decomposition (1.22) in the horizontal variables, we write

hj qv u hF L 2 (L 4 (L 2 )) S hj−1 Sqv+2 ∇h u 3F,λ L 2 (L 4 (L ∞ )) qv u kF Sqv+2 ∂k u 3F,λ L 1 (L 2 ) t

t

j≥q −4

v

h

t

h

v

+S hj+2 qv u hF L ∞ (L 4 (L 2 )) hj Sqv +2 ∇h u 3F,λ L 1 (L 4 (L ∞ )) . t

h

v

t

v

h

Notice that (3.6) implies that there holds for any 1 ≤ p < ∞, hj v u 3F,λ L p (L 4 (L 2 )) t

h

v

t

e

−cνt 22 j

dt

1

p

j 2

d j, 2 2

− 2

0

d j, 1 p

1 4

2

( 21 − 2p ) j − 2

2

ν λ Whence applying Lemma 2.1 gives

h

v

hj Sqv +2 ∇h u 3F,λ L 1 (L 4 (L ∞ )) t

h

v

u 3hh

−1,1 B4 2 2

.

S hj Sqv +2 ∇h u 3F,λ L 2 (L 4 (L ∞ )) t

−λu 3hh 4

e

−1,1 B4 2 2

(3.34)

cj 1 2

ν λ cj νλ

1 4

j

1 4

22 j

2− 2 ,

and

746

M. Paicu, P. Zhang

as a consequence, we obtain qv u kF Sqv +2 ∇h u 3F,λ L 1 (L 2 ) t

1

q c j d j,q 2− 2 u hhh

1

νλ 4 dq νλ

1 4

− 21 , 21

B4

j≥q −4 q

2− 2 u hhh

− 21 , 21

B4

,

from which, we deduce that v,1 P¯ (t) qv u kF Sqv +2 ∂k u 3F,λ L 1 (L 2 ) qv aλh L ∞ 2 q,λ t (L ) t

q ≥q−4

dq2 1

νλ 4

2−q u hhh

− 21 , 21

B4

aλh L ∞ (B ) .

(3.35)

t

Whereas again using Bony’s decomposition (1.22) in the horizontal variables, we obtain 3 Sqv −1 uFh qv ∇h uF,λ L 1 (L 2 ) t

h 3 jh Sqv −1 uFh L 2 (L 4 (L ∞ )) Sj−1 qv ∇h uF,λ L 2 (L 4 (L 2 )) t

j≥q −4

v

h

t

h

v

+S hj+2 Sqv −1 u hF L ∞ (L 4 (L ∞ )) hj qv ∇h u 3F,λ L 1 (L 4 (L 2 )) , t

v

h

t

h

v

which along with (3.34) yields Sqv −1 u hF qv ∇h u 3F,λ L 1 (L 2 ) t

so we obtain v,2 P¯ (t) q,λ

|q −q|≤5

dq2 1

νλ 4

1

q c j d j,q 2− 2 u hhh

1

νλ 4 dq νλ

1 4

− 21 , 21

B4

j≥q −4 q

2− 2 u hhh

− 21 , 21

B4

,

Sqv −1 u hF qv ∇h u 3F,λ L 1 (L 2 ) qv aλh L ∞ 2 t (L ) t

2−q u hhh

− 21 , 21

B4

aλh L ∞ (B ) . t

This along with (3.32), (3.33) and (3.35) concludes the proof of the proposition. Corollary 3.2. Under the assumptions of Proposition 3.4, we have for all q ∈ Z, 3 t ∇h qv (−)−1 ∂ ∂k (u F u kF ) | qv a h L 2 dt ,k=1 0

1 1 3 h h dq2 2−q √ u hhh 2 − 1 , 1 ∇h a h u . + u a 1 1 1 1 L 2t (B) L∞ t (B ) ν hh B4− 2 , 2 hh B4− 2 , 2 ν B 2 2 4

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

747

Proof. Indeed notice that

hj qv u hF L 2 (L 4 (L 2 )) S hj−1 Sqv+2 ∇h u 3F L 2 (L 4 (L ∞ ))

qv u hF Sqv +2 ∇h u 3F L 1 (L 2 ) t

t

j≥q −4

v

h

t

h

v

+S hj+2 qv u hF L ∞ (L 4 (L 2 )) hj Sqv +2 ∇h u 3F L 1 (L 4 (L ∞ )) , t

v

h

t

h

v

which along with (3.2) ensures that qv u kF Sqv +2 ∇h u 3F L 1 (L 2 ) t

q 1 c j d j,q 2− 2 u 3hh − 1 , 1 u hhh − 1 , 1 ν B4 2 2 B4 2 2 j≥q −4

d − q 3 2 2 u hh − 1 , 1 u hhh − 1 , 1 , ν B4 2 2 B4 2 2 q

from which, we deduce that v,1 P¯ (t) qv u kF Sqv +2 ∇h u 3F L 1 (L 2 ) qv a h L ∞ 2 q,0 t (L ) t

q ≥q−4

dq2 ν

2−q u 3hh

− 21 , 21

B4

u hhh

− 21 , 21

B4

a h L ∞ (B ) . t

v,2 Following the same line, we obtain the same estimate for P¯q,0 (t). This along with (3.33) completes the proof of the corollary.

Proposition 3.5. Let a = (a h , a 3 ) ∈ B(t) with div a = 0. Let u F be given by (1.16). For any positive number λ and 0 ≤ m ∈ L ∞ (0, t), we denote t def def 3 2 3 2 f (t) = u F (t) L ∞ + ∇h a (t)B , gλ (t) = exp −λ f (t ) dt − m(t) , 0

def

aλ (t, x) = gλ (t)a(t, x),

def

u F,λ (t, x) = gλ (t)u F (t, x),

(3.36)

and q,λ (t) def P =

3 t ∇h qv (−)−1 ∂ ∂k (u F aλk ) | qv aλh L 2 dt .

,k=1 0

Then we have for all q ∈ Z,

P q,λ (t) dq2 2−q √1 u hhh − 1 , 1 a h ∞ ∇h aλh 2 +aλh ∇h aλh L t (B ) L 2t (B) L t (B ) L 2t, f (B) ν B4 2 2 1 1 1 1 h h h 2 h 2 h . + u a ∇ a + u a 1 1 1 1 h − , − , λ λ λ hh hh 1 1 L 2t, f (B) L 2t (B) L 2t, f (B) B4 2 2 B4 2 2 (νλ) 4 ν2 Proof. Again as in the proof of Proposition 2.2 and Proposition 3.4 , we distinguish the terms with horizontal derivatives from the terms with a vertical one so that h v q,λ (t) = P q,λ q,λ P (t) + P (t),

(3.37)

748

M. Paicu, P. Zhang

with def h q,λ (t) = P

def v q,λ (t) = P

2 t ∇h qv (−)−1 ∂ ∂k (u F aλk ) | qv aλh L 2 dt ,

and

,k=1 0

t

∇h qv (−)−1 [∂32 (u 3F aλ3 ) +

0

2

∂3 ∂k (u 3F,λ a k + aλ3 u kF )] | qv aλh

L2

dt .

k=1

Applying Bony’s decomposition (1.22) and (3.5) gives qv (u hF aλh ) L 2 (L 2 ) Sqv −1 u hF L 2 (L ∞ ) qv aλh L ∞ 2 t (L ) t

t

|q −q|≤5

+

q ≥q−4

qv u hF L 2 (L ∞ (L 2 )) Sqv +2 aλh L ∞ (L 2 (L ∞ )) t

h

v

t

h

v

q dq √ 2− 2 u hhh − 1 , 1 a h L ∞ (B ) , t ν B4 2 2

from which, we deduce that h P q,λ (t) qv (u hF aλh ) 2 2 qv ∇h aλh 2 2 L (L ) L (L ) t

t

dq2 h √ 2−q u hhh − 1 , 1 a h L . ∞ (B ) ∇h aλ L 2t (B) t ν B4 2 2

(3.38)

On the other hand, using the fact that div a = 0 and div u F = 0, we write v q,λ (t) = P

2 t ∇h qv (−)−1 ∂3 (∂k u 3F,λ a k + ∂k aλ3 u kF ) | qv aλh L 2 dt k=1 0

v,1 (t) + P v,2 (t), = P q,λ q,λ

def

(3.39)

Using integration by parts, we obtain t v 3 v,1 q (u F,λ divh a h ) | qv (−)−1 ∂3 divh aλh L 2 dt Pq,λ (t) = − 0

2 t v 3 k q (u F,λ a ) | qv ∂k (−)−1 ∂3 divh aλh L 2 dt − k=1 0

def

a b = Wq,λ (t) + Wq,λ (t).

(3.40)

Applying Bony’s decomposition (1.22) in the vertical variables gives t a Wq,λ (t) = qv (Sqv −1 u 3F qv divh aλh ) | qv (−)−1 ∂3 divh aλh L 2 dt |q −q|≤5 0

+

t qv (qv u 3F,λ Sqv +2 divh a h ) | qv (−)−1 ∂3 divh aλh L 2 dt

q ≥q−4 0

1 2 q,λ q,λ = W (t) + W (t).

def

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

749

Thanks to Definition 1.3, one has 1 t 1 2 W q,λ (t) u 3F 2L ∞ qv aλ 2L 2 dt qv ∇h aλh L 2 (L 2 ) t

0

|q −q|≤5

dq2 2−q aλh 2

L t, f (B)

∇h aλh . L 2 (B ) t

Whereas thanks to (2.1), div a = 0 and div u F = 0, we write by using integration by parts

t 2 v h v 3 v −1 h q,λ W (t) = 2−q − qv (qv,3 q u F,λ Sq +2 ∂3 ∇h a )|q (−) ∂3 divh aλ L 2 dt 0

q ≥q−4 t

v h v h v −1 qv (qv,3 ∂3 divh ∇h aλh q u F Sq +2 divh aλ ) | q (−)

+

0

L2

dt ,

from which and Lemma 2.1, we deduce that t 1 2 2 v h W q,λ (t) q u F L 2 (L ∞ ) ∇h a 3 2B qv aλh 2L 2 dt t

q ≥q−4

0

+Sqv +2 ∇h aλh L 2 (L 2 ) qv aλh L ∞ 2 t (L )

t

dq2 h . √ 2−q u hhh − 1 , 1 aλh + a h L ∞ (B ) ∇h aλ 2 2 L t (B ) t L t, f (B) ν B4 2 2 b (t), again we use Bony’s decomposition (1.22) in the vertical variable To deal with Wq,λ to write 2 t v v b Wq,λ (t) = q (Sq −1 u 3F qv aλk ) | qv ∂k (−)−1 ∂3 divh aλh L 2 dt k=1 |q −q|≤5 0 t qv (qv u 3F,λ Sqv +2 a k ) + 0 q ≥q−4

| qv ∂k (−)−1 ∂3 divh aλh

L2

dt

3 4 q,λ q,λ = W (t) + W (t).

def

It is easy to observe that 3 W q,λ (t)

t

|q −q|≤5

0

u 3F 2L ∞ qv aλh 2L 2 dt

dq2 2−q aλh 2

L t, f (B)

1 2

qv ∇h aλh L 2 (L 2 ) t

∇h aλh . L 2 (B ) t

Whereas again thanks to (2.1) and div u F = 0, we get by using integration by parts that 4 (t) = W q,λ

2 k=1

q ≥q−4

+ 0

t

2−q

t 0

v h v k v −1 qv (qv,3 ∂3 divh aλh q u F,λ Sq +2 ∇h a ) | q ∂k (−)

v h v k v −1 ∂3 divh ∇h aλh qv (qv,3 q u F,λ Sq +2 a ) | q ∂k (−)

L2

dt ,

L2

dt

750

M. Paicu, P. Zhang

from which and Lemma 2.1, we deduce

4 W q,λ (t) qv u hF L 2 (L ∞ (L 2 )) Sqv +2 ∇h aλh L 2 (L 2 (L ∞ )) qv a h L ∞ 2 t (L ) t

q ≥q−4

h

v

t

+ Sqv +2 a h L ∞ (L 2 (L ∞ )) qv ∇h aλh L 2 (L 2 ) t

v

h

v

h

t

dq2

√ 2−q u hhh − 1 , 1 a h ∇h aλh . L 2t (B) L 2t (B) ν B4 2 2 Therefore, we obtain

v,1 P (t) dq2 2−q aλh q,λ 2

L t, f (B)

∇h aλh L 2 (B ) t

1 h . + √ u hhh − 1 , 1 aλh + a h L ∞ (B ) ∇h aλ 2 (B ) 2 L t L t, f (B) t ν B4 2 2

(3.41)

v,2 (t). Indeed using Bony’s decomposition (1.22) in the It remains to estimate of P q,λ vertical variable, we write 2 t v,2 ∇h qv (−)−1 ∂3 (Sqv +2 ∂k a 3 qv u kF,λ ) | qv aλh L 2 dt Pq,λ (t) = k=1 q ≥q−4 0

+

2 t ∇h qv (−)−1 ∂3 (qv ∂k aλ3 Sqv −1 u kF ) | qv aλh L 2 dt k=1 |q −q|≤5 0

def

1 2 = Gq,λ (t) + Gq,λ (t).

(3.42)

It is easy to observe that t 1 G (t) Sqv +2 ∇h a 3 L 2 (L ∞ ) qv u hF,λ L 4 (L 2 ) qv aλh L 4 (L 2 ) dt q,λ

t

q ≥q−4 0

q ≥q−4

×

t 0

v

h

q ≥q−4 0

0

h

v

v

h

1

1

∇h a 3 B qv u hF,λ L 4 (L 2 ) qv aλh L2 2 qv ∇h aλh L2 2 dt v

h

t

∇h a 3 B qv u hF,λ 2L 4 (L 2 ) dt h

∇h a 3 2B qv aλh 2L 2 dt

1 4

1 2

v

1

qv ∇h aλh L2 2 (L 2 ) . t

However, applying Bony’s decomposition (1.22) in the horizontal variables, we obtain t t 3 v h 2 ∇h a B q u F,λ L 4 (L 2 ) dt = ∇h a 3 B (qv u hF,λ )2 L 2 (L 1 ) dt h v h v 0 0 h h Sk−1 qv u hF L ∞ (L 4 (L 2 )) + Sk+2 qv u hF L ∞ (L 4 (L 2 )) k≥q −4 t

×

0

t

h

v

t

∇h a 3 B kh qv u hF,λ L 4 (L 2 ) dt , h

v

h

v

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

751

whereas thanks to (3.6) and (3.36), we have t ∇h a 3 B kh qv u hF,λ L 4 (L 2 ) dt h v 0 t t q k 3 2 2k dk,q 2 2 2− 2 u hhh − 1 , 1 ∇h a 3 B e−λ 0 ∇h a B dτ e−cνt 2 dt B4

dk,q k √ 2− 2 2 λν

− q2

2 2

u hhh

0

− 21 , 21

B4

,

which gives t 1 2 −q h 2 ∇h a 3 B qv u hF,λ 2L 4 (L 2 ) dt √ dk,q 2 u hh − 1 , 1 h v λν k≥q −4 0 B4 2 2 dq2 √ 2−q u hhh 2 − 1 , 1 . λν B4 2 2 Whence thanks to Definition 1.3, we infer 1 G (t) q,λ

dq2 (λν)

1 4

2−q u hhh

−1,1 B4 2 2

1

1

2 aλh

L 2t, f (B)

2 ∇h aλh 2

L t (B )

.

(3.43)

2 (t), we write, using integration by parts and div u = 0, To deal with Gq,λ F t 2 qv ∂3 (qv aλ3 Sqv −1 u hF ) | qv (−)−1 ∇h divh aλh L 2 dt (t) = Gq,λ |q −q|≤5 t

qv (qv aλ3 Sqv −1 ∂3 u 3F ) | qv (−)−1 ∇h ∂3 aλh

+

0

def

=

0

L2

dt

a (t) + G b (t). G q,λ q,λ

(3.44)

Applying Lemma 2.1, (3.5) and div a = 0, we obtain a Gq,λ (t) qv ∂3 aλ3 L 2 (L 2 ) Sqv −1 u hF L 2 (L ∞ ) qv a h L ∞ 2 t (L ) t

|q −q|≤5

t

dq2 h √ 2−q u hhh − 1 , 1 a h L , ∞ (B ) ∇h aλ L 2t (B) t ν B4 2 2 and b Gq,λ (t)

|q −q|≤5

t 0

u 3F 2L ∞ qv aλh 2L 2 dt

dq2 2−q aλh 2

L t, f (B)

1 2

qv ∂3 aλ3 L 2 (L 2 ) t

∇h aλh , L 2 (B ) t

from which, we deduce that

2 h G (t) d 2 2−q √1 u h 1 1 a h ∇h aλh + a . (3.45) ∞ q −2,2 q,λ λ hh L t (B ) L 2t (B) L 2t, f (B) ν B4

752

M. Paicu, P. Zhang

Thanks to (3.42), (3.43) and (3.45), we arrive at 1 1 1 v,2 −q h h 2 h 2 P (t) dq2 2−q 2 u a ∇ a 1 1 h − , λ λ hh 1 q,λ L 2t (B) L 2t, f (B) B4 2 2 (λν) 4

1 h h ∇ + √ u hhh − 1 , 1 a h L a ∞ (B ) + aλ h λ L 2 (B ) . t L 2t, f (B) t ν B4 2 2

This along with (3.37), (3.38) and (3.41) completes the proof of the proposition. Corollary 3.3. Under the assumptions of Proposition 3.5, we have for all q ∈ Z, 3 t ∇h qv (−)−1 ∂ ∂k (u F a k ) | qv a h L 2 dt ,k=1 0

1 h 3 h dq2 2−q √ u hhh − 1 , 1 ∇h a 2 (B ) + u hh − 1 , 1 ∇h a 2 (B ) a L L L∞ 2 2 2 2 t t t (B ) ν B4 B4

1 3 1 2 2 . + 1 u 3hh − 1 , 1 a h ∇h a h ∞ 2 2 L 2t (B) L t (B ) B4 ν4

Proof. The proof of this corollary essentially follows from the proof of Proposition 3.5. Firstly thanks to (3.40), we have v,1 P (t) qv (u 3F ∇h a h ) 4 4 qv a h L 4 (L 4 (L 2 )) q,0 t

L t3 (L h3 (L 2v ))

h

v

+qv (u 3F a h ) L 2 (L 2 ) qv ∇h a h L 2 (L 2 ) , t

t

which along with (2.3), Lemma 3.2 and Corollary 3.1 ensures that 1 3 v,1 d2 P (t) q 2−q u 3hh − 1 , 1 a h 2 ∞ ∇h a h 2 . 1 q,0 L t (B ) L 2t (B) B4 2 2 ν4

(3.46)

Whereas thanks to (3.42), we obtain 1 G (t) Sqv +2 ∇h a 3 L 2 (L 2 (L ∞ )) qv u hF L 2 (L ∞ (L 2 )) qv a h L ∞ 2 q,0 t (L ) t

q ≥q−4

h

v

t

h

v

q−q 1 3 √ dq 2 2 dq 2−q u hhh − 1 , 1 a h L ∞ (B ) ∇h a L 2t (B) t ν B4 2 2 q ≥q−4

dq2

3 √ 2−q u hhh − 1 , 1 a h L , ∞ (B ) ∇h a L 2t (B) t ν B4 2 2

(3.47)

and it follows from (3.44) that 2 G (t) qv ∂3 a 3 L 2 (L 2 ) Sqv −1 u hF L 2 (L ∞ ) q,0 t

|q −q|≤5

t

+Sqv −1 u 3F L 2 (L ∞ ) qv a h L ∞ 2 t (L ) t

dq2 h √ 2−q u hh − 1 , 1 a h L . ∞ (B ) ∇h a L 2t (B) t ν B4 2 2 This together (3.38), (3.46) and (3.47) completes the proof of the corollary.

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

753

With the above preparations, we are in a position to compete the proof of Theorem 1.3. Proof of Theorem 1.3. Motivated by [11], we shall look for a solution of (AN Sν ) with the form (1.15). Then w satisfies (1.17). Again we shall use the classical Friedrichs’ regularization method to construct the approximate solutions of (1.17). For simplicity, we just outline it here. In order to do so, let (Pn )n∈N be the projection operator given by (2.17), we define wn via ⎧ ∂t wn − νh wn + Pn (wn · ∇wn ) + Pn (wn · ∇u F,n ) + Pn (u F,n · ∇wn) ⎪ ⎪ ⎪ k ⎨ = −P (u −1 k n F,n · ∇u F,n ) − Pn ∇(−) ∂ ∂k (u F,n + wn )(u F,n + wn ) S ν,n ) ( AN ⎪ div wn = 0, ⎪ ⎪ ⎩ def wn |t=0 = Pn (u h ) = Pn (u 0 − u hh ), def

where u F,n = (I d − S vjn )u F with jn = − logn2 . Because of properties of the L 2 and L 1 functions, the Fourier transforms of which are supported in the ball B(0, n), the system S ν,n ) appears to be an ordinary differential equation in the space L 2n defined by ( AN (2.19). This ordinary differential equation is globally wellposed (one may check [11] for the details). Now let us turn to the uniform estimate of wn . For a clear presentation, we shall neglect the subscript n. Toward this, as in the proof of Theorem 1.2, for arbitrary positive numbers λ0 , λ1 and λ2 , which we shall choose later on, we denote def

def

f 1 (t) = w 3 (t)2B ∇h w 3 (t)2B , f 2 (t) = u 3F (t)2L ∞ + ∇h w 3 (t)2B , t def gλ (t) = exp −λ0 u 3hh 4 − 1 , 1 − (3.48) (λ1 f 1 (t ) + λ2 f 2 (t )) dt , B4

def

wλ (t, x) = gλ (t)w(t, x)

2 2

and

0

def

u F,λ (t, x) = gλ (t)u F (t, x).

Then wλh solves ⎧ h ∂t wλh + (λ1 f 1 (t) + λ2 f 2 (t))wλh − νh wλh + Pn (w · ∇wλh ) + P ⎪

n (w · ∇u F,λ ) ⎪ ⎪ ⎪ ⎨ +Pn (u F · ∇wλh ) = −Pn (u F · ∇u hF,λ )−Pn ∇h (−)−1 ∂ ∂k (u F + w )(u kF,λ + wλk ) , div wλ = 0, ⎪ ⎪ ⎪ 3 4 ⎪ w | = P (u λ t=0 n h ) exp −λ0 u hh − 1 , 1 , ⎩ B4

2 2

from which, we get by a standard energy estimate that t 1 v h q wλ (t)2L 2 + (λ1 f 1 (t ) + λ2 f 2 (t ))qv wλh 2L 2 dt + ν∇h wλh 2L 2 (L 2 ) t 2 0 1 = Pn (u h )2L 2 exp −2λ0 u 3hh 4 − 1 , 1 2 B4 2 2 t t v v q (w · ∇wλh ) | qv wλh L 2 dt − q (w · ∇u hF,λ ) | qv wλh L 2 dt − 0 0 t t v v h v h q (u F · ∇wλ ) | q wλ L 2 dt − q (u F · ∇u hF,λ ) | qv wλh L 2 dt − 0 0 t

v −1 k q ∇h (−) ∂ ∂k (u F + w )(u F,λ + wλk ) | qv wλh L 2 dt . − (3.49) 0

754

M. Paicu, P. Zhang

Applying Proposition 2.1 for a = w and b = w h gives t v h 2 q (w · ∇wλh ) | qv wλh L 2 dt dq2 2−q w h L ∞ (B ) ∇h wλ 2

L t (B )

t

0

.

Equation (3.24) applied with a = w and b = w h yields

1 1 3 2 2 dt dq2 2−q 1 w h ∇h wλh ∞ L t (B ) L 2t (B) ν4 1 1 1 1 h 2 2 u hhh − 1 , 1 . + 3 wλh wλh + 1 w h L ∞ (B ) ∇h wλ 2 ∞ L t (B ) L t (B ) t L 2t, f (B) B4 2 2 2 ν4 ν 1

t

qv (w · ∇u hF,λ ) | qv wλh

0

L2

Proposition 3.3 applied with b = w h gives

1 1 3 2 2 dt dq2 2−q 1 w h ∇h wλh ∞ L t (B ) L 2t (B) ν4 1 h h + 1 w h L ∞ (B ) ∇h wλ 2 (B ) u hh − 1 , 1 , L t t B4 2 2 2 ν

t

0

qv (u F · ∇wλh ) | qv wλh

L2

while (3.14) applied with a = w h gives

t

qv (u F · ∇u hF,λ ) | qv wλh

0

dq2 2−q

1 ν

1 2

L2

dt

u hhh 2 − 1 , 1 ∇h wλh + L 2 (B ) B4

t

2 2

1

u hhh

−1,1 B4 2 2

1 4

λ0 ν

wλh L ∞ (B ) . t

Finally Proposition 2.2 applied with u = w, and Proposition 3.4, Proposition 3.5 applied with a = w claims that

t 0

qv ∇h (−)−1 ∂ ∂k (u F + w )(u kF,λ + wλk ) | qv wλh L 2 dt

h 2 dq2 2−q w h L ∞ (B ) ∇h wλ 2

L t (B )

t

+ + +

1 1

ν2 1 ν

1 2

1

3

2 + wλh

L 2t, f (B) 1

2 ∇h wλh 2

L t (B )

u hhh 2 − 1 , 1 ∇h wλh + wλh 2 L 2 (B )

∇h wλh L 2 (B )

u hhh

1

B4

−1,1 B4 2 2

1 (λ2 ν)

1 4

u hhh

L t, f (B)

t

2 2

2

h w h L + ∞ (B ) ∇h wλ L 2 (B )

−1,1 B4 2 2

t

t

1

2 wλh

L 2t, f (B) 2

t

1 4

λ0 ν

1

2 ∇h wλh 2

L t (B )

+

u hhh

− 21 , 21

B4

1 ν

1 2

u hhh

wλh L ∞ (B )

−1,1 B4 2 2

t

wλh 2

L t, f (B) 2

.

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

755

Plugging all the above estimates into (3.49), we arrive at " " √ 2λ1 wλh 2λ2 wλh + 2ν∇h wλh wλh L ∞ (B ) + 2 2 L 2 (B ) L t, f (B)

t

1 h 2 ≤ u h B + C w h ∞ L 2t, f (B) 2

+ +

1 1

ν8 1 ν

1 4

+

1

λ0 ν 2

B4

∇h wλh 2

1 2

3

ν8

t

L t (B )

+

1

u hhh

ν

ν

wλh ∞

4 ∇h wλh 2

1

2 wλh

L 2t, f (B) 2 1 4

∇h wλh

L 2t, f (B) 1

1

1 2

1 8

L t (B )

1

L t (B )

1

4 4 wλh ∇h wλh L 2t (B) L 2t, f (B) 2

1 2

−1,1 B4 2 2

1 4

1 4

L 2t, f (B)

1 4

(λ2 ν)

t

wλh ∞

+

1

2 2

1

1

2 ∇h wλh L ∞ (B ) L 2 (B )

h

1

+

L t (B )

1 2

1 8

+ u hhh 2 − 1 , 1

3 4

L t (B )

1

1

L t (B )

w h ∞ w

t

2 ∇h wλh 2

1 4

3

∇h wλh + wλh L 2 (B )

1

2 +wλh

t

2

1 4

L t (B )

1

L t, f (B)

1

∇h wλh 2

L t (B )

.

(3.50)

For some small enough positive constant ε1 , which will be determined later on, we define √ 1 def ν . (3.51) ν∇h w h ≤ min ε , T ∗ = max t : w h L ∞ (B ) + 1 2 L t (B ) t 100C 2 1

1

Applying (1.20) and Young’s inequality, a p b q ≤ 1 p

a, b, p, q satisfying 1

+

1 q

L t (B )

+

b q

for any positive numbers

= 1, gives 1

2 Cw h ∞

a p

∇h wλh ≤ Cε12 L 2 (B )

√

t

ν∇h wλh , L 2 (B ) t

√ ν ∇h wλh ≤ C1 ν + , L 2t (B) 10 t, f 1 1 √ 1 1 ν h 2 h 2 − 12 h Cwλ ∇h wλ 2 ∇h wλh ≤ C1 ν wλ + , 2 (B ) L 2t (B) 2 L L ( B ) 10 L t, f (B) t t, f 2 2 √ 1 C h 21 ν h 2 h ∇h wλh u ∇ w ≤ C c u + , 1 1 h λ 2 0 1 hh −2,2 hh 1 − 21 , 21 L 2t (B) L ( B ) 10 t B ν4 B 4 1 4

3 4

− 32

Cwλh ∇h wλh 2 L t (B ) L 2t, f (B)

wλh L 2 (B )

4

and 1

Cu hhh 2 − 1 , 1 B4

+ +

1 3

ν8

1 ν

2 2

1 8

1

L t (B )

1 4

L t (B )

1

wλh

L 2t, f (B)

L t (B )

+

1

1

4 wλh 1

(λ2 ν) 8

4 ∇h wλh 2

1 4

wλh ∞

L 2t, f (B)

1 1

ν4

+

ν

w h ∞

1

L t (B )

+

1 1

1

1

− 21 , 21

+ν

1 wλh (1 + (λ2 ν)− 2 ) + L 2 (B ) t, f 2

√

L 2t, f (B) 2 1 2

∇h wλh 2

L t (B )

1

λ08 ν 2

B4

1

2 wλh

L t (B )

≤ C1 (1 + ε1 + c1 ε1 + (λ04 ν)−1 )u hhh − 12

1 4

1 2

4 ∇h wλh 2

2

1

3

4 w h ∞

+

2 wλh ∞

L t (B )

3 1 w h + ν − 2 wλh ∞ L 2t, f (B) 10 λ L t (B) 1

ν ∇h wλh , L 2t (B) 10

756

M. Paicu, P. Zhang

for t < T ∗ . Without loss of generality, we may assume that c1 , ε1 ≤ 21 . Then it follows from (1.19) and (3.50) that " " √ 2λ1 wλh 2λ2 wλh + 2ν∇h wλh wλh L ∞ (B ) + 2 2 L 2 (B ) L t, f (B)

t

L t, f (B)

1

t

2

3 1 + 2C1 ν − 2 wλh w h ∞ L 2t, f (B) 10 λ L t (B) 1 √ 1 1 ν 3 +C1 ν − 2 (2 + (λ2 ν)− 2 )wλh + , ∇h wλh L 2t (B) L 2t, f (B) 5 2 1 4

≤ C1 (3 + (λ0 ν)−1 )u 0h

−1,1 B4 2 2

+

for t < T ∗ . Taking λ0 = ν −4 , λ1 = 4C12 ν −3 and λ2 = 9C12 ν −1 in the above inequality, we infer that √ wλh L ν∇h wλh ≤ 5C1 u 0h − 1 , 1 for t < T ∗ , ∞ (B ) + L 2 (B ) t

B4

t

2 2

which along with a similar derivation of (2.26) ensures that w h L ∞ (B ) + t

√

ν∇h w h ≤ 5C1 u 0h L 2 (B )

−1,1 B4 2 2

t

+ν −2 u 3hh 2 − 1 , 1 + B4

2 2

where we used (3.5) so that

t 0

t 0

exp 9C12 ν −4 u 3hh 4 − 1 , 1

(ν −3 w 3 (t )2B + ν −1 )∇h w 3 (t )2B dt ,

B4

2 2

(3.52)

u 3F (t )2L ∞ dt ν −1 u 3hh 2 − 1 , 1 . B4

2 2

Indeed we get by first applying qv Now let us turn to the uniform estimate of S ν,n ) and then taking the L 2 inner product of the resulting to the w 3 equation of ( AN v 3 equation with q w that w3 .

t v 1 3 w (t)2L 2 + ν∇h w 3 2L 2 (L 2 ) + q (w · ∇w 3 ) | qv w 3 L 2 dt t 2 0 t t v v q (w · ∇u 3F ) | qv w 3 L 2 dt + q (u F · ∇w 3 ) | qv w 3 L 2 dt + 0 0 t 1 v qv (u F · ∇u 3F ) | qv w 3 L 2 dt = q Pn (u 3h )2L 2 − 2 0 t ∇h ∂ ∂k (−)−1 qv (u F + w )(u kF + w k )) | qv w h L 2 dt , + (3.53) 0

where in the last step, we used div w = 0 and integration by parts. Applying Proposition 2.1 for a = w, b = w3 and g = 1 gives t qv (w · ∇w 3 ) | qv w 3 L 2 dt 0

1 1 1 3 3 2 3 2 2 2 dq2 2−q w h ∇h w h w ∇ w h ∞ ∞ (B ) L t (B ) L L 2t (B) L 2t (B) t 3 . +∇h w h w 3 L ∞ (B ) ∇h w L 2 (B ) L 2 (B ) t

t

t

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

757

Applying (3.25) for a = w and b = w 3 implies t 1 3 dq2 2 2 ∇h w h qv (w · ∇u 3F ) | qv w 3 L 2 dt 1 2−q u hhh − 1 , 1 w 3 ∞ L t (B ) L 2t (B) B4 2 2 0 ν4 1 1 3 2 3 2 +u 3hh − 1 , 1 ∇h w h w ∇ w h 2 ∞ 2 L (B ) B4

2 2

1 2

L t (B )

t

1 2

+w h ∞

L t (B )

∇h w h 2

L t (B )

L t (B )

, ∇h w 3 2 L (B ) t

whereas applying Proposition 3.3 for b = w 3 and g = 1 ensures t

1 1 3 2 2 qv (u F · ∇w 3 ) | qv w 3 L 2 dt dq2 2−q u hhh − 1 , 1 w 3 ∇h w 3 ∞ 1 L t (B ) L 2t (B) B4 2 2 ν 4 0 1 3 . + 1 w 3 L ∞ (B ) ∇h w 2 L t (B ) t ν2 Applying (3.15) for a = w 3 gives t qv (u F · ∇u 3F ) | qv w 3 L 2 dt 0

dq2 2−q u hhh

−1,1 B4 2 2

1

u 3hh

1 3 w ∇h w 3 + ∞ (B ) . 2 L L t (B ) t ν ν

−1,1 B4 2 2

1 2

Finally applying Corollary 2.2 for u = w, Corollary 3.2 and Corollary 3.3 for a = w claims that t ∇h ∂ ∂k (−)−1 qv (u F + w )(u kF + wk )) | qv w h L 2 dt 0

1 3 1 1 2 2 dq2 2−q w h ∇h w h 22 + w h 2 ∞ ∇h w h w3 2 ∞ ∇h w3 2 L∞ ( B ) L (B ) L (B ) L 2 (B ) L (B ) L (B ) t t

+ + +

1 1

ν2 1

1 ν2

1 1

ν2

t

t

t

t

1 u hhh 2 − 1 , 1 ∇h w h 2 + u h 1 1 u 3hh − 1 , 1 w h L t (B) ν hh − 2 , 2 L∞ t (B ) 2 2 B B 2 2 B 4

4

u hhh

−1,1 B4 2 2

u 3hh

− 21 , 21

B4

w h ∇h w 2 ∞

L t (B )

L t (B )

+

1 ν4

w h ∇h w h 2 ∞

1

u 3hh

−1,1 B4 2 2

w h 2 ∞

L t (B )

Substituting all the above estimates into (3.53) and using (1.19), we obtain √ 2ν∇h w3 L 2t (B) 1 4 ≤ C2 u 30 − 1 , 1 + C2 w h ∞

w3 L ∞ (B ) + t

B4

+∇h w h 2

1 2

L t (B)

1 4

+w h ∞

L t (B)

2 2

1 2

L t (B)

1

w3 ∞

L t (B) 3 4

∇h w h 2

L t (B)

4 ∇h w h 2

L t (B)

1 2

∇h w3 2 w3 ∞

L t (B)

L t (B)

1 2

L t (B)

1 4

1

4 w3 ∞

+ w h ∞

L t (B)

1 4

∇h w3 2

L t (B)

3

4 ∇h w3 2

L t (B)

∇h w h L 2 (B) t

3

2 ∇h w h 2

.

L t (B )

L t (B )

4

1

L t (B )

758

M. Paicu, P. Zhang 1

+u hhh 2 − 1 , 1 B4

+

1 ν

1 4

1

ν

1

+

1 2

3

L t (B)

(∇h w 2

L t (B)

1

1 8

L t (B)

∇h w 2

L t (B)

1

u 3hh 2 − 1 , 1 B4 2 2

L t (B)

1 ν

1 4

1 2

4 ∇h w3 2

L t (B)

1 2

∇h w 2

+

1 2

1

L t (B)

∇h w 2 3

L t (B)

+

ν

1 2

2 ∇h w3 2

L t (B)

) 1

1

L t (B) 3

1

L t (B)

L t (B)

4 w3 ∞

1 4

h

ν

1

2 w3 ∞

1 4

+ ∇h w 2 1

2 ∇h w h 2

1 4

w ∞

+u hhh 2 − 1 , 1 B4 2 2

4 ∇h w3 2

h

1

h

1 8

L t (B)

L t (B)

1

3

4 w3 ∞

w ∞

1

ν

ν

1

1 8

1 2

h

+u 3hh 2 − 1 , 1 B4 2 2 +

1

2 2

1 ν

1 4

+

ν

1 8

1

L t (B)

1 2

L t (B)

1

2 w h ∞

L t (B)

4 ∇h w h 2

1

w ∞ h

3

4 w h ∞

+

2 ∇h w h 2

L t (B)

L t (B)

1 ν

1 2

1

2 w3 ∞

L t (B)

.

Thanks to (1.20) and (3.51), we infer that for t < T ∗ , √ w3 L 2ν∇h w 3 ∞ (B ) + L 2t (B) t √ √ √ √ ≤ C2 (1 + ε1 + c1 ε1 + c1 )u 30 − 1 , 1 + (ε1 + 3 c1 + 3 ε1 )w 3 L ∞ (B ) +(ε1 +

√

c1 ε1 +

√

B4

t

2 2

√ √ √ √ holds. c1 ε1 )ν +(ε1 + 4 ε1 +2 c1 + c1 ε1 ) ν∇h w 3 2 L (B ) t

(3.54) Choosing c1 in (1.20) and ε1 in (3.51) small enough so that √ √ 1 ε1 + 3 c1 + 3 ε1 ≤ and 2C2 √ √ √ √ √ 1 1 ε1 + c1 ε1 + c1 ε1 ≤ , ε1 + 4 ε1 + 2 c1 + c1 ε1 ≤ , 2C2 2C2 then we deduce from (3.54) that √ w 3 L ν∇h w 3 ≤ 4C2 u 30 − 1 , 1 + ν, (3.55) ∞ (B ) + L 2 (B ) ε1 +

√

c1 ε1 +

√ c1 ≤ 1,

t

B4

t

2 2

for t < T ∗ . Thanks to (3.52), (3.55), and a similar proof of (2.28), there exist positive constants K , M which depend on C1 , C2 such that

√ h −4 3 4 ν∇h w h ≤ K u exp Mν u w h L 1 1 ∞ (B ) + 1 1 2 − , 0 − , 0 L (B ) t

t

B4

2 2

B

2 2

4 1 1 ν for t < T ∗ , (3.56) ≤ min ε1 , 2 100C 2 1 1 provided that we take c1 = 2K in (1.20). Equation (3.56) contradicts min ε1 , 100C 2 ∗ S ν,n ) satisfies (3.52) if T < ∞. This shows that the solution sequence defined by ( AN (3.55) and (3.56) for t = ∞. With (3.55) and (3.56), one can follow the compactness S ν ) in the function argument in [11] to prove the global existence of solutions to ( AN space B(∞). Moreover, the (1.21) holds. And the uniqueness part has been proved in [11]. This completes the proof of Theorem 1.3.

Acknowledgements. The authors would like to thank the anonymous referee for many profitable suggestions. P. Zhang is partially supported by NSF of China under Grant 10421101 and 10931007, and one-hundred talents’ plan from the Chinese Academy of Sciences under Grant GJHZ200829. M. Paicu gratefully acknowledges the hospitality of Morningside Center of Mathematics, CAS, from Beijing.

Global Solutions to 3-D Incompressible Anisotropic NS System in Critical Spaces

759

References 1. Bahouri, H., Chemin, J.Y., Danchin, R.: Fourier Analysis and Nonlinear Partial Differential Equations, Grundlehren der Mathematischen Wissenschaften, Vol. 343, Berlin-Heidelberg-NewYork: Springer, 2011 2. Bony, J.M.: Calcul symbolique et propagation des singularités pour les e´quations aux dérivées partielles non linéaires. Ann. Sci. École Norm. Sup. 14(4), 209–246 (1981) 3. Cannone, M., Meyer, Y., Planchon, F.: Solutions autosimilaires des équations de Navier-Stokes, Séminaire “Équations aux Dérivées Partielles de l’École Polytechnique”, Exposé VIII, 1993–1994 4. Cannone, M.: A generalization of a theorem by Kato on Navier-Stokes equations. Rev. Mat. Iberoamericana 13, 515–541 (1997) 5. Chemin, J.Y.: Théorémes d’unicité pour le systéme de Navier-Stokes tridimensionnel. J. Anal. Math. 77, 27–50 (1999) 6. Chemin, J.Y.: Localization in Fourier space and Navier-Stokes system. In: Phase Space Analysis of Partial Differential Equations, Vol. 1, Proceedings 2004, CRM series, Pisa: Pubbl. Cent. Ric. Mat. Ennio giorgi, pp. 53–136 7. Chemin, J.Y., Desjardins, B., Gallagher, I., Grenier, E.: Fluids with anisotropic viscosity. Modélisation Mathématique Et Analyse Numérique 34, 315–335 (2000) 8. Chemin J.Y., Desjardins B., Gallagher I., Grenier E.: Mathematical Geophysics. An Introduction to Rotating Fluids and the Navier-Stokes Equations. Oxford Lecture Series in Mathematics and its Applications, 32, Oxford: Clarendon Press Oxford University Press, 2006 9. Chemin, J.Y., Gallagher, I.: Large, global solutions to the Navier-Stokes equations, slowly varying in one direction. Trans. Amer. Math. Soc. 362, 2859–2873 (2010) 10. Chemin, J.Y., Lerner, N.: Flot de champs de vecteurs non lipschitziens et équations de Navier-Stokes. J. Diff. Eqs. 121, 314–328 (1995) 11. Chemin, J.Y., Zhang, P.: On the global wellposedness to the 3-D incompressible anisotropic Navier-Stokes equations. Commun. Math. Phys. 272, 529–566 (2007) 12. Ekman, V.W.: On the influence of the earth’s rotation on ocean currents. Arkiv. Matem. Astr. Fysik. Stockholm 2(11), 1–52 (1905) 13. Fujita, H., Kato, T.: On the Navier-Stokes initial value problem I. Arch. Rat. Mech. Anal. 16, 269–315 (1964) 14. Grenier, E., Masmoudi, N.: Ekman layers of rotating fluids, the case of well prepared initial data. Comm. Par. Diff. Eqs. 22, 953–975 (1997) 15. Gui, G., Zhang, P.: Stability to the global large solutions of 3-D Navier-Stokes equations. Adv. Math. 225, 1248–1284 (2010) 16. Iftimie, D.: The resolution of the Navier-Stokes equations in anisotropic spaces. Rev. Mat. Iberoamericana 15, 1–36 (1999) 17. Iftimie, D.: A uniqueness result for the Navier-Stokes equations with vanishing vertical viscosity. SIAM J. Math. Anal. 33, 1483–1493 (2002) 18. Koch, H., Tataru, D.: Well-posedness for the Navier-Stokes equations. Adv. Math. 157, 22–35 (2001) 19. Kukavica, I., Ziane, M.: One component regularity for the Navier-Stokes equations. Nonlinearity 19, 453– 469 (2006) 20. Lemarié-Rieusset, P.G.: Recent developments in the Navier-Stokes problem. Chapman & Hall/CRC Research Notes in Mathematics, 431, Boca Raton, FL: Chapman & Hall/CRC, 2002 21. Leray, J.: Sur le mouvement d’un liquide visqueux remplissant l’espace. Acta Math. 63, 193–248 (1934) 22. Pedlovsky, J.: Geophysical Fluid Dynamics. Berlin-Heidelberg-NewYork: Springer, 1979 23. Paicu, M.: Équation anisotrope de Navier-Stokes dans des espaces critiques. Revi. Mat. Iberoamericana 21, 179–235 (2005) 24. Paicu, M., Zhang, P.: Global solutions to the 3-D incompressible inhomogeneous Navier-Stokes system. Preprint 2010 25. Planchon, F.: Asymptotic behavior of global solutions to the Navier-Stokes equations in R3 . Rev. Mat. Iberoamericana 14, 71–93 (1998) 26. Zhang, T.: Global wellposedness problem for the 3-D incompressible anisotropic Navier-Stokes equations in an anisotropic space. Commun. Math. Phys. 287, 211–224 (2009) 27. Zhang, T.: Erratum to: Global wellposed problem for the 3-D incompressible anisotropic Navier-Stokes equations in an anisotropic space. Commun. Math. Phys. 295, 877–884 (2010) Communicated by P. Constantin

Commun. Math. Phys. 307, 761–790 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1351-5

Communications in

Mathematical Physics

Orthogonal and Symplectic Matrix Models: Universality and Other Properties M. Shcherbina Institute for Low Temperature Physics Ukr.Ac.Sci., Kharkov, Ukraine. E-mail: [email protected] Received: 19 April 2010 / Accepted: 26 May 2011 Published online: 18 September 2011 – © Springer-Verlag 2011

Abstract: We study orthogonal and symplectic matrix models with polynomial potentials and multi interval supports of the equilibrium measures. For these models we find the bounds (similar to those for the hermitian matrix models) for the rate of convergence of linear eigenvalue statistics and for the variance of linear eigenvalue statistics and find the logarithms of partition functions up to the order O(1). We prove also the universality of local eigenvalue statistics in the bulk. 1. Introduction and Main Results In this paper we consider ensembles of random matrices, whose joint eigenvalue distribution is pn,β (λ1 , . . . , λn ) = Q −1 n,β [V ]

n i=1

=:

e−nβV (λi )/2

|λi − λ j |β

1≤i< j≤n

β H (λ1 ,...,λn )/2 Q −1 , n,β [V ]e

(1.1)

where the function H , which we call Hamiltonian to stress the analogy with statistical mechanics, and the normalizing constant Q n,β [V ] have the form H (λ1 , . . . , λn ) = −n Q n,β [V ] =

n i=1

V (λi ) +

log |λi − λ j |,

i= j

eβ H (λ1 ,...,λn )/2 d λ¯ .

(1.2)

The function V (called the potential) is a real valued Hölder function satisfying the condition V (λ) ≥ 2(1 + ) log(1 + |λ|).

(1.3)

762

M. Shcherbina

This distribution can be considered for any β > 0, but the cases β = 1, 2, 4 are especially important, since they correspond to real symmetric, hermitian, and symplectic matrix models respectively. We will also consider the marginal densities of (1.1) (correlation functions) (n) pl,β (λ1 , . . . , λl ) = pn,β (λ1 , . . . λl , λl+1 , . . . , λn )dλl+1 . . . dλn , (1.4) Rn−l

and denote

Eβ {(. . . )} =

¯ (. . . ) pn,β (λ1 , . . . , λn )d λ.

(1.5)

It is known (see [2,13]) that if V is a Hölder function, then the first marginal density (n) p1,β converges weakly to the function ρ (equilibrium density) with a compact support σ . The density ρ maximizes the functional, defined on the class M1 of positive unit measures on R, EV (ρ) = max L[dm, dm] − V (λ)m(dλ) = E[V ], (1.6) m∈M1

where we denote

L[ dm, dm] =

L[ f, g] =

log |λ − μ|dm(λ)dm(μ), (1.7) log |λ − μ| f (λ)g(μ)dλdμ.

The support σ and the density ρ are uniquely defined by the conditions: v(λ) := 2 log |μ − λ|ρ(μ)dμ − V (λ) = sup v(λ) := v ∗ , λ ∈ σ, v(λ) ≤ sup v(λ), λ ∈ σ,

σ = supp{ρ}.

(1.8)

For β = 2 it is well known (see [15]) that all correlation functions (1.4) can be represented as (n)

pl,β (λ1 , . . . , λl ) =

(n − l)! det{K n,2 (λ j , λk )}lj,k=1 , n!

(1.9)

where K n,2 (λ, μ) =

n−1

(n)

(n)

ψl (λ)ψl (μ).

(1.10)

l=0

This function is known as a reproducing kernel of the orthonormalized system (n)

(n)

ψl (λ) = exp{−nV (λ)/2} pl (λ), l = 0, . . . ,

(1.11)

n are orthogonal polynomials on R associated with the weight w (λ) = in which { pl(n) }l=0 n −nV (λ) e , i.e., (n) pl(n) (λ) pm (λ)wn (λ)dλ = δl,m . (1.12)

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

763

The orthogonal polynomial machinery, in particular, the Christoffel-Darboux formula and Christoffel function simplify considerably the studies of marginal densities (1.4). This allows to study the local eigenvalue statistics in many different cases: bulk of the spectrum, edges of the spectrum, special points, etc. (see [3–5,12,17,18,20]). For β = 1, 4 the situation is more complicated. It was shown in [24] that all correlation functions can be expressed in terms of some matrix kernels (see (1.13)–(1.17) below). But the representation is less convenient than (1.9)–(1.10). It makes difficult the problems, which for β = 2 are just simple exercises. For example, the bound for the variance of linear eigenvalue statistics (1.21) for β = 1, 4 till now was known only for one interval σ (see [13]), while for β = 2 it is a trivial corollary of the Christoffel-Darboux formula for any σ . The matrix kernels for β = 1, 4 have the form ∂ − ∂μ Sn,1 (λ, μ) Sn,1 (λ, μ) K n,1 (λ, μ) := , β = 1, n = 2k, (1.13) ( Sn,1 )(λ, μ) − (λ − μ) Sn,1 (μ, λ) ∂ Sn,4 (λ, μ) Sn,4 (λ, μ) − ∂μ , β = 4, (1.14) K n,4 (λ, μ) := ( Sn,4 )(λ, μ) Sn,4 (μ, λ)

where Sn,1 (λ, μ) = −

n−1

(n)

(n)

ψ j (λ)(Mn(n) )−1 jk (ψk )(μ),

(1.15)

(n) −1 (n) (ψ (n) j ) (λ)(Dn ) jk ψk (μ),

(1.16)

j,k=0

Sn/2,4 (λ, μ) = −

n−1 j,k=0

(λ) = 21 sgn(λ), sgn denotes the standard signum function, ( f )(λ) := (n)

R

(λ − μ) f (μ) dμ.

(n)

Dn and Mn in (1.15) and (1.16) are the left top corner n ×n blocks of the semi-infinite matrices that correspond to the differentiation operator and to the integration operator respectively. (n) := D∞ (n) M∞

ψ (n) j

, ψk(n)

(n) (n) := ψ j , ψk

(n)

, j,k≥0

j,k≥0

,

n−1 Dn(n) = {D (n) jk } j,k=0 ,

Mn(n)

(n)

=

(1.17)

(n) {M jk }n−1 j,k=0 . (n)

(n)

Both matrices D∞ and M∞ are skew-symmetric, and since (ψ j ) = ψ j , we have for any j, l ≥ 0 that (n)

δ jl = ((ψ j ) , ψl ) =

∞ k=0

(n) (n) (D∞ ) jk (M∞ )kl

⇐⇒

(n) (n) (n) (n) D∞ M∞ = 1 = M∞ D∞ .

764

M. Shcherbina

It was observed in [25] that if V is a rational function, in particular, a polynomial of degree 2m, then the kernels Sn,1 , Sn,4 can be written as 2m−1

Sn,1 (λ, μ) = K n,2 (λ, μ) + n

(1)

(n)

(n)

F jk ψn+ j (λ)ψn+k (μ),

j,k=−(2m−1) 2m−1

Sn/2,4 (λ, μ) = K n,2 (λ, μ) + n

(1.18) (4)

(n)

(n)

F jk ψn+ j (λ)ψn+k (μ),

j,k=−(2m−1) (1) (4) , F jk can be expressed in terms of the matrix Tn−1 , where Tn is the where F jk (n)

(n)

(2m − 1) × (2m − 1) block in the bottom right corner of Dn Mn , i.e., (Tn ) jk := (Dn(n) Mn(n) )n−2m+ j,n−2m+k ,

1 ≤ j, k ≤ 2m − 1.

(1.19)

The main technical obstacle to study the kernels Sn,1 , Sn,4 is the problem to prove that (Tn−1 ) jk are bounded uniformly in n. Until now this technical problem was solved only in a few cases. In the papers [6,7] the case V (λ) = λ2m (1 + o(1)) (in our notations) was studied and the problem of invertibility of Tn was solved by computing the entries of Tn explicitly. A similar method was used in [8] to prove bulk and edge universalities (including the case of the hard edge) for the Laguerre type ensembles with monomial V . In the paper [23] the problem of invertibility of Tn was solved also by computing the entries of Tn for V being an even quartic polynomial. In [21,22] a similar problem was solved without explicit computation of the entries of Tn . It was shown that for any real (n) analytic V with one interval support σ the matrix (Mn )−1 is uniformly bounded in the operator norm. This allowed us to prove the bulk and the edge universality for β = 1 in the one interval case. But there is also a possibility to prove that Tn is invertible with another technique. As a by product of the calculation in [24] one also obtains relations between the partition (n) (n) functions Q n,β and the determinants of Mn and Dn : Q n/2,4 n 2 Q n,1 n 2 (n) (n) det Mn = , det Dn = , n!2n/2 (n/2)!2n/2 (n) (n) (n) where n := n−1 is the leading coefficient of p j (λ) of (1.12). It is j=0 γ j , and γ j (n)

(n)

(n)

also known (see [15]) that Q n,2 = n2 /n!. Since D∞ M∞ = 1 and (D∞ ) jk = 0 for | j − k| > 2m − 1 (see (3.3)), we have Dn(n) Mn(n) = 1 + n with n being zero except for the bottom 2m − 1 rows, and we arrive at a formula, first observed in [23]: Q n,1 Q n/2,4 2 (n) (n) det(Tn ) = det(Dn Mn ) = . (1.20) Q n,2 (n/2)!2n Hence to control det(Tn ), it suffices to control log Q n,β for β = 1, 2, 4 up to the order O(1). In the paper [16] the corresponding expansion of log Q n,β was constructed by using some generalization of the method of [13]. The original method was proposed to study the fluctuations of linear eigenvalue statistics Nn [ϕ] =

n i=1

ϕ(λi ),

(1.21)

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

765

in particular, to control the expectation and the variance of n −1 Nn [ϕ] up to the terms O(n −2 ) for any β, but only in the case of the one-interval support of the equilibrium measure and polynomial V , satisfying some additional assumption. The method was used also to prove CLT for fluctuations of Nn [ϕ]. In the paper [16] the method of [13] was simplified, that allowed us to generalize it in the case of real analytic V with one interval support of ρ without any other assumption. Unfortunately, there is no hope to generalize the method of [13,16] in the case of multi-interval support σ directly, because the method is based on solving some integral equation (see Eq. (2.13) below) which is not uniquely solvable in the case of multi-interval support. In the present paper the problem to control Q n,β [V ] for β = 1, 2, 4 is solved in a little bit different way. We prove that for any analytical potential V with q-interval support σ the normalizing constant Q n,β [V ] can be factorized to a product of Q kα∗ ,β [Va(α) ], (a) α = 1, . . . , q, where kα∗ ∼ μ∗α n (see (1.34)), and the “effective potentials” Vα (see (1.36)) are defined in terms of σ , V and ρ. To be more precise let us formulate our main conditions. C1. V is a polynomial of degree 2m with a positive leading coefficient, and the support of its equilibrium measure is σ =

q

σα , σα = [E 2α−1 , E 2α ].

(1.22)

α=1

C2. The equilibrium density ρ can be represented in the form ρ(λ) =

1 P(λ) X 1/2 (λ + i0), inf |P(λ)| > 0, λ∈σ 2π

(1.23)

where X (z) =

2q

(z − E α ),

(1.24)

α=1

and we choose a branch of X 1/2 (z) such that X 1/2 (z) ∼ z q , as z → +∞. Moreover, the function v defined by (1.8) attains its maximum only if λ belongs to σ . Remark 1. It is known (see, e.g., [1]) that for analytic V the equilibrium density ρ always has the form (1.23)–(1.24). The function P in (1.23) is analytic and can be represented in the form

1 V (z) − V (ζ ) P(z) = dζ. (1.25) 2πi L (z − ζ )X 1/2 (ζ ) Hence Condition C2 means that ρ has no zeros in the internal points of σ and behaves like a square root near the edge points. This behavior of V is usually called generic. We will use also the notations σε =

q α=1

σα,ε , σα,ε = [E 2α−1 − ε, E 2α + ε],

dist {σα,ε , σα ,ε } > δ > 0, α = α .

(1.26)

766

M. Shcherbina

The first result of the paper is the theorem which allows us to control log Q n,β in the one interval case up to O(1) terms. Since the paper [2] it is known that log Q n,β [V ] = βn 2 E[V ]/2 + O(n log n). But, as it was discussed above, for many problems it is important to control the next terms of asymptotic expansion of log Q n,β (see also the discussion in [9], where the expansion in n −1 was constructed for β = 2 and V being a polynomial, close in a certain sense to V0 (λ) = λ2 /2.) We would like to note that almost all assertions of Theorem 1 below were obtained in [16]. The difference is that here we need to control that the remainder bounds are uniform in some parameter η, which we put in front of V in H (see (1.2)). Note also that it is important for us that here V may be a non-polynomial function analytic only in some open domain D ⊂ C containing σ . We use below the Stieltjes transform, defined for an integrable function p as g(z) =

p(λ)dλ . λ−z

(1.27)

Theorem 1. Let V satisfy (1.3) and C2, the equilibrium density ρ (see (1.8)) have the form (1.23) with q = 1, and σ = supp ρ = [a, b]. Assume also that V is analytic in the domain D ⊃ σε . Consider the distribution (1.1) with V replaced by ηV . Then there exists ε1 > 0 such that for any η : |η − 1| ≤ ε1 we have: (η)

(n,η)

(i) The Stieltjes transform gn (z) (1.27) of the first marginal p1,β of (1.1) for z such that d(z) := dist {z, σε } ≥ n −1/3 log n has the form gn(η) (z) = gη (z) + n −1 u n,η (z),

gη (ζ )dζ (1.28) 1 2 −1 + n −1 O(d −11/2 (z)), u n,η (z) = − 1/2 β 2πi X η (z) L Pη (ζ )(z − ζ ) where gη (z) is the Stieltjes transform of the equilibrium density ρη which maximizes E[ηV ] of (1.6), X η of (1.24) corresponds to the support [aη , bη ] of ρη , and Pη is defined by (1.25) for ηV . The contour L here is chosen sufficiently close to [aη , bη ] to have z and all zeros of Pη outside of it. The remainder bound is uniform in |η − 1| ≤ ε1 . Moreover, for any ϕ with bounded third derivative ϕ (3) ,

(n,η) ϕ(λ) p1,β (λ) − ρη (λ) dλ = n −1 O(||ϕ||∞ + ||ϕ (3) ||∞ ), ||ϕ||∞ := sup |ϕ(λ)|.

(1.29)

λ∈σε

(ii) There exists an analytic in D\σε function u ∗ such that u n,η (z) − u n,1 (z) = (η − 1)u ∗ (z) + O((η − 1)2 ) + O(n −1 ), | z| > d. (1.30)

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

767

(η)

(iii) If Q n,β is defined by (1.2) for ηV , then dη 3βn 2 βn 2 β

(η) log log Q n,β = log Q ∗n,β + + E[ηV ] + n 1 − 8 2 2 2 1 βn dt (ηV (z) − Vη(0) (z))u n,η (z, t)dz, + 4πi 0 L

gη (ζ, t)dζ (2/β − 1) + O(n −1 ), u n,η (z, t) = − 1/2 2πi X η (z) L Pη (ζ, t)(z − ζ )

(1.31)

where E[ηV ], X η and [aη , bη ] are the same as in (ii), the contour L is chosen as in (i), L encircles L, Vη(0) (z) = 2(z − cη )2 /dη2 , cη = (aη + bη )/2, dη = (bη − aη )/2, Pη (λ, t) = t Pη (λ) +

4(1 − t) , dη2

gη (z, t) = tgη (z) +

2(1 − t) 1/2 (X η (z) − z + cη ), dη2

Q ∗n,β is defined by the Selberg formula for V (λ) = λ2 /2, Q ∗n,β

nβ = n! 2

−βn 2 /4−n(1−β/2)/2

(2π )n/2

n

(β j/2) ,

(β/2)

(1.32)

j=1

and the remainder bound in (1.31) is uniform in |η − 1| ≤ ε1 . Remark 2. According to [10] for β = 1, 2, 4, log

Q ∗n,β n!

3 n

β = − βn 2 − 1 − n log nβ − 8 2 2 2π + n log − aβ log n + O(1),

(β/2)

(1.33)

where a1 = a4 = 1/24, a2 = 1/12. The next theorem establishes some important properties of symplectic and orthogonal matrix models, in particular, it gives the bound for the rate of convergence of linear eigenvalue statistics and the bound for their variances. In order to formulate the theorem, we define for even n, μ∗α = ρ(λ)dλ, kα∗ := [nμ∗α ] + dα , (1.34) σα

where [x] means an integer part of x, and dα = 0, ±1, ±2 are chosen in a way which makes kα∗ even and

kα∗ = n.

(1.35)

768

M. Shcherbina

For each σα,ε we introduce the “effective potential”

log |λ − μ|ρ(μ)dμ , Vα(a) (λ) = 1σα,ε (λ) V (λ) − 2

(1.36)

and denote ∗ the “cross energy” ∗ := dλ

(1.37)

σ \σα

α=α σα

σα

dμ log |λ − μ|ρ(λ)ρ(μ).

Theorem 2. If the potential V satisfies Conditions C1-C2 and n is even, then the matrices F (1) and F (4) in (1.18) are bounded in the operator norm uniformly in n. Moreover, for any smooth ϕ and β = 1, 2, 4 we have

ϕ(λ) p (n) (λ) − ρ(λ) dλ ≤ C ||ϕ ||∞ , 1,β n (1.38) 2 Eβ Nn [ϕ] − Eβ {Nn [ϕ]} ≤ C||ϕ ||2∞ , where ||.||∞ is defined in (1.29). The logarithm of the normalization constant Q n,β [V ] can be obtained up to the O(1) term from the representation log(Q n,β [V ]/n!) =

q α=1

(a)

where Vα

log(Q kα∗ ,β [Vα(a) ]/kα∗ !) −

βn 2 ∗ + O(1), 2

(1.39)

and ∗ are defined in (1.36) and (1.37).

As it was mentioned above, Theorem 2 together with some asymptotic results for orthogonal polynomials of [5] may be used to prove the universality conjecture for local eigenvalue statistics of the matrix models (1.1). In order to state our theorem on the bulk universality, we need some more notations. Define sin π t , πt (ξ − η) K∞ K ∞ (ξ − η) (1) , K∞ (ξ, η) := ξ −η K ∞ (t) dt − (ξ − η) K ∞ (η − ξ ) 0 (ξ − η) K∞ K ∞ (ξ − η) (4) . K ∞ (ξ, η) := ξ −η K ∞ (t) dt K ∞ (η − ξ ) 0 K ∞ (t) :=

Furthermore, we denote for a 2 × 2 matrix A and λ > 0, √ √ −1 λ 0 λ 0 (λ) √ √ −1 . A := A 0 0 λ λ Theorem 3. Let V satisfy Conditions C1–C2. Then we have for (even) n → ∞, λ0 ∈ R with ρ(λ0 ) > 0, and for β ∈ {1, 4} that (q )

(β)

qn−1 K n,1n (λ0 + ξ/qn , λ0 + η/qn ) = K ∞ (ξ, η) + O(n −1/2 ), (q )

(4) n qn−1 K n/2,4 (λ0 + ξ/qn , λ0 + η/qn ) = K ∞ (ξ, η) + O(n −1/2 ),

where qn = nρ(λ0 ). The error bound is uniform for bounded ξ , η and for λ0 contained q in some compact subset of ∪α=1 (E 2α−1 , E 2α ).

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

769

It is an immediate consequence of Theorem 3 that the corresponding rescaled l-point correlation functions (n)

pl,1 (λ0 + ξ1 /qn , . . . , λ0 + ξl /qn ) ,

(n/2)

pl,4

(λ0 + ξ1 /qn , . . . , λ0 + ξl /qn )

converge for n (even) → ∞ to some limit that depends on β but not on the choice of V . The paper is organized as follows. In Sect. 2 we prove Theorem 1. In Sect. 3 we prove Theorems 2 and 3 modulo some bounds which are obtained in Sect. 4. 2. Proof of Theorem 1 (i) Take n-independent ε which is sufficiently small to provide that σε ⊂ D. Then take η : |η − 1| ≤ ε1 with ε1 being sufficiently small to provide that ση (the support ρη ) belongs to σε/2 . It is known (see [2, Lemmas 1,3] and [19, Theorems 11.1.4, 11.1.6]) that if we replace in (1.1), (1.4), and (1.5) the integration over R by the integration σε , (n,η) (n,η,ε) for k = 1, 2 satisfy the inequalities then pk,β and the new marginal densities pk,β sup

(n,η,ε)

λ1 ,...,λk ∈R, y∈σε

pk+1,β (λ1 , . . . , λk , y) ≤ e−nβdε , dε(η) := inf {(vη∗ − vη (λ))/4} > 0, R\σε

sup

λ1 ,...,λk ∈σε

(n,η)

(n,η,ε)

| pk,β (λ1 , . . . , λk ) − pk,β

(η)

(λ1 , . . . , λk )| ≤ e−nβdε ,

(2.1)

(η) Q n,β [ηV ]/Q (ε) [ηV ] − 1 ≤ e−nβdε , n,β

where vη and vη∗ are defined by (1.8) for ηV . Since it is more convenient for our purposes to consider the integration with respect to σε we use this truncation. Then at the end of the proof of Theorem 1 we can remove the truncation, using (2.1). Thus, starting from this moment we assume that this truncation is made, so everywhere below the integration without limits means the integration over σε , but we will omit superindex ε. Remark 3. Let us note that relations (2.1) means, in particular, that outside of σε we can replace V by any smooth function (e.g. linear) which grows at infinity and satisfies the second line of (1.8). This replacement gives errors O(e−nc ) with some c > 0 to all our computations. Following the idea of [13], we will study a little bit modified form of the joint eigenvalue distribution, than (1.1). Consider some function h : σε → R such that ||h ||∞ = o(n), n → ∞, and denote Vh,η (λ) = ηV (λ) +

1 h(λ). n

(2.2)

(n,η)

Let pn,β,h , Eβ,h {. . . }, pl,β,h be the distribution density, the expectation, and the marginal densities defined by (1.1), (1.5), and (1.4) with V replaced by Vh,η . By (1.1) the first marginal density can be represented in the form n (n,η) −1 |λ − λi |β e−nβVh,η (λi )/2 p1,β,h (λ) = Q n,β,h e−nβVh,η (λ)/2 ·

2≤i< j≤n

i=2 β

|λi − λ j | dλ2 . . . dλn .

(2.3)

770

M. Shcherbina

Using this representation and integrating by parts, we obtain V (λ) p (n,η) (λ) 1,β,h h,η z−λ

2 βn

dλ =

p (n,η) (λ) 1,β,h

dλ (z − λ)2 p (n,η) (λ, μ)dλdμ (η) 2(n − 1) 2,β,h + O(e−nβdε /2 ). + n (z − λ)(λ − μ)

(2.4)

(η)

Here O(e−nβdε /2 ) is the contribution of the integrated terms. This bound is uniform in (η) z : dist{z, σα,ε } ≤ n −1/2 . In fact all equations below should contain O(e−nβdε /2 ), but in order to simplify the formula below we omit it. (n,η) Since the function p2,β,h (λ, μ) is symmetric with respect to λ, μ, we have 2

p (n,η) (λ, μ)dλdμ 2,β,h (z − λ)(λ − μ)

= =

p (n,η) (λ, μ)dλdμ 2,β,h (z − λ)(λ − μ)

p (n,η) (λ, μ)dλdμ 2,β,h (z − λ)(z − μ)

+

p (n,η) (λ, μ)dλdμ 2,β,h (z − μ)(μ − λ)

.

Hence Eq. (2.4) can be written in the form V (λ) p (n,η) (λ) 1,β,h h,η z−λ

2 dλ = βn

p (n,η) (λ) 1,β,h

(n−1) dλ+ 2 (z −λ) n

p (n,η) (λ, μ)dλdμ 2,β,h (z −λ)(z −μ)

.

(2.5)

Let us introduce the notations: (η)

δn,β,h (z) := n(n − 1)

p (n,η) (λ, μ)dλdμ 2,β,h

− n2

(n,η)

p1,β,h (λ)dλ

2

(z − λ)(z − μ) z−λ p (n,η) (λ) k (η) (λ, μ)dλdμ 1,β,h n,β,h , +n dλ = (z − λ)2 (z − λ)(z − μ)

(2.6)

where (η)

(n,η)

(n,η)

(n,η)

kn,β,h (λ, μ) := n(n − 1) p2,β,h (λ, μ) − n 2 p1,β,h (λ) p1,β,h (μ) (n,η)

+nδ(λ − μ) p1,β,h (λ).

(2.7)

Starting from this moment in formulas (2.8)–(2.26) we assume that z ∈ D1 ⊂ D, where D1 = {dist{z, σε } ≤ δ } with some n-independent δ > 0. We chose δ sufficiently small to provide that zeros of Pη are outside of D1 and the distances between zeros and D1 are bigger than δ for |η − 1| ≤ ε1 . Denote (η) gn,β,h (z)

=

p (n,η) (λ)dλ 1,β,h λ−z

Then Eq. (2.4) takes the form (η)

(η)

(gn,β,h (z))2 + ηV (z)gn,β,h (z) + η =

1 n

h (λ) p (n,η) (λ) 1,β,h z−λ

dλ −

, V (z, λ) =

V (z) − V (λ) . z−λ

(2.8)

(n,η)

V (z, λ) p1,β,h (λ)dλ

(n,η) p1,β,h (λ) 1 2 1 (η) −1 dλ − 2 δn,β,h (z). 2 n β (z − λ) n

(2.9)

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

771

Using that V (z, ζ ) is an analytic function of ζ in D, by the Cauchy Theorem we obtain that

1 (n,η) (η) V (z, λ) p1,β,h (λ)dλ = − V (z, ζ )gn,β,h (ζ )dζ. 2πi L Thus (2.9) takes the form

η (η) + ηV V (z, ζ )gn,β,h (ζ )dζ 2πi L (n,η) h (λ) p (n,η) (λ) p1,β,h (λ) 1 (η) 1 1 2 1,β,h dλ − 2 δn,β,h (z). (2.10) = dλ − −1 2 n z−λ n β (z − λ) n

(η) (gn,β,h (z))2

(η) (z)gn,β,h (z) −

On the other hand, it is easy to show that gη satisfies the equation

η gη2 (z) + ηV (z)gη (z) + Q η (z) = 0, Q η (z) = − V (z, ζ )gη (ζ )dζ. 2πi L

(2.11)

Hence η 1 2 2 gη (z) = − V (z) + η V (z) − 4Q η (z). 2 2 Using the inverse Stieltjes transform and comparing with (1.23), we get that 2gη (z) + ηV (z) = Pη (z)X η1/2 (z),

(2.12)

where X η (z) is defined by (1.24) for q = 1. (η) Write gn,β,h = gη + n −1 u n,η . Then, subtracting (2.11) from (2.10) and multiplying the result by n, we get

η (2gη (z) + ηV (z))u n,η (z) − V (z, ζ )u n,η (ζ )dζ = F(z), (2.13) 2πi L where F(z) =

h (λ) p (n,η) (λ) 1,β,h

dλ −

z−λ 1 (η) 1 2 − u n,η (z) − δn,β,h (z). n n

2 1 −1 gη (z) + u n,η (z) β n (2.14)

Using (2.12), we obtain from (2.13), Pη (z)X η1/2 (z)u n,η (z) + Qn (z) = F(z),

η Qn (z) = − V (z, ζ )u n,η (ζ )dζ. 2πi

(2.15)

Then for any L ⊂ D1 which does not contain z we get

1 dζ Pη (ζ )X η1/2 (ζ )u n,β,h (ζ ) + Qn (ζ ) − F(ζ ) = 0. (2.16) 2πi L Pη (ζ )(z − ζ )

772

M. Shcherbina

Since by definition (2.15) Qn (ζ ) is analytic in D1 , and z and all zeros of P are outside of L, the Cauchy Theorem yields

Qn (ζ )dζ 1 = 0. 2πi L Pη (ζ )(z − ζ ) Moreover, since for z → ∞, n (n,η) u n,η (z) = − p1,β,h (λ)dλ − ρη (λ)dλ + n O(z −2 ) = n O(z −2 ), (2.17) z we obtain X η1/2 (z)u n,η (z) = n O(z −1 ). Then the Cauchy Theorem yields for z outside of L,

1/2 X η (ζ )u n,η (ζ )dζ 1 = X η1/2 (z)u n,η (z). 2πi L z−ζ Thus for any z ∈ D1 (2.16) implies u n,η (z) =

F(z) 1/2 X η (z)Pη (z)

+

1 1/2 2πi X η (z)

F(ζ )dζ , L Pη (ζ )(ζ − z)

(2.18)

where the contour L ⊂ D1 contains z. It will be convenient for us to take L as far from σε , as it is possible. We will use the bounds: C ∗ n 1/2 log1/2 n C ∗ n 1/2 log1/2 n , |u n,η (z)| ≤ , d(z) d 2 (z) C ∗ n log n (η) |δn,β,h (z)| ≤ , d(z) = dist{z, σε }, d 2 (z) |u n,η (z)| ≤

(2.19)

where C ∗ is an n, η-independent constant which depends on ||V + n1 h ||∞ , ε, and |b−a|. The bounds of this type but with different exponents of d(z) were obtained in [2] (see also Theorem 11.1.2 and Remark 11.1.3 of [19]). For the reader’s convenience we will show at the end of this section how to obtain the first bound of (2.19) from the result of [2]. The second bound follows from the first one and the Cauchy Theorem, and the third bound follows from the first one and the lemma which is the analog of Lemma 3.11 of [13]. Lemma 1. For some fixed z ∈ σε consider ϕ(λ) = (λ − z)−1 or ϕ(λ) = (λ − z)−1 and assume that for any real h : ||h ||∞ < nε0 with some small ε0 > 0 the following inequality holds: (n,η) n ϕ(λ)( p1,β,h (λ) − ρη (λ))dλ ≤ wn ||ϕ (s) ||∞ (1 + ||h ||∞ ), (2.20) where ϕ (s) is the s th derivative of ϕ. Then there exists an absolute constant C∗ which does not depend on n, h, and z such that for any ||h ||∞ < nε0 /2, (η) (2.21) kn,β,h (λ, μ)ϕ(λ)ϕ(μ)dλdμ ≤ C∗ wn2 (1 + ||h ||∞ )2 ||ϕ (s) ||2∞ .

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

773

Remark 4. From the proof the lemma it is evident that it is valid for any function ϕ such that ||ϕ ||∞ ≤ ||ϕ (s) ||∞ < ∞. The lemma was proved in [13], but since it is an important ingredient of our proof, we give its proof here. Proof of Lemma 1. Without loss of generality assume that wn > 1. Using the method of [13], consider the function n t

ϕ(λi ) − ϕ(λ)ρη (λ)dλ Z n (t) = Eη,β,h exp , 2 wn i=1

w n = wn (1 + ||h ||∞ )||ϕ (s) ||∞ , where Eη,β,h {.} is defined in (1.5) with V replaced by Vη,h of (2.2). It is easy to see that d2 log Z n (t) dt 2 = (2 wn )−2 Eβ,h−tϕ/2β w n

n

(ϕ(λi ) − Eβ,h−tϕ/2β w n {ϕ(λi )})

2

≥ 0.

(2.22)

i=1

Hence in view of (2.20) we have

t

d d log Z n (τ )dτ ≤ |t| log Z n (t) dt 0 dτ n

ϕ(λi ) − ϕ(λ)ρη (λ)dλ = |t|(2 wn )−1 Eβ,h−tϕ/2β w n

log Z n (t) = log Z n (t) − log Z n (0) =

|t|n = 2 wn

ϕ(λ)

i=1

dλ ≤ |t|,

(n,η) p1,β,h−tϕ/2β w n (λ) − ρη (λ)

t ∈ [−1, 1].

Thus Z n (t) ≤ e|t| ≤ 3, t ∈ [−1, 1], and for any t ∈ C, |t| ≤ 1, |Z n (t)| ≤ Z n (t) < 3.

(2.23)

Then, by the Cauchy Theorem, we have

1 1 Z n (t )dt ≤ 12, |t| ≤ , |Z n (t)| = 2 2π |t |=1 (t − t) 2 and therefore for |t| ≤

1 24 ,

|Z n (t) − 1| =

0

t

1 Z n (t)dt ≤ 2

Hence log Z n (t) is analytic for |t| ≤

1 24 ,

⇒

|Z n (t)| ≥

1 . 2

and using the above bounds we have

d2 log Z n (t) 1 log Z n (0) = dt ≤ C. 2 dt 2πi |t|=1/24 t3

774

M. Shcherbina

Finally, in view of (2.22) we get n

2 (η) ≤ 4C w n2 . (ϕ(λi ) − Eη,β,h {ϕ(λi )}) kn,β,h (λ, μ)ϕ(λ)ϕ(μ)dλdμ = Eη,β,h i=1

Having the bounds (2.19), let us come back to the proof of Theorem 1. Set Mn (d) =

sup |u n,η (z)|. z:d(z)≥d

By (2.17) and the maximum principle, there exists a point z : d(z) = d such that Mn (d) = |u n,η (z)|. δ from σε (see definition D1 below Then, using (2.18) with L lying on the distance (2.7), the definition of F (2.14), and (2.19)), we obtain for d ≤ δ /2 the inequality Mn (d) ≤

Mn2 (d) C2 log n + , C2 ≤ C0 (1 + ||h ||∞ ) C1 nd 1/2 d 5/2

δ and C ∗ of (2.19). Solving the above with some C0 , C1 depending only on max |Pη−1 |, D1

quadratic inequality, we get

⎡ Mn (d) ≥ 21 C1 nd 1/2 + C12 n 2 d − 4C1 C2 n log n/d 2 ; ⎢ ⎣

Mn (d) ≤ 21 C1 nd 1/2 − C12 n 2 d − 4C1 C2 n log n/d 2 . Since the first inequality contradicts (2.19), we conclude that for δ /2 ≥ d > n −1/3 log n the second inequality holds. Hence we get |u n,η (z)| ≤ C0 log nd −5/2 (z)(1 + ||h ||∞ ), δ /2 ≥ d(z) > n −1/3 log n. (2.24) Now we can use (2.24) to apply Lemma 1 with wn = C0 log n to z with δ /2 ≥ d(z) > n −1/3 log n and ||h ||∞ ≤ 2. Then we obtain for such z (cf. (2.19)) (η)

|δn,β,h (z)| ≤ C log2 nd −5 (z).

(2.25) (η)

Then, using this bound and (2.24) in (2.18) and taking into account that n −1 δn,β,h (z) ≥ δ /4 ≥ d(z) > d −2 (z) for d(z) > n −1/3 log n, we get that for ||h ||∞ ≤ 1 and n −1/3 log n,

gη (ζ )dζ (2/β − 1) + O(d −5/2 (z)). u n,η (z) = − (2.26) 1/2 2πi X η (z) L Pη (ζ )(z − ζ ) (η) Applying Lemma 1 once more, we get δn,β,h (z) ≤ Cd −5 (z) for δ /4 ≥ d(z) > δ /8 ≥ d(z) > n −1/3 log n. n −1/3 log n. Using the bound in (2.18) we prove (1.28) for

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

775

Let us note that u n,η (z) is a function which is analytic everywhere in C\σε and which behaves like |u n,η (z)| ∼ nz −2 , as z → ∞. Then, applying the Cauchy Theorem, we have for any z that

u(ζ )dζ 1 (2.27) u n,η (z) = 2πi L z − ζ with the contour L ⊂ D2 := {z : d(z) ≤ δ /8}. This allows us to obtain the bound of the type (1.28) for z ∈ D2 . To prove (1.29), consider the Poisson kernel P y (λ) =

y . π(y 2 + λ2 )

It is easy to see that for any integrable ϕ, (P y ∗ ϕ)(λ) =

1 π

(2.28)

ϕ(μ)dμ . μ − (λ + i y)

Hence we can use (2.26) in order to prove that for |y| ≥ n −1/3 log n, (n,η)

||P y ∗ νn,η ||22 ≤ C(y −5 + y −1 ), for νn,η (λ) := n( p1,β (λ) − ρη (λ)), (2.29) where ||.||2 is the standard norm in L 2 (R). Then we use the following formula (see [13]) valid for any ν ∈ L 2 (R), ∞ −y 2s−1 2 e y ||P y ∗ ν||2 dy = (2s) (1 + 2|ξ |)−2s | ν(ξ )|2 dξ. (2.30) R

0

This formula for s = 3, the Parseval equation for the Fourier integral, and the Schwarz inequality yield 1 ϕ (ξ ) νn,η (ξ )dξ ϕ(λ)νn,η (λ)dλ = 2π R R 1/2 1/2 1 ≤ | ϕ (ξ )|2 (1 + 2|ξ |)2s dξ | νn,η (ξ )|2 (1 + 2|ξ |)−2s dξ 2π R R ∞ 1/2 C((||ϕ||2 + ||ϕ (3) ||2 ) −y 2s−1 2 ≤ e y ||Py ∗ νn,η ||2 dy . (2.31)

1/2 (2s) 0 To estimate the last integral here, we split it into two parts |y| ≥ n −1/3 log n and |y| < n −1/3 log n. For the first integral we use (2.29) and for the second - (2.19). (ii) To prove (1.30) we start from the relations aη − a = (η − 1)a∗ + O((η − 1)2 ), bη − b = (η − 1)b∗ + O((η − 1)2 ), Pη (z) − P(z) = (η − 1) p(z) + O((η − 1)2 ), −1/2

g/η := gη (z)/η − g(z) = −X 1

−3/2

(z)(η−1 − 1) + X 1

(z)O((η − 1)2 ), (2.32)

|(ρη /η − ρ, ϕ)| ≤ C|η − 1| (||ϕ||2 + ||ϕ ||2 ),

where a∗ , b∗ are some constant, p(z) is some analytic in D function, and the bound in the third line is uniform in D1 . The first line here follows from the results of Theorem 1.2(c)

776

M. Shcherbina

of [11]. We would like to stress that although the results of [11] were obtained under the assumption that V is defined on the whole R, in fact it is proved that the system of the equations for the edge points (in our case aη and bη as a function of η) is uniquely solvable, i.e. that the correspondent Jacobian is nonzero. Since this Jacobian is expressed in terms of ρη and derivatives of ηV in aη and bη , these results are applicable to our case. The second line of (2.32) follows from the first one, if we use the representation (1.25). The third line follows from (2.11) written for the function gη (z)/η. If we subtract from it (2.11) for η = 1 and then apply the operation (2.16) with a contour L encircling z and σε and lying on the finite distance d > δ of σε and zeros of P1 , we get 1/2 X 1 (z) g/η (z)

= −(η

−1

− 1) +

P1−1 (z) 2g/η (z) +

1 2πi

2g/η (ζ )dζ L

(z − ζ )P1 (ζ )

.

Taking into account the first two lines of (2.32) and (2.12), we conclude that g/η = O(η − 1), if we are on L; hence the integral above is O((η − 1)2 ) uniformly in D1 . Thus, solving this quadratic equation, we obtain the third line of (2.32). The third line of (2.32) and (2.31) imply the last line of (2.32). Then (1.30) follows from (1.28) combined with (2.32). (iii) Consider the functions Vη,t of the form Vη,t (λ) = tηV (λ) + (1 − t)Vη(0) (λ), (0)

where Vη

(2.33)

(η)

is defined in (1.31). Let Q n,β (t) := Q n,β [Vη,t ] be defined by (1.2) with V (η)

replaced by Vη,t . Then, evidently, Q n,β (1) = Q n,β [ηV ], and Q n,β (0) corresponds to (0)

Vη

(see (1.31)). Hence 1 1 1 1 d (η) (η) (η) log Q n,β (1) − 2 log Q n,β (0) = 2 dt log Q n,β (t) 2 n n n 0 dt β 1 (n,η) =− dt dλ(ηV (λ) − Vη(0) (λ)) p1,β (λ; t), 2 0

(2.34)

(n,η)

where p1,β (λ; t) is the first marginal density corresponding to Vη,t . Using (1.8), one can check that for the distribution (1.1) with V replaced by Vt the equilibrium density ρt has the form ρt (λ) = tρη (λ) + (1 − t)ρη(0) (λ), ρη(0) (λ) =

1/2

2X η (λ) π dη2

with X η , dη of (1.31). Hence using (1.28) for the last integral in (2.34), we get β β log Q n,β [ηV ] = log Q n,β [Vη(0) ] − n 2 E[Vη(0) ] + n 2 E[ηV ] 2 2 1 βn 1 + dt (ηV (z) − Vη(0) (z))u n,η (z, t)dz. 2 (2πi) 0 L

(2.35)

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

777

Changing the variables in the corresponding integrals, we have n2β

dη log Q n,β [Vη(0) ] = log Q ∗n,β + + n(1 − β/2) log , 2 2 2β 2β d n 3n n2β η E[Vη(0) ] = − + log . 2 8 2 2 Then (1.31) follows. (n,η)

Derivation of the first bound of (2.19). Set ρη,h = p1,β,h − ρη,h , where ρη,h is the equilibrium density corresponding to the potential ηV + n −1 h. Note that if ρη,h does not exist (it could happen if h is not Hölder), then we should consider d Nη,h instead of ρη,h . We derive the first bound of (2.19) from the following inequality which was obtained in [2] (see also [18, Thm. 3] or [19, Thm. 11.1.2] for the more detailed version of the proof) −L[ ρη,h , ρη,h ] − C0 n −1 log n 2 ≤ E[ηV + n −1 h] − Q n,β [ηV + n −1 h] ≤ 2C0 n −1 log n, βn 2

(2.36)

where the constant C0 depends only on ||V + n −1 h ||∞ and A : (−A, A) ⊃ σε . Without loss of generality we can assume that A = 1. We will use below that in this case the quadratic form L[ f, f ] of (1.7) is negative definite for f : supp f ∈ [−1, 1]. Using the Schwarz inequality and the negative definiteness of the quadratic form L (n,η) on σε , we get for ρη := p1,β,h − ρη , −L[ ρη , ρη ] ≤ −2L[ ρη,h , ρη,h ] − 2L[ρη,h − ρη , ρη,h − ρη ] ≤ C0 n −1 log n + 4 E[ηV + n −1 h] − E[ηV ] + 2n −1 (h, ρη,h + ρη ) ≤ C1 n −1 log n(1 + ||h ||∞ ). Take any z ∈ σε , set ϕz (λ) := (λ − z)−1 and consider the function f z (λ) which is a solution of the Cauchy type equation on the interval σε/2 , f z (μ)dμ . (2.37) ϕz (λ) = σε/2 λ − μ By the standard theory of singular integral equations we can take 1/2 ϕz (μ)X ε/2 (μ)dμ dε2 − (z − c)(λ − c) 1 = − f z (λ) = , 1/2 1/2 1/2 λ−μ π X ε/2 (λ) σε/2 X ε/2 (z)X ε/2 (λ)(z − λ)2 where c = 21 (a + b), dε = 21 (b − a + ε), (z − c)2 − 4dε2 , z ∈ σε/2 , 1/2 X ε/2 (z) = |(z − c)2 − 4dε2 |, z ∈ σε/2 , and we consider the branch of the square root which is analytic outside σε/2 and 1/2 X ε/2 (z) ∼ z, as z → +∞. Then in view of (2.37) we have (L f z )(λ) = ϕz (λ) + C z ; C z is a λ-independent constant.

(2.38)

778

M. Shcherbina

Then, using (2.1) and the fact that L is negative definite, we obtain (η)

|(ϕz (λ), ρη )| = |L[ f z , ρη ]| + O(e−ndε ), |L[ f z , ρη ]| ≤ |L[ ρη , ρη ]|1/2 |L[ f z , f¯z ]|1/2 ≤ (C1 n −1 log n(1 + ||h ||∞ )1/2 |L[ f z , f¯z ]|1/2 . Now, taking into account (2.38) and the fact that ( f z , 1σε/2 ) = 0, we can compute L[ f z , f¯z ] explicitly. This gives the first bound of (2.19). 3. Matrix Kernels for Orthogonal and Symplectic Ensembles In this section we will prove Theorems 2 and 3, using the results of the previous section. But before we briefly recall the basic known facts from the theory of orthogonal and symplectics matrix models, which we are going to use. Throughout this section V (λ) is a polynomial potential of degree 2m with generic behavior, i.e. we assume that it satisfies Conditions C1–C2. Matrix kernels (1.13)–(1.16) are constructed from the orthonormal system of func(n) tions {ψk }∞ k=0 of (1.11)–(1.12) which according to the standard orthogonal polynomial theory satisfy the recursion relations (n)

(n)

(n)

(n)

(n)

(n)

(n)

λψk (λ) = ak+1 ψk+1 (λ) + bk ψk (λ) + ak ψk−1 (λ).

(3.1)

These relations define a semi-infinite Jacobi matrix J (n) . It is known (see, e.g. the proof of Lemma 2 in [17]) that |ak(n) | ≤ C, |bk(n) | ≤ C, |n − k| ≤ εn.

(3.2)

By orthogonality and the spectral theorem we find that (n) (D∞ ) j,k = n sign( j − k)V (J (n) ) jk ⇒

(n) (n) ⇒ (D∞ ) j,k = 0, | j − k| ≥ 2m, |(D∞ ) j,k | ≤ nC, | j − n| ≤ nc.

(3.3)

We are going to use the formula for Sn,β , obtained in [25, Thm. 2] (see also [8]). In order to present this formula, introduce some more notations: (n)

(n)

(n)

(n)

1 := (ψn−2m+1 , ψn−2m+2 , . . . , ψn−1 )T , (n) (n) (n) T (n) 2 := (ψn , ψn+1 , . . . , ψn+2m−2 ) ,

and T Mr s := (r(n) , ((n) s ) ), (n)

(n)

T Dr s := ((r(n) ) , ((n) s ) ), 1 ≤ r, s ≤ 2. (n)

Observe that M∞ D∞ = 1 together with (D∞ ) jk = 0 for | j − k| ≥ 2m implies Tn = 1 − D12 M21 .

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

779

Then we have (see [8, Thm. 2.7]) ˆ 1 (μ), Sn,1 (λ, μ) = K n,2 (λ, μ) + 1 (λ)T D12 2 (μ) − 1 (λ)T G Gˆ := D12 M22 (1 − D21 M12 )−1 D21 , Sn/2,4 (λ, μ) = K n,2 (λ, μ) + 2 (λ)T D12 1 (μ) − 2 (λ)T G2 (μ),

(3.4)

G := −D21 (1 − M12 D21 )−1 M11 D12 , where K n,2 is defined in (1.10). Proof of Theorem 2. Using (3.4), it is straightforward to see that the first assertion of Theorem 2 follows from the following lemma. Lemma 2. Given any smooth functions f, g and any fixed A > 0 there exists a C > 0 such that for all n ≥ 2m and all j, k ∈ {n − A, . . . , n + A} one has

C (n) (n) (i) ( f ψ j )(λ)g(λ)ψk (λ)dλ ≤ || f ||∞ + || f ||∞ ||g||∞ + ||g ||∞ ; n

C (n) (3.5) (ii) ( f ψ j )(λ) ≤ √ || f ||∞ + || f ||∞ ; (iii) log det(Tn ) ≥ −C. n (n) which are used in Indeed, taking in (i) f = g = 1, we obtain that all entries of M∞ −1 −1 (3.4) are bounded by Cn , hence, having that | det T | ≤ C, we obtain that G and Gˆ have entries bounded by nC. This proves representation (1.18).

Proof of Lemma 2. Assertions (i) and (ii) of Lemma 2 will be derived from the asymptotics of the orthogonal polynomials in Sect. 4. We now prove assertion (iii), using Theorem 1. Note that without loss of generality we can assume that σ ⊂ (−1, 1) and v ∗ = 0 in (1.8). Similarly to the proof of Theorem 1 we choose ε in (1.26) small enough to have all zeros of P of (1.25) outside of σε and use the results of [18] that if we replace in all definitions (1.1)–(1.4) the integration in R by integration with respect to σε , then the new (η) (ε) Q n,β [V ] will differ from Q n,β [V ] by the factor (1 + O(e−ndε )) (see (2.1)). Moreover, (η)

the new marginal densities will differ from (1.4) by an additive error O(e−ndε ). Hence, starting from now, we assume that this replacement is made and all integrals below are in σε . Consider the “approximating” function Ha (Hamiltonian) (cf. (1.1)) Ha (λ1 . . . λn ) = −n + V (a) (λ) =

i= j q α=1

V (a) (λi )

log |λi − λ j |

Vα(a) (λ),

q α=1

1σα,ε (λi )1σα,ε (λ j ) − n 2 ∗ ,

(3.6)

780

M. Shcherbina (a)

where Vα (λ) is defined in (1.36), and ∗ is defined in (1.37). Then H (λ1 . . . λn ) = Ha (λ1 . . . λn ) + H (λ1 . . . λn ), λ1 , . . . , λn ∈ σε , log |λi − λ j | 1σα,ε (λi )1σα ,ε (λ j ) H (λ1 . . . λn ) = α=α

i= j

−2n

n

(λ j ) + n 2 ∗ , V

(3.7)

j=1

where (λ) = V

q

1σα,ε (λ)

α=1

Set (a) Q n,β

=

σεn

σ \σα

log |λ − μ|ρ(μ)dμ.

¯ eβ Ha (λ1 ...λn )/2 d λ.

(3.8)

By the Jensen Inequality, we have β β H H a ,β ≤ log Q n,β [V ] − log Q (a) H H,β , n,β ≤ 2 2 where < · · · > H,β := Q −1 n,β < · · · > Ha ,β

(3.9)

¯

¯ (. . . )eβ H (λ)/2 d λ, ¯ −1 := (Q (a) (. . . )eβ Ha (λ)/2 d λ¯ . n,β )

Let us estimate the r.h.s. of (3.9) for β = 2, q (n) H H,β = n(n − 1) p2,β (λ, μ) log |λ − μ| 1σα,ε (λ)1σα ,ε (μ) 2

(n)

α=α

2 ∗ , p −2n V 1,β + n .

Here and below (., .) means the inner product in L 2 (σε ). But using the definition of V and ∗ , we can rewrite the r.h.s. above as (n) H H,β = 1σα,ε (λ)1σα ,ε (μ) log |λ − μ|kβ (λ, μ)dλdμ α=α

+ n2

α=α

(n) (n) L 1σα,ε p1,β − ρ , 1σα ,ε p1,β − ρ ≤ C,

(3.10)

where the kernel kβ(n) is defined in (2.7) with η = 1 and h = 0. The term with δ(λ − μ) from (2.7) gives zero contribution here, because |λ − μ| ≥ δ in our integration domain (see (1.26)). The last inequality in (3.10) is obtained as follows. Since dist {σα,ε , σα ,ε } L α,α (λ) with 8 ≥ δ, and σε ⊂ [−1, 1], we can construct 6-periodic even function derivatives such that

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

781

L α,α (λ) = 0, |λ| > 5/2. L α,α (λ−μ) = log |λ−μ|, λ ∈ σα,ε , μ ∈ σα ,ε ,

(3.11)

Then, ∞

L α,α (λ − μ) =

ck eikπ(λ−μ)/3 , λ ∈ σα,ε , μ ∈ σα ,ε , |ck | ≤ Ck −8 . (3.12)

k=−∞

Note that we need 8 derivatives to have the above bound for the Fourier coefficients of L α,α , which will be used later. To estimate the first sum in (3.10), we use the well known bound for β = 2 (see e.g. [19, Thm. 4.2.6]) (n) 2 ϕ(λ)ϕ(μ)k2 (λ, μ)dλdμ = |ϕ(λ) − ϕ(μ)|2 K n,2 (λ, μ)dλdμ ≤ C||ϕ (λ)||∞ . (3.13) Hence

(n)

1σα,ε (λ)1σα ,ε (μ) log |λ − μ|kβ (λ, μ)dλdμ =

∞

ck

(n)

1σα,ε (λ)1σα ,ε (μ)eikπ λ/3 e−ikπ μ/3 kβ (λ, μ)dλdμ = O(1).

k=−∞

To estimate each term of the second sum in (3.10), we use the bound which can be obtained by integration of the asymptotics for orthogonal polynomials (4.2) and (4.4) of [5, Thms. 1.1 and 1.2] with any smooth ϕ similarly to that in Sect. 4, but more simple. We have

(n) ϕ(λ) p1,2 (λ) − ρ(λ) dλ = O(n −1 )||ϕ ||∞ . (3.14) (n)

Note that it is also proved in [5, Lem. 6.1] that | p1,2 (λ) − ρ(λ)| ≤ Cn −1 for any domain where ρ(λ) > c > 0, ! (n) (n) L 1σα,ε ( p1,β − ρ), 1σα ,ε ( p1,β − ρ) ∞

(n) (n) = ck 1σα,ε ( p1,β − ρ), eikπ λ/3 1σα ,ε ( p1,β − ρ), e−ikπ λ/3 = O(n −2 ). k=−∞

(3.15) (a)

To estimate the l.h.s. of (3.9), we first study the structure of Q n,β . Since Ha does not

contain terms log |λi − λ j | with λi ∈ σα,ε , λ j ∈ σα ,ε with α = α , Q (a) n,β can be written as n! Q ¯ [V (a) ], Q (a) n,β = k1 ! . . . kq ! k,β k1 +···+kq =n

Q k,β ¯ =

k1

σεn j=1

= e−n

1σ1,ε (λ j ) · · ·

2 β ∗ /2

n j=k1 +···+kq−1 +1

q α=1

Q kα ,β [nVα(a) /kα ],

1σq,ε (λ j )eβ Ha (λ1 ...λn )/2 d λ¯ (3.16)

782

M. Shcherbina (a) q

where the “effective potentials” {Vα }α=1 are defined in (1.36). Take {kα∗ }α=1 of (1.34) and set κk¯ :=

k1∗ ! . . . kq∗ ! k1 ! . . . kq !

q

Q k;β ¯ /Q k¯ ∗ ;β .

(3.17)

It is evident that (a)

Q n,β =

n!

Q ¯∗ κk¯ . k1∗ ! . . . kq∗ ! k ;β k1 +···+kq =n (α)

Choose ε sufficiently small to provide that |μ∗α n/kα − 1| ≤ ε1 for α = 1, . . . , q, if (α) (a) |k¯ − k¯ ∗ | ≤ εn, where ε1 is chosen for σα,ε and (n/kα )Vα as in Theorem 1. ¯ Let us prove that there exist n, k-independent C∗ , c∗ , c0 > 0 and b¯ ∈ Rq such that ¯ ¯∗ ¯ ¯∗ ¯ ¯ ¯∗ C∗ e−c∗ (k−k ,k−k )+(b,k−k ) , |k¯ − k¯ ∗ | ≤ εn, (3.18) κk¯ ≤ 2 C∗ e−c0 n , |k¯ − k¯ ∗ | ≥ εn. (a)

The proof of the first bound is based on Theorem 1 applied to the potential (μ∗α )−1 Vα (λ) which is defined on the interval σα,ε by (1.36). The equilibrium density for this potential is (μ∗α )−1 ρ(λ)1σα,ε . Indeed, the corresponding function vα of (1.8) evidently has the form vα (λ) = (μ∗α )−1 v(λ). Hence it satisfies (1.8) on σα,ε automatically. We need not to check (1.8) for other λ in view of Remark 3 after (2.1). Using (1.31) and (1.33), we can write for β = 1, 2, 4,

k2 β log Q kα ,β /kα ! = α En/kα + kα I (kα ) − aβ log kα + O(1), 2 1

β n (a) (0) I (kα ) := dt Vα (ζ ) − Vn/kα (ζ ) u n/kα (ζ, t)dζ (3.19) 2πi 0 kα

2π 1

+ 1 − β/2 log(dn/kα /2) − log kα β + + log , 2

(β/2) (a)

(0)

where En/kα = E[(n/kα )Vα ] is the equilibrium energy (1.6), u n/kα (., t), Vn/kα , and dn/kα are defined in (1.31). Using (1.30) and (2.32), we get for I (kα ) of (3.19), kα I (kα ) − kα∗ I (kα∗ ) = b(α) (kα − kα∗ ) + 1 − β/2 (kα − kα∗ ) log n + O(1) + O((kα − kα∗ )2 /n), (3.20) where the concrete form of b(α) is not important for us. Note that the term containing log n here appears because of the term log kα written as log n +log(kα /n) (see the second line of the definition of Ikα above). This term " is not important " for us because it disappears after summation with respect to α since kα = n = kα∗ (see (1.35)). To estimate the difference of the energies we introduce the notations kα 1σ (λ)ρn/kα (λ), ρ (α) (λ) := ρ(λ)1σα (λ), n α,ε (α) (α) n/kα (λ) := ρ n/kα (λ) − ρ (α) (λ). (α)

ρ n/kα (λ) :=

(3.21)

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

Then

783

(α) (α) (α) n/kα , ρ kα2 En/kα = n 2 L ρ n/kα − Va , ρ n/kα

(α) (α) (α) (α) = n2 L ρ n/kα , ρ n/kα + 2L ρ n/kα − V, ρ n/kα , ρ − ρ (α) .

Hence taking

! !

n 2 E (α) := n 2 L ρ (α) , ρ (α) − V, ρ (α) + 2L ρ (α) , ρ − ρ (α) , we obtain

(α) (α) (α) kα2 En/kα − n 2 E (α) = n 2 L n/kα , n/kα + 2L n/kα , ρ (α)

(α) (α) +2L n/kα , ρ − ρ (α) − V, n/kα (3.22) (α) (α) (α) (α) (α) = n 2 L n/kα , n/kα + n 2 v, n/kα ≤ n 2 L n/kα , n/kα =: n 2 L n/kα ,

where v is defined in (1.8). Here we used that in view of (1.8) and our assumption v ∗ = 0, we have v(λ) = 0, λ ∈ σα , (α) n/kα (λ)

=

v(λ) < 0, λ ∈ σα ,

(α) ρ n/k (λ) α

≥ 0, λ ∈ σα ,

(α) which implies v, n/kα ) ≤ 0. Note that for any function ϕ : supp ϕ ⊂ [−1, 1], ϕ = ϕ ∗ , L ϕ, ϕ = L ϕ − ϕ ∗ ψ∗ , ϕ − ϕ ∗ ψ∗ − |ϕ ∗ |2 log 2 ≤ −|ϕ ∗ |2 log 2,

(3.23)

where ψ∗ (λ) = π −1 (1 − λ2 )−1/2 1[−1,1] , and we used the well known properties of ψ∗ :

1

−1

log |λ − μ|ψ∗ (μ)dμ = − log 2, λ ∈ [−1, 1],

1

−1

ψ∗ (μ)dμ = 1.

Since by (3.21)

(α) ρ n/k (λ)dλ = kα /n, α

ρ (α) (λ)dλ = μ∗α ,

we get n 2 L n/kα ≤ − log 2|kα /n − μ∗α |2 n 2 .

(3.24)

In view of (3.22), to obtain the estimate for the difference of energies, it suffices to obtain the bound for (kα∗ )2 En/kα∗ − n 2 E (α) from below. But it follows from (2.32) and (1.23) (α) (α) that for λ ∈ supp ρn/k ∗ \σα ∪ σα \supp ρn/k ∗ , α

α

(α)

n/k ∗ (λ) = O(|dα∗ |1/2 ), v(λ) = O(|dα∗ |3/2 ), α

(3.25)

784

M. Shcherbina

where dα∗ := kα∗ /n − μ∗α , |dα∗ | ≤ 2n −1 . Thus, ∗ 3 −3 v, (α) n/k ∗ = O(|dα | ) = O(n ), (kα∗ )2 En/kα∗

α

− n 2 E (α) = n 2 L n/kα∗ + O(n −1 ).

In order to estimate L n/kα∗ , observe that if we denote vn/kα∗ (λ) the l.h.s. of (1.8) for (a) (α) (n/kα∗ )Vα and σα∗ = σα ∩ supp ρ n/k ∗ , then relations (2.32) yield α

vn/kα∗ (λ) − v(λ) = dα∗ 1σα∗ (λ) + O(dα∗ )1R\σα∗ (λ), dα∗ = const = O(dα∗ ). (α)

Integrating these relations with n/kα (λ), we obtain that n 2 L n/kα∗ = O(1) and kα2 En/kα − (kα∗ )2 En/kα∗ ≤ − log 2(kα − kα∗ )2 /2 + O(1).

(3.26)

Then (3.17), (3.19), (3.20), and (3.26) imply the first bound " of (3.18), if we take the sum with respect to α = 1, . . . , q and take into account that α (kα − kα∗ ) = 0. To obtain the second bound of (3.18), we use the inequality βkα2 En/kα + Ckα log kα 2 with some universal C (see [2] or [19, Thm. 11.1.2]). Using the inequality for kα and (3.19) for kα∗ , we get in view of (3.26),

log Q kα ,β /Q kα∗ ,β ≤ −β log 2|kα − kα∗ |2 /4 + O(n log n). log Q kα ,β ≤

Hence we obtain the second inequality of (3.18). Now we are ready to find a bound for the l.h.s. of (3.9). We have

−1 H H a ,β = κk¯ H k¯ κk¯ , k1 +···+kq =n

H k¯ :=

α=α

+n 2 =

k1 +···+kq =n (k ) p1,βα ,

(k ) p1,βα

L ρ (α) , ρ (α )

− 2n

α

(λ), p (kα ) kα V 1,β (3.27)

α=α

n n (α ) (k ) (k ) =: kα kα L p1,βα − ρ (α) , p1,βα − ρ k α k α L kα kα . kα kα

α=α

kα kα L

α=α

Then for |k¯ − k¯ ∗ | ≤ εn we write similarly to (3.12) and (3.15), ∞ (kα ) n (α) i jπ λ/3 (kα ) n (α ) −i jπ λ/3 p1,β − c j p1,β − ρ , e ρ ,e L kα kα = kα kα ≤

j=−∞ ∞

2 (k ) 2 (k ) |c j | p1,βα − ρn/kα , e−i jπ λ/3 + p1,βα − ρn/kα , e−i jπ λ/3

j=−∞

2 n 2 (α ) 2

n 2 (α) −i jπ λ/3 −i jπ λ/3 , e + , e n/kα n/kα kα2 kα2 2 2 ≤ O(n −2 ) + C n/kα − (μ∗α )−1 + n/kα − (μ∗α )−1 . +

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

785

To get the bound O(n −2 ) for the first two terms in the r.h.s. here, we used (1.29) for ϕk = eikπ λ/3 and the bound for c j from (3.12), and for the last two terms we used (3.12), εn, in view of the second line of combined with the last line of (2.32). For |k¯ − k¯ ∗ | ≥ (3.18), it suffices to use that the r.h.s. of (3.27) is O(n 2 ). Thus we get finally | H H a ,β | ≤ C. The bound, (1.20), and (3.9), yield det(Tn ) =

Q n,1 Q n/2,4 Q n,2 (n/2)!2n

2 ≥C

q

Q kα∗ ,1 Q kα∗ /2,4

2 ∗

Q kα∗ ,2 (kα∗ /2)!2kα

α=1

.

Hence it is enough to prove that each multiplier in the r.h.s. is bounded from below. Consider the functions Vt of (2.33). Then, as it was mentioned in the proof of Theorem 1 (iii), the limiting equilibrium density ρt has the form (2.35), which corresponds to Vt . Hence Vn/kα∗ ,t satisfies conditions of Theorem 1 for any t ∈ [0, 1]. Moreover, if we introduce the matrix Tn (t) for the potential Vt by the same way, as above, then Tn (0) corresponds to GOE or GSE. Consider the function L(t) = log det Tkα∗ (t).

(3.28)

To prove that |L(1)| ≤ C, it is enough to prove that |L(0)| ≤ C, |L (t)| ≤ C, t ∈ [0, 1].

(3.29)

The first inequality here follows from the results of [24]. To prove the second inequality we use (1.20) for V replaced by Vt . Then we get (kα∗ /2) (kα∗ ) L (t) = (kα∗ )2 V (λ) p1,4,t (λ)dλ + V (λ) p1,1,t (λ)dλ

(kα∗ ) − 2 V (λ) p1,2,t (λ)dλ , (0)

V (λ) = (n/kα∗ )Vα(a) (λ) − Vn/k ∗ (λ). α

Using (1.28), we obtain that the first term of (1.28) and the integral in the definition of u n,η (z, t) give zero contributions in L (t), hence L (t) = O(1). Thus we have proved the second inequality in (3.29) and so assertion (iii) of Lemma 2. As it was mentioned (1) (4) above, assertions (i) and (iii) of Lemma 2 and (3.3) imply that F jk and F jk of (1.18) are bounded uniformly in n. To prove the first line of (1.38) for β = 1, we use (1.18), (n) p1,1 (λ) f (λ)dλ = n −1 Sn,1 (λ, λ) f (λ)dλ =

(n) p1,2 (λ) f (λ)dλ +

2m−1

(1) F jk

(n)

(n)

ψn+ j (λ)ψn+k (λ) f (λ)dλ.

j,k=−(2m−1)

Thus we obtain the first line of (1.38) from (3.14) and (3.5)(i). For β = 4 the proof is the same. The second line of (1.38) follows from the first one in view of Lemma 1.

786

M. Shcherbina

To obtain (1.39), observe that we have proved already that the l.h.s. of (1.39) is more than the r.h.s. Hence we are left to prove the opposite inequality. To this aim we use the inequality (3.9) and the relation (3.10). Then each term in the first sum of (3.10) can be estimated by the same way as for β = 2 by using the second line of (1.38) instead of (3.13). Each term in the second sum in (3.10) can be estimated by the same way, as for β = 2, if we use the first line of (1.38). Proof of Theorem 3. The convergence of the diagonal entries of K n,1 and K n/2,4 to the corresponding limiting expressions follows from the first assertion of Theorem 2, (ii) of (3.5) and convergence of K n,2 to K ∞ . The same is valid for 12-entries of K n,1 and K n/2,4 . Moreover, since Sn,β (λ, μ) = − Sn,β (μ, λ), one has μ μ Sn,1 (t, μ) dt, ( Sn/2,4 )(λ, μ) = − Sn/2,4 (t, μ) dt, ( Sn,1 )(λ, μ) = − λ

λ

which implies convergence of 21-entries of K n,1 and K n/2,4 . 4. Uniform Bounds for Integrals with ( f ψ k(n) ) In this section we prove the bounds (i) and (ii) of Lemma 2, using the asymptotics of orthogonal polynomials (1.11)–(1.12) obtained in [5]. Throughout this section V satisfies Conditions C1-C2. To simplify formulas below, we set δn = n −2/3+κ , 0 < κ < 1/3, σ±δn =

q

σα,±δn ,

α=1

(4.1)

σα,±δn = [E 2α−1 ∓ δn , E 2α ± δn ], (σα,−δn ⊂ σα ⊂ σα,+δn ). Then, according to [5, Thm. 1.1], we have ψn(n) (λ) = R0 (λ) cos nπ φn (λ) (1 + O(n −1 )), λ ∈ σ−δn , (n)

ψn−1 (λ) = R1 (λ) sin nπ φn−1 (λ) (1 + O(n −1 )), λ ∈ σ−δn , φn (λ) = φ(λ) + n −1 m 0 (λ), φn−1 (λ) = φ(λ) + n −1 m 1 (λ), E 2q φ(λ) = ρ(μ)dμ,

(4.2)

λ

(n)

where we represent M1 and M2 from formula (1.31) of [5] as γn−1 M1 (λ) = R0 (λ)eim 0 (λ) , (n)

(n)

(n)

γn−1 M2 (λ) = R1 (λ)eim 1 (λ) , R0 , R1 > 0 (where γ j is the leading coefficient of p j of (1.12)); hence R0 and R1 are smooth functions, which behave like |X −1/4 (λ)| near each E α (see (1.24) for the definition of X ), and m 0 , m 1 are smooth functions such that their first derivatives are bounded by |X −1/2 (λ)| (recall that for σ = [−2, 2] m 0 (λ) and m 1 (λ) are arccos(λ/2) with some coefficients). It will be important for us that πan(n) R0 (λ)R1 (λ) cos(π(m 0 (λ) − m 1 (λ))) = 1, λ ∈ σ−δn , (n)

(4.3)

where an is defined in (3.1). This relation follows from the fact (see [5], Lem. 6.1) that for λ ∈ σ−δn ,

O(n −1 ) (n) (n) . (λ) − ψn(n) (λ)(ψn−1 (λ)) = ρ(λ) + 3/2 an(n) n −1 (ψn(n) (λ)) ψn−1 |X (λ)|

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

787

For |λ − E α | ≤ δn we have (see [5, Thm. 1.2])

(α) Ai n 2/3 α (−1)α (λ − E α ) (1 + O(|λ − E α |)) ψn(n) (λ) = n 1/6 B11

(α) +n −1/6 B12 Ai n 2/3 α (−1)α (λ − E α ) ·(1 + O(|λ − E α |)) + O(n −1 ),

(n) (α) ψn−1 (λ) = n 1/6 B21 Ai n 2/3 α (−1)α (λ − E α ) (1 + O(|λ − E α |))

(α) +n −1/6 B22 Ai n 2/3 α (−1)α (λ − E α )

(4.4)

·(1 + O(|λ − E α |)) + O(n −1 ). Functions α in (4.2) are analytic in some neighborhood of 0 and such that α (λ) = K α x + O(x 2 ) with some positive n-independent K α . Moreover, (n)

|ψn(n) (λ)| + |ψn−1 (λ)| ≤ e−nc dist

3/2 {λ,σ }

, λ ∈ R\σ+δn .

The proof of Lemma 2 is based on the following proposition. Proposition 1. Under Conditions C1-C2 for any smooth function f we have uniformly in σ+δn , ( f ψn(n) )(λ) = f (λ)R0 (λ) +

2q

sin nφn (λ) 1σ−δn + χn (λ) nφn (λ)

n −1/2 B11 n 2/3 α (−1)α (λ − E α ) (−1)α α 0 (α)

f (E α )

α=1 −5/6

) 1|λ−E α |≤δn + rn (λ) + O(n −1 ),

(4.5)

+ O(n x (x) := Ai(t)dt, −∞

where |χn (λ)| ≤ n −1/2 C is a piecewise constant function which is a constant in each σα,−δn and each interval (E α − δn , E α + δn ), and the remainder rn (λ) admits the bound |rn (λ)|dλ ≤ Cn −1/2−3κ/4 . (4.6) σ+δn

(n) ) if we replace sin by cos, R0 by R1 , φn by A similar representation is valid for ( f ψn−1 (α)

(α)

φn−1 , and B11 by B21 . Proof. Let λ ∈ σα,−δn . Then, integrating by parts in (4.2), we obtain

λ E 2α−1 +δn

f (μ)ψn(n) (μ)dμ

λ sin nφn (μ) = f (μ)R0 (μ) nφn (μ) E 2α−1 +δn λ d f (μ)R0 (μ) dμ. − sin nφn (μ) dμ nφn (μ) E 2α−1 +δn

(4.7)

788

M. Shcherbina

Moreover, by (4.4), we have for |λ − E 2α | ≤ δn ,

λ

ψn(n) (μ) f (μ)dμ E 2α −δn n 2/3 2α (λ − E 2α ) − n 2/3 2α (−δn ) −1/2 (2α) =n B11 f (λ) + O(n −5/6 ) (λ − E α ) n 2/3 2α (λ − E 2α ) −1/2 (2α) B11 f (E 2α ) =n + o(n −1/2 )const + O(n −5/6 ). (0)

Similar relations are valid for integrals near E 2α−1 . Taking into account that the integrals 3/2 3κ/2 (n) over R\σ+δn are of the order O(e−ncδn ) = O(e−n ) and writing ( f ψn ) as a sum of the above integrals and similar ones (with integration from λ to E 2α ), we obtain (4.5). Then, for λ ∈ σα,−δn , 1 χn (λ) = 2

E 2α−1 +δn E 1 −δn

f (μ)ψn(n) (μ)dμ −

1 2

E 2q +δn E 2α−1 +δn

f (μ)ψn(n) (μ)dμ = O(n −1/2 ),

and rn is the sum of the terms, which are under the integrals in the r.h.s. of (4.7). Hence we are left to prove the bound for rn . Using that φn (λ) = (2π )−1 P(λ)X 1/2 (λ) + n −1 m (λ), |R0 (λ)| ≤ C|X −1/4 (λ)| and taking into account that by definition rn (μ) = 0 for μ ∈ σ−δn , we obtain d f (μ)R0 (μ) dμ + O(n −5/6 ) |rn (μ)|dμ ≤ σ+δn σ−δn dμ nφn (μ) −3/4

≤ C(1 + || f ||∞ )n −1 δn

= C(1 + || f ||∞ )n −1/2−3κ/4 .

Proof of Lemma 2. Using recursion relations (3.1), it is easy to get that for any | j| ≤ 2m, (n) (n) (n) ψn+ j (λ) = f 0 j (λ)ψn (λ) + f 1 j (λ)ψn−1 (λ), (n)

(n)

where f 0 j and f 1 j are polynomials of degree at most | j|. Note that since ak and bk are bounded uniformly in n for k − n = o(n), f 0 j and f 1 j have coefficients, bounded uniformly in n. Hence assertion (ii) follows from Proposition 1. Moreover, it follows from the above argument that to prove assertion (i) it suffices to estimate (n) (n) (n) I1 := (gψn−1 , ( f ψn(n) )), I2 := (gψn−1 , ( f ψn−1 )),

I3 := (gψn(n) , ( f ψn(n) )) with differentiable f, g. It follows from Proposition 1 that I1 = I1,0 +

2q α=1

(n)

I1,α + (gψn−1 , rn ) + O(n −1 ),

(4.8)

Orthogonal and Symplectic Matrix Models: Universality and Other Properties

where

sin nφn (λ) sin nφn−1 (λ) dλ, Fn (λ) σ−δn E 2α +δn

−1/3 (2α) (2α) =n B11 B21 f (E 2α )g(E 2α ) n 2/3 2α (λ − E 2α ) E 2α −δn Ai n 2/3 2α (λ − E 2α ) dλ, · 2α (0)

I1,0 = n I1,2α

789

−1

f (λ)g(λ)R0 (λ)R1 (λ)

kn

and I1,2α−1 is the integral similar to I1,2α for the region |λ − E 2α−1 | ≤ δn . It is easy to see that (α)

(α)

I1,α = B11 B21

f (E α )g(E α ) (1 + o(1)) . 2n(α (0))2

Moreover, (4.6) and (ii) of Lemma 2 yield (n) (n) (n) (gψn−1 , rn ) = −((gψn−1 ), rn ) ≤ |(gψn−1 )| |rn | dλ = O(n −1−3κ/4 ). Hence we are left to find the bound for I1,0 , cos π(m 0 (λ) − m 1 (λ)) I1,0 = (2n)−1 f (λ)g(λ)R0 (λ)R1 (λ) dλ φn (λ) σ−δn cos n(φn (λ) + φn−1 (λ)) dλ = I10 f (λ)g(λ)R0 (λ)R1 (λ) + I10 . +(2n)−1 (λ) φ σ−δn n By (4.3) and (4.2), we obtain I10 =

1 2n

σ−δn

f (λ)g(λ) dλ + o(1) . 1/2 P(λ)X (λ) −3/2

= O(n −2 δ Integrating by parts, we obtain that I10 ) = O(n −1−3κ/2 ). n The other two integrals from (4.8) can be estimated similarly.

Acknowledgements. The author thanks Prof. T. Kriecherbauer and Prof. B. Eynard for the fruitful discussion. This work was partially supported by the joint project 17-01-10 of National Academy of Sciences of Ukraine and Russian Foundation for Basic Research.

References 1. Albeverio, S., Pastur, L., Shcherbina, M.: On the 1/n expansion for some unitary invariant ensembles of random matrices. Commun. Math. Phys. 224, 271–305 (2001) 2. Boutetde Monvel, A., Pastur, L., Shcherbina, M.: On the statistical mechanics approach in the random matrix theory. Integrated density of states. J. Stat. Phys. 79, 585–611 (1995) 3. Bleher, P., Its, A.: Double scaling limit in the random matrix model: the Riemann-Hilbert approach. Comm. Pure Appl. Math. 56, 433–516 (2003) 4. Claeys, T., Kuijalaars, A.B.J.: Universality of the double scaling limit in random matrix models. Comm. Pure Appl. Math. 59, 1573–1603 (2006) 5. Deift, P., Kriecherbauer, T., McLaughlin, K., Venakides, S., Zhou, X.: Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality questions in random matrix theory. Commun. Pure Appl. Math. 52, 1335–1425 (1999)

790

M. Shcherbina

6. Deift, P., Gioev, D.: Universality in random matrix theory for orthogonal and symplectic ensembles. Int. Math. Res. Papers. 2007, no. 2, Art ID rpm 004, 004-116 7. Deift, P., Gioev, D.: Universality at the edge of the spectrum for unitary, orthogonal, and symplectic ensembles of random matrices. Comm. Pure Appl. Math. 60, 867–910 (2007) 8. Deift, P., Gioev, D., Kriecherbauer, T., Vanlessen, M.: Universality for orthogonal and symplectic Laguerre-type ensembles. J. Stat. Phys. 129, 949–1053 (2007) 9. Ercolani, N.M., McLaughlin, K.D.: Asymptotics of the partition function for random matrices via Riemann-Hilbert techniques, and applications to graphical enumerations. Int. Math. Res. Not. 2003:14, 755–820 (2003) 10. Forrester, P. J.: Log-gases and random matrices. Princeton, NJ: Princeton University Press, 2010 11. Kuijlaars, A.B.J., McLaughlin, K.T.-R.: Generic behavior of the density of states in random matrix theory and equilibrium problems in the presence of real analytic external fields. Comm. Pure Appl. Math. 53, 736–785 (2000) 12. McLaughlin, K.T.-R., Miller, P.D.: The steepest descent method for orthogonal polynomials on the real line with varying weights. International Mathematics Research Notices, 2008, Article ID rnn075, 66p (2008) 13. Johansson, K.: On fluctuations of eigenvalues of random Hermitian matrices. Duke Math. J. 91, 151– 204 (1998) 14. Levin, L., Lubinskky, D.S.: Universality limits in the bulk for varying measures. Adv. Math. 219, 743– 779 (2008) 15. Mehta, M.L.: Random Matrices. New York: Academic Press, 1991 16. Kriecherbauer, T., Shcherbina, M.: Fluctuations of eigenvalues of matrix models and their applications. http://arxiv.org/abs/1003.6121v1 [math-ph], 2010 17. Pastur, L., Shcherbina, M.: Universality of the local eigenvalue statistics for a class of unitary invariant random matrix ensembles. J. Stat. Phys. 86, 109–147 (1997) 18. Pastur, L., Shcherbina, M.: Bulk universality and related properties of Hermitian matrix models. J. Stat. Phys. 130, 205–250 (2007) 19. Pastur, L., Shcherbina, M.: Eigenvalue Distribution of Large Random Matrices. Math. Surv. Monogr. III, Providence, RI: Amer. Math. Soc., 2011, 634pp 20. Shcherbina, M.: Double scaling limit for matrix models with non analytic potentials. J. Math. Phys. 49, 033501–033535 (2008) 21. Shcherbina, M.: On universality for orthogonal ensembles of random matrices. Commun. Math. Phys. 285, 957–974 (2009) 22. Shcherbina, M.: Edge universality for orthogonal ensembles of random matrices. J. Stat. Phys. 136, 35–50 (2009) 23. Stojanovic, A.: Universality in orthogonal and symplectic invariant matrix models with quartic potentials. Math. Phys. Anal. Geom. 3, 339–373 (2002) 24. Tracy, C.A., Widom, H.: Correlation functions, cluster functions, and spacing distributions for random matrices. J. Stat. Phys. 92, 809–835 (1998) 25. Widom, H.: On the relations between orthogonal, symplectic and unitary matrix models. J. Stat. Phys. 94, 347–363 (1999) Communicated by S. Smirnov

Commun. Math. Phys. 307, 791–815 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1328-4

Communications in

Mathematical Physics

From a Large-Deviations Principle to the Wasserstein Gradient Flow: A New Micro-Macro Passage Stefan Adams1 , Nicolas Dirr2 , Mark A. Peletier3 , Johannes Zimmer4 1 Mathematics Institute, University of Warwick, Coventry CV4 7AL, UK. E-mail: [email protected] 2 School of Mathematics, Cardiff University, Senghennydd Road, Cardiff, Wales CF24 7AG, UK.

E-mail: [email protected]

3 Department of Mathematics and Computer Science and Institute of Complex Molecular Systems,

Technische Universiteit Eindhoven, Den Dolech 2, P. O. Box 513, 5600 MB Eindhoven, The Netherlands

4 Department of Mathematics Sciences, University of Bath, Bath BA2 7AY, UK

Received: 11 May 2010 / Accepted: 14 April 2011 Published online: 24 September 2011 – © Springer-Verlag 2011

Abstract: We study the connection between a system of many independent Brownian particles on one hand and the deterministic diffusion equation on the other. For a fixed time step h > 0, a large-deviations rate functional Jh characterizes the behaviour of the particle system at t = h in terms of the initial distribution at t = 0. For the diffusion equation, a single step in the time-discretized entropy-Wasserstein gradient flow is characterized by the minimization of a functional K h . We establish a new connection between these systems by proving that Jh and K h are equal up to second order in h as h → 0. This result gives a microscopic explanation of the origin of the entropy-Wasserstein gradient flow formulation of the diffusion equation. Simultaneously, the limit passage presented here gives a physically natural description of the underlying particle system by describing it as an entropic gradient flow. 1. Introduction 1.1. Particle-to-continuum limits. In 1905, Einstein showed [Ein05] how the bombardment of a particle by surrounding fluid molecules leads to behaviour that is described by the macroscopic diffusion equation (in one dimension) ∂t ρ = ∂x x ρ

for (x, t) ∈ R × R+ .

(1)

There are now many well-established derivations of continuum equations from stochastic particle models, both formal and rigorous [DMP92,KL99]. In this paper we investigate a new method to connect some stochastic particle systems with their upscaled deterministic evolution equations, in situations where these equations can be formulated as gradient flows. This method is based on a connection between two concepts: large-deviations rate functionals associated with stochastic processes on one hand, and gradient-flow formulations of deterministic differential equations on the other. We explain these below.

792

S. Adams, N. Dirr, M. A. Peletier, J. Zimmer

The paper is organized around a simple example: the empirical measure of a family of n Brownian particles X (i) (t) ∈ R, t ≥ 0, has a limit as n → ∞, which is characterized by Eq. (1). The natural variables to compare are the empirical measure of the position n at time t, i.e. L tn = n −1 i=1 δ X (i) (t) , which describes the density of particles, and the solution ρ(·, t) of (1). We take a time-discrete point of view and consider time points t = 0 and t = h > 0. Large-deviations principles. A large-deviations principle characterizes the fluctuation behaviour of a stochastic process. We consider the behaviour of L nh under the condition of a given initial distribution L 0n ≈ ρ0 ∈ M1 (R), where M1 (R) is the space of probability measures on R. A large-deviations result expresses the probability of finding L nh close to some ρ ∈ M1 (R) as P L nh ≈ ρ | L 0n ≈ ρ0 ≈ exp −n Jh (ρ ; ρ0 ) as n → ∞. (2) The functional Jh is called the rate function. By (2), Jh (ρ ; ρ0 ) characterizes the probability of observing a given realization ρ: large values of Jh imply small probability. Rigorous statements are given below. Gradient flow-formulations of parabolic PDEs. An equation such as (1) characterizes an evolution in a state space X , which in this case we can take as X = M1 (R) or X = L 1 (R). A gradient-flow formulation of the equation is an equivalent formulation with a specific structure. It employs two quantities, a functional E : X → R and a dissipation metric d : X × X → R. Equation (1) can be written as the gradient flow of the entropy functional E(ρ) = ρ log ρ d x with respect to the Wasserstein metric d (again, see below for precise statements). We shall use the following property: the solution t → ρ(t, ·) of (1) can be approximated by the time-discrete sequence {ρ n } defined recursively by ρ n ∈ argmin K h (ρ ; ρ n−1 ), ρ∈X

K h (ρ ; ρ n−1 ) :=

1 d(ρ, ρ n−1 )2 + E(ρ) − E(ρ n−1 ). 2h (3)

Connecting large deviations with gradient flows. The results of this paper are illustrated in the diagram below. discrete-time rate functional Jh large-deviations principle⏐

n→∞ Brownian particle system

this paper

−−−−−−−−−−−→ Gamma-convergence h→0

continuum limit

−−−−−−−−→ n→∞

discrete-time variational formulation K h ⏐

h→0

(4)

continuum equation (1)

The lower level of this diagram is the classical connection: in the limit n → ∞, the empirical measure t → L tn converges to the solution ρ of Eq. (1). In the left-hand column the large-deviations principle mentioned above connects the particle system with the rate functional Jh . The right-hand column is the formulation of Eq. (1) as a gradient flow, in the sense that the time-discrete approximations constructed by successive minimization of K h converge to (1) as h → 0.

From a Large-Deviations Principle to the Wasserstein Gradient Flow

793

Both functionals Jh and K h describe a single time step of length h: Jh characterizes the fluctuations of the particle system after time h, and K h characterizes a single time step of length h in the time-discrete approximation of (1). In this paper we make a new connection, a Gamma-convergence result relating Jh to K h , indicated by the top arrow. It is this last connection that is the main mathematical result of this paper. This result is interesting for a number of reasons. First, it places the entropyWasserstein gradient-flow formulation of (1) in the context of large deviations for a system of Brownian particles. In this sense it gives a microscopic justification of the coupling between the entropy functional and the Wasserstein metric, as it occurs in (3). Secondly, it shows that K h not only characterizes the deterministic evolution via its minimizer, but also the fluctuation behaviour via the connection to Jh . Finally, it suggests a principle that may be much more widely valid, in which gradient-flow formulations have an intimate connection with large-deviations rate functionals associated with stochastic particle systems. The structure of this paper is as follows. We first introduce the specific system of this paper and formulate the existing large-deviations result (2). In Sect. 3 we discuss the abstract gradient-flow structure and recall the definition of the Wasserstein metric. Section 4 gives the central result, and Sect. 5 provides a discussion of the background and relevance. Finally the two parts of the proof of the main result, the upper and lower bounds, are given in Sects. 7 and 8. Throughout this paper, measure-theoretical notions such as absolute continuity are with respect to the Lebesgue measure, unless indicated otherwise. By abuse of notation, we will often identify a measure with its Lebesgue density. 2. Microscopic Model and Large-Deviations Principle Equation (1) arises as the hydrodynamic limit of a wide variety of particle systems. In this paper we consider the simplest of these, which is a collection of n independently moving Brownian particles. A Brownian particle is a particle whose position in R is given by a Wiener process, for which the probability of a particle moving from x ∈ R to y ∈ R in time h > 0 is given by the probability density ph (x, y) :=

1 2 e−(y−x) /4h . 1/2 (4π h)

(5)

Alternatively, this corresponds to the Brownian bridge measure for the n random elements in the space of all continuous functions [0, h] → R. We work with Brownian motions having generator instead of 21 , and we write Px for the probability measure under which X = X (1) starts from x ∈ R. We now specify our system of Brownian particles. Fix a measure ρ0 ∈ M1 (R) which will serve as the initial distribution of the n Brownian motions X (1) , . . . , X (n) in R. For each n ∈ N, we let (X (i) )i=1,...,n be a collectionof independent Brownian motions, n whose distribution is given by the product Pn = i=1 Pρ0 , where Pρ0 = ρ0 (d x)Px is (1) the probability measure under which X = X starts with initial distribution ρ0 . It follows from the definition of the Wiener process and the law of large numbers that the empirical measure L tn , the random probability measure in M1 (R) defined by L tn :=

n 1 δ X (i) (t) , n i=1

794

S. Adams, N. Dirr, M. A. Peletier, J. Zimmer

converges in probability to the solution ρ of (1) with initial datum ρ0 . In this sense Eq. (1) is the many-particle limit of the Brownian-particle system. Here and in the rest of this paper the convergence is the weak-∗ or weak convergence for probability measures, defined by the duality with the set of continuous and bounded functions Cb (R). Large-deviations principles are given for many empirical measures of the n Brownian motions under the product measure Pn . Of particular interest to us is the empirical measure for the pair of the initial and terminal position for a given time horizon [0, h], that is, the empirical pair measure n 1 Yn = δ(X (i) (0),X (i) (h)) . n i=1

Note that the empirical measures L 0n and L nh are the first and second marginals of Yn . The relative entropy H : (M1 (R × R))2 → [0, ∞] is the functional

f (x, y) log f (x, y) p(d(x, y)) if q p, f = dq dp H (q | p) := R×R +∞ otherwise. For given ρ0 , ρ ∈ M1 (R) denote by (ρ0 , ρ) = {q ∈ M1 (R × R) : π0 q = ρ0 , π1 q = ρ} (6) the set of pair measures whose firstmarginal π0 q(d·) := R q(d·, dy) equals ρ0 and whose second marginal π1 q(d·) := R q(d x, d·) equals ρ. For a given δ > 0 we denote by Bδ = Bδ (ρ0 ) the open ball with radius δ > 0 around ρ0 with respect to the Lévy metric on M1 (R) [DS89, Sect. 3.2]. Theorem 1 (Conditional large deviations). Fix δ > 0 and ρ0 ∈ M1 (R). The sequence (Pn ◦ (L nh )−1 )n∈N satisfies under the condition that L 0n ∈ Bδ (ρ0 ) a large deviations principle on M1 (R) with speed n and rate function Jh,δ (ρ ; ρ0 ) :=

inf

q : π0 q∈Bδ (ρ0 ),π1 q=ρ

H (q | q0 ), ρ ∈ M1 (R),

(7)

where q0 (d x, dy) := ρ0 (d x) ph (x, y)dy. This means that (1) For each open O ⊂ M1 (R), lim inf n→∞

1 log Pn L nh ∈ O | L 0n ∈ Bδ (ρ0 ) ≥ − inf Jh,δ (ρ ; ρ0 ). ρ∈O n

(2) For each closed K ⊂ M1 (R), lim sup n→∞

1 log Pn L nh ∈ K | L 0n ∈ Bδ (ρ0 ) ≤ − inf Jh,δ (ρ ; ρ0 ). ρ∈K n

(8)

From a Large-Deviations Principle to the Wasserstein Gradient Flow

795

A proof of this standard result can be given by an argument along the following lines. First, note that Pρ0 ◦ (σ0 , σh )−1 (x, y) = ρ0 (d x)Px (X (h) ∈ dy) = ρ0 (d x) ph (x, y)dy =: q0 (d x, dy), x, y ∈ R, where σs : C([0, h]; R) → R, ω → ω(s) is the projection of any path ω to its position at time s ≥ 0. By Sanov’s Theorem, the sequence (Pn ◦ Yn−1 )n∈N of the empirical pair measures Yn satisfies a large-deviations principle on M1 (R × R) with speed n and rate function q → H (q | q0 ), q ∈ M1 (R × R), see e.g. [dH00,Csi84]). Secondly, the contraction principle (e.g., [dH00, Sec. III.5]) shows that the pair of marginals (L 0n , L nh ) = (π0 Yn , π1 Yn ) of Yn satisfies a large deviations principle on M1 (R)×M1 (R) with rate n and rate function (ρ˜0 , ρ) →

inf

q∈M1 (R×R) : π0 q=ρ˜0 ,π1 q=ρ

H (q | q0 ),

for any ρ˜0 , ρ ∈ M1 (R). Thirdly, as in the first step, it follows that the empirical measure L 0n under Pn satisfies a large deviations principle on M1 (R) with speed n and rate function ρ˜0 → H (ρ˜0 | ρ0 ), for ρ˜0 ∈ M1 (R). Therefore for a subset A ⊂ M1 (R), 1 1 1 log Pn (L nh ∈ A | L 0n ∈ Bδ ) = log Pn (L nh ∈ A, L 0n ∈ Bδ ) − log Pn (L 0n ∈ Bδ ) n n n ∼ inf H (q | q0 ) − inf H (ρ˜0 | ρ0 ). q : π0 q∈Bδ ,π1 q∈A

ρ˜0 ∈Bδ

Since ρ0 ∈ Bδ , the latter infimum equals zero, and the claim of Theorem 1 follows. We now consider the limit of the rate functional as the radius δ → 0. Two notions of convergence are appropriate, that of pointwise convergence and Gamma convergence. Lemma 2. Fix ρ0 ∈ M1 (R). As δ ↓ 0, Jh,δ ( · ; ρ0 ) converges in M1 (R) both in the pointwise and in the Gamma sense to Jh (ρ ; ρ0 ) :=

inf

q : π0 q=ρ0 ,π1 q=ρ

H (q | q0 ).

Gamma convergence means here that (1) (Lower bound) For each sequence ρ δ ρ in M1 (R), lim inf Jh,δ (ρ δ ; ρ0 ) ≥ Jh (ρ ; ρ0 ). δ→0

(9)

(2) (Recovery sequence) For each ρ ∈ M1 (R), there exists a sequence (ρ δ ) ⊂ M1 (R) with ρ δ ρ such that lim Jh,δ (ρ δ ; ρ0 ) = Jh (ρ ; ρ0 ).

δ→0

(10)

Proof. Jh,δ ( · ; ρ0 ) is an increasing sequence of convex functionals on M1 (R); therefore it converges at each fixed ρ ∈ M1 (R). The Gamma-convergence then follows from, e.g., [DM93, Prop. 5.4] or [Bra02, Rem. 1.40]. Remark. Léonard [Léo07] proves a similar statement, where he replaces the ball Bδ (ρ0 ) in Theorem 1 by an explicit sequence ρ0,n ρ0 . The rate functional that he obtains is again Jh .

796

S. Adams, N. Dirr, M. A. Peletier, J. Zimmer

Summarizing, the combination of Theorem 1 and Lemma 2 forms a rigorous version of the statement (2). The parameter δ in Theorem 1 should be thought of as an artificial parameter, introduced to make the large-deviations statement non-singular, and which is eliminated by the Gamma-limit of Lemma 2. 3. Gradient Flows Let us briefly recall the concept of a gradient flow, starting with flows in Rd . The gradient flow in Rd of a functional E : Rd → R is the evolution in Rd given by x˙ i (t) = −∂i E(x(t))

(11)

which can be written in a geometrically more correct way as x˙ i (t) = −g i j ∂ j E(x(t)).

(12)

The metric tensor g converts the covector field ∇ E into a vector field that can be assigned to x. ˙ In the case of (11) we have g i j = δ i j , the Euclidean metric, and for a general Riemannian manifold with metric tensor g, Eq. (12) defines the gradient flow of E with respect to g. In recent years this concept has been generalized to general metric spaces [AGS05]. This generalization is partly driven by the fact, first observed by Jordan, Kinderlehrer, and Otto [JKO97,JKO98], that many parabolic evolution equations of a diffusive type can be written as gradient flows in a space of measures with respect to the Wasserstein metric. The Wasserstein distance is defined on the set of probability measures with finite second moments, 2 P2 (R) := ρ ∈ M1 (R) : x ρ(d x) < ∞ , R

and is given by

d(ρ0 , ρ1 ) := 2

inf

γ ∈(ρ0 ,ρ1 ) R×R

(x − y)2 γ (d(x, y)),

(13)

where (ρ0 , ρ1 ) is defined in (6). Examples of parabolic equations that can be written as a gradient flow of some energy E with respect to the Wasserstein distance are • The diffusion equation (1); this is the gradient flow of the (negative) entropy E(ρ) := ρ log ρ d x; R

(14)

• nonlocal convection-diffusion equations [JKO98,AGS05,CMV06] of the form ∂t ρ = div ρ∇ U (ρ) + V + W ∗ ρ , (15) where U , V , and W are given functions on R, Rd , and Rd , respectively; • higher-order parabolic equations [Ott98,GO01,Gla03,MMS09,GST08] of the form (16) ∂t ρ = − div ρ∇ ρ α−1 ρ α , for 1/2 ≤ α ≤ 1;

From a Large-Deviations Principle to the Wasserstein Gradient Flow

797

• moving-boundary problems, such as a prescribed-angle lubrication-approximation model [Ott98] ∂t ρ = −∂x (ρ ∂x x x ρ) ∂x ρ = ±1

in {ρ > 0}, on ∂{ρ > 0},

(17)

and a model of crystal dissolution and precipitation [PP08] ∂t ρ = ∂x x ρ in {ρ > 0}, with ∂n ρ = −ρvn and vn = f (ρ) on ∂{ρ > 0}. (18) 4. The Central Statement The aim of this paper is to connect Jh to the functional K h in the limit h → 0, in the sense that Jh ( · ; ρ0 ) ∼

1 K h ( · ; ρ0 ) 2

as h → 0.

(19)

For any ρ = ρ0 both Jh (ρ ; ρ0 ) and K h (ρ ; ρ0 ) diverge as h → 0, however, and we therefore reformulate this statement in the form Jh ( · ; ρ0 ) −

1 1 1 d( · , ρ0 )2 −→ E( · ) − E(ρ0 ). 4h 2 2

The precise statement is given in the theorem below. This theorem is probably true in greater generality, possibly even for all ρ0 , ρ ∈ P2 (Rd ). For technical reasons we need to impose restrictive conditions on ρ0 and ρ, and to work in one space dimension, on a bounded domain [0, L]. For any 0 < δ < 1 we define the set L ∞ −1 ρ = 1 and ρ − L ∞ < δ . Aδ := ρ ∈ L (0, L) : 0

Theorem 3. Let Jh be defined as in (7). Fix L > 0; there exists δ > 0 with the following property. Let ρ0 ∈ Aδ ∩ C([0, L]). Then Jh ( · ; ρ0 ) −

1 1 1 d( · , ρ0 )2 −→ E(·) − E(ρ0 ) 4h 2 2

as h → 0,

(20)

in the set Aδ , where the arrow denotes Gamma-convergence with respect to the narrow topology. In this context this means that the two following conditions hold: (1) (Lower bound) For each sequence ρ h ρ in Aδ , lim inf Jh (ρ h ; ρ0 ) − h→0

1 1 1 d(ρ h , ρ0 )2 ≥ E(ρ) − E(ρ0 ). 4h 2 2

(21)

(2) (Recovery sequence) For each ρ ∈ Aδ , there exists a sequence (ρ h ) ⊂ Aδ with ρ h ρ such that lim Jh (ρ h ; ρ0 ) −

h→0

1 1 1 d(ρ h , ρ0 )2 = E(ρ) − E(ρ0 ). 4h 2 2

(22)

798

S. Adams, N. Dirr, M. A. Peletier, J. Zimmer

5. Discussion There are various ways to interpret Theorem 3. An explanation of the functional K h and the minimization problem (3). The authors of [JKO98] motivate the minimization problem (3) by analogy with the well-known backward Euler approximation scheme. Theorem 3 provides an independent explanation of this minimization problem, as follows. By the combination of (2) and (19), the value K h (ρ ; ρ0 ) determines the probability of observing ρ at time h, given a distribution ρ0 at time zero. Since for large n only near-minimal values of Jh , and therefore of K h , have non-vanishing probability, this explains why the minimizers of K h arise. It also shows that the minimization problem (3), and specifically the combination of the entropy and the Wasserstein terms, is not just a mathematical construct but also carries physical meaning. A related interpretation stems from the fact that (2) characterizes not only the most probable state, but also the fluctuations around that state. Therefore Jh and by (19) also K h not only carry meaning in their respective minimizers, but also in the behaviour away from the minimum. Put succinctly: K h also characterizes the fluctuation behaviour of the particle system, for large but finite n. A microscopic explanation of the entropy-Wasserstein gradient flow. The diffusion equation (1) is a gradient flow in many ways simultaneously: it is the gradient flow of the Dirichlet integral 21 |∇ρ|2 with respect to the L 2 metric, of 21 ρ 2 with respect to the H −1 metric; more generally, of the H s semi-norm with respect to the H s−1 metric. In addition there is of course the gradient flow of the entropy E with respect to the Wasserstein metric. Theorem (3) shows that among these the entropy-Wasserstein combination is special, in the sense that it not only captures the deterministic limit, i.e., Eq. (1), but also the fluctuation behaviour at large but finite n. Other gradient flows may also produce (1), but they will not capture the fluctuations, for this specific stochastic system. Of course, there may be other stochastic particle systems for which not the entropy-Wasserstein combination but another combination reproduces the fluctuation behaviour. There is another way to motivate the combination of entropy and the Wasserstein distance. In [KO90] the authors study the hydrodynamic limit for a stochastic particle system consisting of independent Brownian motions. A natural object to study is the time dependent (in a finite time horizon [0, T ]) empirical measure for a spatial averaged system of Brownian motions where space and time are scaled by , i.e. the over i spatial averaged measures δ X (i) . In particular the authors derive a rate functional for the timet/

continuous problem, which is therefore a functional on a space of space-time functions such as C(0, T ; L 1 (Rd )). The relevant term for this discussion is I (ρ) := inf v

0

T

|v(x, t)| ρ(x, t) d xdt : ∂t ρ = ρ + div ρv , 2

Rd

where the infimum is over functions v ∈ C 2,1 (Rd × [0, T ]). If we rewrite this infimum by v = w − ∇ log ρ instead as inf w

0

T

|w(x, t) − ∇(log ρ + 1)| ρ(x, t) d xdt : ∂t ρ = div ρw , 2

Rd

From a Large-Deviations Principle to the Wasserstein Gradient Flow

799

then we recognize that this expression penalizes deviation of w from the variational derivative (or L 2 -gradient) log ρ + 1 of E. Since the expression Rd |v|2 ρ d x can be interpreted as the derivative of the Wasserstein distance (see [Ott01] and [AGS05, Ch. 8]), this provides again a connection between the entropy and the Wasserstein distance. The origin of the Wasserstein distance. The proof of Theorem 3 also allows us to trace back the origin of the Wasserstein distance in the limiting functional K h . It is useful to compare Jh and K h in a slightly different form. Namely, using (13) and the expression of the relative entropy H given in (25) below, we write √ 1 (x − y)2 q(x, y) d xd y , (23) E(q)− E(ρ0 ) + log 2 π h + q∈(ρ0 ,ρ) 4h R×R 1 1 1 1 (x − y)2 q(x, y) d xd y. K h (ρ ; ρ0 ) = E(ρ) − E(ρ0 )+ inf 2 2 2 4h q∈(ρ0 ,ρ) R×R

Jh (ρ ; ρ0 ) =

inf

One similarity between these expressions is the form of the last term in both lines, combined with the minimization over q. Since that last term is prefixed by the large factor 1/4h, one expects it to dominate the minimization for small h, which is consistent with the passage from the first to the second line. In this way the Wasserstein distance in K h arises from the last term in (23). Tracing back the origin of that term, we find that it originates in the exponent (x − y)2 /4h in P h (see (5)), which itself arises from the Central Limit Theorem. In this sense the Wasserstein distance arises from the same Central Limit Theorem that provides the properties of Brownian motion in the first place. This also explains, for instance, why we find the Wasserstein distance of order 2 instead of any of the other orders. This observation also raises the question whether stochastic systems with heavy-tail behaviour, such as observed in fracture networks [BS98, BSS00] or near the glass transition [WW02], would be characterized by a different gradient-flow structure. A macroscopic description of the particle system as an entropic gradient flow. For the simple particle system under consideration, the macroscopic description by means of the diffusion equation is well known; the equivalent description as an entropic gradient flow is physically natural, but much more recent. The method presented in this paper is a way to obtain this entropic gradient flow directly as the macroscopic description, without having to consider solutions of the diffusion equation. This rigorous passage to a physically natural macroscopic limit may lead to a deeper understanding of particle systems, in particular in situations where the gradient flow formulation is mathematically more tractable. The choice for Gamma-convergence. Gamma-convergence is a natural concept of convergence for functionals in the context of minimization. It has the property that minimizers converge to minimizers, which explains why the concept is asymmetric; inverting the sign of functionals and taking the Gamma-limit do not commute as they do for other notions, such as pointwise convergence of functions. It is a natural question whether an analogue of Theorem 3 holds with pointwise convergence instead of Gamma-convergence, which is equivalent to asking whether (22) can be achieved with ρ h = ρ. In order to adapt the proof of (22), one would have to solve a Schrödinger system [Sch31,Beu60] that ‘corrects’ the error in the second marginal, and obtain certain bounds on the solution of this system. Since the kernel ph becomes singular in the limit h → 0, these bounds will be difficult to obtain, or may even fail

800

S. Adams, N. Dirr, M. A. Peletier, J. Zimmer

to hold. At the moment, therefore, we do not know whether the functionals converge pointwise or not. Future work. Besides the natural question of generalizing Theorem 3 to a larger class of probability measures, including measures in higher dimensions, there are various other interesting avenues of investigation. A first class of extensions is suggested by the many differential equations that can be written in terms of Wasserstein gradient flows, as explained in Sect. 3: can these also be related to large-deviation principles for wellchosen stochastic particle systems? Note that many of these equations correspond to systems of interacting particles, and therefore the large-deviation result of this paper will need to be generalized. Further extensions follow from relaxing the assumptions on the Brownian motion. Kramers’ equation, for instance, describes the motion of particles that perform a Brownian motion in velocity space, with the position variable following deterministically from the velocity. The characterization by Huang and Jordan [Hua00,HJ00] of this equation as a gradient flow with respect to a modified Wasserstein metric suggests a similar connection between gradient-flow and large-deviations structure.

6. Outline of the Arguments Since most of the appearances of h are combined with a factor 4, it is notationally useful to incorporate the 4 into it. We do this by introducing the new small parameter ε2 := 4h, and we redefine the functional of Eq. (3), 1 1 1 1 K ε (ρ ; ρ0 ) := 2 d(ρ, ρ0 )2 + E(ρ) − E(ρ0 ), 2 ε 2 2 and analogously for (7) Jε (ρ ; ρ0 ) :=

inf

q∈(ρ0 ,ρ)

H (q | q0 ),

(24)

where q0 (d xd y) = ρ0 (d x) pε (x, y)dy, with 1 2 2 pε (x, y) := √ e−(y−x) /ε , ε π in analogy to (5) and (8). Note that

q(x, y) log ρ0 (x) pε (x, y) d xd y R ×R 1 1 (x − y)2 q(x, y) d xd y, = E(q) − E(ρ0 ) + log ε2 π + 2 2 ε R ×R

H (q | q0 ) = E(q) −

where we abuse notation and write E(q) = R×R q(x, y) log q(x, y) d xd y.

(25)

From a Large-Deviations Principle to the Wasserstein Gradient Flow

801

6.1. Properties of the Wasserstein distance. We now discuss a few known properties of the Wasserstein distance. Lemma 4 (Kantorovich dual formulation [Vil03,AGS05,Vil09]). Let ρ0 , ρ1 ∈ P2 (R) be absolutely continuous with respect to Lebesgue measure. Then 2 2 d(ρ0 , ρ1 ) = sup (x − 2ϕ(x))ρ0 (x) d x + (y 2 − 2ϕ ∗ (y))ρ1 (y) dy : ϕ

R

R

ϕ : R → R convex

,

(26)

where ϕ ∗ is the convex conjugate (Legendre-Fenchel transform) of ϕ, and where the supremum is achieved. In addition, at ρ0 -a.e. x the optimal function ϕ is twice differentiable, and ϕ (x) =

ρ0 (x) . ρ1 (ϕ (x))

(27)

ρ1 (y) . ρ0 ((ϕ ∗ ) (y))

(28)

A similar statement holds for ϕ ∗ , (ϕ ∗ ) (y) =

For an absolutely continuous q ∈ P2 (R × R) we will often use the notation d(q)2 := (x − y)2 q(x, y) d xd y. R ×R

Note that d(ρ0 , ρ1 ) = inf{d(q) : π0,1 q = ρ0,1 }, and that if π0,1 q = ρ0,1 , and if the convex functions ϕ, ϕ ∗ are associated with d(ρ0 , ρ1 ) as above, then the difference can be expressed as 2 2 2 d(q) −d(ρ0 , ρ1 ) = (x − y) q(x, y) d xd y − (x 2 −2ϕ(x)) q(x, y) d xd y R ×R R ×R (y 2 − 2ϕ ∗ (y)) q(x, y) d xd y − R ×R (ϕ(x) + ϕ ∗ (y) − x y) q(x, y) d xd y. (29) =2 R ×R

6.2. Pair measures and q˜ε . A central role is played by the following, explicit measure in P2 (R × R). For given ρ0 ∈ M1 (R) and a sequence of absolutely continuous measures ρ ε ∈ M1 (R), we define the absolutely continuous measure q˜ ε ∈ M1 (R × R) by 2 1 q˜ ε (x, y) := Z ε−1 √ ρ0 (x) ρ ε (y) exp 2 (x y − ϕε (x) − ϕε∗ (y)) , ε ε π

(30)

802

S. Adams, N. Dirr, M. A. Peletier, J. Zimmer

where the normalization constant Z ε is defined as 2 1 ρ0 (x) ρ ε (y) exp 2 (x y − ϕε (x)−ϕε∗ (y)) d xd y. Z ε = Z ε (ρ0 , ρ ε ) := √ ε ε π R ×R (31) In these expressions, the functions ϕε , ϕε∗ are associated with d(ρ0 , ρ ε ) as by Lemma 4. Note that the marginals of q˜ ε are not equal to ρ0 and ρ ε , but they do converge (see the proof of part 2 of Theorem 3) to ρ0 and the limit ρ of ρ ε . 6.3. Properties of q˜ ε and Z ε . The role of q˜ ε can best be explained by the following observations. We first discuss the lower bound, part 1 of Theorem 3. If q ε is optimal in the definition of Jε (ρ ε ; ρ0 )—implying that it has marginals ρ0 and ρ ε —then 0 ≤ H (q ε |q˜ ε ) = E(q ε ) − q ε log q˜ ε 1 1 = E(q ε ) + log Z ε + log ε2 π − q ε (x, y) log ρ0 (x) + log ρ ε (y) d xd y 2 2 2 + 2 q ε (x, y) ϕε (x) + ϕε∗ (y) − x y d xd y ε 1 1 1 1 (29) = E(q ε ) − E(ρ0 ) − E(ρ ε ) + 2 d(q ε )2 − d(ρ0 , ρ ε )2 + log Z ε + log ε2 π 2 2 ε 2 1 1 1 ε ε 2 ε (32) = Jε (ρ ; ρ0 ) − 2 d(ρ0 , ρ ) − E(ρ ) + E(ρ0 ) + log Z ε . ε 2 2 The lower-bound estimate lim inf Jε (ρ ε ; ρ0 ) − ε→0

1 1 1 d(ρ0 , ρ ε )2 ≥ E(ρ) − E(ρ0 ) 2 ε 2 2

then follows from the lemma below, which is proved in Sect. 8. Lemma 5. We have (1) lim inf ε→0 E(ρ ε ) ≥ E(ρ); (2) lim supε→0 Z ε ≤ 1. For the recovery sequence, part 2 of Theorem 3, we first define the functional G ε : M1 (R × R) → R by G ε (q) := H (q|(π0 q)P ε ) −

1 d(π0 q, π1 q)2 . ε2

Note that by (25) and (29), for any q such that π0 q = ρ0 we have 1 G ε (q) = E(q) − E(ρ0 ) + log ε2 π 2 2 q(x, y) ϕ(x) + ϕ ∗ (y) − x y d xd y : ϕ convex . + inf 2 ϕ ε

(33)

From a Large-Deviations Principle to the Wasserstein Gradient Flow

803

Now choose for ϕ the optimal convex function in the definition of d(ρ0 , ρ), and let the function q˜ ε be given by (30), where ρ1ε , ϕε , and ϕε∗ are replaced by the fixed functions ρ, ϕ, and ϕ ∗ . Define the correction factor χε ∈ L 1 (π0 q˜ ε ) by the condition ρ0 (x) = χε (x)π0 q˜ ε (x).

(34)

We then set q ε (x, y) = χε (x)q˜ ε (x, y) 2 1 = Z ε−1 √ χε (x) ρ0 (x) ρ1 (y) exp 2 (x y − ϕ(x) − ϕ ∗ (y)) , (35) ε ε π so that the first marginal π0 q ε equals ρ0 ; in Lemma 6 below we show that the second marginal converges to ρ. Note that the normalization constant Z ε above is the same as for q˜ ε , i.e., 2 1 Zε = √ ρ0 (x) ρ1 (y) exp 2 (x y − ϕ(x) − ϕ ∗ (y)) d xd y. ε ε π K K Since the functions ϕ and ϕ ∗ are admissible for d(π0 q ε , π1 q ε ), we find with (26) d(π0 q ε , π1 q ε ) ≥ (x 2 − 2ϕ(x))π0 q ε (x) d x + (y 2 − 2ϕ ∗ (y)) π1 q ε (y) dy R R 2 = x − 2ϕ(x) − 2ϕ ∗ (y) + y 2 q ε (x, y) d xd y. Then 1 2 2 G ε (q ) ≤ E(q ) − E(ρ0 ) + log ε π + 2 q ε (x, y) ϕ(x) + ϕ ∗ (y) − x y d xd y 2 ε = − log Z ε + q ε (x, y) log χε (x) d xd y 1 1 ε + q (x, y) log ρ1 (y) d xd y − q ε (x, y) log ρ0 (x) d xd y 2 2 = − log Z ε + ρ0 (x) log χε (x) d x 1 1 ε + π1 q (y) log ρ1 (y) dy − ρ0 (x) log ρ0 (x) d x. 2 2 ε

ε

The property (22) then follows from the lower bound and lemma below, which is proved in Sect. 7. Lemma 6. We have (1) limε→0 Z ε = 1; (2) π0,1 q˜ ε and χε are bounded on (0, L) from above and away from zero, uniformly in ε; (3) χε → 1 in L 1 (0, L); (4) π1 q ε → ρ1 in L 1 (0, L).

804

S. Adams, N. Dirr, M. A. Peletier, J. Zimmer

7. Upper Bound In this section we prove Lemma 6, and we place ourselves in the context of the recovery property, part 2 of Theorem 3. Therefore we are given ρ0 , ρ1 ∈ Aδ with ρ0 ∈ C([0, L]), and as described in Sect. 6.3 we have constructed the pair measures q ε and q˜ ε as in (35); the convex function ϕ is associated with d(ρ0 , ρ1 ). The parameter δ will be determined in the proof of the lower bound; for the upper bound it is sufficient that 0 < δ < 1/2, and therefore that 1/2 ≤ ρ0 , ρ1 ≤ 3/2. Note that this implies that ϕ and ϕ ∗ are bounded between 1/3 and 3. By Aleksandrov’s theorem [EG92, Th. 6.4.I] the convex function ϕ ∗ is twice differentiable at Lebesgue-almost every point y ∈ R. Let N x ⊂ R be the set where ϕ is not differentiable; this is a Lebesgue null set. Let N y ⊂ R be the set at which ϕ ∗ is not twice differentiable, or at which (ϕ ∗ ) does exist but vanishes; the first set of points is a Lebesgue null set, and the second is a ρ1 -null set by (28); therefore ρ1 (N y ) = 0. Now set N = N x ∪ ∂ϕ ∗ (N y ); here ∂ϕ ∗ is the (multi-valued) sub-differential of ϕ ∗ . Then ρ0 (N ) ≤ ρ0 (N x ) + ρ0 (∂ϕ ∗ (N y )) = 0 + ρ0 (∂ϕ ∗ (N y )) = ρ1 (N y ) = 0, where the second identity follows from [McC97, Lemma 4.1]. Then, since ϕ ∗ (ϕ (x)) = x, we have for any x ∈ R \ N , 1 ϕ ∗ (y) = ϕ ∗ (ϕ (x)) + x(y − ϕ (x)) + ϕ ∗ (ϕ (x))(y − ϕ (x))2 + o((y − ϕ (x))2 ), 2 so that, using ϕ(x) + ϕ ∗ (ϕ (x)) = xϕ (x), ϕ(x) + ϕ ∗ (y) − x y =

1 ∗ ϕ (ϕ (x))(y − ϕ (x))2 + o((y − ϕ (x))2 ). 2

Therefore for each x ∈ R \ N , y = ϕ (x) is a Lebesgue point of ρ1 , and the single integral 2 1 ρ1 (y) exp 2 (x y − ϕ(x) − ϕ ∗ (y)) dy ε R ε 1 1 = ρ1 (y) exp − 2 ϕ ∗ (ϕ (x))(y − ϕ (x))2 + o(ε−2 (y − ϕ (x))2 ) dy ε R ε can be shown by Watson’s Lemma1 to converge to

√ ρ1 (ϕ (x)) π

1 ϕ ∗ (ϕ (x))

=

√ π ρ0 (x).

(36)

By Fatou’s Lemma, therefore, lim inf Z ε ≥ 1. ε→0

(37)

1 This requires a generalization of Watson’s Lemma (see e.g. [Olv97, Th. 3.7.1]) to Lebesgue points. This can be done for the case at hand using the concept of ‘nicely shrinking sequences of sets’ [Yeh06, Th. 25.17]. The pertinent observation is that if one approximates the exponential by step functions, then the convexity of the exponent (2/ε 2 )(x y − ϕ(x) − ϕ ∗ (y)) in y causes the components of this step function to be single intervals, which are a sequence of nicely shrinking sets.

From a Large-Deviations Principle to the Wasserstein Gradient Flow

805

By the same argument as above, and using the lower bound ϕ ≥ 1/3, we find that 1 1 (38) x y − ϕ(x) − ϕ ∗ (y) ≤ min − (x − ϕ ∗ (y))2 , − (y − ϕ (x))2 . 6 6 Then we can estimate 2 1 ρ0 (x) ρ1 (y) exp 2 (x y − ϕ(x) − ϕ ∗ (y)) d xd y ε R R ε 2 1 L L ≤ ρ0 (ϕ ∗ (y)) ρ1 (y) exp 2 (x y − ϕ(x) − ϕ ∗ (y)) d xd y ε 0 0 ε L L 1 1 + ρ0 (x) − ρ0 (ϕ ∗ (y)) ρ1 (y) exp − 2 (x − ϕ ∗ (y))2 d xd y. ε 0 0 3ε (39) By the same argument as above, in the first term the inner integral converges at ρ1 -almost √ every y to ρ1 (y) π and is bounded by 1 1 1/2 1/2 1/2 1/2 √ ρ0 ∞ ρ1 ∞ exp − 2 (x − ϕ ∗ (y))2 d x = ρ0 ∞ ρ1 ∞ 3π , ε 3ε R so that 1 ε→0 ε

L

L

lim

0

0

2 √ ρ0 (ϕ ∗ (y)) ρ1 (y) exp 2 (x y − ϕ(x) − ϕ ∗ (y)) d xd y = π . ε (40)

To estimate the second term we note that since ϕ ∗ maps [0, L] to [0, L], we can estimate for all (x, y) ∈ [0, L] × [0, L], ρ0 (x) − ρ0 (ϕ ∗ (y)) ≤ ω√ρ0 (|x − ϕ ∗ (y)|), where ω√ρ0 is the modulus of continuity of 1 ε

√

ρ0 ∈ C([0, L]). Then

L

1 ρ0 (x) − ρ0 (ϕ ∗ (y)) ρ1 (y) exp − 2 (x − ϕ ∗ (y))2 d xd y 3ε 0 0 L 1 1 1/2 ≤ ω√ρ0 (η)ρ1 ∞ exp − 2 (x − ϕ ∗ (y))2 d xd y ε 3ε {x∈[0,L]:|x−ϕ ∗ (y)|≤η} 0 L 1 1 1/2 1/2 + ρ0 ∞ ρ1 ∞ exp − 2 (x − ϕ ∗ (y))2 d xd y ε 3ε {x∈[0,L]:|x−ϕ ∗ (y)|>η} 0 √ 1 η 1/2 1/2 1/2 ≤ ω√ρ0 (η)ρ1 ∞ L 3π + ρ0 ∞ ρ1 ∞ L 2 exp − 2 . (41) ε 3ε L

The first term above can be made arbitrarily small by choosing η > 0 small, and for any fixed η > 0 the second converges to zero as ε → 0. Combining (37), (39), (40) and (41), we find the first part of Lemma 6: lim Z ε = 1.

ε→0

806

S. Adams, N. Dirr, M. A. Peletier, J. Zimmer

Continuing with part 2 of Lemma 6, we note that by (38), e.g., L 1 1 ρ1 (y) exp − 2 (y − ϕ (x))2 dy π0 q˜ ε (x) ≤ Z ε−1 √ ρ0 (x) 3ε ε π 0 1/2 1/2 √ −1 ≤ Z ε ρ0 ∞ ρ1 ∞ 3. Since Z ε → 1, π0 q˜ ε is uniformly bounded from above. A similar argument holds for the upper bound on π1 q˜ ε , and by applying upper bounds on ϕ and ϕ ∗ we also obtain uniform lower bounds on π0 q˜ ε and π1 q˜ e . The boundedness of χε then follows from (34) and the bounds on ρ0 . We conclude with the convergence of the χε and π1 q ε . By (36) and (40) we have for almost all x ∈ (0, L), 2 1 ε −1 π0 q˜ (x) = Z ε ρ1 (y) exp 2 (x y − ϕ(x) − ϕ ∗ (y)) dy −→ ρ0 (x), ρ0 (x) √ ε ε π and the uniform bounds on π0 q˜ ε imply that π0 q˜ e converges to ρ0 in L 1 (0, L). Therefore also χε → 1 in L 1 (0, L). A similar calculation gives π1 q ε → ρ1 in L 1 (0, L). This concludes the proof of Lemma 6. 8. Lower Bound This section gives the proof of the lower-bound estimate, part 1 of Theorem 3. Recall that in the context of part 1 of Theorem 3, we are given a fixed ρ0 ∈ Aδ ∩ C([0, L]) and a sequence (ρ ε ) ⊂ Aδ with ρ ε ρ. In Sect. 6.3 we described how the lowerbound inequality (21) follows from two inequalities (see Lemma 5). The first of these, lim inf ε→0 E(ρ ε ) ≥ E(ρ), follows either from [Geo88, Chap. 14] or by the variational representation of the entropy as given in [DS89, Chap. 3]. The rest of this section is therefore devoted to the proof of the second inequality of Lemma 5, lim sup Z ε ≤ 1. ε→0

(42)

Here Z ε is defined in (31) as 2 1 ρ0 (x) ρ ε (y) exp 2 (x y − ϕε (x) − ϕε∗ (y)) d xd y, Z ε := √ ε ε π R ×R where we extend ρ0 and ρ ε by zero outside of [0, L], and ϕε is associated with d(ρ0 , ρ ε ) as in Lemma 4. This implies among other things that ϕε is twice differentiable on [0, L], and ρ0 (x) for all x ∈ [0, L]. (43) ϕε (x) = ε ρ (ϕε (x)) We restrict ourselves to the case L = 1, that is, to the interval K := [0, 1]; by a rescaling argument this entails no loss of generality. We will prove below that there exists a 0 < δ ≤ 1/3 such that whenever

ε ρ δˆ := max ρ0 − 1 L ∞ (K ) , sup − 1 ≤ δ, ρ0 ε L ∞ (K ) the inequality (42) holds. This implies the assertion of Lemma 5 and concludes the proof of Theorem 3.

From a Large-Deviations Principle to the Wasserstein Gradient Flow

807

Fig. 1. The function κεz for negative and positive values of z

8.1. Main steps. A central step in the proof is a reformulation of the integral defining Z ε in terms of a convolution. Upon writing y = ϕε (ξ ) and x = ξ + εz, and using ϕε (ξ ) + ϕε∗ (ϕε (ξ )) = ξ ϕε (ξ ), we can rewrite the exponent in Z ε as ϕε (x) + ϕε∗ (y) − x y = ϕε (ξ + εz) + ϕε∗ (ϕε (ξ )) − (ξ + εz)ϕε (ξ ) = ϕε (ξ + εz) − ϕε (ξ ) − εzϕε (ξ ) z = ε2 (z − s)ϕε (ξ + εs) ds

(44) (45)

0

z 2 ε2 z κε ∗ ϕε (ξ ), = 2 where we define the convolution kernel κεz by (see Fig. 1) ⎧ 2 ⎪ ⎨ z 2 (z + σ ) z −1 z −1 z κε (s) = ε κ (ε s) and κ (σ ) = − z22 (z + σ ) ⎪ ⎩ 0

(46)

if − z ≤ σ ≤ 0 if 0 ≤ σ ≤ −z otherwise.

While the domain of definition of (44) is a convenient rectangle K 2 = [0, 1]2 , after transforming to (45) this domain becomes an inconvenient ε-dependent parallellogram in terms of z and ξ . The following lemma therefore allows us to switch to a more convenient setting, in which we work on the flat torus T = R/Z (for ξ ) and R (for z). Lemma 7. Set u ∈ L ∞ (T) to be the periodic function on the torus T such that u(ξ ) = ϕε (ξ ) for all ξ ∈ K (in particular, u ≥ 0). There exists a function ω ∈ C([0, ∞)) with ω(0) = 0, depending only on ρ0 , such that for all δˆ ≤ 1/3, √ π Z ε ≤ ω(ε) + ρ0 (ξ ) u(ξ ) exp[−(κεz ∗ u)(ξ )z 2 ] dzdξ. T

R

Given this lemma it is sufficient to estimate the integral above. To explain the main argument that leads to the inequality (42), we give a heuristic description that is mathematically false but morally correct; this will be remedied below. We approximate in Z ε an expression of the form e−a−b by e−a (1 − b) (let us call this perturbation 1), and we set ρ0 ≡ 1 (perturbation 2). Then √ 2 π Z ε − ω(ε) ≤ u(ξ ) e−u(ξ )z 1 − (κεz ∗ u − u)(ξ )z 2 dzdξ R T 2 −u(ξ )z 2 = u(ξ ) e dzdξ − u(ξ ) e−u(ξ )z (κεz ∗ u) − u (ξ )z 2 dzdξ. T

R

T

R

808

S. Adams, N. Dirr, M. A. Peletier, J. Zimmer

√ The first term can be calculated by setting ζ = z u(ξ ), √ √ −ζ 2 e dζ dξ = π dξ = π . T R

T

∗ u)(ξ ) − u(ξ ) by cu (ξ )ε2 z 2 , where c = Inthe second term we approximate 1 2 z 4 s κ (s) ds (this is perturbation 3). Then this term becomes, using the same transformation to ζ as above, u (ξ ) 2 2 u(ξ ) e−u(ξ )z u (ξ )z 4 dzdξ = −cε2 e−ζ ζ 4 dζ dξ − cε2 2 u(ξ ) T R T R 2 √ u (ξ ) = −2cε2 π dξ. (47) 3 u(ξ ) T (κεz

Therefore this term is negative and of order ε2 as ε → 0, and the inequality (42) follows. The full argument below is based on this principle, but corrects for the three perturbations made above. Note that the difference e−a−b − e−a (1 − b)

(48)

is positive, so that the ensuing correction competes with (47). In addition, both the beneficial contribution from (47) and the detrimental contribution from (48) are of order ε2 . The argument only works because the corresponding constants happen to be ordered in the right way, and then only when u −1∞ is small. This is the reason for the restriction represented by δ. 8.2. Proof of Lemma 7. Since δˆ ≤ 1/3, then (43) implies that ϕε is Lipschitz on K , and we can transform Z ε following the sequence (44)–(46), and using supp ρ0 , ρ ε = K : 2 √ 1 π Zε = ρ ε (y) ρ0 (x) exp 2 (x y − ϕε (x) − ϕε∗ (y)) d xd y ε K ε K (1−ξ )/ε = ρ ε (ϕε (ξ )) ρ0 (εz + ϕε∗ (y)) exp[−(κεz ∗ ϕε )(ξ )z 2 ] dz ϕε (ξ )dξ K

=

K

−ξ/ε

ρ0 (ξ + εz) exp[−(κεz ∗ ϕε )(ξ )z 2 ] dzdξ, ρ0 (ξ ) ϕε (ξ ) R

where we used (43) in the last line. Note that (κεz ∗ ϕε )(ξ )z 2 = (κεz ∗ u)(ξ )z 2 for all z ∈ R and for all ξ ∈ K εz , where εz K is the interval K from which an interval of length εz has been removed from the left (if z < 0) or from the right (if z > 0). Therefore √ π Zε − ρ0 (ξ ) u(ξ ) exp[−(κεz ∗ u)(ξ )z 2 ] dzdξ T R = ρ0 (ξ ) u(ξ ) ρ0 (ξ + εz) − ρ0 (ξ ) exp[−(κεz ∗ u)(ξ )z 2 ] dξ dz εz R K + ρ0 (ξ ) u(ξ ) ρ0 (ξ + εz) exp[−(κεz ∗ u)(ξ )z 2 ] dξ dz R K \K εz − ρ0 (ξ ) u(ξ ) exp[−(κεz ∗ u)(ξ )z 2 ] dξ dz. R K \K εz

From a Large-Deviations Principle to the Wasserstein Gradient Flow

809

The final term is negative and we discard it. From the assumption δˆ ≤ 1/2 we deduce u − 1∞ ≤ 1/2, so that the first term on the right-hand side can be estimated from above (in terms of the modulus of continuity ωρ0 of ρ0 ) by 1/2

1/2

ρ0 L ∞ (K ) u L ∞ (K )

R K εz

ωρ0 (εz)e−z

2 /2

dξ dz ≤

3 2 ωρ0 (εz)e−z /2 dz, 2 R

which converges to zero as ε → 0, with a rate of convergence that depends only on ρ0 . Similarly, the middle term we estimate by

1/2

ρ0 L ∞ (K ) u L ∞ (K )

R

|K \ K εz |e−z

which converges to zero as ε → 0.

2 /2

dz ≤

3 3/2 2 ε |z|e−z /2 dz, 2 R

8.3. The semi-norm · ε . It is convenient to introduce a specific semi-norm for the estimates that we make below, which takes into account the nature of the convolution expressions. On the torus T we define u2ε :=

2 2 2 |u k |2 1 − e−π k ε ,

k∈Z

where the u k are the Fourier coefficients of u, u(x) =

u k e2πikx .

k∈Z

The following lemmas give the relevant properties of this seminorm. Lemma 8. For ε > 0, √ −z 2 e (u(x + εz) − u(x))2 d xdz = 2 πu2ε . R

Lemma 9. For ε > 0, R T

T

e−z (u(x) − κεz ∗ u(x))2 z 4 d xdz ≤ 2

5√ π u2ε . 6

(49)

(50)

Lemma 10. For α > 0 and ε > 0,

uε/α ≤

uε if α ≥ 1 1 u ε if 0 < α ≤ 1, α

where · ε/α should be interpreted as · ε with ε replaced by ε/α. The proofs of these results are given in the Appendix.

(51)

810

S. Adams, N. Dirr, M. A. Peletier, J. Zimmer

8.4. Conclusion. To alleviate notation we drop the caret from δˆ and simply write δ. Following the discussion above we estimate 2 ρ0 (ξ ) u(ξ ) exp[−(κεz ∗ u)(ξ )z 2 ] dzdξ = ρ0 (ξ ) u(ξ )e−u(ξ )z dzdξ T R T R 2 + ρ0 (ξ ) u(ξ )e−u(ξ )z [u(ξ ) − κεz ∗ u(ξ )]z 2 dzdξ + R, (52) T R

where

R=

2 ρ0 (ξ ) u(ξ )e−u(ξ )z exp[(u(ξ ) − κεz ∗ u(ξ ))z 2 ] − 1 T R −(u(ξ ) − κεz ∗ u(ξ ))z 2 dzdξ 2 ≤ (1 + δ)3/2 e−u(ξ )z exp[(u(ξ ) − κεz ∗ u(ξ ))z 2 ] − 1 − (u(ξ ) T R −κεz ∗ u(ξ ))z 2 dzdξ.

Since u − 1 L ∞ (T) ≤ δ, we have u − κεz ∗ u L ∞ (T) ≤ 2δ and therefore exp[(u(ξ )−κεz ∗ u(ξ ))z 2 ] − 1−(u(ξ ) − κεz ∗ u(ξ ))z 2 ≤

1 2δz 2 e (u(ξ ) − κεz ∗ u(ξ ))2 z 4 , 2

so that

(1 + δ)3/2 2 R≤ e(−u(ξ )+2δ)z (u(ξ ) − κεz ∗ u(ξ ))2 z 4 dzdξ 2 T R (1 + δ)3/2 2 ≤ e(−1+3δ)z (u(ξ ) − κεz ∗ u(ξ ))2 z 4 dzdξ. 2 T R √ Setting α = 1 − 3δ and ζ = αz, we find (1 + δ)3/2 2 R≤ e−ζ (u(ξ ) − κεζ /α ∗ u(ξ ))2 ζ 4 dζ dξ. 2(1 − 3δ)5/2 T R ζ /α

Noting that κε

ζ

= κε/α , we have with ε˜ := ε/α = ε(1 − 3δ)−1/2

(1 + δ)3/2 2(1 − 3δ)5/2 (50) (1 + δ)3/2 ≤ 2(1 − 3δ)5/2 (51) (1 + δ)3/2 ≤ 2(1 − 3δ)7/2

R ≤

T R

ζ

e−ζ (u(ξ ) − κε˜ ∗ u(ξ ))2 ζ 4 dζ dξ 2

5√ π u2ε˜ 6 5√ π u2ε . 6

(53)

We next calculate √ √ 2 2 ρ0 (ξ ) u(ξ )e−u(ξ )z dzdξ = ρ0 (ξ ) e−ζ dζ dξ = π ρ0 (ξ ) dξ = π . T R

T

R

T

(54)

From a Large-Deviations Principle to the Wasserstein Gradient Flow

811

Finally we turn to the term 2 ρ0 (ξ ) u(ξ )e−u(ξ )z (u(ξ ) − κεz ∗ u(ξ ))z 2 dzdξ. I := T R

Lemma 11. Let ε > 0, let ρ0 ∈ L ∞ (T) ∩ C([0, 1]) with T ρ0 = 1, and let u ∈ L ∞ (T). Recall that 0 < δ < 1/3 with ρ0 − 1 L ∞ (T) ≤ δ

u − 1 L ∞ (T) ≤ δ.

and

Then I ≤−

1 1−δ √ π u2ε + rε , 2 (1 + δ)2

where rε → 0 uniformly in δ. From this lemma and the earlier estimates the result follows. Combining Lemma 7 with (52), (54), Lemma 11 and (53), √ √ 1 1−δ √ (1 + δ)3/2 5 √ π Zε ≤ π − π u2ε + π u2ε + Sε , 2 2 (1 + δ) (1 − 3δ)7/2 12 where Sε = ω(ε) + rε converges to zero as ε → 0, uniformly in δ. Since 1/2 > 5/12, for sufficiently small δ > 0 the two middle terms add up to a negative value. Then it follows that lim supε→0 Z ε ≤ 1. Proof of Lemma 11. Writing I as z 2 e−u(ξ )z (z − σ )(u(ξ ) − u(ξ + εσ )) dσ dzdξ, I = 2 ρ0 (ξ ) u(ξ ) T

R 0

we apply Fubini’s Lemma in the (z, σ )-plane to find ∞ ∞ 2 I = −2 ρ0 (ξ ) u(ξ ) e−u(ξ )z (z − σ ) u(ξ + εσ ) − 2u(ξ ) T 0 σ + u(ξ − εσ ) dzdσ dξ ∞ = −2 σ ρ0 (ξ ) u(ξ + εσ ) − 2u(ξ ) + u(ξ − εσ ) h(σ 2 u(ξ )) dξ dσ, T

0

where 1 h(s) := √ s

∞ √

e−ζ (ζ − 2

√ s) dζ ≤

s

1 √ e−s . 2 s

(55)

Since u − 1∞ ≤ δ, h (σ 2 u) =

1 −1 −1 2 2 e−uσ ≤ e−(1+δ)σ . 3 3/2 3 3/2 4σ u 4σ (1 + δ)

(56)

Then, writing Dεσ f (ξ ) for f (ξ + εσ ) − f (ξ ), we have ρ0 (ξ ) u(ξ + εσ ) − 2u(ξ ) + u(ξ − εσ ) h(σ 2 u(ξ )) dξ T = − ρ0 (ξ )Dεσ u(ξ )Dεσ h(σ 2 u)(ξ ) dξ − Dεσ ρ0 (ξ )Dεσ u(ξ )h(σ 2 u(ξ +εσ )) dξ, T

T

812

S. Adams, N. Dirr, M. A. Peletier, J. Zimmer

so that

∞

I =2 0

σ

∞

+2 0

= Ia + I b .

T

σ

ρ0 (ξ )Dεσ u(ξ )Dεσ h(σ 2 u)(ξ ) dξ dσ

T

Dεσ ρ0 (ξ )Dεσ u(ξ )h(σ 2 u(ξ + εσ )) dξ dσ

Taking Ib first, we estimate one part of this integral with (55) by

∞

2 0

σ ∞0

1−εσ

Dεσ ρ0 (ξ )Dεσ u(ξ )h(σ 2 u(ξ + εσ )) dξ dσ

1 2 e−(1−δ)σ dσ σ ωρ0 (εσ ) 2δ √ 2σ 1 − δ 0 ∞ 2δ 2 ωρ0 (εσ )e−(1−δ)σ dσ, ≤√ 1−δ 0

≤2

and this converges to zero as ε → 0 uniformly in 0 < δ < 1/3. The remainder of Ib we estimate 1 σ Dεσ ρ0 (ξ )Dεσ u(ξ )h(σ 2 u(ξ + εσ )) dξ dσ 0 1−εσ ∞ 1 2 ≤2 εσ 2 2δ √ e−(1−δ)σ dσ 2σ 1 − δ 0 ∞ 2εδ 2 =√ σ e−(1−δ)σ dσ, 1−δ 0

2

∞

which again converges to zero as ε → 0, uniformly in δ. To estimate Ia we note that by (56) and the chain rule, Dεσ h(σ 2 u)(ξ ) ≤ −

1 1 2 e−(1+δ)σ Dεσ u(ξ ) σ 2 , 4σ 3 (1 + δ)3/2

and thus ∞ 1−δ −(1+δ)σ 2 e (Dεσ u(ξ ))2 dξ dσ 2(1 + δ)3/2 0 T ∞ 1−δ −s 2 e (Dεs/√1+δ u(ξ ))2 dξ ds = − 2(1 + δ)2 0 T 1−δ √ (49) = − π u2ε/√1+δ 2(1 + δ)2 (51) 1−δ √ ≤ − π u2ε . 2(1 + δ)2 (56)

Ia ≤ −

From a Large-Deviations Principle to the Wasserstein Gradient Flow

813

Appendix A. Proofs of the Lemmas in Section 8.3 Proof of Lemma 8. Since the left- and right-hand sides are both quadratic in u, it is sufficient to prove the lemma for a single Fourier mode u(x) = exp 2πikx, for which 2 2 e−z (u(x + εz) − u(x))2 d xdz = e−z | exp 2πikεz − 1|2 dz R T R 2 = 2 e−z (1 − cos 2π kεz) dz R

√ 2 2 2 = 2 π (1 − e−π k ε ), since

R

e

−z 2

dz =

√

π

and

R

e−z cos ωz dz = 2

√ −ω2 /4 πe .

Proof of Lemma 9. Again it is sufficient to prove the lemma for a single Fourier mode u(x) = exp 2πikx, for which 2 2 e−z (u(x) − κεz ∗ u(x))2 z 4 d xdz = e−z z 4 |1 − κεz (k)|2 dz. R T

R

Writing ω := 2π kε, the Fourier transform of on T is calculated to be 1 2 κεz (x)e−2πikx d x = − 2 2 eiωz − 1 − iωz . κεz (k) = ω z 0 κεz

Then 1 − κεz (k) =

2 iωz ω2 z 2 , e − 1 − iωz + ω2 z 2 2

so that ω2 z 2 2 4 1 − cos ωz − + (sin ωz − ωz)2 4 ω 2 ω4 z 4 4 − 2ωz sin ωz + ω2 z 2 cos ωz . = 4 2 − 2 cos ωz + ω 4

z 4 |1 − κεz (k)|2 =

We then calculate

3√ π 4 R √ 2 2 e−z cos ωz dz = π e−ω /4 R ω √ −ω2 /4 2 e−z z sin ωz dz = πe 2 R 1 ω2 √ 2 2 − , e−z z 2 cos ωz dz = π e−ω /4 2 4 R e−z z 4 dz = 2

814

S. Adams, N. Dirr, M. A. Peletier, J. Zimmer

implying that √ 4 π 3 2 2 2 z −z 2 4 2 2 − 2e−ω /4 + ω4 − ω2 e−ω /4 + ω2 e−ω /4 e z |1 − κε (k)| dz = 4 ω 16 R 1 ω2 − × 2 4 √ 4 π 3 1 1 2 2 2 = 2−2e−ω /4 + ω4 − ω2 e−ω /4 − ω4 e−ω /4 . ω4 16 2 4 We conclude the lemma by showing that the right-hand side is bounded from above by 5√ 2 π (1 − e−ω /4 ). 6 Indeed, subtracting the two we find √ 4 π 3 1 1 5 2 2 2 2 2 − 2e−ω /4 + ω4 − ω2 e−ω /4 − ω4 e−ω /4 − ω4 (1 − e−ω /4 ) , 4 ω 16 2 4 24 and setting s := ω2 /4, the sign of this expression is determined by 1 2 2(1 − e−s ) − s 2 − 2se−s − s 2 e−s . 3 3 This function is zero at s = 0, and its derivative is 2 2 2 − s + se−s + s 2 e−s , 3 3 3 which is negative for all s ≥ 0 by the inequality e−s (1 + s) ≤ 1. Proof of Lemma 10. Since the function α → 1 − e−π k ε /α is decreasing in α, the first inequality follows immediately. To prove the second it is sufficient to show that 1 − e−βx ≤ β(1 − e−x ) for β > 1 and x > 0, which can be recognized by differentiating both sides of the inequality. 2 2 2

2

References [AGS05] [Beu60] [Bra02] [BS98] [BSS00] [CMV06] [Csi84] [dH00] [DM93]

Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows in Metric Spaces and in the Space of Probability Measures. Lectures in mathematics ETH Zürich. Basel: Birkhäuser, 2005 Beurling, A.: An automorphism of product measures. Ann. Math. 72(1), 189–200 (1960) Braides, A.: Gamma-Convergence for Beginners. Oxford: Oxford University Press, 2002 Berkowitz, B., Scher, H.: Theory of anomalous chemical transport in random fracture networks. Phys. Rev. E 57(5), 5858–5869 (1998) Berkowitz, B., Scher, H., Silliman, S.E.: Anomalous transport in laboratory-scale, heterogeneous porous media. Water Resour. Res. 36(1), 149–158 (2000) Carrillo, J.A., McCann, R.J., Villani, C.: Contractions in the 2-wasserstein length space and thermalization of granular media. Arch. Rat. Mech. Anal. 179, 217–263 (2006) Csiszár, I.: Sanov property, generalized I-projection and a conditional limit theorem. Ann. Probab. 12(3), 768–793 (1984) den Hollander, F.: Large Deviations. Providence, RI: Amer. Math. Soc., 2000 Dal Maso, G.: An Introduction to -Convergence, Volume 8 of Progress in Nonlinear Differential Equations and Their Applications. Boston: Birkhäuser, First edition, 1993

From a Large-Deviations Principle to the Wasserstein Gradient Flow

[DMP92] [DS89] [EG92] [Ein05] [Geo88] [Gla03] [GO01] [GST08] [HJ00] [Hua00] [JKO97] [JKO98] [KL99] [KO90] [Léo07] [McC97] [MMS09] [Olv97] [Ott98] [Ott01] [PP08] [Sch31] [Vil03] [Vil09] [WW02] [Yeh06]

815

De Masi, A., Presutti, E.: Mathematical methods for hydrodynamic limits. Lecture Notes in Mathematics, Berlin-Heidelberg-New York; Springer, 1992 Deuschel, J.D., Stroock, D.W.: Large deviations. London, New York: Academic Press, 1989 Evans, L.C., Gariepy, R.F.: Measure Theory and Fine Properties of Functions. Studies in Advanced Mathematics. Boca Raton, FL: CRC Press, 1992 Einstein, A.: Über die von der molekularkinetischen theorie der wärme geforderte bewegung von in ruhenden flüssigkeiten suspendierten teilchen. Annalen der Physik, 17(4), 548–560 (1905) Georgii, H.-O.: Gibbs Measures and Phase Transitions. Berlin: de Gruyter, 1988 Glasner, K.: A diffuse-interface approach to hele-shaw flow. Nonlinearity 16(1), 49–66 (2003) Giacomelli, L., Otto, F.: Variational formulation for the lubrication approximation of the heleshaw flow. Calc. Var. Part. Diff. Eqs. 13(3), 377–403 (2001) Gianazza, U., Savaré, G., Toscani, G.: The wasserstein gradient flow of the fisher information and the quantum drift-diffusion equation. Arch. Ration. Mech. Anal. 194(1), 133–220 (2009) Huang, C., Jordan, R.: Variational formulations for vlasov-poisson-fokker-planck systems. Math. Meth. Appl. Sci. 23(9), 803–843 (2000) Huang, C.: A variational principle for the kramers equation with unbounded external forces. J. Math. Anal. Appl. 250(1), 333–367 (2000) Jordan, R., Kinderlehrer, D., Otto, F.: Free energy and the fokker-planck equation. Physica D: Nonlinear Phenomena 107(2-4), 265–271 (1997) Jordan, R., Kinderlehrer, D., Otto, F.: The variational formulation of the fokker-planck equation. SIAM J. Math. Anal. 29(1), 1–17 (1998) Kipnis, C., Landim, C.: Scaling limits of interacting particle systems. Berlin-Heidelberg-New York: Springer Verlag, 1999 Kipnis, C., Olla, S.: Large deviations from the hydrodynamical limit for a system of independent brownian particles. Stoch. and Stoch. Reps. 33(1-2), 17–25 (1990) Léonard, C.: A large deviation approach to optimal transport. http://arXiv.org/abs/0710.1461v1 [math.PR], 2007 McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128, 153–179 (1997) Matthes, D., McCann, R.J. Savaré, G.: A family of nonlinear fourth order equations of gradient flow type. http://arXiv.org/abs/0901.0540vL [math.AP], 2009 Olver, F.W.J.: Asymptotics and Special Functions. London-New York: Academic Press, 1997 Otto, F.: Lubrication approximation with prescribed nonzero contact angle. Comm. Part. Diff. Eqs. 23(11), 63–103 (1998) Otto, F.: The geometry of dissipative evolution equations: the porous medium equation. Comm. Part. Diff. Eqs. 26, 101–174 (2001) Portegies, J.W., Peletier, M.A.: Well-posedness of a parabolic moving-boundary problem in the setting of wasserstein gradient flows. Interfaces and Free Boundaries 12, 121–150 (2010) Schrödinger, E.: Über die Umkehrung der Naturgesetze. Sitzungsber. Preuss. Akad. Wiss. Phys. Math. Kl, 144–153 (1931) Villani, C.: Topics in Optimal Transportation. Providence, RI: Amer. Math. Soc., 2003 Villani, C.: Optimal transport: Old and new. Springer Verlag, 2009 Weeks, E.R., Weitz, D.A.: Subdiffusion and the cage effect studied near the colloidal glass transition. Chem. Phys. 284(1-2), 361–367 (2002) Yeh, J.: Real Analysis: Theory of Measure and Integration. River Edge, NJ: World Scientific, 2006

Communicated by H.-T. Yau

Commun. Math. Phys. 307, 817–860 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1326-6

Communications in

Mathematical Physics

LSI for Kawasaki Dynamics with Weak Interaction Georg Menz Max Planck Institute for Mathematics in the Sciences, Inselstr. 22 - 26, Leipzig, Germany. E-mail: [email protected] Received: 16 June 2010 / Accepted: 31 March 2011 Published online: 10 September 2011 – © Springer-Verlag 2011

Abstract: We consider a large lattice system of unbounded continuous spins that are governed by a Ginzburg-Landau type potential and a weak quadratic interaction. We derive the logarithmic Sobolev inequality (LSI) for Kawasaki dynamics uniform in the boundary data. The scaling of the LSI constant is optimal in the system size and our argument is independent of the geometric structure of the system. The proof consists of an application of the two-scale approach of Grunewald, Otto, Westdickenberg & Villani. Several ideas are needed to solve new technical difficulties due to the interaction. Let us mention the application of a new covariance estimate, a conditioning technique, and a generalization of the local Cramér theorem. Contents 1. 2. 3. A. B.

Introduction and Main Result . . . . . . . . . . . . . . . . . . . . . . Proof of the Main Result . . . . . . . . . . . . . . . . . . . . . . . . . The Local Cramér Theorem for Inhomogeneous Single-Site Potentials Proof of the Generalized Local Cramér Theorem . . . . . . . . . . . . Basic Facts about the LSI . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

817 822 844 848 857

1. Introduction and Main Result The logarithmic Sobolev inequality (LSI) – introduced by Gross [8] – is a powerful tool for studying spin systems. It implies exponential convergence to equilibrium of the naturally associated diffusion process and also characterizes the rate of convergence (cf. [26–28,31,33] and Remark 3). Therefore an appropriate scaling of the LSI constant in the system size indicates the absence of phase transitions. The LSI is also useful to deduce the hydrodynamic limit (see [9,15]). In this article we consider a large system of real-valued unbounded spins. The Hamiltonian of the system is given by a GinzburgLandau type potential and a two-body interaction (see (2)). The quadratic interaction

818

G. Menz

is not restricted to finite range. Any two spins of the system are allowed to interact. Because Kawasaki dynamics conserves the mean spin of the system, we work with the canonical ensemble. Even if there is no interaction term in the Hamiltonian, there is a long-range interaction due to the conservation of the mean spin. Therefore it was a challenge to establish the LSI for the canonical ensemble in the case of a non-interacting quadratic Hamiltonian (cf. [18] for discrete spin and [6,9,16] for continuous spin). The main difficulty was to attain the optimal scaling behavior of the LSI constant in the system size. In [30] this result was generalized to weak interaction of finite range for bounded discrete spin values (see also [4]). In this article we show that the LSI also holds for unbounded continuous spin values with weak two-body interaction. The LSI constant is uniform in the boundary data and scales optimal in the system size. Compared to the discrete case we have to deal with new technical difficulties due to the fact that the spin values and the range of interaction are unbounded. Because we apply the original two-scale approach of [9], it is also possible to derive the hydrodynamic limit with the same method as outlined in [9]. However, the hydrodynamic limit is not considered in this article. Note that for existing results on the hydrodynamic limit (cf. [11,29]) there are restrictions to lattices of certain dimensions or nearest neighbor interaction, whereas our approach is independent of the geometrical structure of the system. For the proof of the main result we also establish a generalized version of the local Cramér theorem [9], where the single-site potentials have an additional linear term and depend on the site. 1.1. Basic setting and main result. We consider the following type of spin system. Let be an arbitrary set of sites that are indexed by {1, . . . , N }, N ∈ N. To each site i ∈ {1, . . . , N } we assign a real value xi ∈ R called the spin. A vector x = (x1 , . . . , x N ) ∈ R N represents a configuration of the spin system. The energy of a configuration is given by the Hamiltonian H (x) ∈ R. In our case there are three contributions to the Hamiltonian: ◦ for each site i ∈ {1, . . . , N } a Ginzburg-Landau type single-site potential ψi : R → R that satisfies ψi (x) =

1 2 x + δψi (x) and δψi C 2 ≤ c1 < ∞, 2

(1)

uniformly in i ∈ {1, . . . , N }; ◦ a two-body interaction given by a real-valued symmetric matrix M = (m i j ) N ×N with zero diagonal m ii = 0; ◦ a linear term given by a vector s ∈ R N . This term models the interaction of the sites with the boundary data of the spin system. Explicitly, the Hamiltonian of the system is given by H (x) :=

N i=1

ψi (xi ) +

N N 1 m i j xi x j + si xi . 2 i, j=1

(2)

i=1

Note that in contrast to [9] we do not consider homogeneous single-site potentials ψi = ψ, i ∈ {1, . . . , N }. The reason is that the linear term in the definition of H naturally induces a dependence of the single-site potentials on the site i. Note that |m i j | determines the strength of the interaction between the spin xi and x j . The sign of m i j determines if

LSI for Kawasaki Dynamics with Weak Interaction

819

the interaction is repulsive or attractive. To avoid phase transition, it is natural to assume that the interaction is small in a certain sense. Our substitute for the mixing condition in the discrete case is: Definition 1 (Condition of smallness). The interaction matrix M satisfies the smallness condition CS(ε) with ε > 0, if for all x ∈ R N , N

xi |m i j | x j ≤ ε

i, j=1

N

xi2 .

i=1

Later, we will use the condition CS(ε) to apply the covariance estimate of Theorem 3. This proceeding is similar to the discrete case, where the mixing condition was used to deduce a decay of correlations. Note that the condition CS(ε) does not impose finiterange interaction as for example the condition used by Yoshida [32]. The grand canonical ensemble μgc is a probability measure on R N given by μgc (d x) :=

1 exp (−H (x)) d x. Z

Here and later on, Z denotes a generic normalization constant. The Kawasaki dynamN ics conserves the mean spin m = N1 i=1 xi of an initial configuration x ∈ R N (cf. Remark 2). Therefore we want to restrict the system to the N − 1 dimensional hyper-plane X N ,m defined by

X N ,m

N 1 := x ∈ R | xi = m . N N

(3)

i=1

We equip X N ,m with the standard scalar product and norm induced by R N , i.e. x, x ˜ :=

N

xi x˜i and |x| =

i=1

M

xi2 .

i=1

The restriction of μgc to X N ,m is called the canonical ensemble μ, i.e. μ(d x) :=

1 1 1 Z N

N

i=1 xi =m

exp (−H (x))

H(d x).

(4)

Here and later on, H denotes the Hausdorff measure in the appropriate dimension. Note that the canonical ensemble μ depends on the system size N ∈ N and the mean spin m ∈ R. Definition 2 (LSI). Let X be an Euclidean space. A Borel probability measure μ on X satisfies the LSI with constant > 0 (in short: LSI()), if for all functions f ≥ 0, Ent ( f μ|μ) :=

f log f dμ −

f dμ log

1 |∇ f |2 dμ. f dμ ≤ 2 f

Here ∇ denotes the gradient determined by the Euclidean structure of X .

(5)

820

G. Menz

Remark 1 (Gradient on X N ,m ). If we choose X = X N ,m in Definition 2, we can calculate |∇ f |2 in the following way: Extend the function f : X N ,m → R to be constant on the direction normal to X N ,m , then |∇ f |2 =

N

|∂xi f |2 .

i=1

Now, we state the main result of this article. Theorem 1. Assume that the Hamiltonian H is given by (2) and that the single-site potentials ψi satisfy (1) with a constant c1 < ∞ independent of the system size N ∈ N, the mean spin m ∈ R, and the boundary data s ∈ R N . Then there exist ε > 0 and > 0 depending only on c1 such that: If the interaction matrix M satisfies CS(ε), then the canonical ensemble μ satisfies LSI() independent of N , m, and s. In the next remark we explain in which sense the scaling behavior of Theorem 1 is optimal in the system size. Remark 2 (From Glauber to Kawasaki). The bound on the right-hand side of (5) is given in terms of the Glauber dynamics in the sense that we have endowed X N ,m with the standard Euclidean structure inherited from R N . By the discrete Poincaré inequality one can recover the bound for the Kawasaki dynamics (cf. [5] or [9, Remark 15]) in the sense that one endows X N ,m with the Euclidean structure coming from the discrete H −1 -norm. More precisely, let us consider a periodic cube of side length W and total number of sites N on the d-dimensional lattice. Let A denote the discrete second-order difference operator. For example, in one dimension A is given by the N × N -Matrix, A := (−δi, j−1 + 2δi, j − δi, j+1 ), using the convention δi,0 = δi,N and δi,N +1 = δi,1 . By the discrete Poincaré inequality (cf. [5,9]) there exists a constant C depending only on the dimension d such that √ |∇ f |2 ≤ C W 2 | A∇ f |2 . It follows that if a measure μ satisfies LSI(), then for all functions f ≥ 0, √ CW2 | A∇ f |2 dμ, Ent ( f μ|μ) ≤ 2 f

(6)

where the gradient on the right hand side is given in terms of Kawasaki dynamics (see also the next remark). It was shown in [30] that the diffusive scaling behavior of the last inequality in W is optimal. In the next remark we explain what is meant by Kawasaki Dynamics and how the LSI is connected to exponential convergence to equilibrium. Remark 3 (Convergence to equilibrium). The Kawasaki dynamics on the cube (see the last remark) is given by a stochastic process X (t) ∈ R satisfying the stochastic differential equation √ d X (t) = −A∇ H (X (t)) dt + 2 A d B(t),

LSI for Kawasaki Dynamics with Weak Interaction

821

where B(t) denotes a standard Brownian motion on R . Because the dynamics conserves the mean spin of the process N N 1 1 X i (t) = X i (0) = m, N N i=1

i=1

we can restrict the state space R to the hyperplane as f t μ, then f t satisfies the time-evolution

X N ,m . If the process X t is distributed

d ( f t μ) = ∇ · (A∇ f t μ). dt Using this equation one sees by direct calculation that √ d 1 | A∇ f |2 Ent ( f t μ|μ) = − dμ. dt 2 f Hence it follows from Gronwall’s Lemma and (6) that if μ satisfies LSI(), then Ent ( f t μ|μ) ≤ exp −C W −2 Ent ( f 0 μ|μ), with a constant C that depends only on the dimension of the lattice. The last inequality yields the exponential convergence of X (t) to the equilibrium state μ in the sense of entropies. The rate of convergence is characterized by the LSI constant . There are several criteria for the LSI (cf. Appendix B), but none of them applies to our situation: ◦ The Tensorization Principle [8] for LSI does not apply because of the interaction M = 0. ◦ The criterion of Bakry & Émery [1] does not apply because the single-site potentials ψi are allowed to be non-convex. ◦ The criterion of Holley & Stroock [13] does not help because we want the LSI constant to be independent of the system size N . ◦ The criterion of Otto & Reznikoff [22] does not help because of the restriction to the hyper-plane X N ,m . Therefore new tools are needed. The most common approach to LSI for Kawasaki dynamics is the Lu & Yau martingale method [6,16,18]. Using this method Landim, Panizo & Yau [16] proved Theorem 1 in the special case M = 0 for the Kawasaki bound. An adaptation of this approach by Chafaï [6] led to the stronger bound for Glauber Dynamics. Providing a new technique – called the two-scale approach – Grunewald, Otto, Westdickenberg (former Reznikoff) & Villani [9] reproduced Theorem 1 for M = 0. We will follow their approach, but our setting differs in two aspects: On the one hand we consider inhomogeneous single-site potentials (i.e. ψi depends on the site i) and on the other hand –and more fundamentally– we allow for interaction M = 0. These differences lead to new technical difficulties compared to [9]: ◦ the interaction between blocks is controlled by the covariance estimate of [20]; ◦ the convexification of the coarse-grained Hamiltonian with interaction is attained by a conditioning technique (that artificially reduces the system size) and a nonstandard perturbation argument;

822

G. Menz

◦ the local Cramér theorem (cf. [9, Prop. 31]) is generalized to Hamiltonians with inhomogeneous single-site potentials and linear terms. The unboundedness of the spins and of the range of interaction also leads to new difficulties compared to the discrete and bounded case (cf. [4,30]): ◦ In the case of finite-range interaction one could use a covariance estimate due to Helffer to deduce exponential decay of covariances (cf. [2,3,12,20] or [17, Prop. 2.1]). The application of the covariance estimate of [20] makes it possible to consider infinite-range interaction (cf. Theorem 3 and the proof of Lemma 3). ◦ The perturbation argument used in the proof of Lemma 7 is a lot easier in the case of bounded spins and finite-range interaction. The proof becomes a lot more delicate in the case of unbounded spins and infinite-range interaction (cf. comments after (38)). Recently, Felix Otto and the author of this article derived in [21] the LSI for noninteracting Hamiltonians given by

H (x) :=

N

ψ(xi ),

i=1

provided the single-site potential ψ : R → R is a bounded perturbation of a strictly convex potential. More precisely, there is a splitting ψ = ψc + δψ such that ψ ≥ λ > 0 and δψC 1 < ∞.

(7)

Again, the scaling behavior of the LSI constant is optimal in the system size. It is a natural question if one could extend their result to weak interaction. Our approach allows only bounded perturbations of quadratic Hamiltonians because the two-scale criterion for LSI of [9, Thm. 3] is used, which is restricted to this class of Hamiltonians (cf. Remark 5).

2. Proof of the Main Result The proof of the main result (Theorem 1) is structured in the following way. In Sec. 2.1 we outline the two-scale approach. The proof of the main result is given directly after we formulated the two-scale criterion for LSI (see Theorem 2), which is the main tool of the argument. In the remaining part of Chapter 2 we verify the ingredients of the twoscale criterion. The microscopic LSI and the macroscopic LSI are deduced in Sec. 2.2 and Sec. 2.3 respectively. For the proof of the macroscopic LSI we need a generalized version of the local Cramér theorem, which is stated in Theorem 4 of Chapter 3. Because the proof of Theorem 4 is elementary but a bit lengthy, it is stated in Appendix A. There is also a small Appendix B containing some basic facts about the LSI. For the rest of Chapter 2 we make the following assumption. Assumption. Assume that the Hamiltonian H is given by (2) and that the single-site potentials ψi satisfy (1) with a constant c1 < ∞ independent of the system size N ∈ N, the mean spin m ∈ R, and the boundary data s ∈ R N .

LSI for Kawasaki Dynamics with Weak Interaction

823

Fig. 1. Block decomposition of the spin system

2.1. The two-scale approach. In this section we will explain the two-scale approach, point out the new difficulties arising from the interaction, and explain how they are solved. Our presentation is based on Subsec. 2.1 and 5.1 of [9], which we recommend for better understanding. We decompose the spin system into L blocks each containing K sites. Therefore N = K L. The index set of the l th block, l ∈ {1, . . . , L} is given by (cf. Fig. 1) B(l) := {(l − 1)K + 1, . . . , l K }. The spin values inside the block B(l) are denoted by x l := (xi )i∈B(l) . Hence a configuration x ∈ X N ,m of the spin system can be written as x = (x 1 , . . . , x L ).

(8)

Note that the block decomposition is arbitrary and has no geometric significance. The coarse-graining operator P : X N ,m → X L ,m =: Y assigns to each block its mean spin, i.e. ⎛ ⎞ 1 1 P(x) := ⎝ xi , . . . , xi ⎠ . (9) K K i∈B(1)

i∈B(L)

As in [9] we endow Y with the scalar product y, z Y :=

L 1 yi z i , for y, z ∈ Y. L

(10)

i=1

Let P ∗ : Y → X N ,m denote the adjoint operator of P. More precisely, P ∗ is given by P ∗ (y1 , . . . y L ) =

1 (y1 , . . . , y1 , . . . . . . . . . . . . , y L , . . . , y L ). N K times

K times

The orthogonal projection of X N ,m on ker P is given by Id −N P ∗ P, which can be seen using the identity P N P ∗ = IdY .

(11)

Hence we can decompose x ∈ X N ,m into a macroscopic profile and a microscopic fluctuation, i.e. x = N P ∗ P x + (Id −N P ∗ P)x . (12) ∈(ker P)⊥

∈ ker P

The coarse graining also induces a natural decomposition of measures. Recall that μ denotes the canonical ensemble given by (4) associated to the Hamiltonian H and the

824

G. Menz

mean spin m. Let μ¯ := P# μ be the push forward of μ under P and let μ(d x|y) denote the conditional measure of μ given P x = y. Then by disintegration, μ(d x) = μ(d x|y)μ(dy). ¯

(13)

This equation has to be understood in a weak sense, i.e. for any test function ξ

ξ μ(d x|y) μ(dy). ¯ ξ dμ = Y

{P x=y}

By the Coarea Formula one can determine the density of μ(dy) ¯ as μ(dy) ¯ = exp(−N H¯ (y)) dy, where the coarse-grained Hamiltonian H¯ is given by 1 ¯ H (y) := − log exp(−H (x)) H(d x). N {P x=y}

(14)

The coarse-grained Hamiltonian H¯ (y) represents the energy of a macroscopic profile y. Overall, we observe the system at two different scales: ◦ the microscopic scale μ(d x|y) considers all fluctuations of the system around a macroscopic profile y ∈ Y , and ◦ the macroscopic scale μ(dy) ¯ considers the macroscopic profiles and neglects all fluctuations. We will apply the two-scale criterion for LSI (see [9, Thm. 3]) to derive the LSI for the canonical ensemble μ. In our setting the two-scale criterion becomes Theorem 2 (Two-scale criterion). Assume that the canonical ensemble μ given by (4) is decomposed by (13). Additionally, assume that: (i) There is > 0 such that for all N , m, s, and y the conditional measures μ(d x|y) satisfy LSI(). (ii) There is λ > 0 such that for all N , m, and s the marginal μ¯ satisfies LSI(λN ). Then μ satisfies LSI() ˆ with ˆ independent of N , m, and s. Remark 4. The two-scale criterion in [9] also contains an explicit representation of the LSI constant ˆ in terms of , λ, and a constant κ, which represents the strength of the coupling between the microscopic and macroscopic scale. However, for our purpose it is just important that ˆ is independent of the system size N , the mean spin m, and the boundary data s. Remark 5. In the Introduction we mentioned the question of generalizing the main result (Theorem 1) to single-site potentials ψ that are perturbed strictly convex singlesite potential ψ in the sense of (7). Our approach does not cover this case because Theorem 2 (or [9, Thm. 3]) cannot be applied to this type of single-site potential. The reason is that κ may be infinite in this case (see the last remark and [21]). In [21] this problem is avoided by an adaptation of the two-scale approach. Instead of one-time coarse-graining of big blocks, one considers iterated coarse-graining of pairs. Unfortunately, this approach cannot be directly transferred to the interactive case, because the iteration is based on a product structure which is lost due to non-zero interaction.

LSI for Kawasaki Dynamics with Weak Interaction

825

Proof (Theorem 1). We carry out the coarse-graining procedure with a large but fixed block size K ≥ K 0 , where K 0 is determined by Proposition 2 below. Note that K 0 is independent of the system size N , the mean spin m, and the boundary data s. The ingredients of the two-scale criterion of Theorem 2, namely the microscopic LSI and the macroscopic LSI, are verified by Proposition 1 and Corollary 1 respectively. Then, Theorem 1 follows directly from an application of Theorem 2. Now, we will discuss how the ingredients of Theorem 2 are verified. The microscopic LSI follows directly from an application of the Otto & Reznikoff criterion for LSI (see Subsec. 2.2). Difficulties arise deducing the macroscopic LSI (see Subsec. 2.3). We follow the strategy of [9] and want to show that H¯ is uniformly convex if the block size K is large enough and the interaction ε is small enough. The uniform convexity of H¯ would yield the macroscopic LSI by the criterion of Bakry & Émery (see Theorem 7). Due to the interaction between blocks we lose the product structure of μ¯ (cf. [9, (63)]), that was crucial for the argument of [9]. As a consequence, the off-diagonal entries h ln , l = n, of the Hessian of H¯ become non trivial (see (23)). However, applying a new covariance estimate [20] yields sufficient control of h ln in terms of ε (see Subsec. 2.3.2). The main difficulty of the proof is encountered when checking the positivity of the diagonal elements h ll of the Hessian of H¯ . It is not possible to transfer the positivity of h ll from the case of ε = 0 to the case of small ε by a simple perturbation argument. The reason is that due to the loss of the product structure, h ll will depend on all spins of system. In the case ε = 0 the diagonal elements h ll depend only on the spins of the l th block, which has size K . Hence one could not choose ε independent from the system size N and the LSI constant would depend on N . We avoid this problem by conditioning on all spins except of a single block (see Subsec. 2.3.3). This procedure artificially reduces the system size to the number K and introduces new boundary data, which is expressed by an additional linear term in the Hamiltonian (cf. proof of Proposition 1). Independently, we observe that for ε = 0 the positivity of h ll for large K is untouched by a linear term (cf. Proposition 3). Therefore we are able to apply a perturbation argument to transfer the positivity of h ll to small ε depending only on K and c1 and not on the total system size N (see Lemma 6 and Lemma 7). 2.2. Microscopic LSI. In this subsection we will prove the following statement. Proposition 1 (Microscopic LSI). There is ε > 0 independent of N , m, s, and y ∈ Y (depending only on the block size K and c1 ) such that: If M satisfies CS(ε), then the conditional measures μ(d x|y) given by (13) satisfy LSI() with > 0 independent of N , m, s, and y (depending only on K , c1 , and ε). Proof (Proposition 1). The statement follows from an application of the criterion for LSI of Otto & Reznikoff (see Theorem 8 in the Appendix or Theorem 1 in [22]). Let us consider an arbitrary but fixed macroscopic profile y = (y1 , . . . , y L ) ∈ Y . We start with decomposing the Euclidean space {P x = y} into a finite product of Euclidean spaces. It follows from the definition (9) of the coarse-graining operator P that {x ∈ R N | P x = y} = X K ,y1 × · · · × X K ,yL , where the hyperplane X K ,yl , 1 ≤ l ≤ L, given by (3) is identified with ⎧ ⎫ ⎨ ⎬ 1 xi = yl . X K ,yl = x l ∈ R B(l) | ⎩ ⎭ K i∈B(l)

826

G. Menz

Fig. 2. Conditioning on spins outside of the block B(l)

Hence we can decompose a configuration x ∈ {P x = y} into x = (x 1 , . . . , x L ) with x l = (xi )i∈B(l) ∈ X K ,yl . For convenience, the spin values outside the block B(l) (or rather X K ,yl ) are denoted by . Disintegration of the microscopic measure μ(d x|y) with respect to x l x¯ l := (xi )i ∈B(l) / for a fixed 1 ≤ l ≤ L yields ¯ x¯ l |y), μ(d x|y) = μ(d x l |x¯ l , y) μ(d where μ(d x l |x¯ l , y) and μ(d ¯ x¯ l |y) denotes the conditional measure and the corresponding marginal respectively (cf. Fig. 2). More precisely, we have for all test functions ξ : {P x = y} → R, ¯ x¯ l |y). (15) ξ(x)μ(d x|y) = ξ(x l , x¯ l )μ(d x l |x¯ l , y)μ(d For the first requirement of Theorem 8 we have to show that on X K ,yl , 1 ≤ l ≤ L, the conditional measures μ(d x l |x¯ l , y) satisfy the LSI() ˜ with constant ˜ > 0 independent of N , m, s, y, l, and x¯ l . For this purpose let us have a closer look at the Hamiltonian of the conditional measure μ(d x l |x¯ l , y): For an arbitrary vector s ∗ ∈ R B(l) we define the Hamiltonian H (x l |M, s ∗ ) by

(1)

H (x l |M, s ∗ ) =

ψi (xi ) +

i∈B(l)

1 m i j xi x j + si∗ xi . 2 i, j∈B(l)

i∈B(l)

The definition (2) of the Hamiltonian H yields H (x) =

N

ψi (xi ) +

i=1

=

i∈B(l)

N N 1 m i j xi x j + si xi 2 i, j=1

i=1

⎛ ⎞ 1 ⎝si + ψi (xi ) + m i j xi x j + m i j x j ⎠ xi 2 i, j∈B(l)

i∈B(l)

j ∈B(l) /

1 + ψi (xi ) + m i j xi x j + si xi 2 i ∈B(l) / i, j ∈B(l) / i ∈B(l) / 1 = H x l |M, sc + ψi (xi ) + m i j xi x j + si xi , 2 i ∈B(l) /

i, j ∈B(l) /

i ∈B(l) /

LSI for Kawasaki Dynamics with Weak Interaction

827

where the vector sc = sc (s, M, x¯ l ) ∈ R B(l) is defined by sc,i := si + mi j x j for i ∈ B(l). j ∈B(l) /

Because one can cancel all terms that are independent of x l = (xi )i∈B(l) with terms of the normalization constant Z , the effective Hamiltonian of the conditional measure μ(d x l |x¯ l , y) is given by H (x l |M, sc ). More precisely, 1 1 X K ,yl exp −H (x l |M, sc ) H(d x). μ(d x l |x¯ l , y) = Z Using the assumption (1) on the single-site potentials ψi we can write H (x l |M, sc ) as the sum of H (x l |M, sc ) = H1 (x l |M, sc ) + H2 (x l |M, sc ), where H1 (x l |M, sc ) and H2 (x l |M, sc ) are given by ⎛ ⎡ ⎞ ⎤ x2 1 ⎣ i + ⎝si + H1 (x l |M, sc ) = m i j x j ⎠ xi ⎦ + m i j xi x j , 2 2 i∈B(l) j ∈B(l) / i, j∈B(l) l H2 (x |M, sc ) = δψi (xi ). i∈B(l)

Using CS(ε) it follows that

m i j xi x j ≤ ε|x l |2 .

i, j∈B(l)

Hence, if ε is small enough, then H1 (x l |M, sc ) is a uniformly strictly convex function with constant λ ≥ 41 . By the assumption (1) on the functions δψi it follows that H2 (x l |M, sc ) is a bounded function satisfying ! ! ! ! ! ! ! sup H2 (x l |M, sc ) − inf H2 (x l |M, sc )! ≤ 2K c1 . ! ! l l x ∈X K ,yl ! !x ∈X K ,yl Therefore by a combination of the criterion of Bakry & Émery (see Theorem 7) and of the criterion of Holley & Stroock (see Theorem 6) yields that the conditional measures μ(d x l |x¯ l , y) satisfy a uniform LSI with constant 1 ˜ = exp (−2K c1 ) . 4

(16)

Note that ˜ is independent of N , m, s, y, l, and x¯ l (depending only on the block size K and the constant c1 given by (1)). Now, we verify the remaining ingredients of the criterion of Otto & Reznikoff. For n, m ∈ {1, . . . , L}, let Mnm denote the K × K matrix given by Mnm = (m i j )i∈B(n),

j∈B(m) .

(17)

828

G. Menz

Let Mnm be defined as the operator norm of Mnm as a bilinear form, i.e. Mnm = max

⎧ ⎨

⎩

i∈B(n), j∈B(m)

⎫ ! ⎬ xi m i j y j !! B(n) B(m) x ∈ R , y ∈ R . ⎭ |x| |y| !

(18)

Let the matrix A = (Anm ) L×L be defined by the elements Anm =

, ˜ if n = m, −Mnm , if n = m,

n, m ∈ {1, . . . , L} .

(19)

We will show that A satisfies for some > 0 independent of N , m,s, y, l, and x¯ l , A ≥ Id in the sense of quadratic forms. For the rest of the proof let C < ∞ denote a generic constant that depends only on K . Firstly, we will show that (Mnm ) L×L ≤ Cε Id,

(20)

in the sense of quadratic forms. Because of the equivalence of norms in finite dimensional vector spaces we have for n, m ∈ {1, . . . , L} , Mnm ≤ C

|m i j |.

i∈B(n), j∈B(m)

For any vector x ∈ R L we have L

L CS xn Mnm xm ≤ C

n,m=1

|xn | |m i j | |xm |

n,m=1 i∈B(n), j∈B(m)

L CS(ε) ≤ Cε xn2 . n=1

This inequality already yields (20). Because ˜ depends only on the block size K and c1 , we can choose ε small independent of N and y such that A = ˜ Id − (Mnm ) L×L + diag (M11 , . . . , M L L ) ≥ ˜ Id − (Mnm ) L×L ≥ (˜ − Cε) Id ≥ Id,

(21)

for some > 0 depending only on K , c1 , and ε. Hence we can apply the criterion of Otto & Reznikoff and the proof is finished.

LSI for Kawasaki Dynamics with Weak Interaction

829

2.3. Macroscopic LSI. In this section we will derive the macroscopic LSI. More precisely, we will prove that H¯ becomes uniformly convex for large K and small ε. Proposition 2. Let H¯ denote the coarse-grained Hamiltonian defined by (14) and let HessY H¯ denote the Hessian of H¯ w.r.t. the Euclidean structure ·, · Y on Y given by (10). Then there exists K 0 ∈ N depending only on c1 such that: If the block size K ≥ K 0 and the interaction matrix M satisfies CS(ε), then there are constants λ > 0 and C < ∞ independent of N , m, and s (depending only on K and c1 ) such that for all y ∈ Y, HessY H¯ (y) ≥ (λ − Cε) Id in the sense of quadratic forms. By the definition (14) of H¯ we have μ(dy) ¯ = exp(−N H¯ (y))H(dy). Hence the macroscopic LSI is a direct consequence of Proposition 2 and the criterion of Bakry & Émery (see Theorem 7), if we choose ε small enough. More precisely, we have Corollary 1 (Macroscopic LSI). Choose a fixed block size K ≥ K 0 , where K 0 is given by Proposition 2. Consider the marginal μ¯ defined by (13). Then there exist ε > 0 and λ > 0 independent of N , m, and s (depending only on K and c1 ) such that: If the interaction matrix M satisfies CS(ε), then μ¯ satisfies LSI(λN ). The proof of Proposition 2 consists of three steps. In the next subsection we will deduce a formula for the elements of HessY H¯ . In Subsec. 2.3.2 we will show that the off-diagonal elements of HessY H¯ are small in a certain sense (cf. Lemma 3). In Subsec. 2.3.3 we will show that the diagonal elements of HessY H¯ are uniformly positive for large K and small ε (cf. Lemma 5). Proof (Proposition 2). We decompose the HessY H¯ (y) into its diagonal matrix and its remainder, i.e. HessY H¯ (y) = diag HessY H¯ (y) 11 , . . . , HessY H¯ (y) L L " # + HessY H¯ (y) − diag HessY H¯ (y) , . . . , HessY H¯ (y) . 11

LL

A combination of Lemma 3 and Lemma 5 from below yields the statement. 2.3.1. Formula for the elements of the Hessian of H¯ Before we derive the formula for the elements of the Hessian of H¯ , we state an alternative representation of the coarse-grained Hamiltonian H¯ . Lemma 1. Assume that the Hamiltonian H and the coarse-grained Hamiltonian H¯ are given by (2) and (14) respectively. For x ∈ {P x = 0} and y ∈ Y, let HM (x, y) be defined by $ % 1 x, (Id +M)x + x, M N P ∗ y + s, x + δψi (xi + (N P ∗ y)i ). 2 N

HM (x, y) :=

i=1

Then % 1$ y, (Id +P M N P ∗ )y Y + Ps, y Y H¯ (y) = 2 1 − log exp (−HM (x, y)) H(d x), N {P x=0} where the scalar product ·, · Y is given by (10).

(22)

830

G. Menz

The last lemma is verified by straightforward calculation. One applies the linear transformation x → x − N P ∗ y to the integral in the definition (14) of H¯ (y). Additionally, one has to use the fact that by orthogonality x, N P ∗ y = 0 for any x ∈ ker P and N P ∗ y ∈ (ker P)⊥ (cf. (12)). From now on we will use the standard notation that for a probability measure ν,

f − f dν g − gdν and var ν ( f ) := covν ( f, f ). covν ( f, g) := The following representation of the Hessian of H¯ is the base of our argument for the convexity of the coarse-grained Hamiltonian H¯ . Lemma 2. Assume that the Hamiltonian H and the coarse-grained Hamiltonian H¯ are given by (2) and (14) respectively. Recall that the conditional measures μ(d x|y) are defined by (13). For 1 ≤ l, n ≤ L , we have 1 1 HessY H¯ (y) ln = δln + δln δψi (xi ) μ(d x|y) + mi j K K i∈B(l) i∈B(l), j∈B(n) ' & N 1 − covμ(d x|y) m i j xi + δψ j x j , K j∈B(l) i=1 & N '

(23) m i j xi + δψ j x j . j∈B(n)

i=1

The last statement is easily deduced by differentiating (22). Additionally, one has to apply the inverse translation x + N P ∗ y to the occurring integrals, consider the orthogonality of N P ∗ y ∈ (ker P)⊥ , and apply the fact that covariances are invariant under adding constant functions. Because every step of the proof is very basic, we will omit the details. 2.3.2. Estimation of the off-diagonal elements of the Hessian of H¯ In this section we will show that the off-diagonal elements of the Hessian of H¯ are controlled by ε. Explicitly, we will prove the following statement. Lemma 3. If the interaction matrix M satisfies CS(ε), then there is a constant 0 ≤ C < ∞ independent of N , m, and s (depending only on the block size K and c1 ) such that HessY H¯ (y) − diag HessY H¯ (y) 11 , . . . , HessY H¯ (y) L L ≥ −Cε Id in the sense of quadratic forms. This lemma is not obvious. Considering (23) one has to estimate for example the covariance ⎞ ⎛ covμ(d x|y) ⎝ δψ j x j , δψ j x j ⎠ for 1 ≤ l = n ≤ L . j∈B(l)

j∈B(n)

It is not clear how to exploit the control CS(ε) on the last expression. The key observation is that the first function depends only on spins of the block B(l), whereas the second

LSI for Kawasaki Dynamics with Weak Interaction

831

function depends only on spins of block B(n). One hopes that the covariance is decaying in the distance of the blocks, if ε is small enough. In the case of finite range interaction one could use the covariance due to Helffer (see [12,17]) to deduce exponential decay of covariances (see also [2,3,20]). However, we will use the covariance estimate of [20], which allows us to deal with an infinite range of interaction: Theorem 3 (Covariance estimate of [20]). Let dμ := Z1 exp(−H (x))d x be a probability measure on a direct product of Euclidean spaces X = X 1 × · · · × X L . We assume that ◦ the conditional measures μ(d x l |x n ∈ X n n = l) on X l , l ∈ {1, . . . , L}, satisfy a uniform SG() ˜ (see Definition 3). ◦ the numbers κln , 1 ≤ l = n ≤ L satisfy |∇l ∇n H (x)| ≤ κln < ∞ uniformly in x ∈ X ; here | · | denotes the operator norm of a bilinear form. ◦ the symmetric matrix A = (Aln ) L×L defined by , ˜ if l = n, Aln = −κln, if l < n, is strictly positive definite. Then for any function f and g, covμ ( f, g) ≤

L

A

−1

1

1 2 2 2 |∇l f | dμ |∇n g| dμ . 2

ln

l,n=1

(24)

For the proof of this covariance estimate, we also refer the reader to the dissertation [19] of the author. From the last theorem we deduce the following auxiliary lemma that is the main ingredient of the proof of Lemma 3. Lemma 4. The following statements hold: (i) The conditional measures μ(d x|y) given by (13) satisfy the covariance estimate (24) with the matrix A given by (19). (ii) Assume that ˜ is given by (16) and that the elements Ms1 s2 of the L × L- Matrix Ms1 s2 L×L are given by (18). Then in the sense of quadratic forms: A−1 − diag

Ms1 s2 L×L

, . . . , A−1

1 ε Id, 11 LL ˜ ˜ − ε 1 ε2 Id . A−1 Ms1 s2 L×L ≤ ˜ ˜ − ε

A−1

≤

(25) (26)

Proof (Lemma 4). Argument for (i): The LSI() implies the SG() by Lemma 11. Hence, the hypotheses of Theorem 3 are weaker than the hypotheses of the criterion of Otto & Reznikoff (cf. Theorem 8), which were already verified for the conditional measures μ(d x|y) in the proof of Proposition 1. Thus the statement follows from a direct application of Theorem 3.

832

G. Menz

Argument for (ii): Using the Neumann representation of A−1 one sees that diag

A−1

11

, . . . , A−1

LL

≥

1 Id, ˜

(27)

in the sense of quadratic forms. Because for sufficiently small ε (cf. (21)) A ≥ ˜ Id − Ms1 s2 L×L > 0, it follows that A

−1

∞ −1 1 ≤ ˜ Id − Ms1 s2 L×L = ˜

& 'k Ms1 s2 L×L ˜

k=0

.

(28)

A combination of (27) and (28) yields A−1 − diag

A−1

11

, . . . , A−1

LL

∞ 1 ≤ ˜

& 'k Ms1 s2 L×L ˜

k=1

,

which implies the desired estimate (25) by using (20). By (28) we have ∞ 1 Ms1 s2 L×L A−1 Ms1 s2 L×L ≤ ˜

& 'k Ms1 s2 L×L ˜

k=2

which implies the desired estimate (26) by using (20).

,

Now, we can continue to the proof of Lemma 3. Proof (Lemma 3). Because of (23) we can write HessY H¯ (y) − diag HessY H¯ (y) 11 , . . . , HessY H¯ (y) L L = W1 + W2 , where the matrix W1 is given by (W1 )ln =

1 K

i∈B(l), j∈B(n) m i j ,

0,

if 1 ≤ n = l ≤ L , if l = n,

and the elements of the matrix W2 are defined for 1 ≤ n = l ≤ L by ⎛ & N ' 1 ⎝ (W2 )ln = − covμ(d x|y) m i j xi + δψ j x j , K j∈B(l) i=1 ⎞ & N ' m i j xi + δψ j x j ⎠ , j∈B(n)

i=1

and for l = n by (W2 )ll = 0. By using CS(ε) we can estimate W1 ≥ −ε Id

LSI for Kawasaki Dynamics with Weak Interaction

833

in the sense of quadratic forms. The estimation of W2 is a little bit more subtle. By bilinearity of the covariance the matrix W2 can be rewritten as W2 = W3 + W4 + W5 + W6 , where the elements of the matrices W1 , . . . , W6 are defined for 1 ≤ l = n ≤ L by ⎛ & N ' & N '⎞ 1 m i j xi , m i j xi ⎠ , (W3 )ln = − covμ(d x|y) ⎝ K j∈B(l) i=1 j∈B(n) i=1 ⎞ ⎛ 1 (W4 )ln = − covμ(d x|y) ⎝ δψ j x j , δψ j x j ⎠ , K j∈B(l) j∈B(n) ⎞ ⎛ & N ' 1 (W5 )ln = − covμ(d x|y) ⎝ m i j xi , δψ j x j ⎠ , K j∈B(l) i=1 j∈B(n) ⎛ & N '⎞ 1 (W6 )ln = − covμ(d x|y) ⎝ δψ j x j , m i j xi ⎠ , K j∈B(l)

j∈B(n)

i=1

and for l = n by (W3 )ll = 0

(W4 )ll = 0

(W5 )ll = 0

(W6 )ll = 0.

We estimate each matrix separately and start with W3 . A simple linear algebra argument outlined in [22, Lemma 9] shows that the elements of the inverse of A are non-negative, i.e. (A−1 )s1 s2 ≥ 0 for all s1 , s2 ∈ {1, . . . , L}. Hence, Lemma 4 (i) and the equivalence of norms in finite dimensional vector spaces yield for 1 ≤ l = n ≤ L the estimate ⎛ ⎞1 ⎛ ⎞1 2 2 L ⎝ A−1 − (W3 )ln ≤ m i2j ⎠ ⎝ m i2j ⎠ s1 ,s2 =1

≤C

L s1 ,s2 =1

s1 s2

i∈B(l), j∈B(s1 )

Mls1 A−1

s1 s2

i∈B(n), j∈B(s2 )

Ms2 n ,

where the matrix A is defined by (19) and Mls1 is defined by (18). Here and later on in this proof, 0 < C < ∞ denotes a generic constant depending only on K and c1 . It follows from the last estimate and (26) that −W3 ≤ Ms1 s2 L×L A−1 Ms1 s2 L×L ≤ Cε in the sense of quadratic forms. Let us turn to the estimation of W4 . An application of Lemma 4 (i) implies the estimate − (W4 )ln ≤ A−1 max max |δψi (x)|2 ln i∈{1,...,N } x∈R

for 1 ≤ l = n ≤ L. Hence, (25) yields in the sense of quadratic forms ≤ Cε. −W4 ≤ A−1 − diag A−1 , . . . , A−1 11

LL

834

G. Menz

With a similar argument one can estimate the matrices W5 and W6 as −W5 − W6 ≤ Cε, which together with the estimates of W3 and W4 yields −W2 ≤ Cε. 2.3.3. Estimation of the diagonal elements of the Hessian of H¯ In this section we will deduce the strict positivity of the diagonal elements of the Hessian of H¯ for sufficiently large block sizes K and sufficiently small interaction ε. More precisely, we will show the following statement. Lemma 5. There exist K 0 ∈ N depending only on c1 such that: If the block size K ≥ K 0 and the interaction matrix M satisfies CS(ε), then there are constants λ > 0 and C < ∞ independent of N , m, and s (depending only on K and c1 ) such that for all 1 ≤ l ≤ L and y ∈ Y, HessY H¯ (y) ll ≥ λ − Cε. Therefore

diag HessY H¯ (y) 11 , . . . , HessY H¯ (y) L L ≥ (λ − Cε) Id

in the sense of quadratic forms. For the proof of Lemma 5 we use a conditioning technique that allows to apply a perturbation argument for small ε independent of N , m, and s. Let us consider an arbitrary but fixed block B(l), 1 ≤ l ≤ L. Recall that the spin values inside the block B(l) are denoted by x l := (xi )i∈B(l) and the spin values outside the block B(l) are denoted by . As in the proof of Proposition 1, disintegration of the measure μ(d x|y) x¯ l := (xi )i ∈B(l) / with respect to x l yields (cf. Fig. 2) ¯ x¯ l |y), μ(d x|y) = μ(d x l |x¯ l , y) μ(d ¯ x¯ l |y) denote the conditional measure and the corresponding where μ(d x l |x¯ l , y) and μ(d marginal respectively (cf. (15)). Recall the definition of H (x l |M, s ∗ ) for an arbitrary vector s ∗ ∈ R B(l) , i.e. 1 ψi (xi ) + m i j xi x j + si∗ xi . (29) H (x l |M, s ∗ ) := 2 i∈B(l)

i, j∈B(l)

i∈B(l)

In the proof of Proposition 1 we have shown that the conditional measures μ(d x l |x¯ l , y) are given by μ(d x l |x¯ l , y) =

1 1 X K ,yl exp − H (x l |M, sc ) H(d x), Z

where the vector sc = sc (M, s) ∈ R B(l) defined by mi j x j for i ∈ B(l) sc,i := si + j ∈B(l) /

(30)

(31)

LSI for Kawasaki Dynamics with Weak Interaction

835

and the integration space X K ,yl is identified with ⎧ ⎫ ⎨ ⎬ 1 X K ,yl = x l ∈ R B(l) | xi = yl . ⎩ ⎭ K

(32)

i∈B(l)

We introduce the coarse-grained Hamiltonian of H (x l |M, s ∗ ) as usual, i.e. for yl ∈ R, 1 ∗ ¯ H (yl |M, s ) := − log exp −H (x l |M, s ∗ ) H(d x l ). (33) K X K ,yl The next lemma shows that uniform positivity of d2 ¯ H (yl |M, s ∗ ) dyl2 yields uniform positivity of (HessY H¯ (y))ll for small ε. This observation is one of the main insights in order to apply a perturbation argument for small ε independent of the system size N . The advantage of H¯ (yl |M, s ∗ ) over H¯ (y) is that in (33) one integrates only over sites of the block B(l), whereas in definition (14) of the coarse-grained Hamiltonian H¯ (y) one integrates over all sites of the spin system. Lemma 6. Assume that the vector sc and the Hamiltonian H (x l |M, sc ) are given by (31) and (29) respectively. Then: If the interaction matrix M satisfies CS(ε), then for all 1 ≤ l ≤ L and y ∈ Y, d2 ¯ ¯ (HessY H (y))ll ≥ ¯ x¯ l |y) − Cε, H (yl |M, sc )μ(d dyl2 where the constant C < ∞ is independent of N , m, and s (depending only on the block size K and c1 ). The proof of Lemma 6 consists of two steps. In the first step we show that the disintegration (15) yields the identity d2 ¯ ¯ x¯ l |y) H (yl |M, sc ) μ(d HessY H¯ (y) ll = dyl2 ⎞ ⎛ ' & N 1 ⎝ − var μ(d m i j xi + δψ j (x j ) μ(d x l |x¯ l , y)⎠ . (34) ¯ x¯ l |y) K j∈B(l)

i=1

In the second step we show that the variance term on the right hand side can be estimated by using the covariance estimate of Theorem 3 as ⎛ ⎞ ' & N 1 ⎝ var μ(d m i j xi + δψ j (x j ) μ(d x l |x¯ l , y)⎠ ≤ Cε. (35) ¯ x¯ l |y) K j∈B(l)

i=1

We will state the full proof of Lemma 6 below. The next lemma provides the last remaining ingredient of the proof of Lemma 5, which is the uniform positivity of d2 ¯ H (yl |M, s ∗ ). dy 2 l

836

G. Menz

Lemma 7. There is K 0 ∈ N such that: If the block size K ≥ K 0 and the interaction matrix M satisfies CS(ε), then there are constants λ > 0 and C < ∞ independent of N , m, and s (depending only on K and c1 ) such that for all 1 ≤ l ≤ L, yl ∈ R, and s ∗ ∈ R B(l) , d2 ¯ H (yl |M, s ∗ ) ≥ λ − Cε. dyl2

(36)

For the proof of Lemma 7 we apply the following strategy. If the block size K is large enough, the generalized local Cramér theorem (cf. Proposition 3 and Theorem 4) yields d2 ¯ H (yl |0, s˜ ) ≥ λ > 0 dyl2

(37)

for all yl ∈ R and s˜ ∈ R B(l) . We want to derive (36) from (37) by a perturbation argument. More precisely, we will show that for a specific choice of s˜ = s˜ (s ∗ ) ∈ R B(l) (cf. (50)), ! ! ! d2 ! 2 d ! ! ∗ ! 2 H¯ (yl |M, s ) − 2 H¯ (yl |0, s˜ )! ≤ Cε holds. ! dyl ! dyl

(38)

The constant C < ∞ just depends on K and c1 . For the proof of Lemma 5 it is crucial that the last inequality holds uniformly in s ∗ ∈ R B(l) and yl . Because we consider unbounded spins with quadratic interaction, this is difficult and leads to the specific choice of s˜ = s˜ (s ∗ ). It would be a lot easier to derive (38) for bounded spin-values with finite-range interaction. In this case one could also deduce the estimate (38) choosing s˜ = 0. Then, the standard version of the local Cramér theorem [9, Prop. 31] would be sufficient for the perturbation argument at least for homogeneous single-site potentials ψi = ψ. The reason is that [9, Prop. 31] yields in this case d2 ¯ H (yl |0, 0) ≥ λ > 0. dyl2 We will state the full proof of Lemma 7 below. Proof (Lemma 5). The desired statement follows directly from a combination of Lemma 6 and Lemma 7. Proof (Lemma 6). Let us deduce the identity (34). Recall that by Lemma 2 we have 1 mi j + δψ j (x j ) μ(d x|y) K i, j∈B(l) j∈B(l) ⎞ ⎛ & N ' 1 − var μ(d x|y) ⎝ m i j xi + δψ j (x j )⎠ . K

1 HessY H¯ (y) ll = 1 + K

j∈B(l)

i=1

LSI for Kawasaki Dynamics with Weak Interaction

837

The disintegration rule (15) and the additive property of variances yield the identity HessY H¯ (y) ll ⎡ ⎛ ⎞ 1 1 = ⎣ ⎝1 + mi j + δψ j (x j ) ⎠ μ(d x l |x¯ l , y) K K i, j∈B(l) j∈B(l) ⎞⎤ ⎛ & N ' 1 − var μ(d x l |x¯ l ,y) ⎝ m i j xi + δψ j (x j )⎠ ⎦ μ(d ¯ x¯ l |y) K j∈B(l) i=1 ⎤ ⎞ ⎛ ⎡ & N ' 1 ⎝ ⎣ − var μ(d m i j xi + δψ j (x j )⎦ μ(d x l |x¯ l , y)⎠ . ¯ x¯ l |y) K j∈B(l)

i=1

Note that the Hamiltonian H (x l |M, s ∗ ) defined by (29) has the same structure as the Hamiltonian H (x) given by (2). Therefore an application of Lemma 2 yields that d2 ¯ 1 1 |M, s ) = 1 + m + δψ j (x j )μ(d x l |x¯ l , y) H (y l c i j K K dyl2 i, j∈B(l) j∈B(l) ⎞ ⎛ ⎛ ⎞ 1 ⎝ − var μ(d x l |x¯ l ,y) ⎝ m i j xi ⎠ + δψ j (x j )⎠ . (39) K j∈B(l)

i∈B(l)

The desired identity (34) follows from the last two equations and the fact that adding constants does not change variances. It remains to derive the estimate (35) of the variance term of the right hand side of (34). By Young’s inequality, ⎛ ⎡ ⎤ ⎞ & N ' 1 ⎝ ⎣ var μ(d m i j xi + δψ j (x j )⎦μ(d x l |x¯ l , y)⎠ ¯ x¯ l |y) K j∈B(l) i=1 ⎞ ⎛ N 2 ⎝ ≤ var μ(d m i j xi μ(d x l |x¯ l , y)⎠ ¯ x¯ l |y) K j∈B(l) i=1 ⎞ ⎛ 2 ⎝ + var μ(d δψ j (x j ) μ(d x l |x¯ l , y)⎠ . (40) ¯ x¯ l |y) K j∈B(l)

Let us consider the first term of the right hand side of (40). By the disintegration rule (15) we have for any function ξ(x¯ l ), ξ(x¯ l )μ(d ¯ x¯ l |y) = ξ(x¯ l ) 1 μ(d x l |x¯ l , y) μ(d ¯ x¯ l |y) = ξ(x¯ l )μ(d x|y). =1

It follows that 2 2 l var μ(d var μ(d x|y) ξ(x¯ l ) . ¯ x¯ l |y) ξ( x¯ ) = K K

838

G. Menz

Therefore an application of the covariance estimate of Theorem 3 to the measure μ(d x|y) yields ⎛ ⎞ N 2 ⎝ var μ(d m i j xi μ(d x l |x¯ l , y)⎠ ¯ x¯ l |y) K j∈B(l) i=1

≤

2 K

L

A−1

s1 ,s2 =1

⎛ ⎜ ×⎝

s1 s2

! !! d ! !dx ! k

k∈B(s1 )

⎛ ⎜ ×⎝

! ! ! d ! !dx k k∈B(s2 ) !

N j∈B(l) i=1

N

j∈B(l) i=1

⎞1 !2 2 ! ! ⎟ l l m i j xi μ(d x |x¯ , y)!! μ(d x|y)⎠ ! ⎞1 !2 2 ! ! ⎟ l l ! m i j xi μ(d x |x¯ , y)! μ(d x|y)⎠ . !

It follows from the definition x l = (xk )k∈B(l) that for k ∈ B(l), ⎞ ⎛ N d ⎝ m i j xi μ(d x l |x¯ l , y)⎠ = 0. d xk

(41)

(42)

j∈B(l) i=1

Using the definition (29) of H (x l |M, sc ) direct calculation shows that N d m i j xi μ(d x l |x¯ l , y) d xk j∈B(l) i=1 ⎛ ⎞ N d = m k j − covμ(d x l |x¯ l ,y) ⎝ m i j xi , H (x l |M, sc )⎠ d xk j∈B(l) i=1

j∈B(l)

for k ∈ / B(l). From now on, let C < ∞ denote a generic constant depending only on K and c1 . Because μ(d x l |x¯ l , y) satisfies LSI() ˜ with ˜ > 0 depending only on K and c1 (cf. proof of Proposition 1), an application of Lemma 11 and the equivalence of norms in finite-dimensional vector spaces yields ! ! ! ! N ! ! d l l ! m i j xi μ(d x |x¯ , y)!! !dx ! ! k j∈B(l) i=1

⎛ ≤C⎝

⎞1 2

m 2k j ⎠

j∈B(l)

⎛

⎞1 ⎛ 2

⎞1

1 + ⎝ m i2j ⎠ ⎝ m 2k j ⎠ ˜ i, j∈B(l) j∈B(l)

2

≤CMll

⎞1

⎛

2

C 2 ⎠ ⎝ ≤ C+ ε mk j . ˜

(20)

j∈B(l)

(43)

LSI for Kawasaki Dynamics with Weak Interaction

839

A combination of the estimates (41), (42) and (43) yields the estimate of the first term on the right-hand side of (40). More precisely, ⎞ ⎛ N 2 ⎝ var μ(d m i j xi μ(d x l |x¯ l , y)⎠ ¯ x¯ l |y) K j∈B(l) i=1

≤C

L

A−1

s1 ,s2 =1

≤C

L s1 ,s2 =1

A−1

s1 s2

s1 s2

⎛ ⎝

⎞1 ⎛ 2

m i2j ⎠ ⎝

i∈B(s1 ), j∈B(l)

⎞1 2

m i2j ⎠

i∈B(s2 ), j∈B(l) (26)

Mls1 Ms2 l ≤ Cε.

The second term on the right hand side of (40) can be estimated with the same argument as we used for the first term. The only different ingredient is the estimation of ! ! ! ! ! ! d l l ! δψ j (x j ) μ(d x |x¯ , y)!! !dx ! ! k j∈B(l)

! ⎛ ! ! = !!covμ(d x l |x¯ l ,y) ⎝ δψ j (x j ) , ! j∈B(l)

⎞! ⎞1 ⎛ 2 ! ! C 2 ⎠ ! ⎠ ⎝ m ks xs ! ≤ mk j , ˜ ! s∈B(l) j∈B(l)

where we applied the estimate of Lemma 11 and the uniform bound (1) of the functions δψi . Proof (Lemma 7). Because the estimate (37) follows directly from the generalized local Cramér theorem (cf. Proposition 3 and Theorem 4), it is only left to deduce (38). Let ν(d x l |M, s ∗ ) denote the Gibbs measure on X K ,yl (see (32)) associated to the Hamiltonian H (x l |M, s ∗ ), i.e. 1 1 X K ,yl exp(−H (x l |M, s ∗ ))H(d x l ). Z The same reason as for (39) yields that d2 ¯ 1 1 ∗ |M, s ) = 1 + m + δψ j (x j ) ν(d x l |M, s ∗ ) H (y l i j K K dyl2 i∈B(l), j∈B(l) j∈B(l) ⎞ ⎛ ⎛ ⎞ 1 ⎝ − var ν(d x l |B,s ∗ ) ⎝ m i j xi ⎠ + δψ j (x j )⎠ . K ν(d x l |M, s ∗ ) =

j∈B(l)

i∈B(l)

An application of this formula to H¯ (yl |0, s˜ ) with arbitrary s˜ ∈ R B(l) yields d2 ¯ 1 |0, s ˜ ) = 1 + δψ j (x j ) ν(d x l |0, s˜ ) H (y l K dyl2 j∈B(l) ⎞ ⎛ 1 − var ν(d x l |0,˜s ) ⎝ δψ j (x j )⎠ . K j∈B(l)

840

G. Menz

It follows from the last two equations and the bilinearity of the covariance that ! ! ! d2 ! 2 d ! ! ! 2 H¯ (yl |M, s) − 2 H¯ (yl |0, s˜ )! ≤ T1 + T2 + T3 + T4 + T5 , ! dyl ! dyl

(44)

where the terms T1 , T2 , and T4 are given by ! ! ! ⎛ ⎞! ! ! ! ! ! ! ! ! 1 ! 1 ! ! ! ⎝ ⎠ , T var m := m x T1 := l ∗ ij! 2 ij i ! , ν(d x |M,s ) ! ! K ! K ! ! ! i, j∈B(l) i, j∈B(l) ! ⎞! ⎛ ! ! ! 2 !! !, ⎠ ⎝ T3 := cov m x , δψ (x ) l ∗ ij i ν(d x |M,s ) j j ! ! K ! ! i, j∈B(l) and the terms T4 and T5 are given by ! ! ! ! ! 1 !! l ∗ l T4 := δψ j (x j ) ν(d x |M, s ) − δψ j (x j ) ν(d x |0, s˜ )!! , ! K ! ! j∈B(l) j∈B(l) ! ⎞ ⎞! ⎛ ⎛ ! ! ! 1 !! ⎠ ⎠ ⎝ ⎝ T5 := var ν(d x l |M,s ∗ ) δψ j (x j ) − var ν(d x l |0,˜s ) δψ j (x j ) !! . ! K ! ! j∈B(l) j∈B(l) Note that the measure ν(d x l |M, s ∗ ) has the same structure as the measure μ(d x l |x¯ l , y). Therefore it follows by the same argument as in the proof of Proposition 1 that the measure ν(d x l |M, s ∗ ) satisfies LSI() ˜ with ˜ > 0 depending only on K and c1 . It is easy to deduce by using CS(ε) and the basic covariance estimate of Lemma 11 that T1 + T2 + T3 ≤ Cε for a constant C < ∞ depending only on K and c1 . The interesting part is the estimation of T4 and T5 , for which the right choice of s˜ = s˜ (s ∗ ) ∈ R B(l) plays an important role. Therefore let us motivate how to choose s˜ = s˜ (s ∗ ) for a given vector s ∗ ∈ R B(l) . The structure of T4 and T5 is given by ! ! ! ! ! ξ(x l ) ν(d x l |M, s ∗ ) − ξ(x l ) ν(d x l |0, s˜ )! ! ! for a bounded function ξ : R B(l) → R. We want to estimate the last expression uniformly in the unbounded parameters yl ∈ R and s ∗ ∈ R B(l) . Therefore let us take a closer look at the dependence of 1 l l ∗ ξ(x ) ν(d x |M, s ) = ξ(x l ) exp −H (x l |M, s ∗ ) H(d x l ) (45) Z X K ,yl on the parameters yl and s ∗ . On the block B(l) we define the coarse-graining operator Pl : R B(l) → R by Pl x l = K1 i∈B(l) xi . Let Pl∗ denote the adjoint operator of P, i.e. for yl ∈ R, Pl∗ (yl ) :=

1 (yl , . . . , yl ) ∈ R B(l) . K

LSI for Kawasaki Dynamics with Weak Interaction

841

By using the identity Pl K Pl∗ = IdR one sees that the orthogonal projection of R B(l) on ker Pl = X K ,0 is given by = Id −K Pl∗ Pl .

(46)

Consider the right hand side of (45). The dependence of the integration space X K ,yl on yl is abolished by the translation x l → z˜ = x l , which maps X K ,yl onto X K ,0 and yields the identity 1 l l ∗ ξ(x )ν(d x |M, s ) = ξ(˜z + K Pl∗ yl ) Z X K ,0 ⎛ ⎞ $ % 1 × exp ⎝− ˜z , (Id +Mll )˜z − s ∗ + Mll K Pl∗ yl , z˜ − δψi (z i + yl )⎠ H(d z˜ ), 2 i∈B(l)

(47) where the matrix Mll is given by (17). Deriving the last identity consists of a straightforward calculation, where one has to consider the definition (29) of H (x l |M, s ∗ ), cancel all terms that are independent $ % of z˜ with terms of the normalization constant Z , and ∗ y , z˜ = 0 for z˜ ∈ X apply the fact that K P l K ,0 . Note that in (47) only the linear term l % $ ∗ ∗ s + Mll K Pl yl , z˜ depends on the parameters yl and s ∗ . The idea is to get rid of this term by a second translation z˜ → z˜ + v, which leaves the integration space X K ,0 invariant. Because z˜ ∈ X K ,0 = ker Pl , we can rewrite the Gaussian part of the Hamiltonian in (47) as $ % 1 ˜z , (Id +Mll )˜z + s ∗ + Mll K Pl∗ yl , z˜ 2 $ % 1 = ˜z , (Id + Mll )˜z + s ∗ + Mll K Pl∗ yl ), z˜ . 2 Because M satisfies CS(ε) with ε < 1, the map (Id + Mll ) : X K ,0 → X K ,0 is invertible. We define v by v = (Id − Mll )−1 ( s ∗ + Mll K Pl∗ yl ).

(48)

Direct calculation using the definition of v yields $ % 1 ˜z , (Id + Mll )˜z + s ∗ + Mll K Pl∗ yl , z˜ 2 $ % 1 1 = z, (Id + Mll )z − s ∗ + Mll K Pl∗ yl , v + v, (Id + Mll )v . 2 2 Because v ∈ X K ,0 , the transformation z˜ → z = z˜ + v leaves the integration space X K ,0 on the right-hand side of (47) invariant and yields by using the last identity that 1 l l ∗ ξ(x ) ν(d x |M, s ) = ξ(z + N P ∗ yl − v) Z X K ,0 ⎞ ⎛ 1 δψi (z i + yl − vi )⎠ H(dz), (49) × exp ⎝− z, (Id +Mll )z − 2 i∈B(l)

842

G. Menz

where we have canceled the terms that are independent of z with terms of the normalization constant Z . Note that we have gained compactness by this representation: The unbounded parameters yl and s ∗ only enter (49) as an argument of the bounded functions ξ and δψi . This observation is crucial for the estimation of T4 and T5 . The derivation of (49) reveals that it is natural to choose s˜ (s ∗ ) = s ∗ + Mll K Pl∗ yl = Id −K Pl∗ Pl s ∗ + Mll K Pl∗ yl , (50) where the matrix Mll is given by (17). The reason is that carrying out the two translations from above yields 1 ξ(x l ) ν(d x l |0, s˜ ) = ξ(z + K Pl∗ yl − v) Z X K ,0 ⎛ ⎞ 1 × exp ⎝− z, z − δψi (z i + yl − vi )⎠ H(dz). (51) 2 i∈B(l)

that %the right-hand side of (49) and (51) coincide except for the interaction term $Note x l , Mll x l . The latter is very helpful to apply a perturbation argument for the uniform estimation of T4 and T5 . Now, we will estimate T4 and T5 . Let us choose s˜ = s˜ (s ∗ ) as in (50). For 0 ≤ λ ≤ 1 we define the probability measure νλ on X K ,0 (see (32)) by ⎛ ⎞ 1 1 1 X K ,0 exp ⎝− z, (Id +λMll )z − νλ (dz) := δψ j z j + yl − v j ⎠ H(dz), Z 2 j∈B(l)

where the vector v is defined by (48). Applying the translation x l → z = x l + v on the integrals of T4 yields (cf. (49), and (51)) ! ! ! ! ! 1 !! δψ j (z j + yl − v j )ν1 (dz) − δψ j (z j + yl − v j )ν0 (dz)!! T4 = ! K ! ! j∈B(l) j∈B(l) ! ! ! ! ! !d 1 ! (52) ≤ sup ! δψ j (z j + yl − v j ) νλ (dz)!! . K 0≤λ≤1 ! dλ ! j∈B(l) Because M satisfies CS(ε), we may assume w.l.o.g. that 1 1 Id ≤ Mll ≤ Id. (53) 2 2 By direct calculation we get that for any 0 ≤ λ ≤ 1, d δψ j (z j + yl − v j ) νλ (dz) dλ j∈B(l) ⎛ ⎞ 1 = covνλ (dz) ⎝ δψ j (z j + yl − v j ) , z, Mll z ⎠ 2 j∈B(l) ⎞ ⎛ 1 ⎝ δψ j (z j + yl − v j ) − δψ j (z j + yl − v j )νλ (dz)⎠ z, Mll z νλ (dz). = 2 −

j∈B(l)

LSI for Kawasaki Dynamics with Weak Interaction

843

Let C < ∞ denote a generic constant depending only on K and c1 . From the last identity we can deduce the estimate ! ! !d ! ! ! δψ (z + y − v ) ν (dz) j l j λ j ! dλ ! j∈B(l) ! ! ! ! |z, Mll z | νλ (dz) ≤ K max sup !δψ j (x)! j∈B(l) x∈R * 1 2 |z| z, +λM H(d x) z exp − δψ + y − v z − (Id ) ll j j l j (20) j∈B(l) X K ,0 2 ≤ Cε * 1 H(d x) j∈B(l) δψ j z j + yl − v j X K ,0 exp − 2 z, (Id +λMll ) z − *

1 2 (53) X K ,0 |z| exp − 2 z, z H(d x) ≤ Cε exp 2K max sup |δψ j (x)| * 3 j∈B(l) x X K ,0 exp − 2 z, z H(d x) ≤ Cε.

(54)

A combination of (52) and (54) yields the estimate T4 ≤ Cε. The same argument also yields T5 ≤ Cε. Compared to the estimation of T4 one has to take a closer look at the term ⎞ ⎛ d var νλ (dz) ⎝ δψ j (z j + yl − v j )⎠ dλ j∈B(l) ⎞2 ⎛ d ⎝ = δψ j (z j + yl − v j ) − δψ j (z j + yl − v j ) νλ (dz)⎠ νλ (dz). dλ j∈B(l)

Because ⎡ ⎞2 ⎤ ⎛ ⎢d ⎝ ⎥ δψ j (z j + yl − v j ) − δψ j (z j + yl − v j ) νλ (dz)⎠ ⎦ νλ (dz) ⎣ dλ j∈B(l)

= −2 d × dλ = 0,

⎛ ⎝

δψ j (z j + yl − v j ) −

⎞ δψ j (z j + yl − v j ) νλ (dz)⎠ νλ (dz)

j∈B(l)

j∈B(l)

δψ j (z j + yl − v j ) νλ (dz)

844

G. Menz

it follows by direct calculation that ⎞ ⎛ d var νλ (dz) ⎝ δψ j (z j + yl − v j )⎠ dλ =

j∈B(l)

δψ j (z j + yl − v j ) −

j∈B(l)

=

1 covνλ (dz) 2

δψ j (· · · ) −

δψ j (z j + yl − v j )νλ (dz)

δψ j (· · · )νλ (dz)

2

2

d νλ (dz) dλ

, z, Mll z .

j∈B(l)

However, the covariance term on the right hand side can be estimated in the same way as in (54). Therefore we have deduced (38) uniformly in yl ∈ R and s ∗ ∈ R B(l) , which completes the proof of Lemma 7. 3. The Local Cramér Theorem for Inhomogeneous Single-Site Potentials The main goal of this section is to deduce a convexification result that is one of the central ingredients for the macroscopic LSI (cf. Proposition 2 and Lemma 7): Proposition 3. Assume that the Hamiltonian H : R K → R is given by H (x) :=

K 1 j=1

2

x 2j + s j x j + δψ j (x j )

(55)

for some arbitrary vector s ∈ R K and some functions δψ j : R → R satisfying the uniform bound (1), i.e. for all j ∈ {1, . . . , K }, δψ j C 2 ≤ c1 < ∞. Let H¯ K denote the coarse-grained Hamiltonian of H associated to coarse-graining the whole system. More precisely, for m ∈ R, 1 H¯ K (m) := − log (56) exp (−H (x)) H (d x) . K 1 K j=1 x j =m K Then there is K 0 and λ > 0 such that for all K ≥ K 0 , s, and m, d2 ¯ HK (m) ≥ λ. dm 2 Like the convexification results of [21] and [9, Lemma 29], the statement of Proposition 3 is a direct consequence of the a local Cramér theorem, namely: Theorem 4 (Local Cramér theorem). Assume that the Hamiltonian H is given by (55). Let ϕ K (m) be defined as the Cramér transform of H , namely ⎛ ⎛ ⎞ ⎞ K 1 exp ⎝−H (x) + σ x j ⎠ dx⎠ . (57) ϕ K (m) := sup ⎝σ m − log K RK σ ∈R j=1

LSI for Kawasaki Dynamics with Weak Interaction

845

Then ϕ K is strictly convex independently of s, m, and K . Additionally, it holds H¯ K (m) − ϕ K (m)C 2 → 0

as K → ∞.

The convergence only depends on the constant c1 given by (1). The main difference between Theorem 4 and the local Cramér theorem of [9, Prop. 31] and [21] is that the single-site potentials ψ j of the Hamiltonian H (x) :=

K

ψ j (x j )

j=1

are inhomogeneous in the sense that the single-site potentials ψ j (x j ) =

1 2 x + s j x j + δψ j (x j ), 2 j

depend on the site j ∈ {1, . . . , K }. As usual, the proof of the local Cramér theorem is based on two ingredients. The first one is Cramér’s representation of the difference ( H¯ K (m) − ϕ K (m)) (cf. [9, (125)]): Lemma 8. For j ∈ {1, . . . , K } we consider the one-dimensional probability measure μσj given by μσj (d x j )

1 2 ∗ := exp −ϕ K , j (σ ) + σ x j − x j − s j x j − δψ j (x j ) d x j , 2

where ∗ ϕK , j (σ )

:= log

1 2 exp σ x j − x j − s j x j − δψ j (x j ) d x j . 2

We introduce the mean m j and variance ς 2j of the measure μσj , m j :=

x j μσj (d x j )

and

ς 2j :=

(x j − m j )2 μσj (d x j ).

Assume that X j , j ∈ {1, . . . , K }, are independent random variables distributed according to μσj . Let g K ,m (ξ ) denote the Lebesgue density of the distribution of the random variable K 1 X j − m j. √ K j=1

Then g K ,m (0) = exp(K ϕ K (m) − K H¯ K (m)).

(58)

846

G. Menz

The second ingredient is a local central limit type theorem for the density g K ,m . The generalization of the local Cramér theorem by Theorem 4 is not surprising: For the classical central limit theorem it is not important that the random variables X j are identically distributed. It suffices that the standard deviation ς j of X j is uniformly bounded. The latter is guaranteed by the uniform control δψ j ≤ c1 (cf. Lemma 9 below). As a consequence we can proceed with the same strategy as for the classical local Cramér theorem (cf. [9,21]). We just have to pay attention that every step does not rely on the specific form of ψ j but on the uniform bound of ς j . Because the complete proof of Theorem 4 is elementary but a bit lengthy, we will state it in Appendix A. Lemma 9. Assume that δψ j C 2 ≤ c1 < ∞ uniformly in j ∈ {1, . . . , K }. Then there is a constant 0 < c < ∞ such that for any σ and j, 1 ≤ ς j ≤ c, c

(59)

where ς j is defined as in Lemma 8. We conclude this chapter with the proof of Lemma 8 and Lemma 9. Proof (Lemma 8). Because ϕ K is the Legendre transform of the strictly convex function ⎛ ⎞ K 1 ∗ ϕK (σ ) := exp ⎝−H (x) + σ x j ⎠ d x, log K RK j=1

there exists for every m ∈ R a unique σ = σ (m) ∈ R such that ∗ ϕ K (m) = σ m − ϕ K (σ ).

(60)

It is well-known that σ is determined by the equation m=

d ∗ ϕ (σ ). dσ K

(61)

∗ and m can be decomposed according to Now, we will show that ϕ K ∗ ϕK (σ ) =

K K 1 ∗ 1 ϕ K , j (σ ) and m = m j. K K j=1

(62)

j=1

∗ directly follows from definitions. Observe that Indeed, the decomposition of ϕ K d ∗ ϕ (σ ). m j = x j μσj (d x j ) = dσ K , j ∗ . More Then, the decomposition of m follows from (61) and the decomposition of ϕ K precisely,

m=

K K d ∗ 1 d ∗ 1 ϕ K (σ ) = ϕ K , j (σ ) = m j. dσ K dσ K j=1

j=1

LSI for Kawasaki Dynamics with Weak Interaction

847

Now, we will deduce Cramér’s representation (58). The density g K ,m (ξ ) at ξ = 0 can be written as ⎛

g K ,m (0) =

K

− 21

K j=1

x j −m j =0

. exp ⎝

K

⎞ ∗ ⎠ −ϕ K , j (σ ) + σ x j − ψ j (x j ) H(d x).

j=1

By (62) we get ⎛

g K ,m (0) =

X K ,m

∗ exp ⎝−K ϕ K (σ ) + K σ m −

K

⎞ ψ j (x j )⎠ H(d x).

j=1

Using (60) the right-hand side becomes ⎛

g K ,m (0) = exp (K ϕ K (m))

exp ⎝− X K ,m

K

⎞ ψ j (x j )⎠ H(d x).

j=1

Applying the definition (56) of H¯ K (m) yields the desired formula.

Proof (Lemma 9). Observe that the variance of a one-dimensional Gaussian measure is invariant under adding a linear term to the Hamiltonian, i.e. for any σ˜ ∈ R, & ς := 2

* x− *

2

x exp(− x2 )d x

'2

2

*

exp(− x2 )

dx 2 exp(− x2 )d x '2 * 2 2 & x exp(σ˜ x − x2 )d x exp(σ˜ x − x2 ) d x. = x− * * 2 2 exp(σ˜ x − x2 )d x exp(σ˜ x − x2 )d x 2

exp(− x2 )d x

Let us consider the upper bound of (59). Because the mean of a probability measure ν is optimal in the sense that for all c ∈ R,

(x − c)2 ν(d x) =

x 2 ν(d x) − 2c

≥

x ν(d x) − 2

=

xν(d x) + c2

2 xν(d x)

2

x−

xν(d x)

ν(d x),

848

G. Menz

we have by using the uniform bound δψ j C 2 ≤ c1 < ∞ and σ˜ = σ − s j , ς 2j

=

xj − mj

2

exp(σ˜ x j −

x 2j 2

− δψ j (x j ))

dx j − δψ (x ))d x j j j 2 ⎛ ⎞2 2 * x x2 x j exp(σ˜ x j − 2j )d x j exp(σ˜ x j − 2j − δψ j (x j )) ⎠ ≤ ⎝x j − * dx j * x2 x2 exp(σ˜ x j − 2j )d x j exp(σ˜ x j − 2j − δψ j (x j ))d x j ⎛ ⎞2 * x2 x2 x j exp(σ˜ x j − 2j )d x j exp(σ˜ x j − 2j ) ⎝ ⎠ ≤ exp(2c1 ) xj − * dx j * x2 x2 exp(σ˜ x j − 2j )d x j exp(σ˜ x j − 2j )d x j *

exp(σ˜ x j −

x 2j

= exp(2c1 ) ς 2 . The lower bound of (59) is deduced by the same type of argument, namely ς 2j ≥ exp(−2c1 ) ≥ exp(−2c1 )

xj − mj

⎛

2

*

exp(σ˜ x j − exp(σ˜ x j −

x 2j 2 ) x 2j 2 )d x j

dx j

⎞2 x2 x2 x j exp(σ˜ x j − 2j )d x j exp(σ˜ x j − 2j ) ⎝x j − ⎠ dx j * * x 2j x2 exp(σ˜ x j − 2 )d x j exp(σ˜ x j − 2j )d x j *

= exp(−2c1 ) ς 2 . Acknowledgement. The author thanks Professor Felix Otto for bringing this topic into his view and for helpful guidance. He thanks Maria Westdickenberg for helpful comments. Finally, he also thanks the Deutsche Forschungsgemeinschaft for financial support through the Gottfried Wilhelm Leibniz program, the Bonn International Graduate School in Mathematics, and the Max Planck Institute for Mathematics in the Sciences in Leipzig.

A. Proof of the Generalized Local Cramér Theorem In this part of the appendix, we will state the proof of Theorem 4. As in [9] and [21], Theorem 4 follows from a local central limit type result for the density g K ,m . Even if we closely follow the approach of [21], we cannot directly take over their argument because it only considers the case of homogeneous single-site potentials ψ j = ψ, j ∈ {1, . . . , K }. In another aspect, our setting is not as complex as in [21]: We have a uniform control (59) of the standard deviation ς j . Therefore, we can sometimes apply simpler arguments. Because the proceeding is more or less standard, some elements of the proof may also be found in [7, Chap. XVI, 14, App. 2, 11 Sec. 3, 16, p. 752 and Sec. 5, 9, App.: Local Cramér theorem]. Convention. For the rest of Appendix A, we assume that the index j is given by some number j ∈ {1, . . . , K }. Additionally, we introduce the notation f j := f (x j )μσj (d x j ).

LSI for Kawasaki Dynamics with Weak Interaction

849

The definition of g K ,m suggests to introduce for the shifted variables x˜ j := x j − m j , $ % which yields that the mean of x˜ j is normalized i.e. x˜ j j = 0. The following auxiliary lemma provides all the tools needed for the proof of Theorem 4. Lemma 10. There is a constant 0 < C < ∞ such that the following statements are true: % $ (i) For any k ∈ {1, . . . , 5} and j it holds: |x˜ j |k j ≤ C. $ % (ii) For any ξ ∈ R and j it holds: | exp(i x˜ j ξ ) j | ≤ C|ξ |−1 . (iii) For any δ > 0 there is λ < 1 such that for all σ , |ξ | ≥ δ, and j it holds: !$ % !! ! ! exp i x˜ j ξ j ! ≤ λ. (iv) For any δ > 0 there is 0 < Cδ < ∞ such that for all σ , |ξ | ≥ δ, and j: !$ % !! 1 ! . ! exp i x˜ j ξ j ! ≤ Cδ 1 + |ξ | (v) For any j it holds: ! ! ! d $ % ! 3 ! ! ! dm exp(i x˜ j ξ ) j ! ≤ C (1 + |ξ |) |ξ | , ! ! 2 ! d $ % ! 2 ! ! ≤ C 1 + |ξ | |ξ |3 . exp(i x ˜ ξ ) j ! dm 2 j! (vi) There exists a complex-valued function h j (ξ ) such that for |ξ | 1: ! ! ! $ % 1 2 2 !! ! exp(i x˜ j ξ ) j = exp(−h j (ξ )) with !h j (ξ ) − ς j ξ ! ≤ C|ξ |3 , 2 where ς 2j is given by Lemma 8. We will partly deduce the last lemma from auxiliary results stated in [21]. However, because our setting is simpler than the one in [21], we will also state self-contained arguments if it is reasonable. Proof (Lemma 10). In this proof, let 0 < C < ∞ denote a generic constant that is uniform in s, j, and σ . Argument for (i): By Hoelder’s inequality, it suffices to show the estimate / 0 x˜ kj ≤ C (63) j

for k ∈ {2, 4, 6}. We start with considering the case k = 2. Due to the uniform control δψ j ≤ c1 , the Hamiltonian 1 ψ˜ j (x j ) := −σ x j + x 2j + s j x j + δψ j (x j ) 2

(64)

850

G. Menz

of the measure dμσj = Z −1 exp(−ψ˜ j (x j ))d x j is a bounded perturbation of the strictly convex function 1 ψ˜ j,c (x j ) := −σ x j + x 2j + s j x j . 2 Hence, a combination of the criterion of Bakry & Émery (cf. Theorem 7), the criterion of Holley & Stroock (cf. Theorem 6), and Lemma 11 yields that μσj satisfies a SG() ˜ with constant ˜ > 0 uniformly in s, j, and σ . Therefore the SG yields the estimate / 0 (65) x˜ 2j = var μσj (x j ) ≤ C. Now, we will deduce the estimate (63) for k = 4. Indeed, the SG for the measure μσj yields the estimate / 0 var μσj (x j − m j )2 ≤ C x˜ 2j . Then, the desired statement follows from the estimate (65) and the identity / 0 / 02 x˜ 4j = var μσj (x j − m j )2 + x˜ 2j . j

j

Finally, we will deduce the estimate (63) for k = 6. We have the identity / 02 / 0 x˜ 6j = var μσj (x j − m j )3 + x˜ 3j . j

j

The second term on the right-hand side can be estimated by Hoelder’s inequality using the estimate (63) for the case k = 4. On the first term on the right-hand side, an application of the SG() ˜ for the measure μσj yields / 0 var μσj (x j − m j )3 ≤ C x˜ 4j ≤ C. j

Argument for (ii): Recalling the definition (64) of ψ˜ j , partial integration yields %

$

1 exp(i x˜ j ξ ) j = iξ

exp(i(x j − m j )ξ ) ψ˜ j (x j )

exp(−ψ˜ j (x j )) dx j. Z

By Hoelder’s inequality and another partial integration we get the inequality $

%

1 | exp(i x˜ j ξ ) j | ≤ |ξ |

& ⎛

1 ⎜ = ⎝− |ξ | 1 = |ξ |

&

(ψ˜ j (x j ))2

exp(−ψ˜ j (x j )) dx j Z

exp(−ψ˜ j (x j ))

ψ˜ j (x j )

ψ˜ j (x j )

Z

exp(−ψ˜ j (x j )) dx j Z

'1 2

⎞1 2

⎟ dx j⎠

'1 2

.

LSI for Kawasaki Dynamics with Weak Interaction

851

The latter yields the desired estimate by the definition (64) of ψ˜ j and the uniform bound δψ j C 2 ≤ c1 . Argument for (iii): The statement directly follows from an application of [21, Lemma 2.2.4] and the observation that the constant λ only depends on the upper bounds of the statements (i) and (ii). The statement (iv) follows directly from a combination of (ii) and (iii). Argument for (v): Before we start the proof, some preparatory work has to be done. We need the following identities, which are verified by straightforward calculation: d ∗ ϕ = m j, dσ K , j

d2 ∗ d m j = ς 2j , ϕK , j = 2 dσ dσ

d3 ∗ d 2 / 30 ς = x˜ j . ϕ = dσ 3 K , j dσ j

Additionally, we need the identities K K d 1 d 1 2 m= mj = ςj dσ K dσ K

(66)

K K 1 d 2 1 3 d2 ς m = = x˜ j j . dσ 2 K dσ j K

(67)

j=1

j=1

and

j=1

j=1

Note that an application of Lemma 9 yields 1 d ≤ m ≤ c. c dσ

(68)

Now, we will argue that ! 2 ! ! d ! ! ! ! dm 2 σ ! ≤ C.

(69)

Indeed, a direct calculation shows

−1

−3 2 d2 d d d d m m σ = =− m. 2 dm dm dσ dσ dσ 2 Then, the desired estimate (69) follows from the formula (67), the moment estimate of (i), and the estimate (68). The main ingredient for the argument is [21, Lemma 2.2.5], providing the following estimates: ! ! !1 d $ % ! ! ! ≤ C 1 + |ξ | |ξ |33 , exp(i x ˜ ξ ) (70) j ! ς dσ ςj j! ςj j ! !

! 1 d 2 $ % !! 3 ! ξ2 (71) exp(i x˜ j ξ ) j ! ≤ C 1 + ς 2 |ξς |3 . ! ! ! ς j dσ j j In the last two inequalities, the constant C only depends on the moment bounds of statement (i).

852

G. Menz

Now, the preparatory work is done and we start to estimate the first term of (iv). Direct calculation yields % % d d $ d $ exp(i x˜ j ξ ) j = exp(i x˜ j ξ ) j σ dm dσ dm

−1 % d 1 d $ exp(i x˜ j ξ ) j . = ςj m ς j dσ dσ Therefore, an application of the estimate (70), (68), and Lemma 9 yields the desired inequality, ! ! ! d $ % ! 3 ! ! ! dm exp(i x˜ j ξ ) j ! ≤ C (1 + |ξ |) |ξ | . Let us turn to the second estimate of (v). A direct calculation reveals

% % d $ d d d2 $ exp(i x˜ j ξ ) j σ exp(i x˜ j ξ ) j = dm 2 dm dσ dm

% 1 d $ d d d exp(i x˜ j ξ ) j ς j σ σ = dσ ς j dσ dm dm

−2 1 2 $ % 1 d d exp(i x˜ j ξ ) j m = ς 2j ς j dσ dσ

−2 % d d 1 d $ exp(i x˜ j ξ ) j ςj m + ς j dσ dσ dσ

−3 2 % d d 1 d $ exp(i x˜ j ξ ) j ς j m − m. ς j dσ dσ dσ 2 The desired statement follows from a combination of the estimates (71), (67), (68), Lemma 9, and the estimate of statement (i). The statement (vi) directly follows from Taylor and the estimates of (i). Proof (Theorem 4). We start with deducing the strict convexity of ϕ K . It follows from Eqs. (60) and (61) that d d ∗ d ϕ K (m) = σ + ϕ K (σ ) σ = σ. dm dσ dm Hence, a second differentiation and the estimate (68) yield d2 1 d σ ≥ > 0. ϕ K (m) = 2 dm dm c Now, let us consider the convergence of ϕ K (m) − H¯ K (m)C 2 . By Lemma 8 it suffices to consider the Lebesgue density g K ,m of the distribution of the random variable K 1 X j − m j. √ K j=1

LSI for Kawasaki Dynamics with Weak Interaction

853

Because the random variables √1 (X j − m j ) are independent, g K ,m is a convolution. K Because the Fourier transform of a convolution becomes a multiplication we can express g K ,m as (cf. the Inversion Lemma in [25]) 2π g K ,m (0) =

3 K 4 j=1

5 1 exp i x˜ j √ ξ dξ. K j

(72)

For convenience, we will use the following notation for the rest of the proof: ab a∼b

⇔ there is a uniform constant C > 0 such that a ≤ Cb, ⇔ it holds that a b and b a.

Assume that the following estimates hold uniformly in K and m: ! ! ! 3 !

5 K 4 ! ! 1 ! exp i x˜ j √ ξ dξ !! ∼ 1, ! K ! j=1 ! j ! ! ! !

5 3 K 4 ! d ! 1 ! exp i x˜ j √ ξ dξ !! 1, ! dm K ! ! j j=1 ! ! ! 2 K 4 !

5 3 ! d ! 1 ! exp i x˜ j √ ξ dξ !! 1. ! dm 2 K ! ! j

(73)

(74)

(75)

j=1

Then a combination of the formula (72) and Cramér’s representation (58) yields the desired result H¯ K (m) − ϕ K (m)C 2 → 0 as K → ∞. It remains to establish the estimates from above. Note that the intermediate estimate (74) follows from the estimates (73) and (75) by interpolation. Argument for (73): We start with deducing the upper bound, ! ! ! ! K 4

5 ! ! 3 1 ! exp i x˜ j √ ξ dξ !! 1. ! K ! ! j=1 j For some fixed 0 < δ 1 we split the integral according to 3 K 4 j=1

5 1 exp i x˜ j √ ξ dξ = !! K ! √1 j K +

5 K 4 3 1 exp i x˜ j √ ξ dξ K j

! ! ξ !≤δ j=1

K 4 3

! ! ! √1 ! ξ !≥δ j=1 ! K

5 1 exp i x˜ j √ ξ dξ. K j

(76)

854

G. Menz

Let us consider the inner integral. We can choose δ so small that the statement (vi) of Lemma 10 applies. Hence, we may rewrite the inner integral as

5 K 4 3 1 exp i x ˜ ξ dξ I := !! √ ! j ! K ! √1 ξ !≤δ j=1 j K ⎛ ⎞

K 1 h j √ ξ ⎠ dξ. = !! ! exp ⎝− ! K ! √1 ξ !≤δ j=1 K ! ! ! ! Note that for ! √1 ξ ! ≤ δ the statement (vi) of Lemma 10 yields K ! ! ! K !

K ς2 ! 1 1 j 2 !! 3 ! ξ ξ − h (77) √ j ! ! √ K |ξ | . 2K K ! ! j=1

j=1

In particular for δ small enough this implies by using the assumption (59), ⎛ ⎞

K K 2 1 1 1 ςj 2 ξ ≥ 2 ξ 2, hj √ ξ ⎠ ≥ Re ⎝ 4 K 4c K j=1

(78)

j=1

where the constant 0 ≤ c < ∞ is given by (59). The last statement yields the estimate ! ⎛ ⎞! !

! K ! ! 1 ⎠! dξ ξ h |I | ≤ !! √ ! !!exp ⎝− j ! ! K ! √1 ξ !≤δ ! ! j=1 K

1 2 ≤ !! ξ dξ 1. ! exp − ! 1 4c2 ! √ ξ !≤δ K

Now, let us consider the outer integral I I :=

K 4 3

! ! ! √1 ! ξ !≥δ j=1 ! K

5 1 exp i x˜ j √ ξ dξ. K j

On the integrand we apply the statement (iii) of Lemma 10 (on K − 2 of the K factors) and the statement (iv) of Lemma 10 (on the remaining 2 factors): ! ! ⎛ ⎞2 ! K

! !3 ! 1 ! exp i x˜ j √1 ξ j ! λ K −2 ⎝ ⎠ ! ! 1 √ 1 + |ξ | K ! j=1 ! K K λ K −2

1 1 K λ K −2 . K + ξ2 1 + ξ2

It follows that the second term I I is exponentially small: ! ! ! !

5 K 4 3 ! ! 1 1 ! ! K λ K −2 |I I | = ! !! exp i x ˜ ξ dξ dξ √ ! j ! ! 1 + ξ2 K ! ! √1K ξ !≥δ j=1 ! j K λ K −2 → 0 as K → ∞.

LSI for Kawasaki Dynamics with Weak Interaction

855

Together with the estimate of |I | from above, this yields the desired upper bound (76). We turn to the lower bound ! ! ! ! K 4

5 ! ! 3 1 ! exp i x˜ j √ ξ dξ !! = |I + I I | 1. ! K ! ! j=1 j Applying the triangle inequality yields |I + I I | |I | − |I I |. Because |I I | → 0 as K → ∞ it suffices to show |I | 1. ! ! ! ! Recall that for ! √1 ξ ! ≤ δ we have (cf. (78)) K ⎛ ⎞

K 1 1 Re ⎝ h j √ ξ ⎠ ≥ 2 ξ 2. 4c K j=1 The function C y → exp(y) ∈ C is Lipschitz continuous on Re y ≤ − 4c12 ξ 2 with constant exp(− 4c12 ξ 2 ). Therefore (77) yields the estimate ! ⎛ ⎞! ⎞ ⎛ ! !

K K ς2 ! ! 1 1 1 2 j 2 ⎠! 3 !exp ⎝− ⎠ ⎝ ξ ! √ |ξ | exp − 2 ξ . hj √ ξ − exp − ! 2K 4c K K ! ! j=1 j=1 The last estimate implies ! ⎛ ⎞ ! ! !

K ς2 ! ! 1 1 2 j 2⎠ 3 ! I − ! ! ⎝ ξ dξ ! √ |ξ | exp − 2 ξ dξ → 0 ! exp − ! ! √1 ! 2K 4c K ξ !≤δ ! ! ! j=1 K as K → ∞. Additionally, we observe that by the assumption (59), ⎛ ⎞ K ς2 c j 2⎠ ξ I I I := !! exp − ξ 2 dξ 1. dξ ! exp ⎝− ! 2K 2 {|ξ |≤δ} ! √1 ξ !≤δ j=1

K

Hence, we may conclude that |I | = |I − I I I + I I I | ≥ |I I I | − |I − I I I | 1 for K 1 large enough. Argument for (75): We split the integral according to

5

5 3 K 4 K 4 d2 1 d2 3 1 exp i x ˜ exp i x ˜ ξ ξ dξ = dξ √ √ ! ! j j 2 ! √1 ! dm 2 K K ξ !≤δ dm j=1 ! j j j=1 K +

! ! ! √1 ! ξ !≥δ ! K

=: I V + V.

5 K 4 d2 3 1 exp i x˜ j √ ξ dξ dm 2 K j j=1

856

G. Menz

Let us consider the inner integral I V . An application of the chain rule for differentiation yields K K d d 3 exp(i x˜ j ξ ) j exp(i x˜ j ξ ) j = dm dm j=1

j=1

3

exp(i x˜k ξ ) k .

k∈{1,...,K }, k = j

A second differentiation yields ⎡ d2 dm 2

+

K 3

K

$

% exp i x˜ j ξ j =

j=1

j=1

d exp(i x˜ j ξ ) j dm

⎢ d2 ⎢ ⎣ dm 2 exp(i x˜ j ξ ) j

K n∈{1,...K }, n = j

d exp(i x˜n ξ ) n dm

3

exp(i x˜k ξ ) k

k∈{1,...,K }, k = j

3

⎤

⎥ exp(i x˜l ξ ) l ⎥ ⎦.

(79)

l∈{1,...,K }, l = j, l =n

! ! ! ! The same argument as for (78) yields that for ! √1 ξ ! ≤ δ with δ small enough, K

! ! ! !

4

3 l∈{1,...,K }, l = j, l =n

5 ! ! 1 != exp i x˜l √ ξ ! K l

! ! ! exp !

hj

l∈{1,...,K }, l = j, l =n

1 √ ξ K

! ! ! !

1 ≤ exp − 2 ξ 2 . 4c

(80)

Hence, a combination of the identity (79), the estimate (80), and the estimates of Lemma 10 (v) yields ! 2 3

5 K 4 ! d 1 ! exp i x˜ j √ ξ ! dm 2 K j=1

1

j

! ! ! !

2

1 1 |ξ |2 |ξ |2 1 ≤ √ |ξ |3 + 1+ |ξ |6 exp − 2 ξ 2 . 1+ K K K 4c K The desired estimate directly follows from the last estimate, i.e. |I V |

! ! √1 !

K

1 2 2 3 6 |ξ | + |ξ | exp − ξ dξ 1. ! 1 + |ξ | ! 4c ξ !≤δ

Now, we turn to the outer integral V . By substitution we have V =

√

K

{|ξ |≥δ}

K d2 3 exp(i x˜ j ξ ) j dξ. dm 2 j=1

LSI for Kawasaki Dynamics with Weak Interaction

857

On ! the identity ! (79), we apply the estimates of Lemma 10 (v) in a first step and !exp(i x˜ j ξ ) j ! ≤ 1 in a second step: ! ! ! 2 K ! ! d 3 ! ! exp(i x˜ j ξ ) j !! ! dm 2 ! ! j=1 ⎡

K ⎢ ⎢(1 + |ξ |2 ) |ξ |3 ⎣ j=1

(1 + |ξ |8 )

|exp(i x˜k ξ ) k |

k∈{1,...,K }, k = j

⎤

+(1 + |ξ |2 ) |ξ |6

3

3

n∈{1,...,K }, n = j

j∈{1,...,K }

l∈{1,...,K }, l = j, l =n

n∈{1,...,K }, n = j

⎥ |exp(i x˜l ξ ) l | ⎥ ⎦

3

|exp(i x˜l ξ ) l | .

l∈{1,...,K }, l = j, l =n

We use Lemma 10 (iii) (on K − 12 of the K − 2 factors |exp(i x˜l ξ ) l |) and Lemma 10 (iv) (on the remaining 10 factors |exp(i x˜l ξ ) l |): ! ! ! 2 3 !

10 K ! d ! 1 2 8 K −12 ! ! K exp(i x ˜ ξ ) (1 + |ξ | ) λ j j! ! dm 2 1 + |ξ | ! ! j=1

K 2 λ K −12

1 . 1 + |ξ |2

Hence, we see that the term |V | is exponentially small, i.e. √ 1 |V | K K 2 λ K −12 dξ → 0 as K → ∞. 1 + |ξ |2 Together with the estimate for |I V | from above, the latter yields (75).

B. Basic Facts about the LSI In this section we quote some basic facts about the LSI, that are needed in our arguments. For a general introduction to LSI we refer to [10,17,24]. There are several standard criteria for LSI. The Tensorization principle shows that LSI is compatible with products (cf. [8]). Theorem 5 (Tensorization principle). Let μ1 and μ2 be probability measures on Euclidean spaces X 1 and X 2 respectively. If μ1 and μ2 satisfy LSI(1 ) and LSI(2 ) respectively, then the product measure μ1 ⊗ μ2 satisfies LSI(min{1 , 2 }). The next criterion [13] shows how the LSI constant behaves under perturbations. Note that it is not well suited for high dimensions.

858

G. Menz

Theorem 6 (Criterion of Holley & Stroock). Let μ be a probability measure on the Euclidean space X and let δψ : X → R be a bounded function. Let the probability measure μ˜ be defined as μ(d ˜ x) =

1 exp (−δψ(x)) μ(d x). Z

If the measure μ satisfies LSI() for some constant > 0, then μ˜ satisfies LSI() ˜ with constant ˜ = exp (− (sup δψ − inf δψ)). The criterion of Bakry & Émery connects the convexity of the Hamiltonian to the LSI constant (cf. [1,23]). Theorem 7 (Criterion of Bakry & Émery). Let X be a N -dimensional Euclidean space and let H ∈ C 2 (X ). The probability measure μ on X is defined via 1 exp (−H (x)) d x. Z If there is a constant > 0 such that for all x, v ∈ X, μ(d x) =

v, Hess H (x)v ≥ |v|2 , then μ satisfies LSI(). More recently, Otto & Reznikoff [22] deduced a criterion that is capable to deal with certain non-convex Hamiltonians in high dimensions. Theorem 8 (Criterion of Otto & Reznikoff). Let dμ := Z1 exp(−H (x)) d x be a probability measure on a direct product of Euclidean spaces X = X 1 ×· · ·× X M . We assume that ◦ the conditional measures μ(d x l |x n ∈ X n n = l), 1 ≤ l = n ≤ M, satisfy a uniform LSI(l ) with constant l > 0, ◦ the numbers κln , 1 ≤ l = n ≤ M, satisfy |∇l ∇n H (x)| ≤ κln < ∞, uniformly in x; here | · | denotes the operator norm of a bilinear form. ◦ the matrix A = (Ai j ) M×M defined by i if i = j, Ai j = −κi j else, satisfies in the sense of quadratic forms A ≥ Id

for a constant > 0.

Then μ satisfies LSI(). One can understand Theorem 8 as a comparison principle. Via the matrix A, a Gaussian measure μ A (d x) = exp(− x, Ax ) d x is associated to the original Gibbs measure μ(d x) = exp(−H (x)) d x. Because for Gaussian measures the property of positive definiteness of A and the LSI are equivalent (see for example [22]), the criterion of Otto & Reznikoff becomes:

LSI for Kawasaki Dynamics with Weak Interaction

859

Theorem. If μ A satisfies LSI(), then also μ does. Due to this example one could hope that μ inherits further features from μ A . Theorem 3 shows that this is the case for covariances (cf. [20]). In the proof of the main result we also need the linearized version of the LSI, which is known as the spectral gap inequality (SG). Definition 3. A probability measure μ satisfies SG(), > 0, if for all f,

2 1 f − f dμ var μ ( f ) := dμ ≤ |∇ f |2 dμ. We need the following well-known facts about the SG. Lemma 11. ◦ If μ satisfies LSI(), then μ also satisfies SG(). ◦ If μ satisfies SG(), then for all functions f and g 1 covμ ( f, g) ≤

1

1 2 2 2 |∇g| dμ . |∇ f | dμ 2

References 1. Bakry, D., Émery, M.: Diffusions hypercontractives. Sem. Probab. XIX, Lecture Notes in Math, 1123. Berlin-Heidelberg-New York: Springer-Verlag, 1985, pp. 177–206 2. Bodineau, T., Helffer, B.: The log-Sobolev inequality for unbounded spin systems. J. Funct. Anal. 166(1), 168–178 (1999) 3. Bodineau, T., Helffer, B.: Correlations, spectral gap and log-Sobolev inequalities for unbounded spins systems. In: Differential equations and mathematical physics (Birmingham, AL, 1999), Volume 16 of AMS/IP Stud. Adv. Math., Providence, RI: Amer. Math. Soc., 2000, pp. 51–66 4. Cancrini, N., Martinelli, F., Roberto, C.: The logarithmic Sobolev constant of Kawasaki dynamics under a mixing condition revisited. Ann. Inst. H. Poincaré Probab. Statist. 38(4), 385–436 (2002) 5. Caputo, P.: Uniform Poincaré inequalities for unbounded conservative spin systems: the non-interacting case. Stochastic Process. Appl. 106(2), 223–244 (2003) 6. Chafaï, D.: Glauber versus Kawasaki for spectral gap and logarithmic Sobolev inequalities of some unbounded conservative spin systems. Markov Process. Rel. Fields 9(3), 341–362 (2003) 7. Feller, W.: An introduction to probability theory and its applications. Vol II. 2nd ed. Wiley Series in Probability and Mathematical Statistics. New York: John Wiley and Sons, Inc., 1971 8. Gross, L.: Logarithmic Sobolev inequalities. Amer. J. Math. 97, 1061–1083 (1975) 9. Grunewald, N., Otto, F., Villani, C., Westdickenberg, M.: A two-scale approach to logarithmic Sobolev inequalities and the hydrodynamic limit. Ann. Inst. H. Poincaré Probab. Statist. 45(2), 302–351 (2009) 10. Guionnet, A., Zegarlinski, B.: Lectures on logarithmic Sobolev inequalities. In: Séminaire de Probabilités, XXXVI, Volume 1801 of Lecture Notes in Math, Berlin: Springer, 2003, pp. 1–134 11. Guo, M., Papanicolau, G., Varadhan, S.: Nonlinear diffusion limit for a system with nearest neighbor interactions. Commun. Math. Phys. 118, 31–59 (1988) 12. Helffer, B.: Remarks on decay of correlations and Witten Laplacians. III. Application to logarithmic Sobolev inequalities. Ann. Inst. H. Poincaré Probab. Statist. 35(4), 483–508 (1999) 13. Holley, R., Stroock, D.: Logarithmic Sobolev inequalities and stochastic Ising models. J. Stat. Phys. 46, 1159–1194 (1987) 14. Kipnis, C., Landim, C.: Scaling limits of interacting particle systems. Grundlehren der Mathematischen Wissenschaften. 320. Berlin: Springer., 1999 15. Kosygina, E.: The behavior of the specific entropy in the hydrodynamic scaling limit. Ann. Probab. 29(3), 1086–1110 (2001) 16. Landim, C., Panizo, G., Yau, H.T.: Spectral gap and logarithmic Sobolev inequality for unbounded conservative spin systems. Ann. Inst. H. Poincaré Probab. Statist. 38(5), 739–777 (2002) 17. M. Ledoux. Logarithmic Sobolev inequalities for unbounded spin systems revisted. Sem. Probab. XXXV, Lecture Notes in Math. 1755. Berlin-Heidelberg-New York: Springer-Verlag, 2001, pp. 167–194

860

G. Menz

18. Lu, S.L. Yau H.-T.: Spectral gap and logarithmic Sobolev inequality for Kawasaki and Glauber dynamics. Commun. Math. Phys. 156(2), 399–433 (1993) 19. Menz, G.: Equilibrium dynamics of continuous unbounded spin systems. Dissertation, University of Bonn (2011). urn:nbn:de:hbz:5N-25331 20. Menz, G., Otto, F.: A new covariance estimate. In preparation, 2011 21. Menz, G., Otto, F.: Uniform logarithmic sobolev inequalities for conservative spin systems with superquadratic single-site potential. MPI-MIS preprint, Leipzig, 2011 22. Otto, F., Reznikoff, M.: A new criterion for the logarithmic Sobolev inequality and two applications. J. Funct. Anal. 243(1), 121–157 (2007) 23. Otto, F., Villani, C.: Generalization of an inequality by Talagrand and links with the logarithmic Sobolev inequality. J. Funct. Anal. 173, 361–400 (2000) 24. Royer, G.: Une initiation aux inégalités de Sobolev logarithmiques. Cours Spéc., Soc. Math. France, 1999 25. Shiryaev, A.N.: Probability. New York: Springer-Verlag, 2nd edition, 1996 26. Stroock, D., Zegarlinski, B.: The equivalence of the logarithmic Sobolev inequality and the DobrushinShlosman mixing condition. Commun. Math. Phys. 144, 303–323 (1992) 27. Stroock, D., Zegarlinski, B.: The logarithmic Sobolev inequality for discrete spin systems on the lattice. Commun. Math. Phys. 149, 175–193 (1992) 28. Stroock, D., Zegarlinski, B.: On the ergodic properties of glauber dynamics. J. Stat. Phys. 81, 1007– 1019 (1995) 29. Yau, H.-T.: Relative entropy and hydrodynamics of Ginzburg-Landau models. Lett. Math. Phys. 22, 63–80 (1991) 30. Yau, H.-T.: Logarithmic Sobolev inequality for lattice gases with mixing conditions. Commun. Math. Phys. 181(2), 367–408 (1996) 31. Yoshida, N.: Application of log-Sobolev inequality to the stochastic dynamics of unbounded spin systems on the lattice. J. Funct. Anal. 173, 74–102 (2000) 32. Yoshida, N.: The equivalence of the log-Sobolev inequality and a mixing condition for unbounded spin systems on the lattice. Ann. Inst. H. Poincaré Probab. Statist. 37(2), 223–243 (2001) 33. Zegarlinski, B.: The strong decay to equilibrium for the stochastic dynamics of unbounded spin systems on a lattice. Commun. Math. Phys. 175, 401–432 (1996) Communicated by H.-T. Yau

Communications in Mathematical Physics - Volume 221

Read more

Communications in Mathematical Physics - Volume 220

Read more

Communications in Mathematical Physics - Volume 235

Read more

Communications in Mathematical Physics - Volume 223

Read more

Communications In Mathematical Physics - Volume 283

Read more

Communications In Mathematical Physics - Volume 270

Read more

Communications in Mathematical Physics - Volume 208

Read more

Communications in Mathematical Physics - Volume 186

Read more

Communications In Mathematical Physics - Volume 294

Read more

Communications in Mathematical Physics - Volume 217

Read more

Communications In Mathematical Physics - Volume 274

Read more

Communications in Mathematical Physics - Volume 239

Read more

Communications in Mathematical Physics - Volume 306

Read more

Communications in Mathematical Physics - Volume 264

Read more

Communications in Mathematical Physics - Volume 227

Read more

Communications in Mathematical Physics - Volume 184

Read more

Communications in Mathematical Physics - Volume 261

Read more

Communications in Mathematical Physics - Volume 225

Read more

Communications In Mathematical Physics - Volume 263

Read more

Communications in Mathematical Physics - Volume 211

Read more

Communications In Mathematical Physics - Volume 293

Read more

Communications in Mathematical Physics - Volume 246

Read more

Communications In Mathematical Physics - Volume 298

Read more

Communications in Mathematical Physics - Volume 234

Read more

Communications In Mathematical Physics - Volume 288

Read more

Communications in Mathematical Physics - Volume 304

Read more

Communications In Mathematical Physics - Volume 292

Read more

Communications in Mathematical Physics - Volume 233

Read more

Communications in Mathematical Physics - Volume 253

Read more

Communications in Mathematical Physics - Volume 222

Read more

Recommend Documents

Communications in Mathematical Physics - Volume 221

Commun. Math. Phys. 221, 1 – 26 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Evolution of a ...

Communications in Mathematical Physics - Volume 220

Commun. Math. Phys. 220, 1 – 12 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 On the Definiti...

Communications in Mathematical Physics - Volume 235

Commun. Math. Phys. 235, 1–45 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0778-0 Communications in Mathe...

Communications in Mathematical Physics - Volume 223

Commun. Math. Phys. 223, 1 – 12 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Resonance Expan...

Communications In Mathematical Physics - Volume 283

Commun. Math. Phys. 283, 1–24 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0556-8 Communications in Mathe...

Communications In Mathematical Physics - Volume 270

Commun. Math. Phys. 270, 1–12 (2007) Digital Object Identifier (DOI) 10.1007/s00220-006-0139-5 Communications in Mathe...

Communications in Mathematical Physics - Volume 208

Commun. Math. Phys. 208, 1 – 23 (1999) Communications in Mathematical Physics © Springer-Verlag 1999 Characters of C...

Communications in Mathematical Physics - Volume 186

Commun. Math. Phys. 186, 1-59 (1997) Communications in Mathematical Physics (~) Springer-Verlag1997 Meanders and the...

Communications In Mathematical Physics - Volume 294

Commun. Math. Phys. 294, 1–19 (2010) Digital Object Identifier (DOI) 10.1007/s00220-009-0920-3 Communications in Mathe...

Communications in Mathematical Physics - Volume 217

Commun. Math. Phys. 217, 1 – 31 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Integrable Stru...