Communications in Mathematical Physics - Volume 213

Commun. Math. Phys. 213 (2000) Communications in Mathematical Physics © Springer-Verlag 2000 Editorial Leafing back...

Author: M. Aizenman (Chief Editor)

27 downloads 893 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 213 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Editorial

Leafing back through Communication in Mathematical Physics, one would turn many pages before reaching the issue marking the moment when Arthur Jaffe stepped up to the helm of CMP, which was previously held by Rudolf Haag, Klaus Hepp, and James Glimm. As he is now passing on the role of Editor in Chief, Arthur Jaffe is to be thanked and congratulated for the many years of his successful leadership. Under his guidance the journal has experienced spectacular growth, continuously reflecting the depth and the richness of mathematical physics. The editorial policy of Communications in Mathematical Physics remains committed to offering the forum for works of the highest mathematical standards, motivated by the vision and the challenges of modern physics. Also welcome are submissions presenting valuable insights which may not yet be fully supported by mathematically mature proofs, maintaining a clear distinction between proven statements and conjectured relations. Works are expected to be introduced in a manner which serves the purpose of communicating the results within a broad professional community. As the modalities of scientific publishing are evolving, with new possibilities opened up by the electronic media, the publisher is committed to keeping abreast with the emerging technology. Currently, institutional subscriptions enable all affiliated individuals access via the internet to the electronic version of CMP, including back issues and articles in the printer’s queue. The strength of CMP is based on the high quality of the works of contributing authors, as well as the dedicated work of the editors and the referees. Through the symbiotic relation of the journal with the community it serves, CMP will continue to reflect the highest level of mathematical physics, as it evolves through this dynamic period. Michael Aizenman Editor in Chief

Commun. Math. Phys. 213, 1 – 17 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

The Geometry of (Super) Conformal Quantum Mechanics Jeremy Michelson1,2, , Andrew Strominger2 1 Department of Physics, Harvard University, Cambridge, MA 02138, USA 2 Department of Physics, University of California, Santa Barbara, CA 93106, USA

Received: 3 September 1999 / Accepted: 30 January 2000

Abstract: N -particle quantum mechanics described by a sigma model with an N dimensional target space with torsion is considered. It is shown that an SL(2, R) conformal symmetry exists if and only if the geometry admits a homothetic Killing vector D a ∂a whose associated one-form Da dX a is closed. Further, the SL(2, R) can always be extended to Osp(1|2) superconformal symmetry, with a suitable choice of torsion, by the addition of N real fermions. Extension to SU (1, 1|1) requires a complex structure I and a holomorphic U (1) isometry D a Ia b ∂b . Conditions for extension to the superconformal group D(2, 1; α), which involve a triplet of complex structures and SU (2) × SU (2) isometries, are derived. Examples are given.

1. Introduction Conformal and superconformal field theories in various dimensions have played a central role in our understanding of modern field theory and string theory. Oddly, the subject of this paper – one dimension – is one of the least well understood cases. The simplest example of conformally invariant single-particle quantum mechanics was pioneered in [1], following the general analysis of [2–4]. Supersymmetric generalizations were discussed in [5–11]. The quantum mechanics case has taken on renewed interest because superconformal quantum mechanics may provide a dual description of string theory on AdS2 [12]. Most of the discussions so far have concerned relatively simple systems either with small numbers of particles or exact integrability. In this paper we consider a more general class of models with N particles. Current address: New High Energy Theory Center, Rutgers University, 126 Frelinghuysen Road, Piscataway, NJ 08854, USA.

2

J. Michelson, A. Strominger

We begin in Sect. 2 with a bosonic sigma model with an N -dimensional target space. It is shown that the model has a nonlinearly-realized conformal symmetry if and only if the target space metric has a vector field D a ∂a whose Lie derivative obeys LD gab = 2gab ,

(1.1)

and whose associated one-form is closed d(Da dX a ) = 0.

(1.2)

Given (1.1) and (1.2) it is shown that, in a Hamiltonian formalism, the dilations are (roughly) generated by D a Pa while the special conformal transformations are generated by K = 21 Da D a .

(1.3)

The conformal symmetry persists in the presence of a potential V obeying LD V = −2V . A general class of examples is given. In Sect. 3 we turn to the supersymmetric case. The geometry of Poincaré-supersymmetric quantum mechanics with a variety of supermultiplets was discussed by Coles and Papadopoulos [13]. We restrict our attention to the case for which the multiplet structure with respect to the Poincaré super-subgroup consists of N bosons X a with N real superpartners λa . Such multiplets arise in the reduction of two-dimensional chiral (0, N ) multiplets, where N is the number of supersymmetries, to one dimension, and give rise to what is sometimes referred to as “type B” models (most of the literature concerns “type A” (N /2, N /2) multiplets). In Sect. 3.1 we show that every bosonic conformal model can be extended to an N = 1B theory with Osp(1|2) superconformal symmetry provided the torsion obeys certain constraints. In Sect. 3.2 we consider N = 2B and find that the extension to SU (1, 1|1) requires a complex structure I with respect to which D must be holomorphic. D a Ia b ∂b is found to generate a U (1) isometry. In Sect. 3.3 we first derive a simplified version of the conditions for N = 4B Poincaré supersymmetry with an SU (2) R-symmetry as first-order differential relations between the triplet of complex structures I r . We further show that an N = 4B model has a D(2, 1; α) superconformal symmetry if the vector fields D a Iarb ∂b generate an SU (2) isometry group and obey generalizations of the identities required for SU (1, 1|1). The parameter α is determined by the constant in the SU (2) Lie bracket algebra. In Sect. 3.4 we construct a large class of N = 4B theories in terms of an unconstrained potential L. D(2, 1; α) superconformal symmetry then follows if L is a homogeneous and SU (2) rotationally-invariant function of the coordinates. Related results in four dimensions were recently discussed in [14]. Throughout the paper we use a Hamiltonian formalism. In Appendix A we give a Lagrangian derivation of the supercharges used in the text. We use real coordinates throughout the body of the text, but Appendix B gives various useful formulae for the geometry and supercharges in complex coordinates. In Appendix C we discuss the conditions under which an N = 4B geometry can be written in terms of a potential L. A primary motivation for this work is the expectation that quantum mechanics on the five-dimensional multi-black hole moduli space is an N = 4B theory with a D(2, 1; α) superconformal symmetry at low energies [15].

The Geometry of (Super) Conformal Quantum Mechanics

3

2. N = 0 Conformally Invariant N-Particle Quantum Mechanics In this section we find the conditions under which a general N -particle quantum mechanics admits an SL(2, R) symmetry. We will adopt a Hamiltonian formalism, and derive the conditions for the existence of appropriate operators generating the symmetries. The general Hamiltonian is1 H = 21 Pa† g ab Pb + V (X), (2.1) where a, b = 1, . . . , N. We now determine the conditions under which the theory, defined by Eq. (2.1), admits an SL(2, R) symmetry. We first look for a dilational symmetry of the general form δD X a = !D a (X), δD t = 2!t.

(2.2)

D = 21 D a Pa + h.c.

(2.3)

[D, H ] = 2iH.

(2.4)

This is generated by an operator

which should obey

From the definitions (2.3) and (2.1) one finds i i [D, H ] = − Pa† (LD g ab )Pb − iLD V − ∇ 2 ∇a D a , 2 4

(2.5)

where LD is the usual Lie derivative obeying: LD gab = D c gab,c + D c ,a gcb + D c ,b gac .

(2.6)

Therefore, given a metric g and potential V a dilational symmetry exists if and only if there exists a conformal killing vector D obeying

and

LD gab = 2gab

(2.7)

LD V = −2V .

(2.8)

Note that (2.7) implies the vanishing of the last term of Eq. (2.5). A vector field D obeying (2.7) is known as a homothetic vector field, and the action of D is known as a homothety. Next we look for a special conformal symmetry generated by an operator K(X) obeying [D, K] = −2iK, (2.9) and

[H, K] = −iD.

(2.10)

Equations (2.9) and (2.10) together with (2.5) is an SL(2, R) algebra. Equation (2.9) is equivalent to LD K = 2K, (2.11) √ † 1 1 The canonical momentum P = g X b b b a ab ˙ = −i∂a obeys [Pa , X ] = −iδa and Pa = √g Pa g (for √ the norm (f1 , f2 ) = d N X gf1∗ f2 ). In this and all subsequent expressions, the operator ordering is as indicated.

4

J. Michelson, A. Strominger

while (2.10) can be written Da dX a = dK.

(2.12)

Hence the one-form D is exact. One can solve for K as K = 21 gab D a D b .

(2.13)

We shall adopt the phrase “closed homothety” to refer to a homothety whose associated one-form is closed and exact. An alternate basis of SL(2, R) generators is L0 = 21 (H + K), L±1 = 21 (H − K ∓ iD).

(2.14)

In this basis the generators obey the standard commutation relations [L1 , L−1 ] = 2L0 , [L0 , L±1 ] = ∓L±1 .

(2.15)

The nature of these geometries can be illuminated by choosing coordinates such that (X0 )2 = 2K and gi0 = 0 for i = 1, . . . , N−1. This is always locally possible away from the zeros of D. One then finds ds 2 = (dX 0 )2 + (X 0 )2 gij (X k )dX i dX j , ∂ . D a ∂a = X 0 ∂X 0

(2.16)

Hence, given any metric gij in N − 1 dimensions, one can construct a geometry with a closed homothety in N dimensions by dressing it with an extra radial dimension. Similar comments pertain to the potential V . An alternate useful choice is dilational coordinates, in which Da =

2 a X , h

(2.17)

where h is an arbitrary constant. These are related to the coordinates in (2.16) by X 0 = (X0 )2/ h , X i = (X 0 )2/ h X i . In such dilational coordinates one finds X a ∂a gbc =

h LD gbc − X a ,c gba − X a ,b gac = (h − 2)gbc . 2

(2.18)

Hence in dilational gauge the metric components are homogeneous functions of degree h − 2. (It is not, however, the case that every homogeneous metric admits an SL(2, R) symmetry.) At this point h can be changed by transformations which take the coordinates to powers of themselves, and so has no coordinate independent meaning. However, it turns out that for N = 4B supersymmetry, a preferred value of h is obtained in quaternionic coordinates, when such coordinates exist and coincide with dilational gauge, as in the class of examples considered in Sect. 3.4.

The Geometry of (Super) Conformal Quantum Mechanics

5

In conclusion, the Hamiltonian (2.1) describes an SL(2, R) invariant quantum mechanics if and only if the metric admits a closed homothety LD gab = 2gab , d(Da dX a ) = 0,

(2.19)

under which the potential transforms according to (2.8).

3. The Supersymmetric Case In the following we supersymmetrize the bosonic sigma model by extending the boson Xa to the supermultiplet (X a , λa ) with λa = λa† . A number of other multiplets exist [13] which will not be considered in the following. Furthermore we will set the potential V = 0. An operator approach to a similar system can be found in [16]. 3.1. N = 1B Poincaré supersymmetry and Osp(1|2) superconformal symmetry. Let us supersymmetrize the bosonic sigma-model (2.1) for V = 0 with N fermions λα , where α = 1, . . . , N is a tangent space index. These obey the standard anticommutation relations {λα , λβ } = δ αβ , (3.1) and of course commute with Pa and Xb . It is convenient to make the field redefinitions λ a ≡ e a α λα , i i /a ≡ Pa − ωabc λb λc + cabc λb λc , 2 2

(3.2) γ

where ω is related to the usual spin connection by ωabc = ωa β γ ebβ ec .2 A supercharge can then be constructed as3 i Q = λa /a − cabc λa λb λc , (3.3) 3 where c is a 3-form, which at this point is arbitrary. A derivation of the supercharge from a supersymmetric Lagrangian is given in Appendix A. The supercharge obeys {Q, Q} = 2H,

(3.4)

where the bosonic part of H agrees with (2.1) for V = 0. We wish to extend this N = 1B Poincaré-superalgebra to the Osp(1|2) superconformal algebra whose non-vanishing commutation relations are [H, K] = −iD,

[H, D] = −2iH,

[K, D] = 2iK,

{Q, Q} = 2H,

[Q, D] = −iQ,

[Q, K] = −iS,

{S, S} = 2K,

[S, D] = iS,

[S, H ] = iQ,

(3.5)

{S, Q} = D. 2 We note that [/ , / ] = − 1 R + λc λd , where R + b b a b 2 abcd abcd is constructed from the connection 3ac + c ac ; b )λc and [/ , λb ] = i(3 b + cb )λc , where 3 is the Christoffel connection. The [Pa , λb ] = −i(ωa b c − 3ac ac a ac Hilbert space can be viewed as a spinor (as is seen by identifying Eq. (3.1) with the γ -matrix algebra) and /a as the covariant derivative (with torsion c) on Hilbert space states. 3 Despite the non-hermiticity of / , this expression is hermitian with the indicated operator ordering. a

6

J. Michelson, A. Strominger

As before, the bosonic subalgebra requires a closed homothety. The new supercharge can then be constructed as S = i[Q, K] = λa Da , (3.6) with K given by (2.13). The {S, Q} anticommutator is then used to find {S, Q} = D = 21 (D a /a + h.c.).

(3.7)

Then, [S, D] = iS is satisfied, but the [Q, D] commutator is [Q, D] = −iQ − icabc D a λb P c + O(λ3 ).

(3.8)

Agreement with (3.5) then requires c to be orthogonal to D: D a cabc = 0.

(3.9)

Given (3.9), the full commutator becomes 1 [Q, D] = −iQ − λa λb λc (LD − 2)cabc . 6

(3.10)

We therefore demand that c transform under dilations as LD cabc = 2cabc .

(3.11)

The remaining commutators in (3.5) then follow from the Jacobi identities, with no further constraints on the geometry. In summary any N = 0 conformal quantum mechanics can be promoted to Osp(1|2), but the torsion c appearing in the supercharges must obey D a cabc = 0, LD cabc = 2cabc .

(3.12)

3.2. N = 2B Poincaré supersymmetry and SU (1, 1|1) superconformal symmetry. N = 2B supersymmetry requires a complex structure I and a hermitian metric on the target space [13]. The relevant formulae are simplest in complex coordinates. However complex coordinates are less useful in the extension to the 4B case (which has an SU (2) triplet of complex structures) considered in the next subsection. Accordingly we continue with real coordinates, but give the complex version in Appendix B. The second supercharge is given by ˜ = λa Ia b /b − i λa Ia b cbcd λc λd − i λa λb λc Ia d Ib e Ic f cdef − i λa cabc I bc . (3.13) Q 2 6 2 A derivation is given in Appendix B. Whereas c is unconstrained for N = 1B, for ˜ Q} requires [13] N = 2B the vanishing of {Q, + ∇(b Ic) a = 0,

(3.14)

where the torsion connection ∇ + involves the Christoffel connection plus the torsion c b + cb In complex coordinates (3.14) can be solved for the (1, 2) part of c as as 3ac ac i¯ c|(1,2) = − ∂J, 2

(3.15)

The Geometry of (Super) Conformal Quantum Mechanics

7

with J = 21 Ia c gbc dX a ∧ dX b .

(3.16)

The (0, 3) part of c must be closed under ∂¯ but is otherwise unconstrained, and the (2, 1) and (3, 0) parts are obtained by complex conjugation. We wish to promote the N = 2B algebra to SU (1, 1|1). This involves an additional bosonic generator R which is the generator of the R symmetry group of the N = 2B subalgebra. The non-vanishing commutation relations are given by (3.5), an identical ˜ and S, ˜ together with set of relations with both Q and S replaced by Q ˜ S} = R, {Q,

˜ Q} = −R, {S,

˜ [R, Q] = −i Q,

˜ = iQ, [R, Q]

˜ [R, S] = −i S,

˜ = iS. [R, S]

(3.17)

As before closure of the algebra requires that the geometry must admit a closed homothety, as well as the constraints (3.12) on c. Commutation of the supercharges with K leads to the superconformal charges S = λa Da , S˜ = λa Ia b Db .

(3.18)

˜ = iQ ˜ requires that the action of D preObtaining the correct commutator [D, Q] serves the complex structure: LD Ia b = 0. (3.19) This is equivalent to the statement that D acts holomorphically. Alternate forms of (3.19) are D f If a cade Ib d Ic e = D f If a cabc ; (3.20) ¯ ∂i D j = 0. It follows from (3.20), together with (3.9) and (3.15) that D˜ a = D b Ib a generates a holomorphic isometry LD˜ Ia b = 0, (3.21) LD˜ gab = 0, as expected from [R, H ] = 0. Moreover the (2, 1) part of the torsion c is annihilated by LD˜ while the (3, 0) part has weight −2i. R is determined from the commutator of Q and S˜ as R = D˜ a /a − iIab λa λb − i D˜ a cabc λb λc ,

(3.22)

where we used Eq. (3.20). One finds [R, λa ] = −i(I a b + D˜ a ,b )λb , [R, D a ] = −i D˜ b D a ,b .

(3.23)

8

J. Michelson, A. Strominger

In complex coordinates and dilational gauge hD a = 2Xa , when such coordinates exist, this reduces to 2 [R, λa ] = −i(1 + )λb Ib a , h (3.24) 2i b a a [R, X ] = − X Ib . h Notice that R commutes with λ in complex coordinates with h = −2. All the remaining commutators (3.17) and (3.5) are satisfied without any additional constraints. In summary, there is an SU (1, 1|1) symmetry if and only if, in addition to the Osp(1|2) constraints (2.19) and (3.12), and the N = 2B constraints, D preserves the complex structure: (3.25) LD Ia b = 0. It further follows that D˜ a = D b Ib a generates a holomorphic isometry. 3.3. N = 4B Poincaré supersymmetry and D(2, 1; α) superconformal symmetry. Remarks on N = 4B Poincaré supersymmetry. Extending the algebra to include 4 supersymmetries requires 3 complex structures I r , r = 1, 2, 3. With each I r one can associate a generalized exterior derivative d r = dX a Iar b ∇br ∧,

(3.26)

where the connection 5r appearing in ∇ r is4 5ra bc = −Idra ∂c Ibrd .

(3.27)

One of the conditions for N = 4 supersymmetry found in [13, 17] can be expressed {d r , d s } = 0.

(3.28)

These are the vanishings of the Nijenhuis tensors and concomitants.5 In complex coor¯ Equation (3.28) further implies dinates adapted to I r , 5r vanishes and d r =i(∂−∂). {d r , d} = 0.

(3.29)

Additional requirements for supersymmetry discussed in [13, 17] are gab = Iarc Ibrd gcd (∀ r), r

s

(3.30)

rs

{I , I } = −2δ , re ∂[a (Ibre c|e|cd] ) − 2I[a ∂[e cbcd]] + ra ∇(b Ic)

= 0.

(3.31) = 0,

(3.32) (3.33)

In this last equation, we used the covariant derivative with torsion ∇ + defined just below Eq. (3.14). 4 5r defined in this way gives a connection acting on forms as described but not on general tensors. 5 So, Theorem 3.9 of [18] implies that the vanishing of any two of these equations yields the vanishing of

all six.

The Geometry of (Super) Conformal Quantum Mechanics

9

The commutators of I r are related to the R-symmetry group. We shall consider the SU (2) case6 [I r , I s ] = 2! rst I t . (3.34) This case is sometimes referred to as N = 4B supersymmetry, and arises in the reduction of (0, 4) supersymmetry from two dimensions. We now show, defining the two-forms J r = 21 Iar c gbc dX a ∧ dX b ,

(3.35)

that the necessary and sufficient conditions for N = 4B supersymmetry can be recast in the simpler form {d r , d s } = 0, (3.36) gab = Iarc Ibrd gcd (∀ r),

(3.37)

I r I s = −δ rs + ! rst I t ,

(3.38)

d 1J 1 = d 2J 2 = d 3J 3.

(3.39)

Note that the last two conditions (3.32) and (3.33) which involve the torsion c have been replaced by the condition (3.39) which is independent of c. Let us write the torsion appearing in (3.33) as c = 21 d 3 J 3 + e (3.40) for some three-form e. It can be checked that the torsion connection with e set to zero is the unique such connection annihilating I 3 , and therefore has holonomy contained in U (N/2). It follows that, in complex coordinates adapted to I 3 , the condition (3.33) for r = 3 reduces to ei j¯k¯ = eij (3.41) ¯ k = 0. (This is the argument that led to Eq. (3.15).) On the other hand, adding the r = 1 plus or minus i times the r = 2 component of (3.33) yields eij k = ei¯j¯k¯ = 0.

(3.42)

We conclude that e = 0 and c = 21 d 3 J 3 . By symmetry we must also have c = 21 d 1 J 1 and c = 21 d 2 J 2 , from which (3.39) follows. Conversely given (3.39), adding the torsion c = 21 d 3 J 3 to the Christoffel connection implies (3.33). It can be further checked that this choice of c satisfies (3.32). This single choice of torsion connection annihilates all three complex structures ∇b+ Icr a = 0.

(3.43)

In fact the condition (3.43) is equivalent to (3.39). It differs from (3.33) by the absence of symmetrization but is nevertheless equivalent for N = 4B. Equation (3.43) is referred to in [17] as the weak HKT (hyperkähler with torsion) condition. We have shown that N = 4B (which includes the condition (3.38)) implies weak HKT. 6 We have employed an obvious summation convention in this equation. We hope that it will be clear from the context when repeated indices should or should not be summed.

10

J. Michelson, A. Strominger

Extension to D(2, 1; α) superconformal symmetry. We now turn to superconformal symmetry. It turns out that the relevant supergroup is D(2, 1; α), where the parameter α =−1 will be determined by the geometry. In order to write down the commutators, it is convenient to define the four-component supercharges Qm = (Qr , Q) and S m = (S r , S) for m = 1, 2, 3, 4; these transform in the (2, 2) of the SU (2)×SU (2) R-symmetry group r (to be described) then comprise the of N = 4B. Operators Qm , S m , H , D, K and R± D(2, 1; α) algebra. The non-vanishing commutators are [H, K] = −iD, {Qm , Qn }

=

2H δ mn ,

[H, D] = −2iH, [Qm , D]

=

[K, D] = 2iK,

−iQm ,

[Qm , K] = −iS m ,

{S m , S n } = 2Kδ mn ,

[S m , D] = iS m ,

[S m , H ] = iQm ,

r , Qm ] = it ±r Qn , [R± mn

r , S m ] = it ±r S n , [R± mn

r , R s ] = i! rst R t [R± ± ±

{S m , Qn } = Dδ mn −

4α +r r 4 −r r t R − t R . 1 + α mn + 1 + α mn −

(3.44)

The t ± matrices defined by ±r r 4 ≡ ∓δ[m δn] + 21 !rmn tmn

(3.45)

obey [t +r , t −s ] = 0,

[t ±r , t ±s ] = −! rst t ±t ,

{t ±r , t ±s } = − 21 δ rs .

(3.46)

Notice that when α = 0 or α = ∞, one of the two SU (2)s can be decoupled, and there is an SU (1, 1|2) subalgebra. Since D(2, 1|α) has three SU (1, 1|1) and one N = 4B subalgebra, the (previously discussed) conditions on the geometry for the existence of those subgroups can all be assumed. In particular, D must now be holomorphic with respect to all three complex structures (3.47) LD Iarb = 0. Expressions for Qr and S r are then of the SU (1, 1|1) forms (3.13) and (3.18) with I r as a function of α then follow from replaced by I r . Somewhat lengthy expressions for R± m n linear combinations of {Q , S } anticommutators as determined by (3.44).7 Obtaining r so determined requires properly normalized SU (2) algebras for the operators R± [LD r , LD s ] = with D r = D a Iarb ∂b and

4 rst ! LD t , h

h = −2α − 2.

(3.48)

(3.49)

Equation (3.48) can be taken as the definition of the constant h.8 Since the normalization of D r is fixed in terms of D, h is a coordinate-invariant parameter associated to the geometry. 7 In principle, we should treat α = 0 or ∞ as special cases. In fact, α = ∞ cannot be realized with the supermultiplet we are considering. For α = 0, the logic is slightly different but the results are the same. 8 Note that the two excluded values α = −1 and α = ∞, correspond respectively to h = 0 and h = ∞, for which the algebra (3.48) is clearly singular.

The Geometry of (Super) Conformal Quantum Mechanics

11

Reproducing the proper [R, Q] commutators leads to the stronger requirement LD r Iasb =

4 rst tb ! Ia . h

(3.50)

In fact (3.50) (including r = s) implies both (3.48) and (3.47). Using (3.50) one then r are given by finds R± h h − 2 a rb h r = − D ra /a + i λ Ia λb + i λa λb D rc ccab , R− 4 8 4 r = R+

i a rb λ Ia λb . 4

(3.51) (3.52)

The torsion c can be eliminated from (3.51) using the identity D rc ccab = 21 (d r dK)ab − r . Jab Using the Jacobi identity, the remaining commutators follow with no further constraints on the geometry. We note that Eqs. (3.50) and (3.43) imply 2(h + 2)J r = h(d r dK − 21 ! rst d s d t K), 4(h + 2)c = −hd 1 d 2 d 3 K.

(3.53)

Properties of d r and d then imply Eq. (3.39), which thus needs not be taken as a further condition. We also find, in quaternionic coordinates and dilational gauge, when such coordinates exist, that r , λa ] = 0, r , X a ] = i X b I ra , [R− [R− b 2 (3.54) r , λa ] = i λb I ra , r , X a ] = 0. [R [R+ + b 2 In summary, a quantum mechanical theory has N = 4B supersymmetry if and only if the complex structure and metric obey Eqs. (3.36)–(3.39). The torsion c is then uniquely determined as c = 21 d 3 J 3 . (3.55) A D(2, 1; α) symmetry arises if and only if in addition there is a vector field D obeying LD gab = 2gab , d(Da dX a ) = 0, 4 LD r Ias b = ! rst Iat b , h LD r gab = 0,

(3.56)

where D rb = D a Iarb and h is a constant characterizing the geometry. The parameter α in the superconformal algebra is related to the constant h in (3.56) by α=−

h+2 . 2

(3.57)

12

J. Michelson, A. Strominger

3.4. Examples of D(2, 1; α) Quantum Mechanics. In this subsection, we show that a large class of examples of quantum mechanical systems with D(2, 1; α) symmetry (and an integrable quaternionic structure) can be constructed from a potential L. In an N = 2 superspace formalism (not described here, but similar to the ones in [19, 20]) L turns out to be the superspace integrand. R4 has an obvious SU (2) triplet of complex structures associated to self-dual twoforms obeying (3.38). Let I r be the generalizations to R4N . We may then define a triplet of fundamental two-forms by Jr =

1 (2d r dL − ! rst d s d t L). 8

(3.58)

It follows immediately from this definition and {d r , d s } = 0 that the J r obey (3.39). r (∀ r) can be written (in a coordinate Moreover the associated metric gab = Ibr c Jac system in which the I r are constant) gab =

1 c d δ δ + Iar c Ibr d ∂c ∂d L. 4 a b

(3.59)

This expression is manifestly hermitian. In other words for any L we can construct an N = 4B quantum mechanics.9 It is natural to ask whether or not every weak HKT geometry is described by some potential L. This is related to the integrability of the quaternionic structure, as discussed in Appendix C. The full D(2, 1; α) symmetry follows by imposing X a ∂a L = hL,

(3.60)

Xa Iarb ∂b L = 0.

(3.61)

where h is an arbitrary constant and

The first condition implies that L is a homogeneous function of degree h on R4N , while the second states that it is invariant under SU (2) R-symmetry rotations. These conditions manifestly ensure the existence of the required homothety D a ∂a =

2 a X ∂a , h

(3.62)

as well as the SU (2) isometries. Remarkably, it follows from (3.60) and (3.61) with a little algebra that D is automatically a closed homothety, Da dX a =

(h + 2) (∂a L) dXa . 2h

(3.63)

As discussed in Sect. 2 this implies the existence of special conformal transformations generated by (h + 2) K = 21 gab D a D b = L. (3.64) 2h In fact, all the requirements of (3.56) are automatically satisfied with these conditions, and so indeed the full D(2, 1; α = − h+2 2 ) algebra is obtained. 9 Although one may wish in addition to impose positivity of the metric g, which further constrains L.

The Geometry of (Super) Conformal Quantum Mechanics

13

The conditions (3.60) and (3.61) are sufficient but not necessary to insure D(2, 1; α) invariance. More generally one could add to the right hand side anything which is in the kernel of the second-order differential operator in (3.58). This is especially relevant for the interesting case h = −2, for which Eqs. (3.63) and (3.64) show that the metric is otherwise degenerate. An example of this will appear in [15]. The simplest case is L = 21 δab X a X b , (3.65) where a, b = 1, . . . , 4N . This has h = 2. The metric is then simply the flat metric on R4N , ds 2 = δab dX a dX b , (3.66) while the torsion c vanishes. The generators of D(2, 1; −2) ∼ Osp(4|2) are then H = 21 P a Pa ,

K = 21 X a Xa ,

Q = λa Pa ,

Qr = λa Iarb Pb ,

S = λa Xa ,

S r = λa Iarb Xb ,

r = − 1 X a I rb P , R− b a 2

r = i λa I rb λ . R+ a b 4

D = X a Pa , (3.67)

Acknowledgements. We have benefitted from useful conversations with R. Britto-Pacumio, J. Gutowski, J. Maldacena, A. Maloney, M. Spalinski, M. Spradlin, P. Townsend, A. Volovich and especially G. Papadopoulos. This work was supported in part by an NSERC PGS B Scholarship and DOE grant DE-FGO2-91ER40654.

Appendix A. Lagrangian Derivation of the Supercharges In this section we derive the supercharges used in the body of the text, from the component action [13, 17] 1 Dλb i a a ˙b c b ˙ ˙ S = dt gab X X + λ gab − X cabc λ 2 2 dt (A.1) 1 d a b c − ∂d cabc λ λ λ λ , , 6 where the covariant derivative is Dλb b d λ , ≡ λ˙ b + X˙ c 3cd dt

(A.2)

with 3 the Christoffel connection, and we use dots to denote time derivatives. Although we have, for ease of manipulation, written the fermions with spacetime indices, in deriving commutators it is better to use λα , where α is a tangent index, because, unlike λa , it will commute with the momentum conjugate to X. In terms of λα , the kinetic term for the fermions is i Dλb i gab λa = (δαβ λα λ˙ β + X˙ c ωcαβ λα λβ ), (A.3) 2 dt 2 and the momentum conjugate to X is i Pa = gab X˙ b + (ωabc − cabc )λb λc , 2

(A.4)

14

J. Michelson, A. Strominger

or, using the definition in (3.2),

/a = gab X˙ b .

(A.5)

The action (A.2) is invariant under the supersymmetry transformation δ! λa = ! X˙ a ,

δ! X a = −i!λa

(A.6)

where ! is a real anticommuting parameter. Note that [δ! , δη ] = −2iη!

d , dt

(A.7)

as required of a supersymmetry transformation. It is straightforward to compute the Noether charge corresponding to this symmetry; we find i Q = λa /a − cabc λa λb λc , 3

(A.8)

which is the origin of Eq. (3.3). Actually, the Noether procedure determines the charge only up to operator ordering. We have fixed this ambiguity by demanding hermiticity and target space covariance. Appendix B. N = 2B Supersymmetry in Complex Coordinates In this appendix we revisit the N = 2B supersymmetry of Sect. 3.2 in complex coordinates, which simplifies the formulae and calculations. Equation (3.15) for the (1, 2) part of the torsion is ci j¯k¯ = gi[j¯,k] (B.1) ¯ . The (3, 0) part of the torsion is constrained by the relation c[ij k ,l] = 0.

(B.2)

Identities required of D a are D k gi j¯,k + D k ,i gk j¯ = gi j¯ , D i cij k = 0, ¯

D i cij ¯ k = 0,

(B.3)

¯

D i ci¯j¯k = −D i ci j¯k . It is convenient to define a complex supercharge ˜ Q = 21 (Q − i Q),

˜ Q¯ = 21 (Q + i Q).

(B.4)

Q can be determined by the requirement Q = Q + Q¯ together with {Q, λk } = 0.

(B.5)

Equation (B.5) is a manifestation of the separation of the holomorphic and antiholomorphic parts of the theory. One finds i ¯ Q = λi /i − icij k¯ λi λj λk − cij k λi λj λk − icj j k λk . 3

(B.6)

The Geometry of (Super) Conformal Quantum Mechanics

15

Using ¯

(λk /k )† = λk /k¯ ,

(B.7)

one finds the hermitian conjugate is i ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ Q¯ = λi /i¯ − ici¯j¯k λi λj λk − ci¯j¯k¯ λi λj λk − icj j¯k¯ λk . 3

(B.8)

˜ in the text. After reordering the operators these expressions agree with that for Q ˜ as the It is straightforward, though quite tedious, to obtain this expression for Q (hermitian) Noether charge for the second supersymmetry δ˜! X a = −i!Ib a λb ,

δ˜! λa = −![Ib a X˙ b − iλc (∂c Ib a )λb ].

(B.9)

We note that the extra term in the transformation of λ not only appears naturally from the N = 1 superspace formulation of [13], but also is necessary to obtain the algebra [δ˜! , δη ] = 0,

[δ˜! , δ˜η ] = −2iη!

d , dt

(B.10)

for I a complex structure with vanishing Nijenhuis tensor. Note that Eq. (B.9) implies Eq. (B.5).

Appendix C. More on the Geometry of Weak HKT Manifolds In this appendix we will show that, given a weak HKT manifold with integrable complex structures, we can find a potential L. We will prove this shortly but first we should elaborate on the assumption that the quaternionic structure is integrable. It is well known that, for almost complex manifolds, the almost complex structure is integrable – that is, there exists a coordinate system in which the components of the almost complex structure are constant – if and only if the Nijenhuis tensor vanishes. The analogous statement is not true for quaternionic manifolds; rather, the vanishing of the six Nijenhuis concomitants on an almost quaternionic manifold only guarantees the integrability of any one complex structure. To see this (see also [21]), suppose that we work in a complex coordinate system adapted to I 3 . Then, I 1 and I 2 have only mixed indices (i.e., as forms they are (2, 0) ⊕ (0, 2) forms). Now consider the connection [22, 18]10 ¯ Cijk = Il¯1 k ∂i Ij1 l and c.c. (C.1) The vanishing of the Nijenhuis tensor implies that Cijk is actually a symmetric connection. Furthermore, this connection vanishes in a basis (if one exists) in which I 1 and I 3 are simultaneously constant, and so its curvature tensor vanishes in such a basis. Thus, a necessary condition for integrability of the quaternionic structure is the vanishing of the curvature associated with the connection (C.1). Obata [22] has shown that this is also a sufficient condition. 10 This is not identical to Eq. (3.27). Equation (3.27) was written in a general coordinate system, whereas the following equation is written in coordinates adapted to the I 3 . Thus, C is a connection, provided one restricts oneself to holomorphic coordinate transformations, for C depends implicitly on I 3 , while 51 did not.

16

J. Michelson, A. Strominger

If we assume integrability of the quaternionic structure, then we can, without loss of generality, work in a basis in which the complex structures are given by ∂ ∂ ∂ ∂ − id z¯ A ⊗ − idw A ⊗ A + idzA ⊗ , A A ∂z ∂w ∂ z¯ ∂ w¯ A ∂ ∂ ∂ ∂ I 2 = d w¯ A ⊗ A − d z¯ A ⊗ + dw A ⊗ A − dzA ⊗ , ∂z ∂w A ∂ z¯ ∂ w¯ A ∂ ∂ ∂ ∂ I 3 = idzA ⊗ A + idw A ⊗ − id z¯ A ⊗ A − id w¯ A ⊗ , A ∂z ∂w ∂ z¯ ∂ w¯ A I 1 = id w¯ A ⊗

(C.2)

where we have split up the complex coordinates into two sets (zA , wA ), A = 1, . . . , N4 . Hermiticity of the metric with respect to I 1 (we do not get any additional information from I 2 ) implies that gzA z¯ B = gwB w¯ A ;

gzA w¯ B = −gzB w¯ A = gz[A w¯ B] .

(C.3)

The condition that d 1 J 1 = d 3 J 3 then becomes11 gz[A w¯ B ,w¯ C] = 0,

gw[A z¯ B ,wC] = 0,

= 0,

gz[A w¯ B ,zC] = 0,

= 0,

gzA z¯ [B ,wC] + 21 gwB z¯ C ,zA = 0,

= 0,

gz[A z¯ |C| ,zB] + 21 gzA w¯ B ,wC = 0.

gw[A z¯ B ,¯zC] 1 gz[A z¯ |C| ,w¯ B] − 2 gzA w¯ B ,¯zC gzA z¯ [B ,¯zC] − 21 gwB z¯ C ,w¯ A

(C.4)

The first and second lines of Eq. (C.4), when combined with the antisymmetry in A, B of gzA w¯ B , allow us to write gzA w¯ B = (∂zA ∂w¯ B − ∂zB ∂w¯ A )L;

gwA z¯ B = (∂wA ∂z¯ B − ∂wB ∂z¯ A )L,

(C.5)

where L is some real (by hermiticity of the metric – and therefore identical in the two equations (C.5)) function. Inserting Eq. (C.5) into the third equation of (C.4) gives (C.6) ∂w¯ B gzA z¯ C − L,zA z¯ C − (B ↔ A) = 0, and therefore, gzA z¯ B = L,zA z¯ B +∂w¯ A Gz¯ B

(C.7)

for some integration one-form Gz¯ B . Combining this with the fourth equation of (C.4) gives Gz¯ B = L,wB . Thus we have obtained Eq. (3.59), which is the desired result. We have shown that integrability of the quaternionic structure implies the existence of a potential L for the metric. Although Eq. (3.59) holds only in a coordinate system in which the quaternionic structures are constant, Eq. (3.58) is coordinate invariant. Equation (3.53) motivates us to ask whether or not the existence of a potential L obeying Eq. (3.58) is generically implied by the weak HKT conditions, independent of integrability of the quaternionic structure. 11 Again, we do not get any additional information from J 2 , since d 3 J 3 is (1, 2) ⊕ (2, 1) and the (2,1) and (1,2) parts of d 2 J 2 are trivially equal to those of d 1 J 1 and the (0,3) and (3,0) parts of d 2 J 2 are just minus those of d 1 J 1 .

The Geometry of (Super) Conformal Quantum Mechanics

17

References 1. de Alfaro, V., Fubini, S. and Furlan, G.: Conformal Invariance in Quantum Mechanics. Nuovo Cimento 34A, 569 (1976) 2. Callan, C.G., Coleman. S. and Jackiw, R.: A New Improved Energy-Momentum Tensor. Ann. Phys. (NY) 59, 42 (1970); Jackiw, R.: Introducing Scale Symmetry. Physics Today 25, 23 (1972) 3. Hagan, C.R.: Scale and Conformal Transformations in Galilean-Covariant Field Theory. Phys. Rev. D 5, 377–388 (1972) 4. Niederer, U.: The Maximal Kinematical Invariance Group of the Free Schrödinger Equation. Helv. Phys. Acta 45, 802–810 (1972) 5. Akulov V.P. and Pashnev, I.A.: Quantum Superconformal Model in (1,2) Space. Theor. Math. Phys. 56, 862 (1983) 6. Fubini, S. and Rabinovici, E.: Superconformal Quantum Mechanics. Nucl. Phys. B 245, 17 (1984) 7. Ivanov, E., Krivonos, S. and Leviant, V.: Geometry of Conformal Mechanics. J. Phys. A 22, 345 (1989) Ivanov, E., Krivonos, S. and Leviant, V.: Geometric Superfield Approach to Superconformal Mechanics. J. Phys. A 22, 4201 (1989) 8. Freedman, D.Z. and Mende, P.: An Exactly Solvable N Particle System in Supersymmetric Quantum Mechanics. Nucl. Phys. B 344, 317 (1990) 9. Claus, P., Derix, M., Kallosh, R., Kumar, J., Townsend, P. and van Proeyen, A.: Black Holes and Superconformal Mechanics. Phys. Rev. Lett. 81, 4553 (1998); hep-th/9804177 10. de Azcárraga, J.A., Izquerido, J.M., Pérez Buono, J.C., and Townsend, P.K.: Superconformal Mechanics, Black Holes, and Non-linear Realizations. Phys. Rev. D 59, 084015 (1999); hep-th/9810230 11. Zhou, J.-G.: Super 0-brane and GS Superstring Actions on AdS2 × S 2 . Nucl. Phys. B 559, 92 (1999); hep-th/9906013 12. Maldacena, J.: The Large N Limit of Superconformal Field Theories and Supergravity. Adv. Theor. Math. Phys. 2, 231–252 (1998); hep-th/9711200 13. Coles, R. and Papadopoulos, G.: The Geometry of the One-Dimensional Supersymmetric Non-Linear Sigma Models. Class. Quant. Grav. 7, 427 (1990) 14. de Wit, B., Kleijn, B. and Vandoren, S.: Rigid N = 2 Superconformal Hypermultiplets. In J. Wess and E.A. Ivanov (eds.) Dubna 1997, Supersymmetries and Quantum Mechanics, Berlin–Heidelberg–New York: Springer, 1999, pp. 37–45; hep-th/9808160 15. Michelson, J. and Strominger, A.: Superconformal Multi-Black Hole Quantum Mechanics. JHEP 009, (1999) 005; hep-th/9908044 16. De Jonghe, F., Peeters, K. and Sfetsos, K.: Killing-Yano supersymmetry in String Theory. Class. Quant. Grav. 14, 35 (1997); hep-th/9607203 17. Gibbons, G.W., Papadopoulos, G. and Stelle, K.S.: HKT and OKT Geometries on Soliton Black Hole Moduli Spaces. Nucl. Phys. B 508, 623 (1997); hep-th/9706207 18. Yano, K. and Ako, M.: Integrability Conditions for Almost Quaternion Structures. Hokkaido Math. J. 1, 63 (1972) 19. Gates, S.J., Jr., Hull, C.M. and Roˇcek, M.: Twisted Multiplets and New Supersymmetric Non-linear σ -Models. Nucl. Phys. B 248, 157–186 (1984) 20. Douglas, M., Polchinski, J. and Strominger, A.: Probing Five-Dimensional Black Holes with D-branes. JHEP 12, 003 (1997); hep-th/9703031 21. Howe, P. and Papadopoulos, G.: Further Remarks on the Geometry of Two-Dimensional Non-Linear σ -Models. Commun. Math. Phys. 151, 467 (1993) 22. Obata, M.: Affine Connections on Manifolds with Almost Complex, Quaternion or Hermitian Structure. Japan. J. Math. 26, 43 (1956) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 213, 19 – 37 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Geometry of Hyper-Kähler Connections with Torsion Gueo Grantcharov , Yat Sun Poon Department of Mathematics, University of California at Riverside, Riverside, CA 92521, USA. E-mail: [email protected]; [email protected] Received: 1 October 1999 / Accepted: 30 January 2000

Abstract: The internal space of a N = 4 supersymmetric model with Wess–Zumino term has a connection with totally skew-symmetric torsion and holonomy in Sp(n). We study the mathematical background of this type of connection. In particular, we relate it to classical Hermitian geometry, construct homogeneous as well as inhomogeneous examples, characterize it in terms of holomorphic data, develop its potential theory and reduction theory.

1. Introduction It has been known that the internal space for the N = 2 supersymmetric one-dimensional sigma model is a Kähler manifold [28], and the internal space for the N = 4 supersymmetric one-dimensional sigma model is a hyper-Kähler manifold [6, 15]. It means that there exists a torsion-free connection with holonomy in U(n) or Sp(n), respectively, on the internal space. It has also been known for a fairly long time that when the Wess–Zumino term is present in the sigma model, the internal space has linear connections with holonomy in U(n) or Sp(n) depending on the numbers of supersymmetry. However, the connection has torsion and the torsion tensor is totally skew-symmetric [11, 16, 14]. The geometry of a connection with totally skew-symmetric torsion and holonomy in U(n) is referred to as KT-geometry by physicists. When the holonomy is in Sp(n), the geometry is referred to as HKT-geometry. If one ignores the metric and the connection of a HKT-geometry, the remaining object on the manifold is a hypercomplex structure. The subject of hypercomplex manifolds has been studied by many people since the publication of [24] and [4]. A considerable On leave from University of Sofia. Partially supported by Contract MM 809/1998 with Ministry of Education of Bulgaria and by Contract 238/1998 with the University of Sofia. Partially supported by the IHES.

20

G. Grantcharov, Y. S. Poon

amount of information is known. It has a twistor correspondence [24, 21]. There are homogeneous examples [19] and there are also inhomogeneous examples [5, 22]. There is a reduction construction modeled on symplectic reduction and hyper-Kähler reduction [18]. However, all these works focus on the hypercomplex structure and the associated Obata connection which is a torsion-free connection preserving the hypercomplex structure. What is not discussed in these works is hyper-Hermitian geometry. On the other hand, Hermitian connections on almost Hermitian manifolds are studied rather thoroughly by Gauduchon [12]. He considered a subset of Hermitian connections determined by the form of their torsion tensor, called canonical connections. Guided by physicists’ work and based on the results on hypercomplex manifolds, we review and further develop the theory of HKT-geometry. Some of our observations are re-interpretation of physicists’ results, especially those in [16,17] and [20,26], and some of the results in this paper are new. In Sect. 2, we review the basic definition of HKT-geometry along the line of classical Hermitian geometry developed by Gauduchon [12]. Based on Joyce’s construction of homogeneous hypercomplex manifolds [19], we review the construction of homogeneous HKT-geometry with respect to compact semi-simple Lie groups [20]. In Sect. 3, we find that a hyper-Hermitian manifold admits a HKT-connection if and only if for each complex structure, there is a holomorphic (0,2)-form. This characterization easily implies that some hyper-Hermitian structures are not HKT-structure. Furthermore, when this characterization is given a twistorial interpretation, the associated object on the twistor space of the hypercomplex structure is holomorphic with respect to a non-standard almost complex structure J2 . This almost complex structure J2 is first discussed by Eells and Salamon in a different context [9]. Since this almost complex structure is never integrable, we focus on the holomorphic (0,2)-forms. From this perspective, we verify that there are HKT-structures on nilmanifolds, and that the twist of a HKT-manifold is again a HKT-manifold. In Sect. 4 we study the potential theory for HKT-geometry which is based on results in Sect. 3. We shall see that local HKT-geometry is very flexible in the sense that the existence of one generates many through a perturbation of potential functions. In particular, we show that hyper-Kähler potentials generate many HKT-potentials. The results in this section and Sect. 3 allow us to construct a large family of inhomogeneous HKT-structures on compact manifolds including S 1 × S 4n+3 . Finally, a reduction theory based on hyper-Kähler reduction for HKT-geometry is developed in Sect. 5. 2. Hyper-Kähler Geometry with Torsion 2.1. Kähler Geometry with Torsion. Let M be a smooth manifold with Riemannian metric g and an integrable complex structure J. It is a Hermitian manifold if g(JX, JY ) = g(X, Y ). The Kähler form F is a type (1,1)-form defined by F (X, Y ) = g(JX, Y ). A linear connection ∇ on M is Hermitian if it preserves the metric g and the complex structure J. i.e., ∇g = 0 and ∇J = 0. Since the connection preserves the metric, it is uniquely determined by its torsion tensor T . We shall also consider the following (3,0)-tensor: c(X, Y, Z) = g(X, T (Y, Z)).

(1)

Geometry of Hyper-Kähler Connections with Torsion

21

Gauduchon found that on any Hermitian manifold, the collection of canonical Hermitian connections is an affine subspace of the space of linear connections [12]. This affine subspace is at most one dimensional. It is one point if and only if the Hermitian manifold is Kähler, i.e., when the Kähler form is closed, then the family of canonical Hermitian connections collapses to the Levi-Civita connection of the given metric. It is one-dimensional if and only if the Hermitian manifold is non-Kähler. In the latter case, there are several distinguished Hermitian connections. For example, the Chern connection and Lichnerwicz’s first canonical connection are in this family. We are interested in another connection in this family. Physicists find that the presence of the Wess–Zumino term in N = 2 supersymmetry yields a Hermitian connection whose torsion c is totally skew-symmetric. In other words, c is a 3-form. Such a connection turns out to be another distinguished Hermitian connection [3, 12]. The geometry of such a connection is called by physicists a KT-connection. Among some mathematicians, this connection is called the Bismut connection. According to Gauduchon [12], on any Hermitian manifold, there exists a unique Hermitian connection whose torsion tensor c is a 3-form. Moreover, the torsion form can be expressed in terms of the complex structure and the Kähler form. Recall the following definitions and conventions [2, Eqs. 2.8 and 2.15–2.17]. For any n-form ω, when (Jω)(X1 , . . . , Xn ) := (−1)n ω(JX1 , . . . , JXn )

then

dc ω = (−1)n JdJω, (2)

and ∂=

1 1 (d + idc ) = (d + (−1)n iJdJ), 2 2

∂=

1 1 (d − idc ) = (d − (−1)n iJdJ). 2 2 (3)

By [12], the torsion 3-form of the Bismut connection is 1 c(X, Y, Z) = − dc F (X, Y, Z). 2

(4)

2.2. Hyper-Kähler Connection and HKT-Geometry. Three complex structures I1 , I2 and I3 on M form a hypercomplex structure if I12 = I22 = I32 = −1,

and

I1 I2 = I3 = −I2 I1 .

(5)

A triple of such complex structures is equivalent to the existence of a 2-sphere worth of integrable complex structures: I = {a1 I1 + a2 I2 + a3 I3 : a21 + a22 + a23 = 1}.

(6)

When g is a Riemannian metric on the manifold M such that it is Hermitian with respect to every complex structure in the hypercomplex structure, (M, I, g) is called a hyperHermitian manifold. Note that g is hyper-Hermitian if and only if g(X, Y ) = g(I1 X, I1 Y ) = g(I2 X, I2 Y ) = g(I3 X, I3 Y ).

(7)

On a hyper-Hermitian manifold, there are two natural torsion-free connections, namely the Levi-Civita connection and the Obata connection. However, in general, the Levi-Civita connection does not preserve the hypercomplex structure and the Obata connection does not preserve the metric. We are interested in the following types of connections:

22

G. Grantcharov, Y. S. Poon

Definition 1. A linear connection ∇ on a hyper-Hermitian manifold (M, I, g) is hyperHermitian if ∇g = 0,

and

∇I1 = ∇I2 = ∇I3 = 0.

(8)

Definition 2. A linear connection ∇ on a hyper-Hermitian manifold (M, I, g) is hyperKähler if it is hyper-Hermitian and its torsion tensor is totally skew-symmetric. A hyper-Kähler connection is referred to HKT-connection in the physics literature. The geometry of this connection or this connection is also referred to HKT-geometry. Note that a HKT-connection is also the Bismut connection for each complex structure in the given hypercomplex structure. For the complex structures {I1 , I2 , I3 }, we consider their corresponding Kähler forms {F1 , F2 , F3 } and the complex operators {d1 , d2 , d3 }, where di = di c . Due to Gauduchon’s characterization of the Bismut connection, we have Proposition 1. A hyper-Hermitian manifold (M, I, g) admits a hyper-Kähler connection if and only if d1 F1 = d2 F2 = d3 F3 . If it exists, it is unique. In view of the uniqueness, we say that (M, I, g) is a HKT-structure if it admits a hyper-Kähler connection. If the hyper-Kähler connection is also torsion-free, then the HKT-structure is a hyper-Kähler structure.

2.3. Homogeneous Examples. Due to Joyce [19], there is a family of homogeneous hypercomplex structures associated to any compact semi-simple Lie group. In this section, we briefly review his construction and demonstrate, as Opfermann and Papadopoulos did [20], the existence of homogeneous HKT-connections. Let G be a compact semi-simple Lie group. Let U be a maximal torus. Let g and u be their algebras. Choose a system of ordered roots with respect to uC . Let α1 be a maximal positive root, and h1 the dual space of α1 . Let ∂1 be the sp(1)-subalgebra of g such that its complexification is isomorphic to h1 ⊕ gα1 ⊕ g−α1 , where gα1 and g−α1 are the root spaces for α1 and −α1 respectively. Let b1 be the centralizer of ∂1 . Then there is a vector subspace f1 composed of root spaces such that g = b1 ⊕ ∂1 ⊕ f1 . If b1 is not Abelian, Joyce applies this decomposition to it. By inductively searching for sp(1) subalgebras, he finds the following [19, Lemma 4.1]. Lemma 1. The Lie algebra g of a compact Lie group G decomposes as g = b ⊕nj=1 ∂j ⊕nj=1 fj ,

(9)

with the following properties: (1) b is Abelian and ∂j is isomorphic to sp(1). (2) b⊕nj=1 ∂j contains u. (3) Set b0 = g, bn = b and bk = b ⊕nj=k+1 ∂j ⊕nj=k+1 fj . Then [bk , ∂j ] = 0 for k ≥ j. (4) [∂l , fl ] ⊂ fl . (5) The adjoint representation of ∂l on fl is reducible to a direct sum of the irreducible 2-dimensional representations of sp(1). When the group G is semi-simple, the Killing-Cartan form is a negative definite inner product on the vector space g. Lemma 2. The Joyce Decomposition of a compact semi-simple Lie algebra is an orthogonal decomposition with respect to the Killing-Cartan form.

Geometry of Hyper-Kähler Connections with Torsion

23

Proof. Since Joyce Decomposition given as in (9) is inductively defined, it suffices to prove that the decomposition gC = b1 ⊕ ∂1 ⊕ f1

(10)

is orthogonal. Recall that ∂1 = h1 , Xα1 , X−α1 , f1 = ⊕α1 = α>0,α,α1 = 0 gα ⊕ g−α , b1 = {h ∈ uC : α1 (h) = 0} ⊕α1 = α>0,α,α1 = 0 gα ⊕ g−α . Since the Cartan subalgebra uC is orthogonal to any root space, and it is an elementary fact that two root spaces gα , gβ are orthogonal whenever α = ±β, f1 is orthogonal to both b1 and ∂1 . For the same reasons, ∂1 is orthogonal to the summand ⊕α>0,α,α1 =0 gα ⊕ g−α in b1 , and b1 is orthogonal to the summand Xα1 , X−α1 in ∂1 . Then ∂1 is orthogonal to b1 because for any element h in the Cartan subalgebra in b1 , h, h1 = α1 (h) = 0. Let G be a compact semi-simple Lie group with rank r. Then (2n − r)u(1) ⊕ g ∼ = Rn ⊕n ∂j ⊕n fj . j=1

j=1

(11)

2n−r

× G, i.e. the Lie algebra (2n − At the tangent space of the identity element of T r)u(1)⊕g, a hypercomplex structure {I1 , I2 , I3 } is defined as follows. Let {E1 , . . . , En } be a basis for Rn . Choose isomorphisms φj from sp(1), the real vector space of imaginary quaternions, to ∂j . It gives a real linear identification from the quaternions H to Ej ⊕∂j . If Hj , Xj and Yj forms a basis for ∂j such that [Hj , Xj ] = 2Yj and [Hj , Yj ] = −2Xj , then I1 Ej = Hj , I2 Ej = Xj , I3 Ej = Yj .

(12)

Define the action of Ia on fj by Ia (v) = [v, φj (ιa )], where ι1 = i, ι2 = j, ι3 = k. The complex structures {I1 , I2 , I3 } at the other points of the group T 2n−r × G are obtained by left translations. These complex structures are integrable and form a hypercomplex structure [19]. Lemma 3. When G is a compact semi-simple Lie group with rank r, there exists a ˆ on the decomposition (2n − r)u(1) ⊕ g ∼ negative definite bilinear form B = Rn ⊕nj=1 n ∂j ⊕j=1 fj such that (1) its restriction to g is the Killing-Cartan form, (2) it is hyperHermitian with respect to the hypercomplex structure, and (3) the above decomposition is orthogonal. Proof. In ∂j , we choose an orthogonal basis {Hj , Xj , Yj } such that Hj is in the Cartan subalgebra and B(Hj , Hj ) = B(Xj , Xj ) = B(Yj , Yj ) = −λ2j .

(13)

n

On R = (2n − r)u(1) ⊕ b, choose E1 , . . . , En and extend the Killing-Cartan form so that ˆ i , Ej ) = −δij λ2j . (14) B(E It is now apparent that the extended Killing-Cartan form is hyper-Hermitian with respect to I1 , I2 and I3 . To show that the Killing-Cartan form is hyper-Hermitian on ⊕nj=1 fj , it suffices to verify that the Killing-Cartan form is hyper-Hermitian on f1 . It follows from the fact that B(X, [Y, Z]) is totally skew-symmetric with respect to X, Y, Z and the Jacobi identity.

24

G. Grantcharov, Y. S. Poon

ˆ It is a biLet g be the left-translation of the extended Killing-Cartan form −B. invariant metric on the manifold T 2n−r × G. The Levi-Civita connection D is the bi-invariant connection. Let ∇ be the left-invariant connection defined by having all left-invariant vector fields being parallel. When X and Y are left-invariant vector fields 1 [X, Y ], and ∇X Y = 0. 2 Since the hypercomplex structure and the hyper-Hermitian metric are left-invariant, the left-invariant connection is hyper-Hermitian. The torsion tensor for the left-invariant connection is T (X, Y ) = −[X, Y ]. The (3,0)-torsion tensor is DX Y =

ˆ c(X, Y, Z) = −B([X, Y ], Z). It is well known that c is a totally skew-symmetric 3-form. Therefore, the left-invariant connection is a HKT-structure on the group manifold T 2n−r × G. It is apparent that if one extends the Killing-Cartan form in an arbitrary way, then the resulting bi-invariant metric and left-invariant hypercomplex structure cannot make a hyper-Hermitian structure. The above construction can be generalized to homogeneous spaces [20]. 3. Characterization of HKT-Structures In this section, we characterize HKT-structures in terms of the existence of a holomorphic object with respect to any complex structure in the hypercomplex structure. Through this characterization, we shall find other examples of HKT-manifolds. Toward the end of this section, we shall also reinterpret the twistor theory for HKT-geometry developed by Howe and Papadopoulos [17]. The results seem to indicate that the holomorphic characterization developed in the next paragraph will serve all the purposes that one wants the twistor theory of HKT-geometry to serve. 3.1. Holomorphic Characterization. Proposition 2. Let (M, I, g) be a hyper-Hermitian manifold and Fa be the Kähler form for (Ia , g). Then (M, I, g) is a HKT-structure if and only if ∂1 (F2 + iF3 ) = 0; or equivalently ∂ 1 (F2 − iF3 ) = 0. Proof. Since ∂1 (F2 + iF3 ) = 12 (dF2 − d1 F3 ) + 2i (d1 F2 + dF3 ), it is identically zero if and only if d1 F2 = −dF3 , and dF2 = d1 F3 . Note that F2 (I1 X, I1 Y ) = g(I2 I1 X, I1 Y ) = −g(I2 X, Y ) = −F2 (X, Y ). It follows that d1 F2 = (−1)2 I1 dI1 (F2 ) = −I1 dF2 . As dF2 is a 3-form, for any X, Y, Z tangent vectors, −I1 dF2 (X, Y, Z) = dF2 (I1 X, I1 Y, I1 Z) = dF2 (I2 I3 X, I2 I3 Y, I2 I3 Z) = −I2 dF2 (I3 X, I3 Y, I3 Z) = I3 I2 dF2 (X, Y, Z). Since F2 is type (1,1) with respect to I2 , I2 F2 = F2 . Then d1 F2 = −I1 dF2 = I3 I2 dF2 = I3 I2 dI2 F2 = I3 d2 F2 . On the other hand, −dF3 = I3 I3 dF3 = I3 I3 dI3 F3 = I3 d3 F3 . Therefore, d2 F2 = d3 F3 if and only if d1 F2 = −dF3 . Similarly, one can prove that d2 F2 = d3 F3 if and only if d1 F3 = dF2 . It follows that ∂1 (F2 + iF3 ) = 0 if and only if d2 F2 = d3 F3 . It is equivalent to ∇2 = ∇3 , where ∇a is the Bismut connection of the Hermitian structure (M, Ia , g). Since I1 = I2 I3 , and ∇2 = ∇3 , I1 is parallel with respect to ∇2 = ∇3 . By the uniqueness of the Bismut connection, ∇1 = ∇2 = ∇3 .

Geometry of Hyper-Kähler Connections with Torsion

25

On any hypercomplex manifold (M, I), if F2 −iF3 is a 2-form such that −F2 (I2 X, Y ) = g(X, Y ) is positive definite and it is a non-holomorphic (0,2)-form with respect to I1 , then (M, g, I) is a hyper-Hermitian manifold but it is not a HKT-structure. For example, a conformal change of a HKT-structure by a generic function gives a hyper-Hermitian structure which is not a HKT-structure so long as the dimension of the underlying manifold is at least eight. On the other hand Proposition 2 implies that every four-dimensional hyper-Hermitian manifold is a HKT-structure, a fact also proven in [13, Sect. 2.2]. In the proof of Proposition 2, we also derive the following [17]. Corollary 1. Suppose F1 , F2 and F3 are the Kähler forms of a hyper-Hermitian structure. Then the hyper-Hermitian structure is a HKT-structure if and only if di Fj = −2δij c − %ijk dFk .

(15)

Theorem 1. Let (M, I) be a hypercomplex manifold and F2 − iF3 be a (0,2)-form with respect to I1 such that ∂ 1 (F2 − iF3 ) = 0 or equivalently ∂1 (F2 + iF3 ) = 0 and −F2 (I2 X, Y ) = g(X, Y ) is a positive definite symmetric bilinear form. Then (M, I, g) is a HKT-structure. Proof. In view of the last proposition, it suffices to prove that the metric g along with the given hypercomplex structure I is hyper-Hermitian. Note that F2 − iF3 is type (0,2) with respect to I1 . Since X − iI1 X is a type (1,0)vector with respect to I1 , (F2 − iF3 )(X − iI1 X, Y ) = 0 for any vectors X and Y . It is equivalent to the identity F2 (I1 X, Y ) = −F3 (X, Y ). Then F3 (I3 X, Y ) = −F2 (I1 I3 X, Y ) = F2 (I2 X, Y ) = −g(X, Y ). So F3 (I3 X, I3 Y ) = F3 (X, Y ), and g is Hermitian with respect to I3 . Since the metric g is Hermitian with respect to I2 and I1 = I2 I3 , g is also Hermitian with respect to I1 . 3.2. HKT-Structures on Compact Nilmanifolds. In this section, we apply the last theorem to construct a homogeneous HKT-structure on some compact nilmanifolds. Let {X1 , ..., X2n , Y1 , ..., Y2n , Z} be a basis for R4n+1 . Define commutators by: [Xi , Yi ] = Z, and all others are zero. These commutators define on R4n+1 the structure of the Heisenberg Lie algebra h 2n . Let R3 be the 3-dimensional Abelian algebra. The direct sum n = h2n ⊕ R3 is a 2-step nilpotent algebra whose center is four-dimensional. Fix a basis {E1 , E2 , E3 } for R3 and consider the following endomorphisms of n [8]: I1 : Xi → Yi , Z → E1 , E2 → E3 ; I2 : X2i+1 → X2i , Y2i−1 → Y2i , Z → E2 , E1 → E3 ; I12 = I22 = −identity, I3 = I1 I2 . Clearly I1 I2 = −I2 I1 . Moreover, for a = 1, 2, 3 and X, Y ∈ n, [Ia X, Ia Y ] = [X, Y ] so Ia are Abelian complex structures on n in the sense of [1] and in particular are integrable. It implies that {Ia : a = 1, 2, 3} is a left invariant hypercomplex structure on the simply connected Lie group N whose algebra is n. It is known that the complex structures Ia on n satisfy: 1,1 ∗ ∗ d(Λ1,0 Ia n ) ∈ Λ Ia n ,

26

G. Grantcharov, Y. S. Poon

∗ where n∗ is the space of left invariant 1-forms on N and Λi,j Ia n is the (i, j)-component 2,1 ∗ ∗ of n∗ ⊗ C with respect to Ia [25]. But then we have d(Λ2,0 Ia n ) ∈ ΛIa n and any left invariant (2,0)-form is ∂1 -closed. Now consider the invariant metric on N for which the basis {Xi , Yi , Z, Ea } is orthonormal. Since it is compatible with the structures Ia in view of Theorem 1 we obtain a left-invariant HKT-structure on N . Noting that N is isomorphic to the product H2n × R3 of the Heisenberg Lie group H2n and the Abelian group R3 we have:

Corollary 2. Let Γ be a cocompact lattice in the Heisenberg group H2n and Z3 a lattice in R3 . The compact nilmanifold (Γ × Z3 )\N admits a HKT-structure. 3.3. Twist of Hyper-Kähler Manifolds with Torsions. Suppose that (M, I) is a hypercomplex manifold, a U(1)-instanton P is a principal U(1)-bundle with a U(1)connection 1-form θ such that its curvature 2-form is type-(1,1) with respect to every complex structure in I [7, 10]. Let ΨM : U(1) → Aut(M ) be a group of hypercomplex automorphism, and let ΨP : U(1) → Aut(P ) be a lifting of ΨM . Let Φ : U(1) → Aut P be the principal U (1)-action on the bundle P , and (g) be the diagonal product Φ(g)ΨP (g) action on P . A theorem of Joyce [19, Theorem 2.2] states that the quotient space W = P/(U(1)) of the total space of P with respect to the diagonal action is a hypercomplex manifold whenever the vector field generated by (U(1)) transversal to the horizontal distribution of the connection θ. The quotient space W is called a twist of the hypercomplex manifold M . Now suppose that (M, I, g) is a HKT-structure and P is a U(1)-instanton with connection form θ. Suppose that ΨM : U(1) → Aut(M ) is a group of hypercomplex isometry. Due to the uniqueness of HKT-structure, ΨM is a group of automorphism of the HKT-structure. Corollary 3. The twist manifold W admits a HKT-structure. Proof. Let φ : P → M and ∆ : P → W be the projections from the instanton bundle P to M and the twist W respectively. The connection θ defines a splitting of the tangent bundle of P into horizontal and vertical components: T P = H ⊕ V, where H = Ker θ. We define endomorphisms I˜a on T P as follows: I˜a = 0 on vertical directions, and when v˜ is a horizontal lift of a tangent vector v to M , define I˜a v˜ = I a v. Since the fibers of the projection ∆ are transversal to the horizontal distribution, for any tangent vector vˆ to W , there exists a horizontal vector v˜ such that d∆˜ v = vˆ. Define Iˆa and gˆ on W by Iˆa vˆ = d∆(I˜a v˜) and gˆ(ˆ v , w) ˆ = g˜(˜ v , w). ˜ As the diagonal action is a group of hyper-holomorphic isometries, the almost complex structures Iˆa and metric gˆ are well defined. To verify that Iˆa are integrable complex structures on W , we first observe that: for horizontal vector fields X and Y , d∆[X, Y ] = [d∆X, d∆Y ], dφ[X, Y ] = [dφX, dφY ] and d∆I˜a = Iˆa d∆, dφI˜a = Ia dφ. Through these relations, we establish the following relations between Nijenhius tensors of Ia , Iˆa and I˜a : ˆa (d∆X, d∆Y ) d∆N˜a (X, Y ) = N

and

dφN˜a (X, Y ) = Na (dφX, dφY ).

The second identity implies that the horizontal part of N˜a (X, Y ) vanishes because the complex structures Ia are integrable. With the first identity, it follows that the Nijenhius

Geometry of Hyper-Kähler Connections with Torsion

27

tensor for Iˆa vanishes if the vertical part of N˜a (X, Y ) also vanishes. To calculate the vertical part, we have 1 θ([X, Y ] + I˜a [I˜a X, Y ] + I˜a [X, I˜a Y ] − [I˜a X, I˜a Y ]) 4 1 1 = θ([X, Y ] − [I˜a X, I˜a Y ]) = (dθ(X, Y ) − dθ(Ia X, Ia Y )). 4 4

θ(N˜a (X, Y )) =

Since θ is an instanton, dθ(X, Y ) − dθ(Ia X, Ia Y ) = 0. It follows that Iˆa are integrable. To check that gˆ is a HKT-metric, we first observe that d∆ and dφ give rise to isomorphisms of Λ(p,q) M , Λ(p,q) H and Λ(p,q) W when we fix the structures I1 , Iˆ1 and I˜1 . Let the Kähler forms of the structures Ia and Iˆa be denoted by Fa and Fˆa respectively. Now if X, Y and Z are sections of H(1,0) then X(∆∗ (Fˆ2 + iFˆ3 ))(Y, Z) = X(φ∗ (F2 + iF3 ))(Y, Z). Since dθ is type (1,1), θ([X, Y ]) = dθ(X, Y ) = 0. It means that [X, Y ] is a section of H(1,0) . Therefore, ∆∗ (Fˆ2 + iFˆ3 )([X, Y ], Z) = φ∗ (F2 + iF3 )([X, Y ], Z). It follows that (∆∗ d(Fˆ2 + iFˆ3 ))|Λ(3,0) H = (d∆∗ (Fˆ2 + iFˆ3 ))|Λ(3,0) H = dφ∗ (F2 + iF3 ))|Λ(3,0) H = 0. Hence d(Fˆ2 + iFˆ3 )|Λ(3,0) W = 0 and the corollary follows from Proposition 2.

3.4. Twistor Theory of HKT-Geometry. When (M, I) is a 4n-dimensional hypercomplex manifold, the smooth manifold Z = M ×S 2 admits an integrable complex structure. It is defined as follows. For a unit vector a = (a1 , a2 , a3 ) ∈ R3 , let Ia be the complex structure a1 I1 + a2 I2 + a3 I3 in the hypercomplex structure I. Let Ja be the complex structure on S 2 defined by cross product in R3 : Ja w = a × w. Then the complex structure on Z = M × S 2 at the point (x, a) is J(x,a) = Ia ⊕ Ja . It is well known from twistor theory that this complex structure is integrable [24]. We shall have to consider a non-integrable almost complex structure J2 = I ⊕ (−J). Unless specified otherwise, we discuss holomorphicity on Z in terms of the integrable complex structure J . With respect to J , the fibers of the projection π from Z = M ×S 2 onto its first factor are holomorphic curves with genus zero. It can be proved that the holomorphic normal bundles are ⊕2n O(1). The antipodal map τ on the second factor is an anti-holomorphic map on the twistor space Z leaving the fibers of the projection π invariant. The projection p onto the second smooth factor of Z = M × S 2 is a holomorphic map such that the inverse image of a point (a1 , a2 , a3 ) is the manifold M equipped with the complex structure a1 I1 + a2 I2 + a3 I3 . If D is the sheaf of kernel of the differential dp, then we have the exact sequence dp

0 → D → ΘZ −→ p∗ ΘCP1 → 0.

(16)

Real sections, i.e. τ -invariant sections, of the holomorphic projection p are fibers of the projection from Z onto M . Twistor theory shows that there is a one-to-one correspondence between the hypercomplex manifold (M, I) and its twistor space Z with the complex structure J , the anti-holomorphic map τ , the holomorphic projection p and the sections of the projection p with prescribed normal bundle [21].

28

G. Grantcharov, Y. S. Poon

It is not surprising that when a hypercomplex manifold has a HKT-structure, there is an additional geometric structure on the twistor space. The following theorem is essentially developed in [17]. Theorem 2. Let (M, I, g) be a 4n-dimensional HKT-structure. Then the twistor space Z is a complex manifold such that 1. the fibers of the projection π : Z → M are rational curves with holomorphic normal bundle ⊕2n O(1), 2. there is a holomorphic projection p : Z → CP1 such that the fibers are the manifold M equipped with complex structures of the hypercomplex structure I, 3. there is a J2 -holomorphic section of ∧(0,2) D ⊗ p∗ ΘCP1 defining a positive definite (0,2)-form on each fiber, 4. there is an anti-holomorphic map τ compatible with 1, 2 and 3 and inducing the antipodal map on CP1 . Conversely, if Z is a complex manifold with a non-integrable almost complex structure J2 with the above four properties, then the parameter space of real sections of the projection p is a 4n-dimensional manifold M with a natural HKT-structure for which Z is the twistor space. Proof. Given a HKT-structure, then only Part 3 in the first half of this theorem is a new observation. It is a generalization of Theorem 1. Through the stereographic projection, ζ → a =

1 1 − |ζ|2 , −i(ζ − ζ), −(ζ + ζ) , 1 + |ζ|2

(17)

ζ is a complex coordinate of the Riemann sphere. Note that  i(ζ − ζ) ζ +ζ 1 − |ζ|2 1 2 2    −i(ζ − ζ) 1 + 12 (ζ 2 + ζ ) − 2i (ζ 2 − ζ )  1 + |ζ|2 2 2 −(ζ + ζ) − 2i (ζ 2 − ζ ) 1 − 12 (ζ 2 + ζ ) 

is a special orthogonal matrix. Let b and c be the second and third column vectors respectively. Consider the complex structure Ia =

1 (1 − |ζ|2 )I1 − i(ζ − ζ)I2 − (ζ + ζ)I3 . 2 1 + |ζ|

According to Theorem 1, the 2-form Fb − iFc =

1 2 (F − iF ) − 2iζF + ζ (F + iF ) 2 3 1 2 3 1 + |ζ|2

(18)

is holomorphic with respect to Ia . Due to the integrability of the complex structure Ia , da is linear in a. Therefore, da =

1 (1 − |ζ|2 )d1 − i(ζ − ζ)d2 − (ζ + ζ)d3 . 2 1 + |ζ|

(19)

Geometry of Hyper-Kähler Connections with Torsion

29

Note that ζ is holomorphic with respect to the almost complex structure J2 . More precisely, consider the ∂-operator with respect to the almost complex structure J2 : on n-forms, it is δ=

1 (d − i(−1)n J2 dJ2 ), 2

(20)

then J2 dζ = iζ, and δζ = 0. It follows that at (x, a) on Z = M × S 2 ,

2 2 δ −2iζF1 + (1 + ζ )F2 − i(1 − ζ )F3 2

2

= − 2iζδF1 + (1 + ζ )δF2 − i(1 − ζ )δF3 2

2

= − 2iζ∂ a F1 + (1 + ζ )∂ a F2 − i(1 − ζ )∂ a F3 1

2 2 −2iζdF1 + (1 + ζ )dF2 − i(1 − ζ )dF3 = 2 i

2 2 −2iζda F1 + (1 + ζ )da F2 − i(1 − ζ )da F3 . − 2 2

Now (19) and (15) together imply that the twisted 2-form (F2 − iF3 ) − 2iζF1 + ζ (F2 + iF3 ) is closed with respect to δ. Therefore, it is a J2 -holomorphic section. Since ζ is a holomorphic coordinate on S 2 , the homogeneity shows that this section is twisted by O(2). The inverse construction is a consequence of the inverse construction of hypercomplex manifold [21] and Theorem 1. As the almost complex structure J2 is never integrable [9], twistor theory loses substantial power of holomorphic geometry when we study HKT-structure. Therefore, we focus on the application of Theorem 1. 4. Potential Theory Theorem 1 shows that the form F2 + iF3 is a ∂1 -closed (2,0)-form on a HKT-manifold. It is natural to consider a differential form β1 as potential 1-form for F2 + iF3 if ∂1 β1 = F2 + iF3 . A priori, the 1-form β1 depends on the choice of the complex structure I1 . The potential 1-form for F3 + iF1 , if it exists, depends on I2 , and so on. In this section, we seek a function that generates all Kähler forms. 4.1. Potential Functions. A function µ is a potential function for a hyper-Kähler manifold (M, I, g) if the Kähler forms Fa are equal to dda µ. Since da = (−1)n Ia dIa on n-forms, da µ = Ia dµ. Therefore, d1 d2 µ = d1 I2 dµ = −I1 dI1 I2 dµ = −I1 dI3 dµ = −I1 dd3 µ = −I1 Ω3 = Ω3 = dd3 µ. Now we generalize this concept to HKT-manifolds. Definition 3. Let (M, I, g) be a HKT-structure with Kähler forms F1 , F2 and F3 . A possibly locally defined function µ is a potential function for the HKT-structure if F1 =

1 (dd1 + d2 d3 )µ, 2

F2 =

1 (dd2 + d3 d1 )µ, 2

F3 =

1 (dd3 + d1 d2 )µ. 2

(21)

30

G. Grantcharov, Y. S. Poon

Due to the identities dda + da d = 0 and da db + db da = 0, µ is a potential function if and only if Fa =

1 (dda + db dc )µ, 2

when a = b × c and Fa is the Kähler form for the complex structure Ia = a1 I1 + a2 I2 + a3 I3 . Moreover, the torsion 3-form is equal to − 14 d1 d2 d3 µ. Furthermore, since ∂a = 12 (d + ida ) and ∂ a = 12 (d − ida ), F2 + iF3 =

1 (dd2 + idd3 + id1 d2 − d1 d3 )µ = 2∂1 I2 ∂ 1 µ. 2

(22)

Conversely, if a function µ satisfies the above identity, it satisfies the last two identities in (21). Since the metric is hyper-Hermitian, for any vectors X and Y , F1 (X, Y ) = F2 (I3 X, Y ). Through the integrability of the complex structures I1 , I2 , I3 , the quaternion identities (5) and the last two identities in (21), one derives the first identity in (21). Therefore, we have the following theorem which justifies our definition for potential functions. Theorem 3. Let (M, I, g) be a HKT-structure with Kähler form F1 , F2 and F3 . A possibly locally defined function µ is a potential function for the HKT-structure if F2 + iF3 = 2∂1 I2 ∂ 1 µ.

(23)

In this context, a HKT-structure is hyper-Kähler if and only if the potential function satisfies the following identities: dd1 µ = d2 d3 µ,

dd2 µ = d3 d1 µ,

dd3 µ = d1 d2 µ.

(24)

Corollary 4. Every hypercomplex manifold locally admits a HKT-metric. Proof. We fix the complex structure I2 and consider a locally defined Kähler potential function µ with respect to I2 . Then dd2 µ(X, I2 X) > 0 for every nonzero X. Simple calculation shows that if Y = I3 X then (dd2 + d3 d1 )µ(X, I2 X) = dd2 µ(X, I2 X) + dd2 µ(Y, I2 Y ) > 0.

(25)

Then we see that the form F2 + iF3 = 2∂1 I2 ∂ 1 µ satisfies the conditions of Theorem 1 and hence is a local HKT-potential function, thus defining a HKT-metric. Remark. As in the Kähler case, compact manifolds do not admit globally defined HKT potential. To verify, let f be a potential function and g be the corresponding induced metric. Define the complex Laplacian of f with respect to g: ∗

∂ ∂f = c f = g(dd1 f, F1 ). Then 0 ≤ 2g(F1 , F1 ) = g(dd1 f + d2 d3 f, F1 ) = 2c f, because g(d2 d3 f, F1 ) = g(−I2 dd1 f, F1 ) = −g(dd1 f, I2 F1 ) = g(dd1 f, F1 ) = c f. Now the remark follows from the standard arguments involving the maximum principle for second order elliptic differential equation just like in the Kähler case since c f does not have zero-order terms.

Geometry of Hyper-Kähler Connections with Torsion

31

Remark. If we introduce the following quaternionic operators acting on quaternionic H valued forms on the left: ∂ H = d + id1 + jd2 + kd3 , and ∂ = d − id1 − jd2 − kd3 , H then a real-valued function µ is a HKT-potential if ∂ H ∂ µ = −2iF1 − 2jF2 − 2kF3 . n 2n If we identify H with C , we deduce like in Corollary 4 that any pluri-subharmonic function in the domain of C2n is a HKT-potential. The converse however is wrong. As we shall see in 4.3 the function log(|z|2 + |w|2 ) is a HKT potential in C2n \{0} but is not pluri-subharmonic. Remark. Given a HKT-metric g with Kähler forms F1 , F2 and F3 , for any real-valued function µ we consider Fˆ2 + iFˆ3 = F2 + iF3 + ∂1 I2 ∂ 1 µ. According to Theorem 1 and other results in this section, whenever the form gˆ(X, Y ) := −Fˆ2 (I2 X, Y ) is positive definite, we obtain a new HKT-metric with respect to the old hypercomplex structure. 4.2. Transformations of HKT-Potentials. Let (M, I, g) be a HKT-manifold with potential function µ. The Kähler forms are given by Ωa = 12 (dda + db dc )µ. We consider HKT-structures generated by potential functions through µ. Theorem 4. Suppose (M, I, g) is a HKT-manifold with a potential function µ. For any smooth function f of one variable, let U be the open subset of M on which µ is defined and 1 f (µ) + f (µ)|∇µ|2 > 0. 4

(26)

Define a symmetric bilinear form gˆ by 1 gˆ = f (µ)g + f (µ)(dµ ⊗ dµ + I1 dµ ⊗ I1 dµ + I2 dµ ⊗ I2 dµ + I3 dµ ⊗ I3 dµ). 4 (27) Then (U, I, gˆ) is a HKT-structure with f (µ) as its potential. Proof. Since µ is a HKT-potential for the HKT-structure (I, g), Ω2 + iΩ3 = 2∂1 I2 ∂ 1 µ. It follows that 2∂1 I2 ∂ 1 f = 2∂1 f (µ)I2 ∂ 1 µ = 2f (µ)∂1 I2 ∂ 1 µ + 2f (µ)∂1 µ ∧ I2 ∂ 1 µ 1 = f (µ)(Ω2 + iΩ3 ) + f (µ)(dµ + id1 µ) ∧ (I2 dµ − iI2 d1 µ). 2 When F2 and F3 are the real and imaginary parts of 2∂1 I2 ∂ 1 f respectively, then 1 F2 = f (µ)Ω2 + f (µ)(dµ ∧ I2 dµ + d1 µ ∧ I2 d1 µ). 2

(28)

It is now straightforward to verify that −Fˆ2 (I2 X, Y ) = gˆ(X, Y ). Therefore, gˆ together with given hypercomplex structure defines a HKT-structure with the function f as its potential so long as gˆ is positive definite.

32

G. Grantcharov, Y. S. Poon

Since g is hyper-Hermitian, the vector fields Y0 = ∇µ and Ya = Ia ∇µ are mutually orthogonal with equal length. At any point where Y0 is not the zero vector, we extend {Y0 , Y1 , Y2 , Y3 } to an orthonormal frame with respect to the hyper-Kähler metric g. Any vector X can be written as X = a0 Y0 + a1 Y1 + a2 Y2 + a3 Y3 + X ⊥ , where X ⊥ is in the orthogonal complement of {Y0 , Y1 , Y2 , Y3 }. Note that dµ(X ⊥ ) = g(∇µ, X ⊥ ) = 0, and Ia dµ(X ⊥ ) = −g(∇µ, Ia X ⊥ ) = g(Ia ∇µ, X ⊥ ) = 0. Also, for 1 ≤ a = b ≤ 3, dµ(Ya ) = g(∇µ, Ia ∇µ) = 0, dµ(Y0 ) = |∇µ|2 , Ib dµ(Ya ) = −g(∇µ, Ib Ia ∇µ) = 0, Ia dµ(Ya ) = −g(∇µ, Ia2 ∇µ) = |∇µ|2 . Then 3 3 f (µ) 2 2 2 ( a )|∇µ| + a )|∇µ|4 gˆ(X, X) = f (µ)( 4

=0

=0

3 f (µ) |∇µ|2 )( = (f (µ) + a2 )|∇µ|2 . 4

=0

Therefore, gˆ is positive definite on the open set defined by the inequality (26).

Note that for any positive integer m, f (µ) = µm satisfies (26) whenever µ is positive. So does f (µ) = eµ . Therefore, if g is a HKT-metric with a positive potential function µ, the following metrics are HKT-metrics: m − 1

dµ ⊗ dµ + I1 dµ ⊗ I1 dµ gm = mµm−2 (µg + 4 +I2 dµ ⊗ I2 dµ + I3 dµ ⊗ I3 dµ) , 1 g∞ = eµ (g + (dµ ⊗ dµ + I1 dµ ⊗ I1 dµ + I2 dµ ⊗ I2 dµ + I3 dµ ⊗ I3 dµ)). 4 4.3. Inhomogeneous HKT-Structures on S 1 × S 4n−3 . On the complex vector space (Cn ⊕Cn )\{0}, let (zα , wα ), 1 ≤ α ≤ n, be its coordinates. We define a hypercomplex structure to contain this complex structure as follows: I1 dzα = −idzα , I1 dwα = −idwα , I1 dz α = idz α , I1 dwα = idwα , I2 dzα = dwα , I2 dwα = −dz α , I2 dz α = dwα , I2 dwα = −dzα , I3 dzα = idwα , I3 dwα = −idz α , I3 dz α = −idwα , I3 dwα = idzα . The function µ = 12 (|z|2 +|w|2 ) is the hyper-Kähler potential for the standard Euclidean metric: g=

1 (dzα ⊗ dz α + dz α ⊗ dzα + dwα ⊗ dwα + dwα ⊗ dwα ). 2

(29)

Since |∇µ|2 = 2µ, the function f (µ) = ln µ satisfies the inequality (26) on C2n \{0}. By Theorem 4, ln µ is the HKT-potential for a HKT-metric gˆ on C2n \{0}.

Geometry of Hyper-Kähler Connections with Torsion

33

Next for any real number r, with 0 < r < 1, and θ1 , . . . , θn modulo 2π, we consider the integer group r generated by the following action on (Cn ⊕ Cn )\{0}: (zα , wα ) → (reiθα zα , re−iθα wα ).

(30)

One can check that the group r is a group of hypercomplex transformations. As observed in [22], the quotient space of (Cn ⊕ Cn )\{0} with respect to r is the manifold S 1 × S 4n−1 = S 1 × Sp(n)/ Sp(n − 1). Since the group r is also a group of isometries with respect to the HKT-metric gˆ determined by f (µ) = ln µ, the HKT-structure descends from (Cn ⊕ Cn )\{0} to a HKT-structure on S 1 × S 4n−1 . Since the hypercomplex structures on S 1 × S 4n−1 are parametrized by (r, θ1 , . . . , θn ) and a generic hypercomplex structure in this family is inhomogeneous [22], we obtain a family of inhomogeneous HKT-structures on the manifold S 1 × S 4n−1 . Theorem 5. Every hypercomplex deformation of the homogeneous hypercomplex structure on S 1 × S 4n−1 admits a HKT-metric. Furthermore, Fˆ2 + iFˆ3 = 2∂1 I2 ∂ 1 µ descends to S 1 × S 4n−1 . However, the function µ does not descend to S 1 ×S 4n−1 . Therefore, this (2,0)-form has a potential form I2 ∂ 1 µ but not a globally defined potential function. 4.4. Associated Bundles of Quaternionic Kähler Manifolds. When M is a quaternionic Kähler manifold, i.e. the holonomy of the Riemannian metric is contained in the group Sp(n) · Sp(1), the representation of Sp(1) on quaternions H defines an associated fiber bundle U(M ) over the smooth manifold M with H\{0}/Z2 as fiber. Swann finds that there is a hyper-Kähler metric g on U(M ) whose potential function µ is the length of the radius coordinate vector field along each fiber [27]. As in the last example, ln µ is the potential function of a HKT-structure with metric gˆ. Again, metric gˆ and the hypercomplex structure are both invariant of fiberwise real scalar multiplication. Therefore, the HKT-structure with metric gˆ descends to the compact quotients defined by integer groups generated by fiberwise real scalar multiplications. 5. Reduction First of all, we recall the construction of hypercomplex reduction developed by Joyce [18]. Let G be a compact group of hypercomplex automorphisms on M . Denote the algebra of hyper-holomorphic vector fields by g. Suppose that ν = (ν1 , ν2 , ν3 ) : M −→ R3 ⊗ g is a G-equivariant map satisfying the following two conditions. The Cauchy-Riemann condition: I1 dν1 = I2 dν2 = I3 dν3 , and the transversality condition: Ia dνa (X) = 0 for all X ∈ g. Any map satisfying these conditions is called a G-moment map. Given a point ζ = (ζ1 , ζ2 , ζ3 ) in R3 ⊗ g, denote the level set ν −1 (ζ) by P . Since the map ν is G-equivariant, level sets are invariant if the group G is Abelian or if the point ς is invariant. Assuming that the level set P is invariant, and the action of G on P is free, then the quotient space N = P/G is a smooth manifold. Joyce proved that the quotient space N = P/G inherits a natural hypercomplex structure [18]. His construction runs as follows. For each point m in the space P , its tangent space is Tm P = {t ∈ Tm M : dν1 (t) = dν2 (t) = dν3 (t) = 0}.

34

G. Grantcharov, Y. S. Poon

Consider the vector subspace Um = {t ∈ Tm P : I1 dν1 (t) = I2 dν2 (t) = I3 dν3 (t) = 0}. Due to the transversality condition, this space is transversal to the vectors generated by elements in g. Due to the Cauchy-Riemann condition, this space is a vector subspace of Tm P with co-dimension dim g, and hence it is a vector subspace of Tm M with codimension 4 dim g. The same condition implies that, as a subbundle of T M|P , U is closed under Ia . We call the distribution U the hypercomplex distribution of the map ν. Let π : P → N be the quotient map. For any tangent vector v at π(m), there exists a unique element v in Um such that dπ( v ) = v. The hypercomplex structure on N is defined by Ia v = dπ(Ia v ),

i.e.

I

. a v = Ia v

(31)

Theorem 6. Let (M, I, g) be a HKT-manifold. Suppose that G is a compact group of hypercomplex isometries. Suppose that ν is a G-moment map such that along the invariant level set P = ν −1 (ζ), the hypercomplex distribution U is orthogonal to the Killing vector fields generated by the group G, then the quotient space N = P/G inherits a natural HKT-structure. Proof. Under the condition of this theorem, the hypercomplex distribution along the level set P is identical to the orthogonal distribution Hm = {t ∈ Tm P : g(t, X) = 0, X ∈ g}. Now, we define a metric structure h at Tπ(m) N as follows. For v, w ∈ Tπ(m) N, v , w).

hπ(m) (v, w) = gm (

(32)

It is obvious that this metric on N is hyper-Hermitian. To find the hyper-Kähler connection D on the quotient space N , let v and w be locally defined vector fields on the manifold N . They lift uniquely to G-invariant sections v and w

of the bundle U . As U is a subbundle of the tangent bundle of P , and P is a submanifold of M , we consider v as a section of T P and w

as a section of T M|P . Restricting the hyper-Kähler connection ∇ onto P , we consider ∇vw

as a section of T M|P . Recall that there is a direct sum decomposition T M|P = U ⊕ g⊕I1 g⊕I2 g⊕I3 g.

(33)

Let θ be the projection from T M|P onto its direct summand U . Since g is orthogonal to the distribution U , and U is hypercomplex invariant, θ is an orthogonal projection. Define Dv w := dπ(θ(∇vw)).

i.e.

D

v w = θ(∇v w).

(34)

Now we have to prove that it is a HKT-connection. We claim that the connection D preserves the hypercomplex structure. This claim is equivalent to Dv (Ia w) = Ia Dv w. Lifting to U , it is equivalent to θ(∇vIa w)

= Ia θ(∇vw).

Since the direct sum decomposition is invariant of the hypercomplex structure, the projection map θ is hypercomplex. Therefore, it commutes with the complex structures. Then the above identity is equivalent to θ(∇vIa w)

= θ(Ia ∇vw).

This identity holds because ∇ is hypercomplex.

Geometry of Hyper-Kähler Connections with Torsion

35

To verify that connection D preserves the Riemannian metric h, let u, v, and w be vector fields on N . The identity uh(v, w) − h(Du v, w) − h(v, Du w) = 0 is equivalent to the following identity on P : u

g( v , w)

− g(θ(∇u v ), w)

− g( v , θ(∇u w))

= 0. Since θ is the orthogonal projection along g, the above identity is equivalent to u

g( v , w)

− g(∇u v , w)

− g( v , ∇u w)

= 0. This identity on P is satisfied because ∇ is a HKTconnection. Finally, we have to verify that the torsion of connection D is totally skew-symmetric. By definition and the fact that θ is an orthogonal projection, the torsion of D is T D (u, v, w) = g(∇u v , w)−g(∇

, w)−g(

[u, v], w).

Note that [ u, v ] is a vector tangent v u to P such that dπ ◦θ([ u, v ]) = [dπ( u), dπ( v )] = [u, v]. Therefore, [ u, v ] and [u, v] differ by a vector in g. Since the Killing vector fields are orthogonal to the hypercomplex dis tribution, g([u, v], w)

= g([ u, v ], w).

Then we have T D (u, v, w) = T ∇ ( u, v , w).

This is totally skew-symmetric because connection ∇ is the Bismut connection on M . Suppose that the group G is one-dimensional. Let X be the Killing vector field generated by G. The hypercomplex distribution U and the horizontal distribution H are identical if and only if the 1-forms I1 dν1 = I2 dν2 = I3 dν3 are pointwisely proportional to the 1-form ιX g along the level set P , i.e. for any tangent vector Y to P , Ia dνa (Y ) = f g(X, Y ) or equivalent to dνa = f ιX Fa . In the next example, we shall make use of this observation. 2 5.1. Example: HKT-Structure on V CP = S 1 × (SU (3)/U (1)). We construct a HKT-structure on V CP2 by a U (1)-reduction from a HKT-structure on H3 \{0}. Choose a hypercomplex structure on R6 ∼ = C3 ⊕ C3 by I1 (χ, E) = (iχ, −iE),

I2 (χ, E) = (iE, iχ),

I3 (χ, E) = (−E, χ).

(35)

It is apparent that the holomorphic coordinates with these complex structures are (χ, E), (χ + E, χ − E), and (E − iχ, E − iχ) respectively. As in 4.3, the hyper-Kähler potential for the Euclidean metric g on (C3 ⊕ C3 )\{0} is µ = 12 (|χ|2 + |E|2 ). We apply Proposition 4 to f (µ) = ln µ to obtain a new HKT-metric gˆ =

1 1 g − 2 (dµ ⊗ dµ + I1 dµ ⊗ I1 dµ + I2 dµ ⊗ I2 dµ + I3 dµ ⊗ I3 dµ). µ µ

(36)

Define a hypercomplex moment map ν = (ν1 , ν2 , ν3 ) by ν1 (χ, E) = |χ|2 − |E|2 ,

(ν2 + iν3 )(χ, E) = 2 χ, E ,

(37)

where , is a Hermitian inner product on C3 . Let Γ ∼ = U(1) be the one-parameter group acting on (C3 ⊕ C3 )\{0} defined by (t; (χ, E)) → (eit χ, eit E).

(38)

Let r be the integer group generated by a real number between 0 and 1. It acts on (C3 ⊕ C3 )\{0} by (n; (χ, E)) → (rn χ, rn E).

(39)

Both Γ and r are groups of hypercomplex automorphisms leaving the zero level set of ν invariant. Then the quotient space ν −1 (0)/Γ is a hypercomplex reduction. The discrete

36

G. Grantcharov, Y. S. Poon

quotient space V = ν −1 (0)/Γ × r is a compact hypercomplex manifold. From the homogeneity of the metric gˆ, we see that both Γ and the discrete group r are group of isometries for the metric gˆ. Therefore, the quotient space V inherits a hyper-Hermitian metric. On (C3 ⊕ C3 )\{0}, the real vector field generated by the group Γ is X = iχ

∂ ∂ ∂ ∂ − iE + iE . − iχ ∂χ ∂χ ∂E ∂E

Let Fˆa be the Kähler form for the HKT-metric gˆ. We check that dνa = −2µιX Fˆa . Therefore, Theorem 6 implies that the quotient space V inherits a HKT-structure. Note that if (χ, E) is a point in the zero level set, then it represents a pair of orthogχ % χ % onal vectors. Therefore, the triple ( |χ| , |%| , |χ| × |%| ) forms an element in the matrix group SU(3). The action of Γ induces an action on U(3) by the left multiplication of χ % χ % χ % χ % Diag(eit , eit , e−2it ). Denote the Γ -coset of ( |χ| , |%| , |χ| × |%| ) by [ |χ| , |%| , |χ| × |%| ]. 1 The quotient space V is isomorphic to the product space S ×SU(3)/ U(1). The quotient map is χ E χ ln |χ| E (χ, E) → exp 2πi , , , × . ln r |χ| |E| |χ| |E| Remark. A fundamental question on HKT-structures remains open. Does every hypercomplex manifold admit a metric such that it is a HKT-structure? Acknowledgements. We thank G. W. Gibbons for introducing the topic in this paper to us and P. Gauduchon and V. Apostolov for useful conversations. The second named author thanks J.-P. Bourguignon for providing an excellent research environment at the Institut des Hautes Études Scientifiques. The first named author thanks the “Abdus Salam International Center for Theoretical Physics” where the final part of this work was done.

References 1. Barberis, M.L., Dotti Miatello, I.G. and Miatello, R.J.: On certain locally homogeneous Clifford manifolds. Ann. Global Anal. Geom. 13, 289–301 (1995) 2. Besse, A.: Einstein Manifolds, Ergebnisse der Mathematik und ihrer Grenzgebiete, 3. Folge 10, New York: Springer-Verlag, 1987 3. Bismut, J.-M.: A local index theorem for non-Kähler manifolds. Math. Ann. 284, 681–699 (1989) 4. Boyer, C.: A note on hyperhermitian four-manifolds. Proc. Amer. Math. Soc. 102, 157–164 (1988) 5. Boyer, C., Galicki, K. and Mann, B.: Hypercomplex structures on Stiefel manifolds. Ann. Global Anal. Geom. 14, 81–105 (1996) 6. Curtright, and D. Freedman, Z.: Nonlinear σ-models with extended supersymmetry in four dimensions. Phys. Lett. B 90, 71 (1980) 7. Capria, M.M. and Salamon, S.M.: Yang–Mills fields on quaternionic spaces. Nonlinearity 1, 517–530 (1988) 8. Dotti. I.G. and Fino, A.: Abelian hypercomplex 8-dimensional nilmanifolds. Preprint 9. Eells, J. and Salamon, S.M.: Twistorial construction of harmonic maps of surfaces into four manifolds. Ann. Scuola Norm. Sup. Pisa Cl. Sci. 12, 589–640 (1985) 10. Galicki, K. and Poon, Y.S.: Duality and Yang–Mills fields on quaternionic Kähler manifolds. J. Math. Phys. 32, 1529–1543 (1991) 11. Gates, S.J., Hull, C.M. and Roˇcek, M.: Twisted multiplets and new supersymmetric nonlinear sigma models. Nucl. Phys. B 248, 157–186 (1984) 12. Gauduchon, P.: Hermitian connections and Dirac operators. Bollettino U.M.I. B 11, 257–288 (1997) 13. Gauduchon, P. and Tod, K.P.: Hyper-Hermitian metrics with symmetry. J. Geom. Phys. 25, 291–304 (1998) 14. Gibbons, G.W., Papadopoulos, G. and Stelle, K.S.: HKT and OKT geometries on soliton black hole moduli spaces. Nucl. Phys. B 508, 623–658 (1997)

Geometry of Hyper-Kähler Connections with Torsion

37

15. Hitchin, N.J., Karlhede, A., Lindström, U., and Roˇcek, M.: Hyper-Kähler metrics and supersymmetry. Commun. Math. Phys. 108, 535–589 (1987) 16. Howe, P.S. and Papadopoulos, G.: Holonomy Groups and W-symmetries. Commun. Math. Phys. 151, 467–480 (1993) 17. Howe, P.S. and Papadopoulos, G.: Twistor spaces for hyper-Kähler manifolds with torsion. Phys. Lett. B 379, 80–86 (1996) 18. Joyce, D.: The hypercomplex quotient and quaternionic quotient. Math. Ann. 290, 323–340 (1991) 19. Joyce, D.: Compact hypercomplex and quaternionic manifolds. J. Differential Geom. 35, 743–761 (1992) 20. Opfermann, A. and Papadopoulos, G.: Homogeneous HKT and QKT manifolds. Preprint, mathph/9807026 21. Pedersen, H. and Poon, Y.S.: Deformations of hypercomplex structures. J. reine angew. Math. 499, 81–99 (1998) 22. Pedersen, H. and Poon, Y.S.: Inhomogeneous hypercomplex structures on homogeneous manifolds. J. reine angew. Math. 516, 159–181 (1999) 23. Pedersen, H., Poon,Y.S. and Swann, A.F.: Hypercomplex structures associated to quaternionic manifolds. Differential Geom. Appl. 9(3), 273–292 (1998) 24. Salamon, S.M.: Differential geometry of quaternionic manifolds. Ann. scient. Éc. Norm. Sup. 4e , 19, 31–55 (1986) 25. Salamon, S.M.: Complex structures on nilpotent Lie algebras. Preprint math.DG/ 9808025 26. Spindel, Ph., Sevrin, A., Troost, W. and Van Proeyen, A.: Extended supersymmetric σ-models on group manifolds. Nucl. Phys. B 308, 662–698 (1988) 27. Swann, A.F.: HyperKähler and quaternionic Kähler geometry. Math. Ann. 289, 421–450 (1991) 28. Zumino, B.: Supersymmetry and Kähler manifolds. Phys. Lett. B 87, 203–206 (1979) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 213, 39 – 125 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians J. Derezinski ´ 1 , C. Gérard2 1 Department of Mathematical Methods in Physics, Warsaw University, Ho˙za 74, 00-682 Warszawa, Poland 2 Centre de Mathématiques, UMR 7640 CNRS, Ecole Polytechnique, 91128 Palaiseau Cedex, France

Received: 25 June 1999 / Accepted: 9 February 2000

Abstract: We study spatially cut-off P (ϕ)2 Hamiltonians. We show the local finiteness of the pure point spectrum outside of thresholds, the limiting absorption principle and asymptotic completeness of scattering for such Hamiltonians. Our results imply the absence of singular continuous spectrum. 1. Introduction 1.1. P (ϕ)2 models in quantum field theory. Models of quantum field theory used by physicists to describe basic interactions, although very successful experimentally, are defined only in a formal and perturbative way. In 1952 Wightman and Gårding formulated a set of axioms, which, at least at that time, seemed to constitute a rather general mathematical framework for a physically acceptable QFT of basic interactions. In particular, these axioms satisfied the requirements of relativistic covariance and causal locality. It was hoped that physically realistic models of QFT can be interpreted in a mathematically consistent way and that they can be shown to satisfy axioms similar to Wightman axioms. At that time no examples of theories satisfying Wightman axioms were known except for free fields, which are in a sense trivial both from the physical and mathematical point of view. It is not difficult to give a list of non-trivial QFT models which on a formal level satisfy Wightman axioms. These models can be ordered according to their difficulty and physically realistic models in 3+1 dimensions are quite high on this list. Wightman proposed to construct these models one by one and check whether they satisfy the axioms he formulated, starting with the easiest (but, unfortunately, non-physical) ones. Thus began one of the most famous chapters of mathematical physics – constructive quantum field theory. The simplest class of models in the Wightman program were the so-called P (ϕ)2 models, that is, the models of self-interacting bosons in 2 space-time dimensions with the interaction given by a semibounded polynomial P (ϕ) of degree at least 4. The

40

J. Derezi´nski, C. Gérard

construction of these models was one of the early successes of constructive field theory. A number of different constructions were given. One of the approaches (in fact, the one that was used in the earliest works) started with considering a spatially cutoff P (ϕ)2 interaction, where the cutoff is defined using a positive coupling function g(x), which decays sufficiently fast at infinity. One can then define the Hamiltonian (1.1) H := H0 + g(x) : P (ϕ(x)) : dx, as a√semibounded self-adjoint operator on Fock space (L2 (R)), where H0 = d ( k 2 + m2 ) is the free Hamiltonian. Operator (1.1) is called the spatially cut-off P (ϕ) Hamiltonian. The next step is to show that, as g(x) → 1, one obtains a limiting dynamics which acts in a different, renormalized Hilbert space and satisfies the Wightman axioms. The Hamiltonian H will be the main subject of our paper. We will always assume that g ∈ L1 (R); for most results we will also need some additional assumptions on the decay and differentiability of g. The program of constructive field theory has not attained its original goal of constructing a physically realistic and mathematically rigorous model satisfying the conditions of covariance and locality in 4 space-time dimensions. To our knowledge, the models that have been constructed, including P (ϕ)2 , do not describe any real physical systems. Nevertheless, we believe that the heritage of constructive field theory is a source of models and techniques that are very interesting both physically and mathematically. One could ask what are the reasons to look at the Hamiltonians (1.1). One of them is historic – as we tried to sketch above, these Hamiltonians played an important role in the development of constructive field theory and there is a considerable literature on this subject. Unfortunately, (1.1) is not relativistic, since it is not even translation invariant. Nevertheless, it has a certain remarkable property: it satisfies the axiom of the causal locality, more precisely, if one defines a net of local algebras in the sense of Haag–Kastler with the help of H , then this net is causally local. Another reason is that spatially cutoff P (ϕ)2 Hamiltonians can be viewed as examples of Schrödinger operators in infinite dimension. Studying such Hamiltonians is a good occasion to test various advanced tools of functional analysis and sheds light on the mathematical structure of quantum field theory. This point of view was advocated in Simon’s survey [Si2], where a number of mathematical questions concerning the spectral theory of P (ϕ)2 Hamiltonians are formulated. 1.2. Content of this paper. In the present paper we extend methods developed for N body Schrödinger operators to study spatially cut-off P (ϕ)2 Hamiltonians. Our results include 1) local finiteness of the pure point spectrum outside of the thresholds, 2) the limiting absorption principle, 3) asymptotic completeness. Note that 2) and 3) imply the absence of the singular continuous spectrum. (The properties 2) and 3) are proven under different assumptions, neither of which implies the other.) Recently a number of papers appeared that study other models that belong to a broadly understood QFT [AH, BFS, BFSS, DG1, Ge, HuSp1, HuSp2, JP1, JP2, Sk, Sp1, Sp2]. The

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

41

Hamiltonians studied in these papers are sometimes called Pauli–Fierz Hamiltonians. They are non-relativistic, non-local and they have little to do with the Wightman program. Nevertheless, they are physically relevant and of a significant mathematical interest. One of these papers – [DG1] – can be viewed as a predecessor of this paper. It is devoted to massive Pauli–Fierz Hamiltonians and it contains results similar to those contained in this paper (except for the limiting absorption principle, which, however, could be easily shown in the context of [DG1]). It should be noted that there are a lot of analogies between this paper and [DG1]. Both massive Pauli–Fierz Hamiltonians and spatially cut-off P (ϕ)2 models share a lot of common characteristics, in particular the basic framework of scattering theory is essentially the same. Both classes are examples of QFT Hamiltonians with localized interactions. Nevertheless, the technical difficulties of this paper are more serious than those of [DG1]. This is partly due to the fact that it is much more difficult to define a P (ϕ)2 Hamiltonian than a Pauli–Fierz Hamiltonian. In the case of Pauli–Fierz Hamiltonians considered in [DG1], the perturbation is relatively bounded, which is not true in the case of (1.1). These problems become especially apparent when one considers the Mourre estimate, which requires a rather careful treatment and in many respects is more difficult than in the case of Schrödinger operators. In fact, the original theory of Mourre [Mo] does not seem to be applicable in the case of H and we need to apply its more sophisticated version contained in [ABG]. The key idea of the approach of [ABG] is the property of C 1 (A) regularity of an Hamiltonian H with respect to a unitary group eisA , which fortunately can be verified in the case of P (ϕ)2 Hamiltonians. The main tools that we use in the study of H is considering the interaction as an operator of multiplication in the Q-representation and the higher order estimates due to Rosen. These tools were developed in the early years of constructive field theory [Ne, Ro1, Ro2, Se, S-H.K]. Note that these tools are not needed in the case of technically simpler Pauli–Fierz Hamiltonians considered in [DG1]. Another difference between this paper and [DG1] is a significant simplification of the proof of asymptotic completeness and a different proof of the Fock property of asymptotic fields. Our paper can be divided into two parts. The first part, which consists of Sects. 2, 3, 4, 5, describes the general formalism of CAR representations, bosonic Fock spaces and Q-space representation. Our presentation is quite general and at some points its generality goes beyond what we need in the case of the Hamiltonians (1.1). Actually, when one considers some other models of QFT (such as those with massless particles) one needs the formalism in its more general form (see for instance Theorem 4.3, which allows for a non-Fock component of CCR representations). Most of the material of these sections can be found in the literature, notably Sects. 2 and 4 follow quite closely [BR] and Sect. 5 follows [S-H.K, Si1]. Nevertheless, our presentation has some modifications and improvements as compared, for example, to that in [BR] and we believe that the reader will find it useful, especially since it is compact and essentially self-contained. A considerable effort has been devoted to develop a concise notation for operators in Fock spaces. Some elements of this notation are standard (due in particular to I. Segal), others were introduced in [DG1]. In [DG1] we did not need to consider Wick polynomials, which play an important role in this paper. We devote special attention to the properties of Wick polynomials in Subsect. 3.12, and also in the context of the Q-representation in Subsect. 5.2. Note that the calculus and notation in the literature on QFT can be quite cumbersome and ad hoc, which we wanted to avoid.

42

J. Derezi´nski, C. Gérard

The second part of our paper is devoted to the study of spatially cutoff P (ϕ)2 Hamiltonians. In Sect. 6 we introduce spatially cut-off P (ϕ)2 Hamiltonians and we describe their basic properties, following e.g. [S-H.K]. One of the most difficult results about such Hamiltonians are the so-called higher order estimates due to Rosen. They are described with some of their consequences in Sect. 7. Strictly speaking, their proof contained in [Ro2] does not cover the class of Hamiltonians that we consider. Therefore, we indicate how to modify the arguments of [Ro2] to cover our class of coupling functions g. In Sect. 8 we study the commutator of H with the second quantized generator of dilations A. The operator A will play the role of a conjugate operator in the Mourre theory. The abstract framework of this section is based on [ABG], where a theory of the C 1 (A) property is developed. Such a careful treatment of this question was not needed either in [DG1] or in the case of N -particle Schrödinger operators. The case of the ϕ24 model, i.e. the case when P is a polynomial of degree 4 is simpler. For example, the construction of the space cutoff ϕ24 model can be done without using the Q-representation (see [GJ2]). Similarly the Mourre theory in the ϕ24 case can be treated in a simpler way, under weaker conditions on the cutoff function g. Section 9 is devoted to the spectral theory of P (ϕ)2 Hamiltonians. The analog of the HVZ theorem is proven in Subsect. 9.1. This result was first proven in [GJ4, S-H.K]; we give a different proof (analogous to the one given in [DG1]), which is essentially a by-product of the techniques that we develop in our paper for other purposes. In Subsect. 9.2 we prove the Mourre estimate for H . The proof is similar to the one contained in [DG1]. The Mourre estimate implies the local finiteness of the pure point spectrum outside of the thresholds. The set of thresholds is defined as {λ + nm | n = 1.2, . . . , λ ∈ σpp (H )}, where σpp (H ) denotes the pure point spectrum of H . Note that this result implies that the pure point spectrum of H is contained in a closed set of measure zero. Under stronger conditions on the coupling function, we can also show the limiting absorption principle, which implies the absence of singular continuous spectrum. More precisely, we show the existence of the boundary value of the resolvent on the real line limA −µ (λ + i − H )−1 A −µ , ↓0

where A is the conjugate operator and µ > 21 . In Sect. 10 we study the scattering theory for spatially cut-off P (ϕ)2 Hamiltonians. The basic construction of scattering theory in the context of this paper are asymptotic fields, that is the limits of the field operators in the interaction picture: a ±, (g) := s- lim eitH a (gt )e−itH , t→±∞

√ 2 2 e−it k +m g.

where gt := We prove the existence of the asymptotic fields and show that they realize a CCR representation satisfying the Fock property. This result was first proven in [HK]. Up to technical details due to a more singular character of the interaction, the proof of the existence of asymptotic fields follows the proof of the analogous result in [DG1]. The proof of the Fock property is based on the general theory of CCR representations. Its main ingredient is the concept of the number operator associated to the regular CCR representation described in Sect. 4. Note that the proof of the Fock property contained in [DG1] was different – it was closer to the original argument of [HK].

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

43

With the CCR representations given by a ±, (g) one can associate the spaces of asymptotic vacua K± , that is the states annihilated by asymptotic annihilation operators. The Fock property is equivalent to saying that vectors of the form a ±∗ (g1 ) · · · a ±∗ (gn )ψ, where ψ ∈ K± , span the whole Hilbert space H. It is easy to see that the bound states are contained in the spaces of the asymptotic vacua K± . The property of asymptotic completeness means that the spaces K± are equal to the space of bound states of H . This property is formulated at the end of Sect. 10. Among its consequences are the fact that the asymptotic vacua at t = −∞ and at t = +∞ coincide and the justification of the formalism of asymptotic states commonly used by physicists. In Sect. 11 we describe various propagation estimates. Their proofs do not differ substantially from the Pauli–Fierz case and we refer to [DG1] for most of them. In Sect. 12 we prove asymptotic completeness, that is, we show that the space of asymptotic vacua equals the pure point subspace of H . In principle, in this section we could repeat almost verbatim the arguments of the analogous section of [DG1]. Nevertheless, we simplify substantially the arguments of [DG1]. The major difference is that in [DG1] we used operators Pk , Qk and their asymptotic counterparts. In this paper we avoid using them, and the main role is played by the operators + (q) (describing something similar to the asymptotic velocity) and inverse wave operators W + (j ) (both of these objects were also used in [DG1]). Note in parenthesis that the absence of the operators Pk in the present paper has its price – it seems that one needs them to show a certain interesting intermediate result concerning the inverse wave operators W + (j ) (see [DG1, Thm 7.13]), which, fortunately, is not needed for the proof of asymptotic completeness itself. The methods of this paper can be applied to other models of QFT with a localized interaction and a massive dispersion relation. In particular, one can use ideas of this paper to simplify some of the arguments of [DG1]. As we mentioned earlier, the P (ϕ)2 models are the simplest nontrivial models considered in constructive field theory. Still, their treatment requires a lot of care and involves a number of various techniques, which go beyond the problems usually encountered in quantum mechanics and PDE’s. Even more difficult and more interesting problems arise when one considers other models of constructive field theory such as Y2 or λϕ34 . It would be interesting to extend our results to spatially cut-off versions of these models. We believe that it is feasible, since they are also models with a localized interaction and a massive dispersion relation. In particular, the framework of scattering theory for these models is essentially the same as the one considered in this paper. The main new difficulty would be the various renormalization procedures needed to define these Hamiltonians.

2. CCR Representations We recall some standard facts on CCR representations.

2.1. Weyl operators. Let H be a Hilbert space. Let g be a real vector space equipped with an antisymmetric form σ . A map g h → Wπ (h) ∈ B(H)

(2.1)

44

J. Derezi´nski, C. Gérard

is a representation of the canonical commutation relations (in short a CCR representation) over g in H, if Wπ (h1 )Wπ (h2 ) = e−iσ (h1 ,h2 )/2 Wπ (h1 + h2 ), Wπ∗ (h) = Wπ (−h),

Wπ (0) = 1.

(2.2)

Note that, as a consequence, Wπ (h) are unitary and we have Wπ (h1 )Wπ (h2 ) = e−iσ (h1 ,h2 ) Wπ (h2 )Wπ (h1 ).

(2.3)

2.2. Field operators. We say that the CCR representation (2.1) is regular, if t → Wπ (th)

is strongly continuous, for any h ∈ g.

From now on, we assume that we are given a regular representation. By the Stone theorem, for any h ∈ g we can define the corresponding field operator φπ (h) := −i

d Wπ (th) . t=0 dt

The following proposition is well known (see eg [BR]). Proposition 2.1. i) In the sense of a quadratic form on D(φπ (h1 )) ∩ D(φπ (h2 )) the Heisenberg commutation relations are satisfied: [φπ (h1 ), φπ (h2 )] = iσ (h1 , h2 ).

(2.4)

ii) Wπ (g) leaves invariant D(φπ (h)) and [φπ (h), Wπ (g)] = iσ (g, h)Wπ (g).

(2.5)

iii) Let f be a finite dimensional subspace of g. Then f f → Wπ (h + f ) is strongly continuous for any h ∈ g. iv) If f is a finite dimensional subspace of g, then the intersection of D(φπ (hp ) · · · φπ (h1 )), h1 , . . . , hp ∈ f, p ∈ N is dense in H.

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

45

2.3. Creation and annihilation operators. From now on we assume that g is equipped with a complex structure (that is an R-linear operator i : g → g with i2 = −1). We assume that σ and i are compatible in the following sense: σ (ih1 , h2 ) + σ (h1 , ih2 ) = 0, σ (h, ih) > 0, h = 0. (In particular, this forces σ to be non-degenerate.) Then (h1 |h2 ) := σ (h1 , ih2 ) + iσ (h1 , h2 ) defines a positive definite scalar product. From now on we will treat g as a complex space equipped with this scalar product. One defines the creation and annihilation operators as aπ∗ (h) =

√1 (φπ (h) − iφπ (ih)), 2

aπ (h) =

√1 (φπ (h) + iφπ (ih)). 2

(2.6)

Clearly, 1 φπ (h) := √ (aπ∗ (h) + aπ (h)), h ∈ g. 2

(2.7)

Proposition 2.2. i) The operators aπ∗ (h) and aπ (h) with domain D(φπ (h))∩D(φπ (ih)) are closed. (By Proposition 2.1 iii), this domain is dense in H.) ii) The following commutation relations are true in the sense of a quadratic form: [aπ (h1 ), aπ∗ (h2 )] = (h1 |h2 )1, [aπ (h2 ), aπ (h1 )] = [aπ∗ (h2 ), a ∗ (h1 )] = 0.

(2.8)

iii) Wπ (g) leaves invariant D(aπ (h)) and [aπ (h), Wπ (g)] =

√i

2

(g, h)Wπ (g),

[aπ∗ (h), Wπ (g)] = − √i (g, h)Wπ (g). 2

3. Operators in Bosonic Fock Spaces We recall various constructions on bosonic Fock spaces.

(2.9)

46

J. Derezi´nski, C. Gérard

3.1. Bosonic Fock spaces. Let h be a Hilbert space, which we will call the 1-particle space. Let ⊗ns h denote the symmetric nth tensor power of h. Let Sn denote the orthogonal p q projection of ⊗n h onto ⊗ns h. If u ∈ ⊗s h and v ∈ ⊗s h, then we will write h. u ⊗s v := Sp+q u ⊗ v ∈ ⊗p+q s p

q

If a ∈ B(⊗s h, ⊗rs h) and b ∈ B(⊗s h, ⊗ss h), then we will write h, ⊗r+s a ⊗s b := Sr+s a ⊗ b ∈ B(⊗p+q s s h). We define the bosonic Fock space over h to be the direct sum (h) :=

∞

⊗ns h.

n=0

+ will denote the vacuum vector – the vector 1 ∈ C = ⊗0s h. The number operator N is defined as N n = n1. s

h

For a selfadjoint operator A on (h), we denote by Hcomp (A) the space Hcomp (A) = {u ∈ H | χ (A)u = u, χ ∈ C0∞ (R)}. We define the space of finite particle vectors and finite particle operators: fin (h) = Hcomp (N ) := {u ∈ (h) | for some n ∈ N, 1[0,n] (N )u = u}, Bfin ( (h)) := {B ∈ B( (h)) | for some n ∈ N, 1[0,n] (N )B1[0,n] (N ) = B}. 3.2. Creation and annihilation operators. There exists a natural representation of CCR over h in (h) (where h is equipped with the symplectic form I m(·|·)). To construct this representation it is natural to proceed in the reverse order from the one used in Sect. 2: first one constructs creation/annihilation operators, then field operators and then Weyl operators. If h ∈ h, we define the creation operator a ∗ (h) by setting a ∗ (h) : fin (h) → fin (h), √ a ∗ (h)u := n + 1h ⊗s u,

u ∈ ⊗ns h.

a(h) denotes the adjoint of a ∗ (h), and is called the annihilation operator. Both a ∗ (h) and a(h) are defined on fin (h) and can be extended to densely defined closed operators on (h). By writing a (h) we will mean both a ∗ (h) and a(h). Creation and annihilation operators a (h) on a Fock space satisfy (2.8) . In our paper we will usually have h = L2 (R, dk). Then we will often write (as is customary in the literature) ¯ a ∗ (h) = a ∗ (k)h(k)dk, a(h) = a(k)h(k)dk, where a ∗ (k) and a(k), k ∈ R, have the meaning of operator valued distributions.

(3.1)

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

47

3.3. Field operators. We define the field operator 1 φ(h) := √ (a ∗ (h) + a(h)), h ∈ h. 2 The operators φ(h) are essentially selfadjoint on fin (h) and can be extended to selfadjoint operators on (h). Field operators φ(h) on a Fock space satisfy (2.4). In the case of (3.1), one can also write φ(h) =

h(k)φ(k)dk,

where φ(k) is an operator valued distribution 1 φ(k) := √ (a ∗ (k) + a(k)), k ∈ R. 2

3.4. Weyl operators. We introduce also the Weyl operators: W (h) := eiφ(h) , h ∈ h. The map h h → W (h) is a regular representation of CCR over h in (h). Moreover, Weyl operators in a Fock space have the following properties: Proposition 3.1. i) the map R s → W (sh)(N + 1)− 2

1

is C 1 in the strong topology and the map R s → W (sh)(N + 1)− 2 − 1

is C 1 in the norm topology. More precisely, lim sup s −1 W (sh) − 1 − isφ(h) (N + 1)−1/2− = 0.

s→0 h≤C

ii) (W (h1 ) − W (h2 ))u ≤ C h1 − h2 (h1 2 + h2 2 ) 2 u + (N + 1) 2 u .

48

J. Derezi´nski, C. Gérard

3.5. Operator d . If b is an operator on h, we define the operator d (b) : (h) → (h), n

d (b)n := 1⊗(j −1) ⊗ b ⊗ 1⊗(n−j ) s

h

j =1

= nb ⊗s 1⊗(n−1) .

An important example is the number operator N := d (1). Lemma 3.2. i) Heisenberg derivatives: d dt d (b)

= d ( dtd b),

[d (b1 ), d (b2 )] = d ([b1 , b2 ]). ii) Commutation properties: [d (b), a ∗ (h)] = a ∗ (bh), [d (b), a(h)] = −a(b∗ h), [d (b), iφ(h)] = φ(ibh), if b = b∗ , W (h)d (b)W (−h) = d (b) − φ(ibh) − 21 Re(bh|h) if b = b∗ . iii) Estimates: b1 ≤ b2 implies d (b1 ) ≤ d (b2 ), N − 2 d (b)u ≤ d (b∗ b) 2 u, 1

1

d (b)α ≤ N 1−α d (bα ), if b ≥ 0, 1 ≤ α, 1

1

d (ab) ≤ d (a p ) p d (bq ) q ,

if a ≥ 0, b ≥ 0, ab = ba, p−1 + q −1 = 1.

3.6. Functor . Let hi , i = 1, 2 be Hilbert spaces. Let q : h1 → h2 be a bounded linear operator. We define (q) : (h1 ) → (h2 ), (q)n = q ⊗ · · · ⊗ q. s

h1

The functor has the following properties: Lemma 3.3. i) Relationship with d : assume h1 = h2 . Then ed (b) = (eb ).

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

49

ii) Intertwining properties: (q)a ∗ (h1 ) = a ∗ (qh1 ) (q), h1 ∈ h1 , (q)a(q ∗ h2 ) = a(h2 ) (q), h2 ∈ h2 . If q is isometric, that is q ∗ q = 1, then (q)a (h1 ) = a (qh1 ) (q), (q)φ(h1 ) = φ(qh1 ) (q). If q is unitary, then

(q)a (h1 ) (q −1 ) = a (qh1 ), (q)φ(h1 ) (q −1 ) = φ(qh1 ).

iii) If q ≤ 1, then

(q) = 1.

3.7. Operator d (q, r). Let q, r be operators from h1 to h2 . We define d (q, r) : (h1 ) → (h2 ), n

q ⊗(j −1) ⊗ r ⊗ q ⊗(n−j ) d (q, r)n := s

h1

j =1

= nr ⊗s q ⊗(n−1) .

Lemma 3.4. i) Relationship with d and : d (1, r) = d (r), d (r, r) = N (r). ii) Heisenberg derivatives of (q): d (b2 ) (q) = d (q, b2 q), (q)d (b1 ) = d (q, qb1 ), d dt (q)

= d (q, dtd q).

iii) Intertwining properties: a(h2 )d (q, r) = d (q, r)a(q ∗ h2 ) + (q)a(r ∗ h2 ), d (q, r)a ∗ (h1 ) = a ∗ (qh1 )d (q, r) + a ∗ (rh1 ) (q). iv) Estimates:

0 ≤ r, and 0 ≤ q ≤ 1 implies d (q, r) ≤ d (r),

|(u2 |d (q, r2 r1 )u1 )| ≤ d (r2 r2∗ ) 2 u2 d (r1∗ r1 ) 2 u1 , q ≤ 1, 1

1

N − 2 d (q, r)u ≤ d (r ∗ r) 2 u, q ≤ 1. 1

1

50

J. Derezi´nski, C. Gérard

3.8. Tensor product of Fock spaces. We will adopt the following convention for tensor products: E ⊗ F will denote the algebraic tensor product of E and F , except when E, F are both Hilbert spaces, in which case it will denote the hilbertian tensor product. Let hi , i = 1, 2 be two Hilbert spaces. Let i1 , i2 be the injections of h1 , h2 into h1 ⊕ h2 . We define U : (h1 ) ⊗ (h2 ) → (h1 ⊕ h2 ) as follows: (p + q)! U u ⊗ v := (i1 )u ⊗s (i2 )v, u ∈ ⊗ps h1 , u ∈ ⊗qs h2 . (3.2) p!q! Proposition 3.5. i) U is unitary, ii) U + ⊗ + = +, iii) a (h1 ⊕ h2 )U = U a (h1 ) ⊗ 1 + 1 ⊗ a (h2 ) , h1 ∈ h1 , h2 ∈ h2 , φ(h1 ⊕ h2 )U = U φ(h1 ) ⊗ 1 + 1 ⊗ φ(h2 ) , h1 ∈ h1 , h2 ∈ h2 . iv)

d (b1 ⊕ b2 )U = U d (b1 ) ⊗ 1 + 1 ⊗ d (b2 ) ,

(3.3)

U (q1 ) ⊗ (q2 ) = (q1 ⊕ q2 )U.

3.9. Scattering identification operator I . Along with the space (h) we will consider the space (h ⊕ h) (h) ⊗ (h). We will use the notation N0 := N ⊗ 1,

N∞ := 1 ⊗ N.

We will also write

a0 (h) := a (h) ⊗ 1, a∞ (h) := 1 ⊗ a (h). Following [HuSp1], we define the scattering identification operator I : fin (h) ⊗ fin (h) → fin (h), I u ⊗ v :=

(p + q)! p q u ⊗s v, u ∈ ⊗s h, v ∈ ⊗s h. p!q!

(3.4)

Another formula defining I is I := (i)U, where U : (h) ⊗ (h) → (h ⊕ h) is the unitary operator introduced in (3.2) for h1 = h2 = h and i : h ⊕ h → h, (h0 , h∞ ) → h0 + h∞ . √ Note that since i = 2, the operator (i) is unbounded. Therefore, I is unbounded too.

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

51

Yet another formula defining I is: n

p

p

n

i=1

i=1

i=1

i=1

I 6 a ∗ (hi )+ ⊗ 6 a ∗ (gi )+ := 6 a ∗ (gi ) 6 a ∗ (hi )+,

hi , gi ∈ h.

(3.5)

If h = L2 (R, dk), then we can write still another formula for I : 1 Iu ⊗ ψ = ψ(k1 , · · · , kp )a ∗ (k1 ) · · · a ∗ (kp )udk1 · · · dkr , u ∈ (h), 1 (3.6) (p!) 2 ψ ∈ ⊗ps h. Proposition 3.6. i) Let b, q be operators on h. Then d (b)I = I d (b) ⊗ 1 + 1 ⊗ d (b) , (q)I = I (q) ⊗ (q). ii) For h ∈ h

a(h)I = I (a0 (h) + a∞ (h)), ∗ (h). a ∗ (h)I = I a0∗ (h) = I a∞

iii) I (N0 + 1)−k/2 1[0,k] (N∞ ) is bounded.

(3.7)

3.10. Operator I (j ). Let j0 , j∞ be two operators on h. Set j = (j0 , j∞ ). We define I (j ) : fin (h) ⊗ fin (h) → fin (h), I (j ) := I (j0 ) ⊗ (j∞ ). If we identify j with the operator j : h ⊕ h → h,

(3.8)

j (h0 ⊕ h∞ ) := j0 h0 + j∞ h∞ , then we have

I (j ) = (j )U.

ˇ ∗ )∗ in the notation of [DG1]. Remark 3.7. I (j ) equals (j Note that I = I (1, 1). Other formulas defining I (j ) are n

p

p

n

i=1

i=1

i=1

i=1

I (j ) 6 a ∗ (hi )+ ⊗ 6 a ∗ (gi )+ := 6 a ∗ (j0 gi ) 6 a ∗ (j∞ hi )+,

hi , gi ∈ h, (3.9)

∗ ∗ (j∞ hi ) + ⊗ +, hi ∈ h. I ∗ (j )6ni=1 a ∗ (hi )+ := 6ni=1 a0∗ (j0∗ hi ) + a∞

(3.10)

52

J. Derezi´nski, C. Gérard

Lemma 3.8. i) ∗ I (j˜)I ∗ (j ) = (j˜0 j0∗ + j˜∞ j∞ ). ∗ = 1, then In particular, if j0∗ + j∞

I I ∗ (j ) = 1.

ii) Intertwining properties: For h ∈ h, ∗ h)), a(h)I (j ) = I (j )(a0 (j0∗ h) + a∞ (j∞

a ∗ (j0 h)I (j ) = I (j )a0∗ (h), ∗ (h). a ∗ (j∞ h)I (j ) = I (j )a∞ ∗ j ≤ 1, and then iii) I (j ) is bounded iff j0∗ j0 + j∞ ∞

I (j ) = 1. Let us note some additional properties of I (j ) in the coisometric case. Lemma 3.9. Assume ∗ j0 j0∗ + j∞ j∞ = 1.

(3.11)

(This assumption implies that j is coisometric, that is jj ∗ = 1.) Then i) I (j )I ∗ (j ) = 1. ii) Intertwining properties: ∗ a (h)I (j ) = I (j ) a0 (j0∗ h) + a∞ (j∞ h) , ∗ φ(h)I (j ) = I (j ) φ0 (j0∗ h) + φ∞ (j∞ h) . iii) If in addition j0 , j∞ are self-adjoint, then d (b) = I (j ) (d (b) ⊗ 1 + 1 ⊗ d (b)) I ∗ (j ) + 21 d (ad2j0 b + ad2j∞ b).

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

53

3.11. Operator dI (j, k). Let j = (j0 , j∞ ), k = (k0 , k∞ ) be pairs of maps from h to h. We define dI (j, k) : fin (h) ⊗ fin (h) → fin (h) as follows: dI (j, k) := I (d (j0 , k0 ) ⊗ (j∞ ) + (j0 ) ⊗ d (j∞ , k∞ )). Equivalently, treating j and k as maps from h ⊕ h to h, as in (3.8), we can write dI (j, k) := d (j, k)U.

ˇ ∗ , k ∗ )∗ in the notation of [DG1]. Remark 3.10. dI (j, k) equals d (j Lemma 3.11. i) Heisenberg derivative of I (j ): d dt I (j )

= dI (j, dtd j ),

I (j ) (d (b0 ) ⊗ 1 + 1 ⊗ d (b∞ )) = dI (j, k), d (b)I (j ) = dI (j, bj ). Here b, b0 , b∞ are operators on h and k = (j0 b0 , j∞ b∞ ). ii) Intertwining properties: dI (j, k)a0∗ (h) = a ∗ (j0 h)dI (j, k) + a ∗ (k0 h)I (j ), ∗ h)) + I (j )(a (k ∗ h) + a (k ∗ h)) = a(h)dI (j, k). dI (j, k)(a0 (j0∗ h) + a∞ (j∞ 0 0 ∞ ∞ ∗ ≤ 1, k , k are self-adjoint, we have the estimate iii) If j0 j0∗ + j∞ j∞ 0 ∞

|(u2 |dI ∗ (j, k)u1 )| ≤ d (|k0 |) 2 ⊗ 1u2 d (|k0 |) 2 u1 1

1

1

1

+ 1 ⊗ d (|k∞ |) 2 u2 d (|k∞ |) 2 u1 , u1 ∈ (h), u2 ∈ (h) ⊗ (h). ∗ ≤ 1, then iv) If j0 j0∗ + j∞ j∞ ∗ ) 2 u, u ∈ (h). (N0 + N∞ )− 2 dI ∗ (j, k)u ≤ d (k0 k0∗ + k∞ k∞ 1

1

54

J. Derezi´nski, C. Gérard p

q

3.12. Wick polynomials. Let w ∈ B(⊗s h, ⊗s h). We define the operator Wick(w) : fin (h) → fin (h) as follows: √ n!(n + q − p)! w ⊗s 1⊗(n−p) . Wick(w)n := (n − p)! s h

(3.12)

This definition extends to w ∈ Bfin ( (h)) by linearity. The operator Wick(w) is called a Wick polynomial. The operator w is called the symbol of the Wick polynomial Wick(w). Before we describe properties of Wick polynomials, let us introduce more definitions. If u is an element of a Hilbert space H, we denote by (u| the map H v → (u, v) ∈ C and by |u) : C → H its adjoint. p q n If u ∈ ⊗m s h, v ∈ ⊗s h, w ∈ B(⊗s h, ⊗s h) with m ≤ p, n ≤ q, then, to simplify the notation, we will introduce the “contracted” symbols p q−n (v|w := (v| ⊗s 1⊗(q−n) w ∈ B(⊗s h, ⊗s h), p−m q w|u) := w |u) ⊗s 1⊗(p−m) ∈ B(⊗s h, ⊗s h), p−m q−n (v|w|u) := (v| ⊗s 1⊗(q−n) w |u) ⊗s 1⊗(p−m) ∈ B(⊗s h, ⊗s h). Theorem 3.12. i) Case p = q = 0. Let λ ∈ C = B(⊗0s h, ⊗0s h). Then Wick(λ) = λ1. ii) Case p = q = 1. (Note that ⊗1s h = h.) For b ∈ B(h), we have Wick(b) = d (b). iii) Cases q = 0, p = 1 and q = 1, p = 0. For h ∈ h, we have Wick |h) = a ∗ (h), Wick (h| = a(h). p

q

n iv) Let u ∈ ⊗m s h, v ∈ ⊗s h, w ∈ B(⊗s h, ⊗s h). Then

Wick |v) ⊗s w ⊗s (u| = Wick(|v))Wick(w)Wick((u|). v) Let h"1 , . . . , h"p , h1 , . . . , hq ∈ h. Then Wick | h1 ⊗s · · · ⊗s hq )(h"p ⊗s · · · ⊗s h"1 )| = a ∗ (h1 ) · · · a ∗ (hq )a(h"p ) · · · a(h"1 ). p

q

vi) Let w ∈ B(⊗s h, ⊗s h). Then Wick(w)∗ = Wick(w ∗ ).

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

55

Note that v) of the above theorem (which follows immediately from iv)), justifies the name Wick polynomials. p q If we fix a basis {hi }i∈I of h, then any operator in B(⊗s h, ⊗s h) can be written as a sum (convergent for the weak topology): w= wi1 ,...,iq ,ip" ,...,i1" |hi1 ⊗s · · · ⊗s hiq )(hip" ⊗s · · · ⊗s hi1" |, i1 ,...,iq ,i1" ,...,ip"

where we can assume that wi1 ,...,iq ,ip" ,...,i1" is separately symmetric wrt the first q and the

last p indices. Then, writing ai for a (hi ), we have Wick(w) = wi1 ,...,iq ,ip" ,...,i1" ai∗1 · · · ai∗q aip" · · · ai1" . i1 ,...,iq ,ip" ,...,i1"

In later sections, we will consider the case when h = L2 (R, dk). Any operator w p q from S(Rp ) to S " (Rq ), in particular, any w ∈ B(⊗s h, ⊗s h), has a distributional kernel w(k1 , . . . , kq , kp" , . . . , k1" ) ∈ S " (Rp+q ),

(3.13)

where we can assume that the kernel w in (3.13) is separately symmetric wrt the first q and the last p variables. The following formal expression is then commonly used to denote the Wick polynomial Wick(w): w(k1 , . . . , kq , kp" , . . . , k1" )a ∗ (k1 ) · · · a ∗ (kq )a(kp" ) · · · a(k1" )dk1 · · · dkq dkp" · · · dk1" . (3.14) Although the definition (3.12) always makes sense, let us describe a few cases in which a rigorous meaning can be attached to the formal expression (3.14). First of all if u ∈ fin (S(R)), a(k1 ) . . . a(kp )u is well defined as an element of S(Rp ) ⊗ (h). This shows that the expression

w(k1 , . . . , kq , kp" , . . . , k1" ) (3.15) × a(k1 ) · · · a(kq )u|a(k1" ) · · · a(kp )u (h) dk1 , · · · dkq dkp" · · · dk1" is well defined for u ∈ fin (S(R)), w ∈ S " (Rp+q ). Hence if w ∈ S " (Rp+q ), (3.14) always makes sense as a quadratic form on fin (S(R)). If u ∈ D(N p/2 ), a(k1 ) . . . a(kp )u is also well defined as an element of ps L2 (R) ⊗ (L2 (R)), and the expression (3.15) is well defined for u ∈ D(N sup(p,q)/2 ), w ∈ L2 (Rp+q ). Hence if w ∈ L2 (Rp+q ), (3.13) makes sense as a quadratic form on D(N sup(p,q)/2 ). The following proposition summarizes basic properties of Wick polynomials. p

q

Proposition 3.13. i) If w ∈ B(⊗s h, ⊗s h) and k + m ≥

p+q 2 ,

then

(N + 1)−k Wick(w)(N + 1)−m ≤ w. If moreover, s- lim wn = w, then s- lim (N + 1)−k Wick(wn )(N + 1)−m = (N + 1)−k Wick(w)(N + 1)−m . n→∞

56

J. Derezi´nski, C. Gérard

ii) Identities: Let w ∈ Bfin ( (h)), b ∈ B(h). Then [d (b), Wick(w)] = Wick([d (b), w]).

(3.16)

Let w ∈ Bfin ( (h2 ), (h1 )), q ∈ B(h1 , h2 ). Then (q)Wick(w (q)) = Wick( (q)w) (q).

(3.17)

Let w ∈ Bfin ( (h1 ), (h1 )), q ∈ B(h1 , h2 ). Then (q)Wick(w) = Wick( (q)w (q ∗ )) (q), for isometric q,

(3.18)

(q)Wick(w) (q −1 ) = Wick( (q)w (q −1 )), for unitary q.

(3.19)

Let w ∈ Bfin ( (h)), h ∈ h. Then [Wick(w), a ∗ (h)] = pWick w|h) ,

W (h)Wick(w)W (−h) =

[Wick(w), a(h)] = qWick (h|w , (3.20)

q p p! q! s=0 r=0

s! r!

i ( √ )p+q−r−s Wick(ws,r ), 2

(3.21)

where ws,r = (h⊗(q−r) |w|h⊗(p−s) ).

(3.22)

Proof. The first part of i) is a particular case of the well known Nτ estimates (see e.g. [GJ1]). It follows directly from the definition (3.12) and the fact that 1

(n!(n + q − p)!) 2 ≤ (n + q − p)q/2 np/2 . (n − p)! The second part of i) follows similarly from (3.12). All identities of ii) are easy, except for the last one, which follows for example from Thm. 3.12 v) and the identity i W (h)a ∗ (f )W (−h) = a ∗ (f ) + √ (h|f ). 2

% $

Let now K be an additional Hilbert space. If w ∈ Bfin (K ⊗ (h)), then we can also define Wick(w) as an operator acting on K ⊗ fin (h), by √ n!(n + q − p)! Wick(w) := 1K ⊗ Sn+q−p w ⊗ 1⊗(n−p) , n K⊗⊗s h (n − p)! p

q

for w ∈ B(K ⊗ ⊗s h, K ⊗ ⊗s h). This construction can be used, for example, if one considers generalizations of Pauli–Fierz Hamiltonians with more general interactions than those considered in [DG1]. In particular, this additional space can be also a Fock space. Then, if w ∈ B(⊗ps 1 h1 ⊗ ⊗ps 2 h2 , ⊗qs 1 h1 ⊗ ⊗sq2 h2 ),

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

57

we define Wick ⊗ Wick(w)

n

n

⊗s 1 h1 ⊗⊗s 2 h2

√ √ n1 !(n1 + q1 − p1 )! n2 !(n2 + q2 − p2 )! := (n1 − p1 )! (n2 − p2 )!

× (Sn1 +q1 −p1 ⊗ Sn2 +q2 −p2 )1⊗(n1 −p1 ) ⊗ w ⊗ 1⊗(n2 −p2 ) . We extend this definition to w ∈ B( fin (h1 ) ⊗ fin (h2 )) by linearity. The following proposition is completely similar to Prop. 3.13. Proposition 3.14. i) For wi ∈ B( fin (hi )), i = 1, 2 then Wick ⊗ Wick(w1 ⊗ w2 ) = Wick(w1 ) ⊗ Wick(w2 ). ii) If q1

p1

q2

p2

1

1

1

1

w = | ⊗s hi,1 )(⊗s gi,1 | ⊗ | ⊗s hi,2 )(⊗s gi,2 |, then q q p p Wick ⊗ Wick(w) = 611 a ∗ (hi,1 ) ⊗ 612 a ∗ (hi,2 ) 61 1 a(gi,1 ) ⊗ 61 2 a(gi,2 ) . iii) For ji ∈ B(hi ), i = 1, 2, w ∈ B( fin (h1 ) ⊗ fin (h2 )), (j1 ) ⊗ (j2 )Wick ⊗ Wick(w (j1 ) ⊗ (j2 )) = Wick ⊗ Wick( (j1 ) ⊗ (j2 )w) (j1 ) ⊗ (j2 ). iv) For w ∈ B( ps 1 h1 ⊗ ps 2 h2 , qs 1 h1 ⊗ qs 2 h2 ) and ki + mi ≥ we have

pi +qi 2 ,

i = 1, 2

(N + 1)−k1 ⊗ (N + 1)−k2 Wick ⊗ Wick(w)(N + 1)−m1 ⊗ (N + 1)−m2 ≤ w. In the next proposition we describe additional properties of Wick polynomials, which have a nice formulation if one uses the Wick ⊗ Wick notation. Proposition 3.15. i) Let w ∈ Bfin ( (h1 ⊕ h2 )). Then U ∗ Wick(w)U = Wick ⊗ Wick(U˜ ∗ w U˜ ). Here the map U˜ : fin (h1 ) ⊗ fin (h2 ) → fin (h1 ⊕ h2 ) is defined as follows (recall that i1 , i2 are the injections of h1 , h2 into h1 ⊕ h2 ): (p + q)! U˜ u1 ⊗ u2 := (i1 )u1 ⊗s (i2 )u2 , u1 ∈ ⊗ps h1 , u2 ∈ ⊗qs h2 . p!q!

(3.23)

58

J. Derezi´nski, C. Gérard

ii) Let w ∈ Bfin ( (h)). Then Wick(w)I = I Wick ⊗ Wick(P w I˜). Here I˜ : fin (h) ⊗ fin (h) → fin (h) is defined as follows: (p + q)! I˜u ⊗ v := u ⊗s v, u ∈ ⊗ps h, v ∈ ⊗qs h, p!q!

(3.24)

and P : fin (h) → fin (h) ⊗ fin (h) is defined as P u := u ⊗ +, u ∈ fin (h). iii) Let us keep the notation of ii) and let j be as in Subsect. 3.10. Then Wick( (j0 )w)I (j ) = I (j )Wick ⊗ Wick(P w I˜ (j0 ) ⊗ (j∞ )). Proof. By linearity, it suffices to check the identities i) and ii) for w of rank one. One can then use the identities in Lemma 3.5 iii), Thm. 3.12 v) and Prop. 3.14 ii) to verify i) and ii). Finally, iii) follows from ii), the fact that I (j ) = I (j0 ) ⊗ (j∞ ) and Prop. 3.14 iii). $ % In the following sections, we will need to estimate some commutators between Wick polynomials and (q), I ∗ (j ) operators. To this end we will need the identities described in the next proposition. Proposition 3.16. Let w ∈ Bfin ( (h)). Then, i) for q ∈ B(h),

[ (q), Wick(w)] = (q)Wick w(1 − (q)) + Wick ( (q) − 1)w (q).

ii) for j = (j0 , j∞ ) ∈ B(h ⊗ h, h), I ∗ (j )Wick (w) − (Wick(w) ⊗ 1)I ∗ (j ) = I ∗ (j )Wick w(1 − (j0∗ )) ∗ )I˜∗ wP ∗ I ∗ (j ) + Wick ⊗ Wick ( (j0∗ ) − 1) ⊗ (j∞ ∗ ) − |+)(+|)I˜∗ wP ∗ I ∗ (j ). + Wick ⊗ Wick 1 ⊗ ( (j∞ Proof. i) follows directly from the identity (3.17). To prove ii), we deduce from Prop. 3.15 that I ∗ (j )Wick(w) = I ∗ (j )Wick w(1 − (j0∗ )) ∗ ˜∗ + Wick ⊗ Wick (j0∗ ) ⊗ (j∞ )I wP ∗ I ∗ (j ).

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

59

We use then the identities w ⊗ |+)(+| = P wP ∗ , P = (1 ⊗ |+)(+|)I˜∗ , Wick(w) ⊗ 1 = Wick ⊗ Wick w ⊗ |+)(+| = Wick ⊗ Wick (1 ⊗ |+)(+|)I˜∗ wP ∗ .

to obtain

Next we write ∗ ) − 1 ⊗ |+)(+| (j0∗ ) ⊗ (j∞ ∗ ) + 1 ⊗ (j ∗ ) − |+)(+| = (j0∗ ) − 1 ⊗ (j∞ ∞

to obtain ii). $ % p

Lemma 3.17. Assume that h = L2 (R, dk) and that w ∈ B(⊗s h, ⊗rs h) is given by a kernel w, as in (3.13) with w ∈ L2 (Rp+r ). i) Let q ∈ B(h), q ≤ 1. Then, for m + k ≥

p+r 2 ,

(N + 1)−m [ (q), Wick(w)](N + 1)−k ≤ Cp,r sup1≤i≤p+r 1⊗(i−1) ⊗ (1 − q) ⊗ 1⊗(p+r−i) wL2 (Rp+r ) .

(3.25)

∗ j ≤ 1. Then, for m + k ≥ p+r , ii) Let j = (j0 , j∞ ), with j0 , j∞ ∈ B(h), j0∗ j0 + j∞ ∞ 2 (N0 + N∞ + 1)−m I ∗ (j )Wick(w) − (Wick(w) ⊗ 1)I ∗ (j ) (N + 1)−k

≤ Cp,r sup1≤i≤p+r 1⊗(i−1) ⊗ (1 − j0 ) ⊗ 1⊗(p+r−i) wL2 (Rp+r ) + Cp,r sup1≤i≤p+r 1⊗(i−1) ⊗ (j∞ ) ⊗ 1⊗(p+r−i) wL2 (Rp+r ) . (3.26) Proof. To prove i), it suffices, by Prop. 3.13 i), to estimate the operator norm of the symbols w(1 − (q)) and ( (q) − 1)w, which are bounded by the r.h.s. of (3.25). Similarly, to prove ii), it suffices, by Prop. 3.14 iv), to estimate the operator norm of the ∗ )I˜∗ wP ∗ and (1⊗( (j ∗ )−|+)(+|)I˜∗ wP ∗ symbols w(1− (j0∗ )), ( (j0∗ )−1)⊗ (j∞ ∞ ∗ (note that I (j ) = 1 by Lemma 3.8 iv) and I ∗ (j )N = (N0 + N∞ )I ∗ (j )). The norm of the first symbol is bounded by the r.h.s. of (3.26), by the same argument ∗ ) = P ∗ = 1, the norm of the second symbol is less than as in i). Since (j∞ ˜ by the r.h.s. of (3.26). Similarly the norm of the w I ( (j0 )1) ⊗ 1, which is bounded third symbol is less than w I˜1 ⊗ (j∞ ) − |+)(+| . This is also bounded by the r.h.s. of (3.26). (Note that (j∞ ) − |+)(+| vanishes on the vacuum sector.) $ % 4. Fock Representations of CCR In this section we described the construction of the Fock subrepresentation of a regular CCR representation, (see [BR, CMR]).

60

J. Derezi´nski, C. Gérard

4.1. Construction of the Fock subrepresentation. Suppose that we are given a regular CCR representation over a pre-Hilbert space g in the Hilbert space H. Let h be the completion of g. We define the space of vacua Kπ := {u ∈ H | aπ (h)u = 0, h ∈ g}.

Proposition 4.1. i) Kπ is a closed space. ii) Kπ is contained in the set of analytic vectors of φπ (h), h ∈ g. Proof. Kπ is closed as an intersection of null spaces of closed operators. To prove ii) we will show that (u|Wπ (h)u) = exp(−h2 /4), u ∈ Kπ .

(4.1)

Clearly, Kπ ⊂ D(φπ (h)), hence for u ∈ Kπ , f (t) := (u|Wπ (th)u) is continuously differentiable. We have d dt f (t)

= i(u|φπ (h)Wπ (th)u) =

√i

2

(aπ (h)u|Wπ (th)u) +

√i (u|Wπ (th)aπ (h)u) − 1 th2 (u|Wπ (th)u) 2 2

= − 21 th2 f (t). Therefore f (t) = exp(−t 2 h2 /4), which shows (4.1). Now the spectral theorem implies that u is an analytic vector for % φπ (h) and thus t → Wπ (th)u extends to an analytic function around the real axis. $ Define the space Hπ := Kπ ⊗ (h). We define +π : Kπ ⊗ fin (g) → H, by setting +π ψ ⊗ φ(h)p + := φπ (h)p ψ,

h ∈ g, ψ ∈ K+ .

(4.2)

p

(Note that the vectors φ(h)p + = 2−p/2 a ∗ (h)p +, for h ∈ g, span ⊗s h) Proposition 4.2. The map +π extends to an isometric map +π : Hπ → H, satisfying +π 1 ⊗ a (h) = aπ (h)+π , h ∈ g.

(4.3)

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

61

4.2. Number operator. We discuss now the number operator Nπ associated to a regular CCR representation in the space H. One can give two equivalent definitions of Nπ . Note that a more general notion of a number operator associated to a CCR representation, (which in particular does not need to be a positive operator) has been introduced by Chaiken in [Ch1, Ch2]. The first definition uses the intertwiner +π . Define D(Nπ ) := +π Kπ ⊗D(N ), which is a subspace of Hπ whose closure is Ran+π . Let Nπ be the operator on H with the domain D(Nπ ) defined by Nπ := +π 1 ⊗ N +∗π . (Note that Nπ needs not be densely defined.) Before we give an alternative definition of Nπ , let us recall some facts about quadratic forms. We will assume that a positive quadratic form is defined on the whole space H and takes values in [0, ∞]. The domain of a positive quadratic form b is defined as D(b) := {u ∈ H| b(u) < ∞}. If the form b is closed, then there exists a unique positive self-adjoint operator B such that 1 D(b) = D(B 2 ), b(u) = (u|Bu). If A is a closed operator, then Au2 is a closed form. The sum of closed forms is a closed form, and the supremum of a family of closed forms is a closed form. The following theorem gives an alternative definition of Nπ . Theorem 4.3. For each finite dimensional space f ⊂ g, one defines nπ,f (u) :=

dimf

aπ (hi )u2 ,

i=1

where {hi } is an orthonormal basis of f. (If u ∈ D(aπ (hi ) for some i, then nπ,f (u) = ∞.) Then the quadratic form nπ,f does not depend on the choice of the basis {hi } of f. The quadratic form nπ is defined by nπ (u) := supf nπ,f (u), u ∈ H. 1

Then D(nπ ) = D((Nπ ) 2 ), and nπ (u) = (u|Nπ u), u ∈ D(Nπ ). In particular, Ran+π = D(nπ ). To prepare for the proof of the above theorem, note that nπ defines a positive operator, 1 which we denote temporarily N˜ π , such that D(nπ ) = D((N˜ π ) 2 ) and nπ (u) = (u|N˜ π u), u ∈ D(N˜ π ).

(4.4)

Our aim is to show that N˜ π = Nπ . Note also that D(nπ ) ⊂ D(φπ (h)), h ∈ g.

(4.5)

62

J. Derezi´nski, C. Gérard 1

Lemma 4.4. If v ∈ D(N˜ π2 ), and F is a Borel function, then aπ (h)F (N˜ π − 1)v = F (N˜ π )aπ (h)v.

(4.6)

Proof. First we note that Wπ (h) maps D(nπ ) into itself and we have nπ (Wπ (h)u) = nπ (u) + (u|φπ (ih)u) + h2 u2 /2.

(4.7)

In fact, using (2.5) we see that (4.7) is true if we replace nπ with nπ,f , where f is a finite subspace of g containing h. Then (4.7) follows immediately. By the polarization identity, (4.7) has the following consequence for u, w ∈ D(nπ ): 1

1

1

1

(N˜ π2 Wπ (h)w|N˜ π2 Wπ (h)u) = (N˜ π2 w|N˜ π2 u) + (w|φπ (ih)u) + h2 (w|u)/2.

(4.8)

Replacing w with Wπ (h)∗ v and using the invariance of D(nπ ) under Wπ (h) we can rewrite (4.8) as follows, for u, v ∈ D(nπ ): 1

1

(N˜ π2 v|N˜ π2 Wπ (h)u) 1

(4.9)

1

= (N˜ π2 Wπ (h)∗ v|N˜ π2 u) + (Wπ (h)∗ v|φπ (ih)u) + 21 h2 (Wπ (h)∗ v|u). Next assume in addition that u, v ∈ D(N˜ π ). Then we can rewrite (4.9) as (N˜ π v|Wπ (h)u) = (Wπ (h)∗ v|N˜ π u) + (Wπ (h)∗ v|φπ (ih)u) + 21 h2 (Wπ (h)∗ v|u).

(4.10)

Next we set h = tg, for g ∈ g and we differentiate (4.10) w.r.t. t. (Differentiating is allowed by (4.5).) We obtain (N˜ π v|φπ (g)u) = (φπ (g)v|N˜ π u) − i(v|φπ (ig)u).

(4.11)

Substituting ig for g in (4.11) we obtain (N˜ π v|φπ (ig)u) = −(φπ (ig)v|N˜ π u) + i(v|φπ (g)u).

(4.12)

Adding up (4.11) and (4.12), we get (N˜ π v|aπ (g)u) = (aπ∗ (g)v|N˜ π u) − (v|aπ (g)u), u, v ∈ D(N˜ π ). 3

(4.13)

1

Next let us assume that u ∈ D(N˜ π2 ). Then N˜ π u ∈ D(N˜ π2 ) ⊂ D(aπ (g)). Hence, (4.13) implies (N˜ π v|aπ (g)u) = (v|aπ (g)(N˜ π − 1)u).

(4.14)

Therefore, aπ (g)u ∈ D(N˜ π ), and we have N˜ π aπ (g)u = aπ (g)(N˜ π − 1)u,

(4.15)

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

63

or equivalently (N˜ π + λ)aπ (g)u = aπ (g)(N˜ π + λ − 1)u. 1

(4.16)

3

Now let v ∈ D(N˜ π2 ) and λ > 1. Then (N˜ π + λ − 1)−1 v ∈ D(N˜ π2 ). Therefore, by (4.16) (N˜ π + λ)aπ (g)(N˜ π + λ − 1)−1 v = aπ (g)v.

(4.17)

Multiplying this with (N˜ π + λ)−1 , we obtain aπ (g)(N˜ π + λ − 1)−1 v = (N˜ π + λ)−1 aπ (g)v.

(4.18)

Since linear combinations of functions (N˜ π + λ)−1 with λ > 0 are strongly dense in the Von Neumann algebra of functions of N˜ π , and aπ (g) is closed, (4.18) implies 1

aπ (g)F (N˜ π − 1)v = F (N˜ π )aπ (g)v, v ∈ D(N˜ π2 ) for any bounded Borel function F .

% $

Lemma 4.5. Kπ = {0} implies D(nπ ) = {0}. Proof. Suppose that D(nπ ) = {0}. We know that N˜ π ≥ 0. Therefore, σ (N˜ π ) is nonempty and bounded from below. Hence λ0 := inf σ (N˜ π ) is a finite number, and Ran1[λ0 ,λ0 +1[ (N˜ π ) = {0}. By Lemma 4.4, for any h ∈ h, aπ (h)1[λ0 ,λ0 +1[ (N˜ π ) = 1[λ0 −1,λ0 [ (N˜ π )aπ (h). But

(4.19)

1[λ0 −1,λ0 [ (N˜ π ) = 0.

Therefore, (4.19) is zero and Ran1[λ0 −1,λ0 [ (N˜ π ) ⊂ Kπ .

% $

The following lemma is immediate: Lemma 4.6. Suppose that H = H0 ⊕ H1 . Suppose that h h → Wπ (h) ∈ H is a CCR representation and Wπ (h) leave H0 invariant. Then Wπ (h) leave also H1 invariant. Thus, we have two CCR representations g h → Wπ (h) , H0

g h → Wπ (h)

H1

.

Let Kπ,i , N˜ π,i denote the corresponding spaces of vacua and the operators defined by (4.4) for the representations i = 0, 1. Then Kπ = Kπ,0 ⊕ Kπ,1 ,

(4.20)

N˜ π = N˜ π,0 ⊕ N˜ π,1 .

(4.21)

64

J. Derezi´nski, C. Gérard

Lemma 4.7. The operators Wπ (h) preserve Ran+π , for h ∈ g. Proof. Since +π is isometric, +π (Kπ ⊗ fin (g)) is dense in Ran+π . It is also preserved by φπ (h), h ∈ g, and consists of vectors analytic for φπ (h). Hence, it is also preserved by Wπ (h) = eiφπ (h) . $ % Proof of Theorem 4.3. By Lemma 4.7, we are in the situation of Lemma 4.6 and we have two CCR representations in H0 = Ran+π and in H1 = H0⊥ . By the definition of Nπ , we have Nπ = Nπ,0 ⊕ Nπ,1 , where D(Nπ,1 ) = {0}. We check immediately by (4.3) that N˜ π,0 = Nπ,0 . We know that Kπ ⊂ H0 , hence Kπ,1 = {0}. By Lemma 4.5, this implies D(N˜ 1,π ) = {0}. Therefore, N˜ π = Nπ . $ %

5. Gaussian Random Processes and the Q-Space Representation In this section we describe the Q-space representation of Fock space and discuss the notion of Wick ordering associated to a Q-space representation. We also recall the notion of hypercontractivity, following [S-H.K,Si1]. 5.1. Gaussian processes. Let f be a real Hilbert space with the scalar product (h1 , h2 ), h1 , h2 ∈ f. Let Q be a space with a σ -algebra Q and a probability measure µ. Let Exp(F ) denote Q F dµ, for any measurable function F on Q. A linear map f h → φ(h)

(5.1)

into measurable functions on Q is called a Gaussian random process if Exp φ(h)2p = 2−p (h, h)p , Exp φ(h)2p+1 = 0, or equivalently

Exp e

iφ(h)

1 = exp − (h, h) . 2

Proposition 5.1. The following conditions are equivalent: (1) Q is the smallest σ -algebra for which φ(h), h ∈ f are measurable; (2) L2 (Q, dµ) is spanned by φ(h)n , h ∈ f; (3) L∞ (Q, dµ) is the smallest W ∗ -algebra containing eiφ(h) , h ∈ f. We refer to [Si1, Lemma I.5] for the proof. If the conditions of the above proposition are satisfied, then one says that the process (5.1) is full. If not, one can always make it full by choosing a smaller σ -algebra of subsets of Q. (Obviously, this procedure does not change the Hilbert space spanned by φ(h), nor the W ∗ -algebra spanned by eiφ(h) ). Let us assume that the random process (5.1) is full.

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

65

Let Pn denote the projection onto polynomials of degree n in φ(h), h ∈ f, inside L2 (Q, dµ). For any h1 , . . . , hn ∈ f we define : φ(h1 ) · · · φ(hn ) : := (1 − Pn−1 )φ(h1 ) · · · φ(hn ). We recall the well known Wick identities : φ(h)n := φ(h)n =

[n/2]

m=0 [n/2]

m=0

n! n−2m (− 1 (h, h))m , m!(n−2m)! φ(h) 2

(5.2)

n! m!(n−2m)!

: φ(h)n−2m : ( 21 (h, h))m .

5.2. Q-space representation of Fock space. Let h be a Hilbert space with a complex conjugation c, that is an antilinear map c : h → h such that c2 = 1 and (ch|cg) = (g|h). We set hc := {h ∈ h | ch = h}. Let Mc ⊂ B( (h)) be the abelian Von Neumann algebra generated by the Weyl operators W (h) for h ∈ hc . The following basic result follows from the fact that + is a cyclic vector for Mc (see eg [S-H.K]). Theorem 5.2. There exists a compact Hausdorff space Q, a probability measure µ on Q and a unitary map R such that R : (h) → L2 (Q, dµ), R+ = 1, RMc R ∗ = L∞ (Q, dµ), where 1 ∈ L2 (Q, dµ) is the constant function equal to 1 on Q. Moreover, R (c)u = Ru, u ∈ (h), and hc h → Rφ(h)R ∗ is a full Gaussian random process on Q. The space L2 (Q, dµ) is called the Q-space representation of the Fock space (h) associated to Mc . The following property of the Q-space representation is often useful (see [Si1, Prop.1.7]). Proposition 5.3. Let h = h1 ⊕ h2 , where hi , i = 1, 2 are Hilbert spaces with conjugations ci . Equip h with the conjugation c = c1 ⊕ c2 . Then, as a Q-space representation of the Fock space (h), one can take L2 (Q, dµ) for Q = Q1 × Q2 , µ = µ1 ⊗ µ2 , where L2 (Qi , dµi ), i = 1, 2 is the Q-space representation of (hi ). We have RU = R1 ⊗ R2 , where U : (h1 ) ⊗ (h2 ) → (h) is defined in Subsect. 3.8.

66

J. Derezi´nski, C. Gérard

To simplify notation, we will often omit the unitary transformation R in the formulas. Similarly, a function V on Q will be identified with the operator of multiplication by V on (h) ≡ L2 (Q, dµ). In particular, an element of (h) ≡ L2 (Q, dµ) can be considered as a multiplication operator on (h), i.e. as an unbounded operator affiliated to the Von Neumann algebra Mc . For v ∈ (h), this operator will be denoted by Wick c (v). It is the unique operator affiliated to Mc such that Wick c (v)+ = v. For instance, if h ∈ h = ⊗1s h, then Wick c (h) = a ∗ (h) + a(ch). One can generalize this formula to an arbitrary v ∈ fin (h), by writing Wick c (v) as a Wick polynomial. Proposition 5.4. Let v ∈ fin (h). Then Wick c (v) = Wick(γc (v)), where

γc (v) : fin (h) → fin (h)

is defined by (u2 |γc (v)u1 ) := (I˜ (c)u1 ⊗ u2 |N !− 2 v), u1 , u2 ∈ fin (h). 1

(5.3)

Proof. The proposition follows from Prop. 5.6 below. In fact by linearity we may assume p that v = 61 a ∗ (hi )+. Using then the concrete expression of γc (v) given in Prop. 5.6 i), we easily check the proposition. $ % Proposition 5.5. i)

γc (v)+ = v,

ii)

γc (v) (c) = (c)γc ( (c)v),

iii)

if v ∈ ⊗s h, γc (v) ∈ ⊕ B(⊗rs h, ⊗s

p

p

p−r

r=0

h).

Using the identity (3.17), (which is also true for antilinear q), we see that ii) in Prop. 5.5 is equivalent to the identity (c)Wick c (v) = Wick c ( (c)v) (c), which in turn follows from the fact that (c) is simply the complex conjugation on L2 (Q, dµ). p

Proposition 5.6. i) Let v = 61 a ∗ (hi )+. Then

Wick c (v) = I ⊂{1,...,p} 6i∈I a ∗ (hi )6i∈I a(chi ),

γc (v) = I ⊂{1,...,p} | ⊗s hi )(⊗s chi |. i∈I

i∈I

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

67

ii) (Wick’s theorem). For h ∈ hc , : φ(h)p :=

p 1 r a ∗ (h)r a(h)p−r . p 2p/2 0

Proof. It is easy to verify that the operator on the right hand side of the first identity of i) commutes with φ(h), h ∈ hc and maps + onto v. Hence it equals Wick c (v). The second identity of i) follows then from Thm. 3.12 v). To prove ii), we first claim 1 that : φ(h)p : + = 2p/2 a ∗ (h)p +. In fact the r.h.s. is orthogonal to the polynomials in φ(hi ), hi ∈ hc of order less than p − 1 and differs from φ(h)p + by a polynomial of order less than p − 1. Hence it equals : φ(h)p : +. Now ii) follows from i) and the fact that since : φ(h)p : is affiliated to Mc , Wick c (: φ(h)p :)+ =: φ(h)p :. $ % ¯ If h = L2 (R, dk) and the conjugation c is defined by h(k) = h(−k), (which will be p 2 the case in the P (ϕ)2 theory), and v ∈ ⊗s L (R), then using the notation (3.14) we can write p p − 21 Wick c (v) = p! w(k1 , . . . , kr , kr+1 , . . . , kp ) r r=0

× a ∗ (k1 ) · · · a ∗ (kr )a(−kr+1 ) · · · a(−kp )dk1 · · · dkp . 5.3. Hypercontractive semigroups. Let (Q, dµ) be a measure probability space. Definition 5.7. Let H0 ≥ 0 be a selfadjoint operator on H = L2 (Q, dµ). The semigroup e−tH0 is hypercontractive, if i) e−tH0 is a contraction on L1 (Q, dµ) for all t > 0, ii) ∃ T , C, such that e−T H0 ψL4 (Q,dµ) ≤ CψL2 (Q,dµ) . The abstract result used to construct the P (ϕ)2 Hamiltonian is the following theorem, due to Segal ([Se]). Theorem 5.8. Let e−tH0 be a hypercontractive semigroup. Let V be a real function on Q such that V ∈ Lp (Q, dµ), for some p > 2, and e−tV ∈ L1 (Q, dµ) for all t > 0. Let Vn = 1{|V |≤n} V and Hn = H0 + Vn . Then the semigroups e−tHn converge strongly on H, when n → ∞ to a strongly continuous semigroup on H denoted by e−tH . Its infinitesimal generator H has the following properties: i) H is the closure of H0 + V defined on D(H0 ) ∩ D(V ), ii) H is bounded below: H ≥ −c − ln e−V Lp (Q,dµ) , where c and p depend only on the constants C and T in Def. 5.7. The following technical result (see [Si1, Lemma V.5] for a proof) will be used later to show that a given function V on Q verifies e−tV ∈ L1 (Q, dµ).

68

J. Derezi´nski, C. Gérard

Lemma 5.9. Let for κ ≥ 1, Vκ , V be functions on Q such that, for some n ∈ N, V − Vκ Lp (Q,dµ) ≤ C(p − 1)n κ − ,

(5.4)

Vκ ≥ −C(ln κ)n . Then,

α

µ{q ∈ Q|V (q) ≤ −2C(ln κ)n } ≤ Ce−cκ , for some α > 0. Consequently, e−tV ∈ L1 (Q, dµ), ∀t > 0. The following theorem of Nelson (see [Si1, Thm. 1.17]) establishes a connection between contractions on h and hypercontractive semigroups on L2 (Q, dµ). Theorem 5.10. Let r ∈ B(h) be a selfadjoint contraction commuting with c. Then i) U (r)U ∗ is a positivity preserving contraction on Lp (Q, dµ), 1 ≤ p ≤ ∞. 1 1 ii) If r ≤ (p − 1) 2 (q − 1)− 2 for 1 < p, q < ∞, then U (r)U ∗ is a contraction from Lp (Q, dµ) to Lq (Q, dµ). Combining Thm. 5.10 with Thm. 5.8, we obtain the following result. Theorem 5.11. Let h be a Hilbert space with a conjugation c. Let a be a selfadjoint operator on h with [a, c] = 0, a ≥ m > 0.

(5.5)

Let L2 (Q, dµ) be the Q-space representation of (h), and let V be a real function on Q with V ∈ Lp (Q, dµ), for some p > 2, and e−tV ∈ L1 (Q, dµ) for all t > 0. Then, i) the operator sum H = d (a) + V is essentially selfadjoint on D(d (a)) ∩ D(V ); ii) H ≥ −C, where C depends only on m and e−V Lp (Q,dµ) , for some p depending only on m. Note that, by applying Thm. 5.10 to a = (q −1)− 2 1h for q > 2, we obtain the following lemma about the Lp properties of finite vectors in (h) (see [Si1, Thm. 1.22]). 1

Lemma 5.12. Let ψ ∈ ⊗ns h and q ≥ 2. Then RψLq (Q,dµ) ≤ (q − 1)n/2 ψ.

6. The Spatially Cut-Off P (ϕ)2 Hamiltonian In this section, we recall some standard facts about the construction of the spatially cut-off P (ϕ)2 Hamiltonian. Some of these facts are presented in a slightly more general form, which will be useful later when we will consider the Mourre theory for P (ϕ)2 Hamiltonians.

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

69

6.1. The spatially cut-off P (ϕ)2 model. We recall now the definition of the spatially cut-off P (ϕ)2 model that we will study in this paper (see for example [GJ1, S-H.K]). P (ϕ)2 models describe quantum field theories in 2 space-time dimensions, which means that the 1-particle Hilbert space h is taken equal to L2 (R, dk). We normalize the Fourier transform by F : L2 (R, dx) → L2 (R, dk),

Fχ (k) = χˆ (k) = e−ik·x χ (x)dx. The complex conjugation is the map c defined by ch(k) = h(−k), which corresponds to the usual conjugation f → f on L2 (R, dx) by Fourier transformation. The bosonic Fock space (h) will be denoted by H. We fix the dispersion relation 1

R k → ω(k) = (k 2 + m2 ) 2 , m > 0. The kinetic energy is H0 = d (ω). Let us now define the interaction term. Let ϕ(x) := e−ik·x a ∗ (k) + a(−k)

dk 1

ω(k) 2

(6.1)

be the local relativistic field operator defined in distribution sense. Note that the local field ϕ(x) is denoted by a different variety of the letter phi than the Segal field φ(h), h ∈ h, (see Subsect. 2.2). It is useful to note that ϕ(x) can be formally expressed in terms of φ as follows. Set √ 1 f (k) := 2ω(k)− 2 . Let τx : L2 (R, dk) → L2 (R, dk) be the translation by x, that is τx h = e−ik·x h. Then formally ϕ(x) = φ(τx f ).

(6.2)

Unfortunately, τx f ∈ L2 (R), so (6.2) has to be understood in the distribution sense. To remedy this one introduces UV-cutoff fields. Let χ ∈ L1R (R, dx) ∩ L2 (R, dx) with χ (x)dx = 1 and let κ ≥ 1 be a large UV-cutoff parameter. We introduce for later use the cutoff fields

ϕκ (x) := κ ϕ(y)χ (κ(y − x))dy

(6.3) = e−ik·x χˆ ( κk ) (a ∗ (k) + a(−k)) dk 1 . ω(k) 2

If one sets fκ (k) := then one can write

√ 1 k 2ω(k)− 2 χˆ ( ), κ

(6.4)

ϕκ (x) = φ(τx fκ ).

Note that since the function τx fκ (k) = e−ik·x χˆ ( κk )ω(k)− 2 belongs to hc , ϕκ (x) is affiliated to the algebra Mc . We will set ϕ∞ (x) := ϕ(x). 1

70

J. Derezi´nski, C. Gérard

To define the spatially cut-off P (ϕ)2 interaction, we fix a real polynomial of degree 2n,

P (λ) =

2n

aj λj , with a2n > 0,

(6.5)

j =0

and a real function g ∈ L1R (R, dx) ∩ L2 (R, dx) with g ≥ 0. We set, for κ < ∞, Vκ :=

g(x) : P (ϕκ (x)) : dx,

(6.6)

which is an unbounded operator affiliated to Mc . We will see later that, when κ → ∞, Vκ converges in L2 (Q, dµ) to a function V , which we will denote by V =

g(x) : P (ϕ(x)) : dx.

Alternatively, one can view the multiplication operators Vκ and V as Wick polynomials, using the discussion in Subsect. 5.2. In fact for p ∈ N, we have

g(x) :ϕκ (x)p : dx p p = wp,κ (k1 , . . . , kr , kr+1 , . . . , kp ) r

(6.7)

r=0

× a ∗ (k1 ) · · · a ∗ (kr )a(−kr+1 ) · · · a(−kp )dk1 · · · dkp ,

for p 1 ki p wp,κ (k1 , · · · , kp ) = g( ˆ ki )61 χˆ ( )ω(ki )− 2 . κ

(6.8)

1

From Lemma 6.1 and Prop. 6.5 below, we deduce that V and Vκ are defined as unbounded operators with domain D(N )n . This implies that the operator sum H0 +V is well defined as a symmetric operator on D(H0 ) ∩ D(N n ). The construction of a unique selfadjoint extension of H0 + V is outlined in Subsect. 6.4. This Hamiltonian is denoted by H = H0 +

g(x) : P (ϕ(x)) : dx,

and is called a spatially cut-off P (ϕ)2 Hamiltonian or simply a P (ϕ)2 Hamiltonian.

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

71

6.2. Assumptions on g. In this subsection, we discuss the various assumptions on the cutoff function g, which will be used in our paper. A spatially cut-off P (ϕ)2 model is completely specified by the polynomial P and the cutoff g. Let us introduce the following assumptions: (A) g ≥ 0, g ∈ L1R (R, dx) ∩ L2 (R, dx). Assumption (A) is a standard assumption needed to construct H as a selfadjoint operator. (This assumption can be relaxed to g ≥ 0, g ∈ L1R (R, dx) ∩ L1+ (R, dx) > 0, see [Si1].) 1 (C) g ∈ H 2 (R). Assumption (C) will be needed if degP = 4 to ensure that D(H ) = D(H0 ) ∩ D(V ). This fact allows us to give a simpler treatment of the Mourre theory for ϕ24 Hamiltonians that does not use the Q-representation, (Mm) (x · ∂x )j g ∈ L2 (R, dx), j = 1, . . . , m. Assumption (Mm) is needed to define the commutators adm d (a) V , where a is the generator of dilations on h as densely defined operators (a priori they are only defined as quadratic forms), (I s) x s g ∈ L2 (R, dx), s ≥ 0. Assumption (Is) is needed for the scattering theory of spatially cut-off P (ϕ)2 Hamiltonians. In particular (Is) for s > 1 is a short-range condition, under which the asymptotic fields and the wave operators can be constructed, (Bm) g(x) ≤ Cg(y)x − y N ,

|(x · ∂x )j g(x)| ≤ Cg(x), 0 ≤ j ≤ m.

Assumption (Bm) will be needed in Sect. 8 in order to be able to control the commutator adm d (a) V . Note that (Bm) implies (Mm). We will always assume (A). All other assumptions will be explicitly stated. 6.3. Some properties of the interaction kernel. We collect here various properties of the interaction kernels wp,κ of the Wick polynomials Vκ . The following lemma is well known. Lemma 6.1. The kernels wp,κ are in L2 (Rp ), for 1 ≤ κ ≤ ∞ and wp,κ −wp,∞ L2 (Rp ) ≤ CgL2 (R) κ − , > 0. Proof. We use the bound p

61 aj ≤

p

(6j =i aj )p/(p−1) ,

(6.9)

i=1

which follows from the fact that p

(61 λi )1/p ≤

p

λi ,

1 p/(p−1)

applied to λi = 6j =i aj . Applying (6.9) to ai = ω(ki )− 2 , we obtain that wp,∞ , and hence wp,κ , for κ < ∞, belong to L2 (Rd ). The bound on wp,κ − wp,∞ is a direct computation, using (6.9). $ % 1

72

J. Derezi´nski, C. Gérard

We deduce from Lemma 6.1 the following result: Lemma 6.2. The operators Vκ (N + 1)−n are bounded on H, for 1 ≤ κ ≤ ∞, and (V − Vκ )(N + 1)−n ≤ CgL2 (R) κ − , > 0. Lemma 6.3. Let j ∈ C ∞ (R), with j ≡ 0 near 0 and j ≡ 1 near infinity. Then, for 1 ≤ i ≤ p, xi O(R −s ), under hypothesis (I s), j ( )wp L2 (Rp ) ∈ o(R 0 ), under hypothesis (A). R Proof. It suffices to prove the lemma for i = 1. It follows from (Is) that gˆ belongs to the Sobolev space H s (R). Let us first check that |Dk1 |s wp ∈ L2 (Rp ).

(6.10)

By interpolation it suffices to check this for s ∈ N. We see that ∂ks1 wp is a sum of terms of the form p 1 1 p gˆ (s1 ) ( ki )ω− 2 (k1 )(s2 ) 62 ω(ki )− 2 , 1

where s = s1 + s2 and hence gˆ (s1 ) ∈ L2 (R). Since gˆ (s1 ) belongs to L2 (R, dk), the bound (6.9) gives that ∂ks1 wp ∈ L2 (Rd ), and hence (6.10) is true. Next we note that since j vanishes near the origin, j (x) = |x|s js (x), where js is bounded. Now j(

x1 −Dk1 −Dk1 ) = j( ) = R −s js ( )|Dk1 |s , R R R −Dk1 2 p R ) is a uniformly bounded operator on L (R ). The

and by the spectral theorem, js ( lemma follows then from (6.10).

% $

6.4. Existence and basic properties. We summarize now standard results on the existence of the P (ϕ)2 Hamiltonian (see [GJ1, Se, S-H.K, Ro1]). Theorem 6.4. Let P be a real, bounded below polynomial of degree 2n and g ≥ 0, g ∈ L1R (R, dx) ∩ L2 (R, dx). Then,

i) the Hamiltonian H = H0 + V for V = g(x) : P (ϕ(x)) : dx is essentially selfadjoint on D(H0 ) ∩ D(V ); ii) there exist b, C ≥ 0, such that the following first order estimates hold: H0 ≤ C(H + b),

(6.11)

N ≤ C(H + b).

(6.12)

Proof. i) follows from Thm. 5.11 using Lemma 6.6 below. ii) follows from Thm. 5.11 with a replaced by (1 − )a. Equation (6.12) follows from (6.11). $ %

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

73

Proposition 6.5. Let P (λ) be a real polynomial of degree 2n as in (6.5) and P˜ (λ) be a real polynomial of degree ≤ 2n. Assume that the coefficient a˜ 2n of P˜ (λ) and a2n of P (λ) satisfy |a˜ 2n | < a2n . Let g, g˜ ∈ L1R (R, dx) ∩ L2 (R, dx) be two functions with g ≥ 0, |g| ˜ ≤ g. Let V˜ := g(x) ˜ P˜ (ϕ(x)) : dx. Then V˜ is a multiplication operator in the Q-space representation, and there exist C, b such that |V˜ | ≤ C(H + b).

The above theorem and proposition follow from the next lemma, where we collect some well-known properties (see e.g. [Ne, Se, S-H.K]), which show that the P (ϕ)2 interaction is a multiplication operator in the Q-space representation. Lemma 6.6. Under the conditions of Proposition 6.5, set W := C g(x) : P (ϕ(x)) : −g(x) ˜ : P˜ (ϕ(x)) : dx. Then W is a multiplication operator in the Q-space representation. Moreover, W ∈ Lp (Q, dµ), for all p < ∞, and e−tW L1 (Q,dµ) depends only on P , P˜ and gL1 (R)∩L2 (R) . Proof. Note first that since |g| ˜ ≤ g, g ˜ L1 (R)∩L2 (R) is less than gL1 (R)∩L2 (R) . Let, for κ ≥ 1, Wκ be the cutoff operator defined as in (6.6). Both W and Wκ are Wick polynomials. Applying Lemma 6.2 we see that W (N + 1)−n is bounded and (W − Wκ )(N + 1)−n ≤ CgL2 (R) κ − , > 0.

(6.13)

We have seen at the end of Subsect. 6.1 that Wκ is a multiplication operator by a function Wκ on Q. Moreover Wκ ∈ L2 (Q, dµ), since Wκ L2 (Q,dµ) = Wκ +H < ∞. Next it follows from (6.13) that Wκ is Cauchy in L2 (Q, dµ), and hence converges to W in L2 (Q, dµ) when κ → +∞. We note then that W + ∈ ⊗2n s h since W is a Wick polynomial of degree 2n, which by Lemma 5.12 implies that W ∈ Lp (Q, dµ), for all p < ∞. To bound e−tW L1 (Q,dµ) , we will use Lemma 5.9, checking that the constants C, , n there depend only on P , P˜ and gL1 (R)∩L2 (R) . The first bound of (5.4) follows from (6.13) and Lemma 5.12. To check the second bound, we use the Wick identities (5.2), which yield : P (ϕκ (x)) := : P˜ (ϕκ (x)) :=

2n

p=0 2n

p=0

ap a˜ p

[p/2]

r=0 [p/2]

r=0

bp,r ϕκ (x)+2r ϕκ (x)p−2r , (6.14) bp,r ϕκ

(x)+2r ϕ

κ

(x)p−2r ,

for bp,0 = 1. Hence, as an inequality between functions on Q, we have : P (ϕκ (x)) : −| : P˜ (ϕκ (x)) : | ≥ Fκ (ϕκ (x)),

74

J. Derezi´nski, C. Gérard

for Fκ (λ) =

2n [p/2]

cp,r ϕκ (x)+2r λp−2r ,

(6.15)

p=0 r=0

and c2n,0 = a2n − |a˜ 2n |. If we apply the bound a 2r bp−2r ≤ bp + C a p to all terms in (6.15), for p < 2n, and use that a2n − |a˜ 2n | > 0, we obtain Fκ (λ) ≥ −C(ϕκ (x)+2n + 1).

(6.16) 1

By a direct computation, we check that ϕκ (x)+ = ϕκ (0)+ ∈ O(ln κ 2 ). This gives Wκ = ≥

g(x) : P (ϕκ (x)) : −g(x) ˜ : P˜ (ϕκ (x)) : dx g(x) : P (ϕκ (x)) : −g(x)| : P˜ (ϕκ (x)) : |dx

≥ −CgL1 (R) ((ln κ)n + 1), which proves the second bound of (5.4), with a constant depending only on gL1 (R) . % $ 7. Higher Order Estimates In this section we will state some higher order estimates which will be very important in the sequel. These higher order estimates are due to Glimm and Jaffe [GJ2] for the ϕ24 model and (in a more general form than here) to Rosen [Ro2] for the general P (ϕ)2 model. Note, however, that the proof in [Ro2] is only valid under the additional hypothesis that g ∈ C0∞ (R) (see Remark 7.5). In this subsection we will explain the modifications to Rosen’s proof necessary to treat the general case when g ∈ L1R (R, dx) ∩ L2 (R, dx). The reader may also consult [Si2] for a review of higher order estimates. Theorem 7.1. Assume hypothesis (A). Then there exists b > 0 such that, for all α ∈ N, the following higher order estimates hold: N α (H + b)−α < ∞, H0 N α (H + b)−n−α < ∞,

(7.1)

N α (H + b)−1 (N + 1)1−α < ∞. In the case of the ϕ24 model a better estimate is known. Theorem 7.2. Assume degP = 4 and hypothesis (C). Then there exists b > 0 such that H0 (H + b)−1 < ∞. Consequently D(H ) = D(H0 ) ∩ D(V ).

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

75

Proof. The proof given in [GJ2] for g ∈ C0∞ (R) is still valid under hypothesis (C). In fact one first proves that if W = g(x) : ϕκ (x)p : dx for p ≤ 4 then (see [GJ2, Eq. 2.17]): 1

1

(H0 + 1)−1 [H02 , [H02 , W ]](H0 + 1)−1 ≤ Cω 2 ( 1

4

ki )wp L2 (R4 ) .

1 1

1

1

(Note that the expressions W , [H02 , W ] and [H02 , [H02 , W ]] are well defined as quadratic 1

forms on S = fin (C0∞ (R)) since H02 preserves S.) One then uses Jaffe’s double commutator trick (see eg [Si2, Sect. 4]) to obtain that, as quadratic forms on S, H 2 ≥ c(H02 + V 2 ) − d.

(7.2)

Now it follows from the higher order estimates that any core for H0n is a core for H , and in particular, S is a core for H . Hence (7.2) extends to D(H ), which proves the theorem. % $ Remark 7.3. The importance of the higher order estimates comes from the fact that the domain of H is not known explicitly. In particular, the question if D(H ) = D(H0 )∩D(V ) is still an open problem, except for the ϕ24 model, where this result was shown by Glimm and Jaffe in [GJ2]. This means that, for u ∈ D(H ), the identity H u = H0 u + V u does not make sense for the general P (ϕ)2 model. However, a consequence of the higher order estimates is that D(H n ) ⊂ D(V ) ∩ D(H0 ), so that this identity makes sense for u ∈ D(H n ). The proof of higher order estimates in [Ro2] is based on the pullthrough formula, which gives an expression for the multi-commutators of annihilation operators a(ki ) with the resolvent (H − z)−1 . The technical problem is that, in order to make sense of these formal computations, one needs a subspace D of H that is contained in the domain of all powers of creation operators and on which H is essentially selfadjoint. To circumvent this difficulty, Rosen introduced cutoff Hamiltonians for which the interaction acts only on a finite number of degrees of freedom. In a Q-space representation these cutoff Hamiltonians become differential operators, for which the construction of a subspace D with the above properties is easy. The higher order estimates are then shown for the cutoff Hamiltonians, with constants uniform in the cutoff parameters. The proof is then completed by taking the cutoff to infinity.

7.1. Cutoff Hamiltonians. In this subsection we introduce the U.V. cutoff Hamiltonians used in the proof of the higher order estimates. Let h be a Hilbert space equipped with a conjugation c. Let π1 : h → h1 be an orthogonal projection on a closed subspace h1 of h with [π1 , c] = 0. Let h⊥ 1 be the orthogonal complement of h1 . In all formulas below we will consider π1 as an element of B(h, h1 ). With this convention the orthogonal projection on h1 , considered as an element of B(h, h), is equal to π1∗ π1 . Let U : (h1 ) ⊗ (h⊥ 1 ) → (h) be the unitary map defined in Subsect. 3.8. We ⊥ ⊥ 2 denote by L (Q1 , dµ1 ), L2 (Q⊥ 1 , dµ1 ) the Q-space representations of (h1 ), (h1 ). Recall that, by Prop. 5.3, we may take as a Q-space representation of (h) the space

76

J. Derezi´nski, C. Gérard

⊥ ⊥ L2 (Q, dµ) for Q = Q1 × Q⊥ 1 , µ = µ1 ⊗ µ1 . Accordingly we denote by (q1 , q1 ) the elements of Q = Q1 × Q⊥ 1. If W ∈ B( (h)) we set B( (h)) 61 W := U (π1 )W (π1∗ ) ⊗ 1 U ∗ .

Lemma 7.4. i) If w ∈ Bfin ( (h)), then 61 Wick(w) = Wick( (π1∗ π1 )w (π1∗ π1 )).

(7.3)

ii) If V is a multiplication operator by a function in L2 (Q, dµ), then 61 V is the operator of multiplication by the function V (q1 , q1⊥ )dµ⊥ (7.4) 61 V (q1 ) = 1. Q⊥ 1

Proof. To prove i), we may assume by linearity that q

p

1

1

w = | ⊗s hi )(⊗s gi |, for which the verification of i) is easy. To prove ii), we first deduce from Prop. 3.5 that (π1 )U = 1 ⊗ |+)(+|. This implies that, as a multiplication operator on Q1 , (π1 )V (π1∗ ) is given by (7.4). Then one uses Prop. 5.3. $ % q

p

In particular, if W = 61 a ∗ (hi )61 a(gi ), then q

p

61 W = 61 a ∗ (π1∗ π1 hi )61 a(π1∗ π1 gi ).

(7.5)

Let now {πn }n∈N be a sequence of orthogonal projections on h such that πn ≤ πn+1 , [πn , c] = 0, s- lim πn = 1, n→+∞

(7.6)

and let 6n be the associated maps defined by (7.3). Using the representation (7.4), it is shown in [S-H.K, Prop. 4.9] that i) 6n V → V in Lp (Q, dµ), when n → ∞, if V ∈ Lp (Q, dµ); ii) e−t6n V L1 (Q,dµ) ≤ e−tV L1 (Q,dµ) .

(7.7)

Let us now specify the particular sequence of projections corresponding to the lattice regularization and volume cutoff used in [Ro2]. For v ≥ 1 we introduce the lattice v −1 Z and let R k → [k]v ∈ v −1 Z be the integer part of k defined by −(2v)−1 < k − [k]v ≤ (2v)−1 . For γ ∈ v −1 Z, we 1 denote by eγ ∈ h the vector eγ (k) = v 2 1]−(2v)−1 ,(2v)−1 ] (k −γ ) and set a (γ ) := a (eγ ). For κ ∈ [1, +∞[ an UV cutoff parameter, we denote by κ,v the set v −1 Z ∩ {|γ | ≤ κ}, by hκ,v the finite dimensional space spanned by {eγ }γ ∈ κ,v , and by πκ,v the orthogonal projection from h onto hκ,v .

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

77

Let us fix a sequence (κn , vn ) tending to (∞, ∞) in such a way that κn ,vn ⊂ κn+1 ,vn+1 .

(7.8)

We denote by n the finite lattice κn ,vn and choose as πn the projection on hn := hκn ,vn , which satisfies

(7.6). If V = g(x) : P (ϕ(x)) : dx, we define the cutoff interaction Vn := 6n V . Using (7.5) and (6.7), we obtain the following explicit expression for Vn :

degP

Vn =

g(x) : ϕn (x)p : dx,

ap

(7.9)

p=0

where p p : ϕn (x) := r p

r=0

p

a ∗ (γ1 ) · · · a ∗ (γr )a(−γr+1 ) · · · a(−γp )61 µn (x, γi ),

γ1 ,... ,γp ∈ n

(7.10) and 1 2

µn (x, γ ) := vn

+(2vn )−1 −(2vn

)−1

e−ix,γ +k ω(γ + k)− 2 dk. 1

(7.11)

Remark 7.5. Our definition of the cutoff interaction is different from the one used in [Ro2]. In fact, the cutoff interaction used there is obtained by replacing the orthogonal projection πn : h → hn by the unbounded operator 1 h → v − 2 h(γ )eγ . γ ∈ κn ,vn

With this convention, it is easy to see that, for example, Vn + will not converge to V + for an arbitrary g ∈ L1R (R, dx) ∩ L2 (R, dx). To define the cutoff kinetic energy, we set, as in [Ro2], ωn : R → R, ωn (k) := ω([k]vn ). Since [ωn , πn ] = 0, the operator ωn acts on hn and h⊥ n . By Prop. 3.5, we have H0,n := d (ωn ) = Un d (ωn h ) ⊗ 1 + 1 ⊗ d (ωn h⊥ ) Un∗ , n

n

(7.12)

where Un is the unitary operator between (hn ) ⊗ (h⊥ n ) and (h). The cutoff Hamiltonian is then defined as Hn := H0,n + Vn .

78

J. Derezi´nski, C. Gérard

7.2. Properties of the cutoff Hamiltonians. In this subsection we collect some properties of the cutoff Hamiltonians that are needed to prove the higher order estimates. These properties are: the existence of a suitable domain of essential selfadjointness, the uniform lower bounds and finally resolvent convergence to the Hamiltonian H . j

Proposition 7.6 (Ro2). The Hamiltonians Hn , j ∈ N, are essentially selfadjoint on Dn = U fin (hn ) ⊗ fin (h⊥ n ∩ S(R)) .

Proof. As in [Ro2], we have Un∗ Hn Un = Hˆ n ⊗ 1 + 1 ⊗ d (ωn h⊥ ), n

for Hˆ n = d (ωn h ) + Vn . Since hn is finite dimensional, in the Q-space representation n Hˆ n becomes a differential operator −H + W (x), for W a bounded below polynomial, j acting on L2 (Rdimhn , dx). By [GJ3, Thm. 2.2.6], Hˆ n is essentially selfadjoint on fin (hn ), for j ∈ N. The arguments in [Ro2] give then the proposition. $ % Proposition 7.7. Let, for n ∈ N, g˜ n ∈ L1R (R, dx) ∩ L2 (R, dx) with |g˜ n | ≤ Cg and P˜ (λ) a real polynomial of degree less than degP − 1. Then there exist constants a, b > 0 such that H0,n ≤ a(Hn + b),

g˜ n (x) : P˜ (ϕn (x)) : dx ≤ C(Hn + b).

Proof. Let Wn =

g(x) : P (ϕ(x)) : −g˜ n (x) : P˜ (ϕ(x)) : dx, n

and Wn = 6n W n . By Lemma 6.6, e−tW L1 (Q,dµ) is bounded uniformly in n. Hence, by (7.7) e−tWn L1 (Q,dµ) is bounded uniformly in n. On the other hand, H0,n = d (ωn ), where ωn satisfies (5.5). It follows then from Thm. 5.11 that H0,n +Wn is bounded below uniformly in n. This shows the second bound in the proposition. The first one follows from the same argument, replacing H0,n by (a − 1)H0,n . $ % Finally, the following result is shown in [S-H.K, Prop. 4.8]. Proposition 7.8. For z ≤ −b, (Hn − z)−1 → (H − z)−1 in norm, when n → +∞.

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

79

7.3. Proof of the higher order estimates. We now explain the modifications to the proof of Rosen [Ro2] needed in our case. The only places where modifications are needed are the ones where the interaction Vn appears, i.e. in [Ro2, Lemma 4.4]. We define, for I = {1, . . . , s}, ki ∈ R, i ∈ I : VnI := ada(k1 ) · · · ada(ks ) Vn , which is well defined as an operator on Dn . The analog of [Ro2, Lemma 4.4] is now Lemma 7.9. There exist b, c > 0 such that for all λ1 , λ2 < −b, (Hn − λ2 )− 2 VnI (Hn − 1 1 λ1 )− 2 defined on (Hn − λ1 ) 2 Dn extends to a bounded operator on H such that 1

(Hn − λ2 )− 2 VnI (Hn − λ1 )− 2 ≤ c6si ω(ki )− 2 . 1

1

1

Proof. Using (7.10) and the commutation relation [a(k), a ∗ (γ )] = eγ (k) = v 2 δ(γ , [k]v ), 1

for "

δ(γ , γ ) = we obtain that

p

[a(k), : ϕn (x) :] = 0, if |k| > κn ,

and p

1 if γ = γ " , 0 otherwise,

[a(k), : ϕn (x) :] =

p

r=0

r

p

∗ ∗ γ2 ,... ,γp ∈ n a (γ2 ) · · · a (γr )a(−γr+1 ) · · · a(−γp ) r

1

p

× vn2 µn (x, [k]vn )62 µn (x, γi ), if |k| ≤ κn . We obtain VnI

=

s/2 6s1 1{|k|≤κ} (ki )vn

g(x)6s1 µn (x, [ki ]vn ) : P (s) (ϕn (x)) : dx.

For fixed (k1 , . . . , ks ), the operator VnI is of the form g˜ n (x, k1 , . . . , ks ) : Q(ϕn (x)) : dx for

s/2

g˜ n (x, k1 , . . . , ks ) = vn g(x)6s1 1{|k|≤κ} (ki )µn (x, ki ),

and Q(λ) = P (s) (λ). Since ω(γ ) ≤ c0 , γ ∈ v ,|k|≤1 ω(γ + k) sup

we have

−1

|µn (x, k)| ≤ c0 ω(k)− 2 vn 2 , 1

80

J. Derezi´nski, C. Gérard

which yields |g˜ n (x, k1 , . . . , ks )| ≤ c0 g(x)6s1 ω(ki )− 2 . 1

Applying then Prop. 7.7, we obtain that (Hn + b)− 2 VnI (Hn + b)− 2 ≤ C6s1 ω(ki )− 2 , uniformly in n, 1

1

1

which implies the lemma. $ %

7.4. Number energy estimates. In this subsection we state a consequence of the higher order estimates, which will be used in what follows. We will denote by H ext the Hamiltonian H ⊗ 1 + 1 ⊗ H0 , acting on the extended Hilbert space Hext = (h) ⊗ (h). We will use the following notation: let an operator B(t), depending on some parameter t, map ∩n D(N n ) ⊂ H into itself. We will write We will use the following notation: let an operator B(t), depending on some parameter t, map ∩n D(N n ) ⊂ H into itself. We will write B(t) ∈ (N + 1)m ON (t p ), for m ∈ R, if (N + 1)−m−k B(t)(N + 1)k ≤ Ck t p , k ∈ Z.

(7.13)

If (7.13) holds for any m ∈ R, then we will write B(t) ∈ (N + 1)−∞ ON (t p ). Likewise, for an operator C(t) that maps ∩n D(N n ) ⊂ H into ∩n D((N0 + N∞ )n ) ⊂ we will write

Hext ,

C(t) ∈ (N + 1)m Oˇ N (t p ) for m ∈ R if (N0 + N∞ )−m−k C(t)(N + 1)k ≤ Ck t p ,

k ∈ Z.

If (7.14) holds for any m ∈ R, then we will write B(t) ∈ (N + 1)−∞ Oˇ N (t p ). The notation (N + 1)oN (t p ), (N + 1)m oˇ N (t p ) is defined similarly. Proposition 7.10. The following properties hold: i) uniformly, for z in a compact set of C, we have (H − z)−1 ∈ (N + 1)−1 ON (|I mz|−1 ), m ∈ R; ii) for χ ∈ C0∞ (R), we have N m χ (H )N p < ∞, m, p ∈ N.

(7.14)

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

81

Proof. ii) follows directly from (7.1). To prove i), it is enough to prove that, for m ∈ N, (N + 1)m (H − z)−1 (N + 1)1−m ∈ O(|I mz|−1 ), m ∈ R. We use an induction on m. For m = 0, i) follows from (7.1). Assume that i) holds for m − 1. Then we write N m (H − z)−1 (N + 1)1−m = N m (H + b)−1 (N + 1)1−m (N + 1)m−1 (H + b)(H − z)−1 (N + 1)1−m = N m (H + b)−1 (N + 1)1−m (N + 1)m−1 (1 + (b + z)(H − z)−1 )(N + 1)1−m . So i) for m follows from (7.1) and the induction hypothesis. $ % 7.5. Commutator estimates. In this subsection we estimate commutators between operators (q), I (j ) and the Hamiltonians H and H ext . These estimates rely on the identities of Subsect. 3.12 and the higher order estimates. The following lemma is analogous to [DG1, Lemma 3.3]. Lemma 7.11. Let q ∈ C0∞ (Rd ), 0 ≤ q ≤ 1, q = 1 near 0. Set for R ≥ 1, where q R (x) = q( Rx ). Then, for χ ∈ C0∞ (R), (N + 1)−∞ ON (R − inf(s,1) ), under hypothesis (I s), [ (q R ), χ (H )] ∈ (N + 1)−∞ oN (R 0 ), under hypothesis (A). Proof. Let us prove the lemma under hypothesis (Is), the proof under hypothesis (A) being similar. We have [ (q R ), N] = 0, hence (q R ) preserves D(N n ). By Lemma 3.4 ii), [H0 , (q R )] = d (q R , [ω, q R ]),

(7.15)

[ω, q R ] is bounded, and hence [H0 , (q R )](H0 + 1)−1 is bounded. Therefore, (q R ) preserves D(H0 ). Using that on D(H0 ) ∩ D(N n ) we have H = H0 + V and (q R ) preserves D(H0 ) ∩ D(N n ), the following identity is valid as an operator identity on D(H0 ) ∩ D(N n ): [H, (q R )] = [H0 , (q R )] + [V , (q R )] =: T . Using (7.15) and the fact that [ω, q R ] ∈ O(R −1 ), we get [ (q R ), H0 ] ∈ (N + 1)ON (R −1 ), and using Lemma 3.17 and Lemma 6.3, we have [ (q R ), V ] ∈ (N + 1)n ON (R −s ), which gives T ∈ (N + 1)n O(R − inf(s,1) ).

(7.16)

82

J. Derezi´nski, C. Gérard

Now, let

R(z) := [ (q R ), (z − H )−1 ] = −(z − H )−1 [ (q R ), H ](z − H )−1 .

We want to show that N m R(z)(H + b)−n−m ∈ |I mz|−2 O(R − inf(s,1) ), m ≥ 0,

(7.17)

(H + b)−n−m R(z)N m ∈ |I mz|−2 O(R − inf(s,1) ), m ≥ 0.

(7.18)

By the higher order estimates (7.1), D(H n ) ⊂ D(H0 )∩D(N n ), so the following operator identity holds on D(H n−1 ): R(z) = (z − H )−1 T (z − H )−1 . Now,

N m R(z)(H + b)−n−m ≤ N m (z − H )−1 (N + 1)−m+1 × (N + 1)m−1 T (N + 1)−n−m+1 × (N + 1)n+m−1 (H + b)−n−m+1 × (H + b)(z − H )−1 ,

which proves (7.17), and (7.18) follows then by taking the adjoint. If χ ∈ C0∞ (R), we denote by χ˜ ∈ C0∞ (C) an almost analytic extension of χ , satisfying χ˜ |R = χ , |∂ z χ˜ (z)| ≤ Cn |I mz|n , n ∈ N. We use the following functional calculus formula (see [HS, DG2]) for χ ∈ C0∞ (R): i χ (A) = ∂ z χ˜ (z)(z − A)−1 dz ∧ d z. (7.19) 2π C Let now χ1 ∈ C0∞ (R) with χ1 χ = χ and χ˜ 1 an almost analytic extension of χ1 . We write N m [χ (H ), (q R )]N p = N m χ1 (H )[χ (H ), (q R )]N p + N m [χ1 (H ), (q R )]χ (H )N p i = ∂ z χ˜ (z)N m χ1 (H )R(z)N p dz ∧ d z 2π C i + ∂ z χ˜1 (z)N m R(z)χ (H )N p dz ∧ d z. 2π C Using Prop. 7.10 ii), (7.17) and (7.18), we see that N m [χ (H ), (q R )]N p is O(R − inf(s,1) ), as claimed. $ % The following lemma is analogous to [DG1, Lemma 3.4]:

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

83

2 ≤ 1, j = 1 Lemma 7.12. Let j0 ∈ C0∞ (Rd ), j∞ ∈ C ∞ (Rd ), 0 ≤ j0 , 0 ≤ j∞ , j02 + j∞ 0 R ). near 0 (and hence j∞ = 0 near 0). Set j := (j0 , j∞ ) and for R ≥ 1 j R = (j0R , j∞ Then for χ ∈ C0∞ (R): ˇ − inf(s,1) ), under hypothesis (I s), (N + 1)−∞ O(R χ (H ext )I ∗ (j R )−I ∗ (j R )χ (H ) ∈ −∞ (N + 1) o(R ˇ 0 ), under hypothesis (A).

Proof. Again we will only prove the lemma under hypothesis (Is). We have, by Lemma 3.11 i), H0ext I ∗ (j R ) − I ∗ (j R )H0 ∈ (N + 1)Oˇ N (R −1 ).

(7.20)

This implies that I ∗ (j R ) sends D(H0 ) into D(H0ext ), and since I ∗ (j R )N = (N0 + N∞ )I ∗ (j R ), I ∗ (j R ) sends also D(N n ) into D((N0 + N∞ )n ). Next, by Lemma 3.17 and Lemma 6.3, we obtain (V ⊗ 1)I ∗ (j R ) − I ∗ (j R )V ∈ (N + 1)n Oˇ N (R −s ).

(7.21)

This and (7.20) show that, as an operator identity on D(H0 ) ∩ D(N n ), we have H ext I ∗ (j R ) − I ∗ (J R )H ∈ (N + 1)n Oˇ N (R − min(1,s) ).

(7.22)

Using then the higher order estimates (7.1) and the fact that I ∗ (j R ) sends D(H0 ) into D(H0ext ) and D(N n ) into D((N0 + N∞ )n ), we obtain the following operator identity on D(H n ): R(z) = (z − H ext )−1 I ∗ (j R ) − I ∗ (j R )(z − H )−1 = (z − H ext )−1 I ∗ (j R )H − H ext I ∗ (j R ) (z − H )−1 . Using Prop. 7.10, we see that (N0 + N∞ )m R(z)(H + b)−m−n ∈ O(|I mz|−2 )R − inf(s,1) ,

(7.23)

(H ext + b)−m−n R(z)N m ∈ O(|I mz|−2 )R − inf(s,1) .

(7.24)

Let us again pick χ1 ∈ C0∞ (R) with χ1 χ = χ . We have (N0 +N∞ )m χ (H ext )I ∗ (j R ) − I ∗ (j R )χ (H )N m = (N0 + N∞ )m χ1 (H ext ) χ (H ext )I ∗ (j R ) − I ∗ (j R )χ (H ) N m + (N0 + N∞ )m χ1 (H ext )I ∗ (j R ) − I ∗ (j R )χ1 (H ) χ (H )N m

= 2πi C ∂ z χ˜ (z)(N0 + N∞ )m χ1 (H ext )R(z)N m dz ∧ d z

+ 2πi C ∂ z χ˜1 (z)(N0 + N∞ )m R(z)χ (H )N m dz ∧ d z. Using Prop. 7.10 ii), (7.23) and (7.24), the above operator is O(R − inf(s,1) ), as claimed. % $

84

J. Derezi´nski, C. Gérard

8. A Conjugate Operator for P (ϕ)2 Hamiltonians 8.1. Introduction. This section is devoted to the study of a conjugate operator for P (ϕ)2 Hamiltonians. The central point of the construction of a conjugate operator A for a Hamiltonian H is the proof that the quadratic form [H, iA], defined on D(H ) ∩ D(A), extends to a bounded operator from D(H ) to D(H )∗ that is locally positive, i.e. such that 1H (H )[H, iA]1H (H ) ≥ c0 1H (H ) + K, for c0 > 0, H ⊂ R an open interval and K a compact operator. However, the local positivity of the quadratic form [H, iA] is not sufficient to apply the Mourre method. Additional conditions on H, A are needed. It seems that the weakest property one can impose is the property that H is of class C 1 (A), introduced in the book [ABG]. Let us recall the precise definition. First let us define this property for a bounded operator B on H. Let A be a self-adjoint operator on H. We say that B ∈ C 1 (A), if the map s → eisA Be−isA is C 1 for the strong topology. The condition that B ∈ C 1 (A) can be characterized in terms of the commutator [B, A]. Namely, (see [ABG, Lemma 6.2.9]) B ∈ C 1 (A) if and only if the quadratic form on D(A) Q(v, v) = (Av, Bv) − (B ∗ v, Av), v ∈ D(A), satisfies |Q(v, v)| ≤ Cv2 , v ∈ D(A).

(8.1)

Let us note the following consequence of the C 1 (A) property, proven in [ABG]: Proposition 8.1. Let B be bounded and B ∈ C 1 (A). Then B maps D(A) into itself, so that the expression [A, iB] makes sense as an operator on D(A), and d isA −isA e Be = [A, iB]. ds For an arbitrary self-adjoint operator H , [ABG] propose a different definition (which is equivalent to the one above for bounded H ): H ∈ C 1 (A), if, for some z ∈ σ (H ), the map s → eisA (z − H )−1 e−isA is C 1 for the strong topology. Some of the consequences of the fact that H ∈ C 1 (A) (see [ABG, Thm. 6.2.10, Prop. 7.2.10]) are described in the next proposition. Proposition 8.2. Let A, H be self-adjoint and H ∈ C 1 (A). Then the following is true: i) For z ∈ σ (H ), (z − H )−1 maps D(A) into itself, the space (z − H )−1 D(A) does not depend on z and is a dense subspace of D(A) ∩ D(H ). ii) D(A) ∩ D(H ) is dense both in D(A) and in D(H ). iii) The quadratic form [H, iA] on D(A)∩D(H ) extends uniquely to a bounded operator [H, iA]0 from D(H ) to D(H )∗ . iv) For any λ ∈ R, the virial relation 1{λ} (H )[H, iA]0 1{λ} (H ) = 0 holds.

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

85

In our case, the application of the Mourre method to P (ϕ)2 Hamiltonians runs into two related problems. The first problem is that (except for the φ24 model) the domain of H is not explicitly known. (This indicates that it is unlikely that the stronger set of conditions introduced in the original paper of Eric Mourre [Mo], which require, in particular, that eisA preserves D(H ), can be checked for P (ϕ)2 Hamiltonians). The second problem is that the actual computation of [H, iA], needed to prove its positivity, cannot be done easily on D(A)∩D(H ), since the identity H = H0 +V needed to do this computation does not make sense on D(H ). For general P (ϕ)2 Hamiltonians, these two problems will be addressed in Thms. 8.4 and 8.7 below. If degP = 4, a simpler argument, using the Wick calculus instead of the Q-representation, can be used to show that H ∈ C 1 (A). This is done in Thm. 8.8. 8.2. Analysis of [H, iA] part I. Let a = 21 (x · Dx + Dx · x) = − 21 (k · Dk + Dk · k) be the generator of dilations on L2 (R, dk). We denote by A the second quantized generator of dilations A := d (a). We put

(1)

H0

:= [H0 , iA], as a quadratic form on D(A) ∩ D(H0 ),

V (1) := [V , iA], as a quadratic form on D(A) ∩ D(N n ), H (1) := [H, iA], as a quadratic form on D(A) ∩ D(H n ).

(8.2)

Remark 8.3. It is important here to define the quadratic form [H, iA] on D(A) ∩ D(H n ). In fact, from the higher order estimates (7.1), we know that D(H n ) ⊂ D(H0 ) ∩ D(N n ), so that on D(A) ∩ D(H n ) we have [H, iA] = [H0 , iA] + [V , iA]. (1)

By a direct computation we see that H0 extends uniquely as a bounded operator (1) from D(H0 ) to H (still denoted by H0 ), equal to d (k · ∇ω(k)). Note that the operator eisa commutes with the conjugation c. Therefore, since V is a multiplication operator on Q-space, so is V (1) . In this subsection we will study the properties of V (1) that follow from the assumption (M1) and its expression as a Wick polynomial. It is convenient to introduce the notation Hs = eisA H e−isA , H0,s = eisA H0 e−isA , Vs = eisA V e−isA . We have H0,s = d (ωs ), for ωs (k) = ω(es k), and, using (3.19), we see that Vs is a Wick polynomial with the kernels wp,s = (eisa )wp . Theorem 8.4. Assume hypothesis (M1). Then i) the form V (1) extends to a bounded operator from D(N n ) to H. It is a multiplication operator on Q-space by a function in ∩ Lp (Q, dµ); p<∞

ii) the form H (1) extends uniquely as an operator, still denoted by H (1) , bounded from D(H n ) into H; iii) for all z ∈ σ (H ), for r ≥ 2n, (z − H )−r ∈ C 1 (A), and hence the following identity holds as an identity between bounded operators from D(A) to H: A(H − z)−r = (H − z)−r A + i

d (Hs − z)−r |s=0 , ds

(8.3)

86

J. Derezi´nski, C. Gérard

where r−1

d (1) (H − z)−r+j (H0 + V (1) )(H − z)−j −1 (Hs − z)−r |s=0 = ds

(8.4)

j =0

is a bounded operator on H. iv) Assume in addition that degP = 4 and that hypothesis (C) holds. Then (z − H )−1 ∈ C 1 (A) and the following identity holds as an identity between bounded operators from D(A) to H: (1)

A(H − z)−1 = (H − z)−1 A + (H − z)−1 (H0 + V (1) )(H − z)−1 . In the next subsection we will need to approximate V by Vκ . Let us set Vκ(1) := [Vκ , iA], as a quadratic form on D(A) ∩ D(N n ).

Proposition 8.5. Assume hypothesis (M1). Then (1)

i) the form Vκ extends to a bounded operator from D(N n ) to H. It is a multiplication operator on Q-space by a function in ∩ Lp (Q, dµ); p<∞

ii) as bounded operators from D(N n ) to H, we have V (1) = lim Vκ(1) ; κ→+∞

iii) for some > 0, V (1) − Vκ(1) Lp (Q,dµ) ≤ C(p − 1)n κ − , > 0. Lemma 8.6. Assume hypothesis (M1). Then i) uniformly in κ,

d (a)wp,κ ∈ L2 (Rp ), 1 ≤ κ ≤ ∞;

ii) there exists > 0 such that d (a)(wp,κ − wp,∞ )L2 (Rp ) ∈ O(κ − ). Proof. We compute that p 1 p d (a)wp,∞ = a g( ˆ ki ) 61 ω(ki )− 2

+

1 p

p

1

1

g( ˆ

1 1 ki ) aω(ki )− 2 ) 6j =i ω(kj )− 2 ,

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

87

and p k 1 i p d (a)wp,κ = a gˆ ω(ki )− 2 ki 61 χˆ κ 1

+

p 1

gˆ

p k k 1 1 j i 6j =i ω(kj )− 2 χˆ . ki aω(ki )− 2 χˆ κ κ 1

Using (M1) and the bound (6.9), one sees that d (a)wp,κ ∈ L2 (Rp ), 1 ≤ κ ≤ ∞. Then one checks that |a α (1 − χˆ ( κk ))ω(k)− 2 | ≤ C|k|− 2 + κ − , for some > 0 and α = 0, 1. 1

1

Using again (6.9), we deduce from (8.5) statement ii) in the lemma.

(8.5)

% $

Proof of Theorem 8.4. Applying Prop. 3.13, we obtain the following identity between quadratic forms on D(A) ∩ D(N n ): Vp(1) := g(x) : ϕ(x)p : dx, iA =

p p wp(1) (k1 , . . . , kr , kr+1 , . . . , kp ) r

(8.6)

r=0

× a ∗ (k1 ) · · · a ∗ (kr )a(−kr+1 ) · · · a(−kp )dk1 · · · dkp ,

where

wp(1) = d (a)wp . (1)

By Lemma 8.6 with κ = ∞, wp ∈ L2 (Rp ). Hence, the rhs of (8.6) defines a bounded p (1) is a Wick operator from D(N n ) to H. Next we note that V (1) + ∈ ⊕2n p=0 ⊗s h, since V polynomial of degree 2n. Hence, by Lemma 5.12, V (1) + ∈ ∩ Lp (Q, dµ). Therefore, p<∞

as a multiplication operator V (1) ∈ ∩ Lp (Q, dµ). This ends the proof of i). p<∞

Let us prove iii). For r ∈ N, z ∈ C\σ (H ), the following identity makes sense (all terms are bounded operators): (Hs − z)−r − (H − z)−r =

r−1

(Hs − z)−r+j (H − Hs )(H − z)−j −1 .

(8.7)

j =0

We deduce easily from the explicit form of H0,s that H0,s (H0 + 1)−1 ≤ C, uniformly for |s| ≤ 1.

(8.8)

r ) = D(H r ) for r ∈ N. Since, on Since H0,s and H0 commute, this implies that D(H0,s 0 isA the other hand, e preserves N , we have

(N + 1)α (H0 + 1)2 ≤ C(Hs + b)2n+α , α ≥ 0, |s| ≤ 1, i.e. Rosen’s higher order estimates are uniformly valid for Hs , |s| ≤ 1.

(8.9)

88

J. Derezi´nski, C. Gérard

We will first show that for r ≥ 2n, s −1 ((Hs − z)−r − (H − z)−r ) is uniformly bounded for |s| ≤ 1.

(8.10)

To prove (8.10) it suffices to show that ((Hs − z)−r − (H − z)−r )u ≤ Csu, u ∈ D(H n ), |s| ≤ 1.

(8.11)

By the higher order estimates, D(H n ) ⊂ D(H0 )∩D(N n ) and hence D(H n ) ⊂ D(H0 )∩ D(V ). Hence we can write ((Hs − z)−r − (H − z)−r )(H + b)−n

−r+j (H − H −j −1 (H + b)−n . = r−1 0 0,s + V − Vs )(H − z) j =0 (Hs − z)

(8.12)

We note that (is)−1 (H0,s − H0 ) = d ((is −1 )(ωs − ω)) and that s- lims→0 (is)−1 (H0,s − H0 )(H0 + 1)−1 (1)

= H0 (H0 + 1)−1 ,

(8.13)

(H0,s − H0 )(H0 + 1)−1 ≤ C|s|. The same result holds for (is)−1 (H0 + 1)−1 (H0,s − H0 ). For j + 1 ≥ n, we write (is)−1 (Hs − z)−r+j (H0,s − H0 )(H − z)−j −1 = (is)−1 (Hs − z)−r+j (H0,s − H0 )(H0 + 1)−1 (H0 + 1)(H − z)−j −1 , and, for r − j ≥ n, we write (is)−1 (Hs − z)−r+j (H0,s − H0 )(H − z)−j −1 = (is)−1 (Hs − z)−r+j (H0 + 1)(H0 + 1)−1 (H0,s − H0 )(H − z)−j −1 . Since r ≥ 2n, if 0 ≤ j ≤ r − 1, we have either j + 1 ≥ n or r − j ≥ n. Using (8.9), we obtain that (is)−1 (Hs − z)−r+j (H0,s − H0 )(H − z)−j −1 ≤ C.

(8.14)

Next, it follows from Lemma 8.6 i), for κ = ∞, that the map p 1 p s → wp,s = gˆ es ki 6i=1 ω(es ki )− 2 i=1

is C 1 (R, L2 (Rp )) with derivative d (a)wp . This implies that (is)−1 (N + 1)−r1 (Vs − V )(N + 1)−r2 → i(N + 1)−r1 V (1) (N + 1)−r2 in operator norm, when s → 0, for r1 + r2 ≥ n.

(8.15)

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

89

We write (is)−1 (Hs − z)−r+j (Vs − V )(H − z)−j −1 = (is)−1 (Hs − z)−r+j (N + 1)r−j (N + 1)j −r × (Vs − V )(N + 1)−j −1 (N + 1)j +1 (H − z)−j −1 . Using (8.9), we obtain that (is)−1 (Hs − z)−r+j (Vs − V )(H − z)−j −1 ≤ C.

(8.16)

Combining (8.16), (8.14) and (8.12), we obtain (8.11), and hence (8.10). By (8.10), to prove that (H − z)−r ∈ C 1 (A), it suffices to show the convergence of (is)−1 ((Hs − z)−r − (H − z)−r ) on a dense subspace of H. But by (8.13) and (8.15), this convergence holds on D(H n ), and we have r−1

d (1) (H − z)−r+j (H0 + V (1) )(H − z)−j −1 . (Hs − z)−r |s=0 = ds j =0

This completes the proof of iii). Let us now prove iv). We assume hence that degP = 4 and hypothesis (C) holds. By Thm. 7.2 and (8.8), the Glimm-Jaffe estimate holds uniformly in |s| ≤ 1, that means, H02 ≤ C(Hs + b)2 , N 2 ≤ C(Hs + b)2 .

(8.17)

Another consequence of the fact that degP = 4 is that (is)−1 (N + 1)−1 (Vs − V )(N + 1)−1 → i(N + 1)−1 V (1) (N + 1)−1 ,

(8.18)

since we may take n = 2 in (8.15). Next if we use (8.13), (8.18) as before, and (8.17) instead of (8.9), we see that the proof of iii) extends to the case r = 1. Finally let us prove ii). By Remark 8.3, the quadratic form [H, iA] on D(A) ∩ D(H n ) (1) is equal to [H0 , iA] + [V , iA]. We have seen that [H0 , iA] extends as an operator H0 (1) such that H0 (H0 + 1)−1 is bounded, which by the higher order estimates implies that (1) H0 (H + b)−n is bounded. In i) we have seen that [V , iA] extends as an operator V (1) such that V (1) (N + 1)−n is bounded, which again by the higher order estimates implies that V (1) (H + b)−n is bounded. It remains to check that the extension of [H, iA] to an operator H (1) with domain D(H n ) is unique, ie that D(A) ∩ D(H n ) is dense in D(H n ). For u = (H + b)−n v ∈ D(H n ), we set u = (H + b)−n (1 + iA)−1 v. Clearly, u ∈ D(H n ) and u tends to u in D(H n ) when → 0. It follows then from iii) that u ∈ D(A), which completes the proof of ii). $ % Proof of Proposition 8.5. i) follows from Lemma 8.6. To prove ii), we note that (1) )(N + 1)−n ≤ Cwp − wp,κ L2 (Rp ) ≤ Cκ − . (Vp(1) − Vp,κ

To show iii), we note that (1) Vp(1) − Vp,κ L2 (Q,dµ) = wp − wp,κ L2 (Rp ) ≤ Cκ − .

Then we use Lemma 5.12.

% $

90

J. Derezi´nski, C. Gérard

8.3. Analysis of [V , iA] part II. In this subsection we continue our study of [H, iA]. The main new ingredient is the use of hypothesis (B1), which will allow to dominate |V (1) |, as a function on Q-space, by H , using hypercontractivity arguments. Theorem 8.7. Assume hypothesis (B1). i) There exists c0 , b > 0 such that |(u, H (1) u)| ≤ c0 (u, (H + b)u), u ∈ D(H n ), 1

1

and hence H (1) extends uniquely as an operator bounded from D(H 2 ) to D(H 2 )∗ . ii) H ∈ C 1 (A). iii) The operator [H, iA]0 from D(H ) into D(H )∗ , associated to [H, iA] by Proposi1 1 tion 8.2, coincides with H (1) and hence is bounded from D(H 2 ) to D(H 2 )∗ . iv) The virial relation holds: 1{λ} (H )[H, iA]0 1{λ} (H ) = 0, λ ∈ R. Theorem 8.7 is the main result of this section. Property ii) allows in particular to justify the virial relation iv). Property iii) allows to actually compute [H, iA]0 , which will be important to prove its positivity in Subsect. 9.2. In the ϕ24 case, a similar result holds under weaker hypotheses on g. Theorem 8.8. Assume degP = 4 and hypotheses (M1), (C). i) There exists c0 , b > 0 such that |(H u, Au) − (Au, H u)| ≤ c0 (H + b)u2 , u ∈ D(A) ∩ D(H ), and hence H (1) extends uniquely as an operator bounded from D(H ) to D(H )∗ . ii) H ∈ C 1 (A). iii) The operator [H, iA]0 from D(H ) into D(H )∗ associated to [H, iA] coincides with H (1) . iv) The virial relation holds: 1{λ} (H )[H, iA]0 1{λ} (H ) = 0, λ ∈ R. It is convenient to make a specific choice of the cutoff χ used to define the cutoff interactions Vκ . Namely, we will fix for this section 1 −|x| , χˆ (k) = (1 + k 2 )−1 . e 2 We denote by u L v the convolution u L v(x) = u(y)v(x − y)dy, χ (x) =

so that F(u L v) = F(u)F(v). Recall that the function fκ was defined in (6.4). Lemma 8.9. where

τx (iafκ ) = 2τx fκ − ακ L τx fκ , ακ (x) = κe−κ|x| +

m −m|x| . e 4

(8.19)

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

91

Proof. After conjugation by the Fourier transformation, we are reduced to check that 1 (−k∂k − )fˆκ = (2δ0 − αˆ κ )fˆκ . 2 This is a direct computation, using (8.19).

% $

Lemma 8.10. Assume hypothesis (B0). Then there exists C0 such that, for κ ≥ m, |ακ L g(x)| ≤ C0 g(x).

(8.20)

An analogous estimate is true if we replace ακ (x) with x∂x ακ (x) or ακ L ακ (x). Proof. It is sufficient to show the estimate replacing ακ with a function κψ(κx), for ψ ∈ S(R). Now, by (B0),

" % $ g(x − x " )ψ(κx " )κdx " ≤ Cg(x) ψ( xκ )x " κdx " . Lemma 8.11. We have

[A, i g(x)ϕκ (x)p dx]

= (2p + ∂x x)g(x)ϕκ (x)p dx

−p g(x)ακ (x " − x)ϕκ (x " )(ϕκ (x))p−1 dxdx " .

(8.21)

Proof. Using ϕκ (x) = φ(τx fκ ),

(8.22)

1

we obtain, as a quadratic form on D(A) ∩ D(N 2 ), [A, iϕκ (x)] = φ(iaτx fκ ). Now, iaτx fκ (y) = y∂y fκ (y − x) + 21 fκ (y − x) = (y − x)∂y fκ (y − x) + 21 fκ (y − x) − x∂x fκ (y − x)

= (2 − x∂x )τx fκ (y) − ακ (x " − x)τx " fκ (y)dx " , by Lemma 8.9. This gives

[A, iϕκ (x)] = (2 − x∂x )ϕκ (x) −

ακ (x " − x)ϕκ (x " )dx " .

(8.23)

Since ia = x∂x + 21 preserves L2R (R, dx), we have [φ(iaτx f ), φ(τx f )] = 0, and hence [A, iϕκ (x)p ] = pϕκ (x)p−1 [A, iϕκ (x)] p

= 2pϕκ (x)p − x∂x ϕκ (x) − p

p−1

ακ (x " − x)ϕκ (x " )ϕκ

(x)dx " ,

(8.24)

as a quadratic form on D(A) ∩ D(N p/2 ). The lemma follows then using (8.24) and integration by parts. $ %

92

J. Derezi´nski, C. Gérard

As a consequence of Lemmas 8.9, 8.11, we have the following inequality, which should be understood as an inequality between functions on Q-space: Lemma 8.12. Assume hypothesis (B1). Then, for p ∈ N, |[A, i g(x)ϕκ (x)p dx]| ≤ Cp g(x)|ϕκ (x)|p dx, uniformly for κ ≥ m.

(8.25)

Proof. Let us denote by I1 , I2 the terms in the r.h.s. of (8.21). We will estimate separately I1 and I2 . Estimate of I1 . Since by (B1) |x∂x g(x)| ≤ Cg(x), we see that |I1 | ≤ C g(x)|ϕκ (x)|p dx.

(8.26)

Estimate of I2 . We have

|I2 | = p| g(x)ακ (x " − x)ϕκ (x " )ϕκ (x)p−1 dxdx " |

≤ C g(x)|ακ (x " − x)||ϕκ (x " )|p dxdx "

+ C g(x)|ακ (x " − x)||ϕκ (x)|p dxdx " , using the fact that abp−1 ≤ Cp (a p + bp ). This yields

|I2 | ≤ C0 cκ g(x)|ϕκ (x)|p dx + C0 gκ (x)|ϕκ (x)|p dx, for

cκ =

ακ (x " )dx " , gκ = g L ακ .

We have used here the fact that g and ακ are positive functions. Clearly, |cκ | ≤ C, uniformly in κ,

(8.27)

and from Lemma 8.10 we get that |gκ (x)| ≤ C0 g(x), uniformly in κ ≥ m. From (8.27) and (8.28), we obtain |I2 | ≤ C0 g(x)|ϕκ (x)|p dx, uniformly in κ ≥ m.

(8.28)

% $

(8.29)

Proposition 8.13. Assume hypotheses (B1). Then there exists c > 0 such that, for t > 0, e−t (cV −|V

(1) |)

∈ L1 (Q, dµ).

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

Proof. Set

93

W := cV − |V (1) |.

To check that e−tW ∈ L1 (Q, dµ), we use Lemma 5.9. The first bound of (5.4) follows from Prop. 8.5 iii). Let us now check the second bound. Using Wick identities (5.2) and Lemma 8.12, we obtain |Vκ(1) | ≤ C0

2n

|ap |

p=0

[p/2] r=0

p!2−r ϕκ (x)+2r (p − 2r)!r!

g(x)|ϕκ (x)|p−2r dx,

so that cVκ − Vκ(1)

≥

g(x)Fκ (ϕκ (x))dx,

where Fκ (λ) is a function as in (6.15), with c2n,0 = ca2n − c0 |a2n | > 0 for c > c0 . Using (6.16), we obtain cVκ − |Vκ(1) | ≥ −c2 gL1 (R) (ϕκ (x)+2n + 1),

(8.30)

1

which since ϕκ (x)+ = O(ln κ 2 ) completes the proof of the second bound in (5.4). Applying now Lemma 5.9, we get that there exists c large enough such that e−tW ∈ % L1 (Q, dµ) for all t > 0. This completes the proof of the proposition. $ Proof of Thm. 8.7. For large enough C we have Cω − ω(1) ≥ C0 > 0. Therefore, we can apply Thm. 5.11 to a = Cω − ω(1) and W = CV − |V (1) | and show that (1) (1) CH0 − H0 + CV − |V (1) | is bounded from below on D(CH0 − H0 ) ∩ D(N n ). But (1) D(CH0 − H0 ) = D(H0 ) and D(H0 ) ∩ D(N n ) contains D(H n ) by the higher order estimates, and hence is dense in D(H ). Therefore, the inequality (1)

H0 + V (1) ≤ C(H0 + V + b) on D(H n ) extends as the inequality H (1) ≤ C(H + b) on D(H ). Likewise, we prove

−H (1) ≤ C(H + b) on D(H ).

This proves i). Next, let us show ii). To prove that H ∈ C 1 (A), we check condition (8.1). Let us first prove that (H + b)−1 preserves D(A). By Thm. 8.4, we have the following identity on D(A), for s ≥ 0: A(H + b + s)−2n = (H + b + s)−2n A +i

2n−1

(H + b + s)−2n−j H (1) (H + b + s)−j −1 .

j =0

By i), (H + b)− 2 H (1) (H + b)− 2 is bounded. Using then the bound 1

1

(H + b + s)−1 (H + b) 2 ≤ c(b + s)− 2 , 1

1

94

J. Derezi´nski, C. Gérard

we obtain that (H + b + s)−2n has a norm O(s −2n ) in B(D(A)). We use then the formula +∞ (H + b)−1 = cn s 2n−2 (H + b + s)−2n ds. 0

The integrand has a norm O(s 2n−2 s −2n ) in B(D(A)), hence the integral norm. This implies that (H + b)−1 is a bounded operator on D(A). Since (H + b)−1 preserves D(A), we can write, for v ∈ D(A),

converges in

Q(v, v) = (v, A(H + b)−1 v) − (A(H + b)−1 v, v) = (H u, Au) − (Au, H u), for u = (H + b)−1 v. Since (H + b)−1 D(A) ⊂ D(A) ∩ D(H ), (8.1) is implied by |(H u, Au) − (Au, H u)| ≤ C(H u2 + u2 ), u ∈ D(A) ∩ D(H ).

(8.31)

We know, by i), that (8.31) holds for u ∈ D(A) ∩ D(H n ). So to prove (8.31) it suffices to show that D(A) ∩ D(H n ) is dense in D(A) ∩ D(H ) for the intersection topology. Hence, let u ∈ D(A) ∩ D(H ) and u = (1 + iH )−2n u. Clearly, u ∈ D(A) ∩ D(H n ), u → u, H u → H u. Now, from (8.3), we get A(1 + iH )−2n u = (1 + iH )−2n Au − i

2n−1

(1 + iH )−2n+j H (1) (1 + iH )−j −1 u

j =0

=: (1 + iH )−2n Au − R u. We claim that s- lim R = 0,

(8.32)

→0

which will imply that Au → Au when → 0. In fact, we write (1 + iH )−2n+j H (1) (1 + iH )−j −1 = (1 + iH )−2n+j (H + b) 2 (H + b)− 2 H (1) (H + b)− 2 (H + b) 2 (1 + iH )−j −1 . 1

1

1

1

Using i) and the bound (1 + iH )−j (H + b) 2 ∈ O( − 2 ), for j ≥ 1, we get that R ∈ O(1). So it suffices to prove (8.32) on a dense subset of H. For u ∈ D(H n ), we have 1

1

(1 + iH )−r+j H (1) (1 + iH )−j −1 u = (1 + iH )−r+j H (1) (H + b)−n (1 + iH )−j −1 (H + b)n u, which, by Thm. 8.4 ii), shows that R u → 0 when → 0. This proves that H is of class C 1 (A). To prove iii), we note that both [H, iA]0 and H (1) are extensions of [H, iA] on D(A) ∩ D(H ) and D(A) ∩ D(H n ) respectively. Since D(A) ∩ D(H n ) is dense in D(A) ∩ D(H ), these two extensions coincide.

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

95

Finally, the fact that the virial theorem is true follows also from the C 1 (A) property (see [ABG, Prop. 7.2.10]). $ % Proof of Theorem 8.8. We will prove that (1)

(H u, Au) − (Au, H u) = i−1 (u, H0 u) + (u, V (1) u), u ∈ D(H ) ∩ D(A).

(8.33)

We have seen in Subsect. 8.2 that (1)

1

|(u, H0 u)| ≤ CH02 u2 , u ∈ D(H0 ), and

|(u, V (1) u)| ≤ (N + 1)u2 , u ∈ D(N ), since degP = 4.

Hence, i) follows from (8.33) and the higher order estimates. Thus we have shown (8.31), which, as we have seen in the proof of Thm. 8.7, implies that H ∈ C 1 (A), i.e. that ii) holds. Finally iv) follows from ii) and iii) follows from (8.33). So it suffices to prove (8.33). Since, by Thm. 7.2, D(H ) = D(H0 ) ∩ D(V ), we have (H u, Au) − (Au, H u) = (H0 u, Au) − (Au, H0 u) + (V u, Au) − (Au, V u), u ∈ D(H ) ∩ D(A). By definition,

(1)

(H0 u, Au) − (Au, H0 u) = i−1 (u, H0 u), so it remains to justify the identity (V u, Au) − (Au, V u) = i−1 (u, V (1) u), u ∈ D(H ) ∩ D(A).

(8.34)

Note that, while (8.34) holds for example on D(N 2 ) ∩ D(A), it is not obvious that it extends to u ∈ D(H ) ∩ D(A). In fact, the expression of V as a Wick polynomial of order 4 needed to prove (8.34) is meaningful on D(N 2 ), but not on D(V ). To justify (8.34) we use an approximation argument similar to the one used in the proof of Thm. 8.3. So let u ∈ D(H )∩D(A), and u = (1+iN )−1 u. Since by the higher order estimates D(H ) ⊂ D(N ), u ∈ D(N 2 ) ∩ D(A), and hence (8.34) holds for u . So it remains to show that Au → Au, V u → V u, when → 0. The first convergence is obvious. To prove the second one it suffices to show that V (1 + iN )−1 u − (1 + iN )−1 V u → 0, when → 0.

(8.35)

Note that, since u ∈ D(N ), we can write V as a Wick polynomial to prove (8.35). If W = w(k1 , · · · , k4 )a ∗ (k1 ) · · · a ∗ (kr )a(−kr+1 ) · · · a(−k4 )dk1 · · · dk4 , for w ∈ L2 (R4 ), then W (1 + iN )

−1

=

(1 + iN + i(4 − 2r))−1 w(k1 , · · · , k4 )

× a ∗ (k1 ) · · · a ∗ (kr )a(−kr+1 ) · · · a(−k4 )dk1 · · · dk4 .

96

J. Derezi´nski, C. Gérard

Using the first resolvent formula and the bound N (1 + iN )−1 ≤ −1 , we obtain that (W (1 + iN)−1 − (1 + iN )−1 W )u ≤ C(N + 1)u, (W (1 + iN)−1 − (1 + iN )−1 W )u ≤ C(N + 1)2 u.

(8.36)

By the first inequality in (8.36), it suffices to prove (8.35) for u in a dense subspace of D(N ). By the second inequality in (8.36), (8.35) holds for u ∈ D(N 2 ). $ %

8.4. Analysis of [[H, iA], iA]. The aim of this subsection is to show that, under hypothesis (B2), H ∈ C 2 (A). The structure of the argument is parallel to the arguments used in Subsects 8.2 and 8.3. We put (2)

H0

(1)

:= [H0 , iA], as a quadratic form on D(A) ∩ D(H0 ),

V (2) := [V (1) , iA], H

(2)

:= [H

(1)

, iA],

as a quadratic form on D(A) ∩ D(N n ),

(8.37)

n

as a quadratic form on D(A) ∩ D(H ). (2)

By a direct computation we see that H0 extends uniquely as a bounded operator (2) from D(H0 ) to H (still denoted by H0 ), equal to d ((k · ∇)2 ω(k)). Proposition 8.14. Assume hypothesis (M2). Then i) the form V (2) extends to a bounded operator from D(N n ) to H. It is a multiplication operator on Q-space by a function in ∩ Lp (Q, dµ). p<∞

ii) the form H (2) extends uniquely to a bounded operator from D(H n ) to H. Proof. The proof is analogous to the proof of Thm. 8.4 i) and ii). We write V (2) as a Wick polynomial using the fact that a 2 g ∈ L2 (R), which follows from hypothesis (M2). % $ Then we set Vκ(2) := [Vκ(1) , iA] as a quadratic form on D(A) ∩ D(N n ). The following proposition is analogous to Prop. 8.5. Proposition 8.15. Assume hypothesis (M2). Then the following is true: (2)

i) The form Vκ extends to a bounded operator from D(N n ) to H. It is a multiplication operator on Q-space by a function in ∩ Lp (Q, dµ). p<∞

ii) As bounded operators from D(N n ) to H, we have V (2) = lim Vκ(2) . κ→+∞

iii) For some > 0, V (2) − Vκ(2) Lp (Q,dµ) ≤ C(p − 1)n κ − , > 0.

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

97

Theorem 8.16. Assume hypothesis (B2). i) There exists c0 , b > 0 such that |(u, H (2) u)| ≤ c0 (u, (H + b)u), u ∈ D(H n ). 1

Hence H (2) extends uniquely to a bounded operator from D((H + b) 2 ) to D((H + 1 b) 2 )∗ . ii) H ∈ C 2 (A). The proof of this theorem will be similar to the proof of Theorem 8.7. The main difference is the following lemma, which is used instead of Lemma 8.12. Lemma 8.17. Assume hypothesis (B2). Then, for p ∈ N, p |[A, i[A, i g(x)ϕκ (x) dx]]| ≤ C g(x)|ϕκ (x)|p dx, uniformly for κ ≥ m.

Proof. We recall from Lemma 8.11 that [A, i g(x)ϕκ (x)p dx] = I1 + I2 , for

I1 = (2pg(x) + ∂x xg(x))ϕκ (x)p dx,

g(x)ακ (x " − x)ϕκ (x " )(ϕκ (x))p−1 dxdx " . I2 = −p

The term [A, iI1 ] is completely analogous to [A, i g(x)ϕκ (x)p dx], with g replaced by g1 = 2pg+∂x xg. It follows from hypothesis (B2) that |x∂x g1 (x)| ≤ g(x). The argument used in the proof of Lemma 8.12 shows then that |[A, iI1 ]| ≤ C g(x)|ϕκ (x)|p dx. We consider next [A, iI2 ]. Using the identity (8.23), we obtain = p g(x)ακ (x " − x)x " ∂x " ϕκ (x " )ϕκ (x)p−1 dxdx " + p g(x)ακ (x " − x)ϕκ (x " )x∂x ϕκ (x)p−1 dxdx " − 4p g(x)ακ (x " − x)ϕκ (x " )ϕκ (x)p−1 dxdx " − p g(x)ακ (x " − x)ακ (x "" − x " )ϕκ (x "" )ϕκ (x)p−1 dxdx " dx "" − p(p − 1) g(x)ακ (x " − x)ακ (x "" − x)ϕκ (x " )ϕκ (x "" )ϕκ (x)p−2 dxdx " dx "" = R1 + · · · + R5 .

(8.38)

The term R3 is equal to 4I2 , and hence bounded by C g(x)|ϕκ (x)|p dx. Integrating by parts, we have R1 + R2 = − 2p g(x)ακ (x " − x)ϕκ (x " )ϕκ (x)p−1 dxdx " − p x∂x g(x)ακ (x " − x)ϕκ (x " )ϕκ (x)p−1 dxdx " (8.39) − p g(x)(x " − x)∂x " ακ (x " − x)ϕκ (x " )ϕκ (x)p−1 dxdx " .

98

J. Derezi´nski, C. Gérard

The first term in (8.39) equals 2I2 . The second term is similar to I2 , with g replaced by x∂x g. Note that it follows from (B1) that |x∂x g(x)| ≤ cg(x). The third term is also similar to I2 , with ακ replaced by x∂x ακ . By the argument in Lemma 8.12, we obtain that R1 + R2 is bounded by C g(x)|ϕκ (x)|p dx. The term R4 is equal to −p g(x)ρκ (x "" − x)ϕκ (x "" )ϕκ (x)p−1 dxdx "" , for ρκ = ακ L ακ .

Again, the argument in Lemma 8.12 shows that R4 is bounded by C g(x)|ϕκ (x)|p dx. Finally to estimate R5 , we use the fact that abcp−2 ≤ C(a p + bp + cp ), and get: R5 ≤ C g(x)ακ (x " − x)ακ (x "" − x)|ϕκ (x "" )|p dxdx " dx "" + C g(x)ακ (x " − x)ακ (x "" − x)|ϕκ (x " )|p dxdx " dx "" + C g(x)ακ (x " − x)ακ (x "" − x)|ϕκ (x)|p dxdx " dx "" ≤ 2Ccκ gκ (x)|ϕκ (x)|p dx + Ccκ2 g(x)|ϕκ (x)|p dx,

for ck = ακ (x)dx, gκ = g L ακ . Using (8.27), (8.28), we obtain that R5 is bounded by C g(x)|ϕκ (x)|p dx. This completes the proof of the lemma. $ % Proof of Theorem 8.16. i) is shown exactly as the analogous statement of Theorem 8.7, using Lemma 8.17 instead of Lemma 8.12. Let us prove that H ∈ C 2 (A). It follows first from the fact that H ∈ C 1 (A) that the following identity holds as quadratic forms on D(A): [(H + b)−1 , iA] = −(H + b)−1 H (1) (H + b)−1 , (see [ABG, Thm. 6.2.10]). To show that H ∈ C 2 (A), we have to check that (u|[(H + b)−1 H (1) (H + b)−1 , A]u) ≤ Cu2 , u ∈ D(A).

(8.40)

We have remarked in the proof of Theorem 8.4 (see (8.32)) that D(H n )∩D(A) is dense in D(A) for the graph topology. So it is enough to show (8.40) for u ∈ D(H n ) ∩ D(A). For u ∈ D(H n ) ∩ D(A), we have ((H + b)−1 H (1) (H + b)−1 u, Au) − (Au, (H + b)−1 H (1) (H + b)−1 u) = (H (1) (H + b)−1 u, (H + b)−1 Au) − ((H + b)−1 Au, H (1) (H + b)−1 u) = (H (1) (H + b)−1 u, A(H + b)−1 u) − (A(H + b)−1 u, H (1) (H + b)−1 u) + i(H (1) (H + b)−1 u, (H + b)−1 H (1) (H + b)−1 u) + i((H + b)−1 H (1) (H + b)−1 u, H (1) (H + b)−1 u).

(8.41)

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

99

We use the fact that H (1) (H + b)−n is bounded by Thm. 8.4 ii) to justify the first equality in (8.41). Then we note that, since (H + b)−1 preserves D(A), the following identity is valid as bounded operators from D(A) to H: (H + b)−1 A = A(H + b)−1 + i(H + b)−1 H (1) (H + b)−1 .

(8.42)

Next we use the identity (8.42) in the second equality of (8.41). Applying Thm. 8.7 i), we see that the last two terms of (8.41) are less than Cu2 . This shows that, as quadratic forms on D(H n ) ∩ D(A), we have [(H + b)−1 H (1) (H + b)−1 , iA] = (H + b)−1 [H (1) , iA](H + b)−1 + R, where R is bounded for the topology of H. By i), also the first term on the rhs is bounded for the topology of H. $ %

9. Spectral Analysis of P (ϕ)2 Hamiltonians This section is devoted to the spectral theory of P (ϕ)2 Hamiltonians. We first show an HVZ type result. Note that the ⊂ part of the HVZ theorem is well known (see [GJ3, S-H.K]), although our proof is different. We then prove the Mourre estimate, which implies the local finiteness of point spectrum and, under additional hypotheses, the limiting absorption principle. 9.1. HVZ theorem and existence of a ground state. Theorem 9.1. We have σess (H ) = [inf σ (H ) + m, +∞[. Consequently inf σ (H ) is a discrete eigenvalue of H . Let us pick functions j0 , j∞ ∈ C ∞ (R) with 0 ≤ j0 ≤ 1, j0 ∈ C0∞ (R), j0 = 1 near 2 = 1. For R ≥ 1, j R is defined as in Subsect. 7.5. We will also set 0 and j02 + j∞ R R 2 q = (j0 ) . Proof. We prove first the ⊂ part of the theorem. Let χ ∈ C0∞ (] − ∞, inf σ (H ) + m[). Because of suppχ , we have χ (H ext ) = χ (H ext )1{0} (N∞ ). Hence, using twice Lemma 7.12, we have χ (H ) = χ (H )I (j R )I ∗ (j R ) = I (j R )χ (H ext )I ∗ (j R ) + o(R 0 ) = I (j R )χ (H ext )1{0} (N∞ )I ∗ (j R ) + o(R 0 ) = I (j R )1{0} (N∞ )I ∗ (j R )χ (H ) + o(R 0 ). We claim that the operator I (j R )1{0} (N∞ )I ∗ (j R )χ (H ) = q R χ (H ) is compact. In fact, since (H0 + 1) 2 (H + b)− 2 is bounded by the first order estimates (6.11), it suffices 1 to verify that q R (H0 + 1)− 2 is compact, which is easy (see e.g. [DG1, Lemma 4.2]). Hence, (q R )χ (H ) is compact as a norm limit of compact operators. 1

1

100

J. Derezi´nski, C. Gérard

Let us now prove the ⊃ part of the theorem. Note that it follows from the ⊂ part of the theorem that H admits a ground state. Let λ > inf σ (H ) + m. Let u be a ground

state of H . Let h ∈ C0∞ (R) with h(k)dk = 1, and let x0 ∈ R, x0 = 0, k0 ∈ R, k0 = 0, ω(k0 ) = λ − inf σ (H ). Choose a sequence Rj such that limj →∞ j −1 Rj = ∞ and define hj ∈ C0∞ (R), by setting hj (k) = j d/2 h(j (k − k0 ))eiRj k·x0 . Then hj = 1, w − limj →∞ hj = 0 and limj →∞ (ω(k) − ω(k0 ))hj = 0. Let uj = a ∗ (hj )u. We have limj →∞ uj = 1 and w − limj →∞ uj = 0. Note that u ∈ D(H m ) for any m, so it belongs to D(H0 N m ) for any m ∈ R. Therefore, uj ∈ D(H0 ) ∩ D(N n ) ⊂ D(H ) and (H − λ)uj = (H0 + V − λ)uj = a ∗ (hj )(H − λ)u + a ∗ (ω(k)hj )u + [V , a ∗ (hj )]u = a ∗ (ω(k) − ω(k0 ))hj u + [V , a ∗ (hj )]u. It is easy to see that

lim (hi |wp L2 (Rp−1 ) = 0. i→∞

Therefore, by Proposition 3.13 we get that [a ∗ (hj ), V ](N + 1)−n+1/2 → 0, when j → ∞. This implies that (H − λ)uj → 0 when j → ∞, and since uj tends weakly to 0, uj is a Weyl sequence for λ. $ % 9.2. The Mourre estimate and its consequences. We denote by τ the set of thresholds τ := σpp (H ) + mN∗ . For λ ∈ R, > 0, let I (λ, ) denote [λ − , λ + ]. Likewise, for a subset N ⊂ R, let I (N, ) denote the set {k ∈ R : dist(N, k) ≤ }. Theorem 9.2. Assume hypothesis (B1), or if degP = 4, hypotheses (C), (M1). i) Let λ ∈ R\τ . Then there exists > 0, c0 > 0 and a compact operator K such that 1I (λ,) (H )[H, iA]1I (λ,) (H ) ≥ c0 1I (λ,) (H ) + K. ii) For all [λ1 , λ2 ] such that [λ1 , λ2 ] ∩ τ = ∅, one has pp

dim1[λ1 ,λ2 ] (H ) < ∞. Consequently σpp (H ) can accumulate only at τ , which is a closed countable set. iii) Let λ ∈ R\(τ ∪ σpp (H )). Then there exists > 0, c0 > 0 such that 1I (λ,) (H )[H, iA]1I (λ,) (H ) ≥ c0 1I (λ,) (H ).

Remark 9.3. There is an example due to Simon [Si3] of a P (ϕ)2 Hamiltonian with eigenvalues embedded in [O + m, O + 2m[.

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

101

Theorem 9.4. Assume hypothesis (B2). Then the strong limiting absorption principle holds: w − lim (1 + |A|)−r (H − λ − i)(1 + |A|)−r exists →±0

locally uniformly on σ (H )\(τ ∪ σpp (H )), for r > 21 . Consequently H has no singular continuous spectrum. Thm. 9.4 is a consequence of Thm. 8.16 and the abstract Mourre theory (see [Mo, PSS], [ABG, Thm. 7.4.1]). Proof of Theorem 9.2. The proof will be very similar to that of [DG1, Thm. 4.3]. Let us 1 set ω(k) ˜ := k · ∇ω(k) = k 2 (k 2 + m2 )− 2 . Let

p p ∞

inf ω(k ˜ i) λ − ω(ki ) ∈ σpp (H ) , d(λ) = inf p=1 k1 ,...,kp ∈R ∞

˜ d(λ) = inf

inf

i=1

p

p=0 k1 ,...,kp ∈R

i=1

Let us note that ˜ d(λ) :=

i=0

p

ω(k ˜ i) λ − ω(ki ) ∈ σpp (H ) . i=1

d(λ), λ ∈ σpp (H ), 0, λ ∈ σpp (H ).

˜ We introduce also “smeared out” versions of the functions d(λ) and d(λ). We set d κ (λ) :=

inf

µ∈I (λ,κ) ∞

= inf

d(µ),

inf

p

p=1 k1 ,...,kp ∈R

d˜ κ (λ) :=

inf

µ∈I (λ,κ) ∞

= inf

i=1

p

ω(k ˜ i) λ − ω(ki ) ∈ I (σpp (H ), κ) , i=1

˜ d(µ)

inf

p=0 k1 ,...,kp ∈R

p i=1

p

ω(k ˜ i) λ − ω(ki ) ∈ I (σpp (H ), κ) . i=1

Note that the following equality holds: p p ∞ κ ˜ inf d λ− inf ω(ki ) + ω(k ˜ i ) = d κ (λ). p=1 k1 ,...,kp ∈Rp

i=1

(9.1)

i=1

We will use an induction with respect to n ∈ N. Let us first list the statements that we will show. We put E0 := inf σ (H ). H1 (n) : Let > 0 and λ ∈ [E0 , E0 + nm[. Then there exists a compact operator K, an interval H λ such that 1H (H )[H, iA]1H (H ) ≥ (d(λ) − )1H (H ) + K. H2 (n) : Let > 0 and λ ∈ [E0 , E0 + nm[. Then there exists an interval H λ such that ˜ 1H (H )[H, iA]1H (H ) ≥ (d(λ) − )1H (H ).

102

J. Derezi´nski, C. Gérard

H3 (n) : Let κ > 0, 0 > 0 and > 0. Then there exists δ > 0 such that for all λ ∈ [E0 , E0 + nm − 0 ], one has 1I (λ,δ) (H )[H, iA]1I (λ,δ) (H ) ≥ (d˜ κ (λ) − )1I (λ,δ) (H ).

S1 (n) : τ is a closed countable set in [E0 , E0 + nm]. pp S2 (n) : for all λ1 ≤ λ2 ≤ E0 +nm with [λ1 , λ2 ]∩τ = ∅, we have dim1[λ1 ,λ2 ] (H ) < ∞. e will prove, for all n ∈ N, the following implications: H1 (n) ⇒ H2 (n), H2 (n) ⇒ H3 (n), H1 (n) ⇒ S2 (n), S2 (n − 1) ⇒ S1 (n), S1 (n) and H3 (n − 1) ⇒ H1 (n). Note first that the statements H1 (1) and S1 (1) are immediate since the spectrum of H is discrete in [E0 , E0 + m[. Note also that the implication S2 (n − 1) ⇒ S1 (n) is obvious. The proof of the implications H1 (n) ⇒ H2 (n), H2 (n) ⇒ H3 (n) is a standard argument which adapts directly to the present setting (see [FH, CFKS]). The proof of the implication H1 (n) ⇒ S2 (n) is also standard and based on the virial relation, which holds here by Thm. 8.7. It remains to prove that S1 (n) and H3 (n − 1) ⇒ H1 (n). Recall that the Hamiltonian H ext acting on H ⊗ H was introduced in Subsect. 7.4. We also set Aext = A ⊗ 1 + 1 ⊗ A, acting on H ⊗ H. Let us first show that, for all λ ∈ [E0 , E0 + nm − 0 [, there exists δ > 0 such that 1I (λ,δ) (H ext ))[H ext , iAext ]1I (λ,δ) (H ext )1[1,∞[ (N∞ ) ≥ (d(λ) −

2 ext 3 )1I (λ,δ) (H )1[1,∞[ (N∞ ).

(9.2)

To simplify, let us write d ∞ (ω), d ∞ (ω), ˜ instead of 1 ⊗ d (ω), 1 ⊗ d (ω). ˜ We will also write B instead of B ⊗ 1. Using the closedness of τ in [E0 , E0 + nm], i.e. the induction hypothesis S1 (n), we see that d(λ) = sup d κ (λ), κ>0

for λ ∈ [E0 , E0 + nm[. So we may choose κ small enough so that d κ (λ) ≥ d(λ) − /3. Next, using H3 (n − 1), we choose δ such that, for λ1 ∈ [E0 , E0 + (n − 1)m − 0 [, we have 1I (λ1 ,δ) (H )[H, iA]1I (λ1 ,δ) (H ) ≥ d˜ κ (λ1 ) − 1I (λ1 ,δ) (H ). 3

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

103

Replacing λ1 with λ − d (ω(k)), we obtain for λ ∈ [E0 , E0 + nm − 0 [ the following estimate: 1I (λ,δ) H + d ∞ (ω) [H, iA] + d ∞ (ω) ˜ 1I (λ,δ) H + d ∞ (ω) 1[1,∞[ (N∞ ) ˜ − 3 1[1,∞[ (N∞ ) ≥ 1I (λ,δ) H + d ∞ (ω) d˜ κ (λ − d ∞ (ω)) + d ∞ (ω) ≥ (d κ (λ) − 3 )1I (λ,δ) H + d ∞ (ω) 1[1,∞[ (N∞ ) ≥ (d(λ) − 2 3 )1I (λ,δ) H + d ∞ (ω) 1[1,∞[ (N∞ ), which yields (9.2). Now let χ ∈ C0∞ (R). As in the proof of Theorem 9.1, χ 2 (H ) = I (j R )1{0} (N∞ )I ∗ (j R )χ 2 (H ) + I (j R )1[1,∞[ (N∞ )I ∗ (j R )χ 2 (H ) = (q R )χ 2 (H ) + I (j R )1[1,∞[ (N∞ )χ 2 (H ext )I ∗ (j R ) + o(R 0 ).

(9.3)

The first term of (9.3) is compact as in the proof of Thm. 9.1. Next we use that [H, iA] equals H (1) , and that on D(H n ) the operator H (1) can be (1) written as H0 + V (1) . So on D(H n ) [H, iA] is similar to H with ω replaced by ω˜ and V replaced by the Wick polynomial V (1) . It is then easy to see that the analog of (7.22) holds, i.e. as an operator identity on D(H n ) we have ˇ R ) − (j ˇ R )[H, iA] ∈ (N + 1)n oˇ N (R 0 ), [H ext , iAext ] (j

(9.4)

for [H ext , iAext ] = [H, iA] ⊗ 1 + 1 ⊗ d (ω). ˜ Using also Lemma 7.12, we obtain χ (H )[H, iA]χ (H ) = (q R )χ (H )[H, iA]χ (H ) + I (j R )1{[1,∞[} (N∞ )χ (H ext )[H ext , iAext ]χ (H ext )I ∗ (j R ) + o(R 0 ),

(9.5)

where the first term on the right is again compact. Now (9.2), (9.3) and (9.5) for supp χ ⊂ [λ − δ, λ + δ], yield χ (H )[H, iA]χ (H ) ≥ (d(λ) − 2/3)χ 2 (H ) + K1 + o(R 0 ), where K1 is compact. Picking R large enough, this proves H1 (n). $ %

10. Scattering Theory of P (ϕ)2 Hamiltonians This section is devoted to the scattering theory of P (ϕ)2 Hamiltonians. In quantum field theory, the scattering theory is usually based on the construction of the asymptotic fields, which is done in Subsect. 10.1. The unitarity of the wave operator (a result originally due to Høegh–Krohn) is shown in Subsect. 10.2, using general properties of regular CCR representations proven in Sect. 4. The asymptotic completeness property is formulated in Subsect. 10.3 and will be shown in Sect. 12.

104

J. Derezi´nski, C. Gérard

10.1. Asymptotic fields. In all this section, we will assume the conditions (A), (Is) for s > 1. For h ∈ h, we set ht := e−itω(k) h. We denote by h0 ⊂ h the space C0∞ (R\{0}). Theorem 10.1. i) For all h ∈ h, the strong limits W + (h) := s- lim eitH W (ht )e−itH t→+∞

(10.1)

exist. They are called the asymptotic Weyl operators. The asymptotic Weyl operators can be also defined using the norm limit: W + (h)(H + b)−n = lim eitH W (ht )(H + b)−n e−itH .

(10.2)

h h → W + (h)

(10.3)

t→+∞

ii) The map

is strongly continuous and, for > 0, the map h h → W + (h)(H + b)−

(10.4)

is norm continuous. iii) The operators W + (h) satisfy the Weyl commutation relations: W + (h)W + (g) = e−i 2 I m(h|g) W + (h + g). 1

iv) The Hamiltonian preserves the asymptotic Weyl operators: eitH W + (h)e−itH = W + (h−t ). Proof. We have

(10.5)

W (ht ) = e−itH0 W (h)eitH0 ,

which implies that, as a quadratic form on D(H0 ), one has ∂t W (ht ) = −[H0 , iW (ht )].

(10.6)

Using (10.6) and the fact that D(H n ) ⊂ D(H0 ) ∩ D(V ), we have, as quadratic forms on D(H n ), ∂t eitH W (ht )e−itH = ieitH [V , W (ht )]e−itH . Integrating this relation we have, as a quadratic form identity on D(H n ), t " " eit H [V , W (ht " )]e−it H dt " . eitH W (ht )e−itH − W (h) = i 0

Using Prop. 3.13, we obtain that [V , W (ht )] = W (ht )V˜t ,

(10.7)

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

105

where V˜t is the sum of Wick monomials in (3.21) with s + r ≥ 1. By stationary phase arguments, we obtain that, for h ∈ h0 , there exists > 0 such that ht = 1{|x|≥t} ht + O(t −∞ ).

(10.8)

Using then Lemma 6.3 and the form (3.22) of the symbols of V˜t , we obtain that V˜t (N + 1)−n ∈ O(t −s ). This shows that the identity (10.7) makes sense as an identity between bounded operators from D(H n ) to H. It also proves that the norm limit (10.2) exists for h ∈ h0 . For h ∈ h, let hn ∈ h0 such that h = limn→∞ hn . Let 0 < ≤ 21 . Using the first order estimates and Prop. 3.1, we obtain W (hn,t ) − W (ht ) (H + b)− ≤ W (hn ) − W (h) (N + 1)− (N + 1) (H + b)− ≤ C(hn − h (hn 2 + h)2 + 1), which implies

lim sup W (hn,t ) − W (ht ) (H + b)− = 0.

n→∞ t∈R

This implies the existence of the norm limit (10.2) for all h ∈ h. Now (10.2) implies (10.1). This ends the proof of i). We have W + (h) − W + (g) (H + b)− ≤ lim eitH (W (ht ) − W (gt ))(H + b)− e−itH t→+∞

≤ C(g − h (g2 + h)2 + 1), by Prop. 3.1, which implies the norm continuity of (10.4). This implies the strong continuity of (10.3) and completes the proof of ii). Finally iii) and iv) are immediate. $ % It follows from the above theorem that h h → W + (h) is a regular CCR representation. We next follow Sect. 2, introducing field operators, creation/annihilation operators, etc. Theorem 10.2. i) For any h ∈ h, φ + (h) := −i

d + W (sh)s=0 ds

defines a self-adjoint operator, called the asymptotic field, such that W + (h) = eiφ

+ (h)

.

ii) The operators φ + (h) satisfy, in the sense of quadratic forms on D(φ + (h1 )) ∩ D(φ + (h2 )), the canonical commutation relations [φ + (h2 ), φ + (h1 )] = iI m(h2 |h1 ).

(10.9)

106

J. Derezi´nski, C. Gérard

iii)

eitH φ + (h)e−itH = φ + (h−t ). p

iv) For hi ∈ h, 1 ≤ i ≤ p, D((H + i)p/2 ) ⊂ D(61 φ + (hi )), and p

p

6 φ + (hi )(H + i)−p/2 = s- lim eitH 6 φ(hi,t )e−itH (H + i)−p/2 . t→+∞

i=1

i=1

Proof. Properties i) and ii) are consequences of the fact that the asymptotic Weyl operators define a regular CCR representation (see Sect. 2). Property iii) follows from Thm. 10.1 iv). It remains to prove iv). Let us first establish the existence of the strong limit p

s- lim eitH 61 φ(hi,t )(H + b)−p/2 e−itH =: R(h1 , . . . , hp ), for hi ∈ h. (10.10) t→+∞

For u, v ∈ D(H n ), we have p ∂ ∂t (vt , 61 φ(hi,t )(H

+ b)−p/2 ut )

p

p

= (vt , [H, i61 φ(hi,t )](H + b)−p/2 ut ) + (vt , ∂t 61 φ(hi,t )(H + b)−p/2 ut ). We use again the fact that H = H0 + V on D(H n ) and the higher order estimates, which show that p

p

[H, i61 φ(hi,t )](H + b)−p/2 + ∂t 61 φ(hi,t )(H + b)−p/2 p

p

= [V , i61 φ(hi,t )](H + b)−p/2 + [H0 , i61 φ(hi,t )](H + b)−p/2 p

+ ∂t 61 φ(hi,t )(H + b)−p/2 p

= [V , i61 φ(hi,t )](H + b)−p/2 , as an identity between quadratic forms on D(H n ). Using then the fact that φ(h) maps 1 D(N k ) into D(N k− 2 ), we obtain the identity p

[V , i61 φ(hi,t )](H + b)−p/2 =

p j =1

j −1

61

p

φ(hi,t )[V , iφ(hj,t )]6j +1 φ(hi,t )(H + b)−p/2 ,

as a quadratic form identity on D(H n ). For h ∈ h, the term [V , iφ(ht )] is by Prop. 3.13 a Wick polynomial with kernels of the form wp |ht ) or (ht |wp . By a stationary phase argument, if h ∈ h0 , we can find 0 > 0 such that 1{|x|≤0 t} ht ∈ O(t −∞ ). Using then hypothesis (Is) for s > 1, Lemma 6.3 and Prop. 3.13, we obtain [V , iφ(ht )] ∈ ON (t −s )(N + 1)n .

(10.11)

Using again the higher order estimates, we obtain that if hi ∈ h0 , 1 ≤ i ≤ p, then ∂ p (vt , 61 φ(hi,t )(H + b)−p/2 ut ) = (vt , R(t)ut ), u, v ∈ D(H n ), ∂t

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

107

where R(t)(H + b)−n ≤ Ct −s . This proves the existence of the limit (10.10) for u ∈ D(H n ), hi ∈ h0 . The estimate (N + 1)m (φ(h1 ) − φ(h2 ))(N + 1)−m− 2 ≤ Ch1 − h2 1

and a density argument, as in the proof of Thm. 10.1, give the existence of (10.10) for u ∈ D(H n ), hi ∈ h. Finally it follows again from the higher order estimates that p 61 φ(hi,t )(H + b)−p/2 is bounded uniformly in t, which shows the existence of (10.10) for all u ∈ H. We prove now the identity iv) by induction on p. We have to show that D(H p/2 ) ⊂ p p D(61 φ + (hi )) and that R(h1 , . . . , hp ) = 61 φ + (hi )(H + b)−p/2 . This amounts to show that p

R(h1 , . . . , hp ) = s- lim (is)−1 (W + (sh1 ) − 1)62 φ + (hi )(H + b)−p/2 . s→0

p

Note that, by the induction assumption, D(H p/2 ) ⊂ D(62 φ + (hi )) and p

p

62 φ + (hi )(H + b)−p/2 = s- lim eitH 62 φ(hi,t )e−itH (H + b)−p/2 . t→+∞

(10.12)

Using (10.12) and the fact that eitH W (h1,t )e−itH is uniformly bounded in t, we have p

(is)−1 (W + (sh1 ) − 1)62 φ + (hi )(H + b)−p/2 p

= s- lim eitH (is)−1 (W (sh1,t ) − 1)62 φ(hi,t )e−itH (H + b)−p/2 . t→+∞

So to prove iv), it suffices to check that s- lim s- lim eitH R(s, t)e−itH = 0, s→0

for R(s, t) =

t→∞

(10.13)

W (sh ) − 1 1,t p − iφ(h1,t ) 62 φ(hi,t )(H + b)−p/2 . s

We recall that W (sh) − 1 1 (N + 1)− 2 < ∞, s |s|≤1,h≤C sup

(10.14)

and W (sh) − 1 1 lim sup − iφ(h) (N + 1)− 2 − = 0, > 0. s→0 h≤C s

(10.15)

Using (10.14) and the higher order estimates, we see that R(s, t) is uniformly bounded for |s| ≤ 1, t ∈ R, and using then (10.15) we see that lims→0 supt∈R R(s, t)u = 0, for u ∈ D(H ). This shows (10.13) and completes the proof of the theorem. $ % The following theorem follows directly from Thm. 10.1 and from the properties of regular CCR representations.

108

J. Derezi´nski, C. Gérard

Theorem 10.3. i) For any h ∈ h, the asymptotic creation and annihilation operators defined on D(a + (h)) := D(φ + (h)) ∩ D(φ + (ih)) by a +∗ (h) := a + (h) :=

√1 2

φ + (h) − iφ + (ih) , + φ (h) + iφ + (ih) ,

√1 2

are closed. ii) The operators a + satisfy, in the sense of forms on D(a + (h1 )) ∩ D(a + (h2 )), the canonical commutation relations [a + (h1 ), a +∗ (h2 )] = (h1 |h2 )1, [a + (h2 ), a + (h1 )] = [a +∗ (h2 ), a +∗ (h1 )] = 0. iii) eitH a + (h)e−itH = a + (h−t ).

(10.16)

p

iv) For hi ∈ h, 1 ≤ i ≤ p, D((H + i)p/2 ) ⊂ D(61 a + (hi )) and p

p

p

p

61 a + (hi )(H + b)− 2 = s- lim eitH 61 a (hi,t )(H + b)− 2 e−itH . t→∞

10.2. Asymptotic spaces and wave operators. In this subsection, we recall the construction of the asymptotic vacuum spaces and wave operators, due to Høegh–Krohn [HK]. We give a more direct proof of the unitarity of the wave operators, based on the existence of a number operator for the CCR representation given by the asymptotic Weyl operators. We define the asymptotic vacuum space to be K+ := {u ∈ H | a + (h)u = 0, h ∈ h}. The asymptotic space is defined as H+ := K+ ⊗ H.

Proposition 10.4. i) K+ is a closed H -invariant space. p ii) K+ is included in the domain of 61 a + (hi ), for hi ∈ h. iii) Hpp (H ) ⊂ K+ . Proof. i) and ii) follow by the properties of CCR relations described in Proposition 4.1. The fact that K+ is H -invariant follows from (10.16). To prove iii) we verify that for u ∈ D(H ), H u = λu, h ∈ h0 , a(ht )e−itH u = e−itλ a(ht )u ∈ o(1). $ %

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

109

The asymptotic Hamiltonian is defined by H + := K + ⊗ 1 + 1 ⊗ d (ω), K + := H

for

K+

.

We also define ++ : H+ → H, ++ ψ ⊗ a ∗ (h1 ) · · · a ∗ (hp )+ := a +∗ (h1 ) · · · a +∗ (hp )ψ,

h1 , . . . , hp ∈ h, ψ ∈ K+ . (10.17)

The map ++ is called the wave operator. It is a particular case of the map +π defined in Prop. 4.2. The following theorem is due to Høegh–Krohn [HK]. Theorem 10.5. ++ is a unitary map from H+ to H such that a + (h)++ = ++ 1 ⊗ a (h), h ∈ h, H ++ = ++ H + . Proof. By general properties of regular CCR representations, (see Proposition 4.2) the operator ++ is well defined and isometric. To prove that it is unitary, we will show that the CCR representation h h → W + (h) admits a densely defined number operator and use Theorem 4.3. Let n+ be the quadratic form associated to the CCR representation W + , as in Subsect. 4.2. Let us show that D(n+ ) is dense in H. For each finite dimensional space f ⊂ h, if n+ f (u) =

dimf

a + (hi )u2 ,

i=1

for {hi } an orthonormal base of f, we have 2 n+ f (u) = lim

=

dimf

a(hi,t )e−itH u2 t→+∞ i=1 lim (e−itH u, d (Pf,t )e−itH u), t→+∞

if Pf,t is the orthogonal projection on e−itω f. But d (Pf,t ) ≤ N , so 2 −itH u2 ≤ C(H + b) 2 u2 , n+ f (u) ≤ sup N e 1

1

t

by the first order estimates (6.4). Therefore D(H 2 ) ⊂ D(n+ ), 1

which implies that D(n+ ) is dense in H and hence, by Theorem 4.3, Ran++ = H.

% $

110

J. Derezi´nski, C. Gérard

10.3. Asymptotic completeness. The definition of the wave operators seems different from the one commonly used in the physics literature, where asymptotic creation operators are only applied to bound states of H , generating the so-called asymptotic states. In this respect one can ask what property of the wave operators should be called asymptotic completeness. A physically important property is the fact that incoming and outgoing asymptotic vacua coincide, that is K+ = K− , where K− is defined analogously to K+ , with t → −∞ replacing t → +∞ in the definition of the asymptotic Weyl operators. Since we have seen that Hpp (H ) ⊂ K± , the natural definition of asymptotic completeness is that Hpp (H ) = K± . The following theorem is one of the main results of this paper: Theorem 10.6. Assume hypotheses (B1),(Is) for s > 1 , or if degP = 4, hypotheses (C), (M1), (Is) for s > 1. Then the P (ϕ)2 Hamiltonian H has the asymptotic completeness property: Hpp (H ) = K± . Theorem 10.6 will be proved in Subsect. 12.5, as a consequence of Thm. 12.5 and of the Mourre estimate. 10.4. Extended wave operator. Recall that in Subsect. 7.5 we introduced the extended Hilbert space and the extended Hamiltonian Hext = (h) ⊗ (h),

H ext = H ⊗ 1 + 1 ⊗ d (ω(k)).

Clearly, H+ is a subspace of Hext and

H + = H ext

H+

.

Sometimes we will also need the “extended wave operator”. Its domain can be chosen to be ∞ p D(+ext,+ ) := D (H + b) 2 ⊗ ⊗ps h, p=0

which is a dense subset of Hext . Now we set +ext,+ : D +ext,+ → H, +ext,+ ψ ⊗ a ∗ (h1 ) · · · a ∗ (hp )+ := a +∗ (h1 ) · · · a +∗ (hp )ψ,

p ψ ∈ D (H + b) 2 . (10.18)

Note that +ext,+ is an unbounded operator. Clearly, +ext,+ + = ++ . H

(10.19)

We will sometimes treat ++ as a partial isometry equal to zero on the orthogonal complement of H+ inside Hext . We can then write the following identity: ++ = +ext,+ 1H+ , where 1H+ denotes the projection onto H+ inside the space Hext .

(10.20)

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

111

10.5. Another construction of the wave operators. Recall that in Subsect. 3.9 we defined the identification operator I : (h) ⊗ (h) → (h). Theorem 10.7. i) Let u ∈ D(+ext,+ ). Then the limit lim eitH I e−itH u ext

t→+∞

exists and equals +ext,+ u. ii) Let χ ∈ C0∞ (R). Then Ranχ (H ext ) ⊂ D(+ext ), I χ (H ext ) and +ext,+ χ (H ext ) are bounded operators, and lim eitH I e−itH χ (H ext ) = +ext,+ χ (H ext ). ext

t→+∞

(10.21)

Proof. Let us first show i). Let u ∈ D((H + i)k/2 ) ⊗ ⊗ks h. Since, by (3.7), I (H + i)−k/2 ⊗ 1{k} (N∞ ) is a bounded operator, it suffices to prove i) for u = ψ ⊗ 6ki a ∗ (hi )+, ψ ∈ D((H + i)k/2 ), hi ∈ h. It follows from property (3.5) of I that eitH I e−itH ψ ⊗ 6k1 a ∗ (hi )+ = eitH 6k1 a ∗ (hi,t )e−itH ψ. ext

i) follows then from Thm. 10.3 iv). To prove ii), we observe that, since the boson mass is positive, vectors in Hcomp (H ext ) are also in Hcomp (H ) and in Hcomp (N∞ ). So ii) follows from i). $ % 11. Propagation Estimates In this section we collect various propagation estimates about the evolution e−itH , which will be used in the next section. It is essentially similar to [DG1, Sect. 6], the only difference being the control of the interaction term V , which is here much more singular. In all this section we assume hypothesis (Is) for s > 1. We will use the following notation for various Heisenberg derivatives: D0 = D=

∂ ∂t

∂ ∂t

+ [H0 , i·], acting on B( (h)),

+ [H, i·], acting on B(H).

The following easy observation will be used to compute Heisenberg derivatives. It follows from the fact that H = H0 + V on D(H n ). Lemma 11.1. Let R t → M(t) ∈ B(D(H ), H) be of class C 1 . Then, for χ ∈ C0∞ (R), we have Dχ (H )M(t)χ (H ) = χ (H )D0 M(t)χ (H ) + χ (H )[V , iM(t)]χ (H ).

We first derive a standard large velocity estimate. It means that no boson can asymptotically propagate in the region |x| > t.

112

J. Derezi´nski, C. Gérard

Proposition 11.2. Let χ ∈ C0∞ (R). For R " > R > 1, one has

1 2 dt |x| 2 " ≤ Cu2 . d 1[R,R ] ( ) χ (H )e−itH u t t

∞ 1

Proof. Let F ∈ C ∞ (R) be a cutoff function equal to 1 near ∞, to 0 near the origin, with F " (s) ≥ 1[R,R " ] (s). The propagation observable is P(t) = χ (H )d F ( |x| t )χ (H ). The proof is identical to that of [DG1, Prop. 6.1], except for the term χ (H )[V , id (F (

|x| ))]χ (H ), t

coming from the application of Lemma 11.1. By Prop. 3.13 and Lemma 6.3, [V , id (F ( |x| t ))] is a sum of Wick monomials with symbols having an L2 norm O(t −s ), s > 1, by condition (Is). Proposition 3.13 and the −s higher order estimates then imply χ (H )[V , id (F ( |x| t ))]χ (H ) ∈ O(t ), s > 1. Thus this term is integrable in norm. $ % The following proposition contains a more subtle propagation estimate. Its intuitive meaning is that along the evolution of an asymptotically free boson the instantaneous velocity ∇ω(k) and the average velocity xt converge to each other as time goes to ∞. Proposition 11.3. Let χ ∈ C0∞ (R), 0 < c0 < c1 . Set N[c0 ,c1 ] (t) := d xt − ∇ω(k), 1[c0 ,c1 ] ( xt )( xt − ∇ω(k)) . Then

∞ 1

N[c0 ,c1 ] (t) 2 χ (H )e−itH u2 1

dt ≤ Cu2 . t

Proof. The propagation observable used to prove the proposition is of the form P(t) = χ (H )d (b(t))χ (H ), for

x 1 x x b(t) := R( ) − ∇R( ), − ∇ω(k) + hc , t 2 t t with |∂xα R(x)| ≤ Cα , suppR ⊂ {|x| ≥ 0 > 0}. As above it suffices to estimate the term χ (H )[V , id (b(t))]χ (H ), the other terms in the Heisenberg derivative of P(t) being similar to those in [DG1, Prop. 6.2]. By Prop. 3.13, [V , id (b(t))] is a sum of Wick momomials with symbols d (b(t))wp,∞ , where wp,∞ is the kernel defined in (6.8). We use then pseudodifferential calculus, the fact that supp R ⊂ {|x| ≥ 0 } and Lemma 6.3 to show that d (b(t))w∞ ∈ O(t −s ), s > 1. By Prop. 3.13 i) this implies that χ (H )[V , id (b(t))]χ (H ) ∈ O(t −s ) and hence is integrable in norm. $ % The following proposition is an improvement on Prop. 11.3.

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

113

Proposition 11.4. Let 0 < c0 < c1 , J ∈ C0∞ ({c0 < |x| < c1 }), χ ∈ C0∞ (R). Then 1

2 1 x 2 −itH dt d J x − ∂ω(k) + hc χ (H )e < Cu2 . u t t t

+∞

Proof. The proof is identical to [DG1, Prop. 6.3], using the argument in the proof of Prop. 11.3 to control the commutators with V . $ % ext Note that Prop. 11.4 is H by H and d (b) by d (b) ⊗ 1 + still x true if we replace x 1 ⊗ d (b), for b = J t t − ∂ω(k) + hc . The last propagation estimate of this section is the so called minimal velocity estimate, based on the Mourre estimate shown in Subsect. 9.2. Since the conjugate operator is different from the one in [DG1], we will give a more detailed proof.

Proposition 11.5. Assume condition (B1). Let χ ∈ C0∞ (R) be supported in R\(τ ∪ σpp (H )). Then there exists > 0 such that

2 −itH dt 2 1[0,] |x| u χ (H )e t ≤ Cu . t

+∞

1

Proof. Let us first prove the proposition for χ supported near an energy level λ ∈ R\τ ∪ σpp (H ). By Thm. 9.2, we can find χ ∈ C0∞ (R) equal to 1 near λ such that for some c0 > 0: χ (H )[H, iA]χ (H ) ≥ c0 χ 2 (H ).

(11.1)

Let > 0 be a parameter which will be fixed later. Let q ∈ C0∞ (|x| ≤ 2), 0 ≤ q ≤ 1, q = 1 near {|x| ≤ } and let q t = q( xt ). We use the propagation observable A P(t) := χ (H ) (q t ) (q t )χ (H ). t We fix cutoff functions q˜ ∈ C0∞ (R), χ˜ ∈ C0∞ (R) such that supp q˜ ⊂ {|x| ≤ 4}, qq ˜ = q, χ˜ χ = χ . Let us show the following estimate: m

N k At m (q t )χ (H ) ≤ C m + O (t −1 ), m = 1, 2.

(11.2)

First note that, by Lemma 3.2 iii), A2m ≤ N 2m−1 d (a 2m ). Next, (q t )d (a 2m ) (q t ) = d ((q t )2 , q t a 2m q t ) ≤ d (q t a 2m q t ), q t a 2m q t ≤ 2m t 2m ω2m (k) + O(t 2m−2 )ω(k).

(11.3)

114

J. Derezi´nski, C. Gérard

Therefore, (q t )d (a 2m ) (q t ) is less than C 2m t 2m d (ω2m ) + Ct 2m−2 d (ω) ≤ C 2m t 2m d (ω)2m + Ct 2m−2 d (ω). Therefore, N k

1 Am (q t )χ (H )u2 ≤ C 2m N k+m− 2 H0m χ (H )u2 tm

+ Ct −2 N k+m− 2 H0 χ (H )u2 . 1

Then we apply the high order estimates. Now (11.2) implies the uniform boundedness of P(t). Let us compute the Heisenberg derivative of P(t). Using Lemma 11.1, we have, for d0 q t = ∂t q t + [ω, iq t ], DP(t) = χ (H )d (q t , d0 q t ) At (q t )χ (H ) + hc + χ (H )[V , i (q t )] At (q t )χ (H ) + hc + t −1 χ (H ) (q t )[H, iA] (q t )χ (H )

(11.4)

− t −1 χ (H ) (q t ) At (q t )χ (H ) =: R1 (t) + R2 (t) + R3 (t) + R4 (t). We have used the fact that (q t ) preserves D(H0 ) and D(N n ) to expand the commutator [H, iP(t)] in (11.4). Let us first estimate R2 (t). By Lemma 3.17 and Lemma 6.3, [V , i (q t )] ∈ (N + 1)−n ON (t −s ), s > 1. Therefore, by (11.2), R2 (t) ∈ O(t −s ), s > 1.

(11.5)

We consider next R1 (t). We have d0 q t = −

1 x 1 x − ∇ω(k), ∇q( ) + hc + r t =: g t + r t , 2t t t t

where r t ∈ O(t −2 ). By the higher order estimates (7.1), χ (H )d (q t , r t ) ∈ O(t −2 ), which, using (11.2), yields A χ (H )d (q t , r t ) (q t )χ (H ) ∈ O(t −2 ). t Then we set B1 := χ (H )d (q t , g t )(N + 1)− 2 , B2∗ := (N + 1) 2 1

1

A (q t )χ (H ), t

and use the inequality χ (H )d (q t , g t ) At (q t )χ (H ) + hc = t −1 B1 B2∗ + t −1 B2 B1∗ ≥ −t −1 B1 B1∗ − t −1 B2 B2∗ .

(11.6)

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

115

We have −B2 B2∗ = −χ (H ) (q t ) At 2 (N + 1) (q t )χ (H ) 2

= χ (H ) (q t )χ˜ (H ) (q˜ t ) At 2 (N + 1) (q˜ t )χ˜ (H ) (q t )χ (H ) + O(t −1 ) 2

≥ − 2 C1 χ (H ) 2 (q t )χ (H ) + O(t −1 ), (11.7) where we used Lemma 7.11 and the boundedness of step and the following estimate analogous to (11.2):

A2 (N t2

+ 1) (q t )χ (H ) in the first

A2 (N + 1) (q˜ t )χ˜ (H ) ≤ C1 2 + O(t −2 ) t2 in the second step. Next we use Lemma 3.4 v), χ˜ (H ) (q˜ t )

(N + 1)− 2 d (q t , g t )u ≤ d (g t∗ g t ) 2 u, u ∈ H, 1

to obtain

1

|(u, B1∗ B1 u)| = (N + 1)− 2 d (q t , g t )χ (H )u2 1

1

≤ d (g t∗ g t ) 2 χ (H )u2 , u ∈ H. Using Prop. 11.3, we obtain

+∞

B1 e−itH u2

1

dt ≤ Cu2 . t

(11.8)

Next we use Lemma 7.11 to write R3 (t) = t −1 (q t )χ (H )[H, iA]χ (H ) (q t ) + O(t −2 ) ≥ C0 t −1 (q t )χ 2 (H ) (q t ) − Ct −2

(11.9)

≥ C0 t −1 χ (H ) 2 (q t )χ (H ) − Ct −2 . It remains to estimate R4 (t). We have R4 (t) = −t −1 χ (H ) (q t ) At (q t )χ (H ) = −t −1 χ (H ) (q t )χ˜ (H ) (q˜ t ) At (q˜ t )χ˜ (H ) (q t )χ (H )

(11.10)

≥ −C2 t −1 χ (H ) (q t )2 χ (H ) + O(t −2 ). Collecting (11.7), (11.9) and (11.10), we obtain − 2 t −1 B2∗ (t)B2 (t) + R3 (t) + R4 (t) ≥ (− 2 C1 + C0 − C2 )t −1 χ (H ) (q t )2 χ (H ) − Ct −2 .

(11.11)

We pick now small enough so that C˜ 0 = − 2 C1 + C0 − C2 > 0. Using (11.5), (11.8) and (11.11), we conclude that DP(t) ≥

C˜ 0 χ (H ) 2 (q t )χ (H ) − R(t) − Ct −s , s > 1, t

116

J. Derezi´nski, C. Gérard

where R(t) is integrable along the evolution. By the standard argument, this proves the proposition for χ with support close enough to an energy level λ ⊂ R\(τ ∪ σpp (H )). To prove the proposition for all χ supported in R\(τ ∪ σpp (H )), we argue as in [DG2, Prop. 4.4.7]. $ % 12. Asymptotic Completeness 12.1. Existence of asymptotic localizations. Theorem 12.1. Assume hypothesis (Is), s > 1. Let q ∈ C0∞ (R), 0 ≤ q ≤ 1, q = 1 on a neighborhood of zero. Set q t (x) = q( xt ). Then there exists s- lim eitH (q t )e−itH =: + (q).

(12.1)

+ (q q) ˜ = + (q) + (q), ˜

(12.2)

t→∞

We have +

+

0 ≤ (q) ≤ (q) ˜ ≤ 1, if 0 ≤ q ≤ q˜ ≤ 1, +

[H, (q)] = 0.

(12.3) (12.4)

Proof. Let us first prove the existence of (12.1). Using Lemma 7.11 and a density argument, it suffices to prove the existence of s- lim eitH χ (H ) (q t )χ (H )e−itH , t→+∞

for χ ∈ C0∞ (R). We compute the Heisenberg derivative χ (H )D (q t )χ (H ) = χ (H )d (q t , d0 q t )χ (H ) + χ (H )[V , i (q t )]χ (H ), by Lemma 11.1. From Lemma 3.17, Lemma 6.3 and hypothesis (Is), we obtain χ (H )[V , i (q t )]χ (H ) ∈ O(t −s ).

(12.5)

Next we compute d0 q t = where

1 t g + rt , t

g t = − 21 ( xt − ∂ω(k))∂q( xt ) + hc

and r t ∈ O(t −2 ). Using Lemma 3.4 v) and the higher order estimates, we obtain that χ (H )d (q t , r t )χ (H ) ∈ O(t −2 ).

(12.6)

On the other hand, by Lemma 3.4 v), we have |(u|χ (H )d (q t , g t )χ (H )u)| ≤ d (|g t |) 2 χ (H )u2 . 1

(12.7)

Hence the existence of the limit (12.1) follows from (12.5)–(12.7), Proposition 11.4 and Lemma A.1. Equation (12.4) follows by Lemma 7.11. (12.2) follows from (q t q˜ t ) = (q t ) (q˜ t ).

% $

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

117

An analogous theorem is true for the free Hamiltonian, but it is much easier. It follows within each n-particle sector by the stationary phase method. Note that in the free case one does not need to assume that the cutoff function q is one at zero. Proposition 12.2. Let q ∈ C0∞ (R), 0 ≤ q ≤ 1. Then s- lim eitd (ω) (q t )e−itd (ω) = (q(∇ω)). t→∞

12.2. Projection P0+ . Theorem 12.3. Let {qn } ∈ C0∞ (R) be a decreasing sequence of functions such that 0 ≤ qn ≤ 1, q = 1 on a neighborhood of 0 and ∩∞ n=1 supp qn = {0}. Then P0+ := s- lim + (qn ) exists.

(12.8)

n→∞

P0+ does not depend on the choice of the sequence {qn }. It is an orthogonal projection satisfying [H, P0+ ] = 0. Besides, RanP0+ ⊂ K+ .

(12.9)

The range of P0+ can be interpreted as the space of states asymptotically containing no bosons away from the origin. Proof. The existence of P0+ and the fact that it is a projection follow from (12.2), (12.3) and Lemma A.3. To show that P0+ does not depend on the choice of {qn }, we pick two sequences {qn }, {q˜n }. There exist, for each n ∈ N, an index mn such that qn ≥ q˜mn , q˜n ≥ qmn . Hence, by (12.3), we see that s- lim + (qn ) = s- lim + (q˜n ). n→∞

n→∞

The fact that [H, P0+ ] = 0 follows from (12.4). Let us now show (12.9). We know that RanP0+ is invariant wrt H , hence D(H ) ∩ RanP0+ is dense in RanP0+ . Besides, K+ is closed. Thus it is enough to show that D(H ) ∩ RanP0+ ⊂ K+ . Let u ∈ D(H ) ∩ RanP0+ . We are going to show that (H + b)− 2 a + (h)u = 0, h ∈ h, 1

(12.10)

which will imply u ∈ K+ . By the continuity of h h → (H + b)− 2 a + (h), it is enough to assume that h ∈ h0 . By stationary phase arguments, we may choose q ∈ C∞ (R) with 1

118

J. Derezi´nski, C. Gérard

0 ≤ q ≤ 1, q(0) = 1 and supp q contained in a sufficiently small neighborhood of 0 so that q t ht ∈ o(1). Then, u = lim eitH (q t )e−itH u, t→∞

(H + b)− 2 a + (h) = s- lim eitH (H + b)− 2 a(ht )e−itH . 1

1

t→∞

Hence, (H + b)− 2 ]a + (h)u 1

= lim eitH (H + b)− 2 a(ht ) (q t )(H + b)− 2 e−itH (H + b) 2 u 1

1

1

t→∞

= lim eitH (H + b) t→∞

− 21

(q t )a(q t ht )(H + b)

− 21

(12.11)

e−itH (H + b) u. 1 2

But since q t ht ∈ o(1), a(q t ht )(H + b)− 2 ∈ o(1) and therefore (12.11) vanishes. 1

% $

12.3. Geometric inverse wave operators. Let j0 ∈ C0∞ (R), j∞ ∈ C ∞ (R), 0 ≤ j0 , 2 ≤ 1, j = 1 near 0 (and hence j = 0 near 0). Set j := (j , j ). Set 0 ≤ j∞ , j02 + j∞ 0 ∞ 0 ∞ t t t ), where j t (x) = j ( x ), j t (x) = j ( x ). also j = (j0 , j∞ 0 t ∞ t ∞ 0 As in Subsect. 3.10, we introduce the operator I (j t ) : (h) ⊗ (h) → (h). Theorem 12.4. Assume hypothesis (Is) for s > 1. i) The following limits exist: s- lim eitH I ∗ (j t )e−itH ,

(12.12)

s- lim eitH I (j t )e−itH .

(12.13)

ext

t→+∞

ext

t→+∞

If we denote (12.12) by W + (j ), then (12.13) equals W + (j )∗ . ii) For a bounded Borel function F one has W + (j )F (H ) = F (H ext )W + (j ). iii) Let q0 , q∞ ∈ C ∞ (R), ∇q0 , ∇q∞ ∈ C0∞ (R), 0 ≤ q0 , q∞ ≤ 1, q0 = 1 near 0. Set j˜ := (j˜0 , j˜∞ ) := (q0 j0 , q∞ j∞ ). Then + (q0 ) ⊗ (q∞ (∇ω))W + (j ) = W + (j˜). iv) Let q ∈ C ∞ (R), ∇q ∈ C0∞ (R), 0 ≤ q ≤ 1, q = 1 near 0. Then W + (j ) + (q) = W + (qj ), where qj = (qj0 , qj∞ ).

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

119

v) Let j˜ = (j˜0 , j˜∞ ) be another pair satisfying the conditions stated at the beginning of this subsection. (Note that j˜0 j0 + j˜∞ j∞ ≤ 1 and j˜0 j0 = 1 near zero). Then W + (j˜)∗ W + (j ) = + (j˜0 j0 + j˜∞ j∞ ), 2 = 1, then W + (j ) is isometric. In particular, if j02 + j∞ vi) Let j0 + j∞ = 1. If χ ∈ C0∞ (R), then

+ext,+ χ (H ext )W + (j ) = χ (H ). Proof. Let us first prove the existence of the limit (12.12), the case of (12.13) being similar. Using Lemma 7.12 and a density argument, it suffices to prove the existence of s- lim eitH χ (H ext )I ∗ (j t )χ (H )e−itH , ext

t→∞

for some χ ∈ C0∞ (R). We compute the asymmetric Heisenberg derivative ˇ ∗ (j t )χ (H ) = χ (H ext )D ˇ 0 I ∗ (j t )χ (H ) χ (H ext )DI + iχ (H ext )(V ⊗ 1I ∗ (j t ) − I ∗ (j t )V )χ (H ). From Lemma 3.17, Lemma 6.3 and hypothesis (Is), we obtain χ (H ext )(V ⊗ 1I ∗ (j t ) − I ∗ (j t )V )χ (H ) ∈ O(t −s ).

(12.14)

ˇ 0 I ∗ (j t ) = dI ∗ (j t , dˇ 0 j t ), and, by pseuOn the other hand by Lemma 3.11, we have D dodifferential calculus, 1 dˇ 0t j t = k t + r t , t where t ), k t = − 1 ( x − ∂ω(k))∂j ( x ) + hc , = 0, ∞ k t = (k0t , k∞ t 2 t and r t ∈ O(t −2 ). Using Lemma 3.11 v) and the higher order estimates we obtain χ (H ext )dI ∗ (j t , r t )χ (H ) ∈ O(t −2 ).

(12.15)

Using then Lemma 3.11 iv), we obtain | u2 |χ (H ext )dI ∗ (j t , k t )χ (H )u1 | 1

1

≤ (d (|k0t |) 2 ⊗ 1)χ (H ext )u2 d (|k0t |) 2 χ (H )u1 1 2

(12.16)

1 2

t |) )χ (H ext )u d (|k t |) χ (H )u . +(1 ⊗ d (|k∞ 2 1 ∞

Hence, the existence of the limit (12.12) follows from (12.14)–(12.16), Proposition 11.4 and Lemma A.2. ii) follows from Lemma 7.12. iii) follows from Prop. 12.2 and the fact that t )I ∗ (j t ) = I ∗ (j˜t ). (q0t ) ⊗ (q∞

120

J. Derezi´nski, C. Gérard

iv) follows from

I ∗ (j t ) (q t ) = I ∗ ((j q)t ).

v) follows from

t j t ). I (j˜t )I ∗ (j t ) = (j˜0t j0t + j˜∞ ∞

Up to technical details due to the unboundedness of I , vi) can be considered as a special case of v) with j˜ = (1, 1) . To prove vi) we note that H ext 1[k,∞[ (N∞ ) ≥ mk + E0 , where E0 = inf σ (H ). Hence, for χ ∈ C0∞ (R), we can find n ∈ N such that χ (H ext )1]n,∞[ (N∞ ) = 0.

(12.17)

Therefore, +ext,+ χ (H ext )W + (j ) = +ext,+ 1[0,n] (N∞ )χ (H ext )W + (j )

(1)

= s- limt→∞ eitH I 1[0,n] (N∞ )χ (H ext )I ∗ (j t )e−itH (2) = s- limt→∞ eitH I 1[0,n] (N∞ )I ∗ (j t )e−itH χ (H )

(3) (12.18)

using (12.17) in Step (1), Thm. 10.7 ii) and Thm. 12.4 i) in Step (2), and Lemma 7.12 and the boundedness of I 1[0,n] (N∞ )(N0 )−n in Step (3). Next we claim that I 1]n,∞[ (N∞ )I ∗ (j t )(N + 1)−1 ≤ C(n + 1)−1 .

(12.19)

In fact the operator ∗

t

I 1]n,∞[ (N∞ )I (j ) = (i)1]n,∞[ d

00 01

(j t∗ )

commutes with N. On ⊗ns h, it can be written as jt1 ⊗ · · · ⊗ jtn , #{i|i =∞}>n

where the indices i take the values 0, ∞. This explicit expression and the fact that j0 + j∞ = 1 imply I 1]n,∞[ (N∞ )I ∗ (j t ) ≤ 1, I 1]n,∞[ (N∞ )I ∗ (j t )1[0,n] (N ) = 0, which yields (12.19). Hence, lim lim sup eitH I 1]n,∞[ (N∞ )I ∗ (j t )e−itH χ (H ) = 0.

n→∞ t→∞

Since n can be chosen arbitrarily big and I I ∗ (j t ) = 1, (12.18) equals χ (H ).

% $

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

121

12.4. Geometric asymptotic completeness. In this subsection we will show that RanP0+ = K+ . We call this property geometric asymptotic completeness. It will be convenient to work in the extended space Hext and to treat ++ as a partial isometry ++ : Hext → H, as explained in Subsect. 10.4. We will give an explicit construction of the inverse wave operator ++∗ in terms of the geometric inverse wave operators W + (j ). Theorem 12.5. Assume hypothesis (Is) for s > 1. Let jn = (j0,n , j∞,n ) satisfy the conditions of Subsect. 12.3. Additionally, assume that j0,n + j∞,n = 1, and that for any > 0, there exists m such that, for n > m, supp j0,n ⊂ [−, ]. Then, ++∗ = w − lim W + (jn ). n→∞

Besides,

K+ = RanP0+ .

Proof. Let q ∈ C0∞ (R), q = 1 in a neighborhood of 0, 0 ≤ q ≤ 1. For sufficiently big n we have qj0,n = j0,n . Therefore, for sufficiently big n, by Thm. 12.4 iii), ( + (q) ⊗ 1)W + (jn ) − W + (jn ) = 0. Hence, w − lim

P0+ ⊗ 1W + (jn ) − W + (jn ) = 0.

n→∞

(12.20)

Let now u ∈ H, χ ∈ C0∞ (R). We have ++∗ χ (H ) = ++∗ +ext,+ χ (H ext )W + (jn ), = w − limn→∞ ++∗ +ext,+ χ (H ext )W + (jn ),

(1) (2)

= w − limn→∞ ++∗ +ext,+ χ (H ext )P0+ ⊗ 1W + (jn ), (3) = w − limn→∞ P0+ ⊗ 1χ (H ext )W + (jn ),

(4)

= w − limn→∞ P0+ ⊗ 1W + (jn )χ (H ),

(5)

= w − limn→∞ W + (jn )χ (H ).

(6)

We used Theorem 12.4 vi) in Step (1); step (2) is obvious – we just added w − limn→∞ to a constant sequence; (12.20) was used in Step (3) (note that P0+ ⊗ 1 commutes with χ (H ext )); in Step (4) we used K+ ⊃ RanP0+ , +ext,+ 1K+ ⊗1 = ++ , ++∗ ++ = 1K+ ⊗1; in Step (5) we used Theorem 12.4 ii); finally in Step (6) we used again (12.20). The arbitrariness of χ ∈ C0∞ (R) and a density argument imply ++∗ = w − lim W + (jn ). n→∞

122

J. Derezi´nski, C. Gérard

Therefore, by (12.20), (P0+ ⊗ 1)++∗ = ++∗ , i.e. Ran++∗ ⊂ RanP0+ ⊗ (h) ⊂ K+ ⊗ (h). But, by construction, Hence,

K+

⊗ (h) =

Ran++∗ = K+ ⊗ (h). RanP0+

⊗ (h), and therefore K+ = RanP0+ .

% $

12.5. Asymptotic completeness. In this subsection, we will prove Thm. 10.6. Proof of Thm. 10.6. By Proposition 10.4 and geometric asymptotic completeness, we already know that Hpp (H ) ⊂ K+ = RanP0+ . It remains to prove that P0+ ≤ 1pp (H ). Let χ ∈ C0∞ (R\(τ ∪ σpp (H ))). We deduce from Prop. 11.5 in Sect. 11 that there exists > 0 such that, for q ∈ C0∞ ([−, ]) with q(x) = 1, for |x| < /2, we have +∞ dt (q t )χ (H )e−itH u2 ≤ cu2 . t 1 Since (q t )χ (H )e−itH u → + (q)χ (H )u, we have + (q)χ (H ) = 0. This implies that P0+ ≤ 1τ ∪σpp (H ). Since τ is a closed countable set and σpp (H ) can accumulate only at τ , we see that 1pp (H ) = 1τ ∪σpp (H ). This completes the proof of the theorem. $ %

A. Appendix The following lemma describes an argument commonly used to prove the so-called propagation estimates (see [DG1, Sect. 8.4] and references therein). Lemma A.1. Let H be a self-adjoint operator and D the corresponding Heisenberg derivative d D := + i[H, ·]. dt Suppose that P(t) is a uniformly bounded family of self-adjoint operators. Suppose that there exist C0 > 0 and operator valued functions B(t) and Bi (t), i = 1, . . . , n, such that n

DP(t) ≥ C0 B ∗ (t)B(t) − Bi∗ (t)Bi (t),

∞

i=1

Bi (t)e−itH φ2 dt ≤ Cφ2 , i = 1, . . . , n.

1

Then there exists C1 such that ∞ 1

B(t)e−itH φ2 dt ≤ C1 φ2 .

(A.1)

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

123

Next we describe how one uses propagation estimates to prove the existence of asymptotic observables. Lemma A.2. Let H1 and H2 be two self-adjoint operators. Let 2 D1 be the corresponding asymmetric Heisenberg derivative: 2 D1 P(t)

:=

d P(t) + iH2 P(t) − iP(t)H1 . dt

Suppose that P(t) is a uniformly bounded function with values in self-adjoint operators. Let D1 ⊂ H be a dense subspace. Assume that |(ψ2 |2 D1 P(t)ψ1 )| ≤

n

B2i (t)ψ2 B1i (t)ψ1 ,

i=1

∞

B2i (t)e−itH2 φ2 dt ≤ Cφ2 , φ ∈ H, i = 1, . . . , n,

1

∞

B1i (t)e−itH1 φ2 dt ≤ Cφ2 , φ ∈ D1 , i = 1, . . . , n.

1

Then the limit

s- lim eitH2 P(t)e−itH1 t→∞

exists. The proof of the following lemma is given in [DG1]: Lemma A.3. Let Qn be a commuting sequence of selfadjoint operators such that 0 ≤ Qn ≤ 1, Then the limit

Qn+1 ≤ Qn ,

Qn+1 Qn = Qn+1 .

Q = s- lim Qn . n→∞

exists and is a projection. Acknowledgements. The research of the first author was a part of the project Nr 2 P03A 019 15 financed by a grant of Komitet Bada´n Naukowych. Part of this work was done during a visit of the first author at the Aarhus University supported by MaPhySto funded by the Danish National Foundation.

References Amrein, W., Boutet de Monvel, A., Georgescu, W.: C0 -Groups, Commutator Methods and Spectral Theory of N-Body Hamiltonians. Basel–Boston–Berlin: Birkhäuser, 1996 [AH] Arai, A., Hirokawa, M.: On the existence and uniqueness of ground states of a generalized spin-boson model. J. Funct.Anal. 151, 455–503 (1997) [BFS] Bach, V., Fröhlich, J., Sigal, I.: Quantum electrodynamics of confined non-relativistic particles. Adv. in Math. 137, 299–395 (1998) [BFSS] Bach, V., Fröhlich, J., Sigal, I., Soffer, A.: Positive Commutators and the Spectrum of Pauli-Fierz Hamiltonian of Atoms and Molecules. Commun. Math. Phys. 207, 557–587 (1999)

[ABG]

124

[BR]

J. Derezi´nski, C. Gérard

Bratteli, O, Robinson D. W.: Operator Algebras and Quantum Statistical Mechanics Vols I, II. Berlin: Springer, 1981 [Ch1] Chaiken, J.M.: Finite particles representations and states of the canonical commutation relations. Ann. Phys. 42, 23–80 (1967) [Ch2] Chaiken, J.M.: Number operators for representations of the canonical commutation relations. Commun. Math. Phys. 8, 164–184 (1968) [CMR] Courbage, M., Miracle-Sole, S., Robinson, D.W.: Normal states and representations of the canonical commutation relations. Ann. I.H.P. 14, 171–178 (1971) [CFKS] Cycon, H.L., Froese, R., Kirsch, W., Simon, B.: Schrödinger Operators with applications to Quantum Mechanics and Global Geometry. Berlin–Heidelberg–New York: Springer, 1987 [DG1] Derezi´nski, J., Gérard, C.: Asymptotic completeness in quantum field theory. Massive Pauli–Fierz Hamiltonians. Rev. in Math. Phys. 11 383–450 (1999) [DG2] Derezi´nski, J., Gérard, C.: Scattering Theory of Classical and Quantum N-Particle Systems. Texts and Monographs in Physics, Berlin–Heidelberg–New York: Springer, 1997 [E] Enss, V.: Long-range scattering of two- and three-body quantum systems. Journées “Equations aux dérivées partielles” Saint Jean de Monts, Juin 1989, Publications Ecole Polytechnique, Palaiseau 1989 [FH] Froese, R., Herbst, I.: A new proof of the Mourre estimate. Duke Math. J. 49, 1075–1085 (1982) [Ge] Gérard, C.: Asymptotic completeness for the spin-boson model with a particle number cutoff. Rev. Math. Phys. 8, 549–589 (1996) [GJ1] Glimm, J., Jaffe, A.: Boson quantum field theory models. In Mathematics of Contemporary Physics R. Streater ed. London–New York: Academic Press, 1972 [GJ2] Glimm, J., Jaffe, A.: A λφ 4 quantum field theory without cutoffs I. Phys. Rev. 176, 1945–1951 (1968) [GJ3] Glimm, J., Jaffe, A.: Quantum field theory models. In: Statistical Mechanics and Quantum Field Theory, C. de Witt, R. Stora eds. London: Gordon Breach, 1971 [GJ4] Glimm, J., Jaffe, A.: A λφ 4 quantum field theory without cutoffs: II. The field operators and the approximate vacuum. Ann. Math. 91. 204–267 (1970) [Gr] Graf, G.M.: Asymptotic completeness for N -body short range quantum systems: A new proof. Commun. Math. Phys. 132, 73–101 (1990) [HS] Helffer, B., Sjöstrand, J.: Equation de Schrödinger avec champ magnétique et équation de Harper. Springer Lecture Notes in Physics 345, 1989, pp. 118–197 [HK] Høegh–Krohn, R.: On the spectrum of the space cutoff : P (ϕ) : H amiltonian in 2 space-time dimensions. Commun. Math. Phys. 21, 256–260 (1971) [HuSp1] Hübner, M., Spohn, H.: Radiative decay: nonperturbative approaches. Rev. Math. Phys 7, 363–387 (1995) [HuSp2] Hübner, M., Spohn, H.: Spectral properties of the spin-boson Hamiltonian. Ann. Inst. H. Poincaré 62, 289–323 (1995) [JP1] Jaksic, V., Pillet, C.A.: On a model for quantum friction I: Fermi’s golden rule at zero temperature. Ann. Inst. H. Poincaré 63, 62 -47 (1995) [JP2] Jaksic, V., Pillet, C.A.: On a model for quantum friction II: Fermi’s golden rule and dynamics at zero temperature. Commun. Math. Phys. 176, 176–619 (1995) [JP3] Jaksic, V., Pillet, C.A.: On a model for quantum friction III: Ergodic properties of the spin-boson system. Commun. Math. Phys. 178, 627–651 (1996) [Mo] Mourre, E.: Absence of singular continuous spectrum for certain selfadjoint operators. Commun. Math. Phys. 78, 519–567 (1981) [Ne] Nelson, E.: A quartic interaction in 2 dimensions. In: Mathematical theory of elementary particles, R. Goodman, I. Segal eds. Cambridge: MIT Press, 1966 [PSS] Perry, P., Sigal, I.M., Simon, B.: Spectral analysis of N -body Schrödinger operators. Ann. of Math. 114, 519–567 (1981) [Ro1] Rosen, L.: A λφ 2n field theory without cutoffs. Commun. Math. Phys. 16, 157–183 (1970) [Ro2] Rosen, L.: The (φ 2n )2 Quantum Field Theory: Higher Order Estimates. Comm. Pure Appl. Math. 24, 417–457 (1971) [Ro3] Rosen, L.: Renormalization of the Hilbert space in the mass shift model. J. Math. Phys. 13, 918–927 (1972) [Se] Segal, I.: Construction of non linear local quantum processes I. Ann. Math. 92, 462–481 (1970) [Si1] Simon, B.: The P (φ)2 Euclidean (Quantum) Field Theory. Princeton, NJ: Princeton University Press, 1974 [Si2] Simon, B: Studying spatially cutoff (ϕ)2n 2 Hamiltonians. In: Statistical Mechanics and Field Theory, R.N. Sen and C. Weil eds, New York: Halsted Press, 1972 [Si3] Simon, B.: Continuum embedded eigenvalues in a spatially cutoff P (ϕ)2 field theory. Proc. A.M.S. 35, 223–226 (1972)

Spectral and Scattering Theory of Spatially Cut-Off P (ϕ)2 Hamiltonians

125

[S-H.K] Simon, B., Høegh–Krohn, R.: Hypercontractive Semigroups and Two dimensional Self-Coupled Bose Fields. J. Funct. Anal. 9, 121–180 (1972) [Sk] Skibsted, E.: Spectral analysis of N-body systems coupled to a bosonic system. Rev. Math. Phys. 10, 989–1026 (1997) [Sp1] Spohn, H.: Asymptotic completeness for Raleigh scattering. J. Math. Phys. 38, 2281–2296 (1997) [Sp2] Spohn, H.: Ground state of a quantum particle coupled to a scalar Bose field. Preprint IHES 1997 Communicated by D. Brydges

Commun. Math. Phys. 213, 127 – 179 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

The Structure of Sectors Associated with Longo–Rehren Inclusions I. General Theory Masaki Izumi Department of Mathematics, Graduate School of Science, Kyoto University, Sakyo-ku, Kyoto 606-8502, Japan. E-mail: [email protected] Received: 19 July 1999 / Accepted: 10 February 2000

Dedicated to John E. Roberts on the occasion of his sixtieth birthday Abstract: We investigate the structure of the Longo–Rehren inclusion for a finite closed system of endomorphisms of factors, whose categorical structure is known to be the same as the asymptotic inclusion of A. Ocneanu. In particular, we obtain a precise description of the sectors associated with the Longo–Rehren inclusions in terms of half braidings, which do not necessarily satisfy the usual condition of braidings. In doing so, we give new proofs to most of the known statements concerning asymptotic inclusions. We construct a complete system of matrix units of the tube algebra using the half braidings, which will be used in the second part to describe concrete examples of the Longo–Rehren inclusions arising from the Cuntz algebra endomorphisms. We also discuss the case where the original system has a braiding, and generalize Ocneanu and Evans–Kawahigashi’s method for the analysis of the asymptotic inclusions of the Hecke algebra subfactors.

1. Introduction The notion of the asymptotic inclusions of subfactors was introduced by A. Ocneanu. He announced a description of the asymptotic inclusions in terms of topological quantum field theory (TQFT), and pointed out that it may be considered as a subfactor analogue of the quantum double construction. Detailed accounts of his theory are given by D. E. Evans and Y. Kawahigashi in [7,9]. The asymptotic inclusions also play important roles in S. Popa’s classification of amenable subfactors [29]. Inspired by examples arising from local quantum field theory, R. Longo and K. H. Rehren introduced new construction of a subfactor, called the Longo–Rehren inclusion, for a finite system of irreducible endomorphisms of infinite factors with some conditions [24]. T. Masuda proved that the asymptotic inclusions and the Longo–Rehren inclusions are essentially the same objects [25]. Namely, he generalized the Longo– Rehren construction to a system of bimodules. Then he showed that if the system comes

128

M. Izumi

from an inclusion of the AFD II1 factors, the Longo–Rehren inclusion is isomorphic to the dual inclusion of the asymptotic inclusion. In the same paper [24], Longo and Rehren introduced induction procedure of endomorphisms from a subfactor, whose idea goes back to J. E. Roberts’ earlier work. F. Xu successfully employed essentially the same procedure in his analysis of subfactors associated with the conformal inclusions [35]. Following these two works, J. Böckenhauer and D. E. Evans proposed in [2] systematic use of the induction procedure, which is now called the α-induction. We refer to [3–5, 20] for further development of the material. In this paper, we discuss detailed and direct analysis of the Longo–Rehren inclusions based on (the dual form of) the induction procedure using the notion of half braiding, which is weaker than the usual notion of braidings. We will give new proofs to most of the statements in [7] by using algebraic computation of sectors avoiding subtle and ingenious topological arguments. The category theoretical meaning of our argument, especially the relationship to the center construction in [19], will be discussed by M. Müger in [26]. The original aim of this work is to describe concrete examples of the asymptotic inclusions of subfactors arising from the Cuntz algebra endomorphisms [14], which will be the main subject of the second part. For this purpose, we write down various explicit formulae with precise normalization constants. Indeed, in the second part, we give the dual principal graphs and the S and T matrices of the asymptotic inclusions for several subfactors including the E6 subfactor. We also realize the Haagerup subfactor using an endomorphism of the Cuntz algebra O4 and compute the T matrix of the asymptotic inclusion. These are the first examples of computation of the asymptotic inclusions of subfactors neither coming from Hopf algebras nor with braidings. This paper is organized as follows: In Sect. 2, we first collect basics of sectors and introduce the Longo–Rehren inclusion A ⊃ B for a finite closed system of endomorphisms of an infinite factor. Then, using a criterion for intermediate subfactors obtained in [18], we establish a Galois type correspondence for the Longo–Rehren inclusions. In Sect. 3, we translate Ocneanu’s notion of tube algebras introduced in [7] (and related notions such as S and T matrices) into the language of sectors. We also introduce a faithful positive functional ϕ of the tube algebra, which turns out to be a useful tool for later analysis. Section 4 is the main body of the present work. First we introduce the notion of half-braidings, which satisfy only half of the defining conditions of the usual notion of braidings. Then, we show that every B − B sector associated with the Longo–Rehren inclusion can be obtained by (the dual form of) the induction procedure with a half braiding, and that the half braiding turns into a genuine braiding for the induced sectors. We also construct a complete system of matrix units of the tube algebra from half braidings, which gives a new proof to the fact that there exists a one-to-one correspondence between the set of minimal central projections of the tube algebra and the set of irreducible M∞ − M∞ bimodules [7]. In Sect. 5, we show that S and T matrices introduced in Sect. 3 coincide with the ones associated with the braiding of B − B sectors introduced in Sect. 4 through Rehren’s argument [10, 30]. Not only this gives a proof to the fact that S and T satisfy the usual modular group relation and Verlinde formula, it also provides a concrete formula of S and T in terms of the matrix units of the tube algebra, which will be used in the second part in order to obtain S and T matrices for concrete examples.

Structure of Sectors Associated with Longo–Rehren Inclusions I

129

In Sect. 6, we explicitly construct a half-braiding corresponding to the restriction of the canonical endomorphism to B, which is necessary to obtain the α-inductions of the sectors with respect to the braiding obtained in Sect. 4. In Sect. 7, we investigate the structure of the Longo–Rehren inclusion when the original system has a braiding. As it has already been obtained in [8, 20, 28], we first show that if the braiding is non-degenerate, the sectors associated with the Longo–Rehren inclusion is isomorphic to the direct product of the original system and its opposite. Then, we consider the case where the braiding of is not necessarily non-degenerate, and ˆ with a non-degenerate braiding. We obtain several results is embedded in a system applying the Galois correspondence established in Sect. 2 and the α-induction obtained in Sect. 6 to this situation. Namely, we give a proof to Ocneanu’s announcement of the “bipermutant theorem” in [28]. We also show that the method employed in [8, 28] to obtain the M∞ − M∞ bimodules for the Hecke algebra subfactors works in general as ˆ with a non-degenerate braiding. This allows us to treat the far as is embedded in general SU (n) Hecke algebra subfactors while in [8, 28] they treated only SU (2) and SU (3) cases for some technical reason. 2. Preliminaries First, we collect basics of sectors and Longo–Rehren inclusions. 2.1. Sectors. Let M, N be infinite factors and Mor(N, M)0 the set of unital normal ∗-homomorphisms from N to M whose image has a finite index. For ρ ∈ Mor(N, M)0 , Eρ denotes the minimal conditional expectation from M to ρ(N ). We set 1

d(ρ) = [M : ρ(N )]02 , which is called the statistical dimension of ρ, where [M : ρ(N )]0 is the minimum index of M ⊃ ρ(N ) [12]. The set of M − N sectors Sect(M, N ) is defined to be the unitary equivalence classes of Mor(N, M)0 . [ρ] denotes the class of ρ ∈ Mor(N, M)0 in Sect(M, N ). We use the notation End(M)0 and Sect(M) for Mor(M, M)0 and Sect(M, M). The sectors have 3 natural operations, the sum [ρ] ⊕ [σ ], the product [ρ][σ ], and the conjugation [ρ] [13, 22]. For simplicity, ρ ∈ Mor(M, N )0 denotes one of the representatives of [ρ]. ρ is given by ρ −1 · γ , where γ : M −→ ρ(N ) is the canonical endomorphism of Longo. ρ ∈ Mor(N, M)0 is called irreducible if the relative commutant M ∩ ρ(N ) is trivial. For ρ, σ ∈ Mor(N, M)0 the intertwiner space (ρ, σ ) is defined by (ρ, σ ) := {V ∈ M; Vρ(x) = σ (x)V ,

x ∈ N }.

If ρ is irreducible, (ρ, σ ) is a Hilbert space with the inner product V , W 1 = W ∗ V . It is known that for every ρ ∈ Mor(N, M)0 there exist a pair of isometries Rρ ∈ (id, ρρ), R ρ ∈ (id, ρρ) satisfying ∗

R ρ ρ(Rρ ) =

1 , d(ρ)

Rρ∗ ρ(R ρ ) =

1 . d(ρ)

(2.1)

One can choose R ρ equal to either Rρ or −Rρ . The standard left inverse φρ of ρ and Eρ are obtained by φρ (x) = Rρ∗ ρ(x)Rρ , Eρ = ρ · φρ . (2.2)

130

M. Izumi

2.2. Q-systems. The canonical endomorphisms can be characterized as follows [23]: Theorem 2.1. Let M be an infinite factor and γ a unital endomorphism of M. Then, (i) If there exist isometries V ∈ (id, γ ), W ∈ (γ , γ 2 ) satisfying 1 ∈ R>0 , d γ (W )W = W 2 ,

V ∗ W = W ∗ γ (V ) = W ∗ γ (W ) = W W ∗ ,

then there exists a subfactor N such that γ is the canonical endomorphism for the inclusion M ⊃ N . N is given by N = {x; W x = γ (x)W,

W x ∗ = γ (x ∗ )W }.

(ii) The conditional expectation E : M −→ N is given by E(x) = W ∗ γ (x)W. Every x ∈ M satisfies x = d 2 V ∗ E(V x). (iii) The conditional expectation E1 : N −→ γ (M) is given by E1 (y) = γ (V ∗ yV ). Every y ∈ N satisfies y = d 2 W ∗ E1 (Wy). The triplet (γ , V , W ) satisfying the above condition is called a Q − system [23]. 2.3. The Longo–Rehren inclusions. Let = {ρξ }ξ ∈0 be a finite system of irreducible endomorphisms in End(M)0 that is closed under the sector operations in the following sense: (i) [ρξ ] = [ρη ] if and only if ξ = η. (ii) There exists e ∈ 0 such that ρe = id. (iii) For every ξ ∈ 0 there exists ξ ∈ 0 such that [ρξ ] = [ρξ ]. ζ

(iv) There exists a non-negative integer Nξ,η such that [ρξ ][ρη ] =

ζ

ζ

Nξ,η [ρζ ].

We denote Rξ := Rρξ , R ξ := R ρξ , φξ = φρξ , and d(ξ ) = d(ρξ ) for simplicity. Set λ = ξ d(ξ )2 , which we call the global index of . Let M op be the opposite algebra of M and j : M −→ M op the natural anti-linear isomorphism. (When M op is identified with M , j may be identified with x → JM xJM , where JM is the modular conjugation ζ

N

ζ

ξ,η of M.) We set A := M ⊗ M op , ρξ = j · ρξ · j , and ρˆξ = ρξ ⊗ ρξ . Let {T (ξ,η )i }i=1 be an orthonormal basis of (ρζ , ρξ ρη ), and set

op

op

ζ

ζ

Tξ,η =

Nξ,η

i=1

ζ

ζ

T (ξ,η )i ⊗ j (T (ξ,η )i ),

which does not depend on the choice of the basis. Longo and Rehren showed the following [24]:

Structure of Sectors Associated with Longo–Rehren Inclusions I

131

Theorem 2.2. Let γ be the direct sum of {ρˆξ }ξ ∈0 ; namely we choose a system of isometries {Vξ }ξ ∈0 ⊂ A satisfying ξ Vξ Vξ∗ = 1, and define γ (x) = Vξ ρˆξ (x)Vξ∗ . ξ

Let V ∈ (id, γ ), W ∈ (γ , γ 2 ) be isometries defined by V = Ve , W =

ξ,η,ζ

d(ξ )d(η) ζ Vξ ρˆξ (Vη )Tξ,η Vζ∗ = λd(ζ )

ξ,η,ζ

d(ξ )d(η) ζ γ (Vη )Vξ Tξ,η Vζ∗ . λd(ζ )

Then, (γ , V , W ) is a Q-system. In consequence, there exists a subfactor B such that γ : A −→ B is the canonical endomorphism. The above inclusion A ⊃ B is called the Longo–Rehren inclusion. Let E : A −→ B be the conditional expectation, which is given by E (x) = W ∗ γ (x)W . Using the definition of W , we see that E is given by Zξ∗ ρˆξ (x)Zξ , E (x) = ξ ∈0

Zξ =

η,ζ ∈0

d(ξ )d(η) ζ ρˆξ (Vη )Tξ,η Vζ∗ . λd(ζ )

(2.3)

2.4. The Galois correspondence. In this subsection, we establish the Galois correspondence for the Longo–Rehren inclusions. The results obtained here will be used only in Sect. 7. Let P ⊃ Q be an inclusion of factors of finite index and γ : P −→ Q be the canonical endomorphism. Let [γ |Q ] = i∈I ni [σi ] be the irreducible decomposition of the restriction of γ to Q. We set Hi = {X ∈ P ; Xx = σi (x)X,

∀x ∈ Q}.

Then, Hi is a Hilbert space whose dimension is ni thanks to Sect. 3 of [18]. P is uniquely decomposed as P = QHi , i∈I

that can be regarded as generalization of crossed product decomposition. Corollary 3.10 of [18] implies the following: Lemma 2.3. Let P , Q, and γ be as above. We assume that each σi appears in γ |Q with multiplicity one, i.e. ni = 1. Then there exists a one-to-one correspondence in the lattice of intermediate subfactors between P ⊃ Q and the lattice of subsets J ⊂ I satisfying the following: ¯ (i) If i ∈ J , so does i. (ii) If i, j ∈ J and σk ≺ γ |Q is contained in σi σj , then k ∈ J .

132

M. Izumi

The correspondence is given as follows: For J ⊂ I satisfying the above condition, the corresponding intermediate subfactor RJ is given by RJ =

QHj .

j ∈J

For an intermediate subfactor P ⊃ R ⊃ Q, the corresponding subset JR ⊂ I is given by JR = {i ∈ I ; Hi ⊂ R}. Moreover, if γR : R −→ Q is the canonical endomorphism, then the irreducible decomposition of γR restricted to Q is given by [γR |Q ] =

[σi ].

i∈JR

Let = {ρξ }ξ ∈0 be as in the previous subsection and A ⊃ B be the corresponding Longo–Rehren inclusion. We denote by A1 the basic extension of A ⊃ B. We say that a subset - ⊂ is a closed subsystem if - is closed under the conjugation and the irreducible decomposition of the products. Applying Lemma 2.3 to A1 ⊃ A, we can easily obtain the following: Proposition 2.4. Let A ⊃ B be the Longo–Rehren inclusion of . Then, there exists a lattice anti-isomorphism between the set of the intermediate subfactors of A ⊃ B and the set of closed subsystem of . For a given closed subsystem - ⊂ , the corresponding intermediate subfactor B- is characterized as follows: If γB- : A −→ B- is the canonical endomorphism, then γB- is decomposed as [γB- ] =

[ρˆξ ].

ξ ∈-0

In the above proposition, it is not so clear whether A ⊃ B- really gives the Longo– Rehren inclusion of - because in general, the canonical endomorphism does not determine a subfactor up to inner conjugacy, or even more strongly, up to conjugacy (see, for example, 5.1 of [16]). The rest of this section is devoted to the proof of this fact. Theorem 2.5. Let λ- be the global index of -, i.e. λ- = ξ ∈-0 d(ξ )2 . We set P- := ∗ ξ ∈-0 Vξ Vξ , which is a projection in the relative commutant A ∩ γ (A) . We define a normal completely positive map E- from A to itself by E- (x) :=

λ ∗ W P- γ (x)W. λ-

Then, (i) B- = BP- B and E- is the conditional expectation from A to B- . (ii) A ⊃ B- is the Longo–Rehren inclusion for -.

Structure of Sectors Associated with Longo–Rehren Inclusions I

133

Proof. (i) Since P- commutes with γ (A), E- defines a completely positive map, which can be shown to be unital by direct computation. First we claim that E- (xy) = E- (x)y (and E- (yx) = yE- (x)) holds for all x ∈ A and all y ∈ (B ∪ {P- }) . Indeed, since we have B = W ∗ γ (A), it suffices to show the statement for the following three cases y ∈ γ (A), y = W ∗ and y = P- . In the first two cases, the facts W ∈ (γ , γ 2 ),

γ (W ∗ )W = W W ∗

imply the statement. Since - is a closed system, we have P- γ (P- )W = P- W P- = γ (P- )W P- . Thus, E- (xP- ) =

λ ∗ λ ∗ W γ (x)P- γ (P- )W = W γ (x)P- W P- = E- (x)P- . λλ-

Next we show that E- is a norm-one projection onto BP- B that is the von Neumann algebra generated by B and P- . Indeed, using the fact that A = BV V ∗ B and E- (V V ∗ ) =

λ ∗ 1 W γ (V )P- γ (V ∗ )W = P- , λλ-

we get E- (A) = BP- B. Thanks to the above claim, this shows that E- is a norm-one projection onto BP- B and the image is closed under product, and so it is a von Neumann algebra. Let γ be the canonical endomorphism from A into the image of E- . Then, Proposition 2.9, (i) of [18] and the definition of E- imply [ρˆξ ]. [γ ] = ξ ∈-0

Thus, Proposition 2.4 implies that the image is nothing but B- . ∗ = P . Then, we have (ii) We take an isometry R- ∈ B- satisfying R- R ∗ ∗ E- (R- xR)R- = Zξ∗ ρˆξ (x)Zξ , E- (x) = Rξ ∈-0

where Zξ =

λ ∗ ρˆξ (R)Zξ R- . λ-

∗ V for ξ ∈ - . Then {V } We set Vξ := Rξ 0 ξ ξ ∈-0 are isometries satisfying = 1. Moreover, we have d(ξ )d(η) ζ Zξ = ρˆξ (Vη )Tξ,η Vζ∗ . λ- d(ζ )

ξ ∈-0

Vξ Vξ∗

η,ζ ∈-0

Comparing this with (2.3), we conclude that E- comes from the Longo–Rehren inclusion with the canonical endomorphism Vξ ρˆξ (x)Vξ∗ . γ- (x) = ξ ∈-0

134

M. Izumi

3. Tube Algebras The notion of tube algebras was introduced by Ocneanu [7] for a system of bimodules. Here, we will translate his definition into the language of sectors, and give explicit formulae of the tube algebra operations and the S and T matrices for the computation of concrete examples. We will freely use the Frobenius reciprocity of sectors formulated in [15]. Let = {ρξ }ξ ∈0 be a finite system of endomorphisms of M as in the previous section. The tube algebra of this system is, as a linear space, defined by Tube := (ρξ · ρζ , ρζ · ρη ). ξ,η,ζ

Since the same operator may belong to two distinct intertwiner spaces, to avoid possible confusion we use the following notation; when X ∈ (ρξ · ρζ , ρζ · ρη ) is regarded as an element of Tube , it is denoted by (ξ ζ |X|ζ η). We introduce ∗-algebra structure into Tube as follows: Nν

(ξ ζ |X|ζ η)(ξ ζ |Y |ζ η ) := δη,ξ

ζ,ζ

ν≺ζ ζ

i=1

(ξ ν|T (νζ,ζ )∗i ρζ (Y )Xρξ (T (νζ,ζ )i )|νη ), ∗

∗

(ξ ζ |X|ζ η) := d(ζ )(ηζ |ρζ (ρξ (R ζ )X ∗ )Rζ |ζ ξ ). It is routine to show, by using the Frobenius reciprocity, that Tube is a ∗-algebra with these operations. Moreover, Tube is a C∗ -algebra. To show it, we need the following: Lemma 3.1. (i) For X, Y ∈ (ρµ , ρξ ρη ), we have φξ (XY ∗ ) = X, Y

d(µ) . d(ξ )d(η)

(ii) For X, Y ∈ (ρξ ρζ , ρζ ρη ), we have d(ξ )φξ (Y ∗ X) = d(η)φζ (XY ∗ ). Proof. (i) Thanks to the polarization identity, we may assume that Y = X is an isometry. In this case, XX∗ is a minimal projection in the relative commutant M ∩ ρξ ρη (M) and the statement follows from the local index formula for the minimal expectation. (See Sect. 2 of [15].) (ii) It suffices to show the statement for X and Y of the following form: X = X1 X2∗ , Y = Y1 Y2∗ ,

X1 ∈ (ρµ , ρζ ρη ), Y1 ∈ (ρν , ρζ ρη ),

X2 ∈ (ρµ , ρξ ρζ ), Y2 ∈ (ρν , ρξ ρζ ).

For these X and Y , we have φξ (Y ∗ X) = φξ (Y2 Y1∗ X1 X2∗ ) = δµ,ν X1 , Y1 φξ (Y2 X2∗ ), φζ (XY ∗ ) = φζ (X1 X2∗ Y2 Y1∗ ) = δµ,ν Y2 , X2 φζ (X1 Y1∗ ). Thus, the statement follows from (i).

Structure of Sectors Associated with Longo–Rehren Inclusions I

135

We define two linear functionals on Tube by Tr((ξ ζ |X|ζ η)) := d(ξ )δξ,η δζ,e X, ϕ ((ξ ζ |X|ζ η)) := d(ξ )2 δξ,η δζ,e X. Note that X ∈ (ρξ , ρξ ) = C in the above unless the right-hand sides vanish. Proposition 3.2. Tr is a faithful positive trace and ϕ is a faithful positive functional. In consequence, since Tube is a finite dimensional ∗-algebra with a faithful positive functional, it is a C∗ -algebra. Proof. First of all, note that unless ξ = ξ , η = η , ζ = ζ , we have ∗

∗

Tr((ξ ζ |X|ζ η)(ξ ζ |Y |ζ η ) ) = Tr((ξ ζ |Y |ζ η ) (ξ ζ |X|ζ η)) = 0. Using (2.1) and the definitions of the product and the ∗-operation, we get ∗

∗

Tr((ξ ζ |X|ζ η)(ξ ζ |Y |ζ η)∗ ) = d(ξ )d(ζ )R ζ ρζ (ρζ (ρξ (R ζ )Y ∗ )Rζ )Xρξ (R ζ ) ∗

= d(ξ )ρξ (R ζ )Y ∗ Xρξ (R ζ ). Since the right-hand side is already a scalar, we can apply φξ and get ∗

Tr((ξ ζ |X|ζ η)(ξ ζ |Y |ζ η)∗ ) = d(ξ )R ζ φξ (Y ∗ X)R ζ = d(ξ )φξ (Y ∗ X). In a similar way, we obtain the following using (2.1) and (2.2): ∗

Tr((ξ ζ |Y |ζ η)∗ (ξ ζ |X|ζ η)) = d(η)d(ζ )Rζ∗ ρζ (X)ρζ (ρξ (R ζ )Y ∗ )Rζ ρη (Rζ ) ∗

= d(η)d(ζ )Rζ∗ ρζ (Xρξ (R ζ )Y ∗ ρζ ρη (Rζ ))Rζ ∗

= d(η)d(ζ )φζ (Xρξ (R ζ ρζ (Rζ ))Y ∗ )

= d(η)φζ (XY ∗ ). Thanks to Lemma 3.1, Tr is a trace. Let x = ξ,η,ζ (ξ ζ |Xξ,ζ,η |ζ η) ∈ Tube . Then, d(η)φζ (Xξ,ζ,η Xξ,ζ,η ∗ ). Tr(x ∗ x) = ξ,η,ζ

Since the left inverses {φξ }ξ ∈0 are faithful and positive, so is Tr. The statement for ϕ can be proven in a similar way. We denote by Aξ,η the linear subspace ζ ∈0 (ρξ ρζ , ρζ ρη ) ⊂ Tube . Then, we have Aξ,η Aξ ,η ⊂ δξ ,η Aξ,η , A∗ξ,η = Aη,ξ . Thus, Aξ := Aξ,ξ is a ∗-subalgebra of Tube . Let A be the direct sum of Aξ , which includes the center of Tube . The above proof shows that ϕ restricted to A is a faithful trace. We introduce an inner product into A by (X, Y ) := ϕ (Y ∗ X) = ϕ (XY ∗ ), or more explicitly by ((ξ ζ |X|ζ ξ ), (ξ ζ |Y |ζ ξ )) = δξ,ξ δζ,ζ d(ξ )2 φζ (XY ∗ ) = δξ,ξ δζ,ζ d(ξ )2 φξ (Y ∗ X).

136

M. Izumi

Now, we introduce S and T matrices. We define a linear operator S0 on A by S0 ((ξ η|X|ηξ )) = d(ξ )(ηξ |Rη∗ ρη (Xρξ (R η ))|ξ η). We set

t=

ξ

∗

d(ξ )(ξ ξ |Rξ R ξ |ξ ξ ) ∈ Tube .

Theorem 3.3. Let S0 and t be as above. Then, (i) S0 is a unitary. (ii) t is a unitary in Tube with t∗ =

(ξ ξ |1|ξ ξ ). ξ

(iii) t belongs to the center of Tube . Proof. (i) Using the definitions of S0 and the inner product, we get (S0 ((ξ η|X|ηξ )), S0 ((ξ η |Y |η ξ )))

∗

= d(ξ )2 d(η)2 δξ,ξ δη,η φη (ρη (ρξ (R η )Y ∗ )Rη Rη∗ ρη (Xρξ (R η ))) ∗

= d(ξ )2 d(η)2 δξ,ξ δη,η ρξ (R η )Y ∗ φη (Rη Rη∗ )Xρξ (R η ) ∗

= d(ξ )2 δξ,ξ δη,η ρξ (R η )Y ∗ Xρξ (R η ). Since this is already a scalar, we can apply φξ to it. Thus, we get (S0 ((ξ η|X|ηξ )), S0 ((ξ η |Y |η ξ ))) = d(ξ )2 δξ,ξ δη,η φξ (Y ∗ X) = ((ξ η|X|ηξ ), (ξ η |Y |η ξ )). (ii) Using (2.1), we obtain ∗ d(ξ )2 (ξ ξ |ρξ (ρξ (Rξ )R ξ Rξ∗ )Rξ |ξ ξ ) = (ξ ξ |1|ξ ξ ). t∗ = Since

ξ

ξ (ξ e|1|eξ )

t∗ t =

ξ

ξ

is the unit of Tube , we get ∗

d(ξ )(ξ ξ |1|ξ ξ )(ξ ξ |Rξ R ξ |ξ ξ ) = d(ξ )

∗ (ξ e|R ξ ρξ (Rξ )|eξ ) = 1. ξ

(iii) It suffices to show that t∗ commutes with (ξ ζ |X|ζ η) of the following form: X = X1 X2∗ ,

X1 ∈ (ρµ , ρζ ρη ), µ

X2 ∈ (ρµ , ρξ ρζ ).

µ

We may further assume X1 = T (ζ,η )1 , X2 = T (ξ,ζ )1 . Then we get Nν

t∗ (ξ ζ |X|ζ η) = (ξ ξ |1|ξ ξ )(ξ ζ |X|ζ η) =

ξ,ζ

ν≺ξ ζ i=1

= (ξ µ|X2∗ ρξ (X1 )|µη),

(ξ ν|T (νξ,ζ )∗i ρξ (X)ρξ (T (νξ,ζ )i )|νη)

Structure of Sectors Associated with Longo–Rehren Inclusions I

137

Nν

∗

(ξ ζ |X|ζ η)t = (ξ ζ |X|ζ η)(ηη|1|ηη) =

ζ,η

ν≺ζ η i=1

=

(ξ µ|X2∗ ρξ (X1 )|µη).

(ξ ν|T (νζ,η )∗i Xρξ (T (νζ,η )i )|νη)

In Sect. 5, we show that S0 leaves the center Z(Tube ) invariant. Therefore, the restriction of S0 to Z(Tube ), denoted by S, is well-defined. We denote by T the multiplication operator of t on Z(Tube ). Note that S and T are unitaries. Before closing this section, we show a simple but non-trivial example of Tube . Let G be a finite group. A G-kernel in M is a homomorphism from G to Out (M). We take a lift α : G −→ Aut (M) of an injective G-kernel. Then, since [αg ][αh ] = [αgh ], there exists a unitary u(g, h) ∈ M for each pair g, h ∈ G satisfying αg · αh = Ad(u(g, h)) · αgh . Therefore, using associativity of αg · αh · αk , we get Ad(u(g, h)u(gh, k)) · αghk = Ad(αg (u(h, k))u(g, hk)) · αghk . This implies that there exists a scalar ω(g, h, k) ∈ T satisfying u(g, h)u(gh, k) = ω(g, h, k)αg (u(h, k))u(g, hk). ω satisfies the 3 cocycle relation and the cohomology class [ω] ∈ H 3 (G, T) depends only on the G-kernel [31]. Now we consider the case = {αg }g∈G . Let c(g, h) := (gh|u(h, g h )u(g, h)∗ |hg h ), where g h = h−1 gh. Then, clearly {c(g, h)}g,h∈G forms a basis of Tube . By applying the definition of the product introduced above, we get c(g, h)c(k, l) = δk,g h (g(hl)|u|(hl)g hl )(h, l)∗ αh (u(l, g hl )u(g h , l)∗ ) · u(h, g h )u(g, h)∗ αg (u(h, l)) = δk,g h ω(h, g h , l)ω(g, h, l)ω(h, l, g hl )c(g, hl). Thus Tube in this case is nothing but the quantum double of G with 3 cocycle ω discussed in [6, 32]. 4. Half Braidings and B − B Sectors Let = {ρξ }ξ ∈0 be as before and A ⊃ B the associated Longo–Rehren subfactor. We keep using the same notation as in the previous two sections. In [7], it is shown that the center of the Tube can be identified with the Hilbert space HS 1 ×S 1 associated with the corresponding TQFT, and that the minimal central projections of Tube parameterize the M∞ − M∞ bimodules associated with the asymptotic inclusion. In this section, we directly show the relationship between the half braidings of direct sums of sectors in and the matrix units of Tube , and consequently show the relationship between the center of Tube and the sectors generated by γ|B without using TQFT. Let ι : B <→ A be the inclusion map and consider [ι] to be an A − B sector. Then the conjugate sector of [ι] is [γ ] ∈ Sect(B, A). [ι] and [ι] generate 4 kinds of sectors, A − A, A − B, B − A, and B − B sectors, and we call them sectors associated with A ⊃ B. The induction-reduction graph of the A − A and A − B sectors can be obtained by simple manipulation of the Frobenius reciprocity (cf. Sect. 4 of [7]).

138

M. Izumi

Theorem 4.1. Let A ⊃ B be the Longo–Rehren inclusion of = {ρξ }ξ ∈0 and ι the inclusion map. Then, op

(i) Every irreducible A − A sector associated with A ⊃ B is given by [ρξ ⊗ ρη ] for some ξ, η ∈ 0 . op (ii) For every ξ , [ρξ ⊗ id][ι] = [id ⊗ ρξ ][ι] is irreducible. In consequence, we have

op

[ρξ ⊗ ρη ][ι] =

ζ

ζ

Nξ,η [ρζ ⊗ id][ι]

(iii) Every irreducible A − B sector associated with A ⊃ B is given by [ρξ ⊗ id][ι] for some ξ . [ρξ ⊗ id][ι] = [ρη ⊗ id][ι] if and only if ξ = η. op Proof. (i) Since [ιι] = ξ [ρξ ⊗ ρξ ], this is obvious. (ii) Thanks to the Frobenius reciprocity, we have dim((ρξ ⊗ id op ) · ι, (ρξ ⊗ id op ) · ι) = dim(ρξ ρξ ⊗ id op , ιι) = 1 because ρξ ρξ contains identity only once. Thus [(ρξ ⊗ id op ) · ι] is an irreducible A − B op sector. In the same way, we can show that [(id ⊗ ρξ ) · ι] is irreducible too. Again by the Frobenius reciprocity, we get op

op

dim((ρξ ⊗ id op ) · ι, (id ⊗ ρξ ) · ι) = dim(ρξ ⊗ ρξ , ιι) = 1, op

which shows [(ρξ ⊗ id op ) · ι] = [(id ⊗ ρξ ) · ι]. Using these facts, we obtain op

op

[ρξ ⊗ ρη ][ι] = [ρξ ⊗ id op ][id ⊗ ρη ][ι] = [ρξ ρη ⊗ id][ι] =

ζ

ζ

Nξ,η [ρζ ⊗ id][ι]

(iii) The first statement is a direct consequence of (i) and (ii). [ρξ ⊗ id][ι] = [ρη ⊗ id][ι] if and only if 1 = dim((ρξ ⊗ id op ) · ι, (ρη ⊗ id op ) · ι) = dim(ρη ρξ ⊗ id op , ιι). Since ρη ρξ contains identity if and only if ξ = η, we get the result.

Remark. (ii) and (iii) give one of the principal graphs of A ⊃ B. To determine the B −B sectors and the induction-reduction graph between the A−B sectors and the B −B sectors, we use the following observation due to Xu [35], which is, in fact, the dual form of the induction procedure of Longo–Rehren [24]: if ρ ∈ End(A)0 and U ∈ A is a unitary satisfying U ∈ (ρ · γ , γ · ρ),

W U = γ (U )Uρ(W ),

(4.1)

where W is as in Subsect. 2.3, then the restriction of Ad(U ) · ρ to B is an endomorphism of B. Indeed, the above two conditions imply Ad(U ) · ρ(γ (A)) = γ (ρ(A)),

Ad(U ) · ρ(W ) = γ (U ∗ )W.

Since B = γ (A)W , we have Ad(U ) · ρ(B) ⊂ B. In [35], Xu used a braiding operator for U . However, we need only two properties above.

Structure of Sectors Associated with Longo–Rehren Inclusions I

139

Definition 4.2. Let = {ρξ }ξ ∈0 be as before. (i) Let σ be a finite direct sum of endomorphisms from sectors in . A system of unitary operators {Eσ (ξ )}ξ ∈0 is called a half braiding of σ with respect to if it satisfies the following: (1) Eσ (ξ ) ∈ (σρξ , ρξ σ ). (2) For every X ∈ (ρζ , ρξ ρη ), the following holds: XEσ (ζ ) = ρξ (Eσ (η))Eσ (ξ )σ (X). Two half braidings {Eσ (ξ )}ξ ∈0 and {Eσ (ξ )}ξ ∈0 are equivalent if there exists a unitary u ∈ (σ, σ ) such that Eσ (ξ ) = ρξ (u)Eσ (ξ )u∗ holds for all ξ ∈ 0 . (ii) A system of unitary operators {E(ξ, η)}ξ,η∈0 is a braiding of if {E(ξ, η)}η∈0 and {E(η, ξ )∗ }η∈0 are half braidings of ξ with respect to for each fixed ξ . Remark. (i) The second condition in (i) is called the braiding fusion equation. Readers are encouraged to draw diagrams as in [35] when we use this condition. (ii) Let µ be a finite direct sum of sectors in and {X(ξ )i } an orthonormal basis of (ρξ , µ). For a half braiding {Eσ (ξ )} of σ with respect to , we set Eσ (µ) =

ξ,i

X(ξ )i Eσ (ξ )σ (X(ξ )∗i ).

Then, the two conditions of the definition of half braidings hold with respect to arbitrary finite direct sums of sectors in instead of irreducibles ρξ , ρη , and ρζ above. Moreover, it is routine to show the following for finite direct sums µ1 , µ2 of sectors in : Eσ (µ1 µ2 ) = µ1 (Eσ (µ2 ))Eσ (µ1 ). In the same way, starting with the braiding of irreducibles in the above sense, we can construct, in a natural way, a braiding of the category generated by . For a half braiding {Eσ (ξ )}ξ ∈0 , we define unitaries U (σ, Eσ ) ∈ A and U (σ op , Eσ ) ∈ A by U (σ, Eσ ) =

ξ

U (σ , Eσ ) = op

ξ

Vξ (Eσ (ξ ) ⊗ 1)(σ ⊗ id op )(Vξ∗ ), Vξ (1 ⊗ j (Eσ (ξ )))(id ⊗ σ op )(Vξ∗ ).

Lemma 4.3. U (σ, Eσ ) and U (σ op , Eσ ) satisfy (4.1) for ρ = σ ⊗ id op and ρ = id ⊗ σ op respectively.

140

M. Izumi

Proof. It suffices to show the statement only for U (σ, Eσ ) because that for U (σ op , Eσ ) can be shown in the same way. It is easy to show that U (σ, Eσ ) is a unitary belonging to (ρ · γ , γ · ρ). Using the braiding fusion equation, we get d(ξ )d(η) ζ γ (Vη )Vξ Tξ,η (Eσ (ζ ) ⊗ 1)(σ ⊗ id op )(Vζ∗ ) W U (σ, Eσ ) = λd(ζ ) ξ,η,ζ d(ξ )d(η) ζ = γ (Vη )Vξ (ρξ (Eσ (η))Eσ (ξ ) ⊗ 1)(σ ⊗ id op )(Tξ,η Vζ∗ ). λd(ζ ) ξ,η,ζ

On the other hand, we have γ (U (σ, Eσ ))U (σ, Eσ )(σ ⊗ id op )(W ) γ (Vη (Eσ (η) ⊗ 1)(σ ⊗ id op )(Vη∗ ))U (σ, Eσ )(σ ⊗ id op )(W ) = η

=

η

=

γ (Vη (Eσ (η) ⊗ 1))U (σ, Eσ )(σ ⊗ id op )(γ (Vη∗ )W )

ξ,η,ζ

=

ξ,η,ζ

d(ξ )d(η) ζ γ (Vη (Eσ (η) ⊗ 1))U (σ, Eσ )(σ ⊗ id op )(Vξ Tξ,η Vζ∗ ) λd(ζ ) d(ξ )d(η) ζ γ (Vη )Vξ (ρξ (Eσ (η))Eσ (ξ ) ⊗ 1)(σ ⊗ id op )(Tξ,η Vζ∗ ). λd(ζ )

Therefore the statement holds. Definition 4.4. Let be as above. (i) For a finite direct sum σ of sectors in and a half braiding Eσ of σ , we define (σ, Eσ ) ∈ End(B) by the restriction of Ad(U (σ, Eσ )) · (σ ⊗ id op ) to B. Note that (σ, Eσ ) ∈ End(B) is characterized by (σ, Eσ )(γ (x)) = γ ((σ ⊗ id op )(x)),

x ∈ A,

(σ, Eσ )(W ) = γ (U (σ, Eσ ) )W. ∗

op , E ) in an obvious way. In general, σ might have several half We also define (σ σ braidings. To distinguish them, we put a parameter α as {Eσα (ξ )}ξ ∈0 . For simplicity, we sometimes use the following notation:

U α (σ ) = U (σ, Eσα ),

σ α = (σ, Eσα ),

op , E α ). U α (σ op ) = U (σ op , Eσα ), σ op,α = (σ σ

(ii) We define D() to be the set of endomorphisms ρ ∈ End(B)0 such that [ι][ρ] is a finite direct sum of sectors in {[ρξ ⊗ id op ][ι]}ξ ∈0 . We call D() the quantum double of .

Structure of Sectors Associated with Longo–Rehren Inclusions I

141

By construction, every (σ, Eσ ) belongs to D(). We show the converse. Lemma 4.5. Let D() be as above. Then, (i) D() is closed under the sector operations, i.e. sum, product, decomposition, and conjugation. (ii) For every ρ ∈ D(), there exist σ ∈ End(M)0 that is a finite direct sum of sectors in and a half braiding {Eσ (ξ )}ξ ∈0 of σ with respect to such that [ρ] = [(σ, Eσ )]. Proof. (i) It is easy to see that D() is closed under Let [ρ] ∈ the first 3 operations. op ][ι], where n is a n [ρ ⊗ id D(). By definition, we may assume [ι][ρ] = ξ ξ ξ ξ non-negative integer. The Frobenius reciprocity implies nξ = dim((ρξ ⊗ id op ) · ι, ι · ρ) = dim(ι · ρ, (ρξ ⊗ id op ) · ι). op By comparing dimensions, we obtain [ι][ρ] = ξ nξ [(ρξ ⊗ id )][ι], which shows [ρ] ∈ D(). (ii) Since ρ¯ ∈ D(), we have [ρ][ι] = nξ [ι][(ρξ ⊗ id op )]. ξ

We set σ = ξ nξ ρξ . Then, [ρ][ι] = [ι][(σ ⊗ id op )]. Therefore, perturbing ρ by an inner automorphism of B if necessary, we may and do assume ρ · γ = γ · (σ ⊗ id op ). Note that B = γ (A)W and for each x ∈ B there exists a unique element y ∈ A such that x = γ (y)W . Indeed, if x = γ (y)W , 1 V ∗ x = V ∗ γ (y)W = yV ∗ W = √ y. λ So √ there exists a unique element U ∈ A satisfying ρ(W ) = γ (U ∗ )W . Since U is given by λρ(W ∗ )V , we have U ∈ ((σ ⊗ id op ) · γ , γ · (σ ⊗ id op )), and so

Vη∗ U (σ ⊗ id op )(Vξ ) ∈ ((σ · ρξ ⊗ ρξ ), (ρη · σ ⊗ ρη )). op

op

op

op

Note that ((σ · ρξ ⊗ ρξ ), (ρη · σ ⊗ ρη )) is non-trivial only if ξ = η. Thus, there exists u(ξ ) ∈ (σ · ρξ , ρξ · σ ) such that U= Vξ (u(ξ ) ⊗ 1)(σ ⊗ id op )(Vξ∗ ). ξ ∈0

We show that {u(ξ )}√ξ ∈0 is a half braiding of σ with respect to . First we show that u(ξ ) is a unitary. 1/ λ = W ∗ γ (V ) implies 1 √ = ρ(W ∗ )ρ(γ (V )) = W ∗ γ (U (σ ⊗ id op )(V )) λ 1 = W ∗ γ (V (u(e) ⊗ 1)) = √ γ (u(e) ⊗ 1), λ

142

M. Izumi

and so u(e) = 1. Since we have ρ(γ (W ∗ )W ) = γ ((σ ⊗ id op )(W ∗ )U ∗ )W, ρ(W W ∗ ) = γ (U ∗ )W W ∗ γ (U ) = γ (U ∗ W ∗ γ (U ))W, ρ(γ (W )W ) = γ ((σ ⊗ id op )(W )U ∗ )W, ρ(W 2 ) = γ (U ∗ )W γ (U ∗ )W = γ (U ∗ γ (U ∗ )W )W, γ (W ∗ )W = W W ∗ and γ (W )W = W 2 imply (σ ⊗ id op )(W ∗ )U ∗ = U ∗ W ∗ γ (U ), (σ ⊗ id op )(W )U ∗ = U ∗ γ (U ∗ )W. Therefore, we get U ∗ U (σ ⊗ id op )(W ) = (σ ⊗ id op )(W )U ∗ U, and so (σ ⊗ id op )(Vξ∗ )U ∗ U (σ ⊗ id op )(W )(σ ⊗ id op )(Vζ )

= (σ ⊗ id op )(Vξ∗ )(σ ⊗ id op )(W )U ∗ U (σ ⊗ id op )(Vζ ).

The left-hand side is (u(ξ )∗ u(ξ ) ⊗ 1)(σ ⊗ id op )(Vξ∗ W Vζ ) d(ξ )d(η) op ζ = (u(ξ )∗ u(ξ ) ⊗ 1)(σ · ρξ ⊗ ρξ )(Vη )(σ ⊗ id op )(Tξ,η ) λd(ζ ) η∈0 d(ξ )d(η) op ζ = (σ · ρξ ⊗ ρξ )(Vη )(u(ξ )∗ u(ξ ) ⊗ 1)(σ ⊗ id op )(Tξ,η ), λd(ζ ) η∈0

where we use u(ξ )∗ u(ξ ) ∈ (σ · ρξ , σ · ρξ ). On the other hand, the right-hand side is (σ ⊗ id op )(Vξ∗ W Vζ )(u(ζ )∗ u(ζ ) ⊗ 1) d(ξ )d(η) op ζ = (σ · ρξ ⊗ ρξ )(Vη )(σ ⊗ id op )(Tξ,η )(u(ζ )∗ u(ζ ) ⊗ 1). λd(ζ ) η∈0

So we have ζ

ζ

(u(ξ )∗ u(ξ ) ⊗ 1)(σ ⊗ id op )(Tξ,η ) = (σ ⊗ id op )(Tξ,η )(u(ζ )∗ u(ζ ) ⊗ 1). Setting ξ = e, η = ζ , we get u(η)∗ u(η) = 1, which shows that U is an isometry. 1 = W ∗ W implies 1 = W ∗ γ (U U ∗ )W = E (U U ∗ ), where E is the conditional expectation from A to B. Therefore, U is unitary and so is u(ξ ) for every ξ ∈ 0 . What remains is to show that {u(ξ )}ξ 0 satisfies the braiding fusion equation. To do so we compute both sides of (σ ⊗ id op )(Vξ )∗ (σ ⊗ id op )(W )U ∗ Vζ = (σ ⊗ id op )(Vξ )∗ U ∗ γ (U ∗ )W Vζ .

Structure of Sectors Associated with Longo–Rehren Inclusions I

143

The left-hand side is (σ ⊗ id op )(Vξ∗ W )(σ ⊗ id op )(Vζ )(u(ζ )∗ ⊗ 1) d(ξ )d(η) op ζ = (σ · ρξ ⊗ ρξ )(Vη )(σ ⊗ id op )(Tξ,η )(u(ζ )∗ ⊗ 1). λd(ζ ) η∈0

The right-hand side is (u(ξ )∗ ⊗ 1)(ρξ ⊗ ρξ )(U ∗ )Vξ∗ W Vζ d(ξ )d(η) op ζ (u(ξ )∗ ⊗ 1)(ρξ ⊗ ρξ )(U ∗ Vη )Tξ,η = λd(ζ ) η∈0 d(ξ )d(η) op ζ = (u(ξ )∗ ⊗ 1)(ρξ · σ ⊗ ρξ )(Vη )(ρξ (u(η)∗ ) ⊗ 1)Tξ,η λd(ζ ) η∈0 d(ξ )d(η) op ζ = (σ · ρξ ⊗ ρξ )(Vη )(u(ξ )∗ ρξ (u(η)∗ ) ⊗ 1)Tξ,η , λd(ζ ) op

η∈0

where we use u(ξ ) ∈ (σ · ρξ , ρξ · σ ). Thus, we get ζ

ζ

Tξ,η (u(ζ ) ⊗ 1) = (ρξ (u(η))u(ξ ) ⊗ 1)(σ ⊗ id op )(Tξ,η ). ζ

Multiplying by 1 ⊗ j (T (ξ,η )∗i ) from the left, we obtain ζ

ζ

T (ξ,η )i u(ζ ) = ρξ (u(η))u(ξ )σ (T (ξ,η )i ), which shows that {u(ξ )}ξ ∈0 is a half braiding.

The following is one of the main results in this paper: Theorem 4.6. Let A ⊃ B be the Longo–Rehren inclusion for = {ρξ }ξ ∈0 . (i) For every B − B sector ρ associated with A ⊃ B, there exist σ ∈ End(M)0 that is a finite direct sum of sectors in and a half braiding {Eσ (ξ )}ξ ∈0 of σ with respect to such that [ρ] = [(σ, Eσ )]. β

(ii) Let σ , µ be finite direct sums of sectors in and {Eσα (ξ )}ξ ∈0 , {Eµ (ξ )}ξ ∈0 half braidings of σ and µ respectively. Then, ( σ α, µβ ) = {γ (X ⊗ 1) ; X ∈ (σ, µ),

Eµβ (ξ )X = ρξ (X)Eσα (ξ ), ∀ξ ∈ 0 }.

µβ ] only if [σ ] = [µ]. When µ = σ , [ σ α ] = [ σ β ] if and only In particular, [ σ α ] = [ β if {Eσα (ξ )}ξ ∈0 and {Eµ (ξ )}ξ ∈0 are equivalent. (iii) Let σ α, µβ be as above and set E( σ α, µβ ) = γ (Eσα (µ) ⊗ 1). Then, {E( σ α, µβ )} is a braiding of D().

144

M. Izumi

(iv) For σ as above, we set Eσα (ξ ) = d(σ )Rσ∗ σ (Eσα (ξ )∗ ρξ (R σ )). α Then, {Eσα (ξ )}ξ ∈0 is a half braiding of σ such that [ σ α ] = [ σ ]. When σ is not irreducible, the above definition of Eσα (ξ ) does depend on the choice of Rσ , R σ ; however, different choice of them only amounts to an equivalent half braiding.

Proof. (i) If [ρ] is a B − B sector associated with A ⊃ B, we have ρ ∈ D() thanks to Theorem 4.1. Therefore, this follows from the previous lemma. (ii)√Let X ∈ B. Then, there exists Y ∈ A such that X = γ (Y )W . Y is given by Y = λV ∗ X. By the definition of σ α and µβ , we can see that X belongs to ( σ α, µβ ) if and only if X satisfies the following two conditions: X ∈ (γ · (σ ⊗ id op ), γ · (µ ⊗ id op )), µβ (W )X. X σ α (W ) = It is easy to show that the first condition is equivalent to Y ∈ (γ · (σ ⊗ id op ), µ ⊗ id op ). op

Therefore, if this is the case, Y Vξ ∈ (ρξ · σ ⊗ ρξ , µ ⊗ id op ) is non-zero only if ξ = e. Since (σ ⊗ id op , µ ⊗ id op ) = (σ, µ) ⊗ C, there exists Z ∈ (σ, µ) such that Y = (Z ⊗ 1)V ∗ , and so 1 X = γ ((Z ⊗ 1)V ∗ )W = √ γ (Z ⊗ 1). λ For such X, the second condition is equivalent to γ ((Z ⊗ 1)U α (σ )∗ )W = γ (U β (µ)∗ )W γ (Z ⊗ 1) = γ (U β (µ)∗ γ (Z ⊗ 1))W, and so equivalent to

γ (Z ⊗ 1)U α (σ ) = U β (µ)(Z ⊗ 1).

It is routine to show that this is equivalent to ρξ (Z)Eσα (ξ ) = Eµβ (ξ )Z for all ξ ∈ 0 . αβ β (iii) Let Eσ ·µ (ξ ) = Eσα (ξ )σ (Eµ (ξ )). Then, by direct computation we can show that αβ {Eσ ·µ (ξ )}ξ ∈0 is a half braiding of σ · µ and αβ

σα · µβ = (σ · µ) . Thanks to (ii), we have ( σα · µβ , µβ · σ α ) = γ (X ⊗ 1); X ∈ (σ · µ, µ · σ ), Eµβ (ξ )µ(Eσα (ξ ))X = ρξ (X)Eσα (ξ )σ (Eµβ (ξ )), ∀ξ ∈ 0 .

Structure of Sectors Associated with Longo–Rehren Inclusions I

145

Therefore, to show that γ (Eσα (µ) ⊗ 1) ∈ ( σα · µβ , µβ · σ α ), it suffices to show Eµβ (ξ )µ(Eσα (ξ ))Eσα (µ) = ρξ (Eσα (µ))Eσα (ξ )σ (Eµβ (ξ )). This can be shown by the braiding fusion equation of the half braiding of σ . A similar argument using σα · µβ ) ( νδ , = {γ (X ⊗ 1); X ∈ (ν, σ · µ),

Eσα (ξ )σ (Eµβ (ξ ))X = ρξ (X)Eνδ (ξ ), ∀ξ ∈ 0 },

shows the braiding fusion equation of {E( σ α, µβ )}. (iv) The proof of Lemma 4.4, (i) shows that there exists a half braiding {Eσα (ξ )}ξ ∈0 α such that [ σ α ] = [ σ ]. Let α R ∈ (idB , σ · σ α ),

α R ∈ (idB , σα · σ )

be isometries satisfying α

∗

R ∗ σ (R) = R σ α (R) =

1 . d(σ )

Thanks to (ii), we have α

(idB , σα · σ ) = {γ (X ⊗ 1); X ∈ (id, σ · σ ), Eσα (ξ )σ (Eσα (ξ ))X = ρξ (X), ∀ξ ∈ 0 }, α (idB , σ · σ α) = {γ (X ⊗ 1); X ∈ (id, σ · σ ),

Eσα (ξ )σ (Eσα (ξ ))X = ρξ (X), ∀ξ ∈ 0 }.

Therefore, there exist isometries Rσ ∈ (id, σ · σ ),

R σ ∈ (id, σ · σ )

satisfying 1 ∗ , Rσ∗ σ (R σ ) = R σ σ (Rσ ) = d(σ )

Eσα (ξ )σ Eσα (ξ ) R σ = ρξ (R σ ), α α E(σ (ξ )σ Eσ (ξ ) Rσ = ρ(Rσ ), ) such that R = γ (Rσ ⊗ 1), R = γ (R σ ⊗ 1). Thus get

d(σ )Rσ∗ σ Eσα (ξ )ρξ (ρξ R σ ) = d(σ )Rσ∗ σ σ Eσα (ξ ) R σ = d(σ )Eσα (ξ )Rσ∗ σ (R σ ) = Eσα (ξ ).

In general, the pair Rσ and R σ satisfying (2.1) is not unique. However, if Rσ and R σ also satisfies the relation, there exists a unitary u ∈ (σ , σ ) such that Rσ = uRσ , R σ = σ (u)R σ . This change only amounts to an equivalent half braiding of σ .

146

M. Izumi

Remark. (i) The set of B − B sectors associated with A ⊃ B may be smaller than D() in general. A typical example is the one discussed at the end of Sect. 3. In this op case, since [γ ] = g [αg ⊗ αg ], B is the fixed point subalgebra of A under a G op action {αg ⊗ αg }g∈G with an appropriate unitary perturbation [13]. (Note that even op if the class [ω] is not trivial, the obstruction of {αg ⊗ αg } disappears.) Thus the category of B −B sectors associated with the inclusion is isomorphic to the dual of G while D() is isomorphic to the (twisted) quantum double of G. Therefore, strictly speaking the right object we should deal with is D() rather than the B − B sectors contained in some powers of the restriction γ |B of the canonical endomorphism to B. (ii) If there exists a braiding {E(ξ, η)} such that Eσα (ξ ) = E(σ, ξ ), we have Eσα¯¯ (ξ ) = E(σ¯ , ξ ). However, in general even when σ is self-conjugate, Eσα and Eσα¯ are not necessarily equivalent and σ α may not be self-conjugate. the relationship between Tube and B − B sectors. Let [σ ] = Now, we investigate α (ξ )} n [ρ ] and {E σ α is irreducible. ξ ∈0 be a half braiding of σ . We assume that σ ξ ξ ξ nξ We fix an orthonormal basis {Wσ (ξ )i }i=1 ⊂ (ρξ , σ ) and set Eσα (ξ )(η,i),(ζ,j ) = ρξ (Wσ (ζ )∗j )Eσα (ξ )Wσ (η)i ∈ (ρη · ρξ , ρξ · ρζ ). Then, we have Eσα (ξ )

=

nη nζ η,ζ i=1 j =1

ρξ (Wσ (ζ )j )Eσα (ξ )(η,i),(ζ,j ) Wσ (η)∗i .

In [8], Evans and Kawahigashi constructed central elements of Tube from braidings. Generalizing their argument, we define d(σ ) e( σ α )(η,i),(ζ,j ) := √ d(ξ )(ηξ |Eσα (ξ )(η,i),(ζ,j ) |ξ ζ ) ∈ Tube . λ d(η)d(ζ ) ξ

Lemma 4.7. Let e( σ α )(η,i),(ζ,j ) be as above and X ∈ (ρµ · ρτ , ρτ · ρν ). Then, the following holds: (i) e( σ α )(η,i),(ζ,j ) ∗ = e( σ α )(ζ,j ),(η,i) . (ii)

(iii)

e( σ α )(η,i),(ζ,j ) (µτ |X|τ ν)

√ d(τ ) d(ν) = δζ,µ √ φτ (XEσα (τ )(ζ,j ),(ν,k) ∗ )e( σ α )(η,i),(ν,k) . d(ζ ) k (µτ |X|τ ν)e( σ α )(η,i),(ζ,j ) √ d(τ ) d(µ) = δη,ν √ φµ (Eσα (τ )(µ,k),(η,i) ∗ X)e( σ α )(µ,k),(ζ,j ) . d(η) k

Structure of Sectors Associated with Longo–Rehren Inclusions I

147

Proof. (i) Applying the definition of the ∗-operation of Tube , we have d(σ ) e( σ α )(η,i),(ζ,j ) ∗ = √ λ d(η)d(ζ ) ∗ d(ξ )2 (ζ ξ |ρξ (ρη (R ξ )Wσ (η)∗i Eσα (ξ )∗ ρξ (Wσ (ζ )j ))Rξ |ξ η) · ξ

d(σ ) = √ λ d(η)d(ζ ) ∗ · d(ξ )2 (ζ ξ |ρξ (Wσ (η)∗i σ (R ξ )Eσα (ξ )∗ )Rξ Wσ (ζ )j |ξ η). ξ

Using the braiding fusion equation, we get Eσα (ξ )σ (R ξ ) = ρξ (Eσα (ξ )∗ )R ξ , and so ∗

ρξ (Wσ (η)∗i σ (R ξ )Eσα (ξ )∗ )Rξ Wσ (ζ )j ∗

= ρξ (Wσ (η)∗i R ξ ρξ (Eσα (ξ )))Rξ Wσ (ζ )j ∗

= ρξ (Wσ (η)∗i R ξ )Rξ Eσα (ξ )Wσ (ζ )j =

1 ρ (Wσ (η)∗i )Eσα (ξ )Wσ (ζ )j . d(ξ ) ξ

Therefore, we obtain the result. (ii) By direct computation, we get e( σ α )(η,i),(ζ,j ) (µτ |X|τ ν) d(σ ) d(ξ )(ηξ |ρξ (Wσ (ζ )∗j )Eσα (ξ )Wσ (η)i |ξ ζ )(µτ |X|τ ν) = √ λ d(η)d(ζ ) ξ Nθ

ξ,τ d(σ ) = √ d(ξ ) δµ,ζ λ d(η)d(ζ ) ξ θ≺ξ τ k=1

· (ηθ|T (θξ,τ )∗k ρξ (XWσ (ζ )∗j )Eσα (ξ )Wσ (η)i ρη (T (θξ,τ )k )|θν) Nθ

ξ,τ d(σ ) = √ d(ξ ) δµ,ζ λ d(η)d(ζ ) θ ξ ≺θτ k=1

· (ηθ|T (θξ,τ )∗k ρξ (XWσ (ζ )∗j )Eσα (ξ )σ (T (θξ,τ )k )Wσ (η)i |θν), where we use the Frobenius reciprocity. Using the braiding fusion equation, we get Nθ

ξ,τ

ξ ≺θτ k=1

d(ξ )T (θξ,τ )∗k ρξ (XWσ (ζ )∗j )Eσα (ξ )σ (T (θξ,τ )k )Wσ (η)i Nθ

=

ξ,τ

ξ ≺θτ k=1

d(ξ )T (θξ,τ )∗k ρξ (XWσ (ζ )∗j Eσα (τ )∗ )T (θξ,τ )k Eσα (θ )Wσ (η)i .

148

M. Izumi

Thus we concentrate on computation of the quantity Nθ

ξ,τ

ξ ≺θτ k=1

d(ξ )T (θξ,τ )∗k ρξ (XWσ (ζ )∗j Eσα (τ )∗ )T (θξ,τ )k .

Set T (θξ,τ )k

=

d(θ)d(τ ) ξ ∗ T (θ,τ )k ρθ (Rτ ). d(ξ )

Then by the Frobenius reciprocity, {T (θξ,τ )k } is an orthonormal basis of (ρθ , ρξ · ρτ ) too. Therefore, we can replace T (θξ,τ )k with T (θξ,τ )k in the above, and get Nθ

d(θ )d(τ )

ξ,τ

ξ ≺θτ k=1

ξ

ξ

ρθ (Rτ∗ )T (θ,τ )k ρξ (XWσ (ζ )∗j Eσα (τ )∗ )T (θ,τ )∗k ρθ (Rτ )

= d(θ )d(τ )ρθ (Rτ∗ ρτ (XWσ (ζ )∗j Eσα (τ )∗ )Rτ )

= d(θ )d(τ )ρθ (φτ (XWσ (ζ )∗j Eσα (τ )∗ )) = d(θ )d(τ )

nω ω k=1

= d(θ )d(τ )

nω ω k=1

ρθ (φτ (XEσα (τ )(ζ,j ),(ω,k) ∗ ρτ (Wσ (ω)∗k ))) ρθ (φτ (XEσα (τ )(ζ,j ),(ω,k) ∗ )Wσ (ω)∗k ).

Since φτ (XEσα (τ )(ζ,j ),(ω,k) ∗ ) ∈ (ρω , ρν ), this term survives only when ω = ν, and so we get nν φτ (XEσα (τ )(ζ,j ),(ν,k) ∗ )ρθ (Wσ (ν)∗k ). d(θ )d(τ ) k=1

Therefore, we obtain the result. (iii) By direct computation, we get (µτ |X|τ ν)e( σ α )(η,i),(ζ,j ) d(σ ) = √ d(ξ ) δν,η λ d(η)d(ζ ) ξ · (µτ |X|τ ν)(ηξ |ρξ (Wσ (ζ )∗j )Eσα (ξ )Wσ (η)i |ξ ζ ) Nθ

τ,ξ d(σ ) δν,η = √ d(ξ ) λ d(η)d(ζ ) θ≺τ ξ,ξ k=1

· (µθ|T (θτ,ξ )∗k ρτ (ρξ (Wσ (ζ )∗j )Eσα (ξ )Wσ (η)i )Xρµ (T (θτ,ξ )k )|θζ ) Nθ

τ,ξ d(σ ) = √ d(ξ ) δν,η λ d(η)d(ζ ) θ ξ ≺τ θ k=1

· (µθ|ρθ (Wσ (ζ )∗j )T (θτ,ξ )∗k ρτ (Eσα (ξ )Wσ (η)i )Xρµ (T (θτ, xi )k )|θζ ).

Structure of Sectors Associated with Longo–Rehren Inclusions I

149

Using the braiding fusion equation, we get Nθ

τ,ξ

ξ ≺τ θ k=1

d(ξ )ρθ (Wσ (ζ )∗j )T (θτ,ξ )∗k ρτ (Eσα (ξ )Wσ (η)i )Xρµ (T (θτ,ξ )k ) Nθ

=

τ,ξ

ξ ≺τ θ k=1

d(ξ )ρθ (Wσ (ζ )∗j )Eσα (θ )σ (T (θτ,ξ )∗k )Eσα (τ )∗ ρτ (Wσ (η)i )Xρµ (T (θτ,ξ )k ).

Thus, we concentrate on computation of the quantity Nθ

τ,ξ

ξ ≺τ θ k=1

d(ξ )σ (T (θτ,ξ )∗k )Eσα (τ )∗ ρτ (Wσ (η)i )Xρµ (T (θτ,ξ )k ).

Set T (θτ,ξ )k

=

d(θ )d(τ ) ξ ρτ (T (τ θ )∗k )R τ . d(ξ )

Then, thanks to the Frobenius reciprocity, {T (θτ,ξ )k } is an orthonormal basis of (ρθ , ρτ · ρξ ). Therefore, we can replace T (θτ,ξ )k with T (θτ,ξ )k in the above, and so we get Nθ

τ,ξ

ξ ≺τ θ k=1

d(ξ )σ (T (θτ,ξ )∗k )Eσα (τ )∗ ρτ (Wσ (η)i )Xρµ (T (θτ,ξ )k ) Nθ

= d(θ )d(τ )

τ,ξ

ξ ≺τ θ k=1

∗

Nθ

= d(θ )d(τ )

τ,ξ

ξ ≺τ θ k=1

= d(θ )d(τ ) = = =

ξ

∗

ξ

ξ

σ (R τ )Eσα (τ )∗ ρτ (σ (T (τ θ )k )Wσ (η)i ρη (T (τ θ )∗k ))Xρµ (R τ )

Nθ

τ,ξ

ξ

σ (R τ ρτ (T (τ θ )k ))Eσα (τ )∗ ρτ (Wσ (η)i )Xρµ (ρτ (T (τ θ )∗k )R τ )

∗

ξ

ξ

σ (R τ )Eσα (τ )∗ ρτ (Wσ (η)i ρη (T (τ θ )k )ρη (T (τ θ )∗k ))Xρµ (R τ )

ξ ≺τ θ k=1 ∗ d(θ )d(τ )σ (R τ )Eσα (τ )∗ ρτ (Wσ (η)i )Xρµ (R τ ) nω ∗ d(θ )d(τ ) σ (R τ )Wσ (ω)k Eσα (τ )(ω,k),(η,i) ∗ Xρµ (R τ ) ω k=1 nω ∗ Wσ (ω)k ρω (R τ )Eσα (τ )(ω,k),(η,i) ∗ Xρµ (R τ ). d(θ )d(τ ) ω k=1 ∗

Note that ρω (R τ )Eσα (τ )(ω,k),(η,i) ∗ Xρµ (R τ ) ∈ (ρµ , ρω ) and this survives only if ω = µ. Thus, we get d(θ )d(τ )

nµ k=1

∗

Wσ (µ)k ρµ (R τ )Eσα (τ )(µ,k),(η,i) ∗ Xρµ (R τ ).

150

M. Izumi ∗

Since ρµ (R τ )Eσα (τ )(µ,k),(η,i) ∗ Xρµ (R τ ) is already a scalar, to evaluate it we can apply φµ and get d(θ )d(τ )

nµ k=1

∗

Wσ (µ)k ρµ (R τ )Eσα (τ )(µ,k),(η,i) ∗ Xρµ (R τ )

= d(θ )d(τ )

nµ k=1 nµ

= d(θ )d(τ )

k=1

This proves the statement.

∗

Wσ (µ)k R τ φµ (Eσα (τ )(µ,k),(η,i) ∗ X)R τ φµ (Eσα (τ )(µ,k),(η,i) ∗ X)Wσ (µ)k .

Remark. To show the above lemma, irreducibility of σ α is not necessary. Corollary 4.8. Let σ α ∈ D() be an irreducible sector as above. Then, z( σ α ) := α) e( σ belongs to the center of Tube . (η,i),(η,i) η,i Proof. By Lemma 4.7, we have √ d(τ ) d(ν) α z( σ )(µτ |X|τ ν) = √ φτ (XEσα (τ )∗(µ,i),(ν,j ) )e( σ α )(µ,i),(ν,j ) , d(µ) i,j √ d(τ ) d(µ) α φτ (Eσα (τ )∗(µ,i),(ν,j ) X)e( σ α )(µ,i),(ν,j ) . (µτ |X|τ ν)z( σ )= √ d(ν) i,j Thus, the result follows from Lemma 3.1.

To obtain another main result in this section, we need one more technical lemma. Lemma 4.9. With the same notation as above, the following hold: (i) Let D()0 be a complete system of representatives of the equivalence classes of the irreducibles in D(). Then, d( σ α ) 2 = λ2 . σ α ∈D()0

σ α )) = (ii) ϕ (z( (iii) ϕ (1) = λ.

d(σ )2 λ .

Proof. (i) We note that if ρ is an irreducible B − B sector such that ι · ρ contains (ρξ ⊗ id op ) · ι for some ξ , then ρ belongs to D(). Indeed, thanks to the Frobenius reciprocity ρ is contained in ι · (ρξ ⊗ id op ) · ι ∈ D(), and so ρ ∈ D(). For ξ ∈ 0 and θ ∈ D()0 , we denote by Eξ,θ the multiplicity of (ρξ ⊗ id op ) · ι in ι · θ. We set vξ = d(ξ ), wθ = d(θ ), v = (vξ ), w = (wθ ). Then, we have Ew = λv,

t

Ev = w.

Structure of Sectors Associated with Longo–Rehren Inclusions I

151

Thus, we get

d(θ )2 = w, w = w, t Ev = Ew, v = λv, v = λ2 .

θ∈D()0

(ii) and (iii) can be easily obtained from the definitions of z( σ α ) and ϕ , and the fact 1=

(ξ e|1|eξ ).

ξ

Theorem 4.10. Let e( σ α )(η,i),(ζ,j ) be as above. Then, (i) {e( σ α )(η,i),(ζ,j ) }(η,i),(ζ,j ) is a system of matrix units of a simple component of Tube . (ii) {z( σ α )} σ α ∈D()0 are mutually orthogonal minimal central projections satisfying

z( σ α ) = 1.

σ α ∈D()0

Proof. (i) Using (ii) and (iii) of Lemma 4.7, we can compute e( σ α )(η,i),(ζ,j ) α e( σ )(η ,i ),(ζ ,j ) in two ways. First, using (ii) we get d(σ ) e( σ α )(η,i),(ζ,j ) e( σ α )(η ,i ),(ζ ,j ) = √ λ d(η )d(ζ ) · d(ξ )e( σ α )(η,i),(ζ,j ) (η ξ |ρξ (Wσ (ζ )∗j )Eσα (ξ )Wσ (η )i |ξ ζ ) ξ

d(σ ) = δζ,η λd(ζ ) n

·

ζ

ξ

k=1

d(ξ )2 φξ (ρξ (Wσ (ζ )∗j )Eσα (ξ )Wσ (ζ )i Eσα (ξ )(ζ,j ),(ζ ,k) ∗ )e( σ α )(η,i),(ζ ,k) n

ζ d(σ ) = Wσ (ζ )∗j δζ,η λd(ζ ) k=1   · d(ξ )2 φξ (Eσα (ξ )Wσ (ζ )i Wσ (ζ )∗j Eσα (ξ )∗ ) Wσ (ζ )k e( σ α )(η,i),(ζ ,k)

ξ

Secondly, using (iii) we get σ α )(η ,i ),(ζ ,j ) e( σ α )(η,i),(ζ,j ) e( d(σ ) = √ d(ξ )(ηξ |ρξ (Wσ (ζ )∗j )Eσα (ξ )Wσ (η)i |ξ ζ )e( σ α )(η ,i ),(ζ ,j ) λ d(η)d(ζ ) ξ nη

=

d(σ ) d(ξ )2 φη (Eσα (ξ )(η,k),(ζ,i ) ∗ Eσα (ξ )(η,i),(ζ,j ) )e( σ α )(η,k),(ζ ,j ) . δζ,η λd(ζ ) ξ

k=1

152

M. Izumi

Thanks to Lemma 3.1, this is equal to nη

d(σ ) d(ξ )2 φξ (Eσα (ξ )(η,i),(ζ,j ) Eσα (ξ )(η,k),(ζ,i ) ∗ )e( σ α )(η,k),(ζ ,j ) δζ,η λd(η) ξ

=

k=1 nη

d(σ ) δζ,η Wσ (ζ )∗j λd(η) k=1   · d(ξ )2 φξ (Eσα (ξ )Wσ (η)i Wσ (η)∗k Eσα (ξ )∗ ) Wσ (ζ )i e( σ α )(η,k),(ζ ,j ) . ξ

We set Wσ (ζ )i , Wσ (ζ )j =

ξ

d(ξ )2 φξ (Eσα (ξ )Wσ (η)i Wσ (η)∗j Eσα (ξ )∗ ) ∈ (σ, σ ),

and show that this is a scalar. To do so, since σ α is irreducible, it suffices to show Eσα (η)Wσ (ζ )i , Wσ (ζ )j Eσα (η)∗ = ρη (Wσ (ζ )i , Wσ (ζ )j ) for all η thanks to Theorem 4.6. The braiding fusion equation gives µ

µ

ρξ (Eσα (η))Eσα (ξ )σ (T (ξ,η )p ) = T (ξ,η )p Eσα (µ), and so, we have N

ρξ (Eσα (η))Eσα (ξ ) =

µ

ξ,η

µ≺ξ η p=1

µ

µ

T (ξ,η )p Eσα (µ)σ (T (ξ,η )∗p ).

Thus, Eσα (η)Wσ (ζ )i , Wσ (ζ )j Eσα (η)∗ = d(ξ )2 φξ (ρξ (Eσα (η))Eσα (ξ )Wσ (ζ )i ξ

· Wσ (ζ )∗j Eσα (ξ )∗ ρξ (Eσα (η))∗ ) µ µ = d(ξ )2 φξ (T (ξ,η )p Eσα (µ)σ (T (ξ,η )∗p ) ξ,µ,ν p,q

· Wσ (ζ )i Wσ (ζ )∗j σ (T (νξ,η )q )Eσα (ν)∗ T (νξ,η )∗q ) µ µ = d(ξ )2 φξ (T (ξ,η )p Eσα (µ)Wσ (ζ )i Wσ (ζ )∗j Eσα (µ)∗ T (ξ,η )∗p ) ξ,µ p

=

ξ,µ p

=

µ

d(ξ )d(µ) ξ,µ p

µ

d(ξ )2 Rξ∗ ρξ (T (ξ,η )p Eσα (µ)Wσ (ζ )i Wσ (ζ )∗j Eσα (µ)∗ T (ξ,η )∗p )Rξ d(η)

η

η

T (ξ ,µ )∗p ρξ (Eσα (µ)Wσ (ζ )i Wσ (ζ )∗j Eσα (µ)∗ )T (ξ ,µ )p ,

Structure of Sectors Associated with Longo–Rehren Inclusions I

153

where we use the Frobenius reciprocity in the last equality. Set d(η)d(µ) ξ ∗ η T (η,µ )p ρη (Rµ ). T (ξ ,µ )p = d(ξ ) η

Then, {T (ξ ,µ )p }p is an orthonormal basis of (ρη , ρξ · ρµ ) too. Thus, we can replace η

η

T (ξ ,µ )p with T (ξ ,µ )p in the above, and so we get ξ,µ p

=

ξ

ξ,µ p

=

ξ

d(µ)2 ρη (Rµ∗ )T (η,µ )p ρξ (Eσα (µ)Wσ (ζ )i Wσ (ζ )∗j Eσα (µ)∗ )T (η,µ )∗p ρη (Rµ )

ξ,µ p

d(µ)2 ρη (Rµ∗ )ρη · ρµ (Eσα (µ)Wσ (ζ )i Wσ (ζ )∗j Eσα (µ)∗ )ρη (Rµ ) d(µ)2 ρη (φµ (Eσα (µ)Wσ (ζ )i Wσ (ζ )∗j Eσα (µ)∗ ))

= ρµ (Wσ (ζ )i , Wσ (ζ )j ). Therefore, Wσ (ζ )i , Wσ (ζ )j is a scalar and we get σ α )(η ,i ),(ζ ,j ) e( σ α )(η,i),(ζ,j ) e( d(σ ) δζ,η Wσ (ζ )i , Wσ (ζ )j e( = σ α )(η,i),(ζ ,j ) λd(ζ ) d(σ ) = Wσ (η)i , Wσ (η)k e( σ α )(η,k),(ζ ,j ) . δζ,η δj,i λd(η) k

This shows that there exists a non-negative number c such that Wσ (ξ )i , Wσ (ξ )j = d(ξ )cWσ (ξ )i , Wσ (ξ )j . To determine c, we compute the following quantity by using the definition of ·, ·: Wσ (η)i , Wσ (η)i = c nη d(η) = cd(σ ). η,i

η

Indeed, it is ξ,η,i

d(ξ )2 φξ (Eσα (ξ )Wσ (η)i Wσ (η)∗i Eσα (ξ )∗ ) =

d(ξ )2 = λ.

ξ

This implies c = λ/d(σ ), and that {e( σ α )(η,i),(ζ,j ) }(η,i),(ζ,j ) is a system of matrix units. (ii) and (iii) 4.7 show that e( σ α )(η,i),(η,i) is a minimal projection. Since of Lemma z( σ α ) = η,i e( σ α )(η,i),(η,i) is a central element due to Corollary 4.8, it should be a unit of some simple component of Tube and {e( σ α )(η,i),(ζ,j ) }(η,i),(ζ,j ) is a full system of matrix units of the simple component. (ii) Note that different choice of the basis {Wσ (η)i } makes matrix units of the same simple component of Tube . The converse is also true; it is routine to see that if two irreducibles σ α, µβ define matrix units of the same simple component, then [σ ] = [µ] and the half-braidings are equivalent. This observation shows that {z( σ α )} σ α ∈D()0 are

154

M. Izumi

mutually orthogonal. To show σ α ) = 1, we use the fact that ϕ is a faithful σ α ∈D()0 z( positive functional. Indeed, Lemma 4.9 shows ϕ (z( σ α )) = ϕ (1), σ α ∈D()0

which proves the theorem. Remark. (i) To obtain the induction-reduction graph between A−B and B −B sectors, it suffices to know the number dim(ι σ α , (ρξ ⊗ id op )ι) = dim((σ ⊗ id op )ι, (ρξ ⊗ id op )ι) = dim(σ, ρξ ). Note that this number can be deduced from the tube algebra: it is the rank of the subalgebra z( σ α )Aξ ⊂ Tube . Therefore, we can obtain the dual principal graph of the Longo–Rehren inclusion once we know the structure of the tube algebra. (ii) Thanks to Lemma 4.9, D() has only finitely many equivalence classes of irreducible objects. Since D() exhausts all the half-braidings, this implies, in particular, that there are only finitely many braidings for . 5. S and T Matrices In this section, we show that our definition of S and T in Sect. 3 is relevant; more precisely, we prove that our S and T coincide with the ones obtained from the braiding of D() through Rehren’s construction in [10, 30]. First, we show that S is well-defined, i.e. S preserves the center of Tube . σ α ) be a minimal central projection of Tube . Lemma 5.1. Let S0 be as in Sect. 3 and z( Then, d(σ ) S0 (z( σα )) = (ξ η|Wσ (η)∗i Eσα (ξ )∗ ρξ (Wσ (η)i )|ηξ ) λ ξ,η,i

=

d(σ ) (ξ η|Eσα (ξ )∗(η,i),(η,i) |ηξ ). λ ξ,η,i

Proof. Using the definition of z( σα ) and S0 , we get d(σ ) d(ξ ) S0 (z( σα )) = S0 ((ηξ |ρξ (Wσ (η)∗i )Eσα (ξ )Wσ (η)i |ξ η)) λ d(η) ξ,η,i

=

d(σ ) d(ξ )(ξ¯ η|Rξ∗ ρξ¯ (ρξ (Wσ (η)∗i )Eσα (ξ )Wσ (η)i ρη (R ξ ))|ηξ¯ ) λ ξ,η,i

=

d(σ ) d(ξ )(ξ¯ η|Wσ (η)∗i Rξ∗ ρξ¯ (Eσα (ξ )σ (R ξ )Wσ (η)i )|ηξ¯ ). λ ξ,η,i

Thanks to the braiding fusion equation, we have ρξ (Eσα (ξ¯ ))Eσα (ξ )σ (R ξ ) = R ξ , and so d(ξ )Rξ∗ ρξ¯ (Eσα (ξ )σ (R ξ )) = d(ξ )Rξ∗ ρξ¯ (ρξ (Eσα (ξ¯ )∗ R ξ ) = Eσα (ξ¯ )∗ , which shows the lemma.

Structure of Sectors Associated with Longo–Rehren Inclusions I

155

Proposition 5.2. S0 preserves the center of Tube . Proof. Let z( σα ) be a minimal central projection of Tube as in the previous lemma. To prove the theorem, it suffices to show that S0 (z( σ α )) commutes with arbitrary (ξ ζ |X|ζ η), or equivalently (ξ ν|Wσ (ν)∗i Eσα (ξ )∗ ρξ (Wσ (ν)i )|νξ )(ξ ζ |X|ζ η) ν,i

=

(ξ ζ |X|ζ η)(ητ |Wσ (τ )∗i Eσα (η)∗ ρη (Wσ (τ )i )|τ η). τ,i

The left-hand side is N

µ

ν,ζ

ν,i µ≺νζ j =1

µ

N

µ

ν,ζ

=

µ

(ξ µ|T (ν,ζ )∗j ρν (X)Wσ (ν)∗i Eσα (ξ )∗ ρξ (Wσ (ν)i )ρξ (T (ν,ζ )j )|µη)

ν,i µ≺νζ j =1

µ

µ

(ξ µ|T (ν,ζ )∗j Wσ (ν)∗i σ (X)Eσα (ξ )∗ ρξ (Wσ (ν)i T (ν,ζ )j )|µη).

The right-hand side is N

µ

ζ,τ

τ,i µ≺ζ τ j =1 N

=

µ

µ

(ξ µ|T (ζ,τ )∗j ρζ (Wσ (τ )∗i Eσα (η)∗ ρη (Wσ (τ )i ))Xρξ (T (ζ,τ )j )|µη)

µ

ζ,τ

τ,i µ≺ζ τ j =1

µ

µ

(ξ µ|T (ζ,τ )∗j ρζ (Wσ (τ )∗i )ρζ (Eσα (η)∗ )Xρξ (ρζ (Wσ (τ )i )T (ζ,τ )j )|µη).

Using the braiding fusion equation, we get ρζ (Eσα (η)∗ )X = Eσα (ζ )σ (X)Eσα (ξ )∗ ρξ (Eσα (ζ )∗ ), and so the right-hand side is N

µ

ζ,τ

τ,i µ≺ζ τ j =1 µ µ (ξ µ|T (ζ,τ )∗j ρζ (Wσ (τ )∗i )Eσα (ζ )σ (X)Eσα (ξ )∗ ρξ (Eσα (ζ )∗ ρζ (Wσ (τ )i )T (ζ,τ )j )|µη) µ

µ

Since both {Wσ (ν)i T (ν,ζ )j }ν,i,j and {Eσα (ζ )∗ ρζ (Wσ (τ )i )T (ζ,τ )j }τ,i,j are orthonormal bases of (ρµ , σρζ ), we get the equality. The following lemma is also useful to compute S and T in concrete examples, especially when we can manage to obtain all matrix units of Tube .

156

M. Izumi

Lemma 5.3. (i) Let S σ α , µβ be the matrix coefficient of S with respect to the orthonormal √

σ α )}. Then, basis { d(σλ) z(

d(σ ) d(ξ )φξ (Eµβ (η)∗(ξ,i),(ξ,i) Eσα (ξ )∗(η,j ),(η,k) ) = δj,k S σ α , µβ . λ ξ,i

In particular, 1 d(ξ )d(η)φξ (Eµβ (η)∗(ξ,j ),(ξ,j ) Eσα (ξ )∗(η,i),(η,i) ) = S σ α , µβ . λ ξ,η,i,j

(ii) Let t =

σα

ω σ α ), where t is as in Sect. 3. If ρξ is contained in σ , σ α z( φξ (Eσα (ξ )(ξ,i),(ξ,j ) ) = δi,j

1 ω σα. d(ξ )

Proof. (i) Thanks to Proposition 5.2, we have z( σ α )S0 (z( µβ )) = and so S σ α )(η,i),(ζ,j ) = σ α , µβ e(

d(µ) σ α ), Sσ α , µβ z( d(σ )

d(σ ) e( σ α )(η,i),(ζ,j ) S0 (z( µβ )). d(µ)

On the other hand, thanks to Lemma 4.7 and Lemma 5.1, the right-hand side equals to d(σ ) d(τ )φτ (Eµβ (ζ )∗(τ,k),(τ,k) Eσα (τ )∗(ζ,j ),(ζ,l) )e( σ α )(η,i),(ζ,l) , λ τ,k,l

which shows the statement. (ii) Using t∗ = η (ηη|1|ηη) and Lemma 4.7, we get ω σ α )(η,i),(ζ,j ) = e( σ α )(η,i),(ζ,j ) t∗ σ α e( = d(ζ ) φζ (Eσα (ζ )∗(ζ,j ),(ζ,k) )e( σ α )(η,i),(ζ,k) , k

which proves the statement. µβ )} be the braiding of D() defined in Theorem 4.6. Then, Lemma 5.4. Let {E( σ α, the following hold: d(σ )d(µ) µβ , σ α )∗ E( σ α, µβ )∗ ). φ (i) S σ α , µβ = µβ (E( λ (ii) ω σ α , i.e. ω σ α, σ α )). σ α is the statistical phase of σ α = d(σ )φ σ α (E( (iii) d( σ α )2 ω σ α = λ. σα

Structure of Sectors Associated with Longo–Rehren Inclusions I

157

Proof. (i) Note that the standard left inverse of a reducible sector σ is given by φσ (x) =

d(ξ ) φξ (Wσ (ξ )∗i xWσ (ξ )i ). d(σ ) ξ,i

First, we claim φ µβ (γ (x ⊗ 1)) = γ (φµ (x) ⊗ 1). Indeed, thanks to Theorem 4.6, we know that two isometries β¯

β¯

µ¯ ) γ (R¯ σ ⊗ 1) ∈ (id, µβ

γ (Rσ ⊗ 1) ∈ (id, µ¯ µβ ), satisfies (2.1). Thus, we get ¯

β ∗ φ ¯ (γ (x ⊗ 1))γ (Rµ ⊗ 1) = γ (Rµ∗ µ(x)R ¯ µ ⊗ 1) µβ (γ (x ⊗ 1)) = γ (Rµ ⊗ 1)µ = γ (φµ (x) ⊗ 1).

Using the claim and the definition of the braiding, we get β ∗ α ∗ µβ , σ α )∗ E( σ α, µβ )∗ ) = φ φ µβ (E( µβ (γ (Eµ (σ ) Eσ (µ) ) ⊗ 1)

= γ (φµ (Eµβ (σ )∗ Eσα (µ)∗ ) ⊗ 1), which is a scalar. Thus, it equals to φµ (Eµβ (σ )∗ Eσα (µ)∗ ) d(ξ ) = φξ (Wµ (ξ )∗i Eµβ (σ )∗ Eσα (µ)∗ Wµ (ξ )i ) d(µ) ξ,i

=

d(ξ ) φξ (Wµ (ξ )∗i µ(Wσ (η)j )Eµβ (η)∗ Wσ (η)∗j σ (Wµ (ξ )i )Eσα (ξ )∗ ) d(µ)

ξ,η,i,j

=

d(ξ ) Wσ (η)j φξ (Wµ (ξ )∗i Eµβ (η)∗ ρη (Wµ (ξ )i )Wσ (η)∗j Eσα (ξ )∗ ) d(µ)

ξ,η,i,j

=

ξ,η,ζ,i,j,k

d(ξ ) Wσ (η)j φξ (Eµβ (η)∗(ξ,i),(ξ,i) Eσα (ξ )∗(η,j ),(ζ,k) )Wσ (ζ )∗k . d(µ)

β

Since φξ (Eµ (η)∗(ξ,i),(ξ,i) Eσα (ξ )∗(η,j ),(ζ,k) ) ∈ (ρζ , ρη ), this term survives only when η = ζ . Thus we get the following using Lemma 5.3: φµ (Eµβ (σ )∗ Eσα (µ)∗ ) =

ξ,η,i,j,k

=

d(ξ ) φξ (Eµβ (η)∗(ξ,i),(ξ,i) Eσα (ξ )∗(η,j ),(η,k) )Wσ (η)j Wσ (η)∗k , d(µ)

λ S Wσ (η)j Wσ (η)∗j σ α , µβ d(σ )d(µ) η,j

=

λ Sσ α , µβ . d(σ )d(µ)

158

M. Izumi

(ii) In a similar way as above, we have φ σ α, σ α )) = γ (φσ (Eσα (σ )) ⊗ 1) and σ α (E( φσ (Eσα (σ )) =

d(ξ ) φξ (Wσ (ξ )∗i Eσα (σ )Wσ (ξ )i ) d(σ ) ξ,i

d(ξ ) = φξ (Eσα (ξ )Wσ (ξ )i )Wσ (ξ )∗i d(σ ) ξ,i

d(ξ ) φξ (Eσα (ξ )(ξ,i),(η,j ) )Wσ (η)j Wσ (ξ )∗i d(σ ) ξ,η,i,j ω σα = Wσ (ξ )i Wσ (ξ )∗i d(σ ) =

ξ,i

=

ω σα . d(σ )

(iii) We compute ϕ (t∗ ) in two ways. Using t∗ = On the other hand ϕ (t∗ ) =

ω σ α )) = σ α ϕ (z(

σα

σα

ξ (ξ ξ |1|ξ ξ ),

ω σα

d(σ )2 . λ

we get ϕ (t∗ ) = 1.

Thanks to Rehren’s argument in [10, 30], the above implies the following: Theorem 5.5. Let S and T be as above. Then, S and T satisfy the following: (i) Si,j = Sj,i = Si,j¯ . (ii) Modular group relation: (ST )3 = S 2 = C,

CT = T C,

where Ci,j := δi,j¯ is the conjugation matrix. (iii) Verlinde formula: Si,l Sj,l Sk,l ¯ k = . Ni,j Se,l l

6. Relation to α-Induction Though Theorem 4.6 assures that there are enough endomorphisms with half braidings for a given system , one might still wonder why it occurs or how one can obtain these endomorphisms and half braidings. Here, we give the one corresponding to the restriction γ |B of γ to B explicitly, which is necessary in order to obtain the α-inductions [2] of B − B sectors with respect to the braiding we defined in Sect. 4. We start with the following technical lemma: Lemma 6.1. Let E be the conditional expectation onto B. Then, 1 E (Vξ (R¯ ξ R¯ ξ∗ ⊗ 1)Vξ∗ ) = E (Vξ (1 ⊗ j (R¯ ξ R¯ ξ∗ ))Vξ∗ ) = . λ

Structure of Sectors Associated with Longo–Rehren Inclusions I

159

Proof. Since E is given by W ∗ γ (·)W , we get E (Vξ (R¯ ξ R¯ ξ∗ ⊗ 1)Vξ∗ ) = W ∗ γ (Vξ (R¯ ξ R¯ ξ∗ ⊗ 1)Vξ∗ )W d(ξ )d(η) ζ ∗ ζ = Vζ Tη,ξ (ρη (R¯ ξ R¯ ξ∗ ) ⊗ 1)Tη,ξ Vζ∗ √ λ d(ζ )d(ζ ) η,ζ,ζ

=

d(ξ )d(η) ζ ζ Vζ (T (η,ξ )∗i ρη (R¯ ξ R¯ ξ∗ )T (η,ξ )i ⊗ 1)Vζ∗ λd(ζ )

η,ζ,i

=

1 η η Vζ (T (ζ,ξ¯ )i T (ζ,ξ¯ )∗i ⊗ 1)Vζ∗ λ

η,ζ,i

=

1 , λ

where we use the Frobenius reciprocity. The second equation can be proved in the same way. We set σ0 to be the direct sum of ρξ ρξ¯ , more precisely, σ0 (x) :=

ξ

Wξ ρξ ρξ¯ (x)Wξ∗ ,

where {Wξ }ξ ⊂ M are isometries satisfying Eσ00 (ξ ) =

η,ζ,i

ξ

Wξ Wξ∗ = 1. We set ζ¯

η

∗ ∗ ρξ (Wζ )T (ξ,ζ )i ρη (T (ηξ ¯ )i )Wη .

Direct computation shows that {Eσ00 (ξ )}ξ is a half braiding of σ0 with respect to . We denote by {E(ρ, π )} the braiding of D() given by Theorem 4.6. Proposition 6.2. Let σ0 and Eσ00 be as above. Then, √ (i) Let X0 := λ ξ (Wξ ⊗ j (R¯ ξ∗ ))Vξ∗ γ (Vξ¯∗ ) and X := γ (X0 )W . Then, X is a unitary σ 0 ). In consequence, we have [ σ 0 ] = [γ |B ]. in (γ |B , β β (ii) E(µ˜ , γ |B ) = γ (U (µ)).

Proof. (i) First we show that X is an isometry. Indeed, using Lemma 6.1, we get X ∗ X = E (X0∗ X0 ) = λ γ (Vξ¯ )E (Vξ∗ (1 ⊗ j (R¯ ξ R¯ ξ∗ ))Vξ )γ (Vξ¯∗ ) ξ

=

ξ

γ (Vξ¯ Vξ¯∗ ) = 1.

Let Eγ (A) be the conditional expectation from B to γ (A). Then, Eγ (A) (XX ∗ ) = γ (X0 )Eγ (A) (W W ∗ )γ (X0∗ ) =

1 Wξ Wξ∗ ⊗ 1 = 1, γ (X0 X0∗ ) = λ ξ

160

M. Izumi

which shows that X is a unitary. Since B = γ (A)W , to prove X ∈ (γ |B , σ 0 ), it suffices to show the following two: X ∈ (γ 2 , γ (σ0 ⊗ id op )),

Xγ (W ) = γ (U 0 (σ0 )∗ )W X,

which is equivalent to X0 ∈ (γ 2 , σ0 ⊗ id op ),

X0 γ (W ) = U 0 (σ0 )∗ γ (X0 )W.

The first condition obviously holds. Straightforward computation using the Frobenius reciprocity shows that both sides of the second equation above coincide with η ζ (σ0 ⊗ id)(Vη )(Wξ ρξ (T (ξ¯ η )i ) ⊗ j (T (ξ,ζ )∗i ))ρˆξ (Vζ∗ )Vξ∗ . ξ,η,ζ,i

(ii) Since γ |B = Ad(X ∗ ) σ00 , we have E( µβ , γ |B ) µβ (X) = W ∗ γ (X0∗ (Eµβ (σ0 ) ⊗ 1)(µ ⊗ id)(X0 )U β (µ)∗ )W = X ∗ γ (Eµβ (σ0 ) ⊗ 1) = E (X0∗ (Wξ ⊗ 1)(Eµβ (ρξ ρξ¯ ) ⊗ 1)(µ ⊗ id)((Wξ∗ ⊗ 1)X0 )U β (µ)∗ ) ξ

=λ

ξ

=λ

ξ

=λ

ξ

=λ

ξ

=λ

ξ

=

ξ

E (γ (Vξ¯ )Vξ (Eµβ (ρξ ρξ¯ ) ⊗ j (R¯ ξ R¯ ξ∗ ))(µ ⊗ id)(Vξ∗ γ (Vξ¯∗ ))U β (µ)∗ ) γ (Vξ¯ )E (Vξ (Eµβ (ρξ ρξ¯ ) ⊗ j (R¯ ξ R¯ ξ∗ ))(µ ⊗ id)(Vξ∗ )U β (µ)∗ )γ ((µ ⊗ id)(Vξ¯∗ )) γ (Vξ¯ )E (Vξ (Eµβ (ρξ ρξ¯ )Eµβ (ξ )∗ ⊗ j (R¯ ξ R¯ ξ∗ ))Vξ∗ )γ ((µ ⊗ id)(Vξ¯∗ )) γ (Vξ¯ )E (Vξ (ρξ (Eµβ (ξ¯ )) ⊗ j (R¯ ξ R¯ ξ∗ ))Vξ∗ )γ ((µ ⊗ id)(Vξ¯∗ )) γ (Vξ¯ (Eµβ (ξ¯ ) ⊗ 1))E (Vξ (1 ⊗ j (R¯ ξ R¯ ξ∗ ))Vξ∗ )γ ((µ ⊗ id)(Vξ¯∗ ))

γ (Vξ¯ (Eµβ (ξ¯ ) ⊗ 1)(µ ⊗ id)(Vξ¯∗ ))

= γ (U β (µ)), which proves the statement. We use the notation E + (ρ, π ) := E(ρ, π ), E − (ρ, π ) := E(π, ρ)∗ . As in [2], for a sector ρ ∈ D() we define the α-induction αρ± ∈ End(A) of ρ by γ −1 · Ad(E ± (ρ, γ |B )) · ρ · γ . The above result shows γ −1 · Ad(E( µβ , γ |B )) · µβ · γ = γ −1 · Ad(γ (U β (µ))) · γ · (µ ⊗ id op ) = Ad(U β (µ))) · (µ ⊗ id op ).

Structure of Sectors Associated with Longo–Rehren Inclusions I

161

Thus, we obtain + = Ad(U β (µ))) · (µ ⊗ id op ). Corollary 6.3. α µβ

We proceed to αρ− . We choose {Wσ (ξ )i }i and {Wσ¯ (ξ¯ )i }i satisfying d(ξ ) Wσ¯ (ξ¯ )i ρξ¯ (Wσ (ξ ))i Rξ , Rσ = d(σ ) d(ξ ) R¯ σ = Wσ (ξ )i ρξ (Wσ¯ (ξ¯ ))i R¯ ξ . d(σ )

(6.1)

(6.2)

Such a choice always exists because we may define Rσ and R¯ σ by these equalities. (Note that the pair (Rσ , R¯ σ ) is not unique, even up to scalar multiple, when σ is reducible.) α¯

Proposition 6.4. Let σ α ∈ D(). Then, σ α and σ ¯ op are equivalent with a unitary α ¯ α op ) given by intertwiner Y (σ ) ∈ ( σ , σ¯ op ξ ζ Y (σ ) = Vξ (T (ζ,η¯ )∗i ρζ (Wσ (η) ¯ ∗k ) ⊗ ρξ (j (Wσ¯ (η)k ))j (T (ξ,η )i ))Vζ∗ , ξ,η,ζ,i,k

ξ

ζ

where T (ζ,η¯ )i and T (ξ,η )i are related through the Frobenius reciprocity. Proof. It is obvious that Y (σ ) is a unitary operator in α¯ ¯ op · γ ) = (γ · (σ ⊗ id op ), γ · (id ⊗ σ¯ op )). ( σα · γ, σ

We set Y (σ )0 =

√ ∗ λ (R¯ ξ ρξ (Wσ (ξ¯ )∗k ) ⊗ j (Wσ¯ (ξ )k ))Vξ∗ ∈ (γ · (σ ⊗ id op ), id ⊗ σ¯ op ). ξ,k

Then, direct computation using the Frobenius reciprocity yields Y (σ ) = γ (Y (σ )0 )W ∈ B. α¯ Since we have B = γ (A)W , what remains is to show Y (σ ) σ α (W ) = σ ¯ op (W )Y (σ ). The left-hand side equals to

γ (Y (σ )0 )W γ (U α (σ )∗ )W = γ (Y (σ )0 γ (U α (σ )∗ )W )W, while the right-hand side is γ (U α¯ (σ¯ op )∗ Y (σ ))W. Thus, to prove the statement, it suffices to show U α¯ (σ¯ op )Y (σ )0 γ (U α (σ )∗ )W = Y (σ ). Using Eσ (η)∗ = d(σ )σ (ρη (Rσ∗ )Eσα¯¯ (η))R¯ σ (see Theorem 4.6) and the fact that Y (σ )0 belongs to (γ · (σ ⊗ id op ), id ⊗ σ¯ op ), we get Y (σ )0 γ (U α (σ )∗ ) = d(σ )

(id ⊗ σ¯ op )(Vη )(ρη (Rσ∗ )Eσα¯¯ (η)∗ ⊗ 1)Y (σ )0 γ ((R¯ σ ⊗ 1)Vη∗ ). η

162

M. Izumi

Thus, U α¯ (σ¯ op )Y (σ )0 γ (U α (σ )∗ )W equals to

d(σ )

ξ,ηζ i,k

d(ξ )d(η) ζ Vη (ρη (Rσ∗ )Eσα¯¯ (η)∗ ⊗ j (Eσα¯¯ (η)∗ ))Y (σ )0 Vξ (ρξ (R¯ σ ) ⊗ 1)Tξ,η Vζ∗ . λd(ζ )

Using the normalization (6.1),(6.2) of {Wσ (ξ¯ )k } and {Wσ¯ (ξ )k } stated above, we have Y (σ )0 Vξ (ρξ (R¯ σ ) ⊗ 1) =

λ Wσ¯ (ξ )k ⊗ j (Wσ¯ (ξ )k ), d(σ )d(ξ ) k

and so U α¯ (σ¯ op )Y (σ )0 γ (U α (σ )∗ )W equals to ξ,η,ζ,i,k

d(η) Vη (ρη (Rσ∗ ) ⊗ 1) d(σ )d(ζ ) ζ

ζ

(Eσα¯¯ (η)∗ Wσ¯ (ξ )k T (ξ,η )i ⊗ j (Eσα¯¯ (η)∗ Wσ¯ (ξ )k T (ξ,η )i ))Vζ∗ . ζ

Since {Eσα¯¯ (η)∗ Wσ¯ (ξ )k T (ξ,η )i }ξ,i,k is an orthonormal basis of (ρζ , ρη σ¯ ), we can replace it ζ

with {ρη (Wσ¯ (ξ )k )T (η,ξ )i }ξ,i,k in the above formula. Thus, U α¯ (σ¯ op )Y (σ )0 γ (U α (σ )∗ )W equals to

d(η) ζ ζ Vη (ρη (Rσ∗ Wσ¯ (ξ )k )T (η,ξ )i ⊗ j (ρη (Wσ¯ (ξ )k )T (η,ξ )i ))Vζ∗ d(σ )d(ζ ) ξ,η,ζ,i,k d(ξ )d(η) ζ ζ = Vη (ρη (R¯ ξ∗ ρξ (Wσ (ξ¯ )∗k ))T (η,ξ )i ⊗ j (ρη (Wσ¯ (ξ )k )T (η,ξ )i ))Vζ∗ d(ζ ) ξ,η,ζ,i,k d(ξ )d(η) ζ ζ = Vη (ρη (R¯ ξ∗ )T (η,ξ )i ρζ (Wσ (ξ¯ )∗k ) ⊗ j (ρη (Wσ¯ (ξ )k )T (η,ξ )i ))Vζ∗ d(ζ ) ξ,η,ζ,i,k η ζ = Vη (T (ζ,ξ¯ )∗i ρζ (Wσ (ξ¯ )∗k ) ⊗ j (ρη (Wσ¯ (ξ )k )T (η,ξ )i ))Vζ∗ ξ,η,ζ,i,k

= Y (σ ).

¯

α¯ β We set E op (σ ¯ op , µ ¯ op ) = γ (1 ⊗ j (Eσα¯¯ (µ))). ¯ Then, clearly E op can extend in a natural way to a braiding of D().

Proposition 6.5. With the above notation, we have E − = E op .

Structure of Sectors Associated with Longo–Rehren Inclusions I

163 ¯

α¯ β Proof. Since σ α = Ad(Y (σ )∗ ) · σ ¯ op and µβ = Ad(Y (µ)∗ ) · µ ¯ op , we have α¯

E op ( σ α, µβ ) = µβ (Y (σ )∗ )Y (µ)∗ γ (1 ⊗ j (Eσα¯¯ (µ))) ¯ σ ¯ op (Y (µ))Y (σ ). α¯ First, we show γ (1 ⊗ j (Eσα¯¯ (µ))) ¯ σ ¯ op (Y (µ)) = Y (µ). Indeed, α¯

α¯

γ (1 ⊗ j (Eσα¯¯ (µ))) ¯ σ ¯ op (Y (µ)) = γ (1 ⊗ j (Eσα¯¯ (µ))) ¯ σ ¯ op (γ (Y (µ)0 )W )

= γ ((1 ⊗ j (Eσα¯¯ (µ))(id ¯ ⊗ σ¯ op )(Y (µ)0 )U α¯ (σ¯ op )∗ )W √ = λγ ((R¯ ξ∗ ρξ (Wµ (ξ¯ )∗i ) ⊗ j (Eσα¯¯ (µ) ¯ σ¯ (Wµ¯ (ξ )i )Eσα¯¯ (ξ¯ )∗ ))Vξ∗ )W ξ,i

=

√ ξ,i

λγ ((R¯ ξ∗ ρξ (Wµ (ξ¯ )∗i ) ⊗ j (Wµ¯ (ξ )i )Vξ∗ )W

= Y (µ). Thus, σ α, µβ ) = µβ (Y (σ )∗ )Y (σ ) = µβ (W ∗ γ (Y (σ )∗0 ))γ (Y (σ )0 )W E op ( = E (U β (µ)(µ ⊗ id)(Y (σ )∗0 ))Y (σ )0 ) =λ E (U β (µ)(µ ⊗ id)(Vξ )(µ(ρξ (Wσ (ξ¯ )i )R¯ ξ )R¯ ξ∗ ρξ (Wσ (σ¯ )∗i ) ⊗ 1)Vξ∗ ) ξ,i

=λ

ξ,i

γ (µ(Wσ (ξ¯ )i ) ⊗ 1)E (Vξ (Eµβ (ξ )µ(R¯ ξ )R¯ ξ∗ ⊗ 1)Vξ∗ )γ (Wσ (ξ¯ )∗i ⊗ 1).

β β Using the braiding fusion equation, we have Eµ (ξ )µ(R¯ ξ ) = ρξ (Eµ (ξ¯ )∗ )R¯ ξ . Thus,

E op ( σ α, µβ ) =λ γ (µ(Wσ (ξ¯ )i ) ⊗ 1)E (Vξ (ρξ (Eµβ (ξ¯ )∗ )R¯ ξ R¯ ξ∗ ⊗ 1)Vξ∗ )γ (Wσ (ξ¯ )∗i ⊗ 1) ξ,i

=λ

ξ,i

=

ξ,i

γ (µ(Wσ (ξ¯ )i )Eµβ (ξ¯ )∗ ⊗ 1)E (Vξ (R¯ ξ R¯ ξ∗ ⊗ 1)Vξ∗ )γ (Wσ (ξ¯ )∗i ⊗ 1)

γ (µ(Wσ (ξ¯ )i )Eµβ (ξ¯ )∗ Wσ (ξ¯ )∗i ⊗ 1)

= γ (Eµβ (σ )∗ ⊗ 1). This proves the statement. With the above result, it is easy to show the following: ¯

− = Ad(Y (σ )∗ U β (σ¯ op )) · (id ⊗ σ¯ op ). Corollary 6.6. α σβ

164

M. Izumi

7. When {ρξ }ξ ∈ 0 Has a Braiding In this section, we describe the Longo–Rehren inclusions when the system has a braiding. Part of the contents in this section has already been discussed by several authors [8, 20, 28]. However, the Galois correspondence established in Subsect. 2.4 allows us to perform more detailed analysis. Throughout this section, we assume that the system has a braiding {E(ξ, η)}. We set Eξ+ (η) := E + (ξ, η) = E(ξ, η),

Eξ− (η) := E − (ξ, η) = E(η, ξ )∗ . +−

op − . We denote by ω the statistical phase for σ +µ We use the notation (σ ⊗ µop ) ξ d(ξ )φξ (E(ξ, ξ )). Though we have seen in Sect. 4 that in general half braidings, other than ordinary braidings, are required in order to obtain all the sectors in D(), one might think that there should be some way to do so just using the given braiding in this situation with extra +− symmetry. Indeed, it is a natural question whether the system {(ρξ ⊗ ρη op ) }ξ,η∈0 op − op + gives all the irreducibles in D(). Here we need to consider ρ rather than ρ η

η

op+ because we have = [ρξ¯ ] thanks to Proposition 6.4. It turns out that this is the case if and only if the braiding of is non-degenerate [8, 20, 28]. A sector ρξ is said to be degenerate if the monodromy operator m(ξ, η) := E(η, ξ )E(ξ, η) equals to 1 for all η ∈ 0 . is said to be non-degenerate if ρe is the only degenerate sector. We denote by d0 the set of all ξ with degenerate ρξ . Before starting the proof of the above statement, we consider a more general situation as in [8] to treat the degenerate case such as the systems of even bimodules for Hecke ˆ ⊃ with a braiding algebra subfactors. We assume that there exists a larger system ˆ 0 nor η ∈ ˆ0 {E(ξ, η)} whose restriction to is the given one. Even when neither ξ ∈ belongs to 0 , ρξ ρη¯ may be decomposed into sectors in . In such a case, we set

[ρξ + ]

+− + − Eξ, η¯ (ζ ) = Eξ (ζ )ρξ (Eη¯ (ζ )), +− ] = which is a half braiding of ρξ,η¯ := ρξ ρη¯ . Note that when ξ, η ∈ 0 , we have [ρ ξ,η¯ +− [(ρξ ⊗ ρη op ) ] thanks to Proposition 6.4. More generally, when σ and µ are direct ˆ (we do not necessarily assume that σ µ¯ is decomposed into sectors sums of sectors in +−

+− and (σ ⊗ µop ) in the same way as above though they in ), we can define σ µ¯ ˆ by may not belong to D(). Following [28], we define the “permutant" of in

ˆ m(ξ, η) = 1, ∀η ∈ 0 }. = {ρξ ∈ ; It is easy to show that is a closed subsystem.

Structure of Sectors Associated with Longo–Rehren Inclusions I

165

Theorem 7.1. In the above situation, the following hold: ˆ (i) If ξ1 , ξ2 , η1 , η2 ∈ , (ρ ξ1 ,η¯1

+−

, ρ ξ2 ,η¯2

+−

) = {γ (X ⊗ 1); X ∈ (ρξ1 ,η¯1 , ρξ2 ,η¯2 ), Xρξ1 (m(η¯1 , ζ )) = ρξ2 (m(η¯2 , ζ ))X, ∀ζ ∈ 0 } N

= {γ (X ⊗ 1); X =

ξ2

ξ1 ,ν

ν∈0 i=1

ξ

T (ξ21 ,ν )∗i ρξ1 (X(ν)i ),

X(ν)i ∈ (ρη¯1 , ρν ρη¯2 )}. In the above, the expression N

ξ2

ξ1 ,ν ξ ∗ X= T ξ2 ,ν i ρξ1 (X(ν)i )

ν∈0 i=1

1

is unique. ˆ Then, (ii) Let σ1 , σ2 , µ1 , µ2 be finite direct sums of sectors in . +−

((σ1 ⊗ µ1 op )

+−

, (σ2 ⊗ µ2 op ) ) op op op γ (Yξ Vξ∗ )W ; Yξ ∈ (ρξ σ1 ⊗ ρξ µ1 , σ2 ⊗ µ2 ) . =

(7.1)

ξ ∈d0

In particular, when is non-degenerate, we have +−

((σ1 ⊗ µ1 op )

+−

, (σ2 ⊗ µ2 op )

op op ) = γ (Y ); Y ∈ ((σ1 ⊗ µ1 ), (σ2 ⊗ µ2 )) .

Proof. (i) In the same way as in the proof of Theorem 4.6, we can show +−

+−

(ρ , ρ ) ξ1 ,η¯1 ξ2 ,η¯2 = γ (X ⊗ 1); X ∈ (ρξ1 ,η¯1 , ρξ2 ,η¯2 ), Eξ+− (ζ )X = ρζ (X)Eξ+− (ζ ), ∀ζ ∈ 0 . 2 ,η¯2 1 ,η¯1 The braiding fusion equation implies that the condition Eξ+− (ζ )X = ρζ (X)Eξ+− is 2 ,η¯2 1 ,η¯1 equivalent to (∗) Xρξ1 (m(η¯1 , ζ )) = ρξ2 (m(η¯2 , ζ ))X, which shows the first equality. We claim that every X ∈ (ρξ1 ,η¯1 , ρξ2 ,η¯2 ) can be uniquely expressed as N

ξ2

ξ1 ,ν

ˆ 0 i=1 ν∈

ξ

T (ξ21 ,ν )∗i ρξ1 (X(ν)i ),

X(ν)i ∈ (ρη¯1 , ρν ρη¯2 ).

166

M. Izumi

Indeed, we set X(ν)i = reciprocity, we get N

ξ2

ξ1 ,ν

ˆ 0 i=1 ν∈

N

N

=

ξ2

ˆ 0 i=1 ν∈

ˆ 0 i=1 ν∈

=

∈ (ρη¯1 , ρν ρη¯2 ). Using the Frobenius

ξ1 ,ν d(ξ1 )d(ν) ξ2 ∗ ξ2 ∗ ξ T (ξ1 ,ν )i ρξ1 (φξ1 (T (ξ21 ,ν )i X)) T (ξ1 ,ν )i ρξ1 (X(ν)i ) = d(ξ2 ) ξ Nξ 2,ν 1

=

ξ2 d(ξ1 )d(ν) d(ξ2 ) φξ1 (T (ξ1 ,ν )i X)

d(ξ1 )2 R¯ ξ∗1 ρξ1 (T (νξ¯ ,ξ )i φξ1 (ρξ1 (T (νξ¯ ,ξ )∗i )R¯ ξ1 X)) 1

2

1

2

ξ2

ξ1 ,ν

d(ξ1 )2 R¯ ξ∗1 ρξ1 (T (νξ¯ ,ξ )i T (νξ¯ ,ξ )∗i φξ1 (R¯ ξ1 X))

ˆ 0 i=1 ν∈ d(ξ1 )2 R¯ ξ∗1 ρξ1 (φξ1 (R¯ ξ1 X)) d(ξ1 )R¯ ξ∗1 ρξ1 (ρξ¯1 (X)Rξ1 )

1

2

1

2

= = X,

which shows the claim. Since the above expression is unique, the condition (∗) is equivalent to ˆ 0. ∀ζ ∈ 0 , ∀ν ∈

ρν (m(η¯2 , ζ ))X(ν)i = X(ν)i m(η¯1 , ζ ),

The braiding fusion equation implies that this is equivalent to ˆ 0. ∀ζ ∈ 0 , ∀ν ∈

m(ζ, ν)ρζ (X(ν)i ) = ρζ (X(ν)i ),

We show that if X(ν)i is not zero, m(ζ, ν) = 1 for all ζ ∈ 0 . Indeed, if it is the case, we have φζ (m(ζ, ν))X(ν)i = X(ν)i , which implies φζ (m(ζ, ν)) = 1 because it is a scalar. Thus, we get φζ ((1 − m(ζ, ν))∗ )(1 − m(ζ, ν))) = 2 − φζ (m(ζ, ν)) − φζ (m(ζ, ν)∗ ) = 0, and so m(ζ, ν) = 1. This proves the statement.

+−

(ii) By simple computation, we can see that γ (Y )W ∈ ((σ1 ⊗ µ1 op ) +− op op ⊗ µ2 op ) ) if and only if Y ∈ (γ (σ1 ⊗ µ1 ), σ2 ⊗ µ2 ) and the following: (σ2 Y γ (U1∗ )W = U2∗ γ (Y )W, where Ui =

ξ

(∗∗)

Vξ (Eσ+i (ξ ) ⊗ j (Eµ−i (ξ )))(σi ⊗ µi )(Vξ∗ ). Let op

op op

,

op

Yξ = Y Vξ ∈ (ρξ σ1 ⊗ ρξ µ1 , σ2 ⊗ µ2 ).

Structure of Sectors Associated with Longo–Rehren Inclusions I

167

Then, the left-hand side of (∗∗) is d(ξ )d(η) ζ ∗ γ (U1 )W = Yξ ρˆξ (U1∗ Vη )Tξ,η Vζ∗ λd(ζ ) ξ,η,ζ ∈0 d(ξ )d(η) op ζ = Yξ ρˆξ ((σ1 ⊗ µ1 )(Vη )(E + (σ1 , η)∗ ⊗ j (E − (µ1 , η)∗ ))Tξ,η Vζ∗ λd(ζ ) ξ,η,ζ ∈0 d(ξ )d(η) op ζ = (σ2 ⊗ µ2 )(Vη )Yξ ρˆξ (E + (σ1 , η)∗ ⊗ j (E − (µ1 , η)∗ ))Tξ,η Vζ∗ . λd(ζ ) ξ,η,ζ ∈0

On the other hand, the right-hand side is U2∗ γ (Y )W =

ξ,η,ζ ∈0

=

ξ,η,ζ ∈0

d(ξ )d(η) op ζ (σ2 ⊗ µ2 )(Vξ )(E + (σ2 , ξ )∗ ⊗ j (E − (µ2 , ξ )∗ ))ρˆξ (Yη )Tξ,η Vζ∗ λd(ζ ) d(ξ )d(η) op (σ2 ⊗ µ2 )(Vξ )Yη ρˆη (E + (σ1 , ξ )∗ ⊗ j (E − (µ1 , ξ )∗ )) λd(ζ ) ζ

× (E + (η, ξ )∗ ⊗ j (E − (η, ξ )∗ ))Tξ,η Vζ∗ d(ξ )d(η) op = (σ2 ⊗ µ2 )(Vξ )Yη ρˆη (E + (σ1 , ξ )∗ ⊗ j (E − (µ1 , ξ )∗ )) λd(ζ ) ξ,η,ζ ∈0

ζ

× (m(η, ξ )∗ ⊗ 1)Tη,ξ Vζ∗ , where we use the braiding fusion equation in the second equality. Thus, (∗∗) is equivalent to the following for ∀η, ζ ∈ 0 : ζ d(ξ )Yξ ρˆξ (E + (σ1 , η)∗ ⊗ j (E − (µ1 , η)∗ ))Tξ,η ξ ∈0

=

ξ ∈0

ζ

d(ξ )Yξ ρˆξ (E + (σ1 , η)∗ ⊗ j (E − (µ1 , η)∗ ))(m(ξ, η)∗ ⊗ 1)Tξ,η .

This shows that if ξ is a degenerate sector, there is no further extra condition for Yξ . Now, we assume the above equality holds. Multiplying both sides by ρˆζ (R¯ σ1 ⊗ j (R¯ µ1 )) from the right, we get ζ d(ξ )Yξ ρˆξ (E + (σ1 , η)∗ ⊗ j (E − (µ1 , η)∗ )ρˆη (R¯ σ1 ⊗ j (R¯ µ1 )))Tξ,η ξ ∈0

=

ξ ∈0

ζ

d(ξ )Yξ ρˆξ (E + (σ1 , η)∗ ⊗j (E − (µ1 , η)∗ )ρˆη (R¯ σ1 ⊗j (R¯ µ1 )))(m(ξ, η)∗ ⊗1)Tξ,η .

Using the braiding fusion equation, we have (E + (σ1 , η)∗ ⊗ j (E − (µ1 , η)∗ ))ρˆη (R¯ σ1 ⊗ j (R¯ µ1 )) = (R¯ σ1 ⊗ j (R¯ µ1 ))(E + (σ¯1 , η) ⊗ j (E − (µ¯1 , η))),

168

M. Izumi

and so

ξ ∈0

=

ζ

d(ξ )Yξ (E + (σ¯1 , η) ⊗ j (E − (µ¯1 , η)))Tξ,η

ξ ∈0

where

ζ

d(ξ )Yξ (E + (σ¯1 , η) ⊗ j (E − (µ¯1 , η)))(m(ξ, η)∗ ⊗ 1)Tξ,η ,

op Yξ = Yξ ρˆξ (R¯ σ1 ⊗ j (R¯ µ1 )) ∈ (ρˆξ , σ2 σ¯1 ⊗ µ2 µ¯1 op ).

Note that Yξ is a multiple of an isometry and {Yξ }ξ are mutually orthogonal. Moreover, since we have op Yξ = d(σ1 )d(µ1 )(σ2 ⊗ µ2 )(Rσ1∗ ⊗ Rµ∗ 1 )Yξ , ζ

Yξ is not zero if and only if Yξ is not zero. Therefore, if Yξ is not zero, we have Tξ,η = ζ

ζ

ζ

(m(ξ, η)∗ ⊗ 1)Tξ,η for all η, ζ ∈ 0 , or equivalently T (ξ,η )i = m(ξ, η)∗ T (ξ,η )i for ζ

all i = 1, 2, · · · Nξ,η , η, ζ ∈ 0 . Thus, m(ξ, η) = 1 for all η ∈ 0 , which proves the statement. The above theorem implies the following [8, 20, 28]: +−

Corollary 7.2. If the braiding of is non-degenerate, {(ρξ ⊗ ρη op ) }ξ,η∈0 gives the set of irreducibles in D(). D() is isomorphic, as a category, to the direct product of the category generated by and its opposite. The irreducible decomposition of the restriction γ |B of the canonical endomorphism γ to B is given by +− (ρξ . ⊗ ρξ op ) [γ |B ] = ξ ∈0 +−

Proof. (ii) of the above theorem shows that {(ρξ ⊗ ρη op ) }ξ,η∈0 are not mutually equivalent irreducible sectors in D(). Since the global index of A ⊃ B is λ2 , there is no other irreducible sector in D(). Thanks to the Frobenius reciprocity, we have +−

dim((ρξ ⊗ ρη op )

+−

, γ |B ) = dim(ι, ι · (ρξ ⊗ ρη op ) =

)

op dim(ι, (ρξ ⊗ ρη ) · ι) op dim(γ , (ρξ ⊗ ρη ))

= = δξ,η , which proves the last statement.

It is easy to show that in the above situation, the principal graph of A ⊃ B and that of the dual inclusion are the same. For a more detailed description of the dual inclusion, we have the following: Proposition 7.3. We assume that the braiding of is non-degenerate. We set Vξ := )W , where γ (Vξ,0 d(ξ )d(ζ ) η ∗ Vξ,0 := Vη Tζ,ξ (m(ζ, ξ )∗ ⊗ 1)Vζ∗ . d(η) ξ,η,ζ

Structure of Sectors Associated with Longo–Rehren Inclusions I

169

+−

Then, Vξ is an isometry in ((ρξ ⊗ ρη op ) , γ |B ). The Q-system (γ |B , W, γ (V )) for the dual inclusion B ⊃ γ (A) can be expressed as γ |B (x) =

ξ ∈0

γ (V ) =

ξ,η,ζ

+−

Vξ (ρξ ⊗ ρξ op )

(x)Vξ∗ ,

W = Ve , d(ξ )d(η) ζ γ (Vη )Vξ γ ((m(ξ, η) ⊗ 1)Tξ,η )Vζ∗ . λd(ζ ) +−

⊗ ρξ op ) , γ |B ) = 1. As usual Proof. From the previous corollary, we know dim((ρξ +− )W ∈ ((ρ op we can see that γ (Vξ,0 , γ |B ) if and only if the following two ξ ⊗ ρξ ) conditions hold: Vξ,0 ∈ (γ · ρˆξ , γ ),

where U=

η

Vξ,0 γ (U ∗ )W = W Vξ,0 ,

Vη (E + (ξ, η) ⊗ j (E − (ξ, η)))ρˆξ (Vη∗ ).

is proportional to the one given in the We assume that these hold and show that Vξ,0 statement. Direct computation shows that when the first condition holds, the second one γ (U ∗ )W = W V is equivalent to Vξ,0 ξ,0

µ

d(µ) ∗ ζ = Vν Vξ,0 Vµ ρˆµ (E + (ξ, η)∗ ⊗ j (E − (ξ, η)∗ ))Tµ,η d(ζ ) τ

d(ν) τ ∗ T V V Vζ , d(τ ) ν,η τ ξ,0

for all η, ζ, ν ∈ 0 . We set ν = e. Then, the right-hand side is √

1 V ∗ V Vζ . d(η) η ξ,0

V ∈ (ρˆ ρˆ , id) in the left-hand side survives only when µ = ξ¯ , and Note that Ve∗ Vξ,0 µ µ ξ it is proportional to Rξ∗ ⊗ j (Rξ∗ ) if µ = ξ¯ . Thus, there exists a complex number c such that d(ξ )d(η) ∗ ζ ∗ Vη Vξ,0 Vζ = c (Rξ ⊗ j (Rξ∗ ))ρˆξ¯ (E + (ξ, η)∗ ⊗ j (E − (ξ, η)∗ ))Tξ¯ ,η . d(ζ )

Using the braiding fusion equation and the Frobenius reciprocity, this equals to cωξ

2

d(ζ ) η ∗ + (E (ξ, ζ )∗ ⊗ j (E − (ξ, ζ )∗ )) T d(ξ )d(η) ξ,ζ = cωξ 2

d(ζ ) η ∗ T (m(ζ, ξ )∗ ⊗ 1). d(η)d(ξ ) ζ,ξ

170

M. Izumi

Thus, we get Vξ,0

= cωξ

2

η,ζ

d(ζ ) η ∗ Vη Tζ,ξ (m(ζ, ξ )∗ ⊗ 1)Vζ∗ . d(ξ )d(η)

To see that |c| = d(ξ ) is the correct normalization, it suffices to show Eγ (Vξ Vξ∗ ) = d(ξ )2 /λ, where Eγ is the conditional expectation from B to γ (A). Indeed, it is equal to ∗

γ (Vξ,0 )Eγ (W W ∗ )γ (Vξ,0 )=

=

1 |c|2 d(ζ ) η ∗ Vξ,0 )= γ (Vξ,0 N Vη Vη∗ λ λ d(ξ )d(η) ζ,ξ η,ζ

|c|2

λ

η

Vη Vη∗ =

|c|2 λ

.

We set c = d(ξ )ωξ2 . Then, Ve = W is obvious. To show the last statement, we compute the quantity 1 ∗ ∗ Vξ∗ γ (Vη∗ )γ (V )Vζ = W ∗ γ (Vξ,0 W V Vη,0 Vζ,0 )W = √ E (Vξ,0 ∗ Vη,0 Vζ,0 ). λ Note that this belongs to +−

((ρζ ⊗ ρζ op )

+−

, (ρξ ⊗ ρξ op )

+−

(ρη ⊗ ρη op )

) = γ ((ρˆζ , ρˆξ ρˆη )).

∗ V V ) = γ (X). Since Thus, there exists X ∈ (ρˆζ , ρˆξ ρˆη ) such that E (Vξ,0 η,0 ζ,0 ∗ V E (·)V is the left inverse of γ , we get ∗ X = V ∗ E (Vξ,0 Vη,0 Vζ,0 )V =

1 op ∗ d(ν)2 (φν ⊗ φν )(Vν∗ Vξ,0 Vη,0 Vζ,0 Vν ). λ ν

and the braiding fusion equation, we get Using the definition of Vξ,0 ∗ Vν∗ Vξ,0 Vη,0 Vζ,0 Vν

=

√ d(ν) d(ξ )d(η)d(ζ ) d(µ)

µ,τ

µ ∗

=

τ µ · (m(ν, ξ ) ⊗ 1)Tν,ξ (m(τ, η) ⊗ 1)Tτ,η Tν,ζ (m(ν, ζ )∗ ⊗ 1) √ d(ν) d(ξ )d(η)d(ζ )

d(µ)

µ,τ

µ ∗

τ µ · (m(ν, ξ ) ⊗ 1)(m(νξ, η) ⊗ 1)Tν,ξ Tτ,η Tν,ζ (m(ν, ζ )∗ ⊗ 1).

In the above, we can replace ∗ Vη,0 Vζ,0 Vν Vν∗ Vξ,0

=

τ

µ

τ T Tν,ξ τ,η with

τ

µ

τ )T , and we get ρˆν (Tξ,η ν,τ

√ d(ν) d(ξ )d(η)d(ζ ) µ,τ

d(µ) µ ∗

τ µ (m(ν, ξ ) ⊗ 1)(m(νξ, η) ⊗ 1)ρˆν (Tξ,η )Tν,τ Tν,ζ (m(ν, ζ )∗ ⊗ 1).

Structure of Sectors Associated with Longo–Rehren Inclusions I

171

Using the braiding fusion equation several times (draw a diagram), we get ∗ Vη,0 Vζ,0 Vν ) (φν ⊗ φν )(Vν∗ Vξ,0 √ d(ν) d(ξ )d(η)d(ζ ) op τ µ µ ∗ (m(ξ, η) ⊗ 1)Tξ,η (φν ⊗ φν )(Tν,τ Tν,ζ ). = d(µ) µ,τ op

op

µ ∗

µ

Note that (φν ⊗ φν )(Tν,τ Tν,ζ ) ∈ (ρˆζ , ρˆτ ) is not zero only when τ = ζ , and that it is a scalar if τ = ζ . Since we have µ

µ

φν (T (ν,ζ )i T (ν,ζ )∗j ) = δi,j

d(µ) , d(ν)d(ζ )

the right-hand side equals to d(µ) d(ξ )d(η) µ d(ξ )d(η) ζ ζ Nν,ζ (m(ξ, η) ⊗ 1)Tξ,η = (m(ξ, η) ⊗ 1)Tξ,η . d(ν)d(ζ ) d(ζ ) d(ζ ) µ Therefore, we get

X=

and so γ (V ) =

ξ,η,ζ

d(ξ )d(η) ζ (m(ξ, η) ⊗ 1)Tξ,η , d(ζ )

d(ξ )d(η) ζ γ (Vη )Vξ γ ((m(ξ, η) ⊗ 1)Tξ,η )Vζ∗ . λd(ζ )

Remark. We say that Two Q-systems (γ , V1 , W1 ) and (γ , V2 , W2 ) are equivalent if there exists a unitary u ∈ (γ , γ ) satisfying V2 = uV1 ,

W2 = uγ (u)W1 u∗ .

This equivalence relation precisely corresponds to inner conjugacy of the corresponding subfactors [17]. In particular, if γ is decomposed into automorphisms, the obstruction to this equivalence precisely corresponds to the 2 cohomology obstruction H 2 (G, T) for the group actions to cocycle conjugacy. When {E(ξ, η)} is a, not necessarily nondegenerate, braiding of , we define WE by d(ξ )d(η) ζ γ (Vη )Vξ (m(ξ, η) ⊗ 1)Tξ,η Vζ∗ . WE = λd(ζ ) ξ,η,ζ

Then, direct computation shows that (γ , V , WE ) is a Q-system too. We call the corresponding inclusion E-twisted Longo–Rehren inclusion of . The above proposition shows that if the braiding is non-degenerate, B ⊃ γ (A) is isomorphic to the E-twisted Longo–Rehren inclusion of . It is easy to show that (γ , V , W ) and (γ , V , WE ) are equivalent only if m(ξ, η) is a scalar when it acts on (ρζ , ρξ ρη ) for all ξ, η and ζ . Therefore, a priori we do not know if the Longo–Rehren inclusion is self-dual even when it possesses a non-degenerate braiding. However, many arguments for the Longo–Rehren inclusions work for the E-twisted ones as well such as the Galois correspondence discussed in Subsect. 2.4.

172

M. Izumi

ˆ with a non-degenerate In what follows, we assume that is a closed subsystem of braiding {E(ξ, η)}, whose restriction to is not necessarily non-degenerate. We denote ˆ and by ιˆ the inclusion map from B ˆ to A. by Bˆ the Longo–Rehren subfactor for , Let γˆ : A −→ Bˆ be the canonical endomorphism given by γˆ (x) = Vˆξ ρˆξ (x)Vˆξ∗ . ˆ0 ξ ∈

We set Wˆ =

ˆ0 ξ,η,ζ ∈

d(ξ )d(η) ˆ ζ Vξ ρˆξ (Vˆη )Tξ,η Vˆζ∗ , λˆ d(ζ )

ˆ The global index of will be denoted by λ . For a where λˆ is the global index of . ˆ finite direct sum σ of sectors in and a half braiding {Eσα (ξ )}ξ ∈ˆ 0 of σ , we set Vˆξ (Eσα (ξ ) ⊗ 1)(σ ⊗ id op )(Vˆξ∗ ). Uˆ α (σ ) := ˆ0 ξ ∈

For Bˆ − Bˆ sectors, we use the notation σˆ α instead of σ˜ α (though ρˆξ still means op +− } ρξ ⊗ ρξ ). Note that {ρ ξ,η¯ ˆ 0 is a complete system of representatives of the ξ,η∈ ˆ thanks to Proposition 6.4 and Corollary 7.2. To avoid possible irreducibles in D() confusion, we denote by B , instead of B, the Longo–Rehren subfactor for . For the Longo–Rehren inclusion A ⊃ B we use the same notation as before such as ι, γ , V , W . Using the Galois correspondence, we regard B as an intermediate subfactor of A ⊃ Bˆ . More precisely, we set P = ξ ∈0 Vξ Vξ∗ . Then, the conditional expectation E from A to B is given by λˆ E (x) = Wˆ ∗ γˆ (x)P Wˆ . λ ∗ = P . Then, we have V = R ∗ Vˆ for We fix an isometry R ∈ B satisfying R R ξ ξ ξ ∈ (see Subsect. 2.4). In [28] Ocneanu claims, without any hint of the proof, that the “bipermutant” theorem ( ) = holds. Thanks to the Galois correspondence, we can show the following stronger statement: Theorem 7.4. Let ι1 : Bˆ <→ B be the inclusion map. We denote by λ the global index of the “permutant" of . Then, +− ]. (i) [¯ι1 ι1 ] = ξ ∈ [ρ ξ,ξ¯ 0 (ii) λ λ = λˆ . In consequence, ( ) = holds. Proof. (i) Thanks to Corollary 7.2, we know that [¯ι1 ι1 ] is contained in +− [ρ ]. [γˆ |Bˆ ] = ξ,ξ¯ ˆ0 ξ ∈

+− , ι¯ ι ) = dim(ι , ι ρ +− ). Since ρ +− The Frobenius reciprocity implies dim(ρ 1 1 1 1 ξ,ξ¯ ξ,ξ¯ ξ,ξ¯ +− op is the restriction of Ad(U (ρξ,ξ¯ )) · (ρξ,ξ¯ ⊗ id ) to Bˆ , we have +− +− (ι1 , ι1 ρ ) ⊂ (ˆι, ιˆρ ) = CU +− (ρξ,ξ¯ )R¯ ξ . ξ,ξ¯ ξ,ξ¯

Structure of Sectors Associated with Longo–Rehren Inclusions I

173

+− is contained in ι¯ ι if and only if Thus ρ 1 1 ξ,ξ¯

E (U +− (ρξ,ξ¯ )R¯ ξ ) = U +− (ρξ,ξ¯ )R¯ ξ . Using the braiding fusion equation, we get +− U +− (ρξ,ξ¯ )R¯ ξ = Vˆη (Eξ, (η) ⊗ 1)(ρξ ρξ¯ ⊗ id op )(Vˆη∗ )(R¯ ξ ⊗ 1) ξ¯ ˆ0 η∈

=

ˆ0 η∈

=

+− Vˆη (Eξ, (η) ⊗ 1)(R¯ ξ ⊗ 1)Vˆη∗ ξ¯

Vˆη (m(η, ξ )ρη (R¯ ξ ) ⊗ 1)Vˆη∗ .

ˆ0 η∈

Therefore, we obtain E (U +− (ρξ,ξ¯ )R¯ ξ ) = µ∈0 η,ζ,ν∈ ˆ0

d(µ)d(η) ˆ ν ∗ ζ ˆ∗ Vζ Vν Tµ,η ρˆµ (m(η, ξ )ρη (R¯ ξ ) ⊗ 1)Tµ,η √ λ d(ζ )d(ν)

ζ

µ,η N d(µ)d(η) ˆ Vζ (T (ζµ,η )∗i ρµ (m(η, ξ ))T (ζµ,η )i ρζ (R¯ ξ ) ⊗ 1)Vˆζ∗ . = λ d(ζ )

µ∈0 η,ζ ∈ ˆ 0 i=1

+− is contained in ι¯ ι if and only if the following holds for all This means that ρ 1 1 ξ,ξ¯ ˆ 0: ζ ∈ ζ

µ,η N d(µ)d(η) ζ ∗ ¯ m(ζ, ξ )ρζ (Rξ ) = T (µ,η )i ρµ (m(η, ξ ))T (ζµ,η )i ρζ (R¯ ξ ). λ d(ζ )

µ∈0 η∈ ˆ 0 i=1

Multiplying both sides by ρζ (ρξ (Rξ∗ )) from left, we see that this is equivalent to ζ

µ,η N d(µ)d(η) ζ ∗ T (µ,η )i ρµ (m(η, ξ ))T (ζµ,η )i . m(ζ, ξ ) = λ d(ζ )

(∗)

µ∈0 η∈ ˆ 0 i=1

Thanks to the braiding fusion equation, this is also equivalent to ζ

µ,η N d(µ)d(η) ζ ∗ 1= T (µ,η )i ρµ (E(η, ξ )∗ )m(µ, ξ )∗ ρµ (E(η, ξ ))T (ζµ,η )i . (∗ ) λ d(ζ )

µ∈0 η∈ ˆ 0 i=1

+− is contained in ι¯ ι and set ζ = e in (∗). Then, we get Assume that ρ 1 1 ξ,ξ¯

1=

d(µ)2 d(µ)2 R¯ µ∗ ρµ (m(µ, ¯ ξ ))R¯ µ = φµ¯ (m(µ, ¯ ξ )). λ λ

µ∈0

µ∈0

174

M. Izumi

Since µ∈0 d(µ)2 = λ and |φµ¯ (m(µ, ¯ ξ ))| ≤ 1, we get φµ¯ (m(µ, ¯ ξ )) = 1 for all µ ∈ 0 . Thus, we get ¯ ξ ))∗ (1 − m(µ, ¯ ξ ))) = 1, φµ¯ ((1 − m(µ, and so m(µ, ¯ ξ ) = 1 for all µ ∈ 0 . On the other hand, if ξ ∈ 0 , the right-hand side of (∗ ) is ζ

µ,η N d(µ)d(η) d(µ)2 d(µ)d(η) ζ = Nµ,η = = 1, λ d(ζ ) λ d(ζ ) λ

µ∈0 η∈ ˆ 0 i=1

µ∈0 η∈ ˆ0

µ∈0

which proves the statement. (ii) Since λ , λ , and λˆ are the indices of A ⊃ B , B ⊃ Bˆ , and A ⊃ Bˆ respectively, we have λ λ = λˆ . Applying this equality to , we get λ = λ( ) , which proves the statement because ( ) ⊃ obviously holds. Remark. Applying the Galois correspondence to the E-twisted Longo–Rehren inclusion, we can show that the above theorem implies that B ⊃ Bˆ is isomorphic to the dual inclusion of the E-twisted Longo–Rehren inclusion of . If is contained in , or in ˆ is a minimal non-degenerate extension of in the sense of [28], then other words, is completely degenerate, i.e. m(ξ, η) = 1 for all ξ, η ∈ . Thus, B ⊃ Bˆ is actually isomorphic to the dual inclusion of the Longo–Rehren inclusion of . Therefore, the structure of the inclusion B ⊃ Bˆ itself depends only on in this case, while at least a priori, the relationship between two inclusions A ⊃ B and B ⊃ Bˆ depends on ˆ In fact, Ocneanu claims in [28] that the minimal the choice of the embedding ⊂ . ˆ non-degenerate extension ⊂ is “essentially unique" (whatever it means). It would be plausible if we could characterize the B − B sectors associated with B ⊃ Bˆ only in terms of the inclusion A ⊃ B . We proceed to show how to construct the sectors in D() in the degenerate case. ˆ we consider only the one given by Theorem 4.6, and we use For braidings of D(), the same notation E. We take a canonical endomorphism γ1 : B −→ Bˆ satisfying γ1 · γ = γˆ . Then, since γˆ |Bˆ = ι¯1 ι¯ιι1 and γ1 |Bˆ = ι¯1 ι1 , we have γ1 |Bˆ = γ1 (W ∗ )γˆ |Bˆ (·)γ1 (W ). ˆ we define the α-inductions of ρ to B and A by For ρ ∈ D(), αρ (x) = γ1−1 · Ad(E(ρ, γ1 |Bˆ )) · ρ · γ1 , x ∈ B , αρ (x) = γˆ −1 · Ad(E(ρ, γˆ |Bˆ )) · ρ · γˆ ,

x ∈ A.

We begin with the following well-known lemma: Lemma 7.5. Let P ⊃ Q be a unital inclusion of C∗ -algebras with a faithful conditional expectation E. If X is an isometry in P such that E(X) is also an isometry, then X ∈ Q. Proof. Compute E((X − E(X))∗ (X − E(X))).

Structure of Sectors Associated with Longo–Rehren Inclusions I

175 β

ˆ with a half braiding {Eσ (ξ )} ˆ . Lemma 7.6. Let σ be a finite direct sum of sectors in ξ ∈0 β β ∗ Then, we have Uˆ (σ )U (σ ) ∈ B and = Ad(Uˆ β (σ )U β (σ )∗ ) · σ˜ β . ασ ˆβ Proof. Since γ1 (W ) ∈ (γ1 |Bˆ , γˆ |Bˆ ), we have γ1 (W )E(σˆ β , γ1 |Bˆ ) = E(σˆ β , γˆ |Bˆ )σˆ β (γ1 (W )), and in particular, E(σˆ β , γ1 |Bˆ ) = γ1 (W ∗ )E(σˆ β , γˆ |Bˆ )σˆ β (γ1 (W )). Thus, for x ∈ B we have, αρ (x) = γ1−1 · Ad(E(σˆ β , γ1 |Bˆ )) · σˆ β · γ1 (x)

= γ1−1 (γ1 (W ∗ )E(σˆ β , γˆ |Bˆ )σˆ β (γ1 (W x))E(σˆ β , γ1 |Bˆ )∗ )

= γ1−1 (γ1 (W ∗ )E(σˆ β , γˆ |Bˆ )σˆ β (γ1 (γ (xW )))E(σˆ β , γ1 |Bˆ )∗ )

= γ1−1 (γ1 (W ∗ )E(σˆ β , γˆ |Bˆ )σˆ β (γˆ (x))E(σˆ β , γˆ1 |Bˆ )∗ )γ1 (W ))

= γ1−1 (γ1 (W ∗ )γˆ (ασˆ β (x))γ1 (W ))) = W ∗ γ (ασˆ β (x))W = E (ασˆ β (x)).

is a ∗-homomorphism, we obtain E (ασˆ β (x)) = ασˆ β (x) for every isometry Since ασ ˆβ x ∈ B thanks to Lemma 7.5, and so it holds for every x ∈ B . Thus, Corollary 6.3 implies = Ad(Uˆ β (σ )) · (σ ⊗ id op )|B . ασ ˆβ Using Uˆ β (σ )(σ ⊗ id op )(R ) = R U β (σ ), we get (R ) = R U β (σ )Uˆ β (σ )∗ ∈ B ασ ˆβ and U β (σ )Uˆ β (σ )∗ ∈ B . Therefore, we obtain = Ad(Uˆ β (σ )U β (σ )∗ ) · Ad(U β (σ )) · (σ ⊗ id op )|B = Ad(Uˆ β (σ )U β (σ )∗ ) · σβ. ασ ˆβ ˆ We assume that ˆ has a non-degenerate Theorem 7.7. Let be a closed subsystem of . braiding {E(ξ, η)}ξ,η∈ˆ 0 , which is not necessarily non-degenerate on . Then, every +− for some ξ, η ∈ ˆ 0. irreducible sector in D() is contained in ρ ξ,η¯ +− for Proof. Let ρ ∈ D() be an irreducible sector. We claim that ι¯1 ρι1 contains ρ ξ,η¯ ˆ Indeed, note that ιˆι¯1 ρι1 = ιι1 ι¯1 ρι1 contains ιρι1 that is decomposed into some ξ, η ∈ . sectors in {(ρξ ⊗ id op )ˆι}ξ ∈ˆ 0 from the definition D(). Thus, there exists an irreducible ˆ 0. Bˆ −Bˆ sector ρ contained in ι¯1 ρι1 such that ιˆρ contains (ρξ ⊗id op )ˆι for some ξ ∈ ˆ which The Frobenius reciprocity implies that ρ is contained in ¯ιˆ(ρξ ⊗ id op )ˆι ∈ D(),

176

M. Izumi

+− is contained in ι¯ ρι . Thanks to the Frobenius shows the claim. We assume that ρ ξ,η¯ 1 1 reciprocity, we get

dim(αρ γ , ρ) = dim(αρ ι , ρι1 ) +− 1 +− 1 ξ,η¯

ξ,η¯

+− +− = dim(ι1 ρ , ρι1 ) = dim(ρ , ι¯1 ρι1 ) % = 0. ξ,η¯ ξ,η¯ +− γ . Let π be an irreducible Thus the above lemma shows that ρ is contained in ρ ξ,η¯ 1 +− +− π = (ρ ρ ⊗ id op )ιπ component of γ1 such that ρ π contains ρ. Then, ιρ ξ,η¯ ξ,η¯ ξ η¯ contains (ρζ ⊗ id op )ι for some ζ ∈ 0 . The Frobenius reciprocity implies that ιπ ˆ 0 . On the other hand, we have contains (ρν ⊗ id op )ι for some ν ∈

dim(ιγ1 , (ρν ⊗ id op )ι) = dim(ˆι, (ρν ⊗ id op )ˆι) = dim(γˆ , (ρν ⊗ id op )) = δν,e . +− Since γ1 contains idB with multiplicity one, we conclude π = idB . Thus ρ ξ,η¯ contains ρ.

Before closing this section, we apply the above results to the Hecke algebra subfactors [34]. In [8, 28], a precise description of the asymptotic inclusions of the Hecke algebra subfactors is obtained in the cases of SU (2) and SU (3), which is indeed our motive example of the above theorem. We assume that is isomorphic to the set of irreducible even bimodules of the SU (n)k Hecke algebra subfactor where k is the level. As in [8] we ˆ associated with SU (n)k WZW model, or embed into the full system of irreducibles equivalently the set of positive energy irreducible projective representations of the loop ˆ has a natural Zn -grading and is the set of grade 0 elements. group LSU (n) [33]. ˆ 0 , gr(ξ ) denotes the grading of ξ . We set For ξ ∈ ˆ 0,i = {ξ ∈ ˆ 0 ; gr(ξ ) = i}. ˆ is always non-degenerate, it is not always the case for . As in [8], we Although assume that k is a multiple of n for simplicity. Then, there is a sector ρσ ∈ with d(σ ) = 1 such that ρσN is an inner automorphism and {id, ρσ , ρσ2 , · · · ρσn−1 } is precisely the set of degenerate sectors (Lemma 3.5 of [8]). We use the notation ρσ i = ρσi , i = 1, 2, · · · n−1. There exists a unique sector ρf ∈ such that ρf is self-conjugate ˆ such that ρf ρ f contains and [ρσ ρf ] = [ρf ]. ρf is the unique irreducible sector in ρσ i for some i = 1, 2, · · · n − 1. +− contains a sector in D() only if ξ and η have the It is easy to see that ρ ξ,η¯ ˆ 0 have the same grading, ρξ,η¯ is same grading. On the other hand, if ξ and η in +− ∈ D(). Thanks to Theorem 7.1, we know decomposed into sectors in and ρ ξ,η¯ +− is reducible if and only if (ξ, η) = (f, f ), and that ρ +− and ρ +− that ρ ξ,η¯ ξ1 ,η¯1 ξ2 ,η¯2 are equivalent if and only (ξ1 , η1 ) = (σ i ξ2 , σ i η2 ) for some i. Thus, we introduce an ˆ 0,i × ˆ 0,i for i % = gr(f ) and ˆ 0,i × ˆ 0,i \ {(f, f )} for equivalence relation in j j i = gr(f ) by setting (ξ1 , η1 ) ∼ (ξ2 , η2 ) if (ξ1 , η1 ) = (σ ξ2 , σ η2 ) for some j . We denote by Ji , i = 0, 1, · · · n − 1 the quotients by this equivalence relation, and take a complete system of representatives {(ξs , ηs )}s∈Ji . We set J = Ji . Theorem 7.1 shows +− and ρ +− . that if (ξ, η) % = (f, f ) there is no non-zero intertwiner between ρ ξ,η¯ f,f¯

Structure of Sectors Associated with Longo–Rehren Inclusions I

177

Therefore, Theorem 7.7 implies that the set of irreducibles of D() consists of the +− +− and }. irreducible subsectors of ρ ξs ,η¯s s∈J {ρ f,f¯ The following lemma (Assumption 3.14 of [8]) was proved in [8] for the SU (2) and SU (3) cases. +−

Lemma 7.8. ρf,f¯ dimensions.

is decomposed into mutually not equivalent sectors with equal

Proof. Since [ρσ ρf ] = [ρf ], we may and do assume ρσ ρf = ρf by taking an appropriate representative ρσ . In this case, ρσn = id holds because it is an inner automorphism implemented by a unitary in (ρf , ρf ) = C. Then, we have (ρf , ρf ρσ i ) = CE(σ i , f ) and (ρf , ρσ i ρf ) = C1. Thus, Theorem 7.1 implies +− +− (ρ , ρ )={ ci γ (E(f, σ i ) ⊗ 1); ci ∈ C}. f,f¯ f,f¯ i

The braiding fusion equation and the condition ρσ ρf = ρf imply ρσ (E(f, σ )) = ωσ E(f, σ ), where E(σ, σ ) = ωσ . Therefore, we get − i(i−1) 2

E(f, σ i ) = ρσi−1 (E(f, σ ))ρσi−2 (E(f, σ )) · · · E(f, σ ) = ωσ

E(f, σ )i .

n(n−1)

In particular, we have E(f, σ )n = ωσ 2 that is a scalar. Note that φf (E(f, σ i )) = 0 +− to for i % = 0 because it belongs to ∈ (ρσi , id). Let τf be the restriction of φρ f,f¯

+− , ρ +− ), which is the canonical trace. Then, the above argument shows that (ρ f,f¯ f,f¯ +− , ρ +− ), τ ) is isomorphic to (CZ , τ ), where τ is the natural trace the pair ((ρ f n f,f¯ f,f¯ +− , ρ +− ) ∼ Cn and the trace of the group algebra of Zn . This means that (ρ = f,f¯ f,f¯ evaluation of each minimal projection is equal, which proves the statement.

+− , Let γ (E0 ⊗ 1), γ (E1 ⊗ 1), · · · γ (En−1 ⊗ 1) be minimal projections of (ρ f,f¯ ∗ +− ρ ). We take isometries Xi ∈ M satisfying Xi Xi = Ei , and set f,f¯ +− (x)γ (Xi ⊗ 1). πi (x) = γ (Xi∗ ⊗ 1)ρ f,f¯

Then, the above argument shows the following for the general SU (n) case: +− Corollary 7.9. s∈J {ρ } ∪ {π0 , π1 , · · · πn−1 } is the set of irreducibles of D(). ξs ,η¯s Remark. (i) To obtain the principal graph, we need to compute the irreducible decomposition of Xi∗ ρf2 (·)Xi . (ii) The braiding of D() is given as follows: +−

+−

E(ρ , ρ ) ξ1 ,η¯1 ξ2 ,η¯2 = γ ((ρξ2 (E(ξ1 , η¯2 ))E(ξ1 , ξ2 )ρξ1 ρξ2 (E(η¯2 , η¯1 )∗ )ρξ1 (E(ξ2 , η¯2 )∗ )) ⊗ 1), +− , πi ) E(ρ ξ,η¯

+− +− = γ (Xi∗ ⊗ 1)E(ρ , ρ )γ (ρξ,η¯ (Xi ) ⊗ 1), ξ,η¯ f,f¯ +− ) E(πi , ρ ξ,η¯ +− +− = γ (ρξ,η¯ (Xi∗ ) ⊗ 1)E(ρ , ρ )γ (Xi ⊗ 1), ξ,η¯ f,f¯

E(πi , πj ) +− +− = γ (Xj∗ ρf,f (Xi∗ ) ⊗ 1)E(ρ , ρ )γ (ρf,f (Xj )Xi ⊗ 1). f,f¯ f,f¯

178

M. Izumi

Thus, using Lemma 5.4 we can obtain explicit formulae for the S and T matrices in terms of the braiding of and {Ei }n−1 i=0 . Actually we can evaluate them except for Sπi ,πj : +− = ωξ ωη , ωρ ξ,η¯

Sξ1 ,ξ2 Sη1 ,η2 = nSξ1 ,ξ2 Sη1 ,η2 , λ|Se,e |2

Sρ +− ,ρ +− = ξ1 ,η¯1

ξ2 ,η¯2

+− ,π = Sρ ξ,η¯ i

Sπi ,πj =

ωπi = 1,

Sξ,f Sη,f = Sξ,f Sη,f , nλ|Se,e |2

d(f )4 3 φ (Eρ(E)φ(E ∗ ρ(E ∗ )Ej ρ 2 (Ei )ρ(E)E ∗ )ρ(E)E ∗ ), λ

where φ = φf , ρ = ρf , E = E(f, f ). We leave the details. (iii) Thanks to Theorem 7.4, we know that B is the cross product of Bˆ by the Zn action +− }. Thus, D() may also be obtained by applying “treatment arising from {ρ σ,σ¯ ˆ as in [3, 27]. We give a more precise description of B ⊃ B ˆ . of orbifold " to D() +− is As before we may and do assume ρσn = id and ρσ¯ = ρσ−1 . Since θ := ρ σ,σ¯ the restriction of Ad(U +− (ρσ,σ¯ )) · (ρσ,σ¯ ⊗ id op ) and ρσ,σ¯ is trivial now, θ is implemented by U := U +− (ρσ,σ¯ ) on Bˆ . Note that since ρξ ρσ is irreducible, the monodromy operator m(ξ, σ ) is a scalar, and in fact √ we have m(ξ, σ ) = e

2π −1 gr(ξ ) n

(see [21]). The braiding fusion equation implies

Eρ+− (ξ ) = E(σ, ξ )ρσ (E(ξ, σ¯ )∗ ) = E(σ, ξ )E(ξ, σ ) = m(ξ, σ ). σ,σ¯ Thus, B is generated by Bˆ and the implementing unitary U=

e

√ 2π −1 gr(ξ ) n

Vˆξ Vˆξ∗ .

ˆ0 ξ ∈

Note added in proof. Using Eq. (2.30) in [30], one can show that Q-systems with “twist” discussed in Sect. 7 are in fact equivalent to ordinary ones. The author would like to thank Y. Kawahigashi and K.-H. Rehren for informing him of this fact. Acknowledgement. The author would like to thank J. Böckenhauer, D. E. Evans, Y. Kawahigashi, R. Longo, M. Müger, K.-H. Rehren, and J. E. Roberts for useful comments about the subjects. A part of this work was done when the author visited the University of Rome “Tor Vergata” and he acknowledges their hospitality. This work is partially supported by Sumitomo Foundation.

References √ √ 1. Asaeda, M., Haagerup, U.: Exotic subfactors of finite depth with Jones index (5+ 13)/2 and (5+ 17)/2. Commun. Math. Phys. 202, 1–63 (1999) 2. Böckenhauer, J., Evans, D.: Modular invariants, graphs and α-induction for nets of subfactors I. Commun. Math. Phys. 197, 361–386 (1998) 3. Böckenhauer, J., Evans, D.: Modular invariants, graphs and α-induction for nets of subfactors II. Commun. Math. Phys. 200, 57–103 (1999)

Structure of Sectors Associated with Longo–Rehren Inclusions I

179

4. Böckenhauer, J., Evans, D.: Modular invariants, graphs and α-induction for nets of subfactors III. Commun. Math. Phys. 205, 183–228 (1999) 5. Böckenhauer, J., Evans, D., Kawahigashi, Y.: On α-induction, chiral generators and modular invariants for subfactors. Commun. Math. Phys. 208, 429–487 (1999) 6. Dijkgraaf, R., Witten, E.: Topological gauge theories and group cohomology. Commnu. Math. Phys. 129, 393–429 (1990) 7. Evans, D. E., Kawahigashi, Y.: On Ocneanu’s theory of asymptot ic inclusions for subfactors, topological quantum field theories and quantum doubles. Int. J. Math. 6, 205–228 (1995) 8. Evans, D. E., Kawahigashi,Y.: Orbifold subfactors from Hecke algebras II. Quantum double and braiding. Commun. Math. Phys. 196, 331–361 (1998) 9. Evans, D. E., Kawahigashi, Y.: Quantum Symmetries on Operator Algebras. Oxford: Oxford University Press, 1998 10. Fredenhagen, K., Rehren, K. H.; Schroer, B.: Superselection sectors with braid group statistics and exchange algebras. II. Geometric aspects and conformal covariance. Rev. Math. Phys. Special Issue, 113–157 (1992) √ 11. Haagerup, U.: Principal graphs of subfactors in the index range 4 < [M : N ] < 3 + 2. In: Subfactors, (ed. H. Araki, et al.) Sinagpore: World Scientific, 1994, pp. 1–38 12. Hiai, F.: Minimizing indices of conditional expectations onto a subfactor. Publ. RIMS. Kyoto Univ. 24, 673–678 (1988) 13. Izumi, M.: Application of fusion rules to classification of subfactors. Publ. RIMS. Kyoto Univ. 27, 953– 994 (1991) 14. Izumi, M.: Subalgebras of infinite C∗ -algebras with finite Watatani indices I. Cuntz algebras. Commun. Math. Phys. 155, 157–182 (1993) 15. Izumi, M.: Subalgebras of infinite C∗ -algebras with finite Watatani indices II. Cuntz-Krieger algebras. Duke J. Math. 91, 409–461 (1998) 16. Izumi, M., Kosaki, H.: Finite-dimensional Kac algebras arising from certain group actions on a factor. Internat. Math. Res. Notices 8, 357–370 (1996) 17. Izumi, M., Kosaki, H.: In preparation 18. Izumi, M., Longo, R., Popa, S.: A Galois correspondence for compact groups of automorphisms of von Neumann algebras with a generalization to Kac algebras. J. Funct. Anal. 155, 25–63 (1998) 19. Kassel, C.: Quantum Groups. Graduate Texts in Mathematics 155. New York: Springer-Verlag, 1995 20. Kawahigashi, Y., Longo, R., Müger, M.: Multi-interval subfactors and modularity of representations in conformal field theory. Preprint 21. Kohno, T., Tanaka, T.: Symmetry of Witten’s 3-manifold invariants for sl(n, C). J. Knot Theory Ramif. 2, 149–169 (1993) 22. Longo, R.: Index of subfactors and statistics of quantum fields I., II. Commun. Math. Phys. 126, 217–247 (1989); 130, 285–309 (1990) 23. Longo, R.: Duality for Hopf algebras and for subfactors. I. Commun. Math. Phys. 159, 133–150 (1994) 24. Longo, R., Rehren, K. H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 25. Masuda, T.: An analogue of Longo’s canonical endomorphism for bimodule theory and its application to asymptotic inclusions. Int. J. Math. 8, 249–265 (1997) 26. Müger, M.: Categorical approach to paragroups I. The Quantum Double of Tensor ∗-Categories and Subfactors. In preparation. 27. Müger, M.: Galois theory for braided tensor categories and the modular closure. Preprint 28. Ocneanu,A.: Chirality for operator algebras. In: Subfactors, ed. H.Araki, et al. Singapore: World Scientific, 1994 pp. 39–63 29. Popa, S.: Classification of amenable subfactors of type II. Acta Math. 172, 163–255 (1994) 30. Rehren, K.-H.: Braid group statistics and their superselection rules. In: The algebraic Theory of Superselection Sectors, D. Kastler, ed., Singapore: World Scientific, 1990 31. Sutherland, C.: Cohomology and extensions of von Neumann algebras. I, II. Publ. Res. Inst. Math. Sci. 16, 105–133; 135–174 (1980) 32. Wakui, M.: On Dijkgraaf–Witten invariant for 3-manifolds. Osaka J. Math. 29 , 675–696 (1992) 33. Wassermann, A: Operator algebras and conformal field theory. III. Fusion of positive energy representations of LSU(N ) using bounded operators. Invent. Math. 133, 467–538 (1998) 34. Wenzl, H.: Hecke algebras of type A and subfactors. Invent. Math. 92, 345–383 (1988) 35. Xu, F.: New braided endomorphisms from conformal inclusion. Commun. Math. Phys. 192, 345–403 (1998) Communicated by H. Araki

Commun. Math. Phys. 213, 181 – 201 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

On Dynamics of Mostly Contracting Diffeomorphisms Dmitry Dolgopyat Department of Mathematics, Penn State University, State College, PA 16802, USA. E-mail: [email protected] Received: 2 February 1999 / Accepted: 11 February 2000

Abstract: Mostly contracting diffeomorphisms are the simplest examples of robustly nonuniformly hyperbolic systems. This paper studies the mixing properties of mostly contracting diffeomorphisms. 1. Introduction This paper treats a class of partially hyperbolic systems with non-zero Lyapunov exponents. Before stating our result let us recall some recent work motivating our research. In recent years there were several advances in understanding of statistical properties of weakly hyperbolic dynamical systems. On one hand L.-S. Young developed quantitative Pesin theory in [38, 39]. Among other things she proved that if a diffeomorphism f has a Pesin set such that the distribution of the return time to has an exponentially decaying tail and if f has no discrete spectrum then it is exponentially mixing. This theory was applied to a number of examples in the above mentioned papers as well as in [3, 10]. On the other hand M. Grayson, C. Pugh and M. Shub showed ([13, 27, 28]) that partial hyperbolicity can give rise to a good ergodic behavior in a robust way. Further examples of systems satisfying their criteria can be found in [7, 8, 16, 20, 37]. These results lead to the natural question if there is an open set of (partially hyperbolic) systems satisfying the conditions of Young’s theory with uniform bounds. This question was addressed in a number of papers [1, 2, 4, 32, 36]. Our paper also fits into this framework. Let us give a few definitions. Let f be a diffeomorphism of a smooth manifold X and let ν be an ergodic f -invariant measure. We call ν an SRB-measure for f if there is a subset Y (ν) ⊂ X of positive Lebesgue measure such that for almost all n−1 y ∈ Y for any continuous function A we have n1 A(f j x) → ν(A). Y (ν) is called a j =0

182

D. Dolgopyat

basin of attraction of ν. Certainly the question of existence of SRB-measures and their dependence on parameters is quite important in smooth ergodic theory. We say that f has a global attractor if there is only one SRB-measure whose basin is all of X. (Our use of the word attractor follows that of [35, 27]. More precisely, ν or supp(ν) should be called a stochastic attractor because it describes the statistical properties of large iterations of f . For a more topological approach see [19].) Let S be a subset of Diff r (X), r > 1 endowed with some topology (think of S as a parameter space) and let f ∈ S. We call f statistically stable in S if any diffeomorphism g in some neighborhood of f in S has a finite number of SRB-measures ν1 (g), ν2 (g) . . . νk (g), the maps g → νj (g) are continuous and the union of basins of νj (g) is all of X. If k = 1 we call f strongly statistically stable. Below we deal with the case when S = Diff 2 (X) with uniform C 2 -topology. In this note we provide some sufficient conditions for statistical stability as well as for other good statistical properties. Our main results are the following. Theorem I. Let f be a partially hyperbolic dynamically coherent u-convergent mostly contracting diffeomorphism of a three-dimensional manifold X. Then (a) f has a global attractor ν; (b) for any γ > 0 there are constants C, ζ < 1 such that if A, B ∈ C γ (X) then for positive n B(x)A(f n x)dx − ν(A) B(x)dx ≤ Cζ n ||A||γ ||B||γ , B(x)A(f n x)dν(x) − ν(A)ν(B) ≤ Cζ n ||A||γ ||B||γ ; (c) f has non-zero Lyapunov exponents. Remark. Parts (a) and (c) of this theorem were established before in [4] for a larger class of systems. It follows from (c) and the results of [11, 21] that the system (X, f, ν) is a Bernoulli shift. Remark. In fact, we prove more than (a). Namely we show that the image under f n of any unstable leaf becomes equidistributed. In [12] we proved that diffeomorphisms having this property satisfy many classical limit theorems of probability theory. Theorem II. Let f be as in Theorem I. If in addition f is stably dynamically coherent then f is strongly statistically stable. More precisely, there exists a neighborhood O(f ) ⊂ Diff 2 (X) such that any g ∈ O satisfies the conditions of Theorem I and the constants C, ζ in Theorem I(b) can be chosen uniformly in O(f ). In particular if νg is the SRB measure for g then for any γ > 0 the map g → νg O(f ) → (C γ (X))∗ is Holder continuous. See Sects. 2 and 3 for the definition of the terms appearing in the formulation of this theorem. In Sect. 4 we show that for mostly contracting diffeomorphisms the second forward Lyapunov exponent of almost every point is negative and prove large deviation estimates for the exceptional set. This is done by certain submartingale estimates more common in the theory of stochastic differential equations. In Sect. 5 we recall the construction of u-Gibbs measures [26] and show that in our situation they are SRB measures. The uniqueness of SRB measures is treated in Sect. 6–9. In Sect. 6 we recall

Dynamics of Mostly Contracting Diffeomorphisms

183

the coupling method of L.-S. Young. In Sect. 7 we describe a coupling algorithm for our system. The properties of this algorithm are studied in Sects. 8 and 9. The proofs of the main theorems are completed in Sect. 10. In Sects. 11 and 12 we discuss some examples. Final remarks and some open questions are presented in Sect. 13. Remark. Independently and slightly earlier A. Castro [9] proved a result similar to our Theorem I. However, because of some technical assumptions in his paper it is not clear if his result can be applied to the examples of Sect. 12.

2. Partial Hyperbolicity In this and the next sections we describe the properties of f which appear in the statement of Theorem I. As it was mentioned in the introduction f is a diffeomorphism of X 3 . We also assume that f is partially hyperbolic and stably dynamically coherent. Thus the tangent bundle of X is the sum of three continuous one dimensional subbundles Eu , Ec and Es such that eλ1 ≤ (df |Es ) ≤ eλ2 , e

λ3

e

λ5

(1)

λ4

(2)

λ6

(3)

≤ (df |Ec ) ≤ e , ≤ (df |Eu ) ≤ e ,

where λ1 ≤ λ2 < λ3 ≤ λ4 < λ5 ≤ λ6 and λ2 < 0, λ5 > 0. Eu and Es are always integrable so they are tangent to f invariant foliations: unstable (W u ) and stable (W s ). Dynamical coherence means that Ec , Ec ⊕ Eu and Ec ⊕ Es are also tangent to f -invariant foliations which are called central (W c ), center-unstable (W cu ) and center-stable (W cs ) respectively, and that W c subfoliates both W cu and W cs . (In fact only unique integrability of Ec ⊕ Es is used in the paper.) Stable dynamical coherence means that any g close to f is also dynamically coherent. The openness of these conditions was studied in [14]. Namely partial hyperbolicity is open. It is unknown if dynamical coherence is open but if the center foliation of f is C 1 then f is stably dynamically coherent. Let V be the set of all unstable curves of lengths between 1 and 2. V is a Markov family in the sense that ∀V ∈ V there is a finite set {V˜j } of elements of V such that f V = j V˜j . More generally for any unstable curve U of length greater than one there is a finite set {Vj } such that Vj ∈ V and U=

Vj .

(4)

j

We call (4) Markov decomposition of U . We will use (4) for U = f n V , where V ∈ V, n > 0. We call f u-convergent if ∀ε ∃n > 0 ∀V1 , V2 ∈ V ∃xj ∈ Vj such that d(f n x1 , f n x2 ) ≤ ε. Later on we show that u-convergence is open among mostly contracting diffeomorphisms.

184

D. Dolgopyat

3. Mostly Contracting Systems The assumptions of Sect. 2 are routine partial hyperbolicity assumptions. The next property guarantees that f is non-uniformly hyperbolic. In order to formulate it we need to recall the definition of canonical densities on W u . We would like to study the SRBmeasure for f . A priori we do not know if it exists, but if it does then its conditional densities on W u are given by [26]. Fix a Riemann structure on X. It induces a metric on W u -fibers. Let V be an interval inside W u . For z1 , z2 ∈ V let ρ(z1 , z2 ) =

∞ (df −1 |Eu )(f −j z1 ) . (df −1 |Eu )(f −j z2 )

(5)

j =0

Fix some z0 ∈ V and let ρV (z) = Cρ(z, z0 ), where C = ( V ρ(z, z0 )dz)−1 . Since ρ(z, z0 ) = ρ(z, z0 )ρ(z0 , z0 ) this definition actually does not depend on z0 . Also if W = f V , y = f z then ρV (z)dz = CV ρ(z, z0 )dz = CV ρ(z, z0 )(df −1 |Eu )(y)dy (df −1 |Eu )(y) = C˜ V ρ(z, z0 ) dy = CW ρ(y, y0 )dy. (df −1 |Eu )(y0 ) Thus if A is continuous then A(f z)ρV (z)dz = V

fV

A(y)ρf V (y)dy.

Our last assumption is the following. There is a positive constant α0 such that for any V ∈ V, ρV (x) (ln(df |Ec )(x)) dx ≤ −α0 < 0. (6) V

We call f mostly contracting if some positive power of it satisfies (6). In the proof of Theorem I we assume, as we may, that f itself satisfies (6). Remark. Condition (6) is C 2 -open. Indeed by standard theory ([14]) the map center(f ) = (df |Ec )(x) is continuous: Diff 2 (X) → C(X). Let V (x, f, t) denote an f -unstable curve of length t centered at x. Let V (x, f, t)(τ ) be the arclength parameterization of τ . Denote T = {(t, τ ) : 1 ≤ t ≤ 2, 0 ≤ τ ≤ t}. Then the map density(f ) = ρV (x,f,t) (V (x, f, t)(τ )) is continuous: Diff 2 (X) → C(X × T) since ρ is a uniform limit of continuous functions. center is also continuous in C 1 -topology, but density is not because convergence in (5) may fail to be uniform in f (cf. [23]). Thus it is unclear if mostly contractiveness is C 1 -open. 4. Large Deviations Here we prove Theorem 4.1. ∃C1 , s > 0, θ1 < 1 such that ∀V ∈ V ∀n > 0, s ρV (x) df n |Ec (x)dx ≤ C1 θ1n . V

(7)

Dynamics of Mostly Contracting Diffeomorphisms

185

The proof consists of a number of lemmas.

Lemma 4.2. ∀n > 0

V

ρV (x) ln(df n |Ec )(x)dx ≤ −nα0 .

Proof. We have ρV (x) ln(df n |Ec )(x)dx = ρV (x) ln(df |Ec )(x)dx V V + ρf V (y) ln(df n−1 |Ec )(y)dy. Let f V =

fV

where cj =

j

Vj be an almost Markov decomposition. Then the second term equals

cj ρVj (y) ln(df n−1 |Ec )(y)dy,

f −1 Vj

Vj

j

ρV (x)dx. By induction Vj

ρVj (y) ln(df n−1 |Ec )(y)dy ≤ −(n − 1)α0 .

Summation over j proves the lemma. The following distortion bound is standard (see, for example [4], Lemma 3.3). Proposition 4.3. There is a constant C so that ∀n > 0 ∀V ∈ V ∀x1 , x2 ∈ f −n V , ln(df n |Ec )(x1 ) − ln(df n |Ec )(x2 ) ≤ C. If A is a continuous function and U is a piece of unstable manifold we write ||A||U = maxx∈U |A(x)|. Corollary 4.4. There exists α1 such that if n is large enough then for any V ∈ V for any Markov decomposition f n V = j Vj the following holds. Let Uj = f −n Vj , cj = Uj ρV (x)dx. Then

cj ln ||(df n |Ec )||Uj ≤ −α1 .

j

Changing if necessary f → f n we can assume that this is true for n = 1. Under this assumption we have Lemma 4.5. If s is small enough there is a constant θ2 (γ ) < 1 such that under the conditions of the previous corollary

cj ||df |Ec ||sUj ≤ θ2 . j

Proof. Regard the LHS as a function r(s). Then r(0) = 1, Const.

dr ds (0)

≤ −α1 , | dds 2r (0)| ≤ 2

186

D. Dolgopyat

Repeating the argument of Lemma 4.2 we obtain Corollary 4.6. For any n > 0 there is a Markov decomposition f n V = that if Uj = f −n Vj , cj = Uj ρV (x)dx then

j

j

Vj such

cj ||df n |Ec ||sUj ≤ θ2n .

(8)

Proof. By induction. Suppose that (8) is valid for all n ≤ n0 − 1. Let f V = j Vj be a Markov decomposition. By inductive assumption ∀j there is a Markov decom n0 −1 V = −1 U , U −n0 V , position f V j j jk = f jk k j k satisfying (8). Let Uj =f n 0 bj = Uj ρV (x)dx, cj k = Uj k ρUj (x)dx. Then f V = j k Vj k is Markov and

jk

f −n0 Vj k

ρV (x)||(df n0 |Ec )||sUj k dx =

jk

≤

j

≤

j

bj cj k ||(df n0 |Ec )||sUj k bj ||(df |Ec )||sUj

cj k (||df n0 −1 |Ec ||f Uj k )s

k

bj ||(df |Ec )||sUj θ2n0 −1 ≤ θ2n0 .

This Corollary proves Theorem 4.1 since for any Markov decomposition f n V = LHS of (8) majorates LHS of (7).

j

Vj

5. Transfer Operator Now we recall the general method of the construction of SRB-measures for partially hyperbolic systems ([26]). SRB measures are obtained as forward iteration limits of suitable measures. Here we describe the set of the initial measures. Fix some R. Let E1 (R) be the set of measures of the form l(A) = A(z)eG(z) ρV (z)dz, V

where V ∈ V, l(1) = 1 and |G(z1 ) − G(z2 )| ≤ Rd γ (z1 , z2 ). (Those three conditions also guarantee that G is uniformly bounded.) Let E2 (R) be the convex hall of E1 (R) and E(R) be the closure of E2 (R). The family {E(R)} is continuous from above in the sense that E(R0 ) = R>R0 E(R). (This follows from the fact that E1 (R0 ) = R>R0 E1 (R).) Let T (l)(A) = l(A ◦ f ). Proposition 5.1. T : E(R) → E(Re−λ5 γ ). Proof. If l ∈ E1 (R). Then (T l)(A) = eG(z) A(f z)ρV (z)dz = V

fV

e(G◦f

−1 )(y)

A(y)ρf V (y)dy.

Dynamics of Mostly Contracting Diffeomorphisms

Let f V =

j

187

Vj , be a Markov decomposition then T l = lj (A) =

Vj

e(G◦f

−1 )(y)

cj lj , where

A(y)ρVj (y)dy.

Also |(G ◦ f −1 )(y1 ) − (G ◦ f −1 )(y2 )| ≤ Re−λ5 γ d γ (y1 , y2 ). Thus T : E1 (R) → E2 (Re−λ5 γ ). Since E(R) is a convex compact set Proposition 5.1 implies that there is an f -invariant measure in E(R) which moreover belongs to R>0 E(R) = E(0) (this is also proven in [26]). Proposition 5.2. Any f -invariant measure ν ∈ E(0) has two negative Lyapunov exponents. Proof. By (6) for any l ∈ E(0) we have l(ln(df |Ec )) < −α0 .

Proposition 5.2 and Lemma 13 of [26] guarantee that ν satisfies the conditions of Theorem 3 of [27] and so it is a SRB measure. (Another proof of this fact is given in Sect. 10.)

6. Coupling Argument We now pass to the uniqueness of ν. It is established via the coupling argument of [39]. We want to show that for large n for any l1 , l2 ∈ E(R), T n (l1 ) is close to T n (l2 ). First we consider the case when l1 and l2 are in E1 (0), say lj (A) = Vj A(x)ρVj (x)dx. The idea is to divide f n Vj into small pieces and pair the pieces of f n V1 to f n V2 so that the members of the pair are very close to each other. However since f n gives different weights to different pieces of f n Vj it is more convenient to regard unstable curves as 1-chains so that the heavier ones can be split into several pieces each one being paired to a different partner. Let us now give a formal statement. Denote Yj = Vj × [0, 1]. Equip Yj with the measure dmj (x, t) = ρVj (x)dxdt. The heart of the coupling method is the following result, the proof of which occupies the next three sections. Lemma 6.1. There is a measure preserving map τ : Y1 → Y2 , a function R : Y1 → N and constants C1 , C2 > 0, ρ1 < 1, ρ2 < 1 such that (A) If (x2 , t2 ) = τ (x1 , t1 ) then for n ≥ R(x1 , t1 ) d(f n x1 , f n x2 ) ≤ C1 ρ1n−R ;

(9)

(B) m1 (R > N ) ≤ C2 ρ2N . Let ||l||γ denote the norm of l as the element of (C γ (X))∗ . Corollary 6.2. ∃C3 > 0, ρ3 < 1 such that ∀n > 0 ∀l1 , l2 ∈ E(0), ||T n (l1 − l2 )||γ ≤ C3 ρ3n .

188

D. Dolgopyat

Proof. It suffices to verify this for lj ∈ E1 (0). We have A(f n xj )dmj (xj , tj ). (T n lj )(A) = Yj

Let (x2 , t2 ) = τ (x1 , t1 ). Then n

|T (l1 − l2 )(A)| ≤

Y1

|A(f n x1 ) − A(f n x2 )|dm1 (x1 , t1 ).

Let Z(n) = {y : R(y) < n2 }, then |T n (l1 − l2 )(A)| ≤ |A(f n x1 ) − A(f n x2 )|dm1 (x1 , t1 ) + 2||A||0 m1 (Y1 \Z(n)) Z(n)

n n ≤ ||A||γ (C1 ρ12 )γ + 2C2 ρ22 .

Let ν be some f -invariant measure in E(0). Substituting in Corollary 6.2 l2 = ν we obtain Corollary 6.3. ν is the only f -invariant measure ν in E(0). Moreover ∀l ∈ E(0) ∀A ∈ C γ (X), ∀n > 0, A(f n x)dl(x) − ν(A) ≤ C3 ρ n ||A||γ . 3 7. Coupling Algorithm Here we define τ and R described in Lemma 6.1. Let Y be the set of rectangles Y = V ×I , V ∈ V, I ⊂ [0, 1]. Let D > 2 be a constant defined below (see (13)). Let Y˜ be the set of rectangles Y = V × I , where the length of V is between D1 and D and I ⊂ [0, 1]. We write f (x, t) = (f x, t). For Y1 ∈ Y, Y2 ∈ Y˜ such that mes(Y1 ) = mes(Y2 ), we give an algorithm defining τ and R. This algorithm will depend on three positive parameters K, λ and 3. We require λ be so close to 0 that λ < −λ2 (recall (1)) and e−λs > θ1 , where s and θ1 are the constants from Theorem 4.1. By this theorem and Proposition 4.3 if K is large enough then q1 = max mes(U (V )) < 1, V ∈V

(10)

where U (V ) = {x ∈ V : ∃n > 0, y ∈ V : d(f n x, f n y) ≤ 2 and (df n |Ec )(y) ≥ Ke−2n }. Take K so large that (10) is satisfied. Write Ecs = Es ⊕ Ec . By partial hyperbolicity there is a constant K such that ∀j > 0 ∀x ||(df j |Ecs )(x)|| ≤ K |(df j |Ec )(x)|.

(11)

Set K˜ = max(KK , 1). Since (df |Ec )(x) is Holder continuous there exists δ > 0 such λ that if d(x, y) < δ, then |(df |Ecs )(y)| ≤ e 2 |df |Ecs (x)|. Let 3≤

δ 2K˜

.

(12)

Dynamics of Mostly Contracting Diffeomorphisms

189

Now D is defined by the requirement that if V1 , V2 are unstable curves, V1 ∈ V, V2 is the image of V1 under center-stable projection and ∀x ∈ V1 d(x, px) < δ then 1 ≤ length(V2 ) < D. D

(13)

Our algorithm will work recursively. During the first run we define the map between subsets Pj∞ of Yj . For points where τ is not defined we define a stopping time s(y) such that the set Pjn = {y ∈ Yj : s(y) = n} will be of the form f −n ( k Yj nk ), Y1nk ∈ Y, Y2nk ∈ Y˜ and mes(P1n ) = mes(P2n ). Then we can use our algorithm again to couple P1n to P2n . More precisely since mes(P1n ) = mes(P2n ) we can chop each Yj nk into several pieces so that the resulting collections {Y¯j nl } satisfy k Yj nk = l Y¯j nl and mes(Y¯1nl ) = mes(Y¯2nl ). Let f −n Y¯j nl = Uj nl × Ij nl . Denote cj nl = Uj nl ρVj (x)dx. Let 8j nl be the map 8j nl (x, t) = (f n x, rj nl (t)) where rj nl is the affine isomorphism between Ij nl and [0, cj nl |Ij nl |]. (This rescaling is necessary to make 8’s measure preserving.) We now call our algorithm recursively to produce maps τnl : 8(f −n Y¯1nl ) → 8(f −n Y2nl ) and Rnl : 8(f −n Y¯1nl ) → N satisfying the conditions of Lemma 6.1. We then set τfirst run (x, t) if (x, t) ∈ P1∞ τ (x, t) = −1 82nl ◦ τnl ◦ 81nl if (x, t) ∈ f −n Y¯1nl , if (x, t) ∈ P1∞ Rfirst run (x, t) R(x, t) = n + Rnl (81nl (x, t)) if (x, t) ∈ f −n Y¯1nl . Let us now describe the first run of our algorithm. By rescaling we may suppose that Yj = Vj × [0, 1]. By u-convergence there is n0 and curves V¯j on distance at least 1 from ∂(f n Vj ) such that V1 ∈ V, V¯2 = pV¯1 and ∀x ∈ V¯1 d(x, px) ≤ 3. (Here p means center-stable holonomy.) Let cˆj = f −n0 V¯j ρVj (x)dx. Let (t¯1 , t¯2 ) = (1, ccˆˆ2 ) if cˆ2 ≤ cˆ1 1 and (t¯1 , t¯2 ) = ( ccˆˆ1 , 1) if cˆ1 ≤ cˆ2 . Define Y¯j = V¯j × [0, t¯j ]. Let s(y) = n0 for points of 2 Yj \f −n0 Y¯j . m = Yj \ n−1 We now proceed to define Pjn inductively for n > n0 . Let Qn−1 m=n0 Pj . j We assume by induction that f n−1 Qn−1 = k Zj k(n−1) , where Zj k(n−1) = Vj k(n−1) × j [0, tj k(n−1) ], mes(f −(n−1) Z1k(n−1) ) = mes(f −(n−1) Z2k(n−1) ),

(14)

V1k(n−1) ∈ V and V2k(n−1) = p(V1k(n−1) ) and d(x, px) ≤ rn−1 , where λn

˜ −2. (15) rn = K3e Take a Markov decomposition f V1k(n−1) = l V1lkn and let V2lkn = p(V1lkn ). We ˜ note that the fact that V1lkn ∈ V, (12), (13) and (15) guarantee that then V2lkn ∈ Y. Let βlkn = max (df n |Ec )(x). If βlkn > Ke−λn , let s(y) = n on f −n Vj lkn × x∈f −n V1lkn

[0, tj k(n−1) ]. Otherwise let Z˜ j lkn = Vj lkn × [0, tj k(n−1) ]. In general mes(f −n Z˜ 1lkn ) = mes(f −n Z˜ 2lkn ). So cutoff the top of the larger rectangle so that the adjusted ones satisfy mes(f −n Z1lkn ) = mes(f −n Z2lkn ) and let s(y) = n on Z˜ j lkn \Zj lkn . Now Pj∞ =

190

D. Dolgopyat

Yj \ n P1n is a union of vertical intervals of the form {(x, [0, t (x)])}, where x varies over some positive measure Cantor set R1 . For points of P1∞ let τ (x, t) = (px, t t(px) (x) t), R(x, t) = n0 . Four things have to be proven: – – – –

τ τ τ τ

is defined on a set of whole measure in Y1 ; is measure preserving; satisfies condition (A) of Lemma 6.1; satisfies condition (B) of Lemma 6.1.

The second and the third properties of τ are verified in Sect. 8. The first and the fourth properties are verified in Sect. 9.

8. Convergence of Images Here we prove Property (A) of Lemma 6.1. Let Wr∗ (x) denote the ball centered at x of radius r inside W ∗ with induced Riemannian metric. Let (x, t) ∈ P1∞ . Let x0 = f n0 x. Then ∀j ≥ 0 (df j |Ec )(x0 ) ≤ Ke−λj , so it suffices to show the following. Lemma 8.1 (Cf. [2], Lemma 2.7). If x0 ∈ X and n > 0 are such that ∀0 ≤ j ≤ n (df j |Ec )(x0 ) ≤ Ke−λj , then ∀0 ≤ j ≤ n, f j W3cs (x0 ) ⊂ Wrj (f j x0 ),

(16)

where rj is given by (15). Proof. We proceed by induction. For j = 0 (16) is true by (12). Suppose that (16) holds for 0 ≤ j < j0 . Then ∀y ∈ W3cs (x) ∀0 ≤ j < j0 d(f j y, f j x0 ) ≤ δ. Hence |(df j0 |Ecs )|(y) =

j 0 −1

|(df |Ec )(f j y)| ≤

j =0

˜ − By (11) ||(df j |Ecs )(y)|| ≤ Ke

j 0 −1

j0 λ λ e 2 |(df |Ec )(f j x0 )| ≤ Ke− 2 .

j =0 j0 λ 2

, and so f j Wεcs (x0 ) ⊂ Wrj (f j x0 ) as claimed.

Note that the proofs of Lemma 8.1 and Theorem 4.1 do not use u-convergence. Hence we get the following result which will be used in Sect. 10. Corollary 8.2. Assume that f satisfies all the conditions of Theorem I except possibly u-convergence. There are constants q1 , 3 > 0 such that for any pair of unstable curves V1 , V2 such that V1 ∈ V, V2 = p(V1 ) and ∀x ∈ V d(x, px) < 3, mes({x ∈ V1 : d(f n x, f n px) → 0}) ≥ q1 . We now continue with the proof of Lemma 6.1. Corollary 8.3. τ is measure preserving.

Dynamics of Mostly Contracting Diffeomorphisms

191

Proof. By the recursive structure of our algorithm it suffices to show that τ : P1∞ → P2∞ is measure preserving. Denote by Rj the base of Pj∞ . It follows from Lemma 8.1 by standard Pesin theory (see [24, Sect. 3] or [27, Sect. 4] that p : R1 → R2 is absolutely continuous. We want to compute its Jacobian J (x). Let tn (x) denote the height of the rectangle containing (x, 0) and Wn (x) denote its base. By absolute continuity for almost all x ∈ R1 , x is a density point of R1 and px is a density point of R2 . For such points pW (x) 1R2 (y)ρV2 (y)dy J (x) = lim n . n→∞ Wn (x) 1R1 (y)ρV1 (y)dy Since R1 , R2 have large densities in Wn (x) and pWn (x) respectively we can drop indicator functions. So t (x) tn (x) pW (x) ρV2 (y)dy J (x) = lim n = lim = , n→∞ n→∞ tn (px) t (px) Wn (x) ρV1 (y)dy where the second equality follows by (14). Thus τ is measure preserving.

9. Coupling Time Here we prove Part (B) of Lemma 6.1. We begin with some information about one run of our algorithm. Lemma 9.1. There are constants q, C0 > 0, ρ0 < 1 such that for any pair Y1 , Y2 , (1)

mes(P1∞ ) ≥ q; mes(Y1 )

(2)

mes(P1n ) ≤ C0 ρ0n . mes(Y1 )

Proof. We begin with (2). (y, t) can belong to P1n for two reasons. The first, ∃x˜ such that d(f n y, f n x) ˜ ≤ 2 and (df n |Ec )(x) ˜ > Ke−λn . By Proposition 4.3 (df n |Ec )(x) ˜ > ∗ −λn K e . So the measure of such points is exponentially small by Theorem 4.1 and our choice of λ. The second reason is that f n (y, t) ∈ Z˜ 1lkn \Z1lkn . Let κn (y) =

mes(f −n Z˜ 1lkn \Z1lkn ) mes(Z˜ 1lkn \Z1lkn ) = . mes(f −n Z1kln ) mes(Z1kln )

Lemma 9.2. There are constants C4 > 0, ρ4 < 1 such that κn (y) ≤ C4 ρ4n .

(17)

Proof. We may suppose that mes(Z˜ 1lkn ) > mes(Z˜ 2lkn ), since otherwise Z1kln = Z˜ 1kln . Then mes(Z˜ 1lkn \Z1lkn ) mes(V1lkn ) mes(V2k(n−1) ) − 1. = mes(Z1kln ) mes(V2lkn ) mes(V1k(n−1) )

(18)

192

D. Dolgopyat

Now by inductive assumption we have both on V1k(n−1) and on V1lkn , λ

˜ − 2 (n−1) . d(x, px) ≤ Kεe

(19)

In the proof below C∗ will denote various constants which depend on f but not on n or Yj . Likewise ρ∗ will denote various constants which are less than 1. Let x0 be the center of V1k(n−1) and x˜0 = px0 . By Holder continuity of unstable foliation ∃C5 > 0, ρ5 < 1 such that ρ(x0 , x) ≤ C5 ρ n . − 1 5 ρ(x˜ , px) 0 λn

Divide V1k(n−1) into subintervals σm of size e− 4 and let σ˜ m = pσm . Then

ρ(x0 , x)dx = ρ(x0 , xm )dm + O(ρ6n ), V1k(n−1)

m

where xm is any point on σm and dm is the distance between the endpoints of σm . Likewise

ρ(x0 , x)dx = ρ(x˜0 , pxm )d˜m + O(ρ6n ), V1k(n−1)

m

where d˜m is the distance between the endpoints of σ˜ m . Now by (19) and the triangle λ˜ n

inequality | d˜m − 1| ≤ C6 e− 2 . Hence dm

mes(V1k(n−1) ) ≤ C7 ρ n . − 1 7 mes(V ) 2k(n−1)

Similarly

mes(V1kln ) ≤ C7 ρ n . − 1 7 mes(V ) 2kln

The last two inequalities together with (18) prove the lemma.

Assertion (2) of Lemma 9.1 now follows from (17) by summation over k and l. Now let (x, t) ∈ P1∞ . Let κj (x) be the relative measure of points cut off the top of the rectangle containing (x, t) on the j th step. Thus t (x) = ∞ j =n0 (1 − κj (x)). By (17) this series converges uniformly, hence t (x) is uniformly bounded from below. But the measure of the base of P1∞ is also uniformly bounded (see (10)). This proves assertion (1) of Lemma 9.1. k(y) Now represent R(y) = j =1 sj (y), where sj (x) is the stopping time of the j th run of our algorithm. Let Tk be the set where τ is not defined after k runs of our algorithm, Uk = Tk−1 \Tk . Denote Sk (x) = kj =1 sj (x) and consider generating functions ϕk (Y1 , ξ ) = 1 1 Sk Sk mes(Y1 ) Tk ξ (x)dm(y), ψk (Y1 , ξ ) = mes(Y1 ) Uk ξ (x)dm(y). Lemma 9.1 says that the radius of convergence of ϕ1 is strictly greater than 1 and ϕ1 (1) ≤ 1 − q. We need the following generalization. Lemma 9.3. ∃δ0 , q, ¯ C > 0 such that if 0 ≤ ξ ≤ 1 + δ0 then ¯ k (ξ ); (1) ϕk+1 (ξ ) ≤ (1 − q)ϕ (2) ψk+1 (ξ ) ≤ Cϕk (ξ ).

Dynamics of Mostly Contracting Diffeomorphisms

193

Proof. (1) Take some y ∈ Tk . Assume that after k runs of our algorithm f Sk (y) ∈ Y1k (y) and the (k + 1)st run couples Y1k to some Y2k . Let us compare the contributions of Y1k to ϕk and ϕk+1 . (That is if Y1k = 8(k) Y¯1k , where Y¯1k ⊂ Y1 , 8(k) (x, t) = (f Sk (y) x, at + b) and we compare mes(Y¯1k ) 1 ξ Sk (x) dm1 = ξ Sk (y) Ik = mes(Y1 ) Y¯1k mes(Y1 ) 1 S (x) dm .) Their ratio equals r(ξ ) = Ik+1 = ϕ (Y k , ξ ). By and Ik+1 = mes(Y k ξ k+1 1 1 1 Ik 1 ) Y¯1 Lemma 9.1 r(ξ ) is uniformly convergent and r(1) ≤ 1 − q. So there is δ0 , q¯ such that for ξ < 1 + δ0 r(ξ ) ≤ 1 − q. ¯ This proves (1). (2) ψk+1 (ξ ) = ξ n0 ξ Sk (y) dm(y) ≤ ξ n0 ξ Sk (y) dm(y) ≤ (1 + δ0 )n0 ϕk (ξ ). Uk+1

Tk

Now for ξ ≤ 1 + δ0 , ∞ ∞

C ξ R(y) dm(y) = ψk (ξ ) ≤ C (1 − q) ¯ k−1 ϕ1 (ξ ) ≤ ϕ1 (ξ ) < ∞. q¯ Y1 k=1

k=1

This shows that m(R > n) ≤ complete the proof of Lemma 6.1.

Const (1+δ0 )n

and in particular m(R = ∞) = 0. These facts

10. Proof of the Main Results Proof of Theorem I. Consider l ∈ E(R). By Proposition 5.1 there exists l˜ ∈ E(0) such that for all N > 0, −λ n n ˜ ≤ Const e 25 . ||T N+ 2 l − T N l|| Hence −λ5 n n ||T n l − ν|| ≤ Const e 2 + ||T 2 l˜ − ν|| ≤ C8 ρ8n . It follows from [12] that ν is a global attractor for f . Let B ∈ C γ (X) then, for large R, B·Lebesgue and B · ν are in E(R), provided that γ is sufficiently close to 1 (see [26, 12]). This together with Proposition 5.2 proves Theorem I for γ close to 1. The result for general γ is proved by approximation of A and B by smooth functions. Proof of Theorem II. By the remark at the end of Sect. 3 there exists a C 2 neighborhood O1 (f ) such that any g ∈ O1 (f ) is mostly contracting. Also, given ε there is another C 2 neighborhood O2 (f ) such that ∀g ∈ O2 (f ) ∃n0 such that ∀V1 , V2 ∈ V ∃Uj ⊂ Vj such that g n0 U1 ∈ V, U2 = pU1 and ∀x ∈ U1 d(g n0 x, g n0 px) < ε. By Corollary 8.2 g is u-convergent. Constants C, ζ from Part (b) of Theorem I can be chosen uniformly for g near f since they depend only on Holder data of invariant foliations and the constant α0 in (6). Thus νg (A) − νf (A) = [A(g n x) − A(f n x)]dx + O(||A||γ ζ n ) = ||A||γ (O(ζ n ) + O((K n d(f, g))γ + ζ n ) = ||A||γ O(K n d(f, g))γ + ζ n ). γ

Taking n so that ( Kζγ )n = d(f, g) 2 we obtain the result needed.

194

D. Dolgopyat

11. Examples of u-Convergent Diffeomorphisms Here we give several conditions sufficient for u-convergence. (a) Suppose that f has 3-leg accessibility property in the sense that there exists R such that ∀V1 , V2 ∈ V ∃x1 ∈ V1 , x2 ∈ V2 such that x2 ∈ W s (x1 ) and ds (x1 , x2 ) ≤ R, where ds means the distance along the stable leaf. Then ds (f n x1 , f n x2 ) ≤ eλ2 n R, so f is u-convergent. (b) Assume that W u is minimal. Thus given ε there exists R such that for any two unstable curves V1 , V2 of length at least R there are xj ∈ Vj such that d(x1 , x2 ) ≤ ε. So, f is u-convergent. (c) To formulate this condition suppose that the fibers of W c are circles. Suppose that f satisfies the following condition: for any two unstable curves V1 ∈ V and V2 ∈ W cu (V1 ) : V2 = pc (V1 ) (where pc denotes the center holonomy) the following inequality holds: d(f x, fpx) ρV1 (x) ln (20) dx ≤ −α0 < 0. d(x, px) V1 (Note that letting V2 tend to V1 here we obtain (6) so (20) is a stronger assumption.) Proposition 11.1. There are sets Uj ⊂ Vj such that mes(Vj \Uj ) = 0, U2 = p(U1 ) and for all x ∈ U1 d(f n x, f n px) → 0. Proof. Let U1 = {x ∈ V1 : d(f n x, f n px) → 0}. Repeating the arguments of Corollary 8.2 we get that there is a constant q˜ (depending only on f but not on V1 , V2 ) such that mes(U1 ) ≥ q˜ mes(V1 ). (21) n Now considering Markov decompositions f V1 = l Vln and applying (21) to each Vln we find that V1 \U1 has no density points and so mes(V1 \U1 ) = 0. Interchanging V1 and V2 we get mes(V2 \U2 ) = 0. Now since W u and W cs are transverse foliationsthere exists R > 0 such that ∀x1 , x2 ∈ X there is x3 ∈ X such that x3 ∈ W cs (x1 ) W u (x2 ) and dcs (x1 , x3 ) < R, du (x1 , x3 ) < R. Iterating this construction forward wefind that ∀δ ∃R˜ such that ∀x1 , x2 ∃y1 , y2 such that y1 ∈ Wδs (x1 ) and y2 ∈ W c (y1 ) W u˜ (x2 ). Now if δ is small R enough, then by Corollary 8.2 there exist sets U1 ⊂ W1u (U1 ), U˜ ⊂ W1u (y1 ) such that mes(U1 ) > 0, mes(U˜ 1 ) > 0 U˜ 1 = pcs (U1 ) and ∀x ∈ U1 d(f n x, f n pcs x) → 0. But y2 ∈ W u˜ (x2 ). By compactness there is R ∗ > 0 such that U2 ⊂ WR ∗ (x2 ). Thus R ∃z1 ∈ W1u (x1 ), z2 ∈ WRu∗ (x2 ) such that d(f n z1 , f n z2 ) → 0. Now take V1 , V2 ∈ V. There exists n0 such that the lengths of f n0 Vj is greater than 2R ∗ and so ∃zj ∈ f n0 Vj : such that d(f n z1 , f n z2 ) → 0. Therefore f is u-convergent. Hence (20) implies both mostly contractiveness and u-convergence. Similarly we can consider the case when the leaves of f are non-compact and require that (20) is satisfied for d(V1 , V2 ) ≤ R, where R = R(f ) is a large constant. Remark. If fibers of W c are circles and (20) is satisfied then one can show using Proposition 11.1 that the W c -holonomy restricted to W cu is absolutely continuous. Let us examine the underlying geometric picture more closely since it will allow the reader to appreciate better the idea behind the proof of Lemma 6.1. Indeed choose an orientation of W c and let V1 , V2 be two unstable curves in the same center-unstable leaves such that V2 = p(V1 ) V2 is ε-close to V1 and is on the right of V1 . Then there are subsets

Dynamics of Mostly Contracting Diffeomorphisms

195

Uj ⊂ Vj such that U2 = p(U1 ), mes(Vj /Uj ) = 0 and ∀x ∈ U1 d(f n x, f n px) → 0. (1) The geometry of U1 is however quite complicated. In fact there is a Cantor set U1 ⊂ U1 (1) of measure 1 − O(ε C ) such that for x ∈ U1 f n px is always on the right from f n x. (1) (2) (2) Each gap of U1 contains a positive measure Cantor set U1 such that for x ∈ U1 , (2) f n (px) is on the left from f n x. In turn each gap of U1 contains a positive measure (3) (2) Cantor set U1 such that for x ∈ U1 , f n (px) is on the right from f n x and so on. c If fibers of W are lines then it is conceivable that W c is not absolutely continuous inside W cu . Instead Lemma 6.1 allows us to construct a map π : V1 → V2 which is absolutely continuous and such that π x ∈ W cs (x). However even if V2 is very close to V1 still πx sometimes will be different from the naive projection along the W c -fibers. The fact that the fibers of W cs are dense and so for each x there is a countable number of candidates for πx is really essential to this proof. 12. Examples of Mostly Contracting Diffeomorphisms (a) Let T : T2 → T2 be a linear Anosov diffeomorphism. Take A : T2 → SL2 (R). Assume that the image A(T2 ) generates SL2 (R). Let Sn1 ,n2 (x) = A(T n1 n2 x) . . . A(T n2 x)A(x), Sn1 ,n2 (x)v . Mn1 ,n2 (x)v = ||Sn1 ,n2 (x)v|| Define fn1 ,n2 : T2 × P1 → T2 × P1 by fn1 n2 (x, v) = (T n1 n2 x, Mn1 ,n2 v). We claim that ∃n¯ 2 such that ∀n2 > n¯ 2 ∃n¯ 1 such that ∀n1 > n¯ 1 fn1 ,n2 satisfy (20). Let d(v1 , v2 ) = Area(v1 , v2 ). We have d(fn1 ,n2 (x, v1 ), fn1 ,n2 (x, v2 )) = Area(Mn1 ,n2 (x)v1 , Mn1 ,n2 (x)v2 ) Area(Sn1 ,n2 (x)v1 , Sn1 ,n2 (x)v2 ) = ||Sn1 ,n2 (x)v1 ||||Sn1 ,n2 (x)v2 )|| Area(v1 , v2 ) = , ||Sn1 ,n2 (x)v1 ||||Sn1 ,n2 (x)v2 )|| since S ∈ SL2 (R). The reader will have no difficulties to show that for fixed n2 and large n1 fn1 ,n2 is partially hyperbolic, its unstable manifolds are graphs of functions Kn : W u (x0 ) → P1 and ||dKn || → 0 as n → ∞. Now the Riemann structure of T2 is non-degenerate on the leaves of W u and with respect to this structure ρ((x1 , Kn (x1 )), (x2 , Kn (x2 ))) = ρ(x1 , x2 ). Let m be the distribution of A(x) with respect to Lebesgue measure and mn be the nth convolution power of m. Take now x0 ∈ T2 , j v1 , v2 ∈ P1 and let Kn (x) be the function defining the unstable manifold through (x0 , vj ). Then d(fn1 ,n2 (x, Kn1 (x)), fn1 ,n2 (x, Kn2 (x))) ρ(z) ln dz → d(v1 , v2 ) V (22)

1 − Emn2 −1 ln ||AM(x)Kn1 (x)|| + ln ||AM(x)Kn2 (x)|| dx |V | V

196

D. Dolgopyat

as n1 → ∞. But by [5], Theorem A3.6 for all v n1 Emn ln ||Av|| → λ+ > 0, where λ+ is the positive exponent of m. So for large n2 , n1 n2 the expression (22) is negative. (b) This example is similar to (a). Let fn : T2 × S 1 → T2 × S 1 be given by fn (x, y) = (T n x, gn (x)y), where the distribution of gn converges to that of a time 1 map of the stochastic differential equation dy = L(y)dw(t),

(23)

dw(t) being the white noise. Then for large n, fn is mostly contracting. This follows ∂y from the fact that ξ = ∂y (t) satisfies 0 dL 1 d ln ξ = dw − dy 2 and so EP ln ξ(1) = −

1 2

0

1

EP

dL dy

2 dt

dL (t) dt < 0, dy

where P denotes the stationary distribution of process defined by (23). (Of course since the distribution of ξ is not compactly supported some restrictions should be imposed on the rate of convergence of gn to the distribution of the solution of (23). We leave it as an exercise to the reader to write down the explicit estimates.) Remark. This example shows that many phenomena occurring in the stochastic differential equation can also take place in deterministic systems. More research is needed in this direction. (c) The above examples are essentially dissipative, since both f and f −1 are mostly contracting. Here we describe a conservative example which is a slight modification of the one given in [32]. Let T : T2 → T2 be as before. Consider f0 : T3 → T3 given by f0 (x, θ ) = (T (x), θ + τ (x)), where τ is such that f0 is Bernoullian (this condition discards an infinite codimension submanifold in the space of skewing functions). Let h : T3 → T3 be a volume preserving diffeomorphism close to identity. (df maps small cones Ku around Eu and Kcu around Eu ∧ Ec into themselves. Likewise df −1 preserves cones Ks and Kcs . We want a dh map Eu into Ku and so on.) Let fn = f0n ◦ h ◦ f0n . Choose vectorfields eu ∈ Eu , ec ∈ Ec and es ∈ Es so that df (eu ) = λeu , df (ec ) = ec , df (es ) = λ1 es . Suppose that in this basis dh(x) = (Aij (x)). fn has an unstable vector of (n) the form vu = eu + α(x)ec + β(x)es . Let dfn (x)vu = r1 (x)vu . The direct calculation (n) shows that r1 (x) = λ2n A11 (f0n x)(1 + O( λ1n )). Similarly choose the central vector v0 (n) so that vu ∧ v0 = eu ∧ e0 + . . . and let dfn (x)(vu ∧ v0 ) = r2 (x)(vu ∧ v0 ). Then (n) r2 (x) = λ2n (A11 A22 − A12 A21 )(f0n x)(1 + O( λ1n )). Again one can show that unstable manifolds of fn are close to unstable manifolds of f0 and so they are transversal to W cs . Let A11 A22 − A12 A21 ln (x) dx. J (h) = A11 T3 We have ρVn (x) ln(dfn |Ec )(x)dx ≈ Vn

Vn

ρVn (x) ln

A11 A22 − A12 A21 A11

(f0n x)dx.

Dynamics of Mostly Contracting Diffeomorphisms

197

Since f0 is mixing we obtain (cf. [18] or [15, Sect. 20.6] that

Vn

ρVn (x) ln

A11 A22 − A12 A21 A11

(f0n x)dx ≈ J (h)

Vn

ρVn (x)dx ≈ J (h).

So, if J (h) < 0 then fn is mostly contracting for large n. Similarly if J (h) > 0 then fn−1 is mostly contracting for large n. (A more symmetric expression for J is J =

T3

[ln A33 (h−1 ) − ln A11 (h)]dx.)

To show that J is not identically zero one can use the following argument of [32]. Let Diff ∗ (T3 ) be the space of volume preserving C 3 -diffeomorphisms which preserve W cs (f0 ). Then J |Diff ∗ (T3 ) is a C 2 -functional and calculating its first two derivatives at identity one can prove that J ≡ 0. We refer the reader to [32] for more details. (d) A similar example can be given with f0 being a time one map of the geodesic flow on the unit tangent bundle over a negatively curved surface. (e) In [4] several examples are given of the systems having the following property: (*) ∀V ∈ V there is a subset U ⊂ V of positive measure such that for all x ∈ U the forward Lyapunov exponent of Ec is negative. The next proposition is essentially proven in [4] even though it is not stated there. For the convenience of the readers we sketch their arguments below. Proposition 12.1. f satisfies (*) ⇔ it is mostly contracting. Proof. In view of Theorem 4.1 we only have to show that if f satisfies (*) then it is mostly contracting. Given x0 ∈ X choose V containing x0 in its interior. By (*) ∃K(x0 ), λ(x0 ) such that the set L(x0 ) = {x ∈ V : (df n |Ec )(x) ≤ Ke−λn } has positive measure. Given δ there is ε > 0 and a positive measure subset L˜ ⊂ L such that (a) ∀x L˜ Wεcs (x) belong to the weak stable manifold of x; (b) If V˜ ∈ V, d(V˜ , V ) ≤ ε the center stable holonomy p : V → V˜ is absolutely continuous on L˜ and its Jacobian J (x) satisfies |J (x) − 1| ≤ δ. ˜ 0 ). It follows that if I is a small enough interval about Let y be a density point of L(x y then ∀n λn ρI (x) ln(df n |Ec )(x)dx ≤ − + K1 . 2 I Let C = [I, Wεcs (y)] ([·, ·] denotes the (u,cs)-local product). Then if δ, ε are small enough, for any unstable slice J of C, J

ρJ (x) ln(df n |Ec )(x)dx ≤ −

λn + K2 . 4

cs (y(x ))] the trap associated with x . Call C the core of T . Call T = [V (x0 ), Wε(x 0 0 0) By compactness X is covered by a finite number of traps {Tj }. Now take any V ∈ V.

198

D. Dolgopyat

Given m > 0 let V1 (m) be the set of points which visit some Cj before time m and let V2 (m) = V \V1 (m). We have ρV (x) ln(df n |Ec )(x)dx In = V = ρV (x) ln(df n |Ec )(x)dx + V1 (m)

V2 (m)

= I+II. λn + K(m) mes(V1 (m)), I≤ − 4 II ≤ Const n mes(V2 (m)) and mes(V2 (m)) ≤ Const θ m for some θ < 1 (the proof of this last inequality is similar to that of Lemma 9.1.) So, for large n, In is negative. 13. Conclusions Here we repeat what we have said in the introduction adding more technical details. 13.1. Here we relate our results with those of [38]. Let K, λ be as in Sect. 7. For V ∈ V let x(V ) be the center of V and L(V ) = {x ∈ V : (df n |Ec )(x) ≤ Ke−λn }. Let δ (V ) = [L, Wδcs (x(V ))]. Then we have essentially shown that if δ is small enough then δ (V ) satisfies the conditions of Theorem 2 of [38]. We did not deduce our result from [38] but rather repeated some of her arguments in Sects. 6–9 in order to show that u-convergence guarantees the absence of the discrete spectrum. Now suppose that f satisfies all the conditions of Theorem I except u-convergence. Then the conclusion can be false (consider, for example, the double covering of f from 12(a) corresponding to π : S 1 → P1 ). However we can still say something. Namely by [4] there is a finite number ν1 . . . νk of SRB measures and the union of their basins is the whole of X. Let P = j supp(νj ). Let 3 be as in Sect. 7. Choose a finite disjoint set V1 . . . Vm which is 3-dense in P. Take δ % 3. Choose small subintervals Uj ⊂ Vj such that δ (Ul ) are disjoint. Then the arguments of Sects. 8 and 9 show that δ = m l=1 δ (Ul ) satisfy Theorem 2 of [38] except may be f n is not ergodic for some n. It then follows from the nj analysis of [38] that ∀j ∃nj so that νj = n1j l=0 νj l and (f nj , νj l ) is exponentially mixing. Question. What happens for g close to f ? Can the maps g → νj (g) be chosen continuously? 13.2. It seems that the assumption that f is dynamically coherent can be relaxed (it is however satisfied in all the known examples). In fact, we only used it in Sect. 8. Let (P1 , m1 ) and (P2 , m2 ) be probability spaces. Call the map τ : P1 → P2 ε-measure preserving if ∃Aj ⊂ Pj such that mj (Aj ) ≤ ε and τ |P1 \A1 is absolutely continuous and the Jacobian satisfies |J (x) − 1| ≤ ε. For our arguments it suffices to know that if d(V1 , V2 ) ≤ ε then there is ε β -measure preserving map p : V1 → V2 such that for x ∈ V1 \A1 f n (x) and f n (px) converge exponentially fast. This, in turn seems to follow

Dynamics of Mostly Contracting Diffeomorphisms

199

from the Pesin theory. However, the proof without the dynamical coherence assumption would be much more complicated. 13.3. Question. Let f be as in Theorem I. Is the map g → νg actually smooth? An easier problem is the following. Assume that W c (f ) is absolutely continuous. Is the map g → νg differentiable at f ? (See [31] for additional discussion.) 13.4. Question. How common is (6) among partially hyperbolic diffeomorphisms of three manifolds? In particular is the set {f : f or f −1 is mostly contracting } dense? 13.5. Let X be a three dimensional manifold. Consider the space S of partially hyperbolic ergodic volume preserving diffeomorphisms with two negative Lyapunov exponents. Question. How often are elements of S mostly contracting? What is the rate of mixing for elements of S? According to the general scheme proposed in [38] one has to locate a “bad set” of f and see how long an orbit can stay near it. Analysis of [4] shows that the bad set here is the set of points whose forward orbits never fall into any trap described in Sect. 12(e). So a way to attack this problem is to obtain more information about the geometry of this set. For example, can it have the Hausdorff dimension equal to three? 13.6. For f ∈ S we have the following elegant characterization due to [4]: f is mostly contracting if and only if W u (f ) is minimal. (Minimality implies mostly contractiveness by ([4], Theorem B). The converse implication is easier. See, for example [18].) Question. How often is W u minimal? What can be said if it is not? 13.7. Another natural condition to consider if dim W c = 1 is ∀V ∈ V ρV (x) ln(df |Ec )(x)dx ≥ α0 > 0. V

In this case the results similar to ours were obtained in [1, 2]. In fact, [1, 2, 4] do not assume that dim(W c ) = 1 but only that all its Lyapunov exponents have the same sign. Question. Can a similar theory be developed in case (f |Eu ) has both positive and negative Lyapunov exponents? 13.8. In the example 12(a) f is a skew extension over the Anosov base. Similar construction can be made with Axiom A attractors. Question. What can be said for general Axiom A diffeomorphisms? In particular, call f entropy stable if any g near f has an unique measure of maximal entropy µg and µg → µf as g → f . How large is the set of entropy stable diffeomorphisms? Are examples of Sect. 12 entropy stable? 13.9. In [7, 8, 16, 20, 28] a number of examples is given of ergodic systems which remain ergodic after a small volume preserving perturbation. Question. What happens if we allow non-volume preserving perturbations? 13.10. Finally, let us remark that the questions we asked are special cases of some general conjectures about statistical properties of a generic dynamical systems. See [22].

200

D. Dolgopyat

Acknowledgement. I thank V. Nitica, C. Pugh and M. Ratner for useful discussions. I first learned about mostly contracting systems during the Ergodic Theory and Statistical Mechanics Seminar at Princeton University where a random version of this property was discussed. I thank all the participants of that seminar and especially K. Khanin, A. Mazel and Ya. Sinai for introducing me to this subject. In a previous version of my paper I imposed a strong regularity requirement on the unstable foliation to prove part (A) of Lemma 6.1. I am grateful to C. Bonatti and M. Viana for explaining to me that Pesin theory can be used to verify this property (see Lemma 8.1). This work is supported by the Miller Institute for Basic Research in Science.

References 1. Alves, J.: SRB measures for nonhyperbolic systems with multidimensional expansion. IMPA Thesis, 1997 2. Alves, J., Bonatti, C. & Viana M.: SRB measures for partially hyperbolic diffeomorphisms: The expanding case. Preprint 3. Benedics, M. & Young, L.-S.: Sinai–Bowen–Ruelle measures for certain Henon maps. Inv. Math. 112, 541–576 (1993) 4. Bonatti, C. & Viana, M.: SRB measures for partially hyperbolic systems those central direction is mostly contracting. To appear in Israel Math. J. 5. Bougerol, P. & Lacroix, J.: Products of random matrices with applications to Schrödinger operators. Boston–Basel–Stuttgart: Birkhäuser, 1985 6. Brin, M. & Pesin, Ya. B.: Partially hyperbolic dynamical systems. Math. USSR-Izvestiya 8, 177–218 (1974) 7. Burns, K., Pugh, C. & Wilkinson, A.: Stable ergodicity and Anosov flows. Preprint 8. Burns, K. & Wilkinson, A.: Stable ergodicity of skew products Preprint 9. Castro, A.: Backwards inducing and decay of correlations for certain partially hyperbolic maps whose central direction is mostly contracting. IMPA Thesis, 1998 10. Chernov, N. I.: Decay of correlations and dispersing billiards. J. Stat. Phys. 94, 513–556 (1999) 11. Chernov, N. I. & Haskell, C.: Nonuniformly hyperbolic K-systems are Bernoulli: Erg. Th. & Dyn. Sys. 16, 19–44 (1996) 12. Dolgopyat, D.: Limit theorems for partially hyperbolic systems. Preprint 13. Grayson, M., Pugh C. & Shub M. Stably ergodic diffeomorphisms. Ann. Math. 140, 295–329 (1994) 14. Hirsh, M., Pugh C. & Shub, M.: Invariant manifolds. Lect. Notes in Math. 583. Berlin: Springer-Verlag, 1977 15. Katok, A. & Hasselblatt B.: Introduction to the modern theory of dynamical systems. Encyclopedia Math., Appl. 54, Cambridge: Cambridge University Press, 1998 16. Katok,A.& Kononenko,A.: Cocycle stability for partially hyperbolic systems. Math. Res. Lett. 3, 191–210 (1996) 17. Margulis, G. A.: Certain measures that are connected with U-flows on compact manifolds. Func. Anal. Appl. 4, 62–76 (1970) 18. Margulis, G. A.: PhD Thesis, Moscow State University, 1970 19. Milnor, J.: On the concept of attractor. Commun. Math. Phys. 99, 177–195 (1985) and 102, 517–519 (1985) 20. Nitica, V. & Torok, A.: An open dense set of stably ergodic diffeomorphisms in a neighbourhood of a non-ergodic one. To appear in Topology 21. Ornstein, D. & Weiss, B.: On Bernoulli nature of systems with some hyperbolic structure. Erg. Th. & Dyn. Sys. 18, 441–456 (1998) 22. Palis, J.: Global view on dynamics. To appear in Asterisque 23. Palis, J., Pugh, C. C. & Robinson, R. C.: Nondifferentiability of invariant foliations. Lecture Notes Math. 468, Berlin: Springer, 1975, pp. 234–240 24. Pesin, Ya. B.: Families of invariant manifolds that correspond to nonzero characteristic exponents. Izv. Akad. Nauk SSSR 40, 1332–1379 (1976) 25. Pesin, Ya. B.: Characteristic Lyapunov exponents, and smooth ergodic theory. Uspehi Mat. Nauk 32, 55–112 (1977) 26. Pesin, Ya. B. & Sinai, Ya. G.: Gibbs measures for partially hyperbolic attractors. Erg. Th. & Dyn. Sys. 2, 417–438 (1982) 27. Pugh, C. & Shub, M.: Ergodic attractors. Trans. AMS 312, 1–54 (1989) 28. Pugh, C. & Shub, M.: Stably ergodic dynamical systems and partial hyperbolicity. J. Complexity 13, 125–179 (1997) 29. Pugh, C. & Shub, M.: Stable ergodicity and Julienne quasi-conformality. Preprint 30. Ruelle, D.: A measure associated with Axiom A attractors. Am. J. Math. 98, 619–654 (1976) 31. Ruelle, D.: Differentiation of SRB states. Commun. Math. Phys. 187, 227–241 (1997) 32. Shub, M. & Wilkinson A.: Pathological foliations and removable zero exponents. Inv. Math. (1999)

Dynamics of Mostly Contracting Diffeomorphisms

201

33. Sinai,Ya. G.: Classical dynamic systems with countably-multiple Lebesgue spectrum. II. Izv. Akad. Nauk SSSR Ser. Mat. 30, 15–68 (1966) 34. Sinai, Ya. G.: Gibbs measures in ergodic theory. Uspehi Mat. Nauk 27, 21–64 (1972) 35. Sinai,Ya. G.: Stochasticity of dynamical systems. In: Nonlinear waves, Edited by A. V. Gaponov–Grekhov Moscow: Nauka, 1979, pp. 192–211 36. Viana, M.: Multidimensional nonhyperbolic attractors. Publ. IHES 85, 63–96 (1997) 37. Wilkinson, A.: Stable ergodicity of time-one map of a geodesic flow: PhD Thesis, University of California at Berkeley, 1995 38. Young, L.-S.: Statistical properties of dynamical systems with some hyperbolicity including certain billiards. Ann. Math. 147, 585–650 (1998) 39. Young, L.-S.: Recurrence times and rates of mixing. Isr. J. Math 110, 153–188 (1999) Communicated by Ya. G. Sinai

Commun. Math. Phys. 213, 203 – 247 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

On the Poisson Limit Theorems of Sinai and Major Nariyuki Minami Institute of Mathematics, University of Tsukuba, Tsukuba, Ibaraki 305-8571, Japan. E-mail: [email protected] Received: 15 June 1999 / Accepted: 11 February 2000

Abstract: Let f (ϕ) be a positive continuous function on 0 ≤ ϕ ≤ , where ≤ 2π , and let ξ be the number of two-dimensional lattice points in the domain R (f ) between the curves r = (R + c1 /R)f (ϕ) and r = (R + c2 /R)f (ϕ), where c1 < c2 are fixed. Randomizing the function f according to a probability law P, and the parameter R according to the uniform distribution µL on the interval [a1 L, a2 L], Sinai showed that the distribution of ξ under P × µL converges to a mixture of the Poisson distributions as L → ∞. Later Major showed that for P-almost all f , the distribution of ξ under µL converges to a Poisson distribution as L → ∞. In this note, we shall give shorter and more transparent proofs to these interesting theorems, at the same time extending the class of P and strengthening the statement of Sinai. 1. Introduction In connection with a question in the theory of quantum chaos, Sinai proposed ([S1]) and solved ([S2]) the following problem: Given a planar curve which is described in the polar coordinate as r = f (ϕ), 0 ≤ ϕ ≤ , (1.1) with ∈ (0, 2π] and f (·) > 0 continuous, consider the domain R (f ) defined by R (f ) = {x ∈ R2 | 0 ≤ ϕ(x) ≤ , (R + c1 /R)f (ϕ(x)) < |x| < (R + c2 /R)f (ϕ(x))}.

(1.2)

Here c1 < c2 are real constants and R > 0 is a parameter which we shall let tend to infinity. Moreover |x| = r = x12 + x22 is the distance of x from the origin and ϕ(x) is the angle between the vector x = (x1 , x2 ) and the positive real axis. If we set f (ϕ)2 dϕ, (1.3) λ(f ) = (c2 − c1 ) 0

204

N. Minami

then as R → ∞, the area of R (f ) is asymptotically given by λ(f )(1 + O(R −2 )). Thus the domain R (f ) becomes thinner and thinner but its area is asymptotically constant. Now consider the quantity ξ = ξ(R; f ) = {R (f ) ∩ Z2 },

(1.4)

which is the number of the lattice points in R (f ). Then the problem, which is of physical significance, is the following: Suppose the parameter R > 0 is randomly distributed on the interval IL = (a1 L, a2 L), 0 < a1 < a2 , according to the uniform distribution µL . What is the limiting probability distribution of ξ(·; f ) under µL as L → ∞? Numerical studies suggest that for typical smooth curves f which are obtained from Hamiltonians of integrable dynamical systems, the limiting distribution is Poissonian. To give a rigorous proof to this conjecture for a concretely given f remains as a difficult open question. What Sinai did was to randomize the function f according to some probability law P on some function space. Then he showed that under the probability measure P × µL , the distribution of ξ(·; ·) converges, as L → ∞, to a mixture of Poisson distributions each of which has the parameter λ(f ). He also showed that there is a sequence L¯ n → ∞ such that for P-almost all f , the law of ξ(·; f ) under µL¯ n converges as n → ∞ to the Poisson distribution with parameter λ(f ). Later Major [M] extended Sinai’s idea and succeeded in getting rid of subsequences. Namely by a more elaborate analysis, he proved that as L → ∞, the distribution of ξ(·; f ) under µL converges to the Poisson distribution with parameter λ(f ), for P-almost all f . Under the conditions on P posed by Sinai and Major (and which we shall assume also in this note), the sample functions f (ϕ) is not smooth in ϕ (see Remark 2 of [M]), so that the direct connection of their results to the problem of quantum chaos is lost. But still the theorems of Sinai and Major are interesting from the viewpoint of elementary probability theory, since they provide us with a new limit theorem which gives the Poisson distribution. The present author has the opinion that these theorems deserve to be made more popular among probabilists who are not necessarily interested in such topics as quantum chaos. The author also believes that in the future study of quantum chaos too, one will need to prove variants of these theorems under various different conditions, e.g. for higher dimensional lattices. For these purposes, however, the proofs of these theorems should be made as transparent and readable as possible. In fact, Sinai’s paper [S2] contains many unnecessary arguments and is obscure at some crucial points. (For example, the interrelation among his conditions I) to VI) are not clear enough.) There is no such ambiguity in Major’s paper [M], but some of his arguments are also dispensable. In this paper, we will make a more direct attack on the theorems of Sinai and Major, to improve them in the following way: (1) The distribution under P of ξ(R; ·) converges, as R → ∞, to a mixture of Poisson distribution. This result has a much simpler proof than Sinai’s one and implies Sinai’s original assertion as a trivial corollary. (2) We shall give a slightly different formulation of Major’s theorem, which turns out to be equivalent to the original one but technically more tractable and more closely related to the problem in quantum chaos raised by Berry and Tabor [BT]. (See also [Mi].) (3) The theorems are extended to the case where f (·) are not Lipschitz continuous. Major’s proof cannot cover this case because he makes explicit use of the sample Lipschitz continuity. But in fact, we need to use the samplewise Lipschitz continuity

On the Poisson Limit Theorems of Sinai and Major

205

only in combination with the upper bound of the probability density, so that these two conditions ((1.7) and (1.8) in (II-2) below) can entirely be replaced by a single Condition (II-1). As we shall see below, the number σ ∈ (0, 1) in (II-1) plays the role of τ − 1 in (II-2), τ appearing in (1.8), where the number 1 comes from the Lipschitz continuity of f (·). (4) We can slightly loosen the technical condition on the partial derivative of the probability density as in (III). (Compare this with (2b) of Property A of [M].) Moreover even if we do not assume this condition, we still have a partial result, Theorem 2. (5) We prove the limit theorems by directly proving the convergence of factorial moments of all orders as in (2.2) and (3.5). These assertions are actually stronger than those of Sinai and Major because they proved the convergence of moments not for the random variables ξ(R; f ) themselves but for their modifications. As discussed by the present author in [Mi], the convergence of k th moments may have applications in a problem of quantum chaos even if it is proven only for finite numbers of k’s, so that the Poisson limit theorem is not available. Now let us formulate our conditions and theorems. We suppose that our probability measure P is defined on the function space X = Xb1 ,b2 = {f |f is continuous on [0, ] and b1 ≤ f (ϕ) ≤ b2 }

(1.5)

with 0 < b1 < b2 , and we assume that P satisfies the following two conditions (I) and (II). Condition (I). For each k ≥ 1, and (ϕ1 , . . . , ϕk ) (ϕi = ϕj ) in [0, ]k , the distribution of (f (ϕ1 ), . . . , f (ϕk )) has a density pk (y1 , . . . , yk |ϕ1 , . . . , ϕk ) which is C 1 on [b1 , b2 ]k × {(ϕ1 , . . . , ϕk ) ∈ [0, ]k | ϕi = ϕj (i = j )}. We note that if π is a permutation of {1, . . . , k}, then pk (yπ(1) , . . . , yπ(k) |ϕπ(1) , . . . , ϕπ(k) ) = pk (y1 , . . . , yk |ϕ1 , . . . , ϕk ).

Condition (II). Either of the conditions (II-1) and (II-2) below holds: (II-1) There exists a σ ∈ (0, 1) such that the inequality pk (y1 , . . . , yk |ϕ1 , . . . , ϕk ) ≤ Bk

k

(ϕj − ϕj −1 )−σ

(1.6)

j =2

holds with some constant Bk for all k ≥ 2 and 0 ≤ ϕ1 < · · · < ϕk ≤ . (II-2) There is a constant b3 > 0 such that for P-almost all f ∈ X , Y (f ) ≡

|f (ϕ2 ) − f (ϕ1 )| ≤ b3 . ϕ2 − ϕ 1 0≤ϕ1 <ϕ2 ≤ sup

(1.7)

206

N. Minami

Moreover there exists a τ ∈ (1, 2) such that the inequality pk (y1 , . . . , yk |ϕ1 . . . , ϕk ) ≤ Ck

k

(ϕj − ϕj −1 )−τ

(1.8)

j =2

holds with some constant Ck for all k ≥ 2 and 0 ≤ ϕ1 < · · · < ϕk ≤ . Under these conditions, we can prove Theorems 1 and 2 below: Theorem 1. As R → ∞, the distribution of ξ(R; ·) under the probability measure P converges to the mixture of Poisson distributions each of which has the parameter λ(f ), where λ(f ) was defined in (1.3). Namely we have λ(f )k lim P(ξ(R; ·) = k) = EP e−λ(f ) , R→∞ k!

k = 0, 1, . . . .

(1.9)

As an immediate corollary, we obtain Sinai’s theorem ([S2]). Corollary 1. As L → ∞, the distribution of ξ(·; ·) under the probability measure µL ×P converges to the mixture of Poisson distribution, each of which having the parameter λ(f ), namely λ(f )k lim (µL × P)(ξ = k) = EP e−λ(f ) , L→∞ k!

k = 0, 1, . . . .

(1.10)

Sinai proved this assertion under conditions which are close to our conditions (I) and (II-2) with τ = 1. Under the same conditions, he also showed the following theorem, which we shall prove, simultaneously with Theorem 3, under wider conditions (I) and (II). Theorem 2. There is a sequence L¯ n → ∞ such that for P-almost all f ∈ X , the distribution of ξ(·; f ) under µL¯ n converges to the Poisson distribution with parameter λ(f ), namely we have lim µL¯ n (ξ(·; f ) = k) = e−λ(f )

n→∞

λ(f )k , k!

k = 0, 1, . . . ,

(1.11)

for P-almost all f ∈ X . If we assume, in addition to Conditions (I) and (II), the following technical condition, we can get rid of subsequences. Condition (III). For some constants Ak > 0, νk ∈ (0, d/2), where k ≥ 2 and d = 1−σ or = 2 − τ according to which of (II-1) and (II-2) P satisfies, we have |∂i pk (y1 , . . . , yk |ϕ1 , . . . , ϕk )| ≤ Ak exp(β −νk )

(1.12)

whenever k ≥ 2, b1 ≤ yj ≤ b2 , |ϕi − ϕj | ≥ β > 0, i, j = 1, . . . , k. Here ∂i denotes either ∂/∂yi or ∂/∂ϕi .

On the Poisson Limit Theorems of Sinai and Major

207

Theorem 3. If we assume Condition (III) in addition to Conditions (I) and (II), then for P-almost all f ∈ X , the distribution of ξ(·; f ) under µL converges to the Poisson distribution with parameter λ(f ) as L → ∞, namely we have lim µL (ξ(·; f ) = k) = e−λ(f )

L→∞

λ(f )k , k!

k = 0, 1 . . . ,

(1.13)

for P-almost all f ∈ X . This was proved by Major under Conditions (I), (II-2) and a slightly more restrictive condition than (III). Before closing this introduction, we shall give an example of P satisfying our conditions (I), (II-1) and (III). (An example which satisfies (II-2) instead of (II-1) was discussed by Sinai [S2] and Major [M].) Let {Xt ; t ≥ 0} be the reflecting Brownian motion on the interval S = [b1 , b2 ] with fixed X0 = b, and let P be its probability law. Then if we set f (ϕ) = X1+ϕ , the law of {f (ϕ); 0 ≤ ϕ ≤ } under P satisfies our requirements. In fact, the infinitesimal generator of {Xt } is the Neumann Laplacian 21 . on the interval S. Let · · · < −λn < · · · < −λ1 < −λ0 = 0 be its eigenvalues and ψn (x), n ≥ 0 be the corresponding normalized eigenfunctions. The transition density Pt (x, y) of {Xt } has the eigenfunction expansion: Pt (x, y) = e−λn t ψn (x)ψn (y), (1.16) n≥0

and if 0 ≤ ϕ1 < · · · < ϕk ≤ , the joint distribution of (f (ϕ1 ), . . . , f (ϕk )) under P has the density pk (y1 , . . . , yk |ϕ1 , . . . , ϕk ) = P1+ϕ1 (b, y1 )Pϕ2 −ϕ1 (y1 , y2 ) · · · Pϕk −ϕk−1 (yk−1 , yk ). Now it is elementary to calculate λn and ψn (x) explicitly: 2 π 2 n2 nπ λn = ; ψn (x) = cos (x − b1 ), 2 2(b2 − b1 ) b2 − b 1 b2 − b 1

n ≥ 0.

(1.17)

(1.18)

From this, it is clear that Pt (x, y) is C 1 on [0, ∞) × [b1 , b2 ]2 and hence from (1.17), our pk (·|·) is C 1 in all of its variables y1 , . . . , yk and ϕ1 , . . . , ϕk as required in Condition (I). It is clear from (1.16), (1.17) and (1.18) that there is a constant Bk such that   k (ϕ −ϕ ) −λ  pk (y1 , . . . , yk |ϕ1 , . . . , ϕk ) ≤ Bk (1.19) e n j j −1  . j =2

n≥0

On the other hand, if we let N (λ) =

λn ≤λ

√

2λ 1= (b2 − b1 ) , π

(1.20)

208

N. Minami

then

e−λn t =

∞

e−λt dN (λ),

(1.21)

0−

n≥0

√ 2(b2 − b1 ) √ λ π as λ → +∞, we can use the Abelian theorem to obtain e−λn t ∼ const.t −1/2 and since

N (λ) ∼

(1.22)

(1.23)

n≥0

as t ↓ 0, where the constant can be explicitly given. Hence by (1.19), there is another constant Bk > 0 such that pk (y1 , . . . , yk |ϕ1 , . . . , ϕk ) ≤ Bk

k

(ϕj − ϕj −1 )−1/2 ,

(1.24)

j =2

so that Condition (II-1) is valid with σ = 1/2. As is obvious from (1.17), in order to verify Condition (III), it suffices to obtain an upper bound of |(∂/∂y)Pt (x, y)| and |(∂/∂t)Pt (x, y)|. Differentiating (1.16) term by term, we obtain ∞ ∂ Pt (x, y) ≤ const. e−λt dN1 (λ) (1.25) ∂y 0− and

∂ Pt (x, y) ≤ const. ∂t

where we have set N1 (λ) =

n;

n;λn ≤λ

∞

e−λt dN2 (λ),

(1.26)

0−

N2 (λ) =

n2 .

(1.27)

N2 (λ) ∼ const.λ3/2 ,

(1.28)

n;λn ≤λ

Since as λ → ∞ we have N1 (λ) ∼ const.λ;

we can again use the Abelian theorem to obtain ∂ Pt (x, y) = O(t −1 ) ∂y and

∂ Pt (x, y) = O(t −3/2 ) ∂t

(1.29)

(1.30)

as t ↓ 0. Hence we can conclude that for some constant Ak > 0, |∂i pk (y1 , . . . , yk |ϕ1 , . . . , ϕk )| ≤ Ak β −3/2

(1.31)

whenever b1 ≤ yj ≤ b2 , |ϕi − ϕj | ≥ β > 0, i, j = 1, . . . , k, which is much stronger than (1.12).

On the Poisson Limit Theorems of Sinai and Major

209

2. Proof of Theorem 1 For k ≥ 1, define ∞ n! ξ ξk = ξk (R; f ) = 1{ξ =n} . 1{ξ ≥k} = k k!(n − k)!

(2.1)

n=k

As in [S2], for the proof of Theorem 1, it suffices to prove lim EP [ξk (R; ·)] =

R→∞

1 EP [λ(f )k ] k!

(2.2)

for each k ≥ 1. On the other hand, by a simple combinatorial argument, we can write 1{A⊂R (f )} , (2.3) ξk = A=k

where the summation ranges over all k-point subsets A of Z2 . If m1 , m2 ∈ Z2 are distinct but ϕ(m1 ) = ϕ(m2 ), then we must have |m1 − m2 | ≥ 1. Hence by the definition of R (f ), if R > 0 is large, then m1 , m2 ∈ R (f ) with m1 = m2 is possible only if ϕ(m1 ) = ϕ(m2 ). Noting b1 ≤ f ≤ b2 , we can therefore rewrite (2.3) as ξk = 1{m1 ,... ,mk ∈R (f )} (2.4) (m1 ,... ,mk )∈Zk (R)

when R > 0 is large, where we have set Zk (R) = {(m1 , . . . , mk ) ∈ (Z2 )k |

b1 R ≤ |mj | ≤ 2b2 R, j = 1, . . . , k 2 and 0 ≤ ϕ(m1 ) < · · · < ϕ(mk ) ≤ }.

Now it is easy to see that m ∈ R (f ) is equivalent to |m| |m| , , f (ϕ(m)) ∈ ImR ≡ R + c2 /R R + c1 /R and hence EP [ξk (R; ·)] =

Zk (R)

P(f (ϕ(mj )) ∈ ImRj ,

(2.5)

(2.6)

j = 1, . . . , k).

(2.7)

Take a β > 0 and define Zk (β; R) = {(m1 , . . . , mk ) ∈ Zk (R)| ϕ(mj ) − ϕ(mj −1 ) ≥ β, and so that

Zk (β; R) = Zk (R) \ Zk (β; R) Zk (R)

=

Zk (β;R)

+

Zk (β;R)

.

j = 2, . . . , k}; (2.8) (2.9) (2.10)

210

N. Minami

Let us first estimate the sum over Zk (β; R). By the definition (2.6) of ImR , we have as R → ∞, |m| the center of ImR = (1 + O(R −2 )) (2.11) R and (c2 − c1 )|m| the length of ImR = (1 + O(R −2 )). (2.12) R3 In particular, if (m1 , . . . , mk ) ∈ Zk (R), the length of ImR is of O(R −2 ) for j = 1, . . . , k. Define for β > 0, 3k (β) =

sup

sup

max

(2.13)

(y1 ,... ,yk ) (ϕ1 ,... ,ϕk )∈[0,]k 1≤i≤k ϕj −ϕj −1 ≥β ∈[b1 ,b2 ]k j =2,... ,k

∂ ∂ max pk (y1 , . . . , yk |ϕ1 , . . . , ϕk )|, | pk (y1 , . . . , yk |ϕ1 , . . . , ϕk ) ∂yi ∂ϕi

and ψk (β) =

sup

sup

(y1 ,... ,yk ) (ϕ1 ,... ,ϕk )∈[0,]k ϕj −ϕj −1 ≥β ∈[b1 ,b2 ]k j =2,... ,k

pk (y1 , . . . , yk |ϕ1 , . . . , ϕk ).

(2.14)

Then by Condition (I), and (2.11), (2.12) and the remark following it, we will have P(f (ϕ(mj )) ∈ ImRj , j = 1, . . . , k) = dy1 · · · dyk pk (y1 , . . . , yk |ϕ(m1 ), . . . , ϕ(mk )) ImR

ImR

k

1

k

(c2 − c1 )|mj | |m1 | |mk | = pk ( ,... , |ϕ(m1 ), . . . , ϕ(mk )) R R R3

(2.15)

j =1

+ O((ψk (β) + 3k (β))R

−2k−2

).

Hence noting Zk (R) = O(R 2k ), we get P(f (ϕ(mj )) ∈ ImRj , j = 1, . . . , k) Zk (β;R)

= (c2 − c1 )k

Zk (β;R)

 k k |mj | |m1 | |mk |  1 pk ( ,... , |ϕ(m1 ), . . . , ϕ(mk ))  R R R R2

+ O((ψk (β) + 3k (β))R



j =1

−2

).

(2.16)

Now let us consider a function on (R2 )k defined by  R(x1 , . . . , xk ) = pk (|x1 |, . . . , |xk | | ϕ(x1 ), . . . , ϕ(xk )) 

k

j =1

 |xj | .

(2.17)

On the Poisson Limit Theorems of Sinai and Major

211

Then the sum on the right-hand side of (2.16) is the approximation by the Riemann sum of the integral k (c2 − c1 ) ··· dx1 · · · dxk R(x1 , . . . , xk ), (2.18) Dk (β)

where the domain Dk (β) of integration is given by b1 ≤ |xj | ≤ 2b2 , j = 1, . . . , k, 2 0 ≤ ϕ(xj ) ≤ , ϕ(xj ) − ϕ(xj −1 ) ≥ β, j = 2, . . . , k }.

Dk (β) = {(x1 , . . . , xk ) ∈ (R2 )k |

(2.19)

It is easy to see that ∂R ≤ const.(ψk (β) + 3k (β)) max 1≤j ≤k ∂xj

(2.20)

holds on Dk (β). This gives the accuracy of the above Riemann sum approximation, and we obtain, after replacing the domain of integration Dk (β) by b1 ≤ |xj | ≤ 2b2 , j = 1, . . . , k, (2.21) 2 0 ≤ ϕ(xj −1 ) < ϕ(xj ) ≤ , j = 2, . . . , k},

Dk = {(x1 , . . . , xk ) ∈ (R2 )k |

and transforming the variable into the polar coordinate, P(f (ϕ(mj )) ∈ ImR j = 1, . . . , k) Zk (β;R)

= (c2 − c1 )  ×

k j =1

k

···



dϕ1 · · · dϕk

0≤ϕ1 <···<ϕk ≤

···

dy1 · · · dyk

b1 ≤yj ≤b2 , j =1,... ,k

yj2  pk (y1 , . . . , yk |ϕ1 , . . . , ϕk )

+ O((ψk (β) + 3k (β))R −1 + β) 1 = EP [λ(f )k ] + O((ψk (β) + 3k (β))R −1 + β). k!

(2.22)

In order to complete the proof of (2.2), it remains to prove lim lim R↑∞ P(f (ϕ(mj )) ∈ ImRj , j = 1, . . . , k) = 0. β↓0

(2.23)

Zk (β;R)

Suppose first that P satisfies (II-1). By the remark which follows (2-12), we see that P(f (ϕ(mj )) ∈ ImRj

j = 1, . . . , k) ≤ const.R −2k

k j =2

(ϕ(mj ) − ϕ(mj −1 ))−σ . (2.24)

212

N. Minami

Hence Lemma 1 below gives lim R↑∞ P(f (ϕ(mj )) ∈ ImR ,

j = 1, . . . , k) = O(β 1−σ )

(2.25)

Zk (β;R)

for small β > 0, and (2.23) follows immediately from this. Lemma 1. For large R > 0 and for small β > 0, we have the estimate max (ϕ(m ) − ϕ(m))−σ = O(R 2 (R σ −1 + β 1−σ )). m∈Z1 (R)

m ∈Z1 (R); 0<ϕ(m )−ϕ(m)<β

Proof. We may assume 0 ≤ ϕ(m) ≤ π/4, because Z2 is invariant under the rotation by πj/2, j = 1, 2, 3, and because the √ lattice obtained by rotating Z2 by π/4 + πj/2, j = 1, 2, 3 is contained in the lattice (1/ 2)Z2 . But in this case, we have ρ ≡ tan ϕ(m) ∈ [0, 1]. Writing m = (µ1 , µ2 ), we get (ϕ(m ) − ϕ(m))−σ m ∈Z1 (R); 0<ϕ(m )−ϕ(m)<β

≤

1≤µ1 ≤2b2 R µ2 ;ρ< µ2
tan

−1

−σ

µ2 − ϕ(m) µ1

.

(2.26)

1

Since

µ2 > ρ} = [ρµ1 ] + 1, (2.27) µ1 the inner summation on the right-hand side of (2.26) is further estimated as below: −σ −1 µ2 tan − ϕ(m) µ1 µ2 min{µ2 |

µ2 ;ρ< µ

≤ µ1

tan(ϕ(m)+β) ρ

(tan−1 y − ϕ(m))−σ dy

{ρµ1 } [ρµ1 ] + 1 tan−1 − tan−1 ρ µ1 µ1 ≡ F (µ1 ) + G(µ1 ). +

(2.28) −σ

Here [a] and {a} are respectively the integer and the fractional parts of a real number a. Now we have   ϕ(m)+β dϕ   F (µ1 ) = µ1 (ϕ − ϕ(m))−σ cos2 ϕ (2.29) ϕ(m) 1≤µ1 ≤2b2 R

1≤µ1 ≤2b2 R

= O(R 2 β 1−σ ), where we have used the assumption that 0 ≤ ϕ(m) ≤ π/4 and that β > 0 is small so that cos2 ϕ is bounded away from 0 on the range of integration.

On the Poisson Limit Theorems of Sinai and Major

213

In order to estimate the sum of G(µ1 ) we set ρ = p/q with p and q mutually prime integers. Since m ∈ Z1 (R), and 0 ≤ ρ ≤ 1, we must have 0 ≤ p ≤ q ≤ 2b2 R. Moreover for any integer 9, {{ρn} | n ∈ N} = {{ρ(q9 + j )} | j = 1, . . . , q} q −1 1 }. = {0, , . . . , q q

(2.30)

[ρµ1 ] + 1 1 − {ρµ1 } =ρ+ µ1 µ1

(2.31)

Then noting that 0 ≤ {ρµ1 } ≤ 1; hold, we can compute G(µ1 ) 1≤µ1 ≤2b2 R

≤

G(j ) +

1≤j ≤q−1

G(q9 + j )

1≤9≤2b2 R/q 0≤j ≤q−1

−σ 1 − j/q tan (ρ + ) − ϕ(m) ≤ q 1≤j ≤q−1 −σ 1 1 − j/q −1 + tan (ρ + ) − ϕ(m) q9 q(9 + 1) 1≤9≤2b2 R/q 0≤j ≤q−1 −σ 1 1−x tan−1 (ρ + ) − ϕ(m) ≤ q dx (2.32) q 0 −σ 1 1 1−x tan−1 (ρ + dx ) − ϕ(m) + 9 0 q(9 + 1)

−1

1≤9≤2b2 R/q

= q

2

+

tan−1 (ρ+1/q) ϕ(m)

1≤9≤2b2 R/q

(ϕ − ϕ(m))−σ

q(9 + 1) 9

dϕ cos2 ϕ

tan−1 (ρ+1/q(9+1)) ϕ(m)

(ϕ − ϕ(m))−σ

dϕ cos2 ϕ

= O(q σ +1 + R σ ) = O(R σ +1 ), completing the proof of Lemma 1. Next we consider the case where P satisfies (II-2). By (2.11) and (2.12), there is a constant C > 0 such that the conditions 1 b1 R ≤ |m| ≤ 2b2 R 2 and

f (ϕ(m)) ∈ ImR

214

N. Minami

imply the condition

f (ϕ(m)) − |m| ≤ 1 CR −2 . R 2

(2.33)

By the Lipschitz continuity of f (·), we have the following implication for large R > 0 : 1 f (ϕ(mj )) ∈ ImRj , b1 R ≤ |mj | ≤ 2b1 R, j = 1, . . . , k 2 |mj | |mj −1 | ≤ C + b3 |ϕ(mj ) − ϕ(mj −1 )|, j = 2, . . . , k. (2.34) ⇒ − R R R2 On the other hand, it is easy to see that for some absolute constant M > 0, min

(m,m )∈Z2 (R)

(ϕ(m ) − ϕ(m)) ≥ MR −2

(2.35)

for large R > 0. Hence if we set K = b3 + C/M, the right-hand side of (2.34) can be rewritten as |mj | − |mj −1 | ≤ KR|ϕ(mj ) − ϕ(mj −1 )|, j = 2, . . . , k. (2.36) From this consideration and (1-8), we obtain Zk (β;R)

P(f (ϕ(mj )) ∈ ImRj , j = 1, . . . , k) ≤ const.

Zk (β;R)

R −2k

k

(ϕ(mj ) − ϕ(mj −1 ))−τ ,

(2.37)

j =2

where Zk (β; R) is the totality of (m1 , . . . , mk ) ∈ Zk (β; R) for which (2.36) holds. Then by Lemma 3 below, we will have lim R→∞

Zk (β;R)

R

−2k

k

(ϕ(mj ) − ϕ(mj −1 ))−τ = O(β 2−τ ),

β ↓ 0,

(2.38)

j =2

which completes the proof of Theorem 1. Lemma 2. Let A > 0, S > 0, T > 0 and R ≥ 1, where T ≥ δR and S ≥ δR for some δ > 0. Let further β > (AR)−1 . Then there is a constant C > 0 such that 1{||m|−S|≤T (ϕ(m)−ϕ0 )} (ϕ(m) − ϕ0 )−τ ≤ CST β 2−τ m∈Z2 ;(AR)−1 <ϕ(m)−ϕ0 <β

for any ϕ0 ∈ [0, ). Proof. Divide the interval (ϕ0 +(AR)−1 , ϕ0 +β] into subintervals .9 of length (AR)−1 , where 9 = 1, . . . , [ARβ] − 1, and an interval .[ARβ] of length ≤ (AR)−1 . Let N9 be the number of lattice points m ∈ Z2 such that ϕ(m) ∈ .9 and that ||m| − S| ≤ T (ϕ(m) − ϕ0 ).

On the Poisson Limit Theorems of Sinai and Major

215

Then N9 ≤ const.R −2 ST 9 and we have 1{||m|−S|≤T (ϕ(m)−ϕ0 )} (ϕ(m) − ϕ0 )−τ m∈Z2 ;(AR)−1 <ϕ(m)−ϕ0 <β

=

[Rβ]

1{||m|−S|≤T (ϕ(m)−ϕ0 )} (ϕ(m) − ϕ0 )−τ

9=1 m;ϕ(m)∈.9

≤

[Rβ]

1{||m|−S|≤T (ϕ(m)−ϕ0 )} (9R −1 )−τ

9=1 m;ϕ(m)∈.9

≤ const.

[Rβ] 9=1

(2.39)

ST 1−τ τ 9 R R2

ST (Rβ)2−τ R 2−τ = CST β 2−τ , ≤ const.

and this estimate is uniform in ϕ0 .

As a corollary of Lemma 2, we get the following Lemma 3. For large R > 0 and small β > 0, we have max (ϕ(m) − ϕ(m ))−τ = O(KR 2 β 2−τ ). m ∈Z1 (R)

m;(m ,m)∈Z2 (β;R)

Proof. By an elementary geometric consideration, we can show the existence of an absolute constant M > 0 such that min

m∈Z1 (R)

min

m ;ϕ(m )>ϕ(m), ||m |−|m||≤R(ϕ(m )−ϕ(m))

(ϕ(m ) − ϕ(m)) ≥ M R −1 .

(2.40)

Hence if we take A−1 = M , S = |m |, T = KR and ϕ0 = ϕ(m ) in Lemma 2, then the condition (AR)−1 < ϕ(m) − ϕ0 under the summation symbol can be dropped and we obtain the desired assertion. 3. Proof of Theorems 2 and 3 In this section, we shall prove simultaneously Theorems 2’ and 3’, which are equivalent variants of Theorems 2 and 3, admitting a technical lemma which will be proved in Sect. 4. After that, we shall sketch how to transform the modified version of our theorems back to the original ones. In order to state the modified version of our theorems – which the author believes are in fact closer to the original problem raised by Berry and Tabor [BT] – let us introduce ˜ R (f ) by the planar domain ˜ R (f ) = {x ∈ R2 | 0 ≤ ϕ(x) ≤ , R + 2c1 f (ϕ(x)) < |x| < R + 2c2 f (ϕ(x))}.

(3.1)

216

N. Minami

˜ R (f ) is not only asymptotic to, but is equal to λ(f ) for all R > 0. Then the area of Given positive numbers 0 < a1 < a2 and L > 0, let µaL1 ,a2 (dR) be the uniform probability distribution on the interval IL = (a1 L, a2 L). Finally we define ˜ R (f ) ∩ Z2 }, ξ˜ = ξ˜ (R; f ) = {

(3.2)

˜ R (f ). to be the number of lattice points in the domain Now we shall prove the following two theorems: Theorem 2’. Suppose Conditions (I) and (II) hold. Then there is a sequence L¯ n → ∞ such that for P-almost all f ∈ X , the distribution of ξ˜ (·; f ) under µaL¯1 ,a2 converges to n the Poisson distribution with parameter λ(f ). Theorem 3’. If we assume Condition (III) in addition to Conditions (I) and (II), then for P-almost all f ∈ X , the distribution of ξ˜ (·; f ) under µaL1 ,a2 converges to the Poisson distribution with parameter λ(f ) as L → ∞. Let us turn to the proof of Theorem 3’. For k = 1, 2, . . . , set EL = EL,k (f ) = ξ˜ (R; f )µL (dR), IL

(3.3)

where µL = µaL1 ,a2 , and ∞ n! ξ˜ 1˜ . 1{ξ˜ ≥k} = ξ˜k = ξ˜k (R; f ) = k k!(n − k)! {ξ =n}

(3.4)

n=k

As was done in [M], Theorem 3’ will be proved as soon as we have shown lim EL,k (f ) =

L→∞

1 λ(f )k , k!

k = 1, 2, . . .

(3.5)

for P -almost all f ∈ X . Similarly, Theorem 2’ will be proved as soon as we have shown the existence of a sequence L¯ j → ∞ such that (3.5) holds along this sequence for P -almost all f ∈ X . For a lattice point m ∈ Z2 , let ˜ R (f ) " m}. Dm = Dm (f ) = {R > 0 | ˜ R (f ), we have By Definition (3.1) of |m|2 |m|2 Dm = − 2c2 , − 2c1 . f (ϕ(m))2 f (ϕ(m))2

(3.6)

(3.7)

In particular, |Dm | = 2(c2 − c1 ). For convenience, we let γm = γm (f ) =

|m|2 . f (ϕ(m))2

(3.8)

˜ R (f ) " m, R ∈ IL and if Ln ≤ L < Ln+1 , Now set Ln = 2n , n = 1, 2, . . . . Then if then we will have b1 a1 Ln ≤ |m| ≤ 2b2 a2 Ln+1 . (3.9)

On the Poisson Limit Theorems of Sinai and Major

217

Under the same condition, it is also easy to see that if m = m but ϕ(m) = ϕ(m ), then Dm ∩ Dm = φ. Hence if we define, as in Sect. 2, Zk,n = {(m1 , . . . , mk ) ∈ (Z2 )k |mj satisfies (3.9) for j = 1, . . . , k and 0 ≤ ϕ(m1 ) < · · · < ϕ(mk ) ≤ },

(3.10)

then for large n and for Ln ≤ L < Ln+1 , we have ˜ R (f )) µL (m1 , . . . , mk ∈ EL,k = (m1 ,... ,mk )∈Zk,n

=

1 |IL |

|IL ∩ Dm1 ∩ · · · ∩ Dmk |.

(3.11)

(m1 ,... ,mk )∈Zk,n

For k = 1, (3.5) can be proved without any reference to the randomness of f (·). Indeed in this case, 1 |IL ∩ Dm |. (3.12) EL,1 (f ) = |IL | m∈Z1,n

Since and we see

Dm ∩ IL = φ ⇐⇒ a1 L + 2c1 < γm < a2 L + 2c2 ,

(3.13)

Dm ⊂ IL ⇐⇒ a1 L + 2c2 ≤ γm ≤ a2 L + 2c1 ,

(3.14)

1 1(a1 L+2c1 ,a2 L+2c2 ) (γm )|Dm | |IL | m∈Z2 |m|2 2(c2 − c1 ) , 1(a1 L+2c1 ,a2 L+2c2 ) = |IL | f (ϕ(m))2 2

(3.15)

|m|2 2(c2 − c1 ) . 1[a1 L+2c2 ,a2 L+2c1 ] |IL | f (ϕ(m))2 2

(3.16)

EL,1 (f ) ≤

m∈Z

and similarly EL,1 (f ) ≥

m∈Z

Define E¯ L,1 (f ) and EL,1 (f ) to be the right-hand sides of (3.14) and (3.15) respectively. For each @ > 0, consider the planar domain A@ = A@ (f ) = {x ∈ R2 | 0 ≤ ϕ(x) ≤ and a1 f (ϕ(x))2 < |x|2 < (a2 + @)f (ϕ(x))2 }. (3.17) Since f (·) is a continuous function, the boundary of A@ has zero Lesbesgue measure, and hence the indicator of A@ is Riemann integrable. We have therefore 1 2 m 2(c2 − c1 ) lim L→∞ E¯ L,1 (f ) ≤ lim L→∞ 1A@ √ √ a2 − a 1 L m∈Z2 L 2(c2 − c1 ) × the area of A@ a − a1 2 @ λ(f ). = 1+ a2 − a 1 =

(3.18)

218

N. Minami

Letting @ $ 0, we obtain lim L→∞ E¯ L,1 (f ) ≤ λ(f ).

(3.19)

By a similar argument, we can also prove lim L→∞ EL,1 (f ) ≥ λ(f ),

(3.20)

lim EL,1 (f ) = λ(f )

(3.21)

so that L→∞

for every f ∈ X . Next we consider the case k ≥ 2. If Ln ≤ L < Ln+1 , then IL ⊂ [a1 Ln , a2 Ln+1 ). We divide the interval [a1 Ln , a2 Ln+1 ) into qn = nη subintervals [zi+1 , zi ) having equal length, where η > 0 is arbitrary. Obviously i zi = a1 Ln + (a2 Ln+1 − a1 Ln ), i = 1, . . . , qn . (3.22) qn If we let a1 (L − Ln ) + 2c1 A¯ n (L) = qn ; a2 Ln+1 − a1 Ln

a2 L − a1 Ln + 2c2 B¯ n (L) = 1 + qn , a2 Ln+1 − a1 Ln

(3.23)

and An (L) = 1 +

a1 (L − Ln ) + 2c2 qn ; a2 Ln+1 − a1 Ln

B n (L) =

a2 L − a1 Ln + 2c1 qn , a2 Ln+1 − a1 Ln

(3.24)

then we have γm ∈ [zi−1 , zi ),

Dm ∩ IL = φ ⇒ A¯ n (L) < i < B¯ n (L)

(3.25)

An (L) ≤ i ≤ B n (L) ⇒ Dm ⊂ IL .

(3.26)

and γm ∈ [zi−1 , zi ), Hence for n large, we can write Ek,L (f ) =

1 |IL |

1 ≤ |IL |

|IL ∩ Dm1 ∩ · · · ∩ Dmk |

(m1 ,... ,mk )∈Zk,n

A¯ n (L)
1{γm1 ∈[zi−1 ,zi )}

|Dm1 ∩ · · · ∩ Dmk |,

(3.27)

Zk,n (m1 )

and Ek,L (f ) ≥

1 |IL |

An (L)≤i≤B n (L) m1

1{γm1 ∈[zi−1 ,zi )}

|Dm1 ∩ · · · ∩ Dmk |, (3.28)

Zk,n (m1 )

where Zk,n (m1 ) ≡ {(m2 , . . . , mk ) ∈ (Z2 )k−1 | (m1 , m2 , . . . , mk ) ∈ Zk,n }.

(3.29)

On the Poisson Limit Theorems of Sinai and Major (1)

219

(1)

As before, we define E¯ k,L (f ) and E k,L (f ) to be the right-hand sides of (3.27) and (3.28) (2) respectively. We also define E¯ (f ) by k,L

1 (2) E¯ k,L (f ) = |IL |

A¯ n (L)
× 2(c2 − c1 )k

ϕ(m1 )

1{γm1 ∈[zi−1 ,zi )} ×

dϕ2

ϕ2

dϕ3 · · ·

ϕk−1

dϕk f (ϕ2 )2 · · · f (ϕk )2 ,

(3.30) (2) and E k,L (f ) by the same formula as (3.30) except that the condition under the summation symbol “A¯ n (L) < i < B¯ n (L)” is replaced by “An (L) ≤ i ≤ B n (L)”. Now it is easy to prove λ(f )k (2) (2) lim E¯ k,L (f ) = lim E k,L (f ) = L→∞ L→∞ k!

(3.31)

for every f ∈ X , so that we are left with estimating the differences ¯ k,L (f ) ≡ E¯ (1) (f ) − E¯ (2) (f ); G k,L k,L

(1)

(2)

Gk,L (f ) ≡ E k,L (f ) − E k,L (f ).

(3.32)

More precisely, writing En (f ) ≡ ≤

¯ k,L (f )|} max{|Gk,L (f )|, |G

sup

Ln ≤L
1 |ILn | ×

q n +1 i=1

m1

1{γm1 ∈[zi−1 ,zi )}

|Dm1 Dm2 ∩ · · · ∩ Dmk |

(m2 ,... ,mk )∈Zk,n (m1 )

− 2(c2 − c1 )k

ϕ(m1 )

dϕ2

ϕ2

dϕ3 · · ·

ϕk−1

dϕk f (ϕ2 )2 · · · f (ϕk )2 , (3.33)

we shall prove ∞ n=1

EP [En (·)] = EP [

∞

En (·)] < ∞,

(3.34)

n=1

which immediately gives ¯ k,L (f )|} ≤ lim n→∞ En (f ) = 0 lim max{|Gk,L (f )|, |G

L→∞

for P-almost all f ∈ X .

(3.35)

220

N. Minami

Let us finish the proof of (3.34), and hence of Theorem 3’, admitting the following technical lemma: Lemma 4. For n ≥ 1 and 1 ≤ i ≤ qn + 1, let Ui,n

= EP 1{γm1 ∈[zi−1 ,zi )} m1

−2(c2 − c1 )

k

dϕ2

ϕ(m1 )

ϕ2

|Dm1 ∩ Dm2 ∩ · · · ∩ Dmk |

(m2 ,... ,mk )∈Zk,n (m1 )

dϕ3 · · ·

ϕk−1

2 dϕk f (ϕ2 ) · · · f (ϕk ) . 2

2

(3.36)

Let further T −1 ≤ a1 < a2 ≤ T with T ≥ 1. Then there is a contant CT > 0 which −1/2 depends only on T > 0 such that for each n ≥ 1 and β > Ln , 3/2 max Ui,n ≤ CT (32k (β) + Ln β d )Ln qn−2 , (3.37) 1≤i≤qn +1

where d = 1 − σ or d = 2 − τ according to which of the Conditions (II-1) or (II-2) is satisfied by P. Take β = βn = An−1/νk in the above lemma. Then by (3.33) and (3.36), E[En (·)] ≤

1

q n +1

|ILn |

i=1

Ui,n

1/4 d/2 3/4 −1 ≤ const.L−1 n qn ( 32k (βn ) + Ln βn )Ln qn −1/4 d/2 = const.(Ln 32k (βn ) + βn ) 1 −1/4 exp( A−νk n) + n−d/2νk ). ≤ const.(Ln 2

(3.38)

If we take A > 0 sufficiently large, then the first term on the right hand side decays exponentially first, while d/2νk > 1. Hence we have ∞

E[En (·)] < ∞,

n=1

which completes the proof of Theorem 3. When we do not assume Condition (III), 32k (β) can be any positive function monotonically tending to ∞ as β $ 0. It is possible to choose a decreasing sequence {βn } tending to 0 so slowly that −1/2 Ln 32k (βn ) −→ 0 (3.39) holds. Then choose a subsequence {n } such that −1/4 d/2 Ln 32k (βn ) + βn < ∞.

(3.40)

n

Choose a number L¯ n arbitrarily from the interval [Ln , Ln +1 ) for each n . Then obvioiusly the sequence {L¯ n } meets the requirement of Theorem 2’.

On the Poisson Limit Theorems of Sinai and Major

221

Before closing this section, we sketch how to obtain Theorems 2 and 3 from what we have shown up to now. For this purpose, we first note that for any f ∈ X and any 0 < a1 < a2 , √ (3.41) lim µaL1 ,a2 ({R > 0|ξ˜ (R; f ) = ξ( R; f )}) = 0. L→∞

√ Indeed, ξ˜ (R; f ) = ξ( R; f ) implies ˜ R (f ).R (f )} ∩ Z2 = φ, {

(3.42)

and since a1 L ≤ R ≤ a2 L, b1 ≤ f (·) ≤ b2 , it holds that √ µaL1 ,a2 ({R > 0|ξ˜ (R; f ) = ξ( R; f )}) ˜ R (f ).R (f )) ∩ Z2 = φ}) ≤ µaL1 ,a2 ({R > 0|( a ,a ˜ R (f ).R (f )}). µL1 2 ({R > 0| m ∈ ≤

(3.43)

m∈Z1,n

If we set for each m ∈ Z2 with 0 ≤ ϕ(m) ≤ , ˜ R (f ).R (f )}, Jm = {R > 0| m ∈

(3.44)

and Jm(i) = {R > 0| (1)

√ ci R + 2ci f (ϕ(m)) ≤ |m| ≤ ( R + √ )f (ϕ(m))}, R (2)

then Jm ⊂ Jm ∪ Jm , and it suffices for our purpose to show a ,a lim µL1 2 (Jm(i) ) = 0, i = 1, 2. L→∞

i = 1, 2, (3.45) (3.46)

m∈Z1,n (i)

Now it is easy to see that R ∈ Jm implies c12 |m|2 |m|2 ≤ R ≤ − 2c − − 2ci , i a1 L f (ϕ(m))2 f (ϕ(m))2 and hence µaL1 ,a2 (Jm(i) ) ≤

ci2 1 . (a2 − a1 )a1 L2

(3.47)

(3.48)

Since the number of lattice points in Z1,n is bounded by a constant times L, we finally arrive at a ,a µL1 2 (Jm(i) ) = O(L−1 ) (3.49) m∈Z1,n

as desired. Now fix 0 < a1 < a2 and apply Theorem 3’ with a12 < a22 in use instead of 0 < a1 < a2 . Then from the above argument, for P-almost all f ∈ X , one has 1 lim 2 L→∞ L

a22 L2

a12 L2

1{ξ(√R;f )=k} dR = (a22 − a12 )pk (f ),

(3.50)

222

N. Minami

where we have set for brevity pk (f ) = e−λ(f ) λ(f )/k!. On the other hand, the estimate in Lemma 4 is uniform in a1 and a2 such that T −1 ≤ a1 < a2 ≤ T , for any T ≥ 1. Hence with the same choice βn = An−1/νk as before, we see ∞ dα E[En (·)] < ∞ (3.51) n=1 a1 <α
for any T > a1 . This means that for P−almost all f ∈ X and for Lesbesgue almost all α > a1 , one has αL2 αL2 1 1 lim 2 1{ξ˜ (R;f )=k} dR = lim 2 1{ξ(√R;f )=k} dR L→∞ L L→∞ L (3.52) a12 L2 a12 L2 = (α − a12 )pk (f ),

k ≥ 1.

We now fix f ∈ X for which (3.50) and (3.52) are valid, and letting gk (R) = 1{ξ(R;f )=k} for brevity, we further compute as follows: µaL1 ,a2 ({R > 0| ξ(R; f ) = k}) a2 L 1 = gk (R)dR (a2 − a1 )L a1 L a 2 L2 2 √ ds 1 = gk ( s) √ 2(a2 − a1 )L a12 L2 s a 2 L2 τ 2 L2 a2 2 √ √ 1 1 1 dτ 1 = gk ( s)ds + g( s)ds . 2(a2 − a1 )a2 L2 a12 L2 2(a2 − a1 ) a1 τ 2 L2 a12 L2 (3.53) a 2 −a 2

The first term on the right-hand side converges to 2(a22−a11)a2 pk (f ) as L → ∞ because of (3.50). On the other hand, we have 1 τ 2 L2 √ gk ( s)ds ≤ a22 − a12 (3.54) 2 L a 2 L2 1

for all τ ∈ [a1 , a2 ] and 1 lim L→∞ L2

τ 2 L2 a12 L2

√ g( s)ds = (τ 2 − a12 )pk (f )

(3.55)

for Lesbesgue almost all τ ∈ [a1 , a2 ] because of (3.52). Now apply Lesbesgue’s dominated convergence theorem to the second term on the right-hand side, to obtain lim µa1 ,a2 ({R > 0| L→∞ L a22 − a12 =

ξ(R; f ) = k})

2(a2 − a1 )a2 = pk (f ),

pk (f ) +

1 2(a2 − a1 )

a2 a1

dτ 2 (τ − a12 )pk (f ) τ2

(3.56)

completing the proof of Theorem 3. according to Theorem 2’. We can repeat the Finally let {L¯ n } be the sequence taken above argument to see that the sequence { L¯ n } meets the requirement of Theorem 2.

On the Poisson Limit Theorems of Sinai and Major

223

4. Proof of Lemma 4 Let us fix an i ∈ {1, . . . , 1 + qn } and a T > 1 and assume T −1 ≤ a1 < a2 ≤ T . We make the following definitions: Qi (m1 , . . . , mk ; m1 , . . . , mk ) = Qi (m; m ) = EP [1{γm1 ,γm ∈[zi−1 ,zi )} |Dm1 ∩ · · · ∩ Dmk | |Dm1 ∩ · · · ∩ Dmk |], 1

where we write (m, m ) = (m1 , . . . , mk ; m1 , . . . , mk ) for brevity; Qi (m; m ); Xni =

(4.1)

(4.2)

(m;m )∈Zk,n ×Zk,n

R i (m1 , . . . , m1 ; m1 ) = R i (m; m1 ) = EP [1{γm1 ,γm ∈[zi−1 ,zi )} |Dm1 ∩ · · · ∩ Dmk | 1 2 2 ; dϕ2 dϕ3 · · · dϕk f (ϕ2 ) · · · f (ϕk ) × ϕ(m1 )

ϕ2

Yni =

ϕk−1

R i (m; m1 );

ϕ(m1 )

×

ϕ(m1 )

ϕ2

dϕ2

Zni = Then

(4.4)

(m;m1 )∈Zk,n ×Z1,n

S i (m1 , m1 ) = EP [1{γm1 ,γm ∈[zi−1 ,zi )} 1 × dϕ2 dϕ3 · · ·

(4.3)

ϕ2

dϕ3 · · ·

ϕk−1

ϕk−1

dϕk f (ϕ2 ) · · · f (ϕk ) 2

2

(4.5)

dϕk f (ϕ2 )2 · · · f (ϕk )2

S i (m1 , m1 ).

; (4.6)

(m1 ,m1 )∈Z1,n ×Z1,n

Ui,n = Xni − 4(c2 − c1 )k Yni + 4(c2 − c1 )2k Zni .

(4.7)

We shall prove that for some positive constant CT which depends only on T , the following two inequalities hold as far as we have T −1 ≤ a1 < a2 ≤ T : 3/2 | Xni − 4(c2 − c1 )2k Zni | ≤ CT (Ln qn−2 (32k (β) + Ln β d )); (4.8) and

| Yni − 2(c2 − c1 )k Zni | ≤ CT (Ln qn−2 (32k (β) + 3/2

Ln β d )),

(4.9)

from which the assertion of Lemma 4 follows immediately. At this point, we make the following convention. We denote by C or CT positive constants, of which we are not interested in exact values, and which may differ from inequality to inequality. We distinguish CT from C when it is dependent on T , while C

224

N. Minami

does not depend on a1 and a2 . Similarly, when something is bounded by C [resp. CT ] times something else, we shall say it is of O(·) [resp. OT (·)]. In order to estimate Xni , let us make the representation (4.1) of Qi (m; m ) more explicit. To begin with, from Definition (3.8) of γm , we see that |m1 | |m1 | γm1 ∈ [zi−1 , zi ) ⇐⇒ √ < f (ϕ(m1 )) ≤ √ , zi zi−1

(4.10)

where |m1 | |m1 | |m1 | − √ =√ √ zi−1 zi zi zi−1

zi − zi−1 √ √ zi + zi−1

= O((a2 /a1 )3/2 qn−1 ) = OT (qn−1 ),

(4.11) if m1 ∈ Z1,n . Recall (3.9) and a1 Ln ≤ zi ≤ a2 Ln+1 . Here we have simply written OT because the dependence of the estimate on a1 and a2 need not be made explicit. Next, writing x+ = max{x, 0} = x ∨ 0 for x ∈ R, we obtain from (3.7) and (3.8), |Dm1 ∩ · · · ∩ Dmk | = A − 0 ∨ max (γmj − γm1 ) + 0 ∧ min (γmj − γm1 ) , 2≤j ≤k

2≤j ≤k

+

(4.12) where we have let A = 2(c2 − c1 ) for brevity. Letting ys = f (ϕ(ms )) and yt = f (ϕ(mt )), we can write, when ϕ(ms ), ϕ(mt ), s, t = 1, . . . , k are all different, Qi (m; m ) |m1 |/√zi−1 = dy1 √ |m1 |/ zi

√ |m1 |/ zi−1

√ |m1 |/ zi

dy1

∞

dy2 · · ·

0

0

dyk

∞ 0

dy2 · · ·

∞ 0

dyk

|ms |2 |ms |2 |m1 |2 |m1 |2 + 0 ∧ min × A − 0 ∨ max − − 2≤s≤k 2≤s≤k ys2 ys2 y12 y12 + 2 2 2 2 |m | |m | |mt | |mt | + 0 ∧ min × A − 0 ∨ max − 12 − 12 2 2 2≤t≤ 2≤t≤ yt y1 yt y1

∞

+

× p2k (y1 , . . . , yk , y1 , . . . , yk | ϕ(m1 ), . . . , ϕ(mk ), ϕ(m1 ), . . . , ϕ(mk )). (4.13) Note that p2k , as a function of y s, is supported on the set of (y1 , . . . , yk , y1 , . . . , yk ) such that b1 ≤ ys , yt ≤ b2 , s, t = 1, . . . , k. Let us make the following change of variables: ζs =

|ms |2 |m1 |2 − ; ys2 y12

Then we have ys = and dys = −

ζs =

|ms | , 9s

1 |ms | dζs , 2 93s

|m1 |2 |ms |2 − , ys2 y12

ys =

|ms | , 9s

dys = −

s = 2, . . . , k.

s = 2, . . . , k

1 |ms | dζs , 2 93 s

s = 2, . . . , k,

(4.14)

(4.15)

(4.16)

On the Poisson Limit Theorems of Sinai and Major

225

where we have defined

y2 |m1 | 9s = 9s (y1 , m1 , ζs ) = 1 + 1 2 ζs , y1 |m1 |

(4.17)

and similarly for 9s = 9s (y1 , m1 , ζs ). Equation (4.13) can now be written as follows:

=

Qi (m; m ) |m1 |/√zi−1 √ |m1 |/ zi

dy1

√ |m1 |/ zi−1

√ |m1 |/ zi

dy1

A −A

dζ2 · · ·

A −A

dζk

A −A

dζ2 · · ·

A −A

dζk

A−(0 ∨ max ζs ) + (0 ∧ min ζs ) × A−(0 ∨ max ζs ) + (0 ∧ min ζs ) 2≤s≤k

2≤s≤k

2≤t≤k

+

2≤t≤k

+

m2 1 ms 1 1 1 ms ,... (4.18) × p × (y , 2k 1 9 2 9s 2 9s 92s 92 2 s s=2 mk m2 m . . . , , y1 , , . . . , k | ϕ(m1 ), . . . , ϕ(mk ), ϕ(m1 ), . . . , ϕ(mk )). 9k 92 9k k

Here the variables ζ ’s range in (−A, A) because the integrand vanishes if at least one of them is outside of this interval. −1/2 Given β > Ln , define a subset Wn (β) of Zk,n × Zk,n to be the totality of those (m; m ) = (m1 , . . . , mk ; m1 , . . . , mk ) such that ϕ(ms ) − ϕ(ms−1 ) ≥ β,

ϕ(ms ) − ϕ(ms−1 ) ≥ β, and that

|ϕ(ms ) − ϕ(mt )| ≥ β, 

Then

Xni = 

Wn (β)

s, t = 1, . . . , k.

+

s = 2, . . . , k

(4.19) (4.20)

  Qi (m; m ).

(4.21)

(Zk,n ×Zk,n )\Wn (β)

Let Xni (β) be the first sum on the right-hand side of (4.21). If for given m1 , m1 ∈ Z1,n with |ϕ(m1 ) − ϕ(m1 )| ≥ β one defines Wn (β; m1 , m1 ) = {(m2 , . . . , mk ; m2 , . . . , mk );

(m1 , m2 , . . . , mk ; m1 , m2 , . . . , mk ) ∈ Wn (β)},

then Xni (β) =

m1 ,m1 ∈Z1,n |ϕ(m1 )−ϕ(m1 )|≥β

Wn (β;m1 ,m1 )

Qi (m1 , . . . , mk ; m1 , . . . , mk ).

(4.22)

(4.23)

Fix m1 , m1 ∈ Z1,n such that |ϕ(m1 ) − ϕ(m1 )| ≥ β

(4.24)

226

N. Minami

and y1 , y1 such that |m1 | |m | √ ≤ y1 ≤ √ 1 . zi zi−1

|m1 | |m1 | ; √ ≤ y1 ≤ √ zi zi−1

(4.25)

Take also ζs , ζs , s = 2, . . . , k from the interval (−A, A). Then for some positive constant CT > 0, it holds that CT−1 Ln

1/2

≤ 9s ,

9s ≤ CT Ln , 1/2

s = 2, . . . , k.

(4.26)

Now we shall evaluate approximately the sum k m2 1 ms 1 1 1 ms ×p2k y1 , , . . . 2 2 2 9 2 9 9 9 9

(m2 ,... ,mk ;m2 ,... ,mk ) s=2 ∈Wn (β;m1 ,m1 )

s

m y1 , 2 , . . . 92

s

s

s

2

mk , , 9k

m , k | ϕ(m1 ), . . . , ϕ(mk ), ϕ(m1 ), . . . , ϕ(mk ) , 9k

(4.27)

the approximation being uniform in ζ2 , . . . , ζk , ζ2 , . . . , ζk . Note that the summand appears as a part of the integrand in (4.18). For this purpose, we introduce a function F = Fm1 ,m1 ,y1 ,y1 (x2 , . . . , xk ; x2 , . . . , xk ) ≡

k |xs | |xs | )p2k (y1 , |x2 |, . . . , |xk |, y1 , |x2 |, . . . , |xk | | ( 2 2

s=2

ϕ(m1 ), ϕ(x2 ), . . . , ϕ(xk ), ϕ(m1 ), ϕ(x2 ), . . . , ϕ(xk )),

(4.28)

which is defined on the set Dm1 ,m1 ≡ {(x2 , . . . , xk ; x2 , . . . , xk ) ∈ (R2 )2(k−1) b1 ≤ |xs |, |xs | ≤ b2 ,

ϕ(m1 ) < ϕ(x2 ) < · · · < ϕ(xk ); ϕ(m1 ) < ϕ(x2 ) < · · · < ϕ(xk ))}.

(4.29)

It is easy to see that for some C > 0, ∂F ∂x j

∂F , ≤ C(ψ2k (θ ) + 32k (θ )) ∂xj

(4.30)

holds on Dm1 ,m1 , where θ is the minimum of ϕ(x2 ) − ϕ(m1 ), ϕ(x3 ) − ϕ(x2 ) . . . , etc. In view of Conditions (II) and (III), we may suppose ψ2k (θ ) < 32k (θ ), so that (4.30) is actually ∂F ∂F (4.31) ≤ C32k (θ ). ∂x , ∂x j j

On the Poisson Limit Theorems of Sinai and Major

227

The sum (4.27) can then be viewed as a Riemann sum approximation for the integral of the function F , and (4.26), (4.31) can be used to estimate its accuracy. Namely (4.27) is equal to k 1 1 m m2 mk m2 Fm1 ,m1 ,y1 ,y1 ( , . . . , , , . . . , k ) 2 2 9s 9 s 92 9k 92 9k (m2 ,... ,mk ;m2 ,... ,mk ) ∈Wn (β;m1 ,m1 )

=

dx2 · · ·

dxk

−1/2

+ OT (Ln

s=2

dx2 · · ·

dxk Fm1 ,m1 ,y1 ,y1 (x2 , . . . , xk , x2 , . . . , xk )

32k (β)),

(4.32)

where the integration ranges over those (x2 , . . . , xk , x2 , . . . , xk ) ∈ Dm1 ,m1 for which ϕ(x2 ) − ϕ(m1 ) ≥ β, ϕ(xs ) − ϕ(xs−1 ) ≥ β,

s = 3, . . . , k;

ϕ(x2 ) − ϕ(m1 ) ≥ β, |ϕ(xs ) − ϕ(xt )| ≥ β,

t = 3, . . . , k;

ϕ(xt ) − ϕ(xt−1 )

≥ β,

(4.33)

s, t = 2, . . . , k.

Transforming each of xs and xs into the polar coordinates, the above integral becomes

dϕ2 · · ·

dϕk

dϕ2 · · ·

dϕk ×

∞

0

k 2 2 r r s

s=2

∞

dr2 · · · s

2 2

0

drk

∞ 0

dr2 · · ·

∞ 0

drk ×

×

× p2k (y1 , r2 , . . . , rk , y1 , r2 , . . . , rk | ϕ(m1 ), ϕ2 , . . . , ϕk , ϕ(m1 ), ϕ2 , . . . , ϕk ), (4.34) where the integration ranges over those ϕs , ϕs such that ϕ(x2 ) − ϕ(m1 ) ≥ β, ϕ(xs ) − ϕ(xs−1 ) ≥ β, s = 3, . . . , k; ϕ(x2 ) − ϕ(m1 ) ≥ β, ϕ(xt ) − ϕ(xt−1 ) ≥ β,

and that

|ϕs − ϕt | ≥ β,

t = 3, . . . , k;

s, t = 1, . . . , k.

(4.35) (4.36)

If we integrate out with respect to the variables rs , rs , s = 2, . . . , k, we see without difficulty that the difference made by letting β = 0 in the range of integration of (4.34) is less than Cβp2 (y1 , y1 | ϕ(m1 ), ϕ(m1 ). (4.37) We have thus approximated the sum (4.27) uniformly in the parameters ζ2 , . . . , ζk , ζ2 , . . . , ζk . In order to finish the estimation of Xni (β) (recall (4.23) for the definition), we need to get an approximation of the sum of Qi (m1 , . . . , mk ; m1 , . . . , mk ) over W(β; m1 , m1 ). This is done by integrating the sum (4.27) over parameters y1 , y1 , ζ2 , . . . , ζk , ζ2 , . . . , ζk . By noting the above mentioned uniformity of the integralapproximation in ζs and ζs , and by using the formula A A dζ2 · · · dζk {A−(0∨ζ2 ∨· · ·∨ζk )+(0∧ζ2 ∧· · ·∧ζk )}+ = Ak , k ≥ 2, (4.38) −A

−A

228

N. Minami

which we shall prove as Lemma 5 at the end of this section, we obtain Xni (β)

=

√ |m1 |/ zi−1 √ |m1 |/ zi

···

ϕ(m1 )<ϕ2 <···<ϕk

∞

∞

dr2 · · ·

0

0

drk

dy1

×

A

2k

m1 ,m1 ∈Z1,n |ϕ(m1 )−ϕ(m1 )|≥β

×

∞ 0

, rk , y1 , r2 , . . .

∞ 0

√ |m1 |/ zi

dy1

···

dϕ2 · · · dϕk dr2 · · ·

√ |m1 |/ zi−1

drk

dϕ2 · · · dϕk

k 2 2 r r s

s=2

s

2 2

| ϕ(m1 ), ϕ2 , . . . , ϕk , ϕ(m1 ), ϕ2 , . . . , ϕk ) −1/2 +OT (Ln 32k (β) + βp2 (y1 , y1 |ϕ(m1 ), ϕ(m1 )) 3/2 = 4(c2 − c1 )2k S i (m1 , m1 ) + OT (Ln qn−2 32k (β)) × p2k (y1 , r2 , . . .

   +O 

, rk

ϕ(m1 )<ϕ2 <···<ϕk

m1 ,m1 ∈Z1,n |ϕ(m1 )−ϕ(m1 )|≥β

β



√ |m1 |/ zi−1

√ |m1 |/ zi m1 ,m1 ∈Z1,n |ϕ(m1 )−ϕ(m1 )|≥β

dy1

√ |m1 |/ zi−1 √ |m1 |/ zi

  dy1 p2 (y1 , y1 | ϕ(m1 ), ϕ(m1 )) , 

(4.39) where we have used (4.11) and (Z1,n × Z1,n ) = OT (Ln ) to get the bound of the first error term. See also (4.5) for the definition of S i (m1 , m1 ). In order to estimate the second error term, let us first consider the case where our probability measure P satisfies Condition (II-1). Then the sum m1 ,m1 ∈Z1,n |ϕ(m1 )−ϕ(m1 )|≥β

β

√ |m1 |/ zi−1

√ |m1 |/ zi

dy1

√ |m1 |/ zi−1 √ |m1 |/ zi

dy1 p2 (y1 , y1 | ϕ(m1 ), ϕ(m1 )) (4.40)

is bounded by Cβqn−2

|ϕ(m1 ) − ϕ(m1 )|−σ

m1 ,m1 ∈Z1,n |ϕ(m1 )−ϕ(m1 )|≥β

≤ Cββ −σ qn−2

1

(4.41)

m1 ,m1 ∈Z1,n

≤ CT L2n qn−2 β 1−σ . Next suppose that P satisfies Condition (II-2). Then we have p2 (y1 , y1 | ϕ(m1 ), ϕ(m1 )) ≤ C1{|y1 −y1 |≤b3 |ϕ(m1 )−ϕ(m1 )|} |ϕ(m1 ) − ϕ(m1 )|−τ . (4.42)

On the Poisson Limit Theorems of Sinai and Major

229

Then the sum (4.40) is bounded by Cβ

b2 b1

dy1

b2 b1

dy1

m1 ,m1 ∈Z1,n

1{y1 √zi−1 ≤|m1 |y1 ≤√zi }

× 1{y1 √zi−1 ≤|m1 |≤y1 √zi } 1{|y1 −y1 |∨(b3 β)≤|ϕ(m1 )−ϕ(m1 )|}

(4.43)

× |ϕ(m1 ) − ϕ(m1 )|−τ . For fixed y1 , y1 and m1 , we shall estimate m1 ∈Z1,n

1{y1 √zi−1 ≤|m1 |≤y1 √zi } 1{|y1 −y1 |∨(b3 β)≤|ϕ(m1 )−ϕ(m1 )|} |ϕ(m1 ) − ϕ(m1 )|−τ . (4.44)

It is sufficient to do this assuming ϕ(m1 ) > ϕ(m1 ). To this end, let us divide the interval −1/2 [β ∨ (|y1 − y1 |/b3 ), ] into subintervals .j of √ length Ln . Since the number of those √ m1 ∈ Z1,n for which y1 zi−1 ≤ |m1 | ≤ y1 zi and ϕ(m1 ) ∈ .j hold is bounded by 1/2 CT Ln qn−1 , the sum (4.44) is less than CT Ln qn−1 1/2

∞

−1/2

{j Ln

+ (β ∨

j =1

≤ CT Ln qn−1

∞ |y −y | β∨( 1b 1 ) 3

|y1 − y1 | −τ )} b3

t −τ dt

(4.45)

= CT Ln qn−1 (β 1−τ ) ∧ (|y1 − y1 |1−τ ). Inserting this estimate into (4.43), we obtain the following new bound of (4.40):   b2 b2 dy1 dy1  1{y1 √zi−1 ≤|m1 |≤y1 √zi }  Ln qn−1 {β 1−τ ∧ |y1 − y1 |1−τ } CT β b1

≤

b1

CT β(Ln qn−1 )2

m1 ∈Z1,n

b2 b1

dy1

b2 b1

dy1 {β 1−τ ∧ |y1 − y1 |1−τ }

= CT L2n qn−2 β 2−τ .

(4.46) Returning to (4.39), we have thus proved 3/2 S i (m1 , m1 ) + OT (Ln qn−2 32k (β) + L2n qn−2 β d ), Xni (β) = 4(c2 − c1 )2k m1 ,m1 ∈Z1,n |ϕ(m1 )−ϕ(m1 )|≥β

where d = 1 − σ or d = 2 − τ . (Recall A = 2(c2 − c1 ).) To finish the discussion on Xni (β), we estimate the error S i (m1 , m1 ) m1 ,m1 ∈Z1,n |ϕ(m1 )−ϕ(m1 )|<β

(4.47)

(4.48)

230

N. Minami

which is made by dropping the condition |ϕ(m1 ) − ϕ(m1 )| ≥ β under the summation symbol in (4.47). For this purpose, we divide the sum (4.48) into two parts, namely we rewrite (4.48) as S i (m1 , m1 ) + S i (m1 , m1 ) ≡ J1 + J2 (4.49) 0<|ϕ(m1 )−ϕ(m1 )|<β

ϕ(m1 )=ϕ(m1 )

when Condition (II-1) holds, or S i (m1 , m1 )+ −1/2 Ln ≤|ϕ(m1 )−ϕ(m1 )|<β

S i (m1 , m1 ) ≡ K1 +K2 (4.50)

−1/2 |ϕ(m1 )−ϕ(m1 )|
when Condition (II-2) holds. Note that if ϕ(m1 ) = ϕ(m1 ), S i (m1 , m1 ) ≤ CP(γm1 , γm1 ∈ [zi−1 , zi )) |m |/√zi−1 |m1 |/√zi−1 (4.51) 1 dy dy p (y , y | ϕ(m ), ϕ(m )). =C 1 2 1 1 1 1 1 √ √ |m1 |/ zi

|m1 |/ zi

Suppose that Condition (II-1) is satisfied. Then by (4-11), S i (m1 , m1 ) ≤ CT qn−2 |ϕ(m1 ) − ϕ(m1 )|−σ 1/2

with σ ∈ (0, 1). Hence applying Lemma 1 with R = Ln

(4.52) −1/2

and β > Ln

, we see

J1 ≤ CT L2n qn−2 β 1−σ . On the other hand, if ϕ(m1 ) = ϕ(m1 ), one will have b2 i dy1 1{y1 √zi−1 ≤|m1 |,|m1 |≤y1 √zi } . S (m1 , m1 ) ≤ C b1

(4.53)

(4.54)

√ √ 1/2 Since y1 ( zi − zi−1 ) = OT (Ln qn−1 ), we have J2 ≤ CT Ln qn−2 . 3/2

Next suppose that Condition (II-2) holds. Then if ϕ(m1 ) = ϕ(m1 ), b2 b2 dy1 dy1 1{y1 √zi−1 ≤|m1 |≤y1 √zi } 1{y1 √zi−1 ≤|m1 |≤y1 √zi } S i (m1 , m1 ) ≤ C b1

b1

× 1{|y1 −y1 |≤b3 |ϕ(m1 )−ϕ(m1 )|} |ϕ(m1 ) − ϕ(m1 )|−τ . Hence we have K1 ≤ C

b2

b1

×

dy1

m1 ∈Z1,n

b2 b1

dy1 1{|y1 −y1 |≤β}

m1 ∈Z1,n

1{y1 √zi−1 ≤|m1 |≤y1 √zi } 1{|y

× |ϕ(m1 ) − ϕ(m1 )|−τ .

(4.55)

(4.56)

1{y1 √zi−1 ≤|m1 |≤y1 √zi }

−1/2 )≤|ϕ(m1 )−ϕ(m1 )|} 1 −y1 |∨(b3 Ln

(4.57)

On the Poisson Limit Theorems of Sinai and Major

231

Let us estimate the summation over m1 . We can assume ϕ(m1 ) > ϕ(m1 ). As we did in −1/2 ∨ ( b13 |y1 − y1 |), ] into the estimation of the sum (4.44), we divide the interval [Ln −1/2

subintervals of length Ln , and obtain 1{y1 √zi−1 ≤|m1 |≤y1 √zi } 1{|y −y |∨(b 1

m1 ∈Z1,n

≤ CT Ln qn−1 1/2

∞

−1/2

{(Ln

∨

j =0 −1/2

≤ CT Ln qn−1 (Ln

∨

−1/2 )≤|ϕ(m1 )−ϕ(m1 )|} 3 Ln

1

|ϕ(m1 ) − ϕ(m1 )|−τ

1 −1/2 |y1 − y1 |) + j Ln }−τ b3

1 |y1 − y1 |)1−τ . b3

(4.58)

Inserting this inequality into (4.57), one easily gets the bound (4.59) K1 ≤ CT L2n qn−2 β 2−τ . √ 1/2 −1/2 We now turn to estimating K2 . By zi ≤ CT Ln and |ϕ(m1 ) − ϕ(m1 )| ≤ Ln , one easily sees that there is a constant CT > 0 such that the condition, √ √ (4.60) f (ϕ(m1 )) zi−1 ≤ |m1 | ≤ f (ϕ(m1 )) zi implies

√ √ (4.61) f (ϕ(m1 )) zi−1 − CT ≤ |m1 | ≤ f (ϕ(m1 )) zi + CT . For each fixed m1 ∈ Z1,n , the number of m1 which satisfies the Condition (4.61) and −1/2 1/2 is again bounded by CT Ln qn−1 . Hence |ϕ(m1 ) − ϕ(m1 )| < Ln 1/2 S i (m1 , m1 ) ≤ CT Ln qn−1 P(γm1 ∈ [zi−1 , zi )) m1 ,m1 ∈Z1,n

m1 ∈Z1,n

−1/2

|ϕ(m1 )−ϕ(m1 )|
(4.62)

≤ CT Ln qn−2 . 3/2

We have thus shown that the sum (4.48) is bounded by CT (Ln qn−2 + L2n qn−2 β d ) 3/2

with d = 1 − σ or d = 2 − τ . Since 1 = O(32k (β)), we finally arrive at 3/2 1/2 S i (m1 , m1 ) + OT (Ln qn−2 (32k (β) + Ln β d )). Xni (β) = 4(c2 − c1 )2k m1 ,m1 ∈Z1,n

(4.63)

We are still left with the task of estimating Xni − Xni (β) =

Qi (m; m ).

(4.64)

(Z1,n ×Z1,n )\Wn (β)

Again we divide this sum into two parts, in slightly different ways according to which of the conditions (II-1) or (II-2) is satisfied. Namely we let    Qi (m; m ), Xni − Xni (β) ≡ Xn (β) + Xn =  + (4.65) Wn (β)

Wn

232

N. Minami

where we define, in case Condition (II-1) is satisfied, Wn ≡ {(m; m ) ∈ Zk,n × Zk,n | ϕ(ms ) = ϕ(mt ) for some s, t = 1, . . . , k}, (4.66) and define, in case Condition (II-2) is satisfied, −1/2

Wn ≡ {(m; m) ∈ Zk,n ×Zk,n | |ϕ(ms )−ϕ(mt )| < M Ln the constant M > 0 being chosen later. In either case, we let

for some s, t = 1, . . . , k}, (4.67)

Wn (β) = (Z1,n × Z1,n ) \ (Wn (β) ∪ Wn ).

(4.68)

Recall Wn (β) was defined just before (4.21). Suppose (II-1) is satisfied. We shall estimate Xn (β). By (4.66) and (4.68), Xn (β) is the sum of Qi (m; m ) over those (m; m ) ∈ Zk,n × Zk,n such that ϕ(ms ) and ϕ(mt ) are all different but that at least one of the following is less than β : ϕ(ms ) − ϕ(ms−1 ), |ϕ(ms ) − ϕ(mt )|,

ϕ(ms ) − ϕ(ms−1 ),

s = 2, . . . , k;

s, t = 1, 2, . . . , k.

(4.69)

¯ 2k be the rearrangement of m1 , . . . , mk , m1 , . . . , mk in the increasing Now let m ¯1 ... ,m order of ϕ(·), namely ϕ(m ¯ 1 ) < · · · < ϕ(m ¯ 2k ),

{m ¯1 ... ,m ¯ 2k } = {m1 , . . . , mk , m1 , . . . , mk }.

(4.70)

¯1 ... ,m ¯ 2k } into two classes There are (2k)!/(k!)2 ways of dividing a given {m {m1 , . . . , mk } and {m1 , . . . , mk }, a typical one, let us call it σ , being as follows: ϕ(m1 ) < · · · < ϕ(mr ) < ϕ(m1 ) < ϕ(m ¯ r+2 ) < · · · < ϕ(m ¯ 2k ); {m ¯ r+2 , . . . , m ¯ 2k } = {mr+1 , . . . , mk , m2 , . . . , mk }.

Accordingly Wn (β) is divided into (2k)!/(k!)2 disjoint classes: Wn (β) = Wn,σ (β),

(4.71)

(4.72)

σ

(β) of Qi (m , . . . , m ; m , and obviously it is sufficient to estimate the sum Xn,σ 1 k 1 . . . , mk ) over one of these classes. Recalling Definition (3.10) of Zk,n , of which Wn (β) is a subset, and Definition (4.17) of 9s , we see that |ms |/9s and |ms |/9s are of OT (1), and hence √ |m1 |/ zi−1 −2(k−1) Xn (β) ≤ CT Ln dy1 √

·

√ |m1 |/ zi−1 √ |m1 |/ zi

dy1

Wn (β) |m1 |/ zi

A −A

dζ2 · · ·

mk m2 . . . , y1 , , . . . 9k 92

A −A

dζk

A −A

dζ2 · · ·

m2 dζk × p2k (y1 , , . . . , 92 −A A

m , k |ϕ(m1 ), . . . , ϕ(mk ), ϕ(m1 ), . . . , ϕ(mk )). 9k

(4.73)

On the Poisson Limit Theorems of Sinai and Major

233

At this point, we assume that (m1 , . . . , mk ; m1 , . . . , mk ) are arranged like (4.71) and set y¯1 = y1 ,

y¯j =

|ms | ; 9s

ζ¯j = ζs

if m ¯ j = ms for some s = 2, . . . , k

(4.74)

|mt | ; ζ¯j = ζt 9t

if m ¯ j = mt for some t = 2, . . . , k.

(4.75)

and y¯r+1 = y1 ,

y¯j =

Then by Condition (II-1), we get Xn (β)

≤ CT L−2(k−1) n ×

√ (β) |m1 |/ zi Wn,σ

√ |m1 |/ zi−1 √ |m1 |/ zi

√ |m1 |/ zi−1

dy1

A −A

dy1

d ζ¯r+2 · · ·

A −A

A −A

dζ2 · · ·

A −A

dζr

d ζ¯2k

× p2k (y1 , y¯2 , . . . , y¯r , y1 , y¯r+2 , . . . , y¯2k |ϕ(m1 ), . . . ,

(4.76)

. . . ϕ(mr ), ϕ(m1 ), . . . , ϕ(m ¯ r+2 ), . . . , ϕ(m ¯ 2k )) ≤ CT L−2(k−1) n

m ¯ 1 ... ,m ¯ 2k , ¯ j −1 <βj 0<m ¯ j −m

qn−2

2k

(ϕ(m ¯ j ) − ϕ(m ¯ j −1 ))−σ ,

j =2

where βj = β or βj = and at least one of βj is equal to β. We now apply Lemma 1 with 1/2 ¯ j , j = 2k, 2k − 1, . . . , 2, R = Ln , successively to the summation with respect to m to obtain Xn (β) ≤ CT qn−2 L2n

2k j =2

β¯j1−σ ≤ CT L2n qn−2 β 1−σ .

(4.77)

Next let us estimate Xn , still assuming Condition (II-1). By definition, we can write Xn =

k

9=1 1≤s1 <···<s9 ≤k 1≤t1 <···

Qi (m; m ).

(4.78)

(m;m )∈Zk,n ×Zk,n ; ϕ(msj )=ϕ(mtj ),j =1,... ,9

Let us denote the last sum by Xn (s1 , . . . , s9 ; t1 , . . . , t9 ). We shall prove Xn (s1 , . . . , s9 ; t1 , . . . , t9 ) ≤ CT Ln qn−2 , 3/2

(4.79)

which is sufficient for our purpose. We first consider the special case of sj = tj = j , j = 1, . . . , 9. Letting as before ys = f (ϕ(ms )), s = 1, . . . , k and yt = f (ϕ(mt )),

234

N. Minami

t = 9 + 1, . . . , k, we can write, noting f (ϕ(mj )) = f (ϕ(mj )), j = 1, . . . , 9, Qi (m; m ) =

|m | |m | |m | |m | ( √z1 , √z 1 )∩( √z1 , √z 1 ) i i−1 i i−1

dy2 · · ·

dy1

dyk

dy9+1 ···

dyk

|ms |2 |m1 |2 |ms |2 |m1 |2 × A − {0 ∨ max ( 2 − )} + {0 ∧ min ( 2 − )} 2≤s≤k ys 2≤s≤k ys y12 y12 + 2 2 2 2 |m | |m | |m | |mt | × A − {0 ∨ max ( t2 − 12 ) ∨ max ( 2 − 12 )} 2≤t≤9 yt 9
|m1 |2 |m1 |2 |mt |2 |mt |2 + {0 ∧ min ( 2 − ) ∧ min ( 2 − 2 )} 2≤t≤9 yt 9
(4.80)

+

× p2k−9 (y1 , . . . , yk , y9+1 , . . . , yk | ϕ(m1 ), . . . , ϕ(mk ), ϕ(m9+1 ), . . . , ϕ(mk )).

If we make the change of variables ζs =

|ms |2 |m1 |2 − 2 , ys2 y1

s = 2, . . . , k;

ζt =

|mt |2 |m1 |2 − 2 , t = 9+1, . . . , k, (4.81) yt2 y1

and if we note that for t = 2, . . . , 9, |m1 |2 |m1 |2 ζt |m1 |2 |mt |2 2 |m − = + | − , t |mt |2 yt2 y12 y12 |mt |2 y12

(4.82)

we can further rewrite (4.80) as Qi (m; m ) =

|m | |m | |m | |m | ( √z1 , √z 1 )∩( √z1 , √z 1 ) i i−1 i i−1

dy1

A −A

dζ2 . . .

A −A

dζk

A −A

dζ9+1 ...

A −A

dζk

k k 1 ms 1 1 mt 1 × 2 9s 92s 2 9t 92 t s=2

t=9+1

× [A − (0 ∨ max ζs ) + (0 ∧ min ζs )]+ 2≤s≤k 2≤s≤k |m1 |2 ζt |m1 |2 2 |mt | − ∨ max ζt + 2 × A − 0 ∨ max 2≤t≤9 9
(4.83)

On the Poisson Limit Theorems of Sinai and Major

235

In order to estimate the summation over those (m; m ) ∈ Zk,n × Zk,n such that ϕ(mj ) = ϕ(mj ), j = 1, . . . , 9, we make the following observation. First, we have ms 9s ≥ CT Ln , ≤ CT , s = 2, . . . , k (4.84) 9s and 9t

≥ CT

mt ≤ CT , 9

Ln ,

t = 9 + 1, . . . , k.

t

(4.85)

Moreover Qi (m; m ) must vanish unless for t = 2, . . . , 9 and for some value of y1 and ζt , one has ζ |2 |m |m1 |2 t |mt |2 − 12 < A. (4.86) + 2 2 |mt |2 y1 |mt | y1 It is easy to see that for given values of m1 , mt , m1 and y1 , ζt , the number of mt which satisfies this condition and ϕ(mt ) = ϕ(mt ) is at most one. Hence if we make summation over m2 , . . . , m9 first, then Xn (1, . . . , 9; 1, . . . , 9) 2k−9−1 ≤ CT (L−1 n ) ×

A −A

dζ2 · · ·

|m | i

A −A

dζk

A −A

|m |

|m | i−1

|m |

( √z1 , √z 1 )∩( √z1 , √z 1 )

m1 ,... ,mk ;m1 , m9+1 ,... ,mk

dζ9+1 ···

i

A −A

dy1

i−1

dζk

(4.87)

× p2k−9 (y1 , . . . , yk , y9+1 , . . . , yk | ϕ(m1 ), . . . , ϕ(mk ), ϕ(m9+1 ), . . . , ϕ(mk )).

Now we arrange m2 , . . . , mk and m9+1 , . . . , mk in increasing order of ϕ, and call them m ¯ 2, . . . , m ¯ 2k−9 , namely ϕ(m1 ) = ϕ(m1 ) < ϕ(m ¯ 2 ) < · · · < ϕ(m ¯ 2k−9 ).

(4.88)

By Condition (II-1), we then have Xn (1, . . . , 9; 1, . . . , 9) 2k−9−1 ≤ CT (L−1 n )

×

(m1 ,m1 ,m ¯ 2 ,... ,m ¯ 2k−9 )∈Z2k−9,n ϕ(m1 )=ϕ(m1 )

|m | i

|m | i−1

|m |

|m |

( √z1 , √z 1 )∩( √z1 , √z 1 ) i

i−1

dy1 (4.89)

2k−9

(ϕ(m ¯ s ) − ϕ(m ¯ s−1 ))−σ (ϕ(m ¯ 2 ) − ϕ(m1 ))−σ .

s=3

Applying Lemma 1 successively to the summation over m ¯ 2k−9 , m ¯ 2k−9−1 , . . . , m ¯ 2 , we arrive at |m1 | ∧ |m1 | |m1 | ∨ |m1 | − . (4.90) Xn (1, . . . , 9; 1, . . . , 9) ≤ CT √ √ zi−1 zi + m1 ,m1 ∈Z1,n ϕ(m1 )=ϕ(m1 )

236

N. Minami

The summand vanishes unless √ √ zi−1 zi |m1 |. √ |m1 | < |m1 | < √ zi zi−1

(4.91)

But for each m1 ∈ Z1,n , the number of m1 which satisfies this condition and ϕ(m1 ) = 1/2 ϕ(m1 ) is bounded by CT Ln qn−1 . On the other hand, each summand is bounded by √ zi |m1 | |m1 | |m1 | − √ = √ − 1 √ ≤ CT qn−1 , (4.92) √ zi−1 zi zi−1 zi hence we finally obtain Xn (1, . . . , 9; 1, . . . , 9) ≤ CT (Ln qn−1 )qn−1 1/2

m1 ∈Z1,n

1 ≤ CT Ln qn−2 , 3/2

(4.93)

as desired. The general Xn (s1 , . . . , s9 ; t1 , . . . , t9 ) can be estimated in essentially the same way. Indeed, if γm1 , γm1 ∈ [zi−1 , zi ) and if |Dm1 ∩ · · · ∩ Dmk | > 0, |Dm1 ∩ · · · ∩ Dmk | > 0, then we must have |γms1 − γm1 | < A,

|γmt − γm1 | < A, 1

(4.94)

which means γm1 , γm1 ∈ (zi−1 − A, zi + A) .

(4.95)

Hence Qi (m; m ) ≤ EP [1γm1 ,γm ∈(zi−1 −A,zi +A) |Dm1 ∩ · · · ∩ Dmk ||Dm1 ∩ · · · ∩ Dmk |]. (4.96) 1

The right-hand side can be treated similarly as above, except that the role of (m1 , . . . , m9 ; m1 , . . . , m9 ) is now played by

(ms1 , . . . , ms9 ; mt1 , . . . , mt9 ),

and zi−1 , zi are replaced by zi−1 − A and zi + A respectively. Thus we obtain (4.79) again. Combining (4.63), (4.77) and (4.79), and noting 1 = O(32k (β)) as β $ 0, we now conclude 3/2 1/2 (4.97) Xni = 4(c2 − c1 )2k Zni + O(Ln qn−2 (32k (β) + Ln β d )) under Condition (II-1). Next we assume Condition (II-2) and estimate Xn (β) and Xn again. This time, Xn (β) is the sum of Qi (m; m ) over those (m; m ) ∈ Zk,n × Zk,n such that −1/2

|ϕ(ms ) − ϕ(mt )| ≥ MLn

,

s, t = 1, . . . , k

(4.98)

hold and that at least one of ϕ(ms ) − ϕ(ms−1 ), ϕ(ms ) − ϕ(ms−1 ) or |ϕ(ms ) − ϕ(mt )| is less than β.

On the Poisson Limit Theorems of Sinai and Major

237

Let us take M > 0 sufficiently small. Then as can be seen from the proof of Lemma 3, as far as we have (m; m ) ∈ Zk,n × Zk,n , the condition −1/2

ϕ(ms ) − ϕ(ms−1 ) ≥ M Ln

−1/2

ϕ(ms ) − ϕ(ms−1 ) ≥ M Ln

;

,

s = 2, . . . , k (4.99)

is automatically satisfied. (β) be the subset of W (β) consisting of those (m; m ) which As before, let Wn,σ n can be arranged like (4.71). Let us define, for j = 1, . . . , 2k,   y1 m ¯ j = m1   y m ¯ j = m1 1 . (4.100) y¯j =  |ms |/9s m ¯ j = ms for some s = 2, . . . , k    |mt |/9t m ¯ j = mt for some t = 2, . . . , k Then if we note that

CT−1 Ln ≤ 9s , 9s ≤ CT Ln for some constant CT ≥ 1, we see from (4.18) and Condition (II-2), that Qi (m; m ) ≤ CT L−2(k−1) n ×

A

−A

1/2

√ |m1 |/ zi−1 √ |m1 |/ zi

d ζ¯r+1 · · ·

A −A

dy1

d ζ¯2k

2k

A −A

1/2

dζ2 · · ·

A −A

dζr

√ |m1 |/ zi−1

√ |m1 |/ zi

dy1

(4.101)

(4.102)

1{|y¯j −y¯j −1 |≤b3 (ϕ(m¯ j )−ϕ(m¯ j −1 )} (ϕ(m ¯ j ) − ϕ(m ¯ j −1 ))−τ ,

j =2

where ζ¯j = ζs or ζ¯j = ζt according to m ¯ j = ms or m ¯ j = mt . i (β). Noting (4.99), we apply Let Xn,σ (β) be the sum of Q (m; m ) over Wn,σ Lemma 2 successively to the summation over m ¯ 2k , . . . , m ¯ r+2 , and integrate out over d ζ¯j , to obtain   √ 2k |m1 |/ zi−1 2−τ  −r+1  ¯ Xn,σ (β) ≤ CT Ln dy1 βj √ r

j =r+2

A

m1 ∈Z1,n |m1 |/ zi

1{|y¯s −y¯s−1 |≤b3 (ϕ(m¯ s )−ϕ(m¯ s−1 )} (ϕ(m ¯ s ) − ϕ(m ¯ s−1 )−τ −A ms s=2 b2 × dy1 1{y1 √zi−1 ≤|m1 |≤y1 √zi } 1{|y¯s −y¯s−1 |≤b3 (ϕ(m¯ s )−ϕ(m¯ s−1 )} (ϕ(m ¯ s ) − ϕ(m ¯ s−1 )−τ , b1 m1 ×

dζs

(4.103) where as before β¯j = β or β¯j = . We can apply the same method which we used in obtaining (4.57), namely we see without difficulty that 1{y1 √zi−1 ≤|m1 |≤y1 √zi } 1{|y¯s −y¯s−1 |≤b3 (ϕ(m¯ s )−ϕ(m¯ s−1 ))} × (ϕ(m ¯ s ) − ϕ(m ¯ s−1 ))−τ m1

τ/2

≤ CT qn−1 Ln (Ln ∧ b3−1 |y¯r − y1 |)1−τ . 1/2

(4.104)

238

N. Minami

Integrating with respect to y1 on the region {|y1 − yr | < β¯r+1 }, we get

dy1

m1

1{y1 √zi−1 ≤|m1 |≤y1 √zi } 1{|y¯s −y¯s−1 ≤b3 (ϕ(m¯ s )−ϕ(m¯ s−1 ))} (ϕ(m ¯ s ) − ϕ(m ¯ s−1 ))−τ τ/2−1

≤ CT qn−1 Ln (Ln

2−τ ≤ CT qn−1 Ln β¯r+1 ,

2−τ + β¯r+1 )

(4.105) −1/2 because of β ≥ Ln . We insert the estimate (4.102), which is uniform in m1 , . . . , mr , y1 and in ζ2 , . . . , ζr , apply Lemma 2 successively to the summation over mr , . . . , m2 , and then integrate out over ζ2 , . . . , ζr , to obtain  Xn,σ (β) ≤ CT L−r+1 Lr−1  n

 ≤ CT L2n qn−2  ≤

2k j =r+2

2k j =r+2

 β¯j2−τ  qn−1 Ln 

m1 ∈Z1,n

√ |m1 |/ zi−1 √ |m1 |/ zi

1 dy1

β¯j2−τ 

CT L2n qn−2 β 2−τ ,

(4.106) because at least one of the β¯j ’s is β. This estimate being similar for all arrangements in (4.71), we can now conclude Xn (β) ≤ CT L2n qn−2 β 2−τ .

(4.107)

Finally, we turn to the estimate of Xn under Condition (II-2), to finish the estimate of Xn . Recall that this time, Xn is the sum of Qi (m; m ) over those (m; m ) ∈ Zk,n × Zk,n −1/2 for which (4.99) holds but |ϕ(ms ) − ϕ(mt )| < M Ln for some s, t = 1, . . . , k. Analogously to (4.79), we let Xn (s1 , . . . , s9 , s9+1 , . . . , sr ; t1 , . . . , t9 , t9+1 , . . . , tr ) to be the sum of Qi (m; m ) over those (m; m ) ∈ Zk,n × Zk,n such that for some 1 ≤ s1 < · · · < s9 ≤ k, 1 ≤ s9+1 < · · · < sr ≤ k, 1 ≤ t1 < · · · < t9 ≤ k and 1 ≤ t9+1 < · · · < tr ≤ k, it holds that ϕ(msj ) = ϕ(mtj ),

j = 1, . . . , 9

(4.108)

−1/2

(4.109)

and that 0 < |ϕ(msj ) − ϕ(mtj )| < M Ln

, j = 9 + 1, . . . , r.

(Possibly one of the equalities 9 = 0 or r = 9 may holds, but not both.) We shall treat in detail the case of 9 ≥ 1, r ≥ 9 and sj = tj = j , j = 1, . . . , r, and give a brief sketch of the other cases.

On the Poisson Limit Theorems of Sinai and Major

239

Since ϕ(ms ) = ϕ(ms ) for s = 1, . . . , 9, Qi (m; m ) has the representation (4.83) and summing over m2 , . . . , m9 , we obtain Xn (1, . . . , r; 1, . . . , r) ≤ CT |m | |m m1 ,... ,mk ;m1 , m9 ,... ,mk

×

A

dζ2 · · ·

−A

| i−1

|m |

|m |

( √z1 , √z 1 )∩( √z1 , √z 1 ) i

A

dζk

−A

i

dy1

i−1

A k 1 dζ9+1 · · · dζk 2 −A −A s=2 A

k ms 1 1 mt 1 9 92 9 92 2 s t s t

mk m m2 × p2k−9 (y1 , , . . . , , 9+1 , . . . , 92 9k 99+1 m . . . k |ϕ(m1 ), . . . , ϕ(mk ), ϕ(m9+1 ), . . . , ϕ(mk )). 9k

−1/2

By Condition (II-2) and by |ϕ(mj ) − ϕ(mj )| < M Ln 1, . . . , r, the summand vanishes if the condition |m | |m | j j −1/2 − < b3 M Ln , 9j 9j fails to hold for all y1 , ζj and ζj . Since y2 |m1 | 9j = 1 + 1 2 ζj ; y1 |m1 |

t=9+1

(4.110)

which holds for j = 9 +

j = 9 + 1, . . . , r

| |m y2 9j = 1 1 + 1 2 ζj , y1 |m1 |

(4.111)

(4.112)

we see 9j 9j

=

|m1 | + OT (L−1 n ) |m1 |

(4.113)

uniformly in y1 ∈ (b1 , b2 ) and in ζj , ζj ∈ (−A, A). Hence if (4.111) ever holds for some y1 , ζj and ζj , then there is a constant C such that |m | − |m1 | |mj | < C. j |m1 |

(4.114)

Now if we perform the integration with respect to ζj , j = 9 + 1, . . . , r, then by (4.16), , . . . , y , and since p the variables of integration transform back into y9+1 2k−9 (·) is a r

240

N. Minami

probability density, the summand of (4.110) is bounded by A A A dy1 dζ2 · · · dζk dζr+1 · · · |m | |m | |m | |m | ( √z1 , √z 1 )∩( √z1 , √z 1 ) i

i−1

i

−A

i−1

−A

−A

A −A

k r 1 ms 1 1 mt 1 × 1 |m1 | 2 9s 92s 2 9t 92 t j =9+1 {||mj |− |m1 | |mj ||
dζk (4.115)

For given values of m1 , m1 , mj ∈ Z1,n , the number of mj which satisfy (4.114) and ϕ(mj ) = ϕ(mj ) is bounded by a constant which is independent of m1 , m1 , mj and also of Ln . Hence if we make summation over m9+1 , . . . , mr first, then by (4.110) and (4.115), Xn (1, . . . , r; 1, . . . , r) L−(2k−r−1) ≤ CT n m1 ,... ,mk ;m1 , mr+1 ,... ,mk

×

A −A

dζ2 · · ·

A −A

dζk

A −A

|m | i

|m | i−1

|m |

|m |

( √z1 , √z 1 )∩( √z1 , √z 1 )

dζr+1 ...

i

A −A

dy1

i−1

dζk

(4.116)

mk m m2 × p2k−r y1 , , . . . , , r+1 , . . . 92 9k 9r+1

m , k 9k × |ϕ(m1 ), . . . , ϕ(mk ), ϕ(mr+1 ), . . . , ϕ(mk ) . Again we arrange m2 , . . . , mk ; mr+1 , . . . , mk in increasing order of ϕ, and relabel them as m ¯ 2, . . . , m ¯ 2k−r . Introducing y¯j by (4.100), we then have a bound of the summand of (4.116) similar to (4.102). Applying Lemma 2 successively to the summation over m ¯ 2k−r , . . . , m ¯ 2 , we arrive at |m1 | ∧ |m1 | |m1 | ∨ |m1 | − Xn (1, . . . , r; 1, . . . , r) ≤ CT √ √ zi−1 zi + m1 ,m1 ∈Z1,n , (4.117) ϕ(m1 )=ϕ(m1 )

≤

3/2 CT Ln qn−2 ,

as before. Next suppose 9 ≥ 1 and that for some 1 ≤ s1 < · · · < s9 ≤ k and 1 ≤ t1 < · · · < t9 ≤ k one has ϕ(msj ) = ϕ(mtj ), while for some other 1 ≤ s9+1 < · · · < sr ≤ k and −1/2

1 ≤ t9+1 < · · · < tr ≤ k, one has 0 < |ϕ(msj ) − ϕ(mtj )| < MLn . We can use (4.96) to reduce the analysis of this case to the case of sj = tj = j , j = 1, . . . , r.

On the Poisson Limit Theorems of Sinai and Major

241

In case of 9 = 0, r > 0, we assume, without loss of generality, that sj = tj = j , j = 1, . . . , r holds. This time, ϕ(ms ), ϕ(mt ), s, t = 1, . . . , k are all different, and we have the representation (4.18) for Qi (m; m ). Now in (4.18), y1 can vary only within the interval −1/2 {|y1 − y1 | < b3 MLn }, from which we deduce, instead of (4.113), 9j 9j

=

|m1 | −1/2 + OT (Ln ). |m1 |

But this still gives the following condition on mj : |m | − |m1 | |mj | < C j |m1 |

(4.118)

(4.119)

for the non-vanishing of Qi (m1 , . . . , mk ; m1 , . . . , mk ). For given values of m1 , m1 and mj , the number of mj satisfying (4.119) and |ϕ(mj )− −1/2

is bounded by some constant which is independent of m1 , m1 , mj ϕ(mj )| < M Ln and Ln . Hence the same argument which lead to (4.116) yields Xn (1, . . . , r; 1, . . . , r) ≤ CT m1 ,m1 ∈Z1,n ,

√ |m1 |/ zi−1

√ |m1 |/ zi

dy1

√ |m1 |/ zi−1 √ |m1 |/ zi

dy1 p2 (y1 , y1 | ϕ(m1 ), ϕ(m1 ))

−1/2

|ϕ(m1 )−ϕ(m1 )|<M Ln

≤ CT

m1 ,m1 ∈Z1,n ,

S(m1 , m1 ).

(4.120)

−1/2

|ϕ(m1 )−ϕ(m1 )|<M Ln

But this sum was already estimated in (4.61). Namely we have Xn (1, . . . , r; 1, . . . , r) ≤ CT Ln qn−2 3/2

(4.121)

once again. Thus we have proved (4.97) under Condition (II-2). Let us finish the proof of Lemma 4 by estimating Yni . Since no new technical difficulty arises, we shall only sketch the outline, omitting the repetition of detailed argument. To begin with, we divide the summation (4.4) into two parts: Yni = Yn + (Yni − Yn ), where we define R i (m; m1 ) (4.122) Yn = ϕ(m1 )=ϕ(ms ) for some s=1,... ,k

in case Condition (II-1) holds, and Yn =

R i (m; m1 ) −1/2

|ϕ(m1 )−ϕ(m1 )|<MLn for some s=1,... ,k

in case Condition (II-2) holds.

(4.123)

242

N. Minami

Suppose Condition (II-1) holds, and let ϕ(m1 ) = ϕ(ms ). By the same consideration which lead us to (4.96), we see R i (m; m1 ) ≤ CEP [1{γms ,γm ∈(zi−1 −A,zi +A)} |Dm1 ∩ · · · ∩ Dmk |],

(4.124)

1

and it is easy to obtain Yn =

k

s=1

ϕ(m1 )=ϕ(ms )

≤C

k

R i (m; m1 )

s=1 ϕ(m1 )=ϕ(ms )

(4.125)

EP [1{γms ,γm ∈(zi−1 −A,zi +A)} |Dm1 ∩ · · · ∩ Dmk |] 1

≤ CLn qn−2 , 3/2

as before. In case Condition (II-2) holds, we further divide Yn into summation over those (m; m1 ) for which ϕ(m1 ) = ϕ(ms ) for some s = 1, . . . , k and over those for which −1/2 for some s. Recalling what we did in estimating Xn 0 < |ϕ(m1 ) − ϕ(m1 )| < MLn under Condition (II-1), there is no difficulty in obtaining (4.125) again. Now define a subset Vn (β) of Zk,n ×Z1,n to be the totality of those (m1 , . . . , mk , m1 ) such that (4.126) ϕ(ms ) − ϕ(ms−1 ) ≥ β, s = 2, . . . , k and that

|ϕ(m1 ) − ϕ(ms )| ≥ β,

s = 2, . . . , k.

(4.127)

Corresponding to this, we let Yn (β) ≡

R i (m; m1 )

(4.128)

Vn (β)

and

Yn (β) ≡ Yn − Yn − Yn (β).

(4.129)

To estimate Yn (β), we use the bound R i (m; m1 ) ≤ CEP [1{γms ,γm ∈[zi−1 ,zi ))} |Dm1 ∩ · · · ∩ Dmk |] 1 |m1 |/√zi−1 |m |/√zi−1 1 −2(k−1) dy1 dy1 ≤ CLn √ √ |m1 |/ zi

m2 × pk+1 (y1 , , . . . 92

|m1 |/ zi

A −A

dζ2 . . .

A −A

dζk ×

mk , , y1 | ϕ(m1 ), . . . , ϕ(mk ), ϕ(m1 )). 9k (4.130) Now we make summation over those (m1 , . . . , mk ; m1 ) ∈ Zk,n × Z1,n satisfying ϕ(ms ) − ϕ(ms−1 ) < β, or

0 < |ϕ(m1 ) − ϕ(ms )| < β,

for some s = 2, . . . , k for some s = 1, . . . , k

(4.131) (4.132)

On the Poisson Limit Theorems of Sinai and Major

243

in case Condition (II-1) holds, or over those (m1 , . . . , mk ; m1 ) ∈ Zk,n × Z1,n satisfying −1/2

MLn

≤ |ϕ(m1 ) − ϕ(ms )| < β

for some s = 1, . . . , k

in case Condition (II-2) holds. By the same argument as in the estimation of then obtain Yn (β) ≤ CL2n qn−2 β d .

(4.133) Xn (β),

we

(4.134)

Let us proceed to the estimation of Yn (β). We use the following representation of R i (m; m1 ) : |m1 |/√zi−1 |m |/√zi−1 1 i dϕ2 dϕ3 · · · dϕk dy1 dy1 R (m; m1 ) = √ √ ×

ϕ(m1 )

2(c2 −c1 ) −2(c2 −c1 )

dζ2 · · ·

ϕ2

2(c2 −c1 )

−2(c2 −c1 )

dζk

ϕk−1

dy2 · · ·

|m1 |/ zi

|m1 |/ zi

dyk (y22 · · · yk2 )

k 1 ms 1 (4.135) × [2(c2 − c1 ) − (0 ∨ max ζs ) + (0 ∧ min ζs )]+ 2≤s≤k 2≤s≤k 2 9s 92s s=2 mk m2 × p2k (y1 , , . . . , , y1 , y2 , . . . , yk | ϕ(m1 ), . . . , ϕ(mk ), ϕ(m1 ), ϕ2 , . . . , ϕk ). 92 9k Now we decompose the integral with respect to ϕ2 , . . . , ϕk into two parts. Namely we set D(m1 ) = {(ϕ2 , . . . , ϕk ) | ϕ(m1 ) < ϕ2 < · · · < ϕk < } (4.136) and D(m; m1 ; β) = {(ϕ2 , . . . , ϕk ) ∈ D(m1 ) | ϕ2 − ϕ(m1 ) ≥ β, ϕs − ϕs−1 ≥ β,

s = 3, . . . , |ϕt − ϕ(ms )| ≥ β, ∀s, ∀t};

D (m; m1 ; β) = D(m1 ) \ D(m1 , . . . , mk ; m1 ; β).

(4.137) (4.138)

Rβi (m; m1 )

Let and Rβi (m; m1 ) be defined by the right hand side of (4.135) with the integration over ϕ2 , . . . , ϕk ranging in D(m; m1 ; β) and D (m; m1 ; β) respectively. If we integrate with respect to y2 , . . . , yk first, then the resulting integral is bounded with respect to ϕ2 , . . . , ϕk . Hence we obviously have |m1 |/√zi−1 |m |/√zi−1 1 i −(k−1) Rβ (m; m1 ) ≤ CT βLn dy1 dy1 √ √ |m1 |/ zi |m1 |/ zi m2 × pk+1 (y1 , , . . . 92

and hence it is easy to see that Vn (β)

namely Yn (β) =

mk , , y1 | ϕ(m1 ), . . . , ϕ(mk ), ϕ(m1 )), 9k (4.139)

Rβi (m; m1 ) ≤ CβL2n qn−2 ,

Vn (β)

Rβi (m; m1 ) + O(βLn qn−2 ).

(4.140)

(4.141)

244

N. Minami

At this point, we introduce a function V (x2 , . . . , xk ) = V (x2 , . . . , xk ; m1 , m1 ; y1 , y1 , . . . , yk ; ϕ2 , . . . , ϕk ) # ≡ p2k y1 , |x2 |, . . . , |x2 |, y1 , y2 , . . . , yk | ϕ(m1 ), ϕ(x2 ), . . . , . . . ϕ(xk ), ϕ(m1 ), ϕ2 , . . . , ϕk

(4.142)

k $ 1 ( |xs |) 2 s=2

which is defined on Dm1 ≡ {(x2 , . . . , xk ) ∈ (R2 )k−1 | b1 ≤ |xs | ≤ b2 , s = 2, . . . , k; ϕ(m1 ) < ϕ(x2 ) < · · · < ϕ(xk )}.

(4.143)

Then we can write the first term on the right hand side of (4.141) as follows: Vn (β)

=

Rβi (m; m1 )

···

m1 ,m1

dy2 · · ·

× ×

m2 ,... ,mk

dϕ2 · · · ϕk D(m1 )

√ |m1 |/ zi−1

dy1

√ |m1 |/ zi

√ |m1 |/ zi−1 √ |m1 |/ zi

dy1

A −A

dζ2 . . .

A

dζk

−A

dyk (y22 · · · yk2 )[A − (0 ∨ max ζs ) + (0 ∧ min ζs )]+ 2≤s≤k

m2 1D(m1 ,... ,mk ;m1 ;β) V , . . . 92

k mk 1 , . 9k 92s

2≤s≤k

(4.144)

s=2

As before, the summation over (m2 , . . . , mk ) is nothing but the Riemann sum approximation of the integral of V , namely we have k mk 1 m2 1D(m1 ,... ,mk ;m1 ;β) V , . . . , 92 9k 92 m2 ,... ,mk s=2 s −1/2 = ··· dx2 · · · dxk V (x2 , . . . , xk ) + O(Ln 32k (β)),

Dβ (ϕ2 ,... ,ϕk )

(4.145)

where we have defined the range of integration by Dβ (ϕ2 , . . . , ϕk ) ≡ {(x2 , . . . , xk ) ∈ Dm1 | |ϕ(xs ) − ϕt | ≥ β, ∀s, ∀t}.

(4.146)

On the Poisson Limit Theorems of Sinai and Major

245

Inserting this approximation into (4.144) and using (4.38), we obtain Vn (β)

=

Rβi (m; m1 )

m1 ,m1

···

D(m1 )

dϕ2 · · · ϕK

√ |m1 |/ zi−1 √ |m1 |/ zi

dy1

dy2 · · · dyk (y22 · · · yk2 ) × −(k−1) 2 2 ×2 dy2 · · · dyk (y2 · · · yn ) · · · × Ak

× p2k (y1 , . . . , yk , y1 . . . , yk |ϕ(m1 ), ϕ2 , . . .

√ |m1 |/ zi−1 √ |m1 |/ zi

dy1 × (4.147)

dϕ2 · · · ϕk × D˜ β (ϕ2 ,... ,ϕk ) , ϕk , ϕ(m1 ), ϕ2 , . . . , ϕk )

+ O(Ln qn−2 32k (β)), 3/2

where we have set D˜ β (ϕ2 , . . . , ϕk )

= {(ϕ2 , . . . , ϕk ) | ϕ(m1 ) < ϕ(x2 ) < · · · < ϕ(xk ); |ϕs − ϕt | ≥ β}.

(4.148)

Now (4.147) is further rewritten as Vn (β)

Rβi (m; m1 )

=

m1 ,m1 ; |ϕ(m1 )−ϕ(m1 )|≥β

×

···

×

···

2(c2 − c1 )k EP 1{γms ,γm ∈[zi−1 ,zi ))} 1

dϕ2 · · · ϕk f (ϕ2 )2 · · · f (ϕk )2 D˜ β (ϕ2 ,... ,ϕk )

3/2 dϕ2 · · · ϕk f (ϕ2 )2 · · · f (ϕk )2 + O(Ln qn−2 32k (β))

= 2(c2 − c1 )k

S(m1 , m1 )

m1 ,m1 ; |ϕ(m1 )−ϕ(m1 )|≥β

3/2 + O Ln qn−2 32k (β) +

m1 ,m1 ; |ϕ(m1 )−ϕ(m1 )|≥β

βP(γm1 , γm1 ∈ [zi−1 , zi )) .

Since we already have the estimate m1 ,m1 ; |ϕ(m1 )−ϕ(m1 )|≥β

βP(γm1 , γm1 ∈ [zi−1 , zi )) = O(L2n qn−2 β d )

(4.149)

246

N. Minami

(see (4.41) and (4.46)), we finally arrive at Yn = Yn + (Yn − Yn )

= Yn + Yn (β) + Yn (β) Rβi (m1 , . . . , mk ; m1 ) + O(βL2n qn−2 ) + Yn (β) + Yn = Vn (β)

= 2(c2 − c1 )k

3/2

m1 ,m1 ; |ϕ(m1 )−ϕ(m1 )|≥β

= 2(c2 − c1 )k

S(m1 , m1 ) + O(Ln qn−2 32k (β) + L2n qn−2 β d )

S(m1 , m1 ) + O(Ln qn−2 (32k (β) + Ln β d )). 3/2

m1 ,m1

1/2

(4.150)

We have thus proved Ui,n = Xni − 4(c2 − c1 )k Yni + 4(c2 − c1 )2k Zni

= 4(c2 − c1 )2k Zni − 8(c2 − c1 )2k Zni 4(c2 − c1 )2k Zni

(4.151)

+ O(Ln qn−2 (32k (β) + Ln β d )) 3/2

1/2

= O(Ln qn−2 (32k (β) + Ln β d )), 3/2

1/2

completing the proof of Lemma 4. Before closing this section, we give a proof of formula (4.38) as a lemma. Lemma 5. For A > 0 and k ≥ 2, A A dζ2 · · · dζk [A − {0 ∨ max ζs } + {0 ∧ min ζs }]+ = Ak . −A

2≤s≤k

−A

2≤s≤k

Proof. We denote the left-hand side of the above formula by Ik . Moreover, if we set τ¯k = 0 ∨ max2≤s≤k ζs and τ k = 0 ∧ min2≤s≤k ζs for brevity, then we can compute τ k−1 Ik = · · · dζ2 · · · dζk−1 {A − (τ¯k−1 − ζk )}dζk {τ¯k−1 −τ k−1
τ¯k−1 −A

+{A − (τ¯k−1 − τ k−1 )}(τ¯k−1 − τ k−1 ) + = ··· dζ2 · · · dζk−1

{τ¯k−1 −τ k−1
A−τ¯k−1 +τ k−1

=A

···

τ¯k−1

{A − (ζk − τ k−1 )}dζk

A−τ¯k−1 +τ k−1

sds +

0

τ k−1 +A

sds + {A − (τ¯k−1 − τ k−1 }(τ¯k−1 − τ k−1 )

0

dζ2 · · · dζk−1 {A − (τ¯k−1 − τ k−1 )}+

= AIk−1 .

(4.152)

Thus we obtain Ik = Ak−2 I2 = Ak−2

dζ2 (A − |ζ2 |)+ = Ak .

(4.153)

On the Poisson Limit Theorems of Sinai and Major

247

Acknowledgement. The author warmly thanks Professors Ya. Sinai and I. Kubo for their interest in this work and for continually encouraging him to publish it. He also thanks Professor T. Funaki, who made available to him a copy of Professor Sinai’s note prior to publication and Professor P. Major, who sent him a reprint of his paper, with both of which he started this work. Finally, the author is very grateful to the referee for reviewing a technical paper like this, and for giving him valuable suggestions.

References [BT] Berry, M. and Tabor, L.: Level clustering in the regular spectrum. Proc. Roy. Soc London Ser. A 356, 375–394 (1977) [M] Major, P.: Poisson law for the number of lattice points in a random strip with finite area. Prob. Th. Rel. Fields 92, 423–464 (1992) [Mi] Minami, N.: Level statistics for quantum Hamiltonians – Some preliminary ideas toward mathematical justification of the theory of Berry and Tabor. To appear in the Proceedings of the Second Congress ISAAC [S1] Sinai, Ya.G.: Mathematical problems in the theory of quantum chaos. In: Lindenstrauss, J. and Milman, V.D. (eds.) Geometric aspects of functional analysis, Lect. Notes. in Math. 1469, 1991, pp. 41–59 and in: Campbell, D.K. (ed.) Chaos, American Inst. Phys., 1990 [S2] Sinai, Ya.G.: Poisson distribution in a geometric problem. Adv. Sov. Math. 3, 199–214 (1991) Communicated by Ya. G. Sinai

Commun. Math. Phys. 213, 249 – 266 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

A New Variational Principle for a Nonlinear Dirac Equation on the Schwarzschild Metric Eric Paturel CEREMADE, Université Paris IX-Dauphine, 75775 Paris Cedex 16, France. E-mail: [email protected] Received: 2 August 1999 / Accepted: 14 February 2000

Abstract: In this paper, we prove the existence of infinitely many solutions of a stationary nonlinear Dirac equation on the Schwarzschild metric, outside a massive ball. These solutions are the critical points of a strongly indefinite functional. Thanks to a concavity property, we are able to construct a reduced functional, which is no longer strongly indefinite. We find critical points of this new functional using the Symmetric Mountain Pass Lemma. Note that, as A. Bachelot-Motet conjectured, these solutions vanish as the radius of the massive ball tends to the horizon radius of the metric. 1. Introduction The goal of this paper is to give a partial answer to a conjecture of A. Bachelot-Motet [2] concerning nonlinear Dirac fields on the Schwarzschild metric. Many results about the nonlinear Dirac equation with mass m > 0, ¯ iγ µ ∇µ − m = λ()

(1.1)

have been obtained in the Minkowski space-time. This equation is related to several models in particle physics (see [11]). In [2], A. Bachelot-Motet is looking towards a special pattern of solutions, related to the extended fermion model proposed by Soler [12] and Wakano [14], that we call Soler-type solutions: 

 v(r) 0   (t, x) = eiωt  iu(r) cos θ  , iu(r) sin θeiϕ

ω > 0.

(1.2)

In the Minkowski space-time, the existence of infinitely many Soler- type solutions (1.2) was established for a wide class of nonlinearities, first by using shooting methods

250

E. Paturel

[7, 4, 3, 10], then by a variational method [8]. The existence of such solutions in the Schwarzschild metric 2M 2M −1 2 µ ν 2 gµν dx dx = 1 − dt − 1 − dr − r 2 (dθ 2 + sin2 θdϕ 2 ) (1.3) r r was then investigated by A. Bachelot-Motet. She proved [2] that there is no Soler-type solution with finite energy on the Schwarzschild manifold Rt × (2M, ∞)r × S 2 and carried out numerical experiments to put in evidence such solutions outside a massive sphere of radius r0 > 1, with MIT-bag boundary conditions. In this case, (1.1) becomes a nonautonomous planar differential system in (v, u),

1 1 u f u + {f + f 2 } = v λ(v 2 − u2 ) − (f 2 m − ω) , (1.4) r

1 1 v f v + {f − f 2 } = u λ(v 2 − u2 ) − (f 2 m + ω) , (1.5) r where f (r) = 1− 1r , taking the normalization 2M = 1. The MIT-bag boundary condition becomes the following initial data condition: u(r0 ) = −v(r0 ).

(1.6)

These numerical experiments led A. Bachelot-Motet to the following conjecture: Conjecture 1.1 ([2]). For every r0 > 1 there is an integer N (r0 ) and an increasing sequence (xn )n≥N(r0 ) of strictly positive numbers such that if we denote by (vn , un ) the solution of (1.4- 1.5-1.6) with v(r0 ) = xn , we have for every n ≥ N (r0 ): 1. the solution (vn , un ) is defined on [r0 , ∞); 2. the function vn (respectively un ) has exactly n (respectively n + 1) zeros in [r0 , ∞); 3. (vn , un ) converges to (0, 0) when r goes to infinity. Moreover N (r0 ) = 0 for r0 large enough, and N (r0 ) tends to infinity when r0 goes to 1. In this paper, we take advantage of the variational structure of the system (1.4-1.5). In [8], Esteban and Séré found infinitely many solutions of a nonlinear Dirac equation by an infinite dimensional linking. Our method is different and simpler. Thanks to a concavity argument, we may replace the initial functional Fr0 with a reduced one, called Gr0 , which is even and has 0 as a trivial critical point. This argument is inspired by the works [1, 5], and [6]. We can use a Mountain–Pass-type method based on the Krasnoselski genus (see [13] for a complete reference) to find out infinitely many critical points. The property used here, called in [13] the Symmetric Mountain Pass Lemma, is equivalent to a finite dimensional linking. As r0 goes to 1, the lowest min-max values converge to 0 by pairs, creating negative directions for the Hessian of the functional at 0. More precisely, sn := √1 (un + vn ) is found as a critical point of the functional Gr0 , 2 whose precise definition will be given in (2.8). We obtain the following result: 1 2

Theorem 1.2. For every r0 > 1, given ω ∈ ( f (r2r02) , m), there exists an integer N (r0 ) 0

and a sequence of non-trivial solutions of (1.4- 1.5-1.6), denoted by (vn , un )n≥N(r0 ) such that, for every n ≥ N (r0 ):

(i) (vn , un ) is defined on [r0 , ∞) and converges exponentially to (0, 0) when r goes to infinity;

New Variational Principle for a Nonlinear Dirac Equation

251

(ii) sn = √1 (un + vn ) has Morse index at most n + 1 relatively to the functional Gr0 2 described in (2.8). Moreover N (r0 ) = 0 for r0 large enough, and N (r0 ) tends to infinity when r0 goes to 1. Remark 1.3. There should be a relationship between the Morse index of sn and the number of zeros of un and vn : this is an open problem. To our knowledge, this is the first Mountain–Pass-type characterization of critical 1

points for non-linear Dirac functionals. The limitation ω >

f (r0 ) 2 2r02

is technical, it is

useful in an a priori estimate (see Lemma 4.1). Note that in her computations, BachelotMotet always considered the case ω > 21 covered by our assumption. Section 2 is devoted to the introduction of the variational framework, in particular the new functional Gr0 . In Sect. 3, we solve suitable approximate problems, by means of the Symmetric Mountain Pass Lemma. In Sect. 4, the convergence of the solutions found in Sect. 3 to solutions of the exact problem is proved, using a Pohozaev-type estimate. The proof of some important lemmas is postponed to Sect. 5. 2. A New Variational Principle for the Nonlinear Dirac Equation Define now the variational framework. In order to insert the initial data condition (1.6) in the functional scheme, we change variables (v, u) symplectically into (s, d) with 1 s = √ (u + v), 2 1 d = √ (u − v). 2 System (1.4–1.5) is now 1 1 f2 f fd + − mf 2 d + − ω s = 2λsd 2 , r r 1 1 f2 f fs + + mf 2 s + + ω d = −2λs 2 d , r r

(2.1)

(2.2)

and condition (1.6) then becomes s(r0 ) = 0. We introduce the following functional: Fr0 (s, d) = Ar0 (s, d) + Br0 (s, d) + Cr0 (s, d), with

1 sd f −2 2 2 −s d − + (s − d ) r 2 dr, Ar0 (s, d) = r 2r r0

∞ 1 −1 2 − 21 2 Br0 (s, d) = −mf sd − ωf (s + d ) r 2 dr, 2 r0

∞ −λf −1 s 2 d 2 r 2 dr. Cr0 (s, d) =

∞

r0

(2.3)

252

E. Paturel

Functional Fr0 is well defined and continuously differentiable in the Hilbert space E = H × L = H01 ((r0 , +∞), r 2 dr) × L2 ((r0 , +∞), r 2 dr). We will use the following notations for Sobolev spaces of functions defined on (r0 , +∞) corresponding to radial functions on R3 \ Br0 : given an integer s ≥ 1 and 1 ≤ p ≤ ∞, we set s,p

W s,p ((r0 , +∞), r 2 dr) ≡ Wrad , s,p

s,p

W0 ((r0 , +∞), r 2 dr) ≡ W0,rad , p

Lp ((r0 , +∞), r 2 dr) ≡ Lrad . By classical arguments, we have the following compact Sobolev embeddings: 1,p

q

∀p < q, Wrad *→ Lrad . Moreover, by a straightforward computation, we get the following property: 1,p

∀s ∈ Wrad , ∃K ∈ R, / ∀r ∈ (r0 , ∞), s(r) ≤

K 2

rp

.

(2.4)

A simple computation shows that critical points of F give rise to the solutions of (2.1–2.2), satisfying boundary conditions (2.3) and lim s(r) = lim d(r) = 0.

r→+∞

r→+∞

(2.5)

In order to find such critical points, we point out that the functional Fr0 has a concavity property in the d direction. Indeed, we get

∞ −1 f 2 ωf −1 −1 2 Fr0 (s, d) = − + + λf s d 2 2r 2 r0 (2.6) 1 f −2 1 ωf −1 2 2 − 21 s d+ − s + + mf − s r dr. r 2r 2 Thus, given s ∈ H , there exists a unique d ∗ (s) ∈ L such that ∂Fr0 (s, d ∗ (s)) = 0. ∂d This property allows us to introduce the new functional Gr0 , Gr0 (s) = 2Fr0 (s, d ∗ (s))   2 1 1 − 1

∞ s + s r + mf 2   2 f −2  + − ωf −1 s 2  =  − 21  r dr, r r0 f −1 + 2λf −1 s 2 + ωf r

(2.7) (2.8)

where Gr0 is a C 2 and even functional. The main interest of Gr0 is that this functional is no longer strongly indefinite. The reduction of the functional thanks to concavity properties is inspired by the works [1, 5] and [6]. Now we may use the Symmetric Mountain Pass Lemma:

New Variational Principle for a Nonlinear Dirac Equation

253

Proposition 2.1. Let E be an infinite dimensional Banach space and let I ∈ C 1 (E, R) be even, satisfy the Cerami condition, and I (0) = 0. If E = V ⊕ X, where V is finite dimensional, and I satisfies (i) there are constants α, ρ > 0 such that I |∂Bρ ∩X ≥ α, and (ii) there exists a sequence of subspaces (Xk )k∈N ⊂ X with finite dimension k, such that Xk ⊂ Xk+1 , and Rk ≥ 0 such that, if we denote by Ek the finite dimensional subspace Xk ⊕ V , we have I ≤ 0 on Ek \ BRk , then I possesses an unbounded sequence of critical values, defined by: ck = inf max I (h(u)), h∈7k u∈Ek

with the invariant classes 7k defined by 7k = {h ∈ C 0 (E, E), h odd, ∀j ≤ k, u ∈ Ej , u ≥ Rj ⇒ h(u) = u }. Unfortunately, we do not know if Gr0 satisfies the Cerami condition. So we will study approximate functionals, as in [8]. 3. Solution of Approximate Problems Given p > 4 and ε > 0, we define now the following functionals:

∞ Fr0 ,ε (s, d) = Fr0 (s, d) − ε s p r 2 dr, r0

∞ s p r 2 dr. Gr0 ,ε (s) = 2Fr0 ,ε (s, d ∗ (s)) = Gr0 (s) − 2ε r0

These functionals are C 2 respectively in H × L and H . We have to prove a compactness result (i.e. convergence, up to a subsequence, of Cerami sequences) for these approximate functionals; then, we need to verify that functional Gr0 ,ε satisfies the assumptions of Proposition 2.1. First we check some properties of the operators involved in the system (2.1–2.2). Lemma 3.1. With the notations

1 − 21 L1 : d → d + − mf (r) d, r 1 − 21 L2 : s → s + + mf (r) s, r

we have the following properties: (i) L1 is an invertible operator from L to H , where H denotes the dual space of H , and its inverse is bounded; (ii) Given an integer m ≥ 1 and 1 ≤ q ≤ ∞, L1 is an invertible bounded operator m,q m−1,q from Wrad onto Wrad , and its inverse is bounded; (iii) Given an integer m ≥ 1 and 1 ≤ q ≤ ∞, L2 is an invertible bounded operator m,q m−1,q from W0,rad onto Wrad , and its inverse is bounded.

254

E. Paturel

Proof. It is clear, by classical differential operator theory, that L1 and L2 are bounded operators with respect to the spaces defined in the lemma. To obtain invertibility, we make use of standard tools, including the variation of the constant and Fubini formula. For example, we obtain the following expression for d, if L1 (d) = µ ∈ H :

∞ 1 1 µ(τ )eG(τ )−G(r) dτ, with G (r) = − mf − 2 (r). d(r) = − r r One easily checks that d belongs to L, and, using Fubini and Cauchy-Schwarz formulae, we get a uniform estimate dL ≤ KµH , which implies the boundedness of L−1 1 from H to L. Similar computations give the results (ii) and (iii). Lemma 3.2. Given r0 > 1 and ε > 0, the functional Gr0 ,ε satisfies the Cerami condition, i.e. any sequence (sn ) satisfying lim Gr0 ,ε (sn ) n→+∞ lim (1 + sn H ) Gr0 ,ε (sn )H n→+∞

= l ∈ R, =0

has a convergent subsequence. Moreover, the limit, denoted by s, satisfies an exponential decay estimate: given any q ≥ 2, we have sW 1,q ((R,∞),r 2 dr) ≤ Ke−σq R ,

(3.1)

and the same estimate stands for the associated d = d ∗ (s) ∈ Wrad , for q ≥ 2. 1,q

Proof. This proof is postponed in Sect. 5.

To check that the assumptions of Proposition 2.1 are satisfied, we need the following lemma: Lemma 3.3. Given 0 < ω < m, there exists a non-increasing integral function N : (1, +∞) → N which satisfies the following conditions: • limr0 →1 N (r0 ) = +∞ and there exists r¯0 ∈ (1, +∞) such that N (¯r0 ) = 0; • there exists a finite dimensional subspace V (r0 ) ⊂ H , with dimension N (r0 ), and P (r0 ) an orthogonal supplement of V (r0 ) in H such that for every ε > 0, Gr0 ,ε (0) is a positive definite quadratic form on P (r0 ), and there exists α(r0 ) > 0 such that Gr0 ,ε (0).X.X ≥ α(r0 )X2 ,

∀X ∈ P (r0 ).

Proof. This result is based on the computation of the second order derivative of Gr0 and a classical rescaling argument. For details, see Sect. 5. Thanks to this result, we see that condition (i) is fulfilled if we put V = V (r0 ), X = P (r0 ). Condition (ii) is a consequence of the following lemma: Lemma 3.4. There exist sequences (Xk )k∈N ⊂ X with dim Xk = k, such that Xk ⊂ Xk+1 , and (Rk )k∈N ≥ 0 such that, if we denote by Ek the finite dimensional subspace Xk ⊕ V , we have Gr0 ≤ 0 on Ek \ B(0, Rk ).

New Variational Principle for a Nonlinear Dirac Equation

255

Proof. As V = V (r0 ) is finite dimensional, the unit ball in V is compact for the strong topology induced from H . This leads to the following uniform estimate: given η > 0 there exists R > 0 such that

∞ 2 ∀v ∈ V , (v 2 + v )r 2 dr ≤ ηv2H . R

Let φ1 , . . . φk , . . . be a basis of W01,2 ((R, ∞), r 2 dr). We may suppose that every φk has the C 2 regularity. Prolonging by zero on the interval (r0 , R), we consider these functions as elements of H . Let X˜ k = Span(φ1 , . . . , φk ). We denote by PX (resp. PV ) the projection on X (resp. V ) with respect to V (resp. X) and we get, for f ∈ X˜ k ,

∞ f − PX f 2H = (f, PV f )H = f (PV f ) + f (PV f ) r 2 dr ≤ f H √ ≤ ηf 2H

R

∞

(PV f ) + (PV f )

R

2

2

1 r dr 2

2

Then, there exists η0 > 0 such that, given any η ≤ η0 , the projector PX is close enough to the identity to be an isomorphism. We put then Xk = PX (X˜ k ). Then dim Xk = k, and, given any g ∈ Xk , with f = PX−1 (g),

R r0

(g 2 + (g )2 )r 2 dr = =

R

(g(f − PV f ) + g (f − PV f ) )r 2 dr

r

0∞ R

(g(PV f ) + g (PV f ) )r 2 dr

(3.2)

1

≤ 2η 2 g2H . Since the elements of V are eigenvectors for the operator Gr0 from H to H , they satisfy a second order differential equation with smooth coefficients (see Sect. 5.2, the coefficient of the second order derivative is ϕ2 , which is positive), so they are twice continuously differentiable. Moreover, as we supposed each φi with the C 2 regularity, their projection on X is also C 2 , and Xk is a finite dimensional subspace of C 2 functions. We fix k ∈ N. We prove by contradiction that Ek satisfies the properties given in the lemma and suppose that there exists a sequence (µn )n∈N going to infinity and a sequence (sn )n∈N in the unit sphere of Ek , such that, for every n ∈ N, Gr0 (µn sn ) ≥ 0. According to (2.8), the function µ →

1 G (µs) µ2 r0

is nonincreasing. So we infer

∀µ ≤ µn , Gr0 (µsn ) ≥ 0 .

(3.3)

Moreover, as the unit sphere in Ek is finite dimensional, we may extract from (sn ) a converging subsequence, whose limit is a C 2 function denoted by s¯ . Passing to the limit in (3.3), we get 1 ∀µ ≥ 1, 2 Gr0 (µ¯s ) ≥ 0. µ

256

E. Paturel

The equivalence of all norms in Ek implies the equivalence of the L2rad -norm and the H -norm. This and (3.2) imply that the new function



s¯

− 21

2



+ mf  2  r dr  − 21 r0 f −1 + 2λf −1 µ2 s¯ 2 r + ωf 1

∞ f −2 −1 s¯ 2 r 2 dr ωf − ≥ r r0

R

∞ − 21 − 21 f f ≥ ωf −1 − ωf −1 − s¯ 2 r 2 dr + s¯ 2 r 2 dr r r r0 R

H (µ) =

∞

 

+ s¯

1 r

≥ (−Kη 2 + K )¯s H 1

is bounded from below with a positive constant (here K and K are positive constants), provided η is small enough. We claim this is impossible because of Lebesgue’s Dominated Convergence Theorem. Indeed, putting µ = n and denoting by hn (r) the integrand of the last functional, we prove that hn converges strongly to 0 in L1rad . We get trivially, for every r ∈ (r0 , ∞), hµ (r) ≤ h1 (r) and h1 ∈ L1rad . Now hµ converges almost everywhere to 0 as µ goes to infinity. Indeed, define H = {r ∈ (r0 , ∞), /¯s (r) " = 0 or s¯ (r) = s¯ (r) = 0}. It is clear that, given any r ∈ H, hn (r) goes to 0 as n goes to infinity. Moreover, the complement of H in (r0 , ∞) is the set {r ∈ (r0 , ∞), /¯s (r) = 0 and s¯ (r) " = 0} which is a reunion of isolated points, since s¯ is a C 2 function, hence its Lebesgue measure is zero, and the contradiction is obtained. Since Gr0 ,ε (s) ≤ Gr0 (s) for every ε ≥ 0 and s ∈ H , we obtain that Gr0 ,ε satisfies the property of Lemma 3.4 uniformly in ε. So we may use Proposition 2.1 with functionals Gr0 ,ε : this gives rise to a sequence (snε )n>N(r0 ) of critical points for Gr0 ,ε , whose corresponding critical values are unbounded. Thanks to the concavity property, we can associate a dnε ∈ L to each snε , such that (snε , dnε ) is a critical point for Fr0 ,ε , for any n > N (r0 ). Moreover, since the constants α, ρ and Rk do not depend on ε, we get an uniform estimate for the critical levels of the perturbed functionals: there exist two sequences (an )n∈N and (bn )n∈N going to +∞ as n goes to infinity, such that, given any ε > 0, we get an ≤ Fr0 ,ε (snε , dnε ) ≤ bn .

(3.4)

The next section is devoted to find uniform estimates on the corresponding critical points in order to pass to the limit as ε goes to 0, and to prove that this limiting procedure leads to an infinite sequence of solutions of (1.4–1.5).

New Variational Principle for a Nonlinear Dirac Equation

257

4. Limiting Procedure A priori estimates for the sequence (snε , dnε ) are necessary to pass to the limit as ε goes to 0. The following one is the consequence of a Pohozaev-type inequality: 1 2

Lemma 4.1. Given ω ∈ ( f (r2r02) , m), ε ∈ (0, ε¯ ) and n > N (r0 ), we have 0

−1 − 21 r f (r ) 0 0 − Br0 (snε , dnε ) ≥ r0 − 1 2ωr02 1 3

∞ − 21 (r0 )f − 2 r 2 f −2 r ε ε f − 21 2 −Cbn + + f r dr . msn dn − 2 2ωr02 r0

Proof. This inequality is obtained in Sect. 5.

Fix n > N (r0 ) and let (εq )q∈N be a sequence of positive real numbers converging to ε ε 1,2 2 0. We denote by W the Hilbert space (Wrad ) . To show that (snq , dnq ) is bounded in W , hence in E, we proceed by contradiction, and suppose that, up to a subsequence, ε

ε

(snq , dnq )W → ∞. In order to simplify the notations, we omit the dependence on n and we just write ε ε (sq , dq ) for (snq , dnq ). From Lemma 3.2, (sq , dq ) is a strong solution, i.e. a solution in 1,s , for s ≥ 2, of the system any Wrad 1 f −2 −1 s + 2λf −1 sd 2 + εq ps p−1 , (4.1) L1 (d) = ωf − r 1 f −2 −1 L2 (s) = −ωf − d − 2λf −1 s 2 d. (4.2) r We denote by (¯sq , d¯q ) the normalization of this sequence, i.e.: s¯q (r) =

sq (r) dq (r) and d¯q (r) = . (sq , dq )W (sq , dq )W

Then (¯sq , d¯q ) is a bounded sequence in W , which moreover satisfies the exponential decay estimates (3.1), as (sq , dq ) satisfies it. Hence this sequence is precompact in every q Lrad , for q ≥ 2. Now, Cr0 (¯sq , d¯q ) =

Cr0 (sq , dq ) → 0 as q → +∞. (sq , dq )4W

Keeping the same notations for a converging subsequence of (¯sq , d¯q ), this implies that the function s¯q d¯q converges to 0 in Lαrad , given any α ≥ 1. Using Lemma 4.1, we get

∞

r0

(sq2 + dq2 )r 2 dr ≤ Kbn − K

∞

r0

ζ (r)sq dq r 2 dr,

258

E. Paturel

with K, K positive constants and ζ a bounded function independent of q. The normalization implies

∞

∞ Kbn (¯sq2 + d¯q2 )r 2 dr ≤ − K ζ (r)¯sq d¯q r 2 dr. (sq , dq )2W r0 r0 Taking the limit for q → +∞, we find that s¯q and d¯q converge strongly to 0 in L2rad . The normalization in W implies that s¯q and d¯q also converge to 0 in Lsrad , for 2 ≤ s < ∞, and that the sequence (¯sq , d¯q ) is bounded in L∞ . Hence we may pass to the limit in system (4.1–4.2), and, thanks to the properties of L1 and L2 in Lemma 3.1, s¯q and d¯q 1,2 strongly converge to 0 in Wrad , which is impossible, because of the normalization in W . εq ε So we have proved that (sn , dnq ) is bounded in W , hence precompact in any Lsrad , with s > 2. We denote by (sn∗ , dn∗ ) its limit, up to a subsequence. We apply now a ε ε bootstrap argument: writing snq = sn∗ + δq1 and dnq = dn∗ + δq2 , and passing to the limit as

1,s , for any s ≥ 2. As the derivatives q goes to infinity in (4.1–4.2), we get (sn∗ , dn∗ ) ∈ Wrad εq εq ε ε s ((sn ) , (dn ) ) are bounded in any Lrad , with s ≥ 2, this implies that (snq , dnq ) actually ∗ ∗ ∗ ∗ converges to (sn , dn ) in W . Then (sn , dn ) is a critical point for Fr0 with critical value denoted by 21 cn∗ = Fr0 (sn∗ , dn∗ ) > 0. We have to prove that the limiting procedure given here leads to an infinite number of critical points for Fr0 and Gr0 . To this aim, we follow a method used in [8] for a similar problem. If the sequence cn∗ is strictly increasing, the proof is complete. Let us assume ∗ therefore that there exists n¯ such that cn∗¯ = cn+1 = c, and let us denote by Kc the set of ¯ critical points of Gr0 at the level c. If Kc is not compact, its cardinal is infinite, and the proof is over. To treat the remaining possibility, we need the following lemma: ∗ = c and that Kc is compact. Then Lemma 4.2. Suppose that cn∗¯ = cn+1 ¯

(i) Let U and V be two bounded open sets in H such that Kc ⊂ U ⊂ U¯ ⊂ V. Then there exist constants δ1 , α, ε1 > 0 such that for all ε ∈ (0, ε1 ], for all s ∈ (V \ U) ∩ {s/Gr0 (s) ∈ [c − δ1 , c + δ1 ]}, we have Gr0 (s)H ≥ α. (ii) For all ε ∈ (0, ε1 ], if s "∈ U and if Gr0 (s) ∈ [c − δ1 , c + δ1 ], then Gr0 (s)H ≥ β(ε) > 0. This lemma has the following consequence (whose proof is standard and will be omitted): Corollary 4.3. Under the above assumptions, there exists δ2 > 0 such that for any 0 < ε ≤ ε1 there is a time T (ε) > 0 with Gr0 (hε (T (ε), s)) ≤ c − δ2 ,

(4.3)

for all s " ∈ V and such that Gr0 (s) ≤ c + δ2 , where we denote by hε the flow induced by −∇Gr0 .

New Variational Principle for a Nonlinear Dirac Equation

259 q

Proof of Lemma 4.2. As V is bounded in H , and H is compactly embedded in every Lrad , for q > 2, the restriction of Gr0 to the domain V \ U satisfies the Cerami conditions, and (i) follows. From the exponential decay estimate and the same compact embedding as above, for ε > 0 small enough, any critical point s of Gro ,ε such that Gr0 ,ε (s) ∈ [c − δ1 , c + δ1 ] must be in U. But we know that Gr0 ,ε satisfies the Cerami condition, hence (ii). Making the assumptions of Lemma 4.2, we take two bounded open neighborhoods of Kc , U, V such that, denoting by γ the Krasnoselski genus, γ (Kc ) = γ (V) and Kc ⊂ U ⊂ U¯ ⊂ V. By Corollary 4.3, dε = γ hε T (ε), {s / Gr0 ,ε (s) ≤ c + δ2 } \ V ≤ lε = γ {s / Gr0 ,ε (s) ≤ c − δ2 } ≤ n¯ − 2. ¯ Therefore, γ (Kc ) ≥ 2 and Kc is an infinite set. But we also have dε + γ (V) ≥ n. There remains to prove the property related to the Morse index of the critical points, i.e., (ii) of Theorem 1.2. Lemma 4.4. The Morse index of sn∗ as critical point for the functional Gr0 is at most n + 1. Proof. This result is classical, since the critical points are obtained by the Symmetric Mountain Pass Lemma, and the approximation scheme does not perturb the Morse index (for references, see [9]). 5. Estimates and Auxiliary Results 5.1. Proof of Lemma 3.2. The proof of the Cerami property for Gr0 ,ε is a bit more complicated than in the Minkowski case, because of the apparition of r-dependant coefficients in the system (2.1–2.2), but the a priori estimates are similar. Let (sn )n∈N be a Cerami sequence for Gr0 ,ε at the level l ∈ R. Then, with dn = d ∗ (sn ), we obtain a Cerami sequence (sn , dn ) for Fr0 ,ε at the same level. First we find an estimate of the superquadratic terms:

∞ p ∞ 1 p Fr0 ,ε (sn , dn ) − Fr0 ,ε (sn , dn ).(sn , dn ) = λ f −1 sn2 dn2 r 2 + ε sn r 2 −1 2 2 r0 r0 ≥ 0, (5.1) hence we get

∞

r0

f −1 sn2 dn2 r 2 dr ≤ M,

∞ p sn r 2 dr ≤ M. r0

We prove now Lemma 3.2. Since dn = d ∗ (sn ), we have the exact equation: 1 1 f −2 − 21 −1 dn = −2λf −1 sn2 dn . sn + sn + + mf + ωf r r

(5.2) (5.3)

(5.4)

260

E. Paturel

In order to prove precompactness for the Cerami sequence, we cut dn into d˜n + εn , ∂Fr0 ,ε where εn = L−1 1 ( ∂s (sn , dn )). We obtain of course that εn L tends to 0 as n goes to infinity. Moreover, we can replace (2.1) with the following equation concerning d˜n : − 21 1 1 f d˜n + − mf − 2 d˜n − 2λf −1 sn dn d˜n = ωf −1 − sn r r (5.5) p−1

+ 2λf −1 sn dn εn + εpsn

.

The left-hand side of (5.5) is related to the derivative of eG(r)+Hn (r) d˜n with 1 1 − mf − 2 , r Hn (r) = −2λf −1 sn (r)dn (r).

G (r) =

By (5.2), Hn is a positive and uniformly bounded function. Thus we can write d˜n (r) = −

r

∞

1 f − 2 (ρ) sn eG(ρ)−G(r)+Hn (ρ)−Hn (r) ωf −1 (ρ) − ρ p−1

+ 2λf −1 (ρ)sn dn εn + εpsn

dρ,

and we get |d˜n (r)| ≤ C

∞ r

1 f − 2 (ρ) eG(ρ)−G(r) ωf −1 (ρ) − |sn | + εp|sn |p−1 ρ + 2λf −1 (ρ)|sn dn εn | dρ

− 21 −1 ωf (ρ) − f (ρ) |sn | + εp|sn |p−1 ≤ CL−1 1 ρ + 2λf −1 (ρ)|sn dn εn | dρ .

(5.6)

q

From (5.6) we infer that d˜n is bounded in Lrad for all q > p and in L∞ . Applying to (5.5), we find that d˜n is the sum of functions, which are respectively bounded

L−1 1

1,

p

1,2 1,1 , Wrad , Wrad and Wradp−1 . Hence, by Sobolev embeddings, d˜n is precompact in in Wrad q every Lrad , for q > p, and we have the following uniform estimate: 1,p

K1 ∃K1 , ∀n ∈ N, ∀r ∈ (r0 , ∞), |d˜n (r)| ≤ 2 . rp We denote by d˜ its limit, up to a subsequence. Applying L−1 2 to (5.4), we obtain a similar q decomposition of sn , which implies that sn is also precompact in every Lrad , for q > p, with limit s, up to a subsequence, and the same uniform estimate: ∃K2 , ∀n ∈ N, ∀r ∈ (r0 , ∞), |sn (r)| ≤

K2 2

rp

.

New Variational Principle for a Nonlinear Dirac Equation

261

We now have to prove that convergence is obtained in the right spaces, that is, respectively H for sn and L for dn . To this aim, we prove an exponential decay estimate for the couple (sn , d˜n ). We denote by L the following differential operator: s m ω s s + A , with A = . L = d −ω −m d d √ This matrix has eigenvalues ± m2 − ω2 , and if we denote by G± the projectors on the corresponding eigenspaces, L has a bounded inverse from Lq ((k, ∞), r 2 dr) on to 1,q W0 ((k, ∞), r 2 dr):

r

∞ −1 −A(r−ρ) (L )(r) = e G− (ρ)dρ − e−A(r−ρ) G+ (ρ)dρ. k

r

Of course, we cannot hope for any exponential decay property for the Cerami sequence. Nevertheless, we prove that this sequence is the sum of two terms: the first one satisfies an exponential decay property while the second one vanishes. Using the preceding notations, we can rewrite system (5.4–5.5) as follows: sn sn −2λf −1 sn2 d˜n = B(r) ˜ + L ˜ p−1 dn dn 2λf −1 sn d˜n2 + εpsn (5.7) −2λf −1 sn2 εn + , 2λf −1 sn εn (2d˜n + εn ) where B(r) is a matrix with all its coefficients going to 0 as r goes to infinity. Denoting the couple (sn , d˜n ) by ϕn , we put −2λf −1 sn2 εn . (5.8) ψn = ϕn − L−1 2λf −1 sn εn (2d˜n + εn ) To prove an exponential decay estimate on ψn , we will use a method inspired by [8]. Define now a smooth cut-off function θkδ with δ > 0, k ∈ N, k ≥ 1: we put 0 θkδ (r) = θ r−r δ − k , where θ is a smooth cut-off function satisfying θ (r) = 0 ∀r < 0,

θ (r) = 1

∀r > 1,

θ (r) ∈ (0, 2) ∀r ∈ (0, 1).

We put θkδ ψn = ψn,k . From (5.7) and the properties of L−1 , we infer that, given any q > p, limk→∞ ψn,k W 1,q = 0. Then, we write rad

Lψn,k = θkδ Lψn + (θkδ ) ψn

−2λf −1 sn2 d˜n p−1 −1 2λf sn d˜n2 + εpsn −2λf −1 sn2 εn + θkδ B(r)L−1 . −1 2λf sn εn (2d˜n + εn )

= θkδ B(r) + (θkδ ) ψn + θkδ

q

We have to give a Lrad estimation for the last three terms. For the first one, we get easily, with the properties satisfied by θkδ : δ C 2 δ + (5.9) ψn,k Lq . θk B(r) + (θk ) ψn Lq ≤ 1 1 rad rad δ δ2k2

262

E. Paturel

For the second one, we have δ C −2λf −1 sn2 d˜n δ θ k 2λf −1 s d˜ 2 + εps p−1 q ≤ 1 1 θk ϕn Lqrad . n n n δ2k2 Lrad

(5.10)

The third term may be estimated by the use of Sobolev embeddings: δ −2λf −1 sn2 εn θ B(r)L−1 k 2λf −1 sn εn (2d˜n + εn ) Lq rad −1 s 2 ε C0 −1 0 −2λf n −1 n ≤ 1 1 L + L 2λf −1 sn εn2 W 1,1 4λf −1 sn d˜n εn W 1,2 δ2k2 rad rad 1 ≤ 1 1 C1 2λf −1 sn εn2 1 + C2 2λf −1 sn2 εn 2 + C2 4λf −1 sn d˜n εn 2 L L Lrad 2 2 rad rad δ k C3 ≤ 1 1 εn 2L2 ϕn L∞ . (5.11) rad δ2k2 With the definition of ψn , we then obtain ϕn ≤ ψn +

C3 1 2

δ k

1 2

εn 2L2 ϕn L∞ , rad

q

in the L∞ - or the Lrad -norm. This implies that, for n great enough, we get ϕn L∞ ≤ 2ψn L∞ ≤ 2C4 ψn W 1,q ,

(5.12)

C3 ϕn Lq ≤ 1 + 1 1 εn 2L2 ψn W 1,q ≤ 2ψn W 1,q . rad rad rad rad δ2k2

(5.13)

rad

and

Then, (5.10) and (5.13) lead to an estimate of the second term: C5 −2λf −1 sn2 d˜n δ θk p−1 Lqrad ≤ 1 1 ψn,k W 1,q , −1 2 ˜ rad 2λf sn dn + εpsn δ2k2

(5.14)

and (5.11) with (5.12) give the following estimate for the third term: θkδ B(r)L−1

C6 εn L2 −2λf −1 sn2 εn rad Lq ≤ ψn,k W 1,q . −1 1 1 ˜ rad 2λf sn εn (2dn + εn ) rad 2 2 δ k

(5.15)

Finally, joining (5.9) with (5.14) and (5.15), we obtain, given n, k and δ great enough, ψn W 1,q ((r0 +(k+1)δ,∞),r 2 dr) ≤ ψn , kW 1,q ≤ rad

1 ψn W 1,q ((r0 +kδ,∞),r 2 dr) , 2

which implies an exponential decay estimate for ψn W 1,q ((r0 +kδ,∞),r 2 dr) : ∀q > p, ∃σq > 0 / ψn W 1,q ((r0 +kδ,∞),r 2 dr) ≤ e−σq k .

(5.16)

New Variational Principle for a Nonlinear Dirac Equation

263

Using Hölder’s inequality, this estimate may be checked for q > 1. Now, we can q 1,q pass to the limit in(5.8), in Lrad : ψn is a bounded sequence in Wrad , for every q ≥ 2. Its s limit, denoted by ˜ , satisfies the estimate (5.16). We deduce that ψn has a convergent d s 1,q 1,q subsequence in Wrad , for every q ≥ 2. This implies in particular that ˜ ∈ Wrad for d every q ≥ 2, and moreover these functions satisfy the same exponential decay estimate, which ends the proof. 5.2. Proof of Lemma 3.3. Since the functional Gr0 ,ε is smooth, we compute its second order derivative at 0. Remark that adding superquadratic terms in the functional does not change this quadratic form, denoted by Qr0 ,ε . After tedious but straightforward computations, we find, for every h ∈ H ,

∞

2 Qr0 ,ε (h) = ϕ2 (r)h + ϕ1 (r)h2 dr, (5.17) r0

ϕ2 (r) =

2r 2 f

,

1

f2 r

+ω 1 d 2rf 2 + mr 2 ϕ1 (r) = + 1 1 dr + ωf − 2 r

(5.18) 1

2r 2 f 2 1 r

+ ωf

− 21

1 1 ( + mf − 2 )2 r

(5.19)

1 1 1 − 2r 2 f − 2 (− + ωf − 2 ). r Fix r0 > 1. As ϕ1 and ϕ2 are positive everywhere, except in a compact set in (r0 , ∞), Qr0 ,ε is a compact perturbation of a positive symmetric operator, so we claim that Qr0 ,ε has a finite number of nonpositive directions. Moreover, this number is increasing as r0 goes to 1 because of the following monotonicity: if hr0 satisfies Qr0 ,ε (hr0 ) ≤ 0 then, for all r ∈ (1, r0 ), hr0 extends to a unique h˜ r and

Qr ,ε (h˜ r ) = Qr0 ,ε (hr0 ) ≤ 0. If we denote by N (r0 ) the number of nonpositive directions of Qr0 ,ε , this proves that N (r) is nondecreasing as r0 goes to 1 and N (r0 ) = 0 for large r0 . We use now a rescaling argument to show that N (r) is not a constant, i.e. there exist negative directions, for r0 > 1 small enough, and N tends to infinity as r0 goes to 1. Let θ ∈ C ∞ (R, [0, 1]) be a smooth function, whose compact support lays in [0, 1]. Given r0 > 1, we put r − r0 hr0 (r) = θ . r0 − 1 Now, some properties of ϕ1 and ϕ2 are sufficient to show that, for r0 small enough, Qr0 ,ε (hr0 ) < 0, and this leads to the results claimed. By computation, there exist k1 , k2 > 0 such that ϕ2 (r) ≤ k2 (r − 1), ϕ1 (r) ≤ −k1 (r − 1)

(5.20) − 23

,

(5.21)

264

E. Paturel

for r > 1 small enough. Then, with the same condition,

2r0 −1 2 ϕ 2 h + ϕ 1 h 2 Qr0 ,ε (hr ) = r0

≤ k2 (2r0 − 2)

1

≤ 2k2

2r0 −1 r0

2

h − k1 (2r0 − 2)

θ − 2− 2 k1 (r0 − 1)− 2 3

2

1

0

1

− 23

2r0 −1

r0

h2

θ 2,

0

so there exists r¯0 such that Qr0 ,ε (hr¯0 ) < 0. Since this construction is possible from every θ ∈ C ∞ (R, [0, 1]), the proof is over. 5.3. Proof of Lemma 4.1. Since (snε , dnε ) is a critical point for Fr0 ,ε , it satisfies the system 1

1 f f2 − mf 2 )dnε + ( − ω)snε = 2λsnε (dnε )2 + εp(snε )p−1 , r r 1 1 f f2 ε ε 2 f (sn ) + ( + mf )sn + ( + ω)dnε = −2λ(snε )2 dnε . r r

f (dnε ) + (

(5.22) (5.23)

It is not hard to see, using Lemma 3.1 and the exponential decay estimate, that ε r(sn ) , r(dnε ) ∈ H × L. Hence we can follow a Pohozaev-type idea and compute the following quantity, which is the equivalent in our case of the derivative of the functional relative to the dilations, taken at the critical point (snε , dnε ): ∂Fr0 ,ε ε ε ∂Fr0 ,ε ε ε (s , d ).(r(dnε ) ) + (sn , dn ).(r(snε ) ) = 0 ∂d n n ∂s ∞ 1 (snε (dnε ) − dnε (snε ) ) + f − 2 (dnε (dnε ) − snε (snε ) ) = r0 + 2λf −1 r (snε )2 dnε dnε + snε (snε ) (dnε )2 1 + mf − 2 r snε (dnε ) + dnε (snε ) + ωf −1 r dnε (dnε ) + snε (snε ) + εpr(snε )p−1 (snε ) r 2 dr. Using integration by parts, we may recognize in this integral some terms appearing in the functional Fr0 ,ε . Indeed, we obtain

∞ (snε )p r 2 dnε r 2Ar0 (snε , dnε ) + 3Br0 (snε , dnε ) + 3Cr0 (snε , dnε ) − 3ε

=

∞ 1

r0

− 23

ωf −2 ε 2 ((sn ) + (dnε )2 )r 4 2 2 r0 1 1 −2 ε 2 ε 2 − λf (sn ) (dn ) r dr + dnε (r0 )2 r02 f − 2 (r0 ) + ωf −1 (r0 )r0 . 2 ((snε )2 − (dnε )2 )f − 2 − m 3

f

snε dnε r −

New Variational Principle for a Nonlinear Dirac Equation

265

Then the superquadraticity inequality (5.3) leads to the following lower bound for Br0 :

p−4 ∞ ε p 2 ε ε (sn ) r dr Br0 (sn , dn ) ≥ − bn + ε 2 r0

∞ −3 f 2 f −2 r ε 2 ε 2 ε 2 ε 2 dr (sn ) − (dn ) − ω (sn ) + (dn ) + 4 2 r0

∞ −3 mf 2 r ε ε −2 ε 2 ε 2 − (5.24) sn dn + λf r(sn ) (dv ) dr. 2 r0 Now the uniform bound on the critical values Fr0 ,ε (snε , dnε ) gives an upper bound for p εsnε Lp and λsnε dnε 2L2 . The use of the definition of Br0 allows to get rid of the terms rad

rad

containing (dnε )2 , since we get

∞

∞ 1 ωf −2 r ε 2 1 − mf − 2 snε dnε r 2 dr , (sn ) + (dnε )2 dr ≥ Br0 (snε , dnε ) + 2 r0 − 1 r0 r0 (5.25) and

∞

r0

3 1

∞ f −2 ε 2 f − 2 (r0 ) ε ε − 21 ε ε 2 Br0 (sn , dn ) + − mf sn dn r dr . (dn ) dr ≥ 4 2ωr02 r0

(5.26)

With (5.25) and (5.26), we obtain from (5.24) a lower bound for Br0 , which only depends on n and the function snε dnε ,

1 r0 f − 2 (r0 ) Br0 (snε , dnε ) − r0 − 1 2ωr02 1 3

∞ − 21 (r0 )f − 2 r 2 f −2 r ε ε f − 21 2 ≥ −bn + msn dn − +f r , 2 2ωr02 r0

and the proof is over.

(5.27)

Acknowledgement. The author would like to thank Eric Séré for his patience, great competence and kindness.

References 1. Amann, H.: Saddle points and multiple solutions of differential equations. Math. Z. 169 2, (127–166) (1979) 2. Bachelot-Motet, A.: Nonlinear Dirac fields on the Schwarzschild metric. Classical and Quantum Gravity 15 7, 1815–1825 (1998) 3. Balabane, M. and Cazenave, T. and Douady, A. and Merle, F.: Existence of excited states for a nonlinear Dirac field. Comm. Math. Phys. 119, 153–176 (1988) 4. Balabane, M. and Cazenave, T. and Vázquez, L.: Existence of standing waves for Dirac fields with singular nonlinearities. Comm. Math. Phys. 133, 53–74 (1990) 5. Buffoni, B. and Jeanjean, L. and Stuart, C. A.: Existence of a nontrivial solution to a strongly indefinite semilinear equation. Proc. Amer. Math. Soc. 119 1, 179–186 (1993)

266

E. Paturel

6. Castro, A. and Lazer, A. C.: Applications of a max-min principle. Rev. Colombiana Mat. 10 4, 141–149 (1976) 7. Cazenave, T. and Vázquez, L.: Existence of localized solutions for a classical nonlinear Dirac field. Comm. Math. Phys. 105, 35–47 (1986) 8. Esteban, M.J. and Séré: E.: Stationary states of the nonlinear Dirac equation: A variational approach. Comm. Math. Phys. 171 2, 323–350 (1995) 9. Ghoussoub, N.: Duality and perturbation methods in critical point theory. Cambridge: Cambridge University Press, 1993 10. Merle, F.: Existence of stationary states for nonlinear Dirac equations. J. Diff. Eq. 74, 50–68 (1988) 11. Rañada, A.F.: Classical nonlinear Dirac field models of extended particles, In: Quantum Theory, Groups, Fields and Particles, A.O. Barut, ed., Amsterdam: Reidel, 1983 12. Soler, M.: Classical, stable, nonlinear spinor field with positive rest energy. Phys. Rev. D 1, 2766–2769 (1970) 13. Struwe, M.: Variational methods. Applications to nonlinear partial differential equations and Hamiltonian systems. Berlin: Springer-Verlag, 1996 14. Wakano, M.: Intensely localized solutions of the classical Dirac–Maxwell field equations. Prog. Theor. Phys. (Kyoto) 35, 1117–1141 (1966) Communicated by H. Nicolai

Commun. Math. Phys. 213, 267 – 289 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Modular Invariants from Subfactors: Type I Coupling Matrices and Intermediate Subfactors Jens Böckenhauer, David E. Evans School of Mathematics, University of Wales Cardiff, PO Box 926, Senghennydd Road, Cardiff CF24 4YH, Wales, UK Received: 8 December 1999 / Accepted: 15 February 2000

Abstract: A braided subfactor determines a coupling matrix Z which commutes with the S- and T-matrices arising from the braiding. Such a coupling matrix is not necessarily of “type I”, i.e. in general it does not have a block-diagonal structure which can be reinterpreted as the diagonal coupling matrix with respect to a suitable extension. We show that there are always two intermediate subfactors which correspond to left and right maximal extensions and which determine “parent” coupling matrices Z ± of type I. Moreover it is shown that if the intermediate subfactors coincide, so that Z + = Z − , then Z is related to Z + by an automorphism of the extended fusion rules. The intertwining relations of chiral branching coefficients between original and extended S- and T-matrices are also clarified. None of our results depends on non-degeneracy of the braiding, i.e. the S- and T-matrices need not be modular. Examples from SO(n) current algebra models illustrate that the parents can be different, Z + = Z − , and that Z need not be related to a type I invariant by such an automorphism.

1. Introduction A prominent problem in rational conformal field theory (RCFT) is the classification of modular invariants. Though it is usually a difficult task and solved only for a few special models (e.g. [8, 19]), its mathematical formulation is simple: For a given unitary, finite-dimensional representation of the modular group SL(2; Z), let S = (Sλ,µ ) and 1 1 T = (Tλ,µ ) denote the matrices representing the generators 01−1 0 and 0 1 , respectively. In the representations of interest T is diagonal, S is symmetric, S 2 is a permutation, and Sλ,0 ≥ S0,0 > 0. Here “0” is a distinguished label, referring to the “vacuum sector”. A modular invariant is then a “coupling matrix” Z (or “mass matrix”) commuting with S and T , SZ = ZS,

and

T Z = ZT ,

(1)

268

J. Böckenhauer, D. E. Evans

and subject to the constraints Zλ,µ = 0, 1, 2, . . . ,

and

Z0,0 = 1.

(2)

These constraints reflect the physical background of the problem: The coupling matrix usually describes multiplicities in the decomposition of the Hilbert space Hphys of physical states of a 2D conformal field theory under the action of a “symmetry algebra” A ⊗ A which is a tensor product of two copies1 of a “chiral algebra” A, Hphys = Zλ,µ Hλ ⊗ Hµ , λ,µ

giving rise to a modular invariant partition function Z= Zλ,µ χλ χµ∗ . λ,µ

Here the χλ ’s and χµ ’s are the conformal characters of the representations Hλ and Hµ , and the modular group action of S and T comes from resubstitution of their arguments, leaving the sesqui-linear combination Z invariant. The condition Z0,0 = 1 then expresses the uniqueness of the vacuum state. The simplest example for a coupling matrix is the “diagonal case”, Zλ,µ = δλ,µ , which always gives a modular invariant partition function. More interesting non-diagonal modular invariants arise whenever the chiral algebra can be extended by some local fields. Of special relevance are the so-called type I invariants [11] for which the entries of the coupling matrices can be written as Zλ,µ = bτ,λ bτ,µ , (3) τ

and which refer directly to the extension through their “block-diagonal” structure: The label τ runs over the representations of the extended chiral algebra Aext , and the nonnegative integers bτ,λ describe the branching of a representation τ into λ’s according to the inclusion A ⊂ Aext . The branching coefficients fulfill bτ,0 = δτ,0 (by some abuse of notation we denote the vacuum sector of Aext also by “0”), thus guaranteeing the normalization condition Z0,0 = 1. Rewriting the partition function in terms of the extended characters χτext = λ bτ,λ χλ , any type I modular invariant can be considered ext = δ . It is argued in [12, 29] that after extending the as completely diagonal: Zτ,τ τ ,τ chiral algebras maximally, the coupling matrix of a partition function in RCFT is at ext = δ most a permutation, Zτ,τ τ ,ω(τ ) , where the permutation ω is an automorphism of the extended fusion rules, satisfying ω(0) = 0. As a consequence, a maximal extension A ⊂ Aext in RCFT produces a coupling matrix of some modular invariant partition function which can be written as Zλ,µ = bτ,λ bω(τ ),µ . (4) τ

Partition functions of the form Eq. (4) which are not of type I, Eq. (3), are usually referred to as being “type II” [11]. Given matrices S and T arising from the modular transformations of a collection of characters χλ in a RCFT, the solution of the mathematical problem given in Eqs. (1) 1 In generic RCFTs one does not necessarily start with a symmetry algebra made of two identical chiral algebras. However, Eq. (1) is designed for such a symmetric situation, whereas in the “heterotic” situation one has to deal with different S- and T-matrices intertwined by Z.

Modular Invariants from Subfactors

269

and (2) can nevertheless yield coupling matrices which are neither of type I nor of type II. Note that a coupling matrix of the form in Eq. (4) has necessarily “symmetric vacuum coupling”, Zλ,0 = Z0,λ . However, even for rather well-behaved models like SO(n) current algebras there are known matrices Z satisfying Eqs. (1) and (2), but which do not have this symmetry, cf. Sect. 7 below. Chiral algebras often admit different extensions, and only then, but much more rarely, modular invariants without symmetric vacuum coupling have been found. Namely, it can happen that two chiral extensions A ⊂ Aext ± of the original chiral algebra A are compatible such that a given coupling matrix has to be interpreted as an “automorphism” invariant with respect to the enhanced ext “heterotic” symmetry algebra Aext + ⊗ A− . (It seems that this possibility has sometimes been ignored in the literature although the heterotic case was taken into account in [29].) Unfortunately the standard terminology “permutation” and “automorphism” is a bit misleading in the heterotic case because the labels of left and right sectors are generically different. A more precise notion would be “bijection” and “isomorphism of fusion rules”, and the distinction between diagonal and permutation invariant no longer makes sense for a maximally extended heterotic symmetry algebra. Finally, in case that for a fixed theory there are several modular invariant partition functions it may happen that a linear combination of their coupling matrices yields a solution of Eq. (2), which may however fail to have a consistent interpretation as a partition function [37, 39, 18]. Such modular invariants without physical interpretation seem to be extremely rare. The mathematical classification problem of Eqs. (1) and (2) was considered in [6, 7] by means of subfactor theory, using the ideas of α-induction [28, 41, 3–5] and double triangle algebras [30]. The analysis in [6, 7] addressed in particular the problem of understanding the relation between modular invariants, graphs and “nimreps” (non-negative integer valued matrix representations of the Verlinde fusion algebra) – a puzzling connection going back to the celebrated A-D-E classification of [8, 26], its general nature noticed in [10, 11] and further studied in [9, 31, 1]. It follows from [32, 16, 15] that a (type III) von Neumann factor N with a system N XN of braided endomorphisms give rise to certain “statistics” matrices S and T , which are modular whenever the braiding is nondegenerate [32]. It was shown in [6] that then an inclusion N ⊂ M of von Neumann factors which is compatible with the system N XN determines a coupling matrix Z by α-induction, Zλ,µ = αλ+ , αµ− ,

λ, µ ∈ N XN ,

(5)

solving Eqs. (1) and (2) even if the braiding is degenerate. Here αλ+ and αµ− are the two inductions of λ and µ, coming from braiding and opposite braiding, and the bracket αλ+ , αµ− denotes the dimension of their relative intertwiner space. From current algebra models (“WZW”) in RCFT one can construct braided subfactors such that the statistics matrices S and T and the Kac-Peterson matrices performing the SL(2; Z) transformations of the affine characters (cf. [25]) coincide. This connection between statistics and conformal character transformations is expected to hold quite generally in RCFT (e.g. it was conjectured in [17]), and the conformal spin-statistics theorem [16, 15, 20] establishes this for the T-matrices. To prove this for the S-matrices requires one to show that the composition of superselection sectors indeed recovers the Verlinde fusion rules, and this has been done for several models, most significantly for SU(n) at all levels in [40]. For local extensions (cf. [36]) the subfactor N ⊂ M can be thought of as a version of the inclusion A ⊂ Aext . In terms of (a variation of) the α-induction formula of [28], such subfactors were first investigated for certain conformal inclusions in [41]. This was further analyzed and extended to simple current extensions in [3, 4], and that α-induction

270

J. Böckenhauer, D. E. Evans

indeed recovers the corresponding modular invariants was found (for SU(2) and SU(3) current algebras) in [5]. + − The two inductions, α + and α − , produce chiral systems M XM and M XM , intersecting 0 on the “ambichiral” system M XM . Then Eq. (5) can be written as + − Zλ,µ = bτ,λ bτ,µ , (6) 0 τ ∈M XM

± bτ,λ

= τ, αλ± . Now the question arises whether the with chiral branching coefficients general subfactor setting of [6, 7], which is also able to produce type II invariants so + − that in particular bτ,λ = bτ,λ is possible, will be confined to coupling matrices of the form of Eq. (4) or whether it can even produce other solutions of Eqs. (1) and (2), e.g. with heterotic vacuum coupling. This is the issue of the present paper. We will indeed demonstrate that our framework incorporates the general situation, including modular invariants corresponding to heterotic extensions of the symmetry algebra. In fact, we study subfactors N ⊂ M, producing coupling matrices Z, through intermediate subfactors, making essential use of the Galois correspondence elaborated in [23]. We derive in Sect. 4 that there are intermediate subfactors M+ and M− , N ⊂ M± ⊂ M, naturally associated to the vacuum column (Zλ,0 ) respectively the vacuum row (Z0,λ ) of the coupling matrix determined by N ⊂ M. The subfactors N ⊂ M± in turn determine coupling matrices Z ± which are of the form of Eq. (3) and can be interpreted as the “type I parents” of the original coupling matrix Z. In Sect. 5 we show that in the case M+ = M− , so that in particular there is a unique parent Z + = Z − , the coupling matrix is indeed of the form of Eq. (4), recovering a fusion rule automorphism of the ambichiral system. For the general situation we prove a proposition which shows that M+ and M− should be regarded as the operator algebraic version of maximally extended left and right chiral algebras, using a recent result of Rehren [35]. In Sect. 6 we establish the intertwining relations of the chiral branching coefficients between the original S- and T-matrices and the “extended” ones arising from the ambichiral braiding. It is remarkable that the entire analysis does not need to assume that the braiding is non-degenerate, i.e. our results remain valid even if the matrices S and T are not modular. In Sect. 7 we finally show by examples from SO(n) current algebras that indeed M+ = M− is possible, that the parents can be different, Z + = Z − , and that subfactors can produce coupling matrices Z which have heterotic vacuum coupling, so that Eq. (4) can not be adopted in general. 2. Preliminaries Let A and B be type III von Neumann factors. A unital ∗-homomorphism ρ : A → B is called a B-A morphism. The positive number dρ = [B : ρ(A)]1/2 is called the statistical dimension of ρ; here [B : ρ(A)] is the Jones index [24] of the subfactor ρ(A) ⊂ B. If ρ and σ are B-A morphisms with finite statistical dimensions, then the vector space of intertwiners Hom(ρ, σ ) = {t ∈ B : tρ(a) = σ (a)t, a ∈ A} is finite-dimensional, and we denote its dimension by ρ, σ . An A-B morphism ρ is a conjugate morphism if there are isometries rρ ∈ Hom(idA , ρρ) and r ρ ∈ Hom(idB , ρρ) such that ρ(rρ )∗ r ρ = dρ−1 1B and ρ(r ρ )∗ rρ = dρ−1 1A . The map φρ : B → A, b → rρ∗ ρ(b)rρ , is called the (unique) standard left inverse and satisfies φρ (ρ(a)bρ(a )) = aφρ (b)a ,

a, a ∈ A,

b ∈ B.

(7)

Modular Invariants from Subfactors

271

If t ∈ Hom(ρ, σ ) then we have dρ φρ (bt) = dσ φσ (tb),

b ∈ B.

(8)

We work with the setting of [6], i.e. we are working with a type III subfactor and finite system N XN ⊂ End(N ) of (possibly degenerately) braided morphisms which is compatible with the inclusion N ⊂ M. Then the inclusion is in particular forced to have finite Jones index and also finite depth (see e.g. [14]). More precisely, we make the following Assumption 2.1 We assume that we have a type III subfactor N ⊂ M together with a finite system of endomorphisms N XN ⊂ End(N ) in the sense of [6, Def. 2.1] which is braided in the sense of [6, Def. 2.2] and such that θ = ιι ∈ !(N XN ) for the injection M-N morphism ι : N "→ M and a conjugate N -M morphism ι. With the braiding ε on N XN and its extension to !(N XN ) (the set of finite sums of morphisms in N XN ) as in [6], one can define the α-induced morphisms αλ± ∈ End(M) for λ ∈ !(N XN ) by the Longo-Rehren formula [28], namely by putting αλ± = ι −1 ◦ Ad(ε ± (λ, θ )) ◦ λ ◦ ι, where ι denotes a conjugate morphism of the injection map ι : N "→ M. Then αλ+ and αλ− extend λ, i.e. αλ± ◦ ι = ι ◦ λ, which in turn implies dα ± = dλ by the multiplicativity of the λ minimal index [27]. Let γ = ιι denote Longo’s canonical endomorphism from M into N . Then there is an isometry v ∈ Hom(id, γ ) such that any m ∈ M is uniquely decomposed as m = nv with n ∈ N . Thus the action of the extensions αλ± are uniquely characterized by the relation αλ± (v) = ε ± (λ, θ )∗ v which can be derived from the braiding fusion ± = αλ± αµ± if also µ ∈ equations (BFE’s, see e.g. [6, Eq. (5)]). Moreover, we have αλµ ± !(N XN ), and clearly αidN = idM . In general one has Hom(λ, µ) ⊂ Hom(αλ± , αµ± ) ⊂ Hom(ιλ, ιµ),

λ, µ ∈ !(N XN ).

The first inclusion is a consequence of the BFE’s. Namely, t ∈ Hom(λ, µ) obeys tε± (θ, λ) = ε± (θ, µ)θ (t), and thus tαλ± (v) = tε± (λ, θ )∗ v = ε± (µ, θ )∗ θ(t)v = ε± (µ, θ )∗ vt = αµ± (v)t. The second follows from the extension property of α-induction. Hence αλ± is a conjugate for αλ± as there are rλ ∈ Hom(id, λλ) ⊂ Hom(id, αλ± αλ± ) and r λ ∈ Hom(id, λλ) ⊂ Hom(id, αλ± αλ± ) such that λ(rλ )∗ r λ = λ(r λ )∗ rλ = dλ−1 1. We also have some kind of naturality equations for α-induced morphisms, xε± (ρ, λ) = ε± (ρ, µ)αρ± (x) whenever x ∈ Hom(ιλ, ιµ) and ρ ∈ !(N XN ). Recall that the statistics phase of ωλ for λ ∈ N XN is given as dλ φλ (ε + (λ, λ)) = ωλ 1. The monodromy matrix Y is defined by ωλ ωµ ρ Yλ,µ = Nλ,µ dρ , ωρ ρ∈N XN

λ, µ ∈ N XN ,

(9)

272

J. Böckenhauer, D. E. Evans ρ

with Nλ,µ = ρ, λµ denoting the fusion coefficients. Then one checks that Y is sym∗ as well as Y metric, that Yλ,µ = Yλ,µ λ,0 = dλ [32, 16, 15]. (As usual, the label “0” refers to the identity morphism id ∈ N XN .) Now let ) be the diagonal matrix with entries )λ,µ = ωλ δλ,µ . Putting Zλ,µ = αλ+ , αµ− defines a matrix subject to the constraints Eq. (2) and commuting with Y and ) [6]. The Y- and )-matrices obey )Y )Y ) = zY , where z = λ dλ2 ωλ [32, 16, 15], and this actually holds even if the braiding is degenerate (see [6, Sect. 2]). If z = 0 we put c = 4 arg(z)/π , which is defined modulo 8, and call it the “central charge”. Moreover, S- and T-matrices are then defined by S = |z|−1 Y,

T = e−iπc/12 )

and hence fulfill T ST ST = S. One has |z|2 = w with the global index w = λ dλ2 and S is unitary, so that S and T are indeed the standard generators in a unitary representation of the modular group SL(2; Z), if and only if the braiding is non-degenerate [32]. Consequently, Z gives a modular invariant in this case. Let M XM ⊂ End(M) denote a system of endomorphisms consisting of a choice of representative endomorphisms of each irreducible subsector of sectors of the form [ιλι], λ ∈ N XN . We choose id ∈ End(M) representing the trivial sector in M XM . ± α to be the Then we define similarly the chiral systems M XM and the α-system M XM subsystems of endomorphisms β ∈ M XM such that [β] is a subsector of [αλ± ] and of [αλ+ αµ− ], respectively, for some λ, µ ∈ N XN . (Note that any subsector of [αλ+ αµ− ] is automatically a subsector of [ινι] for some ν ∈ N XN .) The ambichiral system is defined + − ± α 0 = 0 as the intersection M XM M XM ∩ M XM , so that M XM ⊂ M XM ⊂ M XM ⊂ M XM . Their “global indices”, i.e. their sums over the squares of the statistical dimensions are denoted by w0 , w± , wα and w, and thus fulfill 1 ≤ w0 ≤ w± ≤ wα ≤ w. 3. More on Global Indices and Chiral Locality Recall from [7, Prop. 3.1] that w+ = w− and that w/w+ = λ∈N XN dλ Zλ,0 . We will now derive a general formula for the α-global index wα = β∈ X α dβ2 and also for M

deg

M

w0 . We denote by N XN ⊂ N XN the subsystem of degenerate morphisms. Proposition 3.1. The α-global index is given by wα =

w deg

λ∈N XN

Z0,λ dλ

.

(10)

2 /w . Moreover, the ambichiral global index is given by w0 = w+ α β

Proof. Let Rλ,µ , λ, µ ∈ N XN , denote matrices with entries Rλ,µ;β = βαλ+ αµ− , β , α . Further let d denote the column vector with entries d , β ∈ X α . Then β, β ∈ M XM β M M d is a simultaneous eigenvector of the matrices Rλ,µ with respective eigenvalues dλ dµ . We define another vector v by putting vβ =

λ,µ∈N XN

dλ dµ β, αλ+ αµ− ,

α β ∈ M XM .

Modular Invariants from Subfactors

273

Then we have Rλ,µ v = dλ dµ v, as we can compute (Rλ,µ v)β = βαλ+ αµ− , β dν dρ β , αν+ αρ− α ν,ρ∈ X β ∈M XM N N

=

ν,ρ∈N XN

dν dρ βαλ+ αµ− , αν+ αρ−

=

ξ

ν,ρ,ξ,η∈N XN

=

ξ,η∈N XN

η

dν dρ Nν,λ Nρ,µ β, αξ+ αη−

dλ dµ dξ dη β, αξ+ αη− = dλ dµ vβ .

Because the sum matrix λ,µ Rλ,µ is irreducible it follows v = ζ d, ζ ∈ R, by the uniqueness of the Perron-Frobenius eigenvector. Note that dλ dµ = β β, αλ+ αµ− dβ , and hence w2 = β vβ dβ = ζ wα . We next notice that ζ = v0 as d0 = 1. But v0 can be computed as dλ dµ αλ+ , αµ− = Y0,λ Zλ,µ Yµ,0 = Z0,λ Yλ,µ Yµ,0 , v0 = λ,µ∈N XN

λ,µ∈N XN

λ,µ∈N XN

where we used commutativity of the monodromy matrix Y with the coupling matrix Z [6, Thm. 5.7]. By Rehren’s argument [32] we have deg wdλ λ ∈ N XN Yλ,µ Yµ,0 = deg . 0 λ∈ / N XN µ∈ X N

N

Z0,λ dλ , establishing Eq. (10). Next we define two vectors v ± with entries vβ± = λ dλ β, αλ± , β ∈ (the proof of) [7, Prop. 3.1] we learn that ± dβ w/w+ β ∈ M XM ± vβ = ± . 0 β∈ / M XM

Hence ζ = w

deg

λ∈N XN

α M XM .

From

2 . But we can also compute directly Consequently v + , v − = w0 w 2 /w+

v + , v − =

λ,µ β∈ X 0 M M

completing the proof.

dλ αλ+ , β β, αµ− dµ =

λ,µ

dλ Zλ,µ dµ = ζ =

w2 , wα

Note that Proposition 3.1 in particular provides a new proof of the “generating propα = erty of α-induction”, i.e. M XM M XM if the braiding is non-degenerate, which was established in [6, Thm. 5.10]. Now recall that the chiral locality condition ε + (θ, θ )v 2 = v 2 expresses local commutativity (“locality”) of the extended net, if N ⊂ M arises from a net of subfactors [28].

274

J. Böckenhauer, D. E. Evans

Proposition 3.2. The following conditions are equivalent: 1. We have Zλ,0 = θ, λ for all λ ∈ N XN . 2. We have Z0,λ = θ, λ for all λ ∈ N XN . 3. Chiral locality holds: ε+ (θ, θ )v 2 = v 2 . Proof. The implications 3 ⇒ 1,2 follow from [3, Thm. 3.9]. We need to show 1,2 ⇒ 3. Recall θ, λ = ι, ιλ . Moreover, by the extension property of α-induction we have Hom(id, αλ± ) ⊂ Hom(ι, ιλ),

λ ∈ N XN .

Hence, if Zλ,0 = θ, λ (respectively Z0,λ = θ, λ ) then Hom(id, αλ± ) = Hom(ι, ιλ) for all λ ∈ N XN . Then consequently Hom(id, αθ± ) = Hom(ι, ιθ ). We clearly have v ∈ Hom(ι, ιθ ) and hence v 2 = αθ± (v)v = ε± (θ, θ )∗ v 2 , i.e. chiral locality holds. Recall from [7, Prop. 3.4] that the coupling matrix arising from a braided subfactor with satisfied chiral locality condition is automatically of type I. Hence Proposition 3.2 states that chiral locality is equivalent to the canonical endomorphism being “fully visible” in the vacuum row (or column) of the coupling matrix. 4. Intermediate Subfactors In this section we are searching for certain intermediate subfactors N ⊂ M˜ ⊂ M of our subfactor N ⊂ M. It follows from [23, Sect. 3] that the set of such intermediate subfactors M˜ is in a bijective correspondence with systems of subspaces Kρ ⊂ Hρ , where Hρ = Hom(ι, ιρ), ρ ∈ N XN , and subject to conditions (i) Kρ∗ ⊂ N Kρ , (ii) Kρ Kσ ⊂ ξ ≺ρσ N Kξ , ξ where the sum in (ii) runs over all ξ ∈ N XN such that Nρ,σ > 0. The factor M˜ is then generated by N and the Kρ ’s and is uniquely decomposed as M˜ = N Kρ . ρ

˜ ˜ ˜ The dual canonical endomorphism θ of N ⊂ M decomposes as a sector as [θ] = n [ρ], where n = dimK . ρ ρ ρ ρ We now define the spaces Kρ± = Hom(id, αρ± ),

ρ ∈ N XN .

Note that Kρ± ⊂ Hρ = Hom(ι, ιρ), that dimKρ+ = Zρ,0 and dimKρ− = Z0,ρ . Lemma 4.1. We have Kρ± = {zv : z ∈ Hom(θ, ρ), zγ (v) = zε∓ (θ, θ )γ (v)}.

(11)

Modular Invariants from Subfactors

275

Proof. Let x ∈ Kρ± . Now x is uniquely decomposed as x = zv with z ∈ N . Clearly z ∈ Hom(θ, ρ). Then x ∈ Hom(id, αρ± ) reads, using naturality (see e.g. [6, Eq. (8)]), zγ (v)v = zv 2 = αρ± (v)zv = ε ∓ (θ, ρ)vzv = ε∓ (θ, ρ)θ (z)γ (v)v = zε ∓ (θ, θ )γ (v)v, hence is equivalent to zγ (v) = zε∓ (θ, θ )γ (v).

ξ

We choose orthonormal bases of isometries t (ρ,σ )i ∈ Hom(ξ, ρσ ), i = 1, 2, . . . , ξ Nρ,σ ξ ξ ξ t (ρ,σ )i t (ρ,σ )∗i = 1. Nρ,σ so that ξ i=1 Lemma 4.2. We have (i) (Kρ± )∗ = rρ∗ Kρ± , ξ Nρ,σ ξ t (ρ,σ )i Kξ± . (ii) Kρ± Kσ± ⊂ ξ i=1 Proof. Right Frobenius reciprocity [22] gives us an isomorphism Hom(id, αρ± ) → Hom(id, αρ± ), x → x ∗ rρ . Thus any element y ∈ Kρ± can be written as x ∗ rρ with some x ∈ Kρ± , proving (i). To prove (ii), let xρ = zρ v ∈ Kρ± and xσ = zσ v ∈ Kσ± be the decompositions according to Lemma 4.1. Then ξ

xρ xσ = zρ vzσ v =

ρ,σ N

ξ

i=1

t (ξρ,σ )i t (ξρ,σ )∗i zρ θ(zσ )γ (v)v.

ξ

We first notice that t (ρ,σ )∗i zρ θ(zσ )γ (v) ∈ Hom(θ, ξ ). Next we check by use of the BFE and by v 2 = γ (v)v that zρ θ (zσ )γ (v) · ε ∓ (θ, θ )γ (v) = zρ θ(zσ )ε ∓ (θ, θ 2 )θ (γ (v))γ (v) = zρ θ(zσ )θ (ε ∓ (θ, θ ))ε ∓ (θ, θ )γ (v)2 = ρ(zσ ε ∓ (θ, θ ))zρ ε ∓ (θ, θ )γ (v)2 = zρ θ(zσ )θ (ε ∓ (θ, θ ))γ (v)2 = zρ θ(zσ ε ∓ (θ, θ )γ (v))γ (v) = zρ θ(zσ )γ (v) · γ (v). We conclude xρ xσ =

ξ Nρ,σ

ξ

i=1

ξ

ξ

t (ρ,σ )i xξi with xξi = t (ρ,σ )∗i zρ θ(zσ )γ (v)v ∈ Kξ± .

Corollary 4.3. There are two (possibly identical) intermediate subfactors N ⊂ M± ⊂ M with M± = ρ NKρ± . Our next aim is to show that the subfactors N ⊂ M± obey the chiral locality condition. Let ι± : N "→ M± denote the injection maps, so that the (dual) canonical endomorphisms are given by γ± = ι± ι± and θ± = ι± ι± . We now know that [θ+ ] = ρ Zρ,0 [ρ] and [θ− ] = [ρ]. Due to commutativity of Y and Z, which yields in particρ Z0,ρ ular ρ dρ Zρ,0 = ρ Z0,ρ dρ , we find dθ+ = dθ− , i.e. the subfactors N ⊂ M+ and

276

J. Böckenhauer, D. E. Evans

N ⊂ M− have the same Jones index. Moreover, we can apply α-induction, i.e. we define morphisms in End(Mδ ) by ± α˜ δ;λ = ιδ−1 ◦ Ad(ε ± (λ, θδ )) ◦ λ ◦ ιδ ,

where the index δ is either δ = + or δ = −. This will give rise to “parent” coupling + − matrices Z δ . Thanks to Prop. 3.2, it suffices to show Zλ,0 = Zλ,0 and Z0,λ = Z0,λ , λ ∈ N XN , to prove that the chiral locality condition holds for N ⊂ M+ and N ⊂ M− , respectively. ± Lemma 4.4. We have α˜ δ;λ (xρ ) = ε± (λ, ρ)∗ xρ for any xρ ∈ Kρδ and λ, ρ ∈ N XN .

Proof. Let xρ = zρ v be the decomposition according to Lemma 4.1. We first notice that γδ (zρ v)θδ (n) = γδ (zρ θ (n)v) = γδ (ρ(n)zρ v) = θδ ρ(n)γδ (zρ v) for any n ∈ N, i.e. γδ (xρ ) ∈ Hom(θδ , θδ ρ). Therefore we can compute ± γδ (α˜ δ;λ (xρ )) = ε ± (λ, θδ )λγδ (xρ )ε ∓ (θδ , λ) = ε ∓ (θδ , λ)∗ ε ∓ (θδ ρ, λ)γδ (xρ )

= ε∓ (θδ , λ)∗ ε ∓ (θδ , λ)θδ (ε ∓ (ρ, λ))γδ (xρ ) = γδ (ε ± (λ, ρ)∗ xρ ),

and application of γδ−1 yields the statement.

In the same manner we obtain of course also αλ± (xρ ) = ε± (λ, ρ)∗ xρ for any xρ ∈ Hom(ι, ιρ). Therefore we obtain immediately ± Corollary 4.5. We have αλ± |Mδ = α˜ δ;λ . ± (xρ )xλ whenever xλ ∈ Kλ± and xρ ∈ Kρδ , and thus Hence xλ xρ = αλ± (xρ )xλ = α˜ δ;λ ± ± in particular Kλ ⊂ Hom(idM± , α˜ ±;λ ). (Warning: Note that the super- and subscripts, referring to the ±-induction respectively to the choice of the algebra M± , now have to be the same because we have Kλ+ ⊂ M+ and Kλ− ⊂ M− , but in general not the other way round. Here and in the following, any formula containing such combined ±-indices has to be read in such a way that we either take all the upper or all the lower signs.) On the other hand we have ± idM± , α˜ ±;λ

≤ θ± , λ = dimKλ± .

Therefore we arrive in particular at ± Corollary 4.6. We have Hom(idM± , α˜ ±;λ ) = Kλ± for any λ ∈ N XN . + − = Zλ,0 and Z0,λ = Z0,λ , so that both Corollary 4.6 tells us in particular that Zλ,0 subfactors N ⊂ M± must satisfy the chiral locality condition. In turn they must be irreducible by the argument of [3, Cor. 3.6]. We summarize the discussion in the following

Theorem 4.7. There are two (possibly identical) intermediate subfactors N ⊂ M± ⊂ M. The irreducible subfactors N ⊂ M+ and N ⊂ M− have the same Jones index, they both satisfy the chiral locality condition and consequently give rise to type I coupling matrices Z ± . The latter are related to the coupling matrix of the full subfactor N ⊂ M + − through Zλ,0 = Zλ,0 and Z0,λ = Z0,λ .

Modular Invariants from Subfactors

277

5. Chiral Fusion Rule Automorphisms + − + , M XM and M+ XM , We now investigate the relations between the chiral systems M XM + − X , respectively. Note that by [6, Prop. 3.1] and Theorem 4.7, they all must have M− M− the same chiral global index w+ .

Lemma 5.1. We have ± ± , α˜ ±;µ ), Hom(αλ± , αµ± ) = Hom(α˜ ±;λ

(12)

in particular Hom(αλ± , αµ± ) ⊂ M± , for any λ, µ ∈ !(N XN ). Proof. First let ξ ∈ !(N XN ). Then there are orthonormal bases of∗ isometries sν,i ∈ Hom(ν, ξ ), ν ∈ N XN , i = 1, 2, . . . , ν, ξ , such that ν,i sν,i sν,i = 1. We may ∗ x, and we notice s ∗ x ∈ Hom(id, α ± ) = write x ∈ Hom(id, αξ± ) as ν,i sν,i sν,i ν ν,i ± ± Hom(id, α˜ ±;ν ), thanks to Corollary 4.6, so that x ∈ Hom(id, α˜ ±;ξ ). The same argument works vice versa. Now let λ, µ ∈ !(N XN ). Then we have Frobenius isomorphisms ± ± ± ± Hom(αλ± , αµ± ) −→ Hom(id, αλµ ) = Hom(id, α˜ ±;λµ ) −→ Hom(α˜ ±;λ , α˜ ±;µ )

which map t → s = As we have

dλ /dµ αλ± (t)rλ ,

s → t =

± dλ dµ r ∗λ α˜ ±;λ (s).

± ± t = dλ r ∗λ α˜ ±;λ (αλ± (t)rλ ) = dλ r ∗λ αλλ (t)λ(rλ ) = t

± ± , α˜ ±;µ ) ⊂ M± . it follows Hom(αλ± , αµ± ) = Hom(α˜ ±;λ

± is equivalent to an extension of some β˜± ∈ Lemma 5.2. Each β± ∈ M XM ± ± → M± XM . This association gives rise to bijections ϑ± : M XM ±

± M± XM± .

+ Proof. Assume β ≡ β+ ∈ M XM , i.e. there is a λ ∈ N XN and an isometry t ∈ + + + ∗ , α˜ +;λ ) by Lemma 5.1. Hom(β, αλ ). Then tt is a minimal projection in Hom(α˜ +;λ + + ˜ ˜ ˜ Hence there is a β ∈ M+ XM+ and an isometry t ∈ Hom(β, α˜ +;λ ) such that t˜t˜∗ = tt ∗ . Thus putting β (m) = t˜∗ αλ+ (m)t˜ for m ∈ M gives an equivalent endomorphism, as ˜ β = Ad(u) ◦ β with the unitary u = t ∗ t˜ ∈ M, and we clearly have β |M+ = β, + ˜ ˜ thanks to Corollary 4.5. It remains to show that, if β1 , β2 ∈ M+ XM+ correspond this way to different β1 , β2 ∈ X + , then β˜1 and β˜1 are disjoint. Let tj ∈ Hom(βj , α + ) M

M

λj

+ and t˜j ∈ Hom(β˜j , α˜ +;λ ) be isometries as above, i.e. tj tj∗ = t˜j t˜j∗ , j = 1, 2. Asj sume for contradiction that there is a unitary q ∈ Hom(β˜1 , β˜2 ). But then t˜2 q t˜1∗ ∈ + + Hom(α˜ +;λ , α˜ +;λ ) = Hom(αλ+1 , αλ+2 ), so that t2∗ t˜2 q t˜1∗ t1 is a unitary in Hom(β1 , β2 ), in 1 2 ± contradiction to β1 , β2 being different elements in M XM . Hence the association β → β˜ + + defines a bijection ϑ+ : M XM → M+ XM+ . The proof is completed by exchanging “+” by “−” signs.

278

J. Böckenhauer, D. E. Evans

± ± Lemma 5.3. The bijections ϑ± : M XM → M± XM preserve the chiral branching, ± ± β, αλ± = ϑ± (β), α˜ ±;λ

,

± β ∈ M XM ,

(13)

± β1 , β2 , β3 ∈ M XM ,

(14)

λ ∈ N XN ,

and the chiral fusion rules β3 , β1 β2 = ϑ± (β3 ), ϑ± (β1 )ϑ± (β2 ) , and the statistical dimensions. Proof. We just consider the “+” case, the proof for “−” is analogous. By Lemma 5.2 we + may and do assume for simplicity that now all β ∈ M XM are chosen such that β|M+ = ϑ+ (β). This already forces equality of statistical dimensions dβ = dϑ+ (β) . Moreover, + we just have Hom(β, αλ+ ) = Hom(ϑ+ (β), α˜ +;λ ), giving Eq. (13). Given isometries + + tj ∈ Hom(βj , αλj ) = Hom(ϑ+ (βj ), α˜ +;λj ), j = 1, 2, 3, and also y ∈ Hom(β3 , β1 β2 ), then we find similarly + + t1 αλ+1 (t2 )yt3∗ ∈ Hom(αλ+3 , αλ+1 αλ+2 ) = Hom(α˜ +;λ , α˜ +;λ α˜ + ) 3 1 +;λ2 + by Lemma 5.1. Therefore, by using αλ+1 (t2 ) = α˜ +;λ (t2 ) we finally find that y ∈ 1 Hom(ϑ+ (β3 ), ϑ+ (β1 )ϑ+ (β2 )). The same argument works vice versa, so that the intertwiner spaces Hom(β3 , β1 β2 ) and Hom(ϑ+ (β3 ), ϑ+ (β1 )ϑ+ (β2 )) are equal. 0 → 0 Lemma 5.4. The bijections ϑ± restrict to bijections M XM M±XM± of the ambichiral subsystems. 0 , i.e. there are isometries s ∈ Hom(τ, α + ) and t ∈ Hom(τ, α − ) Proof. Let τ ∈ M XM µ λ for some λ, µ ∈ N XN . Put q = ts ∗ ∈ Hom(αλ+ , αµ− ). Then q ∈ Hom(ιλ, ιµ) and

qε + (λ, ρ)∗ xρ = ε− (µ, ρ)∗ xρ q whenever xρ ∈ Hom(ι, ιρ). Hence, using Eq. (9), we calculate the left-hand side as qε + (λ, ρ)∗ xρ = qε − (ρ, λ)xρ = ε− (ρ, µ)αρ− (q)xρ = ε+ (µ, ρ)∗ αρ− (q)xρ . Now let us specialize to the case xρ ∈ Kρ− . Then xρ q = αρ− (q)xρ , so that ε+ (µ, ρ)∗ xρ q = ε− (µ, ρ)∗ xρ q. + − It follows α˜ −;µ (m)q = α˜ −;µ (m)q for all m ∈ M− . But note that qq ∗ = tt ∗ which lies − − + − in Hom(α˜ −;µ , α˜ −;µ ) by Lemma 5.1. Hence α˜ −;µ (m)tt ∗ = α˜ −;µ (m)tt ∗ for all m ∈ M− . + − ∗ ∗ We can similarly derive that α˜ +;λ (m)ss = α˜ +;λ (m)ss for all m ∈ M+ . There are − + isometries t˜ ∈ Hom(ϑ− (τ ), α˜ −;µ ) and s˜ ∈ Hom(ϑ+ (τ ), α˜ +;λ ) such that t˜t˜∗ = tt ∗ and ∗ ∗ s˜ s˜ = ss . But we now find

as well as

− + ϑ− (τ )(m) = t˜∗ α˜ −;µ (m)t˜ = t˜∗ α˜ −;µ (m)t˜,

m ∈ M− ,

+ − (m)˜s = s˜ ∗ α˜ +;µ (m)˜s , ϑ+ (τ )(m) = s˜ ∗ α˜ +;λ

m ∈ M+ ,

0 . Thus ϑ map X 0 into 0 i.e. ϑ± (τ ) ∈ M±XM ± M M M±XM± . But it follows from Proposition ± 0 0 all have the same ambichiral 3.1 and Theorem 4.7 that the systems M XM and M±XM ± global index w0 . This proves the lemma.

Modular Invariants from Subfactors

279

We now can state the precise relation between the coupling matrix Z, arising from N ⊂ M and given as in Eq. (6), and its type I parents Z ± arising from N ⊂ M± . Theorem 5.5. The entries of the type I coupling matrices Z ± arising from N ⊂ M± can be written as ± ± ± = bτ,λ bτ,µ (15) Zλ,µ 0 τ ∈M XM

with chiral branching coefficients ± = τ, αλ± , bτ,λ

0 τ ∈ M XM ,

λ ∈ N XN .

(16)

If the two intermediate subfactors of Corollary 4.3 are identical, M+ = M− , (so that the parent coupling matrices coincide, Z + = Z − ) then the entries of the coupling matrix Z arising from the full N ⊂ M can be written as Zλ,µ =

+ + bτ,λ bω(τ ),µ .

(17)

0 τ ∈M XM

−1 0 , satisfying ω(0) = 0, realizes an auto◦ ϑ− of M XM Here the permutation ω = ϑ+ morphism of the ambichiral fusion rules.

Proof. Since the chiral locality condition holds for N ⊂ M± we have + − τ˜± , α˜ ±;µ

= ι± τ˜± ι± , µ = τ˜± , α˜ ±;µ

0 and µ ∈ X , thanks to [5, Prop. 3.3]. Therefore for τ˜± ∈ M±XM N N ±

± Zλ,µ =

0 τ˜± ∈M XM ±

=

0 τ˜± ∈M XM ±

+ − α˜ ±;λ , τ˜± τ˜± , α˜ ±;µ

=

0 τ˜± ∈M XM ±

±

−1 −1 αλ± , ϑ± (τ˜± ) ϑ± (τ˜± ), αµ±

=

± ± α˜ ±;λ , τ˜± τ˜± , α˜ ±;µ

±

± ± bτ,λ bτ,µ

0 τ ∈M XM

±

for λ, µ ∈ N XN . Now if M+ = M− then Zλ,µ =

0 τ ∈M XM

=

0 τ ∈M XM

αλ+ , τ τ, αµ− =

+ − bτ,λ ϑ− (τ ), α˜ −;µ

0 τ ∈M XM

+ + bτ,λ ϑ− (τ ), α˜ +;µ

=

+ −1 bτ,λ ϑ+ ◦ ϑ− (τ ), αµ+

0 τ ∈M XM

−1 ◦ϑ gives a well-defined permutation for λ, µ ∈ N XN . As M+ = M− , putting ω = ϑ+ − + + 0 bω(τ ),µ . Due to Lemma of M XM and yields the desired formula Zλ,µ = τ ∈ X 0 bτ,λ M M 5.3, ω preserves the fusion rules and we also have ω(0) = 0 because always ϑ± (idM ) = idM± .

280

J. Böckenhauer, D. E. Evans

Note that even if M+ = M− , Z + = Z − , the coupling matrix Z is still governed by an isomorphism of (ambichiral) fusion rule algebras, in perfect agreement with [29]. 0 for the summation in Z, then the general Namely, if we use the system τ˜+ ∈ M+XM + formula Eq. (6) can be written as + − Zλ,µ = α˜ +;λ , τ˜+ ϑ(τ˜+ ), α˜ −;µ

= λ, ι+ τ˜+ ι+ ι− ϑ(τ˜+ )ι− , µ 0 τ˜+ ∈M XM +

0 τ˜+ ∈M XM +

+

+

for λ, µ ∈ N XN , by virtue of Lemma 5.3 and chiral locality, guaranteeing [5, Prop. 3.3]. −1 0 → 0 Here ϑ is the bijection ϑ = ϑ− ◦ ϑ+ : M+XM M−XM− , yielding an isomorphism of + the ambichiral fusion rules. This corresponds to the “extended” coupling matrix Zτext ˜+ ,τ˜− = δτ˜− ,ϑ(τ˜+ ) ,

0 τ˜+ ∈ M+XM , +

0 τ˜− ∈ M−XM , −

(18)

which now has different left and right labels. (The reader may think of left and right extended characters χτext;± = λ λ, ι± τ˜± ι± χλ here.) This extended coupling matrix ˜±

0 = 0 also appears in Eq. (20) below. Now only if M+ = M− , so that M+XM M−XM− , Eq. + ext (18) can be reduced to Zτ˜ ,τ˜ = δτ˜+ ,ϑ(τ˜+ ) which is nothing but the permutation ω up to +

+

0 , ω = ϑ −1 ◦ ϑ ◦ ϑ , which we used in Eq. (17) just for notational relabeling by M XM + + convenience. + − Recall that there is a relative braiding between the morphisms in M XM and M XM 0 , and for τ, τ ∈ 0 which restricts to a proper braiding on M XM M XM these braiding operators are given by [5]

εr (τ, τ ) = s ∗ αµ− (t)∗ ε + (λ, µ)αλ+ (s)t whenever t ∈ Hom(τ, αλ+ ) and s ∈ Hom(τ , αµ− ) are isometries, λ, µ ∈ N XN . We can 0 to !( X 0 ) as explained in [6, Sect. 2]. Now let N opp extend this braiding from M XM M M denote the opposite algebra of N and let j denote the natural anti-linear isomorphism. opp For λ ∈ End(N ) we denote λopp = j ◦ λ ◦ j −1 . We proceed analogously for M− , the opposite algebra of M− . Proposition 5.6. There exists a (type III) factor B such that we have irreducible inclusions opp

N ⊗ N opp ⊂ M+ ⊗ M− ⊂ B

(19)

with the following properties: opp

1. The dual canonical endomorphism ;ext of the inclusion M+ ⊗M− ⊂ B decomposes as [;ext ] = [ϑ+ (τ ) ⊗ ϑ− (τ )opp ]. (20) 0 τ ∈M XM

2. The dual canonical endomorphism ; of the inclusion N ⊗ N opp ⊂ B decomposes as [;] = Zλ,µ [λ ⊗ µopp ]. (21) λ,µ∈N XN

Modular Invariants from Subfactors

281

0 , respectively, according 3. If τ± , σ± ∈ End(M) are the extensions of τ˜± , σ˜ ± ∈ M±XM ± to Lemma 5.2, then

yσ+ (x)εr (τ˜+ , σ˜ + ) = εr (τ˜− , σ˜ − )τ− (y)x

(22)

holds whenever x ∈ Hom(τ+ , τ− ) and y ∈ Hom(σ+ , σ− ). 0 can be extended Proof. Lemma 5.2 (together with Lemma 5.4) tells us that τ˜± ∈ M±XM ± 0 , and to τ± ∈ End(M) such that τ+ and τ− are equivalent to some morphisms in M XM 0 we have [τ+ ] = [τ− ] if and only if τ˜± = ϑ± (τ ) for a τ ∈ M XM . Using these extensions 0 , then [35] determines a factor B such that for subfactors M± ⊂ M with systems M±XM ± opp M+ ⊗ M− ⊂ B is an irreducible subfactor with its dual canonical endomorphism ;ext decomposing as opp τ+ , τ− [τ˜+ ⊗ τ˜− ] = [ϑ+ (τ ) ⊗ ϑ− (τ )opp ], [;ext ] = 0 τ˜ ∈ X 0 τ˜+ ∈M XM − M M +

+

−

0 τ ∈M XM

−

opp

proving 1. Now note that the injection map for N ⊗ N opp "→ M+ ⊗ M− is given by opp ι+ ⊗ ι− . Therefore the dual canonical endomorphism for N ⊗ N opp ⊂ B is obtained opp opp as ; = (ι+ ⊗ ι− ) ◦ ;ext ◦ (ι+ ⊗ ι− ) so that [(ι+ ◦ ϑ+ (τ ) ◦ ι+ ) ⊗ (ι− ◦ ϑ− (τ ) ◦ ι− )opp ]. [;] = 0 τ ∈M XM

Now [ι± ◦ ϑ± (τ ) ◦ ι± ] =

ι± ◦ ϑ± (τ ) ◦ ι± , λ [λ],

λ∈N XN

and since the subfactors N ⊂ M± satisfy chiral locality one has ± ±

= τ, αλ± = bτ,λ ι± ◦ ϑ± (τ ) ◦ ι± , λ = ϑ± (τ ), α˜ ±;λ

by virtue of [5, Prop. 3.3] and Lemma 5.3. Hence + − bτ,λ bτ,µ [λ ⊗ µopp ] = [;] = λ,µ∈N XN τ ∈ X 0 M M

Zλ,µ [λ ⊗ µopp ],

λ,µ∈N XN

0 , proving 2. Finally, if τ± , σ± ∈ End(M) denote the extensions of τ˜± , σ˜ ± ∈ M±XM ± respectively, as in Lemma 5.2, then there are some λ+ , λ− , µ+ , µ− ∈ N XN and isome± ± ∗ α ± (m)t˜ tries t˜± ∈ Hom(τ˜± , α˜ ±;λ ) and s˜± ∈ Hom(σ˜ ± , α˜ ±;µ ) so that τ± (m) = t˜± ± λ± ± ± ∓ ∗ α ± (m)˜ ˜ and σ± (m) = s˜± s for all m ∈ M. Note that also t ∈ Hom( τ ˜ , α ˜ ) and ± ± ± µ± ±;λ± ∓ s˜± ∈ Hom(σ˜ ± , α˜ ±;µ ) because chiral locality holds for N ⊂ M and then ambichiral ± ± morphisms are obtained from α + - and α − -induction by use of the same isometries [5, Sect. 3]. Hence we have ± ∗ ± α˜ ±;µ± (t˜± )∗ ε + (λ± , µ± )α˜ ±;λ (˜s± )t˜± εr (τ˜± , σ˜ ± ) = s˜± ± ∗ ± ˜ ∗ + = s˜± αµ± (t± ) ε (λ± , µ± )αλ±± (˜s± )t˜± ,

282

J. Böckenhauer, D. E. Evans

where we also used Corollary 4.5. We now can compute ∗ + ˜ ∗ + yσ+ (x)εr (τ˜+ , σ˜ + ) = yσ+ (x)˜s+ αµ+ (t+ ) ε (λ+ , µ+ )αλ++ (˜s+ )t˜+ ∗ ∗ + = yσ+ (x t˜+ )˜s+ ε (λ+ , µ+ )αλ++ (˜s+ )t˜+

∗ ∗ ∗ + = σ− (x t˜+ )˜s− s˜− y s˜+ ε (λ+ , µ+ )αλ++ (˜s+ )t˜+

∗ − ˜∗ ˜ ∗ + = s˜− αµ− (t− t− x t˜+ )ε (λ+ , µ− )αλ++ (˜s− y)t˜+

∗ − ˜∗ + ∗ + = s˜− αµ− (t− )ε (λ− , µ− )t˜− x t˜+ αλ+ (˜s− )t˜+ τ+ (y)

∗ − ˜∗ + = s˜− αµ− (t− )ε (λ− , µ− )αλ−− (˜s− )t˜− xτ+ (y) = εr (τ˜− , σ˜ − )τ− (y)x,

where we used Eq. (9) twice, proving 3.

The relevance of Proposition 5.6 is the following. Suppose that our factor N is obtained as a local factor N = N (I◦ ) of a quantum field theoretical net of factors {N (I )} indexed by proper intervals I ⊂ R on the real line, and that the system N XN is obtained as restrictions of DHR-morphisms (cf. [21]) to N . This is in fact the case in our RCFT examples arising from current algebras where the net is defined in terms of local loop groups in the vacuum representation. Taking two copies of such a net and placing the real axes on the light cone, then this defines a local net {A(O)}, indexed by double cones O on two-dimensional Minkowski space (cf. [34] for such constructions). Given a subfactor N ⊂ M, determining in turn two subfactors N ⊂ M± obeying chiral locality, will provide two local nets of subfactors {N (I ) ⊂ M± (I )} due to [28]. Arranging M+ (I ) and M− (J ) on the two light cone axes defines a local net of subfacopp tors {A(O) ⊂ Aext (O)} in Minkowski space. The embedding M+ ⊗ M− ⊂ B gives rise to another net of subfactors {Aext (O) ⊂ B(O)}, and Eq. (22) ensures that the net {B(O)} satisfies locality, due to Rehren’s recent result [35]. As already shown in [35], there exists a local two-dimensional quantum field theory such that the coupling matrix Z describes its restriction to the tensor products of its chiral building blocks N (I ), and this is here expressed in Eq. (21). Now Eq. (20) tells us that there are chiral extensions N(I ) ⊂ M+ (I ) and N (I ) ⊂ M− (I ) for left and right chiral nets which are indeed maximal as the coupling matrix for {Aext (O) ⊂ B(O)} is a bijection. This shows that the inclusions N ⊂ M± should in fact be regarded as the subfactor version of left- and right maximal extensions of the chiral algebra. 6. Extended S- and T-Matrices Using the braiding arising from the relative braiding one can define the statistics phase 0 by d φ (ε (τ, τ )) = ω 1. ωτ of τ ∈ M XM τ τ r τ 0 such that [τ ] is a subsector of [α + ] and [α − ] for some Lemma 6.1. Let τ ∈ M XM µ λ λ, µ ∈ N XN . Then we have ωλ = ωτ = ωµ .

Proof. Let t ∈ Hom(τ, αλ+ ) and s ∈ Hom(τ, αµ− ) be isometries. Then ωτ 1 = dτ φτ (εr (τ, τ )) = dτ φτ (s ∗ αµ− (t)∗ ε + (λ, µ)αλ+ (s)t)

= dτ φτ (τ (t)∗ s ∗ ε + (λ, µ)tτ (s)) = dτ t ∗ φτ (s ∗ ε + (λ, µ)t)s

= dλ t ∗ φα + (ts ∗ ε + (λ, µ))s = dλ t ∗ φα + (ε + (λ, λ)αλ+ (ts ∗ ))s λ

= dλ t ∗ φα + (ε + (λ, λ))ts ∗ s, λ

λ

Modular Invariants from Subfactors

283

where we used Eq. (7), Eq. (8), and since ts ∗ ∈ Hom(ιµ, ιλ) we could also apply Eq. (9). Note that φα + can be given as φα + (m) = rλ∗ αλ+ (m)rλ for all m ∈ M, so that in λ λ particular φα + (n) = φλ (n) for n ∈ N . Hence ωτ = ωλ . We can compute similarly λ

dτ t ∗ φτ (s ∗ ε + (λ, µ)t)s = dµ t ∗ φαµ− (ε + (λ, µ)ts ∗ )s = dµ t ∗ φαµ− (αµ− (ts ∗ )ε + (µ, µ))s, establishing ωτ = ωµ .

Note that with the expansion Eq. (6), Lemma 6.1 implies easily ωλ Zλ,µ = Zλ,µ ωµ , i.e. it gives a new and simple proof of the commutativity of the matrices ) and Z which was first established in [6, Thm. 5.7]. Lemma 6.2. For β ∈ M XM we have ωτ dτ w/w+ ωλ dλ β, αλ± = 0

: :

λ∈N XN

0 β = τ ∈ M XM otherwise.

(23)

Proof. All we need to show is that the left hand side of Eq. (23) vanishes whenever 0 because we recall once more from (the proof of) [7, Prop. 3.1] that we have β∈ / M XM ± 0 , and then the claim follows from Lemma 6.1. dτ w/w+ = λ dλ bτ,λ for any τ ∈ M XM ± For this purpose we define vectors u with entries u± ωλ dλ β, αλ± , β ∈ M XM . β = λ∈N XN

We clearly have "u+ "2 = ωλ ων−1 dλ dν αν+ , β β, αλ+ = ωλ ων−1 dλ dν αλ+ αν+ , id β

=

λ,ν

λ,µ,ν

=

µ ωλ ων−1 dλ dν Nλ,ν αµ+ , id

=

λ,µ,ν

λ,ν

ν ωλ ων−1 dλ dν Nλ,µ ωµ Zµ,0

Y0,λ Yλ,µ Zµ,0 ,

λ,µ

where we used that Zµ,0 = ωµ Zµ,0 by Lemma 6.1. Similarly we obtain "u− "2 = λ,µ Y0,λ Yλ,µ Z0,µ . On the other hand, we can compute the inner product u+ , u− =

λ,µ β

=

λ,µ

ωλ−1 dλ dµ ωµ αλ+ , β β, αµ− =

dλ Zλ,µ dµ =

λ,µ

Y0,λ Zλ,µ Yµ,0 =

λ,µ

ωλ−1 dλ dµ ωµ Zλ,µ

Y0,λ Yλ,µ Zµ,0 = "u+ "2 ,

λ,µ

where we used the commutativity of ) and Y with Z of [6, Thm. 5.7]. Commuting Z in the fifth equality to the left rather than to the right gives u+ , u− = "u− "2 . Thus we ± conclude u+ = u− . Since obviously u± / M XM this implies u± β = 0 whenever β ∈ β =0 0. whenever β ∈ / M XM

284

J. Böckenhauer, D. E. Evans

Let Y ext and )ext denote the Y- and )-matrices associated to the braided system 0 M XM . Lemma 6.3. We have w ± ext ± Yτ,τ bτ,λ Yλ,µ , bτ ,µ = w+ 0 λ∈N XN

τ ∈M XM

0 τ ∈M XM

± )ext τ,τ bτ ,µ =

λ∈N XN

± bτ,λ )λ,µ

(24) 0 and µ ∈ X . for all τ ∈ M XM N N

Proof. Note that the second relation in Eq. (24) is nothing but Lemma 6.1, as this is ± ± = bτ,λ ωλ . So we just need to verify the first relation in Eq. (24). We now just ωτ bτ,λ compute ωλ ωµ ρ ω τ ωµ ± ± λ bτ,λ Yλ,µ = Nλ,µ dρ bτ,λ = Nρ,µ dρ αλ± , τ ωρ ωρ λ λ,ρ λ,ρ ωτ ωµ ωτ ωµ dρ αρ± , τ αµ± = dρ αρ± , β β, τ αµ± = ω ω ρ ρ ρ ρ β w ωτ ωµ τ w ωτ ω µ ± dτ τ , τ αµ = N dτ τ , αµ± = w+ ωτ w+ ωτ τ ,τ τ τ ,τ w ωτ ωτ τ w ext ± ± = N dτ bτ ,µ = Y b , w+ ωτ τ,τ w+ τ,τ τ ,µ τ ,τ

τ

where we used (the complex conjugate of) Lemma 6.2 in the fifth equality. Recall from Sect. 2 the complex number z = λ∈N XN dλ2 ωλ . Analogously we put z0 = τ ∈ X 0 dτ2 ωτ . M

M

Lemma 6.4. We have z0 =

w+ z. w

Proof. Using Lemma 6.2 we can compute z0 =

ωτ dτ2 =

0 τ ∈M XM

w+ w

β∈M XM λ∈N XN

ωλ dλ β, αλ± dβ

w+ w+ z, = ωλ dλ2 = w w λ∈N XN

where we used

± β∈M XM β, αλ dβ

= dλ .

Assume that z = 0 so that the central charge can be defined. Since w/w+ is a real number, Lemma 6.4 tells us that the central charges c and cext of the braided systems 0 N XN and M XM , respectively, which are defined modulo 8, coincide. As a corollary of Lemma 6.3, the intertwining properties of chiral branching coefficients between original and extended S- and T-matrices are therefore clarified in the following

Modular Invariants from Subfactors

Theorem 6.5. Provided z = 0 one has ± ext ± Sτ,τ bτ,λ Sλ,µ , bτ ,µ = λ∈N XN

0 τ ∈M XM

285

0 τ ∈M XM

ext ± Tτ,τ bτ ,µ =

λ∈N XN

± bτ,λ Tλ,µ (25)

0 and µ ∈ X . for all τ ∈ M XM N N

We would like to remind the reader that, if the braiding on N XN is non-degenerate, then so is the ambichiral braiding [7, Thm. 4.2]. In other words, whenever the original S- and T-matrices are modular then so are the extended S- and T-matrices. Now let us return to our situation N ⊂ M± ⊂ M and apply the above results also to the subfactors N ⊂ M± . Let Y ext;± and )ext;± the Y- and )-matrices associated to the − 0 . Recalling now Z + = Z braided systems M±XM λ,0 and Z0,λ = Z0,λ , we obtain from λ,0 ± Lemmata 5.3, 5.4, 6.1 and 6.4 the following Theorem 6.6. The matrices Y ext , )ext , and Y ext;± , )ext;± coincide subject to the bijections ϑ± . If z = 0, then so do the corresponding S- and T-matrices which are then well-defined. In formulae, ext;± ext Sτ,τ = Sϑ (τ ),ϑ (τ ) , ± ±

ext;± ext Tτ,τ = Tϑ (τ ),ϑ (τ ) , ± ±

(26)

0. for all τ, τ ∈ M XM

We remark that in the case that M+ = M− one obtains by use of the properties of the relative braiding operators [5, Lemma 3.11] and from Corollary 4.5, that the ambichiral 0 and 0 braiding operators are the same for M XM M±XM± , subject to the bijections ϑ± , so that Theorem 6.6 is trivial in this case. 7. Heterotic Examples We consider the SO(n) current algebra models at level 1, and where n is a multiple of 16, n = 16>, > = 1, 2, 3, . . . . These theories have four sectors, the basic (0), vector (v), spinor (s) and conjugate spinor (c) module, corresponding to highest weights 0, ?(1) , ?(r−1) and ?(r) , respectively; here r = n/2 = 8> is the rank of SO(n). The conformal dimensions are given as h0 = 0, hv = 1/2, hs = hc = >, and the sectors obey Z2 × Z2 fusion rules. The Kac–Peterson matrices are given explicitly as     1 0 0 0 1 1 1 1 1 1 1 −1 −1   0 −1 0 0  . (27) S=  , T = e−2πi>/3  0 0 1 0  1 −1  2 1 −1 0 0 0 1 1 −1 −1 1 It is easy to check that there are exactly six modular invariants, Z = Here      1 0 1 0 1 0 0 0  0 0 0 0   0 1 0 0  , Q= W = , Xs =  1 0 1 0  0 0 0 1 0 0 0 0 0 0 1 0

t Q.

1, W , Xs , Xc , Q, 1 0 1 0

0 0 0 0

0 0 0 0

 1 0  , 1 0

and Xc = W Xs W . (Note that Q = Xs W and t Q = W Xs .) The matrix Q and its transpose t Q are two examples for modular invariants with non-symmetric vacuum

286

J. Böckenhauer, D. E. Evans

coupling. Such “heterotic” invariants seem to be extremely rare and have not enjoyed particular attention in the literature, perhaps because they were erroneously dismissed as being spurious in the sense that they would not correspond to a physical partition function. Examples for truly spurious modular invariants were given in [37, 39, 18] and found to be “coincidental” linear combinations of proper physical invariants. Note that although there is a linear dependence here, namely 1 − W − Xs − Xc + Q + t Q = 0, we cannot express Q (or t Q) alone as a linear combination of the four symmetric invariants. This may serve as a first indication that Q and t Q are not spurious. We will now demonstrate that they can be realized from subfactors. The Z2 × Z2 fusion rules for these models were proven in the DHR framework in [2], and together with the conformal spin and statistics theorem [16, 15, 20] we conclude that there is a net of type III factors on S 1 with a system {id, ρv , ρs , ρc } of localized and transportable, hence braided endomorphisms, such that the statistics S- and T-matrices are given by Eq. (27). Because the statistics phases are given as ωv = −1 and ωs = ωc = 1, we can assume that the morphisms in the system obey the Z2 × Z2 fusion rules even by individual multiplication, ρv2 = ρs2 = ρc2 = id,

ρv ρs = ρs ρv = ρc ,

thanks to [33, Lemma 4.4]. This is enough to proceed with the DHR construction of the field net [13], as already carried out similarly for simple current extensions with cyclic groups [4, 5]. In fact, all we need to do here is to pick a single local factor N = N (I ) such that the interval I ⊂ S 1 contains the localization region of the morphisms, and then we construct the cross product subfactor N ⊂ N (Z2 × Z2 ). Then the corresponding dual canonical endomorphism θ decomposes as a sector as [θ ] = [id] ⊕ [ρv ] ⊕ [ρs ] ⊕ [ρc ]. Checking ιλ, ιµ = θ λ, µ = 1 for λ, µ = id, ρv , ρs , ρc , we find that there is only a single M-N sector, namely [ι]. By [6, Cor. 6.13] we conclude that the modular invariant coupling matrix Z arising from this subfactor must fulfill trZ = 1. This leaves only the possibility that Z is Q or t Q. We may and do assume that Z = Q, otherwise we exchange braiding and opposite braiding. It is easy to determine the intermediate subfactors N ⊂ M± ⊂ M. Namely, we have M+ = N ρs Z2 and M− = N ρc Z2 with dual canonical endomorphism sectors [θ+ ] = [id] ⊕ [ρs ] and [θ− ] = [id] ⊕ [ρc ], respectively. That both extensions are local can also be checked from ωs = ωc = 1. We therefore find Z + = Xs and Z − = Xc . Finally, the permutation invariant W is obtained from the non-local extension Mv = N ρv Z2 . 8. Conclusions We studied the structure of coupling matrices Z arising from braided subfactors N ⊂ M through intermediate subfactors N ⊂ M± ⊂ M which in turn determine type I “parent” coupling matrices Z ± . We demonstrated that the inclusions N ⊂ M+ and N ⊂ M− should be recognized as the subfactor version of left and right maximal chiral algebra extensions. The main application we have in mind is RCFT where the S- and T-matrices arising from the braiding are modular. For current algebra models based on Lie groups SU(n) or others, the coupling matrices from subfactors are then modular invariants for

Modular Invariants from Subfactors

287

their Kac–Peterson matrices. Most but not all of the known modular invariant coupling matrices of such models are either type I, Eq. (3), or type II, Eq. (4), and the type II invariants have a unique type I parent. For example, the parents of the SU(2) type II modular invariants D2>+1 (> = 2, 3, . . . ) and E7 are A4>−1 and D10 , respectively. For such invariants the extended left and right chiral algebras are the same. (In the Dodd examples the extended algebras are the original, identical left and right current algebras.) In fact, the E7 modular invariant has been constructed from a subfactor with [θ ] = [λ0 ] ⊕ [λ8 ] ⊕ [λ16 ] in [7], and here we obtain M+ = M− with [θ± ] = [λ0 ] ⊕ [λ16 ] which produces the simple current extension D10 invariant. For the cases Deven , E6 and E8 treated in [4, 5], where N ⊂ M is subject to chiral locality from the beginning, we clearly find M = M+ = M− which indeed are the local factors of the local chiral extensions considered e.g. in [36]. (In fact all invariants obtained from subfactors obeying chiral locality are clearly their own parents due to Proposition 3.2.) We showed that Z is in fact type II, Eq. (4), whenever the extensions coincide, M+ = M− . It is interesting that all our results could be derived without assuming the nondegeneracy of the braiding, i.e. all our statements are true even if the modular group is not around. We similarly derived in [7] without such condition that trivial vacuum coupling, Zλ,0 = δλ,0 , is equivalent to Z being a fusion rule automorphism (and to Z0,λ = δλ,0 ), thus recovering a result previously encountered in RCFT [12, 29]. In this paper we started with a braided subfactor producing some coupling matrix with possibly non-trivial vacuum coupling, and our results show that the “extended” coupling 0 → 0 matrix, Eq. (18), is a bijection M+XM M−XM− which yields an isomorphism of the + fusion rules of the ambichiral systems. Moreover, the corresponding extended S- and T-matrices coincide subject to this isomorphism (Theorem 6.6). In the (modular) RCFT case, they are recognized as the S- and T-matrices of the extended left and right chiral algebra, and therefore Theorem 6.6 provides in particular a subfactor version of [29, Eq. (4.5)]. But note that the derivations of the fusion rule automorphism in [12, 29] in turn rely on the Verlinde formula [38] whereas our derivation holds even if the braiding is degenerate, i.e. even if the Verlinde formula does not hold. Our result comes in the same spirit as [34] where the embedding of left and right chiral observables in a 2D conformal quantum field theory is analyzed and the corresponding coupling matrix is shown to describe an automorphism of fusion rules if and only if the chiral observables are maximal. Namely, the result of [34] is derived under very general assumptions in the framework of local quantum physics, and it is in particular entirely independent of the SL(2; Z) machinery heavily exploited in [12, 29]. Note that “almost all” known modular invariants satisfy Zλ,0 = Z0,λ . This means [θ+ ] = [θ− ], and this comes close to M+ = M− . In particular, if θ, λ ≤ 1, then this forces Kλ+ = Kλ− so that we necessarily have M+ = M− . Similarly a totally degenerate braiding gives rise to M+ = M− . Nevertheless our example in Sect. 7 has shown that M+ = M− is possible, and that even Z + = Z − can occur. The significance of different left and right chiral extensions reflected in the possibility M+ = M− and even in different parent coupling matrices may come as a little surprising. For example, the related “heterotic” extensions of current algebra models are not particularly well studied objects. One reason may be that the most popular models, those based on SU(n) current algebras, only seem to have modular invariants with identical parents – it is in fact likely that all SU(n) invariants are entirely symmetric.2 But can it happen that Z + = Z − but M+ = M− ? We do not know an example but we neither see a reason why this should not be possible. For instance, if there is λ, θ ≥ 2 for some λ then it may happen that 2 This was pointed out to us by Terry Gannon.

288

J. Böckenhauer, D. E. Evans

Kλ+ = Kλ− though these spaces may still have the same dimension, Zλ,0 = Z0,λ . In other words, it is conceivable that certain modular invariants look like type I or type II though they really come from heterotic extensions. Let us finally mention that the exotic modular invariants which are argued not to correspond to any RCFT in [37, 39, 18], will not be produced from subfactors by the machinery of [6, 7]. Note that the standard argument showing that a modular invariant Z does not give a partition function of a RCFT is to disprove the existence of an extended S-matrix. However, from braided subfactors there always arises a matrix S ext with all the required properties. And in fact, Rehren’s recent result [35] (and in turn our Proposition 5.6) shows generally that all coupling matrices which arise from an embedding of some local algebra of a chiral RCFT describe the restriction of a 2D RCFT to its chiral building blocks. This implies for example that the heterotic modular invariants Q and t Q discussed in Sect. 7 are in fact coupling matrices of a RCFT whose chiral algebras are different maximal extensions of the SO(n)1 current algebra. Acknowledgement. We would like to thank especially T. Gannon for drawing our attention to the nonsymmetric SO(16>)1 modular invariants as well as K.-H. Rehren for pointing out the benefit of proving Proposition 5.6, and we are indebted to J. Fuchs, T. Gannon, and K.-H. Rehren for helpful e-mail correspondences. We gratefully acknowledge financial support of the EU TMR Network in Non-Commutative Geometry.

References 1. Behrend, R.E., Pearce, P.A., Petkova, V.B., Zuber, J.-B.: Boundary conditions in rational conformal field theories. Nucl. Phys. B570, 525–589 (2000) 2. Böckenhauer, J.: An algebraic formulation of level one Wess-Zumino-Witten models. Rev. Math. Phys. 8, 925–947 (1996) 3. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. I. Commun. Math. Phys. 197, 361–386 (1998) 4. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. II. Commun. Math. Phys. 200, 57–103 (1999) 5. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. III. Commun. Math. Phys. 205, 183–228 (1999) 6. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: On α-induction, chiral generators and modular invariants for subfactors. Commun. Math. Phys. 208, 429–487 (1999) 7. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: Chiral structure of modular invariants for subfactors. Commun. Math. Phys. 210, 733–784 (2000) (1) 8. Cappelli, A., Itzykson, C., Zuber, J.-B.: The A-D-E classification of minimal and A1 conformal invariant theories. Commun. Math. Phys. 113, 1–26 (1987) 9. Di Francesco, P.: Integrable lattice models, graphs and modular invariant conformal field theories. Int. J. Mod. Phys. A7, 407–500 (1992) 10. Di Francesco, P., Zuber, J.-B.: SU (N ) lattice integrable models associated with graphs. Nucl. Phys. B338, 602–646 (1990) 11. Di Francesco, P., Zuber, J.-B.: SU (N ) lattice integrable models and modular invariance. In: Recent Developments in Conformal Field Theories. Trieste 1989, Singapore: World Scientific, 1990, pp. 179– 215 12. Dijkgraaf, R., Verlinde, E.: Modular invariance and the fusion algebras. Nucl. Phys. (Proc. Suppl.) 5B, 87–97 (1988) 13. Doplicher, S., Haag, R., Roberts, J.E.: Fields, observables and gauge transformations. II. Commun. Math. Phys. 15, 173–200 (1969) 14. Evans, D.E., Kawahigashi, Y.: Quantum symmetries on operator algebras. Oxford: Oxford University Press, 1998 15. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with braid group statistics and exchange algebras. II. Rev. Math. Phys. Special issue, 113–157 (1992) 16. Fröhlich, J., Gabbiani, F.: Braid statistics in local quantum theory. Rev. Math. Phys. 2, 251–353 (1990) 17. Fröhlich, J., Gabbiani, F.: Operator algebras and conformal field theory. Commun. Math. Phys. 155, 569–640 (1993)

Modular Invariants from Subfactors

289

18. Fuchs, J., Schellekens, A.N., Schweigert, C.: Galois modular invariants of WZW models. Nucl. Phys. B 437, 667–694 (1995) 19. Gannon, T.: The classification of affine SU(3) modular invariants. Commun. Math. Phys. 161, 233–264 (1994) 20. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–35 (1996) 21. Haag, R.: Local Quantum Physics. Berlin: Springer-Verlag, 1992 22. Izumi, M.: Subalgebras of infinite C ∗ -algebras with finite Watatani indices II: Cuntz-Krieger algebras. Duke Math. J. 91, 409–461 (1998) 23. Izumi, M., Longo, R., Popa, S.: A Galois correspondence for compact groups of automorphisms of von Neumann algebras with a generalization to Kac algebras. J. Funct. Anal. 155, 25–63 (1998) 24. Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1–25 (1983) 25. Kac, V.G.: Infinite dimensional Lie algebras, 3rd edition, Cambridge: Cambridge University Press, 1990 26. Kato, A.: Classification of modular invariant partition functions in two dimensions. Mod. Phys. Lett. A2, 585–600 (1987) 27. Longo, R.: Minimal index of braided subfactors. J. Funct. Anal. 109, 98–112 (1991) 28. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 29. Moore, G., Seiberg, N.: Naturality in conformal field theory. Nucl. Phys. B 313, 16–40 (1989) 30. Ocneanu, A.: Paths on Coxeter diagrams: From Platonic solids and singularities to minimal models and subfactors. (Notes recorded by S. Goto.) In: Rajarama Bhat, B.V. et al. (eds.), Lectures on operator theory, The Fields Institute Monographs, Providence, RI: AMS Publications, 2000, pp. 243–323 31. Petkova, V.B., Zuber, J.-B: From CFT to graphs. Nucl. Phys. B 463, 161–193 (1996) 32. Rehren, K.-H.: Braid group statistics and their superselection rules. In: Kastler, D. (ed.): The algebraic theory of superselection sectors. Palermo 1989, Singapore: World Scientific 1990, pp. 333–355 33. Rehren, K.-H.: Space-time fields and exchange fields. Commun. Math. Phys. 132, 461–483 (1990) 34. Rehren, K.-H.: Chiral observables and modular invariants. Commun. Math. Phys. 208, 689–712 (2000) 35. Rehren, K.-H.: Canonical tensor product subfactors. Commun. Math. Phys. 211, 395–406 (2000) 36. Rehren, K.-H., Stanev, Y.S., Todorov, I.T.: Characterizing invariants for local extensions of current algebras. Commun. Math. Phys. 174, 605–633 (1996) 37. Schellekens A.N., Yankielowicz, S.: Field identification fixed points in the coset construction. Nucl. Phys. B334, 67–102 (1990) 38. Verlinde, E.: Fusion rules and modular transformations in 2D conformal field theory. Nucl. Phys. B 300, 360–376 (1988) 39. Verstegen, D.: New exceptional modular invariant partition functions for simple Kac–Moody algebras. Nucl. Phys. B346, 349–386 (1990) 40. Wassermann, A.: Operator algebras and conformal field theory III: Fusion of positive energy representations of LSU(N ) using bounded operators. Invent. Math. 133, 467–538 (1998) 41. Xu, F.: New braided endomorphisms from conformal inclusions. Commun. Math. Phys. 192, 347–403 (1998) Communicated by H. Araki

Commun. Math. Phys. 213, 291 – 330 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Stochastic Dissipative PDE’s and Gibbs Measures Sergei Kuksin1,2 , Armen Shirikyan1,3 1 Department of Mathematics, Heriot-Watt University, Edinburgh EH14 4AS, Scotland, UK.

E-mail: [email protected]; [email protected]

2 Steklov Institute of Mathematics, 8 Gubkina St., 117966 Moscow, Russia 3 Institute of Mechanics of MSU, 1 Michurinskii Av., 119899 Moscow, Russia

Received: 24 January 2000 / Accepted: 17 February 2000

Abstract: We study a class of dissipative nonlinear PDE’s forced by a random force ηω (t, x), with the space variable x varying in a bounded domain. The class contains the 2D Navier–Stokes equations (under periodic or Dirichlet boundary conditions), and the forces we consider are those common in statistical hydrodynamics: they are random fields smooth in x and stationary, short-correlated in time t. In this paper, we confine ourselves to “kick forces” of the form ηω (t, x) =

+∞

δ(t − kT )ηk (x),

k=−∞

where the ηk ’s are smooth bounded identically distributed random fields. The equation in question defines a Markov chain in an appropriately chosen phase space (a subset of a function space) that contains the zero function and is invariant for the (random) flow of the equation. Concerning this Markov chain, we prove the following main result (see Theorem 2.2): The Markov chain has a unique invariant measure. To prove this theorem, we present a construction assigning, to any invariant measure, a Gibbs measure for a 1D system with compact phase space and apply a version of Ruelle–Perron– Frobenius uniqueness theorem to the corresponding Gibbs system. We also discuss ergodic properties of the invariant measure and corresponding properties of the original randomly forced PDE. Contents 0. 1.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Invariant measures for a class of Markov chains . . . . . 1.2 Stationary process corresponding to an invariant measure 1.3 Gibbs system . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

292 296 296 298 299

292

2.

3.

4.

5.

6.

7.

8.

S. Kuksin, A. Shirikyan

Invariant Measures for Nonlinear Dissipative Semi-Groups 2.1 Statement of the main results . . . . . . . . . . . . 2.2 Scheme of the proof of Theorem 2.2 . . . . . . . . Lyapunov–Schmidt Type Reduction . . . . . . . . . . . . . 3.1 Statement of the result . . . . . . . . . . . . . . . . ⊥ . . . . 3.2 Markov chain in the space HN = HN × HN A Version of the RPF-Theorem . . . . . . . . . . . . . . . 4.1 Statement of the result . . . . . . . . . . . . . . . . 4.2 Proof of Theorem 4.1 . . . . . . . . . . . . . . . . 4.3 Sufficient conditions for application of Theorem 4.1 Proof of Theorem 2.2 . . . . . . . . . . . . . . . . . . . . 5.1 Reduction to Theorem 4.1 . . . . . . . . . . . . . . 5.2 Checking Condition (H1 ) . . . . . . . . . . . . . . 5.3 Checking Condition (H2 ) . . . . . . . . . . . . . . Ergodic Properties of the Invariant Measure . . . . . . . . . 6.1 Support of the invariant measure . . . . . . . . . . 6.2 Convergence to the invariant measure and mixing . Application to Stochastic Dissipative PDE’s . . . . . . . . 7.1 Navier-Stokes equations in a bounded domain . . . 7.2 Navier-Stokes equations on a torus . . . . . . . . . 7.3 A nonlinear Schrödinger equation on a torus . . . . Appendix: Proof of Theorem 3.1 . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

302 302 304 307 307 308 310 310 312 314 316 316 317 319 323 323 323 324 324 327 328 328

0. Introduction The paper deals with a class of randomly forced dissipative nonlinear PDE’s. The class contains the 2D space-periodic Navier–Stokes equations and in the Introduction we confine ourselves to this important example: u˙ − δ u + (u · ∇)u + ∇p = ηω (t, x),

div u = 0,

(0.1)

where x ∈ T2 = R2 /2πZ2 , u = u(t, x) and p = p(t, x). We assume that ηω dx ≡ 0 div ηω ≡ 0, ηω := T2

and study sotions u(t, x) with zero mean value, i. e., u ≡ 0. In the usual way [BV, CF, G, L], we exclude the pressure p from the equations, applying to (0.1) the projection to the linear space formed by divergence-free vector fields. Accordingly, we view (0.1) as a random dynamical system in a Sobolev phase space Hs , s ≥ 0, where Hs = {u ∈ H s (T2 ; R2 ) : div u = 0 , u = 0}. Our concern is large-time asymptotic properties of this system. It is traditional for statistical hydrodynamics to assume that the random force ηω (t, x) in Eq. (0.1) is smooth in x and is stationary short-correlated in time t. In mathematical literature, it is common to replace physically correct forces described above by random fields ηω (t, x) that are smooth in x, while as a function of time t they are white noises, see [DZ, FlM, BKL, M, S1]. In this work, we take another mathematical model for the physically correct forces η, namely, a “kick force model” (see below). This model is

Stochastic Dissipative PDE’s and Gibbs Measures

293

sufficiently popular and suits our techniques the best. We believe that our approach also applies to equations with white noise forces in time1 and plan to address these equations in a subsequent paper. A kick force η corresponds to the situation when the system gets smooth random kicks with some time period T and evolves freely between the kicks. It means that the k th kick changes a solution u(kT , x) to u(kT +0, x) = u(kT , x)+ηk (x), while between the kicks, u(t, x) satisfies Eq. (0.1) with η = 0. This model is described by Eq. (0.1) in which the force η is a δ-function of time t: δ(t − kT )ηk (x). (0.2) ηω (t, x) = k∈Z

Here the ηk ’s are independent identically distributed smooth random fields. To describe them, we expand ηk in the L2 -normalised trigonometric basis {ej , j ∈ N} of the space Hs : ηk (x) =

∞

bj ξj k ej (x).

(0.3)

j =1

In (0.3), ξj k (j ∈ N, k ∈ Z) are independent random variables uniformly distributed on the interval [−1, 1] and the real constants bj satisfy the inequality |bj | ≤ Cr j −r

for all

j, r ∈ N.

(0.4)

(This assumption guarantees that the random fields are smooth.) The restrictions on distributions of ξj k ’s and on the decay rate (0.4) can be weakened, see (2.7). Due to the kick nature of the force, a solution uω (t, x) is completely described by its values uk (x) at the points t = kT , k ∈ Z: uk (x) := uω (kT + 0, x),

k ∈ Z.

Accordingly, from now on we treat (0.1) as a discrete-time random dynamical system in Hs : uk = S(uk−1 ) + ηk , Hs

(0.5)

Hs

→ is a time-one shift along trajectories of Eq. (0.1) with where the map S : η = 0. Due to (0.3) and (0.4), the random fields ηk (x) are bounded in any C l -norm, uniformly with respect to k and almost all ω. It follows that the solution of (0.5) with zero Cauchy data u0 (x) = 0

(0.6)

is also bounded in every Hl -norm: uk Hl ≤ Cl

for

k≥0

and almost all

ω.

(0.7)

Denoting by Ak ⊂ Hs the support of distribution of the random variable uk ∈ Hs , we conclude from (0.7) that the union ∪k≥0 Ak is a precompact subset of Hs . Its closure A 1 For a non-degenerate white noise force η, the set A in Theorems 0.1 and 0.2 below coincides with the whole space Hs .

294

S. Kuksin, A. Shirikyan

is a compact set invariant for the random dynamical system (0.5), so the system defines a family of Markov chains in A. That is, given any Borel measure λ on A, Eq. (0.5) admits a unique solution (uk (x), k ≥ 0), which is a Markov chain in A such that the distribution of u0 is λ (see [Re] and the main text). The measure λ is said to be invariant for (0.5) if the distributions of all random variables uk ∈ Hs coincide with λ. The main result of this work is the following theorem: Theorem 0.1. There exists a finite integer N ≥ 0 such that if bj = 0 for 1 ≤ j ≤ N , then the Markov chain defined by system (0.5) in A has a unique invariant measure λ. This measure is concentrated on smooth vector fields, i. e., λ Hs ∩ C ∞ (T2 , R2 ) = 1, and its support in Hs is equal to A.2 Existence of an invariant measure is an easy consequence of the usual Bogolyubov– Krylov argument (see [DZ]), whereas its uniqueness is a deep result (cf. [G, Sect. 6.1]). The integer N depends on the viscosity δ > 0 and on the Hs -norm of the force η. In particular, the Markov chain in A = Aδ has a unique invariant measure for any δ > 0 if all coefficients bj are nonzero (and satisfy (0.4)). The measure λ in Theorem 0.1 describes asymptotic behaviour in time for solutions of (0.5). That is, it describes the long-time behaviour of 2D fluid sloshed by random kicks. Theorem 0.2. Let bj = 0 for 1 ≤ j ≤ N , where N is defined in Theorem 0.1, and let (uk , k ≥ 0) be any Markov chain in A satisfying (0.5). Then the distributions of random variables uk ∈ Hs weakly converge to λ. In particular, the solution of (0.5), (0.6) converges to λ in distribution as k → ∞. Since the random variables uk are valued in the compact set A ⊂ Hs , for any continuous function f on Hs we have the following convergence: Ef (uk ) → f (u) dλ(u) as k → ∞. A

In particular, choosing s ≥ 2 and taking for f the function f (u(·)) = ui (x)uj (y) with some 1 ≤ i, j ≤ 2 and x, y ∈ T2 , we see that the 2-point correlation tensors Eui (x)uj (y) for a statistical solution of the 2D Navier–Stokes equations (0.1) = (0.5) with zero initial condition (0.6) converge to corresponding correlation tensors for the measure λ. The results of Theorems 0.1 and 0.2 were proved earlier for laminar flows (when the Reynolds number ∼ |η|δ −1 is small), see [M], and for Eq. (0.1) forced by a random force η that is non-smooth in x, see [FlM]. The above-mentioned works deal with the Navier–Stokes equations forced by a white noise force of the form η(t, x) =

∞ d bj βj (t)ej (x). dt

(0.8)

j =1

Here {ej (x)} is the trigonometric basis, βj ’s are independent standard Brownian motions, and bj ’s are real constants. The non-smoothness assumption imposed in [FlM] reads as cj −1/2 ≤ bj ≤ Cj −3/8−ε

for all

j ≥ 1,

2 This means that A is the minimal closed subset of Hs which has full λ-measure.

(0.9)

Stochastic Dissipative PDE’s and Gibbs Measures

295

where c, C, and ε are positive constants. Due to the lower bound in (0.9), Eqtion (0.1) with right-hand side (0.8) defines, in a suitable low-smoothness Sobolev space Hs , a Markov process that satisfies the strong Feller property. This allows one to get the uniqueness of an invariant measure as a corollary of a version of the Doob uniqueness theorem [DZ]. Another work related to Theorems 0.1 and 0.2 is the paper [S1] devoted to the Burgers equation. In [S1], uniqueness of an invariant measure is established without the smallness or non-smoothness assumption. However, the method applied there substantially employs the Cole–Hopf transformation, which integrates the Burgers equation. We also note that analysis of the resulting formulas uses some techniques developed to study Gibbs measures. For a space-smooth random force η, the family of Markov processes defined by Eq. (0.1) (or (0.5)) is not strongly Feller, see [DZ] and Remark 5.2. Therefore Doob’s arguments do not apply to the problem (0.1)–(0.4). Our proof is based on other ideas sketched in Sect. 2.2. Very loosely, to prove the uniqueness we pass from Eq. (0.5) to an equation for semi-trajectories (uk , k ≤ j ), j ∈ Z, and replace the latter by an equivalent system which is a direct sum of a random dynamical system in a space of sequences (vk ∈ RN , k ≤ 0) (N is the same as in Theorem 0.1) and of some trivial system. The new dynamical system turns out to be of the same type as those arising in studies of 1D Gibbs measures (see [Ru, S2, Bo]). To prove uniqueness of a 1D Gibbs measure, Ruelle proposed a Perron–Frobenius type theorem (see the same references). In Sect. 5, we use a version of his result to prove the uniqueness of an invariant measure. It is quite possible that some statements related to Ruelle’s type theorem applied in this paper were known earlier. Still, since we failed to find an appropriate version in the literature, a complete proof of the result we need is given in Sect. 4. Our approach for proving the uniqueness applies to a large class of dissipative nonlinear systems described in Sect. 2.1 and do not use specifics of the Euler nonlinearity (u · ∇)u. To complete the Introduction, we note that, in terms of the measures λ = λδ , by the turbulence problem (for space-periodic 2D flow) is meant the problem of understanding limiting behaviour of the measure λδ as δ → 0. (Theorems 0.1 and 0.2 apply to Eq. (0.1) with any δ > 0 if in (0.3) all bj ’s are positive.) In difference with what we said above regarding the equations with fixed δ > 0, it is commonly believed that the limiting behaviour of the invarinat measure depends essentially on properties of Euler nonlinearity, cf. [BKL, G]. Corresponding results have no chance to be as general as the main results of this work (i. e., Theorems 2.2, 6.1, and 6.2 whose important particular cases are Theorems 0.1 and 0.2). Notation. Let Z = Z∞ be the set of all integers and, for k ∈ Z, let Zk be the set of integers that are no greater than k. For any set M, we denote by M = M Z0 =

0 l=−∞

M,

M = MZ =

+∞

M

l=−∞

the spaces of sequences m = (ml , l ∈ Z0 ) and n = (nl , l ∈ Z), respectively. For any n ∈ M and k ∈ Z, we write nk = (nl , l ∈ Zk ) and regard the sequence nk as an element of M identifying it with a shifted sequence that belongs to M. As a rule, the superscripts of elements belonging to M or M will signify the “discrete time” while the subscripts will stand for the number of a component (e. g., nk = (. . . , nk−1 , nk0 )).

296

S. Kuksin, A. Shirikyan

A set of sequences {mk ∈ M, i < k < j }, −∞ ≤ i < j ≤ ∞, is said to be compatible if there exists a sequence (ml , l < j ) such that mk = (. . . , mk−1 , mk ) for i < k < j. Let X be a Polish space, i. e., separable complete metric space. We shall use the following notation. [B]X is the closure in X of its subset B. BX (x, r) is a ball in X of radius r centered at x ∈ X; we write BX (r) if X has a selected point 0 and x = 0. B(X) is the σ -algebra of Borel subsets of X. P(X) is the set of probability measures on (X, B(X)). C(X) is the space of real-valued continuous functions on X. is the space of bounded functions f ∈ C(X). It is endowed with the norm Cb (X) f ∞ := sup |f (x)|. x∈X

L1 (X, µ) is the space of Borel functions on X with finite norm f µ := f (x) dµ(x). X

The integral of a function f (x) over the space X with respect to a measure µ will sometimes be denoted by (µ, f ): f (x) dµ(x) = f dµ. (µ, f ) = X

D(ξ )

X

is the distribution of a random variable ξ .

1. Preliminaries 1.1. Invariant measures for a class of Markov chains. Let V be a separable Fréchet space, let T : V → V be a continuous mapping, and let {ηk , k ≥ 1} be a sequence of independent identically distributed (i.i.d.) V -valued random variables defined on a probability space (0, F, P). Consider a family of homogeneous V -valued Markov chains (see [Re]) 1k = 1k (v), v ∈ V , defined by the formulas 10 = v,

(1.1)

1k = T (1k−1 ) + ηk ,

(1.2)

where k ≥ 1. Denote by P (k, v, 3) the corresponding transition function: P (k, v, 3) = P{1k (v) ∈ 3},

v ∈ V,

3 ∈ B(V ).

Note that if ν is the distribution of ηk , then P (1, v, 3) = P{T (v) + η1 ∈ 3} = ν 3 − T (v) .

(1.3)

Stochastic Dissipative PDE’s and Gibbs Measures

297

With the transition function (1.3) one associates the Markov operators P (k, v, dz)f (z) : Cb (V ) → Cb (V ), Pk f (v) = V Pk∗ µ(3) = P (k, v, 3)dµ(v) : P(V ) → P(V ), V

(1.4) (1.5)

where f ∈ Cb (V ) and µ ∈ P(V ). We shall write P and P ∗ instead of P1 and P1∗ , respectively. The fact that the image under Pk of a continuous bounded function belongs to Cb (V ) is known as the Feller property. It follows from continuity of T . Indeed, if vn ∈ V converges to v, then P (1, vn , dz)f (z) = f z + T (vn ) dν(z) Pf (vn ) = V

V

and therefore, by the dominated convergence theorem, Pf (vn ) → Pf (v) as n → ∞. We shall say that the transition function P (k, v, 3) and the Markov operators Pk and Pk∗ correspond to Eq. (1.2). Definition 1.1. A random sequence {1k , k > k0 }, k0 ≥ −∞, is called a solution of (1.2) if the following three conditions hold: • {1k } is a Markov chain; • {1k } satisfies (1.2) for k > k0 + 1; • the process is future-independent, i. e., 1k is independent of ηl , l > k. In particular, any Markov chain 1k (u), k ≥ 0, is a solution of (1.2). Our goal is to study distributions of solutions for some equations of the form (1.2) that correspond to stochastic PDE’s. Accordingly, our main interest here is not with the random variables 1k themselves, but rather with their distributions in V . For this reason, we do not distinguish equations of the form (1.2) with different probability spaces (0, F, P) as soon as they give rise to the same transition function (1.3) (and hence to the same Markov operators (1.4) and (1.5)). Denote by D(ξ ) the distribution of a random variable ξ and by supp µ the support of a measure µ (i. e., the smallest closed set of full measure). For Eq. (1.2), the set of attainability Ak from zero by the time k is defined as A0 = {0},

Ak = T (Ak−1 ) + supp D(ηk ),

k ≥ 1,

and the set of attainability from zero (in infinite time) has the form ∞ Ak . A := k=0

(1.6)

(1.7)

V

Remark 1.2. The sets Ak and A can be interpreted in terms of the optimal control theory if we view (1.1), (1.2) as a controllable system, where ηk is the control chosen at the instance k. With this in mind, for given elements v and ηk , k ≥ 1, we define the sequence (cf. (1.1), (1.2)) (1.8) 10 (v) = v, 1k (v; η1 , . . . , ηk ) = T 1k−1 (v; η1 , . . . , ηk−1 ) + ηk , where k ≥ 1. Obviously, Ak and A are the sets of attainability from zero (by finite or infinite time) for the system 1k (v; η1 , . . . , ηk ) in the usual sense of the optimal control.

298

S. Kuksin, A. Shirikyan

Let us denote by P(V , A) the set of Borel measures λ ∈ P(V ) such that supp λ ⊂ A and fix an arbitrary λ0 ∈ P(V , A). Let 1k be a Markov chain satisfying (1.2) for k ≥ 1 such that D(10 ) = λ0 . Note that (1.3) is a transition function for 1k . Consider the Krylov–Bogolyubov averages of distributions of 1k : λL =

L−1 L−1 1 1 ∗ D(1k ) = Pk λ0 . L L k=0

(1.9)

k=0

We recall that λ ∈ P(V ) is called an invariant measure (for the family of Markov chains or for Eq. (1.2)) if P ∗ µ = µ. The following assertion is a simple consequence of the Prokhorov and Krylov–Bogolyubov theorems (for instance, see [IW, Theorem I.2.6] and [DZ, Theorem 3.1.1]). Proposition 1.3. Assume that the set A is compact in V . Then the sequence {λL } is tight in P(V ) and, hence, has at least one limit point. Moreover, any limit point of {λL } is an invariant measure for P ∗ , and its support is contained in A. Remark 1.4. By construction, the set A is invariant for any Markov chain 1k that satisfies (1.2) and whose initial distribution λ0 is supported by A, i. e., 1k ∈ A almost surely. (For instance, if 1k = 1k (v) is given by (1.1), (1.2) with v ∈ A, then A is invariant for 1k .) Redefining such a process ηk on a zero subset of 0, we can assume that 1k ∈ A for all ω ∈ 0 and regard 1k as a Markov chain with phase space A. 1.2. Stationary process corresponding to an invariant measure. Recall that we denote by ν ∈ P(X) the distribution of ηk . Proposition 1.5. Assume that the set of attainability A is compact in V . Let λ ∈ P(V , A)

, F, be an invariant measure for P ∗ . Then there is a probability space (0 P), a sequence

, and a stationary V -valued Markov of i.i.d. V -valued random variables η˜ k , k ∈ Z, on 0 chain z = (zk , k ∈ Z) such that

, zk = T (zk−1 ) + η˜ k for all ω ∈ 0

k ∈ Z.

(1.10)

Moreover, D(η˜ k ) = ν and D(zk ) = λ for any k ∈ Z. Finally, the families (zl , l ≤ k) and (η˜ l , l ≥ k + 1) are independent for all k ∈ Z. Proof. We apply standard arguments based on the Prokhorov and Skorokhod theorems (for instance, see [IW, Theorems I.2.6 and I.2.7]). Define the space V = V Z endowed with the Tikhonov topology3 and denote by {ξk , ζk , k ∈ Z} an arbitrary family of independent V -valued random variables such that D(ξk ) = λ and D(ζk ) = ν. Consider a sequence of V -valued Markov chains xl = (xkl , = k ∈ Z) defined as  for k ≤ −l − 1, 0 for k = −l, (1.11) xkl = ξl  T (x l ) + ζ for k ≥ −l + 1. k k−1 j

j

3 This means that a sequence x j = (x , l ≤ 0) converges to x = (x , l ≤ 0) if and only if x → x as l l l l j → ∞ for all l ≤ 0.

Stochastic Dissipative PDE’s and Gibbs Measures

299

We claim that the sequence D(xl ) is tight in P(V). Indeed, by construction, D(xkl ) = δ0 for k ≤ −l − 1, where δ0 is the Dirac measure concentrated at zero, and D(xkl ) = λ for k ≥ −l. Since the supports of D(xkl ) are compact sets in V and an infinite product of compact sets is compact in the Tikhonov topology, we conclude that the measures λl = D(xl ), l ≥ 1, form a tight family in P(V). By the Prokhorov theorem, there is a sequence of integers lj → +∞ and a measure λ ∈ P(V) such that λlj → λ in the weak topology of P(V). In view of the Skorokhod

, F, embedding theorem, there is a probability space (0 P) and V-valued random varij j ables z = (zk , k ∈ Z) and z = (zk , k ∈ Z) such that D(zj ) = D(xlj ) = λlj , D(z) = λ, zj → z as j → ∞

P-almost surely.

(1.12) (1.13)

We claim that z is the required Markov chain. Indeed, the fact that D(zk ) = λ follows from the construction. Let us define the random variables η˜ k = zk − T (zk−1 ),

l η˜ kl = zkl − T (zk−1 ).

The first of these relations implies (1.10), and the second shows that D(η˜ kl ) = D(ηk ) = ν for k ≥ 1 − l. Moreover, it follows from (1.13) and the continuity of T that,

P-almost surely, η˜ kl → η˜ k as l → ∞ for any k ∈ Z. Therefore, D(η˜k ) = ν and (zkl , η˜ kl , k ∈ Z) → (zk , η˜ k , k ∈ Z)

P-almost surely as

l → ∞,

whence we conclude that the corresponding distributions also converge. This implies the required assertions concerning independence, which completes the proof of the proposition. " # Since the underlying probability space is of no importance for the applications we deal with, in what follows, we shall drop the tildes and replace (1.10) by the original equation (1.2).

1.3. Gibbs system. In this section, we specify a class of Markov chains (1.1), (1.2) we are the most interested in. Let X be a finite-dimensional Euclidean space with Lebesgue measure dα, α ∈ X, and let Y be a separable Fréchet space with a Borel measure4 d:Y (ψ), ψ ∈ Y . Let us denote X = XZ0 , Y = Y Z0 . We shall assume that X and Y are endowed with the Tikhonov topology. Let (ϕk , k ≥ 1) and (ψk , k ≥ 1) be two independent sequences of i.i.d. random variables with values in X and Y , respectively, such that, for any k ≥ 0, D(ϕk ) = D(α) dα,

D(ψk ) = d:Y (ψ),

4 In all applications the measure : will have a bounded support. Y

300

S. Kuksin, A. Shirikyan

where D(α) is an integrable function on X. Let T0 : X × Y → X be a continuous mapping and let v T0 (v, w) T : X × Y → X × Y, T : $→ . w 0 Consider a family of Markov chains k k θ (U ) θ k k Υ = Υ (U ) = , = ζk ζ k (U )

v U= ∈ X × Y, w

with phase space X × Y, defined by the formulas5 (cf. (1.1), (1.2)) Υ 0 = U, Υ k = Υ k−1 , T Υ k−1 + ηk , where ηk =

ϕk ψk

(1.14) (1.15)

and (1.15) holds for any k ≥ 1. Note that if (1.15) is viewed as an

equation and a Markov chain (Υ k , k ∈ Z) is its solution, then the sequences Υ k are compatible (see Notation) since for any integers k and l ≥ 1 we have k+l Υ k+l = Υ k , Υ1−l , . . . , Υ0k+l . (1.16) For l = 1, relation (1.16) results from (1.15); the general case follows by induction. As in Remark 1.2, with the family of Markov chains (1.14), (1.15) we associate the controllable system k θ (U ; σ1 , . . . , σk ) αk k Υ (U ; σ1 , . . . , σk ) = ∈ X × Y, k ≥ 0, , σk = k βk ζ (U ; σ1 , . . . , σk ) given by the formulas Υ 0 (U ) = U , Υ k (U ; σ1 , . . . , σk ) = Υ k−1 , T (Υ k−1 ) + σk ,

(1.17) (1.18)

where Υ k−1 = Υ k−1 (U ; σ1 , . . . , σk−1 ). Let us calculate the transition function P(k, U , 3) for the family (1.14), (1.15). We begin with the case k = 1. For any set I ⊂ R, we shall use capital Gothic letters with subscript I (for instance, BI ) to denote elements of the σ -algebra B (X × Y )I ∩Z0 . Note that, for any vector T ∈ X × Y of the form T = T00 and any k, we have D(T + ηk ) = D(α − T0 ) dα d:Y . Therefore, if a Borel set 3 ∈ B(X × Y) has the form 3 = B(−∞,−1] × B{0} , then P(1, U , 3) = δU (B(−∞,−1] ) D α − T0 (U ) dα d:(ψ). (1.19) B{0}

5 In accordance with our agreement, we identify the right-hand side in (1.15), which is an element of (X × Y )Z1 , with the corresponding element in X × Y.

Stochastic Dissipative PDE’s and Gibbs Measures

301

We now turn to the case k ≥ 2. Let Υ0k (U ) be the zeroth component of the se quence Υ k (U ), i. e., Υ0k (U ) = T Υ k−1 (U ) + ηk . By induction, the joint distribution of the random variables Υ01 (U ), . . . , Υ0k (U ) in the direct product (X × Y ){1,...,k} = k l=1 X × Y has the form DkU =

k D αl − T0 (U ; σ1 , . . . , σl−1 ) d:(σl ) , l=1

where d: = dα d:Y and σl =

(1.20)

αl βl

. Accordingly, for any set of the form

3 = B(−∞,−k] × B(−k,0]

(1.21)

we have (cf. (1.19))

P(k, U , 3) = δU (B(−∞,−k] )DkU B(−k,0] . (1.22) k Now let Υ = (Υlk , l ∈ Z0 ) be a stationary Markov chain that is a solution of Eq. (1.15) for k ∈ Z (see Definition 1.1). In this case, the sequence {Υl = Υ0l , l ∈ Z} formed of the zeroth components of Υ k is a stationary process in X × Y . Since {Υ k , k ≥ 0} is a compatible family, for any n ∈ Z we have Υ n = (. . . , Υn−1 , Υn ). The conditional distribution of the k-vector (Υn+1 , . . . , Υn+k ) under the condition Υ n = U ∈ X × Y equals DkU (see (1.20)). Setting Hk (U ; σ1 , . . . , σk ) := log

k

D αl − T0 (U ; σ1 , . . . , σl−1 ) ≥ −∞,

l=1

we obtain D(Υn+1 , . . . , Υn+k | Υ n = U ) = eHk (U ;σn+1 ,...,σn+k )

n+k

d:(σj ).

(1.23)

j =n+1

Clearly, the Hamiltonian Hk is stationary: it does not depend on n, but only on the vector (σn+1 , . . . , σn+k ) and the “past” U ∈ X × Y. This means that the random sequence {Υn } has a Gibbs distribution with the Hamiltionians {Hk } (for instance, see [S2, Bo]). Thus, any stationary Markov chain Υ m , m ∈ Z, that satisfies the above-mentioned conditions (in particular, (1.15) holds) defines a Gibbs measure for which the conditional distributions have the form (1.23). We conclude that the uniqueness of a stationary solution for (1.15) follows from that of the Gibbs measure with conditional distributions (1.23). There are many results that guarantee uniqueness of a Gibbs measure for 1D systems (for instance, see [Do, Bo]). However, they do not apply to Gibbs systems associated with the Markov chains we are interested in because the corresponding Hamiltonians are equal to −∞ on large parts of the phase space (X × Y ){1,...,k} = kj =1 X × Y (cf. assumption (2.4) in [Do]). Remark 1.6. The results of this section remain true if the spaces V and X×Y are replaced by their closed subsets. In this case, we have to assume in addition that these subsets are invariant for the Markov chains introduced above.

302

S. Kuksin, A. Shirikyan

2. Invariant Measures for Nonlinear Dissipative Semi-Groups 2.1. Statement of the main results. Let H be a separable Hilbert space with norm · and let S : H → H be a (nonlinear) operator such that S(0) = 0 and the following Conditions (A)–(C) hold: (A) For any R > r > 0 there are positive constants C = C(R) and a = a(R, r) < 1 and an integer n0 = n0 (R, r) ≥ 1 such that S(u1 ) − S(u2 ) ≤ C(R)u1 − u2 for all u1 , u2 ∈ BH (R), S n (u) ≤ max{au, r} for u ∈ BH (R), n ≥ n0 .

(2.1) (2.2)

For any compact set K ⊂ H , define a sequence of sets Ak (K) ⊂ H by the rule A0 (K) = {0}, Ak (K) = S Ak−1 (K) + K, k ≥ 1, (2.3) and denote A(K) =

∞

Ak (K)

k=0

.

(2.4)

H

(B) The set A(K) is bounded in H for any compact set K ⊂ H . Let {ej } be an orthonormal basis in H . For a given integer N ≥ 1, denote by PN and QN the orthogonal projections onto the closed subspaces HN and6 HN⊥ generated by the sets of vectors {e1 , . . . , eN } and {eN+1 , eN+2 , . . . }, respectively. (C) For any R > 0 there is a decreasing sequence γN (R) > 0 tending to zero as N → ∞ such that QN S(u1 ) − S(u2 ) ≤ γN (R)u1 − u2 for all u1 , u2 ∈ BH (R). (2.5) Let ηk , k ≥ 1, be a sequence of independent H -valued random variables of the form ηk =

∞

bj ξ j k ej ,

(2.6)

j =1

where bj ≥ 0 are some constants such that b :=

∞ j =1

1/2 bj2

< ∞,

(2.7)

and {ξj k } is a family of independent real-valued random variables satisfying the following condition: (D) For any j , the random variables ξj k , k ≥ 1, have the same distribution πj such that πj (dr) = pj (r) dr, where the densities pj (r) are Lipschitz continuous and, moreover, pj (0) > 0 and supp pj ⊂ [−1, 1]. 6 We denote by E ⊥ the orthogonal complement to a subspace E.

Stochastic Dissipative PDE’s and Gibbs Measures

303

Remark 2.1. We note straightaway that the densities pj (r) are allowed to be piecewise Lipschitz functions. In this case, all the results and their proofs remain the same, but some of the calculations become more cumbersome. Define a family of Markov chains 1k = 1k (u), u ∈ H , by the rule (cf. (1.1), (1.2)) 10 = u, k

1 = S(1

(2.8) k−1

) + ηk ,

(2.9)

where k ≥ 1, and denote by P (k, u, 3) the corresponding transition function (see (1.3)). Recall that Markov operators associated with P (k, u, 3) have the form (1.4) and (1.5). Denote by ν the distribution of the i.i.d. random variables ηk and by A the set of attainability from zero for Eq. (2.9) (see (1.7)). We recall that P(H, A) denotes the set of Borel measures on H whose support is contained in A. According to Remark 1.4, any Markov chain 1k that satisfies (2.9) and whose initial distribution is supported by A can be regarded as an A-valued Markov chain. The theorem below is the main result of this paper. Theorem 2.2. Assume that Conditions (A)–(D) hold. There is an integer N = N (b) ≥ 1 such that if bj > 0 for j = 1, . . . , N,

(2.10)

then P ∗ has a unique invariant measure λ ∈ P(H, A). Remark 2.3. We shall apply Theorem 2.2 to stochastic PDE’s of the form ∂t u + Lu + B(u) =

∞

δ(t − k)ηk ,

(2.11)

k=1

where L is the generator of a parabolic semi-group, B(u) is a nonlinear term, and ηk are i.i.d. random variables (see (2.6)). Denote by St the solving semi-group for Eq. (2.11) with zero right-hand side. To apply Theorem 2.2, we set S = S1 and check that Conditions (A)–(C) are satisfied for S. Let us clarify informally what they mean. Inequality (2.1) is nothing else but the condition of uniform Lipschitz continuity of S on any ball BH (R). Inequality (2.2) expresses the property of dissipativity of the semigroup St . Condition (B) means that the solution of (2.11) starting from zero is bounded in the phase space uniformly with respect to time and realisations of the random right-hand side. As a rule, this property holds for dissipative equations. Finally, in Condition (C), {ej } is the complete set of eigenvectors of L, and inequality (2.5) follows from the fact that the corresponding eigenvalues tend to +∞. Ergodic properties of the invariant measure λ ∈ P(H, A) constructed in Theorem 2.2 are discussed in Sect. 6. The proof of Theorem 2.2, which is based on the constructions of Sects. 3 and 4, is given in Sect. 5. Here we present the main steps of the proof.

304

S. Kuksin, A. Shirikyan

2.2. Scheme of the proof of Theorem 2.2. 2.2.1. Existence of an invariant measure and the corresponding stationary Markov chain. To prove the existence of an invariant measure λ ∈ P(H, A), note that supp ν is a compact set (because it is a closed subset of the Hilbert cube defined by the sequence bj ) and therefore, by Condition (B), the set of attainability A = A(supp ν) is bounded in H . It is easy to see that A = S(A) + supp ν.

(2.12)

Now note that, in view of Condition (C) and inequality (2.1) with u2 = 0, there is a finite ε-net for the image under S of any bounded set in H . Hence, S(A) is compact, and it follows from (2.12) that A is also compact in H . Proposition 1.3 now implies the required assertion. It remains to check that the invariant measure is unique. The corresponding proof occupies Sects. 3–5. Here we sketch it and develop some notations needed in the sequel. Let λ ∈ P(H, A) be an invariant measure for P ∗ . By Proposition 1.5 (see also the remark after its proof), there is a stationary Markov chain (uk , k ∈ Z) and a family of independent random variables ηk such that D(uk ) = λ, D(ηk ) = ν, and uk = S(uk−1 ) + ηk ,

k ∈ Z.

(2.13)

Introduce the linear space H = H Z0 endowed with the Tikhonov topology and consider a family of Markov chains k = k (u) in H defined as 0 = u, k = S k−1 + ηk0 , where k ≥ 0, u ∈ H, ηk0 = (. . . , 0, 0, ηk ) and S(v) = v, S(v0 ) = . . . , v−2 , v−1 , v0 , S(v0 ) ,

(2.14) (2.15)

v(vl , l ≤ 0) ∈ H.

(2.16)

It is clear that the sequences7 uk := (ul , l ∈ Zk ) ∈ H form a stationary compatible Markov chain satisfying Eq. (2.15). (By compatibility we mean that almost all trajectories of the random process {uk , k ∈ Z} form a compatible family of sequences.) Let us denote by P(k, u, 3), Pk and Pk∗ the transition function and Markov semi-groups corresponding to (2.15) and by A the set of attainability from zero. What has been said implies that {uk } defines an invariant measure λ ∈ P(H, A) for the semi-group Pk∗ . Since uniqueness of an invariant measure for Pk∗ implies a similar property for the original semi-group Pk∗ , it suffices to show that the invariant distribution for Pk∗ is unique. However, this cannot be done directly because the noise ηk0 is effective only for the first N Fourier modes, whereas its projection to the (infinite-dimensional) subspace HN⊥ of codimension N may vanish. To overcome this difficulty and to prove that the distribution of a stationary solution {k , k ∈ Z} of (2.15) is uniquely defined, we perform an isomorphic transformation of {k } that replaces a component of k of codimension N (namely, its projection to the space HN⊥ ) by a random sequence (namely, by an appropriate component of the noise) whose “large-time behaviour” is known. The corresponding arguments are based on a Lyapunov–Schmidt type reduction. 7 The sequence uk = (u , l ∈ Z ), k ∈ Z, is regarded as an element of H = H Z0 . l k

Stochastic Dissipative PDE’s and Gibbs Measures

305

2.2.2. Lyapunov–Schmidt type reduction. For any integer N ≥ 1, we introduce the ⊥ = (H ⊥ )Z0 (where H ⊥ spaces HN = (HN )Z0 and HN N = PN H and HN = QN H ) N endowed with the Tikhonov topology. For the stationary sequences (ul , l ∈ Z) and (ηl , l ∈ Z) defined above and for any integer k, we set v k = (vl , l ∈ Zk ),

vl = PN ul ;

v˜ k = (v˜l , l ∈ Zk ),

v˜l = QN ul ,

(2.17)

ψ k0 = (. . . , 0, 0, ψk ),

(2.18)

and ϕ k0 = (. . . , 0, 0, ϕk ),

ψ k (. . . , ψk−1 , ψk ),

where ϕk = PN ηk and ψk = QN ηk . Let us denote by PN and QN the projections PN = · · · × PN × PN : H → HN , QN = · · · × QN × QN : H →

⊥ HN ,

(ul , l ≤ 0) $→ (PN ul , l ≤ 0), (ul , l ≤ 0) $→ (QN ul , l ≤ 0).

Since the Markov chain {uk , k ∈ Z} satisfies Eq. (2.15), applying PN and QN to (2.15), we get the following system of two vector equations: j

v j = PN S(v j −1 , v˜j −1 ) + ϕ 0 , v˜ j =

j QN S(vj −1 , v˜ j −1 ) + ψ 0 .

(2.19) (2.20)

(We used the fact that PN S(uj −1 ) does not depend on v˜l with l ≤ j − 2 and similarly with QN S(uj −1 ).) The random sequences {v j } and {v˜ j } are stationary and compatible since so is {uj }. We now assume that v r = (. . . , vr−1 , vr ) and ψ r = (. . . , ψr−1 , ψr ) are bounded deterministic8 sequences in HN and HN⊥ , respectively. It turns out that there exists a unique bounded sequence (. . . , v˜r−1 , v˜r ) in HN⊥ such that v˜ j = (v˜l , l ≤ j ) satisfies Eq. (2.20) for j ≤ r with v j = (vl , l ≤ j ) and ψ j = (. . . , 0, ψj ). Denote by ⊥ W0 : HN × HN → HN⊥

(2.21)

the operator defined by the relation v˜r = W0 v, ψ r ).

(2.22)

The operator W0 does not depend on r because Eq. (2.19) is invariant with respect to shifts. A crucial property of W0 is that it forgets the past exponentially fast: ∂W0 ∂W0 + ≤ C e−Fj for any j ≥ 0, (2.23) ∂ψ ∂v r−j r−j where C and F are positive constants. We now return to the random equations (2.19) and (2.20). What has been said in the foregoing paragraph implies that we can solve Eq. (2.20) with j ≤ k for every k ∈ Z and express the random sequence v˜ k in terms of v k and ψ k using the operator W0 . 8 In this paragraph, v r and ψ r stand for deterministic sequences that are not related to the random sequences defined in (2.17) and (2.18).

306

S. Kuksin, A. Shirikyan

Substituting the result into (2.19), we conclude that the random sequence {v k , k ∈ Z} satisfies the equation v k = v k−1 , T0 (v k−1 , ψ k−1 ) + ϕ k0 ,

(2.24)

⊥ to H given by the formula where T0 is an operator from HN × HN N

T0 (v, ψ) = PN S v0 + W0 (v, ψ) .

(2.25)

Consider now the family of Markov chains Υ k = Υ k (U ) =

θk , ζk

U=

v ⊥ ∈ HN × HN , w

⊥ , defined by (1.14) and (1.15), where the operator T is given with phase space HN × HN 0 in (2.25). Let us denote by P(k, U , 3), Pk , and Pk∗ the corresponding transition function and Markov semi-groups and by A the set of attainability from zero (see (1.7)). We shall regard Υ k (U ) as Markov chains in A (see Remark 1.4). Assume that we can prove existence and uniqueness of an invariant measure µ ∈ P(A) for Pk∗ . By construction, ⊥ )-valued random sequence the (HN × HN

Ξk =

vk ψk

(2.26)

(with v k and ψ k defined in (2.17) and (2.18), respectively) satisfies (1.15) for every k ∈ Z. Moreover, it can be shown that Ξ k is a stationary Markov chain and that its distribution G = D(Ξ k ) is supported by A (see Sect. 3.2), so that G ∈ P(A) coincides with the unique invariant measure µ for Pk∗ . It remains to note that uk = vk +W0 (v k , ψ k ) and therefore D(zk ) = λ is also uniquely defined. Thus, the problem of uniqueness of invariant distribution for the original Markov operator Pk∗ is reduced to a similar question for Pk∗ . 2.2.3. Uniqueness of invariant measure for Pk∗ . The proof of the uniqueness is based on a version of the Ruelle–Perron–Frobenius (RPF) theorem presented in Sect. 4. Without going into details, let us explain the main idea. We deal with a family of Markov chains Υ k (U ) in A. Using this fact and the dissipativity of the operator S (see (2.2)), it can be shown that Υ k (U ) is irreducible (i.e., P(k, U , O) > 0 for any U ∈ A, an arbitrary non-empty open set O ⊂ A, and sufficiently large k). If the transition function were strong Feller, we could apply the Doob theorem (for instance, see [DZ, Theorem 4.2.1]) to prove uniqueness of the invariant measure. However, the strong Feller property is not satisfied in the case under study (see [DZ, Sects. 7.1 and 7.2] for necessary and sufficient conditions for the validity of the strong Feller property for infinite-dimensional systems), and to establish the uniqueness, we use a version of the RPF-theorem. Namely, we show that if the Markov family is “uniformly” irreducible and the operator Pk possesses a very weak smoothing property, then the invariant measure is unique. Note that the proof of this fact substantially uses the exponential decay (2.23).

Stochastic Dissipative PDE’s and Gibbs Measures

307

3. Lyapunov–Schmidt Type Reduction 3.1. Statement of the result. As in Sect. 2, we denote by H a separable Hilbert space with norm · and by S a nonlinear continuous operator in H . It is assumed that S satisfies Conditions (B) and (C). ⊥ = (H ⊥ )Z0 are endowed with the We recall that the spaces HN = (HN )Z0 and HN N Tikhonov topology. For any R > 0, define the bounded subsets BN (R) = BHN (R)Z0 = v = (vl , l ≤ 0) ∈ HN : vl ≤ R , Z0 ⊥ = w = (wl , l ≤ 0) ∈ HN : wl ≤ R . B⊥ N (R) = BH ⊥ (R) N

For F ≥ 0 and u ∈ H, we write u∞ = sup ul , l≤0

M(F) = u = (eFl ul , l ≤ 0).

It is easy to see that the Tikhonov topology on BN (R) and B⊥ N (R) coincides with the topology defined by the metric (3.1) dF (u, v) = M(F)(u − v)∞ = sup eFl ul − vl , l≤0

where F is an arbitrary positive number. Consider the equation uk = S(uk−1 ) + ηk ,

k ≤ 0,

(3.2)

where uk , ηk ∈ H . Application of QN to (3.2) results in v˜k = QN S(vk−1 + v˜k−1 ) + ψk ,

k ≤ 0,

(3.3)

where vk = PN uk , v˜k = QN uk , and ψk = QN ηk . Let us abbreviate BHN (R) × BH ⊥ (b) = BR,b , N

BN (R) × B⊥ N (b) = BR,b .

(3.4)

Theorem 3.1. Assume that Condition (C) holds. Let R > 0, b > 0, and γ , 0 < γ < 1, be some constants and let an integer N ≥ 1 be so large that γN (ρ) ≤ γ , where γN is the sequence in Condition (C) and ρ := (R 2 + r 2 )1/2 ,

r :=

Rγ + b . 1−γ

(3.5)

⊥ Then for any v ∈ BN (R) and ψ ∈ B⊥ N (b) Eq. (3.3) has a unique solution v˜ ∈ BN (r). Moreover, for any F, 0 ≤ F < − ln γ , the operator

W : BR,b → B⊥ N (r),

˜ (v, ψ) $ → v,

satisfies the inequality M(F) W(v 1 , ψ 1 ) − W(v 2 , ψ 2 ) ∞ F −1 F ≤ (1 − e γ ) e γ M(F)(v 1 − v 2 )∞ + M(F)(ψ 1 − ψ 2 )∞ ,

(3.6)

(3.7)

where v i ∈ BN (R), ψ i ∈ B⊥ N (b), i = 1, 2. In particular, the operator W is continuous if the spaces entering (3.6) are endowed with the Tikhonov topology.

308

S. Kuksin, A. Shirikyan

It follows from (3.3) that the operator W(v, ψ) is independent of the last component of v. Theorem 3.1 is a variant of the well-known result (originally due to [FP]) according to which the asymptotic dynamics of a nonlinear dissipative PDE is determined by the first N Fourier modes, where N is sufficiently large. Similar results are established by many authors for various purposes (for instance, to study attractors, inertial and integral manifolds, etc.). The proof of Theorem 3.1 is given in the Appendix (see Sect. 8). We now derive a simple corollary of Theorem 3.1. Let us write W = (Wl , l ≤ 0), so that W0 (v, ψ) is the zeroth (i. e., the last) component of W. Till the end of this subsection, we shall abbreviate (vl , ψl ), l ≤ 0, to hl . Denote by Lipl W0 , l ≤ 0, the Lipschitz constant of W0 with respect to hl that is uniform in all other arguments. In other words, Lipl W0 is the supremum over all Z \{l} (. . . , hl−1 , hl+1 , . . . , h0 ) ∈ BR,b 0 of the Lipschitz constants of the functions BR,b & h $→ W0 (. . . , hl−1 , h, hl+1 , . . . , h0 ). It follows from (3.7) that −1 Fl Lipl W0 ≤ 1 − eFγ e .

(3.8)

This estimate shows that dependence of the function W0 on hl = (vl , ψl ) decays with l exponentially. For later use, we make another obvious but helpful observation: if v 1 = (v, v1 ), 1 ψ = (ψ, ψ1 ), and v˜ = W(v, ψ), then ˜ QN S(v0 + v˜0 ) + ψ1 . W(v 1 , ψ 1 ) = v, (3.9) Indeed, the right-hand side of (3.9) defines a sequence v˜ = (v˜k , k ≤ 1) that satisfies Eq. (3.2) for k ≤ 1 since v˜ satisfies it for k ≤ 0. ⊥ . We wish to define a family of Markov 3.2. Markov chain in the space HN = HN ×HN chains in HN by the formulas v 0 Υ = U, U= ∈ HN , (3.10) w v T0 (v, ψ) ϕk Υ k = Υ k−1 , T Υ k−1 + , T = , (3.11) ψ 0 ψk

where Υ k = Υ k (U ),

T0 (v, ψ) = PN S v0 + W0 (v, ψ) ,

(3.12)

and v0 is the last component of v. However, the domain of definition of T0 is only a part of the space HN , and therefore we have to choose carefully the corresponding phase space. To this end, find a constant R > 0 such that the set of attainability A = A(supp ν) for the original equation (2.9) is contained in BH (R) (see (2.4) and Condition (B)). Let an integer N ≥ 1 be so large that the conditions of Theorem 3.1 are satisfied. Denote by W = (Wl , l ∈ Z0 ) : BR,b → B⊥ N (r)

(3.13)

Stochastic Dissipative PDE’s and Gibbs Measures

309

the operator constructed in Theorem 3.1. Obviously, we can define the k th element Υ k (U ) if Υ k−1 (U ) ∈ BR,b . We claim that if U = 0, where 0 is the element all of whose components are zero, then Υ k (U ) ∈ BR,b for all k ≥ 1. Indeed, assume that the required inclusion is proved for k ≤ n − 1. Since ψk ≤ ηk ≤ b by (2.6), (2.7) and k k Condition (D), we have ζ k (U ) ∈ B⊥ N (b) for any k ≥ 0. Let θ (U ) = θl (U ), l ∈ Z0 be the first component of Υ k (U ). Since, by the induction hypothesis, n θl (U ), l ≤ −1 = θln−1 , l ≤ 0 = θ n−1 (U ) ∈ BN (R), it suffices to show that θ0n (U ) ∈ BHN (R). However, it follows from the definition of θ k (U ) that θ0n (U ) ∈ PN An , where An is the set of attainability from zero by time n for Eq. (2.9). It now remains to note that, according to the choice of R, we have An ⊂ A ⊂ BH (R) and hence PN An ⊂ BHN (R). Thus, if U = 0, we can define Υ k (U ) for any integer k ≥ 1. Denote by A the set of attainability from zero for Eq. (3.11). An obvious argument based on the continuity of the operators entering the definition of Υ k (U ) shows that formula (3.11), in which T0 has the form (3.12), makes sense for any integer k ≥ 1 and an arbitrary initial state U ∈ A. We have thus obtained a family of Markov chains Υ k (U ) with phase space A which satisfy Eq. (3.11). An important observation is that Eq. (3.11) with phase space A is equivalent to (2.15) with phase space A in the sense that the sets of solutions for these two equations are in one-to-one correspondence. This result is stated below 3.2. as Theorem Let Φ : BR,b → H be a mapping that sends U = Ul = wvll , l ≤ 0 to Φ(U ) = u = ul = vv˜ll ∈ HN × HN⊥ , l ≤ 0 , where v˜ = W(U ), and let Ψ : H → HN be a mapping that sends u = vl wl , l ≤ 0 , where

(3.14) vl

wl = v˜l − QN S(vl−1 + v˜l ),

v˜l

, l ≤ 0 ∈ H to Ψ (u) = U =

l ≤ 0.

(3.15)

It follows from Theorem 3.1 that Φ is uniformly Lipschitz continuous. Moreover, Ineq. (2.1) implies that the restriction of Ψ to the set BH (R) = BH (R)Z0 is also uniformly Lipschitz continuous for any R > 0. Theorem 3.2. The operator Φ defines a Lipschitz homeomorphism A → A whose inverse is Ψ . Moreover, a Markov chain Υ k ∈ A, k > k0 ≥ −∞ is a solution of (3.11) if and only if the chain uk = Ψ (Υ k ) ∈ A, k > k0 is a solution of (2.15). Proof. We first show that Ψ maps A to A. Denote by Ak and Ak the sets of attainability from zero by the time k for Eqs. (3.11) and (2.15), respectively (see (1.6)). If U ∈ Aj for some j ≥ 1, then there exists a trajectory (Υ k , 0 ≤ k ≤ j ) of (3.11), viewed as a controllable system, that is equal to zero for k = 0 and to U for k = j (cf. Remark 1.2). It follows from the definition of W that the operator Φ sends Υ k to a trajectory of the controllable system (2.15) and that this trajectory is equal to zero for k = 0 and to Φ(U ) for k = j . Hence, Φ(U ) ∈ Ak . By continuity, Φ maps A to A.

310

S. Kuksin, A. Shirikyan

If U ∈ A and Φ(U ) = u = vv˜ , then v˜ satisfies (3.3). Hence, (3.15) holds, and U = Ψ (u). Repeating the arguments applied above to Φ, we find that Ψ maps Aj to Aj and, hence, A to A. If u = vv˜ ∈ A and Ψ (u) = U = wv ∈ A, then w is defined by (3.15). Therefore v˜ satisfies (3.3), and we have u = Φ(U ). Hence, Φ : A → A is a homeomorphism and Ψ = Φ −1 . Due to (3.9) and the definition of T (see (3.11)), the following diagram is commutative for any ω ∈ 0: Φ

Υ k−1 −−−−→ uk−1     Φ

Υ k −−−−→ uk Here the left- and right-hand vertical arrows stand for the transformations in Eqs. (3.11) and (2.15), respectively. Since Φ is a homeomorphism, it defines a one-to-one correspondence between solutions of these two equations. " # k Corollary 3.3. A Markov chain Υ , k ∈ Z is a stationary solution of (3.11) in A if and only if uk = Φ(Υ k ) ∈ A, k ∈ Z is a stationary solution of (2.15). Corollary 3.4. Eq. (3.11) has a unique invariant measure supported by A if and only if (2.15) possesses a unique invariant measure supported by A. The second corollary follows from the first since Φ and Ψ transform identically distributed solutions of one equation to identically distributed solutions of the other. By Theorem 3.2, the operator Φ transforms the family of Markov chains (3.10), (3.11) to the family (2.14), (2.15), where u = Φ(U ). Therefore the corresponding transition functions satisfy the relation P(k, U , 3) = P k, Φ(U ), Φ(3) , U ∈ A, 3 ∈ B(A). (3.16) It follows that

Pk f ◦ Φ = Pk (f ◦ Φ),

f ∈ Cb (A).

(3.17)

4. A Version of the RPF-Theorem In this section, we prove a version of the RPF-theorem. It provides a sufficient condition for the uniqueness of invariant measure for a Markov semi-group whose transition function is uniformly irreducible and possesses a smoothing property. This result will be used in Sect. 5 to prove Theorem 2.2. 4.1. Statement of the result. Let A be a Polish space. A subset R ⊂ Cb (A) is called a determining family for P(A) if for arbitrary measures µ1 , µ2 ∈ P(A) the condition f (u) dµ1 (u) = f (u) dµ2 (u) for any f ∈ R A

implies that µ1 = µ2 .

A

Stochastic Dissipative PDE’s and Gibbs Measures

311

Let P(k, u, 3), u ∈ A, 3 ∈ B(A), be a Feller transition function defined for nonnegative integers9 k. Denote by Pk : Cb (A) → Cb (A),

Pk∗ : P(A) → P(A),

k ≥ 0,

the Markov semi-groups associated with P(k, u, 3). For any function f (u), denote by f + and f − its positive and negative parts, respectively: f+ =

1 f + |f | , 2

f− =

1 |f | − f . 2

We shall assume that the condition below is fulfilled. (H) There is a determining family R for P(A) such that for any f ∈ R the function f − c belongs to R for all c ∈ R, and there is a constant A = Af > 1 and an integer k0 = k0 (f ) ≥ 0 such that the following property holds: if sup fk+ (u) ≥ α for all k ≥ 0,

(4.1)

sup fk− (u) ≥ α forll k ≥ 0,

(4.2)

u∈A u∈A

+ − where fk+ = Pk f , fk− = Pk f , and α = α(f ) > 0 is a constant not depending on k, then for any k ≥ k0 there is l = l(k) > 0 such that sup Pl fk+ (u) ≤ Af inf Pl fk+ (u),

u∈A

u∈A

u∈A

u∈A

sup Pl fk− (u) ≤ Af inf Pl fk− (u).

(4.3) (4.4)

Sufficient conditions guaranteeing the validity of (H) are given in Sect. 4.3; see Conditions (H1 ) and (H2 ) there. The following theorem is the main result of this section. Theorem 4.1. Assume that Condition (H) holds. Then the assertions below take place. (i) Let µ ∈ P(A) be an invariant measure of Pk∗ . Then, for any f ∈ R, Pk f → (µ, f ) as t → ∞ in L1 (A, µ).

(4.5)

(ii) The operator Pk∗ has at most one invariant measure µ ∈ P(A). 9 Theorem 4.1 established below remains true for continuous time. The statement of the corresponding result and its proof are literally the same as in the discrete case.

312

S. Kuksin, A. Shirikyan

4.2. Proof of Theorem 4.1. Proof of (i). Step 1. Let µ ∈ P(A) be an invariant measure and let f ∈ R. Without loss of generality, we can assume that f (u) dµ(u) = 0. (4.6) (µ, f ) = A

The general case can be reduced to the former by the change f $ → f − (µ, f ). Thus, we must prove that Pk f µ = Pk f dµ → 0 as k → ∞. (4.7) A

Note that Pl∗ µ = µ for any l ≥ 0, and therefore Pk+l f µ = Pl (Pk f ) dµ ≤ Pl (Pk f ) dµ A A ∗ (Pk f ) d(Pl µ) = (Pk f ) dµ = A

A

= Pk f µ .

This means that Pk f µ is a non-increasing sequence. Hence, the convergence (4.7) will be established if we show that for any ε > 0 there is k = kε > 0 such that Pkε f µ ≤ ε.

(4.8)

Step 2. We first assume that sup fk+s (u) → 0

u∈A

as

s → ∞,

where {ks } is a sequence of integers tending to +∞. In this case + fk+s (u) dµ(u) → 0 as Pks f (u) dµ(u) = A

A

s → ∞.

(4.9)

Moreover, it follows from (4.6) that (µ, f ) = (Pk∗ µ, f ) = (µ, fk ) = 0,

k ≥ 0,

where fk = Pk f , and therefore (µ, fk+ ) = (µ, fk− )

for any

k ≥ 0.

Combining this with (4.9), we derive (µ, fk+s ) = (µ, fk−s ) → 0

as

s → ∞,

whence we conclude that (4.8) holds. A similar argument shows that if sup fk−s (u) → 0

u∈A

for a sequence {ks }, then (4.8) is fulfilled.

as

s→∞

(4.10)

Stochastic Dissipative PDE’s and Gibbs Measures

313

Step 3. Thus, we can assume that (4.1) and (4.2) hold with a constant α > 0. By Condition (H), for any k ≥ k0 there is l ≥ 0 such that (4.3) and (4.4) are satisfied. We claim that there is a sequence of integers {ks , s ≥ 1} such that Pks f µ ≤ afs f µ ,

s ≥ 0,

(4.11)

where af = 1 − A−1 f < 1. The proof is by induction on s. Inequality (4.11) is obvious for s = 0. Assuming that (4.11) is established for s ≤ r, we now prove it for s = r + 1. Set kr+1 = kr + lr , where lr ≥ 0 is the integer entering inequalities (4.3) and (4.4) with k = kr . Note that, by (4.3) and (4.4), we have10 ± fkr dµ = Plr fk±r dµ ≤ sup Plr fk±r (u) ≤ Af inf Plr fk±r (u) A

whence

A

u∈A

u∈A

± Plr fk±r (u) − A−1 f fkr µ ≥ 0

for

u ∈ A.

It follows that ± ± Pl f ± − A−1 f ± µ dµ = Plr fk±r − A−1 r kr f kr f fkr µ dµ = af fkr µ . A

A

We now estimate the expression Pkr+1 f µ = Plr fkr µ . In view of (4.10), we have fk+ µ = fk− µ

for any

k ≥ 0,

(4.12)

and therefore Pl fk dµ = Pl (f + − f − )dµ r r r kr kr A A + −1 + − ≤ Plr fkr − Af fkr µ dµ + Plr fk−r − A−1 f fkr µ dµ A A = af fk+r µ + fk−r µ = af fkr µ . Using the induction hypothesis, we derive Pk f dµ ≤ af Pk f µ ≤ a r+1 f µ , r r+1 f A

which completes the proof of (4.11).

# "

Inequality (4.8) is an obvious consequence of (4.11). Proof of (ii). We first prove that any two invariant measures supported by A are singular. To this end, we apply a well-known argument (for instance, see [DZ, Proposition 3.2.5]). Let µ1 , µ2 ∈ P(A) be two different invariant measures. Since R is a determining family for P(A), there is f ∈ R such that (µ1 , f ) = (µ2 , f ).

(4.13)

10 Here and henceforth a formula involving the symbol ± is a brief writing for the two formulas corresponding to the upper and lower signs.

314

S. Kuksin, A. Shirikyan

By (i), Pk f → (µi , f ) as

k→∞

in

L1 (A, µi ),

i = 1, 2.

Therefore, there is a sequence of integers {ks } tending to +∞ such that Pks f → (µi , f ) as s → ∞

µi -almost everywhere,

i = 1, 2.

(4.14)

Denote by Ci , i = 1, 2, the set of points u ∈ A for which (4.14) takes place. We have µ1 (C1 ) = µ2 (C2 ) = 1 and, in view of (4.13), C1 ∩ C2 = ∅. This means that µ1 and µ2 are singular. We now assume that µ1 , µ2 ∈ P(A) are two different invariant measures for Pk∗ . As is proved, they are singular. Consider the measure µ = (µ1 + µ2 )/2. It is clear that µ ∈ P(A) is an invariant measure and that µ and µ1 are not singular. The contradiction obtained completes the proof of Theorem 4.1. " #

4.3. Sufficient conditions for application of Theorem 4.1. Assume that P(k, u, 3), u ∈ A, 3 ∈ B(A), is a transition function satisfying the following conditions: (H1 ) There is a determining family R0 for P(A) such that if f ∈ R0 , then the sequence Pk f , k ≥ k0 , is uniformly equicontinuous, where k0 is a nonnegave integer depending on f . (H2 ) For every r > 0 there are ε > 0 and l ≥ 1 such that P(l, u, BA (a, r)) ≥ ε for any u, a ∈ A.

(4.15)

Condition (H1 ) can be called a “uniform Feller property”. We impose it instead of the strong Feller property, which is common in arguments proving uniqueness of an invariant measure (see [DZ]), but which is not satisfied for the infinite-dimensional system we deal with (see Remark 5.2). Condition (H2 ) is a slowed-down version of the usual assumption that the measures P(l, u, ·) are absolutely continuous with respect to a reference measure on A and the corresponding densities are positive uniformly in u ∈ A and l ) 1. It can also be regarded as a condition of “uniform irreducibility” for the family of Markov chains in question. Let R be the set of functions f ∈ Cb (A) for which there is a constant c ∈ R such that f − c ∈ R0 . Theorem 4.2. Let Conditions (H1 ) and (H2 ) be satisfied. Then (H) is fulfilled for R, and therefore assertions (i) and (ii) of Theorem 4.1 hold. Moreover, supp µ = A. Finally, if A is a compact space, then the convergence in (4.5) takes place in the space Cb (A). Proof. Let f (u) ∈ R be an arbitrary function satisfying (4.1) and (4.2), where fk± = (Pk f )± and α is a positive constant. We must prove that (4.3) and (4.4) hold. We confine ourselves to the case of index +. In view of Condition (H1 ), there is r > 0 and for any k ≥ k0 there is uk ∈ A such that inf

v∈BA (uk ,r)

fk+ (v) ≥ sup fk+ (v) − v∈A

α α ≥ 2 2

for

k ≥ k0 .

(4.16)

Stochastic Dissipative PDE’s and Gibbs Measures

315

Let ε > 0 and l ≥ 1 be the constants entering Condition (H2 ). In view of (4.15) and (4.16), we have Pl fk+ (u) =

A

P(l, u, dv)fk+ (v)

≥ P l, u, BA (uk , r)

≥

BA (uk ,r)

inf

v∈BA (uk ,r)

P(l, u, dv)fk+ (v)

fk+ (v) ≥

αε . 2

(4.17)

On the other hand, Pl fk+ (u) ≤ sup fk+ (u) ≤ sup fk (u) ≤ sup f (u).

u∈A

u∈A

u∈A

(4.18)

Combining (4.17) and (4.18), we arrive at (4.3) with Af = 2(αε)−1 sup f (u). u∈A

Inequality (4.4) can be proved in a similar way. We now assume that µ is an invariant measure for Pk∗ and show that supp µ = A. To this end, it suffices to check that µ BA (a, r) > 0

for any

a∈A

and

r > 0.

(4.19)

In view of the invariance of µ and inequality (4.15), we have µ BA (a, r) =

A

P l, u, BA (a, r) dµ(u) ≥ ε,

which implies (4.19). Finally, let A be a compact space. We wish to show that Pk f (u) → (µ, f ) as

k → ∞ uniformly in

u ∈ A.

By Condition (H1 ) and the Arzelà–Ascoli theorem, there is a sequence of integers kj → ∞ such that Pkj f (u) converges uniformly to a function g(u). In view of assertion (i) of Theorem 2.2, the function g must coincide with (µ, f ) on the support of µ. Since supp µ = A, we have g ≡ (µ, f ), and the whole sequence Pk f converges to (µ, f ). The proof of Theorem 4.2 is complete. Remark 4.3. If R is dense in Cb (A), then the sequence Pk f uniformly converges to (µ, f ) for any f ∈ Cb (A). Indeed, let a function fε ∈ R be such that f − fε ∞ < ε. We have Pk f − (ν, f ) ≤ Pk (f − fε ) + Pk fε − (ν, fε ) + (µ, f − fε ), ∞ ∞ ∞ which implies the required assertion.

316

S. Kuksin, A. Shirikyan

5. Proof of Theorem 2.2 5.1. Reduction to Theorem 4.1. Step 1. We recall that the original family of Markov chains 1k = 1k (u) with phase space A is defined as 10 = u,

(5.1)

k

1 = S(1

k−1

) + ηk ,

(5.2)

where k ≥ 1. As was shown in Sect. 2.2.1, there is an invariant measure λ ∈ P(A) for the semi-group Pk∗ associated with (5.1), (5.2). Hence, we must establish the uniqueness. We fix an arbitrary R > 0 such that A ⊂ BH (R), choose any constant γ , 0 < γ < 1, and denote by N the smallest integer satisfying the condition (cf. (3.5)) γN (ρ) ≤ γ ,

1/2 ρ = R2 + r 2 ,

r=

Rγ + b , 1−γ

(5.3)

where γN is the sequence in Condition (C). We claim that if condition (2.10) holds with the above choice of N , then the invariant measure is unique. Step 2. By Proposition 1.5, an invariant measure λ ∈ P(H, A) for the family (5.1), (5.2) defines a stationary solution (zk , k ∈ Z) of (5.2), which gives rise to a stationary solution and, hence, to an invariant measure λ ∈ P(H) for (2.15). We claim that supp λ ⊂ A. Indeed, it suffices to show that if u ∈ supp λ, then for any ε > 0 and an arbitrary integer L ≥ 0 there is u ∈ A such that ul − ul ≤ ε

for

− L ≤ l ≤ 0.

(5.4)

Fix arbitrary u ∈ supp λ, L ≥ 0, and ε > 0. It follows from the definition of the support of a measure that the event zl ∈ BH (ul , ε/2),

−L ≤ l ≤ 0,

has a positive probability. Since supp D(zl ) ⊂ A and zk satisfies Eq. (5.2) for all ω ∈ 0 and k ∈ Z, there are realisations z˜ l ∈ A ∩ BH (ul , ε/2),

η˜ l ∈ supp ν,

−L ≤ l ≤ 0,

(5.5)

of the random variables zl and ηl such that z˜ l = S(˜zl−1 ) + η˜ l ,

1 − L ≤ l ≤ 0.

(5.6)

Furthermore, since z˜ −L ∈ A, for any δ > 0 there is an integer j ≥ 0 and u−L ∈ Aj such that ˜z−L − u−L ≤ δ. We now set ul = S(ul−1 ) + η˜ l ,

1 − L ≤ l ≤ 0.

(5.7)

It follows from (5.6), (5.7) and continuity of S (see (2.1)) that ul − z˜ l ≤ c(δ) for

− L ≤ l ≤ 0,

(5.8)

Stochastic Dissipative PDE’s and Gibbs Measures

317

where c(δ) > 0 goes to zero with δ. Comparing (5.5) and (5.8), we obtain the inequalities ul − ul ≤

ε + c(δ), 2

−L ≤ l ≤ 0,

which imply (5.4) for sufficiently small δ > 0. It now remains to prove that the (L + 1)-tuple (u−L , . . . , u0 ) coincides with the last L + 1 components of an element u ∈ A, i. e., there are ul ∈ H , l ≤ −1 − L, such that (ul , l ∈ Z0 ) ∈ A. However, this assertion follows from the inclusion u−L ∈ Aj and definition of Aj . Thus, supp λ ⊂ A, so that λ ∈ P(H, A). Clearly, different original invariant measures correspond to different invariant measures for (2.14), (2.15) since λ is the projections of λ. Hence, it remains to check that the family of Markov chains k (u) has a unique invariant measure λ. Step 3. Since (5.3) holds, Theorems 3.1 and 3.2 apply. Therefore, due to Corollary 3.4, it suffices to show that Eq. (3.11) has a unique invariant measure µ supported by A. Then the measure λ is its image under the map Φ and is unique. Step 4. By Theorem 4.1, to prove the uniqueness of an invariant measure µ ∈ P(A) for (3.11), it is sufficient to check that the transition function P(k, U , 3) satisfies conditions (H1 ) and (H2 ). 5.2. Checking Condition (H1 ). Recall that the space A is endowed with the metric dF (see (3.1)) and that the topology defined on A by dF coincides with the Tikhonov topology for any F > 0. Proposition 5.1. Assume that Conditions (A)–(D) hold. Then the transition function P(k, U , 3) satisfies Condition (H1 ). Proof. For a metric space X and an integer k ≥ 1, we denote by X k the direct product of k copies of X and endow it with the natural direct product metric. Let R be the set of functions f (U ) ∈ Cb (A) for which there is an integer m ≥ 0 and a continuous function F (v−m , w−m , . . . , v0 , w0 ) ∈ Cb (HN × HN⊥ )m+1 (5.9) such that f (U ) = F (v−m , . . . , w0 )

for

U=

v (vl , l ≤ 0) = ∈ A. (wl , l ≤ 0) w

(5.10)

Thus, R is the set of continuous functions on A that depend on finitely many “coordinates”. Clearly, R is invariant with respect to addition of a constant function and, moreover, it is a determining family for P(A) because the Borel σ -algebra on A is generated by the σ -algebras Bm (A), m ≥ 0, where, by definition, Bm (A) consists of the sets of the form U ∈ A : (v−m , w−m , . . . , v0 , w0 ) ∈ 3 , 3 ∈ B (HN × HN⊥ )m+1 . We now prove that, for any f ∈ R, the family {Pk f, k ≥ 0} is uniformly equicontinuous. Since Pk f ∈ Cb (A) for any k ≥ 0 and A is a compact space (in the Tikhonov topology), each of the functions Pk f is uniformly continuous. Therefore it suffices to

318

S. Kuksin, A. Shirikyan

show that the family {Pk f, k ≥ m + 1} is uniformly equicontinuous, where m is the integer in (5.9). ⊥ the distributions of the random variables ϕ = P η and Denote by νN and νN k N k ψk = QN ηk , respectively. It follows from Condition (D) that νN ∈ P(HN ) is absolutely continuous with respect to the Lebesgue measure on HN = RN and that the corresponding density has the form D(α) =

N j =1

bj−1 pj (bj−1 αj ),

α = (α1 , . . . , αN ) ∈ RN ,

where pj (r) is the density of πj (see Condition (D)) and bj > 0 are the constants in (2.4). Note that D(α) is Lipschitz continuous. It follows from (1.20) and (1.22) that if f (U ) is given by (5.10), then Pk f (U ) = Dk (U , σ1 , . . . , σk )F (σk−m , . . . , σk ) d:(σ1 ) · · · d:(σk ), (5.11) k BR,b

k ⊥ where BR,b := (BR,b )k , BR,b := BHN (R) × BH ⊥ (b), d:(σ ) is the measure dα dνN N on BR,b , and

Dk (U ; σ1 , . . . , σk ) =

k

D αl − T0 (U , σ1 , . . . , σl−1 ) .

(5.12)

l=1

Now note that the operator T0 is defined on BR,b and therefore formula (5.12) makes sense for any U ∈ BR,b and σj ∈ BR,b , 1 ≤ σj ≤ k. We claim that Pk f is a Lipschitz continuous function for any k ≥ m + 1 and that the corresponding Lipschitz constants are uniformly bounded. Indeed, for any Lipschitz continuous function G(U ) defined for U ∈BR,b , denote by Lip G and Lipr G, r ≤ 0, its Lipschitz constants in U and hr = wvrr , respectively. (See the final paragraph of Sect. 3.1 for more detailed definition of Lipr G.) It follows from (5.12) that, for any integer r ≤ 0, Lipr Dk (U ; σ1 , . . . , σk ) ≤ L

k j =1

Dkj (U ; σ1 , . . . , σk ) Lipr T0 (U , σ1 , . . . , σj ),

where L is the Lipschitz constant for D(α) and Dkj (U ; σ1 , . . . , σk ) =

k

D αl − T0 (U , σ1 , . . . , σl−1 ) .

j =l=1

Taking into account (2.1), (3.8), and (3.12), we conclude that Lipr T0 (U , σ1 , . . . , σj ) ≤ C1 eF(r−j ) , ⊥ and where C1 > 0 is a constant not depending on k, r, and U . Since d: = dα dνN ⊥ D(α) dα and dνN are probability measures, we have Dkj (U ; σ1 , . . . , σk ) d:(σ1 ) · · · d:(σk ) ≤ C2 . k BR,b

Stochastic Dissipative PDE’s and Gibbs Measures

Therefore, k BR,b

319

Lipr Dk (U ; σ1 , . . . , σk ) d:(σ1 ) · · · d:(σk ) ≤ C3 L eFr ,

r ≤ 0.

(5.13)

Using (5.11) and (5.13), we obtain Lipr Dk (U , σ1 , . . . , σk ) F (σk−m , . . . , σk ) d:(σ1 ) · · · d:(σk ) Lipr Pk f ≤ k BR,b

≤ C3 LF ∞ eFr ≤ C4 eFr . Hence, Lip Pk f ≤

0 r=−∞

Lipr Pk f ≤ C4

0

eFr = C4 (1 − e−F ).

r=−∞

Thus, the functions Pk f , k ≥ m + 1, are uniformly Lipschitz, and (H1 ) follows.

# "

Remark 5.2. Let us show that P(k, U , 3) is a continuous function of U ∈ A if theBorel set 3 ⊂ A depends on the last k coordinates, i. e., there is

3 ∈ B (HN × HN⊥ )k such that 3 . 3 = V = (Vl , l ≤ 0) ∈ A : (V1−k , . . . , V0 ) ∈

Indeed, according to (5.11), we have P(k, U , 3) = Dk (U , σ1 , . . . , σk ) d:(σ1 ) · · · d:(σk )

3

and the required assertion follows from the continuity of the integrand as a function of U and the dominated convergence theorem. It is not difficult to see, however, that the transition function P(k, U , 3) does not possess the strong Feller property. More exactly, assume that a Borel set 3 ⊂ A has the form 3 , 3 = V = (Vl , l ≤ 0) ∈ A : (Vl , l ≤ −k) ∈

where

3 ∈ B (HN × HN⊥ )Z−k . In this case k (U ), . . . , Υ0k (U ) ∈ 3 = χ

P(k, U , 3) = P Υ k (U ) ∈ 3 = P U , Υ1−k 3 (U ),

where χ

3 is the characteristic function of the set 3 . Hence, the function P(k, U , 3) is not continuous unless 3 = ∅ or 3 = A. 5.3. Checking Condition (H2 ). Recall that the space A is endowed with metric dF , F > 0 (see (3.1)). Proposition 5.3. Assume that Conditions (A)–(D) hold. Then for any r > 0 there is an integer l > 0 and a constant ε > 0 such that (5.14) P l, U , BA (a, r) ≥ ε for any U , a ∈ A.

320

S. Kuksin, A. Shirikyan

Proof. We first outline the main ideas. Since Φ : A → A is a uniformly Lipschitz homeomorphism, it follows from (3.16) that Proposition 5.3 will be proved if we establish a similar assertion for P(k, u, 3): for any r > 0 there is an integer l > 0 and a constant ε > 0 such that P l, u, BA (a, r) ≥ ε for any u, a ∈ A. (5.15) The proof of (5.14) is based on the two observations below. • With positive probability, the random variable k (u) belongs to an arbitrarily small ball centered at 0 if k is sufficiently large. More exactly, for any δ > 0 there is ε1 > 0 and an integer L1 > 0 such that P L1 , u, BA (δ) ≥ ε1 for any u ∈ A. (5.16) • With positive probability, the random variable k (u) belongs to an arbitrarily small ball centred at a ∈ A if the initial point u is sufficiently close to zero. More exactly, for any r > 0 there are ε2 > 0 and δ > 0 and an integer L2 > 0 such that P L2 , u, BA (a, r) ≥ ε2 for any u ∈ BA (δ) and a ∈ A. (5.17) The proof of the first assertion is based on the dissipativity of the operator S (see inequality (2.2)) and the fact that the random variables ηk take small values with positive probability, while the second assertion follows from the definition of the set of attainability and the “continuous dependence” of the Markov chain k (u) on the initial point. If (5.16) and (5.17) are proved, then the required inequality (5.15) with l = L1 + L2 and ε = ε1 ε2 is easily implied by the Chapman–Kolmogorov equation. Indeed, P L1 + L2 , u, BA (a, r) = P L1 , u, dv P L2 , v, BA (a, r) A ≥ P L1 , u, dv P L2 , v, BA (a, r) BA (δ) ≥ P L1 , u, BA (δ) inf P L2 , v, BA A(a, r) v∈BA (δ)

≥ ε1 ε2 .

(5.18)

Let us now turn to the accurate proof. Step 1. We first check (5.16). To this end, we note that, with probability 1, we have k (u) = u, 11 (u), . . . , 1k (u) , (5.19) where u ∈ H and u = u0 is the zeroth component of u. In view of the definition of the Tikhonov topology, inequality (5.17) will be proved once we show that for any δ1 > 0 and any integer l ≥ 0 there is ε1 > 0 and an integer L1 ≥ l such that P 1j (u) ≤ 2δ1 for L1 − l ≤ j ≤ L1 ≥ ε1 , (5.20) where u is an arbitrary element of A. Note that S k (u) ∈ A ⊂ BH (R) for any k ≥ 0 if R > 0 is sufficiently large. By Condition (B), for given r = δ1 and R > 0, there are

Stochastic Dissipative PDE’s and Gibbs Measures

321

n0 ≥ 1 and a, 0 < a < 1, such that (2.2) holds. Denote by K ≥ 1 the smallest integer for which a k R < r. Iterating K times inequality (2.2), we obtain S k (u) ≤ r = δ1

for

k ≥ Kn0 .

(5.21)

We claim that (5.20) holds for L1 = Kn0 + l. Indeed, define a controllable system 1k (u; ξ1 , . . . , ξk ) by the formulas (cf. (1.8), (5.1), (5.2)) 10 (u) = u,

1k (u; ξ1 , . . . , ξk ) = T 1k−1 (u; ξ1 , . . . , ξk−1 ) + ξk ,

where ξj ∈ supp ν. It follows from (5.21) and the continuity of S that if ξk ≤ δ2

for

1 ≤ k ≤ L1 ,

where δ2 > 0 is sufficiently small, then j 1 (u; ξ1 , . . . , ξk ) ≤ 2r = 2δ1

for

(5.22)

L1 − l ≤ j ≤ L1 .

(5.23)

Let 0k = ω ∈ 0 : ηk ≤ δ2 ,

0L1 =

L1

0k .

(5.24)

k=1

Condition (D) implies that P(0k ) ≥ p0 for any k and, therefore, in view of the independence, P(0L1 ) ≥ p0L1 = ε1 . (See Lemma 5.4 below for a stronger result.) Now note that 1k (u) = 1k (u; η1 , . . . , ηk ). By virtue of (5.22)–(5.24), the event in braces in (5.20) contains 0L1 , and hence its probability is no less than ε1 . Step 2. We now prove inequality (5.17). To this end, we need two auxiliary assertions. Lemma 5.4. For any ρ > 0 and any integer M ≥ 1 there is p0 = p0 (ρ, M) > 0 such that P ηj − xj < ρ, 1 ≤ j ≤ M ≥ p0 (5.25) uniformly in x1 , . . . , xM ∈ supp ν. Proof. Denote by h(x), x = (x1 , . . . , xM ) ∈ H M = M j =1 , the right-hand side of (5.25). It follows from the Fatou lemma that h(x) is a lower semi-continuous function of x, i. e., lim inf h(x) ≥ h(x 0 ) x→x 0

for any

x0 ∈ H M .

Since h(x) is positive on the compact set K = M j 1 supp ν, it attains its positive minimum on K. Hence, (5.25) holds with a positive p0 . " # Denote by Ak the set of attainability from zero by the time k for Eq. (3.11) (see (1.6)). Lemma 5.5. For any r > 0 there is an integer k ≥ 0 such that A is contained in the r-neighbourhood of Ak , i. e., for any a ∈ A there exists a k ∈ Ak such that a k ∈ BA (r, a).

322

S. Kuksin, A. Shirikyan

Proof. Since A is the closure of

∞

j =0 Aj ,

A⊂

for any r > 0 we have

∞

Oj ,

j =0

where Oj is the open r-neighbourhood of Aj in H. Thus, we have an open covering of the compact set A. Therefore there exists a finite subcovering. It remains to note that the sets Oj form an increasing sequence, and hence A ⊂ Ok for some k ≥ 1. " # We now fix arbitrary r > 0 and a ∈ A. By Lemma 5.5, there is an integer k ≥ 0 not depending on a and an element a k ∈ Ak such that dF (a, a k ) ≤ r/2. Since BA (a k , r/2) ⊂ BA (a, r), we can assume from the very beginning that a ∈ Ak . We claim that (5.17) holds with L2 = k. Indeed, we must check that P k (u) ∈ BA (a, r) ≥ ε2

(5.26)

if a ∈ Ak and u ∈ BA (δ) for a sufficiently small δ > 0. Define a controllable system k (u; ξ1 , . . . , ξk ) by formulas (cf. (1.8), (2.14), (2.15)) 0 (u) = u,

k (u; ξ1 , . . . , ξk ) = S k−1 (u; ξ1 , . . . , ξk−1 ) + ξ k0 ,

where u ∈ A, ξj ∈ supp ν, ξ k0 = (. . . , 0, 0, ξk ) and S is given by (2.16). Since a ∈ Ak , there are ξj0 ∈ supp ν, j = 1, . . . , k, such that k (0; ξ10 , . . . , ξk0 ) = a. It follows from continuity of S that k (u; ξ1 , . . . , ξk ) ∈ BA (a, r) if dF (u, 0) < δ,

ξj − ξj0 < δ,

j = 1, . . . , k,

where δ is sufficiently small. Therefore, P k (u) ∈ BA (a, r) ≥ P ηj − ξj0 < δ, j = 1, . . . , k .

(5.27)

It remains to note that, in view of Lemma 5.4, the right-hand side of (5.27) is bounded from below by a constant not depending on ξj0 ∈ supp ν, j = 1, . . . , k. The proof of Proposition 5.3 is complete. " #

Stochastic Dissipative PDE’s and Gibbs Measures

323

6. Ergodic Properties of the Invariant Measure 6.1. Support of the invariant measure. Theorem 6.1. Let the conditions of Theorem 2.2 hold and let λ ∈ P(H, A) be the invariant measure for Pk∗ . Then supp λ = A. Proof. Let λ ∈ P(H, A) and µ ∈ P(A) be the invariant measures for the semi-groups Pk∗ and Pk∗ , respectively (see Subsects. 2.2.2 and 3.2). By Theorem 4.2, we have supp µ = A. According to Step 3 in Sect. 5.1, we have supp λ = Φ(supp µ) = Φ(A) = A. Now note that the projection π0 : H → H,

u = (ul , l ∈ Z0 ) $ → u0 ,

maps the measure λ to λ. Therefore, π0 (supp λ) = π0 (A) = supp λ. Thus, the theorem will be proved once we show that π0 (A) = A, i. e., for any u ∈ A there is u ∈ A such that π0 (u) = u. This assertion is obvious if u ∈ Ak and can easily be proved with the help of approximation of a given element u ∈ A by a sequence uk ∈ Ak . # " 6.2. Convergence to the invariant measure and mixing. Theorem 6.2. Let the conditions of Theorem 2.2 hold and let f ∈ C(H ). Then Pk f (u) → (λ, f ) as k → ∞

(6.1)

for any u ∈ A. Moreover, the convergence is uniform in u ∈ A. Proof. First note that the measure P (k, u, ·) is supported by the set of attainability A for any k ≥ 0 and u ∈ A, and therefore we can redefine the function f outside A without changing Pk f (u) for u ∈ A. Since f is uniformly bounded on the compact set A, we can assume that f ∈ Cb (H ). Given f ∈ Cb (H ), we define a function f ∈ Cb (H) by the formula f (u) = f (u0 ),

u ∈ H,

where u0 is the zeroth component of u. Let λ ∈ P(H, A) be an invariant measure for Pk∗ and let λ ∈ P(H, A) be the corresponding invariant measure for Pk∗ (see Step 2 in Sect. 5.1). Since the projection u = (. . . , u−1 , u0 ) $ → u0 sends the measure λ to λ, we have (λ, f ) = (λ, f ). As was shown in the proof of Theorem 6.1, for any u ∈ A there is u ∈ A such that u0 = u. Under this choice of u ∈ A, we have Pk f (u) = Pk f (u).

324

S. Kuksin, A. Shirikyan

Therefore, it suffices to prove convergence (6.1) with Pk , f , and u replaced by Pk , f , and u, respectively. The map Ψ defined in Theorem 3.2 transforms the measure λ to the measure µ = Ψ∗ λ, which is invariant for the family of Markov chains (3.10), (3.11). Using (3.17), we see that it remains to check that Pk g(U ) → (µ, g) as k → ∞ uniformly in u ∈ A, where g(U ) = f Φ(U ) . Since Theorem 4.2 applies to the Markov semi-group Pk corresponding to (3.10), (3.11), the last convergence follows from (4.5) and Remark 4.3. " # Theorem 6.2 has two important corollaries. Corollary 6.3. Let the conditions of Theorem 2.2 hold. Then the invariant measure λ ∈ P(H, A) is mixing, i. e., Pk f (u)g(u) dλ(u) → f (u) dλ(u) g(u) dλ(u) as k → ∞ H

H

H

for any two functions f, g ∈ C(H ). In particular, the measure λ is ergodic in A. Corollary 6.4. Let the conditions of Theorem 2.2 hold and let 1k be an arbitrary Markov chain in H that satisfies (2.9) for k ≥ 1 and whose initial distribution λ0 is supported by A. Then the distribution of 1k weakly converges to λ, i. e., (Pk∗ λ0 , f ) → (λ, f ) as k → ∞ for any f ∈ Cb (H ). 7. Application to Stochastic Dissipative PDE’s 7.1. Navier-Stokes equations in a bounded domain. Let D ⊂ R2 be a bounded domain with boundary ∂D ∈ C 2 . Denote by V the space of vector functions u = (u1 , u2 ), uj ∈ C0∞ (D), such that div u = 0, by H and V the closure of V in11 L2 (D) and H 1 (D), respectively, and by S the orthogonal projection in L2 (D) onto H . On the domain D, let us consider the system of Navier–Stokes (NS) equations (0.1) with random right-hand side. We write it as a functional equation in H (for instance, see [CF, Chap. 8] or [BV, Sect. 1.6]): ∂t u + δLu + B(u, u) = η(t).

(7.1)

Here δ > 0 is a parameter, L is the closure in H of the operator L0 = −S with domain V, B(u, u) = S(u, ∇)u, and η(t) is a random process of the form η(t) =

+∞

δ(t − kT )ηk (x),

(7.2)

k=−∞

where T > 0, ηk (x) are H -valued i.i.d. random variables, and δ(·) is the Dirac measure concentrated at zero. In what follows, to simplify the notation, we shall assume that T = 1. Let us define what is meant by a solution of Eq. (7.1). 11 We use the same notation for spaces of scalar and vector functions.

Stochastic Dissipative PDE’s and Gibbs Measures

325

Let · and · 1 be the norms in the spaces H and V , respectively. For an open interval I ⊂ R, denote by L2 (I, V ) the space of Borel functions f (t) : I → V such that 1/2

f L2 (I,V ) :=

I

f (t)21 dt

<∞

and by C(I, H ) the space of functions on I with range in H that are extendible to a continuous function f (t) : I¯ → H , where I¯ is the closure of I . Definition 7.1. Let m and n be some integers such that m + 1 < n. A stochastic process u(t) = u(t, x) defined on the interval [m, n), is called a solution of Equation (7.1) if the following two properties hold with probability 1: • For any k = m + 1, . . . , n, the restriction of u(t) to Ik := (k − 1, k) belongs to the space L2 (Ik , V ) ∩ C(Ik , H ) and satisfies the homogeneous equation ∂t u + δLu + B(u, u) = 0.

(7.3)

u(k + 0, x) − u(k − 0, x) = ηk (x).

(7.4)

• For k = m + 1, . . . , n − 1,

• The function u(t) is continuous from the right at the points t = m + 1, . . . , n − 1. The following proposition is a trivial consequence of Definition 7.1. Proposition 7.2. Let a stochastic process u(t, x) be a solution of (7.1) on an interval [m, n). Then, with probability 1, u(t, x) satisfies Eq. (7.1) in the sense of distributions. Consider now the Cauchy problem for Eq. (7.1): u(0, x) = u0 (x),

(7.5)

where u0 (x) is an H -valued random variable. A stochastic process u(t, x) is called a solution of the problem (7.1), (7.5) if it is a solution of Eq. (7.1) and relation (7.5) holds with probability 1. It follows from Definition 7.1 and the classical result on the correctness of the initialboundary value problem for the 2D Navier–Stokes system (for instance, see [L, CF, BV]) that the problem (7.1), (7.5) has a unique solution u(t, x) defined for all t ≥ 0. This solution can be constructed in the following way. Let St be the solving semi-group for the Cauchy problem (7.3), (7.5). Thus, St (u0 ) = v(t), where v(t) = v(t, x) is the solution of (7.3), (7.5). Define a random process u(t) as u(k) = S u(k − 1) + ηk (x), k = 1, 2, . . . , (7.6) k = 0, 1, 2, . . . , (7.7) u(k + t) = St u(k) , 0 ≤ t < 1, where S = S1 . It is easy to see that u(t) is the required solution. Consider now the sequence uk = u(k) ∈ H , k ≥ 0. Since the random variables u0 , η1 , η2 , . . . are independent, we conclude that {uk } is a Markov chain. Hence, we can define a family of Markov chains by the formulas (cf. (2.8), (2.9)) 10 (v) = v,

1k (v) = S(1k−1 (v)) + ηk ,

k ≥ 1,

(7.8)

326

S. Kuksin, A. Shirikyan

where v ∈ H . Denote by Pk (v, 3), Pk and Pk∗ the corresponding transition function and the Markov operators (see (1.3)–(1.5)) and by A the set of attainability from zero for (7.8) (see (1.7)). Assume that ηk (x), k ∈ Z, have the form (2.6), i. e., ηk =

∞

bj ξj k ej (x),

(7.9)

j =1

where bj ≥ 0 are some constants satisfying (2.7), {ξj k } is a family of independent random variables for which Condition (D) holds, and {ej = ej (x)} is the complete set of L2 -normalized eigenvectors of the operator L with the corresponding eigenvalues {αj }. Theorem 7.3. Under the above conditions, the Markov semi-group Pk∗ has a unique invariant measure λ ∈ P(H, A) if (2.10) holds with a sufficiently large N ≥ 1. This measure is concentrated on the domain of definition D(L) of the operator L if12 ∞ j =1

αj2 bj2 < ∞.

(7.10)

Furthermore, the measure λ is mixing, and for any initial distribution λ0 ∈ P(H, A) the sequence Pk∗ λ0 weakly converges to λ. In particular, if u(t, x) is the solution of (7.1) starting from any point in A (e. g., from zero ), then the measures D u(k, ·) tends to λ as k → ∞, k ∈ Z. Proof. In view of Theorems 2.2, 6.1, 6.2 and Corollaries 6.3, 6.4, to prove the existence, uniqueness, and ergodic properties of an invariant measure, it suffices to check that Conditions (A)–(C) are satisfied for the operator S = S1 . The uniform Lipschitz property of S on any ball BH (R) is well known (for instance, see [CF, Chap. 10], [BV, Th. 1.6.1], or [G, Sect. 3.2]). It is also a classical result that S(u) ≤ au for any u ∈ H, (7.11) where a = e−δα1 . Inequality (7.11) immediately implies (2.2) (with n0 = 1) and Condition (B). Finally, to check (C), we note that (for instance, see [BV, Th. 1.6.2]) S(u1 ) − S(u2 ) ≤ C(R)u1 − u2 for any u1 , u2 ∈ BH (R). (7.12) 1 We also have QN v2 =

∞ j =N+1

−1 |vj |2 ≤ αN+1

∞ j =N+1

−1 αj |vj |2 ≤ αN+1 v21 ,

(7.13)

where vj = (v, ej ) are the Fourier components of v. The required inequality (2.5) with −1 γN (R) = C(R) αN+1 follows from (7.12) and (7.13). We now show that the invariant measure λ ∈ P(H, A) is concentrated on D(L) if (7.10) holds. It is well known that S(H ) ⊂ D(L) and that D(L) is a Borel subset in H . Hence, P 1, u, D(L) = P S(u) + η1 ∈ D(L) = 1 for any u ∈ H. 12 Clearly, condition (7.10) implies that η ∈ D(L) with probability 1. k

Stochastic Dissipative PDE’s and Gibbs Measures

It follows that

λ D(L) =

H

327

P 1, u, D(L) dλ(u) = 1,

which completes the proof of Theorem 7.3.

# "

7.2. Navier-Stokes equations on a torus. We now assume that x ∈ T2 , where T2 is a two-dimensional torus, and that

T2

u(t, x) dx = 0,

T2

η(t, x) dx = 0.

(7.14)

Let Hs be the space of divergence-free vector fields on T2 that belong to the Sobolev space H s (T, R2 ) and whose mean value is zero. We fix an arbitrary integer s ≥ 0 and denote by {ej } the complete set of L2 -normalised eigenvectors of the operator L. As before, St stands for the solving semi-group corresponding to the non-forced NS equations. It is well known that St is a continuous operator in Hs for any integer s ≥ 0. Applying standard arguments (see [BV, CF, FT, L]), it is not difficult to show that the operator S = S1 satisfies Conditions (A)–(C). Besides, S(u)

s+k

≤ Ck us

for any

k ≥ 0.

(7.15)

We assume that the forcing η(t) has the form (7.9) and is smooth, i. e., the coefficients bj ≥ 0 satisfy the inequality |bj | ≤ Cm j −m

for any

j, m ≥ 1,

(7.16)

where Cm > 0 does not depend on j . If Condition (D) is satisfied, then Theorems 2.2, 6.1, and 6.2 apply to the space-periodic 2D NS equations in the space H = Hs provided that bj > 0

for

j = 1, . . . , N = N (δ, s).

(7.17)

By Theorem 2.2, there is a unique invariant measure λ supported by the set of attainability from zero in the space Hs . It follows from (7.15) and (7.16) that the measure λ is concentrated on infinitely smooth functions. Let u(t, x) be a solution of (7.1), (7.2) (with x ∈ T2 ) such that u(0, x) = 0. The sequence of distributions of the random variables u(k) ∈ Hs weakly converges to λ. Hence, Ef u(k) → f (u) dλ(u) as k → ∞ Hs

for any nonlinear continuous functional f on Hs .

328

S. Kuksin, A. Shirikyan

7.3. A nonlinear Schrödinger equation on a torus. Consider the Schrödinger equation u˙ = ( − 1)u + i|u|2 u + η(t),

x ∈ Tn ,

(7.18)

where u = u(t, x) is an unknown complex-valued function, Tn is an n-dimensional torus, and η(t) is a random process of the form (7.2) with T = 1. We regard (7.18) as a system of two equations for the real and imaginary parts of u(t, x). Assume that the random variables ηk in (7.2) have the form (7.9), where bj ≥ 0 are some constants, ξj k are independent random variables satisfying Condition (D), and {ej } is the complete set of eigenvectors (which are pairs of real-valued functions) of the operator 1 − on the torus Tn with corresponding eigenvalues {αj }. It can be proved that if the inequality ∞ αjs bj2 < ∞ j =1

holds with some integer s > n/2, then the Cauchy problem for Eq. (7.18) is well-posed in the Sobolev space H s = H s (Tn , R2 ). More precisely, for any random variable u0 (x) with values in H s the problem (7.18), (7.5) has a unique solution u(t, x) ∈ H s , t ≥ 0, given by formulas (7.6), (7.7), where S = S1 and St is the solving semi-group for the homogeneous equation. We now define a family of Markov chains 1k (v), v ∈ H s , by formulas (7.8). Let P (k, v, 3), Pk , and Pk∗ be the transition function and the Markov semigroups associated with the family 1k (v) and let A = As ⊂ H s be the corresponding set of attainability from zero. The proof of the following assertion is similar to that of Theorem 7.3. Theorem 7.4. Under the above conditions, the Markov semi-group Pk∗ has a unique invariant measure λ ∈ P(H s , A) if (2.10) holds with a sufficiently large N ≥ 1. The measure λ is mixing, and for any initial distribution λ0 ∈ P(H s , A) the sequence Pk∗ λ0 weakly converges to λ. In particular, if u(t, x) is the solution of (7.18) starting from any point in A (e. g., from zero ), then the measures D u(k, ·) tend to λ as k → ∞, k ∈ Z. 8. Appendix: Proof of Theorem 3.1 The solvability of Eq. (3.3) will be proved by the contraction mapping principle. For ⊥ given v ∈ BN (R) and π ∈ B⊥ N (b), consider an operator K that is defined on BN (r), where r is the constant in (3.5), and maps v˜ = (v˜l , l ≤ 0) to v˜ = (v˜l , l ≤ 0), where v˜l = QN S(vl−1 + v˜l−1 ) + ψl ,

l ≤ 0.

It is clear that an element v˜ ∈ B⊥ N (r) is a solution of (3.3) if and only if it is a fixed point of K. We claim that for sufficiently large N the operator K maps the set B⊥ N (r) into itself and is a contraction if B⊥ (r) is endowed with the norm · . ∞ N Indeed, in view of inequality (2.5) with u2 = 0, for any v˜ ∈ B⊥ N (r), we have K(v) ˜ ∞ ≤ supQN S(vl−1 + v˜l−1 ) + sup ψl l≤0

l≤0

≤ γN (ρ) sup vl−1 + v˜l−1 + b ≤ γN (ρ)(R + r) + b. l≤0

(8.1)

Stochastic Dissipative PDE’s and Gibbs Measures

329

Choose an integer N such that γN (ρ) ≤ γ . By (3.5) and (8.1), K(v) ˜ ∞ ≤ γ (R + r) + b ≤ r, which means that K maps the space B⊥ N (r) into itself. To prove that K is a contraction, we take arbitrary v˜ i = (v˜li , l ≤ 0) ∈ B⊥ N (r), i = 1, 2, and note that, in view of (2.5), 1 2 K(v˜ 1 ) − K(v˜ 2 ) ≤ sup + v ˜ ) − S(v + v ˜ ) S(v Q N l−1 l−1 l−1 l−1 ∞ l≤0

≤ γN (ρ)v˜ 1 − v˜ 2 ∞ . It follows that if γN (ρ) ≤ γ < 1, then K is a contraction and hence has a unique fixed point in B⊥ N (r). We now prove (3.7). To this end, choose arbitrary v i ∈ BN (R) and ψ i ∈ B⊥ N (b), i i i i = 1, 2, and set v˜ = W(v , ψ ). In view of (3.3) and (2.5), we have M(F) W(v 1 , ψ 1 ) − W(v 2 , ψ 2 ) ∞ Fl 1 1 2 2 QN S(vl−1 + v˜l−1 ) − S(vl−1 ≤ sup e + v˜l−1 ) +ψl1 − ψl2 l≤0 F

≤ e γ sup eFl vl1 − vl2 + v˜l1 − v˜l2 + sup eFl ψl1 − ψl2 l≤−1

≤ eF γ M(F)(v 1 − v 2 )

l≤0

M(F)(v˜ 1 − v˜ 2 ) + + M(F)(ψ 1 − ψ 2 )∞ , ∞ ∞

whence we derive (3.7). The proof of the theorem is complete. Acknowledgements. We thank I. Gyöngy, K. Khanin, A. Kupiainen, Ya. Sinai, and A.-S. Sznitman for discussions. This research was supported by the EPSRC grant M20624.

References [BV]

Babin, A. V., Vishik, M. I.: Attractors of Evolutionary Equations. Studies in Mathematics and its Applications, Vol. 25, Amsterdam: North-Holland, 1992 [Bo] Bowen, R.: Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms. Lecture Notes in Mathematics, Vol. 470, Berlin–New York: Springer-Verlag, 1975 [BKL] Bricmont, J., Kupiainen, A., Lefevere, R.: Probabilistic estimates for the two-dimensional stochastic Navier–Stokes equations. Preprint [CF] Constantin, P.,C. Foia¸s, P.: Navier-Stokes Equations. Chicago Lectures in Mathematics, Chicago– London: University of Chicago Press, 1988 [DZ] Da Prato, G., Zabczyk, J.: Ergodicity for Infinite-Dimensional Systems. London Matmatical Society Lecture Note Series, Vol. 229, Cambridge: Cambridge University Press, 1996 [Do] Dobrushin, R.L.: Conditions for the absence of phase transitions in one-dimensional classical systems. Mat. Sb. 93 (135), 29–49 (1974) [FlM] Flandoli, F., Maslowski, B.: Ergodicity of the 2D Navier-Stokes equation under random perturbations. Commun. Math. Phys. 172, no. 1, 119–141 (1995) [FP] Foia¸s, C., Prodi, G.: Sur le comportement global des solutions non-stationnaires des équations de Navier-Stokes en dimension 2. Rend. Sem. Mat. Univ. Padova 39, 1–34 (1967) [FT] Foia¸s, C., Temam, R.: Gevrey class regularity for the solutions of the Navier-Stokes equations. J. Funct. Anal. 87, no. 2, 359–369 (1989) [G] Gallavotti, G.: Fluid Mechanics. Hypothesis for an Introduction. To appear in Springer.13 13 The ps-file of this book can be freely downloaded from http://ipparco.roma1.infn.it.

330

[IW] [L] [M] [Re] [Ru] [S1] [S2]

S. Kuksin, A. Shirikyan

Ikeda, N., Watanabe, S.: Stochastic Differential Equations and Diffusion Processes. North-Holland Mathematical Library, Vol. 24, Amsterdam–New York: North-Holland, 1989 Lions, J.-L.: Quelques Méthodes de Résolution des Problèmes aux Limites Nonlinéaires. Paris: Gauthier-Villars, 1969 Mattingly, J.C.: Ergodicity of 2D Navier-Stokes equations with random forcing and large viscosity. Commun. Math. Phys. 206, no. 2, 273–288 (1999) Revuz, D.: Markov chains. Second Edition, North-Holland Mathematical Library, Vol. 11, Amsterdam–New York: North-Holland, 1984 Ruelle, D.: Statistical mechanics of a one-dimensional lattice gas. Commun. Math. Phys. 9, 267–278 (1968) Sinai, Ya.G.: Two results concerning asymptotic behavior of solutions of the Burgers equation with force. J. Statist. Phys. 64, no. 1–2, 1–12 (1991) Sinai, Ya.G.: Gibbs measures in ergodic theory. Uspekhi Mat. Nauk 27, no. 4 (166), 21–64 (1972)

Communicated by A. Kupiainen

Commun. Math. Phys. 213, 331 – 379 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

KMS States, Entropy and the Variational Principle in Full C ∗ -Dynamical Systems C. Pinzari1, , Y. Watatani2, , K. Yonetani2 1 Mathematics Department, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

E-mail: [email protected]

2 Graduate School of Mathematics, Kyushu University, Fukuoka 810-8560, Japan.

E-mail: [email protected]; [email protected] Received: 1 February 2000 / Accepted: 23 February 2000

Dedicated to Sergio Doplicher and John E. Roberts on the occasion of their sixtieth birthdays Abstract: To any periodic and full C ∗ -dynamical system (A, α, R), an invertible operator s acting on the Banach space of trace functionals of the fixed point algebra is canonically associated. KMS states correspond to positive eigenvectors of s. A Perron– Frobenius type theorem asserts the existence of KMS states at inverse temperatures equals the logarithms of the inner and outer spectral radii of s (extremal KMS states). Examples arising from subshifts in symbolic dynamics, self-similar sets in fractal geometry and noncommutative metric spaces are discussed. Certain subshifts are naturally associated to the system, and criteria for the equality of their topological entropy and inverse temperatures of extremal KMS states are given. Unital completely positive maps σ{xj } implemented by partitions of unity {xj } of grade 1 are considered, resembling the “canonical endomorphism” of the Cuntz algebras. The relationship between the Voiculescu topological entropy of σ{xj } and the topological entropy of the associated subshift is studied. Examples where the equality holds are discussed among Matsumoto algebras associated to non finite type subshifts. In the general case ht(σ{xj } ) is bounded by the sum of the entropy of the subshift and a suitable entropic quantity of the homogeneous subalgebra. Both summands are necessary. The measure-theoretic entropy of σ{xj } , in the sense of Connes–Narnhofer–Thirring, is compared to the classical measure-theoretic entropy of the subshift. A noncommutative analogue of the classical variational principle for the entropy is obtained for the “canonical endomorphism” of certain Matsumoto algebras. More generally, a necessary condition is discussed. In the case of Cuntz–Krieger algebras an explicit construction of the state with maximal entropy from the unique KMS state is done.

On leave of absence from Dipartimento di Matematica, Università di Roma Tor Vergata, 00133 Roma. Supported by the EU and NATO–CNR. Supported by the Grant–in–aid for Scientific Research of JSPS.

332

C. Pinzari, Y. Watatani, K. Yonetani

Introduction Let A be a unital C ∗ -algebra endowed with a 2π -periodic automorphic action α of R. In algebraic statistical mechanics elements of A represent kinematic observables of an infinite quantum system, and α is the time evolution of the system. Equilibrium states of the system are states on A which satisfy the KMS condition with respect to α. We shall assume throughout the paper that α is 2π -periodic, so it factors through an action γ of the circle T, and also that γ is full, in the following sense. Let Ak be the spectral subspace of elements a ∈ A such that γz (a) = zk a. We assume that for all k ∈ Z, the closed linear span of {xy, x ∈ Ak , y ∈ A−k } is the fixed point algebra A0 . An example is given by a crossed product C ∗ -algebra A = B β Z by a single ˆ All KMS states on A are tracial, automorphism β, endowed with the dual action γ = β. and are given by T-invariant extensions to A of β-invariant tracial states on B. More generally, if (A, γ ) is not a dual C ∗ -dynamical system, nontracial KMS states arise. Interesting examples are, in increasing generality, the Cuntz–Krieger algebras [CK], the Matsumoto algebras associated with a subshift [M], and the Pimsner algebras associated with a full finite projective Hilbert C ∗ -bimodule [P], all endowed with the canonical gauge action. While KMS states for the first two classes of C ∗ -algebras are now well understood (see [EFW, E, MWY]), the third class of C ∗ -algebras is the main motivation of the present paper. (We shall see that Pimsner C ∗ -algebras are in fact a typical example, in the sense that any unital, full and periodic C ∗ -dynamical system is isomorphic to a system constituted by a Pimsner C ∗ -algebra associated to a full, finite projective Hilbert bimodule, though not unique, and its canonical gauge action.) We point out that, on the other hand, various authors, regarding the Cuntz–Krieger algebras as examples of noncommutative topological dynamical systems, have computed the Voiculescu topological entropy [V] of the so-called “canonical endomorphism” σ ([Ch], [BG]). However, among these, to the authors’ knowledge, it is only for the case of the Cuntz algebras Od that a close relationship is known between KMS states, Voiculescu’s topological entropy and measure-theoretic entropy in the sense of [CNT], see [Ch]. Choda’s result states that the CNT entropy of σ computed with respect to the unique KMS state equals the topological entropy of σ , which is, in turn, log(d). This result can be regarded as a pivotal example of a noncommutative dynamical system for which a variational principle for the entropy holds. Our ultimate goal is that of investigating the variational principle for the entropy and its relationship with KMS states, in full periodic C ∗ -dynamical systems. KMS states for 2π-periodic actions have already been considered in the literature by several authors. Olesen and Pedersen gave in [OPI] an existence and uniqueness theorem for KMS states of the Cuntz algebras. This was generalized to the case of Cuntz–Krieger algebras by Enomoto, Fuji and the second-named author in [EFW] and by Evans in [E]. In [BEH] Bratteli, Elliott and Herman construct, for any closed subset F of the extended real line, a simple C ∗ -algebra AF endowed with a 2π -periodic one-parameter group, for which F is precisely the set of inverse temperatures of KMS states, and such that for each β ∈ F , AF has a unique KMS state at inverse temperature β. We remark that if F ⊂ R, then the T-action is full, and if F ⊂ (0, +∞), AF is purely infinite (cf. Sect. 2). In a subsequent paper [BEK] Bratteli, Elliott and Kishimoto show that even the set of KMS states with a specified inverse temperature can be fairly arbitrary. In a recent paper [MWY] Matsumoto, Yoshida and the second-named author study KMS states for the Matsumoto C ∗ -algebras associated with a subshift in symbolic dynamics. They develop a Perron–Frobenius type theorem for a suitable positive operator naturally acting on a certain subalgebra, and show that the logarithm of its spectral radius

KMS States, Entropy and the Variational Principle

333

arises as the inverse temperature of some KMS state. Furthmore they show a connection with the topological entropy of the underlying subshift. Our approach is close to that of [MWY], in that we emphasize the Perron–Frobenius theory. The starting point is that to any periodic and full C ∗ -dynamical system we associate certain completely positive maps on the underlying C ∗ -algebra, which we interpret as being Perron–Frobenius type operators. KMS states correspond then to the positively scaled tracial states on the fixed point algebra. We study the problem of existence of KMS states, thus proving a Perron–Frobenius type theorem, and the relationship with the variational principle in ergodic theory. The paper is organized as follows. In the first section choose finite subsets {yi } and {xj } of A1 such that j yj ∗ yj = I and i xi xi ∗ = I . Such multiplets exist because the group action is full. They can be regarded as playing the role of the canonical unitary in B β Z implementing β. We then consider two completely positive (cp) maps: T{yi } : T → i yi T yi ∗ and S{xj } : T → j xj ∗ T xj on A0 , and also, by transposition, operators t and s which are inverses of one another on the Banach space of trace functionals on A0 . These operators are independent of the choice of the multiplets {yi } and {xj }. KMS states for the system at finite inverse temperatures then correspond to tracial states on A0 which are positively scaled by those cp maps, or, equivalently, to tracial state eigenvectors of s . In the next section we show that, under the necessary condition that the fixed point algebra has a tracial state, the inner and outer spectral radii of s correspond to inverse temperatures of “minimal” and “maximal” KMS states (see Theorem 2.5 and Corollary 2.6). This can be regarded as a Perron–Frobenius theorem. The key point in the proof is that one needs to consider the trace functionals of the enveloping von Neumann algebra of A0 , endowed with its order structure. In Sects. 3–5 we discuss some examples. In Sect. 3 we apply our results to the Pimsner C ∗ -algebras generated by finite projective Hilbert bimodules, and we thus deduce a criterion for existence of KMS states which applies, in particular, to the case where the coefficient algebra is simple, unital and has a tracial state. In Sect. 4 we construct Hilbert bimodules, and hence full C ∗ -dynamical systems, via Pimsner’s construction, naturally arising from two different situations: subshifts of symbolic dynamics and self-similar sets in fractal geometry. In both cases the coefficient algebra is commutative, and the corresponding Hilbert bimodules are described by a finite set of endomorphisms. We show that in the former situation Pimsner’s construction yields the Matsumoto C ∗ -algebras, while in the latter one gets a genuine Cuntz algebra. We also discuss a generalization of the latter example to noncommutative metric spaces introduced by Connes [Co]. It is interesting to compare our discussion with the papers by Jørgensen–Pedersen [JP] and Bratteli–Jørgensen [BJ], where the authors consider a relationship between Cuntz algebras and multiresolutions in wavelet and fractal analysis. In Sect. 5 we look more closely at the subclass of the so-called Cuntz–Krieger bimodules (and the corresponding C ∗ -algebras). These are bimodules for which the coefficient algebra is a finite direct sum of unital simple C ∗ -algebras. The leading and simplest example is, of course, that of Cuntz–Krieger algebras, where each summand algebra is a copy of the complex numbers. We show in particular that if each of the summands has a unique trace and the defining {0, 1}-matrix A is irreducible then the associated Pimsner C ∗ -algebra has a unique KMS state at inverse temperature log(r(A)), where r(A) is the spectral radius of A. In the next section we associate to each pair ({yi }, {xj }) of finite subsets of A1 as above, a pair of one-sided subshifts, ({xj } , {yi } ), which roughly correspond to the operator

334

C. Pinzari, Y. Watatani, K. Yonetani

s and its inverse t . We show that, under certain conditions, the topological entropies of these subshifts are precisely the minimal and maximal inverse temperatures of KMS states. Furthermore we give a criterion for approximating such extremal temperatures with arbitrary (a priori non KMS) tracial states satisfying suitable conditions. ∗ In Sect. 7 we introduce a ucp map σ{xj } : T → j xj T xj implemented by a multiplet {xj } of grade 1 as above, which should be compared with the map S{xj } . The main result of this section is the estimate ht(σ{xj } ) ≤ htop ({xj } ) + ht({φxi ,xj }), where the l.h.s. is the Brown–Voiculescu topological entropy [B,V] of σ{xj } and the second summand at the r.h.s. is the topological entropy, suitably defined, of the set of contractions φxi ,xj : T → xi ∗ T xj of the homogeneous C ∗ -subalgebra A0 . Both summands at the r.h.s. of this inequality are necessary. Indeed when A = A0 α Z then the associated subshift is trivial so its entropy is zero, and the above inequality, when combined with monotonicity of topological entropy ([B,V]) leads to Brown’s result htA (Ad(u)) = htA0 (α), where u ∈ A is a unitary implementing α [B]. Another extreme case is that of the Cuntz– Krieger algebras OA . Now the second summand vanishes and the previous estimate yields the result by Boca and Goldstein [BG] that ht(σ{xj } ) = ht({xj } ) = log(r(A)), see Corollary 7.8. We then focus our attention on those algebras for which ht({φxi ,xj }) = 0 and we show that if the xj ’s have pairwise orthogonal ranges ht(σ{xj } ) = htop ({xj } ). We next discuss new examples of this occurrence among Matsumoto algebras [M] associated to a subshift. Our assumption of orthogonality introduces a certain trivialization to the classical situation. In fact, in this case the algebra of continuous functions on the one-sided subshift {xj } , together with the endomorphism induced by the left shift epimorphism, sits naturally inside the noncommutative dynamical system (A, σ{xj } ). Therefore monotonicity of topological entropy implies that the topological entropy of {xj } is ≤ the topological entropy of the noncommutative subshift σ{xj } , thus leading to the equality, see Theorem 7.7. We discuss new examples of this occurrence among the Matsumoto algebras associated to certain non finite type subshifts. In Sect. 8 we investigate the CNT dynamical entropy of σ{xj } . We show that, under the orthogonality assumption, if φ is a σ{xj } -invariant state of A centralized by C({xj } ) then hφ (σ{xj } ) ≥ hµ ({xj } ), where µ is the shift-invariant probability measure on {xj } obtained restricting φ. We also find a condition on σ{xj } under which any such µ arises as the restriction of some φ. This enables us to obtain a variational principle for certain systems (A, σ{xj } ) for which ht({φxi ,xj }) = 0. More precisely, our variational principle asserts, for those systems, the existence of σ{xj } -invariant states of A with respect to which the CNT dynamical entropy equals the Voiculescu topological entropy of σ{xj } . In the last section we establish a closer relationship between KMS states and states with maximal entropy. The main point is that a KMS state ω is to be understood as a quasiinvariant measure for the noncommutative shift σ{xj } , as ω ◦ σ{xj } and ω are equivalent. In classical ergodic theory, measures with this property are called conformal, and play an important role, as they lead to measures with maximal entropy. We thus show an

KMS States, Entropy and the Variational Principle

335

explicit general way of constructing σ{xj } -invariant measures from KMS states. We then consider basic examples, which we may think of as being noncommutative Markov shifts: systems (A, γ ) containing some Cuntz–Krieger algebra OA in a way that γ restricts to the canonical gauge action on OA . We show that the σ{xj } -invariant state φ previously derived from a KMS state with maximal entropy restricts, on the algebra of continuous functions on the classical Markov subshift A , to the unique invariant measure µ with maximal entropy. We thus conclude that if ht({φxi ,xj }) = 0, hµ (A ) = hφ (σ{xj } ) = htop (σ{xj } ) = htop (A ) = log(r(A)). This yields a generalization of Choda’s result [Ch] to the Cuntz–Krieger algebras, and Matsumoto algebras associated to certain non-finite type subshifts.

1. The Scaling Property Recall that a state ω over a C ∗ -algebra A endowed with a one-parameter automorphism group α is called a KMS state at inverse temperature β ∈ R if ω(aαiβ (b)) = ω(ba),

(1.1)

for all a, b in a dense α-invariant ∗ -subalgebra of Aα , the set of entire elements for α (which is in fact a dense ∗ -subalgebra). We will only consider 2π -periodic one-parameter groups, i.e. groups for which α comes from an action γ of T by αt := γeit . Furthermore, in view of applications to the algebras generated by Hilbert bimodules, we will assume that A is unital and that the group action is full, in the sense explained in the introduction. Then we note that the spectral subspace Ak , for positive k, is in fact the linear span of the product set of k copies of A1 . Moreover, by definition of the full C ∗ -dynamical system, there exist, for all n ∈ N, finite subsets {yi } and {xj } of An such that i yi ∗ yi = I and j xj xj ∗ = I . We define correspondingly, for any tracial state τ on A0 ,

δn (τ ) := τ (

yi yi ∗ ),

i

'n (τ ) := τ (

xj ∗ xj ).

j

We shall usually write δ(τ ) and '(τ ) for δ1 (τ ) and '1 (τ ) respectively. Lemma 1.1. Let (A, γ , T) be a full C ∗ -dynamical system, with A unital, and let τ be a tracial state on A0 . Then δn (τ ) and 'n (τ ) do not depend on the finite subsets {yi } and {xj } of An satisfying the above relations. If in particular {yi }, {xj } ⊂ A1 , one has, for all n ∈ N,

(xj1 . . . xjn )∗ xj1 . . . xjn −1/n ≤ δn (τ )1/n ≤ yi1 . . . yin (yi1 . . . yin )∗ 1/n , yi1 . . . yin (yi1 . . . yin )∗ −1/n ≤ 'n (τ )1/n ≤ (xj1 . . . xjn )∗ xj1 . . . xjn 1/n .

336

C. Pinzari, Y. Watatani, K. Yonetani

Proof. We shall only prove the statements relative to {yi }, those relative to {xj } can be proved similarly. Let {z1 , . . . , zq } ⊂ An be another multiplet satisfying k zk ∗ zk = I , and write zk = i ak,i yi , where ak,i := zk yi ∗ ∈ A0 . Then τ( zk zk ∗ ) = τ ( ak,i yi yj ∗ ak,j ∗ ) = τ( yi yj ∗ ak,j ∗ ak,i ) k

k,i,j

= τ(

y i y j y j z k zk y i ) = τ ( yi yi ∗ ). ∗

∗

i,j,k

∗

i,j,k

i

Note that δn (τ ) ≤ i yi yi ∗ . Furthermore 1 = τ( x j y i ∗ y i xj ∗ ) = τ ( yi xj ∗ xj yi ∗ ) ≤ xj ∗ xj τ ( yi yi ∗ ). i,j

The conclusion follows choosing subsets in An of the form {yi1 . . . yin } and {xj1 . . . xjn }, where {yi }, {xj } ⊂ A1 and satisfy i yi ∗ yi = I and j xj xj ∗ = I . Let, for λ > 0, T Sλ be the set of tracial states τ on A0 for which λτ (xy ∗ ) = τ (y ∗ x),

x, y ∈ A1 .

(1.2)

We shall show that a state of A satisfies the KMS condition w.r.t. α if and only if its restriction to A0 is an element of some T Sλ . We start with the following characterization of T Sλ . Lemma 1.2. Let (A, γ , T) be a full C ∗ -dynamical system over a unital C ∗ -algebra. For a tracial state τ on A0 and λ > 0, the following conditions are equivalent: (1) τ ∈ T Sλ , (2) τ ( i yi ayi ∗ ) = λ−1 τ (a), a ∈ A0 , (3) τ ( j xj ∗ axj ) = λτ (a), a ∈ A0 . Here {yi } and {xj } are finite subsets of A1 satisfying respectively yi ∗ yi = I, xj xj ∗ = I. i

j

λ is uniquely determined by τ : λ = '(τ ) = δ(τ )−1 . Proof. (1) → (2) and (1) → (3) are obvious. We show that (2) → (1): for x, y ∈ A1 , y ∗ x ∈ A0 , so τ (y ∗ x) = λτ ( yi y ∗ xyi ∗ ) = λτ ( xyi ∗ yi y ∗ ) = λτ (xy ∗ ). i

One similarly proves that (3) → (1).

i

The following result characterizes the set of tracial states on A0 which gives rise to KMS states for (A, γ ). Let F0 : A → A0 denote the projection onto the fixed point algebra obtained averaging over the circle group action. Proposition 1.3. The maps ω → ω A0 , τ → τ ◦ F0 set up a bijective correspondence between the set of KMS states ω for (A, γ ) at inverse temperature β and the set T Seβ .

KMS States, Entropy and the Variational Principle

337

Proof. If ω is a KMS state at inverse temperature β then the KMS condition (1.1) can be formulated, equivalently, for any pair x, y in the dense linear span of the Ak ’s. Therefore the restriction τ of ω to A0 is a tracial state such that ω = τ ◦ F0 since ω is γ -invariant. Furthermore, if x, y ∈ A1 , ω(y ∗ x) = ω(xαiβ (y ∗ )) = eβ ω(xy ∗ ), therefore τ ∈ T Seβ . Conversely, if this condition is satisfied by some tracial state τ on A0 , then ω := τ ◦ F0 is an extension of τ to a state on A, and it is not difficult to check that ω(y ∗ x) = ω(xαiβ (y ∗ )) for x, y ∈ A1 , and hence inductively for x, y ∈ A1 . . . A1 = Ak . If x ∈ Ak , y ∈ Ah , h = k then ω(y ∗ x) = 0 = ω(xαiβ (y ∗ )), and the proof is complete. Remark. A simple argument shows that the sequences an := inf{'n (τ ), τ ∈ T S(A0 )} and bn := sup{'n (τ ), τ ∈ T S(A0 )} are respectively supermultiplicative and submultiplicative, therefore the sequences an 1/n , bn 1/n converge, and limn an 1/n = sup an 1/n and limn bn 1/n = inf bn 1/n . Corollary 1.4. If (1) limn inf{'n (τ ), τ ∈ T S(A0 )}1/n > 1 (e.g. inf{'(τ ), τ ∈ T S(A0 )} > 1) then every KMS state on (A, γ ) has positive inverse temperature, (2) limn sup{'n (τ ), τ ∈ T S(A0 )}1/n < 1 (e.g. sup{'(τ ), τ ∈ T S(A0 )} < 1) then every KMS state on (A, γ ) has negative inverse temperature, (4) limn inf{'n (τ ), τ ∈ T S(A0 )}1/n = limn sup{'n (τ ), τ ∈ T S(A0 )}1/n = 1 (e.g. '(τ ) = 1 for all τ ∈ T S(A0 )) then every KMS state on (A, γ ) is tracial. We next give a criterion of faithfulness for KMS states. Proposition 1.5. Let (A, γ ) be a full periodic C ∗ -dynamical system with A unital, and p consider the ∗ -monomorphism α : A0 → Mp (A0 ) associated to a set {yi }i=1 of A1 p such that i=1 yi ∗ yi = I and defined by α(a) = (yi ayj ∗ ). If A0 has no proper closed ideal I such that α(A0 ) ∩ Mp (I) = α(I) (e.g. A is simple), then any KMS state of (A, γ ) is faithful. Proof. Let τ be the restriction of a KMS state ω to A0 . Then I := {a ∈ A0 : τ (a ∗ a) = 0} is a closed ideal of A0 . Since for x, y ∈ A1 , a ∈ A0 , (xay ∗ )∗ (xay ∗ ) ≤ x2 ya ∗ ay ∗ , we have, by (1.2), that xay ∗ ∈ I if a ∈ I. This yields α(I) ⊂ Mp (I) ∩ α(A0 ). We show the reverse inclusion. Let a ∈ A0 be such that yi ayj ∗ ∈ I, i, j = 1, . . . , p. Then δ(τ )τ (a ∗ yi ∗ yi ayj ∗ yj ) = τ ((yi ayj ∗ )∗ yi ayj ∗ ) = 0. Hence, summing up, we see that a ∈ I, and this shows that α(A0 ) ∩ Mp (I) ⊂ α(I). It follows from our assumption that I = {0}, as, clearly, I = A0 . Now the canonical conditional expectation F0 : A → A0 is faithful, so ω is faithful.

338

C. Pinzari, Y. Watatani, K. Yonetani

We conclude this section recalling from [GP] a criterion for pure infinity of unital C ∗ -algebras, which we shall need in the sequel. We refrain from giving here the proof. We only point out that the arguments essentially go back to Cuntz’ proof of pure infinity of Od [C]. Also, the result is a generalization of Rørdam’s result (cf. [R]) about pure infinity of crossed products by proper corner endomorphisms. Following [R], we say that a C ∗ -algebra B has the comparability property if B has at least one tracial state, and furthermore a projection e ∈ B is equivalent to a subprojection of f if τ (e) < τ (f ) for all tracial states of B. Assume that our C ∗ -algebra A has a nonunitary isometry S in some An , n > 0, and also that the fixed point algebra A0 has the comparability property. Then for all tracial states τ on A0 one has, by Lemma 1.1, δn (τ )(= τ (SS ∗ )) < 1.

(1.3)

(Note that, by Corollary 1.4 and the following Proposition 2.2, all KMS states have positive inverse temperatures.) It is then natural to ask under which conditions (1.3) guarantees the existence of a nonunitary isometry in An , or, better, pure infinity of A. Theorem 1.6. [GP] Let (A, γ , T) be a full C ∗ -dynamical system, and assume that A0 is unital, simple, separable, of real rank zero, and that every Mn (A0 ) has the comparability property. If, for some n > 0, sup{δn (τ ), τ ∈ T S(A0 )} < 1, then An contains a nonunitary isometry. Furthermore, A is simple and purely infinite. 2. A Perron–Frobenius Theorem We associate to each pair of finite subsets {yi }, {xj } ⊂ A1 satisfying yi ∗ yi = I, i

xj xj ∗ = I,

j

a corresponding pair of completely positive maps T = T{yi } and S = S{xj } on the homogeneous subalgebra A0 : T (a) := yi ayi ∗ , a ∈ A0 , i

S(a) :=

xj ∗ axj ,

a ∈ A0 .

j ∗

∗

Let T , S : A0 → A0 denote the Banach space adjoints of T and S respectively, ∗ and let T (A0 ) ⊂ A0 the Banach subspace of trace functionals. Then one has t following result. Proposition 2.1. For any finite subset {yi } (resp. {xj }) of A1 satisfying i yi ∗ yi = I (resp. j xj xj ∗ = I ) the associated operator T (resp. S ) leaves T (A0 ) stable. Let t

0 (resp. s ) be the restriction of t (resp. s ) does not depend on the set T ∗to T (A ). Then {yi } (resp. {xj }) satisfying yi yi = I (resp. xj xj ∗ = I ). Furthermore t and s are inverses of one another.

KMS States, Entropy and the Variational Principle

339

Proof. It is easy to check that T transforms trace functionals into trace functionals. Let {y k } be another finite subset of A1 such that y k ∗ y k = I , and write y k = ak,i yi , with ak,i = y k yi ∗ ∈ A0 . Then for any τ ∈ T (A0 ), a ∈ A0 ,

∗ y k ay k = τ (ak,i yi ayj ∗ ak,j ∗ ) = τ (yi ayj ∗ ak,j ∗ ak,i ) τ k

k,i,j

τ (yi ayj ∗ yj yi ∗ ) = τ

i,j

i,j,k

yi ayi ∗ .

i

Finally note that t s (τ ) = s (τ ) ◦ T = τ ◦ ST = τ,

τ ∈ T (A0 ),

by the trace property of τ . Likewise, s t (τ ) = τ,

τ ∈ T (A0 ).

Note that KMS states of (A, γ ) correspond precisely to the tracial state eigenvectors for s (or t ). The following result, which has its own interest, explains why δ(τ ) = '(τ )−1 when τ corresponds to a KMS state. Proposition 2.2. The map h : T S(A0 ) → T S(A0 ) taking a tracial state τ to h(τ ) = δ(τ )−1 t (τ ) is a homeomorphism of T S(A0 ) endowed with the weak∗ -topology. KMS states of (A, γ ) correspond, as in Prop. 1.3, to fixed points of h. The inverse of h is the map k(τ ) = '(τ )−1 s (τ ). We have: 'n (hn (τ )) = δn (τ )−1 , δn (k n (τ )) = 'n (τ )−1 , thus if τ ∈ T (A0 ) is the restriction of a KMS state, '(τ ) =

1 δ(τ ) .

Proof. Clearly h(τ ) is a tracial state when τ is. Furthermore τ → δ(τ ) is a positive valued continuous function on a compact set, therefore h is continuous. For the same reason k is continuous on T S(A0 ), and, by the trace property, h and k are inverses of one another. Our next aim is to look more closely at the spectrum σ (s ) of s . The previous proposition shows that 0∈ / σ (s ) = σ (t )−1 . So we can define the inner and outer spectral radius of s : rmax (s ) := max{|λ|, λ ∈ σ (s )}

rmin (s ) := r(t )−1 max = min{|λ|, λ ∈ σ (s )}.

We give some estimates for rmin (s ) and rmax (s ).

340

C. Pinzari, Y. Watatani, K. Yonetani

Proposition 2.3. Let {yi }, {xj } ⊂ A1 satisfy i yi ∗ yi = I and j xj xj ∗ = I . Then one has yi1 . . . yin (y1 . . . yin )∗ −1/n , rmin (s ) ≥ lim n

i1 ,...,in

rmax (s ) ≤ lim n

(xj1 . . . xjn )∗ xj1 . . . xjn 1/n .

j1 ,...,jn

Proof. Note that T (a)∗ T (a) ≤

yi yi ∗ T (a ∗ a),

a ∈ A0 ,

i

which, together with T (I ) =

yi yi ∗ , implies T =

i

T n =

i

yi yi ∗ , hence inductively

yi1 . . . yin (yi1 . . . yin )∗ .

i1 ,...,in

Taking the nth root and passing to the limit, one gets the spectral radius of T : yi1 . . . yin (yi1 . . . yin )∗ 1/n . r(T ) = lim n

i1 ,...,in

Similarly, one has r(S) = lim n

(xj1 . . . xjn )∗ xj1 . . . xjn 1/n .

j1 ,...,jn

The proof is completed recalling that s is the restriction of S to a closed subspace, so rmax (s ) ≤ r(S ) = r(S) nd similarly rmin (s ) = rmax (t

−1

) ≥ r(T )−1 .

We next show that the inner and outer spectral radii of s correspond to inverse temperatures of KMS states, or, in other words, that they are in the point spectrum of s , with corresponding positive eigenvalues. The fact that the outer spectral radius is in the point spectrum was first proved in [MWY] for the Matsumoto C ∗ -algebras associated with subshifts [M]. The key point in our situation is that one needs to consider the trace functionals of A0 endowed with the order structure which arises when we extend such traces to normal traces on the enveloping von Neumann algebra. We anticipate the following, possibly known, lemma. Lemma 2.4. Let n φn be a series of normal linear functionals on a von Neumann algebra M weakly convergent to φ. If each of the the absolute values |φn | is tracial then |φn |. |φ| ≤ n

KMS States, Entropy and the Variational Principle

341

Proof. Let τ be a positive tracial linear functional, then |τ (xy)| ≤ ||y||τ (|x|) (see for example [T]). Consider the polar decompositions φn (x) = |φn |(xun ), φ(x) = |φ|(xu). Then, for a positive x, we have |φ|(x) = φ(xu∗ ) = ≤

φn (xu∗ ) =

n ∗

||φn |(xu un )| ≤

n

|φn |(xu∗ un )

n

|φn |(x).

n

We are now in the position of proving our main result of this section. Theorem 2.5. Let (A, γ , T) be a full C ∗ -dynamical system, and assume that A is unital, and that A0 has a tracial state. Then rmin (s ) and rmax (s ) are eigenvalues of s with corresponding tracial state eigenvectors. Proof. We first show that rmax (s ) is a spectral value for s and then that it is in fact an eigenvalue with a tracial state eigenvector. A similar argument will prove that r(t ) is an eigenvalue for t = s −1 with a tracial state eigenvector. By the uniform boundedness theorem, there exists a sequence {zn } of complex numbers such that |zn | → rmax (s )+ and R(zn )τ0 → ∞ for some τ0 ∈ T (A0 ), where R(z) is the resolvent of s in z. Since T (A0 ) is linearly spanned by its tracial states, we may assume that τ0 is a tracial state. Consider, for |z| > rmax (s ), the Neumann series: R(z) =

∞

k

z−(k+1) s .

(2.1)

k=0

By the previous lemma, on the enveloping von Neumann algebra of A0 , |R(z)τ0 | ≤

∞

k

|z|−(k+1) s (τ0 ) = R(|z|)τ0 ,

k=0

so R(|zn |)τ0 → ∞, and this shows that rmax (s ) ∈ σ (s ). Equation (2.1) also shows that R(λ)τ0 is a nonzero positive functional for λ > rmax (s ); hence arguments similar to those of Lemma 3.1 in [MWY] prove that τn :=

1 R(|zn |)τ0 R(|zn |)τ0

is a sequence of tracial states such that every weak∗ -limit point of it is a tracial state eigenvector with eigenvalue rmax (s ). The previous theorem can be considered as an analogue of the Perron–Frobenius theorem for matrices with nonnegative entries.

342

C. Pinzari, Y. Watatani, K. Yonetani

Corollary 2.6. Let (α, R) be a 2π -periodic one-parameter automorphism group of a unital C ∗ -algebra A, such that the induced T-action γ is full. If s is defined as above, relative to γ , then the set of inverse temperatures of KMS states is a closed subset of the interval [log(rmin (s )), log(rmax (s ))] containing the extreme points. Proof. The subset of T S(A0 ) corresponding to KMS states is weakly∗ -compact by Prop. 2.2; furthermore the map ' : T S(A0 ) → R+ defined at the beginning of Sect. 1 is weakly∗ -continuous. It follows that the set of elements of the form log('(τ )), when τ ranges over all tracial states on A0 corresponding to KMS states, is compact. Now this set is precisely the set of possible inverse temperatures by Prop. 1.3. The rest follows from the previous theorem. A KMS state of (A, α) at inverse temperature βmin := log(rmin (s )) or

βmax := log(rmax (s ))

will be called extremal. Let β be the inverse temperature of a KMS state, and set, as in the previous section, an = inf{'n (τ ), τ ∈ T S(A0 )}, bn = sup{'n (τ ), τ ∈ T S(A0 )}. Then for all n, an 1/n ≤ eβ ≤ bn 1/n , so lim 1/n log(an ) ≤ β ≤ lim 1/n log(bn ). n

n

It is then natural to ask for which tracial states τ , the sequence 1/n log('n (τ )) approximates the maximal or the minimal inverse temperature. In Sect. 5 we shall give a sufficient condition. We conclude this section with the discussion of two examples known in the literature. The first example, arising from ergodic theory, shows that in general, at a fixed inverse temperature, there may be more than one KMS state. Example 2.7. Let (X, T ) be a topological dynamical system: X is a compact metric space endowed with a homeomorphism T . We suppose that X is not a finite set. Then it is well known that the C ∗ -algebra A = C(X)αT Z is simple if and only if T is minimal, i.e. there is no nontrivial closed subset F ⊂ X such that T (F ) = F . Here αT is the automorphism of C(X) defined by αT (f ) = f ◦ T −1 . Tracial states on C(X) αT Z are in one-to-one correspondence with T -invariant probability measures on X, while there is no nontracial KMS state on (A, γ ). The operator s therefore has spectrum contained in the unit circle. However, s is the Banach space adjoint of αT , so its spectrum is the same as that of αT which must be equal to T by simplicity of A [OP]. There is an important example, due to Furstenberg, of a minimal analytic diffeomorphism T of T2 with nonunique invariant measures (see, e.g, [Ma]), which thus leads to an example of nonuniqueness of tracial states on the simple crossed product C ∗ -algebra C(T2 ) αT Z. The next example shows that the set of inverse temperatures can in general be an arbitrary closed subset of R.

KMS States, Entropy and the Variational Principle

343

Example 2.8. In [BEH] Bratteli, Elliott and Herman construct an example of a simple C ∗ -algebra B endowed with T-action for which the set of possible inverse temperatures can be any arbitrary closed subset F of R ∪ {+∞, −∞}. For each temperature the corresponding state is unique. In more detail, B is obtained by cutting down the crossed product A α Z of an AF-algebra by some projection P in A. If neither +∞ nor −∞ belongs to F , A itself is simple, and this implies that the T-action is full since P is a full projection. If moreover F ⊂ (0, +∞), one can choose α so that α(P ) < P , see [BEH], hence α(B 0 ) ⊂ B 0 . Then ρ := α B0 is a proper corner endomorphism of B 0 , and one has B = B 0 ρ N. Now by a result of Rørdam [R], B is purely infinite. 3. KMS States of the Pimsner Algebras In this section we discuss an application of the results of the previous section to the C ∗ -algebra OX associated to a Hilbert C ∗ -bimodule X over a C ∗ -algebra B. We refer the reader to [P] for the construction of OX . We just recall that both X and B embed isometrically respectively as a Hilbert bimodule in a C ∗ -subalgebra of OX . We shall always assume that X is finite projective and full, and that B is unital. Therefore any finite basis {xj } of X yields, in OX , the relation xj xj ∗ = I. j

Furthermore any finite subset {yi } of X such that i < yi , yi >= I yields in OX : yi ∗ yi = I. i

OX is endowed with a canonical gauge action γ such that γz (x) = zx, z ∈ T, x ∈ X. Therefore we can conclude that (OX , γ ) is a full periodic C ∗ -dynamical system. We start proving that systems of this form are typical examples, in the sense that we can always easily associate to any unital, full, periodic C ∗ -dynamical system (A, γ ), a finite projective Hilbert C ∗ -bimodule X such that A = OX and γ is the canonical T-action. We should note, however, that the Hilbert bimodule X and its coefficient C ∗ -algebra are, in general, not unique. In fact our construction leads to a maximal Hilbert bimodule. In applications, it may be more convenient to start with smaller Hilbert bimodules. The case of Cuntz–Krieger algebras discussed in [P] is well known, where the coefficient algebra is finite-dimensional. In Sect. 4 we shall discuss the more general situation of Matsumoto algebras, and we will construct natural minimal generating Hilbert bimodules. Theorem 3.1. Let (A, γ ) be a full C ∗ -dynamical system over T and assume that A is unital. Then there exists a full finite projective Hilbert C ∗ -bimodule X over a unital C ∗ -algebra B such that (A, γ ) can be identified with OX , endowed with its canonical gauge action. ∗ y y = I and Proof. Choose finite subsets {yi } and {xj } of A1 such that i i i ∗ = I . Let B be the fixed point algebra A0 . Set X = x x j j j j xj B. Then X is a right Hilbert B-module in A with B-valued inner product: < x, y >B = x ∗ y. The con∗ dition j xj xj = I shows that {xj } is a finite basis of X, and X is full by i yi ∗ yi = I . Left B-action is given by φ(b)x = bx for b ∈ B and x ∈ X. For x = j xj bj , we ∗ have bx = j i xi xi bxj bj ∈ X. Thus φ is well defined. Let xˆ be the image of

344

C. Pinzari, Y. Watatani, K. Yonetani

an element x ∈ X in OX . By the universality of the Pimsner algebras, there exists a surjective ∗-homomorphism ϕ : OX → A such that ϕ(xˆj ) = xj [P]. Let F0 : A → A0 0 be the natural conditional expectations. Since ϕE = F ϕ and E is and E : OX → OX 0 faithful, ϕ is injective by a well known argument. Thus ϕ is the desired isomorphism. A KMS state on (OX , γ ) at some inverse temperature log(δ) ∈ R restricts to a tracial state τ on the coefficient algebra B satisfying, for a ∈ B, τ ( i < xi , axi >) = δτ (a), with {xi } a right basis of X. We show that conversely any such trace extends to a KMS state. Lemma 3.2. Let {x1 , . . . , xd } be a basis of X. Any tracial state τ on the coefficient algebra B satisfying τ < xi , axi > = δτ (a), a ∈ B, i

with δ > 0, extends uniquely to a KMS state on OX at inverse temperature log(δ). This extension is faithful if τ is faithful. Proof. We first prove uniqueness. A KMS state for (OX , γ ) is determined by its restriction to the homogeneous C ∗ -subalgebra, and, by the trace-scaling property of the operator S{xj } on that subalgebra, it is in fact determined by its restriction to the coefficient algebra B. Conversely, if one is given a tracial state τ0 on B as required then it is easy to check that τn (a) :=

1 xi ∗ axi , τn−1 δ

a ∈ L(X n ), n ≥ 1,

is a sequence of tracial states (faithful if τ0 is faithful) such that τn+1 L(Xn ) = τn , which thus gives rise to a tracial state τ on the homogeneous subalgebra positively scaled by S{xj } . Therefore τ extends to a KMS state on OX at inverse temperature log(δ). We now apply the results of the previous section to the Pimsner C ∗ -algebras. Theorem 3.3. Let B be a unital C ∗ -algebra with a tracial state, and let X be a full finite projective Hilbert C ∗ -bimodule over B. Assume that for every tracial state τ on B, and any basis {xi } of X, τ ( i < xi , xi >) > 0. Then OX has a KMS state. 0 ) → T (O 0 ) be the (invertible) operator obtained restricting to (1) Let s : T (OX X 0 T (OX ) the Banach space adjoint of S(a) = i xi ∗ axi , where {xi } is a basis of X. Then the set of possible inverse temperatures is a closed subset of [log(rmin (s )), log(rmax (s ))] containing the extreme points. (2) If OXis T-simple, every KMS state is faithful. (3) If τ ( i < xi , xi >) > 1 for all τ ∈ T S(B) then every KMS state has positive inverse temperature. (4) If τ ( i < xi , xi >) = 1, for all τ ∈ T S(B) then every KMS state is a tracial state. (5) If τ ( i < xi , xi >) < 1 for all τ ∈ T S(B) then every KMS state has a negative inverse temperature. (6) if B has a unique trace then OX 0 has a unique trace, so OX has a unique KMS state.

KMS States, Entropy and the Variational Principle

345

Proof. The function taking a tracial state τ to the tracial state −1 τ < xi , axi > a∈B→ τ < xi , xi > is weakly∗ -continuous, so, by the Schauder–Tychonov fixed point theorem, there is a tracial state τ on B such that τ < xi , axi > = τ < xi , xi > τ (a), a ∈ B. i

i

We can now extend such a τ to a KMS state on (OX , γ ), by the previous lemma. 0 has no ideal of the kind (1) follows from Corollary 2.6. Since OX is T-simple, OX described in our faithfulness criterion, Prop. 1.5, hence (2) follows. Parts (3)–(5) follow from Corollary 1.4. Finally, if B has a unique tracial state then so does OX 0 since it is an inductive limit of C ∗ -algebras stably isomorphic to B itself. Therefore OX has a unique KMS state. 4. Examples of Full Dynamical Systems Arising from Subshifts, Self-Similar Sets and Noncommutative Metric Spaces In this section we continue our discussion of examples of full C ∗ -dynamical systems obtained via Pimsner’s construction. We start considering two different examples of Hilbert bimodules both described by families of ∗ -endomorphisms on commutative C ∗ algebras, arising respectively from symbolic dynamics and fractal geometry. We shall also discuss a generalization of the latter example to noncommutative metric spaces. 4.1. Subshifts in symbolic dynamics and Matsumoto algebras. We recall the construction of the Matsumoto algebra O6 associated with a two-sided subshift 6, [M]. Fix a finite discrete set 7 = {1, 2, . . . , d}, and let 7 Z be the infinite product space endowed with the product topology. We denote by σ the shift homeomorphism on 7 Z defined by (σ (x))i = xi+1 . For a shift-invariant closed subset 6 of 7 Z , the topological dynamical system (6, σ 6 ) is called a subshift. We denote by 6+ the set of one-sided sequences x ∈ 7 N such that x appears in 6. For example, 7 N = 7 Z + . We shall still denote by σ the left shift epimorphism of 7 N . The dynamical system (6+ , σ 6+ ) is called the one-sided subshift associated to 6. A finite sequence µ = (µ1 , . . . , µk ) of elements µj ∈ 7 is called a word. We denote by |µ| the length k of µ. For k ∈ N, let 6k = {µ|µ is a word with length k appearing in some x ∈ 6}, 6l = lk=0 6k and k 0 6∗ = ∞ k=0 6 , where 6 denotes the set constituted by the empty word. Let {e1 , . . . , ed } be an orthonormal basis of a d-dimensional Hilbert space H = Cd . Let F 0 be the one dimensional space C: spanned by a normalized vector :, called the vacuum vector, and let F k be the Hilbert space spanned by the vectors eµ = eµ1 ⊗ · · · ⊗ k eµk for µ = (µ1 , . . . µk ) ∈ 6k . Consider the subspace F6 = ⊕∞ k=0 F of the full Fock space of H . The creation operator Tν by eν on F6 , for ν ∈ 6∗ , is defined by eν ⊗ eµ , if νµ ∈ 6∗ Tν : = eν and Tν eµ = . 0 otherwise

346

C. Pinzari, Y. Watatani, K. Yonetani

The unital C ∗ -subalgebra T6 of the algebra of bounded linear operators on F6 generated by {Ti |i = 1, . . . , d} is called the Toeplitz algebra associated with 6, and contains the algebra K(F6 ) of compact operators on F6 . The Matsumoto algebra O6 associated with the subshift 6 is the quotient algebra T6 /K(F6 ). It is generated by the quotient image {Si |i = 1, . . . , d} of {Ti |i = 1, . . . , d}. The unitary representation of T on F6 defining the grading implements an automorphic action of T on T6 leaving K(F6 ) stable. We thus obtain an automorphic T-action γ on O6 such that

γz (Si ) = zSi ,

z ∈ T, i = 1, . . . , d.

As i Si Si∗ = I and i Si∗ Si ≥ I , (O6 , γ ) is a full periodic C ∗ -dynamical system. We set Sµ = Sµ1 . . . Sµk , for µ = (µ1 , . . . , µk ) ∈ 6∗ . For each i = 1, . . . , d, we define a (not necessarily unital) ∗ -endomorphism ρi on ∞ (6+ ) by f (i, x1 , . . . ) if (i, x1 , . . . ) ∈ 6+ (ρi (f ))(x) = 0 otherwise for f ∈ ∞ (6+ ) and x ∈ 6+ . Then we have ∩i kerρi = 0. Consider the functions qµ ∈ ∞ (6+ ), for µ = (µ1 , . . . µk ) ∈ 6k , defined by 1 if µx ∈ 6+ qµ (x) = . 0 otherwise Thus ρi (I ) = qi . (We should note that ρi (I ) = qi is not a continuous function, in general.) Let 6A be the Markov subshift defined by a d × d matrix A = (ai,j ) with entries in {0, 1} and with no zero rows or columns: 6A = {x ∈ 7 Z : axi ,xi+1 = 1, i ∈ Z}. Then each ρi preserves C(6A+ ). For Markov subshifts, Matsumoto’s construction yields the Cuntz–Krieger algebras, see [M],: O6A OA . Proposition 4.1. Let 6A be the Markov subshift defined by a matrix A = (ai,j ) ∈ Md ({0, 1}) and set B = C(6A+ ). Consider the right Hilbert B-module X = ⊕di=1 qi B and the ∗ -homomorphism φ : B → LB (X) given by the diagonal matrix φ(a) = diag(ρi (a))i . Then the Pimsner algebra OX is isomorphic to the Cuntz-Krieger algebra OA . Proof. The commutative C ∗ -algebra DA generated by {Sµ Sµ∗ ; µ ∈ 6∗ } is isomorphic to C(6A+ ) via an isomorphism which identifies Sµ Sµ∗ with the characteristic function pµ = χ[µ] of the cylinder set [µ] = {x ∈ 6+ ; (x1 , . . . , xk ) = (µ1 , . . . , µk )} for µ ∈ 6k . Then the ∗ -endomorphism γi on DA defined by γi (T ) = Si∗ T Si for T ∈ DA corresponds to the ∗ -endomorphism ρi on C(6+ ). We have ρi (pr ) = δi,r pj and ρi (pµ ) = δi,µ1 p(µ2 ,...,µk ) {j :aij =1}

KMS States, Entropy and the Variational Principle

347

for µ ∈ 6k with k ≥ 2. The corresponding formulae for γi (Sµ Sµ∗ ) hold. Consider the right Hilbert DA -module Y = ⊕di=1 Si DA and the ∗ -homomorphism φ : DA → LDA (Y ) given by the diagonal matrix φ(a) = diag(γi (a))i . Since (Sµ Sµ∗ )Si = Si γi (Sµ Sµ∗ ), the Hilbert bimodules X and Y are isomorphic. Now a standard argument shows that OA ∼ = OY ∼ = OX . If 6 is a general subshift, the endomorphisms ρi , i = 1, . . . , d, do not leave C(6+ ) stable. Thus we should replace C(6+ ) by some unital C ∗ -subalgebra of ∞ (6+ ) which is invariant under ρi , i = 1, . . . d. We shall choose, to this aim, the smallest such C ∗ -algebra, which is related to the Krieger left cover of a sofic subshift and the past equivalence relation considered by Matsumoto. Let A(6+ ) be the unital C ∗ -subalgebra of ∞ (6+ ) generated by {qµ ; µ ∈ 6∗ }. Since ρi (I ) = qi and ρi (qµ ) = qµi , it is clear that A(6+ ) is the smallest unital C ∗ -subalgebra of ∞ (6+ ) which is invariant under ρi , i = 1, . . . d. Theorem 4.2. Let B = A(6+ ) be the commutative C ∗ -algebra associated to a subshift 6 as above. Consider the Hilbert right B-module X = ⊕di=1 qi B and the ∗ homomorphism φ : B → LB (X) given by the diagonal matrix φ(a) = diag(ρi (a))i . Then the Pimsner algebra OX is isomorphic to the Matsumoto algebra O6 . Proof. For l ∈ N, let Al be the C ∗ -subalgebra of O6 generated by {Sµ∗ Sµ , µ ∈ 6l } and A6 be the C ∗ -subalgebra of O6 generated by elements {Sµ∗ Sµ , µ ∈ 6∗ }. Then,again, (Al )l is an increasing sequence of commutative finite-dimensional algebras and A6 = lim Al . Similarly, let Al (6+ ) be the C ∗ -subalgebra of ∞ (6+ ) generated by {qµ , µ ∈ − → 6l }. Then (Al (6+ ))l is an increasing sequence of commutative finite-dimensional algebras and A(6+ ) = lim Al (6+ ). For x ∈ 6+ , let 6l (x) = {µ ∈ 6l ; µx ∈ 6+ }. − → Matsumoto introduced in [M2] the following notion of past equivalence relation. Two points x and y ∈ 6+ are called l-past equivalent, x ∼l y, if 6l (x) = 6l (y). The corresponding set of equivalent classes is denoted by :l := 6+ / ∼l . For µ ∈ 6l , if x ∼l y, then qµ (x) = qµ (y). Thus qµ defines a function qˆµ ∈ C(:l ). The set {qˆµ ∈ C(:l ); µ ∈ 6l } separates the points in :l , thus it generates C(:l ). Al (6+ ) is precisely the set of functions in ∞ (6+ ) which have the same value on each l-past equivalent class and we have an isomorphism between Al (6+ ) and C(:l ). We see directly that the commutative C ∗ -algebra A6 is isomorphic to B = A(6+ ) via an isomorphism which identifies Sµ∗ Sµ with qµ , µ ∈ 6∗ . The ∗ -endomorphism γi on A6 defined by γi (T ) = Si∗ T Si for T ∈ A6 corresponds to the ∗ -endomorphism ρi on ∗ S for µ ∈ 6k . Consider the B = A(6+ ). We have ρi (qµ ) = qµi and γi (Sµ∗ Sµ ) = Sµi µi d ∗ right Hilbert A6 -module Y = ⊕i=1 Si A6 and the -homomorphism φ : A6 → LA6 (Y ) given by the diagonal matrix φ(a) = diag(γi (a))i . Since (Sµ∗ Sµ )Si = Si γi (Sµ∗ Sµ ), the Hilbert bimodules X and Y are isomorphic by the identification of B with A6 . The universality of the Matsumoto algebra and the Pimsner algebra immediately shows that O6 ∼ = OY ∼ = OX . 4.2. Contractions of compact metric spaces. We next discuss an example associated with a self-similar set in fractal geometry. Let : be a (separable) complete metric space and let {γ1 , . . . , γd } be a finite family of nonzero proper contractions of : with Lipschitz constants ci = Lip(γi ) < 1. We assume that d ≥ 2. Then there exists a unique nonempty compact set K ⊂ : satisfying the (exact) invariance condition K = ∪i γi (K).

348

C. Pinzari, Y. Watatani, K. Yonetani

The above invariance condition shows that the compact set K is self-similar in a weak sense. For example the Cantor set, the Koch curve and the Sierpinski gasket are typical examples of self-similar sets. We refer the reader to the book of Hutchinson [H] for more information on fractal geometry. The topological dimension of K is dominated by the Hausdorff dimension of K, and the Hausdorff dimension of K is dominated by the similarity dimension of K, which is a finite number D satisfying i ciD = 1. Thus K has a finite topological dimension. Consider the C ∗ -algebra B = C(K) and the canonical Hilbert right B-module X = d B . For each i, we define an endomorphism φi on B by (φi (a))(z) = a(γi (z)),

a ∈ B,

z ∈ K.

Left B-action φ : B → LB (X) is defined by the diagonal matrix φ(a) = diag(φi (a))i . We see that the (exact) invariance condition K = ∪i γi (K) is equivalent to the fact that φ is injective. The bimodule X generates the Pimsner C ∗ -algebra OX . Let {x1 , . . . , xn } be the canonical basis of X. Then the corresponding elements {S1 , . . . , Sd } of OX generate a copy of the Cuntz algebra Od ⊂ OX . By construction, the Pimsner C ∗ -algebra OX is isomorphic to the universal C ∗ -algebra generated by B = C(K) and Od satisfying the relations aSi = Si φi (a) for a ∈ B and i = 1, . . . , d. In [H] Hutchinson shows that there exists a unique regular Borel probability measure µ on K satisfying, for any measurable set F , µ(F ) =

d 1 µ(γi (F )). d i=1

Consider the trace τ0 on B corresponding to the probability measure µ. Then τ0 satisfies

τ0 (

< xi , axi >) = dτ0 (a),

a ∈ B.

i

Since < xi , xi >= 1 for i = 1, . . . , d , OX has a KMS state at the inverse temperature β if and only if β = log d. Moreover the uniqueness of the probability measure implies that the corresponding KMS state is also unique. We shall show that the algebra OX is in fact the Cuntz algebra Od . Before proving this, we study a more general situation to include standard d-times around embeddings. Let K be a compact metric space. Consider the C ∗ -algebra B = C(K) and the state space S of B. Let Lip(K) be the space of Lipschitz functions, and let Lip(f ) denote the Lipschitz constant of f ∈ Lip(K). In [H] Hutchinson considers the following metric L on S: L(ϕ1 , ϕ2 ) = sup{|ϕ1 (f ) − ϕ2 (f )|; f ∈ Lip(K), Lip(f ) ≤ 1}. Then (S, L) is a complete metric space, and the topology defined by L is precisely the weak∗ -topology of S. Consider the canonical Hilbert right B-module X = B d and any injective unital ∗ -homomorphism φ : B → L (X). We identify φ(a) with the matrix (φ (a)) ∈ ij ij B B ⊗ Md (C) for a ∈ B. Then the Pimsner C ∗ -algebra OX is isomorphic to the universal C ∗ -algebra generated by B = C(K) and the Cuntz algebra Od satisfying the relations aSj = i Si φij (a) for a ∈ B and i, j = 1, . . . , d.

KMS States, Entropy and the Variational Principle

349

Proposition 4.3. In the above situation, let C be the unital positive map on B = C(K) defined by C(a) = d1 i φii (a). If the Banach space adjoint C ∗ induces a proper 0 has a unique tracial state. contraction on S with respect to the metric L, then OX Moreover OX has a KMS state at the inverse temperature β if and only if β = log d and the corresponding KMS state is also unique. Proof. We identify L(X n ) with B ⊗ Md n (C). Using the commutation relation, it is easy to see that the inclusion map Dn : L(X n ) → L(X n+1 ) is described by matrices Dn ((aα,β )α,β ) = (φij (aα,β ))(α,i),(β,j ) , where α and β run the set {1, . . . , d}n of words with length n. Thus Dn = φ ⊗ id on B ⊗ Md n (C). A tracial state on OX 0 = lim n L(X n ) is described by a sequence of tracial − → states {τn }, n ≥ 0 on L(X n ) such that τn+1 L(Xn ) = τn . Therefore one needs to assign a sequence of states (ϕn ) ∈ S with τn = ϕn ⊗ tr on L(X n ) ∼ = C(K) ⊗ Md n (C). The coherence relations require that C ∗ (ϕn+1 ) = ϕn ,

n ≥ 0.

∗n Since K is compact, the diameter of (S, L) is bounded. Hence ∩∞ n=1 C (S) consists of a single point ω0 . The constant sequence of states (ϕn ) ∈ S with ϕn = ω0 gives a tracial state τ on OX 0 , because ω0 is the unique fixed point of C ∗ in S. Choose another tracial state. The coherence relations C ∗ (ϕn+1 ) = ϕn shows that any ϕn belongs ∗r to ∩∞ r=1 C (S). Therefore ϕn = ω0 . Thus OX has a unique KMS state at inverse temperature log d.

We remark that the present situation is similar to that of the Cuntz-Krieger algebras associated to aperiodic matrices. In fact, for any state ω ∈ S, C ∗n (ω) converges to the unique ω0 in S with respect to L. This resembles the Perron–Frobenius Theorem for aperiodic matrices. Example 4.4. In the fractal case, each proper contraction γi induces an endomorphism φi on B = C(K) satisfying Lip(φi (f )) ≤ ci Lip(f ) for f ∈ Lip(K). Hence C ∗ = d1 i φi∗ is a proper contraction on S with respect to the metric L. Example 4.5. We next study the example of standard d-times around embeddings. Let B = C(T) with T = R/Z and X = B d be the natural right Hilbert B-module. Then a standard d-times around embedding is defined by a map φ : C(T) → C(T, Md (C)) of the form t t +1 t +d −1 ∗ ), . . . , f ( ))ut , (φ(f ))(t) = ut diag(f ( ), f ( d d d where (ut )t is a continuous path of unitaries in Md (C) such that u0 = I and such that u1 is the unitary matrix corresponding to the operator taking vectors e1 , . . . , ed of the canonical basis of Cd to ed , e1 , . . . , ed−1 respectively. We regard φ as a map φ : B → LB (X) = B ⊗ Md (C). Then C ∗ = d1 i φi ∗ induces a proper contraction on S with respect to the metric L. In fact (C(f ))(t) =

t t +1 t +d −1 1 (f ( ) + f ( ) + ··· + f( )). d d d d

We assume that d = 2 for the simplicity of notation. For f ∈ Lip(T) and x, y ∈ T, choosy y+1 ing carefully the nearest pairs between { x2 , x+1 2 } and { 2 , 2 }, we have |(C(f ))(x) −

350

C. Pinzari, Y. Watatani, K. Yonetani

(C(f ))(y)| ≤ 21 Lip(f )d(x, y). Hence Lip(C(f )) ≤ 21 Lip(f ). Therefore for ϕ1 , ϕ2 ∈ S, we have 1 |(C ∗ (ϕ1 ))(f ) − (C ∗ (ϕ2 ))(f )| = |ϕ1 (C(f )) − ϕ2 (C(f ))| ≤ L(ϕ1 , ϕ2 ) Lip(f ). 2 Thus L(C ∗ (ϕ1 ), C ∗ (ϕ2 )) ≤ 21 L(ϕ1 , ϕ2 ). Remark 4.6. One can easily show, using known results, that under suitable circumstances OX is simple and purely infinite. Indeed, assume that K is totally disconnected or 0 = lim L(X n ) is simple, then connected and has a finite topological dimension. If OX − →n 0 0 OX is of real rank zero by [BDR], since OX has a unique trace. By a result of Martin and 0 has the comparability property on every matrix algebra. Thus we can Pasnicu [MP], OX apply a result by Theorem 1.6 and conclude that the OX is simple and purely infinite. For example, in the case of standard d-times around embeddings all the assumptions 0 = lim L(X n ) is a Bunce-Deddens algebra. Note that we have are satisfied. In fact OX − →n naturally embedded an AT algebra into a purely infinite simple C ∗ -algebra. Again, in the 0 is simple. So we can apply the preceding argument. fractal case, it is easy to show that OX However, it is not difficult to show that in this case OX is canonically isomorphic to the Cuntz algebra Od (a fact which will be later generalized to noncommutative metric spaces): We identify L(X n ) with C(K, Md n (C)). Then the inclusion map Dn : L(X n ) → L(Xn+1 ) is described by block diagonal matrices (Dn (f ))(t) = diag(f (γ1 (t)), . . . , f (γd (t))). Let ω = (ω1 , . . . , ωk ) ∈ {1, . . . , d}k be a finite word and γω = γω1 . . . γωk . Then the inclusion map Dn+k,n of L(X n ) into L(X n+k ) is given by (Dn+k,n (f ))) = diag(f (γω (t)))ω . By the uniform continuity of f , Dn+k,n (f ) is approximated by a constant matrix up to 0 is a UHF algebra M ∞ and O is exactly the Cuntz ε for a sufficient large k. Thus OX d X algebra Od generated by the original operators {S1 , . . . , Sd }. 4.3. Contractions of noncommutative metric spaces. The preceding argument suggests a generalization to noncommutative metric spaces introduced by Connes in [Co]. Our setting will be the following. Let A and B be unital C ∗ -algebras. Suppose that B is a Banach bimodule over A. Let δ : A ⊃ Dom(δ) → B be a densely defined ∗ -derivation with kerδ = CI . Let S be the state space of A. Consider the following metric L on S: L(ϕ1 , ϕ2 ) = sup{|ϕ1 (a) − ϕ2 (a)| ; a ∈ Dom(δ), δ(a) ≤ 1}. The metric is allowed to take the value ∞. In [RiI], Rieffel considers the question of whether the metric topology agrees with the underlying weak∗ topology on the state space. His setting is, however, more general, as he works with normed vector spaces endowed with seminorms not necessarily arising from ∗ -derivations. We assume that {a ∈ Dom(δ) ; δ(a) ≤ 1}/CI

is bounded in A/CI.

(4.1)

KMS States, Entropy and the Variational Principle

351

By Proposition 1.6 in [RiI] this condition is equivalent to the fact that the metric L on S is bounded. Let {φ1 , . . . , φd } be a finite family of unital ∗ -endomorphisms on A, with d ≥ 2. Recall that the crossed product C ∗ -algebra C ∗ (A; φ1 , . . . , φd ) of A by {φ1 , . . . , φd } is the universal C ∗ -algebra generated by the image of a C ∗ -homomorphism π : A → C ∗ (A; φ1 , . . . , φd ) and the Cuntz algebra Od with the generators S1 , . . . , Sd satisfying the relations π(a)Si = Si π(φi (a)) for a ∈ A and i = 1, . . . , d. We note that π is isometric if and only if ∩i kerφi = 0. In this case the crossed product C ∗ algebra C ∗ (A; φ1 , . . . , φd ) is isomorphic to OX , where X is the trivial Hilbert right A-module X = Ad endowed with the diagonal left A-action: φ : A → LA (X) φ(a) = diag(φ1 (a), . . . , φd (a)). Proposition 4.7. In the above setting, assume that the restrictions γi of the Banach space adjoint φi∗ : A∗ → A∗ to the state space S of A are proper contractions with respect to L. Then the endomorphism crossed product C ∗ -algebra C ∗ (A; φ1 , . . . , φd ) is canonically isomorphic to the Cuntz algebra Od and has a unique KMS state at inverse temperature log d. Proof. Let c be the maximum of the Lipschitz norms ci = Lip(γi ), i = 1, . . . , d. For any a ∈ Dom(δ) and ϕ , ψ ∈ S, we have |ϕ(a) − ψ(a)| ≤ L(ϕ, ψ)δ(a). For a finite word α = (α1 , . . . , αn ) ∈ {1, . . . , d}n , we use the multi-index notation φα and γα . Then we have that L(γα (ϕ), γα (ψ)) ≤ cn L(ϕ, ψ) ≤ cn diam(S, L), for any pair of states ϕ , ψ ∈ S. We shall show that π(A) is included in the canonical UHF subalgebra Md ∞ of the Cuntz algebra Od . For any a ∈ Dom(δ) and ε > 0, there exists n ∈ N such that cn diam(S, L)δ(a) ≤ ε. Fix a state ω0 ∈ S. Consider the diagonal matrix t = diag(ω0 (φα (a)))α ∈ Md n (C). Then for any state ω ∈ S, we have |ω(tαα − φα (a))| = |ω0 (φα (a)) − ω(φα (a))| ≤ L(γα (ω0 ), γα (ω))δ(a) ≤ ε. Hence tαα − φα (a) ≤ 2ε. Since Sα Sα∗ = Sα π(φα (a))Sα∗ , π(a) = π(a) α

α

we have that

α

tαα Sα Sα∗ − π(a) =

α

Sα π(tαα − φα (a))Sα∗ ≤ 2ε.

Thus C ∗ (A; φ1 , . . . , φd ) is precisely the Cuntz algebra Od generated by the original {S1 , . . . , Sd }.

352

C. Pinzari, Y. Watatani, K. Yonetani

Remark. M. Rieffel has kindly pointed out to us that in the proof of the previous Proposition we never use the fact that L comes from a ∗ -derivation, but rather only that it is a seminorm satisfying the boundedness condition (4.1). Seminorms of this kind were studied in [RiII]. Example 4.8. Let D be a noncommutative unital C ∗ -algebra. Let A = {a ∈ C([0, 1], D); a(0) ∈ CI }. Set Y = {(x, y) ∈ [0, 1] × [0, 1]; x = y}. Let B = C b (Y, D) be the set of D-valued bounded continuous functions on Y . Then B is a Banach bimodule over A by (a1 f a2 )(x, y) = a1 (x)f (x, y)a2 (y). Let δ : A ⊃ Dom(δ) → B be the densely defined ∗ -derivation of De Leeuw, given by

a(x) − a(y) , |x − y| where Dom(δ) is the set of Lipschitz functions in A. Then kerδ = CI . Let α be the ∗ -endomorphism on A defined by α(f )(x) = f ( x ). Then the restriction γ of the Banach 2 space adjoint α ∗ : A∗ → A∗ to the state space S is a proper contraction with respect to L. Therefore the endomorphism crossed product C ∗ -algebra C ∗ (A; α, α) is isomorphic to the Cuntz algebra O2 . (δ(a))(x, y) =

5. KMS States of Pimsner C ∗ -Algebras Associated to Cuntz–Krieger Bimodules In this section we illustrate Theorem 2.5 by examples. We shall discuss some situations where there is a unique KMS state, or more generally, where the set of KMS states can be easily characterized. The inspiring example is that of the Cuntz–Krieger algebras that we discuss here below. Some of the following facts are well known in terms of path algebras in subfactor theory. 5.1. KMS states of Cuntz–Krieger algebras. Let OA be the Cuntz–Krieger algebras associated to a matrix A = (aij ) ∈ Md ({0, 1}). A KMS state ω for the canonical circle action restricts to a tracial state τ on the f.d. commutative subalgebra A generated by the ranges P1 , . . . , Pd of the generating partial isometries S1 , . . . , Sd . Let λ = (λ1 , . . . , λd ) n ∈ R+ be defined by λi = ω(Pi ). Since ω is normalized, we have λi = 1. The scaling property s (τ ) = '(τ )τ says, when checked on A, that λ is a nonnegative eigenvector of A, and hence, when A is irreducible, it is the unique normalized Perron eigenvector for A by the Perron–Frobenius Theorem [G]. Let us analyse in more detail the structure of the Banach space T (OA 0 ) and the spectrum of the operator s . Let Lr , r ≥ 1, denote the unital finite-dimensional C ∗ -subalgebra of OA generated by elements of the form Si1 . . . Sir Pk (Sj1 . . . Sjr )∗ . Set L0 = CP1 + · · · + CPd . Then the set of minimal central projections for Lr is {σ r (Pk ), k = 1, . . . , d}, where σ is the canonical endomorphism of L0 ∩ OA defined by σ (T ) = i Si T Si ∗ . So Lr σ r (Pk ) Mdr,k (C), where dr,k = Card{(i1 , . . . , ir ) : Si1 . . . Sir Pk = 0} = Card{(i1 , . . . , ir ) : ai1 ,i2 ai2 ,i3 . . . air ,k = 0}. Therefore dr,k is the sum of the entries in the k th column of Ar : dr,k = ai1 ,i2 ai2 ,i3 . . . air ,k . i1 ,...,ir

KMS States, Entropy and the Variational Principle

353

The (j, i)-entry of inclusion matrix of Lr ⊂ Lr+1 can be computed by looking at the projection σ r (Pi )σ r+1 (Pj ) = σ r (Si Pj Si ∗ ) which is 0 when ai,j = 0, otherwise it is the sum of dr,i minimal projections of Lr+1 σ r+1 (Pj ). Thus the inclusion matrix of Lr ⊂ Lr+1 is At . A tracial state on OA 0 is described by a sequence of positive traces {τr }, r ≥ 0 on Lr such that τ0 is normalized and τr+1 Lr = τr . Therefore one needs to assign a sequence of nonnegative column vectors (tr ) ∈ R+ d which will be the values that τr takes on the minimal projections of Lr . The coherence relations require that A(tr+1 ) = tr ,

r ≥ 0,

while the positivity and normalization properties translates into: tr (k) ≥ 0, r ≥ 0, k = 1, . . . , d, t0 (k) = 1. k

The latter implies, as expected, normalization of each τr : tr (k)dr,k = (Ar tr )(i) = t0 (i) = 1, r ≥ 0. k

i

i

Removing positivity and normalization, but requiring instead a norm bound for the sequence (tr ), one finds that T (OA 0 ) is described by (Ar |tr |)(i) < ∞}, {(tr )r≥0 : tr ∈ Cd , tr = Atr+1 , r ≥ 0, sup r

i

with the Banach space norm (tr )r≥0 = sup r

The operator s acts as: while its inverse is

(Ar |tr |)(i).

i

s ((tr )r≥0 ) = (At0 , t0 , t1 , . . . ) t ((tr )r≥0 ) = (t1 , t2 , . . . ).

Proposition 5.1. If A = (ai,j ) ∈ Md ({0, 1}) is an irreducible symmetric matrix then T (OA 0 ) is linearly spanned by elements of the form (λ−r t0 )r≥0 , where t0 is an eigenvector of A with eigenvalue λ and |λ| = r(A). Furthermore σ (s ) = {λ ∈ σ (A) : |λ| = r(A)}. Proof. Let t0 be an eigenvector for A with eigenvalue λ such that |λ| = r(A). Set tr := λ−r t0 , so that Atr+1 = tr . We have: Ar |tr |2 = r(A)−r Ar |t0 |2 → E0 |t0 |2 , where E0 is the rank one orthogonal projection onto the span of the Perron eigenvector. It follows that (tr ) ∈ T (OA 0 ). Furthermore (tr ) is also an eigenvector of s with the same eigenvalue. The same argument shows that if t0 ∈ T (OA 0 ) were an eigenvector

354

C. Pinzari, Y. Watatani, K. Yonetani

with eigenvalue λ such that |λ| < r(A), and tr is defined as above, then Ar |tr |2 would be unbounded, so that (tr ) does not define an element of T (OA 0 ). With similar arguments one sees that T (OA 0 ) is linearly spanned by vectors of the form (λ−r t0 ), where |λ| = r(A). We show that if either λ ∈ / σ (A) or λ ∈ σ (A) but |λ| < r(A) then s − λ is invertible. We start assuming that λ ∈ / σ (A). Then s − λ is clearly injective. We show that it is also surjective. Given (vr ) ∈ T (OA 0 ) set tr = (A − λ)−1 vr . Then Atr = A(A − λ)−1 vr = (A − λ)−1 Avr = (A − λ)−1 vr−1 = tr−1 . For any matrix B with complex entries let B + stand for the matrix with entries which are the absolute values of the corresponding elements of B, and let MB be the maximum of the absolute values of its entries. Then +

Ar |tr | ≤ Ar (A − λ)−1 |vr | ≤ M(A−λ)−1 Ar |vr | which shows that (tr ) ∈ T (OA 0 ) and (s −λ)(tr ) = vr . Assume now that λ ∈ σ (A)−{0} but |λ| < r(A). We have already noted that s − λ is injective. Furthermore since for each (tr ) ∈ T (OA 0 ), any tr belongs to the range of A − λ, with similar arguments one shows that s − λ is surjective. Remark. If A is aperiodic, then the homogeneous subalgebra of OA has a unique trace. More generally, if one drops the assumption that A is symmetric, then, with a more extensive use of Perron–Frobenius theory, one can still show that eigenvectors corresponding to nonmaximal eigenvalues do not appear in the point spectrum of s , hence our result shows that σ (s ) ⊂ {λ : |λ| = r(A)}. Note also that if we more generally start with a reducible matrix A, then we are in a situation of nonuniqueness of KMS states for OA (corresponding to the minimal and maximal Perron eigenvalues of A). 5.2. KMS states as Markov traces arising from inclusions of finite algebras with finite Jones index. Let N ⊂ M be an inclusion of I I1 -factors with finite index or of finitedimensional C ∗ -algebras such that Z(N ) ∩ Z(M) = C. Let, in the latter case, A be the inclusion matrix. Let τ be a faithful tracial state on M, and consider the unique τ -preserving conditional expectation Eτ : M → N. Endow X = M with the C ∗ -bimodule structure over N as follows. The structure of N bimodule is defined by left and right multiplication, while the N -valued inner product is < x, y >N := Eτ (x ∗ y). Then X is full and finite projective as a right N -module ([GHJ]). It is not difficult to check that LN (X r ) = M2r−1 , where

M−1 = N ⊂ M0 = M ⊂ M1 ⊂ . . .

is the Jones tower. A KMS state at inverse temperature β for OX corresponds precisely to a Markov trace for the tower, which is unique, and one has β = log([M : N ]).

KMS States, Entropy and the Variational Principle

355

See [K]. If N ⊂ M are finite factors, each term of the tower is a finite factor, hence its trace space is one dimensional, and spanned by the Markov trace. So dimT (OX 0 ) = 1, and s acts multiplying by [M : N ]. If N ⊂ M are finite-dimensional C ∗ -algebras, the inclusion matrix of LN (X r ) ⊂ LN (X r+1 ) is At A, which is symmetric and irreducible ([GHJ]). It is not difficult to show, with arguments similar to those of the previous example, that T (OX 0 ) is again linearly spanned by traces corresponding to eigenvectors of At A with maximal eigenvalue.

5.3. KMS states of Pimsner algebras associated with Cuntz–Krieger bimodules. After these motivating examples, we consider, more generally, systems of the form (OX , γ ), where X is what we call a Cuntz–Krieger Hilbert C ∗ -bimodule and γ is the canonical gauge action. Such Hilbert bimodules, and simplicity of the corresponding C ∗ algebras OX , have been considered in [KPW]. Consider d ≥ 2 unital simple C ∗ -algebras A1 , . . . Ad and a matrix A = (ai,j ) ∈ Md ({0, 1}) with no row and no column identically zero. Let, for any pair of indices i, j such that ai,j = 1, Xi,j be a full, finite projective Ai -Aj Hilbert bimodule, and let X = ⊕i,j :ai,j =1 Xi,j be endowed with the natural structure of a Hilbert bimodule over A := A1 ⊕ · · · ⊕ Ad . Then since no row of A is zero, left A-action is faithful, and since no column of A is zero, X is full. Clearly X is finite projective as a right module. We assume that there is a system of tracial states τ1 , . . . , τn on A1 , . . . An respectively satisfying, for each pair of indices for which aj,k = 1,

τk (

< xr k,j axr k,j >) = λj,k τj (a),

a ∈ Aj ,

r

for some λj,k > 0. Here {xr k,j }r is a basis of Xj,k . We set λj,k = 0 if aj,k = 0. We will call {τi } a coherent set of traces. Note that (λj,k ) is irreducible precisely when A is. If each Aj has a unique tracial state τj , the set {τj } is coherent. This is indeed the case of Cuntz–Krieger algebras. Let Pj be the identity of Aj . Recall from [KPW] that A ∩ OX has a unique unital endomorphism σ such that σ (a)x = xa,

x ∈ X, a ∈ A ∩ OX .

Recall also that if A is irreducible, OX is simple [KPW], and if A is aperiodic, A is separable and has real rank zero and all Mn (Aj ) have the comparability property, then OX is simple and purely infinite (cf. Theorem 1.6). For all r ≥ 1 LA (X r ) is a finite direct sum of unital simple C ∗ -algebras, and its minimal central projections are σ r (P1 ), . . . , σ r (Pd ). One has LA (X r )σ r (Pj ) = LAj (X r Pj ), where X r Pj is regarded as an A-Aj Hilbert bimodule. Let τ1 , . . . , τd be a coherent choice of tracial states on A1 , . . . , Ad respectively. Let {ui (r),j } be a basis of X r Pj . Then the positive functional

a ∈ LAj (X r Pj ) → τj (

< ui (r),j , τi (r),j >)

i

is nonzero, tracial and independent of the basis. Let 'r (τj ) denote its norm. After normalization, we get a tracial state Tj (r) on L(X r Pj ). If Aj has a unique tracial state, Tj (r) is the unique tracial state of L(X r Pj ). Consider a tracial state τr on L(X r ) which

356

C. Pinzari, Y. Watatani, K. Yonetani

restricts to a multiple of Tj (r) on L(X r Pj ), and let tj (r) = τ (σr (Pj )), so tj (r) ≥ 0 and (r) = 1. Then τr restricts to τr−1 if and only if for all a ∈ L(X r−1 Pj ), and all j , j tj tj (r−1) Tj (r−1) (a) =

tk (r) Tk (r) (aσ r (Pk )).

k

Now if {xj } is a basis of X, {xj1 . . . xjr Pk } is a basis of Xr Pk , so Tk (r) (aσ r (Pk )) = 'r (τk )−1 τk (

j1 ...jr

= λj,k 'r (τk )

−1

Pk xj∗r (xj1 . . . xjr−1 )∗ axj1 . . . xjr−1 xjr Pk )

'r−1 (τj )Tj (r−1) (a).

So τr restricts to τr−1 if and only if

λj,k tk (r) 'r (τk )−1 = tj (r−1) 'r−1 (τj )−1 .

k

Perron– Set vk (r) := 'r (τk )−1 tk (r) . Then a solution is obtained choosing for v 0 the Frobenius eigenvector of the nonnegative matrix (λj,k ) with the normalization vk 0 = 1, and iteratively v r = λ−1 v r−1 , where λ is a positive eigenvalue of (λj,k ). Note that if A is aperiodic, then (λj,k ) is aperiodic as well, so such a v 0 is the only possible solution. One can easily check that the tracial state τ thus obtained on OX 0 satisfies

τ(

xj ∗ axj ) = λτ (a),

a ∈ OX 0 ,

j

and therefore gives rise to a KMS state of OX . Summarizing, we have proved the following result. Theorem 5.2. Let A = (ai,j ) ∈ Md (0, 1) be an irreducible matrix, A1 , . . . , Ad unital simple C ∗ -algebras with a nonzero trace, and let, for any pair of indices for which ai,j = 1, Xi,j be a full, finite projective, Hilbert Ai -Aj bimodule. Consider X = ⊕i,j :ai,j =1 Xi,j as a Hilbert bimodule over A1 ⊕ · · · ⊕ Ad . Then any normalized Perron– Frobenius eigenvector of the irreducible matrix (λi,j ) associated to a coherent system of tracial states on A1 , . . . , Ad defines, as described above, a KMS state of OX at inverse temperature β = log r((λi,j )). In particular, if each Aj has a unique tracial state, then there is a unique system of coherent tracial states, and the corresponding KMS state is the unique KMS state of OX . Note that if each Aj has a unique trace and A is aperiodic, then OX 0 has a unique trace, which corresponds necessarily to the unique KMS state.

6. Inverse Temperatures and Topological Entropy The aim of this section is to establish a relationship between inverse temperatures of extremal KMS states and the topological entropy of certain subshifts naturally associated to (A, γ ).

KMS States, Entropy and the Variational Principle

357

Let {yi } and {xj } be finite subsets of A1 \{0} satisfying i yi ∗ yi = I and j xj xj ∗ = I, and let T = T{yi } and S = S{xj } be the completely positive maps of A0 defined in Sect. 2. We define 1 h(T{yi } ) := lim log( yi1 . . . yin (yi1 . . . yin )∗ ), n n i1 ,...,in

1 (xj1 . . . xjn )∗ xj1 . . . xjn ). h(S{xj } ) := lim log( n n j1 ,...,jn

In the case of the Matsumoto C ∗ -algebras O6 , h(S) is the topological entropy of the shift homeomorphism σ 6 , and it was shown to coincide with the maximal inverse temperature of certain KMS states in [MWY]. Note that since ( yi1 . . . yin 2 )1/n ≥ 1 for all n and yi ≤ 1, for all i, then 0 ≤ h(T{yi } ) ≤ log(Card{yi }), and similarly, 0 ≤ h(S{xj } ) ≤ log(Card{xj }). Note however that if A is a crossed product C ∗ -algebra by a single automorphism α then h(α) = h(α −1 ) = 0. More generally, since the sequences defining h(T ) and h(S) converge to their greatest lower bounds, we see that these are positive if and only if one has respectively 1/n yi1 . . . yin 2 ≥ 1 + ε, 1/n ≥1+ε xj1 . . . xin 2 for all n and some ε > 0. associate to a fixed finite set of nonzero elements {xj } ⊂ A1 such that We now ∗ xj xj = I , a one-sided subshift {xj } , defined as follows. Set 7 := {1, . . . , d}, where d = Card{xj }. Then {xj } = {λ ∈ 7 N : xλ1 . . . xλr = 0, r ∈ N}. Clearly {xj } is a closed subset of 7 N mapped onto itself by the left shift homomorphism: σ (λ)i = λi+1 . Notice that if (A, γ ) does not result from a crossed product by an automorphism, or a proper corner endomorphism, d ≥ 2. Note also that, thanks to the relation i xj xj ∗ = I , any n-tuple (λ1 , . . . , λr ) ∈ 7 r such that xλ1 . . . xλr = 0 extends to an element of {xj } . In particular, {xj } is nonempty. Replacing the T-action γ by the action z ∈ T → γ z := γz−1 , we see that we also have, for any finite subset {yi } ⊂ A1 \{0} such that i yi ∗ yi = I , a one-sided subshift: N

{yj } = {λ ∈ 7 : yλr . . . yλ1 = 0, r ∈ N}, where 7 is the set of the first d := Card{yi } positive integers.

358

C. Pinzari, Y. Watatani, K. Yonetani

We also introduce the following two-sided subshifts: 6{xj } = {λ ∈ 7 Z : xλr . . . xλr+s = 0, r ∈ Z, s ∈ N}, and

Z

6 {yi } = {λ ∈ 7 : yλr+s . . . yλr = 0, r ∈ Z, s ∈ N}.

Remark. Even though it would seem more convenient to work with two-sided subshifts, we should point out that these may be rather small, in the sense that a finite word (λ1 , . . . , λr ) occurring, e.g., in {xj } does not necessarily extend to a word in 6{xj } . The following simple example well describes the situation. Consider the C ∗ -algebra A = M2 (C) ⊗ C(T), and define the following 2π-periodic automorphic action α of R: αt = advt ⊗ βt , where vt = diag(eit , 1) and βt (f )(eiτ ) = f (ei(τ +t) ). Let ei,j , i, j = 1, 2, be a system of matrix units for M2 (C), and define x1 = e1,2 ⊗ I , x2 = e2,2 ⊗ u, y1 = e1,2 ⊗ I , y2 = e1,1 ⊗ u, where u(z) = z, z ∈ T. Then all the above elements are in A1 , and satisfy x1 x1 ∗ + x2 x2 ∗ = I , y1 ∗ y1 + y2 ∗ y2 = I , so (A, α) is a full C ∗ -dynamical system. Note that x1 2 = 0, x1 x2 = 0, x2 x1 = 0, x2 2 = 0, so 6{xj } = {(. . . , 2, 2, 2, . . . )} while {xj } = {(1, 2, 2, . . . ), (2, 2, 2, . . . )}. Thus (1, 2) is a 2-word appearing in {x1 ,x2 } which can not be extended to any two-sided sequence of 6{x1 ,x2 } . 6

We give a condition ensuring that {xj } and {yi } are the positive parts of 6{xj } and {yi } respectively.

Proposition 6.1. Let {xj }, be a finite subset of A1 \{0} such that

xj xj ∗ = I

j

and let {xj } and 6{xj } be the associated one-sided and two-sided subshifts respectively. If j xj ∗ xj is invertible then 6{xj } = ∅. Furthermore {xj } = 6{xj } + . An analogous statement holds for {yi } and 6 {yi } .

KMS States, Entropy and the Variational Principle

359

Proof. Since j xj xj ∗ = I , for any (i1 , . . . , ir ) such that xi1 . . . xir = 0 there is an ir+1 such that xi1 . . . xir xir+1 = 0, and therefore there is a sequence (in )n≥1 such that xi1 . . . xin = 0 for all n ∈ N. To complete the proof relative to the set {xj } it is now clear that it suffices to show, for any such (in ), the existence of i0 ∈ 7 such that xi0 xi1 . . . xin = 0 for all n ≥ 0. If this were not the case, for all k ∈ 7 there would exist nk such that xk xi1 . . . xink = 0. Letting n = max{nk , k ∈ 7}, we must have xk xi1 . . . xin = 0 for all k ∈ 7, and therefore, k xk ∗ xk being invertible, xi1 . . . xin = 0. This is now a contradiction. The statement relative to the set {yi } can be proved similarly. In particular, if A = (aij ) is a {0, 1}-matrix with no zero row or column, then the generating partial isometries {Si } of the Cuntz–Krieger algebra OA satisfy both Si Si ∗ = I, i

and

S i ∗ Si =

aij Sj Sj ∗ ,

j ∗S

thus i Si i is invertible. One has 6{Si } = 6A and {Si } = 6A+ . More generally, if 6 is a nonempty subshift of 7 Z then the canonical set of generating partial isometries {Si } of the Matsumoto C ∗ -algebra O6 still satisfy the above conditions (see 4.1) and we have also in this case 6{Si } = 6 and {Si } = 6+ . Another example is provided by the algebras generated by certain Cuntz–Krieger bimodules X = ⊕(i,j ):ai,j =1 Xi,j as described in Sect. 5. More precisely, if Xi,j is the Hilbert bimodule defined by a unital ∗ -isomorphism φi,j : Ai → Aj , then one can define Si,j to be the identity of Aj regarded as an element of Xi,j . So Si = j :aij =1 Si,j are partial isometries of OX 1 satisfying the Cuntz–Krieger relations with respect to A = (aij ). Therefore 6{Si } is again the two-sided Markov subshift defined by the matrix A = (ai,j ). The example arising from fractal geometry discussed in Sect. 4 is in the same spirit, in that the natural basis of the generating module are generators of Cuntz algebras, so the associated one or two-sided subshifts are full. Note that all the examples above discussed have in common the fact that there is a multiplet {xi } ⊂ A1 such that i xi xi ∗ = I consisting of elements with pairwise orthogonal ranges (and therefore they are necessarily partial isometries). We start by establishing general estimates for the extremal inverse temperatures using the topological entropies of the associated subshifts. Recall [DGS] that for a one-sided (or two-sided) subshift (, σ ) the topological entropy can be computed as htop (σ ) = lim n

1 log(θn ()), n

where θn () is the cardinality of the set r of distinct words of length n occurring in . Proposition 6.2. Let {yi }, {xj } ⊂ A1 \{0} be finite subsets such that i yi ∗ yi = I and ∗

j xj xj = I , and let {yi } and {xj } be the corresponding one-sided subshifts, defined as above. Then βmin ≥ −h(T{yi } ) ≥ −htop (σ {yi } ), βmax ≤ h(S{xj } ) ≤ htop (σ {xj } ).

360

C. Pinzari, Y. Watatani, K. Yonetani

Proof. By Prop. 2.3 and the triangle inequality βmin ≥ −h(Tyi ) and βmax ≤ h(S{xj } ). The rest follows from i1 ,...,in xi1 . . . xin 2 ≤ θn ({xj } ), and its analogue for {yi }. We shall see that in the general situation if it is possible to choose the multiplets {yi } and {xj } carefully, then the topological entropies of the corresponding subshifts lead to the extremal inverse temperatures of KMS states. We first present an intermediate result, which gives a sufficient condition for h(T ) and h(S) to coincide with the topological entropy of the associated subshifts. Proposition 6.3. Set ln := min{yi1 . . . yin 2 : yi1 . . . yin = 0}, mn := min{xj1 . . . xjn 2 : xj1 . . . xjn = 0}. If 1/n

lim sup ln n

= 1,

then h(T{yi } ) = htop (σ {yi } ). Similarly, if 1/n

lim sup mn n

= 1,

then h(S{xj } ) = htop (σ {xj } ). Proof. We shall prove only the first statement. Let θn denote the number of words of length n occurring in {yi } , i.e. the number on n-tuples (i1 , . . . , in ) ∈ 7 n such that yi1 . . . yin = 0. Then given ε > 0, for infinitely many indices n,

(1 − ε)θn 1/n ≤ θn 1/n ln 1/n ≤ (

yi1 . . . yin 2 )1/n ≤ θn 1/n ,

hence taking the logarithm of the limit over n, h(T ) = lim n

1 log θn = htop (σ {yi } ). n

The previous result applies whenever one is working with a multiplet consisting of partial isometries with mutually orthogonal ranges. We next show that, strengthening the hypotheses of the previous result, all the inequalities of Proposition 6.2 become equalities. More precisely, if the positive evaluations of a tracial state τ on the iterated basic monomials, e.g. (xi1 . . . xin )∗ xi1 . . . xin , do not get too small when n increases, then the maximal inverse temperature βmax can be approximated iterating the operator s on τ . The proof is inspired by an analogous result in [MWY] for the Matsumoto algebras associated to subshifts.

KMS States, Entropy and the Variational Principle

361

Theorem 6.4. Let {yi }, {xj } be finite subsets of A1 such that = I . If τ is a tracial state on A0 , set

yi ∗ yi = I and

xj xj ∗

µn (τ ) := min{τ (yi1 . . . yin (yi1 . . . yin )∗ ) : yi1 . . . yin = 0}, νn (τ ) := min{τ ((xj1 . . . xjn )∗ xj1 . . . yjn ) : xj1 . . . xjn = 0}. If

lim sup µn 1/n (τ ) = 1, n

then βmin = −h(T{yi } ) = −htop (σ {yi } ) = − lim sup n

1 log(δn (τ )). n

In particular, if τ is the restriction of a KMS state ω, then ω has minimal inverse temperature. If instead lim sup νn 1/n (τ ) = 1, n

then βmax = h(S{xj } ) = htop (σ {xj } ) = lim sup n

1 log('n (τ )). n

If τ is the restriction of a KMS state ω, then ω has maximal inverse temperature. Proof. We shall prove only the first statement. For infinitely many indices n, (1 − ε)n θn ≤ θn µn (τ ) ≤ δn (τ ) = τ ( yi1 . . . yin (yi1 . . . yin )∗ ) ≤ θn , hence taking the nth root and then the logarithm of the limit over a subsequence, by the arbitrariness of ε, we get lim sup n

1 log(δn (τ )) = htop (σ {yi } ) = h(T{yi } ). n

The last equality follows from Prop. 6.3. Now δn (τ )1/n = t n (τ )1/n ≤ t n 1/n → r(t ), so htop (σ {y } ) ≤ −βmin , i

which, together with Proposition 6.2, proves the first statement. If in particular τ arises from a KMS state ω at inverse temperature β, then δn (τ ) = e−nβ , so β = βmin . The previous result can be regarded as an analogue of the well known fact from Perron–Frobenius theory that for an irreducible nonnegative matrix A the maximal eigenvalue can be approximated by r(A) = lim supn An (τ )1/n , where τ is any vector with positive entries [G]. Corollary 6.5. If there is a finite subset {yi } (resp. {xj }) of A1 such that

362

C. Pinzari, Y. Watatani, K. Yonetani

∗ ∗ (1) i yi yi = I (resp. j xj xj = I ), (2) yi yh ∗ = 0, i = h (resp. xj ∗ xk = 0, j = k). (3) The algebra C generated by all finite products of the form yi1 . . . yin (yi1 . . . yin )∗ , (resp. (xj1 . . . xjn )∗ xj1 . . . xjn ) is finite-dimensional, then the conclusions of the previous theorem hold for any faithful tracial state on A0 . Proof. Just note that under our assumptions any of the nonzero basic monomials is a projection, and therefore it majorizes a minimal projection in C. It follows that limn µn (τ )1/n = 1 (resp. limn νn (τ )1/n = 1) for any faithful trace τ on A0 , so the previous theorem applies. In particular, this result applies to all the examples discussed at the beginning of this section.

7. Topological Entropy of Canonical ucp Maps Let {xj } be a finite set of a C ∗ -algebra A of grade 1 such that j xj xj ∗ = I . In the previous section we have associated to this set a one-sided subshift ({xj } , σ {xj } ) of the Bernoulli shift (7 N , σ ), where 7 is the state space of the first d positive integers, and d = Card{j : xj = 0}, in a way that, under suitable circumstances, its classical topological entropy equals an extremal inverse temperature of KMS states. One can also associate to the subset {xj } a unital complete positive map defined by σ{xj } : T ∈ A →

xj T x j ∗ .

j

In view of the results of the previous section, we ask whether there is a relationship between the Voiculescu topological entropy of this map and the classical topological entropy of the subshift {xj } . If A = On is the Cuntz algebra with generators S1 , . . . , Sn , Choda shows in [Ch] that the topological entropy of the canonical endomorphism σ{Si } is log(n), i.e. the topological entropy of the associated full shift. In the more general case where A = OA is a Cuntz–Krieger algebra, and {Sj } is the canonical set of generating partial isometries, Boca and Goldstein [BG] have recently computed the Voiculescu entropy [V] of this map, and they have shown that it equals the logarithm of the spectral radius of A, or, in other words, the classical topological entropy of the underlying finite type subshift A [DGS]. However, special cases, although extreme from a certain point of view, of full periodic C ∗ -dynamical systems are the crossed products by an automorphism α. Brown showed in [B] that htA0 Z (Ad(u)) = htA0 (α), where u is a unitary implementing α. It is obvious that the associated subshift is in this case trivial, so its entropy is zero. In Theorem 7.4 we give, for full periodic C ∗ -dynamical systems, an upper bound for ht(σ{xj } ) which allows to recover the above discussed results as special cases. We will then apply this result to find new examples, among the Matsumoto algebras associated to non finite type subshifts, where ht(σ{xj } ) = htop ({xj } ) still holds.

KMS States, Entropy and the Variational Principle

363

In the beginning of this section the automorphic action of the circle plays no role, therefore we shall not assume that the xj ’s are of grade 1. We define the associated one-sided subshift = {xj } as in the previous section. We now show that the ucp map σ{xj } can be understood as a noncommutative subshift. Let σ be the restriction of σ to and T the ∗ -monomorphism of C() obtained by transposing σ , i.e. T f (x) = f (σ (x)), x ∈ . Also, we will consider a natural basis of neighborhoods for . For each (i1 , . . . , ir ) ∈ r , consider the cylinder set [i1 . . . ir ] = {(xj )j ∈ : x1 = i1 , . . . , xr = ir }. For a fixed r ∈ N, these constitute an open and closed cover of with cardinality θr = Card r . Proposition 7.1. Let {xj } be a finite subset of A such that j xj xj ∗ = I, and let (, σ ) be the associated one-sided subshift. Then there is a unique unital completely positive map D : C() → A taking the characteristic function of [i1 . . . ir ] to xi1 . . . xir (xi1 . . . xir )∗ . One has σ{xj } ◦ D = D ◦ T . Moreover, if the sequence (mn )n defined in Proposition 6.3 does not converge to 0, then D is faithful. Proof. We first notice that, for each r ∈ N, the map Dr : 7 N → 7 r projecting onto the first r coordinates takes onto subset r of 7 r consisting of θr elements. Therefore there is a natural ∗ -monomorphism φr : Cθr → C() taking a θr -tuple assuming value 1 on (i1 , . . . , ir ) and zero elsewhere to the characteristic function of [i1 . . . ir ]. Similarly, there are, for r ≤ s, natural ∗ -monomorphisms φr,s : Cθr → Cθs such that φs φr,s = φr . Since the cylinder sets {[ii . . . ir ], r ∈ N} form a basis of closed and open sets for , we see that the image of all the φr is dense. It follows that C() is the inductive limit of the Cθr ’s under the maps φr,s . We define the ucp map Dr : Cθr → A which takes ∗ the characteristic function of [i i , . . . , ir ] to the element xi1 . . . xir (xi1 . . . xir ) . Since ∗ 0 θ r Ds φr,s = Dr s ≥ r, thanks to j xj xj = I , we get a ucp map D : ∪r C → A. Since Dr (f ) ≤ f , 0 D extends to a ucp map on C(), which is the desired D. The relation D ◦ T = σ{xj } ◦ D can be easily checked on the total set of characteristic functions of cylinder sets. We now construct a conditional expectation Er : C() → Cθr . Choose a faithful normalized Borel measure µ on , and associate to a function f ∈ C() the θr -tuple with coordinates 1 Er (f )(i1 , . . . , ir ) = f (x)dµ(x), µ([i1 . . . ir ]) [i1 ,...,ir ] for each (i1 , . . . , ir ) ∈ r . One can easily check that Er (I ) = I and that Er (f a) = Er (f )a, a ∈ Cθr . Clearly (Er )r converges pointwise in norm to the identity. Assume now that f ∈ C() is a positive element such that D(f ) = 0. Then DEr (f ) converges to 0. On the other hand

1 ∗

DEr (f ) =

f (x)dµ(x) x . . . x (x . . . x ) i1 ir i1 ir

i1 ,...,ir µ([i1 . . . ir ]) [i1 ...ir ] 1 ≥ f (x)dµ(x) mr , µ([i1 . . . ir ]) [i1 ...ir ]

364

C. Pinzari, Y. Watatani, K. Yonetani

therefore if mr does not converge to 0, a subsequence of Er (f ) converges to 0, so f = 0. 1/n

Note that if (mn )n does not converge to 0 then lim supn mn = 1, thus we are in the position of applying Proposition 6.3. A particularly important case is when the xj ’ have pairwise orthogonal ranges. Corollary 7.2. If there is a finite subset {xj } ⊂ A such that xj xj ∗ = I, j

xi ∗ xj = 0,

i = j,

then the ucp map D : C() → A constructed in the previous proposition is in fact a restriction of σ{xj } to C() corresponds to the one-sided subshift T .

∗ -monomorphism. The

For any subshift 6 the Matsumoto C ∗ -algebra O6 satisfies the requirements of the previous result [M]. Let now (A, γ ) be a full C ∗ -dynamical system over T. Our next aim is to compare the 1 topological entropy∗ of the ucp map σ{xj } , when {x1 , . . . , xd } is a finite subset of A \{0} such that j xj xj = I , with entropic properties of the canonical homogeneous C ∗ algebra. We refer the reader respectively to [V] for the notion of topological entropy for nuclear C ∗ -algebras and to [B] for its generalization to exact C ∗ -algebras, and to [BG] for the generalization of the topological entropy ht(P ) of a ucp map P on a unital exact C ∗ -algebra. We start defining an entropic quantity for the homogeneous subalgebra A0 which, in the case where A = A0 α Z, reduces to the topological entropy of α. We shall assume that A0 is an exact C ∗ -algebra. We start fixing a choice of nonzero elements {xi } of A1 such that i xi xi ∗ = I . Let us define, for µ = (i1 , . . . , ir ) ∈ r , xµ := xi1 . . . xir . We set x∅ = I and |∅| = 0. We shall also consider the operators qα,β = xα ∗ xβ ∈ A0 for |α| = |β| ≥ 0. Note that q∅,∅ = I . Let π : A0 → B(H) a faithful ∗ -representation be given, ω ⊂ A0 a finite subset and δ > 0. Set, for n ∈ N,

ω(n) = xµ∗ qδ,δ T q',' xν , T ∈ ω, |µ| = |ν| ≤ n − 1, |δ| = |δ | ≤ n − 1, |'| = |' | ≤ n − 1 . Note that ω(n) depends on the contractions φxi ,xj : T ∈ A0 → xi ∗ T xj rather than on the elements {xi }. We define, for a > 0, δ 1 ), log rcp(π, ω(n) , n θn−1 a n hta (π, {φxi ,xj }, ω) = sup hta (π, {φxi ,xj }, ω, δ),

hta (π, {φxi ,xj }, ω, δ) = lim sup δ>0

hta (π, {φxi ,xj }) =

sup ω⊂A0 finite

hta (π, {φxi ,xj }, ω).

We will use the same notation as [B]. Thus in particular for a finite set : ⊂ A0 , rcp(π, :, δ) is computed with respect to factorizations of completely positive contractions, not necessarily unital, from A0 to B(H) via finite dimensional C ∗ -algebras.

KMS States, Entropy and the Variational Principle

365

Brown proves in [B] that rcp(π, :, δ) is independent of the choice of π. We will regard A faithfully represented on a Hilbert space H, and we take π to be the inclusion ιA0 of A0 in B(H). Moreover we will avoid indicating π in the above definitions. We anticipate, for later use, the following immediate consequence of the definition. Lemma 7.3. (1) If α is an automorphism of a unital C ∗ -algebra A0 , A = A0 α Z, and u is any unitary of A1 implementing α on A0 , hta (φu,u ) = ht(α −1 ). (2) If ω is such that for some finite dimensional C ∗ -subalgebra D ⊂ A0 which is the range of a conditional expectation, ω(n) ⊂ D except for finitely many n, then hta ({φxi ,xj }, ω) = 0. Proof. (1) for n ∈ N, θn = 1 and ω(n) = ω ∪ · · · ∪ α −n+1 (ω). (2) Let E : A0 → D be a conditional expectation, and set φ := E, ψ := ιD , so that for T ∈ ω(n) and infinitely δ many indices n, ψ ◦ φ(T ) = T . This implies that rcp(ω(n) , θn−1 a ) ≤ rank(D), 0 Let now ω ⊂ A be a finite subset and n0 ∈ N. A typical finite subset of A has the form F (ω, n0 ) = {xγ T , T ∈ ω, |γ | ≤ n0 }. Our aim is to show the following result. Theorem 7.4. Let (A, γ , T) be a full C ∗ -dynamical system, with A0 exact. Let σ{xi } and {xi } be the ucp map and the one-sided subshift associated to a set {xi } ⊂ A1 satisfying ∗ 0 i xi xi = I . Then for any finite subset ω ⊂ A and n0 ∈ N0 , ht(σ{xi } , F (ω, n0 )) ≤ htop (σ {xi } ) + ht2 ({φxi ,xj }, ω). We will prove this theorem combining appropriate analogues of arguments of Brown [B] for crossed product C ∗ -algebras and Boca and Goldstein [BG] for Cuntz–Krieger algebras. Motivated by [B], we define certain cp maps. Let F ⊂ N0 be a finite subset. Set SF : T ∈ A → (xα ∗ m|α|−|β| (T )xβ )α,β∈IF ∈ MθF (A0 ). Here IF := ∪r∈F r , θF = r∈F θr , and, for k ∈ Z, mk : A → Ak is the natural projection. Note that SF is contractive and cp. For a contractive cp map φ : A0 → B set φF := ι ⊗ φ ◦ SF : A → MθF (B) which is contractive and cp. Let f ∈ 2 (N0 ) have support in F , and define the cp map S˜F,f : T = (Tα,β ) ∈ MθF (B(H)) → f (|αf (|β|)xα Tα,β xβ∗ ∈ B(H). α,β∈IF

Note that S˜F,f (I ) = f 2 2 I . Again, for a contractive cp map ψ : B → B(H), define ψF,f := S˜F,f ◦ ι ⊗ ψ : MθF (B) → B(H). So ψF,f is cp contractive if f 2 ≤ 1.

366

C. Pinzari, Y. Watatani, K. Yonetani

Finally, for a contractive cp map 6 : A0 → B(H) set D6,F,f := S˜F,f ◦ ι ⊗ 6 ◦ SF : A → B(H). Note that in particular, if φ and ψ are as above, Dψ◦φ,F,f = ψF,f ◦ φF which factors through the algebra MθF (B). One can easily show that for an element of fixed degree X ∈ Ak , DιA0 ,F,f (X) =

f (p)f (p − k)X.

p∈F ∩(F +k)

The following lemma is our analogue of Lemma 3.4 in [B]. Lemma 7.5. Let ω be a finite subset of the unit ball of A0 , n0 ∈ N and δ > 0. Consider the set F (ω, n0 ) := {T xγ , T ∈ ω, |γ | ≤ n0 }. Then there is a finite set F ⊂ N0 which depends only on n0 and δ and not on ω such that δ ∗

. rcp(F (ω, n0 ), δ) ≤ θF rcp ∪|γ |≤n0 ∪|α|,|β|∈F,|α|=|β|+|γ | xα ωxγ xβ , 2 maxp∈F θp Proof. We proceed as in the proof of Lemma 3.4 in [B]. Let f ∈ 2 (Z) be a function with finite support E, f 2 ≤ 1 such that |f ∗ f˜(n) − 1| < δ/2, n = 0, 1, . . . , n0 . Here f˜(p) = f (−p). Replacing f with a suitable translate if necessary, we may assume E ⊂ {n ∈ Z : n ≥ n0 }. Set F = E ∪ (E − 1) ∪ · · · ∪ (E − n0 ), which is a subset of N0 . Note that F depends on δ and n0 but not on ω. Consider contractive cp maps φ : A0 → B and ψ : B → B(H) with B finite dimensional such that ψ ◦ φ(a) − a <

δ , 2 maxp∈F θp

a ∈ ∪|γ |≤n0 ∪|α|,|β|∈F,|α|=|β|+|γ | xα ∗ ωxγ xβ .

Let us choose B with minimal rank. Then the cp contractive map Dψ◦φ,F,f factors through the finite dimensional algebra MθF (B), which has rank θF rank(B). We are thus left to show that for a ∈ F (ω, n0 ), Dψ◦φ,F,f (a) − a < δ. We write a = T xγ with T ∈ ω, |γ | ≤ n0 . Then the l.h.s. is bounded by Dψ◦φ,F,f (T xγ ) − DιA0 ,F,f (T xγ ) + DιA0 ,F,f (T xγ ) − T xγ . Now the computation of DιA0 ,F,f given before on elements with fixed degree shows that the second summand is bounded by f (p)f (p − |γ |) − 1T xγ = |f ∗ f˜(|γ |) − 1|T xγ < δ/2. p∈F ∩(F +|γ |)

KMS States, Entropy and the Variational Principle

367

We now evaluate the first summand.

Dψ◦φ,F,f (T xγ ) − DιA0 ,F,f (T xγ ) ≤ ιMθF ⊗ (ψ ◦ φ − ιA0 ) ◦ SF (T xγ )

∗

=

( eα,β ⊗ (ψ ◦ φ − ιA0 )(xα T xγ xβ ))

p∈F ∩(F +|γ |) |α|=p,|β|=p−|γ |

∗

= max eα,β ⊗ (ψ ◦ φ − ιA0 )(xα T xγ xβ )

≤ δ/2. p∈F ∩(F +|γ |)

|α|=p,|β|=p−|γ | The last inequality follows from our choice of ψ and φ and from the fact that if a matrix A ∈ Mh,k (A0 ) has entries of norm bounded by c then A ≤ (hk)1/2 c. Consider the contractive cp map ρr : A → Mθr (A) taking T ∈ A to the matrix (xµ ∗ T xν )µ,ν∈r . (One can easily check that ρr is a unital with image the corner algebra Pr Mθr (A)Pr , where Pr = (xµ ∗ xν ) is an orthogonal projection.) For n, n0 ∈ N, m ≥ n + n0 − 1, l = 0, . . . , n − 1, |α| ≤ n0 , T ∈ A0 , we compute xηα T xη ∗ ) = ( xµ∗ xηα T xη ∗ xν )µ,ν∈m . ρm (σ l (xα T )) = ρm (

∗ -monomorphism

|η|=l

|η|=l

Writing µ = γ µ , |γ | = |η| + |α| = l + |α|, and ν = δν , |δ| = |η| = l we have: xµ ∗ qγ ,ηα T qη,δ xν )µ,ν∈m . ρm (σ l (xα T )) = ( |η|=m

Setting, again,

ν

with |'| = |µ | = m − l − |α|, we obtain that (xµ ∗ qγ ,ηα T qη,δ x' )xν

)µ,ν∈m . ρm (σ l (xα T )) = ( =

'ν

,

|η|=l

If now T ranges over a finite subset ω ⊂ A0 , we see that the image of F (ω, n0 ) = {xα T , T ∈ ω, |α| ≤ n0 } ⊂ A under the maps ρn+n0 −1 ◦σ l , l = 0, . . . , n−1, is constituted by matrices of size θn+n0 −1 with entries sums of at most θn−1 elements in F (ω(n+n0 ) , n0 ). Proof of Theorem 7.4. We apply the previous lemma to the sets F (ω(n+n0 ) , n0 ) for fixed n0 and ω and all n ∈ N. Note that the corresponding set F can be chosen independent of n. We can thus find for each n ∈ N a contractive cp map 6n : A → B(H) factoring through a finite dimensional algebra B of rank θF rcp(∪|γ |≤n0 ∪|α|,|β|∈F,|α|=|β|+|γ | xα ∗ ω(n+n0 ) xγ xβ , , ≤ θF rcp(ω(n+n0 +max F ) ,

δ 2θn+n0 −1

2 max

δ

p∈F

θp

)

2θn+n0 −1+max F 2 maxF θp

)

368

C. Pinzari, Y. Watatani, K. Yonetani

such that 6n (T xγ ) − π(T xγ ) <

δ θn+n0 −1 2

T ∈ ω(n+n0 ) ,

,

|γ | ≤ n0 .

Consider the ucp map Cm : Mθm (B(H)) → B(H) taking the matrix (tµ,ν ) to the operator π(xµ )tµ,ν π(xν )∗ . |µ|=|ν|=m

Then the map Cn+n0 −1 ◦ ιMθn+n −1 ⊗ 6n ◦ ρn+n0 −1 : A → B(H) factors through an 0 algebra of rank bounded by θn+n0 −1 θF rcp(ω(n+n0 +max F ) ,

δ 2 2θn+n maxF 0 +max F −1

θp

).

Thus if we show that Cn+n0 −1 ◦ ι ⊗ 6n ◦ ρn+n0 −1 (σ l (xγ T )) − σ l (xγ T ) < δ for T ∈ ω, |γ | ≤ n0 and l = 0, . . . , n − 1, we will deduce that rcp(F (ω, n0 ) ∪ · · · ∪ σ n−1 (F (ω, n0 )), δ) ≤ θn+n0 −1 θF rcp(ω(n+n0 +max F ) ,

δ ) 2 2θn+n maxF θp 0 +max F −1

and the conclusion will follow. Now as Cm ◦ ρm = ιA , it suffices to show that ιMθn+n

0 −1

⊗ (6n − ιA ) ◦ ρn+n0 −1 (σ l (xγ T )) < δ.

This follows from our choice of 6n and from the fact that entries of ρm (σ l (xγ T )), m = n + n0 − 1 are sums of at most θn−1 elements of F (ω(n+n0 ) , n0 ). Corollary 7.6. Consider the same situation as in Theorem 7.4. Let (ωα )α∈A be a net of finite subsets of A0 with total union. Then ht(σ{xj } ) ≤ htop (σ {xj } ) + lim ht2 ({φxi ,xj }, ωα ). α

Proof. This is a straightforward consequence of the fact that ∪α,n0 F (ωα , n0 )∪ F (ωα , n0 )∗ is total in A and of the Kolmogorov–Sinai property of the entropy of a ucp map, [V], [B, BG]. Remark. If in particular A = A0 α Z and u ∈ A1 is a unitary implementing α on A0 then u is a single point space, so its entropy is zero. By Lemma 7.3 and the previous corollary, we recover Brown’s result that htA (Ad(u)) ≤ htA0 (α) (and therefore one deduces an equality by monotonicity of topological entropy [B].) The case where we can choose the xj ’s with pairwise orthogonal ranges is of course of special interest, the Cuntz algebras, Cuntz–Krieger algebras and Matsumoto algebras belonging to this class. The next result shows that the estimate of the entropy can be made more precise in this case.

KMS States, Entropy and the Variational Principle

369

Theorem 7.7. Let (A, γ , T) be a full C ∗ -dynamical system, with A0 exact. Let {xj } ⊂ A1 be a finite subset such that xj xj ∗ = I,

j xi ∗ xj ∗

xj xj

= 0,

i = j,

is invertible.

j

Then htop (σ {xj } ) ≤ ht(σ{xj } ) ≤ htop (σ {xj } ) + lim ht1 ({φxi ,xj }, ωα ), α

where (ωα )α∈A is any net of finite subsets of A0 with total union in A0 . If in particular for some net (ωα )α∈A ht1 ({φxi ,xj }, ωα ) = 0, α ∈ A, then ht(σxj ) = htop (σ {xj } ). Proof. The proof of the second inequality ≤ goes exactly as that of Theorem 7.4 with the only exception that entries of ρm (σ l (xγ T )) are now already elements of F (ω(n+n0 ) , n0 ). We show that ht(σ{xj } ) ≥ htop (σ {xj } ). By monotonicity of topological entropy [B,V] and Corollary 7.2, ht(σ{xj } ) ≥ ht(T ), where, as before, T denotes the ∗ -monomorphism of C() implemented by the one-sided shift. We are thus left to show that ht(T ) ≥ htop (σ {xj } ). The proof is similar to that of [BG], which in turn goes back to [V, Prop. 4.6]. Let µ be a σ -invariant probability Borel measure on the two-sided subshift 6 = 6{xj } defined before Prop. 6.1, and let us restrict it to a σ -invariant probability measure on = 6+ . For any ucp map γ : M → C(), with M finite dimensional, let hµ,T (γ ) be defined as in [CNT], by means of the function Hµ (γ , T γ , . . . , T n−1 γ ). Reasoning as in [V, Prop. 4,6] we see that hµ,T (γ ) ≤ ht(T ). Choosing M = Cθn and γ : Cθn → C() the natural inclusion, then one finds, thanks to [CNT, Remark III.5.2], that the classical measurable entropy Hµ (σ 6 ) is ≤ ht(T ). Taking the supremum over all invariant measures we obtain the claim, by the classical variational principle for topological entropy [DGS, Theorem 18.8]. Remark. It is natural to ask whether the upper bound for ht(σ{xj } ) described in the previous result can be further improved to htop (σ {xj } ) + ht0 ({φxi ,xj }). We will discuss examples in [KP]. We next show that the estimates above obtained are good enough to compute ht(σ{xj } ) in the case of Cuntz–Krieger algebras. This was first done by Boca and Goldstein [BG]. Corollary 7.8. [BG] Let A = OA be a Cuntz–Krieger algebra defined by a {0, 1}matrix A, and let {Si }d1 be the canonical set of generating partial isometries. Then ht(σ{Si } ) = htop (σ {Si } ) = log(r(A)).

370

C. Pinzari, Y. Watatani, K. Yonetani

Proof. The second equality is the well known computation of entropy of finite type subshifts of order two. See, for example, Proposition 17.12 in [DGS]. The first equality will follow from the previous theorem provided we show that there is an increasing sequence ωp , p ∈ N of finite subsets of OA 0 with total union such that ht1 ({φSi ,Sj }, ωp ) = 0,

p ∈ N.

Set Pi := Si Si ∗ and define ωp := {Sα Pi Sβ ∗ , i = 1, . . . , d, |α| = |β| ≤ p}. It is clear that ∪p ωp is total in the homogeneous subalgebra. One easily checks that, for fixed p and all n, ωp (n) is contained in the linear span of ωp , which is finite dimensional C ∗ -algebra, so by Lemma 7.3 ht1 (φ{Si ,Sj } , ωp ) = 0. We next look at the class of Matsumoto algebras O6 associated to a general subshift 6 ⊂ {1, . . . , d}Z [M]. See also Sect. 4.1. Let {Si }d1 be the canonical set of generating partial isometries. We recall from [M] a few properties of O6 . First, the relations Si ∗ Sj = 0, i = j Si Si ∗ = I,

(7.1) (7.2)

i

which easily imply that for any pair of words with the same length, qα,β := Sα ∗ Sβ =0 unless α = β. We will write qα for qα,α , α ∈ ∪r 6r . Note that these are projections. Furthermore the following commutation relations hold in O6 : for µ, ν ∈ ∪r 6r qµ Sν = Sν qµν , qµ qν = qν qµ .

(7.3) (7.4)

By (7.4) the algebra Ql generated by the projections {qα , α ∈ ∪lk=0 6k } is commutative and therefore finite dimensional. These properties imply that the finite sets ωk,l := Sα ESβ ∗ , |α| = |β| ≤ k, E minimal projection in Ql have total union in O6 0 . It follows that O6 0 is AF [M], so O6 is a nuclear C ∗ -algebra. Using properties (7.1)-(7.4) one can show with tedious computations that for all n ∈ N,

ωk,l (n) ⊂ Sα qSβ ∗ , |α| = |β| ≤ k, q projection in Q2n+max(k,l) . This computation is aimed to give an estimate for hta (φSi ,Sj , ωk,l ). Lemma 7.9. If O6 is the Matsumoto C ∗ -algebra associated to a subshift 6 we have, for a > 0, and k, l ∈ N0 , hta (φSi ,Sj , ωk,l ) ≤ 2 lim sup n

1 log(dim(Qn )). n

KMS States, Entropy and the Variational Principle

371

Proof. Let φ : O6 0 → B and ψ : B → O6 0 be unital completely positive maps such that ψφ(a) − a < δ/θk , when a ranges the projections of Q2n+2 max(k,l) , and assume that B has minimal rank. Consider as before the maps ρk : O6 0 → Mθk (O6 0 ), 0 which satisfy C ◦ ρ = ι

Ck : Mθk (O6 0 ) → O6 k k O6 0 . Define φ := ιMθk ⊗ φ ◦ ρk and

ψ := Ck ◦ ιMθk ⊗ ψ. Then if a is a projection of Q2n+max(k,l) and |α| = |β| ≤ k,

ψ ◦ φ (Sα aSβ ∗ ) − Sα aSβ ∗ ≤

|γ |=|γ |=k

eγ ,γ ⊗ (ψ ◦ φ − ι)(Sγ ∗ Sα aSβ ∗ Sγ )

< δ

because Sγ ∗ Sα aSβ ∗ Sγ is a projection in Q2n+2 max(k,l) . Any element of ωk,l (n) being of the form Sα aSβ ∗ , we deduce that for all n ∈ N, and δ > 0, rcp(ωk,l (n) , δ) ≤ θk rcp(Proj(Q2n+2 max(k,l) ), δ/θk ) ≤ dim(Q2n+2 max(l,k) ). The last inequality follows from the existence of a conditional expectation onto δ Q2n+2 max(k,l) . The rest follows choosing δ of the form θn−1 a , taking the logarithm, dividing by n and passing to the lim sup. We combine the previous result with Theorem 7.6. Theorem 7.10. If σ{Si } is the ucp map associated to the canonical set of generators of a Matsumoto C ∗ -algebra O6 , htop (6) ≤ ht(σ{Si } ) ≤ htop (6) + 2 lim sup n

1 log(dim(Qn )). n

We conclude this section discussing two examples of subshifts, beyond Markov subshifts, for which ht(σ{Si } ) = htop (6). The first example is that of sofic subshifts, see [DGS] and therein quoted references. By [M] a subshift is sofic if and only if ∪n Qn is finite dimensional. Then Theorem 7.10 yields the desired equality. Another example is that of β-shifts associated to β-expansion of real numbers [Re, Par, Bl]. In this case it is proved in [KMW] that if the β-shift is not sofic, dim(Qn ) = n + 1, and this leads again to the same conclusion. 8. CNT Dynamical Entropy and Variational Principle In fact, if A = On Choda shows in [Ch] not only that htop (σ ) = log(n) but also that, if φ is the unique KMS state of On , then hφ (σ ) = htop (σ ) = log(n), where the l.h.s. denotes the Connes–Narnhofer–Thirring dynamical entropy of σ [CNT]. This result has its own importance, as it exhibits a fundamental example where a noncommutative variational principle for the entropy holds true. (We refer the reader to [DGS] for a formulation of the variational principle for the entropy in ergodic theory for compact spaces.) It is an open problem to establish for which C ∗ -algebras a noncommutative variational principle for the entropy holds. In this section we give a class of examples which generalize Choda’s result. Examples will be the canonical ucp map of the Cuntz–Krieger algebras, or certain Matsumoto algebras associated to non finite type subshifts. In this section we show that the CNT dynamical entropy of the ucp map σ{xj } , defined in the previous section is ≥ the m.t. entropy of the associated subshift 6{xj } , as defined, e.g., in [DGS], see Theorems 8.5, 8.6. This inequality looks similar to that of Theorem 7.7

372

C. Pinzari, Y. Watatani, K. Yonetani

relative to the topological entropy, but it goes in the reverse order. We start establishing the setting of the CNT entropy. Let A be a unital C ∗ -algebra, and let γi : Ai → A, i = 1, . . . , n be ucp maps from finite-dimensional C ∗ -algebras, and let φ be a state on A. Let us recall from [CNT] that an Abelian model for (A, φ, γ1 , . . . , γn ) is given by an Abelian finite-dimensional C ∗ -algebra B, a state µ on B and subalgebras B1 , . . . , Bn of B for which there is a ucp map E : A → B with φ = µ ◦ E. Consider first the entropy of the Abelian model (B, µ, B1 , . . . , Bn ) as defined in [CNT, III.3] and then the quantity Hφ (γ1 , . . . , γn ), defined as the supremum of the entropies of all theAbelian models (see [CNT, Def. III.4]). The following result is an obvious consequence of the definition. Lemma 8.1. If, for i = 1, . . . , n, γi : Ai → A is a ∗ -monomorphism, ∨ni=1 γi (Ai ) is finite-dimensional and commutative and if there exists a conditional expectation E : A → ∨ni=1 γi (Ai ) such that φ ◦ E = φ, then Hφ (γ1 , . . . , γn ) ≥ S(φ ∨ni=1 γi (Ai ) ), where the r.h.s. denotes the classical m.t. entropy of the restriction of φ to ∨ni=1 γi (Ai ). Proof. Let us define B = ∨ni=1 γi (Ai ), Bi = γi (Ai ), µ = φ B . Then (B, µ, B1 , . . . , Bn ) is an Abelian model for (A, φ, γ1 , . . . , γn ). Let Ei : B → Bi denote the canonical conditional expectation associated to µ. Then Ei ◦ E ◦ γi : Ai → γi (Ai ) coincides with γi , which is a ∗ -isomorphism, thus its entropy defect is zero (see [CNT], Sect. II). It follows from [CNT, Def. III.4] that Hφ (γ1 , . . . , γn ) ≥ S(φ ∨ni=1 γi (Ai ) ).

Let now σ be a ucp map of our C ∗ -algebra A such that φ ◦σ = φ, and let γ : M → A be a ucp map from a finite-dimensional C ∗ -algebra M. Define the m.t. dynamical entropy of γ with respect to φ to be 1 hφ,σ (γ ) = lim Hφ (γ , σ ◦ γ , . . . , σ n−1 ◦ γ ), n n and, finally, define the m.t. dynamical entropy of σ as hφ (σ ) = supγ {hφ,σ (γ )}, where the supremum is taken over all possible γ : M → A. Corollary 8.2. Let A be a unital C ∗ -algebra, φ a state of A, and let σ be a ucp map of A such that φ ◦ σ = φ. Let γ : M → A be a unital ∗ -monomorphism from a commutative finite-dimensional C ∗ -algebra M. Assume that the smallest σ -stable C ∗ -subalgebra C of A containing γ (M) is commutative and that σ C is a ∗ -monomorphism. If, for n ∈ N, i there exists a conditional expectation En : A → Cn := ∨n−1 i=0 σ ◦ γ (M) such that φ ◦ En = φ, then hφ,σ (γ ) ≥ hφC (σ C , γ (M)), where the r.h.s. denotes the classical m.t. dynamical entropy of the partition of the spectrum of C defined by γ (M) with respect to σ C (see, e.g., [DGS, Def. 10.8]. It follows that hφ (σ ) ≥ hφC (σ C ), where the r.h.s. denotes the classical m.t. entropy of the epimorphism of the spectrum of C defined by the restriction of σ ([DGS, Def. 10.10].

KMS States, Entropy and the Variational Principle

373

Proof. Just apply the previous lemma to γ1 = γ , . . . , γn = σ n−1 γ and then pass to the limit. The last assertion is a consequence of the classical Kolmogorov–Sinai property of the entropy. In order to apply the above result, one needs to know under which conditions on the system (A, σ, γ ) as in Cor. 8.2 every invariant measure µ on C extends to a σ -invariant state φ on A fulfilling all the requirements of the previous corollary. We start giving a well known sufficient condition for the existence of invariant conditional expectations. Lemma 8.3. Let M ⊂ A be a unital inclusion of C ∗ -algebras, with M commutative and finite-dimensional, and let φ be a state on A faithful on M and such that φ(am) = φ(ma),

m ∈ M.

Then there is a unique conditional expectation E : A → M such that φ ◦ E = φ. Proof. Set E(a) = φ(e)−1 φ(ae)e, and check that E is the desired conditional expectation. Uniqueness follows easily from faithfulness of φ on M. We next give a condition on (A, σ, γ ) so that every invariant measure on the spectrum of C extends to a σ -invariant state on A containing C in its centralizer. In view of the previous lemma, this would imply the existence of conditional expectations En as in Corollary 8.2, satisfying all the necessary requirements. Proposition 8.4. Let A be a unital C ∗ -algebra endowed with a ucp map σ , and let C ⊂ A be a unital σ -stable, AF, commutative, C ∗ -subalgebra. If σ (ca) = σ (c)σ (a),

c ∈ C, a ∈ A,

then every state µ on C satisfying µ ◦ σ = µ extends to a state φ on A such that (1) (2)

φ ◦ σ = φ, φ(ca) = φ(ac),

c ∈ C, a ∈ A.

In particular, if µ is faithful, for every finite-dimensional C ∗ -subalgebra M ⊂ C there exists a unique conditional expectation EM : A → M such that φ ◦ EM = φ.

Proof. Let C1 ⊂ C2 ⊂ . . . be an increasing sequence of unital finite-dimensional C ∗ subalgebras of C with dense union, and, for n ∈ N, let Fn be the set of minimal projections of Cn . Set K0 = {φ ∈ S(A) : φ C = µ}, which is a nonempty convex and compact subset of the state space S(A) in the weak∗ topology. The function f0 taking any state φ on A to the state a → e∈F1 φ(eae) is weakly∗ -continuous and leaves K0 invariant, thus, by the Schauder–Tychonov fixed point theorem, the fixed point set K1 = {φ ∈ K0 : f0 (φ) = φ}

374

C. Pinzari, Y. Watatani, K. Yonetani

∗ is nonempty. Note that K1 is still compact and convex. Define now the weakly continuous function f1 : S(A) → S(A) taking φ to a ∈ A → φ( e∈F2 eae) and check again that K1 is invariant under f1 , so that

K2 = {φ ∈ K1 : f1 (φ) = φ} is nonempty. We thus find iteratively a decreasing sequence K0 ⊃ K1 ⊃ K2 . . . of nonempty compact convex subsets of S(A). Consider the compact convex set K := ∩n∈N Kn . A state φ is in K if and only if φ C = µ, φ(ae) = φ(ea),

e ∈ ∪n Fn , a ∈ A,

and therefore, being ∪n Cn dense in C, φ(ca) = φ(ac),

c ∈ C, a ∈ A.

We next define the weakly∗ -continuous function fσ : S(A) → S(A) taking φ to φ ◦ σ and we claim that fσ leaves K invariant. First, any φ ∈ K restricts to µ on C, thus, being C and µ σ -invariant, fσ (φ) restricts to µ on C, as well. We are left to show that for φ ∈ K, C is in the centralizer of fσ (φ). We compute, for c ∈ C, a ∈ A, fσ (φ)(ca) = φ(σ (ca)) = φ(σ (c)σ (a)) = φ(σ (a)σ (c)) = φ(σ (ac)) = fσ (φ)(ac). Thus, applying again the Schauder–Tychonov fixed point theorem, we find a fixed point φ of fσ , which is the desired extension of µ. The last assertion now follows from the previous lemma. We now collect all the results we have obtained, in the form of a theorem. Theorem 8.5. Let A be a unital C ∗ -algebra, endowed with a faithful ucp map σ , and let γ : M → A be a unital ∗ -monomorphism from a commutative finite-dimensional C ∗ -algebra M. Assume that the smallest σ -stable C ∗ -subalgebra C of A containing γ (M) is commutative and that σ (ca) = σ (c)σ (a),

c ∈ C, a ∈ A.

Let µ be a faithful σ -invariant state on C extended to a σ -invariant state φ on A centralized by C, this being possible by Prop. 8.4. Then hφ (σ ) ≥ hµ (σ C ), where the r.h.s. denotes the classical m.t. entropy of the epimorphism of the spectrum of C defined by the restriction of σ to C. We now go back to the situation where σ = σ{xj } is the ucp map implemented by a finite subset {xj } of A such that j xj xj ∗ = I .

KMS States, Entropy and the Variational Principle

375

Theorem 8.6. Let A be a unital C ∗ -algebra, and let {xj } by a finite subset constituted by d nonzero partial isometries satisfying xj xj ∗ = I,

j

xj ∗ xj is invertible,

j

xi ∗ xj = 0, i = j, [xi1 . . . xir (xi1 . . . xir ) , xj ∗ xj ] = 0, j, i1 , . . . , ir = 1, . . . , d, r ∈ N. ∗

Let σ{xj } be the ucp map implemented by the multiplet {xj }. Let D : C(6{xj } + ) → A be the natural ∗ -monomorphism defined in Corollary 7.2. Induce a shift-invariant measure µ+ on 6{xj } + from a shift-invariant measure µ on 6{xj } and then extend µ+ to a σ -invariant state φ on A centralized by D(C(6{xj } + )), by Prop. 8.4. Then hφ (σ{xj } ) ≥ hµ (σ 6{xj } ). Proof. Consider the finite dimensional commutative C ∗ -algebra M of C(6{xj } + ) generated by the characteristic functions of the cylinder sets [i], i : xi = 0. C(6{xj } + ) is naturally embedded in A via the ∗ -monomorphism D defined in Cor. 7.2. Clearly D(C(6{xj } + )) is generated by the ranges of σ{xj } i ◦ D(M). Also, σ is faithful as ∗ j xj xj is invertible. The commutation relations between the domain projections and the range projections of the iterated products of the xj ’s easily show that σ{xj } (ca) = σ{xj } (c)σ{xj } (a), c ∈ D(C(6{xj } + )), a ∈ A. In particular, σ{xj } is a ∗ -monomorphism on D(C(6{xj } + )). Thus the previous theorem applies. If for example A = OA or, more generally, A = O6 , the assumptions of the previous theorem hold true. Note that, with the notation and assumptions of the previous result, we know that, using also Corollary 7.6, hµ (6{xj } ) ≤ hφ (σ{xj } ) ≤ ht(σ{xj } ) ≤ htop (6{xj } ) + lim ht1 ({φxi ,xj }, ωα ), α

(8.1)

where ωα is any net of finite subsets of A0 with total union. The middle inequality is due to Voiculescu [V]. In classical ergodic theory, a probability measure m on a dynamical system (X, T ) such that m ◦ T ∗ = m is called an equilibrium measure, or a measure with maximal entropy, if it maximizes the entropy, i.e. if hm (X) = htop (X). It is well known that dynamical systems arising from subshifts admit equilibrium measures, see [DGS]. Applying this fact to our subshift 6{xj } , we see that there exists a shift-invariant measure µ on 6{xj } with hµ (6{xj } ) = htop (6{xj } ). Combining with the previous inequality, we obtain, under the simplifying assumption that the second summand in (8.1) vanishes, an existence theorem of equilibrium states in the noncommutative situation above considered.

376

C. Pinzari, Y. Watatani, K. Yonetani

Corollary 8.7. Consider the same situation as in Theorem 8.6. Let µ be a shift-invariant measure on 6{xj } with maximal entropy, and let us extend it to a σ{xj } -invariant state φ on A centralized by D(C(6{xj } + )). Assume furthermore that for a net ωα of finite subsets of A0 with total union, ht1 ({φxj ,xj }, ωα ) = 0. Then hµ (6{xj } ) = hφ (σ{xj } ) = ht(σ{xj } ) = htop (6{xj } ). By virtue of the remark following Theorem 7.10, we obtain the following result. Corollary 8.8. Let 6 be a subshift of one of the following types: (1) Markov shift, (2) sofic subshift, (3) β-shift. Let us extend an invariant measure µ on 6 with maximal entropy to a state φ on O6 centralized by the canonical commutative subalgebra C(6+ ) ⊂ O6 . Then hµ (6) = hφ (σ{Si } ) = ht(σ{Si } ) = htop (6), where {Si } is the canonical set of generating partial isometries of the Matsumoto algebra O6 . 9. From KMS States to Equilibrium States In this section we make an attempt to show a closer connection between KMS states on full periodic C ∗ -dynamical systems studied in Sects. 1–6 and equilibrium states considered in Sects. 7–8. To motivate the results of this section, we consider the classical situation of a topological dynamical system (X, T ) over a compact space X. A Borel probability measure m on X is called conformal if m ◦ T ∗ is equivalent to m. The study of conformal measures is of particular importance as it leads to equilibrium states of the system [DU]. Now in our noncommutative setting, where we replace X by a unital C ∗ -algebra A endowed with a full action of the circle, and T by the ucp map σ{xj } , KMS states provide a natural class of states on A which play the role of conformal measures. Indeed we have the following immediate result. ∗Proposition 9.1. Let (A, γ ) be a full periodic C ∗ -dynamical over a unital system C ∗ ∗ 1 algebra A, and let {xj } be a finite subset of A such that j xj xj = I and j xj xj is invertible. If ω is a KMS state at inverse temperature β then

ω ◦ σ{xj } = ω(a·), where a = e−β

j

xj ∗ xj is obviously a positive and invertible element of A0 .

We show how to produce σ -invariant states on A from KMS states of the system (A, γ ). Consider the completely positive map S{xj } : a ∈ A → j xj ∗ axj already considered in Sect. 2. Let ω be a faithful KMS state for (A, γ ) at maximal inverse temperature βmax = log(λmax ). Then, for t > λmax , consider the series +∞ S{xj } k (I ) k=0

t k+1

,

KMS States, Entropy and the Variational Principle

377

which we claim to be Cauchy for every seminorm pT , T ∈ A, where pT (a) = |ω(aT )|, a ∈ A. We show the claim: m m S{xj } k (I ) λmax k+1 −1 T = λmax ω(σ{xj } k (T )), ω k+1 t t n n σk is norm converging for now the r.h.s. is converging to 0 as m, n → ∞, since µk+1 +∞ S{xj } k (I ) µ > 1. Being ω faithful, bt := lies in the enveloping von Neumann k=0 t k+1 algebra of A. Set (9.1) at = (t − λmax )bt , so ω(at ) = 1, and define ωt := ω(at · ). Then for any T ∈ A, ωt (T ) − ωt (σ{xj } (T )) = ω(at T ) − λmax −1 ω(S{xj } (at )T ) = λmax −1 (t − λmax )ω((λmax − S{xj } )(bt )T ) → 0 for t → λmax . So any weak∗ -limit point φ of ωt for t → λmax is a σ{xj } -invariant state on A. Note that if every xµ ∗ xµ commutes with any xν xν ∗ , for all multiindices µ and ν, then the Banach space generated by the xν xν ∗ is in the centralizer of any such φ. The next result can be regarded as an example where the construction of equilibrium states out of KMS states is explicit. This is a noncommutative analogue of the well known relationship between the Perron–Frobenius Theorem and equilibrium states for Markov subshifts, see Proposition 17.14 in [DGS]. Theorem 9.2. Let (A, γ , T) be a unital, full, periodic C ∗ -dynamical system, and let OA ⊂ A be a unital Z-graded inclusion of the Cuntz–Krieger algebra associated to an irreducible matrix A, in A. If ω is a faithful KMS state of A at maximal inverse temperature log(λmax ), then (1) λmax = r(A), (2) ωt is norm convergent, for t → λmax + , to a σ{Sj } -invariant state φ centralized by C(6A+ ), where {Sj } is the canonical set of generators of OA . (3) φ restricts on C(6A+ ) to the unique probability measure µ for which hµ (6A ) = htop (6A ). In particular, if for a net ωα of finite subsets of A0 with total union ht1 ({φxi ,xj }, ωα ) = 0, then hµ (6A ) = hφ (σ{xj } ) = htop (σ{xj } ) = htop (6A ) = log(r(A)).

378

C. Pinzari, Y. Watatani, K. Yonetani

Proof. It is known that Markov subshifts defined by irreducible matrices have a unique maximal measure, see Theorem 19.14 in [DGS]. The elements at , t ≥ λmax , defined as in (9.1) belong to the finite-dimensional C ∗ -subalgebra of C(6A+ ) generated by Si Si ∗ , the characteristic functions of the cylinders [i], i = 1, . . . d. Since ω(at ) = 1 and ω is faithful, there exists a norm-limit point a of at , for t → λmax + . Inspection shows that a is an eigenvector of S{Si } with eigenvalue λmax , and therefore it corresponds to a left eigenvalue (vi ) of A, normalized so that ω(a) = 1. In particular, at is convergent. Since ω is a KMS state of A, and hence of O6 w.r.t. the gauge action, evaluating ω on Si Si ∗ gives the unique, up to a scalar, positive right eigenvector (uj ) of A. The normalization ω(a) = 1 yields i ui vi = 1. Evaluating φ on Si1 . . . Sir (Si1 . . . Sir )∗ gives φ(Si1 . . . Sir (Si1 . . . Sir )∗ ) = ω(aSi1 . . . Sir (Si1 . . . Sir )∗ ) vi = vi1 ω(Si1 . . . Sir (Si1 . . . Sir )∗ ) = r1 ω((Si1 . . . Sir )∗ Si1 . . . Sir ) λ vi1 vi ui = r ai1 ,i2 . . . air−1 ,ir ω(air ,j Sj Sj ∗ ) = 1r−1r ai1 ,i2 . . . air−1 ,ir . λ λ i

j

If we now compare with the formula given in Prop. 17.14 in [DGS], we see that µ = φ C (6A+ ) restricts precisely to the unique measure on 6A+ with maximal entropy. Acknowledgements. Part of this paper was written during a visit of C.P. at the Mathematics Department of the University of Orleans. She wishes to thank C. Anantharaman–Delaroche for the invitation and for drawing attention to Furstenberg’s example, and J. Renault for many fruitful discussions. We are also indebted to H. Matui for pointing out an error in Sect. 7 of a previous version of this paper.

References [Bl] [BDR] [BG] [BEH] [BEK] [BJ] [B] [Ch] [CNT] [Co] [C] [CK] [DGS] [DU] [E]

Blanchard, F.: β-Expansions and symbolic dynamics. Theor. Computer Sci. 65, 131–141 (1989) Blackadar, B., Dadarlat, M., Rørdam, M.: The real rank of inductive limit C ∗ -algebras. Math. Scand. 69, 211–216 (1991) Boca, F., Goldstein, P.: Topological entropy for the canonical endomorphism of the Cuntz–Krieger algebras. Preprint 1999. math.OA/9906210 Bratteli, O., Elliott, G., Herman, R.: On the possible temperatures of a dynamical system. Commun. Math. Phys. 74, 281–295 (1980) Bratteli, O., Elliott, E., Kishimoto, A.: The temperature state space of a C ∗ -dynamical system I. Yokohama Math. J. 28, 125–167 (1980) Bratteli, O., Jørgensen, P.E.T.: Isometries , shifts, Cuntz algebras and multiresolution wavelet analysis of scale N . Integral Equations Operator Theory 28, 382–443 (1997) Brown, N.P.: Topological entropy in exact C ∗ -algebras. Math. Ann. 314, 347–367 (1999) Choda, M.: Endomorphisms of shift type (entropy for endomorphisms of Cuntz algebras). In: S. Doplicher, R. Longo, J.E. Roberts, L. Zsido (eds) Operator Algebras and Quantum Field Theory. Proceedings, Rome 1996, Cambridge, MA: International Press, pp. 469–475 Connes, A., Narnhofer, H., Thirring, W.: Dynamical entropy of C ∗ -algebras and von Neumann algebras. Commun. Math. Phys. 112, 691–719 (1987) Connes, A.: Compact metric spaces: Fredholm modules, and hyperfiniteness. Ergod. Th. & Dynam. Sys. 9, 207–220 (1989) Cuntz, J.: Simple C ∗ -algebras generated by isometries. Commun. Math. Phys. 57, 173–185 (1977) Cuntz, J., Krieger, W.: A class of C ∗ -algebras and topological Markov chains. Invent. Math. 56, 251–268 (1980) Denker, M., Grillenberger, C., Sigmund, K.: Ergodic theory on compact spaces. LNM 527, Berlin– Heidelberg–New York: Springer-Verlag 1976 Denker, M., Urba´nski, M.: On the existence of conformal measures. Trans. Am. Math. Soc. 328, 563–587 (1991) Evans, D.: Gauge actions on OA . J. Operator Theory 7, 79–100 (1982)

KMS States, Entropy and the Variational Principle

379

Enomoto, M., Fuji, M., Watatani,Y.: KMS states for gauge actions on OA . Math. Japon. 29, 607–619 (1984) [G] Gantmacher, F.R.: The theory of matrices, Vol. 2. Chelsea: Chelsea Publishing Company, 1964 [GP] Goldstein, P., Pinzari, C.: Work in progress [GHJ] Goodman, F.M., de la Harpe, P., Jones, V.F.R.: Coxeter graphs and towers of algebras. New York: Springer-Verlag, 1989 [H] Hutchinson, J.E.: Fractals and self–similarity. Indiana Univ. Math. J. 30, 713–747 (1981) [JP] Jørgensen, P.E.T., Pedersen, S.: Harmonic analysis of fractal measures. Constr. Approx. 12, 1–30 (1996) [KPW] Kajiwara, T., Pinzari, C., Watatani, Y.: Ideal structure and simplicity of the C ∗ -algebras generated by Hilbert bimodules. J. Funct. Anal. 159, 295–322 (1998) M . RIMS Kokyuroku 858, 87–90 (1994) [K] Katayama, Y.: Generalized Cuntz algebras ON [KMW] Katayama, Y., Matsumoto, K., Watatani Y.: Simple C ∗ -algebras arising from β-expansion of real numbers. Ergod Th.& Dynam. Sys. 18, 937–962 (1998) [KP] Kerr, D., Pinzari, C.: Noncommutative pressure and the variational principle in Cuntz–Krieger type C ∗ -algebras. Reprint 2000 [Ma] Mañe, R.: Ergodic theory and differentiable dynamics. Berlin–Heidelberg: Springer-Verlag, 1987 [MP] Martin, M., Pasnicu, C.: Some comparability results in inductive limit C ∗ -algebras. J. Operator Th. 30, 137–147 (1993) [M] Matsumoto, K.: On C ∗ -algebras associated with subshifts. Int. J. Math. 8, 357–374 (1997) [M2] Matsumoto, K.: Dimension groups for subshifts and simplicity of the corresponding C ∗ -algebras. To appear in J. Math. Soc. Japan. [MWY] Matsumoto, K., Watatani, Y., Yoshida, M.: KMS states for gauge actions on C ∗ -algebras associated with subshifts. Math. Z. 3, 489–509 (1998) [OPI] Olesen, D., Pedersen, G.K.: Some C ∗ -dynamical systems with a single KMS state. Math. Scand. 42, 111–118 (1978) [OP] Olesen, D., Pedersen, G.K.: Applications of the Connes spectrum to C ∗ -dynamical systems III. J. Funct. Anal. 45, 357–390 (1982) [Par] Parry, W.: On the β-expansions of real numbers. Acta Math. Acad. Sci. Hung. 11, 401–416 (1960) [P] Pimsner, M.: A class of C ∗ -algebras generalizing both Cuntz–Krieger algebras and crossed products by Z. In: Voiculescu, D. (ed.) Free probability theory, Providence, RI: AMS, 1997 [Re] Renyi, A.: Representations of real numbers and their ergodic properties. Acta Math. Acad. Sci. Hung. 8, 477–493 (1957) [R] Rørdam, M.: Classification of certain infinite simple C ∗ -algebras. J. Funct. Anal. 131, 415–458 (1995) [RiI] Rieffel, M.: Metrics on states from actions of compact groups. Doc. Math. 3, 215–229 (1998); math.OA/9807084 [RiII] Rieffel, M.: Metrics on state spaces. Doc. Math. 4, 559–600 (1999); math.OA/9906151 [T] Takesaki, M.: Theory of operator algebras I. New York: Springer-Verlag, 1979 [V] Voiculescu, D.: Dynamical approximation entropies and topological entropy in operator algebras. Commun. Math. Phys. 170, 249–281 (1995) [W] Wassermann, A.: Exact C ∗ -algebras and related topics. Lecture Notes Series No. 19, GARC, Seoul: Seoul National University, 1994 [EFW]

Communicated by A. Connes

Commun. Math. Phys. 213, 381 – 411 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

The Generalized Milnor–Thurston Conjecture and Equal Topological Entropy Class in Symbolic Dynamics of Order Topological Space of Three Letters Shou-Li Peng1,2 , Xu-Sheng Zhang2 1 CCAST (World Laboratory), P.O. Box 8730, Beijing 100080, P. R. China 2 Center for Nonlinear Complex Systems, Department of Physics, College of Science, Yunnan University,

Kunming, Yunnan 650091, P. R. China. E-mail: [email protected] Received: 4 February 1998 / Accepted: 1 March 2000

Abstract: This paper presents an answer to an open problem in the dynamical systems of three letters: the generalized Milnor–Thurston conjecture on the existence of infinitely many plateaus of topological entropy in the two-dimensional parameter plane. The concept of equal topological entropy class is introduced by the dual star product which is a generalization of the Derrida–Gervois–Pomeau star product to the symbolic dynamics of three letters for the endomorphisms on the interval. The algebraic rules established by the dual star products for the doubly superstable kneading sequences are equivalent to the normal factorization of the Milnor–Thurston characteristic polynomials. Moreover, the classification theory of symbolic primitive and compound sequences based on the topological conjugacy in the meaning of equal entropy is completed in the topological space 3 of three letters. 1. Introduction About twenty years ago, Milnor and Thurston [1] proposed the kneading theory for the study of piecewise monotone maps on the one-dimensional interval, which has played a key role in establishing the symbolic dynamics of endomorphisms on the interval. At almost the same time Derrida, Gervois and Pomeau [2] presented their star product of endomorphisms on the ground of experimental symbolic description [3], which supplied a symbolic representation of two letters to real numbers. This star product provides an extremely important tool for studying the complete classification and the universalities in unimodal maps [4, 5]. These two works laid the foundation of the symbolic dynamics of one-dimensional endomorphisms, even if they were not thought to have any connection with each other in the initial stage of development of the symbolic dynamics of onedimensional endomorphisms. Milnor and Thurston [1] conjectured that topological entropies increase monotonically in the region of non-zero entropy and there exist infinitely many intervals of constant topological entropies in the one-dimensional parameter line of the quadratic maps. This

382

S.-L. Peng, X.-S. Zhang

conjecture includes two statements. The first statement describes the monotonic increase of topological entropies, and this can be explained by combining the 1 − 1 correspondence relationship between the lexicographical order of the invariant coordinate and the order of real number, with the 1 − 1 correspondence relationship between the smallest positive zero of the kneading characteristic polynomial and the topological entropy [6]. The second one, a crucial part of their conjecture, points out the existence of infinitely many plateaus (or steps of the staircase) of topological entropies. We have recently verified that the conjecture in order topological space 2 of two letters [7-9] holds. There exist infinitely many steps with the cardinal ℵ1 on the parameter line, and each step is an equal topological entropy class which consists of infinitely many compound sequences with the same cardinal ℵ1 . The algebraic rule for constructing equal topological entropy class is nothing but a Derrida–Gervois–Pomeau (DGP) star product [7-9]. Moreover, it leads to a complete classification of sets of monotone equivalent classes of unimodal maps for all periodic, quasi-periodic and aperiodic orbits. However, when attempting to generalize the Milnor–Thurston conjecture into dynamical systems of three letters, a difficult problem arises: How to seek all equal topological entropy classes and extract the algebraic rules for generating them in 3 ? This still is a important problem in the symbolic dynamics of interval maps. The subject of this paper is to establish all the equal topological entropy classes for the symbolic dynamics of three letters of the one-dimensional endomorphism and to elaborate on its elementary algebraic properties. It is now recognized that the equal topological entropy class conjectured by Milnor and Thurston is one of the most important concepts in symbolic dynamics in the following sense: (1) There is an algebraic rule or transformation in the equal topological entropy class, for instance, the normal (DGP) star product in the symbolic dynamics of unimodal maps [2], or the Farey transformation in the circle maps without non-increasing intervals [5]. (2) The metric universality in Feigenbaum’s bifurcations is confined in the equal topological entropy class [8–10]. (3) The renormalization group operator of equal topological entropy class should be established by the second topological conjugate invariant [11]. Recently we obtained a new method [11] for seeking the equal topological entropy class, which is based on the second topological conjugate invariant. It is worth emphasizing that this new topological method can distinguish all compound words from the set of kneading sequences. In order to select the sequences of the equal topological entropy class, the criterion is the clustering of order and the block diagonalization of Stefan matrix of their sequences in symbolic space. After using the method of symbolic dynamics, all sequences of equal topological entropy class are found. Then the rules of normal star products will be able to be extracted from equal topological entropy class. We think [12,13] that the most important two counterparts of DGP star product are composed of two algebraic operations: (a) multiplication of symbolic sequences and (b) parity inverse operations on the turning points, because it is shown that these two operations will be a good match with the disturbance of kneading theory. By comparing the sequences of equal entropy which have been found with the two algebraic operations, the rule of dual star products of the class will be revealed. It is well known that the achievement of the kneading theory lies in its dividing maps into the monotone equivalent classes according to the symbolic dynamics of orbits of turning points. By virtue of the rules of dual star products in 3 we can further divide the maps in the parameter plane into much finer topological conjugate equivalent classes, namely, the equal topological entropy classes. We will show that these classes are sets of symbolic sequences whose kneading characteristic polynomials can be normally factorized, which further control the formation of equal topological entropy

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

383

classes. Clearly, the dual star product [13] is a normal star product because it comes from an equal topological entropy class. It is known that one focuses on exploring some basic rules of the admissible set of sequences and on establishing the frame of symbolic dynamics in the beginning stage. For instance, for symbolic dynamics of three letters [14-17] and four letters [18], in the studies of star products, star transformations and seed rules of admissible genealogy, etc., one does not pay attention to the property of equal topological entropy for the set of sequences. The normal star product which preserves topological entropy is important to the endomorphism on the interval. It not only reveals the completely topological structure and classifications of sequences, but also provides a fundamental frame to the metric universality of dynamical systems when the universality law [19,20] is again extracted from the bifurcation families in the equal topological entropy class [7–9]. Therefore, once the kneading theory is equipped with the algebraic structure of the normal star product, the intrinsic topological and metric features of the symbolic dynamics of three letters can be developed. In particular, a natural solution to the second statement of Milnor–Thurston conjecture in symbolic systems of three letters is easily deduced. The order of sequence pairs in the two-dimensional kneading plane is complicated, however we believe that there is a harmonic relationship between the order of sequence pairs and that of topological entropies. Thus, the first statement of the Milnor–Thurston conjecture can not be generalized in a simple way. Having introduced notation in Sect. 2, we now give the definition and the elementary algebraic properties of dual star products in symbolic space 3 of three letters in Sect. 3. In Sect. 4 we prove the factorization of kneading characteristic polynomial of the compound sequence and further illustrate the topological entropy preserved by the dual star products. The proofs of the factorization of the compound sequences present clearly the subtle structure of the dual star products. In Subsection 4.5 we demonstrate the Sarkovskii-like order embedding theorem in the kneading plane. In Sect. 5, we first extend the dual-star product onto any admissible sequence pairs and then show the generalization of the Milnor–Thurston conjecture from the one-dimensional step (interval) of topological entropy to the two-dimensional plateaus on the kneading plane of symbolic systems of three letters. Some important applications are sketched at the end of this section.

2. Symbolic Dynamics of Bimodal Map 2.1. Description of symbolic system. In this section we review the symbolic dynamics of bimodal map, and give the admissibility conditions of sequences and fix some notations. We consider the maps of the endomorphism fλ,µ : I → I on the interval I = [a, b] (a < b; a, b ∈ R), which depend on the two parameters (λ, µ) ⊂ R2 . The bimodal maps can be separated into two cases according to whether there are one or two decreasing intervals in the three intervals. In this paper, without loss of generality, we focus only at the bimodal maps with one decreasing interval: (+ − +), and the case (− + −) is similar but rather complicated. Let I1 , I2 , and I3 be the laps associated with the map fλ,µ , c1 and c2 be the turning points of fλ,µ , numbered in their natural order, I1 ≺ c1 ≺ I2 ≺ c2 ≺ I3 . To each point x ∈ I , one can assign the itinerary A(f ∗ (x)) = A0 A1 . . . Aj . . . , defined n (x) ∈ A . to be the sequence of symbols An ∈ {I1 , c1 , I2 , c2 , I3 }, n ∈ Z+ , such that fλ,µ n The kneading sequences K i (f ) = k0i k1i . . . kji . . . , are itineraries starting from f (ci ), i = 1, 2, and the kneading invariant of the map fλ,µ , the 2-tuple (K 1 (f ), K 2 (f )), is preserved under the orientation preserving topological conjugacy.

384

S.-L. Peng, X.-S. Zhang

2.2. Admissibility conditions. Let the symbolic order ≺ be MSS order [3] or equivalently, the lexicographical order [1], which is complete [4]. For any two sequences X and Y , if ϕ k (X) X, and Y ϕ k (Y ), for all k ∈ Z+ , then X is called maximal, Y minimal, and the sequence pair S := (X, Y ) is an extreme pair of sequences. The shift operator ϕ is defined as ϕ k (X) = xk+1 xk+2 . . . , for the sequence X = x1 . . . xk xk+1 . . . . A sequence pair S is called compatible if ϕ k (Y ) X and Y ϕ k (X), for all k ∈ Z+ . If the sequence pair S further satisfies the following conditions:

K 2 ϕ k (X), ϕ k (Y ) K 1 ,

k, k ∈ Z+ ,

(2.1)

then this sequence pair S is admissible with respect to the kneading sequences K 1 and K 2 which follow the turning points c1 and c2 respectively. All the admissible sequence pairs obeying the condition (2.1) form an admissible set K, and fill up the whole kneading plane of symbolic systems of three letters. To span the sets {X} and {Y }, we can produce all these sequences by continuously operating the superorder left-handed multiplication (I1 ≺ I2 ≺ I3 )⊗ [21] on the natural order I1 ≺ c1 ≺ I2 ≺ c2 ≺ I3 . For instance, (I1 ≺ I2 ≺ I3 )⊗(I1 ≺ c1 ≺ I2 ≺ c2 ≺ I3 ) generates the following order sequences, I1 I1 ≺ I1 c1 ≺ I1 I2 ≺ I1 c2 ≺ I1 I3 ≺ I2 I3 ≺ I2 c2 ≺ I2 I2 ≺ I2 c1 ≺ I2 I1 ≺ I3 I1 ≺ I3 c1 ≺ I3 I2 ≺ I3 c2 ≺ I3 I3 . Along with the natural order I1 ≺ c1 ≺ I2 ≺ c2 ≺ I3 , we have the sequences I1 I1 ≺ I1 c1 ≺ I1 I2 ≺ I1 c2 ≺ I1 I3 ≺ c1 ≺ I2 I3 ≺ I2 c2 ≺ I2 I2 ≺ I2 c1 ≺ I2 I1 ≺ c2 ≺ I3 I1 ≺ I3 c1 ≺ I3 I2 ≺ I3 c2 ≺ I3 I3 . So the operation (I1 ≺ I2 ≺ I3 )⊗n ⊗ (I1 ≺ c1 ≺ I2 ≺ c2 ≺ I3 ) will produce all sequences {X} and {Y } when n → ∞, I1∞ ≺ . . . ≺ I2∞ ≺ . . . ≺ I3∞ . It can be broken into two sections, I1∞ ≺ . . . ≺ I2∞ and I2∞ ≺ . . . ≺ I3∞ , which are evidently the sequences of {Y } and {X} respectively. The order topological space 3 := {≺, A} is defined as the product of sets {X} and {Y }, where the sequence pairs (X, Y ) ∈ K, and the order ≺ is used in the sense of the following meaning: for any two admissible sequence pairs A1 = (X 1 , Y 1 ) and A2 = (X 2 , Y 2 ), A1 ≺ A2 if X 1 ≺ X 2 or if X1 = X 2 and Y 2 ≺ Y 1 . If X = K 1 , Y = K 2 , this pair obviously satisfies the condition (2.1), namely,

K 2 ϕ k (K 2 ), ϕ k (K 1 ) K 1 ,

k, k ∈ Z+ .

(2.2)

In particular, if the sequences K 2 and K 1 are periodic, and they run through the two the turning points c1 = C and c2 = D, i.e., K 1 = XDY C ≡ K, K 2 = Y CXD ≡ K, condition (2.2) is reduced to,

Y C ≺ ϕ k (XD), ϕ k (Y C) ≺ XD,

for k = 1, . . . , |XD|, k = 1, . . . , |Y C|. (2.3)

The above sequence pair is called the doubly superstable kneading pair. Let K0 be a set of all doubly superstable kneading pairs. It is obvious that K0 ⊂ K. The doubly superstable kneading pairs are the representatives of all admissible sequence pairs. These

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

385

doubly superstable kneading pairs construct all joints of the skeleton in the kneading plane, from which the bones spanning the kneading plane grow [14,15]. Hence studying the properties of the doubly superstable kneading pairs is quite crucial to an analysis of the construction of the entire kneading plane. In this paper we will mainly concentrate on the doubly superstable kneading sequence pairs (K, K). It is convenient to introduce the permutation operation ∼ between the maximal and minimal representations of a sequence. For any periodic sequence W , let its maximal representation be WM and minimal Wm , then WM and Wm can exchange with each other M = Wm , W m = WM . It is obvious that any doubly under the operation ∼, namely, W here K = XDY C, K = Y CXD. superstable kneading pair can be written as (K, K), When no confusion arises, we will call a doubly superstable kneading sequence pair a superstable kneading sequence K or K, or simply a sequence K hereinafter. (K, K) 2.3. Kneading matrix [1]. Let V be the three-dimensional vector space over the rational numbers with the basis of the formal symbols {I1 , I2 , I3 }. For each sequence of point x, A(f ∗ (x)) = A0 , A1 , . . . , An , . . . can be assigned a symbolic sequence θ = θ0 , θ1 , . . . , θn , . . . , of vectors from V , their invariant coordinates n−1 are given as n−1 θn = n−1 An ; ε(Ak ), n−1 = k=0

where θ0 = A0 , 0 = ε0 , and ε(Ij ) is a parity of symbols Ij , which denote the increasing or decreasing of the intervals Ij , and are fixed as ε(I1 ) = ε(I3 ) = +1, ε(I2 ) = −1. By t and the vectors θn we can compose a formal power series θ(t) = ∞an indeterminate k−1 of point x. When points x approach the turning points c or c from the θ t k−1 1 2 k=1 ± opposite directions and limx→c± θ x (t) = θ ci (t), i = 1, 2, respectively, according to the i

+

−

kneading theory, the kneading-increment ν i (t) = θ Ci (t) − θ Ci (t) leads to ν i (t) =

3

Nij (t)Ij ,

i = 1, 2,

j =1

where N (t) = [Nij (t)] is a 2 × 3 kneading matrix and Nij (t) =

∞ k=0

nkij · t k ,

i = 1, 2, j = 1, 2, 3.

(2.4)

Here the 2 × 3 kneading coefficient matrices nkij for k th address in the itinerary A(f ∗ (x)), according to the geometric implication of local degree of address [1], can be explicitly written as nkij = 2

i i k−1 χ (Ak

∩ Ij ),

i = 1, 2, j = 1, 2, 3, k = 1, 2, . . . ,

where a symbolic characteristic function χ (Ak ∩ Ij ) is introduced as 1, if Ak = Ij χ (Ak ∩ Ij ) = 0, if Ak = Ij

(2.5)

386

S.-L. Peng, X.-S. Zhang

and

−1 1 0 n0ij = 0 −1 1

The superscript i (i = 1, 2) in the quantities ik−1 and χ i (Ak ∩ Ij ) corresponds to the sequences A1 A2 A3 . . . starting from the turning points C and D respectively. The kneading determinant (i.e., kneading characteristic polynomial) D(t) is defined from the invariant ratio D(t) = (−1)j +1 Dj /(1 − ε(Ij )t),

(2.6)

where Dj = det Nˆ (t) and the 2 × 2 matrix Nˆ (t) is obtained from [Nij (t)] by deleting its j th column. It is important to assume in the kneading theory that the turning points ci of the itinerary would be kneaded into such sufficiently small left or right intervals as ci+ ∈ {ci , ci +δ}, or ci− ∈ {ci −δ, ci } by some disturbance δ. In such a disturbance the parity of each turning point encountered in the kneading sequences W would change from ε(ci ) +τ (G) +τ (G) +τ (G) ), where ci = Ii+1 , if τ (G) = +1; ci = Ii , if τ (G) = −1, where to ε(ci G is the whole string from the first symbol Ai0 to the symbol before the turning point ci . Here the disturbance δ is global for the sequence W because the whole of the sequence G is contained in the inverse operation τ (G). The parity inverse operator τ (W ) for the arbitrary sequence W is defined as τ (W ) =

+1, if J (W ) = even , −1, if J (W ) = odd

where the M-parity number of a sequence W , J (W ), is the number of appearances of letter M in the sequence W . For the sake of convenience, the symbols of three intervals are hereinafter specified as I1 = L, I2 = M and I3 = R. Lemma 2.1. For any doubly superstable sequence Z = XDY C, |X| = l, |Y | = m, we have l+1

= +1,

p=l+m+2

= −1,

in other words, J (XD +τ (G) ) = odd,

J (XD +τ (G) Y C +τ (G ) ) = even.

Proof. Evidently, the sequence Z = XDY C starting from the turning point C runs over the turning point D and ends at C. For the turning point D, If J (X) = odd and −τ (X) ) = ε(D + ) = ε(R) = +1, thus −τ (X) ) = +1; l = +1, then ε(D l+1 = l ε(D If J (X) = even and l = −1, then ε(D −τ (X) ) = ε(D − ) = ε(M) = −1, thus −τ (X) ) = +1. In brief, we have J (XD −τ (X) ) = J (XD +τ (G) ) = odd. l+1 = l ε(D The similar argument will lead to the result for the turning point C.

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

387

3. Dual Star Products: Algebraic Rule of Equal Topological Entropy Classes 3.1. Presentation of the dual star products. For the symbolic systems of two letters, although the non-commutative DGP star product P C ∗ Q [2] consists of two operations: multiplication and parity inverse on the turning point (we call the two operations a disturbance), its essence lies in the fact that the turning points (|Q| C’s) of |Q| sequences P C are kneaded into appropriate intervals with the regular disturbances that are governed by the parity of left sequence P C and each letter of the right sequence Q. It is this regular disturbance of the turning points that endows the DGP star product with many invariant properties. For the symbolic systems of three letters, the generalization of the DGP star product should be to seek the regular disturbances of the two turning points C and D, and then the case becomes more complex. In the following, first we give a definition of the star products for the symbolic systems of three letters, and then discuss their algebraic properties. Definition. Let Z r = X r DY r C, r = 1, 2, be two doubly superstable kneading ser , x r , y r ∈ {L, M, R}, ξ = 1, . . . , l , quences, where X r = x1r . . . xlrr , Y r = y1r . . . ym r η ξ r η = 1, . . . , mr . For the symbolic systems of three letters, there are two kinds of star products, i.e., Z 1 ∗ Z 2 , ∗ ∈ {∗, ∗}, ∀ Z 1 , Z 2 ∈ K0 . The up-star product ∗ is defined as Z 1 ∗Z 2 = (X 1 DY 1 C)∗(X 2 DY 2 C) = (X1 DY 1 C)∗x12 . . . (X1 DY 1 C)∗xl22 (X 1 DY 1 C)∗D 2 (X1 DY 1 C)∗y12 . . . (X1 DY 1 C)∗ym (X 1 DY 1 C)∗C, 2

where the up-star product ∗ consists of two operations of up-multiplication and parity inverse, (X 1 DY 1 C)∗L = X 1 D −τ (X ) Y 1 C +τ (Y ) , 1

1

(X1 DY 1 C)∗C = X 1 D −τ (X ) Y 1 C, 1

(X1 DY 1 C)∗M = X 1 D −τ (X ) Y 1 C −τ (Y ) , 1

(X1 DY 1 C)∗D = X 1 DY 1 C

1

(3.1)

−τ (Y 1 )

(X 1 DY 1 C)∗R = X 1 D +τ (X ) Y 1 C −τ (Y ) , 1

1

and the down-star product ∗ as 2 = (Y 1 CX 1 D)∗(Y 2 CX 2 D) 1 ∗Z Z 2 = (Y 1 CX 1 D)∗y12 . . . (Y 1 CX 1 D)∗ym (Y 1 CX 1 D)∗C 2

(Y 1 CX 1 D)∗x12 . . . (Y 1 CX 1 D)∗xl22 (Y 1 CX 1 D)∗D, where the down-star product ∗ consists also of two operations of down-multiplication and parity inverse, (Y 1 CX 1 D)∗L = Y 1 C −τ (Y ) X 1 D +τ (X ) , 1

1

(Y 1 CX 1 D)∗C = Y 1 CX 1 D +τ (X ) , 1

(Y 1 CX 1 D)∗M = Y 1 C +τ (Y ) X 1 D +τ (X ) , 1

(Y 1 CX 1 D)∗D = Y 1 C

+τ (Y 1 )

1

X 1 D,

(Y 1 CX 1 D)∗R = Y 1 C +τ (Y ) X 1 D −τ (X ) . 1

1

(3.2)

388

S.-L. Peng, X.-S. Zhang

2 are admissible doubly 1 ∗Z We will prove later that the product sequences Z 1 ∗Z 2 and Z superstable kneading sequences, and also called compound sequences. The inverse operations τ (X 1 ), τ (Y 1 ) are acting on two turning points C, D of the sequence Y 1 CX 1 D and thus the disturbances of two turning points C, D, which are governed by the parity of Y 1 or X 1 and each letter of the right sequence Z 2 , are local in view of the entire product sequence. We will see immediately that although the rule of disturbances of two turning points is comparatively simple, the combination of these local disturbances and the previous global disturbances lead to the normal factorization of the kneading poly1 ∗Z 2 , and endow the up- and down-star nomial of compound sequences Z 1 ∗Z 2 and Z products with many important properties similar to those of the DGP star product in the symbolic systems of two letters. Thus we present the generalization of the DGP star product to symbolic systems of three letters. These two star products possess the following algebraic properties, ∀Z 1 , Z 2 ∈ K0 and Z 1 = Z 2 , (1) Non-commutativity Z 1 ∗Z 2 = Z 2 ∗Z 1 ,

2 = Z 1 . 1 ∗Z 2 ∗Z Z

(3.3)

(2) Associativity Z 1 ∗(Z 2 ∗Z 3 ) = (Z 1 ∗Z 2 )∗Z 3 ,

1 ∗(Z 2 ∗Z 1 ∗Z 3 ) = (Z 2 )∗Z 3 . (3.4) Z

(3) Order preserving Z 1 ≺ Z 2 ⇒ Z∗Z 1 ≺ Z∗Z 2 , 1 ≺ Z∗Z 2. Z 1 ≺ Z 2 ⇒ Z∗Z

(3.5)

(4) Kneading admissibility preserving 1 ∗Z 2 ∈ K0 . Z 1 , Z 2 ∈ K0 !⇒ Z 1 ∗Z 2 , Z

(3.6)

(5) Duality T Z 1 ∗Z 2 = (Z 1 )T ∗(Z 2 )T , T 1 ∗Z 1 )T ∗(Z 2 )T , 2 = (Z Z

(3.7)

where the parity preserving transformation T is defined as, R L, D C, M M. Therefore these star products possess the dual symmetry under the parity preserving transformation T , and are called dual-star products. (1) and (5) are obvious facts. (2) can be directly verified. (3) is proved in Lemma 3.2. (4) is proved in Theorem 3.1. Due to the symmetry of up-star and down-star products, we will only give proofs of the algebraic properties of the up-star product.

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

389

3.2. Some preliminary lemmas and theorem. Before going to prove the main theorems, we discuss some properties of the compound sequences Z 1 ∗ Z 2 . Lemma 3.1. Let the sequence Z 1 = X1 DY 1 C ∈ K0 , for any letter a ∈ {L, M, R} J (X1 (D∗a)τ (X ) ) = even + J (D∗a), 1

J (Y 1 (C∗a)

τ (Y 1 )

(3.8a)

) = even + J (C∗a),

(3.8b)

and thus J ((X1 DY 1 C)∗a) = even + J (a).

(3.8c)

Proof. We can directly prove the formula J (X1 (D∗a)τ (X ) ) = J (X 1 ) + J ((D∗a)τ (X ) ) even + J ((D∗a)+ ), if J (X 1 ) = even = − odd + J ((D∗a) ) = odd + 1 + J (D∗a), if J (X 1 ) = odd 1

1

= even + J (D∗a). The equality (3.8b) can be proved similarly. Considering

1 1 X1 DY 1 C ∗a = X1 (D∗a)τ (X ) Y 1 (C∗a)τ (Y ) leads directly to, J ((X 1 DY 1 C)∗a) = J (X 1 (D∗a)τ (X ) ) + J (Y 1 (C∗a)τ (Y ) ) = even + J (D∗a) + J (C∗a). 1

1

According to the definition (3.1) of up-star product, it yields J (D∗a) + J (C∗a) = even + J (a). Consequently, the formula (3.8c) holds. By virtue of this lemma, for the compound sequence XDY C = (X 1 DY 1 C)∗ we can immediately obtain

(X2 DY 2 C),

J (X) = even + J (X 1 ) + J (X 2 ),

J (Y ) = odd + J (Y 1 ) + J (Y 2 ),

(3.9)

therefore the dual-star products change the total parity. Lemma 3.2 (Order preserving for the dual star product). For any doubly superstable sequence Z = XDY C, and the order relation zρ1 ≺ zρ2 , zρ1 , zρ2 ∈ {L, C, M, D, R}, we have Z∗zρ1 ≺ Z∗zρ2 ,

ρ1 ≺ Z∗z ρ2 . Z∗z

(3.10)

Further, for any two sequences A1 and A2 , if there exists the order relation A1 ≺ A2 , then Z∗A1 ≺ Z∗A2 ,

1 ≺ Z∗A 2. Z∗A

(3.11)

That is, the star product preserves the order relation of sequence pairs whether these sequence pairs are admissible or not.

390

S.-L. Peng, X.-S. Zhang

Proof. With the definition of parity inverse operators τ (X), τ (Y ), we have immediately Y C −τ (Y ) ≺ Y C ≺ Y C +τ (Y ) , XD −τ (X) ≺ XD ≺ XD +τ (X) , and further J (Y C −τ (Y ) ) = even + J (XD +τ (X) ) = even, J (Y C +τ (Y ) ) = even + J (XD −τ (X) ) = odd. It directly yields (3.10) from the definition of dual star products. are equivalent to the star transforIn some sense, the star product operators Z ∗¯ , Z∗ mations or star maps

zρ → zρ = Z∗zρ ,

(3.12)

ρ, zρ → zρ = Z∗z

which preserve the MSS order and M-parity, i.e.

zρ1 ≺ zρ2 ⇒ zρ1 ≺ zρ2 , zρ1 ≺ zρ2 ,

(3.10a)

from Lemma 3.1

J (zρ ) = even + J (zρ ) = J (zρ ),

for zρ ∈ {L, M, R}.

(3.13)

Combining (3.12), (3.10a) and (3.13) leads to (3.11). That is, the star transformation preserves the order of letters, thus it can also preserve the order of sequences. Theorem 3.1 (Kneading admissibility preserving for the dual star product). For any two doubly superstable kneading sequences Z r = Xr DY r C, r = 1, 2, i.e., if Z 1 , Z 2 ∈ K0 , then the corresponding compound sequence Z ≡ XDY C = Z 1 ∗ Z 2 is also an admissible doubly superstable kneading sequence, Z ∈ K0 . The proof of this theorem is given in [13]. 4. Milnor–Thurston Equal Topological Entropy Classes Due to the nonlinearity of maps fλ,µ (x) the corresponding kneading plane contains very complex structures such as self-similar, self-embedded ones. The upper bound property of topological entropy is an important frame in analyzing these structures, and the plateaus of topological entropy themselves are also basic constructions in the kneading plane, which directly generalize the Milnor–Thurston concept from a one-dimensional line to two-dimensional parameter plane. In this section we demonstrate some preliminary properties of invariant coordinates and symbolic characteristic function of the compound doubly superstable sequences and state two theorems concerning the star products on the doubly superstable sequences, on which the plateaus of topological entropy will be built. Theorem 4.1. For any Z 1 , Z 2 ∈ K0 , the kneading determinant of the compound sequence Z 1 ∗ Z 2 can be normally factorized as DZ 1 ∗Z 2 (t)=DZ 1 (t) · DZ 2 (t |Z | ). 1

(4.1)

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

391

We say that the factorization is normal if the power of the argument in the factorized kneading determinant of the second sequence Z 2 is the period of the first sequence Z 1 . Although the kneading determinants of sequences have different forms of factorization, the formation of normal factorization is one of the important criteria in distinguishing the topological entropy classes by the following theorem. The topological entropy of the compound sequence Z 1 ∗ Z 2 has the following important relationship with the quantities of admissible doubly superstable sequences Z 1 and Z 2 , namely Theorem 4.2. For any Z 1 , Z 2 ∈ K0 , n ∈ Z+ , h(Z 1 ), if Z 1 = (DC)∗n , 1 2 h(Z ∗Z ) = h(Z 2 )/2n , if Z 1 = (DC)∗n .

(4.2a) (4.2b)

It shows that the star map Z 1 ∗ preserves the topological entropy h if Z 1 = (DC)∗n , or compresses the topological entropy to 21n of h(Z 2 ) if Z 1 = (DC)∗n . Here ∗ ∈ {∗, ∗}. 4.1. The kneading characteristic polynomial. To prove normal factorization, we now need to seek an explicit expression of the coefficients dk of the characteristic polynomial. For any admissible doubly superstable sequence Z = XDY C with the period p = l + m + 2, where l = |X|, m = |Y |, it is easy to observe from Lemma 2.1 that there exist the following relations for invariant coordinates and characteristic functions: 2 η−1 1 ξ −1

= =

1 l+1+η−1 , 2 m+1+ξ −1 ,

χ 2 (zη ∩ Ij ) = χ 1 (zl+1+η ∩ Ij ),

η = 1, . . . , m + 1,

χ (zξ ∩ Ij ) = χ (zm+1+ξ ∩ Ij ), ξ = 1, . . . , l + 1, 1

2

and for the elements of kneading matrix l+1+η η = n2j , n1j m+1+ξ ξ n2j = n1j ,

η = 1, 2, . . . , m + 1, ξ = 1, 2, . . . , l + 1.

Considering the periodicity of the kneading matrices for the sequence of finite length, k+p , nkij = nij the kneading polynomials (2.4) for doubly superstable sequences can be written as Nij (t) = n0ij +

p 1 k k nij · t , 1 − tp

i = 1, 2, j = 1, 2, 3

k=1

and (2.6) reads: D(t) = D(t) · (1 − t p ) =

p k=1

dk−1 t k−1 ;

(4.3a)

392

S.-L. Peng, X.-S. Zhang

we give the formula of the coefficients of the characteristic polynomial after many operations, dk−1 = −

k−1

ξ +η=k

+

2

ξ −1

l+1+η−1

χ (zξ ∩ R) − χ (zl+1+η ∩ R) .

1≤ξ ≤l+1, 1≤η≤m+1

(4.3b) The explicit form of the formulas can be used to analyze the topological entropies of all sequence pairs, including the fine- and coarse-grained chaos [5,9,10]. Hereafter the superscript i in the quantities ik and χ i (zk ∩ R) is omitted, and it refers to the case i = 1, i.e., all quantities in the following correspond to the sequences starting from the turning point C.

4.2. The algebraic properties of compound doubly superstable sequences. The core of star product is the algebraic features of turning points in the compound sequence and many invariant properties. To prove Theorem 4.1, these algebraic properties need to be explained in this subsection. We first consider the invariant coordinates, symbolic characteristic function and their algebraic features on the turning points. For two types of turning points C, D of the compound sequence Z = (X 1 DY 1 C)∗(X 2 DY 2 C), according to Lemma 3.1 and the definition of local coordinate ε, we immediately have ε(D∗zi2 )τ (X ) ) = ε0 ε(D∗zi2 )

Z1 l1 ,

1

ε((C∗zi2 )

τ (Y 1 )

) = ε(C∗zi2 )

Z1

p1 −1 ,

i = 1, . . . , l2 ,

(4.4a)

i = l2 + 2, . . . , p2 − 1,

1 1 1 ε(x11 ) . . . ε(xl11 )ε((D∗zi2 )τ (X ) )ε(y11 ) . . . ε(ym )ε((C∗zi2 )τ (Y ) ) 1

=

(4.4b) ε(zi2 ),

(4.4c)

i = 1, . . . , l2 , l2 + 2, . . . , p2 − 1. For the turning points D and C of the compound sequence Z 1 ∗Z 2 , their local coordinates are given as ε(D −τ (G) ) = ε0

Z1 l1

Z2 l2 ,

ε(C −τ (G) ) = ε0

Z1 p1 −1

Z2 p2 −1

(4.5)

which can be shown by a simple argument. Such quantities of the compound sequence Z 1 ∗Z 2 as invariant coordinates and symbolic characteristic functions are closely related with these of sequences Z 1 and Z 2 . The following lemma presents the reduction of characteristic function χ (zk ∩ R) of the compound sequence Z = Z 1 ∗Z 2 . Lemma 4.1. χ (z(i−1)p1 +j ∩ R) =

χ (zj1 ∩ R) χ (zl11 +1

∩ R) −

if j = l1 + 1, Z1 l1

χ (zi2

∩ R) if j = l1 + 1

where j = 1, . . . , p1 , i = 1, . . . , p2 , k = (i − 1)p1 + j = 1, . . . , p1 p2 .

(4.6a) (4.6b)

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

393

Proof. For the case j = l1 + 1, z(i−1)p1 +j = zj1 follows directly from the definition of star products, hence formula (4.6a) holds. For the case j = l1 + 1, if i = l2 + 1, then we τ (X1 ) 1 1 . When Z have z(i−1)p1 +l1 +1 = D∗zi2 l1 = −1 (i.e. J (X1 ) = even), zl1 +1 = M, then R, if zi2 = R, z(i−1)p1 +l1 +1 = M, if zi2 = R; when

Z1 l1

= +1 (i.e. J (X1 ) = odd), zl11 +1 = R, then M, if zi2 = R, z(i−1)p1 +l1 +1 = R, if zi2 = R.

If i = l2 + 1, according to the global disturbance and (3.9), we have z(i−1)p1 +l1 +1 = 1 + 1 2 1 D τ (C X) = D −τ (X +X ) . When Z l1 = −1 (i.e. J (X ) = even), it yields z(i−1)p1 +l1 +1 +

1 = D −τ (X ) = D τ (C X ) = zl22 +1 ; when Z l1 = +1 (i.e. J (X ) = odd), z(i−1)p1 +l1 +1 = −

zl22 +1 . Summing up all these relations yields χ (z(i−1)p1 +j ∩ R) = χ (zl11 +1 ∩ R) − 2

Z1 l1

1

2

χ (zi2 ∩ R). This completes the proof of Lemma 4.3.

By the way, the formula (4.6b) clearly shows the fact that the local disturbances of 1 turning point D result from the actions of the parity of X 1 (i.e., Z l1 ) and the letter zi2 of left sequence. Following this lemma, there is an important corollary concerning invariant coordinates and symbolic characteristic functions at the turning points D of compound sequence, Corollary. Z1 l1

χ (z(k −1)p 2

1 +l1

∩ R) − χ (z(l +1

2 +1+k2 −k2 )p1 +l1 +1

∩ R)

= − χ (z2 ∩ R) − χ (z2

l2 +1+k2 −k2

k2

∩ R) ,

(4.7)

where k2 = 1, 2, . . . , k2 . Similarly, we have the reduction of the invariant coordinate sequence Z = Z 1 ∗Z 2 as follows. Lemma 4.2.

Z (i−1)p1 +j −1

=

Z2 Z1 i−1 j −1 2 ε(D∗zi2 ) Z i−1

ε0

Z1 j −1

Z k−1

for the compound

if j = 1, . . . , l1 + 1,

(4.8a)

if j = l1 + 2, . . . , p1

(4.8b)

here i = 1, . . . , p2 . Proof. For any admissible doubly superstable sequence Z = XDY C, from Lemma 2.1, there exist the following identities: Z 0

= ε0 = −1,

Z l+1

= +1,

Z l+m+2

= −1.

394

S.-L. Peng, X.-S. Zhang

In order to compute the invariant coordinates Z (i−1)p1 +j −1 of the compound sequence 1 2 Z = Z ∗Z , the sequence can also be explicitly written as, 1 C + x11 . . . xl11 (D∗z12 )τ (X ) y11 . . . ym (C∗z12 )τ (Y ) . . . x11 . . . xl11 (D∗zl22 )τ (X 1 1

1

1)

1 1 y11 . . . ym (C∗zl22 )τ (Y ) x11 . . . xl11 D τ (G) y11 . . . ym (C∗D)τ (Y ) x11 . . . xl11 (D∗zl22 +2 )τ (X 1 1 1

1

1 1 y11 . . . ym (C∗zl22 +2 )τ (Y ) . . . x11 . . . xl11 (D∗zp2 2 −1 )τ (X ) y11 . . . ym (C∗zp2 2 −1 )τ (Y 1 1 1

1

1)

1)

1 x11 . . . xl11 (D∗C)τ (Y ) y11 . . . ym C τ (G) . 1 1

(4.9)

Z Z1 Z2 Z1 j −1 = j −1 = ε0 i−1 j −1 , for i = 1 and j = 1, . . . , l1 + 1. 1 For the turning point (D∗z12 )τ (X ) of the compound sequence (i = 1, j = l1 + 2), 2 Z1 Z Z1 2 τ (X1 ) ) = noting Z 0 = ε0 , l1 +1 = +1, and (4.4a), it follows l1 +1 = l1 ε((D∗z1 ) 2 Z 1 . Further, for i = 1 and j = l + 3, . . . , p − 1, Z Z ε(D∗zi2 ) Z 1 1 0 l1 +1 j −1 = l1 +1 · 2 1 1 1 1 2 Z Z 2 τ (Y ) (i = 2, ε(y1 ) . . . ε(yj −l1 −1 ) = ε(D∗zi ) 0 j −1 . For the turning point (C∗z1 ) 1 2 1 Z 2 Z Z Z j = 1), noting 0 = ε0 and (4.4c), it follows p1 = ε0 ε(z1 ) = ε0 1 0 . And 1 1 Z further for i = 2 and j = 2, . . . , l1 +1, it follows Z p1 +j −1 = p1 ·ε(x1 ) . . . ε(xj −1 ) = 2 1 Z . We can prove the lemma by repeating the above calculation. The turning ε0 Z 1 j −1 point D −τ (G) (the global disturbance of the compound sequence), i.e., i = l2 + 1, j = l1 + 2, needs to be dealt with specially. Due to (4.5)

It is obvious to see that

Z l2 p1 +l1 +1

=

Z l2 p1 +l1

· ε(D −τ (G) ) = ε0

for brevity, the formulas can also be written as Here let ε(D∗zl22 +1 )

Z2 l2

Z1 l1

Z2 l2 ε0

Z l2 p1 +l1 +1

Z2 l2

Z1 l1

= +1,

= ε(D∗zl22 +1 )

Z2 l2

≡ 1.

Z1 . l1 +1

(4.10)

After considering this special case, we can prove the formulas (4.8a) and (4.8b) step by step. Finally, the following lemma describes the relationship between symbolic characteristic functions and local coordinates. Lemma 4.3. ε(D∗zi2 ) = 2χ (zi2 ∩ R) − 1,

i = 1, . . . , p2 .

(4.11)

Proof. From the definition of the star products, it is easy to see that, i = l2 + 1, +1, if zi2 = R 2 ε(D∗zi ) = , −1, otherwise and by virtue of Definition (2.5) of the symbolic characteristic function, hence (4.11) holds for i = l2 + 1. If i = l2 + 1, according to Definition (4.10) 2 =1 +1 if Z 2 2 Z l 2 ε(D∗zl2 +1 ) = l2 = , 2 Z −1 if l2 = −1

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

395

and by the global disturbance of the turning point D, χ (zl22 +1 ∩ R) = χ (D −τ (G) ∩ R) 1, if J (z12 . . . zl22 ) = odd i.e. , = 0, if J (z12 . . . zl22 ) = even i.e.,

Z2 l2 Z2 l2

= 1), = −1).

Comparing these relations yields ε(D∗zi2 ) = 2χ (zi2 ∩R)−1 for i = l2 +1. Consequently, formula (4.11) holds. The following formula follows immediately from Lemma 4.3: 2 2 2 ε(D∗z ) − ε(D∗z ∩ R) − χ (z2 ) = 2 χ (z k2 −1

l+k2 +1−k2

k2 −1

l+k2 +1−k2

∩ R) , (4.12)

while combining Lemmas 4.3 and 4.1 leads to the formula 2 2 ε(D∗zξ22 )χ (zl+1−η ∩ R) − ε(D∗zl+1−η )χ (zξ22 ∩ R) 2 2 2 = χ (zξ22 ∩ R) − χ (zl+1−η ∩ R). 2

(4.13)

All these formulas illustrate the subtle structure of the local disturbances of the turning points in the dual star products and they will play a crucial role in the following proof of the factorizations of kneading polynomials of the compound admissible sequence pairs. 4.3. The proof of normal factorization theorem (Theorem 4.1). With the preparation of Subsects. 4.1 and 4.2, we now start to prove the factorization of the compound sequence. For all compound doubly superstable sequences Z = Z 1 ∗Z 2 = XDY C, in general, after considering the lengths of strings XD and Y C, they can be divided into two cases: (I) l + 1 ≤ m + 2; (II) l + 1 > m + 2. By considering the lengths of strings X2 D, Y 2 C and X1 D, Y 1 C, and noting the relations (3.16), the case (I) can be further divided into the following two subcases: (I.A) (I.B)

l2 + 1 ≤ m2 + 2 and l1 + 1 ≤ m1 + 2, l2 + 1 < m2 + 2 and l1 + 1 > m1 + 2,

and the case (II) can be further divided into the following two subcases: (II.A) (II.B)

l2 + 1 > m2 + 2 and l1 + 1 ≤ m1 + 2, l2 + 1 ≥ m2 + 2 and l1 + 1 > m1 + 2.

For the case (I) l + 1 ≤ m + 2, i.e., the length of the string XD is longer than or equal to that of string Y C, then the kneading coefficients dk−1 of the characteristic polynomial D(t) can be expressed as

dk−1 = − 1k−1 + 2 1k −1 1l+k−k χ (zk ∩ R) − χ (zl+1+k−k ∩ R) (4.3c) k

∈?

i

where the index k would be divided into the following three ranges, (I.1)

k = 2, . . . , l + 1,

?1 = {1, . . . , k − 1},

396

S.-L. Peng, X.-S. Zhang

(I.2) (I.3)

k = l + 2, . . . , m + 2, ?2 = {1, . . . , l + 1}, k = m + 2, . . . , p, ?3 = {k − m − 1, . . . , l + 1}.

Under the subcase (I.A), each range can be decomposed into three conditions again. For (1), it can be decomposed into (I.1a): 1 ≤ k2 ≤ l2 + 1, 1 < k1 ≤ l1 + 1, (I.1b): 1 ≤ k2 < l2 + 1, l + 1 < k1 ≤ m1 + 2, (I.1c): 1 ≤ k2 < l2 + 1, m1 + 2 < k1 ≤ p1 , for (2), (I.2a): l2 + 1 < k2 ≤ m2 + 2, 1 < k1 ≤ l1 + 1, (I.2b): l2 + 1 ≤ k2 ≤ m2 + 2, l1 + 1 < k1 ≤ m1 + 2, (I.2c): l2 + 1 ≤ k2 < m2 + 2, m1 + 2 < k1 ≤ p1 for (3), (I.3a): m2 + 2 < k2 ≤ p2 , 1 < k1 ≤ l1 + 1, (I.3b): m2 + 2 < k2 ≤ p2 , l1 + 1 < k1 ≤ m1 + 2 (I.3c): m2 + 2 ≤ k2 ≤ p2 , m1 + 2 < k1 ≤ p1 . Under the subcase (I.B), similarly, there exist nine other conditions. For case (II) l +1 > m + 2, i.e., the length of string XD is shorter than that of string Y C, then the index k in formula (4.3c ) can be expanded into the following three ranges: (II.1) k = 2, . . . , m + 1, ?4 = {1, . . . , k − 1}, (II.2) k = m + 2, . . . , l + 2, ?5 = {k − m − 1, . . . , k − 1}, (II.3) k = l + 2, . . . , p, ?6 = {k − m − 1, . . . , l + 1}. For each subcase of Case (II), these three ranges can also be decomposed into nine conditions. Hence, in order to prove Theorem 4.1, we must consider 4 × 9 conditions. Because there are some similarities among subcases (I.A), (I.B) and (II.A), (II.B), in the following, we will focus on subcase (I.A) in the proof of the factorization of the kneading characteristic polynomial of compound sequences. Of course, the proof is lengthy. However, the details of the necessary techniques will be kept at a sketched level. The proofs of the other three subcases are analogous and omitted. The normal factorization equation (4.1) is equivalent to Z ∗Z d(k = dkZ2 −1 · dkZ1 −1 , 2 −1)p1 +k1 −1 1

2

2

1

(4.1a)

where k2 = 1, . . . , p2 , k1 = 1, . . . , p1 and k = (k2 −1)p1 +k1 = 1, . . . , p = p1 p2 = l + m + 2. p is the period of the compound sequence Z = Z 1 ∗Z 2 . For the special case 1 2 2 1 k = 1 (i.e., k1 = 1, k2 = 1), due to 0 = −1, it leads directly to d0Z ∗Z = d0Z · d0Z . For the general case, k > 1, we analyze the various conditions as follows: Case I. l + 1 ≤ m + 2 subcase (I.A) l2 + 1 ≤ m2 + 2 and l1 + 1 ≤ m1 + 2.

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

397

(I.1). Let 2 ≤ k ≤ l + 1 = l2 p1 + l1 + 1, the kneading coefficients of the compound sequence Z = Z 1 ∗Z 2 can be expressed as Z ∗Z =− dk−1 1

Z 1 ∗Z 2 k−1

2

+

k−1

Z 1 ∗Z 2 k −1

Z 1 ∗Z 2 l+k−k

2

Z 1 ∗Z 2 k1 −1

Z 1 ∗Z 2 l+k−k1

k =1

=

k2 p1

k 2 −1

k2 =1 k1 =(k2 −1)p1 +1

χ (zk ∩ R) − χ (zl+1+k−k ∩ R)

χ (zk ∩ R) − χ (zl+1+k−k ∩ R) 1

1

Z 1 ∗Z 2 (k2 −1)p1 +k1 −1

− +

2

(k2 −1)p 1 +k1 −1

2

k =(k2 −1)p1 +1

(4.14) Z 1 ∗Z 2 k −1

χ (zk ∩ R) − χ (zl+1+k−k ∩ R) ,

Z 1 ∗Z 2 l+k−k

this range can be decomposed into three conditions. We prove all three conditions as follows. Under Condition (I.1a), i.e., 1 ≤ k2 < l2 + 1, 1 < k1 ≤ l1 + 1, due to the restriction of subcase (I.A), the kneading coefficients of sequences Z r , r = 1, 2 are given as, r dkZr −1

=−

Zr kr −1

+

k i −1

2

kr =1

Zr kr −1

Zr l+kr −kr

∩ R) .

r χ (zkr ∩ R) − χ (zl+1+k

r −kr

r

(4.15) At first, we observe the summation of the first item in (4.14),

k2 p1

Z 1 ∗Z 2 k1 −1

2

k1 =(k2 −1)p1 +1



k 1 −1

 =

k1 =1

2

+

Z 1 ∗Z 2 l+k−k1 l 1 +1

χ (zk ∩ R) − χ (zl+1+k−k ∩ R) 1

l1 +k1

+

k1 =k1

χ (z

1



p1

+

k1 =l1 +2

Z 1 ∗Z 2 (k2 −1)p1 +k1 −1 (k2 −1)p1 +k1

 

k1 =l1 +k1 +1

Z 1 ∗Z 2 l+k−k1 −(k2 −1)p1

∩ R) − χ (z

l+1+k−k1 −(k2 −1)p1

∩ R)

.

Now we use the formulas (4.6) and (4.8) to reduce the quantities of Z 1 ∗Z 2 in the above expression (4.14) to these of Z 1 and Z 2 , and note that

•

l + k − (k2 − 1)p1 = (l2 + k2 − k2 )p1 + l1 + k1 , l1 Z1 Z1 χ (z1 ∩ R) − χ (z1 2 • k1 =k1 +1

k1 −1

l1 +k1 −k1

k1

l1 +1+k1 −k1

∩ R) = 0

398

S.-L. Peng, X.-S. Zhang

and with the relation (4.7) Z 1 ∗Z 2 (k2 −1)p1 +k1 −1

2

Z 1 ∗Z 2 (l2 +k2 −k2 )p1 +l1

χ (z(k −1)p

1 +k1

2

Z2 (k2 −1)p1 +l1

∩ R) − χ (z(l

2 +k2 −k2 )p1 +l1 +1

Z2 (l2 +k2 −k2 )p1 +k1 −1

+2 χ (z(k −1)p

∩ R) − χ (z(l +k −k )p +k ∩ R) 1 +l1 +1 2 2 1 2 2 1 1 2 2 Z Z = (− Z χ (z2 ∩ R) − χ (z2 k1 −1 ) · 2 k2 −1

p1

•

Z1 k1 −1

k1 =l1 +k1 +1

•

Z1 p1 +l1 +k1 −k1

χ (z1 k1

∩ R) ,

l2 +1+k2 −k2

k2

l2 +k2 −k2

∩ R)

∩ R) − χ (z1 p1 +l1 +1+k1 −k1 l1 +k1

Replacing l1 + 1 + k1 − k1 by k1 then

⇒

k1 =l1 +2

k 1 −1

∩ R) = 0,

.

k1 =1

By using the corollary of Lemma 4.3, i.e., (4.7), and relation (4.12), we have,

k2 p1

Z 1 ∗Z 2 k1 −1

2

k1 =(k2 −1)p1 +1 Z2 k2 −1

=

Z 1 ∗Z 2 l+k−k1

Z2 {(− l2 +k2 −k2

χ (zk ∩ R) − χ (zl+1+k−k ∩ R) 1

Z1 k1 −1 )2

+ ε(D∗z2 ) − ε(D∗z2 k2

k 1 −1

Z1 k1 −1

2

k1 =1

=2 

Z2 k2 −1

 −

+

1

2 2 χ (z ∩ R) − χ (z ∩ R) k2 l2 +1+k2 −k2 )

l2 +1+k2 −k2

Z1 l1 +k1 −k1

Z2 l2 +k2 −k2

Z1 k1 −1

χ (z1 k1

∩ R) − χ (z1 l1 +1+k1 −k1

k 1 −1 k1 =1

∩ R) }

χ (z2 k2 2

∩ R) − χ (z2 l2 +1+k2 −k2

Z1 k1 −1

Z1 l1 +k1 −k1

∩ R) ·

χ (z1 ∩ R) − χ (z1 k1

l1 +1+k1 −k1

(4.16)   ∩ R)  .

In addition, the other two items in (4.14) can be reduced as, −

Z 1 ∗Z 2 (k2 −1)p1 +k1 −1

+

(k2 −1)p 1 +k1 −1

2

k =(k2 −1)p1 +1

= − ε0

Z2 k2 −1

Z1 k1 −1

− ε(D∗zl22 +1 )

Z 1 ∗Z 2 k −1

Z2 l2

Z 1 ∗Z 2 l+k−k

Z2 k2 −1

χ (z1 ∩ R) − χ (z1 k1

l1 +1+k1 −k1

∩ R)

k 1 −1

k1 =1

χ (zk ∩ R) − χ (zl+1+k−k ∩ R)

2

Z1 k1 −1

Z1 l1 +k1 −k1

(4.17)

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

= (−

Z2 k2 −1 )

−

Z1 k1 −1

+

k 1 −1

2

k1 =1

Z1 k1 −1

Z1 l1 +k1 −k1

399

χ (z1 ∩ R) − χ (z1 k1

l1 +1+k1 −k1

∩ R)

.

Here we have used Definition (4.10). Consequently, summing up (4.16) and (4.17) yields Z ∗Z d(k = 2 −1)p1 +k1 −1   k 1 −1  1 Z2 (− k2 −1 ) − Z + 2 k −1 1   1

2

Z1 k1 −1

k1 =1

+

k 2 −1

k2 =1

2

Z2 k2 −1



 −

Z1 k1 −1

+

Z2 k2 −1

+

Z1 k1 −1

2

k 2 −1

2

k2 =1

  −

k 1 −1 k1 =1

  = −

Z2 l2 +k2 −k2

+

k 1 −1

2

k1 =1

Z1 l1 +k1 −k1

χ (z1 ∩ R)−χ (z1 k1

k2 −1

Z1 k1 −1

∩ R) − χ (z2

l2 +1+k2 −k2

Z1 l1 +k1 −k1

Z2 k2 −1

Z2 l2 +k2 −k2

Z1 k1 −1

Z1 l1 +k1 −k1

l1 +1+k1 −k1

χ (z2

  

χ (z1 ∩ R) − χ (z1

l1 +1+k1 −k1

  ∩ R) 

χ (z2

k2 −1

∩ R) − χ (z2

l2 +1+k2 −k2

χ (z1 ∩ R) − χ (z1 k1

 

∩ R) ·

k1

∩ R)

l1 +1+k1 −k1

  ∩ R)  ·

  ∩ R) 

= dkZ2 −1 · dkZ1 −1 . 2

1

So formula (4.1a) holds under Condition (I.1a). Under Condition (I.1b): 1 ≤ k2 < l2 + 1, l + 1 < k1 ≤ m1 + 2 and (I.1c): 1 ≤ k2 < l2 + 1, m1 + 2 < k1 ≤ p1 , the kneading coefficients of sequence Z 2 and compound sequence Z 1 ∗Z 2 are the same as Condition (I.1a), whereas for the sequence Z 1 , under the condition (I.1b), the kneading coefficients are given as dkZ1 −1 = − 1

Z1 k1 −1

+

l 1 +1

2

k1 =1

Z1 k1 −1

Z1 l1 +k1 −k1

χ (z1 ∩ R) − χ (z1

l1 +1+k1 −k1

k1

∩ R)

and the summation can be transformed as p1 k1 =1

⇒

l 1 +1 k1 =1

+

k 1 −1 k1 =l1 +2

+

+k1 l1 k1 =k1

+

p1 k1 =l1 +k1 +1

.

(4.18)

400

S.-L. Peng, X.-S. Zhang

Under Condition (I.1c), the kneading coefficients of the characteristic polynomial of the sequence Z 1 , l 1 +1 Z1 Z1 Z1 Z1 1 1 dk1 −1 = − k1 −1 + χ (z ∩ R) − χ (z 2 ∩ R) k1 −1

k1 =k1 −m1 −1

k1

l1 +k1 −k1

l1 +1+k1 −k1

and p1

⇒

k1 −m 1 −2

k1 =1

l 1 +1

+

k1 =1

k 1 −1

+

k1 =k1 −m1 −1

k1 =l1 +2

+

p1

.

(4.19)

k1 =k1

Similarly, by using formulas (4.6) and (4.8) to reduce the quantities of Z 1 ∗Z 2 in the expression (4.14) to these of Z 1 and Z 2 , and the corollaries, we can easily prove the formula (4.1a) under Conditions (I.1b) and (I.1c). (I.2). Let l + 1 < k ≤ m + 2, the coefficients of the characteristic polynomial of the compound sequence Z = Z 1 ∗Z 2 read as Z ∗Z dk−1 =− 1

Z 1 ∗Z 2 k−1

2

+

l+1

Z 1 ∗Z 2 k −1

2

k =1

=

k2 p1

l2

Z 1 ∗Z 2 k1 −1

2

k2 =1 k1 =(k2 −1)p1 +1 Z 1 ∗Z 2 k−1

−

+

l2 p1 +l1 +1

Z 1 ∗Z 2 l+k−k1

Z 1 ∗Z 2 k −1

2

k =l2 p1 +1

Z 1 ∗Z 2 l+k−k

χ (zk ∩ R) − χ (zl+1+k−k ∩ R)

χ (zk ∩ R) − χ (zl+1+k−k ∩ R) 1

Z 1 ∗Z 2 l+k−k

(4.20)

1

χ (zk ∩ R) − χ (zl+1+k−k ∩ R) .

Under Condition (I.2b), the kneading coefficients of characteristic polynomial of sequences Z r , r = 1, 2, r

Zr kr −1

dkZr −1 = −

+

l r +1

Zr kr −1

2

kr =1

the summation

χ (zkr ∩ R) − χ (zlr +1+k

r −kr

r

r

∩ R) , (4.21)

p1

Zr lr +kr −kr

k1 =1

in the expression (4.20) can be transformed as (4.18), by the

analogous calculations; it leads to

k2 p1

l2

2

k2 =1 k1 =(k2 −1)p1 +1

=

l2

k2 =1

2

Z2 k2 −1



 −

Z1 k1 −1

+

Z 1 ∗Z 2 k1 −1

Z2 l2 +k2 −k2 l 1 +1 k1 =1

2

Z 1 ∗Z 2 l+k−k1

χ (zk ∩ R) − χ (zl+1+k−k ∩ R) 1

1

χ (z2 k2

Z1 k1 −1

∩ R) − χ (z2 l2 +1+k2 −k2

Z1 l1 +k1 −k1

∩ R) ·

(4.22)

χ (z1 ∩ R) − χ (z1 k1

l1 +1+k1 −k1

  ∩ R)  .

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

401

Now we discuss the other two items, Z 1 ∗Z 2 k−1

−

l2 p1 +l1 +1

+

2

k =l2 p1 +1

Z 1 ∗Z 2 k −1

Z 1 ∗Z 2 l+k−k

χ (zk ∩ R) − χ (zl+1+k−k ∩ R) Z 1 ∗Z 2 (k2 −1)p1 +k1 −1

=− l 1 +1

Z 1 ∗Z 2 l2 p1 +k1 −1

2

k1 =1

Z2 l2

+ 2ε0 l1

2ε0

Z2 l2

2 p1 +k1

Z1 k1 −1

Z1 ε(D∗zk22 ) k1 −1

Z2 k2 −1

l1 +1+k1 −k1

k1

2 p1

1

Z2 k2 −1

χ (z1 ∩ R) − χ 1 (z1

∩ R) − χ (zl+1+k−k −l

χ (zl

Z1 k1 −1

Z1 2 l1 ε(D∗zk2 )

k1 =1

Z 1 ∗Z 2 l+k−k1 −l2 p1

Z2 k2 −1

= − ε(D∗zk22 )

+

+ ∩ R)

χ (zl2 p1 +l1 +1 ∩ R) − χ (zk11 ∩ R) Z1 l1 +k1 −k1

∩ R) .

Noting Lemma 4.3, Definition (4.10), and corollary of Lemmas 4.1 and 4.3, i.e., (4.13), it leads to =

Z1 k1 −1

Z2 k2 −1

Z1 k1 −1

−

Z2 l2

Z2 k2 −1

· 2 ε(D∗zl22 +1 )χ (zk22 ∩ R) − ε(D∗zk22 )χ (zl22 +1 ∩ R)

1 +1 l

2 − ε(D∗zl22 +1 ) − ε(D∗zk22 )

Z1 k1 −1

k1 =1

χ (z1 k1 − ε(D∗zl22 +1 )

∩ R) − χ (z1 l1 +1+k1 −k1 Z2 l2

k1 =1

! = −    −  

l 1 +1

Z2 k2 −1

χ (z1 k1 Z2 k2 −1

Z1 k1 −1

2

+2 +

Z2 l2

l 1 +1 k1 =1

2

Z2 k2 −1

∩ R)

Z1 k1 −1

∩ R) − χ (z1 l1 +1+k1 −k1

Z1 l1 +k1 −k1

∩ R) "

Z1 k1 −1

Z1 l1 +k1 −k1

χ (zl22 +1 ∩ R) − χ (zk22 ∩ R) Z1 l1 +k1 −k1

·

χ (z1 ∩ R) − χ (z1 k1

l1 +1+k1 −k1

(4.23)    ∩ R) .  

402

S.-L. Peng, X.-S. Zhang

Summing up (4.22) and (4.23), it is easy to see that the formula (4.1a) holds under the condition (I.2b). Under Conditions (I.2b) and (I.2c), the proofs are similar and omitted here. (I.3). Let m+2 < k ≤ p = (l2 +m2 +2)(l1 +m1 +2), the coefficients of characteristic polynomial of the compound sequence Z = Z 1 ∗Z 2 read as Z ∗Z =− dk−1 1

Z 1 ∗Z 2 k−1

2

l+1

+

Z 1 ∗Z 2 k −1

2

k =k−m−1

Z 1 ∗Z 2 l+k−k

χ (zk ∩ R) − χ (zl+1+k−k ∩ R)

k2 p1 +k1 −m1 −2

l2

=

2

k2 =k2 −m2 −1 k1 =(k2 −1)p1 +k1 −m1 −1

Z 1 ∗Z 2 k1 −1

Z 1 ∗Z 2 l+k−k1

χ (zk ∩ R) − χ (zl+1+k−k ∩ R) 1

Z 1 ∗Z 2 k−1

−

l2 p1 +l1 +1

+

1

Z 1 ∗Z 2 k −1

2

k =l2 p1 +k1 −m1 −1

Z 1 ∗Z 2 l+k−k

χ (zk ∩ R) − χ (zl+1+k−k ∩ R) .

(4.24)

Under Condition (I.3c), the coefficients of characteristic polynomial of sequences Z r , r = 1, 2, r

dkZr −1 = −

Zr kr −1

l r +1

+

2

kr =kr −mr −1

Zr kr −1

Zr lr +kr −kr

χ (zkr ∩ R) − χ (zlr +1+k

r −kr

r

r

(4.25)

∩ R) .

By transforming the summation as p1 +k 1 −m1 −2

l 1 +1

⇒

+

k1 =k1 −m1 −1

k 1 −1

k1 =k1 −m1 −1

+

p1

+

p1 +k 1 −m1 −2

k1 =l1 +2

k1 =k1

k1 =p1 +1

which looks like the transformation (4.19), and after the analogous operations, we have,

k2 p1 +k1 −m1 −2

l2

2

k2 =k2 −m2 −1 k1 =(k2 −1)p1 +k1 −m1 −1 l2

=

2

k2 =k2 −m2 −1



 −

Z1 k1 −1

+

Z2 k2 −1

Z2 l2 +k2 −k2

l 1 +1 k1 =k1 −m1 −1

2

Z 1 ∗Z 2 k1 −1

Z 1 ∗Z 2 l+k−k1

χ (zk ∩ R) − χ (zl+1+k−k ∩ R) 1

1

χ (z2 ∩ R) − χ (z2

Z1 k1 −1

k2

Z1 l1 +k1 −k1

l2 +1+k2 −k2

∩ R) ·

χ (z1 ∩ R) − χ (z1 k1

l1 +1+k1 −k1

  ∩ R)  (4.26)

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

403

Using the definition (4.10) and the reduction formulas (4.6) and (4.8), the other two items can be reduced to Z 1 ∗Z 2 k−1

−

l2 p1 +l1 +1

+

Z 1 ∗Z 2 k −1

2

k =l2 p1 +k1 −m1 −1 Z 1 ∗Z 2 (k2 −1)p1 +k1 −1

=−

=−

Z 1 ∗Z 2 l2 p1 +k1 −1

2 Z1 k1 −1

· −

χ 1 (zk ∩ R) − χ 1 (zl+1+k−k ∩ R)

l 1 +1

+

k1 =k1 −m1 −1

Z 1 ∗Z 2 l+k−k

Z2 k2 −1

Z 1 ∗Z 2 l+k−k1 −l2 p1 Z2 l2

+2

χ (zl

Z2 p2 −1

∩ R) 1 2 p1 Z1 2 2 l1 χ (zl2 p1 +l1 +1 ∩ R) − χ (zp ∩ R) 2 p1 +k1

∩ R) − χ (zl+1+k−k −l

+ (ε(D∗zl22 +1 ) − ε(D∗zk22 )) l 1 +1

Z1 k1 −1

2

k =k1 −m1 −1

− ε(D∗zl22 +1 )

Z1 l1 +k1 −k1

l 1 +1

χ (z1 ∩ R) − χ (z1

l1 +1+k1 −k1

k1

Z1 k1 −1

2

k1 =k1 −m1 −1

Z1 l1 +k1 −k1

χ (z1 ∩ R) − χ (z1

= −   −

Z2 k2 −1 Z1 k1 −1

+2 +

Z2 l2

Z2 p2 −1

l 1 +1 k1 =k1 −m1 −1

l1 +1+k1 −k1

k1

χ (zl22 +1 ∩ R) − χ (zp2 2 ∩ R)

2

∩ R)

Z1 k1 −1

Z1 l1 +k1 −k1

∩ R)

Z2 l2

Z2 k2 −1

·

χ (z1 ∩ R) − χ (z1 k1

l1 +1+k1 −k1

  ∩ R)  . (4.27)

Summing up (4.26) and (4.27), formula (4.1a) obviously holds under Condition (I.3c). Under Conditions (I.3a) and (I.3b), the proofs are similar and also omitted here.

4.4. The proof of the topological entropy preserving theorem (Theorem 4.2). (a) First of all, we consider a special set of superstable sequences with zero topological entropy. The topological entropy can be calculated from the smallest positive zero of kneading characteristic polynomials. A trivial calculation shows that the topological entropy of the doubly superstable kneading sequence (DC) of period 2 is equal to zero. From formula (4.1) we clearly see that the topological entropies of compound doubly superstable kneading sequences (DC)∗(DC) and (CD)∗(CD) are also equal to zero. In general, the compound sequences (DC)∗n ∗ (DC), ∗ ∈ {∗, ∗}, possess zero topological entropy for all n ∈ Z+ . Let H0n = ∪∗∈{∗,∗} (DC)∗n ∗ (DC) represent the set of period 2n+1 , where ∪∗∈{∗,∗} (DC)∗n means that it includes all 2n different combinations of operations (DC)∗ and (CD)∗. Then the zero topological entropy class

404

S.-L. Peng, X.-S. Zhang

H0 = {Z | h(Z) = 0, Z ∈ K0 } = ∪n∈Z+ H0n . By this observation we can now classify the set K0 of doubly superstable sequences into two classes: a zero topological entropy class H0 and a non-zero topological entropy class H+ = {Z | h(Z) > 0, Z ∈ K0 }. Evidently, K0 = H0 ∪ H+ . (b). In view of Theorem 4.1, the topological entropy of the compound sequence Z 1 ∗Z 2 1 can be determined by the smallest positive zero tmin of DZ 1 (t) = 0 and DZ 2 (t |Z | ) = 0, i.e., h(Z 1 ∗Z 2 ) = − ln tmin ,

tmin =

min

Z 1 ,Z 2 ∈K0

{t | DZ 1 (t) = 0, DZ 2 (t |Z | ) = 0}. 1

If Z 1 ∈ H0 , Z 2 ∈ H0 ∪ H+ , then the smallest positive zero tmin of the analytic function DZ 1 (t) is outside the margin of the open convergent unit disk and thus is larger than or 1 equal to that of DZ 2 (t |Z | ), and so the smallest positive zero of DZ 1 ∗Z 2 (t) is determined 1 by DZ 2 (t |Z | ) = 0, and (4.2b) holds. If Z 1 ∈ H+ , Z 2 ∈ H0 , then the smallest positive |Z 1 |

zero tmin < 1 of DZ 1 (t), while tmin = 1 of DZ 2 (t |Z | ) = 0, hence it follows h(Z 1 ∗Z 2 ) = h(Z 1 ). 1

(c). For the case Z 1 , Z 2 ∈ H+ , we discuss it in detail as follows. From the symbolic order of kneading theory we know that the primitive sequence Qmax of the maximal topological entropy ln 3 corresponds to the surjective map, (CR ∞ , DL∞ ). According to the Sarkovskii-like order embedding theorem stated later, on the one hand, the primitive sequence Qmin of the minimal topological entropy is Qmin (DC)∗(CR ∞ , DL∞ ) or (CD)∗(CL∞ , DR ∞ ), whose topological entropy is 1 2 ln 3. Hence we have the entropic bounds for primitive sequences, 1 ln 3 = min h(Z) ≤ h(Z) ≤ max h(Z) = ln 3, Z∈Hπ Z∈Hπ 2

for Z ∈ Hπ ;

on the other hand, noting that maxZ 1 ,Z 2 ∈Hπ {− ln t | DZ 2 (t |Z | ) = 0} = 13 ln 3 ≤ minQ∈Hπ h(Q), it yields h(Z 1 ∗Z 2 ) = h(Z 1 ). The formulas (4.2) hold. We can also rewrite the above Theorem 4.2 in the forms of the star map. For the star map Z 1 ∗ : Z 2 ' −→ Z 1 ∗ Z 2 , Z 2 ⊂ K0 , if Z 1 ∈ H+ , it is topological entropy preserving from the set K0 to the equal entropy class with the value h(Z 1 ); if Z 1 ∈ H0n , it is such a map that compresses the topological entropies h(Z 2 ) of the preimages to the factor 1/2n . We come here to discuss the non-zero topological entropy class. The primitive sequences are defined as such sequences that can not be factorized in the sense of dual star products. Let Hπ be the set of all the primitive doubly superstable sequences: it has the upper limit (CR ∞ , DL∞ ). With the operators (DC)∗ and (CD)∗ acting on the entire set Hπ , we obtain a set Hπ1 = (DC)∗Hπ ∪ (CD)∗Hπ . Similarly we can make the sets Hπn = ∪n∈Z+ (DC)∗n ∗ Hπ . These sets Hπ , Hπ1 , Hπ2 , . . . construct the non-zero topological entropy class H+ = ∪n∈Z+ Hπn and are closely adjacent one by one in the kneading plane. For instance, Hπ and Hπ1 are adjacent at two points: (DC)∗(CR ∞ , DL∞ ) and (CD)∗(CR ∞ , DL∞ ). In general, Hπn and Hπn+1 are adjacent at the 2n+1 points: (DC)∗n ∗(CR ∞ , DL∞ ). We note that the generalization of Z∗ on the general admissible sequence pairs is given in the next section. 1

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

405

Specially, we can get the lower boundary of the non-zero topological entropy class H+ : Hπ∞ = limn→∞ Hπn with the topological entropies limn→∞ 21n ln 3 → 0. With the analogous operations, we can fix the upper boundary of the zero topological entropy class: H0∞ = limn→∞ H0n . In fact, these two boundaries are the same one H0∞ = Hπ∞ , which is a fractal curve [15,19] composed of accumulation points of all the combinations of operations of up-star products ∗ and down-star products ∗. So we can also get the boundary of topological chaos from the non-zero topological entropy region by the dual star products. 4.5. Sarkovskii-like order embedding theorem. The well known Sarkovskii order [22] is a system of existence order for periodic sequences in one-dimensional maps. By MSS order we may substitute Sarkovskii order and set up a finer order embedding theorem. By selecting the infinitely many primitive doubly superstable sequences from the set Hπ , we can construct a countable infinite subset: Kn ∈ Hπ , n = 1, 2, . . . , where we always take the maximal representations of these sequences (that is, the sequences starting from turning point C). Since MSS order ≺ is complete, thus this subset can be ordered as, Hπ (≺) : K1 ≺ K2 ≺ . . . ≺ Kn ≺ . . . . Here we may choose such sequences whose periods |Ki | are odd numbers like Sarkovskii’s, or any other primitive periodic sequences. The upper limit of which is (CR ∞ , DL∞ ) and this ordering set of sequences is denoted by Hπ (≺). By virtue of the order preserving of operator (DC)∗ in Lemma 3.2, the operator (DC)∗ acting on the order relation Hπ (≺) yields the ordering set, Hπ1 (≺) := (DC)∗Hπ (≺), namely Hπ1 (≺) : (DC)∗K1 ≺ (DC)∗K2 ≺ . . . ≺ (DC)∗Kn ≺ . . . . In general, we can construct the following series of order relations by the operation Hπn+1 (≺) := (DC)∗Hπn (≺): Hπ1 (≺) : (DC)∗K1 ≺ (DC)∗K2 ≺ . . . ≺ (DC)∗Kn ≺ . . . , ... , Hπn (≺) : (DC)∗n ∗K1 ≺ (DC)∗n ∗K2 ≺ . . . ≺ (DC)∗n ∗Kn ≺ . . . , ... , and the adjacent points between Hπn (≺) and Hπn+1 (≺) are (DC)∗n ∗(CR ∞ , DL∞ ), n = 1, 2, . . . . Considering the ordering set of zero topological entropy sequences H0 (≺) : (D = C) ≺ (DC) ≺ (DC)∗2 ≺ . . . ≺ (DC)∗n ≺ . . . . Consequently, we get the order relation series SL(≺) := H0 (≺) ≺ . . . ≺ Hπn (≺) ≺ Hπn−1 (≺) ≺ . . . ≺ Hπ1 (≺) ≺ Hπ (≺)

((I))

and call it Sarkovskii-like order sequences. Again the order preserving operator Ki ∗, Ki ∈ Hπ , acting on the order set SL(≺) will produce the order relation Ki ∗SL(≺) := Ki ∗H0 (≺) ≺ . . . ≺ Ki ∗Hπn (≺) ≺ . . . ≺ Ki ∗Hπ1 (≺) ≺ Ki ∗Hπ (≺),

((II))

406

S.-L. Peng, X.-S. Zhang

according to the formula (4.2a), all the sequences in this ordering set preserve topological entropy and form an equal topological entropy class, in other words, they are all embedded into a plateau of topological entropy which is labelled by the primitive sequence Ki . Thus the Sarkovskii-like order sequences can be embedded into infinitely many plateaus Ki ∗SL(≺), (Ki ∈ K0 ), of topological entropy and the infinity is decided by the cardinal of primitive sequence pair set. Finally, we have the Sarkovskii-like order embedding theorem. Theorem 4.3. If H0 (≺) and Hπ (≺) are well-ordered, then SL(≺) is well-ordered and embedded in any plateau Ki ∗SL(≺) of equal topological entropy h(Ki ). By analogy with the up-star product, the case for the down-star product (CD)∗n ∗ can also be obtained. In general, the case with the occurrence of the mixed products of the up-star and the down-star is very complex, and the relevant order relations will span the whole kneading plane. We will consider it elsewhere. 5. Generalized Milnor–Thurston Conjecture We now come to set up the equal topological entropy classes (or plateaus of topological entropies).

5.1. Extension of dual star products. In the previous section we display the structure relation of topological order in the kneading plane by the Sarkovskii-like order embedding theorem which is induced by the dual star products. Here we give the geometric structure of plateaus of topological entropies in the kneading plane. To understand the whole kneading plane we need to extend the right element of the star product from doubly superstable admissible sequence pairs to general admissible sequence pairs. If the left element Z 1 in the previous definition of the dual star products Z 1 ∗ Z 2 is replaced by a singly superstable sequence pair (XC, Y D), it becomes clear after some trivial operations that some fuzziness arises in formulas (3.1) and (3.2), and the resultant sequence pairs by the manipulation (XC, Y D) ∗ Z 2 can not generally satisfy the admissibility condition (2.1), therefore this fact shows that the replacement is invalid and the position of the left element for the star products can only be occupied by the doubly superstable kneading sequence pairs. However, the right element is not irreplaceable and it can be extended to any admissible sequence pairs. In the following we give this extension of dual star products.

Theorem 5.1. For any doubly superstable kneading sequence pair A = (X , Y ) ∈ K, ˜ , Z∗Y ˜ ) ∈ K. = (Z∗X there exist Z ∗¯ A = (Z ∗¯ X , Z ∗¯ Y ) ∈ K and Z∗A Remark. Owing to the fact that the sequence pair A satisfies the admissibility condition (2.1), and in view of order-preserving of the dual star products, and using the similar techniques to the proof of Theorem 3.1, we can prove the following inequalities for the compound sequence pair Z ∗ A, Z ∗¯ K D ϕ k (Z ∗¯ X ), ϕ k (Z ∗¯ Y ) Z ∗¯ K C ,

The detail of the proof is omitted here.

k ∈ Z+ .

(5.1)

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

407

Theorem 5.2. For the compound sequence pair Z ∗ A, there exists the following normal factorization on the characteristic polynomial: DZ∗A (t) = DZ (t) · DA (t |Z| ).

(5.2)

Therefore we have the relations on the topological entropy: h(Z) if Z ∈ H+ , h(Z ∗A) = h(A)/2n if Z ∈ H0n .

(5.3a) (5.3b)

Remark. For any admissible sequence pair A = (X , Y ), X = x1 . . . xξ . . . and Y =

y1 . . . yη . . . , we can write down the coefficients dk−1 of the characteristic polynomial of A as dk−1 = −

C k−1

+

ξ +η=k 1≤ξ, η<∞

2

C ξ −1

D η−1

χ (xξ ∩ R) − χ (yη ∩ R)

(5.4)

By virtue of the properties of characteristic functions and invariant coordinates, we have the following formula: Z∗A d(k = dkZ1 −1 · dkA2 −1 , 1 −1)p+k2 −1

(5.5)

where k1 = 1, . . . , p(= |Z|), k2 = 1, 2, . . . . We can prove the two theorems by using similar techniques to the proof of Theorems 4.1 and 4.2, for brevity; we omit it here. 5.2. Plateaus of topological entropies in the kneading plane – The generalized Milnor– Thurston conjecture. Here we sketch the geometric description of the kneading plane. In the symbolic space 3 of three letters, there exists a special sequence pair (D = C) corresponding to a fixed point of the bimodal map, which is a pair of two singly superstable kneading sequences and also a doubly superstable sequence pair. It satisfies the relation, Z ∗ (D = C) = Z, Z ∈ K0 , so we call it the unit sequence pair. By the algorithm of the lifting sequence pair in analogy with lifting symbolic sequences of two letters [5], we can locate one point or an area in the parameter plane for every admissible sequence pair (X, Y ). For instance, a doubly superstable sequence pair corresponds to only one point in the parameter plane, and the window sequence pairs (XC ±τ (X) , Y D ±τ (Y ) ) of a singly superstable periodic sequence pair (XC, Y D) correspond to an extending area in the parameter plane. All admissible sequence pairs make up a bounded region, which have four marginal points listed clockwise as (R ∞ , L∞ ), (R ∞ , C), (D = C), (D, L∞ ). Further, we can define the coordinate system of the kneading plane as follows: 1. The origin is located at the surjective map point (R ∞ , L∞ ); 2. The line starting from point (D, L∞ ), through points (X, L∞ ), and finally to (R ∞ , L∞ ), is defined as the X-axis (DL∞ ) (horizontal axis); 3. The line starting from point (R ∞ , C), through points (R ∞ , Y ), and finally to (R ∞ , L∞ ) as the Y -axis (CR ∞ ) (vertical axis).

408

S.-L. Peng, X.-S. Zhang

The up and down window boundaries of the lines (C, Y ) and (X, D), i.e., (L∞ , Y ) and (X, R ∞ ) respectively, along with the other two lines (X, L∞ ) and (R ∞ , Y ), enclose a bounded region in the kneading plane, which is denoted by B. For this bounded region B we have the following theorem. Theorem 5.3 (Plateau theorem). (a) For any doubly superstable sequence pair Z = (XDY C) ∈ K0 , the star maps Z∗ are compressed in the sense of the correspondence between the symbolic order and the parametric order for the whole kneading bound region B, that is, Z ∗ B ⊂ B, ∗ ∈ {∗, ∗}.

(5.6)

(b) If Z ∈ H+ , then the compressed regions defined by Z ∗ B are plateaus of equal topological entropy with the value h(Z). (c) Z ∗ B, Z T ∗ B are symmetric with respect to the central line which starts from (R ∞ , L∞ ) and ends at the unit sequence pair (D = C). Proof. (a) The region Z∗B have the following four marginal points listing clockwise as (C(XD τ (X) Y C −τ (Y ) )∞ , DY C −τ (Y ) (XD −τ (X) Y C τ (Y ) )∞ ), (C(XD τ (X) Y C −τ (Y ) )∞ , DY C −τ (Y ) XD −τ (X) Y C), (CXD, DY C −τ (Y ) (XD −τ (X) Y C τ (Y ) )∞ ), (XDY C, Y CXD) = Z, and its boundary lines are DY C −τ (Y ) (XD −τ (X) Y C τ (Y ) )∞ , C(XD τ (X) Y C −τ (Y ) )∞ , Y C −τ (Y ) XD and XD −τ (X) Y C. It is obvious that DL∞ ≺ DY C −τ (Y ) (XD −τ (X) Y C τ (Y ) )∞ , C(XD τ (X) Y C −τ (Y ) )∞ ≺ CR ∞ Y C −τ (Y ) XD ≺ R ∞ ,

L∞ ≺ XD −τ (X) Y C.

Consequently Z∗B ⊂ B, that is, the star maps Z ∗¯ are compressed in the sense of topological order and also in the sense of parameter metric due to the equivalence between sequence pairs in the kneading plane and points in the parameter plane, and they map all admissible sequence pairs of B onto Z ∗¯ B. (b) If Z ∈ H+ , using Theorem 5.2, we immediately have h(Z ∗¯ A) = h(Z) = h(Z T ∗A) = h(Z T ), it directly follows that Z ∗¯ B ⊂ B are the equal topological entropy plateaus of topological entropies h(Z). T (c) From the duality of up-star and down-star products Z ∗¯ A = (Z)T ∗(A)T , two contracting bounded regions Z ∗ B and Z T ∗B ⊂ B exchange with each other under the parity preserving transformation T . Thus it clearly leads to the result that Z ∗ B and Z T ∗B ⊂ B are symmetric to the central line [21] starting from (R ∞ , L∞ ) to the unit sequence pair (D = C). Incidentally, the sequence pairs located on this central line are anti-symmetric, that is, they keep invariant under the parity preserving transformation T . Theorem 5.3 reveals that the equal topological entropy class is a contraction formed by the star map Z∗ which compresses all phenomena occurring in the whole kneading plane. For instance, the boundary H0∞ = limn→∞ H0n of the topological chaos is compressed onto the plateau of topological entropy h(Z) as Z ∗ H0∞ , which now is not a

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

409

global boundary topological chaos, but is still a local fractal curve. Surely, the star maps Z ∗¯ are one-to-one correspondence, and A ∈ B, Card{B} ∼ ℵ1 , therefore there exist infinitely many sequence pairs with cardinality ℵ1 in every plateau. Noting Z ∈ K0 and Card{K0 } ∼ limp→∞ (3p−1 − 1)/p ∼ 3ℵ0 ∼ ℵ1 under the continuum hypothesis, so there exist infinitely many plateaus with cardinality ℵ1 in the entire kneading plane. An infinite number of equal topological entropy plateaus are infinitely embedded into the entire kneading plane and construct a very rich picture: multifractal with a positive measure (Sierpinski carpet). This is the complete and deep implication of the Milnor–Thurston conjecture which has been expounded above. So far for the continuous interval maps the generalized Milnor–Thurston conjecture has been answered and equal topological entropy class of the bimodal maps has been established. Although a number of important problems of mathematical physics connected closely with the bimodal map will be the continuous interval maps with one or more than one discontinuous point, it is expected that Milnor–Thurston equal topological entropy class will also be able to provide the theoretical frame to the symbolic systems of three letters with periodic [23] and chaotic orbits occurring on first return maps of, for instance, the well-known Lorenz equation [24-26]. The circle maps [27] related to the Hamiltonian systems belong also to the subsystem of the interval maps of three letters. The application of the equal topological entropy class to these problems is important in physics. Feigenbaum’s metric universality [19,20] is confined within a step of equal topological entropy [7–9] in dynamical systems of two letters. For the dynamical systems of three letters, a plateau of equal topological entropy contains two types of star products, and the duality of star products may lead to the establishment of dual equation systems for Feigenbaum’s renormalization group [28,29], hence to the duality or vectorization of the appropriate universal scaling laws. Since all the equal topological entropy classes can be separated from the set of all kneading sequences, the representative of each equal topological entropy class can be labelled by primitive symbolic sequences. The relation among primitive sequences would be addition law and for basic primitive sequences and their addition sequences, they must be different in the topological entropy. The non-equal topological entropy sequences would bring a new light to dealing with the well-known Riemann-Zeta function of maps [30,31] and cycle expansion of periodic orbits [32]. For kneading theory, this work will provide a clue by which the equal topological entropy classes can be generalized to the systems more complex than systems of three letters, in particular, to the systems of four, five and N letters in one-dimensional maps. Of course, the generalization will be a complicated but accessible problem in future studies. Note added in proof. The first statement of the Milnor–Thurston conjecture, the monotonicity of topological entropy, has been generalized to and proved for real cubic maps by Milnor and Tresser [Milnor, J., Tresser, C.: On entropy and monotonicity for real cubic maps, Commun. Math. Phys. 209, 123–178 (2000)], which is very important mathematical progress. They use the method of homotopy to study monotonicity and connectivity of entropy, and reveal that the isentrope (i.e., the topological entropy level set) is a deformation retract. Note that the dual star map Z∗ (∗ ∈ {∗, ∗}) is the 1 − 1 correspondence, namely, Z ∗ B and B are homeomorphic, so the concept of ETEC is parallel to that of isentrope. Since complicated fractals are contained in the ETEC or isentrope, both methods of homeomorphism and homotopy would be necessary [Cao,

410

S.-L. Peng, X.-S. Zhang

K.-F., Zhang, X.-S., Zhou, Z., Peng, S.-L.: Devil’s carpet of topological entropy with positive measure, Preprint YNU-CNLCS-2000-06-01]. Acknowledgements. This work was supported in part by the Nonlinear Science Project of the National Key Projects Program of Basic Research of China (the Climbing Program), the National Natural Science Foundation of China, and the Applied and Basic Research Foundation of Yunnan Province.

References 1. Milnor, J., Thurston, W.: On iterated maps of the interval. I, II. Preprint Institute for Advanced Studies, Princeton (1977); Published in: Alexander, J.C. (ed.) Dynamical systems, Lect. Notes in Math. Vol. 1342, Berlin: Springer-Verlag, 1988 pp. 465–563 2. Derrida, B., Gervois, A., Pomeau, Y.: Iteration of endomorphisms on the real axis and representation of numbers. Ann. Inst. Henri Poincaré A 29, 305–356 (1978) 3. Metropolis, N., Stein, M.L., Stein, P.R.: On finite limit sets for transformations on the unit interval. J. Comb. Theory A 15, 25–44 (1973) 4. Collet, P., Eckmann, J.-P.: Iterated maps on the interval as dynamical systems. Boston: Birkhäuser, 1980 5. Hao, B.-L.: Elementary symbolic dynamics and chaos in dissipative systems. Singapore: World Scientific, 1989 6. Collet, P., Crutchfield, J.P., Eckmann, J.-P.: Computing the topological entropy of maps. Commun. Math. Phys. 88, 257–262 (1983) 7. Peng, S.-L., Cao, K.-F., Chen, Z-X: Devil’s staircase of topological entropy and global regularity. Phys. Lett. A 193, 437–443 (1994); 196, 378 (1995) 8. Cao, K.-F., Chen, Z.-X., Peng, S.-L.: Global metric regularity of the devil’s staircase of topological entropy. Phys. Rev. E 51, 1989–1995 (1995) 9. Chen, Z.-X., Cao, K.-F., Peng, S.-L.: Symbolic dynamics analysis of topological entropy and its multifractal structure. Phys. Rev. E 51, 1983–1988 (1995) 10. Peng, S.-L., Cao, K.-F.: Global scaling behaviors and chaotic measure characterized by the convergent rates of period-p-tupling bifurcations. Phys. Rev. E 51, 3211–3220 (1996) 11. Peng, S.-L., Zhang, X.-S.: The second topological conjugate transformation in symbolic dynamics. Phys. Rev. E 57, 5311–5324 (1998) 12. Peng, S.-L., Luo, L.-S.: The ordering of critical periodic points in coordinate space by symbolic dynamics. Phys. Lett. A 153, 345–352 (1991) 13. Peng, S.-L., Zhang, X.-S.: Dual star products in the symbolic dynamics of three letters for endomorphisms on the interval. Preprint YNU-CNLCS-1997-05 (1997) 14. MacKay, R.S., Tresser, C.: Some flesh on skeleton: The bifurcation structure of bimodal maps. Physica D 27, 412–422 (1987) 15. MacKay, R.S., Tresser, C.: Boundary of topological chaos for bimodal maps of the interval. J. London Math. Soc. (2) 37, 164–181 (1988) 16. Llibre, J., Mumbrú, P.: Extending the ∗-product operator. In: Mira, Ch., Neter, N., Simó, C., Targonsky, Gy. (eds.) Proc. of the European Conference on Iteration Theory – 1989, Singapore: World Scientific, 1991 pp. 199–214 17. Ringland, J., Tresser, C.: A genealogy for finite kneading sequences of bimodal maps on the interval. Trans. Am. Math. Soc. 347, 4599–4624 (1995) 18. Liu, H.-Z., Zhou, Z., Peng, S.-L.: Star transformations and their genealogical varieties in symbolic dynamics of four letters. J. Phys. A: Math. Gen. 31, 8431–8450 (1998) 19. Feigenbaum, M.J.: Quantitative universality for a class of nonlinear transformations. J. Stat. Phys. 19, 25–52 (1978) 20. Feigenbaum, M.J.: The universal metric properties of nonlinear transformations. J. Stat. Phys. 21, 669–706 (1979) 21. Peng, S.-L.: Inverse itinerary sequences and Hausdorff dimensions of disconnected Julia set. Sci. Sin. A 31, 938–949 (1988) 22. Sarkovskii, A.N.: Coexistence of cycles of a continuous map of a line into itself. Ukrainian Math. J. 16, 61–71 (1964) 23. Ding, M.-Z., Hao, B.-L.: Systematics of the periodic windows in the Lorenz model and its relation with the antisymmetric cubic map. Commun. Theor. Phys. 9, 375 (1988) 24. Lorenz, E.N.: Deterministic nonperiodic flow. J. Atmos. Sci. 20, 130–141 (1963) 25. Hubbard, J.H., Sparrow, C.T.: The classification of topologically expanding Lorenz maps. New York: Cambridge 1988

Generalized Milnor–Thurston Conjecture and Equal Topological Entropy

411

26. Fang, H.-P., Hao, B.-L.: Symbolic dynamics of the Lorenz equations. Chaos, Solitons & Fractals 7, 217–246 (1996) 27. Bernhardt, C.: Rotation intervals of a class of endomorphisms of the circle. Proc. London Math. Soc. (3) 45, 258–280 (1982) 28. Campanino, M., Epstein, H.: On the existence of Feigenbaum’s fixed point. Commun. Math. Phys. 79, 261–302 (1981) 29. Campanino, M., Epstein, H., Ruelle, D.: On Feigenbaum’s functional equation. Topology 21, 125–129 (1981) 30. Ruelle, D.: Dynamical zeta functions for piecewise monotone maps of the interval. Providence, RI: American Mathematical Society, 1994 31. Ruelle, D.: Thermodynamic Formalism. In: Rota, G.-C. (ed.) Encyclopedia of Mathematics and Its Applications Vol. 5. Reading, MA: Addison-Wesley, 1978 32. Civtanovi´c, P.: Invariant measurement of strange sets in terms of cycles. Phys. Rev. Lett. 61, 2729–2732 (1988) Communicated by Ya. G. Sinai

Commun. Math. Phys. 213, 413 – 432 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Semiclassical Resolvent Estimates for Trapping Perturbations Vincent Bruneau, Vesselin Petkov Département de Mathématiques Appliquées, Université Bordeaux I, 351, Cours de la Libération, 33405 Talence, France. E-mail: [email protected]; [email protected] Received: 15 November 1999 / Accepted: 1 March 2000

Abstract: We study the semiclassical estimates of the resolvent R(λ + iτ ), λ ∈ J ⊂⊂ R+ , τ ∈]0, 1] of a self-adjoint operator L(h) in the space of bounded operators L(H0,s , H0,−s ), s > 1/2. In the general case of long-range trapping “black-box” perturbations we prove that the estimate of the cut-off resolvent χ (x)R(λ+i0)χ (x)H→H ≤ C exp(Ch−p ), χ (x) ∈ C0∞ (Rn ), p ≥ 1 implies the estimate R(λ + iτ )s,−s ≤ C1 exp(C1 h−p ). 1. Introduction The purpose of this paper is to obtain semiclassical estimates for the resolvent R(z) = (L(h) − z)−1 , z ∈ C \ R of a self-adjoint operator L = L(h) depending on h ∈]0, h0 ] for z ∈ B± = {z = λ + iτ ∈ C : λ ∈ J, ±τ ∈]0, 1]}, J =]µ0 , µ1 [⊂⊂ R+ . The operator L is defined in a domain D ⊂ H of a complex Hilbert space H with an orthogonal decomposition H = HR0 ⊕ L2 (Rn \ B(0, R0 )), B(0, R0 ) = {x ∈ Rn : |x| ≤ R0 },

n≥2

and L satisfies the long-range “black box” assumptions (1.7)-(1.15) described below. Introduce the spaces H0,s = HR0 ⊕ L2 (Rn \ B(0, R0 ), xs dx), x = (1 + |x|2 )1/2 . First we show that for s > 1/2 and fixed h > 0 the limits R(λ ± i0) =

lim

→0, >0

R(λ ± i),

λ∈J

(1.1)

414

V. Bruneau, V. Petkov

exist in the space of bounded operators L(H0,s , H0,−s ), where here and below we omit in R(z) the dependence of h. Next we obtain semiclassical resolvent estimates R(λ ± iτ )s,−s ≤ Cr(h),

λ ∈ J,

τ ∈]0, 1],

h ∈]0, h0 ],

(1.2)

where .s,−s denotes the norm in L(H0,s , H0,−s ). Such estimates have been obtained by many authors in great generality in the case when every λ ∈ J is a non-trapping energy level for the principal symbol l0 (x, ξ ) of a differential operator L (see [19, 6, 20, 9,26, 16, 17, 25] and the survey article [18] for other references). The proofs of these estimates are essentially based on the Mourre method [13, 4, 14] and the main point is to construct a conjugate operator A(h) for which the Mourre inequality (5.2) holds (see Appendix 5.1). Roughly speaking, we call these operators non-trapping perturbations and in this case we have r(h) = h−1 in (1.2). Moreover, for such perturbations there are no resonances converging sufficiently fast to the real axis as h → 0. The case of trapping energy levels is more complicated. First, without the nontrapping assumption, in general, a Mourre type inequality is not true uniformly with respect to h ∈]0, h0 ]. Secondly, resonances converging exponentially fast to the real −p axis could exist and, consequently, we must have r(h) = eCh with some p ≥ 1. In 2 the special case of a Schrödinger operator −h + V (x), with potential V (x) having −1 the form of a “well in an island”, the results in [7, 8] imply (1.2) with r(h) = eCh . In our principal result we assume the following estimate: χ (x)R(λ + i0)χ (x)H→H ≤ C exp(Ch−p ), λ ∈ J, 0 < h ≤ h0 ,

(1.3)

with p ≥ 1 and some constants C > 0, h0 > 0, provided χ (x) ∈ C0∞ ({x : |x| ≤ ρ1 }) is such that χ (x) = 1 for |x| ≤ ρ0 , R0 < ρ0 < ρ1 . Here the constant ρ0 = ρ0 (J ) depends on the interval J and the choice of ρ0 is related to the properties of Hamiltonian trajectories examined in Sect. 2. Recently, for long-range trapping perturbations in the exterior of a connected obstacle Burq [3] established the estimate −p

(Lθ (h) − λ)−1 L2 (Hθ ) ≤ CeC1 h , λ ∈ J,

(1.4)

with p = 1 for the complex scaling operator Lθ (h) defined in a dense domain in Hθ = HR0 ⊕ L2 (#θ \ B(0, R0 )), where #θ is obtained by modifying Rn outside the ball {x : |x| ≥ ρ1 } containing the support of χ (see [23, 21] for a precise definition and Sect. 3). On the other hand, the estimate (1.4) implies immediately (1.3) since we have χ (x)R(λ + i0)χ (x) = χ (x)(Lθ (h) − λ)−1 χ (x) (see Lemma 3.5 in [23]). To treat another case when (1.3) holds, denote by Res L(h) the set of resonances of L(h) and assume that there exist > 0, c > 0 and q ≥ 1 so that dist {Res L(h), [µ0 , µ1 ]} ≥ exp(−ch−q ), 0 < h ≤ h0 . Then an application of Lemma 1 in [24] with g(h) = exp(− 2c h−q ) yields (1.4) with p ≥ q. Thus we have several sufficient conditions leading to (1.3). In this work we show that for a fixed interval J ⊂⊂ R+ , taking χ ∈ C0∞ ({x : |x| ≤ ρ1 }) as above, the estimate (1.3) of the cut-off resolvent χ (x)R(λ+i0)χ (x)H→H , λ ∈ J , implies an estimate of R(λ+iτ )s,−s with the same order p in the bound exp(Ch−p ). Thus, for long-range perturbations in Rn , studied in [3], we prove the conjecture of Robert [17] who claimed that (1.2) with r(h) = C exp(Ch−p ) holds for general trapping perturbations. Under this assumption, Robert [17] obtained a Weyl type asymptotic for the scattering phase related to two h-admissible pseudodifferential operators L1 (h), L2 (h). Our main result is the following.

Semiclassical Resolvent Estimates

415

Theorem 1. Let L(h) be a self-adjoint operator satisfying the “black box” assumptions (1.7) − (1.15). Then for fixed s > 1/2 and for fixed sufficiently small h the limits R(λ ± i0) =

lim

→0, ±>0

R(λ ± i)

exist in L(H0,s , H0,−s ) uniformly with respect to λ ∈ J =]µ0 , µ1 [. Let ρ0 > R0 be a constant, depending on J , such that all bounded trajectories of the Hamilton field Hl0 are included in {x : |x| ≤ ρ0 /2}. Assume that the cut-off resolvent χ (x)R(λ ± i0)χ (x) with χ (x) ∈ C0∞ ({x : |x| ≤ ρ1 }), χ = 1 for |x| ≤ ρ0 < ρ1 , satisfies the estimate χ (x)R(λ ± i0)χ (x)H→H ≤ C exp(Ch−p ), λ ∈]µ0 − δ, µ1 + δ[, 0 < h ≤ h0 (1.5) with δ > 0 and p ≥ 1. Then for each s > (λ, τ ) ∈ J ×]0, 1] we have

1 2

there exist Cs > 0,

R(λ ± iτ )s,−s ≤ Cs exp(Cs h−p ),

h1 > 0 such that for

0 < h ≤ h1 .

(1.6)

Remark 1. It is important to note that under our assumptions for every fixed interval J ⊂⊂ R+ there exists a constant ρ0 > R0 with the properties mentioned in Theorem 1 (see Sect. 2). The precise choice of ρ0 and ρ1 will be given in Proposition 3. ˜ The idea of our proof is to construct a self-adjoint operator L˜ = L(h) on L2 (Rn ) ˜ ˜ such that each λ ∈ J is a non-trapping energy level for L and L(h)ψ = L(h)ψ for ∞ n any ψ ∈ C0 (R ) supported away from a ball B(0, ρ1 ). For this purpose we show that the bounded trajectories related to the symbol of L(h) and to the energy levels λ ∈ J are included in a compact set K(J ) ⊂ R2n depending on J . Next we represent R(z) ˜ by a sum of terms involving the resolvent R(z) of L˜ (see Sect. 3) and an application of the semiclassical estimates for non-trapping energy levels reduces the problem to the estimation of χ (x)R(z)χ (x)H→H (see Proposition 3). In Sect. 2 our approach is motivated by the proof of Lemma 1.2 in [8] concerning the operator L(h) = −h2 + V (x), where the potential V (x) involves a “well in an island”. Having an “island” around the trapped trajectories, Gérard, Martinez and Robert have introduced a perturbation W0 (x) ∈ C0∞ (Rn ) such that the perturbed symbol |ξ |2 + V (x) + W0 (x) becomes a nontrapping Hamiltonian for energy levels λ ∈ J . In the general case we prove a similar ˜ ξ ) of l0 (x, ξ ) (see Proposition 2) so that all result by constructing a perturbation l(x, ˜ ξ ). This implies immediately the existence λ ∈ J are non-trapping energy levels for l(x, ˜ of an operator L(h) with the properties mentioned above. Finally, let us mention that the estimate (1.6) combined with the argument of Robert [17] lead to a Weyl asymptotic of the scattering phase for general long-range perturbations in Rn when H = L2 (Rn ). Now we will recall the abstract “black box” assumptions given in [23, 21] and [22]. Suppose that D satisfies 1Rn \B(0,R0 ) D = H 2 (Rn \ B(0, R0 )),

(1.7)

uniformly with respect to h in the sense of [21]. More precisely, equip H 2 (Rn \B(0, R0 )) with the norm < hD >2 uL2 , < hD >2 = 1 + (hD)2 , and equip D with the norm (L+i)uH . Then we require that 1Rn \B(0,R0 ) : D −→ H 2 (Rn \B(0, R0 )) is uniformly bounded with respect to h and this map has a uniformly bounded right inverse.

416

V. Bruneau, V. Petkov

Assume that

and

1B(0,R0 ) (L + i)−1 is compact

(1.8)

(Lu)|Rn \B(0,R0 ) = Q u|Rn \B(0,R0 ) ,

(1.9)

where Q is a formally self-adjoint differential operator Qu = aα (x; h)(hDx )α u,

(1.10)

|α|≤2

with aα (x; h) = aα (x) independent of h for |α| = 2 and aα (x; h) ∈ Cb∞ (Rn ) uniformly bounded with respect to h. the ellipticity and a long-range decay of the symbol Next we assume α |α|≤2 aα (x; h)ξ . There exists C > 0 such that l0 (x, ξ ) =

aα (x)ξ α ≥ Cξ 2 ,

ξ = (1 + |ξ |2 )1/2 .

(1.11)

|α|=2

There exists γ > 0 such that for every (α, β) ∈ Nn × Nn , we have β |∂xα ∂ξ aα (x; h)ξ α − |ξ |2 | ≤ Cα,β x−γ −|α| ξ 2

(1.12)

|α|≤2

uniformly with respect to h. To study the analytic properties of the cut-off resolvent related to the resonances near J , we need some technical assumptions which we will use only in Proposition 3 and Lemma 3. There exist θ0 ∈]0, π [, > 0 and R1 > R0 so that the coefficients aα (x; h) of Q can be extended holomorphically in x to {rω; ω ∈ Cn , dist (ω, S n−1 ) < , r ∈ C, |r| > R1 , arg r ∈ [−, θ0 + )}

(1.13)

˜ n , R˜ > 2R. Set and (1.12) extend to this larger set. Let R > R0 , T = (R/RZ) H# = HR0 ⊕ L2 (T \ B(0, R0 )) and consider a differential operator Q# u =

|α|≤2

aα# (x; h)(hD)α u

on T with aα# (x; h) = aα (x; h) for |x| < R satisfying (1.9), (1.10), (1.11) with Rn replaced by T . Consider a self-adjoint operator L# u = L(ϕu) + Q# ((1 − ϕ)u), u ∈ D# with domain D# = {u ∈ H# : ϕu ∈ D,

(1 − ϕ)u ∈ H 2 },

Semiclassical Resolvent Estimates

417

where ϕ ∈ C0∞ (B(0, R); [0, 1]) is equal to 1 near B(0, R0 ). Denote by N (L# , [−λ, λ]) the number of eigenvalues of L# in the interval [−λ, λ] and assume that λ n# /2 ), n# ≥ n, λ ≥ 1. (1.14) N (L# , [−λ, λ]) = O( 2 h Finally, we suppose that σpp (L(h)) ∩ [µ0 − δ, µ1 + δ] = ∅, h ∈]0, h0 ],

(1.15)

where δ > 0 is the constant in (1.5). A typical example for an operator satisfying the conditions (1.7)-(1.15) is an elliptic self-adjoint Schrödinger operator aα (x; h)(hDx )α P = |α|≤2

with C ∞ coefficients in the exterior = = Rn \ O of a bounded obstacle O with C ∞ boundary ∂= with Dirichlet condition on ∂= (see [3]). In this case D = H 2 (Rn \ O) ∩ H01 (Rn \ O) and P is a positive operator. Under the above assumptions the resonances close to the real axis can be defined by the method of complex scaling [23, 21] and they coincide with the poles of the meromorphic continuation of the resolvent (L(h) − z)−1 : Hcomp −→ Dloc from Im z > 0 to a conic neighborhood of the positive real axis in the lower half plane. The set of resonances will be denoted by Res L(h). Notice that the assumption (1.13) is necessary for the introduction of the complex scaling operator Lθ (h), while the condition (1.14) is exploited for the estimation of the function counting the resonances in a compact set (see [21, 23]). These assumptions play an important role in the proof of Lemma 3 in Sect. 3. Finally, (1.15) guarantees that we have no resonances lying in the interval ]µ0 − δ, µ1 + δ[. The paper is organized as follows. Section 2 is devoted to the analysis of the non˜ ξ ). In Sect. 3 we prove trapping energy levels and to the construction of the symbol l(x, ˜ Theorem 1 introducing the operator L(h) and exploiting the semiclassical estimates for non-trapping energy levels. In Sect. 4 we obtain some microlocal estimates closely related to the bounds of R(z)s,−s . In another work [2] we apply the estimates given in Proposition 5 for the analysis of the semiclassical asymptotic of the spectral shift function related to trapping “black box” perturbations. 2. Classical Trajectories and Non-Trapping Energy Levels Let l(x, ξ ) ∈ C ∞ (Rn × Rn ; R) be a classical Hamiltonian and let Hl = (∂ξ l, −∂x l) be the associated Hamilton vector field Hl (f )(x, ξ ) = {l, f }(x, ξ ) = (∂ξ l.∂x f − ∂x l.∂ξ f )(x, ξ ),

f ∈ C ∞ (Rn × Rn ).

Denote by ?tl the Hamilton flow of l given by ?tl = exp(tHl ) : (x0 , ξ0 ) → x(t, x0 , ξ0 ), ξ(t, x0 , ξ0 ) . Therefore for each λ ∈ R, the energy surface Aλ = l −1 (λ) is stable by ?tl . Throughout the exposition below we set J =]µ0 , µ1 [⊂⊂ R+ . We will say that (x, ξ ) is a non-critical point for l(x, ξ ) if ∇x,ξ l(x, ξ ) = 0 and λ is a non-critical energy level for l(x, ξ ), if for all (x, ξ ) ∈ Aλ = l −1 (λ) we have ∇x,ξ l(x, ξ ) = 0. We start by an analysis of the trapped trajectories.

418

V. Bruneau, V. Petkov

Lemma 1. Let µ1 > µ0 > 0 be fixed. Assume there exist µ ∈]0, µ0 [, R > 0 such that 1 |x| ≥ R, (x, ξ ) ∈ l −1 (]µ0 , µ1 [) ⇒ | {l, x · ∂ξ l}(x, ξ ) − l(x, ξ )| ≤ µ. (2.1) 4 Then for each λ ∈ J and each (x0 , ξ0 ) ∈ Aλ with |x0 | > R we have supt∈R |x(t, x0 , ξ0 )| = +∞. Proof. For the simplicity of the notations we will omit (x0 , ξ0 ) in x(t, x0 , ξ0 ), ξ(t, x0 , ξ0 ) if the dependence of (x0 , ξ0 ) is not important. It is clear that the function |x(t)|2 satisfies the equations (|x|2 ) = l, |x|2 (x,ξ ) = 2x · ∂ξ l(x, ξ ), (2.2) (x · ∂ξ l(x, ξ )) = l, x · ∂ξ l (x, ξ ). Obviously, for (x0 , ξ0 ) ∈ Aλ we have (x(t), ξ(t)) ∈ Aλ and this yields (x · ∂ξ l(x, ξ )) = 4λ + l, x · ∂ξ l (x, ξ ) − 4l(x, ξ ).

(2.3)

Choose |x0 | > R with (x0 , ξ0 ) ∈ l −1 (]µ0 , µ1 [) and denote by [Tm , TM ] ⊂ R the maximal interval containing t0 = 0 so that | {l, x · ∂ξ l} − 4l (x(t), ξ(t))| ≤ 4µ, t ∈ [Tm , TM ]. (2.4) The continuity of the function x(t) implies the existence of an interval [Tm , TM ] = {0}. Using the relation (2.3), we deduce (x · ∂ξ l(x, ξ )) (t) ≥ 4(λ − µ) ≥ 4(µ0 − µ), that is

∀t ∈ [Tm , TM ],

(x · ∂ξ l(x, ξ ))(t) ≥ x0 · ∂ξ l(x0 , ξ0 ) + 4(µ0 − µ)t, ∀t ∈ [0, TM ], (x · ∂ξ l(x, ξ ))(t) ≤ x0 · ∂ξ l(x0 , ξ0 ) + 4(µ0 − µ)t, ∀t ∈ [Tm , 0].

Then applying (2.2), we get |x(t)|2 ≥ |x0 |2 + 2x0 · ∂ξ l(x0 , ξ0 )t + 4(µ0 − µ)t 2 ,

∀t ∈ [Tm , TM ].

Consequently, if ±x0 · ∂ξ l(x0 , ξ0 ) ≥ 0 for ±t ≥ 0, we have |x(t)|2 ≥ |x0 |2 + 4(µ0 − µ)t 2 ,

(2.5)

which implies immediately TM = +∞ (if x0 · ∂ξ l(x0 , ξ0 ) ≥ 0) or Tm = −∞ (if x0 · ∂ξ l(x0 , ξ0 ) ≤ 0) and (2.5) holds for any t ≥ 0 (any t ≤ 0). Thus, for |x0 | > R and l(x0 , ξ0 ) ∈]µ0 , µ1 [ we obtain supt∈R |x(t, x0 , ξ0 )| = +∞ and the proof is complete. # " The argument of the proof above shows that the trajectories of ?tl passing through a point (x0 , ξ0 ) ∈ Aλ with |x0 | > R go to infinity if the product x0 · ∂ξ l(x0 , ξ0 ) has a suitable sign. This phenomenon is well known for symbols |ξ |2 + V (x), when V (x) is a short-range perturbation (see for instance [12]). The assumption (2.1) is usually known as the virial condition. In the particular case, when l(x, ξ ) = |ξ |2 + V (x), this condition means that |x.Vx (x) + 2V (x)| ≤ 2µ. We will say that a trajectory (x(t, x0 , ξ0 ), ξ(t, x0 , ξ0 )) of the flow ?tl is bounded if there exists A > 0 such that |x(t, x0 , ξ0 )| ≤ A, ∀t ∈ R. In the next proposition we show that under the long-range assumption all bounded trajectories on Aλ , λ ∈ J , as well as all critical points of l(x, ξ ) are included in a fixed compact set K(J ) ⊂ R2n depending on the interval J .

Semiclassical Resolvent Estimates

419

Proposition 1. Let µ1 > µ0 > 0. Assume there exists γ > 0 such that for every (α, β) ∈ Nn × Nn , |α| ≤ 1, |β| ≤ 2 we have β |∂xα ∂ξ l(x, ξ ) − |ξ |2 | ≤ Cα,β x−γ −|α| ξ 2 . Then there exists a compact set K(J ) ⊂ R2n such that for each λ ∈ J , the set K(J )∩Aλ contains all bounded trajectories of the Hamilton flow ?tl on Aλ and all critical points of l(x, ξ ) in Aλ . Remark 2. In a more restrictive setup, this result was stated and exploited by the first author [1] in the analysis of the scattering phase for short-range trapping perturbations of the Laplacian. Proof. First, observe that for (x, ξ ) ∈ Aλ , λ ∈ J we have ξ 2 ≤ C0,0 x−γ ξ 2 + µ1 + 1. Thus for R1 large enough and |x| > R1 we get λ ∈ J, (x, ξ ) ∈ Aλ , |x| > R1 ⇒ |ξ | ≤ M.

(2.6)

Next we will show that outside a fixed compact set K(J ) the x-component x(t, x0 , ξ0 ) of the trajectories of ?tl goes to infinity in the sense that if x(t, x0 , ξ0 ) passes through a point y ∈ K(J ), then supt∈R |x(t, x0 , ξ0 )| = ∞. As we have seen, applying Lemma 1, it is sufficient to establish the existence of R > 0 such that (2.1) holds with 0 < µ < µ0 . We write {l, x · ∂ξ l} = ∂ξ l · ∂x (x · ∂ξ l) − ∂x l · ∂ξ (x · ∂ξ l) = |∂ξ l|2 +

n

∂ξj l x · ∂ξ ∂xj l − ∂x l · ∂ξ (x · ∂ξ l).

(2.7)

j =1

Setting r(x, ξ ) =

l(x, ξ ) − |ξ |2 ,

we observe that

|∂ξ l| = 4|ξ |2 + 4ξ · ∂ξ r + |∂ξ r|2 2

which implies 1 1 |∂ξ l|2 − l(x, ξ ) = ξ · ∂ξ r + |∂ξ r|2 − r(x, ξ ). 4 4 Moreover, since ∂xj l = ∂xj r, we deduce the following representation:

(2.8)

n

1 1 1 ∂ξj l x·∂ξ ∂xj r −∂x r ·∂ξ (x·∂ξ l). {l, x·∂ξ l}−l(x, ξ ) = ξ ·∂ξ r + |∂ξ r|2 −r(x, ξ )+ 4 4 4 j =1

Combining this with the assumption on the derivatives of l, we obtain 1 | {l, x · ∂ξ l}(x, ξ ) − l(x, ξ )| ≤ Cx−γ ξ 4 . 4 From this it is clear that we can find R = R(µ0 , µ1 , γ ) ≥ R1 such that (2.1) holds for |x| > R . On the other hand, taking R so that the modulus of the RHS of (2.8) is bounded by µ, 0 < µ < µ0 , we obtain |∂ξ l| = 0 on l −1 (J ) ∩ {(x, ξ ) : |x| > R}. Consequently, each (x, ξ ) ∈ l −1 (J ) ∩ {(x, ξ ) : |x| > R} is non-critical for l and for each λ ∈ J every trajectory on Aλ with initial data | x0 |> R becomes non-bounded in the sense mentioned above. This completes the proof. " #

420

V. Bruneau, V. Petkov

Now we pass to the analysis of the non-trapping energy levels in the sense given in [19]. Recall that λ ∈ R is a non-trapping energy level for l(x, ξ ) if for every R > 0 there exists TR > 0 such that for (x0 , ξ0 ) ∈ Aλ , |x0 | < R, we have |x(t, x0 , ξ0 )| > R,

∀|t| > TR .

Lemma 2. Let µ1 > µ0 > 0 and let l(x, ξ ) ≥ c0 |ξ |2 + c1 , c0 > 0, ∀(x, ξ ) ∈ Rn . If there exists µ ∈]0, µ0 [ such that 1 (x, ξ ) ∈ l −1 (]µ0 , µ1 [) ⇒ | {l, x · ∂ξ l}(x, ξ ) − l(x, ξ )| ≤ µ, 4

(2.9)

then each λ ∈ J is a non-trapping energy level for l(x, ξ ). Proof. Our purpose is to prove that for any (x0 , ξ0 ) ∈ Aλ we have |x(t, x0 , ξ0 )| −→ +∞ in both directions t → ±∞. Following the proof of Lemma 1, we have the estimate | {l, x · ∂ξ l} − 4l (x(t), ξ(t))| ≤ 4µ,

∀t ∈ R.

(2.10)

Notice that here there are no restrictions on the sign on t. Then it follows easily that |x(t)|2 ≥ |x0 |2 + 2x0 · ∂ξ l(x0 , ξ0 )t + 4(µ0 − µ)t 2 ,

∀t ∈ R.

On the other hand, it is obvious that the set {ξ0 : (x0 , ξ0 ) ∈ Aλ , µ0 ≤ λ ≤ µ1 } is bounded. Consequently, for R > 0 and |x0 | < R, we conclude that x0 · ∂ξ l(x0 , ξ0 ) remains bounded, and there exists CR > 0 such that |x(t)|2 ≥ 4|t| (µ0 − µ)|t| − CR , ∀t ∈ R. Thus, for all |t| > 2CR + 2 CR2 + (µ0 − µ)R 2 we have |x(t)| > R, hence λ is a non-trapping energy level. " # Introduce a function χ ∈ C0∞ ({x ∈ Rn : |x| ≤ 2}), 0 ≤ χ ≤ 1 such that χ (x) = 1 for |x| ≤ 1 and set χR (x) = χ ( Rx ). The next proposition will play a crucial role in our analysis in Sect. 3. Proposition 2. Let µ1 > µ0 > 0. Assume there exists γ > 0 such that for every (α, β) ∈ Nn × Nn , |α| ≤ 1, |β| ≤ 2 we have β |∂xα ∂ξ l(x, ξ ) − |ξ |2 | ≤ Cα,β x−γ −|α| ξ 2 . Then there exists ρ > 0 such that for R ≥ ρ, each λ ∈ J is a non-trapping and non-critical energy level for the Hamiltonian ˜ ξ ) = l(x, ξ ) − χR (x) l(x, ξ ) − |ξ |2 . l(x,

Semiclassical Resolvent Estimates

421

Proof. We will prove that it is possible to choose R > 0 large enough in order to satisfy ˜ ξ ). We first observe that the conditions of Lemma 2 for the symbol l(x, ˜ ξ ) = |ξ |2 + 1 − χR (x) l(x, ξ ) − |ξ |2 l(x, combined with the assumptions on l and the fact that (1−χR ) has support in {x : |x| > R} imply easily the estimate ˜ ξ ) ≥ ξ 2 1 − C0,0 R−γ − 1. l(x, ˜ ξ ) we have Hence there exists R1 such that if we take R ≥ R1 in the definition of l(x, ˜ ξ ) ≥ c0 |ξ |2 + c1 , c0 > 0, ∀(x, ξ ) ∈ Rn . l(x, Now, as in the proof of Proposition 1, we have ˜ x · ∂ξ l} ˜ = |∂ξ l| ˜2+ {l,

n

˜ ∂ξj l˜ x · ∂ξ ∂xj l˜ − ∂x l˜ · ∂ξ (x · ∂ξ l).

(2.11)

j =1

˜ ξ ) − |ξ |2 , the derivative of l˜ = |ξ |2 + (1 − χR )r satisfies Next, setting r(x, ξ ) = l(x, ˜ 2 = 4|ξ |2 + 4(1 − χR )ξ · ∂ξ r + (1 − χR )2 |∂ξ r|2 |∂ξ l| and from this we see that 1 ˜2 ˜ 1 |∂ξ l| − l(x, ξ ) = (1 − χR )ξ · ∂ξ r + (1 − χR )2 |∂ξ r|2 − (1 − χR )r(x, ξ ). (2.12) 4 4 On the other hand, it is clear that x 1 ∂xj l˜ = ∂xj (1 − χR )r = (1 − χR )∂xj r − (∂xj χ ) r R R and, consequently, n

˜ ∂ξj l˜ x · ∂ξ ∂xj l˜ − ∂x l˜ · ∂ξ (x · ∂ξ l)

j =1

  n ˜ = (1 − χR )  ∂ξj l˜ x · ∂ξ ∂xj r − ∂x r · ∂ξ (x · ∂ξ l)

(2.13)

j =1

−

n x 1 ˜ . (∂xj χ ) ∂ξj l˜ x · ∂ξ r − r · ∂ξj (x · ∂ξ l) R R j =1

The function (1−χR ) (resp. ∂x χR ) is supported in {|x| ≥ R}, (resp. {2R ≥ |x| ≥ R}) and the assumptions on r(x, ξ ) combined with the relations (2.11), (2.12), (2.13) lead to

1

{l, ˜ ˜ ˜ x · ∂ξ l}(x, ξ ) − l(x, ξ )

≤ CR−γ ξ 4 .

4 Now recall that |ξ | remains bounded on the energy level Aλ . Then we can find ρ = ˜ ξ ) satisfies (2.9) and each ρ(µ0 , µ1 , γ ) ≥ R1 such that for any R ≥ ρ the function l(x, ˜ ξ ). Moreover, taking ρ so that the right λ ∈ J is a non-trapping energy level for l(x, ˜ = 0 on hand side of (2.12) has a modulus bounded by µ, 0 < µ < µ0 , we obtain |∂ξ l| l˜−1 (J ) and this completes the proof. " #

422

V. Bruneau, V. Petkov

Remark 3. For a more complete analysis of the qualitative aspects of classical trajectories, we refer to the recent work of Knauf [12]. The constant ρ determined in Proposition 2 corresponds to the so-called virial radius. For the Hamiltonian l(x, ξ ) = 21 |ξ |2 +V (x), Knauf obtained a topological condition for the existence of trapped trajectories. 3. Resolvent Estimates We start with the following proposition reducing the proof of the resolvent estimates to that for the cut-off resolvent. Proposition 3. Let L(h) satisfy the “black box” assumptions (1.7))–(1.12) and let J = ]µ0 , µ1 [⊂⊂ R+ . Let ρ0 > 2ρ > R0 , where ρ is the constant of Proposition 2. Then ˜ there exists a self-adjoint differential operator L˜ = L(h) on L2 (Rn ), satisfying the assumptions (1.7)–(1.12) such that each λ ∈ J is a non-trapping and non-critical ˜ energy level for L(h) and ˜ L(h)ψ = L(h)ψ for any ψ ∈ C ∞ (Rn ) supported away from B(0, ρ0 ). Let χ (x) ∈ C0∞ ({x : |x| ≤ ρ1 }), χ (x) = 1 for |x| ≤ ρ0 < ρ1 . Then for every s > 21 there exist C > 0, h0 > 0 such that we have R(z)s,−s ≤ Ch−2 1 + χ R(z)χ H→H , uniformly with respect to z ∈ B± = {z ∈ C : (Re z, ± Im z) ∈ J ×]0, 1]} and h ∈]0, h0 ]. Proof. Choose a function ? ∈ C0∞ ({x ∈ Rn : |x| ≤ 1}; R), 0 ≤ ? ≤ 1 such that ?(x) = 1 for |x| ≤ 1/2 and set ?R (x) = ?( Rx ) for R > R0 , where R0 is the constant in the “black box” assumptions. Introduce the self-adjoint operator on L2 (Rn ) L˜ = (1−?R )L?R +?R L(1−?R )+(1−?R )L(1−?R )+?R (−h2 )?R = Q + ?R (−h − Q)?R , 2

(3.1) (3.2)

where Q is the formally self-adjoint differential operator (1.10). According to Proposition 2, taking ρ0 > 2ρ > R0 , we know that every λ ∈ J is a non-trapping and non-critical energy level in the sense mentioned in Sect. 2 for the ˜ principal symbol l˜0 (x, ξ ) of L(h) having the form l˜0 (x, ξ ) = l0 (x, ξ ) − ?2ρ0 (x) l0 (x, ξ ) − |ξ |2 with l0 (x, ξ ) =

˜ = Qψ = aα (x)ξ α . Clearly, if ψ = 0 on B(0, ρ0 ), we have Lψ

|α|=2 Lψ. ∞ Let χ1 ∈ C0 (Rn ) be equal to 1 on B(0, ρ0 ) and let χ2 ∈ C0∞ (Rn ) be equal to 1 on ˜ − χ1 ) = L(1 − χ1 ) and we obtain the support of χ1 . The above property yields L(1

˜ ˜ χ1 ]R(z)(1 ˜ R(z) = R(z)χ2 + (1 − χ1 )R(z)(1 − χ2 ) + R(z)[L, − χ2 ), where

˜ ˜ R(z) = (L(h) − z)−1 ,

z ∈ C, Im z = 0.

(3.3)

Semiclassical Resolvent Estimates

423

This implies immediately the estimate ˜ ˜ χ1 ]R(z) ˜ R(z)s,−s ≤ C R(z)χ2 s,−s + R(z) s,−s + R(z)χ2 s,−s [L, s,s , (3.4) where here and below we denote by C > 0 different constants independent on h and z ˜ χ1 ] is a first order h-admissible differential which may change from line to line. Next, [L, operator with compactly supported coefficients, so there exists C > 0 such that ˜ ˜ χ1 ]R(z) ˜ ˜ χ1 ]R(z) [L, s,s ≤ C[L, s,−s . ˜ Since the interval J contains non-trapping energy levels for L(h), we can estimate ˜ ˜ ˜ and [ L, χ ] R(z) by Lemma 4 and Corollary 1 (see Appendix 5.1). R(z) s,−s 1 s,−s Consequently, there exist C > 0 and h0 > 0 such that for any h ∈]0, h0 ] we have (3.5) R(z)s,−s ≤ Ch−1 1 + R(z)χ2 s,−s . On the other hand, using the equality R(z)χ2 s,−s = χ2 R(z)s,−s and introducing χ2 in the estimates (3.4), (3.5), we obtain R(z)χ2 s,−s ≤ Ch−1 1 + χ2 R(z)χ2 H→H .

(3.6)

Finally, (3.5) and (3.6) yield R(z)s,−s ≤ Ch−2 1 + χ2 R(z)χ2 H→H and the proof is complete.

# "

In the following up to the end of our paper ρ1 > ρ0 > R0 will denote the constants introduced in Proposition 3 and χ (x) is a cut-off function with the properties given in Proposition 3. Proposition 4. Assume that L(h) satisfies the assumptions (1.7)−(1.15). Then for small h0 > 0 and fixed s > 1/2 and h ∈]0, h0 ] the limits R(λ ± i0) =

lim

→0, ±>0

R(λ ± i)

exist in L(H0,s , H0,−s ) uniformly with respect to λ ∈ J and the function J $ λ → R(λ ± i0) ∈ L(H0,s , H0,−s ) is continuous. ˜ Proof. As in the proof of Proposition 3, introduce an operator L˜ = L(h) with non∞ n trapping energy levels in J . Let χ1 ∈ C0 (R ) be equal to 1 on B(0, ρ0 ) and let χ2 ∈ ˜ z)= C0∞ (Rn ) be equal to 1 on the support of χ1 . Set R(z, z ) = R(z) − R(z ) and R(z, ˜ ˜ ). According to (3.3), we have R(z) − R(z ˜ z )(1 − χ2 ) R(z, z ) = R(z, z )χ2 + (1 − χ1 )R(z, ˜ χ1 ]R(z)(1 ˜ ˜ χ1 ]R(z, ˜ z )(1 − χ2 ) − χ2 ) + R(z )[L, + R(z, z )[L,

424

V. Bruneau, V. Petkov

and we deduce

˜ χ1 ]R(z) ˜ R(z, z )s,−s ≤ CR(z, z )χ2 s,−s 1 + [L, s,s ˜ z )s,−s + CR(z )χ2 s,−s [L, ˜ χ1 ]R(z, ˜ z )s,s . + CR(z,

For fixed h ∈]0, h0 ], Lemma 6 in Appendix 5.2 implies that the cut-off resolvent χ2 (x)R(z)χ2 (x) is analytic in B± and the norm χ2 R(z)χ2 H→H will be uniformly bounded with respect to z ∈ B± . Applying Proposition 3 for s > 1/2, we deduce that R(z )χ2 s,−s is also uniformly bounded with respect to z ∈ B± . On the other hand, ˜ using Corollary 1 we see that for any s > 1/2 there exists since J is non-trapping for L, Cs such that −1 ˜ χ1 ]R(z) ˜ [L, s,s ≤ Cs h uniformly with respect to z ∈ B± . An application of Lemma 4 in Appendix 5.1 shows that for s and h fixed, there exists Ch,s so that ˜ z )s,−s + [L, ˜ χ1 ]R(z, ˜ z )s,−s ≤ Ch,s |z − z | + |z − z |δ1 , R(z, with δ1 = (s − 1/2)(2s − 1/2) uniformly with respect to z, z ∈ B± . Thus with another constant Ch,s we have R(z, z )s,−s ≤ Ch,s |z − z |δ1 + |z − z | + R(z, z )χ2 s,−s . To deal with the term R(z, z )χ2 s,−s , we use the equality R(z, z )χ2 s,−s = χ2 R(z, z )s,−s , and we introduce the cut-off function χ2 in the above estimates in order to obtain R(z, z )χ2 s,−s ≤ Ch,s |z − z |δ1 + |z − z | + χ2 R(z, z )χ2 s,−s . This implies immediately R(z, z )s,−s ≤ Ch,s |z − z |δ1 + |z − z | + χ2 R(z, z )χ2 s,−s . Applying Lemma 6 once more, we conclude that for fixed h and s > 1/2 the limits R(λ ± i0) =

lim

→0, ±>0

R(λ ± i),

λ∈J

exist in L(H0,s , H0,−s ) and J $ λ → R(λ ± i0) ∈ L(H0,s , H0,−s ) is continuous.

# "

For the proof of our principal result we need the following. Lemma 3. Let L(h) satisfy the assumptions (1.7)–(1.15) and assume the estimate (1.5) is fulfilled. Let χ (x) be as in Proposition 3. Then there exists C > 0 such that for (λ, τ ) ∈ J ×]0, 1] we have −p

χ (x)R(λ ± iτ )χ (x)H→H ≤ CeCh , 0 < h ≤ h0 .

(3.7)

Semiclassical Resolvent Estimates

425

Proof. Set Sθ = {z ∈ C : − θ2 ≤ arg z ≤ π/4}, where 0 < θ ≤ θ0 is fixed and θ0 ∈]0, π [ was introduced in (1.13) making possible a complex scaling of L(h). More precisely, following [23, 21], we construct a smooth function [0, π ]×[0, ∞] $ (θ, t) → fθ (t) ∈ C and by using the map κθ : Rn $ x = tω → fθ (t)ω ∈ Cn , t = |x|, fθ (t) = t for 0 ≤ t ≤ ρ1 , we define the manifold #θ which coincides with Rn along B(0, ρ1 ). Here ρ1 > ρ0 is the constant given in Proposition 3. The complex scaling operator Lθ = Lθ (h) is defined in a domain Dθ of the space Hθ = HR0 ⊕ L2 (#θ \ B(0, R0 )). We refer to [23, 21] for more details. Choose 0 < δ < µ0 /2 so that (1.5) holds for λ ∈ [µ0 − δ, µ1 + δ] and consider the compact set =δ = {z ∈ Sθ : µ0 − δ ≤ Re z ≤ µ1 + δ}. The construction of Lθ and an application of Lemma 1 in [24] show that for every m ∈ N we have χ (L(h) − z)−1 χ ≤ C(=δ , m) exp C(=δ , m)h−k , 0 < h ≤ h0 (=δ , m) (3.8) for

z ∈ =δ \

D(zj : hm )

zj ∈Res L(h)∩=δ

hm )

with k > n and D(zj : = {z ∈ C : |z − zj | ≤ hm }. For the number of the scattering resonances lying in =δ we have the estimate (see [23, 21]) H

H{zj : zj ∈ Res L(h) ∩ =δ } ≤ C1 (=δ )h−n

with nH ≥ n. Next we fix an integer m satisfying m ≥ nH + 1 and take 0 < h1 (=δ , m) ≤ h0 (=δ , m) small enough to arrange H

4C1 (=δ )hm−n < δ for 0 < h ≤ h1 (=δ , m). This implies immediately the existence of γ1 (h), γ2 (h) so that δ µ0 /2 ≤ µ0 − δ < γ1 (h) < µ0 − , 2 and / γj (h) + iτ ∈

µ1 +

D(zj : hm ),

δ < γ2 (h) < µ1 + δ, 2 j = 1, 2, τ ≥ 0.

zj ∈Res L(h)∩=δ

Thus for fixed k > n the estimate (3.8) holds for z = γj (h) + iτ, δ Set = 10 and introduce the domain Iδ = {z = λ + iτ : µ0 − δ ≤ λ ≤ µ1 + δ,

j = 1, 2,

τ ≥ 0.

0 ≤ τ ≤ hM } ⊂ =δ ,

where 2M > k. By a modification of the argument in Lemma 2 in [24], we may find a function f (z, h) holomorphic in Iδ with the properties:

426

V. Bruneau, V. Petkov

(i) |f (z, h)| ≤ e in Iδ , (ii) |f (z, h)| ≥ 1/2 in [µ0 − , µ1 + ] + i[0, hM ], (ii) for z ∈ Iδ ∩ {z : Re z ≤ µ0 − 4, or Re z ≥ µ1 + 4} we have 2 |f (z, h)| ≤ C exp − 2M , 0 ≤ h ≤ h(). 2h To construct f (z, h), take a function ψ(x) ∈ C0∞ (Rn ), 0 ≤ ψ(x) ≤ 1, such that 0, x ∈ / [µ0 − 3, µ1 + 3], ψ(x) = 1, x ∈ [µ0 − 2, µ1 + 2] and put

(x − z)2 ψ(x)dx, α = hM . exp − α2 The check of (i) is trivial. To prove (ii), write z = u+iv and take u ∈ [µ0 −, µ1 +], 0 ≤ v ≤ α. Then v2 (x − u)2 2 −1/2 |f (z, h) − 1| ≤ (π α ) exp 2 exp − |ψ(x) − 1|dx α α2 1 2 2 ≤ π −1/2 e e−y |ψ(αy + u) − 1|dy ≤ 2π −1/2 e e−y dy ≤ 2 R\B(0, α ) 2 −1/2

f (z, h) = (π α )

for 0 < h ≤ h(), since α −→ ∞ as h → 0. Finally, to deal with (iii), assume for example that u ≤ µ0 − 4. We have v2

(x − u)2 exp − ψ(x)dx α2 α2 2 2 ≤ Aπ −1/2 h−M exp − 2 ≤ C exp − 2 , 0 ≤ h ≤ h(). α 2α Our assumptions imply that the cut-off resolvent |f (z, h)| ≤ (π α 2 )−1/2 exp

F (z, h) = χ (x)R(z)χ (x) has no poles on ]µ0 − δ, µ1 + δ[ (see Lemma 6 in Appendix 5.2) and the function G(z, h) = f (z, h)F (z, h) is holomorphic in Iδ . For Im z = 0 we have G(z, h) ≤ C eCe hp , while for Im z = hM we get G(z, h) ≤ For z ∈ Iδ , Re z = γj (h), F (z, h), we obtain

C1 Ce ≤eh . M h

j = 1, 2, combining the estimates for |f (z, h)| and C

G(z, h) ≤ CC e hk e

−

2 h2M

≤ C2 ,

0 < h ≤ h().

As in [24], by applying the maximum principle for G(z, h) in Iδ , we deduce the estimate (3.7) for λ ∈ J , 0 < τ ≤ hM . The result for hM ≤ τ ≤ 1 follows trivially from the inequality χ (x)R(λ + iτ )χ (x)H→H ≤ Cτ and the proof is complete. " # Combining Propositions 3, 4 and Lemma 3, we obtain Theorem 1.

Semiclassical Resolvent Estimates

427

4. Microlocal Resolvent Estimates In this section we obtain microlocal resolvent estimates in the trapping case. They are inspired by the microlocal estimates in Lemma 2.3 of [20] corresponding to the nontrapping case. In [20] (see also [17] for more general Hamiltonians), Robert and Tamura constructed an approximation of the propagator exp(ith−1 L) by Fourier integral operators which is uniform with respect to the time t and h. Following the argument in [20], it is convenient to localize in the outgoing or incoming regions of the phase space # ± (R, d, σ± ) = {(x, ξ ) ∈ R2n : |x| > R, d −1 < |ξ | < d,

x.ξ |x||ξ |

> <

σ± },

(4.1)

where d > 1, −1 < σ± < 1 and R % 1 is large enough. Consider a symbol ω± (x, ξ ) ∈ C ∞ (R2n ) such that supp ω± ⊂ # ± (R, d, σ± ) with R > ρ1 > ρ0 , ρ0 , ρ1 being the constants introduced in Proposition 3. Assume that ω± (x, ξ ) satisfy the estimates β

|∂xα ∂ξ ω± (x, ξ )| ≤ Cα,β,L x−|α| ξ −L , ∀α, ∀β

(4.2)

and any L % 1. Then we have the following. Proposition 5. Let J ⊂⊂ R+ , let ψ ∈ C ∞ (Rn ) be supported sufficiently away from B(0, ρ1 ), ψ(x) = 1 for |x| % 1 and let χ (x) be the function of Proposition 3. Then for any (λ, τ ) ∈ J × [0, 1] and h ∈]0, h0 ] the following assertions hold: i) For any s > 1/2, δ > 1 and N ∈ N there exist CN > 0 and h0 > 0 such that for h ∈]0, h0 ] we have ψR(λ ± iτ )ψω± (x, hDx )−s+δ,−s ≤ CN h−1 (1 + hN χ R(λ ± iτ )χ H→H ). (4.3) ii) If σ+ > σ− , then for any s % 1 and N ∈ N, there exist CN > 0 and h0 > 0 such that for h ∈]0, h0 ] we have ω∓ (x, hDx )ψR(λ ± iτ )ψω± (x, hDx )−s,s ≤ CN hN (1+χ R(λ ± iτ )χ H→H ). (4.4) iii) Let ϕ ∈ C0∞ (B(0, ρ0 )), ϕ = 1 on B(0, R0 ). Then for any s % 1 and N ∈ N, there exist CN > 0 and h0 > 0 such that for h ∈]0, h0 ] we have ϕR(λ ± iτ )ψω± (x, hDx )−s,0 ≤ CN hN (1 + χ R(λ ± iτ )χ H→H ).

(4.5)

Proof. Introduce a function χ1 (x) ∈ C0∞ (B(0, ρ1 )) such that χ1 (x) = 1 on B(0, ρ0 ) and χ χ1 = χ1 . As in the proof of Proposition 3, there exists a self-adjoint differential ˜ operator L˜ = L(h) on L2 (Rn ), such that each λ ∈ J is a non-trapping energy level for L˜ and the equality (3.3) implies ˜ ˜ χ1 ]R(z)ψω ˜ R(z)ψω± (x, hDx ) = (1 − χ1 )R(z)ψω ± (x, hDx ) + R(z)[L, ± (x, hDx ). (4.6)

428

V. Bruneau, V. Petkov

Then i) is a consequence of Proposition 3 and Lemma 5, iii) (see Appendix 5.1). To treat (ii), observe that ˜ ω∓ (x, hDx )ψR(z)ψω± (x, hDx ) = ω∓ (x, hDx )ψ R(z)ψω ± (x, hDx ) ˜ χ1 ]R(z)ψω ˜ (x, hD ). +ω∓ (x, hDx )ψR(z)[L, ± x Exploiting the adjoint operators, we get ∗ ω∓ (x, hDx )ψR(z) = R(z)ψω∓ (x, hDx )∗ . Thus using i) and Lemma 5 once more, we obtain ii). The analysis of iii) follows a similar argument. " # 5. Appendix 5.1. Resolvent estimates in the non-trapping case. In this section we recall some resolvent estimates in a non-trapping energy interval for a self-adjoint differential operator ˜ L˜ = L(h) defined in a dense domain in L2 (Rn ). We say that J ⊂⊂ R+ is non-trapping ˜ ˜ for L(h) if every λ ∈ J is a non-trapping energy level for the principal symbol of L(h) in the sense of Sect. 2. Denote by .s,s the norm of L(L2,s , L2,s ), where L2,s is the weight space L2 (Rn , xs dx). ˜ Lemma 4. Let L˜ = L(h) be a self-adjoint differential operator having the form (1.10) ˜ and satisfying (1.11)–(1.12). Let J ⊂⊂ R+ be an open non-trapping interval for L(h) and let s > 1/2. Then i) for h ∈]0, h0 ] fixed, there exists Ch,s > 0 such that for any z, z ∈ B± we have (L˜ − z)−1 − (L˜ − z )−1 s,−s ≤ Ch,s |z − z |δ1 ,

δ1 = (s − 1/2)/(2s − 1/2).

ii) for λ ∈ J , the limits (L˜ − λ ∓ i0)−1 :=

lim

→0,

±>0

(L˜ − λ ∓ i)−1

exist in L(L2,s , L2,−s ), and there exist C > 0 and h0 > 0 such that for z ∈ B± and h ∈]0, h0 ] we have (L˜ − z)−1 s,−s ≤ Ch−1 .

(5.1)

Proof. This result can be obtained following the Mourre theory. First, there exists a conjugate operator A = A(h) satisfying the following Mourre inequality: ∀? ∈ C0∞ (J ), ∃γ0 > 0 such that for all h ∈]0, h0 ], we have ˜ A]?(L) ˜ ≥ γ0 h?(L) ˜ 2. ˜ −1 [L, ?(L)i

(5.2)

For the construction of A(h) we refer to [17], where a more general setup has been considered. The operator A(h) has the form of a h-pseudodifferential operator with symbol 2x.ξ F (x, ξ ) = K(x, ξ ) + , ξ 2

Semiclassical Resolvent Estimates

429

where K ∈ C ∞ (Rn × Rn ) is compactly supported with respect to x and K is uniformly bounded with respect to (x, ξ ). According to Mourre type results, we conclude that L˜ has no eigenvalues in J for h small enough (see [13, 4]) and for s > 1/2 the following assertions hold (see [14, 11] and [9, 16, 17] for operators depending on h): i’) for h ∈]0, h0 ] fixed, there exists Ch > 0 such that for z, z ∈ B± = {z ∈ C; (Re z, ± Im z) ∈ J ×]0, 1]} we have A−s (L˜ − z)−1 − (L˜ − z )−1 A−s L2 →L2 ≤ Ch |z − z |δ1 , ii’) there exist C > 0 and h0 > 0 such that for z ∈ B± and h ∈]0, h0 ] we have A−s (L˜ − z)−1 A−s L2 →L2 ≤ Ch−1 , < A >= (1 + A2 )1/2 .

(5.3)

Let us note that the operators x−s As , As x−s , s ≥ 0, are uniformly bounded with respect to h in L(L2 ). This follows from the functional calculus developed by DimassiSjöstrand (see Sects. 7 and 8 in [5]) or by using a complex interpolation as in Lemma 8.2 of [14]. Thus, writing x−s (L˜ − z)−1 x−s = x−s As A−s (L˜ − z)−1 A−s As x−s , we complete the proof.

# "

Corollary 1. Let P be a p-order h-admissible differential operator, p ≤ 2 and let s > 1/2. Then, under the assumptions of Lemma 4, the following assertions hold: i”) for h ∈]0, h0 ] fixed, there exists Ch,s > 0 such that for z, z ∈ B± we have P (L˜ − z)−1 − (L˜ − z )−1 s,−s ≤ Ch,s |z − z | + |z − z |δ1 , ii”) there exists C > 0 and h0 > 0 such that for z ∈ B± and h ∈]0, h0 ] we have P (L˜ − z)−1 s,−s ≤ Ch−1 .

(5.4)

Proof. The ellipticity of L˜ and the calculus of Dimassi-Sjöstrand (see Sects. 7 and 8 in [5]) imply that for s ≥ 0 the operator x−s P (L˜ − i)−1 xs is uniformly bounded in L(L2 ) with respect to h. Then ii”) is a direct consequence of ii) and the resolvent equation (L˜ − z)−1 = (L˜ − i)−1 + (z − i)(L˜ − i)−1 (L˜ − z)−1 . In the same way i”) follows from i) and the relation (L˜ − z)−1 − (L˜ − z )−1 = (z − z )(L˜ − i)−1 (L˜ − z)−1 + (z − i)(L˜ − i)−1 (L˜ − z)−1 − (L˜ − z )−1 .

# "

Finally, we formulate some microlocal resolvent estimates which generalize Lemma 2.3 of [20].

430

V. Bruneau, V. Petkov

˜ Lemma 5. Let L˜ = L(h) be a self-adjoint differential operator having the form (1.10) ˜ and satisfying (1.11) − (1.12). Let J ⊂⊂ R+ be an open non-trapping interval for L(h) and let ω± (x, ξ ) be a symbol supported in an outgoing (incoming) region as in Sect. 4. Then for any (λ, τ ) ∈ J ×]0, 1] the following assertions hold: i) For any s > 1/2, δ > 1 there exist C > 0 and h0 > 0 such that for h ∈]0, h0 ] we have (L˜ − λ ∓ iτ )−1 ω± (x, hDx )−s+δ,−s ≤ Ch−1 .

(5.5)

ii) If σ+ > σ− , then for any s % 1 there exist C > 0 and h0 > 0 such that for all N ∈ N and h ∈]0, h0 ] we have ω∓ (x, hDx )(L˜ − λ ∓ iτ )−1 ω± (x, hDx )−s,s ≤ CN hN .

(5.6)

iii) Let ρ1 < R and let ϕ ∈ C0∞ (B(0, ρ1 )), ϕ = 1 on B(0, R0 ). Let P be a p-order h-admissible differential operator, p ≤ 2. Then for any s % 1 and N ∈ N, there exist CN > 0 and h0 > 0 such that for h ∈]0, h0 ] we have ϕP (L˜ − λ ∓ iτ )−1 ω± (x, hDx )−s,0 ≤ CN hN .

(5.7)

The proof of this lemma is analogous to that of Lemma 2.3 of [20] exploiting the constructions of long time approximations in Sect. 4 of [17] for a large class of longrange perturbations of the Laplacian. The essential point is the construction of the phases ϕ± (see Proposition 4.1 of [17]). To establish the estimates when we have a differential operator P , we apply the argument of the proof of Corollary 1 based on the resolvent equation and the fact that the operator x−s P (L˜ − i)−1 xs is uniformly bounded in L(L2 ). We leave the details to the reader. 5.2. Boundary values of the cut-off resolvent χ R(z)χ . Our purpose is to show that for fixed h ∈]0, h0 ] the limits χ (x)R(λ ± i0)χ (x) =

lim

→0,

±>0

χ (x)R(λ ± i)χ (x),

λ∈J

exist in L(H, H), provided χ ∈ C0∞ (Rn ) is equal to 1 on B(0, ρ0 ) and χ (x) = 0 for |x| ≥ ρ0 + a0 = ρ1 . To do this, it is sufficient to show that there are no resonances zj ∈ Res L(h) in the interval ]µ0 − δ, µ1 + δ[. Lemma 6. Assume (1.7)–(1.15) fulfilled. Then for 0 < h ≤ h0 we have Res L(h)∩]µ0 − δ, µ1 + δ[= ∅. Proof. The proof can be obtained from the results in [23, 21]. Here we will give only indications of how to extract it from there. Assume h fixed and let z0 ∈]µ0 − δ, µ1 + δ[ be a resonance for L(h). Denote by Hρ0 +a0 the space of elements of H which vanish outside B(0, ρ0 + a0 ). Then by using a complex scaling argument and Lemma 3.5 in [23], we conclude that for v ∈ Hρ0 +a0 and z in a neighborhood of z0 we have (L − z)−1 v =

N j =1

(z0 − z)−j v−j + G(z)

(5.8)

Semiclassical Resolvent Estimates

431

with G(z) holomorphic in a neighborhood of z0 and N independent of v. Consider the spectral projection 1 πθ,z0 = (z − Lθ )−1 dz, 2π i γ (z0 ) Lθ = Lθ (h) being the complex scaling operator defined in the proof of Lemma 3, while γ (z0 ) : [0, 2π ] $ s −→ z0 + 1 eis with 1 > 0 small enough. Let Fθ,z0 be the image πθ,z0 (Hθ ) and let 1 π0,z0 = (z − L)−1 dz : Hcomp −→ Dloc . 2πi γ (z0 ) As in Proposition 3.6 in [23], we conclude that there exists a bijection between the spaces Fθ,z0 and π0,z0 (Hρ0 +a0 ). Since z0 is a resonance, the space Fθ,z0 is not trivial −j and we can find v ∈ Hρ0 +a0 so that (5.8) holds with N j =1 (z0 − z) v−j = 0. For simplicity assume v−j = 0 for j = 2, ..., N and let v−1 = 0. Take a cut-off function ?(x) ∈ C0∞ (Rn ) equal to 1 on B(0, ρ0 ) so that (?, v−1 )H = 0. Therefore, lim (?, i(L − z0 + i)−1 v−1 )H = (?, P{z0 } v−1 )H = 0,

→0

P{z0 } being the spectral projector of L related to z0 , and we get z0 ∈ σpp (L) which contradicts (1.15). The case of a multiple pole can be treated in the same way. Acknowledgements. The authors are grateful to Didier Robert for helpful discussions and valuable comments concerning the non-trapping perturbations. We would like to thank Maciej Zworski for helpful comments and discussions as well as for his suggestion to study the long-range “black box” perturbations. We are also grateful to the referee for his remarks on our paper.

References 1. Bruneau, V.: Semi-classical behavior of the scattering phase for trapping perturbations of the Laplacian. Comm. in P.D.E. 24, 5 & 6, 1095–1125 (1999) 2. Bruneau, V. and Petkov, V.: Representation of the spectral shift function and spectral asymptotics for trapping perturbations. Preprint, 2000 3. Burq, N.: Absence de résonances près du réel pour l’opérateur de Schrödinger. Exposé XVII, Séminaire EDP, Ecole Polytechnique, 1997/1998 4. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schrödinger operators with application to quantum mechanics and global geometry Texts and Monographs in Physics, Berlin–Heidelberg, Springer-Verlag, 1987 5. Dimassi, M., Sjöstrand, J.: Spectral asymptotics in the semi-classical limit. London Mathematical Society, Lecture Notes Series 268, Cambridge: Cambridge University Press, 1999 6. Gérard, C., Martinez, A.: Principe d’absorption limite pour des opérateurs de Schrödinger à longue portée. C. R. Acad. Sci. Paris 306, 121–123 (1988) 7. Gérard, C., Martinez, A.: Semi-classical asymptotics of the spectral function of long range Schrödinger operators. J. Funct. Anal. 84, 226–254 (1989) 8. Gérard, C., Martinez, A. and Robert, D.: Breit–Wigner formulas for the scattering poles and total scattering cross-section in the semi-classical limit. Commun. Math. Phys. 121, 323–336 (1989) 9. Hislop, P., Nakamura, S.: Semiclassical resolvents estimates. Ann. Inst. H. Poincaré (Physique théorique) 51, 187–198 (1989) 10. Isozaki, H., Kitada, H.: Modified wave operators with time independent modifiers. J. Fac. Sc. Univ. Tokyo Sect IA. 32, 77–104 (1985) 11. Jensen, A., Mourre, E., Perry, P.: Multiple Commutator Estimates and Resolvent Smoothness in Quantum Scattering Theory. Ann. Inst. H. Poincaré (Physique théorique) 41, 207–225 (1984) 12. Knauf, A.: Qualitative Aspects of Classical Potential Scattering. Regul. Chaotic Dyn. 4, no. 1, 3–22 (1999)

432

V. Bruneau, V. Petkov

13. Mourre, E.: Absence of singular continuous spectrum for certain selfadjoint operators. Commun. Math. Phys. 78, 391–408 (1981) 14. Perry, P., Sigal, I.M., Simon, B.: Spectral analysis of N-body Schrödinger operators. Ann. Math. 114, 519–567 (1981) 15. Robert, D.: Autour de l’approximation semi-classique. PM, 68, Basel: Birkhäuser, 1987 16. Robert, D.: Asymptotique de la phase de diffusion à haute énergie pour des perturbations du second ordre du Laplacien. Ann. Sci. Ecole Norm. Sup. 25, 107–134 (1992) 17. Robert, D.: Relative time-delay for perturbations of elliptic operators and semiclassical asymptotics. J. Funct. Anal. 126, 36–82 (1994) 18. Robert, D.: Semiclassical approximation in quantum mechanics. A survey of old and recent mathematical results. Helv. Phys. Acta 71, 44–116 (1998) 19. Robert, D. and Tamura, H.: Semiclassical estimates for resolvents and asymptotics for total scattering cross-sections. Ann. Inst. H. Poincaré (Physique théorique) 46, 415–442 (1987) 20. Robert, D. and Tamura, H.: Asymptotic behavior of the scattering amplitudes in semi-classical and low energy limits. Ann. Inst. Fourier 39, 155–192 (1989) 21. Sjöstrand, J.: A trace formula and review of some estimates for resonances. In: Microlocal analysis and spectral theory (Lucca, 1996) NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci. 490, Dordrecht: Kluwer Acad. Publ., 1997, pp. 377–437 22. Sjöstrand, J.: Resonances for bottles and trace formulae. Preprint, Ecole Polytechnique, 1998 23. Sjöstrand, J. and Zworski, M.: Complex scaling and the distribution of scattering poles. J. Amer. Math. Soc. 4, 729–769 (1991) 24. Tang, S. and Zworski, M.: From quasimodes to resonances. Math. Res. Lett. 5, 261–272 (1998) 25. Vasy, A. and Zworski, M.: Semiclassical estimates in asymptotically Euclidean scattering. Commun. Math. Phys. 212, 205–217 (2000) 26. Wang, X.P.: Time-Decay of Scattering Solutions and Resolvents Estimates for Semi-classical Schrödinger Operators. J. Differential Equations 71, 348–395 (1988) Communicated by B. Simon

Commun. Math. Phys. 213, 433 – 470 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Distribution of Lattice Points Visible from the Origin Florin P. Boca1,2, , Cristian Cobeli2 , Alexandru Zaharescu2,3 1 School of Mathematics, Cardiff University, Senghennydd Road, Cardiff CF2 4YH, UK 2 Institute of Mathematics of the Romanian Academy, P.O. Box 1-764, Bucharest 70700, Romania 3 Institute for Advanced Study, School of Mathematics, Olden Lane, Princeton, NJ 08540, USA

Received: 2 November 1999 / Accepted: 2 March 2000

Abstract: Let be a region in the plane which contains the origin, is star-shaped with respect to the origin and has a piecewise C 1 boundary. For each integer Q ≥ 1, we consider the integer lattice points from Q = {(Qx, Qy); (x, y) ∈ } which are visible from the origin and prove that the 1st consecutive spacing distribution of the angles formed with the origin exists. This is a probability measure supported on an interval [m , ∞), with m > 0. Its repartition function is explicitly expressed as the convolution between the square of the distance from origin function and a certain kernel.

Let be a region in the plane which contains the origin, is star-shaped with respect to the origin and is bounded by the curve C, parametrized by x = r (α) cos α, y = 1 r (α) sin α, where r is continuous and piecewise C on [0, 2π ]. For each (large) integer Q, we consider the “blown up” region Q = (Qx, Qy) ; (x, y) ∈ , which has area Q2 Area() and is bounded by the curve CQ = (Qx, Qy) ; (x, y) ∈ C . We are interested in the set of lattice points in Q which are visible from the origin, i.e. F(, Q) = (a, q) ∈ Q ; a, q ∈ Z, gcd(a, q) = 1 . For the number of elements of F(, Q) we have the well-known estimate N = N (, Q) = #F(, Q) ∼

1 Area()Q2 Area(Q ) = , ζ (2) ζ (2)

as Q goes to ∞. A sharp form of this asymptotic result was proved by Huxley and Nowak (see [18] and references therein). Work of first author supported by an EPSRC Advanced Fellowship

434

F. P. Boca, C. Cobeli, A. Zaharescu

In this paper we study a different aspect of the distribution of points in F(, Q), namely we look at the distribution of the angles of straight lines from the origin through the points from F(, Q). This is the set of lines through all the lattice points in Q , taken without multiplicities. We consider the so-called nearest neighbour distribution of these angles as is fixed and Q → ∞. Our goal is to prove the existence of a limiting distribution and to provide explicit formulae for it. Let 0 = θ0 < θ1 < · · · < θN = 2π be the angles that correspond to elements from F(, Q). We normalize them to 0 = θ˜0 = N θ0 /2π < θ˜1 = N θ1 /2π < · · · < θ˜N = N θN /2π = N. These are now N = N (Q) numbers in an interval of length N , thus the difference between consecutive elements is about 1 in average. We shall be (t, Q) of such differences which are ≥ t. If the above interested in the proportion G angles were randomly distributed in [0, 2π ), then one would expect the proportion of differences between consecutive angles which are larger than the average to be about (t, Q) → e−t as Q → ∞. We will see that this is 1/e. Moreover, one would expect G not the case. (t, Q) denotes the repartition function of the 1st consecutive For each Q ≥ 1, G spacing probability measure

µ,Q =

N(,Q) 1 δθ˜j −θ˜j −1 . N (, Q) j =1

We decompose the region into eight regions 0 , 1 , . . . , 7 , where k contains the points from with argument between kπ/4 and (k + 1)π/4. For each k ∈ {0, 1, . . . , 7}, we consider the constants

λ(k ) =

4 Area(k ) , π ζ (2)

λ(k , Q) =

4N (k , Q) ∼ λ(k )Q2 π

and the transformation Tk defined by  π(k + 2)   − α, for k even, 4 Tk (α) = π(k − 1)  α − , for k odd. 4

They extend to rotations or mirror symmetries. Each Tk transforms k into a subset 2 = {(x, y) ∈ R ; 0 ≤ x ≤ y}. Tk (k ) of R++ 2 , with For any (bounded, star-shaped with respect to the origin) region ! ⊂ R++

boundary parametrized by x = r! (α) cos α, y = r! (α) sin α, α ∈ π/4, π/2 , we

Distribution of Lattice Points Visible from the Origin

435

define the function 1 λ(!)  , for 0 < t ≤ ,   2  2 r ! (α)        λ(!) 1 λ(!) λ(!) λ(!) 4λ(!)    − − · log , for . · log   2 4λ(!)  r! (α)2 t r ! (α) 1 + 1 − r (α)2 t !

The main result of this paper is that the limiting distribution exists and can be expressed as a convolution with a kernel of the type described above. More precisely, we prove

Theorem 0.1. The sequence of probability measures µ,Q Q≥1 converges weakly to the probability measure µ with repartition function given by (t) = G

∞ t

1 1 dµ (x) = 8 Area(k ) 7

k=0

(k+1)π/4

2

r Tk (α) ηTk (k ) t, Tk (α) dα.

kπ/4

2 , This can be immediately deduced from the case where (= 1 ) is included into R++ by using the properties of the symmetries with respect to y = ±x, y = 0 and x = 0. To establish the result for such , we will prove convergence for the corresponding (t, Q) as Q → ∞, for all t > 0. Associating to each point repartition functions G (a, q) ∈ Z2 , 0 ≤ a ≤ q, gcd(a, q) = 1, the rational number γ = a/q, we identify the set F(, Q) with a subset of Q ∩ [0, 1] with elements 0 = γ1 < γ2 < · · · < γN = 1. Then, the corresponding angles θ(γj ) satisfy

4N 4N 4N θ (γ1 ) > θ(γ2 ) > · · · > θ(γN ) = N. π π π π/2 Using polar coordinates, we may write 2 Area() = r (α)2 dα. We let 2N =

π/4

4N (, Q) 4 Area() , λ(, Q) = ∼ λ()Q2 , π ζ (2) π t G (t, Q) = # γj ∈ F(, Q) ; θ(γj ) − θ(γj +1 ) ≥ , λ(, Q) ∞ (t, Q) G (t, Q) = G = dµ,Q (x). N (, Q) λ() =

t

Most of this paper is devoted to the proof of the following result.

436

F. P. Boca, C. Cobeli, A. Zaharescu

(t) of G (t, Q) as Q → ∞, exists and Theorem 0.2. For all t > 0, the limit G 1 (t) = G Area()

π/2 r (α)2 η (t, α) dα. π/4

In particular, if M() = sup r (α), then the support of the probability measure µ α is included into 24 Area()/(π 3 M()2 ), ∞ . The situation where is the triangle T = {(x, y) ; 0 ≤ x ≤ y ≤ 1} is particularly interesting. In this case the set F(T , Q) of integer points visible from the origin in the Q-dilatation of T identifies with the set FQ =

a ; 1 ≤ a ≤ q ≤ Q, gcd(a, q) = 1 q

(0.1)

of Farey points of order Q. The existence of a gap beyond zero in the support of the measure µT can be seen by elementary means in this fundamental case, as the difference between consecutive Farey points is cot(θ ) − cot(θ ) =

a a 1 − = q q qq

and normalizing by Area(QT )/ζ (2) = 3Q2 /π 2 gives at least 3/π 2 , implying a corresponding repulsion between the angles. Farey sequences have been studied for a long time, mainly because of their rôle in problems related to diophantine approximation. There is also a connection with the Riemann zeta function which motivated their study. To be precise, there is a statement in terms of Farey sequences which is equivalent to the Riemann Hypothesis (see Franel and Landau [7]). Other aspects of the distribution of Farey sequences have been investigated by Franel [6], Hall and Tenenbaum [12, 13], Kanemitsu, Sita Rama Chandra Rao and Siva Rama Sarma [19], Hall [9–11], Huxley [17], Kargaev and Zhigljavsky [20,21]. The support of the probability measure µT is 6/π 3 , ∞ . The repartition function T is obtained from a particular case of Proposition 2.1. More precisely, we prove G T of the Theorem 0.3. For t > 0 we set a = 6/(π 3 t). Then, the repartition function G probability measure µT is given by T (a) = 1, for t ≥ 1, G √ √ T (a) = 2 2t − 1 − 1 + π t log e − 4t arctan 2t − 1 G t 1/t 1 2−x 1 − 4t arctan dx, for ≤ t < 1, x x 2 1

Distribution of Lattice Points Visible from the Origin

T (a) = − 1 + πt log e − 4t G t T (a) = − 1 + G

√

2 1

437

1 arctan x

2−x dx, x

for

1 1 ≤t < , 4 2

√ 2 1 2−x 2 e arctan dx 1 − 4t + 2π t log − 4t √ x x 1 + 1 − 4t 1

√ (1+ 1−4a)/2

−2 √ (1− 1−4a)/2

(1 − x)(2a − x + x 2 ) dx x

√ (1+ 1−4a)/2

+ 4t √ (1− 1−4a)/2

T (a) = − 1 + G

√

1 arctan x

2a − x + x 2 dx, x(1 − x)

for

1 1 ≤t < , 8 4

√ 2 1 2−x 2 e 1 − 4t + 2π t log arctan dx − 4t √ x x 1 + 1 − 4t 1

√ (1− 1−8a)/2

−2 √ (1− 1−4a)/2 √ (1+ 1−4a)/2

−2 √ (1+ 1−8a)/2 √ (1− 1−8a)/2

+ 4t √ (1− 1−4a)/2 √ (1+ 1−4a)/2

+ 4t √ (1+ 1−8a)/2

(1 − x)(2a − x + x 2 ) dx x

(1 − x)(2a − x + x 2 ) dx x

1 arctan x 1 arctan x

2a − x + x 2 dx x(1 − x)

2a − x + x 2 dx, x(1 − x)

for 0 < t <

1 . 8

In particular, the proportion of differences between consecutive angles which are T (1) = 0.326856 . . . , which is smaller than larger than the average is G T (t), compared with G T (t, 30), and of the density 1/e = 0.36787 . . . . The graphs of G gT of µT are plotted in Figs. 1 and 2. The unit disk D provides yet another interesting example. In this case the kernel ηD is independent of α. More precisely, we have

438

F. P. Boca, C. Cobeli, A. Zaharescu

1 0.8 0.6 0.4 0.2 0.5

1

1.5

2

2.5

3

T (t) and G T (t, 30) Fig. 1. The graph of the repartition functions G

1 0.8 0.6 0.4 0.2 0.5

1

1.5

2

2.5

3

Fig. 2. The graph of the density function gT

Corollary 0.4. For = D, the kernel ηD is given by ηD (t, α) = ηD (t) 1  ,    2        1  3 1 − log 3 − , 2 2 = π t π t 2        3 12 6 2 1 1   1 − 2 + 2 · log − +   2 π t 2 2 π t π t 1+ 1−

for 0 < t < for

12 π 2t

3 , π2

3 12 < t < 2, 2 π π

, for t >

12 . π2

Distribution of Lattice Points Visible from the Origin

439

D (t) = 2ηD (t), the density gD of µD is Moreover, the repartition function is G  3   0, for 0 < t < 2 ,   π         6 π 2t 12 3 for 2 < t < 2 , gD (t) = π 2 t 2 · log 3 , π π       12 2 12    · log , for t > 2 ,  2 2  π π t 1 + 1 − π122 t and the proportion of differences (between consecutive angles) which are larger than the average is D (1) = 6 1 − log 3 − 1 = 0.331876 . . . . G π2 π2 D (t) and G D (t, 30) see Figs. 3 and 4. For the graphs of G

1 0.8 0.6 0.4 0.2 0.5

1

1.5

2

2.5

3

D (t) and G D (t, 30) Fig. 3. The graph of the repartition functions G

The above theorems should be compared with other results from the literature on the distribution of various numerical sequences. There are many sequences of interest in number theory which are believed to have a poissonian distribution. In this case, the above function G is exponential: G(t) = e−t . Among the very few results of this type we mention the work of Hooley (see [16]) on the distribution of residue classes coprime with a large modulus q and the conditional result of Gallagher [8] on the distribution of prime numbers. Lately, the distribution of primitive roots (mod p) was proved in [4] and the distribution of squares modulo highly composite numbers was established by Kurlberg and Rudnick in [23]. The spacings between the energy levels of a two-dimensional harmonic oscillator (see Pandey, Bohigas and Giannoni [26] and Bleher [2,3]) are essentially those between

440

F. P. Boca, C. Cobeli, A. Zaharescu

1.2 1 0.8 0.6 0.4 0.2 0.5

1

1.5

2

2.5

3

Fig. 4. The graph of the density function gD

the numbers αn (mod 1). In this case the consecutive spacings take at most three values, as follows from the work of Sós [29] and Swierczkovski [30]. The distribution of energy levels of a boxed oscillator, however, reduces to the distribution of αn2 (mod 1) (see Berry and Tabor [1]), which is conjectured to be poissonian. In [28] it is proved that this is so for any α satisfying certain diophantine conditions. Other sequences intensively studied in analytic number theory are the nontrivial zeros of the Riemann zeta function or more general L-functions. The distribution of the imaginary parts of zeros of primitive L-functions, known not to be poissonian, is believed to be the same as the GUE distribution studied by the Random Matrix Theory. Significant work in this area was done by Montgomery [25], Rudnick and Sarnak [27] and Katz and Sarnak [22]. One striking difference between the GUE and the poissonian distribution is that the density function g vanishes at the origin in the GUE case whereas g(0) = 1 in the poissonian case. For this reason it is said that in the poissonian case we have “level clustering” while in the GUE case we have “level repulsion”. In the problem concerning the distribution of visible points we have an even stronger repulsion: the visible points are “well spaced”, the density function g vanishes on a whole interval beyond the origin. From our results it follows that the strongest repulsion, in the sense of the largest gap beyond 0 for g , occurs when is the (unit) disk. 1. The Case of the Angular Sector 1.1. Generalities on Farey fractions. Let us record first, for the convenience of the reader, some basic properties of Farey series. For each integer Q ≥ 1, let FQ denote the set of irreducible fractions from (0, 1] whose denominators do not exceed Q defined in (0.1). The number of terms in the Farey series of order Q is #FQ = ϕ(1) + · · · + ϕ(Q) =

Q2 + O(Q log Q). 2ζ (2)

Let us recall now some basic features of successive terms in FQ , collected in the following result from [14, Theorems 28, 30] (see also [9, Lemma 1], [24, Theorem 9.3]).

Distribution of Lattice Points Visible from the Origin

441

Theorem 1.1. (i) If a/q and a /q are two successive terms in FQ , then a q − aq = 1 and q + q > Q. (ii) If 1 ≤ max(q, q ) ≤ Q < q + q and a q − aq = 1 with 0 ≤ a < q , 0 ≤ a < q, then a/q and a /q are successive terms in FQ . We note the following consequence of Theorem 1.1. Corollary 1.2. If c ∈ (0, 1], then # γ ∈ FQ ; γ ≤ c   1 ≤ q1 , q2 ≤ Q, q1 + q2 > Q, gcd(q1 , q2 ) = 1,   = # (q1 , q2 ) ; ∃ a1 ∈ {0, 1, . . . , q1 − 1}, ∃ a2 ∈ {0, 1, . . . , q2 − 1} s.t. .   a2 q1 − a1 q2 = 1, a2 /q2 ≤ c 1.2. A counting formula. Throughout this paper we shall denote by Tα,β,c the region bounded by the three lines y = x tan α, y = x tan β and y = c, with π/4 ≤ α < β ≤ π/2 and c > 0. In the first two sections we will take = Tα,β,1 . The elements of F(, Q), identified with rational numbers as explained in the introduction, are ordered as cot β ≤ γ1 < γ2 < · · · < γN ≤ cot α, where N = N (Tα,β,1 , Q) = #F(Tα,β,1 , Q) ∼

Q2 (cot α − cot β) . 2ζ (2)

The corresponding angles θ(γj ) = Arccot γj satisfy β ≥ θ(γ1 ) > θ (γ2 ) > · · · > θ (γN ) ≥ α. Sects. 2 and 3, we will consider a fixed λ > 0 and a fixed sequence

Throughout λ(Q) Q such that λ(Q) ∼ λQ2 . We also consider a fixed t > 0 and set γ (Q) = γ (t, Q) = cot

λQ2 t ∼ = 2aQ2 , λ(Q) t

where

We will estimate

a=

G(λ, α, β, t, Q) = # γj ∈ F(Tα,β,1 , Q) ; θ(γj ) − θ(γj +1 ) ≥

with the aims of proving that the limit G(λ, α, β, t) = lim Q

1 G(λ, α, β, t, Q) Q2

exists and computing it. The first thing is to observe that, if we let G(λ, α, t, Q) = G(λ, α, π/2, t, Q),

λ . 2t

t , λ(Q)

(1.1)

442

F. P. Boca, C. Cobeli, A. Zaharescu

then this problem reduces to proving that G(λ, α, t) = lim Q

1 G(λ, α, t, Q) Q2

exists. This is immediate once we note the trivial inequality G(λ, α, β, t, Q) + G(λ, β, t, Q) − G(λ, α, t, Q) ≤ 2. Moreover, we also have G(λ, α, β, t) = G(λ, α, t) − G(λ, β, t). We now turn to G(λ, α, t, Q). Using the characterization of successive Farey fractions from the previous subsection we prove Proposition 1.3.

G(λ, α, t, Q) =

Q

M(λ, α, t, Q, q), where the right-hand side

q=1

denotes the number of elements of the set of integers x for which gcd(x, q) = 1, Q − q < x ≤ Q and there exists y ∈ {0, 1, . . . , q − 1} such that xy = 1 (mod q), y(yx − 1) qx + ≤ γ (Q), q q ≥ y tan α. Proof. As mentioned in the introduction, we associate to each point (a, q) ∈ F(, Q) the order Q Farey fraction γ = a/q. The cotangent of the angle θ(γ ) formed by the ray through (a, q) and the x-axis is given by cot θ(γ ) = a/q = γ . It is now clear that the angles corresponding to elements from F(, Q) are in one-to-one correspondence with the Farey fractions a/q of order Q that satisfy cot α ≥ a/q and γ < γ if and only if θ (γ ) > θ (γ ). The formula tan(x − y) =

cot y − cot x 1 + cot x cot y

and the characterization of consecutive Farey fractions of order Q show that if θ(γj ) > θ (γj +1 ) are consecutive angles corresponding to the elements (aj , qj ) and (aj +1 , qj +1 ) from F(, Q), then the inequality θ (γj ) − θ(γj +1 ) ≥

t λ(Q)

is equivalent to γ (Q) = cot =

1 + γj γj +1 t ≥ λ(Q) γj +1 − γj

aj aj +1 qj qj +1 aj +1 aj qj +1 − qj

1+

(1.2) = qj qj +1 + aj aj +1 .

Distribution of Lattice Points Visible from the Origin

443

As a result, we have to count the number of consecutive Farey fractions γj = aj /qj < aj +1 /qj +1 = γj +1 of order Q that satisfy cot α ≥ aj /qj and (1.2). Taking into account Corollary 1.2 we may now write G(λ, α, t, Q)   1 ≤ q1 , q2 ≤ Q, q1 + q2 > Q, gcd(q1 , q2 ) = 1,   = # (q1 , q2 ) ∈ Z2 ; ∃ a1 ∈ {0, 1, . . . , q1 − 1}, ∃ a2 ∈ {0, 1, . . . , q2 − 1} s.t.  cot α ≥ a2 /q2 , a2 q1 − a1 q2 = 1, q1 q2 + a1 a2 ≤ γ (Q)    Q Q − q < x ≤ Q, gcd(x, q) = 1,   = # x ∈ Z ; ∃ a1 ∈ {0, 1, . . . , x − 1}, ∃ a2 ∈ {0, 1, . . . , q − 1} s.t.  cot α ≥ a2 /q, xa2 − qa1 = 1, qx + a1 a2 ≤ γ (Q)  q=1 =

Q

M(λ, α, t, Q, q).

q=1

In the sequel, we shall often refer to the functions g(y) =

y + qγ (Q) , y2 + q2

f (y) = min Q, g(y) ,

h(y) = max f (y) − Q + q, 0 . (1.3)

The derivative −y 2 − 2qγ (Q)y + q 2 (q 2 + y 2 )2

has zeroes y1,2 = q − γ (Q) ± γ (Q)2 + 1 , hence y1 < 0 < y2 . Moreover, (1.1) shows that for Q large g (y) =

y2 =

γ (Q) +

q Q 1 q ≤ ∼ , < 2γ (Q) 2γ (Q) 4aQ γ (Q)2 + 1

so g is monotonically decreasing on [1, ∞). We will consider, for any interval I, the incomplete Kloosterman sum mx + nx¯ e , SI (m, n, q) = q

(1.4)

(1.5)

x∈I gcd(x,q)=1

where x¯ denotes the inverse of x (mod q). We also let S2 (Q, q) = S2 (λ, α, t, Q, q) = q−1 1 = q

k=1 0
1 q

q−1 k(y − x) ¯ e q

0
ky S(Q−q,f (y)] (0, −k, q), e q

(1.6)

444

F. P. Boca, C. Cobeli, A. Zaharescu

S1 (Q, q) = S1 (λ, α, t, Q, q) 1 = # (x, y) ∈ Z2 ; 0 < y ≤ q cot α, Q − q < x ≤ f (y), gcd(x, q) = 1 q 1 = 1. q 0
(1.7) With this notation at hand we prove Lemma 1.4. For all 1 ≤ q ≤ Q we have M(λ, α, t, Q, q) = S1 (Q, q) + S2 (Q, q). 1 if xy = 1 (mod q) Proof. For each x, y ∈ Z, we set δ(x, y) = . If gcd(x, q) = 1, 0 if xy = 1 (mod q) then q 1 k(y − x) δ(x, y) = e . q q k=1

As a result, we obtain

M(λ, α, t, Q, q) =

0
=

1 q

δ(x, y) q k(y − x) ¯ e q

0
= S1 (Q, q) + S2 (Q, q).

1.3. Incomplete Kloosterman sums and a certain counting function. To show that the contribution of S2 (Q, q) to G(λ, α, t, Q) is negligible, we need an upper bound for incomplete Kloosterman sums. This will be derived in Lemma 1.6 from the following estimate for complete Kloosterman sums, proved by Esterman (see [5]), using Weil’s estimates [31] for prime moduli. We denote by S(m, n, q) the complete Kloosterman sum mx + nx¯ S(m, n, q) = e . q 1≤x≤q gcd(x,q)=1

Lemma 1.5 ([5]). Let q, m, n be integers, with q ≥ 2. Then 1

1

|S(m, n, q)| σ0 (q) gcd(m, n, q) 2 q 2 . Here and in the following we use the notation σr (q) = d|q d r . For incomplete Kloosterman sums we have the following result.

Distribution of Lattice Points Visible from the Origin

445

Lemma 1.6. Let q, m, n be integers, with q ≥ 2, and let I be a subinterval of [1, q]. Then 1

1

|SI (m, n, q)| σ0 (q) gcd(n, q) 2 q 2 (2 + log q). The proof uses standard arguments and will be relegated to the Appendix. For I1 and I2 intervals we will consider the following counting function Nq (I1 , I2 ) = # (x, y) ∈ I1 × I2 ; xy = 1 (mod q) . The proof of the next result relies on Lemma 1.6. Actually the corresponding estimate for S(0, −k, q) in Lemma 1.5, obtained by Hooley ([15]), will suffice. We are grateful to G. Greaves for bringing to our attention reference [15]. Lemma 1.7. Suppose that I1 and I2 are subintervals of [1, q]. Then 1 ϕ(q) Nq (I1 , I2 ) = 2 |I1 | |I2 | + O σ0 (q)σ− 1 (q)q 2 log2 q . 2 q The proof of this lemma will also be presented in the Appendix.

1.4. An estimate for

Q

S2 (Q, q). We prove next that the contribution of S2 to M(λ, α,

q=1

t, Q, q) is negligible. Lemma 1.8.

Q

S2 (Q, q) = o(Q2 ).

q=1

Proof. We can replace f (y) by max Q − q, f (y) in (1.6), thus assume

≥ f (y) ≥ Q Q − q, y ∈ (0, q cot α]. We fix 0 < ε < 1/4, denote l(q) = q ε and divide (0, q cot α] into the union of l(q) disjoint intervals I1 , . . . , Il(q) with |I1 | = · · · = Q |Il(q) | = q cot α/ l(q). Then, it is clear that q=1 q/ l(q) = o(Q2 ). We also let mj = inf f (y), Ij

1 A(Q, q) = q

Mj = sup f (y), Ij

l(q) j =1

q−1 k(y − x) e . q

y∈Ij k=1 Q−q<x≤mj gcd(x,q)=1

With the estimate for E from the proof of Lemma 1.7 we get Q Q A(Q, q) l(q)σ0 (q)σ q=1

q=1

Q q=1

− 21 (q)q

1 2

l(q)q 1−2ε log2 q

log2 q Q q=1

(1.8) q 1−ε log2 q = o(Q2 ).

446

F. P. Boca, C. Cobeli, A. Zaharescu

It is clear that l(q) S2 (Q, q) − A(Q, q) ≤ 1 q

j =1 x∈[mj ,Mj ],y∈Ij gcd(x,y)=1

q−1 k(y − x) ≤ SI (q) + SI I (q), e q k=1

(1.9) where SI (q) =

l(q)

Nq [mj , Mj ], Ij ,

SI I (q) =

l(q) 1 q

1.

j =1 x∈[mj ,Mj ],y∈Ij gcd(x,y)=1

j =1

We clearly have Q q=1

l(q) Q Q 1 1 q cot α SI I (q) ≤ |Ij | (Mj − mj ) ≤ · · q = o(Q2 ). q q l(q) q=1

(1.10)

q=1

j =1

On the other hand, Lemma 1.7 provides l(q) q 1 1−ε 1−ε =O = O(q 1−ε ), (Mj − mj ) |Ij | + q +q SI (q) O q l(q) j =1

and so Q

SI (q) = O

q=1

Q

q

1−ε

= o(Q2 ).

The statement follows from (1.9), (1.8), (1.10) and (1.11).

2. Estimates for

Q

(1.11)

q=1

S1 (Q, q)

q=1

Q In this section we shall estimate Q−2 q=1 S1 (Q, q), which gives the main term in Q−2 G(λ, α, t, Q). Before stating the main result of this section, let us define the following ten functions: cot α − (π − 2α)a log(2a sin2 α), 2 cot α I (α, a) = a sin 2α − 2 cot α 2 + 1 − 8a sin2 α + 2a(π − 2α) log , 2 1 + 1 − 8a sin2 α

I0 (α, a) = a sin 2α −

Distribution of Lattice Points Visible from the Origin

1/a I1 (α, a) = a 2 sin2 α

447

2−x dx x

√ √ = 2a − 1 − a sin 2α − 2a arctan 2a − 1 + a(π − 2α), 1/a 1 2−x I2 (α, a) = − 2a arctan dx, x x 2 sin2 α 2

2−x dx = −a sin 2α + a(π − 2α), x

I3 (α, a) = a 2 sin2 α

2

1 arctan x

I4 (α, a) = − 2a 1+

2 sin2 α √

1−8a sin2 α 2

I5 (α, a) = − 1− 1+

√

√

(1 − x)(2a − x + x 2 ) dx, x

1−8a sin2 α 2

1 arctan x

√

1−8a sin2 α 2 √ 1− 1−8a 2

I7 (α, a) = − 1−

√

2a − x + x 2 dx, x(1 − x)

(1 − x)(2a − x + x 2 ) dx x

1−8a sin2 α 2

1+

√

1−8a sin2 α 2

− √ 1+ 1−8a 2 √ 1− 1−8a 2

I8 (α, a) = 2a 1−

2−x dx, x

1−8a sin2 α 2

I6 (α, a) = 2a 1−

√

(1 − x)(2a − x + x 2 ) dx, x

1 arctan x

2a − x + x 2 dx x(1 − x)

1−8a sin2 α 2 1+

√

1−8a sin2 α 2

+ 2a √ 1+ 1−8a 2

1 arctan x

We will prove the following result:

2a − x + x 2 dx. x(1 − x)

448

F. P. Boca, C. Cobeli, A. Zaharescu

Proposition 2.1. Suppose that λ > 0, λ(Q) ∼ λQ2 , t > 0, α ∈ π/4, π/2 and denote as before t . G(λ, α, t, Q) = # γj ∈ F(Tα, π2 ,1 , Q) ; θ(γj ) − θ(γj +1 ) ≥ λ(Q) Then, as Q → ∞,

α, λ , G(λ, α, t, Q) ∼ Q2 G(λ, α, t) = Q2 G 2t

where a) G(α,

 cot α   ,    2          I0 (α, a)+I1 (α, a)+I2 (α, a),        1  = I0 (α, a)+I3 (α, a)+I4 (α, a), ζ (2)           I (α, a)+I3 (α, a)+I4 (α, a)+I5 (α, a)+I6 (α, a),            I (α, a)+I3 (α, a)+I4 (α, a)+I7 (α, a)+I8 (α, a),

if a ≥

1 , 2 sin2 α

if

1 1 >a≥ , 2 2 2 sin α

if

1 1 , >a≥ 2 8 sin2 α

if

1 1 >a≥ , 2 8 8 sin α

if

1 > a > 0. 8

2.1. Two elementary summation formulae. In the process of estimating the following two standard lemmas will be used.

Q

q=1 S1 (Q, q),

Lemma 2.2. Suppose that 0 < a < b are real numbers, q ∈ N∗ and f is piecewise C 1 on [a, b]. Then   b b ϕ(q) f (k) = f + O σ0 (q) f ∞ + |f |  . q a
a

a

Proof. It is clear that it suffices to assume a, b integers and prove a similar estimate for N k SN = , f (k) = µ(d)fd d k=1 d|k d|q

0
where fd (x) = f (dx) and µ denotes the Möbius function. The estimate is well-known for q = 1, so we may consider q > 1 and write SN =

[N/d] d|q l=0

µ(d)fd (l).

Distribution of Lattice Points Visible from the Origin

449

From the Euler–MacLaurin summation formula [N/d] l=0

N/d N/d [N/d] # $ 1 N fd (l) = fd − fd + fd η, fd (0) + fd + 2 d [N/d]

0

0

where η(x) = x − [x] − 21 . The first integral is equal to d −1

dominated by f ∞ and so is the third term. We also have

N

f . The second one is

0

[N/d] d[N/d] x N dx ≤ f (x) dx. fd η = f (x)η d 0

0

0

Therefore, we end up with µ(d)



N

SN =

d|q

d

f + O σ0 (q) f ∞ +

0

N



|f |  .

0

Lemma 2.3. Suppose that 0 < a < b are real numbers, q ∈ N∗ and f is piecewise C 1 on [a, b]. Then   b b ϕ(k) 1 f (k) = f + O log b f ∞ + |f |  . k ζ (2)

a
a

a

Proof. It suffices to prove this assertion for a = 0, b = N , in which case we write N ϕ(k) k=1

k

f (k) =

[N/d] N N µ(d) µ(d) fd (l). f (k) = d d k=1 d|k

d=1

l=1

This equals further (using again Euler–MacLaurin’s formula) N µ(d)

N

d=1

d2



f + O  f ∞ +

0

N 0

=

1 ζ (2)

N 0



|f | log N  

f + O  f ∞ +

N 0



|f | log N  .

450

F. P. Boca, C. Cobeli, A. Zaharescu

2.2. Approximating Q−2 G(λ, α, t, Q) by an integral. As we have seen earlier, S2 brings no contribution to the main term of Q−2 G(λ, α, t, Q).As a result, we only have to analyse the contribution of S1 (Q, q) =

1 q

1

0
to Q−2 G(λ, α, t, Q). For this purpose, we will approximate the above sums by corresponding integrals, through some thorough computations. The errors will be controlled by Lemmas 2.2 and 2.3. We shall be interested in the roots y0 (q) and y1 (q) of the equations g(y) = Q and respectively g(y) = Q − q (if they exist) for y ∈ [1, q cot α]. Since g is decreasing, one has (a) If g(q cot α) ≥ Q, then f (y) = Q and

h(y) = q

for all y ∈ [1, q cot α].

(b) If g(1) ≥ Q ≥ g(q cot α) ≥ Q − q, then 1 ≤ y0 (q) ≤ q cot α, thus f (y) ≥ Q − q and % q, for y ∈ [1, y0 (q)], h(y) = g(y) − Q + q, for y ∈ [y0 (q), q cot α]. (c) If g(1) ≥ Q and Q − q > g(q cot α), then 1 ≤ y0 (q) < y1 (q) < q cot α and   q, h(y) = g(y) − Q + q,  0,

for y ∈ [1, y0 (q)], for y ∈ [y0 (q), y1 (q)], for y ∈ [y1 (q), q cot α].

(d) If Q > g(1) and g(q cot α) > Q−q, then Q > g(y) > Q−q for all y ∈ [1, q cot α]. So h(y) = g(y) − Q + q

for all y ∈ [1, q cot α].

(e) If Q > g(1) > Q − q > g(q cot α), then % f (y) = g(y) and h(y) =

g(y) − Q + q, 0,

for y ∈ [1, y1 (q)], for y ∈ [y1 (q), q].

(f) If Q − q > g(1), then f (y) = g(y) < Q − q

and

h(y) = 0

for all y ∈ [1, q cot α].

Distribution of Lattice Points Visible from the Origin

451

The equality g(q cot α) =

cot α + γ (Q) q(1 + cot 2 α)

shows that g(q cot α) ≥ Q

⇐⇒

q ≤ a(Q),

where we let cot α + γ (Q) 2aQ ∼ = 2aQ sin2 α = a0 Q, Q(1 + cot 2 α) 1 + cot 2 α λ . a0 = 2a sin2 α, a = 2t a(Q) =

We also have g(q cot α) ≥ Q − q

⇐⇒

q 2 − Qq +

cot α + γ (Q) > 0. 1 + cot 2 α

If a ≥ 1/(8 sin2 α), then Q2 ≤

4(cot α + γ (Q)) ∼ 8aQ2 sin2 α. 1 + cot 2 α

Hence in this case we have, for all (large) Q, g(q cot α) ≥ Q − q If a < 1/(8 sin2 α), then g(q cot α) ≥ Q − q where we let

⇐⇒

for all q ∈ [1, Q].

q ∈ − ∞, c− (Q) ∪ c+ (Q), ∞ ,

1 4(cot α + γ (Q)) c± (Q) = ∼ c± Q, Q ± Q2 − 2 1 + cot 2 α 1 c± = 1 ± 1 − 8a sin2 α . 2

It is clear that g(1) > Q

⇐⇒

Qq 2 − γ (Q)q + Q − 1 < 0.

Since for large Q the right-hand side has positive discriminant and roots γ (Q) ± γ (Q)2 − 4Q(Q − 1) q1,2 = , with 0 < q1 < 1 < q2 , 2Q we get g(1) > Q, q ≥ 1

⇐⇒

q ∈ 1, b(Q) ,

452

F. P. Boca, C. Cobeli, A. Zaharescu

where we let and b(Q) =

b = 2a > a0

γ (Q) +

γ (Q)2 − 4Q(Q − 1) ∼ bQ. 2Q

Since 2 ≥ 2 sin2 α ≥ 1, we have (for 8a sin2 α ≤ 1) a0 ≤ c− , b ≤ c+ < 1 and c− ≤ b ⇐⇒ 2a ≤ cos2 α. We see that g(1) > Q − q

θ (q) = q 3 − Qq 2 + 1 + γ (Q) q + 1 − Q > 0.

⇐⇒

Since θ (q) < 0 for all q < 0, θ(q) > 0 for all q > Q and θ(1) > 0, the equation θ (q) = 0 has three roots 0 < q0 < 1 < d± (Q) < Q. With the substitution q = Qx, θ (q) > 0 is equivalent to x3 − x2 +

γ (Q) x 1−Q x+ 2 + > 0, Q2 Q Q3

providing d± (Q) ∼ d± Q, with d± =

√ 1 1 ± 1 − 8a . 2

For 8a > 1, we have g(1) > Q − q for large Q and all q ∈ [1, Q]. For 8a ≤ 1, we have b = 2a < d− < d+ < c+ < 1 and c− < d− . We may write g(y) = g1 (y) + qγ (Q)g2 (y), where g1 (y) =

y2

y , + q2

g2 (y) =

y2

1 . + q2

The derivatives g1 (y) =

q2 − y2 (y 2 + q 2 )2

g2 (y) = −

2y (y 2 + q 2 )2

have constant sign on [1, q]. As a result, the variation of g1 and g2 is plainly estimated by qcot α

g (y) dy = g1 (q cot α) − g1 (1) ≤ g1 (q cot α) ≤ 1 , 1 q

1 qcot α

1

g (y) dy = g2 (1) − g2 (q cot α) ≤ g2 (1) < 1 , 2 q2

Distribution of Lattice Points Visible from the Origin

453

and therefore qcot α

g (y) dy = O γ (Q) . q

1

From the definitions of h and a(Q), we see that h (y) = 0 for all q ∈ 1, a(Q) . Hence, with η(y) = y − [y] − 21 , we get qcot α q cot α Q ϕ(q) 1 ϕ(q) 1 h (y) dy h (y)η(y) dy ≤ 2 Q2 2 2 q Q q q=1 a(Q)
(2.1)

q>a(Q)

By the very definition of h, we see that 0 ≤ h(y) ≤ q. So, for any y ∈ [1, q cot α), Q Q 1 ϕ(q) 1 Q 1 ϕ(q) h(y) ≤ < 2 = . 2 2 2 Q q Q q Q Q q=1

(2.2)

q=1

Lemma 2.2 provides

1=

Q−q<x≤f (y) gcd(x,q)=1

ϕ(q) h(y) + O σ0 (q) , q

which we compare with (1.7) to get Q Q 1 ϕ(q) 1 S (Q, q) = 1 Q2 Q2 q2 q=1

q=1

h(y) + o(1).

(2.3)

1≤y≤q cot α

We apply Euler–MacLaurin to the inner sum, use (2.1), (2.2), (2.3), and gather qcot α Q Q 1 ϕ(q) 1 S1 (Q, q) = 2 h(y) dy + o(1) Q2 Q q2 q=1

q=1

1

1 ϕ(q) = 2 V (q) + o(1), Q q

(2.4)

1≤q≤Q

where we put 1 V (q) = q

qcot α

h(y) dy. 1

We aim to apply Lemma 2.3 to the right-hand side of (2.4). For this, one has first to estimate the variation of V on [1, Q]. For, we write V (q) = V1 (q) + V2 (q),

454

F. P. Boca, C. Cobeli, A. Zaharescu

where h(q cot α) cot α V1 (q) = , q

1 V2 (q) = − 2 q

qcot α

h(y) dy. 1

Using again 0 ≤ h(y) ≤ q, we get |V (q)| ≤ q and |V (q)| ≤ 2. Hence Q

|V (q)| dq ≤ 2Q.

1

Therefore 1 sup |V (q)| = O(Q−1 ) = o(1) Q2 1≤q≤Q and 1 Q2

Q 1

V (q) dq · log Q = O log Q = o(1). Q

As a result, Lemma 2.3 applies to the right-hand side of (2.4). Q Since lim Q−2 q=1 S2 (Q, q) = 0, we get Q

1 1 G(λ, α, t, Q) ∼ 2 Q ζ (2) Q2

Q V (q) dq.

(2.5)

1

Note also for further use that 1 Q2

Q 1

1 q

qcot α

1

log2 Q y dy dq = O . 2 +q Q2

y2

(2.6)

This shows that we may replace g(y) by qγ (Q)g2 (y) in the subsequent computations. 2.3. Estimating G(λ, α, t, Q) for 0 < t < 4λ sin2 α. We proceed now to estimate G(λ, α, t, Q) as Q → ∞, using (2.5), (2.6) and the discussion from the beginning of the previous subsection concerning y0 (q) and y1 (q). The discussion is split into several cases as follows. We will let a = λ/(2t). 1 1) < a , or equivalently 0 < t < λ sin2 α. It is easily seen that 2 sin2 α 1 V (q) = q

qcot α

q dy = q cot α − 1 1

and

ζ (2) cot α . G(λ, α, t, Q) ∼ 2 Q 2

Distribution of Lattice Points Visible from the Origin

2)

455

1 1 , or equivalently λ sin2 α < t < λ. Then
for 1 ≤ q ≤ a(Q),

for a(Q) < q ≤ Q,

1 V (q) = q

qcot α

q dy = q cot α − 1, 1

qcot α y0 (q)

1 V (q) = g(y)−Q + q dy q dy + q y0 (q)

1

y0 (q) 1 = q cot α−1−Q cot α+Q + q q

qcot α

g(y) dy. y0 (q)

As a result, using also (2.5) and (2.6),  Q Q

y0 (q) 1  ζ (2) dq G(λ, α, t, Q) ∼ 2  q cot α dy − Q Q − a(Q) cot α + Q 2 Q Q q 1

Q + a(Q)

1 q



qcot α

a(Q)

 g(y) dy dq 

y0 (q)

cot α ∼− + 2a sin2 α cot α 2 Q Q 1 y0 (q) 1 π γ (Q) y0 (q) + dq + − α − arctan dq Q q Q2 q 2 q a(Q)

a(Q)

cot α = a sin 2α − − (π − 2α)a log(2a sin2 α) + I1 (Q) + I2 (Q), 2

where we let 1 I1 (Q) = Q

Q a(Q)

y0 (q) dq q

and

γ (Q) I2 (Q) = − 2 Q

Q a(Q)

1 y0 (q) arctan dq. q q

We use the formula y0 (q) =

1+

1 − 4Q2 q 2 + 4Qγ (Q)q 2Q

for q ∈ 1, b(Q) ,

(2.7)

456

F. P. Boca, C. Cobeli, A. Zaharescu

the substitutions q = Qy and y = ax and Lebesgue’s dominated convergence theorem to get 1/a lim I1 (Q) = a Q

2 sin2 α

2−x dx = I1 (α, a), x

1/a

1 arctan x

lim I2 (Q) = −2a Q

2 sin2 α

2−x dx = I2 (α, a), x

so that ζ (2)G(λ, α, t) = a sin 2α −

3)

cot α − (π − 2α)a log(2a sin2 α) + I1 (α, a) + I2 (α, a). 2

1 1 < a < , or equivalently λ < t < 4λ sin2 α. Then 2 2 8 sin α 1 V (q) = q

for 1 ≤ q ≤ a(Q),

qcot α

q dy = q cot α − 1, 1

y0 (q) 1 for a(Q) ≤ q ≤ b(Q), V (q) = q cot α−1−Q cot α + Q + q q for b(Q) ≤ q ≤ Q,

V (q) =

1 q

qcot α

qcot α

g(y) dy, y0 (q)

g(y) − Q + q dy

1

q cot α − 1 1 = q cot α − 1 − Q + q q

qcot α

g(y) dy. 1

Therefore, we get as in Case 2) ζ (2) G(λ, α, t, Q) Q2  b(Q) Q

y0 (q) 1  ∼ 2  q cot α dy − Q cot α Q − a(Q) + Q dq Q q a(Q)

1

b(Q)

+ a(Q)

1 q

qcot α

Q

g(y) dy dq + y0 (q)

b(Q)

1 q

qcot α



 g(y) dy dq 

1

cot α ∼ a sin 2α − − (π − 2α)a log(2a sin2 α) + I3 (Q) + I4 (Q), 2

Distribution of Lattice Points Visible from the Origin

457

where 1 I3 (Q) = Q

b(Q) a(Q)

y0 (q) dq q

b(Q)

and

I4 (Q) = −2a a(Q)

1 y0 (q) arctan dq. q q

This provides ζ (2)G(λ, α, t) = a sin 2α −

cot α − (π − 2α)a log(2a sin2 α) + I3 (α, a) + I4 (α, a), 2

where I3 (α, a) and I4 (α, a) have been defined at the beginning of this section. 2.4. Estimating G(λ, α, t, Q) for t > 4λ sin2 α. To sort out the case 0 < 8a sin2 α < 1, 2 into two cases, α ∈ we split the

discussion according

to whether cos α ≥ 1/4, i.e. 2 π/4, π/3 , or cos α ≤ 1/4, i.e. α ∈ π/3, π/2 . In both cases we have 1/(8 sin2 α) ≥ (cos2 α)/2. π π I. For ≤ α ≤ we analyse the remaining subcases: 4 3 cos2 α λ 1 , or equivalently 4λ sin2 α < t < I.4)
V (q) = q cot α − 1,   y0 (q) qcot α

1  for a(Q) ≤ q ≤ b(Q), V (q) =  g(y) − Q + q) dy  q dy + q y0 (q)

1

Qy0 (q) 1 = q cot α − 1 − Q cot α + + q q for b(Q) ≤ q ≤ c− (Q)

g(y) dy, y0 (q)

or c+ (Q) ≤ q ≤ Q, 1 V (q) = q

qcot α

g(y) − Q + q dy

1

Q 1 = q cot α − 1 − Q cot α + + q q for c− (Q) ≤ q ≤ c+ (Q),

qcot α

V (q) =

1 q

y1 (q)

qcot α

g(y) dy, 1

g(y) − Q + q dy

1

Qy1 (q) Q 1 = y1 (q) − 1 − + + q q q

y1 (q)

g(y) dy. 1

458

F. P. Boca, C. Cobeli, A. Zaharescu

Therefore ζ (2) G(λ, α, t, Q) Q2  1  ∼ 2 Q

q cot α dq − Q c− (Q) − a(Q) + Q − c+ (Q) cot α

[1,c− (Q)]∪[c+ (Q),Q]

c + (Q)

+

b(Q)

y1 (q) dq + Q

c− (Q)

a(Q)

+ [b(Q),c− (Q)]∪[c+ (Q),Q]

1 q

y0 (q) dq − Q q

c + (Q)

c− (Q)

a(Q)

c + (Q)

qcot α

g(y) dy dq + c− (Q)

1

b(Q)

y1 (q) dq + q 1 q

1 q

y1 (q)

qcot α

g(y) dy dq y0 (q)



 g(y) dy dq 

1

2 ) cot α (c2 + 1 − c+ − (c− − a0 + 1 − c+ ) cot α ∼ − 2 8 π c− + 2a + Ij (t, Q) − α log 2 c+ a 0 j =5

= I (α, a) +

8

Ij (Q),

j =5

where 2 + 1 − c2 ) cot α π (c− c− + − (c− − a0 + 1 − c+ ) cot α + 2a − α log 2 2 c+ a0 cot α = a sin 2α − 2 cot α 2 + 1 − 8a sin2 α + 2a(π − 2α) log , 2 1 + 1 − 8a sin2 α

I (α, a) =

1 I5 (Q) = Q

b(Q) a(Q)

1 I7 (Q) = − 2 Q

y0 (q) dq, q

c + (Q) c− (Q)

b(Q)

I6 (Q) = −2a a(Q)

Q−q y1 (q) dq, q

1 y0 (q) arctan dq, q q c + (Q)

I8 (Q) = 2a c− (Q)

1 y1 (q) arctan dq. q q

We see as in the previous cases that lim I5 (Q) = I3 (α, a) and lim I6 (Q) = I4 (α, a) Q

Q

exist and are given by the formulae from the beginning of this section. The limits of I7 (Q) and I8 (Q) as Q → ∞ are being computed by means of the formula 1 + 1 + 4(Q − q)γ (Q)q − 4(Q − q)2 q 2 y1 (q) = for q ∈ [1, Q], 2(Q − q)

Distribution of Lattice Points Visible from the Origin

459

the substitution q = Qx and Lebesgue’s dominated convergence theorem. In this way we find that I7 (Q) tends to I5 (α, a) and I8 (Q) tends to I6 (α, a). 1 cos2 α λ I.5)
V (q) = q cot α − 1,   y0 (q) qcot α

1  for a(Q) ≤ q ≤ c− (Q), V (q) =  g(y) − Q + q) dy  q dy + q y0 (q)

1

Qy0 (q) 1 = q cot α − 1 − Q cot α + + q q  for c− (Q) ≤ q ≤ b(Q), V (q) =

1  q

y0 (q)

y1 (q)

q dy +

1 q

V (q) =

1 q

 g(y) − Q + q dy 

y0 (q)

1

y1 (q)

y1 (q)

+Q a(Q)

y0 (q) dq q

y0 (q)

g(y) − Q + q dy

1 y1 (q)

g(y) dy, 1

qcot α

g(y) − Q + q) dy

1 qcot α

g(y) dy. 1

q cot α dq − Q c− (Q) − a(Q) + Q − c+ (Q) cot α

[1,c− (Q)]∪[c+ (Q),Q]

b(Q)

g(y) dy,

Q 1 = q cot α − 1 + Q cot α + + q q Therefore ζ (2) G(λ, α, t, Q) Q2  1  ∼ 2 Q

y0 (q)

Qy1 (q) Q 1 = y1 (q) − 1 − + + q q q for c+ (Q) ≤ q ≤ Q,

g(y) dy, 

Qy1 (q) Qy0 (q) 1 = y1 (q) − 1 − + + q q q for b(Q) ≤ q ≤ c+ (Q), V (q) =

qcot α

460

F. P. Boca, C. Cobeli, A. Zaharescu c + (Q)

− c− (Q)

(Q − q)y1 (q) dq + γ (Q) q b(Q)

+ γ (Q) c− (Q)

Q +γ (Q) c+ (Q)

c − (Q) a(Q)

1 π y0 (q) − α − arctan dq q 2 q

c + (Q) y1 (q) y0 (q) y1 (q) 1 1 arctan −arctan dq + γ (Q) arctan dq q q q q q b(Q)



1 π  − α dq  . q 2

As in I.4), we get in this case as well ζ (2)G(λ, α, t) = I (α, a) + I3 (α, a) + I4 (α, a) + I5 (α, a) + I6 (α, a). 1 I.6) 0 < a < , or equivalently t > 4λ. Then 0 < a0 < c− < b < d− < d+ < c+ < 1 8 and we get for 1 ≤ q ≤ a(Q),

V (q) = q cot α − 1,

Qy0 (q) 1 for a(Q) ≤ q ≤ c− (Q), V (q) = q cot α − 1 − Q cot α + + q q

Qy1 (q) Q 1 + + q q q

g(y) dy, y0 (q) y1 (q)

Qy1 (q) Qy0 (q) 1 for c− (Q) ≤ q ≤ b(Q), V (q) = y1 (q) − 1 − + + q q q for b(Q) ≤ q ≤ d− (Q), V (q) = y1 (q) − 1 −

qcot α

g(y) dy, y0 (q)

y1 (q)

g(y) dy, 1

for d− (Q) ≤ q ≤ d+ (Q), V (q) = 0, Qy1 (q) Q 1 for d+ (Q) ≤ q ≤ c+ (Q), V (q) = y1 (q) − 1 − + + q q q for c+ (Q) ≤ q ≤ Q,

V (q) = q cot α − 1 − Q cot α +

Q 1 + q q

Therefore  ζ (2) 1  G(λ, α, t, Q) ∼ 2  Q2 Q

c − (Q)

Q

q cot α dq + 1

c+ (Q)

q cot α dq

−Q c− (Q) − a(Q) + Q − c+ (Q) cot α

y1 (q)

g(y) dy, 1

q g(y) dy. 1

Distribution of Lattice Points Visible from the Origin b(Q)

461

y0 (q) dq − Q q

+Q a(Q)

d − (Q)

c− (Q)

c + (Q)

−Q d+ (Q)

y1 (q) dq + q

c − (Q)

q

+ γ (Q)

y2

a(Q) y0 (q) b(Q) y1 (q)

+ γ (Q) c− (Q) y0 (q) c + (Q) y1 (q)

+γ (Q) d+ (Q)

∼ I (α, a) +

d − (Q)

c + (Q)

y1 (q) dq + c− (Q)

y1 (q) dq

d+ (Q)

dy dq + q2

dy dq + γ (Q) 2 y + q2

d − (Q) y1 (q) b(Q)

dy dq + γ (Q) y2 + q2

y2

1

Q q

c+ (Q) 1

1

14

y1 (q) dq q

dy dq + q2 

dy  dq  y2 + q2

Ij (Q),

j =9

where d − (Q)

1 I9 (Q) + I10 (Q) = − 2 Q

c− (Q)

1 − 2 Q I11 (Q) =

1 Q

c + (Q) d+ (Q)

b(Q) a(Q)

I12 (Q) = −2a a(Q) d − (Q)

I13 (Q) + I14 (Q) = 2a c− (Q)

Q−q y1 (q) dq −→ I7 (α, a), q

y0 (q) dq −→ I3 (α, a), q

b(Q)

y0 (q) 1 arctan −→ I4 (α, a), q q

1 y1 (q) arctan dq q q

c + (Q)

+ 2a d+ (Q)

as Q → ∞.

Q−q y1 (q) dq q

y1 (q) 1 arctan dq −→ I8 (α, a), q q

462

F. P. Boca, C. Cobeli, A. Zaharescu

π π ≤ α < , we only have to look at the following remaining subcases. 2 3 1 1 , or equivalently 4λ sin2 α < t < 4λ. In this case a0 < b < II.4)
II. For

ζ (2) G(λ, α, t, Q) Q Q2 = I (α, a) + I3 (α, a) + I4 (α, a) + I5 (α, a) + I6 (α, a).

ζ (2)G(λ, α, t) = lim

cos2 α 1 λ II.5) < a < , or equivalently 4λ < t < . In this case a0 < b < c− < 2 8 cos2 α d− < d+ < c+ < 1 and we get for 1 ≤ q ≤ a(Q),

V (q) = q cot α − 1,

1 for a(Q) ≤ q ≤ b(Q), V (q) = q

qcot α y0 (q)

g(y) − Q + q dy q dy + y0 (q)

1

Qy0 (q) 1 = q cot α − 1 − Q cot α + + q q

qcot α

g(y) dy, y0 (q)

for b(Q) ≤ q ≤ c− (Q) or c+ (Q) ≤ q ≤ Q, 1 V (q) = q

qcot α

g(y) − Q + q dy

1

Q 1 = q cot α − 1 − Q cot α + + q q

qcot α

g(y) dy, 1

for c− (Q) ≤ q ≤ d− (Q) or d+ (Q) ≤ q ≤ c+ (Q), 1 V (q) = q

y1 (q)

g(y) − Q + q dy

1

Qy1 (q) Q 1 = y1 (q) − 1 − + + q q q for d− (Q) ≤ q ≤ d+ (Q),

V (q) = 0.

y1 (q)

g(y) dy, 1

Distribution of Lattice Points Visible from the Origin

463

Therefore ζ (2) G(λ, α, t, Q) Q2  1  ∼ 2 Q

q cot α dy − Q c− (Q) − a(Q) + Q − c+ (Q) cot α

[1,c− (Q)]∪[c+ (Q),Q]

b(Q)

+Q a(Q)

y0 (q) dq − q

(Q − q)y1 (q) dq q

[c− (Q),d− (Q)]∪[d+ (Q),c+ (Q)]

b(Q)

y0 (q) 1 π − α − arctan dq q 2 q a(Q) 1π + γ (Q) − α dq q 2 + γ (Q)

[b(Q),c− (Q)]∪[c+ (Q),Q]

+ γ (Q)

 1 y1 (q)  arctan dq  . q q

[c− (Q),d− (Q)]∪[d+ (Q),c+ (Q)]

This yields, as in Case I.6), ζ (2)G(λ, α, t) = I (α, a) + I3 (α, a) + I4 (α, a) + I7 (α, a) + I8 (α, a). cos2 α λ II.6) a < , or equivalently t > . In this case 0 < a0 < c− < b < d− < 2 cos2 α d+ < c+ < 1 and we get, as in I.6), ζ (2)G(λ, α, t) = I (α, a) + I3 (α, a) + I4 (α, a) + I7 (α, a) + I8 (α, a). The proof of Proposition 2.1 is now complete. ∂G . We first note that G(λ, α, ·) is continuous at t = λ sin2 α, ∂α t = λ, t = 4λ sin2 α and t = 4λ, so G(λ, α, ·) is continuous on R+ . We also have G(λ, α, ∞) = 0 and G(λ, α, 0+) = cot α/(2ζ (2)). Theorem 0.3 follows now taking in Proposition 2.1 α = π/4 and λ = 2/(π ζ (2)) = 12/π 3 . From the formulae at the beginning of this section we derive

2.5. The computation of

∂I0 1 + 2a log(2a sin2 α) − 2a(π − 2α) cot α, = 4a cos2 α − 2a + ∂α 2 sin2 α ∂I3 ∂I1 = = −4a cos2 α, ∂α ∂α ∂I2 ∂I4 = = 2a(π − 2α) cot α, ∂α ∂α

464

F. P. Boca, C. Cobeli, A. Zaharescu

4a cos2 α ∂I 1 − 1 − 8a sin2 α 2 − = 4a cos α − 2a + 2 ∂α 2 sin α 1 − 8a sin2 α 2 2 16a (π − 2α) sin α cos α − 4a log +

, 1 + 1 − 8a sin2 α 1 − 8a sin2 α 1 + 1 − 8a sin2 α ∂I5 ∂I7 4a cos2 α = = , ∂α ∂α 1 − 8a sin2 α ∂I8 2a(π − 2α) cot α ∂I6 = =− . ∂α ∂α 1 − 8a sin2 α This provides finally  1 1   , for 1 ≥ sin2 α > , −  2α  2a 2 sin        1 1 1   −2a + > sin2 α > , + 2a log(2a sin2 α), for 2a 8a 1  ∂G 2 sin2 α (α, a) = ∂α ζ (2)     1 1 − 1 − 8a sin2 α 1    for −2a + > sin2 α ≥ .  2  8a 2 2 sin α    2   , −4a log  1 + 1 − 8a sin2 α

We also note that the previous equality shows that ∂ G/∂α is continuous on π/4, π/2 ×(0, ∞). 3. Proof of Theorem 0.2 In this section we will take λ = λ() =

4 Area() , π ζ (2)

fix t > 0, and consider λ(Q) = λ(, Q) =

4N (, Q) ∼ λQ2 . π

We denote a=

λ 2t

and

a) = G(λ, α, t). G(α,

For π/4 ≤ α < β < π/2 and m > 0, we consider the triangle Tα,β,mQ , bounded by y = x tan α, y = x tan β and y = mQ. The set F(Tα,β,m , Q) of lattice points visible from the origin in Tα,β,mQ has cardinality N = N (m) = Nα,β,mQ ∼

1 Area Tα,β,mQ . ζ (2)

Distribution of Lattice Points Visible from the Origin

465

These points, identified with the corresponding Farey fractions of order Q, are ordered as cot β ≤ γ1 < γ2 < · · · < γN ≤ cot α. We consider

G(λ, α, β, t, mQ) = # γk ∈ F Tα,β,1 , [mQ] ; θ(γk ) − θ(γk+1 ) ≥

t . λ(Q)

By Proposition 2.1 and λ([mQ]) ∼ m2 λ(Q) we gather lim Q

1 m2 G(λ, α, β, t, mQ) = lim G(λ, α, β, t, mQ) = m2 G(λ, α, β, m2 t) 2 Q [mQ]2 Q = m2 G(λ, α, m2 t) − G(λ, β, m2 t) . (3.1)

For 0 < m ≤ M we consider Eα,β,m,M,Q = Tα,β,MQ \ Tα,β,mQ , the region bounded by the lines y = x tan α, y = tan β, y = mQ and y = MQ. It is clear that

2 Area Tα,β,mQ = 2Q2 Area Tα,β,m = Q2 m2 (cot α − cot β),

2 Area Tα,β,MQ = 2Q2 Area Tα,β,M = Q2 M 2 (cot α − cot β), 2 Area(Eα,β,m,M,Q ) = Q2 (M 2 − m2 )(cot α − cot β).

Let ! = π/4 = α0 < α1 < · · · < αn+1 = π/2 be a division of π/4, π/2 with norm ! = max (αk+1 − αk ) and let ξk , ηk ∈ [αk , αk+1 ], 0 ≤ k ≤ n. Then 0≤k≤n

A(, !) = 2

n

Area Eαk ,αk+1 ,y (ξk ),y (ηk ),1

k=0

=

n k=0

≤

n

(cot αk − cot αk+1 ) y (ξk )2 − y (ηk )2

(cot αk − cot αk+1 ) y (ξk )2 − y (αk )2 + y (αk )2 − y (ηk )2

k=0 ≤ 4y ∞ y ∞

π !, 2

providing lim A(, !) = 0.

(3.2)

!→0

Assume now that ξk and ηk are chosen such that y (ξk ) = mk = and y (ηk ) = Mk =

max

α∈[αk ,αk+1 ]

min

α∈[αk ,αk+1 ]

y (α)

y (α). For the time being we keep !, hence also

n = n(!), fixed. Since is star-shaped with respect to the origin, the slice k of

466

F. P. Boca, C. Cobeli, A. Zaharescu

contained within the lines y = x tan αk and y = x tan αk+1 contains Tαk ,αk+1 ,mk and is contained within Tαk ,αk+1 ,Mk . Therefore G (t, Q) ≤

n

G(λ, αk , αk+1 , t, mk Q) +

k=0

G (t, Q) ≥

n

G(λ, αk , αk+1 , t, Mk Q) −

k=0

n

# Eαk ,αk+1 ,mk ,Mk ,Q ∩ Z2 + 2n, k=0 n

# Eαk ,αk+1 ,mk ,Mk ,Q ∩ Z2 − 2n.

k=0

But lim Q

n n

1 2 E = ∩ Z Area Eαk ,αk+1 ,mk ,Mk ,1 = A(, !), αk ,αk+1 ,mk ,Mk ,Q 2 Q k=0

k=0

hence the previous inequalities provide, for all !, lim inf Q

n 1 1 G(λ, αk , αk+1 , t, Mk Q) − A(, !) ≤ lim inf 2 G (t, Q) 2 Q Q Q k=0

≤ lim sup Q

n 1 1 G (t, Q) ≤ lim sup G(λ, αk , αk+1 , t, mk Q) + A(, !). Q2 Q2 Q k=0

(3.3) By (3.1) we gather, for all 0 ≤ k < n, 1 2 2 2 G(λ, α G(λ, α , α , t, m Q) = m , m t) − G(λ, α , m t) , k k+1 k k k+1 k k k Q Q2 1 lim 2 G(λ, αk , αk+1 , t, Mk Q) = Mk2 G(λ, αk , Mk2 t) − G(λ, αk+1 , Mk2 t) . Q Q lim

Summing up over k and employing (3.3) we derive n

y (ηk )2 G(λ, αk , y (ηk )2 t) − G(λ, αk+1 , y (ηk )2 t) − A(, !)

k=0

1 1 G (t, Q) ≤ lim sup 2 G (t, Q) 2 Q Q Q Q n ≤ A(, !) + y (ξk )2 G(λ, αk , y (ξk )2 t) − G(λ, αk+1 , y (ξk )2 t)

≤ lim inf

k=0

(3.4)

Distribution of Lattice Points Visible from the Origin

467

for any division !. The sum in the left-hand side of (3.4) can also be written as n

(αk − αk+1 )y (ξk )2

k=0

∂G λ, θk , y (ξk )2 t ∂α =

n k=0

2 ∂G

λ (αk − αk+1 )y (ξk ) θk , ∂α 2y (ξk )2 t

is continuous, it follows that the sum above for some θk ∈ [αk , αk+1 ]. Since ∂ G/∂α tends to π/2 λ() 2 ∂G G (t) = − y (α) α, dα ∂α 2y (α)2 t π/4

π/2 λ() ∂G 2 2 =− dα, α, r (α) sin α ∂α 2r (α)2 t sin2 α π/4

as ! → 0. Taking also into account (3.2) we conclude that the left-hand side of (3.4) tends to G (t) as ! → 0. By the same reasoning, the right-hand side of (3.4) tends to G (t) as ! → 0 and (3.3) implies now that, for all t > 0, Q−2 G (t, Q) tends to G (t) as Q → ∞. After an appropriate normalization, the repartition function (t, Q) of the probability measure µ,Q will tend, as Q → ∞, to G (t) = − ζ (2) G (t) = − ζ (2) G A() A()

π/2 λ() ∂G dα. α, r (α)2 sin2 α ∂α 2r (α)2 t sin2 α

π/4

Appendix Proof of Lemma 1.6. First, we write the incomplete Kloosterman sums using the complete ones as follows q m x + n x¯ 1 k(y − x) SI (m, n, q) = e e q q q =

1≤x≤q gcd(x,q)=1 q

1 q

k=1 y∈I

y∈I

e

k=1

ky S(m − k, n, q). q

The sum over y is a geometric progression, whence 1 2 1 ky ( ( ≤ min |I|, k , ≤ min |I|, k ≤ min |I|, e e sin πk q 2( ( − 1 y∈I

q

q

q

468

F. P. Boca, C. Cobeli, A. Zaharescu

where · is the distance to the nearest integer. Combining this with Lemma 1.5, we deduce q−1 |I| 1 SI (m, n, q) ≤ 1 S(m, n, q) ( k ( S(m − k, n, q) + ( ( q q 2 q k=1 q−1 1 1 1 |I| 1 2 2 (k( + σ0 (q) gcd(n, q) q ( ( 2q q q k=1

q/2 1 1 q 1 ≤ σ0 (q) gcd(n, q) 2 q 2 +1 q k k=1

1 2

1 2

≤ σ0 (q) gcd(n, q) q (2 + log q),

which concludes the proof.

Proof of Lemma 1.7. We first write

Nq (I1 , I2 ) =

x∈I1 ,y∈I2 gcd(x,q)=1

1 δ(x, y) = q

q k(y − x) e . q

(A1)

x∈I1 , y∈I2 k=1 gcd(x,q)=1

Here, the main contribution is given by the terms with k = q. This is equal to |I2 |/q times the number of elements of I1 that are relatively prime to q. The latest can be written using the Möbius function as follows:

1=

x∈I1 gcd(x,q)=1

x∈I1 d|(x,q)

=

µ(d)

µ(d)

x∈I1 d|x d|q

1=

x∈I1 d|x

d|q

µ(d) =

d|q

|I1 | + θd µ(d) d

for some real numbers θd , with |θd | ≤ 1. This equals further |I1 |

µ(d) d|q

d

+ O σ0 (q) = |I1 |

) p|q

1 1− p

+ O σ0 (q) .

* Since p|q 1 − p −1 = ϕ(q)/q, we conclude that the contribution of terms with k = q in (A1) equals |I2 | ϕ(q) · |I1 | · + O (σ0 (q)) . q q

(A2)

Distribution of Lattice Points Visible from the Origin

469

After changing the order of summation, we find that the sum of the remaining terms in (A1) is 1 E = Eq (I1 , I2 ) = q

=

q−1 k(y − x) e q

x∈I1 ,y∈I2 k=1 gcd(x,q)=1

q−1 1 ky SI1 (0, −k, q). e q q k=1 y∈I2

The sum over y is a geometric progression, hence , + ky 1 ≤ min |I2 |, ( ( . e q 2 ( qk (

(A3)

y∈I2

From (A3) and Lemma 1.6 we derive , + q−1 1 1 1 1 |E| min |I2 |, ( k ( σ0 (q) gcd(k, q) 2 q 2 log q ( ( q 2 q

k=1

≤

σ0 (q) log q √ q d|q

1

≤ σ0 (q)q 2 log q

#

q−1 k=1 gcd(k,q)=d #

d|q

q−1

d− 2

1

q−1 2d

$

σ0 (q) log q 1 q d (k( ≤ d2 √ q dm 2( q ( 1 2

d|q

m=1

$

2d 1 1 ≤ σ0 (q)σ− 1 (q)q 2 log2 q. 2 m

m=1

The lemma follows now from (A1), (A2) and the previous estimate.

Acknowledgements. We are grateful to the referee for thorough advice concerning the exposition of this paper.

References 1. Berry, M.V., Tabor, V.: Level clustering in the regular spectrum. Proc. Royal Soc. London A356, 375–394 (1977) 2. Bleher, P.M.: The energy level spacing for two harmonic oscillators with golden mean ratio of frequencies. J. Stat. Phys. 61, 869–876 (1990) 3. Bleher, P.M.: The energy level spacing for two harmonic oscillators with generic ratio of frequencies. J. Stat. Phys. 63, 261–283 (1991) 4. Cobeli, C., Zaharescu, A.: On the distribution of primitive roots (mod p). Acta Arith. 83, 143–153 (1998) 5. Esterman, T.: On Kloosterman’s sums. Mathematika 8, 83–86 (1961) 6. Franel, J.: Les suites de Farey et le problème des nombres premiers. Göttinger Nachr. 191–201 (1924) 7. Franel, J., Landau, E.: Les suites de Farey et le problème des nombres premières. Göttinger Nachr. 202–206 (1924) 8. Gallagher, P.X.: On the distribution of primes in short intervals. Mathematika 23, 4–9 (1976) 9. Hall, R.R.: A note on Farey series. J. London Math. Soc. 2, 139–148 (1970) 10. Hall, R.R.: On consecutive Farey arcs II. Acta Arith. 66, 1–9 (1994) 11. Hall, R.R.: Linear diophantine equations and Franel integrals. J. Reine Angew. Mathematik 496, 93–105 (1998)

470

F. P. Boca, C. Cobeli, A. Zaharescu

12. Hall, R.R., Tenenbaum, G.: On consecutive Farey arcs. Acta Arith. 44, 397–405 (1984) 13. Hall, R.R., Tenenbaum, G.: The set of multiples of a short interval. In: Number Theory, New York Seminar 1989–1990, D. V. Chudnovsky, G. V. Chudnovsky, H. Cohn and M. B. Nathanson (eds.), Berlin– Heidelberg–New York: Springer, 1991, pp. 119–128 14. Hardy, G.H., Wright, E.M.: An Introduction to the Theory of Numbers, Oxford: Clarendon Press, 1938 (fourth edition 1960) 15. Hooley, C.: An asymptotic formula in the theory of numbers. Proc. London Math. Soc. 7, 396–413 (1957) 16. Hooley, C.: On the intervals between consecutive terms of sequences. Proc. Symp. Pure Math. 24, 129–140 (1973) 17. Huxley, M.N.: The distribution of Farey points I. Acta Arith. 18, 281–287 (1971) 18. Huxley, M.N., Nowak, W.G.: Primitive lattice points in convex planar domains. Acta Arith. 76, 271–283 (1996) 19. Kanemitsu, S., Sita Rama Chandra Rao, R., Siva Rama Sarma, A.: Some sums involving Farey fractions. J. Math. Soc. Japan 34, 125–142 (1982) 20. Kargaev, P., Zhigljavsky, A.: Approximation of real numbers by rationals: Some metric theorems. J. Number Theory 61, 209–225 (1996) 21. Kargaev, P., Zhigljavsky, A.: Asymptotic distribution of the distance function to the Farey points. J. Number Theory 65, 130–149 (1997) 22. Katz, N., Sarnak, P.: Zeroes of zeta functions and symmetry. Bull. Am. Math. Soc. 36, 1–26 (1999) 23. Kurlberg, P., Rudnick, Z.: The distribution of spacings between quadratic residues. Duke Math. J. 100, 211–242 (1999) 24. LeVeque, W.L.: Fundamentals of number theory. Reading, MA: Addison-Wesley Publishing Co., 1977 25. Montgomery, H.L.: The pair correlation of zeros of the zeta function. Proc. Symp. Pure Math. 24, 181–193 (1973) 26. Pandey, A., Bohigas O., Giannoni, M.J.: Level repulsion in the spectrum of two-dimensional harmonic oscillators. J. Phys. A 22, 4083–4088 (1989) 27. Rudnick, Z., Sarnak, P.: Zeros of principal L-functions and random matrix theory. Duke Math. J. 81, 269–322 (1996) 28. Rudnick, Z., Sarnak, P., Zaharescu, A.: The distribution of spacings between the fractional parts of n2 α. Preprint 1999 29. Sós, V.: On the distribution mod 1 of the sequence nα. Ann. Univ. Sci. Budapest. Eötvös Sect. Math. 1, 127–134 (1958) 30. Swierczkowski, S.: On successive settings of an arc on the circumference of a circle. Fundam. Math. 46, 187–189 (1958) 31. Weil, A.: On some exponential sums. Proc. Nat. Acad. Sci. USA 34, 204–207 (1948) Communicated by P. Sarnak

Commun. Math. Phys. 213, 471 – 489 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Lattice Dislocations in a 1-Dimensional Model Evgeni Korotyaev Math. Dept. 2, ETU, 5 Prof. Popov Str., St. Petersburg, 197376 Russia. E-mail: [email protected] Received: 5 April 1999 / Accepted: 3 March 2000

Abstract: The spectral properties of the Schrödinger operator T (t) = −d 2 /dx 2 + q(x, t) in L2 (R) are studied, where the potential q is defined by q = p(x +t), x > 0, and q = p(x), x < 0; p is a 1-periodic potential and t ∈ R is the dislocation parameter. For each t the absolutely continuous spectrum σac (T (t)) = σac (T (0)) consists of intervals, which are separated by the gaps γn (T (t)) = γn (T (0)) = (αn− , αn+ ), n 1. We prove: in each gap γn = ∅, n 1 there exist two unique "states" (an eigenvalue and a resonance) ± ± ± λ± n (t) of the dislocation operator, such that λn (0) = αn and the point λn (t) runs clockwise around the gap γn changing the energy sheet whenever it hits αn± , making n/2 complete revolutions in unit time. On the first sheet λ± n (t) is an eigenvalue and on the second sheet λ± (t) is a resonance. In general, these motions are not monotonic. There n exists a unique state λ0 (t) in the basic gap γ0 (T (t)) = γ0 (T (0)) = (−∞, α0+ ). The asymptotics of λ± n (t) as n → ∞ is determined. 1. Introduction and Main Results We consider the Schrödinger operator T (t) = −d 2 /dx 2 + q(x, t) acting in L2 (R), where the real potential q(·, t) has the form p(x) if x < 0, q(x, t) = (1.1) p(x + t if x > 0, p ∈ L1 (0, 1) is a 1-periodic potential and t ∈ R the dislocation parameter. For t = 0 we obtain the Hill operator H = −d 2 /dx 2 + p(x) in L2 (R). It is well-known (see [T]) that + the spectrum of H is absolutely continuous and consists of intervals σn = [αn−1 , αn− ], Partially supported by Russian Fund of Fundamental Research, INTAS and SFB 288. Present address: Universität Potsdam, Institut für Mathematik, PF. 60 15 53, 14415 Potsdam, Germany. E-mail: [email protected]

472

E. Korotyaev

+ where αn−1 < αn− αn+ , n 1. These intervals are separated by gaps γn (H ) = (αn− , αn+ ), n 1, and we define γ0 (H ) = (−∞, α0+ ). If a gap degenerates, i.e., γn = ∅, then the corresponding segments σn , σn+1 merge. Without loss of generality we assume that α0+ = 0. The sequence 0 = α0+ < α1− α1+ < . . . is the spectrum of the equation −y

+ py = λy with periodic boundary conditions of period 2, that is y(x + 2) = y(x), x ∈ R. Let (x, αn± ) be the corresponding real "normed" eigenfunctions, i.e., 1 ± 2 − + − 0 (x, αn ) dx = 1. Here the equality αn = αn means that αn is a double eigenvalue. + The lowest eigenvalue α0 is simple and the corresponding eigenfunction (x, α0+ ) has period 1. The eigenfunctions corresponding to αn± have period 1 when n is even and they are antiperiodic, (x + 1, αn± ) = −(x, αn± ), x ∈ R, when n is odd. We need some results on the dislocation operator from [K1]. The absolutely continuous spectrum σac (T (t)) consists of intervals σn (T (t)) = σn (H ), n 1. These intervals are separated by the gaps γn (T (t)) = γn (H ), n 1, and γ0 (T (t)) = γ0 (H ). Let #(T , ω) be the number of eigenvalues of the operator T on the interval ω. In general, there exist eigenvalues in the gap γn (H ) and for each t ∈ [0, 1] the following relations are fulfilled:

σ (T (t)) = σac (T (t)) ∪ σd (T (t)), σac (T (t)) = σ (H ), σd (T (t)) ⊂ ∪γn (H ), (1.2) #(T (t), γn (T (t))) 2, for each n 0. (1.3) Introduce the operator T0 (t) = −d 2 /dx 2 + χ− (x)p(x + t) + χ+ (x)p(x) acting in L2 (R), where χ± (x) = 1, ±x > 0, and χ± (x) = 0, ±x < 0. There exists an analytic continuation of (T (t) − λ)−1 on a two sheeted Riemann surface. Note that the Riemann surface of T (t) coincides with the Riemann surface of the Hill operator T (0). We define the resonance as a pole of (T (t) − λ)−1 on the second sheet. It is important that in this situation the resonances are real, more precisely they belong to the gaps on the second sheet. Moreover, λ1 ∈ γn (T (t) is the eigenvalue (the resonance ) of T (t) for some n 0 and t iff the same number λ1 , but on another sheet of the energy surface is the resonance (the eigenvalue ) of T0 (t). We call λ a state if λ is an eigenvalue or a resonance on some gap or λ is an end of some gap (a virtual state). Changing t we get different potentials q(·, t). If t = 0 then q(x, 0) = p(x) is periodic and T (0) is the Hill operator and eigenvalues are absent, we have the two virtual states at the end of the gap γn , n 1, and one virtual state at the end of the basic gap γ0 . But if t is small and t = 0 then there + ± exist two states λ± n (t) → αn and one state λ0 (t) → α0 as t → 0. Let p be analytic. Rewriting the potential q(x, t) in the form q(x, t) = p(x) + χ+ (x)(p(x + t) − p(x)), we deduce that q(·, t) is an analytic function of t. Therefore, the operator T (t) is real analytic in t. Slitting the nth gap γn = ∅ we obtain a cut γn0 with two sides: γn+ = (αn− , αn+ ) is the upper side and γn− = (αn− , αn+ ) is the lower one. Introduce the domain = C \ ∪γn0 . From a physical point of view, + = \ C− is the upper half-plane of the first sheet and − = \ C+ is the lower half-plane of the second sheet. Hence, γn+ is the gap on the first sheet and γn− is the gap on the second sheet. This domain is more convenient than the Riemann surface with two sheets. Introduce the arc (α, β) ⊂ γn0 with the points α, β on the cut γn0 by the rule: 1) let α ∈ γn+ , β ∈ γn+ , then the arc (α, β) = (α, β) ⊂ γn+ , 2) let α ∈ γn+ , β ∈ γn− , then the arc (α, β) = (α, αn+ ] ∪ (αn+ , β) where (α, αn+ ) ⊂ γn+ , (αn+ , β) ⊂ γn− , and so on. The spectral properties of the Schrödinger operator with a junction of two periodic potentials, i.e., the operator T12 (t) = −d 2 /dx 2 + p1 (x)χ− (x) + χ+ (x)p2 (x + t) acting in L2 (R), where p1 , p2 are periodic potentials, were studied in the various papers [A1-2, DS, GNP, Ka, K1] (see the literature in [Ka]). Such operators arise in various problems,

Lattice Dislocations in a 1-Dimensional Model

473

for example, non-linear equations [BS], the inverse spectral problem [A1], and so on. If p1 = p2 = p then T12 (t) = T (t) and the spectrum of T (t) has the form (1.2-3). Tamm [Ta] was the first, who considered Schrödinger operators with these potentials (the Kronig-Penny model with periodic p2 and p1 =const) and proved the existence of eigenvalues (the famous surface states). In [K1] these eigenvalues were studied, but the dynamics of eigenvalues when t runs through the unit interval [0, 1] was not considered. The basic goal of this paper is to study the motion of eigenvalues in the gaps, when the dislocation parameter t runs through the unit interval [0, 1]. These results are relevant in studying the 3-dimensional case, the KdV equation, the inverse spectral problem, and the spectral properties of T (t) + V (x), where V (x) → 0, as x → ±∞ or V = εx (Stark effect), and so on. Let us describe the main results of the present paper: i) For each n 1 there exist two unique states λ± n (t) of the dislocation operator T (t) ± (0) = α ± , and the point λ± (t) ∈ γ runs in the th gap, such that λ± (·) ∈ C(R), λ n n n n n clockwise around the gap γn , changing sheets when it hits αn± , making n/2 complete ± ± ∓ ± revolutions in unit time; λ± 2n (1) = α2n and λ2n+1 (1) = α2n+1 and the motion of λn (t) is 2-periodic. i) For each t ∈ [0, 1] there exists a unique state λ0 (t) in the basic gap γ0 . iii) The asymptotics of λ± n (t) as n → ∞ are determined. Let ϕ(x, λ, t), ϑ(x, λ, t) be the solutions of the equation −y

+ p(x + t)y = λy,

λ ∈ C,

t ∈ R,

(1.4)

satisfying ϕx (0, λ, t) = ϑ(0, λ, t) = 1, and ϕ(0, λ, t) = ϑx (0, λ, t) = 0. Introduce the positive function S(λ, t) =

ϕλ (1, λ, t) < 1, ϕλ (1, λ, t) + ϕλ (1, λ, 0)

λ ∈ γn ,

n 0.

We need some results concerning the operator H+ (t)f = −f

+ p(x + t)f in L2 (R+ ) with the Dirichlet boundary condition f (0) = 0. Zheludev [Z] proved that σ (H+ (t)) = σac (H+ (t))∪σd (H+ (t)), where σac (H+ (t)) = σ (H ) and in each gap γn = ∅ there exist one eigenvalue or one resonance. The dynamics of this state was studied in the papers devoted to the KdV equation and the inverse problem (see [D, Tr]). Let µn (t) be an eigenvalue or a resonance of H+ (t)), i.e., the “eigenvalue” of Eq. (1.4) with the condition y(0) = 0, x 0, and fixed parameter t ∈ R. Then µn (t) runs monotonically clockwise around the gap γn , changing sheets when it hits αn± , making n complete revolutions in unit time. The number µn (t) lies on the upper side γn+ when µ˙ n (t) > 0 (the eigenvalue ) or on the lower side γn− when µ˙ n (t) < 0 (a resonance). Here and below (˙) = ∂/∂t. The numbers µn (t), n 1, coincide with the Dirichlet spectrum of p(x + t) on the interval [0, 1], i.e., the spectrum of (1.4) with boundary conditions y(0) = y(1) = 0. Let νn (t), n 0, be the Neumann spectrum of p(x + t), that is the spectrum of (1.4) with boundary conditions y (0) = y (1) = 0. Introduce the set T = R/Z. For the point λ ∈ γn0 we define a new point λ∗ , which has the same projection onto C but lies on the other side of γn0 . There exist points 0 τn,1 < τn,2 <, . . . , τn,n < 1, such that µn (τn,m ) = µn (0)∗ , m = 1, . . . , n. Note, if µn (0) = αn± , then µn (0)∗ = αn± . Theorem 1.1. Let p ∈ L1 (0, 1). Then for each gap γn = ∅, n 1, and any t ∈ R, th gap. The point λ± (t) there exist two unique states λ± n (t) of the operator T (t) in the n ± runs clockwise around the gap γn changing sheets when it hits αn , making n/2 complete revolutions in unit time. The function λ± n (·) belongs to C(2T) and has the properties:

474

E. Korotyaev

∗ + i) λ− n (t) belongs to the arc (µn (t), µn (0) ) for some t ∈ R iff λn (t) belongs to the ∗ − arc (µn (0) , µn (t)). Conversely, λn (t) belongs to the arc (µn (0)∗ , µn (t)) for some ∗ ± 0 t ∈ R iff λ+ n (t) belongs to the arc (µn (t), µn (0) ). The point λn (t0 ) = µn (t0 ) ∈ γn , ± ∗ ± ± ± for some t0 ∈ T iff λn (t0 ) = µn (0) . If µn (0) ∈ γn ∪ αn , then λn (τn,m ) = µn (0)∗ ∗ for odd m and λ∓ n (τn,m ) = µn (0) for even m. Moreover, if τ ≡ t − τn,m → 0 for some m = 1, . . . , n, then the following asymptotic estimates hold: ∗ ˙ n,m )S(µn (0), τn,m )τ + O(τ 2 ), µn (0) = αn± , (1.5) λ± n (t) = µn (0) + µ(τ 1 ∗ ¨ n,m )S(µn (0), τn,m )2 τ 2 + O(τ 3 ), µn (0) = αn± . (1.6) λ± n (t) = µn (0) + µ(τ 2

ii) The following identities are fulfilled: ± λ± n (0) = αn ,

± λ± 2n (t + 1) = λ2n (t),

∓ λ± 2n+1 (t + 1) = λ2n+1 (t),

t ∈ R. (1.7)

iii) If p is real analytic then λ± n (·) is real analytic on R. th gap, that is, the Remark. Below λ± n (t) is the state of the dislocation operator in the eigenvalue, or the resonance or the virtual state (the end of the gap). In other words, ± one has the following result: Slit the th gap γn = ∅ and let λ± n (0) = αn . Then the state ± 0 λn (t) runs clockwise around the gap γn , changing sheets when it hits αn± , making n/2 complete revolutions in unit time. The motion of λ± n (t) is not monotonic (see [K1]). The ± state λ± n (t) on the upper side is an eigenvalue, and λn (t) on the lower side is a resonance, and on the end of the gap is a virtual state. The eigenvalue µn (t) moves faster than λ± n (t) + (t)) at the point µ (0)∗ . At this instant µ (t) and λ− (t) and overtakes λ− (t) (or λ n n n n n (or λ+ n (t)) move in the same direction. This local dynamics is important for the global motion of the states. The point µn (0)∗ is the unique "meeting place" for λ± n (t), µn (t).

˙ αn± )/(t, αn± ). We now consider the ground state in the basic gap Let ζ (t, αn± ) = (t, γ0 . Theorem 1.2. Suppose p ∈ L1 (0, 1). Then for each t ∈ R there exists a unique state λ0 (t) of the dislocation operator T (t) in the basic gap γ0 , such that the function λ0 (·) : R → γ0 belongs to C(T), λ0 (0) = α0+ and has the following properties: i) if p is real analytic, then λ0 (·) is real analytic, ii) if ζ (t, α0+ ) > ζ (0, α0+ ) for some t ∈ T, then λ0 (t) is an eigenvalue, if ζ (t, α0+ ) < ζ (0, α0+ ) for some t ∈ T, then λ0 (t) is a resonance, if ζ (t, α0+ ) = ζ (0, α0+ ) for some t ∈ T, then λ0 (t) = 0 is a virtual state, iii) for each t ∈ [0, 1], the state λ0 (t) lies on the interval [min{ν0 (t), ν0 (0)}, α0+ ]. Remark. Using Theorem 1.2, we obtain: i) if ζ (0, α0+ ) = max{ζ (τ, α0+ )} then each state λ0 (t), t ∈ [0, 1] is a resonance or a virtual state only, ii) if ζ (0, α0+ ) = min{ζ (τ, α0+ )}, then each state λ0 (t), t ∈ [0, 1] is an eigenvalue or a virtual state only. We construct a potential q(x, t) such that the state λ0 (t) = 0 for each t ∈ (α, β). Example. There exists a potential p such that for the dislocation potential q(x, t) the state λ0 (t) does not move and λ0 (t) = 0 for each t ∈ B = (α, β) ⊂ [−1/2, 1/2], where 0 ∈ B.

Lattice Dislocations in a 1-Dimensional Model

475

Proof. We take the function (x, 0) = c(1 + g(x)), where g is a 1-periodic positive smooth function and c > 0 is a constant. Define the potential p(x) = g(x)

/(1 + g(x)). Assume that g(x) = const if x ∈ B. Then since ζ (t, α0+ ) = g(t) /(1 + g(t)) we get ν0 (0) = 0, ν0 (t) = 0, t ∈ B. Hence by iii) of Theorem 1.2, λ0 (t) = 0, t ∈ B. Therefore, we have constructed a potential p such that the state λ0 (t) = 0, t ∈ B, and hence it does not move for t ∈ B. Introduce the real functions zn (t), u± n (t) by the formulas 1 |γn |eizn (t) = (αn − µn (t)) + ign (µn (t)), 2 1 ± ± ± |γn |eiun (t) = (αn − λ± n (t)) + ign (λn (t))), 2 αn = 21 (αn+ + αn− ), u± n (0) = 0. We formulate the results concerning states at high energy. Theorem 1.3. Let p ∈ L1 (0, 1). Then the following asymptotic estimates are fulfilled: u± n (t) =

zn (t) − zn (0) (1.8) 2 1 p(x + t)(2x − 1) 1 ± cos[zn (0) + π n(2x + t)] dx + O(n−2 ) − 4πn 0

as n → ∞, uniformly in t ∈ [0, 2]. Remark. The substitution of (2.15) into (1.8) yields the asymtotics of u± n in terms of p. Let us shortly describe the proof. Each eigenvalue or resonance of T (t) is the zero of the corresponding Wronskian. Using results of [K1] and techniques from inverse spectral theory [L, Tr, K2] we study how the zero depends on t. In order to show (1.8) we need the asymptotic estimates from [K2]. 2. Preliminaries For the Hill operator H (t) = −d 2 /dx 2 + p(x + t) in L2 (R) we introduce the Lyapunov function by the formula 1 5(λ, t) = ϕx (1, λ, t) + ϑ(1, λ, t) . 2 Note that 5(λ, t) = 5(λ, 0) ≡ 5(λ) for all t ∈ R, λ ∈ C (see [IM]) and 5(αn± ) = (−1)n , n 0. We define the quasimomentum by the formula k(λ) = arccos 5(λ), λ ∈ (see [F, MO]). The function k(λ) is a conformal mapping from onto a quasimomentum domain K = {Rek 0} \ ∪cn , where cn = [π n − hn , π n + ihn ] is an excised slit with the height hn 0, n 1. The function k maps γn on the slit cn . With each end point of a non-degenerate gap γn , we associate an effective mass Mn± = 1/λ

(k(αn± )), and let M0+ = 1/λ

(0), where λ(k) is the inverse function for k(λ). It is well known that if |γn | = 0, then ±Mn± > 0,

and

λ(k) = αn± +

(k − π n)2 (1 + o(1)) as 2Mn±

λ → αn± .

476

E. Korotyaev

Recall that µn (t), n 1, is the Dirichlet spectrum of p(x + t), i.e., the spectrum of (1.4) with the boundary condition y(0) = y(1) = 0. We need some results on Eq. (1.4) from ([L, PT, T]). It is well known that ϕ(1, λ, t) is entire function of λ of order 21 at fixed t. The zeros of ϕ(1, λ, t) coincide with the Dirichlet eigenvalues µn (t), n 1, and the following asymptotic estimates are fulfilled: 1 1 µn (t) = (π n)2 + p(x)dx − p(x + t) cos 2π nxdx + O(1/n), as n → ∞, 0

0

(2.1) uniformly on bounded subsets of [0, 1] × L1 (0, 1). It is well known that ϑx (1, λ, t) is an entire function of λ of order 1/2 at fixed t. The zeros of ϑx (1, λ, t) coincide with the Neumann eigenvalues νn (t), n 0. The functions µn (t), νn (t) are 1-periodic. If the parameter t passes the interval [0, 1] then µn (t), νn (t), n 1, run through the gap γn = (αn− , αn+ ). If the gap γn = ∅ then µn (t), νn (t) don’t move and µn (t) = νn (t) = αn± . 0 for each t ∈ R. In the book [PT] there are asymptotics of The eigenvalue ν0 (t) α+ the solutions ϕ(x, λ, 0), ϑ(x, λ, 0) as |λ| → ∞. Repeating it for Eq. (1.4) we obtain the asymptotics for ϕ(x, λ, t), ϑ(x, λ, t), as |λ| → ∞. For example, √ x √ √ sin λx 1 ϕ(x, λ, t) = √ p(y + t)[cos λ(2y − x) − cos λx]dy + 2λ 0 λ (2.2) √ exp | λ|x + O( ), λ3/2 as x, t ∈ [0, 1], |λ| → ∞. These asymptotics can be differentiated with respect to x and /or λ and are uniform on bounded subsets of [0, 1] × [0, 1] × L1 (0, 1). We have the identities (µn (t) − λ) , λ ∈ C, t ∈ R, (2.3) ϕ(1, λ, t) = (π n)2 n1

ϕ(1, αn± , t)

= (−1)n 2Mn± (t, αn± )2 ,

t ∈ R,

n 0,

(2.4)

see [Tr]. Define a function φn (λ, t) ≡ (−1)n ϕ(1, λ, t)/(λ − µn (t)). Then we have φn (λ, t) > 0,

λ ∈ γn ,

n 1.

(2.5)

Introduce the function b(λ) = −i sin k(λ). The definition of the quasimomentum yields b(λ) = −i sin(π n + iv) = (−1)n sinh v, for k = π n + iv(λ), and then b(λ) = (−1)n sinh v(λ) = (−1)n 52 (λ) − 1, (2.6) k = πn + iv(λ), λ ∈ γn , n 0, where 5(λ)2 − 1 > 0 as λ ∈ γn+ and 5(λ)2 − 1 < 0 as λ ∈ γn− . Below we need the identity (see [KK2]) Mn± = −5(αn± )5 (αn± ), n 0.

(2.7)

Identities (2.6-7) yield b(λ) = (−1)n z( |2Mn± | + O(z2 )), as λ ∈ γn = ∅, λ = αn± ∓ z2 , z → 0. (2.8)

Lattice Dislocations in a 1-Dimensional Model

477

We write the Weyl function m± for (1.4) in the form a(λ, t)±i sin k(λ) a(λ, t)∓b(λ) = , ϕ(1, λ, t) ϕ(1, λ, t) ϕx (1, λ, t) − ϑ(1, λ, t) a(λ, t) = , 2

m± (λ, t) =

(2.9)

and we need the useful identity a 2 (λ, t) + 1 − 52 (λ) = a 2 (λ, t) − b2 (λ) = −ϕ(1, λ, t)ϑx (1, λ, t).

(2.10)

Below we need the following equations from [IM] ϑ˙ x (1, λ, t) = (λ − p(t))ϕ(1, ˙ λ, t), ϕ(1, ˙ λ, t) = 2a(λ, t), a(λ, ˙ t) = −ϑx (1, λ, t) − (λ − p(t))ϕ(1, λ, t),

(2.11)

and the formulas ϕ(x, λ, t) = ϑ(t, λ)ϕ(x + t, λ) − ϑ(x + t, λ)ϕ(t, λ), ϑ(x, λ, t) = ϑ(x + t, λ)ϕx (t, λ) − ϑx (t, λ)ϕ(x + t, λ),

(2.12) (2.13)

for x, t ∈ [0, 1], λ ∈ C. Introduce the functions gn (λ) = (αn+ − λ)(λ − an− ), λ ∈ , where ± gn (λ) > 0, λ ∈ γn± , 52 (λ) − 1 2fn (λ) fn (λ) = > 0, rn (λ, t) = > 0, λ ∈ γn , n 1. gn (λ) φn (λ, t) The functions fn , φn , rn are analytic in the disk Un = {λ : |λ − αn | < ρn }, for some ρn > |γn |/2. We need the following result on the Dirichlet spectrum from [K2]. Let C m , m 0, be the space of m times continuously differentiable real-valued functions of period 1. Lemma 2.1. Let p ∈ L1 (0, 1) be a 1-periodic potential. Then there exists a real function zn (t), n 1, such that µn (t) = (1/2)[αn− + αn+ − |γn | cos zn (t)] and zn ∈ C 2 ([0, 1]), d 3 zn /dt 3 ∈ L1 (0, 1). Its derivative is given by the formula z˙ n (t) = rn (µn (t), t) > 0,

t ∈ R,

and the following asymptotic estimates are fulfilled: t 1 1 zn (t) = zn (0) + 2π nt − p(x)dx + o , πn 0 n 1 1 p(x + t)(2x − 1) sin 2π nxdx + o , z˙ n (t) = 2πn + n 0

(2.14)

(2.15) (2.16)

as n → ∞ uniformly on bounded subsets of [0, 1] × L1 (0, 1). Moreover, suppose µn (t0 ) = αn± for some t0 ∈ [0, 1] and n 1. Then µn (t0 + t) = µn (t0 ) + t 2

µ¨ n (t0 ) + o(t 2 ), 2

µ¨ n (t0 ) = −

4Mn± , as t → 0. ϕλ (1, µn (t0 ), t0 )2 (2.17)

478

E. Korotyaev

The Bloch functions ψ ± (·, λ, t) ∈ L2 (R± ), λ ∈ C+ , for the operator H (t), t ∈ [0, 1], are defined by the following formulas: ψ ± (x, λ, t) = ϑ(x, λ, t) + m± (λ, t)ϕ(x, λ, t),

λ ∈ ,

(2.18)

the function exp(∓ik(λ)x)ψ ± (x, λ, t) is 1−periodic in x for any fixed λ, t. We introduce the functions ± (x, λ, t), which are the solutions of the equation −y

+ q(x, t)y = λy, belong to L2 (R± ) for any λ ∈ C+ , t ∈ R and have the form (see [K1]) if x > 0, ψ + (x, λ, t) + (x, λ, t) = w1 (λ,0)−w(λ,t) − (2.19) w(λ,t) + ψ (x, λ, 0) + ψ (x, λ, 0) if x < 0, w1 (λ,0) w1 (λ,0) λ ∈ , and similarly we have

w1 (λ,t)−w(λ,t) + ψ (x, λ, t) + ww(λ,t) ψ − (x, λ, t) w1 (λ,t) 1 (λ,t) ψ − (x, λ, 0)

if x > 0, if x < 0,

(2.20)

w(λ, t) = { − (x, λ, t), + (x, λ, t)} = m+ (λ, t) − m− (λ, 0), λ ∈ ,

(2.21)

− (x, λ, t) =

λ ∈ , where the Wronskians have the form −

+

+

−

w1 (λ, t) = {ψ (x, λ, t), ψ (x, λ, t)} = m (λ, t) − m (λ, t),

λ ∈ .

(2.22)

The functions w, w1 , ± have analytic continuations across the set σ (H ) into the "second sheet" − = C− . Recall that for the point λ ∈ γn0 we define a new point λ∗ , which has the same projection onto C but lies on the other side of γn0 . For the point λ ∈ γn+ , n 0 the following identities are fulfilled: w1 (λ∗ , t) = −w1 (λ, t), ±

∗

∓

m± (λ∗ , t) = m∓ (λ, t), ∗

−

(2.23) +

ψ (·, λ , t) = ψ (·, λ, t), w(λ , t) = m (λ, t) − m (λ, 0),

(2.24)

and +

∗

−

∗

(x, λ , t) = (x, λ , t) =

ψ − (x, λ, t) + (x, λ, t) + − (x, λ, t) + ψ + (x, λ)

w1 (λ,t) − + w1 (λ,0) (ψ (x, λ, 0) − ψ (x, λ, 0)) w1 (λ,0) + − w1 (λ,t) (ψ (x, λ, t) − ψ (x, λ, t))

if x > 0, if x < 0, (2.25) if x > 0, ifx < 0. (2.26)

The kernel R(x, x , λ, t) of the operator (T (t) − λ)−1 , λ ∈ + has the form 1 + (x, λ, t) − (x , λ, t) if x > x ,

R(x, x , λ, t) = w(λ, t) − (x, λ, t) + (x , λ, t) if x < x .

(2.27)

For each x, x , t ∈ R the function R(x, x , λ, t) has an analytic (meromorphic) continuation on the “second” sheet − . We prove now the main result of this section.

Lattice Dislocations in a 1-Dimensional Model

479

Lemma 2.2. Let p ∈ L1 (0, 1) be a 1-periodic real potential. i) Assume w(αn± , t1 ) = 0 for some t1 ∈ R and n 0; let µn (t1 ) = αn± = µn (0) if n 1. Then for some εn > 0 there exists a unique real function zn± (t1 + ·) ∈ ± ± 2 C(−εn , εn ) such that zn± (t1 ) = 0. If zn± (t) > 0 then the state λ± n (t) ≡ αn ∓ zn (t) is an eigenvalue of T (t). If zn± (t) < 0 then the state λ± (t) is a resonance of T (t). n Moreover, the following asymptotic estimates are fulfilled: ± zn (t) = ∓ ±2Mn± r(t1 , αn± ) t (2.28) ˙ 1 , α ± )2 − (p(t) − α ± )(t1 , α ± )2 ]dt + O(t − t1 ), [(t t1

as t → t1 ,

n

n

n

where r(t, αn± )−1 = 1 + ((t, αn± )2 /(0, αn± )2 ).

In addition, suppose p is real analytic, then zn± (t1 + ·) is real analytic on (−εn , εn ). ii) Let µn (0) = αn± = µn (t1 ) for some γn = ∅, n 1, t1 ∈ R. Then for some εn > 0 there exists a unique real function zn± (t1 + ·) ∈ C(−εn , εn ) such that zn± (t1 ) = 0. If ± ± 2 0 < ∓(t − t1 ) < εn then the state λ± n (t) = αn ∓ zn (t) is an eigenvalue of T (t). ± If −εn < ∓(t − t1 ) < 0 then the state λn (t) is a resonance of T (t). Moreover, the following asymptotic estimates hold: zn± (t1 + τ ) = ∓τ |µ¨ n (t1 )/2|S(µn (t1 ), t1 ) + O(τ 2 ), τ = t − t1 → 0. (2.29) Assume that p is real analytic, then zn± (t1 + ·) is real analytic on (−εn , εn ). iii) For each t ∈ R there exists an analytic continuation of (T (t)−λ)−1 , λ ∈ C+ , across the set σ (H ) into the lower half plane C− (the second sheet). Moreover, there exist the resonances, which lie only on the gaps γn− , n 0, (on the second sheet). The number λ1 is an eigenvalue (a resonance) of T (t) for some n 0 and some t iff λ1 is the resonance (the eigenvalue) of T0 (t). Proof. The statements of i), iii) were proved in [K1]. ii) We consider the case µn (0) = αn− and let n be even. The proof for µn (0) = αn+ or odd n is the same. Let a = a(λ, t), ϑx = ϑx (1, λ, t), a 0 = a(λ, 0), ϑx0 = ϑx (1, λ, 0) and so on. Assume that w(λ, t) = 0, i.e., a0 + b a−b . = ϕ ϕ0 Multiplying the last identity by (a + b)(a 0 − b) and using (2.10) we obtain −ϑx (a 0 − b) + ϑx0 (a + b) = 0. Therefore, we have the equation for the states: C(z, t) ≡ b(ϑx + ϑx0 ) + (aϑx0 − a 0 ϑx ) = 0,

λ = αn− + z2 .

(2.30)

Define the function h(t) = ϑx (1, αn− , t) and note h(0)h(t1 ) > 0. Asymptotics (2.8) yield Cz (0, t1 ) = βn (h(t1 ) + h(0)) = 0, βn = |2Mn− | > 0. The functions C(z, t), Cz (z, t) are analytic in z and continuous in t and (2.11) yields ˙ ˙ x0 = −ϑx ϑx0 + (λ − p)[(b − a 0 )ϕ˙ − ϕϑx0 ]. C(z, t) = (b − a 0 )ϑ˙ x + aϑ

(2.31)

480

E. Korotyaev

Then by the implicit function theorem, for some ε > 0 there exists the function z(·) ∈ C(−ε, ε), such that C(z(t), t) = 0 and z(0) = 0. We will prove (2.29) for even n, the proof for odd n is the same. Recall that ϕ(1, λ, t) = (λ − µn (t))φn (λ, t) and using (2.17), (2.11) we have ϕ˙ = −µ˙ n φn + (λ − µn )φ˙ n , ϕ¨ = −µ¨ n φn − 2µ˙ n φ˙ n + (λ − µn )φ¨ n , (2.32) ϕ(1, λ, t) = O(|z|2 + τ 2 ), ϕ(1, ˙ λ, t) = O(|z|2 + |τ |), t t ϑ˙ x (1, αn− , t)dt = (αn− − p(t))a(αn− , t)dt = O(τ ), h(t) − h(t1 ) = t1

t1

(2.33) (2.34)

as τ → 0 and (2.32–34) imply Cz (z, t) = βn (h(0) + h(t1 )) + O(|z| + |τ |), ˙ C(z, t) = −h(0)h(t1 ) + p(t)O(τ 2 + z2 ) + O(|τ | + |z|)

(2.35) (2.36)

as τ → 0. Recall h(0)h(t1 ) > 0, and that the identity (2.10) yields −(b2 )λ = −ϑx (1, λ, t)ϕλ (1, λ, t), at λ = αn− , t = 0, t1 . Then by (2.7), (2.17), ϑx (1, αn− , t) = −2Mn− /ϕλ (1, αn− , t) = µ¨ n (t)ϕλ (1, αn− , t)/2 > 0, t = 0, t1 . (2.37) Using (2.35-36) we obtain t ˙ C(zn (t), t) z(t) = − dt t1 Cz (zn (t), t) t h(0)h(t1 ) =τ [|p(t)|O(τ 2 + z2 (t)) + O(|τ | + |z(t)|)]dt + βn (h(0) + h(t1 )) t1 and since p ∈ L1 (0, 1), (2.37) implies t z(t) = τ µn (t1 )/2S + (1 + |p(t)|(zm (τ )2 + τ 2 )dtO(1), t1

(2.38)

zm (τ ) = max |z(t1 + s)|. 0<s<τ

Therefore, z(t) = O(τ ), and zm (τ ) = O(|τ |) and substituting this into (2.38) we have (2.29). Note that λn (t) = αn− + z(t)2 and z(·) has asymptotics (2.29); the function a 0 − b has the root αn− in the neighborhood of zero and the function a + b has the zero µn (t) in the neighborhood of zero. It is important that λn (t) and µn (t) have different asymptotics, see (2.29) and (2.17). Assume that p is a real analytic function. Hence by the implicit function theorem, there exists a unique real analytic function zn− (t) which solves the equation w(t, z) = 0 for all t ∈ (−ε, ε) for some small ε > 0. Then z(t) = rt m (1 + O(t)), for some r = 0, m 1. Remark. i) If µn (0) = αn− , in (2.29), then zn− (t) > 0 for t > 0 (we get the eigenvalue); if µn (0) = αn+ , then zn− (t) < 0 for t > 0 (we get the resonance). ii) Let p be a smooth potential and n 1. Then we deduce that zn− (t) > 0, as t > 0 (we get the eigenvalue) and zn+ (t) < 0 (we get the resonance). Hence for fixed n 1 and any small t > 0 we have an eigenvalue near αn− and a resonance near αn+ on the second sheet. iii) We obtain the same picture for a small potential and any n 1.

Lattice Dislocations in a 1-Dimensional Model

481

3. States In order to prove Theorems 1.1–2 we need the following results. Lemma 3.1. Let p ∈ L1 (0, 1). Then for a nondegenerate gap we have i) For each τn,m , n 1, m = 1, . . . , n the value λ = µn (τn,m ) = µn (0)∗ is a state. ii) A state (a discrete eigenvalue or a resonance) of T (t) depends continuously on t ∈ R. More precisely, if p is real analytic, then the state of T (t) is given by a real analytic function. iii) The value λ = µn (t1 ) for some t1 ∈ R is a state iff λ = µn (0)∗ is a state. iv) Let a state λn (t1 ) = µn (t1 ) = µn (0)∗ ∈ γn0 , for some t1 ∈ R, n 1 and λn (t1 ) = αn± . Then the following asymptotic estimates are fulfilled: ˙ 1 )S(µn (t1 ), t1 )(t − t1 ) + O((t − t1 )2 ), λn (t) = λn (t1 ) + µ(t

as t → t1 . (3.1)

Proof. i) The case µn (τn,m ) = µn (0)∗ = αn± was proved in Lemma 2.2. We need the following results. Consider the operator H± (t)f = −f

+ p(x + t)f in L2 (R± ) with the Dirichlet boundary condition f (0) = 0. Zheludev [Z] proved: λ is an eigenvalue of H± (t) iff λ is a resonance of H∓ (t). Using this result we deduce that the eigenfunction T (t)f = λf, f (0) = 0, λ ∈ γn , has the form ϕ(x, λ, 0) if x < 0, λ = µn (0)∗ ∈ γn+ , C = 0, f = (3.2) Cϕ(x, λ, t) if x > 0, λ = µn (t) ∈ γn+ . Due to Lemma 2.2 it is enough to consider only the case of an eigenvalue, i.e., a state on the side γn+ . Then by (3.2), λ = µn (τn,m ) = µn (0)∗ ∈ γn+ is an eigenvalue. ii) Using Lemma 2.2 we deduce that it is enough to consider only the case of an eigenvalue, i.e., the state on the side γn+ . The mapping t → q(·, t) is continuous as a mapping from R to L1,loc,unif (R) equipped with the usual norm s+1 f 1,loc,unif = sup |f (x)|dx. s

s

It is well known that the space L1,loc,unif (R) corresponds precisely to the Kato-class K1 in R (cf., e.g., [CFKS]). In particular, if a sequence of potentials pn satisfies pn 1,loc,unif → 0, then we obtain inequalities in the sense of quadratic forms, u , u ∈ L2 (R), |pn ||u|2 ≤ An u 2 + Bn u2 , where the constants An , Bn tend to zero, as n → ∞. Clearly, t → t0 implies that p(· + t) − p(· + t0 )1,loc,unif → 0. Let λn (t0 ) lie inside the gap. It follows by standard perturbation theory that the resolvent difference −1 −1 d2 d2 − − 2 + q(·, t0 ) + i − 2 + q(·, t) + i dx dx converges to zero in norm, implying spectral convergence; in particular, discrete eigenvalues depend continuously on the parameter t inside the gap and λn (t) is piecewise

482

E. Korotyaev

continuous on the parameter t. We have to prove the continuity of the state λn (t) at the end of gap, but before that we prove the next point iii). iii) First, we consider the state inside the gap. Assume λ = µn (t) ∈ γn+ . Then (3.2) implies λ = µn (0)∗ . Suppose now λ = µn (t) ∈ γn− . Hence by Lemma 2.2, λ∗ is an eigenvalue of the dislocation operator T0 (t) in the th gap, the corresponding eigenfunction f = ϕ(x, λ∗ , 0), x > 0, and f = Cϕ(x, λ∗ , t), x < 0, C = 0. Then (3.2) yields λ∗ = µn (0). Conversely, let λ = µn (0)∗ ∈ γn+ (the proof is the same). Then by (3.2), λ = µn (t). Now assume λ = µn (0)∗ ∈ γn− . Hence by Lemma 2.2, λn (t)∗ is the eigenvalue of the dislocation operator T0 (t) in the th gap. Then the corresponding eigenfunction f = ϕ(x, λ∗ , 0), x > 0, and f = Cϕ(x, λ∗ , t), x < 0, C = 0, and (3.2) implies λ = µn (t). Secondly, we now consider the state λn (t) in the the case λn (t1 ) = αn− ( the proof for λn (t1 ) = αn+ is the same). Let µn (t1 ) = λn (t1 ) = αn− and µn (0) = αn− for some t1 ∈ T. Then there exists λn (t) = µn (t) for t = t1 , t → t1 and we have 0 = w(λ, t) = m+ (λ, t) − m− (λ, 0) a(λ, t) − b(λ) a(λ, 0) + b(λ) − , = ϕ(1, λ, t) ϕ(1, λ, 0)

λ = λn (t),

(3.3)

and if we multiply (3.3) by (a(λ, t) + b(λ)) and use the identities (2.10), we obtain −ϑx (1, λ, t) − (a(λ, t) + b(λ))(a(λ, 0) + b(λ))/ϕ(1, λ, 0) = 0. Let t → t1 , then the left-hand side of the last identity goes to −ϑx (1, αn− , t1 ), but this is impossible since −ϑx (1, αn− , t1 ) = 0. Therefore, for t near t1 there is no a state in some neighborhood of αn− . The proof of the case µ(0) = λ(t1 ) = αn− and µ(t1 ) = αn− for some t1 ∈ T is the same. We now prove the continuity of the state λn (t) at the point αn± . Let tm → t0 and λm = λn (tm ) → αn− , as m → ∞. First, let µn (0) = αn− , µn (t0 ) = αn− . Then w(λm , tm ) = 0 and hence w(αn− , t0 ) = 0. Then by Lemma 2.2, there exists the “continuous” state λn (t), t ∈ [t0 − ε, t0 + ε] for some ε > 0. Secondly, let µn (0) = αn− , µn (t0 ) = αn− . Then by iii), µn (tm ) = λn (tm ) and we have w(λ, tm ) =

a(λ, tm ) − b(λ) a(λ, 0) + b(λ) − = 0, ϕ(1, λ, tm ) ϕ(1, λ, 0)

λ = λn (tm ),

(3.4)

and if we multiply (3.4) by (a(λ, tm ) + b(λ)) and use the identities (2.10), we obtain −ϑx (1, λ, tm ) − (a(λ, tm ) + b(λ))(a(λ, tm ) + b(λ))/ϕ(1, λ, 0) = 0.

(3.5)

Let tm → t0 , then the left-hand side of the last identity goes to −ϑx (1, αn− , t0 ), but this is impossible since −ϑx (1, αn− , t0 ) = 0. Therefore, for t near t1 there is no a state in some neighborhood of αn− . The proof of the case µn (0) = αn− and µ(t0 ) = αn− is the same. Therefore, λn (·) is a continuous function. Let p be analytic. Rewriting the potential q(x, t) in the form q(x, t) = p(x) + χ+ (p(x + t) − p(x)), we deduce that q(·, t) is an analytic function of t. Therefore, the operator T (t) is an analytic family, and it follows by standard perturbation theory that λ(t) is a real analytic function inside the gap. Then by Lemma 2.2, λ(t) is real analytic. iv) We prove (3.1) for even n, the proof for odd n is the same. Introduce the functions ξ± (λ, t) ≡ a(λ, t) ± b(λ), ξ±0 = ξ± (λ, 0), ϕ = ϕ(1, λ, t), ϕ 0 = ϕ(1, λ, 0). We have

F (λ, t) = ϕ 0 ξ− − ϕξ+0 , Fλ (λ, t1 ) = (ϕλ0 ξ− − ϕλ ξ+0 ) + ϕ 0 (ξ− )λ − ϕ(ξ+0 )λ ,

Lattice Dislocations in a 1-Dimensional Model

483

and using the identities ϕ = (λ − µ(t))φ(λ, t), µ = µn (t), µ0 = µn (0), and the parameters z = λ − µ0 , we deduce that Fλ (λ, t1 ) = −2b(µ∗0 )(h(0) + h(t1 )) + O(|z| + |τ |), τ ≡ t − t1 → 0, h(t) ≡ ϕ(1, µ0 , t).

(3.6)

˙ +0 yields The substitution of (2.11) into F˙ = aϕ ˙ 0 − ϕξ F˙ = −[ϑx + (λ − p(t))ϕ]ϕ 0 − 2aξ+0 = −2aξ+0 − ϕ 0 ϑx − (λ − p(t))(λ − µ(t))(λ − µ(0))φφ 0 , where φ 0 = φ(λ, 0). Then F˙ = −2aξ+0 − zφ 0 ϑx − (λ − p(t))(z − O(τ ))zφφ 0

= 4b(µ∗0 )2 + O(|z| + |τ |) − z(z − O(|τ |))p(t)φφ 0 .

Let z(t) = λ(t) − λ(t1 ). Using (3.6) and the last result, we obtain t ˙ F (λ(t), t) dt z(t) = − F λ (λ(t), t) t1 t 4b(µ∗0 )2 + O(|z| + |τ |) − z(z − O(|τ |))p(t)φφ 0 =− dt −2b(µ∗0 )(h(0) + h(t1 )) + O(|z| + |τ |) t1 t 2b(µ∗0 )τ = + [O(|z| + |τ |) − z(z − O(τ ))p(t)φφ 0 ]dt. (h(0) + h(t1 )) t1 Assume τ > 0, zm = max |z(t1 + s)|, 0 s τ . Then z(t) =

2b(µ∗0 )τ + O(zm + τ )τ − zm (zm + τ )po(1). h(0) + h(t1 )

(3.7)

Hence z = O(τ ) and then zm = O(τ ). Therefore, z = O(τ ) and we get (2.6), since µ˙ n (t) = 2b(µn (t))/ϕλ (1, µn (t), t). The proof for τ < 0 is the same. We now prove the main Theorem 1.1 concerning the motion of the states on the slit gap. Proof of Theorem 1.1. Assume for definiteness µn (0) ∈ γn− and µ∗n (0) ∈ γ + and n even. The proof of the other cases, µn (0) ∈ γn+ or µn (0) = αn± , is the same (see Lemma 2.2). First, let t = 0. Then by Lemma 2.2, there exist two unique continuous functions ± λ± n (·) : (−ε, ε) → γn , for some ε > 0, such that λn (t) is an eigenvalue or a resonance of ± 0 T (t). Then in some neighbourhood of αn there exist the states λ± n (t) ∈ γn , t ∈ (−ε, ε), and by Lemma 3.1, other states are absent. Therefore, for t ∈ (−ε, ε) there exist only two states λ± n (t). If we increase t then we have not got new states. Indeed, by Lemma 3.1, the new states are not created inside the gap. Assume that the new state appears from the end of the gap at t = t0 . By Lemma 2.2, there exists the same state for t < t0 . We have a contradiction. Thus there exist only two states λ± n (t) for each t ∈ R and by Lemma 3.1, the function λ± n (·) ∈ C(R). ± i) We have only two states λ± n (0) = αn and recall that the eigenvalue µn (0) lies on the ± second sheet. Increasing t the points λn (t), µn (t) move on the gap γn0 , and µn (t) moves on the gap γn0 monotonically. By Lemma 3.1, the point λ± n (t) passes through the value

484

E. Korotyaev

∗ µn (0)∗ simultaneously with µn (t), hence the point λ− n (t) lies on the arc (µn (t), µn (0) ) + ∗ and λn (t) belongs to the arc (µn (0) , µn (t)) for t ∈ (0, τn,1 ). By Lemma 3.1, the point 0 ± ∗ λ± n (t0 ) = µn (t0 ) ∈ γn , for some t0 ∈ T iff λn (t0 ) = µn (0) . ∗ Let t ↑ τn,1 and then µn (t) ↑ µn (0) . Moreover, since λ± n (·) ∈ C(R) the point − + (τ ) = µ (0)∗ . Hence, λ− (t ) = µ (τ ) = λn (t) ↑ µn (0)∗ and λ+ (t) → λ n n n,1 n n n,1 n n,1 0 µn (0)∗ . Increasing t the points λ− n (t), µn (t) move on the gap γn in the same direction, but by (3.1), (2.31), µn (t) moves faster at t = τn,1 ! Therefore, the point λ− n (t) lies on the arc ∗ ) for t ∈ (τ , τ ). Later (µn (0)∗ , µn (t)) and λ+ (t) belongs to the arc (µ (t), µ (0) n n n,1 n,2 n + on, the picture is repeated, but now with λ+ n (t). The point µn (t) overtakes λn (t) at the (τ ), and so on. Equations (1.5– “passing” point t = τn,2 when µ∗n (0) = µn (τn,2 ) = λ+ n n,2 6) were proved in Lemma 3.1, 2.2. Assume that n = 1. Since µ1 (t) runs through the gap γ1 making one complete ∓ revolution in unit time, see Lemma 2.1, we have λ± 1 (1) = α1 . Assume that n = 2. Then µ2 (t) runs through the “circular gap” γ2 making two complete revolutions in unit time , see Lemma 2.1, and µ2 (t) passes λ− 2 (t) at the point t = τ2,1 and µ2 (t) overtakes λ+ (t) at the point t = τ , see Lemma 3.1. Hence 2,2 2 ± ± λ2 (1) = α2 , and so on. ± For the eigenvalue λ± n (t) and the eigenfunction ψn (·, t) we have the identity ± (·, t), which yields H (t)ψn± (·, t) = λ± (t)ψ n n ± H (t)ψn± (·, t + 1) = H (t + 1)ψn± (·, t + 1) = λ± n (t + 1)ψn (·, t + 1), ± ± ∓ since H (t) = H (t + 1). Hence, λ± n (t + 1) = λn (t) or λn (t + 1) = λn (t). Then since ± ± ± ∓ λ2n (1) = α2n , λ2n+1 (1) = α2n+1 , we have (1.7). iii) By Lemma 3.1, if p is real analytic, then λ± n (t) is real analytic.

We now consider the ground state for a dislocation of the potential p. √ Proof of Theorem 1.2. Introduce the new variable z = −λ. For each t ∈ R the function w(−z2 , t) is real analytic in z. Let t = 0, then the function w(−z2 , 0) has the simple zero z = 0. By Lemma 2.2, there exists a state λ(t) = z2 (t) such that the function z = z(t) belongs to C(−ε, ε) for some ε > 0 and z(0) = 0. If t changes then λ0 (t) = z2 (t) moves on the gap γ0 . On the first sheet γ0+ the state λ0 (t) is an eigenvalue and on the second one λ0 (t) is a resonance. Therefore, we have exactly one state λ0 (t) for t ∈ (−ε, ε). For fixed t the function w(λ, t) is analytic in λ = 0 and has only simple zero. Then by the implicit function theorem for any t0 ∈ T there exists a function λ0 (·) ∈ C(t0 − ε, t0 + ε) for some ε > 0. Then λ0 (·) ∈ C(T, γ0 ). Assume that we have a new eigenvalue (or a resonance) at some t1 , which moves from the zero. But by Lemma 2.2, there exists a resonance (or an eigenvalue) at t < t1 , which moves from the zero. We get a contradiction and we have λ0 (t) only. i) If p is real analytic then w(−z2 , t) is real analytic. By the Implicit Function Theorem, λ0 (·) is real analytic on T. ii) In [K1] there is the following result. If w(α0+ , t) > 0 then in the basic gap there exists an eigenvalue. But w(α0+ , t) = ζ (t, α0+ ) − ζ (0, α0+ ). By i), we have only one eigenvalue and a resonance is absent. The proof for ζ (t, α0+ ) < ζ (0, α0+ ) is the same. Then in the basic gap an eigenvalue is absent and there exists a unique resonance. iii) This result was proved in [K1].

Lattice Dislocations in a 1-Dimensional Model

485

4. Asymptotics √ Recall rn = 2fn /φn , fn = 52 − 1/gn . The function ϕ(1, λ, t) is represented by its Taylor series in a neighborhood of the point µn (t) at fixed t in the form ϕ(1, λ, t) = (−1)n sn φn (λ, t), φn (λ, t) ≡ (−1)n [ϕλ (1, µn (t), t) +

sn ϕλλ (1, µn (t), t) + . . . ], 2

(4.1)

where sn = λ − µn (t). We introduce the function ζn (λ, t) = φ˙ n (λ, t)/φn (λ, t) and 1 p(x + t)(2x − 1) sin 2π nxdx, p1sn (t) = 0

p1cn (t) =

1

p(x + t)(2x − 1) cos 2π nxdx, n 1.

0

In order to prove Theorem 1.3 we need the following result. Lemma 4.1. Let p ∈ L1 (0, 1), then the following asymptotic estimates are fulfilled: fn (λ) = (2πn)−1 + O(n−3 ), fn (λ) = O(n−3 ), p1sn (t) 1 1 1− + O( 2 ) , φn (λ, t) = 2(π n)2 2π n n

(4.2)

ζn (λ, t) = p1cn (t) + O(n−1 ),

(4.4)

rn (λ, t) = 2πn + p1sn (t) + O(n

−1

),

rn (λ, t) = O(n

−1

(4.3)

),

(4.5)

as λ ∈ γn , n → ∞ uniformly on bounded subsets of [0, 1] × L1 (0, 1). √ √ Proof. Recall v = Im k. Define the functions Rn (λ) = |(an+ + λ)(an− + λ)|1/2 and v(y 2 )dy 1 Jn (λ) = 1 + , λ ∈ γn , √ π R\(an− ,an+ ) |y − λ||(an+ − y)(an− − y)|1/2 √ where λ > 0, an± ≡ αn± > 0, n 1. We need the identity from [KK1] v(λ) =

gn (λ) Jn (λ), Rn (λ)

where λ ∈ γn0 ,

n 1.

(4.6)

The functions Jn (λ), Rn (λ) are analytic in Un and the following asymptotics are fulfilled: Jn (λ) = 1 + O(n−2 ),

Jn (λ) = O(n−3 ),

Rn (λ) = 2πn(1 + O(1/n )), Rn (λ) = (2π n) v(λ) = O(1/n), 2

−1

(4.7) (1 + O(1/n )), 2

an∓ = πn + O(1/n),

(4.8) (4.9) (4.10)

as λ ∈ γn , n → ∞ uniformly on bounded subsets of L1 (0, 1) (see [KK1]). By (4.6-7), the function φ = sinh v(λ)/v(λ) is analytic in Un and (4.7-10) yield φ(λ) = 1 + O(1/n2 ),

φ(λ) = O(1/n2 ),

n → ∞,

λ ∈ γn .

(4.11)

486

E. Korotyaev

The function Fn (λ) = v(λ)/gn (λ) is analytic in Un and (4.6), (4.10-11) imply Fn (λ) = (2π n)−1 (1 + O(1/n2 )), Fn (λ) = O(1/n3 ),

n → ∞,

(4.12)

λ ∈ γn .

Therefore, the identity fn = φFn and (4.11-12) yield (4.2). Substituting the well known asymptotics of ϕ (see (2.2)) into (4.1) we have (4.3). Differentiating the asymptotics (4.1) with respect to t, using Lemma 2.1 and (2.3) we obtain (4.13) φ˙ n (λ, t) = (−1)n ϕ˙λ (1, µn (t), t) + O(n−3 ) , n → ∞, λ ∈ γn . Using (2.11) and the asymptotics of ϕ, ϑ, we have ϕ(1, ˙ λ, t) = 2a(λ, t) = ϕx (1, λ, t) − ϑ(1, λ, t) 1 √ 1 =√ p(x + t) sin λ(2x − 1)dx + O(1/λ). λ 0 Then differentiating this asymptotics we deduce that φ˙ n (λ, t) = (−1)n ϕ˙λ (1, µn (t), t) + O(1/n3 ) = (2λ)−1 p1cn (t) + O(1/n3 ) and substituting this into ζn = φ˙ n /φn we have (4.4). Equations (4.3–4) and rn = 2fn /φn yield (4.5). We find the asymptotics of the states as n → ∞. Proof of Theorem 1.3. The definitions of b, φn , sn , gn imply gn (λ) b(λ) = rn (λ, t) , ϕ(1, λ, t) 2sn (λ, t)

λ ∈ γn0 .

Using relations (2.1–4) and the formula ϕ(1, ˙ λ, t) = s˙n (λ, t)φn (λ, t) + sn (λ, t)φ˙ n (λ, t) we obtain w(λ, t) = m+ (λ, t) − m− (λ, 0) =

−µ˙ 0 + s0 ζ0 + r0 g −µ˙ + sζ − rg − , 2s 2s0

where µ = µn (t), µ0 = µ(0),

s = sn (λ, t), r = rn (λ, t), ζ = ζn (λ, t), s0 = λ − µn (0), ζ0 = ζ (λ, 0), r0 = r(λ, 0).

g = gn ,

The identity w(λ, t) = 0 yields (−µ˙ + sζ − rg)s0 = (−µ˙ 0 + s0 ζ0 + r0 g)s.

(4.14)

Define the functions z = zn (t), z0 = zn (0), u = un (t), and y = cos z − cos u,

y0 = cos z0 − cos u,

x = sin z + sin u,

x0 = sin z0 − sin u,

Lattice Dislocations in a 1-Dimensional Model

487

and substituting the identities µ˙ =

|γn | (sin z)˙z, 2

sn (λn (t), t) =

|γn | (cos z(t) − cos u(t)) 2

into (4.14) we obtain [−˙z sin z + yζ − r sin u]y0 = [−˙z0 sin z0 + y0 ζ0 + r0 sin u]y, and hence −[˙z sin z + r sin u]y0 + [˙z0 sin z0 − r0 sin u]y = yy0 (ζ0 − ζ ).

(4.15)

The definitions z1 ≡ z − 2πnt, r1 ≡ r − 2π n, β ≡ x0 y − xy0 imply 2πnβ = [˙z1 sin z + r1 sin u]y0 − [˙z10 sin z0 − r10 sin u]y + yy0 (ζ0 − ζ ) which yields 2πnβ = r1 y0 x − r10 x0 y − (r1 − z˙ 1 )y0 sin z + (r10 − z˙ 10 )y sin z0 + yy0 (ζ0 − ζ ).

(4.16)

Using the identities r − z˙ = rn (λ, t) − rn (µn (t), t) = r s =

r |γn |y , 2

r10 − z˙ 10 =

r0 |γn |y0 , 2

where r = rn (µn (t) + τ s, t)λ for some τ, |τ | 1, we get 2πnβ = r1 y0 x − r10 x0 y + yy0 Pn , Pn ≡ (ζ − ζ0 ) −

r0 |γn | r |γn | sin z0 + sin z. 2 2

(4.17)

Substituting the relations z+u z0 + u z−u z0 − u sin , y0 = −2 sin sin , 2 2 2 2 z+u z0 + u z−u z0 − u x = 2 sin cos , x0 = 2 cos sin , 2 2 2 2 y = −2 sin

and β = 4ω sin σ, σ ≡

z − z0 − 2u , 2

u − z0 z+u sin , 2 2 z0 + u z−u ω1 ≡ sin sin , 2 2 ω ≡ sin

into (4.17) we obtain sin σ = Gn , 1 z0 + u z0 + u z−u z−u (r1 sin cos − r10 cos sin + ω1 Pn ). Gn = 2πn 2 2 2 2

(4.18)

488

E. Korotyaev

We rewrite Gn in the form 1 Gn = r1 (− sin σ + sin α) − r10 (sin σ + sin α) + (cos σ − cos α)Pn , 4πn (4.19) z + z0 α= . 2 Introduce the functions ηcn = using (4.4-5) we get

1 4πn (p1cn (t)−p1cn (0)),

Gn = −ηn sin σ − ηcn cos σ + G1n + O( where ηn =

1 4πn (p1sn (t) + p1sn (0)).

1 ), n2

ηsn =

1 4πn (p1cn (t)−p1cn (0)),

G1n = ηsn sin α − ηcn cos α,

Then

(4.20) (1 + ηn ) sin σ + ηcn cos σ = G1n + O(n−2 ). 2 = η + O(n−2 ). We Define the small value Kn by sin Kn = ηcn / (1 + ηn )2 + ηcn cn apply trigonometric formulas to (4.20) and we get sin(σ + Kn ) = G1n + O(n−2 ). This equation has two solutions σ − + Kn = G1n + O(n−2 ),

σ + + Kn = π − G1n + O(n−2 ).

1 1 −2 Hence u± n = 2 (z − z0 ) − ηcn ± Gn + O(n ), and using (2.5) we get (1.8).

Acknowledgements. The various parts of this paper were written at Humboldt Univ., Berlin (March, April 1998) and at ESI, Vienna (May, June 1998). The author is grateful to J. Brüning (Humboldt Univ.), and to T. Hoffmann-Ostenhof (ESI) for their hospitality. The author would like also to thank R. Hempel for useful discussions.

References [A1]

Anoshchenko, O.: The inverse problem of scattering for Schrödinger operator with potential having periodic asymptotics. Theory functions, functional analysis and applications, Kharkov Univ. 47, 59–67 (1987) (in Russian) [A2] Anoshchenko, O.: Expansion in the eigenfunctions of the Schrödinger equation with a potential having a periodic asymptotic behavior. J. Sov. Math. 49, 1237–1241 (1990) [BS] Bikbaev, R., Sharipov, R.: Asymptotics as t → ∞ of the Cauchy problem for the Korteveg–de Vries equation in the class of potentials with finite-gap behavior at x → ±∞. Theoret. and Math. Phys. 78, 244–252 (1989) [CFKS] Cycon, H., Froese, R., Kirsch, W., and Simon, B.: Schrödinger Operators with Applications to Quantum Mechanics and Global Geometry. Berlin–Heidelberg–New York: Springer-Verlag, 1987 [DS] Davies, E., Simon, B.: Scattering theory for systems with different spatial asymptotics on the left and right. Commun. Math. Phys. 63, 277–301 (1978) [D] Dubrovin, B.: Periodic problems for the Korteveg de Vries equation in the class of finite-gap potentials. Funct. Anal. Appl. 9, 215–223 (1975) [F] Firsova, N. : Riemann surface of quasimomentum and scattering theory for the perturbed Hill operator, J. Soviet Math. 11, 487–497 (1979) [GNP] Gesztesy, F., Nowell, R., Pötz, W.: One-dimensional scattering theory for quantum systems with nontrivial spatial asymptotics. Diff. Integral Eq. 10, 521–546 (1997) [IM] Its, A., Matveev, V.: Schrödinger operators with the finite-gap spectrum and the N-soliton solutions of the Korteveg de Fries equation. Teoret. Math. Phys. 23, 51–68 (1975) (Russian) [KK1] Kargaev, P., Korotyaev, E.: Effective masses and conformal mappings. Commun. Math. Phys. 169, 597–625 (1995)

Lattice Dislocations in a 1-Dimensional Model

[KK2] [Ka] [K1] [K2] [L] [MO] [Ta] [T] [Tr] [Z]

489

Kargaev, P., Korotyaev, E.: The inverse problem for the Hill operator, the direct approach. Invent. Math. 127, 567–593 (1997) Karpeshina, Y.: Perturbation Theory for the Schrödinger Operator with Periodic Potential. Lecture Notes in Math. 1663, Berlin–Heidelberg–New York: Springer, 1997 Korotyaev, E.: Schrödinger operators with a junction of two 1-dimensional periodic potentials. Preprint FIM ETH, Zurich, 1997 Korotyaev, E.: Inverse Problem and the trace formula for the Hill Operator II: Math. Z. 231, 345–368 (1999) Levitan, B. M.: Inverse Sturm–Liuville Problems. Utrecht: VNO Science Press, 1987 Marchenko, V., Ostrovski, I.: A characterization of the spectrum of the Hill operator. Math. USSR Sbornik 26, 493–554 (1975) [PT] Pöschel P., Trubowitz E.: Inverse Spectral Theory. Boston: Academic Press, 1987 Tamm, I.: Phys. Z. Sowjet. 1, 733 (1932) Titchmarsh, E.: Eigenfunction Expansions Associated with Second-order Differential Equations 2. Oxford: Clarendon Press, 1958 Trubowitz E.: The inverse problem for periodic potentials. Commun. Pure Appl. Math. 30, 321–337 (1977) Zhelude, V.: On the spectrum of Schrödinger operator with periodic potentials on the halfline. Trudy Kaf. Mat. Anal. Kaliningrad Univ. 1969

Communicated by B. Simon

Commun. Math. Phys. 213, 491 – 521 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles Tomasz Brzezinski ´ 1,2, , Shahn Majid3, 1 Department of Mathematics, University of Wales Swansea, Singleton Park, Swansea SA2 8PP, UK 2 Department of Theoretical Physics, University of Łód´z, Pomorska 149/153, 90–236 Łód´z, Poland 3 School of Mathematical Sciences, Queen Mary and Westfield College, Mile End Road, London E1 4NS,

UK Received: 25 September 1998 / Accepted: 23 February 2000

Abstract: We develop the noncommutative geometry (bundles, connections etc.) associated to algebras that factorise into two subalgebras. An example is the factorisation of matrices M2 (C) = CZ2 · CZ2 . We also further extend the coalgebra version of theory introduced previously, to include frame resolutions and corresponding covariant derivatives and torsions. As an example, we construct q-monopoles on all the Podle´s quantum 2 . spheres Sq,s 1. Introduction In [8] it was shown that one can generalise the notion of principal bundles in noncommutative geometry [7] to a very general setting in which the role of “coordinate functions” on the base is played by a general (possibly noncommutative) algebra and the role of the “structure group” (fibre) of the principal bundle is a coalgebra. In particular, it need not be a quantum group, which would be too restrictive for many interesting examples. In [5] the theory of modules or “associated bundles” is extended to this case along the lines of the quantum group case in [7]. We apply this now to extend the recently introduced notion of a frame resolution [23], thereby bringing the coalgebra version of the gauge theory in line with the more restrictive quantum group gauge theory case. The paper begins, however, in Sect. 2 with a useful reformulation of coalgebra bundles entirely in terms of algebras. This is a theory where the role of “gauge group” or fibre in the principal bundle is played by any algebra A subject to a certain nondegeneracy “Galois” condition for its action on the algebra P of the total space of the bundle. The algebra A plays the role of a classical (or quantum) enveloping algebra of a Lie algebra in usual (or quantum group) gauge theory, but now without any kind of Hopf algebra structure. Without the latter one cannot make general tensor products of representations EPSRC Advanced Research Fellow at the University of Wales Swansea.

Reader and Royal Society University Research Fellow at QMW and Senior Research Fellow at the

Department of Applied Mathematics and Theoretical Physics, University of Cambridge, UK.

492

T. Brzezi´nski, S. Majid

so that it is indeed remarkable that the formulation of geometric notions is possible. This is what we outline, namely a gauge theory that has connections, principal bundles, associated bundles, etc. using only algebras and in particular not requiring anything from the theory of quantum groups. As such, the material in Sect. 2 should be rather more widely accessible than the coalgebra bundle version of the theory. In particular, it can be viewed as a critical first step towards a C ∗ -algebra or von Neumann algebra treatment. While beyond our scope to actually consider operator theory and topological completions here (we work algebraically), it offers the possibility to link up with and extend other approaches to noncommutative geometry based on C ∗ -algebras, etc. We recall that in the C ∗ -algebra approach to noncommutative geometry, see [11], one traditionally works directly with vector bundles (as projective modules) and not principal bundles – one would expect that the latter would require some kind of group-like object such as a Hopf algebra, but we see that this need not be the case. Also, although we do not develop a precise connection with the theory of subfactors at the present time, we note that our final data in terms of algebras is not unlike a subfactor inclusion. In that context one considers inclusions of von Neumann algebras with the larger one being viewed as some kind of “cross product” of the smaller one by some kind of “paragroup” [27]. Similarly we show that if A is an algebra acting on another algebra P subject to a certain nondegeneracy condition then one can form a generalised “cross product” (which we call the “Galois product”) of P by A. In the subfactor case it is known that a special case corresponds to some kind of (weak) Hopf algebra [1], while similarly a special case of an algebra bundle corresponds to A a Hopf algebra. The development of such an analogy on the one hand could provide a gauge theory of subfactors (as well as a coalgebraic version of some aspects of their theory) and on the other hand suggest the existence of a whole “Jones tower” of bundles. It would also connect with gauge theory from the point of view of algebraic quantum field theory as in [14, 15] and many other works. Linking up with C ∗ or operator algebra results is a long-term motivation for the section. From a mathematical point of view, however, it should be stressed that our present results are strictly equivalent to a subset of the coalgebra bundle case. Part of the reformulation was already hinted at in [8] where part of the data was expressed as an algebra factorising into subalgebras A, P . The crucial exactness or “Galois” condition in this form is what we provide now. It turns out to involve traces over the vector space of A, which essentially forces us to finite-dimensional A. From this it is clear that the theory can be developed in two ways to cover the infinite-dimensional case: either one needs to introduce operator completions which is the C ∗ -algebra or von Neumann algebra direction mentioned above, or one needs to replace A by its dual, a coalgebra, which then works for infinite-dimensional coalgebras – this is the approach taken in [8] and in the remaining sections of the present paper. In Sects. 3 and 4 we continue with new results in the coalgebraic setting. We provide the necessary formulation of associated bundles by exploiting the recent work [5]. A small generalisation of coalgebra bundles has been made in [6] and we will use in fact this formulation. Also, the notion of a connection which we use here requires less structure than the one introduced in [8]. In Sect. 5 we study frame resolutions at this level. Finally, in Sect. 6, we show that the coalgebra theory allows one to include the crucial example of the monopole on the full 2-parameter family of Podle´s quantum spheres [28]. Recall that Podle´s classified all reasonable “quantum spheres” covariant under the quantum group SUq (2), and until now the q-monopole has been constructed [7] only for

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

493

a diagonal subfamily (the so-called standard quantum spheres). The general case requires the more general coalgebra bundle theory. The bundle itself for all the quantum 2-spheres is in [4] and we provide on this now the required connection. Similarly it is clear from their construction that all of the q-deformed symmetric spaces in the classification of [26] should be constructable as coalgebra bundles, which includes the coalgebra bundle from which one would expect to project out a q-instanton. This is a second direction for further work. We work algebraically over a general field k. We use the usual notations c = c(1) ⊗ c(2) for a coproduct of a coalgebra C (summation understood). We also write C + = ker ε, where ε is the counit. We write V (v) = v(1) ⊗ v(∞) for a left coaction on a vector space V , and V (v) = v(0) ⊗ v (1) for a right coaction. We also denote by HomA (V , W ) the linear maps commuting with a right action of an algebra A and by C A Hom(V , W ) those commuting with a left action. Similarly, Hom (V , W ) for maps C commuting with a right coaction of a coalgebra C and Hom(V , W ) for a left coaction. In general when we need to refer to the components of other elements χ # , (a ⊗ u), etc. of tensor product spaces we will use the upper bracket notation χ # = χ #(1) ⊗ χ #(2) , etc. again with summation understood. Also, we recall that for any algebra P , the universal 1-forms on P are 1 P = ker ·P ⊆ P ⊗ P . The exterior derivative d : P → 1 P is du = 1 ⊗ u − u ⊗ 1 for all u ∈ P . This extends to higher forms (see [18] e.g.) n P ⊆ P ⊗ n+1 characterised by the requirement that the products of all adjacent factors vanish, and d : n−1 P → n P , d(u1 ⊗ · · · ⊗ un ) =

n+1

(−1)k−1 u1 ⊗ · · · ⊗ uk−1 ⊗ 1 ⊗ uk ⊗ · · · ⊗ un .

(1)

k=1

n With these definitions P = ∞ n=0 P is a graded differential algebra with product given by juxtaposition and multiplication in P . 2. Galois Actions and Algebra Factorisations Although we will continue mainly in an algebra-coalgebra setting in later sections, we start with a more accessible version which depends only on algebras and which should be useful for the operator-algebraic version. We consider unital algebras and unital algebra maps. An algebra factorisation means an algebra X and subalgebras P , A such that the linear map P ⊗ A → X given by the product in X is an isomorphism. Proposition 2.1 (cf. [32, 20, 10]). Algebra factorisations are in 1-1 correspondence with algebras P , A and : A ⊗ P → P ⊗ A such that (·A ⊗ id) = (id ⊗ ·A )12 23 , (id ⊗ ·P ) = (·P ⊗ id)23 12 ,

(1 ⊗ u) = u ⊗ 1, (a ⊗ 1) = 1 ⊗ a,

∀u ∈ P , ∀a ∈ A.

In this case, given e : A → k a character, there is a left action

: A ⊗ P → P,

a u = (id ⊗ e)(a ⊗ u),

∀a ∈ A, u ∈ P .

The subspace M = Pe = {m ∈ P | a m = e(a)m ∀a ∈ A} forms a subalgebra.

494

T. Brzezi´nski, S. Majid

Proof. Details of the stated equivalence are in [20, pp. 299–300]. Given we define X = P ⊗ A with product (u ⊗ a)(v ⊗ b) = u(a ⊗ v)b for u, v ∈ P and a, b ∈ A. Given X we define by au = ·X (a ⊗ u). The action is also part of the proof in [20] (where e = ε the counit of a bialgebra). There is a similar right action of P on A when P is equipped with a character, which we do not use. From the point of view of X, e on A extends to a left P -module map e : X → P obeying e(au) = a u for all a ∈ A and u ∈ P . Hence M = {u ∈ P | e(au) = ue(a) ∀a ∈ A}, from which it is clear that M is a subalgebra. One may also see this from the equations for . Such factorisations are quite common. For example, they come up naturally as part of Hopf algebra factorisations [21, 20]. Another example is the braided tensor product A⊗B of two algebras, see [20]. In our geometrical picture, P plays the role of the algebra of functions of the “total space” of a principal bundle, and A plays the role of the group algebra of the structure group. The subalgebra M plays the role of the functions on the base. The algebra X is not usually considered but plays the role of the cross product C ∗ -algebra of the functions on the total space by the action of the structure group. Proposition 2.2. In the setting above, the map χ˜ : A ⊗ P ⊗ P → P defined by χ˜ (a ⊗ u ⊗ v) = (a u)v descends to χ : A ⊗ P ⊗M P → P . We say that the factorisation is Galois if there exists χ # : P → P ⊗M P ⊗ A such that Tr A (χ # ◦ χ ) = idP ⊗M P ,

(χ ⊗ idA )(idA ⊗ χ # ) = τ : A ⊗ P → P ⊗ A,

where τ is the usual flip or transposition map. We call P (M, A, , e) a copointed algebra bundle. Proof. We have a (um) = e(aum) = e(ui a i m) = ui e(a i m) = ui me(a i ) = (a u)m for all a ∈ A, u ∈ P and m ∈ M, as required. Here (a ⊗ u) = ui ⊗ a i is a notation (sum over i). The rest is a definition. This can also be obtained from the equations. This is the analogue of the Galois condition in [8], which in turn is motivated from the theory of quantum principal bundles and, independently, the theory of Hopf-Galois extensions in the Hopf algebra case. In geometrical terms the map χ plays the role of the action of the Lie algebra g of the structure group of a principal bundle on its algebra of functions: if ξ ∈ g one has a vertical vector field ξ˜ given by differentiating the action corresponding to . The element χ # = χ # (1) ∈ P ⊗M P ⊗ A is particularly important and plays the role of the “translation map” of the principal bundle. Notice, however, that a factorisation can be Galois only if A is finite-dimensional. This should not unduly worry us since our formulation is mainly intended as a precursor to an operator-theoretic treatment where infinite-dimensional A would be allowed subject to topological completions and trace class conditions. To avoid all that in the infinitedimensional case one should of course use the coalgebra formulation as in later sections. Meanwhile, let us note that even finite-dimensional A are not uninteresting – the algebra P and the factorising algebra can in principle both be infinite-dimensional. A similar situation pertains with the theory of subfactors where the two von Neumann algebras are typically infinite-dimensional but the case where their “ratio” is in some sense finite is still very interesting. There is an obvious notion of a -module associated to an algebra factorisation, namely a left P -module and A-module V such that a (u v) = ◦ ((a ⊗ u) v),

∀a ∈ A, u ∈ P , v ∈ V .

(2)

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

495

Explicitly we require a (u v) = ui (a i v),where (a ⊗ u) = ui ⊗ a i . This is what corresponds to a left X-module. This point of view suggests a natural slight generalisation of the above, replacing e by the requirement of a map e˜ : A → P . Proposition 2.3. Let (P , A, ) be an algebra factorisation datum as in Proposition 2.1. Linear maps e˜ : A → P such that (1) (2) e(ab) ˜ = (a ⊗ e(b)) ˜ e((a ˜ ⊗ e(b)) ˜ ),

e(1) ˜ = 1,

∀a, b ∈ A,

are in 1-1 correspondence with extensions of the left regular action of P on itself to a -module structure on P . Given e, ˜ we define a u = (a ⊗ u)(1) e((a ˜ ⊗ u)(2) ),

∀a ∈ A, u ∈ P

and conversely, given such an extension, we set e(a) ˜ = a 1. In this situation the space M = Me˜ = {m ∈ P | a m = e(a)m, ˜ ∀a ∈ A}, is a subalgebra of P and χ˜ as in Proposition 2.2 descends to a map χ . Proof. We define the linear map : A ⊗ P → P as stated and verify first Eq. (2) as i ˜ ⊗ uv)(2) ) = ui (a i ⊗ v)(1) e((a ˜ ⊗ v)(2) ) (a ⊗ uv)(1) e((a

= (a ⊗ u)(1) ((a ⊗ u)(2) v), where we used the second of factorisation properties in Proposition 2.1. We also used the shorthand (a ⊗ u) = ui ⊗ a i as before. Next, we check that is indeed an action, (ab) v = (ab ⊗ u)(1) e((ab ˜ ⊗ u)(2) ) = (a ⊗ ui )(1) e((a ˜ ⊗ ui )(2) bi ) (1) (2) i i = (a ⊗ ui ) ((a ⊗ ui ) e(b ˜ )) = a (ui e(b ˜ )) = a (b u) as required. We used the first of the factorisation properties of and the assumed condition on e, ˜ which can be written as e(ab) ˜ = a e(b) ˜ in terms of . We then used (2) already proven. Finally, 1 u = ue(1) ˜ = 1 so is indeed an action. Conversely, given an action making P a -module we define e(a) ˜ = a 1. Then e(ab) ˜ = a (b 1) = a e(b) ˜ and a u = a (u 1) = ui (a i 1) = ui e(a ˜ i ) (using (2)), as required. The remaining facts follow easily from the definition of M. It can also be characterised equivalently as M = {m ∈ P | a (um) = (a u)m, ∀u ∈ P , a ∈ A} in view of (2) and the definition of .

(3)

We note that Lemma 2.4. In the setting of Proposition 2.3, for any -module V there is a natural notion of “invariant” subspace V0 = {v ∈ V | a v = e(a) v, ˜

∀a ∈ A} ⊆ V

(4)

which is a left M-module by restriction of the action of P . Proof. For all a ∈ A, m ∈ M and v ∈ V0 , we have a (m v) = ◦ ((a ⊗ m) v) = mi (e(a ˜ i ) ) = (mi e(a ˜ i )) v = (a m) v = (e(a)m) v ˜ = e(a) (m v), ˜ so m v ∈ V0 as well.

496

T. Brzezi´nski, S. Majid

The subalgebra M itself is a case of such an invariant subspace. When there is a corresponding χ # , we call P (M, A, , e) ˜ an algebra bundle. The copointed case is e(a) ˜ = e(a)1. The construction has a natural converse. Lemma 2.5. In an algebra bundle, χ # = χ # (1) obeys (a) χ # a = a χ # for all a ∈ A, where the action on P ⊗M P ⊗ A is on its first factor. (b) χ # (uv) = χ # (u)(1) v ⊗ χ # (u)(2) for all u, v ∈ P . (c) χ #(1) (χ #(2) u) = u ⊗M 1. Here χ #(1) ∈ P ⊗M P and χ #(2) ∈ A for all u ∈ P . Proof. From its definition, it is evident that χ (ab ⊗ u ⊗ v) = (ab u)v = χ (a ⊗ b u ⊗ v),

χ (a ⊗ u ⊗ vw) = χ (a ⊗ u ⊗ v)w

for all u, v, w ∈ P and a, b ∈ A. Parts (a) and (b) are just the corresponding properties for χ # . Thus, a χ # = (Tr A χ # ◦ χ ⊗ id)(a χ # ) = (id ⊗ f a )χ # (χ (ea ⊗ a χ #(1) )) ⊗ χ #(2)

= (id ⊗ f a )χ # ◦ χ (ea a ⊗ χ #(1) ) ⊗ χ #(2) = (id ⊗ f a )χ # (1) ⊗ ea a = χ # a,

where {ea } is a basis of A and {f a } a dual basis. Similarly, χ # (u)(1) v ⊗ χ # (u)(2) = (Tr A χ # ◦ χ (χ # (u)(1) v)) ⊗ χ # (u)(2)

= (id ⊗ f a )χ # (χ (ea ⊗ χ # (u)(1) )v) ⊗ χ # (u)(2) = χ # (uv).

We then deduce part (c) from part (b) as χ # (1)(1) (χ # (1)(2) u) = χ # (1)(1) (ea u)f a , χ # (1)(2) = (id ⊗ f a )χ # (1.(ea u)) = (id ⊗ f a )χ # ◦ χ (ea ⊗ u) = u ⊗ 1. These correspond to important properties of the translation map in differential geometry derived in the Hopf algebraic setting in [3, 30]. Theorem 2.6. Let P , A be algebras and P a left A-module under an action . We define M by (3) and say that the action is Galois if χ defined as in Proposition 2.2 has a corresponding χ # . In this case there exists a unique algebra factorisation X = P ⊗ A such that P , A form an algebra bundle and P is a -module (cf. Eq. (2)) via product in P and the action . Explicitly, (a ⊗ u) = χ (a ⊗ uχ #(1) ) ⊗ χ #(2) ,

e(a) ˜ = a 1,

∀a ∈ A, u ∈ P .

We call the corresponding algebra factorisation X = P ⊗ A the Galois product associated to a Galois action of an algebra A on an algebra P . Proof. Here we define M and χ˜ directly from the action ; it is easy to see that M is a subalgebra and that χ˜ descends to a map χ . We assume the existence of a corresponding χ # obeying the conditions in Proposition 2.2. For the purposes of this proof, we now write χ # = χ #(1) ⊗M χ #(2) ⊗ χ #(3) (a more explicit notation than the one before) and we let χ # be a second copy of χ # . Then the map explicitly reads (a ⊗ u) = (a (uχ #(1) ))χ #(2) ⊗ χ #(3)

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

497

and we have, (id ⊗ ·A )12 23 (a ⊗ b ⊗ u) = (id ⊗ ·A )12 (a ⊗(b (uχ #(1) ))χ #(2) ) ⊗ χ #(3) = a (b (uχ #(1) ))χ #(2) χ #(1) χ #(2) ⊗ χ #(3) χ #(3) = a (b (uχ #(1) ))χ #(2) (χ #(3) χ #(1) ) χ #(2) ⊗ χ #(3) = (a (b (uχ #(1) )))χ #(2) χ #(3) = (ab ⊗ u) using parts (a) and then (c) of the lemma and that is an action. On the other side, we have (·P ⊗ id)23 12 (a ⊗ u ⊗ v) = (·P ⊗ id)23 ((a (uχ #(1) ))χ #(2) ⊗ χ #(3) ⊗ v)

= (a (uχ #(1) ))χ #(2) (χ #(3) (vχ #(1) ))χ #(2) ⊗ χ #(3)

= (a (uvχ #(1) ))χ #(2) ⊗ χ #(3) = (a ⊗ uv) using part (c) of the lemma. The computations for (a ⊗ 1) and (1 ⊗ u) are more trivial and left to the reader. We need χ #(1) χ #(2) ⊗ χ #(3) = χ (1 ⊗ χ #(1) ⊗ χ #(2) ) ⊗ χ #(3) = 1 ⊗ 1 for the latter case. Hence we have a factorisation datum and by Proposition 2.1 we have an algebra X built on P ⊗ A with the cross relations (1 ⊗ a)(u ⊗ 1) = (a ⊗ u). We now define e(a) ˜ = a 1 and check easily that e(ab) ˜ = a e(b) ˜ as required, and that ˜ ⊗ u)(2) ) = (a (u χ #(1) ))χ #(2) (χ #(3) 1) = a u (a ⊗ u)(1) e((a by part (c) of the lemma. Hence we have the converse to the preceding proposition. To prove that P is a -module, we take any a ∈ A, u, v ∈ P and use the explicit form of above and part (c) of the lemma to compute · ◦ ((a ⊗ u) v) = (a (uχ #(1) ))χ #(2) (χ #(3) v) = a (uv). Finally, suppose there is another factorisation such that P is a -module, and let (a ⊗ u) = ui ⊗ a i for all a ∈ A, u ∈ P . Then (a ⊗ u) = (a (uχ #(1) ))χ #(2) ⊗ χ #(3) = ui (a i χ #(1) )χ #(2) ⊗ χ #(3) = ui ⊗ a i = (a ⊗ u), where we used the definition of χ # . This proves the uniqueness of .

Example 2.7. Let q be a primitive nth root of 1. The n × n matrices factorise as Mn (C) = CZn · CZn , where the two copies of Zn are generated by  1 0 0 · · · 0  0 q 0 · · · 0 , g= ..   ... . n−1 0 ··· 0 q



0 0 h= 

1 0 .. .

0 1

1 0 0···

 0···0 0···0  ..  .  0

498

T. Brzezi´nski, S. Majid

obeying hg = qgh, so (h ⊗ g) = qg ⊗ h,

(1 ⊗ g) = g ⊗ 1,

(h ⊗ 1) = 1 ⊗ h,

(1 ⊗ 1) = 1 ⊗ 1.

The nontrivial character e(h) = q gives h g m = q m+1 g m and hence M = C1. The result is Galois, with χ (hm ⊗ g k ⊗ g l ) = q m(k+1) g k+l ,

χ # (g m ) = n−1

q −ab g b−1 ⊗ g m−b+1 ⊗ ha .

a,b

Proof. We identify A = CZn = C[h]/(hn = 1) and P = CZn = C[g]/(g n = 1) as the two subalgebras. The relations hg = (1 ⊗ h)(g ⊗ 1) = (h ⊗ g) = q(g ⊗ 1)(1 ⊗ h) give the form of . This extends uniquely to a solution of the factorisation equations in Proposition 2.1 as (hm ⊗ g k ) = q mk g k ⊗ hm . Actually, this is an example of a braided tensor product Mn (C) = CZn ⊗CZn in the braided category of anyonic or m m m Zn -graded spaces. The character e then gives the action shown as h g = q g e(h). Hence m am g m ∈ M iff am (q m+1 − q) = 0 for all m, which means M = C1. We also obtain χ as shown and one may verify that χ # as stated fulfills the requirements in Proposition 2.2. In this example A is actually a Hopf algebra and e(h) ˜ = q1 as here yields a bundle with is equivalent (in the coalgebra bundle version) to a Hopf algebra bundle as in [7]. On the other hand, other choices of e˜ yield algebra bundles which are not equivalent to Hopf algebra bundles, i.e. strict examples of our more general theory. We examine the n = 2 case in detail: Example 2.8. The factorisation M2 (C) = CZ2 · CZ2 as above (with q = −1) admits a family of algebra bundle structures parametrized by θ ∈ [0, 2π ), with e(h) ˜ = cos(θ ) + ıg sin(θ ). The associated Galois action is h g k = (−1)k g k (cos(θ ) + ıg sin(θ )). Proof. We have A = C[h]/(h2 = 1) and P = C[g]/(g 2 = 1). We require e˜ of the form e(1) ˜ = 1,

e(h) ˜ = α + ıβg

(say) obeying the condition in Proposition 2.3. The non-empty case is 1 = e(1) ˜ = (1) e[(h (2) ] = α e(h) e(h.h) ˜ = (h ⊗ e(h)) ˜ ˜ ⊗ e(h)) ˜ ˜ − ıβg e(h) ˜ = (α − ıβg)(α + ıβg) = α 2 + β 2 . This admits many solutions over C, a natural family being those where α, β are real, i.e. on a circle parametrized by θ . On the other hand, in the case equivalent to a Hopf algebra bundle, P would be an A-module algebra. This happens when (h g)2 = cos2 (θ ) − sin2 (θ ) + 2ı sin(θ ) cos(θ )g = 1, which is exactly when θ = 0, π. The first case is trivial and the second is the n = 2 case of the preceding Example 2.7. Next, we consider m = a + bg such that h m = e(h)m. ˜ It is easy to see that this happens iff b = 0, provided sin(θ ) = 0 or cos(θ ) = 0 (one of which is always the case). Hence M = C1. Finally, we have χ (hm ⊗ g k ⊗ g l ) = g k+l ((−1)k (cos(θ )+ı sin(θ )g))m

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

499

which we can write as a map P ⊗ P → A∗ ⊗ P . Identifying A∗ = CZ2 with generator c, say, the map is g k ⊗ g l → c+ + c− (−1)k cos(θ ) ⊗ g k+l + c− ı(−1)k sin(θ ) ⊗ g k+l+1 , where c± = (1±c)/2. (This is the map χ in the corresponding coalgebra bundle version.) Invertibility of this map is equivalent to the existence of χ # in the present setting; in fact the map has determinant 1 in the obvious basis {g k ⊗ g l } and {ck ⊗ g l } and is therefore invertible. The corresponding factorisation over R is the quaternion algebra and provides a counterexample to the existence of e: ˜ Example 2.9. Over R, the quaternion algebra H = span{1, i, j, k} obeying i 2 = j 2 = k 2 = −1 and ij = k etc., is a factorisation H = R[i]R[j ], where R[i] = C as a 2-dimensional algebra over R. One has (j ⊗ i) = −i ⊗ j,

(1 ⊗ i) = i ⊗ 1,

(j ⊗ 1) = 1 ⊗ j,

(1 ⊗ 1) = 1 ⊗ 1.

This factorisation admits no map e. ˜ Proof. The factorisation is evident, with P = R[i] and A = R[j ] (the quotient of polynomials by the relation i 2 = −1 and j 2 = −1 respectively). Now suppose a linear map e˜ : R[j ] → R[i] of the form e(1) ˜ = 1,

e(j ˜ ) = α + iβ.

Then a similar computation to the one above yields this time −1 = e(−1) ˜ = e(j.j ˜ )= α 2 + β 2 , which has no solutions over R. Returning to the general theory, Proposition 2.10. An algebra bundle is trivial or “cleft” if there is an invertible element / = /(1) ⊗ /(2) ∈ P ⊗ Aop such that /a = a /,

∀a ∈ A,

where the product from the right is in A. In this case, P ∼ =Homk (A, M) as a left A-module and right M-module. Moreover, (P , A, , e) ˜ in Proposition 2.3 is a trivial (cleft) algebra bundle if there exists an invertible / ∈ P ⊗ Aop obeying the above condition, with χ # (u) = /(1) ⊗ /−(1) u ⊗ /−(2) /(2) , M

∀u ∈ P .

Proof. The isomorphism 0 : Homk (A, M) → P is 0(f ) = /(1) f (/(2) ),

0−1 (u)(a) = /−(1) ((/−(2) a) u).

Here 0 is a left A-module map since the image of f is in M and / obeys the condition above. Next, the latter condition is equivalent to the condition −1 (a ⊗ /−(1) )/−(2) = e(a)/ ˜

(5)

500

T. Brzezi´nski, S. Majid

for /−1 (just compute e(a) ˜ ⊗ 1 = a 1 ⊗ 1 = a (/−(1) /(1) ) ⊗ /(2) /−(2) using (2)). −(1) −(2) −(1) (/−(2) u), i.e. Hence a (/ ((/

u))) = ·◦(a ⊗ /−(1) )/−(2) u = e(a)/ ˜ −(2) −1 −(1) / (/

u) ∈ M for all u ∈ P . In particular, this implies that 0 (u) : A → M as required. This then provides the required inverse since 0 ◦ 0−1 (u) = /(1) (0−1 (u))(/(2) ) = /(1) /−(1) ((/−(2) /(2) ) u) = u from the definitions, and (0−1 ◦ 0(f ))(a) = /−(1) (/−(2) a 0(f )) = /−(1) 0(/−(2) a f ) = /−(1) /(1) f (/(2) /−(2) a) = f (a) by the left A-module property of 0. For the second part, given a factorisation datum and e, ˜ we define χ # as shown. Then χ (a ⊗ χ # (u)(1) ) ⊗ χ # (u)(2) = (a /(1) )/−(2) u ⊗ /−(2) /(2) = /(1) /−(1) u ⊗ /−(2) /(2) a = u ⊗ a by the property of /. On the other side, Tr A χ # ◦ χ (u ⊗ v) = /(1) ⊗ /−(1) (/−(2) /(2) u)v M

(1)

M −(1)

=/ /

(/−(2) /(2) u) ⊗ v = u ⊗ v, M

M

since /−(1) (/−(2) (/(2) u)) ∈ M by the same proof as above. Hence the action of A on P is Galois in this case. Proposition 2.11. A bundle automorphism is an invertible linear map P → P which is a right M-module map and a left A-module map. The group of bundle automorphisms is in correspondence with invertible f ∈ P ⊗ Aop such that (a ⊗ f (1) )f (2) = f (1) ⊗ f (2) a,

∀a ∈ A.

When the bundle is trivial, such f correspond to invertible elements γ ∈ M ⊗ Aop by f = /γ /−1 multiplied in P ⊗ Aop . Proof. Note first of all that the set of such elements in P ⊗ Aop forms a group. Thus, (a ⊗ f (1) g (1) )g (2) f (2) = f (1) i (a i ⊗ g (1) )g (2) f (2) = f (1) i g (1) ⊗ g (2) a i f (2) = f (1) g (1) ⊗ g (2) f (2) a when f, g obey this condition. The relation between such f and automorphisms F is F (u) = f (1) (f (2) u),

f = F (χ #(1) )χ #(2) ⊗ χ #(3) .

Thus, given f it is evident that a F (u) = a (f (1) (f (2) u)) = ·◦(a ⊗ f (1) )f (2) u = f (1) (f (2) a u) = F (a u) by (2) and the property of f , so F is a left A-module map (it is clearly a right M-module map as well). Also from this, it is immediate that the product in P ⊗ Aop maps over to the composition of bundle transformations. Finally, the inverse of the construction is as shown using the properties of χ # . Thus, F (χ #(1) )χ #(2) (χ #(3) u) = F (u) by Lemma 2.5(c), and when F is defined by f , the inversion formula yields (f (1) (f (2) χ #(1) ))χ #(2) ⊗ χ #(3) = f (1) χ (f (2) ⊗ χ #(1) ⊗ χ #(2) ) ⊗ χ #(3) = f from the definition of χ # .

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

501

In the case of a trivial bundle, we define f as shown and verify (a ⊗ f (1) )f (2) = (a ⊗ /(1) γ (1) /−(1) )/−(2) γ (2) /(2) = (/(1) γ (1) )i (a i ⊗ /−(1) )/−(2) γ (2) /(2)

= (a (/(1) γ (1) ))/−(1) ⊗ /−(2) γ (2) /(2) = f (1) ⊗ f (2) a, using the properties of , (5) and that γ (1) ∈ M. Conversely, given f we define γ = /−1 f / (product in P ⊗ Aop ) and verify using (5) that a γ = e(a)γ ˜ so that γ ∈ M ⊗ Aop . Next, even though A is only an algebra, its action on P extends naturally to tensor powers and hence to the universal exterior differentials n P ⊆ P ⊗(n+1) . Proposition 2.12. In the setting of Proposition 2.3, n P is a • -module with P acting by left multiplication and ˜ n+1 (a ⊗ u0 ⊗ · · · ⊗ un )(2) ], a (u0 ⊗ · · · ⊗ un ) = n+1 (a ⊗ u0 ⊗ · · · ⊗ un )(1) e[ where • |n P = n+1 = n,n+1 · · · 12 defines another factorisation datum (P , • , A). It is such that • ◦ (id ⊗ d) = (d ⊗ id) ◦ • . Proof. That (P , • , A) is another factorisation datum is an elementary proof by induction repeatedly using the factorisation properties in Proposition 2.1 and the product in P (which is just inherited from the product in P ); it is left to the reader. Applying Proposition 2.3 to this new factorisation, with e˜ : A → P ⊆ P then gives a • -module. We then restrict the action to ones of A, P . Armed with this, we can define a connection as an equivariant splitting of 1 P ⊇ P (1 M)P as in [7, 8]. More precisely, we require that 3 has kernel P (1 M)P , is a right P -module map and 3 ◦ d is left A-module map. Such projections turn out to be in 1-1 correspondence with ω ∈ 1 P ⊗ A such that ˜ (2) ) = 0, i) ω(1) e(ω ii) χ˜ (a ⊗ ω(1) ) ⊗ ω(2) = 1 ⊗ a − e(a) ˜ ⊗ 1, iii) ωa = • (a ⊗ ω(1) )ω(2) . Here the correspondence is 3(u ⊗ v) = ω(1) χ˜ (ω(2) ⊗ u ⊗ v) (using χ # in the reverse direction). We will provide this in more detail in the next section in the coalgebra setting. There is also a theory of associated bundles. In fact, one has and needs two kinds of associated bundles; given an algebra bundle and a right A-module VR we have vk ⊗ uk ∈ VR ⊗ P | vk !a ⊗ uk = vk ⊗ a uk , ∀a ∈ A} E={ k

k

k

⊆ VR ⊗ P as a natural right M-module by right multiplication in P . And given a left A-module VL we have uk ⊗ vk ∈ P ⊗ VL | (a ⊗ uk ) vk = e(a)u ˜ E¯ = { k ⊗ vk , ∀a ∈ A} k

⊆ P ⊗ VL

k

k

502

T. Brzezi´nski, S. Majid

as a natural left M-module. Here the E¯ is the natural “invariant” subspace from Lemma 2.4 for the -module structure of P ⊗ VL provided by the following lemma. Lemma 2.13. If V is a left A-module then P ⊗ V is a -module where P acts by multiplication from the left and A acts by a (u ⊗ v) = (a ⊗ u) v. Proof. We check first that A acts as shown. Thus, (ab) (u ⊗ v) = (ab ⊗ u) v = (a ⊗ ui ) (bi v) = (a ⊗(b (u ⊗ v))(1) ) (b (u ⊗ v))(2) = a (b (u ⊗ v)) using the definitions and the factorisation property of . Here b (u ⊗ v) = (b (u ⊗ v))(1) ⊗ (b (u ⊗ v))(2) is a notation. This then forms a -module since a (uu ⊗ v) = (a ⊗ uu ) v = ui (a i ⊗ u ) v = ui (a i (u ⊗ v)) = ◦ ((a ⊗ u) (u ⊗ v)) as required. Sections of these bundles are M-valued M-module maps from E, E¯ respectively. When P is flat over M and has a certain adjoint # , one can show that ¯ M), HomA (VL , P )∼ =M Hom(E,

Hom(VR , P )0 ∼ =HomM (E, M)

as right M-modules, left M-modules respectively. In the first case, if ϕ ∈ HomA (VL , P ) then the corresponding section of E¯ is s¯ϕ (u ⊗ v) = uϕ(v). In the second case, Hom(VR , P ) is a left -module in a similar manner to Lemma 2.13 (coinciding with it in the finite-dimensional case), namely (a ϕ)(v) = Tr A (a ⊗ ϕ(v ( ))). If ϕ ∈ Hom(VR , P )0 then the corresponding section of E is sϕ (v ⊗ u) = ϕ(v)u. The proof of these assertions will be given in Sect. 4 in the coalgebra setting with χ −1 and ψ −1 in the roles of χ # and # . When VL and VR are finite-dimensional then E = HomA (VR∗ , P ),

E¯ = Hom(VL∗ , P )0 ,

so that each bundle can be viewed as the space of sections of the other. Moreover, the constructions generalise directly to form-valued sections by using • in place of . One may then proceed to frame bundles, etc. Thus, one has a covariant derivative ∇ : E → E ⊗ 1 M, M

∇¯ : E¯ → 1 M ⊗ E¯ M

associated to a suitable (strong) connection in the pointed case. By definition a frame resolution is an associated bundle equipped with a canonical form such that E ∼ =1 M, and in this case ∇ plays the role of Levi–Civita connection, etc., along the lines in [23]. This and the rest of the theory will be provided in Sect. 4, in our preferred coalgebra bundle setting. Finally, we give the situation in the case of trivial (cleft) algebra bundles. In this case sections correspond to “matter fields” on the base M, HomA (VL , P )∼ =Homk (VL , M),

Hom(VR , P )0 = Homk (VR , M).

The first isomorphism sends f¯ ∈ Homk (VL , M) to the map ϕ f¯ (v) = /(1) f¯(/(2) v). The second isomorphism sends f ∈ Homk (VR , M) to ϕ f (v) = f (v!/−(2) )/−(1) .

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

503

Similarly, (strong) connections ω are determined by “gauge fields” α ∈ 1 M ⊗ A such that α (1) e(α ˜ (2) ) = 0, according to ω = /(1) α (1) /−(1) ⊗ /−(2) α (2) /(2) + /(1) d/−(1) ⊗ /−(2) /(2) . Proofs will again be given in the following sections, in the coalgebra setting. The covariant derivative on these matter fields and their gauge transformation by γ ∈ M ⊗ Aop then take on the familiar form for algebraic gauge theory on trivial bundles (see [22, Sect. 3]). 3. Coalgebra Bundles and Connections We switch now to the coalgebra version of the theory, where A is replaced by a coalgebra C. This is the original theory of coalgebra bundles [8], which we extend further. The coalgebra version involves less familiar notations but has advantages in a purely algebraic treatment. Definition 3.1. [8] A coalgebra C and algebra P are entwined by ψ : C ⊗ P → P ⊗ C if ψ ◦ (id ⊗ ·) = (· ⊗ id) ◦ ψ23 ◦ ψ12 , (id ⊗ ) ◦ ψ = ψ12 ◦ ψ23 ◦ ( ⊗ id),

ψ ◦ (u ⊗ 1) = 1 ⊗ u, ∀u ∈ P , (id ⊗ ε) ◦ ψ = ε ⊗ id.

(6) (7)

The triple (P , C, ψ) is called an entwining structure. We will often use the notation ψ(c ⊗ u) = uα ⊗ cα (summation over α is understood). In this notation conditions (6) and (7) take a very simple explicit form (uv)α ⊗ cα = uα vβ ⊗ cαβ ,

1α ⊗ cα = 1 ⊗ c,

uα ⊗ cα (1) ⊗ cα (2) = uβα ⊗ c(1) α ⊗ c(2) β ,

uα ε(cα ) = ε(c)u,

for any u, v ∈ P and c ∈ C. The entwining structure corresponds to an algebra factorisation in the case C finitedimensional, built on A = C ∗op and P , as explained in [8]. Similarly, if e ∈ C is grouplike, there is a right coaction P : P → P ⊗ C defined by P (u) = ψ(e ⊗ u) and M = Me = {u ∈ P | P (u) = u ⊗ 1} is a subalgebra. The map χ˜ : P ⊗ P → P ⊗ C defined by χ˜ (u ⊗ v) = uP (v) descends to χ : P ⊗M P → P ⊗ C and we have a copointed coalgebra bundle P (M, C, ψ, e) when χ is invertible and P . This is the setting studied in [8]. We also note that for any entwining structure we have a natural category MC P (ψ) of entwined modules. The objects are right P -modules and right C-comodules V such that for all v ∈ V , u ∈ P , P (v!u) = v (0) !ψ(v (1) ⊗ u) := v (0) !uα ⊗ v (1) α .

(8)

The morphisms are right P -module and right C-comodule maps. The category MPC (ψ) generalises the category of unifying or Doi-Koppinen modules [12, 19] which unifies various categories studied intensively in the Hopf algebra theory (e.g. Drinfeld–Radford– Yetter (or crossed) modules, Hopf modules, relative Hopf modules, Long modules, etc.).

504

T. Brzezi´nski, S. Majid

The algebra P is an object in MPC (ψ), with the right regular action of P (by multiplication) if and only if there exists an element e˜ ∈ P ⊗ C such that ˜ e˜(1) ψ(e˜(2) ⊗ e˜(1) ) ⊗ e˜(2) = (id ⊗ )e,

(id ⊗ ε)e˜ = 1

(where e˜ is another copy of e˜ and we use the notation e˜ = e˜(1) ⊗ e˜(2) , etc.). In this case the coaction is P (u) = e˜(1) ψ(e˜(2) ⊗ u), ∀u ∈ P . Notice that e˜ = P (1). We then define ˜ M = {m ∈ P | P (mv) = mP v ∀v ∈ P } = {m ∈ P | P (m) = me} which is a subalgebra of P , and proceed as above, requiring χ to be bijective. We will call this a general coalgebra bundle P (M, C, ψ). The copointed case corresponds to the choice e˜ = 1 ⊗ e. There is also a converse: if P is an algebra and a right C-comodule, we say that the coaction is Galois if M defined as above is such that χ is bijective. In this case there is an entwining structure [6] ψ(c ⊗ u) = χ (c(1) ⊗ c(2) u), M

χ −1 (1 ⊗ c) = c(1) ⊗ c(2) ,

e˜ = P (1)

and we have a coalgebra bundle. Because of these natural properties, we will work now with these slightly more general coalgebra bundles (or C-Galois extensions). Our preliminary goal in the present section is to make the evident generalisations of the copointed theory in [8] to this case. Next, a coalgebra bundle is trivial cf. [8] (or one says that the C-Galois extension is cleft) if there is a convolution invertible map / : C → P (the trivialisation or cleaving map) such that P ◦ / = (/ ⊗ id) ◦ .

(9)

By considering the equality 1(0) ε(c) ⊗ 1(1) = 1(0) ψ(1(1) ⊗ /(c(1) )/−1 (c(2) )) one finds that ψ(c(1) ⊗ /−1 (c(2) )) = /−1 (c)P (1)

(10)

which allows one to use the argument of the proof of [8, Prop. 2.9] to show that P ∼ = M ⊗ C as a left M-module and right C-comodule. We turn now to the theory of connections, based on the theory for the copointed case in [8]. As shown in [8, Prop. 2.2], given an entwining structure (P , C, ψ) there is an entwining structure (P , C, ψ • ), where ψ • |C⊗n−1 P = ψ n ≡ ψn,n+1 ψn−1,n · · · ψ12 : C ⊗ P ⊗ n → P ⊗ n ⊗ C is the iterated entwining. Moreover, ψ • ◦ (id ⊗ d) = (d ⊗ id) ◦ ψ • .

(11)

C (ψ) Therefore, given e˜ : P ⊗ C satisfying the above conditions we have n P ∈ MP with the action right multiplication by P and the coaction

n P = (·P ⊗ id)(id ⊗ ψ n+1 )(e˜ ⊗ id).

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

505

Definition 3.2. A connection on P (M, C, ψ) is a left P -module projection 3 : 1 P → 1 P such that (i) ker 3 = P (1 M)P (ii) the map 3 ◦ d : P → 1 P commutes with the right coaction. Proposition 3.3. Connections 3 are in 1-1 correspondence with ω : C → 1 P such that (i) e˜(1) ω(e˜(2) ) = 0, (ii) χ˜ ◦ ω(c) = 1 ⊗ c − ε(c)e, ˜ (iii) ψ 2 (c(1) ⊗ ω(c(2) )) = ω(c(1) ) ⊗ c(2) . The correspondence is via 3(udv) = uv (0) ω(v (1) ) for all u, v ∈ P . Proof. Assume first that there is ω satisfying (i)–(iii). Then the map 3 is well-defined since for all u ∈ P , 3(ud1) = ue˜(1) ω(e˜(2) ) = 0, by (i). Next for any u, v ∈ P , x ∈ M we have 3(u(dx)v) = 3(ud(xv)) − 3(uxdv) = u(xv)(0) ω((xv)(1) ) − uxv (0) ω(v (1) ) = 0, since P is left M-linear. On the other hand, if i ui dv i ∈ ker 3, then using (ii) we have 0= χ˜ (ui v i (0) ω(v i (1) )) = (ui v i (0) ⊗ v i (1) − ui v i e) ˜ = χ˜ (ui dv i ). i

i

i

Since ker χ˜ = P (1 M)P , we have ker 3 ⊆ P (1 M)P , i.e., ker 3 = P (1 M)P . Finally notice that for all u ∈ P , 3(du) = u(0) ω(u(1) ). Therefore 1 P (3(du)) = u(0) ψ 2 (u(1) ⊗ ω(u(2) )) = u(0) ω(u(1) ) ⊗ u(2) = 3(du(0) ) ⊗ u(1) .

(by (iii))

Conversely, assume there is a connection in P (M, C, ψ). This is equivalent to the existence of a map σ : P ⊗ C + → 1 P , where C + = ker ε, such that χ˜ ◦ σ = id and 3 = σ ◦ χ˜ . Define ω(c) = σ (1 ⊗ c − ε(c)e). ˜ Clearly, (ii) holds. An immediate calculation verifies (i). The definition of ω implies that 3(udv) = uv (0) ω(v (1) ), for all u, v ∈ P . Since 3 ◦ d commutes with the coaction we have for all u ∈ P , u(0) ψ 2 (u(1) ⊗ ω(u(2) )) = u(0) ω(u(1) ) ⊗ u(2) . Since χ is bijective, for any c ∈ C there is c(1) ⊗ c(2) ∈ P ⊗M P such that c(1) c(2) (0) ⊗ c(2) (1) = 1 ⊗ c. Thus we have ψ 2 (c(1) ⊗ ω(c(2) )) = c(1) c(2) (0) ψ 2 (c(2) (1) ⊗ ω(c(2) (2) )) = c(1) c(2) (0) ω(c(2) (1) ) ⊗ c(2) (2) = ω(c(1) ) ⊗ c(2) . Therefore ω satisfies (iii) and the proof of the proposition is completed.

Every connection 3 induces a covariant derivative D = d − 3 ◦ d : P → 1 P . In the copointed case D commutes with the right coaction, since d itself commutes with the right coaction.

506

T. Brzezi´nski, S. Majid

Proposition 3.4. If P (M, C, ψ, e) is a copointed trivial coalgebra bundle with trivialisation / such that /(e) = 1, and α : C → 1 M obeys α(e) = 0, then ω(c) = /−1 (c(1) )α(c(2) )/(c(3) ) + /−1 (c(1) )d/(c(2) ) is a connection. Proof. We verify directly that ω satisfies conditions (i)-(iii) of Proposition 3.3 with e˜ = 1 ⊗ e. We have ω(e) = /−1 (e)α(e)/(e) + /−1 (e)d/(e) = d1 = 0. Next, take any c ∈ C and compute χ˜ ◦ ω(c) = χ˜ (/−1 (c(1) ) ⊗ /(c(2) ) − ε(c)1 ⊗ 1) ˜ = /−1 (c(1) )/(c(2) ) ⊗ c(3) − ε(c)1 ⊗ e = 1 ⊗ c − ε(c)e, where we used that the first summand in ω is in P (1 M)P . Finally we have ψ 2 (c(1) ⊗ ω(c(2) )) = ψ 2 (c(1) ⊗ /−1 (c(2) )α(c(3) )/(c(4) )) + ψ 2 (c(1) ⊗ /−1 (c(2) )d/(c(3) )) = /−1 (c(1) )ψ 2 (e ⊗ α(c(2) )/(c(3) )) + /−1 (c(1) )ψ 2 (e ⊗ d/(c(2) )) = /−1 (c(1) )α(c(2) )P (/(c(3) )) + /−1 (c(1) )d/(c(2) ) ⊗ c(3) = ω(c(1) ) ⊗ c(2) , C (ψ • ) and (10) to derive the second equality, and that where we used that 1 P ∈ MP α(c) ∈ 1 M, / is an intertwiner and (11) to derive the third one.

For another class of examples one has coalgebra homogeneous spaces associated to coalgebra surjections π : P → C. Thus, let P be a Hopf algebra and M a subalgebra of P such that (M) ⊆ P ⊗ M (an embeddable P -homogeneous quantum space). Define the quotient coalgebra C = P /(M + P ), where M + = ker ε ∩ M is the augmentation ideal. There is a natural right coaction of C on P given as P = (id ⊗ π ) ◦ , where π : P → C is the canonical surjection. It is clear that M ⊆ {u ∈ P |P u = u ⊗ e}, with e = π(1), and we assume that this is an equality (this is known to hold for example if [31] P is faithfully flat as a left M-module). Then P (M, C, π(1)) is a coalgebra bundle. Since e˜ = 1 ⊗ π(1) we have e = π(1), i.e. a copointed coalgebra bundle as in [8]. In this case we know that if i : C → P is a linear splitting of π such that i(e) = 1,

ε ◦ i = ε,

i(c(2) )(2) ⊗ c(1) !(Si(c(2) )(1) i(c(2) )(3) ) = i(c(1) ) ⊗ c(2)

then ω(c) = Si(c)(1) di(c)(2) is a left-invariant connection and every left-invariant connection on the bundle is of this form (cf. [33]). The left-invariance here means that 1 P ω(c) = 1 ⊗ ω(c) for all c ∈ C. We use here the right action of P on C given by c!u = π(vu) for any v ∈ π −1 (c). The theory of connections can be developed also for nonuniversal calculi 1 (P ) = 1 P /N , where N ⊆ 1 P is a sub-bimodule, although the situation is slightly more

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

507

complicated. We say that 1 (P ) is a differential calculus on P (M, C, ψ) iff it is covariant in the sense ψ 2 (C ⊗ N ) ⊆ N ⊗ C 2 defined by so that the coaction P ⊗ P descends to 1 (P ). This is obtained from ψN 2 ψN ◦ (id ⊗ πN ) = (πN ⊗ id) ◦ ψ 2 ,

where πN : 1 P → 1 (P ) is the canonical surjection. We have 2 (e˜(2) ⊗( )). 1 (P ) = e˜(1) ψN

Let M = (P ⊗ C + )/χ˜ (N ) (and denote by πM the canonical surjection). This is a left P -module (since χ˜ is left P -module map) by u m = i πM (uvi ⊗ ci ) for any −1 vi ⊗ ci ∈ πM (m). We can then define ? = {λ ∈ M|∃c ∈ C, s.t. λ = πM (1 ⊗ c − ε(c)e)}. ˜ The action provides a surjection P ⊗ ? → M. Definition 3.5. A connection with a nonuniversal calculus is a left P -module projection 3 : 1 (P ) → 1 (P ) such that ker 3 = 1 (P )hor and 3 ◦ d commutes with the right coaction. Here 1 (P )hor = P (dM)P . As usual, we define χN ◦ πN = πM ◦ χ˜ . It is a left P -module map and the sequence χN

0 → 1 (P )hor → 1 (P ) →M → 0 is exact. Proposition 3.6. Suppose P ⊗ ? ∼ = M by the surjection above. Then connections 3 on 1 (P ) are in 1-1 correspondence with ω : ? → 1 (P ) such that (i) χN ◦ ω = 1 ⊗ id, 2 (c (ii) ψN ˜ (1) ⊗ ω(π? (c(2) ))) = ω(π? (c(1) )) ⊗ c(2) , where π? (c) = πM (1 ⊗ c−ε(c)e). The correspondence is via 3(udv) = u i vi ω(λi ) for all u, v ∈ P and i vi ⊗ λi ∈ P ⊗ ? such that i vi λi = χN (dv). Proof. The proof is analogous to the proof of Proposition 3.3.

In the case of a homogeneous bundle where P is a Hopf algebra and e = π(1), a natural type of calculus 1 (P ) is a left-covariant one defined by an ideal Q in ker ε ⊆ P . Example 3.7. For a homogeneous bundle with left-covariant calculus, ? = C + /π(Q) and P ⊗ ? ∼ = M. Moreover if for all q ∈ Q, u ∈ P , q (2) ⊗ π(u(Sq (1) )q (3) ) ∈ Q ⊗ C, then 1 (P ) is a calculus on P (M, C, ψ, π(1)). In particular, if 1 (P ) is a bicovariant calculus on P then it is a calculus on P (M, C, ψ, π(1)).

508

T. Brzezi´nski, S. Majid

Proof. Recall that any element n ∈ N is of the form n = i ui Sq i (1) ⊗ q i (2) for some ui ∈ P , q i ∈ Q. For any u ∈ P , q ∈ Q we have u ⊗ π(q) = χ˜ (uSq (1) ⊗ q (2) ) ∈ χ˜ (N ). i i On the other hand χ˜ ( i ui Sq i (1) ⊗ q i (2) ) = i u ⊗ π(q ) ∈ P ⊗ Q. This proves that χ˜ (N ) = P ⊗ π(Q). Therefore M = P ⊗ C + /χ˜ (N ) = P ⊗(C + /π(Q)), and ? = C + /π(Q). Finally, take any c ∈ C and let v ∈ π −1 (c). We have: ψ 2 (c ⊗ ui Sq i (1) ⊗ q i (2) ) = ui (1) Sq i (2) ⊗ ψ(π(vui (2) Sq i (1) ) ⊗ q i (3) ) i

i

=

ui (1) Sq i (2) ⊗ q i (3) ⊗ π(vui (2) (Sq i (1) )q i (4) ).

i

By the assumption on Q the last expression is in N ⊗ Q, so that the resulting calculus 1 (P ) is a calculus on P (M, C, ψ, π(1)). If Q defines a bicovariant calculus then Q is Ad-stable, so that the required condition is immediately satisfied. 4. Bijectivity of ψ and Strong Connections In this section we return to some technical considerations. For simplicity here and in most of what follows, we will concentrate on the universal differential calculus. First of all, we consider the question of when ψ is bijective. It plays the role in the Hopf algebra case of having a bijective antipode, and allows us to relate left and right handed versions of the theory. Lemma 4.1. If ψ is bijective then P is a left C-comodule by P (u)

= ψ −1 (ue). ˜

Moreover, M = {u ∈ P | P u = ψ −1 (e)u}. ˜ Proof. This lemma is part of [5, Lemma 6.5].

In the copointed case, it is easy to see that if ψ is bijective then P ⊗(n+1) is a left C-comodule by P ⊗(n+1) = ψ −(n+1) (( ) ⊗ e). This coaction restricts to P ⊗ M ⊗ n and n P . Proposition 4.2. In the copointed case, let ω be a connection on 1 P with ψ bijective. ¯ : 1 P → 1 P defined by Then 3 ¯ 3((du)v) = ω(u(1) )u(∞) v is a right-connection in the sense (i) (ii)

¯ ◦ d is a left C-comodule map. D¯ = (id − 3) ¯ ¯ = P (1 M)P . 3 is a right P -module projection and ker 3

Proof. (i) We introduce the notation ψ −1 (u ⊗ c) = cα ⊗ uα , for all c ∈ C, u ∈ P . One easily finds that cα (1) ⊗ cα (2) ⊗ uα = c(1) α ⊗ c(2) β ⊗ uαβ

(12)

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

509

and P (u) = eα ⊗ uα . We have ψ 2 (u(1) ⊗ ω(u(2) )u(∞) ) = ω(u(1) )ψ(u(2) ⊗ u(∞) ) = ω(eα (1) )ψ(eα (2) ⊗ uα ) = ω(eα )ψ(eβ ⊗ uαβ ) = ω(eα )uα ⊗ e = ω(u(1) )u(∞) ⊗ e, where we used that e is group-like and (12) to derive the third equality. This implies that ψ −2 (ω(u(1) )u(∞) ⊗ e) = u(1) ⊗ ω(u(2) )u(∞) , ¯ ◦ d and, consequently, implies the leftwhich is precisely the left C-covariance of 3 ¯ covariance of D. ¯ is a right P -module map. The following diagram commutes: (ii) It is clear that 3 χ˜

0 −−−−→ P (1 M)P −−−−→ 1 P −−−−→ P ⊗ C + −−−−→ 0    = = ψ

χ˜ L

0 −−−−→ P (1 M)P −−−−→ 1 P −−−−→ C + ⊗ P −−−−→ 0 where χ˜ L = ψ −1 ◦ χ˜ (explicitly, χ˜ L (u ⊗ v) = u(1) ⊗ u(∞) v). Since P is a coalgebra principal bundle the top sequence is exact. Furthermore ψ is bijective and P is right Mlinear, thus the bottom sequence is also exact. It is split by the map σ : C + ⊗ P → 1 P , σ (c ⊗ u) = ω(c)u. Indeed, χ˜ L ◦ σ (c ⊗ u) = χ˜ L (ω(c))u = ψ −1 (1 ⊗ c)u = c ⊗ u, where we used that χ˜ L is a right P -module map and that ω is a connection one-form ¯ = σ ◦ χ˜ L , and the fact that σ is a splitting (i.e. (Proposition 3.3(ii)). Now notice that 3 ¯ is a projection and has the kernel χ˜ L ◦ σ = id) of the above sequence implies both that 3 as stated. Finally, a connection is strong if (id − 3) ◦ d has its image in (1 M)P [16, Definition 2.1]. These are the connections most closely associated to the base and used in the theory of associated bundles, etc. Recently, a simple condition for strongness was given in the Hopf algebra case, in [23]. This can be generalised to the coalgebra case. Proposition 4.3. A connection on a copointed coalgebra bundle P (M, C, ψ, e) is strong iff (id ⊗ P )ω(c) = 1 ⊗ 1 ⊗ c − ε(c)1 ⊗ 1 ⊗ e + ω(c(1) ) ⊗ c(2) . Furthermore, if ψ is bijective then a connection is strong iff (P ⊗ id)ω(c) = c ⊗ 1 ⊗ 1 − e ⊗ 1 ⊗ 1ε(c) + c(1) ⊗ ω(c(2) ).

(13)

510

T. Brzezi´nski, S. Majid

Proof. Assume that ω is strong. This is equivalent to the statement that (id ⊗ P ) ◦ D(u) = 1 P ◦ D(u),

∀u ∈ P .

(14)

Using the explicit definition of d and D, Proposition 3.3 (iii), as well as the fact that C (ψ • ) one finds that (14) implies that 1 P ∈ MP (id ⊗ P )(u(0) ω(u(1) )) = u(0) ⊗ 1 ⊗ u(1) − u ⊗ 1 ⊗ e + u(0) ω(u(1) ) ⊗ u(2) . (1)

(1)

Next for all c, let c ⊗ c(2) ∈ P ⊗M P be the translation map, i.e. c ⊗ c(2) = χ −1 (1 ⊗ c). It means that c(1) c(2) (0) ⊗ c(2) (1) = 1 ⊗ c. Using the above equality and the fact that c(1) c(2) = ε(c), we have (id ⊗ P ) ◦ ω(c) = (id ⊗ P )(c(1) c(2) (0) ω(c(2) (1) )) = c(1) (id ⊗ P )(c(2) (0) ω(c(2) (1) )) = c(1) c(2) (0) ⊗ 1 ⊗ c(2) (1) − c(1) c(2) ⊗ 1 ⊗ e +c(1) c(2) (0) ω(c(2) (1) ) ⊗ c(2) (2) = 1 ⊗ 1 ⊗ c − ε(c)1 ⊗ 1 ⊗ e + ω(c(1) ) ⊗ c(2) , i.e. (13) holds. Conversely, an easy calculation reveals that (13) implies (14), i.e., the connection is strong as required. The second assertion is obtained by applying ψ −2 to (13). As in [23], the significance of the second assertion is that this is manifestly a “strong¯ In studying the coalgebra frame ness” condition for the left-handed theory with 3. resolutions we will need both the left and the right handed theories simultaneously, and we see that if one holds so does the other for a given ω. A situation where ψ is bijective is a homogeneous bundle π : P → C with P having bijective antipode. Proposition 4.4. For a homogeneous coalgebra bundle with bijective antipode, strong left-invariant connections are in 1-1 correspondence with splittings i : C → P of π which are covariant with respect to (id ⊗ π ) ◦ and (π ⊗ id) ◦ , and such that i(π(1)) = 1 and ε ◦ i = ε. In this case ω(c) = Si(c)(1) di(c)(2) . Proof. Given such a splitting i : C → P of π , consider ω(c) = Si(c)(1) di(c)(2) as stated. The normalisation conditions imply that ω(π(1)) = 0 and χ˜ ◦ ω(c) = 1 ⊗ c − ε(c)1 ⊗ π(1). Also ψ 2 (c(1) ⊗ ω(c(2) )) = Si(c(2) )(2) di(c(2) )(3) ⊗ π(i(c(1) )Si(c(2) )(1) i(c(2) )(4) ) = Si(c)(3) di(c)(4) ⊗ π(i(c)(1) Si(c)(2) i(c)(5) ) (i is left-covariant) = Si(c)(1) di(c)(2) ⊗ π(i(c)(3) ) = Si(c(1) )(1) di(c(1) )(2) ⊗ π(i(c(2) )) (i is right-covariant) = ω(c(1) ) ⊗ c(2) (π is split by i).

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

511

Proposition 3.3 implies that ω is a connection one-form. Finally, compute (id ⊗ P )(ω(c)) = Si(c)(1) ⊗ i(c)(2) ⊗ π(i(c)(3) ) − ε(c)1 ⊗ 1 ⊗ π(1) = Si(c(1) )(1) ⊗ i(c(1) )(2) ⊗ c(2) − ε(c)1 ⊗ 1 ⊗ π(1) = ω(c(1) ) ⊗ c(2) + 1 ⊗ 1 ⊗ c − ε(c)1 ⊗ 1 ⊗ π(1), where the use of the fact that i is a right covariant splitting was made in the derivation of the second equality. Proposition 4.3 now implies that the connection corresponding to ω is strong. Conversely, assume that there is a strong connection with the left-invariant connection form ω. Then the left-invariance of ω implies that there exists a splitting i : C → P of π such that ε ◦ i = ε and ω(c) = Si(c)(1) di(c)(2) (cf. [9, Proposition 3.5]). The fact that ω(π(1)) = 0 implies that i(π(1)) = 1. Applying (id ⊗ P ) to this ω and using Proposition 4.3 one deduces that i is right-covariant. Bijectivity of S implies that ψ is bijective (cf. [5]). The left coaction induced by ψ −1 is P (u) = π(S −1 u(2) ) ⊗ u(1) . By Proposition 4.3, (P ⊗ P )ω(c) = π(i(c)(1) ) ⊗ Si(c)(2) ⊗ i(c)(3) − ε(c)π(1) ⊗ 1 ⊗ 1 must be equal to c(1) ⊗ Si(c(2) )(1) ⊗ i(c(2) )(2) − ε(c)π(1) ⊗ 1 ⊗ 1. Applying id ⊗ S −1 ⊗ ε to this equality one deduces that i must be left-covariant. This completes the proof. This is the analogue for coalgebra bundles of the bicovariant formulation of strong canonical connections in the Hopf algebra case in [17]. 5. Frame Resolutions, Covariant Derivatives and Torsion In this section we define frame resolutions in the coalgebra setting, following the theory introduced recently in [23] in the Hopf algebra case. The theory depends heavily on the notion of associated bundles, so we recall these briefly. In the coalgebra case there are two kinds of associated bundles (which are equivalent in the Hopf algebra case), as studied recently in [5]. Definition 5.1. Let P (M, C) be a coalgebra bundle. (i) The left associated bundle (or module) to a left C-comodule V is E = P ✷C V . (ii) The right associated bundle (or module) to a right C-comodule V is E¯ = (V ⊗ P )0 , the fixed subobject, where V ⊗ P is an object of MPC (ψ) by multiplication from the right and V ⊗ P (v ⊗ u) = v (0) ⊗ ψ(v (1) ⊗ u). The cotensor product W ✷C V here, between a left comodule V and right comodule W is defined by the exact sequence [24] 0 −→ W ✷C V A→ W ⊗ V

W ⊗id−id⊗ V

−→

W ⊗ C ⊗ V.

This is just the arrow reversal of the usual tensor product. Less conventional is the fixed subobject (V ⊗ P )0 = vi ⊗ ui ∈ V ⊗ P | vi (0) ⊗ ψ(vi (1) ⊗ ui ) = vi ⊗ ui e˜(1) ⊗ e˜(2) . i

i

i

512

T. Brzezi´nski, S. Majid

This is the natural analogue for coalgebra bundles of the associated bundles in the quantum group gauge theory of [7]. Lemma 5.2. For a copointed coalgebra bundle P (M, C, ψ, e), let (P ⊗M ⊗n )0 = {w ∈ P ⊗ M ⊗n | ψ n+1 (e ⊗ w) = w ⊗ e} be the invariant subset of P ⊗ M ⊗n . If ψ is bijective then (P ⊗ M ⊗n )0 = M ⊗n+1 . Proof. Clearly M ⊗n+1 ⊆ (P ⊗M ⊗n )0 . If w ∈ (P ⊗M ⊗n )0 then ψ n+1 (e ⊗ w) = w ⊗ e. Applying ψ −(n+1) one deduces that ψ −(n+1) (w ⊗ e) = e ⊗ w. Let w = i ui ⊗ mi1 ⊗ · · · ⊗ min . Since for all m ∈ M, ψ −1 (m ⊗ e) = e ⊗ m one immediately finds that e ⊗ i ui ⊗ mi1 ⊗ · · · ⊗ min = i ψ −1 (ui ⊗ e) ⊗ mi1 ⊗ · · · ⊗ min . This in turn implies that for all i, ui ∈ M. Now we can extend the notion of a strongly horizontal form from [7]. Definition 5.3. Let E be a left bundle associated to a copointed coalgebra bundle P (M, C, ψ, e) and a left C-comodule V . A right strongly tensorial n-form on E is a linear map ϕ : V → P (n M) such that ψ n+1 ◦ (id ⊗ ϕ) ◦ V = ϕ ⊗ e.

(15)

By the extension of the notation above, the space of right strongly tensorial n-forms will be denoted by Hom0 (V , P (n M)) (in [5] right strongly 0-forms Hom0 (V , P ) are denoted by Homψ (V , P )). Hom0 (V , P (n M)) has a right M-module structure defined by (ϕ · m)(v) = ϕ(v)m. Proposition 5.4. Let P (M, C, ψ, e) be a copointed coalgebra bundle with ψ bijective and P flat as a right M-module (or V -coflat as a left C-comodule). Then right strongly tensorial forms Hom0 (V , P (n M)) and M Hom(E, n M) are isomorphic as right Mmodules. Proof. The proof of this proposition is analogous to the proof of [5, Theorem 4.3]. We include it here for completeness. The flatness (coflatness) assumption implies that (P ⊗M P )C V ∼ = P ⊗M (P C V ), canonically (cf. [29, p. 172]). Thus there is a left P -module isomorphism ρ : P ⊗M E → P ⊗ V , obtained as a composition of ∼ χ ⊗ id with the canonical isomorphism P ⊗ CC V → P ⊗ V , i.e., ρ = · ⊗ id, ρ −1 = (χ −1 ⊗ id) ◦ (id ⊗ V ). Following [13], apply HomP (−, P (n M)) to ρ to ∼ deduce the right M-module Hom(V , P (n M)) → HomM (E, P (n M)), isomorphism i i i i given by ϕ → sϕ , sϕ ( i u ⊗ v ) = i u ϕ(v ). For any ϕ ∈ Hom(V , P (n M)), x = i ui ⊗ v i ∈ E we have P (ui ϕ(v i )) = ui (0) ψ n+1 (ui (1) ⊗ ϕ(v i )) P (sϕ (x)) = i

=

i i

uψ

n+1

(v

i

i (1) ⊗ ϕ(v (∞) )),

i

since i ui (0) ⊗ ui (1) ⊗ v i = i ui ⊗ v i (1) ⊗ v i (∞) by the definition of E = P ✷C V . By Lemma 5.2, n M = (P (n M))0 , therefore sϕ (x) ∈ n M iff ui ψ n+1 (v i (1) ⊗ ϕ(v i (∞) )) = ui ϕ(v i ) ⊗ e. (16) i

i

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

513

Clearly, (15) implies (16). Applying (16) to ρ −1 (1 ⊗ v) one easily finds that (16) implies (15). Therefore the right M-module isomorphism ϕ → sϕ restricts to the isomorphism ∼

Hom0 (V , P (n M)) → HomM (E, n M) as required.

Proposition 5.4 is the coalgebra bundle version of [23, Lemma 3.1], and allows us to define similarly, Definition 5.5 (cf. [23, Definition 3.2]). A coalgebra frame resolution of an algebra M is a left bundle E associated to a copointed coalgebra bundle P (M, C, ψ, e) with bijective ψ, and V , together with a right strongly tensorial one-form θ : V → P (1 M) such that sθ : E → 1 M corresponding under Proposition 5.4 is an isomorphism of left M-modules. As in [23], we can now proceed to deduce the left M-module isomorphism id ⊗ sθ : (1 M)P ✷C V ∼ = 1 M ⊗M 1 M = 2 M.

(17)

Here, the cotensor product is defined with respect to the right coaction (1 M)P : (1 M)P → (1 M)P ⊗ C given by (1 M)P (w) = ψ 2 (e ⊗ w) (it is an easy exercise which uses (6) to verify that (1 M)P is closed under this coaction). Furthermore, given a frame resolution, we can now define a covariant derivative ∇ : 1 M → 2 M corresponding to a strong connection 3 in P (M, C, ψ, e) by [23, Prop. 3.3] ∇ = (id ⊗ sθ ) ◦ (D✷C id) ◦ sθ−1 : 1 M → 2 M.

(18)

The map ∇ is well-defined since D is an intertwiner so that the expression D✷C id makes sense. Furthermore, by the strongness assumption D(P ) ⊆ (1 M)P so the isomorphism (17) implies that the output of ∇ is in 2 M. Finally, it can be easily verified (cf. [23, Prop. 3.3]) that ∇(m · w) = m · ∇w + dm ⊗M w, for any m ∈ M and w ∈ 1 M, so that ∇ is a connection on 1 M as a left M-module. Next, cf. [23, Prop. 3.5], we define the torsion of a connection ∇ by T = d − ∇ : 1 M → 2 M. By Proposition 5.4 this T can be also viewed as a map T : V → P (2 M) provided P is M-flat. Proposition 5.6. If ω is a strong connection on P (M, C, ψ, e) and ψ is bijective then there is a covariant derivative D¯ : Hom0 (V , P n M) → Hom0 (V , P n+1 M) given by ¯ Dϕ(v) = dϕ(v) + ω(v (1) )ϕ(v (∞) ). ¯ . In particular, T = Dθ

514

T. Brzezi´nski, S. Majid

Proof. We first show that the map D¯ is well-defined. We will use the following notation for the connection one-form ω(c) = ω(c)(1) ⊗ ω(c)(2) (summation understood), for all c ∈ C. Take any ϕ ∈ Hom0 (V , P n M), v ∈ V and compute ¯ (id ⊗ P )Dϕ(v) = 1 ⊗ ϕ(v)(0) ⊗ ϕ(v)(1) + dϕ(v) ⊗ e − 1 ⊗ ϕ(v) ⊗ e + ω(v (1) )(1) ⊗ P (ω(v (1) )(2) ϕ(v (∞) )) = 1 ⊗ ϕ(v)(0) ⊗ ϕ(v)(1) + dϕ(v) ⊗ e − 1 ⊗ ϕ(v) ⊗ e + ω(v (1) )(1) ⊗ ω(v (1) )(2) (0) ψ n+1 (ω(v (1) )(2) (1) ⊗ ϕ(v (∞) )) = 1 ⊗ ϕ(v)(0) ⊗ ϕ(v)(1) + dϕ(v) ⊗ e − 1 ⊗ ϕ(v) ⊗ e + 1 ⊗ ψ n+1 (v (1) ⊗ ϕ(v (∞) )) − 1 ⊗ ψ n+1 (e ⊗ ϕ(v)) +ω(v (1) )ψ n+1 (v (2) ⊗ ϕ(v (∞) )) = 1 ⊗ ϕ(v)(0) ⊗ ϕ(v)(1) + dϕ(v) ⊗ e − 1 ⊗ ϕ(v) ⊗ e + 1 ⊗ ϕ(v) ⊗ e − 1 ⊗ ϕ(v)(0) ⊗ ϕ(v)(1) + ω(v (1) )ϕ(v (∞) ) ⊗ e ¯ = Dϕ(v) ⊗ e, C (ψ • ) to derive the second equality, then Proposition 4.3 where we used that P ∈ MP to derive the third one and the fact that ϕ ∈ Hom0 (V , P n M) to obtain the fourth ¯ equality. This shows that Dϕ(v) ∈ P (n+1 M). ¯ satisfies (15). We have Next we need to show that Dϕ n+2 ¯ ψ n+2 (v (1) ⊗ Dϕ(v (v (1) ⊗ dϕ(v (∞) )) + ψ n+2 (v (1) ⊗ ω(v (2) )ϕ(v (∞) )) (∞) )) = ψ

= (d ⊗ id)(ψ n+1 (v (1) ⊗ ϕ(v (∞) )) + ω(v (1) )ψ n+1 (v (2) ⊗ ϕ(v (∞) )))

¯ ⊗ e, = dϕ(v) ⊗ e + ω(v (1) )dϕ(v (∞) ) ⊗ e = Dϕ(v) C (ψ • ), where we used the covariance of d with respect to ψ • , the fact that P ∈ MP and the covariance property of the connection one-form to derive the second equality. It ¯ . is an easy exercise to verify that T = Dθ

Here D¯ extends D¯ in Sect. 4 to higher forms. Next, again following [23], we introduce left strongly tensorial forms and a quantum metric. Thus, let V be a right C-comodule. A left strongly tensorial n-form is a map ϕ : V → (n M)P commuting with the right coaction of C, where C coacts on (n M)P by ψ n+1 (e ⊗ w). ¯ Proposition 5.7. Left strongly tensorial forms HomC (V , (n M)P ) and HomM (E, n M) are isomorphic as left M-modules if P is faithfully flat as a left M-module (cf. [2] for a comprehensive review of the concept of faithful flatness). n M)P ) the Proof. This can be shown as [5, Theorem 5.4]. Given ϕ∈ HomC (V , ( n i i i i ¯ corresponding sϕ ∈ HomM (E, M) is given by sϕ ( i v ⊗ u ) = i ϕ(v )u , i i n ¯ ¯ i v ⊗ u ∈ E. Conversely given s ∈ Hom M (E, M), the corresponding tensorial form is given by ϕ s (v) = s(v ⊗ 1).

On the other hand, for V a right C-comodule we have the covariant derivative D extending the D in Sect. 3 to higher forms.

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

515

Proposition 5.8. If ω is a strong connection on P (M, C, e) and ψ is bijective then there is a covariant derivative D : HomC (V , (n M)P ) → HomC (V , (n+1 M)P ) given by

Dϕ(v) = dϕ(v) + (−1)n+1 ϕ(v (0) )ω(v (1) ).

Proof. This proposition is a coalgebra bundle version of a similar statement in [16] for quantum group principal bundles. The proof is similar to the proof of Proposition 5.6. Take any right C-covariant ϕ : V → (n M)P and v ∈ V and compute (P ⊗ id)Dϕ(v) = (−1)n+1 ϕ(v)(1) ⊗ ϕ(v)(∞) ⊗ 1 + e ⊗ dϕ(v) + (−1)n e ⊗ ϕ(v) ⊗ 1 +(−1)n+1 P (ϕ(v (0) )ω(v (1) )(1) ) ⊗ ω(v (1) )(2) = (−1)n+1 ϕ(v)(1) ⊗ ϕ(v)(∞) ⊗ 1 + e ⊗ dϕ(v) + (−1)n e ⊗ ϕ(v) ⊗ 1 + (−1)n+1 ψ −n−1 (ϕ(v (0) ) ⊗ ω(v (1) )(1) (1) )ω(v (1) )(1) (∞) ⊗ ω(v (1) )(2) = (−1)n+1 ϕ(v)(1) ⊗ ϕ(v)(∞) ⊗ 1 + e ⊗ dϕ(v) + (−1)n e ⊗ ϕ(v) ⊗ 1 + (−1)n+1 (ψ −n−1 (ϕ(v (0) ) ⊗ v (1) ) ⊗ 1 − ψ −n−1 (ϕ(v) ⊗ e) ⊗ 1) + (−1)n+1 ψ −n−1 (ϕ(v (0) ) ⊗ v (1) )ω(v (2) ) = (−1)n+1 ϕ(v)(1) ⊗ ϕ(v)(∞) ⊗ 1 + e ⊗ dϕ(v) + (−1)n e ⊗ ϕ(v) ⊗ 1 − (−1)n+1 ϕ(v)(1) ⊗ ϕ(v)(∞) ⊗ 1 − (−1)n e ⊗ ϕ(v) ⊗ 1 + (−1)n+1 e ⊗ ϕ(v (0) )ω(v (1) ) = e ⊗ Dϕ(v). The third equality follows from Proposition 4.3. Thus we deduce that Dϕ(v) ∈ (n+1 M)P . The proof of the covariance of Dϕ is analogous to the corresponding part of the proof of Proposition 5.6. Finally, when V is a finite-dimensional left C-comodule we can identify E¯ with Hom(V , P )0 and HomC (V ∗ , (n M)P ) with (n M)P ✷C V and hence obtain (n M)P ✷C V ∼ =HomM (Hom0 (V , P ), n M). We can then define, cf. [23], a metric on M as an element γ ∈ (1 M)P ✷C V such that the corresponding map Hom0 (V , P ) → 1 M is an isomorphism. In the infinite dimensional case we do not have a bijection between these spaces, but we still obtain a map Hom0 (V , P ) → 1 M from γ and can require it to be suitably nondegenerate. If P (M, C, ψ, e) and V is a frame resolution of M then we can identify (1 M)P ✷C V with 2 M, so that γ is a 2-form on M. Following [23], we can also define the cotorsion D ∈ 3 M of the metric as D = (id ⊗ sθ )(D✷C id)(γ ).

516

T. Brzezi´nski, S. Majid

Here, since γ is left strongly tensorial (and if D corresponds to a strong connection) then Dγ is also left-strongly tensorial when viewed as a map on V ∗ . Hence (D✷C id)γ ∈ (2 M)P ✷C V as required here. In this context one has the following version of D that does not go through V ∗ . Proposition 5.9. If ω is a strong connection on P (M, C, ψ, e) and ψ is bijective then there is a covariant derivative D : (n M)P ✷C V → (n+1 M)P ✷C V given by

D(w ⊗ v) = dw ⊗ v + (−1)n+1 wω(v (1) ) ⊗ v (∞) .

Proof. Dual to the proof of Proposition 5.8.

Also provided in [23] is a general construction for frame resolutions on quantum group homogeneous bundles π : P → H . We extend this now in the coalgebra setting π : P → C, to embeddable homogeneous spaces. This more general setting is definitely needed since it includes, for example, the full family of quantum 2-spheres [28] considered in the next section. The following proposition generalises [23, Prop. 4.3] to include this case. Proposition 5.10. A quantum embeddable homogeneous space M of P corresponding to π : P → C has a coalgebra frame resolution with V = M + , V = (π ⊗ id) ◦ and θ : V → P (1 M), θ : v → Sv (1) ⊗ v (2) . Proof. The canonical entwining structure is ψ(c ⊗ h) = h(1) ⊗ π(gh(2) ), where g ∈ π −1 (c) (cf. [8, Example 2.5]). Since θ(v) ∈ P ⊗ M, as M is a left P -comodule algebra, we find ψ 2 (π(v (1) ) ⊗ θ (v (2) )) = Sv (3) ⊗ ψ(π(v (1) Sv (2) ) ⊗ v (4) ) = Sv (1) ⊗ ψ(π(1) ⊗ v (2) ) = Sv (1) ⊗ v (2) ⊗ π(1) = θ(v) ⊗ π(1). Since χ˜ (θ (v)) = (Sv (1) )v (2) ⊗ π(v (3) ) = 1 ⊗ π(v) = 0 it follows that θ(v) ∈ P (1 M). From the above calculation we conclude that ϕ ∈ Hom0 (V , P (1 M)). Now, consider the map r : 1 M → P ⊗ M, r( i mi ⊗ m ˜ i ) = i mi m ˜ i (1) ⊗ m ˜ i (2) . Applying id ⊗ ε to r one immediately finds that Imr ⊆ P ⊗ V . Similarly, applying the coaction equalising map for the cotensor product to r one finds that Imr ⊆ P ✷C V . Finally using the same argument Prop. 4.3] one proves that r is the inverse of sθ : P ✷C V → 1 M, ias ini [23, sθ : i u ⊗ v → i ui Sv i (1) ⊗ v i (2) as required. 6. Monopole on All Quantum 2-Spheres Let SU q (2) be the standard matrix quantum group over the field k = C, with generators αβ and relations αβ = qβα, αγ = qγ α, αδ = δα + (q − q −1 )βγ , βγ = γβ, γ δ γ δ = qδγ , αδ − qβγ = 1. Let ξ = s(α 2 − q −1 β 2 ) + (s 2 − 1)q −1 αβ, ζ = s(qαγ − βδ) + (s 2 − 1)qβγ ,

η = s(qγ 2 − δ 2 ) + (s 2 − 1)γ δ,

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

517

where s ∈ [0, 1]. We define C = SUq (2)/J , where J = {ξ − s, η + s, ζ }SUq (2) is a coideal. We denote by π the canonical projection SUq (2) → C. As shown in [4], the fixed point subalgebra under the coaction of C on SUq (2) is generated by {1, ξ, η, ζ }, 2 , the 2-parameter quantum sphere in [28]. The standard and can be identified with Sq,s quantum sphere discussed in [7] corresponds to s = 0. It has been recently proved [25] that the coalgebra C is spanned by group-like elements. We begin by finding such a basis of C explicitly. Proposition 6.1. Let gn+ = π(

n−1 k=0

(α + q k sβ)),

gn− = π(

n−1

(δ − q −k sγ )),

n = 1, 2, . . .

k=0

(all products increase from left to right). Then gn± are group-like elements of C, and {e = π(1), gn± | n ∈ N} is a basis of C. To prove Proposition 6.1 we will need the following Lemma 6.2. Let ! denote the right action of SUq (2) on C, induced by π. Then: + sgn+1 = gn+ !(sδ + q −n γ ) = gn+ !(sδ + q −n β),

(19)

− = gn− !(sα − q n γ ) = gn− !(sα − q n β). sgn+1

(20)

and

Proof. Using the commutation rules in SUq (2) one easily verifies that for all s ∈ C, and n ∈ N, (α + q n−1 sβ)(sδ + q −n γ ) = (sδ + q −n+1 γ )(α + q n sβ).

(21)

Note that the form of J = ker π implies that for all x ∈ SUq (2), π((sδ + γ )x) = sπ((α + sβ)x),

sπ((δ − sγ )x) = π((sα − β)x).

(22)

This, together with the identity (21) immediately implies that (19) holds for n = 1. Now, assume that (19) is true for n − 1 > 1. Then, using the definition of gn+ as well as (21) we have: + gn+ !(sδ + q −n γ ) = gn−1 !(α + q n−1 sβ)(sδ + q −n γ ) + = gn−1 !(sδ + q −n+1 γ )(α + q n sβ)

+ = sgn+ !(α + q n sβ) = sgn+1 .

Therefore the first of equalities (19) holds for any n ∈ N. Since π(βx) = π(γ x),

∀x ∈ SUq (2),

(23)

also the second of equalities (19) holds. Equalities (20) are proven in an analogous way, by using the following identity (sα − q n−1 β)(δ − sq −n γ ) = (δ − sq −n+1 γ )(sα − q n β).

518

T. Brzezi´nski, S. Majid

Proof of Proposition 6.1. An easy calculation which uses (22) verifies that g1+ is group+ like. Assume that gn+ is group-like for an n > 1. Using the definition of gn+1 and this inductive assumption we have + gn+1 = gn+ !α ⊗ gn+ !α + gn+ !β ⊗ gn+ !γ + q n sgn+ !α ⊗ gn+ !β + q n sgn+ !β ⊗ gn+ !δ

= gn+ !α ⊗ gn+ !(α + q n sβ) + q n gn+ !β ⊗ gn+ !(sδ + q −n γ ) + + = gn+ !α ⊗ gn+1 + q n sgn+ !β ⊗ gn+1

=

(Lemma 6.2)

+ + ⊗ gn+1 . gn+1

Thus we conclude that gn+ is group-like for any n. Similarly one proves that all the gn− are group-like. The proof that π(1), gn± span C is analogous to the proof of [4, Prop. 6.1]. Proposition 6.1 gives an explicit description of the coalgebra bundle. We now construct a bicovariant splitting of π and hence a strong connection on it. Proposition 6.3. The map i : C → SUq (2) given by i(gn+ )

=

n−1 k=0

α + q k s(β + γ ) + q 2k s 2 δ , 1 + q 2k s 2

i(gn− )

=

n−1 k=0

δ − q −k s(β + γ ) + q −2k s 2 α 1 + q −2k s 2

is bicovariant and splits π. Proof. An easy direct calculation which uses (22), (23), verifies that SUq (2) (i(g1+ )) = i(g1+ ) ⊗ g1+ and SUq (2) (i(g1+ )) = g1+ ⊗ i(g1+ ). Now assume that there is n > 1 such that i(gn+ ) is bicovariant. Then we have 1 SUq (2) (i(gn+ )(α + sq n (β + γ ) + s 2 q 2n δ)) 1 + q 2n s 2 1 = i(g + )(α ⊗ gn+ !(α + q n sβ) + q n β ⊗ gn+ !(sδ + q −n γ ) 1 + q 2n s 2 n

+ SUq (2) (i(gn−1 )) =

+ q n sγ ⊗ gn+ !(α + q n sβ) + q 2n sδ ⊗ gn+ !(sδ + q −n γ )) 1 + = i(g + )(α + q n s(β + γ ) + q 2n s 2 δ) ⊗ gn+1 (Lemma 6.2) 1 + q 2n s 2 n + + = i(gn+1 ) ⊗ gn+1 . For the left coaction we have + SUq (2) (i(gn+1 ))

1 + n 2 2n SU (2) (i(gn )(α + sq (β + γ ) + s q δ)) 1 + q 2n s 2 q 1 = (g + !(α + q n sγ ) ⊗ i(gn+ )α + q n gn+ !(sδ + q −n β) ⊗ i(gn+ )γ 1 + q 2n s 2 n =

+q n sgn+ !(α + q n sγ ) ⊗ i(gn+ )β + q 2n sgn+ !(sδ + q −n β) ⊗ i(gn+ )δ) 1 = g + ⊗ i(gn+ )(α + q n s(β + γ ) + q 2n s 2 δ) (Lemma 6.2) 1 + q 2n s 2 n+1 + + = gn+1 ⊗ i(gn+1 ).

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

519

Thus we conclude that i(gn+ ) is bicovariant for all n ∈ N. Similarly one shows that i(gn− ) is bicovariant. The fact that i splits π can be proven inductively too, and in the proof one uses Lemma 6.2 and (22), (23). 2 defined via the elements ω± = Consequently, we have a strong connection on Sq,s n Si(gn± )(1) ⊗ i(gn± )(2) .

Lemma 6.4. The elements ωn± may be computed iteratively from + (1 + q 2n s 2 )ωn+1 = (δ − q n+1 sγ )ωn+ (α + q n sβ) + (αq n s − q −1 β)ωn+ (q n sδ + γ ) − (1 + q −2n s 2 )ωn+1 = (qγ + q −n sδ)ωn− (−β + q −n sα) + (α + q −n−1 sβ)ωn− (δ − q −n sγ )

and ω0± = 1 ⊗ 1. + Proof. From i(gn+1 ) = i(gn+ )(α + sq n (β + γ ) + q 2n s 2 δ)/(1 + q 2n s 2 ) as in the proof of Proposition 6.3 and the coproduct and antipode S of SUq (2) one has + = Sαω+n α+Sβω+n γ +q n s(Sβω+n δ+Sαω+n β +Sγ ωn+α+Sδω+n γ ) (1+q 2n s 2 )ωn+1

+ q 2n s 2 (Sδωn+ δ + Sγ ωn+ β)

= δωn+ α − q −1 βωn+ γ − q n−1 sβωn+ δ + q n sδωn+ beta

− q n+1 sγ ωn+ α + q n sαωn+ γ + q 2n s 2 αωn+ δ − q 2n+1 s 2 γ ωn+ β

which we then factorise as shown. − The computation for ωn+1 is similar. Actually, when s = 0 we may collect the two cases together as ± (1 + q 2n s ±2 )ωn+1 = (δ ∓ q n+1 s ±1 γ )ωn± (α ± q n s ±1 β)

+ (αq n s ±1 ∓ q −1 β)ωn± (q n s ±1 δ ± γ ).

For example, ω(g1+ ) =

1 ((δ − qsγ )d(α + sβ) + (αs − q −1 β)d(γ + sδ)) = ω1+ − 1 ⊗ 1. 1 + s2

A closed expression for ω on all gn± is possible for nonuniversal differential calculi where 2 , along the commutation relations exist between differential forms and elements of Sq,s lines of [7] for the standard q-monopole. Finally, as an example of an associated bundle, let V = C with the right C-comodule structure V (1) = 1 ⊗ g1+ . Here and in what follows we identify linear maps from C with their values at 1 ∈ C. Then the space of strongly tensorial zero-forms in Proposition 5.8 can be computed as 2 HomC (V , P ) = {u ∈ P | R u = u ⊗ g1+ } = {x(α + sβ) + y(γ + sδ)| x, y ∈ Sq,s }.

The covariant derivative D : HomC (V , P ) → HomC (V , (1 M)P ) can be computed as Du = du − uω(g1+ ) = 1 ⊗ u − uω1+ u α + sβ −1 = 1⊗u − (δ − qsγ , αs − q β) ⊗ γ + sδ 1 + s2

(24)

520

T. Brzezi´nski, S. Majid

from the form of ω1+ . Here a matrix product (or vector-covector contraction) notation is used. ¯ From another point of These Hom-spaces correspond to sections of a bundle E. view, we may consider VL = C with the left coaction V (1) = g1+ ⊗ 1 and identify the associated bundle E = P C VL = HomC (V , P ) as the same space as above. Similarly, we identify 1 M ⊗M E = HomC (V , (1 M)P ). From this point of view we can consider the above covariant derivative as a map ∇ : E → 1 M ⊗M E. Finally, from the form of E given above it is clear that E is a rank 2 projective module 2 along the same lines as the recent result over the standard q-sphere in [17]. We over Sq,s use the relation (δ − qsγ )(α + sβ) + (sα − q −1 β)(γ + sδ) = 1 + s 2 ,

(25)

holding in SUq (2) to verify that 1 1 1−ζ ξ α + sβ = p= (δ − qsγ , sα − q −1 β) 1 + s 2 −η s 2 + q −2 ζ 1 + s 2 γ + sδ 2 -valued 2 ×2-matrix, and that (S 2 )2 p = E by the identification obeys p2 = p as an Sq,s q,s of (x, y)p with u = x(α + sβ) + y(γ + sδ). In terms of this, (24) becomes

∇((x, y)p) = 1 ⊗(x, y)p − (x, y)p ⊗ p = (d(x, y))p + (x, y)(dp)p = (d(x, y)p)p so that ∇ is the Grassmannian connection associated to the projective module. Further details of the projector computation will be presented elsewhere. A similar result holds for general n along the lines for the standard q-monopole in [17]. Acknowledgements. Research was originally supported by the EPSRC grant GR/K02244 and in the case of TB by a Lloyd’s of London Tercentenary Foundation Fellowship in the later stages.

References 1. Böhm, G., Nill, F. and Szlachányi, K.: Weak Hopf algebras I. Integral theory and C ∗ -structure. J. Algebra 221, 385–438, (1999); Weak Hopf algebras II. Representation theory, dimensions and the Markov trace. Preprint math.QA/9906045 2. Bourbaki, N.: Commutative Algebra. Reading, MA: Addison-Wesley, 1972 3. Brzezi´nski, T.: Translation map in quantum principal bundles. J. Geom. Phys. 20, 349–370 (1996) 4. Brzezi´nski, T.: Quantum homogeneous spaces as quantum quotient spaces. J. Math. Phys. 37 2388–2399 (1996) 5. Brzezi´nski, T.: On modules associated to coalgebra Galois extensions. J. Algebra, 215, 290–317 (1999) 6. Brzezi´nski, T. and Hajac, P.M.: Coalgebra extensions and algebra coextensions of Galois type. Commun. Algebra 27, 1347–1367 (1999) 7. Brzezi´nski, T. and Majid, S.: Quantum group gauge theory on quantum spaces. Commun. Math. Phys. 157, 591–638 (1993); Erratum 167, 235 (1995) 8. Brzezi´nski, T. and Majid, S.: Coalgebra bundles. Commun. Math. Phys. 191, 467–492 (1998) 9. Brzezi´nski, T. and Majid, S.: Quantum differentials and the q-monopole revisited. Acta Appl. Math. 54, 185–232 (1998) ˇ 10. Cap, A., Schichl, H. and Vanžura, J.: On twisted tensor product of algebras. Commun. Algebra 23, 4701– 4735 (1995) 11. Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994 12. Doi, Y.: Unifying Hopf modules. J. Algebra 153, 373–385 (1992) 13. Doi,Y. and Takeuchi, M.: Hopf-Galois extensions of algebras, the Miyashita-Ulbrich action, and Azumaya algebras. J. Algebra 121, 488–516 (1989) 14. Doplicher, S. and Roberts, J.E.: Why is there a Field Algebra with a Compact Gauge Group Describing the Superselection Structure in Particle Physics. Commun. Math. Phys. 131, 51–107 (1990)

Quantum Geometry of Algebra Factorisations and Coalgebra Bundles

521

15. Fredenhagen, K., Rehren, K.H. and Schroer, B.: Superselection Sectors with Braid Statistics and Exchange Algebras. Commun. Math. Phys. 125, 201–226 (1989) 16. Hajac, P.M.: Strong connections on quantum principal bundles. Commun. Math. Phys. 182, 579–617 (1996) 17. Hajac, P.M. and Majid, S.: Projective module description of the q-monopole. Commun. Math. Phys. 206, 247–264 (1999) 18. Karoubi, M.: Homologie cyclique et K-théorie. Astérisque 149 (1987) 19. Koppinen, M.: Variations on the smash product with applications to group-graded rings. J. Pure Appl. Alg. 104, 61–80 (1994) 20. Majid, S.: Foundation of Quantum Group Theory Cambridge: Cambridge University Press, 1995 21. Majid, S.: Physics for algebraists: Non-commutative and non-cocommutative Hopf algebras by a bicrossproduct construction. J. Algebra 130, 17–64 (1990) 22. Majid, S.: Some remarks on quantum and braided group gauge theory. Banach Center Publ. 40, 335–349 (1997) 23. Majid, S.: Quantum and braided group Riemannian geometry. J. Geom. Phys. 30, 113–146 (1999) 24. Milnor, J.W. and Moore, J.C.: On the structure of Hopf algebras. Ann. Math. 81, 211–264 (1965) 25. Müller, E.F. and Schneider, H.-J.: Quantum homogeneous spaces with faithfully flat module structures. Israel J. Math. 111, 157–190 (1999) 26. Noumi, M. and Sugitani, T.: Quantum symmetric spaces and related q-orthogonal polynomials. In: Group theoretical methods in physics (Toyonaka, 1994), River Edge, NJ: World Sci. Publishing, 1995, pp. 28–40 27. Ocneanu, A.: Quantized Groups, String Algebras and Galois Theory for Algebras. In: D.E. Evans and M. Takesaki, eds., Operator Algebras and Applications: Proc. Seminar, Lon. Math. Soc. Lec. Notes Series Number 136, CUP, 1989 28. Podle´s, P.: Quantum spheres. Lett. Math. Phys. 14, 193–202 (1987) 29. Schneider, H.-J.: Principal homogeneous spaces for arbitrary Hopf algebras. Isr. J. Math. 72, 167–195 (1990) 30. Schneider, H.-J.: Representation theory of Hopf-Galois extensions. Isr. J. Math. 72, 196–231 (1990) 31. Schneider, H.-J.: Normal basis and transitivity of crossed products for Hopf algebras. J. Algebra 152, 289–312 (1992) 32. Tambara, D.: The coendomorphism bialgebra of an algebra. J. Fac. Sci. Univ. Tokyo Sect. IA, Math. 37, 425–456 (1990) 33. Woronowicz, S.L.: Differential calculus on compact matrix pseudogroups (quantum groups). Commun. Math. Phys. 122, 125–170 (1989) Communicated by A. Connes

Commun. Math. Phys. 213, 523 – 538 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Conformal Maps and Integrable Hierarchies P. B. Wiegmann1 , A. Zabrodin2 1 James Franck Institute and and Enrico Fermi Institute of the University of Chicago, 5640 S. Ellis Avenue,

Chicago, IL 60637, USA, and Landau Institute for Theoretical Physics

2 Joint Institute of Chemical Physics, Kosygina str. 4, 117334 Moscow, Russia, and ITEP, 117259 Moscow,

Russia Received: 20 December 1999/ Accepted: 2 March 2000

Abstract: We show that conformal maps of simply connected domains with an analytic boundary to a unit disk have an intimate relation to the dispersionless 2D Toda integrable hierarchy. The maps are determined by a particular solution to the hierarchy singled out by the conditions known as “string equations”. The same hierarchy locally solves the 2D inverse potential problem, i.e., reconstruction of the domain out of a set of its harmonic moments. This is the same solution which is known to describe 2D gravity coupled to c = 1 matter. We also introduce a concept of the τ -function for analytic curves. 1. Introduction By the Riemann mapping theorem, any simply connected domain on the complex plane with a boundary consisting of at least two points can be conformally mapped onto the unit disk. However, the theorem does not say how to construct the map. We argue that the map is explicitly given by a solution of a dispersionless integrable hierarchy which is a multi-dimensional extension of the hierarchies of hydrodynamic type [1,2]. In this paper, we only consider simply connected domains bounded by a simple analytic curve. We show that in this case conformal maps are given by a particular solution of the dispersionless 2D Toda hierarchy (see [2,3] and references therein). Surprisingly, this solution obeys (and is selected by) the string equation familiar in topological gravity and matrix models of 2D gravity [4, 5]. One may characterize a closed analytic curve by a set of harmonic moments of, say the exterior of the domain surrounded by the curve and its area: −k Ck = − z dxdy, k ≥; C0 = dxdy. (1.1) exterior

interior

Here z = x + iy and it is implied that the point z = 0 is inside the domain, whereas the point z = ∞ is outside. The integrals at k = 1, 2 are assumed to be properly regularized. We assume this (infinite) set to be given and address two problems:

524

P. B. Wiegmann, A. Zabrodin

(1) To find the harmonic moments of the interior C−k = zk dxdy, k ≥ 1,

(1.2)

interior

and to reconstruct the form of the domain. This is the 2D inverse potential problem (see e.g. [6]). It is known that the solution, if it exists, is unique [7] if the domain is star-like. It is not known whether this is true for arbitrary domains with a smooth boundary. However, if the boundary is smooth, then a small change of moments uniquely determines a new domain (also with a smooth boundary). The integrable hierarchy gives the solution to this problem is given through the τ -function of an integrable hierarchy, thus suggesting a new concept – the τ -function of the curve. (2) To construct an invertible conformal map of the exterior of the unit circle onto the exterior of the domain. Explicitly, the area and the set of moments of the exterior (and their complex conjugate) are identified with the times of the dispersionless 2D Toda hierarchy as follows: t ≡ t0 =

C0 , π

tk =

Ck , πk

t¯k =

C¯ k , πk

k ≥ 1.

(1.3)

These parameters should be treated as independent variables. We prove that the derivatives of the harmonic moments of the interior C−k with respect to the tj , t¯j enjoy the symmetry ∂C−j ∂ C¯ −j ∂C−k ∂C−k = , = . ∂tj ∂tk ∂ t¯j ∂tk Therefore, the moments of the interior can be expressed as derivatives of a single real function which appears to be the logarithm of the τ -function τ (t, {tk }, {t¯k }) of the dispersionless 2D Toda hierarchy: vk ≡

C−k ∂ log τ , = π ∂tk

v¯k ≡

C¯ −k ∂ log τ , = π ∂ t¯k

k ≥ 1.

(1.4)

The equation which determines the τ -function (the dispersionless Hirota equation) and therefore the moments of the interior is given in Sect. 5. Supplemented by a proper initial condition provided by the string equation, it solves the inverse moments problem for small deformations of a simply connected domain with analytic boundary. The τ function at tk = 0, k ≥ 3 is given in the Appendix. In its turn, the conformal map is determined by the dispersionless limit of the Lax– Sato equations of the Toda hierarchy. Let z(w) be a univalent function that provides an invertible conformal map of the exterior of the unit circle |w| > 1 to the exterior of the domain. Let us represent it by a Laurent series z(w) = rw +

∞

uj w −j .

(1.5)

j =0

The conditions that w = ∞ is mapped to z = ∞ and r is real fix the map, so the potentials r, uj are uniquely determined by the domain. The potentials of the conformal map obey an infinite set of differential equations with respect to the times (harmonic moments). They are evolution Lax–Sato equations of the hierarchy. The series z(w) is identified with the Lax function. The Lax–Sato equations are derived in Sect. 4.

Conformal Maps and Integrable Hierarchies

525

The coefficients of the conformal map have an equivalent description in terms of the τ -function: the inverse map w(z) is explicitly given by the formula   2 z−k ∂ 2 1 ∂  log τ. log w = log z −  + (1.6) 2 ∂t 2 k ∂t∂tk k≥1

Two comments are in order. Many equations of Sects. 3–6 can be found in the literature on dispersionless hierarchies [2, 3, 8, 9], 2D gravity [4, 5] and topological field theories [10, 11] where they appeared without reference to conformal maps and inverse potential problem; we adopt them and their proofs for conformal maps. This work is the development of our previous joint work [12] with Mark Mineev–Weinstein, where some of the results described below were applied to the problems of 2D interface dynamics. 2. Analytic Curves and the Schwarz Function Let a closed analytic curve be the boundary of a simply connected domain on the complex plane with coordinates z = x + iy, z¯ = x − iy. The equation of the curve can be written in the form z¯ = S(z).

(2.1)

Thus the curve determines the function S(z) which is defined also outside the curve. Moreover it is an analytic function within some domain containing the curve. We call the function S(z) the Schwarz function of the curve (see e.g. [13]). Its singularities encode the information about the curve. The Schwarz function may not be an arbitrary function of z. It obeys the unitarity condition: as is seen from (2.1), its complex conjugate function1 is equal to the inverse function, i.e., ¯ S(S(z)) = z.

(2.2)

Being an analytic function in a strip containing the curve, the Schwarz function can be decomposed into a sum of two functions S(z) = S (+) (z) + S (−) (z): one, S (+) , is holomorphic in the interior of the domain, the second, S (−) , is holomorphic in the exterior. Their expansions around z = 0 and z = ∞ determine the harmonic moments in the exterior (1.1) and interior (1.2) respectively: S (+) (z) =

∞ 1 Ck zk−1 , π k=1

S (−) (z) =

∞ 1 C−k z−k−1 . π

(2.3)

k=0

This follows from the contour integral representation of the moments 1 Ck = z−k S(z)dz, −∞ < k < ∞ 2i curve

(2.4)

obtained from (1.1, 1.2) with the help of the Green formula. The Schwarz function is closely related to conformal maps. Consider the conformal map (1.5) of the exterior of the unit circle to the exterior of the domain. The inverse 1 A comment on the notation: given a Laurent series f (z) = f zj , we set f¯(z) = f¯ zj . j j j j

526

P. B. Wiegmann, A. Zabrodin

map sends a point z to a point w(z). Let us invert this point with respect to the circle, w → (w) ¯ −1 and map it back. (It is known that the conformal can be extended into some domain through the analytic curve. We assume that the inverted point belongs to ¯ z) = z((w) it.) This operation is carried out by the conjugate Schwarz function S(¯ ¯ −1 ) or −1 S(z) = z¯ (w ). Obviously, if z belongs to the curve, then S(z) = z¯ . Therefore, we can write S(z(w)) = z¯ (w−1 ). To summarize, we have two formal Laurent series (see (1.3, 1.4) for the notation): S(z) = ¯ S(z) =

∞ k=1 ∞

ktk zk−1 + k t¯k zk−1 +

k=1

∞

t vk z−k−1 , + z

(2.5)

t + z

(2.6)

k=1 ∞

v¯k z−k−1 .

k=1

They are connected by the unitarity condition (2.2). The latter is resolved by the conformal maps ¯ z(w−1 )), z(w) = S(¯ z¯ (w

−1

) = S(z(w)).

(2.7) (2.8)

In their turn, the conformal maps are given by the series2 z(w) =rw +

∞

uj w −j ,

j =0 ∞

z¯ (w−1 ) =rw−1 +

u¯ j w j .

(2.9)

(2.10)

j =0

These series and the unitarity condition (2.7) establish relations between harmonic moments of the exterior tk , harmonic moments of the interior vk and the coefficients of the conformal map uj . Below we find an infinite set of differential equations, which determine the evolution of the potentials and the moments of the interior while varying the moments of the exterior. 3. Symplectic Structure of Conformal Maps and Generating Function Deformations of the domain and, therefore, of the conformal map reveal the symplectic structure. In this section we show that the pairs logw, t and z(w, t), z¯ (w −1 , t) are canonical (see also [12]): {z(w, t), z¯ (w−1 , t)} = 1,

(3.1)

where {, } is the Poisson bracket with respect to w and the area t (all moments of the exterior tk , t¯k are kept fixed). For any two functions f (w, t), g(w, t) the Poisson bracket is defined by {f, g} = w

∂f ∂g ∂g ∂f −w . ∂w ∂t ∂w ∂t

(3.2)

2 To avoid confusion, let us stress that the functions z(w) and z¯ (w −1 ) are complex conjugate only on the curve, i.e., at |w| = 1.

Conformal Maps and Integrable Hierarchies

527

We refer to (3.1) as the string equation3 . This equation suggests that deformations of conformal maps form a group with respect to the composition. To prove it, let us rewrite the l.h.s. in two equivalent forms with the help of (2.7), (2.8). First, let us compute the derivatives of z¯ (w −1 , t) in (3.1) treating it as the composition S(z(w, t), t) as is in (2.8): ∂ z¯ (w−1 , t) ∂S(z, t) ∂S(z, t) ∂z(w, t) = + , ∂t ∂t ∂z ∂t

∂ z¯ (w −1 , t) ∂S(z, t) ∂z(w, t) = . ∂w ∂z ∂w (3.3)

As a result, we obtain: {z(w, t), z¯ (w−1 , t)} = w

∂z(w, t) ∂S(z, t) . ∂w ∂t

(3.4)

¯ z(w −1 , t), t) we get Similarly, treating z(w, t) in (3.1) as the composition S(¯ {z(w, t), z¯ (w −1 , t)} = −w

¯ z, t) ∂ z¯ (w−1 , t) ∂ S(¯ . ∂w ∂t

(3.5)

In the r.h.s. of these equations the derivatives in t is taken at fixed z or z¯ and then understood as functions of w. Now, using the series (2.5) and (2.9), we conclude that the r.h.s. of (3.4) is 1 plus positive powers in w. However, the series (2.6) and (2.10) tell that the r.h.s. of (3.5) is 1 plus negative powers in w. This prompts us to (3.1). The rest follows from the symplectic structure by treating deformations of the conformal map (2.9) along the lines of the multi-time Hamilton–Jacobi formalism [2, 14]. Let us introduce the generating function (z, t) of the canonical transformation (logw, t) → (z, S). Its differential, d = Sdz + log wdt, implies that S=

∂(z, t) , ∂z

log w =

∂(z, t) . ∂t

(3.6)

Using the Laurent series for the Schwarz function (2.5), we get (z, t) =

∞ k=1

∞

vk (t) 1 tk zk + t log z − v0 (t) − z−k , 2 k

(3.7)

k=1

where v0 obeys ∂t v0 = 2 log r.

(3.8)

The integration constant v0 can be interpreted as a logarithmic moment (see (3.15) below). As the Schwarz function, the generating function is the sum of functions, whose derivatives, S (±) (z) = ∂z (±) (z), are analytic in the exterior and the interior of the domain respectively: (z) = (+) (z)+(−) (z)−v0 /2. They have a simple electrostatic interpretation. Say, (+) (z) is the (complex) 2D Coulomb potential in the interior of the domain created by a homogeneously distributed charge in the exterior. In its turn, the 3 See Sect. 7 for the history of this equation and references.

528

P. B. Wiegmann, A. Zabrodin

(−) (z) is the complex Coulomb potential in the exterior created by a homogeneous charge in the interior:

∞ z

1 Ck k log 1 − dx dy = z , z π k exterior k=1 ∞

1 C−k −k 1 C 0 (−) (z) = log z− z . log z−z dx dy = π interior π π k

(+) (z) =

1 π

(3.9)

(3.10)

k=1

The real and imaginary parts of the generating function at a point z on a curve have a clear geometric interpretation. The real part is half of the squared distance from the origin to the point z on a curve while 21 m ((z, t)−(z1 , t)) is the area A(z) (counted modulo πt) of the interior domain bounded by the ray ϕ = arg z and a reference ray ϕ = arg z1 . So, if z is on a curve, we can write (z) =

1 2 |z| + 2iA(z), 2

(3.11)

where the right-hand imaginary constant. Indeed, integrating z side is defined up to a purely z by parts (z) = S(ζ )dζ = 21 zS(z)+ 21 (S(ζ )dζ −ζ dS(ζ )) and taking into account that dA(z) = 4i1 (¯zdz − zd z¯ ) and S(z) = z¯ on a curve, we obtain (3.11). It follows from (3.11) that if z belongs to a curve, then the variation of the (z) with respect to the real parameters, δ = ∂t δt +

∞ k=1

(∂tk + ∂t¯k ) Re δtk + i(∂tk − ∂t¯k ) Im δtk

(3.12)

is purely imaginary. The proof is simple. Let us analytically continue the equality (z)+ ¯ z) = |z|2 (the real part of (3.11)) away from the curve, (¯ ¯ (z) + (S(z)) = zS(z)

(3.13)

and take the partial derivative with respect to tj at a fixed z. Note that the tj -dependence ¯ and S. We get: of the second term comes from both ¯ ) ∂(z) ∂S(z) ∂ (ζ ∂S(z) ¯ + S(S(z)) + =z . ∂tj ∂tj ∂tj ζ =S(z) ∂tj Applying the unitarity condition (2.2), we observe that the second term in the left hand side is equal to the term in the right-hand side, so they cancel. Restricting the equality to the curve again, we conclude that ¯ z) ∂(z) ∂ (¯ + = 0, ∂tj ∂tj

(3.14)

whence all terms of the infinite sum in the right hand side of (3.12) are imaginary. Similarly, for the derivative with respect to t (3.14) reads ∂t + ∂t = 0 (since t is real), so the first term in (3.12) is also imaginary. As is mentioned above, is defined up to a purely imaginary z-independent term (which may depend on the tj ). However, its variation with respect to real parameters, as

Conformal Maps and Integrable Hierarchies

529

in (3.12), is also purely imaginary. Taking into account (3.8), we fix it by the condition that v0 is real. From (3.9), (3.10) and (3.11) it easily follows that 1 v0 = log |z|2 dxdy (3.15) π interior is the logarithmic moment. 4. Lax–Sato Equations for Conformal Maps Let us now vary higher harmonic moments tk . The differential of the generating function changes accordingly [12]: d = Sdz + log wdt +

∞

(Hk dtk − H¯ k d t¯k ),

(4.1)

k=1

where Hj =

∂ ∂tj

z

,

∂ H¯ j = − ∂ t¯j z

(4.2)

are Hamiltonians generating the higher flows in tk , t¯k . Here the derivatives in tk are taken at a fixed z that belongs to the curve. The second equality follows from (3.14). The Hamiltonians, being treated as functions of z and t obey the integrability condition [2] ∂Hj ∂Hi = . (4.3) ∂tj z ∂ti z Computing the derivative in (4.2), we obtain: Hj = zj −

∞

1 ∂v0 ∂vk z−k − 2 ∂tj ∂tj k

(4.4)

k=1

and similarly for H¯ k . The Hamiltonians Hj written in terms of the canonical variables w, t determine the evolution of the conformal map z(w) and z¯ (w −1 ) with respect to the hierarchical times tk (harmonic moments). These are Lax–Sato equations: ∂z(w) = {Hj , z(w)}, ∂tj

(4.5)

∂ z¯ (w −1 ) = {Hj , z¯ (w−1 )}. ∂tj

(4.6)

Note that these formulas can be extended to j = 0, where H0 = log w generates the flow t0 = t. Consistency of these equations acquires the form of the “zero curvature” condition ∂tj Hi − ∂ti Hj + {Hi , Hj } = 0, where derivatives are taken at fixed w. It is equivalent to (4.3) w.

530

P. B. Wiegmann, A. Zabrodin

Now we are ready to prove that the Hamiltonians Hi and H¯ i have the form

1 j Hj (w) = zj (w) + z (w) , + 0 2

1 z¯ j (w −1 ) . H¯ j (w) = z¯ j (w −1 ) + − 0 2

(4.7) (4.8)

The symbols (f (w))± mean a truncated Laurent series, where only terms with positive (negative) powers of w are kept, (f (w))0 is a constant part (w 0 ) of the series. To this end, let us differentiate Hj by w and t, and express the result in terms of ¯ We begin with the formula derivatives of z¯ and S. ∂Hj ∂z(w) z¯ (w −1 ) ∂z(w) z¯ (w −1 ) − = ∂w ∂w ∂tj ∂tj ∂w ¯ z) as which is a simple consequence of the definition (4.2). Replacing here z(w) by S(¯ before, we get: ¯ z) ∂Hj ∂ z¯ ∂ S(¯ . (4.9) = ∂w ∂w ∂tj ∞ −k−1 is a regular function of w at ¯ z) = Using (2.6), we find that ∂tj S(¯ k=1 ∂tj v¯ k z¯ 2 w = 0 and its Taylor expansion starts from w . Thus Hj is also a regular function in w. Moreover, from (4.4) we find that Hj is a polynomial in w of the degree j . Thus, being so, we have Hj = (zj )+ + (zj )0 − 21 ∂tj v0 . To complete the proof, let us find the w0 -term in the Laurent series ¯ ¯ ∂Hj ∂ z¯ (w) ∂ z¯ (w) ∂ S(z) ∂ S(z) − . = ∂t ∂tj ∂t ∂tj ∂t

(4.10)

It comes from the first term of the r.h.s. of this expression and, together with (3.7), gives the desired result: 2(∂t Hj )0 = 2∂tj logr = ∂t ∂tj v0 = ∂t (zj )0 . The dynamics of the conformal map with respect to t¯k can be obtained from (4.5, 4.6) by the complex conjugation. Note that the Poisson bracket changes the sign as w → w¯ = w −1 that is just consistent with the minus sign in (4.2). Hence H¯ j = H¯ j (w −1 ) as is in (4.8). The Lax–Sato equations (4.5, 4.6) with the Hamiltonians (4.7, 4.8) imply that the coefficients of the conformal map obey an infinite set of non-linear differential equations known as the dispersionless 2D Toda hierarchy. The first and the most familiar equation of the hierarchy is the long wave limit of the Toda lattice equation: ∂ 2 (r 2 )/∂t 2 = ∂ 2 log (r 2 )/∂t1 ∂ t¯1 . We also mention other relations between the conformal map and the Hamiltonians: ∞

1 z(w) = H1 + t¯1 + k t¯k H¯ k−1 , 2

(4.11)

k=2

z(w)¯z(w−1 ) = t + 2 Re

∞

ktk Hk .

(4.12)

k=1

They can be immediately obtained from the unitarity condition (2.7, 2.8) by comparing the positive and constant parts of the Laurent series in w and using (4.7, 4.8). These

Conformal Maps and Integrable Hierarchies

531

formulas prove the known property of and conformal maps. If all moments tk = 0 for k > N, then the Laurent series of the conformal map (1.5) is truncated: uj = 0 for j ≥ N . This, in particular, gives another proof of Sakai’s theorem [15]: if all but the first three moments t, t1 , t2 of the complement of a simply connected domain on the complex plane are zero, then the domain is an ellipse. The Lax–Sato equations (4.5, 4.6) plus the string equation (3.1) give the complete set of differential equations for the potentials uj of the conformal map as functions of moments t, tk , t¯k . They uniquely determine small deformations of the map and the curve. 5. Symmetry of Moments and the τ -Function of Conformal Maps and Curves The integrable hierarchy suggests a concept of τ -function for curves and conformal maps. The τ -function solves the problem of moments, i.e., a restoration of moments of the interior vk given the moments of the exterior {tk } of the domain and its area t. We define the τ -function τ (t; {ti }, {t¯i }) as a real function, which determines the moments of the interior by the formulas vk =

∂ log τ , ∂tk

v¯k =

∂ log τ , ∂ t¯k

v0 =

∂ log τ . ∂t

(5.1)

The very existence of the τ -function is due to the fundamental symmetry of harmonic moments: ∂vj ∂vk = , ∂tk ∂tj

∂vj ∂ v¯k = , ¯ ∂ tk ∂tj

∂v0 ∂vk = ∂tk ∂t

(5.2)

which follows [2] from the Lax–Sato equations. We prove them below following the lines of Ref. [3]. To prove the first symmetry relation, we notice from (4.4) that ∂tj vk is the constant (z0 ) part of the Laurent series zk+1 ∂z Hj in z, i.e., the residue res(zk dHj ). Then, using the well known property of residues, we find that this is equal to the constant part (w0 ) of the Laurent series zk w∂w Hj in w. Then, using Eq. (4.7) we get: ∂vj ∂ ∂ j ∂ j Hj )0 = (zk w (z )+ )0 = ((zk )− w (z )+ )0 = (zk w ∂tk ∂w ∂w ∂w ∂ k ∂ k ∂vk = −((zj )+ w . (z )− )0 = ((zj )− w (z )+ )0 = ∂w ∂w ∂tj This chain of equalities is due to the identity (f− w∂w g+ )0 = −(g+ w∂w f− )0 = (g− w∂w f+ )0 for residues of Laurent series. The proof of the other two symmetry relations is similar. In fact, the symmetry relations follow from the unitarity condition (2.2) for the Schwarz function. To show this, let us outline another proof which does not rely on existence of the conformal map. The proof uses only the fact that the variation of the (z) with respect to the harmonic moments tj , if z belongs to the curve, is purely imaginary (see the end of Sect. 3). Using the expansion of in the series (3.7) and the second equality in (4.2), one concludes that Hj (z)dHk (z) = 0 (5.3) curve

532

P. B. Wiegmann, A. Zabrodin

for all j, k > 0. Therefore, we have: 1 ∂tj vk = zk dHj (z) 2πi curve   −l 1 z  1  1 ∂ t v0 + = dHj (z) Hk (z)dHj (z) + ∂tk vl 2πi curve 2π i curve 2 k l l≥1

  −l 1 1 z  ∂t v0 +  dzj = ∂tk vj = ∂tk vl 2πi curve 2 k l l≥1 and similarly for the other derivatives. The τ -function determines the Hamiltonians as functions of z. Equation (4.4) reads   2 −k 2 ∂  z 1 ∂ Hj = z j −  log τ. + 2 ∂t∂tj k ∂tj ∂tk k≥1

The second equation in (3.6) gives the formula for the inverse conformal map w(z) through the τ -function (see (1.6) in the Introduction). Formulas for the coefficients uj of the direct map z(w) through log τ are also available but they have more complicated structure. The first two are simple: r = 21 exp(∂t2 log τ ), u0 = ∂ 2 log τ/∂t∂t1 . The τ -function itself obeys the dispersionless limit of the Hirota equation (a leading term of the differential Fay identity [8, 3, 9]):       vnm v0k v0k (z−ζ ) exp z−n ζ −m  = z exp− z−k  −ζ exp− ζ −k  . nm k k n,m≥1

k≥1

k≥1

(5.4) This is an infinite set of relations between the second derivatives vnm = ∂tn ∂tm logτ, v0m = ∂t ∂tm logτ of the τ -function. The equations appear while expanding both sides of (5.4) in powers of z and ζ . In particular, the leading terms as ζ → ∞ yield   v1k v0k z − v01 − (5.5) z−k = z exp− z−k  . k k k≥1

k≥1

This is in fact the relation H1 = reH0 + u0 /2 between the Hamiltonians H1 and H0 = log w which follows from definition (4.7). Substituting (5.5) back to (5.4), one can eliminate v0k . The resulting relations between vmn with m, n ≥ 1 is exactly the dispersionless limit of the Hirota equations for the KP hierarchy from [8,3,9]). Supplemented by proper initial data, satisfying the unitarity condition, Eqs. (5.4) formally solve the local problem of moments (or the 2D potential problem), i.e., reconstruction of the Laurent expansion of the S (−) . The proof of (5.4) is similar to the one known in the context of the KP hierarchy [8, 9]. Let rw = z +

∞ j =0

pj z−j

(5.6)

Conformal Maps and Integrable Hierarchies

533

be the Laurent expansion of the inverse conformal map w(z), where pj are some coefficients. Multiply both sides by ζ −k zk−1 (w), extract the polynomial part in w and sum over k ≥ 1. After some rearrangement and using (5.6) for w(ζ ) one gets the relation r(w(ζ ) − w(z))

∞

ζ −k−1 (zk (w))+ = rw(z)

k=1

∞

ζ −k−1 (zk (w))0 .

k=0

−k−1 (zk (w)) , we rewrite it in the Using (4.7) and the identity ∂log w(z)/∂z = ∞ 0 k=0 z form ∞ ∂log w(ζ ) w(ζ ) + w(z) 1+2 . ζ −n Hn (w(z)) = ζ ∂ζ w(ζ ) − w(z) n=1

Now, plugging in the equivalent definition (4.4) of Hn and integrating this equality with respect to ζ , one finally arrives at the following important relation: z−k ζ −n w(z) − w(ζ ) vkn = log + log r. kn z−ζ

(5.7)

k,n≥1

Because of (1.6) this is the same as (5.4). Let us note, in passing, that the Schwarzian derivative of the inverse map w(z), T (z) =

w (z) 3 − w (z) 2

w (z) w (z)

2 (5.8)

(the classical stress-energy tensor) admits a remarkably simple representation linear in log τ . Indeed, differentiating both sides of (5.7) with respect to z and ζ , we get

z−k−1 ζ −n−1 vkn =

k,n≥1

w (z)w (ζ ) 1 − . 2 (w(z) − w(ζ )) (z − ζ )2

A simple calculation shows that the limit of the right-hand side as ζ → z is equal to of the Schwarzian derivative (5.8). Therefore, we obtain: T (z) = 6 z−2

k,n≥1

z−k−n

∂ 2 log τ . ∂tk ∂tn

1 6

(5.9)

The Hirota equation itself is thus a “splitted” (ζ = z) analogue of (5.9). 6. Conformal Maps as a Reduction of the Dispersionless Toda Lattice Hierarchy The reader familiar with integrable hierarchies of non-linear differential equations is able to identify the dynamical system for conformal maps (4.5), (4.6) with the dispersionless limit of the Toda lattice hierarchy [2,3]. The latter is related with the Whitham hierarchy – the theory of solitons with distinct fast and slow variables. The Whitham hierarchy appears after averaging over fast variables (see [2,11] and references therein). The dispersionless limit emerges as the genus-zero Whitham hierarchy. Formally, it is a

534

P. B. Wiegmann, A. Zabrodin

semiclassical limit h¯ → 0 of pseudo-differential (or difference) operators. For the Lax operator of the 2D Toda lattice, ∂

L = r(t)eh¯ ∂t +

∞

∂

uj (t)e−j h¯ ∂t

(6.1)

j =0 ∂

one should replace the difference operator eh¯ ∂t by the canonical variable w with the Poisson bracket {logw, t} = 1. The Lax operator then becomes the Lax function L(w) given by a formal Laurent series in w. The Lax function is identified with the conformal map z(w) (1.5). The derivatives of the z(w) with respect to the times tk are given by (4.5) which is nothing else than the dispersionless j limit of the Lax–Sato equation 1 j h∂/∂t ¯ j L = [Hj , L], where Hj = L + + 2 L 0 . (The coefficient 1/2 is due to a particular choice of gauge in (6.1), where the coefficient in front of the first term is not fixed to be 1.) The mathematical theory of the dispersionless hierarchies constrained by the string equations has been developed in Refs. [2, 14] and extended to the Toda hierarchy in Ref. [3]. For a comprehensive review see, e.g., [11]. In the Toda theory, there are two sets of independent times, ti and t˜i , and two sets of potentials, uj and u˜ j . The identification with conformal maps requires them to be complex conjugate: t˜i = t¯i , u˜ j = u¯ j and the function r to be real. Under the reality conditions the semiclassical limit of the second Lax operator, ∂

L¯ = e−h¯ ∂t r(t) +

∞

∂

ej h¯ ∂t u¯ j (t)

(6.2)

j =0

is identified with z¯ (w−1 ). The reality condition is consistent with the 2D Toda hierarchy and selects a class of solutions relevant to conformal maps. This class is reduced to a unique solution by imposing the string equation {z, z¯ } = 1, ¯ = h. which in the dispersionful case would be [L, L] ¯ To clarify the origin of the string equation, one needs two more operators. These are the Orlov–Shulman operators [16] M= M˜ =

∞ k=1 ∞

k

ktk L + t + k t˜k L˜ k + t +

k=1

∞ k=1 ∞

vk L−k ,

(6.3)

v˜k L˜ −k ,

(6.4)

k=1

˜ M] ˜ = −h¯ L. ˜ Then the string equation obeying the conditions [L, M] = hL ¯ ; [L, follows from the relations [4] L˜ = L−1 M, M˜ = M. From Sects. 2, 3 it follows that the dispersionless limit of the operator L−1 M enjoys a simple geometric interpretation: it is the Schwarz function S(z) of an analytic curve. 7. More Connections and Equivalences Analytic curves and dispersionless hierarchies. Thus one particular solution of the dispersionless 2D Toda hierarchy describes evolution of the univalent conformal map of a domain bounded by a simple analytic curve to the exterior of the unit circle. The set of times of the hierarchy appears to be equivalent to the set of harmonic moments of the domain, whereas the conformal map itself is the dispersionless limit of the Lax operator.

Conformal Maps and Integrable Hierarchies

535

This proposition can be reversed: univalent conformal maps of the exterior of the unit circle generate a solution of the dispersionless Toda hierarchy. The solution is selected by the string equation (3.1). Other solutions of the dispersionless Toda hierarchy are selected by more general string equations. The latter are characterized by any two functions f and g forming a canonical pair: {f, g} = f . Then the general string equations consistent with the hierarchy are L¯ = f −1 (L, M), M¯ = g(L, M) (see [3]). We expect that some of them are also relevant to the conformal maps of simply connected domains. It is likely that other types of string equations describe mappings to or from domains other than the exterior of the unit circle and also nonunivalent maps. Moreover, we expect that other integrable hierarchies of the hydrodynamic type, for instance KP hierarchy and its reductions, also describe certain classes of conformal maps and curves other than analytic. We plan to address this question elsewhere. The τ -function for curves. All this suggests to introduce a general notion of the τ function for curves. For simple analytic curves, this is a universal function which determines the curve by means of the Laurent series x2 + y2 − t =

∞ k=1

∂ Re ktk (x + iy)k + (x + iy)−k logτ . ∂tk

(7.1)

The dispersionless Hirota equation (5.4) provides a set of differential equations for the τ -function. Again, it has many solutions. The proper solution is selected by the unitarity condition (2.7) and initial data. For instance, if the curve can be reached by small deformations of a circle, one sets ∂tk logτ = 0 at all tk = 0. Moving curves and the c = 1 string theory. There is another intriguing relation (in fact equivalence) between deformations of analytic curves and the genus-0 topological sector of 2D gravity coupled to c = 1 matter [4]. This follows from the known equivalence between the latter and the dispersionless Toda hierarchy restricted by the so-called W1+∞ -constraints [4, 2, 11]. The interpretation of the objects of the genus-0 string theory in terms of analytic curves is straightforward. The positive (negative) momentum tachyon one-point functions < Tn > are moments of the interior of the domain vn (v¯n ), the partition function of the genus-0 string is the τ -function of analytic curves, the Schwarz function S(z) is the superpotential and, finally, the W1+∞ -constraints are essentially equivalent to the string equation (3.1) or the relation (4.11). In fact, they are nothing else than the unitarity condition (2.7). In its turn, the c = 1 genus-0 string theory, as well as the relevant solution to the dispersionless hierarchy, is known to be equivalent to the planar limit of the twomatrix model (see [5] and references therein). This gives a representation of the τ of the N × N two-matrix integral τ = function of analytic curves as N → ∞ limit ¯ We plan to DMD M¯ exp trW , with the potential W = k (tk M k + t¯k M¯ k ) + M M. discuss this subject elsewhere. The interpretation of the genus-0 string theory as a simply minded classical theory of potential may turn out to be fruitful for both subjects. It is likely that the 2D gravity coupled to c < 1 matter, being described by various reductions of the dispersionless KP hierarchy, also enjoys a geometric interpretation.

536

P. B. Wiegmann, A. Zabrodin

Laplacian growth problem. This paper has been stimulated by our interest in the Laplacian growth problem (see [17] for a review). This problem (a source of interesting mathematics and a great deal of important application) may give further insights in the matters we discuss in the paper. It seems to be the place for a short introduction. This is the problem of a moving interface between two incompressible liquids with different viscosities on the plane. Let, say an exterior of a simply connected domain be occupied by a viscous fluid (oil), while less viscous liquid (water) occupies the interior. Water is supplied by a source at z = 0, while oil sinks to a source at the infinity, so the interface moves. Experiments and numerical simulations suggest that any smooth interface, regardless of its initial shape, develops a finger-like pattern with universal fractal characteristics. The hydrodynamics of an ideal interface (with zero surface tension) is described by the Darcy law Vn = −

∂p , ∂n

(7.2)

where Vn is the velocity of the interface and ∂p/∂n is the normal derivative of the pressure on the interface. The pressure is constant in the water domain and on the interface, while in the oil domain it obeys the Laplace equation ∇ 2 p = 0 with the asymptotic behavior at the infinity p → −1/2 log |z|. The latter indicates a sink at the infinity. The Darcy law implies [18] that all harmonic moments tk of the oil domain are not changed while the interface moves, but the area of the water domain grows linearly in time and thus can be identified with the time. (For connections with the inverse potential problem see [19].) The problem then becomes: find the evolution of the domain as a function of the area t at fixed moments tk . This evolution is described by the string equation (3.1). This equation has a long history. It appeared in 1945 in Ref. [20] or even earlier in the mathematical theory of oil hydrodynamics. In our approach, this equation is the basis of the symplectic structure of the conformal maps. Applications of integrable hierarhies to the Laplacian growth problem are addressed in Ref. [12]. At the time of completion of the manuscript E. Ferapontov informed us that J. Gibbons and S. Tsarev discussed a relation between Benney equations and conformal maps of slit domains [21]. We would like to thank J. Gibbons and S. Tsarev for informing us about their recent paper [22]. Acknowledgements. This work has been inspired by discussions of the Laplacian growth problem with M. Mineev-Weinstein, L. Kadanoff, L. Levitov and B. Shraiman. We acknowledge useful discussions with H. Awata, B. Dubrovin, M. Fukuma, J. Gibbons, V. Kazakov, I. Krichever, S. P. Novikov, A. Orlov and T. Takebe. Some results of this paper were obtained in collaboration with M. Mineev-Weinstein [12]. We also thank I. Krichever and T. Takebe for teaching us dispersionless hierarchies. P. W. would like to thank M. Ninomiya for his hospitality at the Yukawa Institute for Theoretical Physics in Spring 1999, where this work was started. We also thank K. Ishikawa for his hospitality at Hokkaido University. This work was partially supported by the Grant-in-aid for International Science Research (Joint Research 07044048) from the Ministry of Education, Science, Sports and Culture, Japan. The work has been completed during the workshop “Applications of Integrability” at the Erwin Schrödinger Institute in Vienna in September 1999. We have been supported by grants NSF DMR 9971332 and MRSEC NSF DMR 9808595. A.Z. was partially supported by Russian Foundation of Basic Research, grant 98-01-00344.

Appendix: Ellipse Growing From a Circle Here we demonstrate how the Lax–Sato equations describe an ellipse growing from a circle. It is the simplest but nevertheless instructive example which in particular allows

Conformal Maps and Integrable Hierarchies

537

one to compute the τ -function at all tk = 0 but t, t1 , t¯1 , t2 , t¯2 . Consider an ellipse with half-axes a, b centered at z0 = x0 + iy0 and rotated by the angle α: (cos α(x − x0 ) − sin α(y − y0 ))2 (sin α(x − x0 ) + cos α(y − y0 ))2 + = 1. a2 b2 The Schwarz function of the ellipse is S(z) = e2iα

a 2 + b2 2abe2iα (z − z ) + z ¯ − (z − z0 )2 − e−2iα (a 2 − b2 ). 0 0 a 2 − b2 a 2 − b2

The Laurent series of the Schwarz function (2.5) S(z) = 2t2 z + t1 +

t v1 v2 + 2 + 3 + ... z z z

gives the moments of the exterior and the interior. The only nonzero moments of the exterior are t1 , t2 and their complex conjugate: 2t2 = e2iα

a−b , a+b

t1 = z¯ 0 − 2t2 z0 ,

t = ab.

Contrary, none of moments of the interior vanish. The first two are v1 =

t (t¯1 + 2t1 t¯2 ) , 1 − 4t2 t¯2

v2 =

t (t¯1 + 2t1 t¯2 )2 2t 2 t¯2 + . (1 − 4t2 t¯2 )2 1 − 4t2 t¯2

Adding ∂t v0 = 2 log r, one may check the symmetry relations (5.2) and find the τ function for the ellipse: logτ =

t1 t¯1 + t12 t¯2 + t¯12 t2 1 2 3 1 . t logt − t 2 − t 2 log (1 − 4t2 t¯2 ) + t 2 4 2 1 − 4t2 t¯2

Let us note that this function (for t1 = 0 and t2 real) was obtained [23] as the free energy of a classical Coulomb gas in an external field. The Laurent series for the conformal map from the exterior of the unit circle to the exterior of the ellipse is truncated: z(w) = rw + u0 + u1 w −1 . The coefficients of the conformal map are: 1 t , (a + b)2 = 4 1 − 4t2 t¯2 t¯1 + 2t1 t¯2 u0 = z0 = . 1 − 4t2 t¯2 r2 =

u21 =

4t t¯22 1 −4iα (a − b)2 = , e 4 1 − 4t2 t¯2

The first two Hamiltonians are: H1 = rw + 21 u0 , H2 = r 2 w 2 + 2ru0 w + ru1 + 21 u20 . Higher flows deform the ellipse. The Lax–Sato equations plus the string equation (3.1) and their conjugate describe how the ellipse grows from the circle.

538

P. B. Wiegmann, A. Zabrodin

References 1. Dubrovin, B.A. and Novikov, S.P.: Soviet Math. Dokl. 27, 665–669 (1977); Tsarev, S.P.: Soviet Math. Dokl. 31, 488–491 (1985) 2. Krichever, I.M.: Funct. Anal. Appl. 22, 200–213 (1989); Krichever, I.M.: Commun. Math. Phys. 143, 415–429 (1992); Krichever, I.M.: Comm. Pure Appl. Math. 47, 437–476 (1992) 3. Takasaki, K. and Takebe, T.: Rev. Math. Phys. 7, 743–808 (1995) 4. Dijkgraaf, R., Moore, G. and Plesser, R.: Nucl. Phys. B 394, 356–382 (1993); Hanany, A . Oz, Y. and Plesser, R.: Nucl. Phys. B 425, 150–172 (1994); Takasaki, K.: Commun. Math. Phys. 170, 101–116 (1995); Eguchi, T. and Kanno, H.: Phys. Lett. B 331, 330 (1994) 5. Daul, J.M., Kazakov, V.A. and Kostov, I.K.: Nucl. Phys. B 409, 311–338 (1993); Bonora, L. and Xiong, C.S.: Phys. Lett. B 347, 41–48 (1995) 6. Strakhov, V., Brodsky, M.: SIAM J. Appl. Math. 46, 324–344 (1986) 7. Novikov, P.S.: Soviet Math. Dokl. 18, 165–168 (1938); Sakai, M.: Proc. Amer. Math. Soc. 70, 35–38 (1978) 8. Gibbons, J. and Kodama, Y.: Phys. Lett. A 135, 167–170 (1989); Gibbons, J. and Kodama, Y.: Proceedings of NATO ASI, “Singular Limits of Dispersive Waves”, ed. N. Ercolani, London–New York: Plenum, 1994 9. Carroll, R. and Kodama, Y.: Int. J. Mod. Phys A28, 6373–6388 (1995) 10. Dijkgraaf, R. and Witten, E.: Nucl. Phys. B 342, 486–522 (1990); Losev, A. and Polyubin, I.: Int. J. Mod. Phys. A 10, 4161–4178 (1995); Aoyama, S. and Kodama, Y.: Commun. Math. Phys. 182, 185–220 (1996) 11. Dubrovin, B.: In: Montecatini Terme 1993, Integrable systems and quantum groups, 120–348, e-Print Archive: hep-th/9407018 12. Mineev-Weinstein, M., Wiegmann, P.B. and Zabrodin, A.: LAUR-99-0703, submitted to Phys. Rev. Lett. 13. Davis, P.J.: The Schwarz function and its applications. The Carus Mathematical Monographs, No. 17, Buffalo, N.Y.: The Math. Association of America, 1974 14. Dubrovin, B.A.: Commun. Math. Phys. 145, 195–207 (1992) 15. Sakai, M.: J. Analyse Math. 40, 144–154 (1980) 16. Orlov, A. and Shulman, E.: Lett. Math. Phys. 12, 171–179 (1986) 17. Bensimon, D., Kadanoff, L.P., Liang, S., Shraiman, B.I. and Tang, C.: Rev. Mod. Phys. 58, 977 (1986) 18. Richardson, S.: J. Fluid Mech. 56, 609–618 (1972) 19. Etingof, P. and Varchenko, A.: Why does the boundary of a round drop becomes a curve of order four. University Lecture Series, 3, Providence, RI: American Mathematical Society, 1992 20. Galin, L.A.: Dokl. Akad. Nauk SSSR 47, 250–253 (1945); Polubarinova-Kochina, P.Ya.: Dokl. Akad. Nauk SSSR, 47 , 254–257 (1945); Kufarev, P.P.: Dokl. Akad. Nauk SSSR 57, 335–348 (1947) 21. Gibbons, J. and Tsarev, S.P.: Phys. Lett. A 211, 19–24 (1996) 22. Gibbons, J. and Tsarev, S.P.: Phys. Lett. A 258, 263–271 (1999) 23. Di Franchesko, P., Gaudin, M., Itzykson, C. and Lesage, F.: Int. J. Mod. Phys. A9, 4257–4351 (1994) Communicated by T. Miwa

Commun. Math. Phys. 213, 539 – 574 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Spin Chain Models with Spectral Curves from M Theory I. Krichever1,2, , D. H. Phong1, 1 Department of Mathematics, Columbia University, New York, NY 10027, USA.

E-mail: [email protected]; [email protected]

2 Landau Institute of Theoretical Physics, Kosygina str. 2, 117940 Moscow, Russia

Received: 22 December 1999 / Accepted: 3 March 2000

Abstract: We construct the integrable model corresponding to the N = 2 supersymmetric SU (N ) gauge theory with matter in the antisymmetric representation, using the spectral curve found by Landsteiner and Lopez through M Theory. The model turns out to be the Hamiltonian reduction of a N + 2 periodic spin chain model, which is Hamiltonian with respect to the universal symplectic form we had constructed earlier for general soliton equations in the Lax or Zakharov–Shabat representation. 1. Introduction The main goal of this paper is to construct the integrable model which corresponds to the N = 2 SUSY SU (N ) Yang–Mills theory with a hypermultiplet in the antisymmetric representation. The 1994 work of Seiberg and Witten [1] had shown that the Wilson effective action of N = 2 SUSY Yang–Mills theory is determined by a fibration of spectral curves equipped with a meromorphic one-form dλ, now known as the Seiberg– Witten differential. It was soon recognized afterwards [2–4] that this set-up is indicative of an underlying integrable model, with the vacuum moduli of the Yang–Mills theory corresponding to the action variables of the integrable model. In fact, in the special case of hyperelliptic curves, a similar set-up for the construction of action variables as periods of a meromorphic differential had been introduced in [5]. This unexpected relation between N = 2 Yang–Mills theories on one hand and integrable models has proven to be very beneficial for both sides. The Seiberg–Witten differential has led to a universal symplectic form for soliton equations in the Lax or Zakharov–Shabat representation [6, 7]. The connection with integrable models has helped solve the SU (N ) Yang–Mills theory with a hypermultiplet in the adjoint representation [4, 8], as well as pure Yang– Mills theories with arbitrary simple gauge groups G [3]. Conversely, the connection with Supported in part by the National Science Foundation under grant DMS-98-02577

Supported in part by the National Science Foundation under grant DMS-98-00783

540

I. Krichever, D. H. Phong

Yang–Mills theories has led to new integrable models, such as the twisted Calogero– Moser systems associated with Yang–Mills theories with non-simply laced gauge group and matter in the adjoint representation [9], and the elliptic analog of the Toda lattice [10]. 1 Despite all these successes, we still do not know at this moment how to identify or construct the correct integrable model corresponding to a given Yang–Mills theory. This is a serious drawback, since the integrable model can be instrumental in investigating key physical issues such as duality, the renormalization group, or instanton corrections [13– 15]. At the same time, the list of spectral curves continues to grow, thanks in particular to methods from M theory [16, 17] and geometric engineering [18]. It seems urgent to develop methods which can identify the correct integrable model from a given spectral curve and Seiberg–Witten differential. In the case of interest in this paper, namely the SU (N ) gauge theory with antisymmetric matter, the Seiberg–Witten differential and spectral curve had been found by Landsteiner and Lopez [17] using branes and M theory. The Seiberg–Witten differential dλ is given by dλ = x

dy . y

(1.1)

The spectral curve is of the form y 3 −(3N+2 +x 2

N i=0

ui x i )y 2 +(3N+2 +x 2

N

(−)i ui x i )N+2 y −3(N+2) = 0,

i=0

(1.2) where is a renormalization scale. For the SU (N ) gauge theories, one restricts to uN = 1, uN−1 = 0, so that the moduli dimension is N − 1, which is the rank of the gauge group SU (N ). The Landsteiner–Lopez curve (1.2) and differential (1.1) have been studied extensively by Ennes, Naculich, Rhedin, and Schnitzer [20]. In particular, they have verified that the curve and differential do reproduce the correct perturbative behavior of the prepotential predicted by asymptotic freedom. The problem which we wish to address here is the one of finding a dynamical system which is integrable in the sense that it admits a Lax pair, and which corresponds to the Landsteiner–Lopez curve and Seiberg–Witten differential (1.1) in the sense that its spectral curve is of the form (1.2), and its action variables are the periods of dλ along N − 1 suitable cycles on . We have succeeded in constructing two integrable spin chain models, whose spectral curves are given exactly by the Landsteiner–Lopez curves. However, the action variables of the desired integrable model must be given by dλ = x dy y , and here the two models differ significantly. For one model, referred to as the odd divisor spin model, the 2-form resulting from dλ vanishes identically. For the other, referred to as the even divisor spin model, the Hamiltonian reduction of the 2-form resulting from dλ to the moduli space of vacua {uN = 1, uN−1 = 0} is non-degenerate, and the reduced system is indeed Hamiltonian with respect to this symplectic form, with Hamiltonian H = uN−2 . Thus the latter model is the integrable system we are looking for. 1 We refer to [11, 12] for more complete lists of references.

Spin Chain Models with Spectral Curves from M Theory

541

Our main result is as follows2 . Let qn , pn be 3-dimensional vectors which are N + 2 periodic, i.e. pn+N+2 = pn , qn+N+2 = qn , and satisfy the constraints pnT qn = 0, pn = g0 p−n−1 , qn = g0 q−n−1 ,

(1.3) (1.4)

where g0 is the diagonal matrix 

 1 0 0 g0 =  0 −1 0  . 0 0 1

(1.5)

Consider the dynamical system p˙ n =

pn+1 pn−1 + T + µ n pn , T pn+1 qn pn−1 qn

q˙n = −

qn+1 qn−1 − T − µ n pn pnT qn+1 pn qn−1

(1.6)

for some scalar functions µn (t). The system is invariant under the gauge group G generated by the following gauge transformations: pn → λn pn , T

pn → W pn ,

qn → λ−1 n qn ,

(1.7)

qn → W

(1.8)

−1

qn .

Here W is a 3 × 3 matrix which commutes with g0 , Wg0 = g0 W . Define the 3 × 3 matrices L(x) and M(x) by L(x) =

N+1 n=0

(1 + xqn pnT ),

M(x) = x

qN+1 p0T

p0T qN+1

−

T q0 pN+1 T pN+1 q0

.

(1.9)

Main Theorem. • The dynamical system (1.6) is equivalent to the Lax equation ˙ L(x) = [M(x), L(x)];

(1.10)

• The spectral curves = {R(x, y) ≡ det yI − L(x) = 0} are invariant under the flow (1.6), and are exactly the curves of the Landsteiner–Lopez form (1.2) (with N+2 normalized to 1); • There is a natural map (qn , pn ) → (, D) from the space of all spin chains satisfying the constraints (1.3,1.4) to the space of pairs (, D), where is a Landsteiner–Lopez curve, and D = {z1 , · · · , z2N+1 } is a divisor whose class [D] = [D σ ] is symmetric under the involution σ :

(x, y) = z → zσ = (−x, y −1 ).

(1.11)

For a given (qn , pn ), D is the set of poles of the Bloch function ψ0 , L(x)ψ0 = yψ0 (x); 2 The notation is explained in greater detail in Sects. 3 and 5.

542

I. Krichever, D. H. Phong

• Let M0 be the space of pairs {, [D]}, where is a Landsteiner–Lopez curve with uN = 1, uN−1 = 0, and [D] is a divisor class which is symmetric under the involution σ . Then the space M0 has dimension 2(N −1). The map (qn , pn ) → (, D) descends to a map between the two spaces {(qn , pn )}/G ↔ M0 ,

(1.12)

where on the left-hand side, we have factored out the gauge group G from the space of periodic spin chains satisfying the constraints (1.3,1.4). At a generic curve and a divisor [D] in general position, the map (1.12) is a local isomorphism. • Let the action variables ai and the angle variables φi be defined on the space M0 by ai =

Ai

dλ,

φi =

2N+1 zi

dωi ,

(1.13)

i=1

where {Ai }1≤i≤N−1 and {dωi }1≤i≤N −1 , are respectively a basis for the even cycles and a basis for the even holomorphic differentials on . Then ω=

N−1

δai ∧ δφi

(1.14)

i=1

defines a symplectic form on the 2(N − 1)-dimensional space M0 ; • The dynamical system (1.6) is Hamiltonian with respect to the symplectic form (1.14). The Hamiltonian is H = uN−2 . In terms of the (qn , pn ) dynamical variables, the Hamiltonian can be expressed under the form H = =

u2 uN−2 − N−1 uN 2u2N N+1 n=0

(pnT qn−3 ) (pnT qn−2 )2 , − T q T T q 2 (pnT qn−1 )(pn−1 2(pnT qn−1 )2 (pn−1 n−2 )(pn−2 qn−3 ) n−2 )

where we have used the constraint uN = 1, uN−1 = 0 to write H as H =

uN −2 uN

(1.15)

−

u2N −1 . 2u2N

We would like to note the similarity of the Lax matrix L in (1.9) to the 2 × 2 Lax matrix used in [21] for the integration of a quasi-classical approximation to a system of reggeons in QCD. A key tool in our analysis is the construction of [6, 7], which shows that symplectic forms constructed in terms of Seiberg–Witten differentials can also be constructed directly in terms of the Lax representation of integrable models. The latter are given by the following universal formula [6, 7]: ω=

∗ 1 ResPα ψn+1 δLn (x) ∧ δψn dx, 2 α

(1.16)

∗ where ψn and ψn+1 are the Bloch and dual Bloch functions of the system, and Pα are marked punctures on the spectral curve . In the present case, Pα are the 3 points on above x = ∞.

Spin Chain Models with Spectral Curves from M Theory

543

Finally, we note that the odd divisor spin model (which we describe in Sects. 3.1 and 6) may be of independent interest. Although the symplectic form associated to the Seiberg-differential x dy y is degenerate in this case, the model does admit a Hamiltonian structure with non-degenerate symplectic form, but one which is associated rather with dx the form dλ(1) = ln y dx x . As suggested in [19], the form ln y x is also indicative of supersymmetric Yang–Mills theories, but in 5 or 6 dimensions with N = 1 supersymmetry. 2. Geometry of the Landsteiner–Lopez Curve We begin by identifying the geometric features of the generic Landsteiner–Lopez curve which will play an important role in the sequel. Fixing the normalization N+2 = 1, we can write :

R(x, y) ≡ y 3 − f (x)y 2 + f (−x)y − 1 = 0,

(2.1)

where f (x) is a polynomial of the form f (x) = 3 + x 2 PN (x),

PN (x) =

N

ui x i .

(2.2)

i=0

The parameters u0 , · · · , uN are the moduli of the Landsteiner–Lopez curve. • The Landsteiner–Lopez curve is a three-fold covering of the complex plane in the x variable. It is invariant under the involution σ defined in (1.11). The important points on are the singular points, the points above x = ∞, and the branch points. We discuss now all these points in turn. • The singular points are the points where ∂x R(x, y) = ∂y R(x, y) = 0.

(2.3)

The generic Landsteiner–Lopez curve has exactly one singular point, namely (x, y) = (0, 1). At this point, Eq. (2.1) has a triple root, and all three sheets of the curve intersect. For generic values of the moduli ui , all three solutions y of R(x, y) = 0 can be expressed as power series in x in a neighborhood of x = 0, y(x) = 1 +

∞

yi x i .

(2.4)

i=1

In fact, we can substitute (2.4) into (2.1) to find recursively all coefficients yi , with the first coefficient y1 a solution of y13 − u0 y1 + 2u1 = 0.

(2.5)

For generic u0 , u1 , this equation does admit three distinct solutions for y1 , which lead in turn to the three distinct solutions. These three distinct solutions provide effectively a smooth resolution of the curve , where the crossing point y = 1 above x = 0 has been separated into 3 distinct points Qα , 1 ≤ α ≤ 3. Under the involution σ , the leading terms in the three solutions (2.4) transform as (x, 1 + y1 x + · · · ) → (−x, (1 − y1 x + · · · )−1 ) = (−x, 1 + y1 x + · · · ).

(2.6)

544

I. Krichever, D. H. Phong

Since the three solutions y1 of Eq. (2.5) are distinct for generic values of the moduli ui , we see that each of the three points Qα above x = 0 are fixed under the involution σ . • For generic values of the moduli ui , there are also three distinct branches of y(x) near x = ∞. A first branch y(x) = O(x N+2 ) with a pole of order N + 2 can be readily found y(x) = x N+2 (uN + uN−1 x −1 + uN−2 x −2 + · · · ).

(2.7)

(The first three coefficients in y(x) turn out to be exactly the first three coefficients uN , uN−1 and uN−2 in the polynomial PN (x) of (2.2).) We denote by P1 the corresponding point above x = ∞. In view of the involution σ , a second branch y(x) = O(x −(N+2) ) with a zero of order N + 2 exists which is the image of the first branch under σ 2 u − u u u 1 N N−2 N−1 N−1 y(x) = (−x)−(N+2) 1+ x −1 + x −2 + · · · . (2.8) uN uN u2N The corresponding point above x = ∞ is denoted P3 . Finally, the involution σ implies that the third branch y(x) is regular and fixed under σ 1 y(x) = (−)N+2 1 + O . (2.9) x Denoting the corresponding point above x = ∞ by P2 , we have σ : P1 ↔ P3 ,

σ : P2 ↔ P2 .

(2.10)

• The branching points of over the x-plane are just the zeroes on of the function ∂y R(x, y) which are different from the singular points Qα . This function has a pole of order 2(N + 2) at P1 and a pole of order (N + 2) at each of the points P2 and P3 . Therefore, it has 4N + 8 zeros. At each of the points Qα the function ∂y R(x, y) has zeros of order 2. Hence #{Branch Points} = 4N + 2.

(2.11)

Note that for generic moduli ui , neither 0 nor ∞ is a branch point, in view of our previous discussion. Also for generic ui , we can assume that the ramification index at all branch points is 2. Thus the total branching number is just the number of branch points. Since the number of sheets is 3, the Riemann-Hurwitz formula can be written as g() = −3 + 21 (4N + 2) + 1 in this case. Thus the genus g() of the curve is g() = 2N − 1.

(2.12)

• For generic moduli ui , the involution σ : → has exactly four fixed points, namely the three points Qα above x = 0 and the point P2 above x = ∞. That implies that the factor-curve /σ has genus g(/σ ) = N − 1.

(2.13)

The involution σ induces an involution of the Jacobian variety J () of . The odd part J P r () of J () is the Prym variety and the even part is isogenic to the Jacobian J (/σ ) of the factor-curve /σ . The dimension of the space of divisors [D] which are even under σ is equal to dim J (/σ ) = N − 1.

Spin Chain Models with Spectral Curves from M Theory

545

3. The Spin Models We introduce two systems with the same family of spectral curves (2.1). One system has non-trivial dynamics along the even while the other system has non-trivial dynamics along the odd (Prym) directions of the Jacobian. The system corresponding to the SU (N ) Yang–Mills theory with a hypermultiplet in the anti-symmetric representation is the even system. We sketch here the outline of the construction of both models, leaving the full discussion to Sects. 4-5. Both models are periodic spin chain models, with a 3-dimensional complex vector at each site. We view three-dimensional vectors s as column vectors, with components sα , 1 ≤ α ≤ 3. We denote by s T the transpose of s, which is then a three-dimensional row vector, with components s α . In particular, s T s is a scalar, while ss T is a 3 × 3 matrix. Since the odd divisor spin model is simpler, we begin with it. 3.1. The odd divisor spin model. The odd divisor spin model is a (N +2)-periodic chain of complex three-dimensional vectors sn = sN+n+2 , sn = (sn,α ), α = 1, 2, 3, subject to the constraint snT sn =

3

snα sn,α = 0,

(3.1)

sn+1 sn−1 − T . T sn+1 sn sn−1 sn

(3.2)

α=1

and the following equations of motion: s˙n =

The constraint (3.1) and the equations of motion are invariant under transformation of the spin chain by a matrix V satisfying the condition V T V = I , sn → V sn .

(3.3)

The odd divisor spin model is integrable in the sense that the equations of motion are equivalent to a Lax pair. To see this, we define the 3 × 3 matrices Ln (x) and Mn (x) by Ln (x) = 1 + x sn snT , 1 T (sn−1 snT + sn sn−1 ). Mn (x) = x T sn sn−1

(3.4) (3.5)

Then the compatibility condition for the system of equations ψn+1 = Ln (x)ψn , ψ˙ n = Mn (x)ψn

(3.6)

L˙ n (x) = Mn+1 (x)Ln (x) − Ln (x)Mn (x).

(3.8)

(3.7)

is given by

546

I. Krichever, D. H. Phong

A direct calculation shows that for Ln (x) and Mn (x) defined as in (3.5), this equation is equivalent to the equations of motion (3.2) for the spin model. Define now the monodromy matrix L(x) by L(x) = LN+1 (x) · · · L0 (x) =

N+1

Ln (x),

(3.9)

n=0

where the ordering in the product on the right-hand side starts by convention with the lowest indices on the right. Then L(x) and M(x) = M0 (x) form themselves a Lax pair ˙ L(x) = [M(x), L(x)].

(3.10)

This is easily verified using (3.8), since ˙ L(x) =

N+1 N+1

Ln (x) L˙ k ×

N+1 N+1

Ln (x)

Ln (x)(Mk+1 Lk − Lk Mk )

N+1 N+1

k−1

Ln (x)

(3.12)

n=0

k=0 n=k+1

=

(3.11)

n=0

k=0 n=k+1

=

k−1

Ln (x) Mk+1

k

Ln (x) −

n=0

k=0 n=k+1

N+1 N+1

Ln (x) Mk

k=0 n=k

k−1

Ln (x)

(3.13)

n=0

= MN+2 L(x) − L(x)M0 (x).

(3.14)

In particular, the characteristic equation of L(x) is time-independent and defines a timeindependent spectral curve = {(x, y); 0 = R(x, y) ≡ det (yI − L(x))}.

(3.15)

We assert that these spectral curves are Landsteiner–Lopez curves (2.1). In fact, it follows immediately from the expression (3.5) that det Ln (x) = 1, Ln (x) = Ln (x)T , and Ln (x)−1 = L(−x). Thus det L(x) = 1,

L(x)−1 = L(−x).

(3.16)

These two equations imply that det(yI − L(x)) is of the form (2.1) for some polynomial f (x). To obtain the expression (2.2) for f (x), it suffices to observe that f (x) = Tr L(x) = Tr (1 + x

N+1 n=0

sn snT ) + O(x 2 ) = 3 + O(x 2 ).

(3.17)

i Define the moduli ui of the curve R(x, y) = 0 as in (2.1) by f (x) = 3 + x 2 N i=0 ui x . Then the correspondence between the dynamical variables sn , 0 ≤ n ≤ N + 1, and the moduli ui is given by ui = snT1 sn2 snT2 sn3 · · · snTi−1 sni , (3.18) Ii

where the summation runs over the set Ii of all ordered i-th multi-indices n1 < n2 < · · · < ni .

Spin Chain Models with Spectral Curves from M Theory

547

To obtain the phase space of the model, we consider the space of all (N + 2)-periodic spin chains sn , subject to the constraint (3.1), and modulo the equivalence sn ∼ V sn , where V is a matrix satisfying V T V = I . The dimension of this space is dim {sn }/{sn ∼ V sn } = 2N + 1.

(3.19)

Indeed, the (N + 2)-periodic spin chains sn have 3(N + 2) degrees of freedom. The constraint (3.1) removes N + 2 degrees of freedom, and the equivalence sn ∼ V sn removes 3 others, since the dimension of the matrices V with V T V is 3. A 2N -dimensional symplectic manifold Lodd is obtained by setting Lodd = {sn ; uN = constant }/{sn ∼ V sn }.

(3.20)

On the space Lodd , the system is Hamiltonian with respect to the symplectic form defined by the differential dλ(1) = (ln x) dy y , with Hamiltonian H(1) =

N+1 T s (sn+1 uN−1 n−1 ) . = T uN (sn+1 sn )(snT sn−1 ) n=0

(3.21)

The action-variables are the periods of the differential dλ(1) = −(ln x) dy y over a basis of N cycles for the curve , which are odd under the involution σ . If the curve is viewed as a two–sheeted cover of /σ , these N odd curves can be realized as the N cuts along which the sheets are to be glued. 3.2. The even divisor spin model. The even divisor spin model is the Hamiltonian reduction of a periodic spin chain model which incorporates a natural gauge invariance. The starting point is a (N + 2)-periodic chain of pairs of three-dimensional complex vectors pn = (pn,α ), qn = (qn,α ), 1 ≤ α ≤ 3, satisfying the constraints (1.3). We impose the equations of motion (1.6). As noted before, the constraints and the equations of motion are invariant under the gauge transformations (1.7,1.8). In particular, a gauge fixed version of the equations of motion (1.6) is p˙ n =

pn+1 pn−1 + T , T pn+1 qn pn−1 qn

q˙n = −

qn+1 qn−1 − T . T pn qn+1 pn qn−1

This version follows from the other one by the gauge transformation t pn → λn (t)pn , qn → λ−1 . (t)q , λ(t) = exp − µ (t )dt n n n

(3.22)

(3.23)

We shall see in the next section that the system (1.6) admits a Lax representation. A reduced system is defined as follows. We impose the additional constraints (1.4). With these constraints, the spectral curves of the system are the Landsteiner–Lopez curves (2.1). The dimension of the phase space M of all (qn , pn ) subjected to the previous constraints and divided by the gauge group G of (1.7,1.8), is dim M ≡ dim {(qn , pn )}/G = 2N.

(3.24)

To see this, assume that N is even (the counting for N odd is similar). Then the constraint (1.4) reduces the number of degrees of the (N + 2)-periodic spin chain (qn , pn ) to the

548

I. Krichever, D. H. Phong

number 3(N + 2) of a (N + 2)-periodic spin chain. The constraint (1.3) and the gauge transformation (1.7) each eliminates N2 +1 degrees of freedom. Now the dimension of the space of matrices W satisfying Wg0 = g0 W is 5. However, in the gauge transformation (1.8), the matrices W which are diagonal have already been accounted for in the gauge transformation (1.7). Altogether, we arrive at the count which we announced earlier. The phase space {(qn , pn )}/G itself can be reduced further, to a lower-dimensional phase space defined by suitable constraints on the moduli space (u0 , · · · , uN ). It turns out that there are 2 possible natural further reductions, each related to its own choice of differential dλ and corresponding Hamiltonian structure: • On the (2N − 2)-dimensional phase space defined by the constraints M0 = {(qn , pn ); uN = 1, uN−1 = 0}/G

(3.25)

the system is Hamiltonian with respect to the symplectic form defined by the differential dλ = x dy y . Here we have used the same notation for the space just introduced and the space M0 described in the Main Theorem, in anticipation of their isomorphism which will be established later in Sect. 4. The Hamiltonian is given by H = uN−2 or equivalently by (1.15). The action-variables are periods of dλ along a basis of N − 1 cycles Ai of which are even under the involution σ . (Equivalently, the Ai correspond to a basis of cycles for the factor curve /σ .) This is the desired integrable Hamiltonian system, corresponding to the N = 2 supersymmetric SU (N ) Yang–Mills theory with a hypermultiplet in the anti-symmetric representation. • On the (2N − 2)-dimensional phase space M2 defined by the constraints M2 = {(qn , pn ); u0 = constant, u1 = constant}/G

(3.26)

the system is Hamiltonian with respect to the symplectic form defined by the differential dλ(2) = − x1 dy y . This symplectic form coincides with the natural form ω=

n

dpnT ∧ dqn

(3.27)

with respect to which the system (1.6) is manifestly Hamiltonian, with Hamiltonian H (p, q) = ln uN =

N+1 1 + + ln (pn qn−1 )(pn−1 qn ) . 2

(3.28)

n=0

The action-variables are the periods of the differential dλ(2) = − dy xy over again the even cycles Ai of the earlier case. 4. The Direct and Inverse Spectral Transforms We concentrate now on the even divisor spin model. The main goal of this section is to describe the map stated in the main theorem, which associates to the spin chain (qn , pn ) a geometric data (, [D]), (qn , pn ) → (, [D]).

(4.1)

Spin Chain Models with Spectral Curves from M Theory

549

The curve is obtained by showing that the dynamical system (1.6) for (pn , qn ) admits a ˙ Lax representation L(x) = [M(x), L(x)], in which case is the spectral curve {det (yI − L(x)) = 0}. The Lax operator L(x) also gives rise to the Bloch function, which is essentially its eigenvector. The divisor D is obtained by taking the divisor of poles of the Bloch function. A characteristic feature of the even divisor spin model is that the equivalence class of this divisor [D] is even under the involution σ . The map (4.1) descends to a map from the space of equivalence classes of (qn , pn ) under the gauge group G to the space of geometric data (, [D]). These two spaces are of the same dimension 2N : we saw this in (3.24) for the first space, while for the second, the number 2N of parameters is due to N +1 parameters for the Landsteiner–Lopez curves (including uN and uN−1 ), and N −1 parameters for the even divisors [D]. It is a fundamental fact in the theory that the map (4.1) becomes then a bijective correspondence of generic points {qn , pn )}/G ↔ {(, [D])}.

(4.2)

We shall refer to the construction → described above as the direct problem. The reverse construction ←, which recaptures the dynamical variables (pn , qn ) from the geometric data (, [D]) will be referred to as the inverse problem. As usual in the geometric theory of solitons [22], it will be based on the construction of a Baker–Akhiezer function. We now provide the details.

4.1. The Lax representation. We exhibit first the Lax representation for the system (1.6). The desired formulas can be obtained from a slight modification of the easier odd spin model treated in Sect. 3.1. Let pn , qn be (N + 2)-periodic, three-dimensional vectors satisfying pnT qn = 0, and define matrix-valued functions Ln (x) and Mn (x) by T qn pn−1 qn−1 pnT T Ln (x) = 1 + x qn pn , Mn (x) = x . (4.3) − T pnT qn−1 pn−1 qn Then a direct calculation shows that the matrix functions Ln (x) and Mn (x) satisfy the Lax equation ∂t Ln = Mn+1 Ln − Ln Mn

(4.4)

if and only if the vectors pn and qn satisfy the equations of motion (1.6). As before, Eq. (4.4) is a compatibility condition for the linear system ψn+1 = Ln (x)ψn , ψ˙ n = Mn (x)ψn . To obtain the spectral curve , we observe that the same arguments as in the case of the odd spin model show that the matrix M(x) = M0 (x) and the monodromy matrix L(x) defined by L(x) = N+1 n=0 Ln (x) form again a Lax pair ˙ L(x) = [M(x), L(x)].

(4.5)

Thus the spectral curve = {(x, y); R(x, y) ≡ det(yI − L(x)) = 0} is time-independent and well-defined. We have used here the same notation R(x, y) as for (2.1), since the equation det (yI − L(x)) is indeed of the Landsteiner–Lopez form. To see this, we note that det Ln (x) = 1 and Ln (−x) = Ln (x)−1 . Together with the constraint (1.4), this implies det L(x) = 1, L(−x) = g0 L−1 (x)g0 .

(4.6)

550

I. Krichever, D. H. Phong

But we also have near x = 0 Tr L(x) = Tr (1 + x

N n=0

qn pnT ) + O(x 2 ) = 3 + 0(x 2 ),

(4.7)

so that det(yI − L(x)) is of the form (2.1). We observe that the expression R(x, y) = det (yI − L(x)) is invariant with respect to the gauge transformations (1.7) and (1.8). Therefore, if we write R(x, y) in the Landsteiner–Lopez form (2.1) with moduli ui , the moduli ui are well-defined functions on the factor-space M. In analogy with the odd spin case, ui can be written in terms of the dynamical variables (pn , qn ) as (pi+1 qi2 )(pi+2 qi3 ) · · · (pi+k qi1 ). (4.8) uk = Ik

Here the summation is again over sets Ik of multi-indices I = (i1 < i2 < . . . < ik ). 4.2. General properties of Bloch functions. The points Q = (x, y) of the spectral curve = {(x, y); det(yI − L(x)) = 0} parametrize the Bloch functions {ψn (Q)}0≤n≤N+1 of the spin model. We begin by recalling the definition of Bloch functions, and by describing their main properties in the case of our model. • We fix a generic choice of moduli parameters ui . Then the matrix L(x) has 3 distinct eigenvalues y, except possibly at a finite number of points x. Let Q = (x, y). The Bloch solution ψn (Q) for the spin model {Ln (x)}0≤n≤N+1 is the function ψn (Q) with the following properties: 8n+1 (Q) = Ln (x)8n (Q), 8N+n+2 (Q) = y8n (Q).

(4.9)

These equations determine ψn (Q) only up to a multiplicative constant. To normalize ψn (Q), we observe that for generic moduli parameters ui , there are only finitely many points Q, where the eigenvector ψ0 (Q) of the matrix L(x) satisfies the linear constraint 3 α α=1 ψ0 (Q) = 0. Outside of these points, we can fix ψn (Q) by the following normalization condition: 3

ψ0α = 1.

(4.10)

α=1

The Bloch function ψn (Q) is then determined on the spectral curve outside of a finite number of points, and hence uniquely on . Furthermore, the components of ψn (Q) are meromorphic functions on . This follows from the constraint (4.10) and the equation L(x)ψ0 (Q) = yψ0 (Q). They imply that ψ0 (Q) is a rational expression in y and in the entries of the matrix (Lαβ (x) − Lα3 (x)) 1≤α≤3 , in view of Cramer’s rule for solving 1≤β≤2

inhomogeneous systems of linear equations. Since x, y and Lαβ (x) are all meromorphic functions on , our assertion follows. • The exceptional points excluded in the preceding construction of Bloch functions are the points where L(x) has multiple eigenvalues, and the points where the eigenvector ψ0 (Q) lies in the linear subspace of equation 3α=1 ψ0α (Q) = 0. By restricting ourselves to generic values of the moduli ui , we can make the convenient assumption that these two

Spin Chain Models with Spectral Curves from M Theory

551

sets of points are disjoint. In this case, it is evident that at points where 3α=1 ψ0α (Q) = 0, the function ψ0 (Q) develops a pole. Consider now a point x0 = 0, where the matrix L(x) has a multiple eigenvalue. Let (x − x0 )1/b be the local holomorphic coordinate centered at the points Q lying above x0 , where the branching index b can be either 1 or 2. (We can exclude the possibility b = 3 by a genericity assumption on the moduli ui .) The holomorphic function y on the surface can be expanded as y = y0 + ;y1 (x − x0 )1/b + O(x − x0 ),

(4.11)

where ; b = 1 is a root of unity. If b = 1, it follows that ∂x R(x0 , y0 ) = ∂y R(x0 , y0 ) = 0,

(4.12)

which means that the curve is singular at (x0 , y0 ). By a genericity assumption on the moduli ui , the only singular point on is at x0 = 0, and this possibility has been excluded. Thus b = 2, and the curve has a branch point at x0 if and only if L(x0 ) has multiple eigenvalues. The matrix L(x0 ) can now be shown to be a Jordan cell, i.e., L(x0 ) is of the form   λ1 µ 0 (4.13) L(x0 ) =  0 λ2 0  0 0 λ3 in a suitable basis, for some µ = 0 and λ1 = λ2 = λ3 . In fact, L(x0 ) has only one double eigenvalue by genericity assumptions on ui . The three branches of the function y consist then of one branch which is of the form λ3 +y1 (x −x0 )+· · · and is holomorphic in the variable x − x0 . The other two branches are of the form y = λ1 ± y1 (x − x0 )1/2 + · · · .

(4.14)

We must have y1 = 0, for otherwise y = λ1 + O(x − x0 ), and the same argument which ruled out the branching index b = 1 would imply that is singular at x0 . Now for x near but distinct from x0 , the Bloch function ψ0 (Q) also has 3 distinct branches. Let ψ± be the branches corresponding to the eigenvalues in (4.14), and expand them as (0)

(1)

ψ± = ψ± + (x − x0 )1/2 ψ± + O(x − x0 ).

(4.15)

Up to O(x − x0 ), the eigenvector condition can be expressed as (0)

(1)

(0)

(1)

L(x)(ψ± + (x − x0 )1/2 ψ± ) = (λ1 ± y1 (x − x0 )1/2 )(ψ± + (x − x0 )1/2 ψ± ). (4.16) This is equivalent to (0)

(0)

L(x0 )ψ± = λ1 ψ± ,

(1)

(0)

(L(x0 ) − λ1 )ψ± = ±y1 ψ± .

(4.17)

Clearly, this equation admits no solution if L(x0 ) is diagonal. Thus L(x0 ) is of the (1) (0) form (4.13) with µ = 0. We can now identify the coefficients ψ± and ψ± in the Puiseux expansion (4.15). The eigenspace of L(x0 ) corresponding to the eigenvalue λ1 is one-dimensional and generated by a single vector φ1 , which we can take to satisfy the (0) normalization condition (4.10). Evidently, ψ± = φ1 . Let φ2 be the second basis vector

552

I. Krichever, D. H. Phong

in the basis with respect to which L(x0 ) takes the Jordan form (4.13), i.e., L(x0 )φ2 = y1 φ2 + µφ1 . Then the second equation above is solved by (1)

ψ± = ±(

y1 φ2 + νφ1 ), µ

(4.18)

(1),α where the constant ν is chosen so that 3α=1 ψ± = 0. • Outside a finite number of points x, the matrix L(x) has 3 distinct eigenvalues y(a) and three distinct eigenfunctions ψ0 (a), 1 ≤ a ≤ 3, normalized uniquely by the condition (4.10). The function det2 ψ0 (1) ψ0 (2) ψ0 (3) (4.19) is independent of the ordering of both ψ0 (a) and the corresponding eigenvalues y(a). By the preceding observations, it can be expressed as a rational function of x and y(a), which is also symmetric under permutations of y(a). Thus it is actually an unambiguous and rational function of x. We observe that the function det2 ψ0 (1) ψ0 (2) ψ0 (3) vanishes at exactly those values of x which are branch points for the spectral curve det(yI − L(x)) = 0. Indeed, we saw earlier that the branch points x0 are exactly the points where L(x0 ) has multiple eigenvalues. Outside points x0 where L(x0 ) has multiple eigenvalues, the determinant (4.19) is readily seen to be = 0 (it may be infinite, because of the normalization (4.10)). Conversely, assume that x0 is a branch point. Then our preceding discussion shows that for x near x0 det 2 ψ0 (1) ψ0 (2) ψ0 (3) (x) = (x − x0 )det 2 φ1 φ2 ψ0 (3) + O(x − x0 )3/2 . (4.20) This shows that det2 ψ0 (1) ψ0 (2) ψ0 (3) (x0 ) = limx→x0 det 2 ψ0 (1) ψ0 (2) ψ0 (3) (x) = 0, establishing the observation. Furthermore, since the vectors φ1 , φ2 , and ψ0 (3) are linearly independent by construction, we obtain the important fact that the order of vanishing of the square of the determinant in (4.19) at a branch point is exactly 1. (More generally, for an arbitrary branching index b, the order of vanishing of the square of the determinant is equal to b − 1, although we do not need this more general version here, thanks to our genericity assumption on the moduli ui .) • We can now determine the number of poles of the Bloch function ψ0 (Q) outside of the points Pa above x = ∞. Clearly, this number is half of the number of poles of the expression (4.19) outside of x = ∞. Now at x = ∞, we saw that the operator L(x) has 3 eigenvalues, so that (4.19) does not vanish there. Furthermore, we shall show later that ψ0 (Q) is finite at all three points above x = ∞. Thus (4.19) has neither a zero nor a pole at x = ∞. In view of the preceding discussion, the number of zeroes of (4.19) is equal to the number of branch points of . We showed earlier, using the Riemann-Hurwitz formula, that the number of branch points of is 4N + 2. It follows that the number of poles, and hence of zeroes of ψ0 (Q) on is 2N + 1. • The poles of ψn (Q) outside of the points Pa lying above x = ∞ are independent of n. To see this, we let S0 (x) be the 3 × 3 identity matrix I , and set Sn (x) = Ln−1 (x)Sn−1 (x) = 0≤k≤n−1 Lk (x). Then ψn (Q) can be expressed as Lk (x)ψ0 (Q) = Sn (x)ψ0 (x). (4.21) ψn (Q) = Ln−1 (x)ψn−1 (Q) = 0≤k≤n−1

This shows that the poles of ψn (Q) outside of Pa can only occur at the poles of ψ0 (Q). For generic values of the moduli ui , we can assume that all the poles of ψn (Q), 0 ≤ n ≤

Spin Chain Models with Spectral Curves from M Theory

553

N + 1, are exactly of the same order 1 when they occur outside of the points Pa above x = ∞. • Let D = {z1 , · · · , z2N+1 } be the divisor of poles of the Bloch function ψn (Q). Then a fundamental property of the even divisor spin chain model is the invariance of the equivalence divisor class [D] of D under the involution σ [D] = [D σ ].

(4.22)

In other words, there exists a meromorphic function on with poles at zn and znσ . This is a consequence of how L(x) transforms under the involution x → −x, y → y −1 , L(−x) = g0 L(x)−1 g0 .

(4.23)

This transformation rule implies that g0 ψ0 (Q) is a Bloch function at (−x, y −1 ) if ψ0 (Q) is a Bloch function at (x, y). Thus g0 ψ0 (Q) must coincide with ψ0 (Qσ ) up to normalization g0 ψ0 (Q) = f (Q)ψ0 (Qσ ).

(4.24)

Since both ψ0 (Q) and ψ0 (Qσ ) are meromorphic functions, the function f (Q) is meromorphic. This proves (4.22). We summarize the discussion in the following lemma: Lemma 4.1. The vector-function ψn (Q) is a meromorphic vector-function on . Outside the punctures Pa (which are the points of situated over x = ∞) it has g + 2 = 2N + 1 poles {z1 , . . . , z2N+1 }, which are n-independent. The divisor class [D] of D is invariant with respect to the involution σ , i.e. there exists a function f (Q) on with poles at zj and zeroes at zjσ . 4.3. The direct problem. In the previous discussion, we made use only of the fact that the curve R(x, y) = 0 is the spectral curve of a matrix L(x) which satisfies the involution condition L(−x) = L(x)−1 . In particular, the discussion applies for generic values of the moduli ui parametrizing the curves. We consider now the direct problem for the system (1.6), where the matrix L(x) arises more specifically in terms of the dynamical variables (qn , pn ) as L(x) = N+1 n=0 Ln (x) = N+1 T n=0 (1 + xqn pn ). The discussion in the previous section has provided a precise description of the right-hand side of the map (4.1). It is also evident that the map descends to the equivalence classes of (qn , pn ) under the gauge group G. It is convenient to exploit the gauge transformation (1.8) to normalize the Bloch functions at x = 0. First, we observe that Ln (0) = I for all n, so that ψn (Qa ) is independent of n. Furthermore, the Lax operator L(x) can be written near x = 0 as L(x) = I + xT + O(x 2 ),

(4.25)

where the matrix T is given by T =

N+1 n=0

qn pnT .

(4.26)

554

I. Krichever, D. H. Phong

In particular, T satisfies the condition T = g0 T g 0 .

(4.27)

in view of the constraint pn = g0 p−n−1 , qn = g0 q−n−1 . Next, recall from our discussion of the Landsteiner–Lopez curve in Sect. 2 that T has 3 distinct eigenvalues y1 (Qa ), and that y can be expanded as y = 1 + y1 (Qα )x + O(x 2 ) near Qa . Expanding ψ0 (Q) near Qa as ψ0 (Q) = ψ0 (Qa ) + O(x), and using the preceding expansion for L(x), the condition L(x)ψ0 (Q) = yψ0 (Q) for Bloch functions can be rewritten as (I + xT )(ψ0 (Qa ) + xψ0 (Qa )) = (1 + ya x)(ψ0 (Qa ) + xψ0 (Qa )) + O(x 2 ). (4.28) This implies T ψ0 (Qa ) = ya ψ0 (Qa )

(4.29)

i.e., ψ0 (Qa ) are precisely the three eigenvectors of T , corresponding to the eigenvalues ya . If we let 80 (0) be the 3 × 3 matrix whose columns are the vectors ψ0 (Qa ), then the transformation law (4.27) implies that 80 (0) satisfies the condition 80 (0) = g0 80 (0)g0 .

(4.30)

Now the transformation (1.8) on (qn , pn ) does not change the curve and the divisor D, but changes the matrix 80 (0) into W 80 (0). But 80 (0) commutes with the matrix g0 , and hence so does its inverse. This means that the inverse qualifies as one of the gauge transformations W allowed in (1.8). Under such a gauge transformation W , the Bloch function 80 (0) gets transformed to the identity 80 (0) = I.

(4.31)

Henceforth we can assume then this normalization, and pn , qn satisfies the condition Tαβ =

N+1 n=0

qn,α pnβ = y1α δαβ .

(4.32)

Our main task is to establish that the map (4.2) is generically locally invertible. This is the goal of the next section on the inverse spectral problem, but in order to motivate the constructions given there, we identify here the basic behavior of the Bloch function ψn (x, y) near the points Pα above x = ∞. For (x, y) near Pα , set ψn (x, y) = x pnα

∞

ψn,k (Pα )x −k .

(4.33)

k=0

Here pnα is the order of the pole (or zero when pnα < 0) of ψn (x, y) near Pα , which may vary with both n and α. The following lemma identifies the coefficients ψn,k (Pα ) up to normalization:

Spin Chain Models with Spectral Curves from M Theory

555

Lemma 4.2. • In the neighborhood of the puncture P1 (where y = O(x N+2 )), the vector-function ψn has a pole of order n and the leading coefficient ψn,0 (P1 ) of its expansion is equal to ψn,0 (P1 ) = αn qn−1 ,

(4.34)

where the scalar αn satisfy the recurrence relation αn+1 = (pnT qn−1 )αn .

(4.35)

The next coefficient ψn,1 (P1 ) satisfies ψn+1,1 = ψn,0 + qn (pnT ψn,1 ).

(4.36)

• In the neighborhood of the puncture P3 (where y = O(x −N−2 )) the vector-function ψn has a zero of order n and the leading coefficient ψn,0 (P3 ) of its expansion is equal to ψn,0 (P3 ) = βn qn ,

(4.37)

where the scalar βn satisfies the recurrence relation βn+1 = −

1

βn . (pnT qn+1 )

(4.38)

• In the neighborhood of the puncture P2 (where y = 1) the vector-function ψn is regular and its evaluation ψn,0 (P2 ) at P2 is orthogonal to both pn and pn−1 , i.e., T ψn,0 (P2 ) = 0. pnT ψn,0 (P2 ) = pn−1

(4.39)

Proof. First, we show that for generic moduli, the Bloch function ψ0 (x, y) is regular near each Pα . Observe that ψN+2 (x, y) = L(x)ψ0 (x, y) = yψ0 (x, y). Now the relation ψn+1 = Ln (x)ψn can be inverted to produce ψn (x, y) = Ln (x)−1 ψn+1 (x, y) = (1 − xqn pnT )ψn+1 (x, y).

(4.40)

Applying this relation N + 2 times, we may write T ψ0 (x, y) = y(1 − xq0 p0T ) · · · (1 − xqN+1 pN+1 )ψ0 (x, y).

(4.41)

Consider first the neighborhood of the point P3 , where y is of order x −(N+2) . If ψ0 (x, y) admits the expansion (4.33) near P3 with ψ0,0 (P3 ) = 0, then we must have T T qN+1 )(pN+1 ψ0,0 ). ψ0,0 = (−)N+2 q0 (p0T q1 ) · · · (pN

(4.42)

This shows that ψ0,0 (P3 ) is proportional to the vector q0 , say ψ0,0 = βq0 . Now recall that the Bloch function ψ0 (x, y) satisfies the normalization condition (4.10) throughout. This implies that 3α=1 ψ0,0 (P3 ) = 0 if the order n0 (P3 ) of the pole of ψ0 (x, y) at P is positive. For generic values of the moduli of the curve , we may assume that 3 3 α=1 q0α = 0. It follows that β0 = 0 and hence ψ0,0 (P3 ) = 0, which contradicts the definition of ψ0,0 (P3 ). This shows that n0 (P3 ) = 0, and the Bloch function ψ0 (x, y) is regular at P3 . The argument near P1 is similar and even more direct, just using the

556

I. Krichever, D. H. Phong

T equation yψ0 (x, y) = ψN+2 (x, y) = N+1 n=0 (1 + xqn pn )ψ0 (x, y). It shows, incidentally, that the leading coefficient ψ0,0 (P1 ) is proportional to qN+1 . At P2 , the regularity of ψ0 (x, y) follows from the regularity of ψ0 (x, y) at the other two points P1 and P3 , and from the fact that for generic moduli, the determinant (4.19) is regular. It is now easy to see that the functions ψn (x, y) have the zeroes and poles spelled out in Lemma 4.2. The recurrence relations stated there can also be read off the defining relations ψn+1 = Ln (x)ψn (x). For example, near P1 , we find 1 1 x n+1 ψn+1,0 + ψn+1,1 + · · · = x n (1 + xqn pnT ) ψn,0 + ψn,1 + · · · . x x (4.43) This implies ψn+1,0 = qn (pnT ψn,0 ),

(4.44)

ψn+1,1 =

(4.45)

ψn,0 + qn (pnT ψn,1 ).

The relations (4.35, 4.36) follow. Near P2 , we write instead 1 1 x −n ψn,0 + ψn,1 + · · · = x −n−1 (1 − xqn pnT ) ψn+1,0 + ψn+1,1 + · · · . x x (4.46) This implies ψn,0 = −qn (pnT ψn+1,0 ),

(4.47)

ψn,1 =

(4.48)

ψn+1,0 − qn (pnT ψn+1,1 ).

which gives (4.37, 4.38). Finally near P2 , we get 1 1 T ψn+1,0 + ψn+1,1 + · · · = 1 + xqn pn )(ψn,0 + ψn,1 + · · · . x x

(4.49)

This implies that pnT ψn,0 = 0. Furthermore, ψn+1,0 = ψn,0 + qn (pnT ψn,1 ). Multiplying on the left by pnT , we conclude that pnT ψn+1,0 = 0. This establishes (4.39), and Lemma 4.2 is proved.

4.4. The inverse spectral problem. It is now a standard procedure in the geometric theory of soliton equations to solve the inverse problem using the concept of the Baker–Akhiezer function originally proposed in [22]. The main properties of the Baker–Akhiezer function in our model are the following. • Let be a Landsteiner–Lopez curve defined by Eq. (2.1). Then for a divisor D of degree g +2 = 2N +1 in general position, there exists a unique vector-function φn (t, Q) such that: (a) φn (t, Q) is meromorphic on outside the punctures P1 , P3 . It has at most simple poles at the points zi of the divisor D (if all of them are distinct);

Spin Chain Models with Spectral Curves from M Theory

557

(b) In the neighborhood of the punctures P1 and P3 , it has respectively the form ∞ n xt −k φn = x e , Q → P1 , φn,k (P1 )x (4.50) k=0

φn = x

−n −xt

e

∞

φn,k (P3 )x

−k

,

Q → P3 .

(4.51)

k=0

(c) At the points Qa , φn (Q) is regular, and φn (Qa ) is equal to φn,α (t, Qβ ) = δα,β .

(4.52)

The arguments establishing the existence of the Baker–Akhiezer function φn are wellknown, so we shall be brief. First, we recall that as shown in [22] for any algebraic curve with two punctures, any fixed local coordinate in the respective neighborhoods of the punctures, and for any divisor D of degree g there exists a unique (up to a constant factor) function with the analytic properties stated above. Now let (P1 , P3 ) be the punctures, and let x −1 be the local coordinate near either one of the punctures. We can easily show that if D has degree g + 2, the dimension of the space of such functions is equal to 3. We form the 3-dimensional vector whose components are just the three independent functions from this space. This 3-dimensional vector is unique up to multiplication by a constant matrix. We fix this matrix by the normalization condition (4.52). This establishes our claim. The function φn (t, Q) can be written explicitly in terms of the Riemann θ -function associated with . The θ -function is an entire function of g = 2N − 1 complex variables z = (z1 , . . . , zg ), and is defined by its Fourier expansion θ (z1 , . . . , zg ) = e2πi<m,z>+πi<τ m,m> , g m∈Z

where τ = τij is the period matrix of . The θ -function has the following monodromy properties with respect to the lattice Zg + τ Zg : θ (z + l) = θ (z),

θ (z + τ l) = exp[−iπ < τ l, l > −2iπ < l, z >] θ(z),

where l is an integer vector, l ∈ Zg . The complex torus J () = Cg /Zg + τ Zg is the Jacobian variety of the curve . The Abel transform Q dωk Q → Ak (Q) = Q0

imbeds the curve into its Jacobian variety. Here dωk is a basis of g holomorphic differentials, normalized as dual to the A-cycles of a symplectic homology basis for . According to the Riemann–Roch theorem, for each divisor D = z1 +. . .+zg+2 in the general position, there exists a unique meromorphic function rα (Q) with rα (Qβ ) = δαβ and D as the divisor of its poles. It can be written explicitly as (see details in [23]): fα (Q) β=α θ (A(Q) + Fβ ) rα (Q) = , fα (Q) = θ (A(Q) + Zα ) l , fα (Qα ) m=1 θ (A(Q) + Sm )

558

I. Krichever, D. H. Phong

where Fβ = − K − A(Qβ ) −

g−1

A(zj ), Sm = − K − A(zg−1+m ) −

j =1

g−1

A(zj ),

j =1

Zα = Z0 − A(Rα ),

Z0 = − K −

g+2

A(zj ) +

3

A(Qα ),

α=1

j =1

where K is the vector of Riemann constants. Let dG0 and dG1 be the unique normalized meromorphic differentials on , which are holomorphic outside P1 and P3 , and with the property that dG0 has simple poles at the punctures with residues ∓1, dG1 is regular at P3 , and has the form dG1 = dx(1 + O(x −1 )) at P1 . The normalization means that the differentials have zero periods around A-cycles dG0 = dG1 = 0. A

A

dGσ1 (Q)

We observe that the differential = dG(Qσ ) has a pole only at P3 , and is there −1 of the form −dx(1 + O(x )). Let V and U be the vectors whose components are the B-periods of the differentials dG0 and dG1 respectively 1 1 dG1 , U = dG0 . V = 2πi B 2π i B The Baker–Akhiezer function φn (t, Q) is given by φn,α (t, Q) = rα (Q)

θ (A(Q) + tU + + nV + Zα ) θ (Z0 ) exp θ (A(Q) + Zα ) θ (tU + + nV + Z0 )

Q Qα

ndG0 + tdG+ , (4.53)

where dG+ = dG1 + dGσ1 and U + = U + U σ . • The Baker–Akhiezer function φn is a Bloch function, in the sense that φN+2+n (t, Q) = yφn (t, Q).

(4.54)

This is just a consequence of the fact that both sides of the equation satisfy the criteria for the Baker–Akhiezer function, and that the Baker–Akhiezer function is unique. Similarly, the uniqueness of the Baker–Akhiezer function implies that, if the divisor D is equivalent to D σ , then the function φn satisfies φn (t, Q) = g0 φ−n (t, Qσ )f (Q),

(4.55)

where f (Q) is a function with poles at γs and zeros at γsσ . Without loss of generality, we may assume that f (Q1 ) = f (Q3 ) = −f (Q2 ) = 1. • Let φn (t, Q) be the Baker–Akhiezer function corresponding to and the divisor D of degree 2g + 2. Let pn (t) be a vector orthogonal to φn,0 (P3 , t) (the leading term in the expansion (4.50)), and to φn (t, P2 ), i.e. pnT φn,0 (P3 ) = pnT φn (t, P2 ) = 0,

(4.56)

Spin Chain Models with Spectral Curves from M Theory

559

and qn be the vector qn =

φn,0 (P1 ) . T pn φn−1,0 (P1 )

(4.57)

The vector functions pn , qn are then (N +2)-periodic and mutually orthogonal, and they satisfy the contraint (4.31). As can be expected from the gauge invariance (1.7) in the direct problem, the functions pn (t) and qn (t) which we obtain this way are defined only up to a multiplier µn (t). However, the operators Ln and Mn (x) are uniquely defined by the expression (4.3). Furthermore, again by uniqueness of the Baker–Akhiezer function, the Baker–Akhiezer function φn (Q) satisfies ψn+1 (t, Q) = Ln (x)ψn (t, Q), (∂t − Mn (x))ψn (t, Q) = 0.

(4.58)

Thus the vector function (qn (t), pn (t)) is a solution of the dynamical system (1.6). If the equivalence class of the divisor D is invariant with respect to σ , then (pn , qn ) satisfies in addition the relation (1.4). • The Baker–Akhiezer function φn (t, Q) satisfies the same defining Bloch property (4.54) as the Bloch function ψn (Q), except for the different normalizations, which is (4.52) in the case of φn (t, Q) and (4.10) in the case of ψn (Q). It follows that ψn (t, Q) = r −1 (t, Q)φn (t, Q), r(t, Q) =

3

φ0,α (Q)

(4.59)

α=1

is a Bloch solution of (4.9) normalized by the condition (4.10). This leads to the following description of the dynamical system (1.6). Let pn (t), qn (t) be vector functions (subject to constraints (1.3, 1.4, 4.31) ) which satisfy Eqs. (1.6). Then the t-dependence of the divisor D under the map (4.1) (pn (t), qn (t)) −→ {, D(t) =

2N+1

zj (t)}

(4.60)

j =1

coincides with the dynamics of the zeroes of the function r(t, Q) given by (4.59). The dynamics of the Bloch eigenfunction of (4.9) (i.e. normalized by (4.10)) are described by (∂t − Mn (t, x))ψn (t, Q) = µ(t, Q)ψn (t, Q),

µ = −∂t ln r(t, Q).

(4.61)

We observe that the linearization of the equations of motion on the Jacobian of the curve is a direct corollary of the linear dependence on t of the exponential factor in the expansion of ψn (t, Q) near the punctures. • As we saw earlier, the normalization (4.31) can be achieved by the action (1.8) of a subgroup of matrices W which commutes with g0 . In order to get M, we have to consider in addition the action of diagonal matrices W . The basic observation is the following: Let the Baker–Akhiezer functions ψn (t, Q) and ψ (t, Q) correspond to equivalent divisors D and D , respectively. Then ψn (t, Q) = W ψn (t, Q)h(Q),

(4.62)

560

I. Krichever, D. H. Phong

where h(Q) is a function with poles at D and zeros at D , and W is a diagonal matrix W = h−1 (Qα )δα,β . To establish (4.62), it suffices to check that both sides of the equation have the same analytical properties. Equation (4.62) implies that the vectors pn , qn defined by equivalent divisors are related by a transformation (1.8) with a diagonal matrix W . Altogether, we have established the following part of the Main Theorem of Sect. 1: Theorem 1. The map (4.1) identifies the reduced phase space M with a bundle over the space of algebraic curves defined by (2.1) with fN (x) of the form (2.2). At generic data, the map has bijective differential. The fiber of the bundle is the Jacobian J (0 ) of the factor-curve 0 = /σ , M = {, [D] ∈ J (0 )}.

(4.63)

5. Hamiltonian Theory and Seiberg–Witten Differential: The Even Divisor Model We come now to the crucial issue of how to determine the symplectic forms with respect to which the system (1.6) is Hamiltonian. For this, we rely on the Hamiltonian approach proposed in [6] and [7] for general soliton equations expressible in terms of Lax or Zakharov–Shabat equations. This approach was effective in the study of gauge theories with matter in the fundamental representation. Further applications were given in [10] and [24]. We review its main features. 5.1. The symplectic forms in terms of the Lax operator. In order to find the Hamiltonian structure of the equations starting with the Lax operator, we need to identify a two-form on the phase space M of vectors (qn , pn ), written in term of the Lax operator L(x). Candidates for such two-forms are ω(m)

3 dx

∗ 1 = ResPα 8n+1 (Q)δLn (x) ∧ δ8n (Q) m . 2 x

(5.1)

α=1

The various expressions in this equation are defined as follows. The notation fn ! stands for a sum over one period of the periodic function fn : fn ! =

N+1

fn .

(5.2)

n=0

ψn∗ (Q)

The expression is the dual Baker–Akhiezer function, which is the row-vector solution of the equation ∗ ∗ ψn+1 (Q)Ln (z) = ψn∗ (Q), ψN+2 (Q) = y −1 ψ0∗ (Q),

(5.3)

normalized by the condition ψ0∗ (Q)ψ0 (Q) = 1.

(5.4)

∗ ψ ∗ ∗ Note that (4.9) and (5.3) imply that ψn+1 n+1 = ψn+1 (Ln (x)ψn ) = (ψn+1 Ln (x))ψn = ∗ ψn ψn does not depend on n. We would also like to emphasize that, unlike the Bloch function ψn (Q) which does not have n-independent zeroes, the normalization (5.4)

Spin Chain Models with Spectral Curves from M Theory

561

allows the dual Bloch function ψn∗ (Q) to have such zeroes. In fact, they occur at the poles of ψn (Q). In (5.1), the differential δ denotes the exterior differential with respect to the moduli parameters of M. (This is in order to distinguish δ from the differential d, which is the exterior differential on the surface .) Thus the external differential δLn (z) can be viewed as a one-form on M, valued in the space of operator-valued meromorphic functions on . Similarly the Bloch function ψn (Q) and dual Bloch functions ψn∗ (Q) are functions on M, valued respectively in the space of column-vector-valued and the space of row-vector-valued meromorphic functions on . It follows that δψn (Q) is a one-form on M, valued in the space of column-vector-valued meromorphic functions ∗ δL (x) ∧ δψ (x) is then a two-form on M, valued in the on . The expression ψn+1 n n space of meromorphic functions on , and for each m integer, the expression

∗ dx δLn (x) ∧ δψn (x) m G(m) = ψn+1 x

(5.5)

is a meromorphic 1-form on . This justifies (5.1) as a two-form on M. In (5.1), we have allowed for a later choice of an integer m. We shall see shortly that holomorphicity requirements restrict to 0 ≤ m ≤ 2, and that the symplectic form of the N = 2 SUSY with a hypermultiplet in the anti-symmetric representation is obtained by setting m = 0. Sometimes it is useful to think of the symplectic form ω as dx 1 −1 ω(m) = Resx=∞ Tr 8n+1 (x)δLn (x) ∧ δ8n (x) , (5.6) 2 xm where 8n (x) is a matrix with the columns ψn (Qj (x)), Qj (x) = (x, yj ) corresponding to different sheets of . The matrix 8n (x) is of course not defined globally. Note that ψn∗ (Q) are the rows of the matrix 8n−1 (x). That implies that 8n∗ (Q) as a function on the spectral curve is meromorphic outside the punctures, has poles at the branching points of the spectral curve, and zeroes at the poles zj of 8n (Q). These analytical properties will be crucial in the sequel. 5.2. The symplectic forms in terms of x and y. A remarkable property of the symplectic form defined by (5.1) in terms of the Lax operator L(x) is that it can, under quite general circumstances, be rewritten in terms of the meromorphic functions x and y on the spectral curve . More precisely, we have ω(m) = −

2N+1 i=1

δ ln y(zi ) ∧

δx (zi ). xm

(5.7)

The meaning of the right-hand side of this formula is as follows. The spectral curve is equipped by definition with the meromorphic functions y(Q) and x(Q). Their evaluations x(zi ), y(zi ) at the points zi define functions on the space M, and the wedge product of their external differentials is a two-form on M. The proof of the formula (5.7) is very general and does not rely on any specific form of Ln . For the sake of completeness we present it here in detail, although it is very close to the proof of Lemma 5.1 in [24]. Recall that the expression G(m) defined in (5.5) is a meromorphic differential on the spectral curve . Therefore, the sum of its residues at the punctures Pα is equal to the

562

I. Krichever, D. H. Phong

opposite of the sum of the other residues on . For m ≤ 2, the differential G(m) is regular at the points situated over x = 0, thanks to the normalization (4.31), which insures that δψn (Q) = O(x). Otherwise, it has poles at the poles zi of ψn (Q) and at the branch ∗ (Q) has poles. We analyze in turn the residues points si , where we have seen that ψn+1 at each of these two types of poles. First, we consider the poles zi of ψn (Q). By genericity, these poles are all distinct and of first order, and we may write ψn ≡ ψn,0 (zi )

1 + ··· . x − x(zi )

(5.8)

It follows that δψn has a pole of second order at zi δψn = ψn,0 (zi )

δx(zi ) + ··· . (x − x(zi ))2

(5.9)

∗ In view of the fact that ψn+1 has a simple zero at zi and hence can be expressed as ∗ ∗ ψn+1 ≡ ψn+1,0 (x − x(zi )) + · · · ,

(5.10)

we obtain

∗

∗ δx δx δLn ψn ∧ m (zi ) = ψn+1 δLn ψn ∧ m (zi ). Reszi G(m) = ψn+1,0 x x

(5.11)

The key observation now is that the right-hand side can be rewritten in terms of the monodromy matrix L(x). In fact, the recursive relations (4.9) and (5.3) imply that N+1 n−1

∗ ∗ ψn+1 δLn ψn = ψN+2 (5.12) Lm δLn Lm ψ0 m=n+1

= =

N+1

∗ ψN+2

N+1

m=0

Lm δLn n=0 m=n+1 ψN∗ +2 δLψ0 = ψ0 δ ln yψ0 .

n−1

Lm ψ 0

(5.13)

m=0

(5.14)

In the last equality, we have used the standard formula for the variation of the eigenvalue of an operator, ψ0∗ δLψ0 = ψ0∗ δyψ0 . Altogether, we have found that Reszi G(m) = δ ln y(zi ) ∧

δx (zi ). xm

(5.15)

The second set of poles of G(m) is the set of branching points si of the cover. The pole of ψn∗ at si cancels with the zero of the differential dx, dx(si ) = 0, considered as a differential on . The vector-function ψn is holomorphic at si . However, δψn can develop a pole as we see below. If we take an expansion of ψn in the local coordinate (x − x(si ))1/2 (in general position when the branching point is simple) and consider its variation we get ψn = ψn,0 + ψn,± (x − x(si ))1/2 + · · · , 1 δx(si ) δψn = − ψn,± + ··· . 2 (x − x(si ))1/2

(5.16) (5.17)

Spin Chain Models with Spectral Curves from M Theory

Comparing with

dψn dx

563

= 21 ψn,± (x−x(s1 ))1/2 + · · · , we may write i

δψn = −

dψn δx(si ) + O(1). dx

(5.18)

This shows that δ8n has a simple pole at si . Similarly, we may write δy = −

dy δx(si ) + O(1). dx

The identities (5.18) and (5.19) imply that

∗ δy dx . δLn dψn ∧ m Ressi G(m) = Ressi ψn+1 x dy Arguing as for (5.12), this can be rewritten as ∗

δydx . Ressi G(m) = Ressi ψN+2 δLdψ0 ∧ m x dy

(5.19)

(5.20)

(5.21)

Due to the antisymmetry of the wedge product, we may replace δL in (5.21) by (δL−δy). Then using the identities ∗ (y − L), ψN∗ +2 (δL − δy) = δψN+2

(y − L)dψ0 = (dL − dy)ψ0 ,

(5.22) (5.23)

which result from ψN∗ +2 (L − y) = (L − y)ψ0 = 0, we obtain ∗

δydx (dL − dy)ψ0 ∧ m . Ressi G = Ressi δψN+2 x dy

(5.24)

Now the differential dL does not contribute to the residue, since dL(si ) = 0. Fur∗ ∗ thermore, ψN∗ +2 ψ0 = y −1 ψ0∗ ψ0 = y −1 . Thus δψN+2 ψ0 = −ψN+2 δψ0 − y −2 δy. Exploiting again the antisymmetry of the wedge product, we arrive at ∗

dx Ressi G = Ressi ψN+2 δψ0 ∧ δy m . x

(5.25)

Recall that we have normalized the Bloch function ψ0 (Q) at x = 0 by (4.31), and that near x = 0, the function y is of the form (2.4). Thus δψ0 = O(x) and δy = O(x) near x = 0, and the differential form

dx ∗ δψ0 ∧ δy m ψN+2 x

(5.26)

is holomorphic at x = 0 for 0 ≤ m ≤ 2. It is manifestly holomorphic at all the other points of , except at the branching points si and the poles z1 , · · · , z2N+1 . Therefore si

2N+1

∗

dx dx Ressi ψN∗ +2 δψ0 ∧ δy m = − Reszi ψN+2 δψ0 ∧ δy m . x x i=1

(5.27)

564

I. Krichever, D. H. Phong

∗ Using again the expressions (5.16, 5.18) for ψ0 and δψ0 , and the fact that ψN+2 = ∗ −1 y ψ0 , the right-hand side of (5.27) can be recognized as 2N+1

δ ln y(zi ) ∧

i=1

δx(zi ) . x m (zi )

(5.28)

The sum of (5.15) and (5.28) gives (5.7), since 2ω(m) = −

2N

Reszi G(m) −

i=1

si

Ressi G(m) .

(5.29)

The identity (5.7) is proved. 5.3. Action-angle variables and Seiberg–Witten differential. The expression (5.7) for the symplectic form ω(m) suggests its close relation with the following one-form on : dλ(m) = ln y

dx . xm

(5.30)

Strictly speaking, the form dλ(m) is not a meromorphic differential in the usual sense, because of the multiple-valuedness of ln y. However, the ambiguities in ln y are fixed multiples of 2πi, which disappear upon differentiation. Thus, the form dλ(m) is no different from the usual meromorphic differentials, as far as the construction of symplectic 1 forms is concerned. Also, the form dλ(m) and the form m−1 x −m+1 dy y (for m = 1; for

m = 1, −(ln x) dy y ) differ by an exact differential, and we shall not distinguish between them. From this point of view, the Seiberg–Witten form (1.1) can be identified with the form −dλ(0) . Our spin chain model has led so far to a 2N -dimensional phase space M, equipped with several candidate symplectic forms ω(m) , 1 ≤ m ≤ 2. We still have to reduce M to a (2N − 2)-dimensional phase space, and to identify the correct symplectic form. Remarkably, both selections are tied to a key physical requirement for the one-form which corresponds to the Seiberg–Witten of a N = 2 SUSY gauge theory, namely the holomorphicity of its variations under moduli deformations. It is an important feature of N = 2 Yang–Mills theories that the masses of the theory are not renormalized. Since the masses of the theory correspond to the poles of the Seiberg–Witten differential dλ, it follows that δdλ must be holomorphic. Thus we need to examine the poles of δ dλ = δ ln y xdxm , and identify the subvarieties of M along which δdλ is holomorphic. There are 3 such subvarieties, corresponding to the choices of m: • On the variety M2 = M ∩ {u0 = c0 , u1 = c1 }, the differential δ dλ(2) = (δ ln y) dx x2 has no pole at Qα , since y = 1 + O(x 2 ) near x = 0. On the other hand, the differential dx vanishes at x = ∞, so δ ln y dx is also holomorphic there, and δdλ(2) is holomorphic. x2 x2 • On the variety M0 = M∩{uN = 1, uN−1 = 0}, the differential δ dλ(0) = (δ ln y)dx is automatically holomorphic at x = 0. Near ∞, in view of the expansion () for y, we have δ ln y = O(x 2 ) if we vary only the moduli within M2 . Thus δdλ(0) is holomorphic.

• On the variety M1 = M ∩ {uN = 1}, the differential δ dλ(1) = (δ ln y) dx x is still holomorphic, because δ ln y = O(x). Near x = ∞, the sole constraint {uN−1 = 1} suffices to guarantee that δ ln y = O( x1 ). Thus δdλ(1) is holomorphic.

Spin Chain Models with Spectral Curves from M Theory

565

When m and hence dλ(m) is even under the involution σ , action-angle variables can be introduced as follows. Restricted to M(m) , δdλ(m) is holomorphic, and hence can be expressed for suitable coefficients δai as δdλ(m) =

2N−1

(δai )dωi ,

(5.31)

i=1

where dωi is a basis of 2N − 1 holomorphic one-forms on . Since dλ(m) is even, only holomorphic one-forms dωi which are even can occur on the right-hand side. We identify such forms with forms on /σ . We choose a symplectic homology basis Ai , Bi and a dual basis of holomorphic forms dωi , 1 ≤ i ≤ N − 1, for the factor curve /σ . The variables ai and aDi can then be defined by ai = dλ(m) , aDi = dλ(m) . (5.32) Ai

Bi

The interpretation of the variables ai is as action variables from the viewpoint of the spin model and as vacuum moduli from the viewpoint of the N = 2 SUSY gauge theory. Evidently, their variations coincide with the δai of Eq. (5.31). Next, the angle variables φi , 1 ≤ i ≤ N − 1, are defined by D = {z1 , · · · , z2N+1 } −→ φi =

2N+1 zj

dωi .

(5.33)

j =1

We claim now that, for m even, the symplectic form ω(m) is a genuine symplectic form when restricted to M(m) , and that ai and φi as defined above are action-angle coordinates for ω(m) ω(m) =

N−1

δai ∧ δφi

on M(m) .

(5.34)

i=1

2N+1 zj

To see this, we evaluate the two-form δ j =1 Q0 δ dλ in two different ways. Substituting in (5.31), we find that it is equal to N−1

δ(

i=1

δai φi ) =

N−1

δφi ∧ δai .

(5.35)

i=1

On the other hand, we can also write     2N+1 2N+1 2N+1 zj zj δx(zj ) dx ∧ (δ ln y)(zj ). δ dλ = δ  (δ ln y) m  = δ x x m (zj ) Q0 Q0 j =1

j =1

j =1

(5.36) Comparing the two formulas, and making use of (5.7), we obtain the desired equation (5.34). We observe that for the present even divisor spin model, the space M1 and the form dλ(1) are not applicable. In fact, there are difficulties with both the dimension of M1

566

I. Krichever, D. H. Phong

which is odd, and the angle variables φi defined by (5.33), which would vanish identically because the class of the divisor D is even. For the N = 2 SUSY Yang–Mills theory with a hypermultiplet in the antisymmetric representation, the spectral curves are given by M0 . The symplectic form is then ω(0) , which provides an independent check of the choice of Seiberg–Witten form found by Landsteiner and Lopez. 5.4. The Hamiltonian of the Flow. We show now that the even divisor spin model is a Hamiltonian system. More precisely, restricted to each of the phase spaces M(0) or M(2) , the system is Hamiltonian with the corresponding symplectic form, with a corresponding Hamiltonian. We would like to stress that, once again, the arguments to these ends are quite general, and use only the expression for ω(m) in terms of the Lax operator. Lemma 5.1. Let m be either 0 or 2. Then Eqs. (1.6) restricted on M(m) are Hamiltonian with respect to the symplectic form ω(m) given by (5.1). The Hamiltonians H(m) are given by H(0) =

uN−2 ,

H(2) = ln uN =

N+1 n=0

ln(pn+ qn−1 ) =

1 2

N+1 n=0

+ ln[(pn+ qn−1 )(pn−1 qn )].

(5.37) (5.38)

Proof. By definition, a vector field ∂t on a symplectic manifold is Hamiltonian, if its contraction i∂t ω(X) = ω(X, ∂t ) with the symplectic form is an exact one-form δH (X). The function H is the Hamiltonian corresponding to the vector field ∂t . Thus i∂t ω(m) =

∗ ∗ dx 1 L˙ n δψn ResPα ψn+1 δLn ψ˙ n − ψn+1 . 2 α xm

(5.39)

Now under the flow (1.6), the Lax operators Ln (x) flow according to the Lax equation (4.4), while the Bloch function ψn flow according to (4.61). Consequently, i∂t ω(m) =

1 ResPα 2 α

∗ dx ∗ ψn+1 δLn (Mn + µ)ψn − ψn+1 (Mn+1 Ln − Ln Mn )δψn . xm

(5.40)

Since Ln ψn = ψn+1 , it follows that ∗ ∗ ∗ ψn+1 Mn+1 Ln δψn = ψn+1 Mn+1 ψn+1 − ψn+1 Mn+1 δLn ψn .

Upon averaging in n, we obtain

∗

∗ ψn+1 (Mn+1 Ln − Ln Mn )δψn = − ψn+1 Mn+1 δLn ψn .

(5.41)

For all n, both δLn (x) and Mn (x) vanish at x = 0. The differential form

∗ ψn+1 (δLn Mn + Mn+1 δLn ) ψn

dx xm

(5.42)

is thus holomorphic at x = 0, in both cases m = 0 and m = 2. As we have seen, outside ∗ of x = ∞, the poles of ψn+1 are at the branch ponits and are cancelled by the zeroes

Spin Chain Models with Spectral Curves from M Theory

567

∗ . Thus the above of dx there, while the poles of ψn are cancelled by the zeroes of ψn+1 differential form is holomorphic outside of x = 0. The sum of its residues at Pα must be zero

α

∗ dx ResPα ψn+1 (δLn Mn + Mn+1 δLn ) ψn m = 0. x

(5.43)

The expression (5.40) for i∂t ω(m) reduces to i∂t ω(m) =

∗

dx 1 ResPα ψn+1 δLn ψn !µ(Q, t) m . 2 α x

(5.44)

Applying the arguments leading to (5.12), we obtain i∂t ω(m) =

1 dx resPα δ(ln y)µ(t, Q) m . 2 α x

(5.45)

As follows from (4.50,4.51), and (4.61) the function µ(t, Q) is holomorphic at P2 , while it has the following expansion at the punctures P1 , P3 : µ(t, Q) = −x + O(1),

Q → P1 ;

µ(t, Q) = x + O(1),

Q → P3 .

(5.46)

We consider now the cases m = 2 and m = 0 separately. When m = 2, the form µ dx is regular at P2 , and has simple poles with opposite residues at P1 and P3 . Since x2 δ ln y = δuN + O( x1 ) near P1 , it follows immediately that i∂t ω(2) = δ(ln uN ).

(5.47)

When m = 0, we observe that the form (δ ln y)dx is regular at x = ∞. Indeed, the constraints uN = 1, uN−1 = 0 defining the phase space M0 in this case imply that δ ln y = O( x12 ) near all three points P1 , P2 , and P3 . For P1 and P3 , this statement is a direct consequence of (2.7) and (2.8). For P2 , this follows from the fact that three roots yα of the Landsteiner–Lopez curve (2.1) must satisfy 3α=1 yα = 1. Returning to the residues in (5.45), we see that the point P2 does not contribute. As for the points P1 and P3 , they contribute exactly the coefficient uN−2 in the expansions (2.7) and (2.8) for y, i∂t ω(0) = δuN−2 .

(5.48)

The lemma is proved. 5.5. The symplectic form in terms of (pn , qn ). The expression (5.1) for the symplectic forms ω(m) in terms of the Lax operator also provides a straightforward way of writing ω(m) in terms of the dynamical variables (qn , pn ). Such an expression for the form ω(0) appears complicated. But it is quite simple for the form ω(2) , and we derive it here. We have δLn = x δ(qn pnT ), and the contributions of the three points Pa above x = ∞ can be evaluated as follows. At the point P1 , y = O(x N+2 ), ψn = O(x n ), ψn+1 = O(x −(n+1) ), and thus the ∗ δL ∧ δψ ! dx is regular. The residue at P vanishes. differential ψn+1 n n x2 1

568

I. Krichever, D. H. Phong

∗ At the point P2 , ψn and ψn+1 are regular. Using the same notation as in (4.33), we write

ψn = ψn,0 + ψn,1 x −1 + · · · , ∗ ψn+1

=

∗ ψn+1,0

∗ + ψn+1,1 x −1

(5.49)

+ ··· .

(5.50)

In analogy with (ref), from the equation ∗ = ψn∗ Ln (x)−1 = ψn∗ (1 − xqn pnT ), ψn+1

(5.51)

∗ ∗ qn = ψn+1,0 qn = 0. ψn,0

(5.52)

it follows that

The residue at P2 is then readily identified

∗ dx dx ∗ ResP2 ψn+1 δLn ∧ δψn 2 = ResP2 ψn+1,0 δ(qn pnT ) ∧ δψn,0 ! x x ∗ δqn ∧ (δpnT )ψn,1 ! = − ψn+1,0 ≡ I.

(5.53) (5.54) (5.55)

At the point P3 , y = O(x −N −2 ), and ψn,0 x −n + ψn,1 x −n−1 + · · · ,

ψn = ∗ ψn+1

∗ ψn+1,0 x n+1

=

∗ + ψn+1,1 xn

+ ··· .

(5.56) (5.57)

It follows that the residue is given by

∗ dx ResP3 ψn+1 δLn ∧ δψn 2 x ∗ ∗ = ψn+1,0 δ(qn pnT ) ∧ δψn,1 + ψn+1,1 δ(qn pnT ) ∧ δψn,0

(5.58)

We now make use of Eq. (5.51) to derive recursion relations between the coefficients of ψn∗ , ∗ ∗ ψn+1,0 = −ψn,0 qn pnT ,

∗ ∗ ∗ ψn+1,1 = ψn,0 − ψn,1 qn pnT .

(5.59)

∗ ∗ ψn+1,1 qn = ψn,0 qn .

(5.60)

They imply that ∗ qn = 0, ψn+1,0

As a consequence, the first term on the right-hand side of () simplifies to ∗ ∗ δ(qn pnT ) ∧ δψn,1 = ψn+1,0 δqn ∧ pnT δψn,1 . ψn+1,0

(5.61)

Now recall that we introduced the coefficient βn by ψn = βn qn . Comparing with the equation (), we obtain βn = −pnT ψn,1

(5.62)

and the preceding term becomes ∗ ∗ ∗ ψn+1,0 δ(qn pnT ) ∧ δψn,1 = −ψn+1,0 δqn ∧ δβn − ψn+1,0 (δqn ∧ δpnT )ψn,1 .

(5.63)

Spin Chain Models with Spectral Curves from M Theory

569

On the other hand, pnT ψn,0 = 0, and the second term on the right-hand side of () can be rewritten as ∗ ∗ ∗ δ(qn pnT ) ∧ δψn,0 = ψn+1,1 qn δpnT ∧ δψn,0 − ψn+1,1 δqn ∧ (δpnT )ψn,0 . (5.64) ψn+1,1

Altogether, we obtain the following expression for the residue at P3 :

∗ dx δLn ∧ δψn 2 = II + III, ResP3 ψn+1 x

(5.65)

where the terms II and III are defined by ∗ ∗ (δqn ∧ δpnT )ψn,1 + ψn+1,1 (δqn ∧ δpnT )ψn,0 ], II = −[ψn+1,0 ∗ ∗ III = −(ψn+1,0 δqn ∧ δβn − ψn+1,1 qn δpnT ∧ δψn,0 ).

(5.66) (5.67)

We claim that the term III can be simplified to III = −δpnT ∧ δqn .

(5.68)

In fact, in view of the recursion relations (5.59) and the fact that ψn = βn qn , it can be rewritten as

∗ qn ) pnT δqn ∧ δβn + δpnT ∧ (δβn )qn + δpnT ∧ βn δqn . (5.69) III = −ψn,0 The first two terms on the right-hand side cancel, since pnT qn = 0. As for the remaining term, we note that the normalization ψn∗ ψn = 1 implies near P3 ∗ ∗ 1 = (ψn,0 x −n + O(x −n−1 ))(βn qn x n + O(x n−1 )) = ψn,0 βn qn + O(x −1 )

(5.70)

∗ β q = 1. The identity (5.68) is established. from which it follows that ψn,0 n n Finally, it is readily seen that the remaining terms I and II combine into

I + II = −

3 a=1

dx ∗ ResPa ψn+1 δqn ∧ δpnT ψn . x

(5.71)

∗ δqn ∧ δpnT ψn dx But the 1-form ψn+1 x is meromorphic on the space , with poles only at the points Pa above x = ∞ and Qa above x = 0. We can deform then contours and rewrite II+III as residues at Qa , I + II =

3 a=1

dx ∗ ResQa ψn+1 δqn ∧ δpnT ψn . x

(5.72)

∗ At x = 0, we have ψn+1 = ψn∗ , and this expression is determined by the normalization condition (4.31) on the matrix W . In terms of ψn , the normalization (4.31) can be restated as the normalization condition ψn∗ (0)ψnT = I as an identity between 3 × 3 matrices. Thus I + II = 3 δqn ∧ δpn !, and we obtain the final formula for the symplectic form ω in terms of pn and qn ,

ω=2

N+1 n=0

δqnT ∧ δpn .

(5.73)

570

I. Krichever, D. H. Phong

6. Hamiltonian Theory and Seiberg–Witten Differential: The Odd Divisor Model The main difference between the even and the odd divisor spin models is in the parity of the divisor D of poles of the Bloch function ψn (Q). For the odd divisor spin model, D is essentially odd under the involution σ : (x, y) → (−x, y −1 ) in the following sense: 3

[D] + [D σ ] = K + 2

Pα .

(6.1)

α=1

Here K is the canonical class, which is the divisor class of any meromorphic 1-forms on . As in the case of the even divisor spin model, the relation (6.1) is a consequence of the transformation of L(x) under σ , which is in this case L(−x) = (L(x)−1 )T . This implies that ψ0 (Qσ ) and ψ0 (Q)∗ are both dual Bloch functions for L(x), and thus ψ0∗ (Q) = ψ0 (Qσ )f (Q),

(6.2)

where f (Q) is a meromorphic function on . But the zeroes of the dual Bloch function ψ0∗ are exactly the poles of ψ0 (Q), while its poles are exactly the branch points of the surface . Thus the preceding equation implies the following equation for divisor classes [branch points] − [D] = [D σ ].

(6.3)

To determine the divisor of the branch points of , we consider the differential dx, viewed as a meromorphic form on . Since dx has a pole of order 2 at each Pa , and a zero at each branch point, we have [branch points] − 2 3a=1 Pa = K, and the desired relation (6.1) follows. • We discuss briefly the direct and the inverse problems for the odd divisor spin system. Once the difference in parity of the divisor of poles of the Bloch functions is taken into account, the direct problem is treated in exactly the same way as before. As for the inverse problem, we need only a few minor modifications in expansions near the punctures P1 , P3 , which we give (cf. (4.50, 4.51) now ∞ n xt −k φn = x e , Q → P1 , φn,k (P1 )x (6.4) k=0 ∞

φn = x

−n xt

e

φn,k (P3 )x

−k

,

Q → P3 , .

(6.5)

k=0

They lead to minor modifications in the exact formulas for the Baker–Akhiezer function φn (t, Q) (cf. (4.53)): Q θ (A(Q) + tU − + nV + Zα ) θ (Z0 ) − , ndG + tdG exp φn,α (t, Q) = rα (Q) 0 θ (A(Q) + Zα ) θ (tU − + nV + Z0 ) Qα (6.6) where dG− = dG1 − dGσ1 and U − = U − U σ . We show that if the divisor D satisfies (6.1), then the corresponding Baker–Akhiezer function satisfies the relation φn∗ (t, Q) = φnT (t, Qσ )f (Q),

(6.7)

Spin Chain Models with Spectral Curves from M Theory

571

where as before φn∗ are the rows of the matrix inverse to the matrix Mβn,α (x) = φn,α (Pβ ).

(6.8)

Here the points Pα (x) are the three preimages of x on on different sheets. Of course, the matrix Mn (x) does depend on the ordereing of sheets, but one can check that if for Pα (x) we define φn∗ (Pα ) as the corresponding row of the inverse matrix, then φn∗ is well-defined. As before φn∗ has poles at all the branching points and zeroes at the points of the divisor D. To establish (6.7), we show that σ φn,α (t, Pγ (x))φn,β (t, Pγ )f (Pγ ) = δα,β . (6.9) α

Indeed, from (6.4) and (6.5), it follows that the function φn,α (t, Q)φn,β (Qσ )f (Q) is holomorphic everywhere except at the branching points (the poles and the essential singularities at the punctures Pα over x = ∞ cancel each other; there are no poles at D and D σ because f (Q) has zeros at these points). Therefore, the left-hand side of the above equation is a holomorphic function of x (the poles at the branching points cancel upon the summation). Hence it is a constant, which can be found by taking x = 0. The uniqueness of φn and the relation (6.7) implies as before that it satisfies the equation φn+1 = Ln (x)φn , ∂t φn = Mn (x)φn ,

(6.10)

where Ln and Mn have the form (3.4,3.5). • We come now to the Hamiltonian structure of the odd divisor spin model. Recall that we had introduced the space Modd of spin chains. Solving the direct and inverse spectral problem as in the case of the even divisor spin model, we can identity Modd with the space of geometric data σ Modd 1 ↔ {, D; [D] + [D ] = K + 2

3

Pα }.

(6.11)

α=1

We can verify that the space on the right-hand side is 2N + 1 dimensional, as it should be: there are N + 1 moduli parameters for the curve , and N parameters for the antisymmetric divisor [D]. The same discussion as in Sects 5.3 and 5.4 for the even divisor spin model shows that, in the present case, the only candidate for the symplectic form is the form ω(1) , restricted to the 2N -dimensional phase space Modd 1 defined by odd Modd ∩ {uN = 1}. 1 =M

The corresponding action and angle variables are now given by 2N+1 zj ai = dλ(1) , φi = dωiodd , 1 ≤ i ≤ N, Aodd i

(6.12)

(6.13)

j =1

where dωiodd and Aodd are respectively a basis of odd holomorphic differentials and a i basis of odd A-cycles. We have then as before ω(1) =

N j =1

δaj ∧ δφj .

(6.14)

572

I. Krichever, D. H. Phong

References 1. Seiberg, and Witten, E.: Electro-magnetic duality, monopole condensation, and confinement in N = 2 supersymmetric Yang–Mills theory. Nucl. Phys. B 426, 19–53 (1994), hep-th/9407087 Seiberg, N. and Witten, E.: Monopoles, duality, and chiral symmetry breaking in N = 2 supersymmetric QCD, Nucl. Phys. B431 494 (1994), hep-th/9410167. 2. Gorskii, A., Krichever, I.M., Marshakov, A., Mironov, A. and Morozov, A.: Integrability and Seiberg– Witten exact solution. Phys. Lett. B355, 466 (1995), hep-th/9505035 Martinec, E.: Integrable structures in supersymmetric gauge and string theory. hep-th/9510204 3. Martinec, and Warner, N.: Integrable systems and supersymmetric gauge theories. Nucl. Phys. B 459, 97–112 (1996), hep-th/9509161 4. Donagi, R. and Witten, E.: Supersymmetric Yang–Mills and integrable systems. Nucl. Phys. B 460, 288–334 (1996), hep-th/9510101 5. Novikov, S.P. and Veselov, A.: On Poisson brackets compatible with algebraic geometry and Korteweg– deVries dynamics on the space of finite-zone potentials. Soviet Math. Doklady 26, 357–362 (1982) 6. Krichever, I. and Phong, D.H.: On the integrable geometry of N = 2 supersymmetric gauge theories and soliton equations. J. Differ Geom. 45, 445–485 (1997), hep-th/9604199 7. Krichever, I. and Phong, D.H.: Symplectic forms in the theory of solitons. In: Surveys in Differential Geometry IV, edited by C.L. Terng and K. Uhlenbeck, International Press, 1998, pp. 239–313, hepth/9708170 8. D’Hoker, E. and Phong, D.H.: Calogero–Moser systems in SU (N ) Seiberg–Witten theory. Nucl. Phys. B 513, 405–444 (1998), hep-th/9709053 D’Hoker, E. and Phong, D.H.: Order parameters, free fermions, and conservation laws for Calogero– Moser systems. Asian J. Math. 2, 655–666 (1998), hep-th/9808156 9. D’Hoker, E. and Phong, D.H.: Calogero–Moser Lax pairs with spectral parameter for general Lie algebras. Nucl. Phys. B 530, 537–610 (1998), hep-th/9804124 D’Hoker, E. and Phong, D.H.: Calogero–Moser and Toda systems for twisted and untwisted affine Lie algebras. Nucl. Phys. B 530, 611–640 (1998), hep-th/9804125 D’Hoker, E. and Phong, D.H.: Spectral curves for super Yang–Mills with adjoint hypermultiplet for general Lie algebras. Nucl. Phys. B 534, 697–719 (1998), hep-th/9804126 D’Hoker, E. and Phong, D.H.: Lax pairs and spectral curves for Calogero–Moser and Spin Calogero– Moser systems. Regular and Chaotic Dynamics (1999), hep-th/9903002 10. Krichever, I.: Elliptic analog of the Toda lattice. hep-th/9909224 11. Argyres, P. and Faraggi, A.: The vacuum structure and spectrum of N=2 supersymmetric SU (N ) gauge theory. Phys. Rev. Lett. 73, 3931 (1995), hep-th/9411057 Klemm, A., Lerche, W., Theisen, S. and Yankielowicz, S.: Simple singularities and N=2 supersymmetric gauge theories. Phys. Lett. B 344, 169 (1995), hep-th/9411058 Argyres, P. and Shapere, A.: The vacuum structure of N=2 QCD with classical gauge groups. hepth/9509175 Danielsson, U.H. and Sundborg, B.: The moduli space and monodromies of N=2 supersymmetric SO(2r + 1) gauge theories. hep-th/9504102 Brandhuber, A. and Landsteiner, K.: On the monodromies of N = 2 supersymmetric Yang–Mills theories with gauge group SO(2n). hep-th/9507008 Argyres, P., Plesser, M.R. and Shapere, A.: The Coulomb phase of N = 2 supersymmetric QCD. Phys. Rev. Lett. 75, 1699 (1995), hep-th/9505100 Abolhasani, M.R., Alishahiha, M. and Ghezelbash, A.M.: The moduli space and monodromies of the N=2 supersymmetric Yang–Mills theory with any Lie gauge group. Nucl. Phys. B 480, 279–295 (1996), hep-th/9606043 Alishahiha, M., Ardalan, F. and Mansouri, F.: The moduli space of the supersymmetric G(2) Yang–Mills theory. Phys. Lett. B 381, 446 (1996), hep-th/9512005 Hanany, A. and Oz, Y.: On the quantum moduli space of vacua of the N=2 supersymmetric SU(N) Yang– Mills theories. Nucl. Phys. B 452, 73 (1995), hep-th/9505075 Hanany, A.: On the quantum moduli space of vacua of N=2 supersymmetric gauge theories. Nucl. Phys. B 466, 85 (1996), hep-th/9509176 12. Lerche, W.: Introduction to Seiberg–Witten theory and its stringy origins. In: Proceedings of the Spring School and Workshop on String Theory, ICTP, Trieste (1996), hep-th/9611190, Nucl. Phys. Proc. Suppl. B 55, (1997) 83 Marshakov, A.: On integrable systems and supersymmetric gauge theories. Theor. Math. Phys. 112 791–826 (1997), hep-th/9702083 Klemm, A.: On the geometry behind N=2 supersymmetric effective actions in four dimensions. Trieste 1996, High Energy Physics and Cosmology, 120–242, hep-th/9705131 Marshakov, A. and Mironov, A.: Seiberg–Witten systems and Whitham hierarchires: A short review. hepth/9809196

Spin Chain Models with Spectral Curves from M Theory

13.

14.

15.

16.

17. 18.

573

D’Hoker, E. and Phong, D.H.: Seiberg–Witten theory and Integrable Systems. hep-th/9903068 Lozano, C.: Duality in topological quantum field theories. hep-th/9907123 D’Hoker E. and Phong, D.H.: Lectures on Supersymmetric Yang–Mills Theory and Integrable Models. In: Notes from lecture series at Banff and Champaign-Urbana, hep-th/9912xxx D’Hoker, E., Krichever, I.M. and Phong, D.H.: The effective prepotential for N = 2 supersymmetric SU (Nc ) gauge theories. Nucl. Phys. B 489, 179 (1997), hep-th/9609041 D’Hoker, E., Krichever, I.M. and Phong, D.H.: The effective prepotential of N = 2 supersymmetric SO(Nc ) and Sp(Nc ) gauge theories. Nucl. Phys. B 489, 211 (1997), hep-th/9609145 D’Hoker, E., Krichever, I.M., and Phong, D.H.: The renormalization group equation for N = 2 supersymmetric gauge theories. Nucl. Phys. 494, 89–104 (1997), hep-th/9610156 D’Hoker, E. and Phong, D.H.: Strong coupling expansions of SU(N) Seiberg–Witten theory. Phys. Lett. B 397, 94 (1997), hep-th/9701055 Chan, G. and D’Hoker, E.: Instanton Recursion Relations for the Effective Prepotential in N = 2 Super Yang–Mills. hep-th/9906193 Krichever, I.M.: The tau function of the universal Whitham hierarchy, matrix models, and topological field theories. Comm. Pure Appl. Math. 47, 437–475 (1994) Krichever, I.M.: The dispersionless Lax equations and topological minimal models. Commun. Math. Phys. 143, 415–429 (1992) Dubrovin, B.A.: Hamiltonian formalism for Whitham hierarchies and topological Landau–Ginzburg models. Commun. Math. Phys. 145, 195–207 (1992) Dubrovin, B.A.: Integrable systems in topological field theory. Nucl. Phys. B 379 (1992) 627-689 Dubrovin, B.A.: Geometry of 2D topological field theories. Trieste Lecture Notes, 1995 Matone, M.: Instanton recursion relations in N = 2 SUSY gauge theories. Phys. Lett. B357, 342 (1996), hep-th/9506102 Nakatsu, T. and Takasaki, K.: Whitham-Toda hierarchy and N = 2 supersymmetric Yang–Mills theory. Mod. Phys. Lett. A 11, 157 (1995), hep-th/9509162 Sonnenschein, J., Theisen, S. and Yankielowicz, S.: Phys. Lett. B 367, 145 (1996), hep-th/9510129 Eguchi, and Yang, S.K.: Prepotentials of N = 2 SUSY gauge theories and soliton equations. Mod. Phys. Lett. A 11, 131 (1996), hep-th/9510183 Ahn, C. and Nam, S.: hep-th/9603028 Edelstein, J.D. and Mas, J.: Strong coupling expansion and Seiberg–Witten Whitham equations. Phys. Lett. B 452, 69 (1999), hep-th/9901006 Edelstein, J.D., Gomez-Reina, M. and Mas, J.: Instanton corrections in N = 2 supersymmetric theories with clasical gauge groups and fundamental matter hypermultiplets. hep-th/9904087 Marino, M.: The uses of Whitham hierarchies. hep-th/9905053 Marshakov, A., Mironov, A. and Morozov, A.: WDVV-like equations in N = 2 SUSY Yang–Mills theory. Phys. Lett. B 389, 43, (1996), hep-th/9607109 Marshakov, A., Mironov, A. and Morozov, A.: More evidence for the WDVV equations in N = 2 SUSY Yang–Mills theory. hep-th/9701123 Isidro, J.M.: On the WDVV equation and M theory. Nucl. Phys. B 539, 379–402 (1999), hep-th/9805051 Witten, E.: Solutions of four-dimensional field theories via M-Theory. Nucl. Phys. B 500, 3 (1997), hepth/9703166 Brandhuber, A., Sonnenschein, J., Theisen, S. andYankielowicz, S.: M Theory and Seiberg–Witten curves: Orthogonal and symplectic groups. Nucl. Phys. B 504, 175 (1997), hep-th/9705232 Landsteiner, K., Lopez, E. and Lowe, D.A.: N=2 supersymmetric gauge theories, branes and orientifolds. Nucl. Phys. B507, 197 (1997), hep-th/9705199 Gorsky, A.: Branes and Integrability in the N=2 susy YM theory. Int. J. Mod. Phys. A 12, 1243 (1997), hep-th/9612238 Gorsky, A., Gukov, S. and Mironov, A.: Susy field theories, integrable systems and their stringy brane origin. hep-th/9710239 Cherkis, A. and Kapustin, A.: Singular Monopoles and supersymmetric gauge theories in three dimensions. hep-th/9711145 Uranga, A.M.: Towards mass deformed N = 4 SO(N ) and Sp(K) gauge theories from brane configurations. Nucl. Phys. B 526, 241–277 (1998), hep-th/9803054 Yokono, T.: Orientifold four plane in brane configurations and N = 4 U Sp(2N ) and SO(2N ) theory. Nucl. Phys. B 532, 210–226 (1998), hep-th/9803123 Landsteiner, K., Lopez, E. and Lowe, D.: Supersymmetric gauge theories from Branes and orientifold planes. hep-th/9805158 Landsteiner, K. and Lopez, E.: New curves from branes. hep-th/9708118 Katz, S., Mayr, P. and Vafa, C.: Mirror symmetry and exact solutions of 4D N = 2 gauge theories. Adv. Theor. Math. Phys. 1, 53 (1998), hep-th/9706110

574

19.

20.

21. 22.

23. 24.

I. Krichever, D. H. Phong

Katz, S., Klemm, A. and Vafa, C.: Geometric engineering of quantum field theories. Nucl. Phys. B 497, 173 (1997), hep-th/9609239 Bershadsky, M., Intriligator, K., Kachru, S., Morrison, D.R., Sadov, V. and Vafa, C.: Geometric singularities and enhanced gauge symmetries. Nucl. Phys. B 481, 215 (1996), hep-th/9605200 Kachru, S. and Vafa, C.: Exact results for N=2 compactifications of heterotic strings. Nucl. Phys. B 450, 69 (1995), hep-th/9505105 Marshakov, A. and Mironov, A.: 5d and 6d supersymmetric gauge theories: Prepotentials from integrable systems. Nucl. Phys. B 518, 59–91 (1998), hep-th/9711156 Braden, H., Marshakov, A., Mironov, A. and Morozov, A.: The Ruijsenaars–Schneider model in the context of Seiberg–Witten theory. hep-th/9902205 Ohta, Y.: Instanton correction of prepotential in Ruijsenaars model associated with N=2 SU (2) Seiberg– Witten. hep-th/9909196 Braden, H.W., Marshakov, A., Mironov, A. and Morozov, A.: Seiberg–Witten theory for a nontrivial compactification from five-dimensions to four dimensions. Phys. Lett. B 448, 195–202 (1999), hepth/9812078 Takasaki, K.: Elliptic Calogero–Moser systems and isomonodromic deformations. math.qa/9905101 Ennes, I.P., Naculich, S.G., Rhedin, H. and Schnitzer, H.J.: One instanton predictions of a Seiberg–Witten curve from M-theory: The symmetric case. Int. J. Mod. Phys. A 14, 301 (1999), hep-th/9804151 Naculich, S.G., Rhedin, H. and Schnitzer, H.J.: One instanton test of a Seiberg–Witten curve from Mtheory: The antisymmetric representation. Nucl. Phys. B 533, 275 (1988), hep-th/9804105 Ennes, I.P., Naculich, S.G., Rhedin, H. and Schnitzer, H.J.: One instanton predictions for non-hyperelliptic curves derived from M-theory. Nucl. Phys. B 536, 245 (1988), hep-th/9806144 Ennes, I.P., Naculich, S.G., Rhedin, H. and Schnitzer, H.J.: One instanton predictions of Seiberg–Witten curves for product groups. hep-th/9901124 Two antisymmetric hypermultiplets in N=2 SU(N) gauge theory: Seiberg–Witten curve and M theory interpretation. hep-th/9904078 Ennes, I.P., Naculich, S.G., Rhedin, H. and Schnitzer, H.J.: One instanton predictions of Seiberg–Witten curves for product groups. hep-th/9901124 Krichever, I. and Korchemsky, G.: Solitons in high-energy QCD. Nucl. Physics B 505, 387–414 (1997), hep-th/9704079 Krichever, I.M.: The algebraic-geometric construction of Zakharov–Shabat equations and their solutions. Doklady Akad. Nauka USSR 227, 291–294 (1976) Krichever, I.M.: Methods of algebraic geometry in the theory of non-linear equations. Russian Math Surveys 32 185–213 (1977) Krichever, I.,Babelon, O., Billey, E. and Talon, M.: Spin generalization of the Calogero–Moser system and the Matrix KP equation. Am. Math. Transl. 170 2, 83–119 (1995) Krichever, I.M.: Elliptic solutions to difference non-linear equations and nested Bethe ansatz. solvint/9804016

Communicated by T. Miwa

Commun. Math. Phys. 213, 575 – 597 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Localization in One Dimensional Random Media: A Scattering Theoretic Approach Robert Sims, Günter Stolz Department of Mathematics, University of Alabama at Birmingham, Birmingham, AL 35294-1170, USA Received: 23 September 1999 / Accepted: 13 March 2000

Abstract: We use scattering theoretic methods to prove exponential localization for random displacement models in one dimension. The operators we consider model both quantum and classical wave propagation. Our main tools are the reflection and transmission coefficients for compactly supported single site perturbations. We show that randomly displaced, non-reflectionless single sites lead to localization. 1. Introduction Electron localization in disordered media is generally understood as a multiscattering phenomenon in physics: If scattering at each single site is non-trivial and a global potential is constructed by dispersing these single sites in a sufficiently random manner (to avoid resonance effects as in the periodic case), then states should be localized. Our main goal here is to provide a proof of localization for certain one dimensional random operators which makes this idea rigorous. Specifically, we will work with random displacement models, which for the case of Schrödinger operators are given as follows: Let vω (x) = g(x − n − dn (ω)), (1.1) n∈Z

is compactly supported and supp(g) ⊂ [−s, s], s > 0. The displacements where g ∈ {dn (ω)} are independent, identically distributed random variables which take values in [−dmax , dmax ]. We assume that their distribution has a non-trivial absolutely continuous component, and that dmax + s < 1/2 to avoid overlap between adjacent sites in (1.1). The random operator L2

HωS = −

d2 + vω dx 2

Both authors partially supported by NSF grant DMS-9706076.

(1.2)

576

R. Sims, G. Stolz

(“S” for Schrödinger) is essentially self-adjoint on C0∞ (R) for every ω and the following result was recently proven by Buschmann and Stolz: Theorem 1.1 ([4]). If g = 0, then HωS almost surely has dense pure point spectrum with exponentially decaying eigenfunctions. The main technical difficulty in proving Theorem 1.1 is that HωS does not depend monotonically (in form sense) on the random parameters dn . This problem was overcome in [4] by combining results from inverse spectral theory with the method of two parameter spectral averaging. Here, while also using two parameter averaging, we provide an alternate proof of Theorem 1.1 wherein the inverse spectral results used in [4] are replaced by scattering theoretic ideas which are more directly motivated by physics. It is possible to apply our strategy of proof to other models. As an example, we prove the following new result: Let 1 f (x − n − dn (ω)), =1+ aω (x)

(1.3)

n∈Z

where supp(f ) ⊂ [−s, s], f ∈ L1 , 1 + f > 0, and {dn (ω)} are i.i.d. random variables as above, in particular having an a.c. component and dmax + s < 1/2. Sturm–Liouville theory shows that the operator HωW = −

d d aω dx dx

(1.4)

(“W” for wave equation) with D(HωW ) = u ∈ L2 (R) : u, aω u are absolutely continuous with − (aω u ) ∈ L2 (R) is self-adjoint. Alternatively, HωW may be defined as the self-adjoint operator corresponding to the non-negative form aω u v dx, u, v ∈ C0∞ (R). We obtain a result similar to Theorem 1.1: Theorem 1.2. If f = 0, then HωW almost surely has dense pure point spectrum with exponentially decaying eigenfunctions. We prove Theorems 1.1 and 1.2 by showing: If the single site equation − u

+ gu = k 2 u is not reflectionless, then HωS almost surely has dense pure point spectrum with exponentially decaying eigenfunctions, and If the single site equation − HωW

u 1+f

= k 2 u is not reflectionless, then

(1.5)

(1.5’)

almost surely has dense pure point spectrum with exponentially decaying eigenfunctions. Here we call −u

+ gu = k 2 u,

(1.6)

Localization: A Scattering Theoretic Approach

577

respectively −

u 1+f

= k 2 u,

(1.7)

not reflectionless if its reflection coefficient does not vanish simultaneously for all k > 0, for details see Sect. 2. By showing (1.5) and (1.5 ), we give a rigorous justification for the physical heuristics described above, where non-trivial scattering at a single site is interpreted as non-vanishing of the reflection coefficient. Theorem 1.1 follows immediately from (1.5) and the well known fact that (1.6) is never reflectionless for compactly supported, non-trivial g, see Sect. 3, in particular Corollary 3.2. Theorem 1.2, however, requires more work. We employ the same method to prove Theorem 1.2, but in this case we show that there are no non-trivial, compactly supported f ’s for which (1.7) has vanishing reflection coefficient, which to the best of our knowledge is a new result: u Theorem 1.3. If f ∈ L1 is compactly supported with 1 + f > 0, then − 1+f = k2 u is reflectionless if and only if f = 0. The key observation which enables us to prove Theorem 1.3 is an asymptotic formula for the Weyl–Titchmarsh m-function of (1.7), see Proposition 4.2 in Sect. 4. Initiated by Everitt’s work [13], the m-function asymptotics of differential expressions has been intensely studied. While (1.7) can be transformed to Schrödinger-type, i.e. (1.6), if f is smooth, we need a new method of proof here to cover non-smooth f . Asymptotic formulas as provided in Proposition 4.2 and their applications to inverse spectral theory will be further studied in [18]. Combining the asymptotic formula with Herglotz function formalism, we are able to determine that a reflectionless equation must have a smooth coefficient f , see Sect. 4. As mentioned above, smooth coefficients for one can reduce an equation of type (1.7) to one of form (1.6) via a Liouville–Green transformation, see Theorem 3.3. An application of Corollary 3.2 proves Theorem 1.3; hence Theorem 1.2 as in the Schrödinger case. Our interest in the model HωW is motivated by a number of recent works on localization phenomena involving classical waves, e.g. [14, 15, 9, 34], or the recent review [28]. HωW is a one-dimensional version of the random operators which arise in the study of acoustic or electro-magnetic waves in disordered media. The coefficient f in (1.7) will be a step function in many applications, modeling for example the varying density and speed of sound in a two-component acoustic medium. Examples of this nature illustrate why it is not sufficient to only prove results for “smooth” cases.

u We point out that “mixed” equations − 1+f + gu = k 2 u, where f and g are both compactly supported and non-trivial may very well be reflectionless, see the example in Sect. 3. An interesting feature of the displacement models HωS and HωW is that even in the case of a non-reflectionless single site, they may exhibit a discrete set of energies where the (analytic) reflection coefficient vanishes, without vanishing identically. In the case of simple square well potentials, calculations leading to this kind of resonance energies are found in most elementary quantum mechanics texts. At such energies, the unreflected waves yield “extended states”. This fact does not hinder the proof of localization though, since a discrete exceptional set can not support continuous spectrum. Discrete exceptional energies with zero reflection also appear in the so-called random dimer model,

578

R. Sims, G. Stolz

where localization, in fact dynamical localization away from exceptional energies, has recently been shown by de Bièvre and Germinet [2]. We now outline the basic ideas in our proof of (1.5) and (1.5 ) which, after some preparations in Sect. 5, is completed in Sect. 6. Compact support of the single site allows the use of Kotani’s theory [24, 25], respectively its extensions by Minami [29, 30], to conclude positivity of the Lyapunov exponent for almost every energy. Given this, the other main ingredient into the proof of localization is spectral averaging, e.g. [32, 27, 37, 4]. To understand the relation between non-trivial scattering and spectral averaging, we for simplicity concentrate on the Schrödinger case (1.2) and start with the “one bump” problem: For any a ∈ [−dmax , dmax ] define ga (x) = g(x − a), the displaced bump with support in [−s + a, s + a] ⊂ [− 21 , 21 ]. For k > 0, we consider the solution u of −u

+ ga u = k 2 u with sin(kx + ϑ), for x ≤ − 21 u(x) = R sin(kx + ϑ + δ), for x ≥ 21 . Scattering at a single site is described by the amplitude

1/2 u (1/2) 2 2 R = R(ϑ, k, a) = + u(1/2) k and the phase shift δ = δ(ϑ, k, a). Two basic facts (Lemma 3.5 and Lemma 5.4 below) relate trivial reflection at the single site g with stationarity of the phase shift under translation: (1) Fix a and k, then the reflection at k is zero if and only if R = 1 for all ϑ. coefficient 1 (2) Fix k and ϑ, then ∂a δ = k R 2 − 1 . Thus, at an energy E = k 2 with non-zero reflection coefficient, we have R = 1 and therefore ∂a δ = 0 for at least one ϑ. Analyticity implies ∂a δ = 0 for all but a discrete set N of exceptional ϑ’s (Lemma 5.5). To handle the ϑ ∈ N for which the phase shift over one bump is stationary, we introduce a second, adjacent bump, g1+b (x) = g(x − 1 − b) parametrized by b ∈ [−dmax , dmax ], for which ∂b δ = 0 for all but a discrete set of b’s, where now δ is the total phase shift for both bumps (Lemma 5.6). Once nonstationarity of the phase shift in the random variables a, b has been established, one can use the Carmona formalism [5, 6] to show that averaging of spectral measures over a, b yields an absolutely continuous measure. We outline this is Sect. 6, referring for details to [4]. Spectral averaging for the negative energy spectrum, which only exists in the Schrödinger case, uses analytic continuation of reflection and transmission coefficients to the imaginary axis. Results on localization for non-monotonic random operators of type (1.2) or (1.4) are not restricted to the displacement model studied here. Anderson models with indefinite single site potentials as well as Poisson models were studied in [38], respectively [4]. A study of more general models of type (1.4) will be contained in [33]. Finally, we mention that reflection and transmission coefficients have been used successfully in the solution of other spectral problems involving multiscattering in one dimension: Kirsch, Kotani, and Simon [22] use non-vanishing of the reflection coefficient to prove absence of absolutely continuous spectrum for random Schrödinger operators with non-compactly supported single site potentials. Molchanov [31] used them in the

Localization: A Scattering Theoretic Approach

579

spectral analysis of sparse potentials. A scattering theory approach to one-dimensional Anderson models is provided in [23], where the key ingredient is the Lifshitz-Krein shift function, and transmission and reflection coefficients appear via the S-matrix. 2. Transmission and Reflection Coefficients The most basic objects in one-dimensional scattering theory are the transmission and reflection coefficients. We start by defining them in the context of a general, “mixed” single site equation −(pu ) + qu = k 2 u,

(2.1)

where p : R → (0, ∞) and q : R → R. For notational simplicity, we take supp(p − 1) and supp(q) ⊂ [0, D]. (Results obtained for (2.1) apply to the single site equations (1.6) and (1.7) by translation.) Assuming p1 ∈ L1loc (R) and q ∈ L1 (R), the differential expression τ =−

d d p +q dx dx

is a compactly supported perturbation of the Laplacian −d 2 /dx 2 . τ defines a self-adjoint operator L on L2 (R), see [8, 39], with 2

2 D(L) = f ∈ L (R) : f and pf are absolutely continuous and τf ∈ L (R) . As p is not necessarily smooth, e.g. step functions are allowed, we note that smoothness of pf is not equivalent to smoothness of f . The Jost solution u of (2.1) for k ∈ C \ {0} is the solution satisfying

eikx for x ≥ D, u(x, k) = (2.2) a(k)eikx + b(k)e−ikx for x ≤ 0. By the linear independence of eikx and e−ikx , the coefficients a(k) and b(k) are well defined. The Wronskian W [u, v](x) := (pu )(x)v(x)−u(x)(pv )(x) for any two solutions u and v of (2.1) is constant. Applying this to u(x, k) and u(x, −k), we find a(k)a(−k) − b(k)b(−k) = 1,

for any k ∈ C \ {0}.

(2.3)

For real k, one has u(x, −k) = u(x, k), and thus a(−k) = a(k), b(−k) = b(k), and b(k) and |a(k)|2 = 1+|b(k)|2 . Hence, in this case, a(k) = 0, and we may define r(k) := a(k) 1 t (k) := a(k) , the reflection and transmission coefficients. While the latter are closer to physics, describing the reflected and transmitted parts of an incoming plane wave from −∞, we will work mainly with the mathematically more convenient a(k) and b(k). If k is purely imaginary, then u(x, k) is a real solution of the real equation (2.1), in particular a(k) and b(k) are real. Using (2.2) with k = iα, α > 0, we also see that L has the negative eigenvalue −α 2 if and only if a(iα) = 0. One may represent a(k) and b(k) as solutions of integral equations. This is quite standard in the Schrödinger case. We provide some details to include the less well

580

R. Sims, G. Stolz

known case p = 1. Variation of parameters (or direct verification) shows that u(x, k) = a(k, x)eikx + b(k, x)e−ikx , where D 1 a(k, x) = 1 − e−iky q(y)u(y, k) − ik (p(y) − 1) u (y, k) dy, (2.4) 2ik x D 1 b(k, x) = eiky q(y)u(y, k) + ik (p(y) − 1) u (y, k) dy. (2.5) 2ik x Due to the supports of q and p − 1, the above are trivial for x ≥ D and constant for x ≤ 0, specifically a(k) = a(k, 0) and via (2.4) b(k) = b(k, 0). By differentiation, and (2.5), we arrive at pu (x, k) = ik a(k, x)eikx − b(k, x)e−ikx . We can now rewrite (2.4) and (2.5) as a closed system in a and b: D 1 a(k, x) = 1 − q(y) a(k, y) + e−2iky b(k, y) 2ik x 1 + k2 1 − a(k, y) − e−2iky b(k, y) dy, p(y) and

D 1 q(y) −e2iky a(k, y) − b(k, y) 2ik x 1 + k2 1 − e2iky a(k, y) − b(k, y) dy. p(y)

b(k, x) = −

More concisely, we have D 1 1 1 ˜ y) γ (k, y)dy, q(y)A(k, y) + k 2 1 − A(k, γ (k, x) = − 0 2ik x p(y) (2.6)

1 e−2iky 1 −e−2iky ˜ A(k, y) = , A(k, y) = , −e2iky −1 e2iky −1 a(k, x) and γ (k, x) = . b(k, x) For fixed x, iteration of the Volterra-type integral equation (2.6) yields a series representation for γ (x, k) which converges locally uniformly for k ∈ C \ {0}. This shows that a(k) and b(k) are analytic in C \ {0}. In the case of q = 0, the singularity of (2.6) at k = 0 vanishes, thus both a(k) and b(k) are entire. Finally, we note that information about the transmission and reflection coefficients of (2.1) can be used to understand the periodic equation

where

−(pu ˜ ) + qu ˜ = Eu,

(2.7)

where p˜ and q˜ are the D-periodic extensions of p|[0,D] and q|[0,D] to R. The periodic self-adjoint operator defined by (2.7) on L2 (R) will be denoted by Lper . Its spectral properties are determined by the discriminant, d(E), the trace of the transfer matrix of (2.7) from 0 to D, see [11]. In particular, the spectral bands of Lper are found from |d(E)| ≤ 2.

Localization: A Scattering Theoretic Approach

581

As the transfer matrix has determinant 1, its inverse, i.e. the transfer matrix from D to 0, has the same trace. This observation simplifies the following calculation: Consider solutions u1 and u2 of (2.1) with u1 (D) = (pu 2 )(D) = 1 and u2 (D) = (pu 1 )(D) = 0. We may write these in terms of the Jost solutions, u1 (x, k) =

1 −ikD 1 u(x, k) + eikD u(x, −k) e 2 2

and u2 (x, k) =

1 −ikD 1 ikD u(x, k) − e e u(x, −k). 2ik 2ik

For any k ∈ C \ {0}, the transfer matrix of (2.7) at E = k 2 is given by u1 (0) u2 (0) . (pu 1 )(0) (pu 2 )(0) Thus the discriminant at k 2 is d(k 2 ) = u1 (0) + pu 2 (0) = a(k)e−ikD + a(−k)eikD ,

(2.8)

a relation which has been used frequently (e.g. [21, 22]) in the case p = 1. Note that the use of purely imaginary k in (2.8) allows one to determine the negative spectral bands of Lper . 3. Equations with Zero Reflection In this section, we prove a variety of results characterizing equations for which the reflection coefficient is either identically zero or zero at some fixed energy. We say that −(pu ) + qu = k 2 u is reflectionless if r(k) = r(k, p, q) vanishes for all real k > 0. This is equivalent to requiring that b(k) = 0 for all k > 0 and thus, by analyticity, for all k ∈ C \ {0}. We begin by noting that “reflectionless” implies “no gaps” for the spectrum of the corresponding periodic operator, a fact well known in the case of p = 1. This observation leads to another well-known result for Schrödinger equations: the only reflectionless, compactly supported q’s are trivial. We then demonstrate that equations with smooth p’s can be transformed into Schrödinger type equations, in which case a similar “reflectionless implies trivial” result is true. Lastly, we relate zero reflection at a fixed energy to the amplitude of solutions. Lemma 3.1. If −(pu ) + qu = k 2 u is reflectionless, then σ (Lper ) = [0, ∞). Proof. To prove this we show that, |d(k 2 )| ≤ 2 |d(−α 2 )| ≥ 2

for all k > 0, and for all α > 0.

(3.1) (3.2)

Equation (3.1) implies [0, ∞) ⊂ σ (Lper ) and (3.2) implies |d(−α 2 )| > 2 for all α > 0 (bands do not degenerate to points, see [11]), and thus σ (Lper ) ⊂ [0, ∞). Both (3.1) and (3.2) follow readily from (2.8): In the case that (2.1) is reflectionless, (2.3) becomes a(k)a(−k) = 1.

582

R. Sims, G. Stolz

If k > 0, then a(−k) = a(k), d(k 2 ) = 2Re[a(k)e−ikD ],

(3.3)

and thus |d(k 2 )| ≤ 2. On the other hand, if k = iα, α > 0, we have that a(iα) is real, d(−α 2 ) = a(iα)eαD + and so clearly |d(−α 2 )| ≥ 2.

1 , a(iα)eαD

Let us now consider the two special cases of (2.1), where either (i) p = 1 or (ii) q = 0. In case (i), one has Corollary 3.2. If −u

+ qu = k 2 u is reflectionless and q is compactly supported, then q = 0. This is well known: there are no compactly supported solitons. It follows for example from Lemma 3.1 and the result of Borg [3], extended by Hochstadt [20] to the case d2 ˜ has spectrum [0, ∞), then q˜ = 0. q ∈ L1loc , that if the periodic operator − dx 2 +q In case (ii), if p is sufficiently smooth, one can use a Liouville–Green transform to reduce −(pu ) = k 2 u to a Schrödinger equation and prove that reflectionless implies p = 1. This will follow from a more general observation: Theorem 3.3. Let p > 0 and q be real-valued functions such that supp(p − 1) and supp(q) ⊂ [0, D] with p1 ∈ L1loc (R) and q ∈ L1 (R). Suppose in addition that both p and p are absolutely continuous. Then −(pu ) + qu = k 2 u is reflectionless if and only if 1 (p )2 1 q= − p

. 16 p 4 Proof. Let k > 0 and u be any solution of (2.1). This equation has a Liouville–Green x 1 1/2 ds, x(t) the inverse transform, see [12], which is defined by setting t (x) = 0 p(s) of t (x), and z(t) := p(x(t))1/4 u(x(t)).

(3.4)

A short calculation shows that −z

+ Qz = k 2 z, where Q(t) = Q1 (t) + Q2 (t) with Q1 (t) = p(x(t))−1/4

d2 p(x(t))1/4 and Q2 (t) = q(x(t)). dt 2

Under the above transformation we have t = x for x ≤ 0 and t = x + P − D for D 1 1/2 ds. Choosing the particular solution u(x) = u(x, k) x ≥ D, where P := 0 p(s) from (2.2), it follows that u(x(t), k) = a(k)eikt + b(k)e−ikt for t ≤ 0 z(t) = u(x(t), k) = eik(t+D−P ) for t ≥ P .

Localization: A Scattering Theoretic Approach

583

Clearly then, e−ik(D−P ) z(t) is the Jost solution of −z

+Qz = k 2 z, and the corresponding reflection coefficient r ∗ (k) is given by r ∗ (k) =

e−ik(D−P ) b(k) b(k) = r(k). = a(k) e−ik(D−P ) a(k)

We have shown that (2.1) is reflectionless if and only if −z

+Qz = k 2 z is reflectionless as well. Since Q is compactly supported, we conclude from Corollary 3.2 that Q = 0, i.e. almost everywhere q(x) = −p(x)−1/4

1 d2 1 p (x)2 − p

(x). p(x)1/4 = 2 dt 16 p(x) 4

(3.5)

The following corollary, again with the choice p(t) = 1/(1 + f (t − s)), is a reformulation of Theorem 1.3 in the smooth case. Corollary 3.4. If −(pu ) = k 2 u is reflectionless, p − 1 is compactly supported, and p and p are absolutely continuous, then p = 1. Proof. The right-hand side of Eq. (3.5) may be rewritten 1 p (x)2 1 1 − p

(x) = − p(x)1/4 p −1/4 (x)p (x) . 16 p(x) 4 4 If q = 0, then integration, using the compact support of p − 1, yields p = 1.

Example. We have shown that there are no compactly supported, non-trivial q’s for which −u

+ qu = k 2 u is reflectionless. Accordingly, in Sect. 4 we will improve on Corollary 3.4, i.e. remove smoothness assumptions, and show that there are no nontrivial p’s for which p − 1 is compactly supported and −(pu ) = k 2 u is reflectionless. There are, however, compactly supported, non-trivial pairs (p, q) for which (2.1) is reflectionless. Theorem 3.3, in fact, illustrates that a given smooth p determines a q for which (2.1) is reflectionless. For example, if p(x) := 2 − cos(x) on [0, 2π ], then sin2 (x) 1 q(x) = 16 · 2−cos(x) − 41 cos2 (x) on [0, 2π ] defines a non-trivial, reflectionless equation −(pu ) + qu = k 2 u. The corresponding 2π -periodic operator has spectrum [0, ∞), which either follows from Lemma 3.1 or, more directly, from the fact that it is equivalent to the negative Laplacian under the unitary Liouville–Green transform (3.4). Finally, we prove a local result on zero reflection, i.e. a characterization of b(k) = 0 for fixed k. For any solution v of −(pv ) + qv = Ev,

(3.6)

where E > 0, we consider the square of the modified Prüfer amplitude, 1 (pv )2 (x) + v 2 (x), E for more on Prüfer variables see Sect. 5. One easily calculates that 1

q ∂x Fv = 2vpv − 1− , E p Fv (x, E) :=

(3.7)

which implies that F remains constant (for all solutions of (3.6)) wherever q = 0 and √ p = 1. The lemma below shows that zero reflection at a fixed k = E is equivalent to the magnitude of the Prüfer amplitude being unchanged over [0, D].

584

R. Sims, G. Stolz

Lemma 3.5. b(k) = 0 for k = all real solutions v of (3.6).

√

E > 0 holds if and only if Fv (0, E) = Fv (D, E) for

Proof. A given real valued solution of (3.6), normalized such that Fv (0, E) = 1, may be parametrized with θ ∈ [0, π) by taking v(0) = sin(θ ) and pv (0) = k cos(θ ). Using the Jost solutions we may write v = Au(·, k)+Bu(·, −k). Applying the boundary conditions at zero, we find that B=A=

i a(k)e−iθ + b(k)eiθ . 2

(3.8)

A straightforward calculation shows that Fv (0, E) = Fv (D, E) for all real solutions of (3.6) if and only if 1 = 4|B|2 , for all B = B(θ ) from (3.8). Now suppose b(k) = 0, then |a(k)|2 = 1 and i −i 4|B|2 = 4 · a(k)e−iθ · a(k)eiθ = 1, for all B(θ ). 2 2 If, on the other hand, 1 = 4|B|2 for all B(θ ), then for θ = 0, (3.8) shows that B = i 2 [a(k) + b(k)]. Direct substitution shows, 1 = |a(k)|2 + a(k)b(k) + b(k)a(k) + |b(k)|2 . A similar argument for θ =

π 2

gives

1 = |a(k)|2 − a(k)b(k) − b(k)a(k) + |b(k)|2 . Addition yields |b(k)|2 = 0, where |a(k)|2 = 1 + |b(k)|2 was used.

4. Proof of Theorem 1.3 Let p > 0,

1 p

∈ L1loc (R), and supp(p − 1) ⊂ [0, D]. Then Theorem 1.3 is equivalent to

Proposition 4.1. −(pu ) = k 2 u is reflectionless if and only if p = 1. Our proof of Proposition 4.1 uses Weyl–Titchmarsh m-functions (see e.g. [8]). For z ∈ C\[0, ∞) choose z1/2 with Im[z1/2 ] > 0. Let f± (x, z) be the solutions of −(pu ) = 1/2 zu which are equal to e±iz x near ±∞. Then for each x ∈ R the m-functions of −(pu ) = zu on (x, ∞) and (−∞, x) can be defined as

m± (z, x) =

(pf± )(x, z) . f± (x, z)

(4.1)

For fixed x, m± (z, x) is analytic in C\[0, ∞), where the negative half-axis is included d d due to positivity the operator − dx p dx .

Localization: A Scattering Theoretic Approach

Proposition 4.2. If

1 p

585

is Lebesgue-continuous at x, then

m+ (−α 2 , x) = −αp(x)1/2 + o(α) for real α → +∞,

(4.2)

m− (−α 2 , x) = αp(x)1/2 + o(α) for real α → +∞.

(4.3)

and

Remarks. (i) Recall that

is Lebesgue-continuous at x if 1 x+δ 1 1 lim − dt = 0 δ↓0 δ x−δ p(t) p(x)

1 p

and that L1loc -functions are a.e. Lebesgue-continuous. (ii) For smooth p (i.e. p and p absolutely continuous) and α → ∞ in the cone − π2 + ε < arg( α) < −ε (i.e. −α 2 in the upper half-plane), (4.2) and (4.3) have been established in [13] by using a Liouville–Green transformation to reduce the problem to a result on m-function asymptotics for Schrödinger operators. In fact, due to smoothness, [13] gets O(1) instead of o(α). Proof of Proposition 4.2. We only prove (4.2), the proof of (4.3) being similar. To further simplify notation, we only treat the point x = 0, assuming that p1 is Lebesgue-continuous at 0, i.e. in particular, 1 δ 1 1 ϕ(δ) := − dt → 0 as δ ↓ 0. (4.4) δ 0 p(t) p(0) By (4.1) we need to show that 1 (pf+ )(0, −α 2 ) = −p(0)1/2 . α→∞ α f+ (0, −α 2 ) lim

(4.5)

Choose a function µ : [0, ∞) → [0, ∞) satisfying: µ(α) → ∞ as α → ∞,

(4.6)

and µ(α) →0 α

as α → ∞.

(4.7)

Additional assumptions on µ will be posed later, see Lemma 4.5 below. Let pα (y) := p( αy ) and pα (y), if y ≥ µ(α), p˜ α (y) := p(0), if 0 ≤ y < µ(α). Also, let wα be the solution of −(pα wα ) = −wα with wα (α) = 1, (pα wα )(α) = −1 and w˜ α the solution of the same initial value problem with pα replaced by p˜ α . In particular, one has wα (y) = f+ ( αy , −α 2 ) and therefore the proof of (4.5) is equivalent to showing that (pα wα )(0) = −p(0)1/2 . α→∞ wα (0) lim

(4.8)

586

R. Sims, G. Stolz

We introduce the Prüfer phases θα (y) := arccot

(pα wα )(y) wα (y)

and θ˜α (y) := arccot

(p˜ α w˜ α )(y) , w˜ α (y)

where the branch of the arccot is determined by requiring θα (α) = θ˜α (α) = 3π/4 and continuity of θα and θ˜α . They satisfy the first order differential equations θα =

1 cos2 θα − sin2 θα pα

(4.9)

θ˜α =

1 cos2 θ˜α − sin2 θ˜α . p˜ α

(4.10)

and

Lemma 4.3. θα (y) ∈ ( π2 , π) for all y ≥ 0. Proof. The result is obvious for y ≥ α. Suppose that it does not hold for all y < α and let y0 be maximal such that either θα (y0 ) = π/2 or θα (y0 ) = π . In the first case we will show that θα is decreasing near y0 , which yields a contradiction to θα (y) > π/2 for y0 < y ≤ α. Note that for continuous p this would immediately follow from (4.9). For general p1 ∈ L1loc we follow an argument in [39, Proof of Theorem 13.2]: Write (4.9) as θα = ( p1α + 1) cos2 θα − 1 and let y 1 + 1 dt . h(y) := pα (t) y0 Then

θα (y) − θα (y0 ) = y0 − y +

y y0

1 + 1 cos2 θα (t) dt pα (t)

(4.11)

and, since cos2 θα (t) = cos2 θα (t) − cos2 θα (y0 ) ≤ 2| cos θα (t) − cos θα (y0 )| ≤ 2|θα (t) − θα (y0 )|, |θα (y) − θα (y0 )| ≤ |y0 − y| + 2M(y)h(y), where M(y) := max{|θα (t) − θα (y0 )| : t ∈ [y0 , y] or [y, y0 ], resp.}. By monotonicity M(y) ≤ |y0 − y| + 2M(y)h(y).

(4.12)

Choose η > 0 so small that h(y) ≤ 1/6 for |y − y0 | ≤ η. For those y we get from (4.12) that M(y) ≤ 23 |y0 − y| and thus from (4.11), |θα (y) − θα (y0 ) − (y0 − y)| ≤ 2M(y)h(y) ≤ 3|y0 − y|h(y) ≤

1 |y0 − y|. 2

Localization: A Scattering Theoretic Approach

587

This implies θα (y) − θα (y0 ) ≤ − 21 (y − y0 ) < 0 if y > y0 and θα (y) − θα (y0 ) ≥ 1 2 (y0 − y) > 0 if y < y0 , i.e. θα is decreasing near y0 . For the case θα (y0 ) = π a similar argument, based on writing (4.9) as θα = p1α − 1 2 pα + 1 sin θα , allows to show that θα is increasing near y0 , a contradiction to θα (y) < π for y0 < y ≤ α. We skip the details, which in fact are a special case of what is shown in [39]. We may now rephrase (4.8) as lim θα (0) = arccot(−p(0)1/2 ).

α→∞

Thus, Proposition 4.2 is implied by the following results: Lemma 4.4. (a) limα→∞ θ˜α (0) = arccot(−p(0)1/2 ), (b) limα→∞ (θα (0) − θ˜α (0)) = 0. Proof of (a). For y ∈ [0, µ(α)) one has w˜ α

(y) = p(0)−1 w˜ α (y) and thus −1/2 y

w˜ α (y) = Aα ep(0)

−1/2 y

+ Bα e−p(0)

for suitable constants Aα and Bα . By Lemma 4.3, θ˜α (µ(α)) = θα (µ(α)) ∈ (π/2, π ) (note that p˜ α (y) = pα (y) for y ≥ µ(α)) and therefore −1/2

0>

Cα e2p(0) µ(α) − 1 w˜ α (µ(α)−) = p(0)−1/2 , −1/2 w˜ α (µ(α)) Cα e2p(0) µ(α) + 1 −1/2 µ(α)

where Cα := Aα /Bα . This implies |Cα | < e−2p(0) One concludes

→ 0 as α → ∞, using (4.6).

(p˜ α w˜ α )(0) Cα − 1 = p(0)1/2 → −p(0)1/2 w˜ α (0) Cα + 1

as α → ∞,

proving (a). Proof of (b). From (4.9) and (4.10) it follows that 1 1 1 2 θα − θ˜α = − + 1 (sin θα − sin θ˜α )(sin θα + sin θ˜α ). cos θα − pα p˜ α p˜ α Using θα (µ(α)) = θ˜α (µ(α)), p˜ α = p(0) in [0, µ(α)) and | sin θα − sin θ˜α | ≤ |θα − θ˜α | in the above we see that for 0 ≤ x < µ(α), µ(α) 1 1 ˜ |θα (x) − θα (x)| ≤ p (y) − p(0) dy α x µ(α) 1 |θα (y) − θ˜α (y)| dy. +1 +2 p(0) x This allows an application of Gronwall’s lemma and hence µ(α) 1 1 1 2(µ(α)−x)( p(0) +1) |θα (x) − θ˜α (x)| ≤ . p (y) − p(0) dy e α x

(4.13)

588

R. Sims, G. Stolz

Next, we note that by (4.4) µ(α)/α µ(α) 1 1 1 1 − dy = α − p (y) p(0) p(t) p(0) dt α 0 0 µ(α) µ(α) =α· · ϕ( ) α α µ(α) = µ(α)ϕ( ). α

(4.14)

Inserting (4.14) into (4.13) at x = 0 gives 2µ(α)( p(0) +1) |θα (0) − θ˜α (0)| ≤ µ(α)e ϕ 1

µ(α) α

.

(4.15)

Now part (b) of Lemma 4.4 follows from (4.15) under a suitable choice of µ, which is possible by Lemma 4.5. Let ϕ : (0, ∞) → (0, ∞) be such that limδ↓0 ϕ(δ) = 0 and let C > 0. Then there exists a function µ : (0, ∞) → (0, ∞) with the properties (4.6), (4.7) and µ(α) µ(α)eCµ(α) ϕ → 0 as α → ∞. (4.16) α Proof of Lemma 4.5. We establish the lemma in several elementary reduction steps: We may assume that ϕ is monotone increasing and strictly positive on (0, ∞) (otherwise replace ϕ with ϕ(δ) ˜ := max{sup0<s≤δ ϕ(s), δ}). (ii) It is sufficient to prove the existence of µ1 with properties (4.6), (4.7) and µ (α) 1 eCµ1 (α) ϕ → 0 as α → ∞, α (i)

(replace C by 2C). (iii) It is sufficient to prove the existence of µ2 with properties (4.6), (4.7) and µ (α) 2 µ2 (α)ϕ → 0 as α → ∞, (4.17) α (choose µ1 (α) :=

1 C

log µ2 (α) for α sufficiently large).

A function with properties (4.6), (4.7) and (4.17) is given by

1 µ2 (α) := min , α 1/2 , ϕ(α −1/2 ) as is easily verified.

This proves Lemma 4.5 and thereby concludes the proof of Proposition 4.2. Continuing the proof of Proposition 4.1, we note that for z ∈ C \ [0, ∞) the diagonal d d Green’s function of the operator − dx p dx is given by G(z, x, x) =

f+ (x, z)f− (x, z) , W [f+ , f− ]

(4.18)

Localization: A Scattering Theoretic Approach

589

where the Wronskian, W [f+ , f− ] = f+ pf− − pf+ f− , does not depend on x. It follows that 1 = m− (z, x) − m+ (z, x), G(z, x, x) in particular, G(z, x, x) is analytic in C\[0, ∞) for fixed x.Also, m+ (·, x) and −m− (·, x) are Herglotz functions, i.e. they map the open upper half-plane analytically into itself. Thus, G(z, x, x) is Herglotz. Taking the standard branch of the logarithm, ln[G(z, x, x)] is also Herglotz with Im[ln[G(z, x, x)]] = arg(G(z, x, x)) ∈ (0, π ) for z in the open upper half-plane. Thus ξ(λ, x) :=

1 lim Im[ln[G(λ + iε, x, x)]] π ε↓0

(4.19)

exists for a.e. λ with 0 ≤ ξ(λ, x) ≤ 1 and one has the logarithmic Herglotz representation (e.g. [1]) ∞ 1 λ ξ(λ, x)dλ. (4.20) − ln[G(z, x, x)] = c(x) + λ − z 1 + λ2 0 Here it was also used that, for λ < 0, ξ(λ, x) = Note that

1 π Im[ln[G(λ, x, x)]]

c(x) = Re[ln[G(i, x, x)]].

= 0 by (4.18). (4.21)

We note that (4.19) and (4.20) are the key ingredients in proving trace formulas for Schrödinger operators in [17]. We include a proof of the following fact, which is well-known, at least for Schrödinger operators (see e.g. [10, p. 391] for the observation that reflectionless implies (4.24), and also [16] and [26]): Lemma 4.6. If −(pu ) = k 2 u is reflectionless, then ξ(λ, x) = 1/2 for all λ > 0 and x ∈ R. Proof. For Im[z] > 0, we have f+ (x, z) = u(x, z1/2 ), where u is the Jost solution introduced in Definition 2.2. Thus m+ (z, 0) =

(pu )(0, z1/2 ) iz1/2 (a(z1/2 ) − b(z1/2 )) = . u(0, z1/2 ) a(z1/2 ) + b(z1/2 )

If −(pu ) = k 2 u is reflectionless, i.e. b(k) = 0 for all k ∈ C \ {0}, then m+ (z, 0) = iz1/2 .

(4.22)

m− (z, 0) = −iz1/2 .

(4.23)

Re[G(λ + i0, x, x)] = 0 for all λ > 0 and x ∈ R.

(4.24)

Obviously, we also have that

We want to prove that

590

R. Sims, G. Stolz

To this end, we use that G(λ + iε, x, x) =

ψ+ (x, λ + iε)ψ− (x, λ + iε) , W [ψ+ (λ + iε), ψ− (λ + iε)]

(4.25)

where ψ± (x, z) = C(x, z) + m± (z, 0)S(x, z) = C(x, z) ± iz1/2 S(x, z),

(4.26)

and C(x, z), S(x, z) are solutions of −(pu ) = zu with C(0, z) = (pS )(0, z) = 1 and (pC )(0, z) = S(0, z) = 0. Equations (4.22), (4.23), and a calculation show that W [ψ+ (λ + iε), ψ− (λ + iε)] = m− (λ + iε, 0) − m+ (λ + iε, 0) = −2i(λ + iε)1/2 → −2iλ1/2 as ε ↓ 0. By (4.26) lim ψ+ (x, λ + iε)ψ− (x, λ + iε) = C 2 (x, λ) + λS 2 (x, λ), ε↓0

in particular, this limit is real. Inserting into (4.25) yields (4.24). This and (4.19) imply that ξ(λ, x) = 1/2. By Lemma 4.6, (4.20) simplifies to ln[G(z, x, x)] = c(x) +

1 2

∞

0

1 λ − λ − z 1 + λ2

dλ

i = c(x) + ln[ √ ], z and thus i G(z, x, x) = ec(x) √ , z

(4.27)

which extends to z ∈ (−∞, 0) by analyticity. On the other hand, we have by Proposition 4.2 that in Lebesgue points x of p1 and for z ∈ (−∞, 0), √ √ 1 = m− (z, x) − m+ (z, x) = −2ip1/2 (x) z + o( z). G(z, x, x) Combining (4.21), (4.27), and (4.28) we find that e−Re[ln[G(i,x,x)]] = 2p1/2 (x).

(4.28)

(4.29)

This implies that p is absolutely continuous since ln[G(i, x, x)] is absolutely continuous by (4.18). Also, (pf+ )(x, i) (pf− )(x, i) d 1 ln[G(i, x, x)] = + , dx p(x) f+ (x, i) f− (x, i) whose right-hand side is absolutely continuous. By differentiating (4.29), we finally observe that p is absolutely continuous. We can now use Corollary 3.4 to conclude that p = 1. In this section we only included those results on m-function asymptotics and inverse spectral and scattering theory for −(pu ) = zu which are used in our applications to random operators. Many other natural questions arise in this context, e.g. relaxing the assumption of compact support for p − 1 or extending Proposition 4.1 to cover a larger range for −α 2 . These and related questions will be discussed in [18].

Localization: A Scattering Theoretic Approach

591

5. Modified Prüfer Amplitudes and Phases The key result of this section is Lemma 5.4 below, which provides the relation between the phase shift δ and the amplitude R stated in the introduction. Mathematically, δ and R arise as modified Prüfer variables: For any real c, ϑ, and k > 0, take a solution u = 0 of −(pu ) + qu = k 2 u with u(c) = sin(ϑ) and pu (c) = k cos(ϑ). For such a u, we define ϕc (x, ϑ, k) and Rc (x, ϑ, k), the modified Prüfer phase and amplitude, by setting u = Rc sin ϕc and pu = kRc cos ϕc . We fix a unique value of ϕc by requiring that ϕc (c, ϑ, k) = ϑ and continuity in x. As a consequence, we have that ϕc (x, ϑ + π, k) = ϕc (x, ϑ, k) + π . In the next three lemmas we collect facts which are well known in the Schrödinger case, i.e. for p = 1. Lemma 5.1. We have for fixed c, ϑ, and k > 0 that 1 Rc (x) sin ϕc (x) cos ϕc (x), k∂x Rc (x) = q(x) − k 2 1 − p(x) 1 k∂x ϕc (x) = k 2 − k 2 1 − cos2 ϕc (x) − q(x) sin2 ϕc (x). p(x)

(5.1) (5.2)

Proof. We calculate 1 kRc cos ϕc = u = (∂x Rc ) sin ϕc + Rc cos ϕc (∂x ϕc ), p q − k 2 Rc sin ϕc = (pu ) = k(∂x Rc ) cos ϕc − kRc sin ϕc (∂x ϕc ). Solving the above system for the left-hand sides of (5.1) and (5.2) yields the desired equations. We note that (5.1) may be rewritten as k∂x ln[Rc2 (x)] = q(x) − k 2 1 −

1 p(x)

sin[2ϕc (x)].

(5.3)

Lemma 5.2. For every x and ϑ, we have (∂ϑ ϕc )(x, ϑ) = Rc−2 (x, ϑ).

(5.4)

Proof. A naive calculation shows that

1 1 2 2 2 2 cos ϕc − q sin ϕc ∂x ∂ϑ ϕc (x, ϑ) = ∂ϑ ∂x ϕc (x, ϑ) = ∂ϑ k − k 1 − k p 2 1 2 = − (∂ϑ ϕc ) q − k 1 − sin ϕc cos ϕc , k p

where the first equality above is certainly true for the distributional derivatives. In dimension one, however, the distributional derivative of an absolutely continuous function coincides with its pointwise derivative in L1loc ; hence (5.3) implies that ∂x (ln[Rc2 ∂ϑ ϕc ]) = 0 for almost every (x, ϑ). Since Rc2 (c, ϑ)(∂ϑ ϕc )(c, ϑ) = 1, we conclude that ln[Rc2 (x, ϑ)(∂ϑ ϕc )(x, ϑ)] = 0 for a.e. ϑ. This yields (5.4), first for almost every ϑ, and then by continuous extension the result holds for every ϑ (Prüfer variables are analytic in ϑ).

592

R. Sims, G. Stolz

Corollary 5.3. For any c, x, k, and ϑ, 1 π

ϑ+π

ϑ

Rc−2 (x, β, k)dβ = 1.

Proof. This follows from integrating the above result and using that ϕc (x, ϑ + π, k) − ϕc (x, ϑ, k) = π . We are now ready to state how the Prüfer phase changes under translation of the coefficients p and q. For p and q with supp(q), supp(p − 1) ⊂ [0, D], D < 1, and 0 < a < 1 − D, we consider the following differential expression: τa := −

d d p(· − a) + q(· − a). dx dx

(5.5)

Let k > 0, u be such that τa u = k 2 u, and take ϕ(. . . , τa ) and R(. . . , τa ) as defined above for u. Lemma 5.4. For fixed k and ϑ, we have 1 d ϕ0 (1, ϑ, k, τa ) = R0−2 (1, ϑ, k, τa ) − 1. k da Proof. As k is fixed, we suppress this dependency. One may write d2 d2 ϕ0 (1, ϑ, τa ) = ϕa+D 1, ϕ0 D, ϕ0 (a, ϑ, − 2 ), τ0 , − 2 . dx dx

(5.6)

When q = 0 and p = 1, (5.2) can be used to explicitly calculate ϕ. We may thereby rewrite the above as ϕ0 (1, ϑ, τa ) = ϕ0 (D, ϑ + ka, τ0 ) + k(1 − a − D).

(5.7)

Using this, and the lemmas above we find that 1 d ϕ0 (1, ϑ, τa ) = (∂ϑ ϕ0 )(D, ϑ + ka, τ0 ) − 1 k da 1 = 2 −1 R0 (D, ϑ + ka, τ0 ) 1 = 2 − 1. R0 (1, ϑ, τa ) We note that R02 (x, ϑ, k, τ0 ) = Fu0 (x, k 2 ), where Fu0 is as defined by (3.7) and u0 is the solution of (2.1) normalized at zero by u0 (0) = sin(ϑ) and pu 0 (0) = k cos(ϑ). This observation allows us to relate non-zero reflection, i.e. b(k) = 0 for the single site equation −(pu ) + qu = k 2 u, to non-stationarity of the Prüfer phase in a. More precisely, we have: d Lemma 5.5. If k > 0, b(k) = 0, and a0 ∈ (0, 1 − D), then da ϕ0 (1, ϑ, τa )a=a = 0 0 for all ϑ ∈ R \ N (a0 ), where N (a0 ) is a discrete, π -periodic subset of R.

Localization: A Scattering Theoretic Approach

593

Proof. For any fixed k satisfying b(k) = 0, Lemma 3.5 shows that 1 = R02 (0, ϑ, τ0 ) = R02 (D, ϑ, τ0 ) for at least one ϑ. Since R02 is analytic and π -periodic in ϑ, we have that N := α ∈ R : R02 (D, α, τ0 ) = 1 is a discrete, π-periodic subset of R. By Lemma 5.4, d ϕ0 (1, ϑ, τa ) = 0 if and only if 1 = R02 (1, ϑ, τa0 ) = R02 (D, ϑ + ka0 , τ0 ). da a=a0 Clearly then, we can take N (a0 ) := {ϑ ∈ R : ϑ + ka0 ∈ N } and the lemma is proven. Note that N (a0 ) = N (a0 , k).

We now introduce the two parameter model. Let 0 < D < 1, p1 and q ∈ L1 (0, D) be real valued and such that −(pu ) +qu = k 2 u is not reflectionless. Take a, b ∈ (0, 1−D) and define  for a ≤ x ≤ a + D,  p(x − a) pa,b (x) := p(x − 1 − b) for 1 + b ≤ x ≤ 1 + b + D, (5.8a) 1 otherwise in [0, 2],  for a ≤ x ≤ a + D,  q(x − a) qa,b (x) := q(x − 1 − b) for 1 + b ≤ x ≤ 1 + b + D, 0 otherwise in [0, 2],

(5.8b)

and τa,b := −

d d pa,b + qa,b . dx dx

(5.9)

Lemma 5.6. Suppose b(k) = 0 for some fixed k > 0. Then for every a0 ∈ (0, 1 − D) there exists a finite subset M(a0 ) ⊂ (0, 1 − D) with the following property: If b0 ∈ (0, 1 − D) \ M(a0 ), then there exist closed sets T1 , T2 ⊂ R such that (i) T1 ∪ T2 = R, d (ii) da ϕ0 (1, ϑ, τa )a=a = 0 for all ϑ ∈ T1 , 0 d (iii) db ϕ0 (2, ϑ, τa0 ,b )b=b = 0 for all ϑ ∈ T2 . 0

Proof. In keeping with the notation of Lemma 5.5, we have d = 0 for all ϑ ∈ R \ N (a0 ). ϕ0 (1, ϑ, τa ) da a=a0 Let

d M(a0 ) := b ∈ (0, 1 − D) : ϕ0 (2, ϑ, τa0 ,b ) = 0 for some ϑ ∈ N (a0 ) . db

594

R. Sims, G. Stolz

Similar to before, we may write ϕ0 (2, ϑ, τa0 ,b ) = ϕ0 (D, ϕ0 (1, ϑ, τa0 ) + kb, τ0 ) + k(1 − b − D), and so 1 d = R0−2 (D, ϕ0 (1, ϑ, τa0 ) + kb0 , τ0 ) − 1. ϕ0 (2, ϑ, τa0 ,b ) k db b=b0 Hence for fixed ϑ,

b ∈ R : ϕ0 (1, ϑ, τa0 ) + kb ∈ N

is discrete and π-periodic, and thus M(a0 ) is finite. d For b0 ∈ (0, 1 − D) \ M(a0 ) one has that db ϕ0 (2, ϑ, τa0 ,b )b=b = 0 for all 0

d ϑ ∈ N (a0 ). Continuity and periodicity of db ϕ0 (2, ϑ, τa0 ,b ) in ϑ implies the exis d tence of an ε > 0 such that db ϕ0 (2, ϑ, τa0 ,b )b=b = 0 for all ϑ ∈ T2 := {ϑ : 0 d dist(ϑ, N (a0 )) ≤ ε}. By construction, we have that da ϕ0 (1, ϑ, τa )a=a = 0 for 0 ϑ ∈ T1 := {ϑ : dist(ϑ, N (a0 )) ≥ ε}. This completes the proof since T1 and T2 are closed with T1 ∪ T2 = R.

6. Proof of (1.5) and (1.5 ) The results established in the previous sections, in particular Lemma 5.6, will now enable us to prove a result on two parameter spectral averaging (Theorem 6.2 below), which applies to both models HωS and HωW , and thus leads to proofs of (1.5) and (1.5 ). The details are very close to arguments already used in [4], to which we refer frequently. We work with the two parameter differential expressions τa,b from Sect. 5, defined by (5.8a), (5.8b), and (5.9), but think of them as extended to expressions of the form d d − dx p dx + q on the whole line, such that τa,b is limit point at +∞ and −∞ in the sense of Sturm–Liouville theory. Thus each τa,b defines a unique self-adjoint operator Ha,b on L2 (R). Lemma 6.1. Let k0 > 0 with b(k0 ) = 0. Choose a0 and b0 as in Lemma 5.6. Then there exists δ > 0, ε > 0, and C > 0 such that

a0 +δ

a0 −δ

b0 +δ

b0 −δ

R0 (N, ϑ, k, τa,b )−2 dbda ≤ C

(6.1)

uniformly in N > 0, ϑ ∈ [0, π), and k ∈ [k0 − ε, k0 + ε]. To prove this one can closely follow the proof of [4, Lemma 5.3]. Lemma 5.6 replaces the use of [4, Lemma 5.2]. Also, Corollary 5.3 enters, replacing the analogous averaging formula for “ordinary” Prüfer amplitudes used in [4]. We omit the details. The Weyl–Titchmarsh spectral measures of the operators Ha,b (e.g. [8]) are denoted by ρa,b . For these measures we get our main result on two parameter spectral averaging:

Localization: A Scattering Theoretic Approach

595

Theorem 6.2. Let E0 = k02 for k0 > 0 with b(k0 ) = 0. For a0 , b0 , ε, and δ as in Lemma 6.1 and Borel sets B define a0 +δ b0 +δ ρ(B) := ρa,b (B)dbda. (6.2) a0 −δ

b0 −δ

Then the Borel measure ρ is absolutely continuous in ((k0 − ε)2 , (k0 + ε)2 ). Based on Lemma 6.1, the proof of Theorem 6.2 may again be taken from [4] with few changes. The key is to use “Carmona’s formula” 1 π 0,θ dρa,b (E) = w − lim r0 (N, θ, E, τa,b )−2 dρ−∞ dθ, (6.3) π N→∞ 0 expressing ρa,b in terms of ordinary Prüfer variables r and θ , and the spectral measure 0,θ of the restriction of the operator Ha,b to (−∞, 0) with boundary condition at 0 ρ−∞ determined by θ. Use of (6.3) reduces the proof of absolute continuity of (6.2) to a uniform estimate of a0 +δ b0 +δ r0 (N, θ, E, τa,b )−2 dbda a0 −δ

b0 −δ

in N, θ, and E ∈ [(k0 − ε)2 , (k0 + ε)2 ], compare with the proof of [4, Theorem 4.1]. But this follows from Lemma 6.1, since on the compact subinterval [(k0 − ε)2 , (k0 + ε)2 ] of (0, ∞) the ordinary and modified Prüfer amplitudes r and R can be uniformly estimated by each other. The proofs of Carmona’s formula (6.3), which can be found in the literature for the Schrödinger case (e.g. [5, 6]), extend with few changes to the more general differential expressions considered here. Details will be contained in [33]. Theorem 6.2 covers positive energies away from the discrete set where the reflection coefficient for a non-reflectionless single site Eq. (2.1) may vanish. This will suffice to prove (1.5 ) since the operators HωW are non-negative. However, the Schrödinger operators from (1.5) may have some negative spectrum. To cover this, an analogue of Theorem 6.2 which works for negative energies will be needed in the Schrödinger case. But in this case, we may directly refer to [4, Theorem 5.4], which shows that the conclusion of Theorem 6.2 extends to energies E0 ∈ (−∞, 0) \ M0 , where (with q ∈ L1 (0, D) as above) M0 := E ∈ (−∞, 0) : There exists a non-trivial solution u of − u

+ qu = Eu u (0) u (D) with ∈ ± |E| and ∈ ± |E| . u(0) u(D) That M0 is discrete in (−∞, 0) for q = 0 was shown in [4, Theorem 3.1] by means of Floquet Theory. This may be reproven by using the properties of reflection and transmission coefficients: By observing (2.2), for the case k = iα, α > 0, we see that M0 = −α 2 : α > 0 and b(iα)b(−iα)a(iα)a(−iα) = 0 . We have that a(·) is non-zero on the positive real line and analytic in C \ {0}. Thus it has discrete zeros on the imaginary axis. The same holds for b(·), since it is analytic in

596

R. Sims, G. Stolz

C \ {0} and does not vanish identically on the real line by Corollary 3.2. We conclude that M0 is discrete in (−∞, 0). This completes our discussion of results on spectral averaging. As mentioned in the introduction, to finish the proofs of (1.5) and (1.5 ), we need to know positivity of the Lyapunov exponents for the models HωS and HωW . In the Schrödinger case, this follows from Kotani’s work (e.g. [24, 25], extended in [22] to cover L2 -potentials as used here): HωS is non-deterministic in Kotani’s sense, since g is compactly supported. This implies positivity of the Lyapunov exponent for almost every energy, which is sufficient for our purposes. To treat HωW we refer to Minami’s extension of Kotani’s theory to more general random Sturm–Liouville operators [29, 30]. More precisely, it follows as a special case d d of a remark in [30] that ergodic differential expressions of the type − dx aω dx can be d transformed into ergodic expressions of the type −rω dx 2 . For the latter type, Kotani’s theory is extended in [29], in particular that non-deterministic expressions have almost everywhere positive Lyapunov exponent. This applies to HωW since f is compactly supported and thus HωW non-deterministic. At this point, we skip the remaining details of the proofs of (1.5) and (1.5 ) and instead refer to Sect. 2 of [4], where it is shown how to deduce exponential localization from almost everywhere positivity of the Lyapunov exponent and two parameter spectral averaging. The argument in [4] uses Gilbert and Pearson’s subordinacy theory. We refer to [19] for the extension of the latter to general Sturm–Liouville operators. We close by noting that we expect the following more general result to be true: If the 2

single site equation −

u 1+f

+ gu = k 2 u is not reflectionless (f and g as in Sect. 1),

d d then Hω = − dx aω dx + vω almost surely has pure point spectrum (vω and aω defined by (1.1) re. (1.3)). The above proof of (1.5) and (1.5 ) does not carry over since, as remarked in [29] it is not understood how to extend Kotani’s theory to operators with this kind of randomness. A problem arises from the existence of reflectionless “mixed” equations as provided by the example in Sect. 3. However, in the non-reflectionless case it should be possible to prove positivity of the Lyapunov exponent for sufficiently many energies by other means.

Acknowledgement. The authors of this paper are greatly indebted to Fritz Gesztesy for indicating that the asymptotic formula given in Proposition 4.2 should exist. In particular, he identified the relevance of such a formula to Theorem 1.2 by linking the scattering data to m-function asymptotics through Herglotz representations.

References 1. Aronszajn, N. and Donoghue, W.: On exponential representations of analytic functions in the upper half-plane with positive imaginary part. J. Anal. Math. 5, 321–388 (1957) 2. De Bièvre, S. and Germinet, F.: Dynamical Localization for the Random Dimer Schrödinger Operator. Preprint, available from mp_arc 99-250 3. Borg, G.: Eine Umkehrung der Sturm–Liouvilleschen Eigenwertaufgabe. Acta Math. 78, 1–96 (1946) 4. Buschmann, D. and Stolz, G.: Two-Parameter Spectral Averaging and Localization for Non-Monotonous Random Schrödinger Operators. To appear in Trans. Am. Math. Soc. Preprint available from mp_arc 98-617 5. Carmona, R.: Exponential localization in one-dimensional disordered systems. Duke Math. J. 49, 191–213 (1982) 6. Carmona, R. and Lacroix, J.: Spectral theory of random Schrödinger operators. Basel–Berlin: Birkhäuser, 1990

Localization: A Scattering Theoretic Approach

597

7. Clark, S. and Hinton, D.: Strong nonsubordinacy and absolutely continuous spectra for Sturm–Liouville equations. Diff. and Int. Eq. 6, 573–586 (1993) 8. Coddington, E.A. and Levinson, N.: Theory of ordinary differential equations. New York: McGraw-Hill, 1955 9. Combes, J.M., Hislop, P.D. and Tip, A.: Band edge localization and the density of states for acoustic and electromagnetic waves in random media. Ann. Inst. Henri Poincaré, Physique Théorique 70, 381–428 (1999) 10. Craig, W.: The trace formula for Schrödinger operators on the line. Commun. Math. Phys. 126, 379–407 (1989) 11. Eastham, M.S.P.: The Spectral Theory of Periodic Differential Equations. Edinburgh–London: Scottish Academic Press, 1973 12. Eastham, M.S.P.: The asymptotic solution of linear differential systems. Oxford: Clarendon Press, 1989 13. Everitt, W.: On a property of the m-coefficient of a second-order linear differential equation. J. London Math. Soc. (2) 4, 443–457 (1972) 14. Figotin, A. and Klein, A.: Localization of classical waves I: Acoustic waves. Commun. Math. Phys. 180, 439–482 (1996) 15. Figotin, A. and Klein, A.: Localization of classical waves II: Electromagnetic waves. Commun. Math. Phys. 184, 411–441 (1997) 16. Gesztesy, F., Holden, H. and Simon, B.: Absolute summability of the trace relation for certain Schrödinger operators. Commun. Math. Phys. 168, 137–161 (1995) 17. Gesztesy, F. and Simon, B.: The ξ function. Acta Math. 176, 49–71 (1996) 18. Gesztesy, F., Sims, R. and Stolz, G.: In preparation 19. Gilbert, D.J.: On subordinacy and spectral multiplicity for a class of singular differential operators. Proc. Roy. Soc. Edinburgh A 128, 549–584 (1998) 20. Hochstadt, H.: On the Determination of a Hill’s Equation from its Spectrum. Arch. Rational Mech. Anal. 19, 353–362 (1965) 21. Keller, J.: Discriminant, transmission coefficients and stability bands of Hill’s equation. J. Math. Phys. 25, 2903–2904 (1984) 22. Kirsch, W., Kotani, S. and Simon, B.: Absence of absolutely continuous spectrum for some one dinsional random but deterministic potentials. Ann. Inst. Henri Poincaré 42, 383–406 (1985) 23. Kostrykin, V. and Schrader, R.: Scattering Theory Approach to Random Schrödinger Operators in One Dimension. Rev. Math. Phys. 11, 187–242 (1999) 24. Kotani, S.: Lyapunov indices determine absolutely continuous spectra of stationary one dimensional Schrödinger operators. In: Stochastic Analysis, ed. K. Ito, Amsterdam: North Holland, 1984, pp. 225–247 25. Kotani, S.: One-Dimensional Random Schrödinger Operators and Herglotz Functions. In: Probabilistic Methods in Mathematical Physics, eds. K. Ito and N. Ikeda, Boston: Academic Press, 1987, pp. 219–250 26. Kotani, S. and Krishna, M.: Almost periodicity of some random potentials. J. Funct. Anal. 78, 390–405 (1988) 27. Kotani, S. and Simon, B.: Localization in General One-Dimensional Random Systems. Commun. Math. Phys. 112, 103–119 (1987) 28. Kuchment, P.: The Mathematics of Photonic Crystals. In: Mathematical Modeling in Optical Science, eds. G. Bao, L. Cowsar, and W. Masters, SIAM, to appear 29. Minami, N.: An Extension of Kotani’s Theorem to Random Generalized Sturm–Liouville Operators. Commun. Math. Phys. 103, 387–402 (1986) 30. Minami, N.: An extension of Kotani’s theorem to random generalized Sturm–Liouville operators II. In: Stochastic processes in classical and quantum systems, Lecture Notes in Physics 262, Berlin–Heidelberg– New York: Springer 1986, pp. 411–419 31. Molchanov, S.: Multiscattering on sparse bumps. Contemp. Math. 217, 157–181 (1998) 32. Simon, B. and Wolff, T.: Singular continuous spectrum under rank one perturbations and localization for random Hamiltonians. Commun. Pure Appl. Math. 39, 75–90 (1986) 33. Sims, R.: PhD-Thesis, in preparation 34. Stollmann, P.: Localization for random perturbations of anisotropic periodic media. Israel J. Math. 107, 125–139 (1998) 35. Stolz, G.: Bounded Solutions and Absolute Continuity of Sturm–Liouville Operators. J. Math. Anal. Appl. 169, 210–228 (1992) 36. Stolz, G.: Spectral theory of Schrödinger operators with potentials of infinite barriers type. Habilitationsschrift, Frankfurt 1994 37. Stolz, G.: Localization for random Schrödinger operators with Poisson potential. Ann. Inst. Henri Poincaré 63, 297–314 (1995) 38. Stolz, G.: Non-Monotonic Random Schrödinger Operators: The Anderson Model. Preprint, available from mp_arc 99-259 39. Weidmann, J.: Spectral Theory of Ordinary Differential Operators. New York: Springer-Verlag, 1987 Communicated by B. Simon

Commun. Math. Phys. 213, 599 – 639 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Vortex Condensates for the SU(3) Chern–Simons Theory Margherita Nolasco1 , Gabriella Tarantello2 1 Dipartimento di Matematica, Università di L’Aquila, Via Vetoio, Coppito, 67010 L’Aquila, Italy.

E-mail: [email protected]

2 Dipartimento di Matematica, Università di Roma “Tor Vergata”, Via della Ricerca Scientifica, 00133 Roma,

Italy. E-mail: [email protected] Received: 23 October 1999 / Accepted: 14 March 2000

Abstract: We investigate SU (3)-periodic vortices in the self-dual Chern–Simons theory proposed by Dunne in [13, 15]. At the first admissible non-zero energy level E = 2π , and for each (broken and unbroken) vacuum state φ(0) of the system, we find a family of periodic vortices asymptotically gauge equivalent to φ(0) , as the Chern–Simons coupling parameter k → 0. At higher energy levels, we show the existence of multiple gauge distinct periodic vortices with at least one of them asymptotically gauge equivalent to the (broken) principal embedding vacuum, when k → 0.

1. Introduction In recent years several Chern–Simons field theories [14] have been proposed largely motivated by their possible applications to the physics of high critical temperature superconductivity. In fact, the corresponding Chern–Simons vortex theory has revealed a much richer structure compared with that described by the “classical” Yang–Mills framework [21, 26]. The study of Chern–Simons vortices has been particularly succesful in the abelian situation, see [31, 30, 4, 34, 29, 37, 8, 9, 6] and [7], where the existence and multiplicity of vortices with different nature (e.g. topological, nontopological, periodically constrained, etc.) have been established for the models proposed in [18, 20] and [23]. Only recently some progress has been made towards the existence of nonabelian CS-vortices, concerning the self-dual Chern–Simons theory proposed by Dunne in [13, 15], see also [22, 25] and [24]. Dunne’s model is defined in the (2 + 1)-Minkowski space R1,2 , with metric tensor gµν = diag(−1, 1, 1). The gauge group is given by a compact Lie group G equipped with a semi-simple Lie algebra (G, [ , ]). The relative Chern–Simons Lagrangian density Supported by M.U.R.S.T. 40 %“Metodi Variazionali ed Equazioni Differenziali Non Lineari”.

600

M. Nolasco, G. Tarantello

is given by 2 L = −T r{(Dµ φ)† D µ φ} − k µνα T r{∂µ Aν Aα + Aµ Aν Aα } − V (φ, φ † ), 3

(1.1)

where A = (A0 , A1 , A2 ) is the gauge connection of the principal bundle over R2,1 with structure group G that, together with the Higgs field φ, takes values in G through the adjoint representation of G. The gauge-covariant derivative Dµ = ∂µ + [Aµ , · ] is used to weakly couple the Higgs field φ with the gauge potential A = (A0 , A1 , A2 ). Furthermore, the Levi–Civita antisymmetric tensor µνα is chosen with 012 = 1, k > 0 is the Chern–Simons coupling parameter and “T r{. . . }” refers to the trace in the matrix representation of G. The gauge-invariant scalar potential V (φ, φ † ) is defined by, V (φ, φ † ) =

1 T r{([[φ, φ † ], φ] − v 2 φ)† ([[φ, φ † ], φ] − v 2 φ)}, 4k 2

where the constant v 2 plays the role of a mass parameter. In the following we set v 2 = 1. Vortices for L correspond to static solutions (with “finite” energy) for the Euler– Lagrange equations corresponding to (1.1). Over R2 , topological and nontopological vortices (see below) have been established in [38] and [36] respectively. Here we look for periodic vortices or condensates, namely for those static solutions which satisfy appropriate periodic boundary conditions to be specified according to the gauge invariance of L. We note immediately that the Euler-Lagrange equations corresponding to L (see (5.31a-b) in [14]) are very difficult to handle directly, even when we restrict to consider “Bogomol’nyi” type vortices, which are obtained by solving the reduced (first order) “relativistic self-dual Chern–Simons equations”: D− φ = 0 (1.2) F+− = k12 [φ − [[φ, φ † ], φ], φ † ], where D− = D1 − iD2 and F+− = ∂+ A− − ∂− A+ + [A+ , A− ], with A± = A1 ± iA2 and ∂± = ∂1 ± i∂2 . Solutions to (1.2) define a sort of “energy minimizers” for the system as they saturate the lower bound: E=

1 tr(φ † (φ − [[φ, φ † ], φ])) 2k 2

(1.3)

(modulo some negligible surface terms) for the energy density E corresponding to L (see [14, 15]). From (1.2) it is easy to determine the zero-energy vortices (vacua states), where the gauge field vanishes (modulo gauge transformations) while the Higgs field φ(0) corresponds to a zero of the potential V , namely, it is gauge equivalent to a solution of the algebraic equation (see [14, 15]): [φ, φ † ], φ = φ. (1.4) To be able to determine other non-zero energy solutions of (1.2), Dunne in [13] has proposed a simplified form of the self-dual system (1.2) in which the fields are algebrically restricted as follows.

Vortex Condensates for SU(3) Chern–Simons Theory

601

Let r be the rank of the Lie algebra G, {Ha } the generators of the Cartan subalgebra and {E±a } the family of the simple root step operators (with E−a = Ea† ), normalized according to a Chevalley basis [5, 19]. Hence, they satisfy the following commutation relations: [Ha , Hb ] = 0, [Ha , E±b ] = ±Kab E±b , [Ea , E−b ] = δab Ha , a, b = 1, . . . , r , and are subject to the normalization conditions: T r{Ha Hb } = Kab , T r{Ea E−b } = δab , T r{Ha E±b } = 0, a, b = 1, . . . , r, where K = (Kab ) is the Cartan matrix. We assume that the fields take the form: Aµ = −i

r a=1

φ=

r

Aaµ Ha ,

(1.5)

φ a Ea

(1.6)

a=1

with Aaµ (a = 1, . . . , r; µ = 0, 1, 2) real-valued functions and φ a (a = 1, . . . , r) complex-valued functions. In view of (1.5) and (1.6) the rgauge invariance of L can be expressed in terms of i the gauge group H = span {e a=1 wa (x)Ha } (where wa a = 1, . . . , r are real-valued smooth functions) generated by the Cartan subalgebra generators {Ha }. In other words, the gauge transformation laws for the components of the gauge potential and the Higgs field take the following simplified form: Aaµ −→ Aaµ + ∂µ wa , φ a −→ ei

r

b=1 Kba wb

φa ,

(1.7)

a = 1, . . . , r. With the algebraic restriction on the fields (1.5) and (1.6), the Lagrangian density L and the potential V are simplified as well. The Chern–Simons term decomposes into r copies of an abelian Chern–Simons term, and we have Lrestricted

2 r r a b a =− Kba Aµ )φ ∂µ φ − i( a=1 r

−k

a=1

b=1

µνρ

∂µ Aaν Aaρ

(1.8) − Vrestricted ,

602

M. Nolasco, G. Tarantello

where the restricted potential becomes Vrestricted

r r 1 a2 1 a2 = 2 |φ | − 2 |φ | Kab |φ b |2 4k 2k a=1

1 + 2 4k

a,b=1

r

a 2

b 2

(1.9) c 2

|φ | Kab |φ | Kbc |φ | .

a,b,c=1

Furthermore, the “relativistic self-dual Chern–Simons equations” (1.2) (away from the zeroes of φ a ) combine into the single set of coupled equations: ∂+ ∂− ln|φ a |2 = −

r r 1 1 b2 b 2 K |φ | + |φ | Kbc |φ c |2 Kac , ab k2 k2 b=1

a = 1, . . . r.

b,c=1

(1.10) To recover the Aµ=0 component of the gauge potential, we must supplement (1.2) with the Gauss Law constraint of the system, which componentwise reads as follows: a kF12 = J0a

a = 1, . . . , r,

(1.11)

a = ∂ Aa − ∂ Aa and J a define (after multiplication by −i) respectively, the where F12 1 2 2 1 0 components (in the Cartan subalgebra) of the gauge curvature and the current density † 0 0 † J0 = −i([φ , D φ] − [(D φ) , φ]). The selfdual equations (1.2), imply that D0 φ = 2ki ([[φ, φ † ], φ] − φ) and, by direct calculation, for the energy density (1.3) we find the expression

E=

r

a F12

(1.12)

a=1

(modulo negligible surface terms). Therefore under the decomposition (1.5) and (1.6), the system may be described in terms of r-(abelian) Chern–Simons fields Aaµ coupled to r complex scalar Higgs fields φ a with the couplings determined by the Cartan matrix K = (Kab ). Note that, by means of (1.5) and (1.6), the algebraic equation (1.4) may be solved a of φ a explicitly in terms of the components φ(0) (0) in the Cartan subalgebra. When φ(0) = 0, we find that, a 2 | = |φ(0)

r

(K −1 )ab

a = 1, . . . r,

(1.13)

b=1

where K −1 is the inverse of the Cartan matrix K. Notice that, on the base of (1.3), vortex solutions of (1.2) in R2 (with sufficiently fast decay as |x| → +∞) satisfies the energy relation: 1 E= E= 2 tr φ † φ − [[φ, φ † ], φ] . (1.14) 2k R2 R2 Therefore, it is to be expected that non-zero energy vortex solutions in R2 become gauge equivalent to the vacua states as |x| → +∞.

Vortex Condensates for SU(3) Chern–Simons Theory

603

Results in this direction have been obtained byYang [38] and Wang-Zhang [36] when the gauge group G = SU (N ). Yang in [38] shows that there exist solutions for (1.2) satisfying the ansatz (1.5) and (1.6), for every prescribed configuration of zeros for φ a , a | in (1.13) as |x| → +∞, a = 1, . . . , r. and such that |φ a | uniformly converges to |φ(0) Thus, Yang’s vortices are asymptotically equivalent (as |x| → +∞) to the so called principal embedding vacuum. Usually, one refers to those as the topological solutions for (1.2) in R2 . More difficult to derive are instead the nontopological solutions, namely those asymptotically gauge-equivalent to the other vacua states, as |x| → +∞. A class of nontopological solutions has been derived recently in [36]. The solutions in [36] are shown to be asymptotically gauge-equivalent to the unbroken vacuum φ(0) = 0, as |x| → +∞. We observe that, when N = 2, the SU (N )-gauge theory corresponding to L in (1.1) reduces to the abelian Chern–Simons–Higgs theory introduced by Hong–Kim– Pac [18] and Jackiw–Weinberg [20]. Thus, the above mentioned results [38] and [36] extend the work of [31, 30] and [6] on topological and nontopological abelian Chern– Simons vortices. In the abelian contest we also mention the work of [7] on nontopological Maxwell–Chern–Simons vortices for the Lee–Lee–Min model [23]. However, to establish condensate-type vortices whose feature more closely resemble those of the mixed states predicted by Abrikosov in superconductivity [1], it is necessary to derive solutions for (1.2) subject to gauge invariant periodic boundary condition. For this purpose note that, in the stationary case, the functions wa (a = 1, . . . , r) in (1.7) expressing the gauge invariance of (1.1), depend only on the space-variables x = (x1 , x2 ) and the gauge transformation laws reduce to: Aa0 → Aa0 φ a −→ ei

Aaj → Aaj + ∂j wa ,

r

b=1 Kba wb

φa ,

j = 1, 2,

a = 1, . . . , r.

(1.15)

Therefore, following ‘t Hooft [33], for each of the r-components of the fields we require appropriate periodic boundary conditions to hold in the periodic cell domain:

a a b b ( = x = (x1 , x2 ) ∈ R2 | − ≤ x1 ≤ , − ≤ x2 ≤ , 2 2 2 2 as follows. Let e1 = (a, 0) and e2 = (0, b) and decompose the boundary of ( by setting ∂( = * 1 ∪ * 2 ∪ {e1 + * 2 } ∪ {e2 + * 1 } ∪ {0, e1 , e2 , e1 + e2 }, with

1 * 1 = x ∈ R2 | x = (se1 − e2 ) 2

1 2 2 * = x ∈ R | x = (se2 − e1 ) 2

|s| < 1 , |s| < 1 .

We require that each component Aaµ and φ a (a = 1, . . . , r) of the vortex condensates (A, φ) satisfies:  r r b b  (a) ei b=1 Kba ξk (x+ek ) φ a (x + ek ) = ei b=1 Kba ξk (x) φ a (x),    a a A0 (x + ek ) = A0 (x), (b) (1.16) (Aaj + ∂j ξka )(x + ek ) = (Aaj + ∂j ξka )(x), j = 1, 2, (c)    x ∈ * 1 ∪ * 2 \ * k , k = 1, 2, a = 1, . . . , r,

604

M. Nolasco, G. Tarantello

where ξ1a , ξ2a (a = 1, . . . , r) are smooth functions defined in a neighborhood of * 2 ∪ {e1 + * 2 } and * 1 ∪ {e2 + * 1 }, respectively. Notice that, in analogy to the abelian case, the set of boundary conditions (1.16) produce a “quantization” effect on the “charges” (see [4] and [35]). In fact, in view of (1.16)-(a), to any vortex condensate we can associate r-integers Na ∈ Z (a = 1, . . . , r) (vortex numbers), corresponding to the phase shift of φ a around ∂(. More precisely, for a = 1, . . . , r and k = 1, 2, set r ξˆka s 1 , s 2 = Kba ξkb s 1 e1 + s 2 e2

s j ∈ (0, 1) , j = 1, 2,

b=1

we have: ξˆ1a 1, 0+ − ξˆ1a 0, 0+ + ξˆ2a 0+ , 0 − ξˆ2a 0+ , 1 + ξˆ1a 0, 1− − ξˆ1a 1, 1− + ξˆ2a 1− , 1 − ξˆ2a 1− , 0 = 2π Na . Consequently, by means of (1.16)-(c) and (1.11), for the “magnetic flux”-component a and the “electric charge”-component Q = a .a = ( F12 a ( J0 we obtain the relations, r

Kba .b = 2π Na ,

Qa = k.a .

b=1

Hence, they obey to the following “quantization” rules: .a = 2π

r

K

−1

b=1

ba

Nb ,

Qa = 2π k

r

K −1

b=1

ba

Nb .

Accordingly, for the energy 1 E= E= 2 tr φ † (φ − [[φ, φ † ], φ] , 2k ( (

(1.17)

(1.18)

we may use (1.14) to derive E=

r a=1 (

a F12

= 2π

r

K

a,b=1

−1

ba

Nb = 2π

r b=1

b 2 |φ(0) | Nb ,

(1.19)

b expresses the components (in the Cartan subalgebra) of the principal embedwhere φ(0) ding vacuum as given in (1.13). Since D− φ = 0, or equivalently ∂− ln φ a = i rb=1 Ab− Kab (a = 1, . . . , r), as in [4] and [34], each vortex number Na has the (topological) interpretation of counting the number of zeroes (according to their multiplicity) of the Higgs scalar component φ a in (. By virtue of (1.18), we now expect doubly periodic vortex solutions to become asymptotically gauge-equivalent to the vacua-states, when k → 0+ . Thus, in the same spirit of the results [38] and [36] mentioned above, we are going to establish periodic vortex condensates for L, where the asymptotic behavior of the Higgs field, as k → 0+ , is prescribed according to any fixed zero of the gauge potential V .

Vortex Condensates for SU(3) Chern–Simons Theory

605

For this purpose, we shall focus our attention on the simplest non-abelian case of physical relevance and take the gauge group G = SU (3). Our results should be compared with those obtained in [4, 34] for the abelian case (i.e. G = SU (2)) and in [29] for the Maxwell-Chern–Simons-Higgs model of Lee–Lee–Min [23]. 2 −1 Note that for the Lie group SU (3) the Cartan matrix is K = . Thus, the −1 2 restricted potential takes the form: 1 Vrestricted |φ 1 |, |φ 2 | = 2 4|φ 1 |6 + 4|φ 2 |6 − 3|φ 1 |4 |φ 2 |2 − 3|φ 1 |2 |φ 2 |4 4k

+ 4|φ 1 |4 + 4|φ 2 |4 − 4|φ 1 |2 |φ 2 |2 + |φ 1 |2 + |φ 2 |2 2 2 1 = 2 |φ 1 |2 2|φ 1 |2 − |φ 2 |2 − 1 + |φ 2 |2 2|φ 2 |2 − |φ 1 |2 − 1 4k

(1.20) whose zeroes coincide with the pairs (|φ 1 |2 , |φ 2 |2 ) = (0, 0) unbroken vacuum, (|φ 1 |2 , |φ 2 |2 ) = (1, 1) principal embedding vacuum, (|φ 1 |2 , |φ 2 |2 ) = (0, 21 ) and 1 ( 2 , 0). Thus, for k > 0 small, we will be interested in deriving SU (3)- vortex solutions for (1.2) - (1.16), under the ansatz (1.5) and (1.6), such that the components φ a (a = 1, 2) of the Higgs field satisfy one of the following: • |φ 1 | → 1 and |φ 2 | → 1 (type I) • |φ 1 | → 0 and |φ 2 | → 0 (type II) 1 |φ | → √1 and |φ 2 | → 0 (a) 2 • |φ 1 | → 0 and |φ 2 | → √1 (b)

(type III)

2

in some suitable norm as k → 0+ . In this direction, we prove that there exist two gauge distinct family of solutions for any prescribed pair of “vortex numbers” Na and relative set of vortex points Za = a } ⊂ ( corresponding to the zeroes of the component φ a of the Higgs field {p1a , . . . , pN a (a = 1, 2), provided we take k > 0 sufficiently small. Only one of these two families of solutions we can characterize as having the prescribed asymptotic behavior of type I, as k → 0. The existence of solutions of type II and III is proved only when N1 + N2 = 1, that is, a single vortex point is prescribed. In particular, we may conclude that at the energy level E = 2π (see (1.19)) there exist SU (3)- periodic vortices for each of the prescribed type I, II and III. More precisely, we obtain the following results. a } ⊂ ( be an Theorem 1.1. Let Na be a nonnegative integer and Za = {p1a , . . . , pN a assigned set of Na -points (not necessarily distinct) in (, a = 1, 2. For 0 < k < 3|(| 8π max{2N1 +N2 ,2N2 +N1 }

sufficiently small, there exist two gauge distinct SU (3)periodic vortex solutions of (1.2)–(1.16) satisfying the ansatz (1.5)–(1.6) and such that:

(i) the component φ a of the Higgs field satisfies: |φ a | < 1 in (, and φ a vanishes exactly at each pja ∈ Za with the multiplicity given by the repetition of pja in Za , a = 1, 2.

606

M. Nolasco, G. Tarantello

(ii) The induced “magnetic flux”-component .a and “electric charge”-component Qa , satisfy: .a =

1 2π Qa = (2Na + Nb ) a = b = 1, 2. k 3

(iii) The energy E satisfies: E = 2π(N1 + N2 ). Furthermore, one of these two solutions is always of type I, in the sense that the a (a = 1, 2) of the Higgs field satisfy: components φ(1) a | → 1, |φ(1)

as k → 0+ ,

(1.21)

pointwise a.e. in ( and strongly in Lp ((), ∀ p ≥ 1. While, when N1 + N2 = 1, then the other solution is of type II, and the corresponding a (a = 1, 2) of the Higgs field satisfy: components φ(2)

a | → 0, |φ(2)

as k → 0+ , uniformly in (.

(1.22)

For k > 8π max{2N13|(| +N2 ,2N2 +N1 } it is not possible to have SU (3)-periodic vortex solutions for (1.2)–(1.16) satisfying the ansatz (1.5)–(1.6) together with the properties (ii) and (iii). Remark 1.2. In the statement above, it is understood that in case Na = 0 for some a = 1, 2 then the corresponding set of vortex points Za is taken to be the empty set. Furthermore, a bootstrap argument shows that, in fact, the convergence in (1.22) holds C m (()-uniformly, for every m ∈ N, and in any other relevant norm. Theorem 1.1 may be considered as the complete analogue of the results on abelian Chern–Simons periodic vortices (corresponding to G = SU (2)) obtained by Caffarelli– Yang [4] and Tarantello [34]. In fact, if G = SU (2) then r = 1, and the restricted gauge-potential takes the form V (|φ|) = 4k12 |φ|2 (1 − 2|φ|2 )2 . So, in this case, only type I and II-vortices are allowed. To insure the existence of type II-vortices, the given restriction on the vortex numbers not to exceed the value 1 appears as a technical condition, which is required also in the abelian case [34]. Indeed, in the spirit of [34], we derive our results by considering different “constrained” variational principles for each type I, II and III-vortex. This point of view was inspired by a “constrained” variational approach introduced by Caffarelli–Yang [4] to treat 1-periodic vortices in the abelian situation. The restriction on the vortex number N = 1 was needed in [4] in order to derive the existence of a minimum for the relative variational problem, as a direct consequence of the Moser– Trudinger inequality (see [3, 16]). On the other hand, our variational problems take a system-form for which Moser–Trudinger’s inequality no longer suffices to yield directly the existence of a minimizer even under the given restriction on the vortex numbers. Instead, we show (see also [27]) that, on the constrained set, an “improved” form of the Moser–Trudinger inequality holds, which enables us to obtain a minimum for all the variational problems under examination regardless of the values of the vortex numbers.

Vortex Condensates for SU(3) Chern–Simons Theory

607

However, for type II (and even more so for type III) vortices we need to restrict the sum of the vortex numbers, as above, in order to insure that these minima actually lie on the “interior” of the constrained set, and thus yield to the desired vortex-solution. At the moment, it is not clear how to remove such a restriction even for the simpler abelian situation where only recently some progress has been made in this direction, see [28, 12] and [10]. Concerning the type III-vortices, specific to SU (3)-theory, we have the following: Theorem 1.3. For each fixed point p ∈ ( and 0 < k < 41 3|(| π sufficiently small, there exists an SU (3)-periodic-vortex solution for (1.2)–(1.16) satisfying the ansatz (1.5)–(1.6) such that for the component φ a (a = 1, 2) of the Higgs field the following holds: (i) (first component): |φ 1 | < 1; φ 1 never vanishes in ( and, 1 |φ 1 | → √ , 2

as k → 0+ ,

(1.23)

pointwise a.e. in ( and strongly in H1 ((). The corresponding induced first-component of the “magnetic field” .1 and “electric charge” Q1 satisfy: .1 = k1 Q1 = 2π 3 ; (ii) (second component): |φ 2 | < 1; φ 2 admits a simple zero at p ∈ ( and, |φ 2 | → 0,

as k → 0+ ,

(1.24)

pointwise a.e. in ( and strongly in H1 ((). The corresponding induced second-component of the “magnetic field” .2 and “electric charge” Q2 satisfy: .2 = k1 Q2 = 4π 3 . (iii) The energy E = 2π . Remark 1.4. In words, Theorem 1.3 states the existence of a type III (a) vortex with first vortex number N1 = 0 and second vortex N2 = 1. Due to the complete symmetry of (1.10) with respect to the indices a = 1, 2, we can also claim the existence of a SU (3)-periodic vortex of type III (b) simply by exchanging the role between the indices. Thus, Theorem 1.3 may be completed with the existence of another SU (3)-periodic vortex whose component φ a of the Higgs field satisfy: (i) (first component): |φ 1 | < 1; φ 1 admits a simple zero at p ∈ ( and, |φ 1 | → 0

as k → 0+ ,

pointwise a.e. in ( and strongly in H1 ((). The corresponding induced first component of the “magnetic field” .1 = “electric charge” Q1 = 4π 3 k; 2 (ii) (second component): |φ | < 1; φ 2 never vanishes in ( and, 1 |φ 2 | → √ , 2 pointwise a.e. in ( and strongly in H1 (().

as k → 0+ ,

(1.25) 4π 3

and

(1.26)

608

M. Nolasco, G. Tarantello

The corresponding induced second component of the “magnetic field” .2 and “electric charge” Q2 satisfy .2 = k1 Q2 = 2π 3 . (iii) The energy E = 2π. For such a solution the corresponding vortex numbers are given by N1 = 1 and N2 = 0. Thus, in case N1 + N2 = 1, we can combine the results above and conclude: Corollary 1.5. For given p 1 , p2 ∈ ( and 0 < k < 41 3|(| π sufficiently small, at the energy level E = 2π there exists a SU (3)-periodic vortex, satisfying (1.2)–(1.16) and the ansatz (1.5)–(1.6), for each of the asymptotic behaviors prescribed by the type I, II and III (a), (b), as k → 0+ . Furthermore, either the first component of the Higgs field φ 1 admits a simple zero at p 1 and the second component φ 2 never vanishes; or φ 1 never vanishes and φ 2 admits a simple zero at p 2 . Note that, at the moment, no existence result is available concerning vortices in R2 with the asymptotic behavior of the type III, as |x| → +∞. To establish the results above, we take advantage of the equation D− φ = 0, which we may write componentwise as follows: ∂− ln φ a = i

r b=1

Ab− Kba ,

a = 1, . . . , r.

(1.27)

In fact, by virtue of (1.27), we can follow an approach introduced by Taubes ([35]) for the study of self-dual Ginzburg–Landau vortices, and derive from (1.10) a system of nonlinear elliptic equations for the real variable functions ua = ln |φ a |2 (a = 1, 2) of the following form (see also [38]): r Na r ub ub uc a 5ua = − k12 on (, b=1 Kab e − b,c=1 e Kbc e Kac + 4π j =1 δpj ua

doubly periodic on ∂(,

a = 1, . . . , r

(1.28) a ∈ ( are the prescribed zeroes of the scalar fields φ a where the points p1a , . . . , pN a (a = 1, . . . , r) repeated according to their multiplicity. In fact, from each solution ua (a = 1, . . . , r) of (1.28) we may recover, under the ansatz (1.5) and (1.6), the whole vortex-solution for (1.2), by setting: a

φ (x) = e

x−pja

Na 1 j =1 Arg 2 ua (x)+i

Aa1 − iAa2 = −i

r

∂− ln φ b K −1

b=1

Aa0 = −

1 2k

|φ a |2 −

r

,

ba

K −1

b=1

, ba

(1.29)

,

where K −1 is the inverse of the Cartan matrix K. Clearly, from (1.29) we have that φ a vanishes exactly at each pja with the multiplicity corresponding to the repetition of pja in Za . We shall devote the following sections to the analysis of the elliptic system (1.28) in case G = SU (3).

Vortex Condensates for SU(3) Chern–Simons Theory

609

2. Variational Formulation and Preliminary Results We study the system (1.28) when the gauge group is SU (3). Recalling that considered 2 −1 the Cartan matrix for SU (3) is given by K = , the system (1.28) takes the −1 2 following form:  N1 2u1 2u2 u1 u2 u1 +u2 ) + 4π   j =1 δpj1 on ( 5u1 = λ(4e − 2e − 2e + e − e 2 (2.1) 5u2 = λ(4e2u2 − 2e2u1 − 2eu2 + eu1 − eu1 +u2 ) + 4π N j =1 δpj2 on (   u , u doubly periodic on ∂(. 1

2

where we have set 1 > 0. k2 Concerning problem (2.1) we shall prove the following results: λ=

(2.2)

8π max{2N1 + N2 , 2N2 + N1 } problem (2.1) admits Theorem 2.1. (a) For 0 < λ < 3|(| no solutions . (b) Every solution (u1 , u2 ) for (2.1) satisfies :

eua ≤ 1,

in ( (a = 1, 2).

(2.3)

(c) There exists λ0 > 0 sufficiently large such that, ∀λ > λ0 problem (2.1) admits, at least, two distinct solutions, one of which always satisfies: eua → 1,

as λ → +∞

(2.4)

pointwise a.e. in ( and strongly in Lp ((), ∀ p ≥ 1. We point out that, contrary to part (a) of Theorem 2.1 where the estimate on the non-existence range of λ’s is given independently of the position of the vortex points, the range (λ0 , +∞) of existence as established in part (c), depends on the position of such points as can be seen already by the rough estimate given in (2.21). Clearly, (2.4) insures the existence of a periodic vortex solution of type I. Concerning the existence of type II and III vortices, we have to limit our attention to consider the case where the vortex-numbers (N1 , N2 ) satisfy: N1 + N2 = 1. Thus, we consider problem (2.1) in the simpler form:  2u 2u u u u +u  5u1 = λ(4e 1 − 2e 2 − 2e 1 + e 2 − e 1 2 ) + 4π N1 δp1 on ( (2.5) 5u2 = λ(4e2u2 − 2e2u1 − 2eu2 + eu1 − eu1 +u2 ) + 4π N2 δp2 on (  u , u doubly periodic on ∂( 1 2 with assigned points p a ∈ (, a = 1, 2. We prove: Theorem 2.2. For N1 + N2 = 1 and λ > − a (weak) solution (u− 1 , u2 ) satisfying:

16π 3|(|

sufficiently large, problem (2.10) admits

−

eua → 0 as λ → +∞, uniformly in ( (a = 1, 2), (and in any other relevant norm). Furthermore, there always exists a second solution (u∗1 , u∗2 ) such that,

(2.6)

610

M. Nolasco, G. Tarantello

(i) if N1 = 0 and N2 = 1, then ∗

eu1 →

1 , 2

∗

eu2 → 0 as λ → +∞

(2.7)

pointwise a.e. in ( and strongly in H1 ((); (ii) if N1 = 1 and N2 = 0, then ∗

eu1 → 0,

∗

eu2 →

1 2

as λ → +∞

(2.8)

pointwise a.e. in ( and strongly in H1 ((). It is clear that, by means of the transformations in (1.29) and (2.2), we obtain Theorem 1.1, 1.3 as well as Corollary 1.5 as an immediate consequence of Theorem 2.1 and 2.2. To establish Theorem 2.1 and 2.2, it will be convenient to distinguish between the singular and regular part of the solutions in (2.1). For this purpose denote by ua0 (a = 1, 2) the unique solution for the problem (see [3]) Na a a 5ua = − 4πN on ( j =1 δpj |(| + 4π (2.9) 0a a ( u0 = 0 u0 doubly periodic on ∂( a = 1, 2. a }) and if na is the multiplicity of p a then As is well known, ua0 ∈ C ∞ (( \ {p1a . . . pN j j i a

ua0 behaves like: ln |x − pja |2nj , as x → pja . a

Setting ua = ua0 + va and ha = eu0 we have that (u1 , u2 ) is a solution for (2.1) if and only if (v1 , v2 ) is a smooth solution for the following system:  2 2v 2 2v2 v1 v2 v1 +v2 + 4πN1 1  on ( 5v1 = λ 4h1 e − 2h2 e − 2h1 e + h2 e − h1 h2 e |(| 4πN 2 2 2v 2v v v v +v 2 2 1 2 1 1 2 5v2 = λ 4h2 e − 2h1 e − 2h2 e + h1 e − h1 h2 e + |(| on (   v1 , v2 doubly periodic on ∂(. (2.10) As a preliminary result, we start to derive part (b) of Theorem 2.1. Proof of (2.3). First of all notice that, in view of (2.9) and (2.10), limx→pja ua = −∞ for a = 1, 2 and j = 1, . . . , Na . Therefore ua attains its maximum value at some point a }. Set u x¯a ∈ ( \ {p1a , . . . , pN ¯ a = max( ua = ua (x¯a ) (a = 1, 2). By symmetry, we can a assume without loss of generality, that u¯ 1 ≥ u¯ 2 . Since (u1 , u2 ) is a solution of (2.1), we derive 0 ≥ 4e2u¯ 1 − 2e2u2 (x¯1 ) − 2eu¯ 1 + eu2 (x¯1 ) − eu¯ 1 +u2 (x¯1 ) ≥ ≥ 2e2u¯ 1 − 2eu¯ 1 + eu2 (x¯1 ) − eu¯ 1 +u2 (x¯1 ) . u¯ 1 u¯ 1 u2 (x¯1 ) u¯ 1 u¯ 1 u2 (x¯1 ) u¯ 1 e −1 −e e − 1 = 2e − e e −1 = 2e

(2.11)

Thus, eu¯ 1 − 1 ≤ 0 and consequently eua (x) ≤ 1

for any x ∈ (,

and a = 1, 2.

(2.12)

Vortex Condensates for SU(3) Chern–Simons Theory

611

We notice that (2.10) admits a variational formulation on H1 (() × H1 ((). Here 1 (R2 ) with periodic cell denotes the space of doubly periodic functions v ∈ Hloc domain (. It defines an Hilbert space equipped with the standard H 1 (()-scalar product. denote by · the usual norm on H1 (() as given by v2 = ∇v22 +v22 = We shall 2 2 ( |∇v| + ( |v| . It is easy to check that (weak) solutions for (2.10) correspond to critical points in H1 (() × H1 (() for the (unbounded) functional 1 Iλ (v1 , v2 ) = ∇v1 · ∇v2 + λ W (v1 , v2 ) ∇v1 2 + ∇v2 2 + 3 ( ( 4π 4π + v1 + v2 , v1 , v2 ∈ H1 (() (2N1 + N2 ) (2N2 + N1 ) 3 3 ( ( (2.13) H1 (()

with W (v1 , v2 ) given by W (v1 , v2 ) = h21 e2v1 + h22 e2v2 − h1 ev1 − h2 ev2 − h1 h2 ev1 +v2 + 1 3 1 = (2h1 ev1 − h2 ev2 − 1)2 + (h2 ev2 − 1)2 (2.14) 4 4 1 3 = (2h2 ev2 − h1 ev1 − 1)2 + (h1 ev1 − 1)2 ≥ 0. 4 4 To simplify notations, from now on we shall assume, without loss of generality, that |(| = 1. Integrating (2.10) over (, we find that any solution (v1 , v2 ) for (2.10) satisfies the following constraint conditions: 2 2v 1 4 ( h1 e 1 − 2 ( h22 e2v2 − 2 ( h1 ev1 + ( h2 ev2 − ( h1 h2 ev1 +v2 + 4πN λ =0 2 4 ( h22 e2v2 − 2 ( h21 e2v1 − 2 ( h2 ev2 + ( h1 ev1 − ( h1 h2 ev1 +v2 + 4πN λ = 0. (2.15) Conditions (2.15) may be more clearly interpreted if we set vi = ci + wi with ( wi = 0 and ci = ( vi (i = 1, 2). Indeed, after some simple algebraic manipulation, from (2.15) we get a quadratic system for the variables eci (i = 1, 2) as follows: 2c 2 2w 2e 1 ( h1 e 1 − ec1 ( ( h1 ew1 + ec2 ( h1 h2 ew1 +w2 ) + 4π 3λ (2N1 + N2 ) = 0 2e2c2 ( h22 e2w2 − ec2 ( ( h2 ew2 + ec1 ( h1 h2 ew1 +w2 ) + 4π 3λ (2N2 + N1 ) = 0. (2.16) Consequently, a solution ui = ui0 + ci + wi (i = 1, 2,) for (2.1) must satisfy: wi cj w1 +w2 2 32π 2Ni + Nj ( hi e + e ( h1 h2 e 2 ≥ i = j = 1, 2. (2.17) 2wi 3λ ( hi e Thus, taking into account (2.12), by (2.17) and Hölder inequality we obtain, 2 4 ( hi ewi 32π 2Ni + Nj (2.18) ≤ 2 2w ≤ 4|(|, i = j = 1, 2. i 3λ ( hi e

612

M. Nolasco, G. Tarantello

Hence, the condition λ≥

8π max{2N1 + N2 ; 2N2 + N1 } 3|(|

(2.19)

is necessary for the solvability of (2.1), and part (a) of Theorem 2.1 follows. Set E = {w ∈ H1 (() : w = 0}. (

We shall see that, for each fixed pair (w1 , w2 ) ∈ E × E satisfying: (

hi e

wi

2 ≥

32π 2Ni + Nj 3λ

(

h2i e2wi ,

i = j,

i, j = 1, 2,

the system (2.16) admits four distinct solutions (c1 , c2 ). For this purpose, from now on, we take 2 2 32π ( h1 ( h2 λ> max (2N1 + N2 ) 2 ; (2N2 + N1 ) 2 3 h1 h2 (

(2.20)

(2.21)

(

and define the set Aλ = {(w1 , w2 ) ∈ E × E : wi satisfies

(2.20),

i = 1, 2}.

(2.22)

Note that (0, 0) ∈ Aλ . For given (w1 , w2 ) ∈ Aλ , we introduce the smooth functions gi± : [0, +∞) → R (i = 1, 2) defined as follows: w1 + X h1 h2 ew1 +w2 ± ( h1 e 2( g1 (X) ≡ 4 ( h1 e2w1 2 w1 + X w1 +w2 2 − 32π (2N + N ) 2w 1 2 ( h1 e 1 ( h1 e ( h1 h2 e 3λ 2 ± 4 ( h1 e2w1 h2 ew2 + X ( h1 h2 ew1 +w2 g2± (X) ≡ ( 4 ( h22 e2w2 2 w2 + X w1 +w2 2 − 32π (2N + N ) 2w 2 1 ( h2 e 2 ( h2 e ( h 1 h2 e 3λ 2 ± , 4 ( h2 e2w2 (2.23) and set

F + (X) ≡ X − g1+ (g2+ (X)), F ± (X) ≡ X − g1+ (g2− (X)),

F − (X) ≡ X − g1− (g2− (X)) F ∓ (X) ≡ X − g1− (g2+ (X))

.

It is easy to check that solutions of (2.16) correspond to the zeroes of the smooth functions F ∗ : [0, +∞) → R, with ∗ = +, −, ±, ∓.

Vortex Condensates for SU(3) Chern–Simons Theory

613

Notice that gi± (X) > 0 for any X ≥ 0 (i = 1, 2) and so F ∗ (0) < 0 (∗ = +, −, ±, ∓). Moreover, lim g − (X) X→+∞ i while gi+ (X) = X

=0

(i = 1, 2),

(2.24)

w1 +w2 ( h1 h2 e 2 2 ( hi e2wi

+ o(1),

as X → +∞ (i = 1, 2).

(2.25)

Therefore, w1 +w2 2 h h e F + (X) 1 2 = 1 − ( 2 2w 2 2w + o (1) , X 4 ( h 1 e 1 ( h2 e 2

as X → +∞.

For the remaining F ∗ , (∗ = −, ±, ∓) we have F ∗ (X) = 1 + o(1), X

as X → +∞.

Therefore, for any choice ∗ = +, −, ±, ∓, it follows: lim F ∗ (X) = +∞,

X→+∞

and hence, by continuity, F ∗ (X ∗ ) = 0, for some X∗ > 0 and ∗ = +, −, ±, ∓. Furthermore, for i, j = 1, 2, i = j , w1+w2 dgi± ± ( h1 h2 e , (X) = ±gi (X) 2 dX wi + X w1+w2 2 − 32π 2N +N 2wi h e h h e h e i j ( i ( 1 2 ( i 3λ (2.26) and so, F ± and F ∓ are strictly increasing. Moreover, for (w1 , w2 ) ∈ Aλ (i = 1, 2) and X > 0, we have w1 +w2 g + (g + (X)) dF + 1 2 ( h1 h2 e (X) = − dX + ( h ew1 + g (X) h h ew1 +w2 )2 − 32π (2N + N ) h2 e2w1 (

1

2

(

1 2

· ( ( h2 ew2 + X ( h1 h2 ew1 +w2 )2 − >1 −

1

3λ

w1 +w2 g + (X) 2 ( h1 h2 e 32π λ (2N2

+ N1 )

g1+ (g2+ (X)) F + (X) = , X X

and, analogously, g − (g − (X)) dF − F − (X) (X) > 1 − 1 2 = . dX X X So,

F∗ X

is strictly increasing, for ∗ = +, −, and X > 0.

2

(

2 2w2 ( h2 e

1

+1

614

M. Nolasco, G. Tarantello

In conclusion, for any (w1 , w2 ) ∈ Aλ and ∗ = +, −, ±, ∓ there exists a unique ∗ X∗ > 0 such that F ∗ (X ∗ ) = 0. Set ec1 (w1 ,w2 ) = X∗ , and observe that, by the strict + − monotonicity of gi and gi , i = 1, 2 (see (2.26)), there exists a unique ci∗ = ci∗ (w1 , w2 ) (i = 1, 2; ∗ = +, −, ±, ∓) satisfying +

+

ec2 = g2+ (ec1 );

±

±

ec2 = g2− (ec1 );

∓

∓

ec2 = g2+ (ec1 );

−

−

ec2 = g2− (ec1 ).

ec1 = g1+ (ec2 ), ec1 = g1+ (ec2 ), ec1 = g1− (ec2 ), ec1 = g1− (ec2 ),

+

+

±

±

∓

∓

−

−

(2.27)

Consequently, for given (w1 , w2 ) ∈ Aλ , setting vi∗ = wi + ci∗ (w1 , w2 ), i = 1, 2, ∗ = +, −, ±, ∓ we have that the pair (v1∗ , v2∗ ) satisfy (2.15). Our goal will (v1 , v2 ) for (2.10) which decompose as vi = be to derive solutions wi + ci with ( wi = 0, ci = ( vi and ci = ci∗ (w1 , w2 ) (i = 1, 2) with prescribed ∗ to coincide with either +, −, ± or ∓. This will yield to solutions of (2.10) with specific asymptotic behavior as λ → +∞. Note that, by the complete symmetry of the problem, the case ∗ = ± and ∗ = ∓ are similar in nature. So, we shall limit our attention to the case ∗ = ± with the understanding that, by changing the role between the indices, analogous considerations hold also when ∗ = ∓. We start with the following: Lemma 2.3. For every (w1 , w2 ) ∈ Aλ we have + (i) eci ( hi ewi ≤ 1 for i = 1, 2; − (ii) eci ( hi ewi ≤ 8π

= j ; λ (2Ni + Nj ) for i = 1, 2, i ± c2± w2 ≤ (iii) ec1 ( h1 ew1 ≤ 21 + 16π (2N + N ) and e 2 1 ( h2 e 3λ

8π λ (2N2

+ N1 ) .

Proof. (i) In view of (2.23), from (2.27) for i, j = 1, 2 i = j we have ci+

e

≤

( hi e

wi

+ + ecj ( h1 h2 ew1 +w2 . 2 ( h2i e2wi

Iterating such an inequality, by means of Hölder inequality, we get, w1 w1 +w2 c1+ w2 c1+ w1 +w2 ( h1 e ( h1 h 2 e e ≤ + h2 e + e h1 h2 e 2 ( h21 e2w1 4 ( h21 e2w1 ( h22 e2w2 ( ( h1 ew1 h1 h2 ew1 +w2 h2 ew2 1 + ≤ ( 2 2w + ( 2 2w (2 2w + ec1 . 4 2 ( h1 e 1 4 ( h1 e 1 ( h 2 e 2 +

By symmetry, an analogous estimate holds for ec2 . Hence, w1 +w2 w1 w2 4 ( ( hi ewi )2 ci+ wi ( h1 h2 e ( h1 e ( h2 e , e hi e ≤ + 3 2 ( h2i e2wi 4 ( h21 e2w1 ( h22 e2w2 (

i = 1, 2, (2.28)

and, using Hölder inequality, we derive the desired estimate.

Vortex Condensates for SU(3) Chern–Simons Theory

615

To obtain (ii), we use again (2.27) together with (2.23). Thus, for i = 1, 2, i = j , we have −

eci ≤

8π(2Ni + Nj ) 1 − c 3λ wi w1 +w2 j ( hi e + e ( h1 h2 e

(2.29)

and (ii) easily follows. As above, to obtain (iii), note that from (2.27) and (2.23), we have ±

ec2 ≤

8π(2N2 + N1 ) 1 , ± c w w1 +w2 3λ 2 +e 1 ( h2 e ( h 1 h2 e

(2.30)

8π(2Ni + Nj ) . λ ( h2 ew2

(2.31)

and so, ±

ec2 ≤ On the other hand, c1±

e

≤

( h1 e

w1

± + ec2 ( h1 h2 ew1 +w2 , 2 ( h21 e2w1

(2.32)

while from (2.30) we also have, ±

ec2 ≤

1 8π(2N2 + N1 ) . ± c 3λ e 1 ( h1 h2 ew1 +w2

Furthermore, from (2.27) and (2.23) it is easy to check that h1 ew1 c1± e ≥ ( 2 2w . 4 ( h1 e 1 Combining (2.33) with (2.34), from (2.32) we get ± w1 2 ec2 ( h1 h2 ew1 +w2 ( h1 ew1 ( h1 e c1± w1 h1 e ≤ 2 2w + e 2 ( h1 e 1 2 ( h21 e2w1 ( w1 2 4π (2N2 + N1 ) ( h1 ew1 ( h1 e ≤ 2 2w + ± 2 ( h1 e 1 3λec1 ( h21 e2w1 w1 2 16π (2N2 + N1 ) ( h1 e ≤ 2 2w + , 1 3λ 2 ( h1 e

(2.33)

(2.34)

(2.35)

and the desired estimate follows by Hölder inequality. Remark 2.4. Note that, by Jensen’s inequality, ( hi ewi ≥ 1 (i = 1, 2), which combined with the estimates in Lemma 2.3, gives in particular that, + − 1 eci ≤ 1, eci ≤ O , i = 1, 2; λ (2.36) ± ± 1 1 1 ec1 ≤ + O and ec2 ≤ O . 2 λ λ

616

M. Nolasco, G. Tarantello

This suggests that solutions of (2.10) with the prescribed asymptotic behavior of type I, II and III (see the Introduction) should correspond to those with mean-values as given by ci+ , ci− and ci± , respectively. With this aim, we consider the functionals Jλ+ , Jλ− , Jλ± and Jλ∓ defined on Aλ and obtained by inserting the constraint (2.27) into Iλ . More precisely, for (w1 , w2 ) ∈ Aλ , define Jλ∗ (w1 , w2 ) = Iλ (w1 + c1∗ (w1 , w2 ), w2 + c2∗ (w1 , w2 )),

with ∗ = +, −, ±, ∓.

Using (2.16) we find: 1 ∇w1 · ∇w2 ∇w1 2 + ∇w2 2 + Jλ∗ (w1 , w2 ) = 3 ( λ λ ∗ ∗ + 1 − ec1 h1 ew1 + 1 − ec2 h2 ew2 − 4π (N1 + N2 ) 2 ( 2 ( 4π 4π ∗ + (2N1 + N2 ) c1 + (2N2 + N1 ) c2∗ , 3 3 (2.37) with ∗ = +, −, ± and ∓. Remark 2.5. It is easy to check that Jλ∗ is Frechét differentiable in the interior of Aλ . Moreover if (w1 , w2 ) (in the interior of Aλ ) is a critical point for Jλ∗ , then (w1 +c1∗ , w2 + c2∗ ) defines a critical point for Iλ .

Concerning the existence of critical points for Jλ∗ , we will show that such a functional is bounded below and attains its infimum on Aλ . However, only when ∗ = +, we can prove that the corresponding minimum point belongs to the interior of Aλ , for λ > 0 large (see Proposition 3.4 below). In the other cases ∗ = −, ± and ∓, we can prove that this occurs only when N1 + N2 = 1 (see Sect. 5). It is an interesting open question to know what happens for the remaining cases. 3. Multivortex Solutions of the I Type In this section we analyze the functional Jλ+ , and the corresponding minimization problem in Aλ . We start with a preliminary lemma already derived in [27]: Lemma 3.1. If (w1 , w2 ) ∈ Aλ , then ∀τ ∈ (0, 1] we have 1−τ 1 τ τ 3λ wi τ τ w i hi e ≤ hi e , 32π 2Ni + Nj ( (

i, j = 1, 2 i = j.

(3.1)

1 so that τ a + 2(1 − a) = 1. By the interpolation Proof. Let τ ∈ (0, 1) and let a = 2−τ inequality we have a 1−a hi ewi ≤ hτi eτ wi h2i e2wi (

(

≤

(

hτi eτ wi

a

(

3λ 32π 2Ni + Nj

1−a (

hi ewi

2(1−a) ,

i = j,

Vortex Condensates for SU(3) Chern–Simons Theory

617

Consequently, for i = j , i, j = 1, 2, 1−a 2a−1 a 3λ wi hi e ≤ hτi eτ wi , 32π 2Ni + Nj ( ( that is,

(

hi e

wi

≤

3λ 32π 2Ni + Nj

1−τ τ (

hτi eτ wi

1 τ

.

In view of Lemma 3.1 we derive the coerciveness of Jλ+ as follows:

Proposition 3.2. For λ > 0 sufficiemtly large, there exist constants α, C > 0 (independent of λ) such that (3.2) Jλ+ (w1 , w2 ) ≥ α ∇w1 2 + ∇w2 2 − C (ln λ + 1) for all (w1 , w2 ) ∈ Aλ . Moreover, Jλ+ attains its infimum on Aλ . Proof. From (2.23) and (2.27) it follows immediately that hi ewi ci+ e ≥ ( 2 2w , i = 1, 2. 4 ( hi e i Hence, from (2.20) we find a suitable constant C > 0, independent of λ, such that for every (w1 , w2 ) ∈ Aλ , we have hi ewi − C, i = 1, 2. (3.3) ci+ (w1 , w2 ) ≥ −ln λ − ln (

Whence, using Lemma 3.1, for fixed τ ∈ (0, 1) and any (w1 , w2 ) ∈ Aλ , we obtain: 2 4π 1 Jλ+ (w1 , w2 ) ≥ 2Ni + Nj ln ∇wi 22 − hi ewi 6 3 ( i=1

−

i =j =1,2

i =j =1,2

2 1 4π ∇wi 22 2Ni + Nj ln λ − C ≥ 3 6

i=1

4π 3λ − 2Ni + Nj ln[ 3 16π 2Ni + Nj i =j =1,2 −

i =j =1,2

1−τ τ

τ τ wi +ui0

1

(

e

4π 2Ni + Nj ln λ − C 3

2 4π 1 2 τ wi i ≥ ln ∇wi 2 − e + max u0 2Ni + Nj ( 6 3τ ( i=1 i =j =1,2 4π 1 2Ni + Nj ln λ − C, − 3 τ i =j =1,2

]

618

M. Nolasco, G. Tarantello

for some constant C > 0 independent of λ. Recall that, by Moser–Trudinger’s inequality [3](see [16, 11] and [27], for alternative proofs) we have that 1 (3.4) ∇w22 , ∀w ∈ E, ew ≤ C exp 16π ( with C a positive constant depending only on (. Thus, for any (w1 , w2 ) ∈ Aλ , we obtain τ 2Ni + Nj 1 + Jλ (w1 , w2 ) ≥ 1− ∇wi 22 6 2 i =j =1,2 (3.5) 4π 2Ni + Nj − ln λ − Cτ , 3 τ i =j =1,2

with Cτ > 0 a suitable constant independent of λ. Hence, it suffices to take 0 < τ < maxi =j =1,22(2Ni +Nj ) in (3.5), to derive (3.2) and

conclude that Jλ+ is coercive on Aλ . Since Jλ+ is weakly lower semicontinuous on the weakly closed set Aλ , we immediately conclude that the infimum of Jλ+ is attained on Aλ . Our next goal is to prove that, for λ sufficiently large, such a minimum point lies in the interior of Aλ . To this purpose we will estimate the functional Jλ+ on the boundary ∂Aλ of Aλ . Lemma 3.3. For λ > 0 sufficiently large, inf

(w1 ,w2 )∈∂ Aλ

Jλ+ (w1 , w2 ) ≥

√ λ − C( λ + ln λ + 1), 2

(3.6)

with C > 0 a suitable constant independent of λ. Proof. For (w1 , w2 ) ∈ ∂Aλ , we have that the identity (

hi e

wi

2 =

32π (2Ni + Nj ) 3λ

(

h2i e2wi ,

i = j

(3.7)

necessarily holds for i = 1 or 2. Without loss of generality assume that (3.7) holds for i = 2. Using (3.7) into (2.28), by means of H¨older’s inequality, we find w2 )2 w1 w2 ( h e h e h e + 4 2 1 2 ( ( ec2 h2 ew2 ≤ + ( 1 1 3 2 ( h22 e2w2 ( 4( ( h21 e2w1 ) 2 ( ( h22 e2w2 ) 2 (3.8) 1 1 ≤ C( + √ ), λ λ for some constant C > 0 independent of λ.

Vortex Condensates for SU(3) Chern–Simons Theory

619

Hence, for λ > 0 sufficiently large, we have c2+

e

2C h2 ew2 ≤ √ . λ (

(3.9)

Using (2.37) with ∗ = +, the arguments of Proposition 3.2 with τ = Lemma 2.3-(i) and (3.9), we get

2 maxi =j =1,2 (2Ni +Nj ) ,

λ + + 1 − ec1 h1 ew1 + 1 − ec2 h2 ew2 2 ( ( 2 2π − 2Ni + Nj ln λ − C max 3 i,j =1,2;i =j √ λ λ + ln λ + 1 , ≥ −C 2

Jλ+ (w1 , w2 ) ≥

λ 2

for any (w1 , w2 ) ∈ ∂Aλ , with C > 0 a suitable constant independent of λ.

In order to find suitable test-functions in the interior of Aλ , where the reverse estimate in (3.6) holds, we recall here some results obtained in [34] concerning theAbelian Chern– Simons–Higgs equation. In [34] (Proposition 3.1) it iis proved that ifor µ > 0 sufficiently large there exist i +w i = v¯µi = c¯µ ¯ µi , with c¯µ ¯ µ = 0 (i = 1, 2) solution of ( v¯ µ and the ( w

i

i

5v = µev+u0 (ev+u0 − 1) + 4π Ni v ∈ H 1 ((),

(3.10)

i → 0 and w such that ui0 + v¯µi < 1 in (, c¯µ ¯ µi → −ui0 pointwise a.e., as µ → +∞ (i = i

i

1, 2). Since hi = eu0 ∈ L∞ ((), by dominated convergence, we have that hi ew¯ µ → 1 strongly in Lp (() for any p ≥ 1. In particular,

h1 h2 ew¯ µ +w¯ µ → 1, 1

(

2

as µ → +∞.

(3.11)

Hence, for fixed λ0 > 0 large and ∈ (0, 1), we can find µ > 0 sufficiently large to insure that, setting w¯ i, = w¯ µi (i = 1, 2), we have (w¯ 1, , w¯ 2, ) ∈ Aλ for every λ ≥ λ0 and

max

j =1,2

4

2 2w¯ j, + 1 ( hj e 2 2w¯ 2 2w¯ 1, 2, ( h1 e ( h2 e

2

−1

> 1 − .

(3.12)

620

M. Nolasco, G. Tarantello

Recalling that, by Jensen’s inequality, ( h1 h2 ew¯ 1, +w¯ 2, ≥ 1 and ( h2j e2w¯ j, ≥ ( ( hj e2w¯ j, )2 ≥ 1, j = 1, 2, by means of (2.27) and (2.23) we obtain, + w¯ 1, + ecj (w¯ 1, ,w¯ 2, ) w¯ 1, +w¯ 1, h e h h e + ( i ( 1 2 2 2w¯ eci (w¯ 1, ,w¯ 2, ) ≥ i, 4 ( hi e  2 2w¯  h e i, 32π (2Ni + Nj ) ( i w¯ 2  × 1 + 1 − 3λ ( ( hi e i, ) + hi ew¯ i, 1 + ecj (w¯ 1, ,w¯ 2, ) 2 2w¯ ≥ + ( 2 2w¯ 2 ( hi e i, 4 ( hi e i,   2 2w¯ i, h e 32π × 1− (2Ni + Nj ) ( i w¯ 2 − 1 3λ ( ( hi e i, ) +

8π(2Ni + Nj ) 1 + ecj (w¯ 1, ,w¯ 2, ) 2 2w¯ ≥ − , i, 3λ 2 ( hi e

to

for i, j = 1, 2

and i = j. (3.13)

+ Thus, setting ci, = ci+ (w¯ 1, , w¯ 2, ) i = 1, 2, an iteration of the estimate above yields +

+ 1 1 + 2 2w¯ 2 2w¯ (1 + eci, ) 2 2 w ¯ i, 1, 2, 2 ( hi e 4 ( h1 e ( h2 e 8π(2Nj + Ni ) 8π(2Ni + Nj ) 1 − 2 2w¯ − 3λ 3λ 2 ( hi e i, 2 2w¯ + 2 ( hj e j, + 1 4π(4Nj + 5Ni ) eci, ≥ 2 2w¯ 2 2w¯ + 2 2w¯ 2 2w¯ − , 1, 2, 1, 2, 3λ 4 ( h1 e 4 ( h1 e ( h2 e ( h2 e

eci, ≥

for i = j = 1, 2. Consequently, for i = j = 1, 2, 2 ( h2j e2w¯ j, + 1 + 4π(4Nj + 5Ni ) ci, e ≥ 2 2w¯ 2 2w¯ . − 1, 2, 9λ 4 ( h1 e −1 ( h2 e In view of (3.12), we conclude that +

eci, ≥ 1 − −

4π max Nj , λ j =1,2

i = 1, 2,

that gives +

(1 − eci, for all λ ≥ λ0 .

(

hi ew¯ i, ) ≤ +

4π max Nj , λ j =1,2

i = 1, 2

(3.14)

Vortex Condensates for SU(3) Chern–Simons Theory

621

Now, we are ready to prove: Proposition 3.4. For λ > 0 sufficiently large, inf

(w1 ,w2 )∈∂ Aλ

Jλ+ (w1 , w2 ) >

inf

(w1 ,w2 )∈Aλ

Jλ+ (w1 , w2 ).

(3.15)

Proof. Fix ∈ (0, 21 ) and consider (w¯ 1, , w¯ 2, ) satisfying (3.14) for λ ≥ λ0 . Since ci+ ≤ 0, i = 1, 2 (see Remark 2.4), by (2.37) (with ∗ = +) and (3.14), we have: Jλ+ (w¯ 1, , w¯ 2, ) ≤ ∇ w¯ 1, 2 + ∇ w¯ 2, 2 + λ ≤ C + λ,

(3.16)

with C > 0 a suitable constant depending on only. Comparing with Lemma 3.3, it follows inf

(w1 ,w2 )∈∂ Aλ

Jλ+ (w1 , w2 ) −

inf

(w1 ,w2 )∈Aλ

Jλ+ (w1 , w2 ) ≥

inf

(w1 ,w2 )∈∂ Aλ

Jλ+ (w1 , w2 )

√ 1 − Jλ+ (w¯ 1, , w¯ 2, ) ≥ λ( − ) − C( λ + lnλ) − C → +∞, 2

as λ → +∞, and the proposition is proved.

From Proposition 3.2 and 3.4 we may conclude that, for λ > 0 sufficiently large and (N1 , N2 ) ∈ N × N, there exists (w1,λ , w2,λ ) in the interior of Aλ , where Jλ+ attains its infimum. Consequently, (Jλ+ )$ (w1,λ , w2,λ ) = 0, and + = w1,λ + c1+ (w1,λ , w2,λ ) v1,λ (3.17) + = w2,λ + c2+ (w1,λ , w2,λ ) v2,λ defines a critical point for Iλ , namely a (weak) solution of (2.10). Next we prove that, as in the Abelian Chern–Simons-Higgs theory (see [34]), the + + , v2,λ ), solution characterized by the choice of the “plus” sign in (2.23), namely (v1,λ give rise to a periodic vortex of type I. More precisely, we prove + + , v2,λ ) be the solution of (2.10) as given Proposition 3.5. For λ sufficiently large, let (v1,λ by (3.17), then, i

+

• eu0 +vi,λ → 1 as λ → +∞, pointwise a.e. in ( and in Lp ((), ∀ p ≥ 1 (i = 1, 2). In order to prove Proposition 3.5 we start with the following lemma: + + , v2,λ ) be given by (3.17), then, Lemma 3.6. Let (v1,λ + + W (v1,λ , v2,λ ) → 0, as λ → +∞. (

(3.18)

Proof. In view of (3.16) we may conclude that ∀ > 0 ∃λ > 0 and C > 0 such that ∀λ ≥ λ we have inf Jλ+ ≤ λ + C . Aλ

(3.19)

622

M. Nolasco, G. Tarantello

On the other hand, following the argument of Proposition 3.2 (with τ = 2 (maxi =j =1,2 (2Nj + Ni ))−1 ) we obtain 4π + + + + , w2,λ )≥λ W (v1,λ , v2,λ )− max(2Ni + Nj )2 ln λ − C, inf Jλ+ = Jλ+ (w1,λ 3 i =j Aλ ( (3.20) for C > 0 a suitable constant independent of λ. So putting together (3.19) and (3.20), we obtain that + + W (v1,λ , v2,λ ) ≤ , ∀ > 0 lim sup λ→+∞

(

and the conclusion follows. +

Proof of Proposition 3.5. Recalling (2.14), by Lemma 3.6 we have hi evi,λ → 1 in L2 (() + + as λ → +∞ and i = 1, 2. Since (v1,λ + u10 , v2,λ + u20 ) is a solution of (2.1), by (2.3), we have that +

i

evi,λ +u0 ≤ 1, +

in (,

i = 1, 2.

i

Hence, evi +u0 → 1 pointwise a.e., and, by dominated convergence, strongly in Lp ((), ∀ p ≥ 1, as λ → +∞. + + We conclude this section by observing that the type I multivortex solution (v1,λ , v2,λ ) can be characterized variationally as follows: + + + + , v2,λ ) be given by (3.17). Then (v1,λ , v2,λ ) is a local minimum for Lemma 3.7. Let (v1,λ the functional Iλ .

Proof. For any (w1 , w2 ) ∈ Aλ observe that ∂ci Iλ (w1 + c1+ (w1 , w2 ), w2 + c2+ (w1 , w2 )) = 0,

i = 1, 2.

(3.21)

Moreover, for i = 1, 2, i = j we have 2 4h2i e2(wi +ci ) − hi ewi +ci − h1 h2 ec1 +c2 ew1 +w2 , ∂c2 Iλ (w1 + c1 , w2 + c2 ) = λ i ( 2 ∂c1 c2 Iλ (w1 + c1 , w2 + c2 ) = −λ h1 h2 ec1 +c2 ew1 +w2 . (

In view of (2.23) we have + + 1 + + 32π + + , v2,λ ) = λ(( hi evi,λ + h1 h2 ev1,λ +v2,λ )2 − h2i e2vi,λ ) 2 , (2Ni + Nj ) ∂c22 Iλ (v1,λ 3λ i ( ( (3.22) and since (w1,λ , w2,λ ) lies in the interior of Aλ , + + + + , v2,λ )>λ h1 h2 ev1,λ +v2,λ , ∂c22 Iλ (v1,λ i

(

i = 1, 2.

Vortex Condensates for SU(3) Chern–Simons Theory

623

+ + Therefore, at the point (v1,λ , v2,λ ), the Hessian matrix of Iλ (w1 + c1 , w2 + c2 ) w.r.t. the variables (c1 , c2 ) is strictly positive defined. Let vi = wi + ci (i = 1, 2); by continuity, + there exists δ > 0 such that for any i=1,2 vi − vi,λ ≤ δ, we have (w1 , w2 ) ∈ Aλ and

Iλ (v1 , v2 ) ≥ Iλ (w1 + c1+ (w1 , w2 ), w2 + c2+ (w1 , w2 )) = Jλ+ (w1 , w2 ).

(3.23)

Therefore, Iλ (v1 , v2 ) ≥ Jλ+ (w1 , w2 ) ≥

inf

(w1 ,w2 )∈Aλ

+ + Jλ+ (w1 , w2 ) = Iλ v1,λ , v2,λ ,

+ + , v2,λ ) defines a local minimum for Iλ . and so (v1,λ

(3.24)

4. The Mountain Pass Solution In this section we prove the existence of a second solution of “mountain pass” type and obtain the proof of Theorem 2.1. We start to show that the functional Iλ satisfies the compactness condition of Palais– Smale. Lemma 4.1. Let {(v1,n , v2,n )} be a sequence in H1 (() × H1 (() satisfying: (1) Iλ (v1,n , v2,n ) → α as n → +∞, (2) Iλ$ (v1,n , v2,n ) → 0 as n → +∞, then {(v1,n , v2,n )} admits a convergent subsequence in H1 (() × H1 ((). Proof. Set vi,n = wi,n + ci,n (i = 1, 2), where ( wi,n = 0 and ci,n = ( vi,n . For any (ψ1 , ψ2 ) ∈ H1 (() × H1 ((), 2 $ Iλ v1,n , v2,n [ψ1 , ψ2 ] = ∇w1,n ∇ψ1 + ∇w2,n ∇ψ2 3 ( ( 1 + ∇w1,n ∇ψ2 + ∇w2,n ∇ψ1 3 ( ( 4π 2 2v1,n v1,n v1,n +v2,n +λ 2h1 e − h1 e − h 1 h2 e + (2N1 + N2 ) ψ1 3 ( 4π 2 2v2,n v2,n v1,n +v2,n 2h2 e +λ − h2 e − h 1 h2 e + (2N2 + N1 ) ψ2 . 3 ( (4.1) Choosing ψi = 1 and ψj = 0 (i, j = 1, 2; i = j ) in (4.1) we get, as n → +∞, λ (2h2 e2vi,n − hi evi,n − h1 h2 ev1,n +v2,n ) + 4π (2Ni + Nj ) ≤ o(1). (4.2) i 3 ( By (2.14), it implies: 2λ W (v1,n , v2,n ) + λ h1 ev1,n + λ h2 ev2,n ≤ 2λ + o(1). (

(

(

624

M. Nolasco, G. Tarantello

Hence, as n → +∞, we have ( (

W (v1,n , v2,n ) ≤ 1 + o(1); hi evi,n ≤ 2 + o(1)

(4.3) i = 1, 2.

From (4.3), and Jensen’s inequality, we also get eci,n ≤ 2 + o(1),

as n → +∞.

Furthermore, using (2.14) together with (4.3) we derive, as n → +∞, that (hi evi,n − 1)2 ≤ 2 + o(1), i = 1, 2, (

(4.4)

(4.5)

(

h2i e2vi,n ≤ C,

i = 1, 2,

(4.6)

with C > 0 a suitable constant. Set (w1,n + w2,n )+ = max{w1,n + w2,n , 0} and take (4.1) with ψ1 = ψ2 = (w1,n + w2,n )+ , then + Iλ$ v1,n , v2,n [ψ1 , ψ2 ] = ∇ w1,n + w2,n 2 + 2h21 e2v1,n + 2h22 e2v2,n − 4h1 h2 ev1,n +v2,n w1,n + w2,n +λ ( + + + 2λ h1 h2 ev1,n +v2,n w1,n + w2,n − λ hi evi,n w1,n + w2,n ( i=1,2 ( (4.7) + + + 4π (N1 + N2 ) h1 h2 ev1,n +v2,n w1,n + w2,n w1,n + w2,n ≥ 2λ −λ

(

(

w1,n + w2,n

(

1 + 2 2 i=1,2

(

h2i e2vi,n

1 2

.

Therefore, by (4.6) and assumption (2) we get, h1 h2 ev1,n +v2,n (w1,n + w2,n )+ ≤ C(w1,n + w2,n )+ 2 + n (w1,n + w2,n )+ (

≤ C(∇w1,n 2 + ∇w2,n 2 ), (4.8)

with n → 0 as n → +∞ and C > 0 a suitable constant independent of n ∈ N. Note that in (4.8) we have used the well known estimate: (w1,n + w2,n )+ 2 ≤ (w1,n + w2,n )+ ≤ w1,n + w2,n 2 ≤ C0 (∇w1,n 2 + ∇w2,n 2 ), with a suitable constant C0 > 0 (independent of n ∈ N).

Vortex Condensates for SU(3) Chern–Simons Theory

625

Now, take ψ1 = w1,n , ψ2 = w2,n in (4.1) and use (4.6), (4.8), and the Poincaré inequality to derive 1 Iλ$ (v1,n , v2,n )[w1,n , w2,n ] ≥ (∇w1,n 2 + ∇w2,n 2 ) 3 2 +λ (2h2i e2vi,n − hi evi,n )wi,n −λ

( i=1 (

h1 h2 ev1,n +v2,n (w1,n + w2,n )+

1 (∇w1,n 2 +∇w2,n 2 ) + 2λ 3

2

≥

+2λ

2 i=1

(

h2i e2ci,n wi,n − λ

−C(∇w1,n 2 +∇w2,n 2 ) ≥

i=1 2 i=1

(

(

h2i e2ci,n (e2wi,n − 1)wi,n

h2i e2vi,n

1 2

wi,n 2

1 ∇wi,n 22 − C ∇wi,n 2 , 3 i=1,2

i=1,2

where we used that (e2wi,n − 1)wi,n ≥ 0, i = 1, 2 a.e. in (. Hence, by assumption (2), we conclude that ∇w1,n and ∇w2,n are bounded sequences. Moreover, from assumption (1), Iλ (v1,n , v2,n ) is bounded below uniformly on n ∈ N and we get a constant C > 0, independent of n ∈ N, such that 1 4π(N1 + N2 ) min{c1,n , c2,n } ≥ − (∇w1,n 2 + ∇w2,n 2 ) 6 (4.9) −λ W (v1,n , v2,n ) + Iλ (v1,n , v2,n ) ≥ −C, (

that is, the sequence ci,n (i = 1, 2) is also bounded from below. Therefore, after passing to a subsequence, we get vi,n A v¯i ,

(i = 1, 2)

as n → +∞,

(4.10)

weakly in H1 ((), strongly in Lp ((), p ≥1 and pointwise a.e. in (. Moreover, evi,n → ev¯i strongly in Lp (() p ≥ 1, and ci,n → ( v¯i = c¯i . Consequently, for any (ψ1 , ψ2 ) ∈ H1 (() × H1 (() we derive Iλ$ v1,n , v2,n [ψ1 , ψ2 ] → Iλ$ (v¯1 , v¯2 ) [ψ1 , ψ2 ] = 0, (4.11) namely (v¯1 , v¯2 ) defines a critical point for Iλ . In order to obtain strong convergence in H1 (() × H1 (() we choose ψ1 = v1,n − v¯1 and ψ2 = v2,n − v¯2 into (4.1). By assumption (2) and (4.11), we obtain |(Iλ$ (v1,n , v2,n ) − Iλ$ (v¯1 , v¯2 ))[v1,n − v¯1 , v2,n − v¯2 ]| ≤ n (v1,n − v¯1 + v2,n − v¯2 ) = o(1), as n → +∞.

(4.12)

626

M. Nolasco, G. Tarantello

Consequently, 1 ∇ wi,n − w¯ i 2 ≤ −2λ 3 2

2

i=1

+λ +λ

(

(

i=1

2 (

i=1

h2i e2vi,n − e2v¯i vi,n − v¯i

hi evi,n − ev¯i vi,n − v¯i +

h1 h2 ev1,n +v2,n − ev¯1 +v¯2

v1,n + v2,n − (v¯1 + v¯2 ) + o (1) = o (1) ,

as n → +∞, and the desired conclusion follows.

Proof of Theorem 2.1(c). In Sect. 3 we proved, for λ > 0 sufficiently large, the existence of a solution of problem (2.10) with the desired asymptotic behavior (2.4) as λ → +∞ (see Proposition 3.5). Moreover, in Lemma 3.7, we have shown that such a solution + + , v2,λ ) defines a local minimum for Iλ , namely (v1,λ + + + ∃ δ0 > 0 : Iλ (v1 , v2 ) ≥ Iλ (v1,λ , v2,λ ), provided vi − vi,λ ≤ δ0 . (4.13) i=1,2

In order to find a second solution for (2.10), we observe that Iλ admits a “mountain pass” structure. In fact, there exists a constant Cλ > 0 (depending only on λ) such that for c ∈ R we have: + + + , w2,λ + c) − Iλ (v1,λ , v2,λ ) ≤ Cλ + Iλ (v1,λ

4π (2N2 + N1 )c. 3

(4.14)

We distinguish two cases. + + , v2,λ ) is not a strict local minimum for Iλ , namely (i). If (v1,λ ∀ 0 < δ < δ0

inf

+ i=1,2 vi −vi,λ =δ

+ + Iλ = Iλ (v1,λ , v2,λ ),

(4.15)

then, by an application of Ekeland’s lemma Corollary 1.6 ), we obtain a local (see [17], δ , v δ ) for I , such that δ − v + = δ for every δ ∈ (0, δ ). minimum (v1,λ v λ 0 i=1,2 i,λ 2,λ i,λ Therefore, in this case, we find a one-parameter family of (weak) solutions for (2.10). Otherwise, + + , v2,λ ) is a strict local minimum for Iλ , then (ii). if (v1,λ ∃ δ1 ∈ (0, δ0 ) :

inf

+ i=1,2 vi −vi,λ =δ1

+ + Iλ (v1 , v2 ) > Iλ (v1,λ , v2,λ ).

(4.16)

Moreover, in view of (4.14), there exists c¯ < 0 such that |c¯ − c2+ (w1,λ , w2,λ )| > δ1 and + + + Iλ (v1,λ , w2,λ + c) ¯ < Iλ (v1,λ , v2,λ ).

(4.17)

We introduce the class of paths + + + , v2,λ ); γ (1) = (v1,λ , w2,λ + c)} ¯ *λ = {γ ∈ C([0, 1], H1 (() × H1 (()) : γ (0) = (v1,λ

Vortex Condensates for SU(3) Chern–Simons Theory

627

and define + + αλ = inf max Iλ (γ (t)) > Iλ (v1,λ , v2,λ ). γ ∈* t∈[0,1]

In view of Lemma 4.1, (4.16) and (4.17), we can apply the Mountain Pass Theorem of Ambrosetti–Rabinowitz [2] to conclude that αλ defines a critical level for Iλ . Namely, there exists (v¯1,λ , v¯2,λ ) ∈ H1 (() × H1 (() such that Iλ$ (v¯1,λ , v¯2,λ ) = 0

+ + Iλ (v¯1,λ , v¯2,λ ) = αλ > Iλ (v1,λ , v2,λ ).

and

Hence (v¯1,λ , v¯2,λ ) defines a (weak) solution of (2.10) distinct from the local minimum + + (v1,λ , v2,λ ).

5. Vortex Solutions of the II and III-Type In this section we are going to establish Theorem 2.2, by proving the existence of onevortex solutions from minima on Aλ of the functionals Jλ− , Jλ± and Jλ∓ (see remark 2.5). We start to discuss the minimization problem for Jλ− . To this purpose we prove a preliminary lemma: Lemma 5.1. There exists a constant C > 0, independent of λ, such that for any (w1 , w2 ) ∈ Aλ , −

eci ≥

C , λ ( hi ewi

i = 1, 2.

(5.1)

Proof. By symmetry, it suffices to show (5.1) for i = 1. From (2.27) and (2.23), we have −

ec1 ≥

4π(2N1 + N2 ) . − 3λ( ( h1 ew1 + ec2 ( h1 h2 ew1 +w2 )

(5.2)

On the other hand, by the Hölder inequality, (2.20) and Lemma 2.3 (ii) we have c2−

e

(

h1 h2 e

w1 +w2

c2−

≤e

1 (

2

h21 e2w1

(

1 h22 e2w2

2

− 3λ ≤ ec2 √ 32π (2N1 + N2 ) (2N2 + N1 ) 3 2N2 + N1 ≤ h1 ew1 . 4 2N1 + N2 (

(

Combining (5.2) and (5.3), we obtain −

e c1 ≥

4π(2N1 + N2 ) 1 , w 2 +N1 λ ( h1 e 1 3(1 + 43 2N ) 2N1 +N2

and the desired estimate is established.

h2 e

w2

(

h1 ew1

(5.3)

628

M. Nolasco, G. Tarantello

Lemma 5.1 permits to obtain the following: Proposition 5.2. If N1 + N2 = 1, then there exists a constant C > 0, independent of λ, such that for all (w1 , w2 ) ∈ Aλ we have: Jλ− (w1 , w2 ) ≥

1 (∇w1 2 + ∇w2 2 ) + λ − 4π ln λ − C. 30

(5.4)

Moreover, Jλ− attains its infimum on Aλ . Proof. Recalling (2.37), from Lemma 5.1 we get 1 − 2 2 ∇w1 ∇w2 + λ ∇w1 + ∇w2 + Jλ (w1 , w2 ) ≥ 3 ( 4π 4π 1 1 + + −C, (2N1 + N2 ) ln (2N2 + N1 ) ln 3 3 λ ( h1 ew1 λ ( h2 ew2 (5.5) for any (w1 , w2 ) ∈ Aλand for some constant C > 0 independent of λ. 1 Using the estimate ( |∇w1 ||∇w2 | ≤ 2 ∇w1 2 + 2 ∇w2 2 , valid for any > 0, and the Moser–Trudinger inequality (3.4), for some constant C > 0 (independent of λ), we obtain 1 1 2N1 + N2 − 1− − ∇w1 2 Jλ (w1 , w2 ) ≥ 3 2 4 2N2 + N1 1 ∇w2 2 + 1− − 2 4 3 + λ − 4π (N1 + N2 ) ln λ − C 2 1 3 − − N1 ∇w1 2 ≥ 12 1 + (3 − 2 − N2 ) ∇w2 2 + λ − 4π ln λ − C, 12 provided N1 + N2 = 1. Thus (5.4) follows by choosing = 45 (1 − N2 ) + 45 (1 − N1 ). Therefore Jλ− is coercive on Aλ . By its weak lower semicontinuity on the weakly closed set Aλ , we immediately conclude that Jλ− attains its infimum on Aλ . Let (w1,λ , w2,λ ) ∈ Aλ satisfy Jλ− (w1,λ , w2,λ ) = inf Aλ Jλ− , in order to prove that − = w1,λ + c1− (w1,λ , w2,λ ) v1,λ (5.6) − v2,λ = w2,λ + c2− (w1,λ , w2,λ ) defines a (weak) solution of (2.10), it suffices to show that (w1,λ , w2,λ ) lies in the interior of Aλ . Indeed, we have Proposition 5.3. For N1 + N2 = 1 and λ > 0 sufficiently large, we have inf

(w1 ,w2 )∈∂ Aλ

Jλ− (w1 , w2 ) >

inf

(w1 ,w2 )∈Aλ

Jλ− (w1 , w2 ).

(5.7)

Vortex Condensates for SU(3) Chern–Simons Theory

629

Proof. For (w1 , w2 ) ∈ ∂Aλ the identity 2 32π 2Ni + Nj wi hi e = h2i e2wi , λ ( (

i = j

(5.8)

holds for i = 1 or 2. Now let (w1,λ , w2,λ ) satisfies Jλ− (w1,λ , w2,λ ) = inf Aλ Jλ− and by contradiction assume that (w1,λ , w2,λ ) ∈ ∂Aλ . W.l.o.g. we may suppose that (5.8) holds for i = 2. By Jensen’s inequality it follows that, h22 e2w2,λ → +∞, as λ → +∞, (

and, by the Moser–Trudinger inequality (3.4), necessarily ∇w2,λ → +∞,

as λ → +∞.

(5.9)

Furthermore, by Proposition 5.2 we get Jλ− (w1,λ , w2,λ ) ≥

1 (∇w1,λ 2 + ∇w2,λ 2 ) + λ − 4π ln λ − C, 30

(5.10)

with C > 0 a suitable constant independent of λ. On the other hand, by Lemma 2.3 (ii) and Lemma 5.1 there exist constants C1 , C2 > 0, independent of λ, such that λ C2 C1 λ C1 4π − Jλ (0, 0) ≤ 1− + 1− + (2N1 + N2 ) ln 2 λ 2 λ 3 λ (5.11) 4π C2 + ≤ λ − 4π (N1 + N2 ) lnλ + C, (2N2 + N1 ) ln 3 λ with C > 0 a suitable constant independent of λ. Thus, if N1 + N2 = 1, from (5.10) and (5.11) we obtain a contradiction since, 0 ≥ Jλ− (w1,λ , w2,λ ) − Jλ− (0, 0) ≥

1 ∇w2,λ 2 − C → +∞, 30

Hence, (w1,λ , w2,λ ) must belong to the interior of Aλ .

as λ → +∞.

Remark 5.4. Note that, if we use estimate (3.1) into (5.5) we can derive, as in Proposition 3.2, that the functional Jλ− is bounded below and attains its infimum on Aλ for all integers N1 and N2 . However, in the general situation, the estimate (5.4) gets worse with respect to λ and becomes: Jλ− (w1 , w2 ) ≥ α ∇w1 22 + ∇w2 22 + λ − 4π cα ln λ − C, (5.12) for some α > 0 and some constant cα = cα (N1 , N2 ) ≥ 0, depending on N1 and N2 in α such a way that N1c+N → +∞, as N1 + N2 → +∞. 2 Thus, the estimate (5.12) is no longer sufficient for the arguments in the proof of Proposition 5.3 to yield (5.7).

630

M. Nolasco, G. Tarantello

So we have established that (5.6) defines a solution for (2.10) provided λ > 0 is − − sufficiently large. Next, we show that (v1,λ , v2,λ ) exhibits a different asymptotic behavior + + as λ → +∞ w.r.t. the family (v1,λ , v2,λ ) obtained in Proposition 3.5 above. Indeed, we have − − Proposition 5.5. Let N1 + N2 = 1 and (v1,λ , v2,λ ) be given by (5.6). Then,

(i) ci− (w1,λ , w2,λ ) → −∞, as λ → +∞. (ii) There exists a constant C > 0 (independent of λ) such that ∇wi,λ ≤ C (i = 1, 2). Furthermore, any sequence λn → +∞ admits a subsequence (still denoted by λn ) such that for wi,n = wi,λn (i = 1, 2) we have wi,n → w¯ i strongly in H1 ((), and (w¯ 1 , w¯ 2 ) satisfies:  4 h1 ew1 1 h 2 ew 2  −5w = 4πN − − 1 + 1 1 3 w1  3 ( h2 ew2  ( h1 e   w2 w1 −5w2 = 4πN2 43 h2he ew2 − 13 h1he ew1 − 1 +  ( 2 ( 1     1 ( wi = 0, wi ∈ H (() i = 1, 2.

(i = 1, 2)

8π 3 N2 8π 3 N1

w

h1 e 1w 1 ( h1 e w

h2 e 2w 2 ( h2 e

w

w

−

h2 e 2w 2 ( h2 e

−

h1 e 1w 1 ( h1 e

(5.13) Proof. By Remark 2.4 and Lemma 5.1 we have − 1 , eci (w1,λ ,w2,λ ) = O λ

i = 1, 2,

and (i) immediately follows. (ii). Recalling (5.10) and (5.11) we have 0 ≥ Jλ− (w1,λ , w2,λ ) − Jλ− (0, 0) ≥

1 ∇w1,λ 2 + ∇w2,λ 2 − C, 30

hence, we immediately derive ∇wi,λ ≤ C (i = 1, 2) for some suitable constant C > 0 independent of λ. Therefore, passing to subsequences if necessary, we derive wi,n = wi,λn → w¯ i weakly in H1 ((), strongly in Lp (() ∀p ≥ 1 and pointwise a.e. in (. Furthermore, by dominated convergence, we also have i i eα(u0 +wi,n ) → eα(u0 +w¯ i ) , i = 1, 2, and α > 0. (5.14) (

(

− − Consequently, taking into account (i), and the definition of ci,n = ci,n (w1,n , w2,n ) in (2.27), as n → +∞, we get −

λn eci,n →

4π 2Ni + Nj , 3 ( hi ew¯ i

i, j = 1, 2,

i = j.

(5.15)

The weak convergence in H1 ((), together with (5.14) and (5.15), yield the conclusion that (w¯ 1 , w¯ 2 ) is a weak solution for (5.13).

Vortex Condensates for SU(3) Chern–Simons Theory

631

Finally, to prove that wi,n → w¯ i (i = 1, 2) strongly in H1 ((), notice that

|∇wi,n − ∇ w¯ i | = − 2

(

(

5wi,n − 5w¯ i

wi,n − w¯ i =

(

hi,n wi,n − w¯ i ,

where − − 4π 2N1 + N2 h1,n = λn ec1,n 2h1 ew1,n − 2h1 ew¯ 1 + 2h1 ew¯ 1 λn ec1,n − 3 ( h1 ew¯ 1 − − 4π 2N2 + N1 c2,n c2,n w2,n w¯ 2 w¯ 2 (5.16) − λn e λn e − − h2 e h2 e − h2 e 3 ( h2 ew¯ 2 − − − − − λn 4e2c1,n h21 e2w1,n − 2e2c2,n h22 e2w2,n − ec1,n ec2,n h1 ew1,n h2 ew2,n , and h2,n is given by the symmetric expression. Hence, hn p is uniformly bounded for any p ≥ 1 and ( |∇wi,n − ∇ w¯ i |2 ≤ hn 2 wi,n − w¯ i 2 → 0. − − Corollary 5.6. Let (v1,λ , v2,λ ) be given by (5.6). Then as λ → +∞, −

• evi,λ → 0 uniformly in C k ((), ∀k > 0, i=1,2. Proof. In view of (i) in Proposition 5.5 it is enough to prove that any sequence λn → +∞ admits a subsequence (which we still denote by λn ) such that wi,n = wi,λn → w¯ i in C k (() for any k ≥ 0. This is readily established since, −5(wi,n − w¯ i ) = hi,n , with hi,n given in (5.16), and hi,n p → 0 as n → +∞ for any p ≥ 1. Consequently, wi,n − w¯ i C 1,α → 0 as n → +∞ and α ∈ (0, 1). A bootstrap argument then gives wi,n − w¯ i C k → 0, for any k ∈ N. To conclude the proof of Theorem 2.2, we consider the analogous minimization problem for Jλ± and Jλ∓ on Aλ . We start with the following: Proposition 5.7. Let N1 + N2 = 1 and ∗ = ± or ∓, there exists a constant C > 0, independent of λ, such that Jλ∗ (w1 , w2 ) ≥

3 1 ∇w1 2 + ∇w2 2 + λ − 4π ln λ − C, 30 4

(5.17)

for all (w1 , w2 ) ∈ Aλ . Moreover, Jλ∗ attains its infimum on Aλ . Proof. It suffices to prove (5.17) with ∗ = ±, the other case ∗ = ∓ follows analogously by exchanging the role between the indices. In view of (2.23), from (2.27) it follows immediately that h1 ew1 8π (2N1 + N2 ) 1 c1± ; (5.18) e ≥ ( 2 2w , ≥ w1 1 3λ h 4 ( h1 e ( 1e

632

M. Nolasco, G. Tarantello

while, ±

e c2 ≥ 3λ

( h2 e

8π w2

+e

c1±

( h 1 h2 e

w1 +w2

≥

3λ max

( h2

4π ew2 ,

( h1 h2 e

w1 +w2

,

(5.19) ±

c where we have used thate 1 ≤ 1 (see Remark 2.4). In case ( h2 ew2 ≥ ( h1 h2 ew1 +w2 , by (2.37) we can use Lemma 2.3 (iii), (5.18), (5.19) and the Moser–Trudinger inequality (3.4), to derive 3 1 ± 2 2 Jλ (w1 , w2 ) ≥ ∇w1 + ∇w2 + ∇w1 ∇w2 + λ 3 4 ( 4π 4π w1 − h1 e − h2 ew2 (2N1 + N2 ) ln (2N2 + N1 ) ln 3 3 ( ( − 4π (N1 + N2 ) ln λ − C (5.20) 1 2N1 + N2 1 1− − ∇w1 2 ≥ 3 2 4 2N2 + N1 1 1− − ∇w2 2 + 3 2 4 3 + λ − 4π (N1 + N2 ) ln λ − C 4 for every > 0 and C > 0 independent of λ. Since N1 + N2 = 1, as in Proposition 5.2, we can certainly make a choice of > 0 in (5.20) in order to insure (5.17). Now suppose that ( h2 ew2 < ( h1 h2 ew1 +w2 , proceeding as above, in this case we get 3 1 ± 2 2 Jλ (w1 , w2 ) ≥ ∇w1 ∇w2 + λ − 4π (N1 + N2 ) ln λ ∇w1 + ∇w2 + 3 4 ( 4π w1 4π − h1 e − h1 h2 ew1+w2 −C (2N1 +N2 ) ln (2N2 +N1 ) ln 3 3 ( ( (2N + N ) 1 1 2 2 2 2 ≥ ∇w1 + ∇w2 + ∇w1 + ∇w2 − ∇w1 2 6 12 3 (2N2 + N1 ) ∇w1 + ∇w2 2 + λ − 4π (N1 + N2 ) ln λ − C − 12 4 3 1 2 2 ≥ ∇w1 + ∇w2 + λ − 4π ln λ − C, 24 4 (5.21)

provided N1 + N2 = 1. In any case we get the desired estimate (5.17). Thus Jλ± is coercive on Aλ . Since it is weakly lower semicontinuous on the weakly closed set Aλ , we immediately conclude that Jλ± attains its infimum on Aλ . Remark 5.8. By similar considerations to those of Remark 5.4, we can assert that, in fact, the functional Jλ∗ , ∗ = ± or ∓, is bounded below and attains its infimum on Aλ for all integers N1 and N2 . However, we need to restrict to the case N1 + N2 = 1 in order to insure a sharp form of estimate (5.17) (see (5.36) below), which is crucial for the existence of a minimizer in the interior of Aλ .

Vortex Condensates for SU(3) Chern–Simons Theory

633

∗ , w ∗ ) ∈ A a minimum of J ∗ in A , namely Let ∗ = ± or ∓ and denote by (w1,λ λ λ λ 2,λ Jλ∗ (w1,λ , w2,λ ) = inf Aλ Jλ∗ . Define ∗ = w ∗ + c∗ (w ∗ , w ∗ ) v1,λ 1,λ 1 1,λ 2,λ (5.22) ∗ = w ∗ + c∗ (w ∗ , w ∗ ), v2,λ 2,λ 2 1,λ 2,λ

∗ = ± or ∓. ∗ , w ∗ ) lies in the interior of A , and hence that (v ∗ , v ∗ ) defines To show that (w1,λ λ 2,λ 1,λ 1,λ a (weak) solution of (2.10), we prove the following preliminary result which holds for any choice of N1 , N2 ∈ N. ± ± , v2,λ ) be given by (5.22), then Lemma 5.9. (i) Let (v1,λ ± 1 h1 ev1,λ → . 2 ( ∓ ∓ (ii) Let (v1,λ , v2,λ ) be given by (5.22), then ∓ 1 h2 ev2,λ → . 2 (

Proof. By symmetry, we only need to establish (i). 1 1 +w 1 = ¯ µ1 , with c¯µ ¯ µ1 = 0 As in the proof of Lemma 3.3, let v¯µ1 = c¯µ ( v¯ µ and ( w be the solution of 1 1 5v = µev+u0 (ev+u0 − 1) + 4π N1 v ∈ H1 (() satisfying w¯ µ1 → −u10 pointwise a.e. in ( and h1 ew¯ µ → 1, 1

in Lp ((), ∀p ≥ 1,

as µ → +∞ (see [34, Proposition 3.1]). Observe that, from (2.27) and (2.23), we have   2 ! w1 2 2w1 ! h e h e ± + N 32π (2N ) 1 1 2 1 ( ec1 h1 ew1 ≥ ( 2 2w 1 + "1 −  w1 2 3λ 4 ( h1 e 1 ( h e 1 ( 2 w 1 8π 1 ( h1 e 2 − ≥ (2N1 + N2 ) . 2w 1 2 ( h1 e 3λ

(5.23)

(5.24)

Recalling (5.23) and Lemma 2.3 (iii), we find λ0 > 0 sufficiently large and c0 > 0 such that for any > 0 there exist µ > 0 with the property that (w¯ µ1 , 0) ∈ Aλ and 1 8π 1 c± w¯ 1 ,0 h1 ew¯ µ ≥ − − e 1 µ (2N1 + N2 ) , 2 3λ ( (5.25) c0 c± w¯ 1 ,0 e 2 µ h2 ≥ , λ ( for every λ ≥ λ0 .

634

M. Nolasco, G. Tarantello

Consequently, 1 1 ± 1 1 2 Jλ w¯ µ , 0 ≤ ∇ w¯ µ + λ 1 − + + O (ln λ) , 3 4 2 On the other hand, we have λ λ ± c1± w1,λ Jλ w1,λ , w2,λ ≥ + + O (ln λ) , h1 e 1−e 2 2 (

as λ → +∞. (5.26)

as λ → +∞. (5.27)

Therefore, 0≥

Jλ± w1,λ , w2,λ − Jλ±

from which we derive

lim sup λ→+∞

λ λ c1± w1,λ + ≥ h1 e 1−e 2 2 ( 1 1 − ∇ w¯ µ1 2 + λ 1 − + + O (ln λ) , 3 4 2

w¯ µ1 , 0

± 1 − e c1 2

(

h1 e

w1,λ

≤ ,

∀ > 0.

At this point, taking into account Lemma 2.3 (iii), we conclude ± 1 h1 ev1,λ → , as λ → +∞. 2 ( Remark 5.10. Putting together (2.35) and (5.24), we have that necessarily 2 h1 ew1,λ ( 2 2w → 1, as λ → +∞. 1,λ ( h1 e

(5.28)

Using Lemma 5.9, we derive: Proposition 5.11. (i) If N1 = 0 and N2 = 1 then, for λ > 0 sufficiently large, inf

(w1 ,w2 )∈∂ Aλ

Jλ± (w1 , w2 ) >

inf

(w1 ,w2 )∈Aλ

Jλ± (w1 , w2 ).

(5.29)

(ii) If N1 = 1 and N2 = 0 then, for λ > 0 sufficiently large, inf

(w1 ,w2 )∈∂ Aλ

Jλ∓ (w1 , w2 ) >

inf

(w1 ,w2 )∈Aλ

Jλ∓ (w1 , w2 ).

(5.30)

Proof. Again by symmetry we only need to establish (i). Let us suppose that (w1,λ , w2,λ ) satisfies J ± (w1,λ , w2,λ ) = inf Aλ J ± and (w1,λ , w2,λ ) ∈ ∂Aλ . In view of Remark 5.10 necessarily 2 32π (2N2 + N1 ) w2,λ h2 e = h22 e2w2,λ . (5.31) λ ( ( As a consequence of (5.31) we get h22 e2w2,λ → +∞ (

as λ → +∞

Vortex Condensates for SU(3) Chern–Simons Theory

635

and, by the Moser–Trudinger inequality (3.4), necessarily ∇w2,λ → +∞ as λ → +∞.

(5.32)

Now, note that if N1 = 0 then u10 = 0, and in particular h1 = eu0 = 1. By explicit calculation we see that, ± 1 1 ec1 (0,0) = + O , 2 λ (5.33) ± 1 ec2 (0,0) = O , as λ → +∞. λ 1

Therefore, in view of (2.14) and (5.33) we get λ ± ± 8π λ Jλ± (0, 0) ≤ 1 − ec1 (0,0) + 1 − ec2 (0,0) ln λ + C h2 − 2 2 3 ( (5.34) 3 8π ≤ λ− ln λ + C as λ → +∞. 4 3 for a suitable constant C > 0, independent of λ. On the other hand, by Lemma 5.9, for λ sufficiently large, we can insure that, c1± (w1,λ , w2,λ ) ≥ −ln ew1,λ − ln 4. (5.35) (

Thus, using the same arguments of Proposition 5.7, by Lemma 2.3 (iii) we find constants α, C > 0 (independent of λ) such that λ ± Jλ± w1,λ , w2,λ ≥ α ∇w1,λ 2 + ∇w2,λ 2 + 1 − ec1 ew1,λ 2 ( 8π ± λ (5.36) + 1 − ec2 h2 ew2,λ − ln λ − C 2 ( 3 3 8π ≥ α ∇w1,λ 2 + ∇w2,λ 2 + λ − ln λ − C. 4 3 Combining (5.34) and (5.36) we conclude that, 0 ≥ Jλ± (w1,λ , w2,λ ) − Jλ± (0, 0) ≥ α(∇w1,λ 22 + ∇w2,λ 22 ) − C, and, in view of (5.32), we reach a contradiction.

(5.37)

To conclude we determine the asymptotic behavior, as λ → +∞, of the family of solutions given by (5.22). ∗ , v ∗ ) be given by (5.22) with ∗ = ± or ∓. We have Proposition 5.12. Let (v1,λ 2,λ ± ± ± ± ± , w2,λ ) → ln 21 , c2± (w1,λ , w2,λ ) → −∞ w1,λ → 0 strongly • (case ∗ = ±): c1± (w1,λ 1 in H (() as λ → +∞; along any sequences λn → +∞, there exists a subsequence ± (still denoted by λn ) such that w2,λ → w± strongly in H1 (() with w± satisfying: n  w 1 −5w = 4π h2 e − in ( |(| h 2 ew (5.38) (  w = 0. (

636

M. Nolasco, G. Tarantello

In particular, ±

ev1,λ →

1 2

±

and ev2,λ → 0,

strongly in H1 ((), as λ → +∞.

(5.39)

∓ ∓ ∓ ∓ ∓ , w2,λ ) → −∞, c2∓ (w1,λ , w2,λ ) → ln 21 , w2,λ → 0 strongly • (case ∗ = ∓): c1∓ (w1,λ 1 in H (() as λ → +∞; along any sequences λn → +∞, there exists a subsequence ∓ (still denoted by λn ) such that w1,λ → w∓ strongly in H1 (() with w∓ satisfying: n  w 1 −5w = 4π h1 e − in ( w |(| (5.40) ( h1 e  ( w = 0.

In particular, ∓

∓

ev1,λ → 0 and ev2,λ →

1 , 2

strongly in H1 ((),

as λ → +∞.

(5.41)

Proof. As usual we only need to prove the result in case ∗ = ±, the other case follows by exchanging the role between the indices. Recall that in this case N1 = 0, and hence h1 = 1, N2 = 1. By (5.37) we have ± 2 ∇wi,λ 2 ≤ C,

for suitable C > 0 (independent of λ) and i = 1, 2. Consequently, ± hi ewi,λ ≤ C (i = 1, 2) 1≤ (

(5.42)

(5.43)

with C > 0 independent of λ. ± ± ± = ci± (w1,λ , w2,λ ), we have: Thus, by setting ci,λ ± = c2,λ

and ± c1,λ

8π (2N2 + N1 ) 1 + o ± 9λ h2 ew2,λ λ (

w± 1 ( e 1,λ 4π 1 = + + N )(N − 9N ) + o (N 1 2 2 1 ± 2 9λ λ e2w1,λ

(5.44)

(5.45)

(

as λ → +∞. ± From (5.43) we derive immediately that c2,λ → −∞, as λ → +∞. Furthermore, in view of (5.28) and (5.42), along any sequence λn → +∞, we find a subsequence (still denoted by λn ) such that, ± → 0 weakly in H1 ((), w1,n := w1,λ n

strongly in Lp ((), p ≥ 1 and pointwise a.e. in (.

Analogously, for suitable w± ∈ H1 ((), we may claim that, ± → w± weakly in H1 ((), w2,n := w2,λ n

strongly in Lp ((), p ≥ 1 and pointwise a.e. in (,

Vortex Condensates for SU(3) Chern–Simons Theory

637

Note in particular that ew1,n → 1, ew2,n → ew± in Lp ((), ∀ p ≥ 1. Thus, using (2.10) (with |(| = 1 ) together with (5.44)–(5.45), we find: ±

±

±

−5(w1,n + 2w2,n ) = 3λn h2 ec2,n ew2,n (1 + ec1,n ew1,n − 2ec2,n ew2,n ) − 4π(2N2 + N1 ) h2 ew2,n 1 − + φn in ( = 4π(2N2 + N1 ) w2,n |(| ( h2 e ± ± := ci,λ (i = 1, 2) and φn → 0 strongly in Lp ((), ∀ p ≥ 1. with ci,n n Consequently, by elliptic regularity theory, we obtain (after taking a subsequence if necessary)

1 w1,n + w2,n → w± 2

strongly in C 1,α ((), α ∈ (0, 1)

and w± satisfies:  w 1 −5w = 2π(2N + N ) h2 e − 2 1 w |(| ( h2 e  w = 0. (

in (

(5.46)

(5.47)

with N2 = 1 and N1 = 0, namely (5.40). On the other hand, if we insert (5.44)-(5.45) into the first equation in (2.10) (with N1 = 0 and N2 = 1) we get: 2w1,n ( ( ew1,n )2 8π h2 ew2,n e ew1,n 2w − w 5w1,n = λn 2w + (2 − ew1,n ) w2,n 1,n 1,n 1,n 9 e h e e e ( ( 2 ( ( 8π w1,n w1,n + e (2e − 1) + ψn in ( 9 with ψn → 0 strongly in Lp ((), ∀p ≥ 1. Therefore, 2w1,n ( ( ew1,n )2 e ew1,n 2 2w − w ∇w1,n 2 = −λn 2w w1,n − fn w1,n , (5.48) 1,n 1,n 1,n ( ( (e (e (e w± 8π h2 e 9 (1 + ( h2 ew± )

in Lp ((), ∀ p ≥ 1. tw1,n Note that the function h(t) := ( e tw1,n w1,n is increasing in t ∈ R, since

with fn →

h$ (t) =

(e

(

=

(

etw1,n w2 − ( tw1,n 1,n (e

etw1,n tw (w1,n − 1,n (e

(

(

etw1,n w )2 tw1,n 1,n (e

etw1,n w )2 ≥ 0 tw1,n 1,n e (

∀ t ∈ R.

Thus, (

(

e2w1,n ew1,n w )w1,n = h(2) − h(1) ≥ 0, − 2w1,n 1,n (e (e

638

M. Nolasco, G. Tarantello

and from (5.48) we derive λn

(

∇w1,n 2 → 0,

e2w1,n (e

2w1,n

−

ew1,n (e

w1,n

as n → +∞;

(5.49)

w1,n → 0

(5.50)

as n → +∞.

Taking into account (5.46), we can also assert that, w2,n → w±

strongly in H1 (().

± →0 Since (5.49) holds along any sequence λn → +∞, we may conclude that, w1,λ strongly in H1 (() as λ → +∞. ± Finally, from (5.45) we get c1,λ → ln 21 , as λ → +∞. This concludes the proof.

Clearly, Theorem 2.2 it is now an immediate consequence of Corollary 5.6 and Proposition 5.12. Final remarks. It is an interesting open problem to know if Theorem 2.2 remains valid without the restriction N1 + N2 = 1. To test whether or not our approach could be generalized for a more general choice of N1 , N2 ∈ N, we could start by investigating the existence question for problems (5.13) and (5.47). While problem (5.47) has appeared already in abelian theory, see [34], and it has been studied in [32] and [11], the elliptic system (5.13) is a novelty of the SU (3)-theory and it is certainly worthwhile investigating. Acknowledgements. The authors wish to express their gratitude to G. Dunne for useful comments.

References 1. Abrikosov, A.: On the magnetic properties of superconductors of the second group. Soviet Phys. JETP 5, 1174–1182 (1957) 2. Ambrosetti, A. and Rabinowitz, P.H.: Dual variational methods in critical points theory and applications. J. Funct. Anal. 14, 349–381 (1973) 3. Aubin, T.: Nonlinear analysis on manifolds. Monge-Ampere equations. Berlin: Springer-Verlag, 1982 4. Caffarelli, L. andYang,Y.: Vortex condensation in the Chern–Simons–Higgs model: An existence theorem. Commun. Math. Phys. 168, 321–366 (1995) 5. Carter, R.: Simple groups of Lie type. New York: Wiley, 1972 6. Chae, D. and Imanuvilov, O.: The existence of non-topological multivortex solutions in the relativistic selfdual Chern–Simons theory. Preprint, 1997 7. Chae, D. and Imanuvilov, O.: Non-topological multivortex solutions of the selfdual Maxwell Chern– Simons–Higgs theory. Preprint, 1998 8. Chae, D. and Kim, N.: Topological multivortex solutions of the selfdual Maxwell Chern–Simons–Higgs system. J. Diff. Eq. 134, 154–182 (1997) 9. Chae, D. and Kim, N.: Vortex condensates in the Relativistic selfdual Maxwell Chern–Simons–Higgs system, Preprint, 1998 10. Ding, W., Jost, J., Li, J., Peng, X. and Wang, G.: Self duality equations for Ginzburg–Landau and Seiberg– Witten type functionals with 6th order potential. Preprint, 1999 11. Ding, W., Jost, J., Li, J. and Wang, G.: The differential equation &u = 8π −8π heu on a compact Riemann surface. Asian J. Math. 1, 230–248 (1997) 12. Ding, W., Jost, J., Li, J. and Wang, G.: An analysis of the two-vortex case in the Chern–Simons–Higgs model, Calc. Var. and PDE 7, 87–97 (1998) 13. Dunne, G.: Mass degeneracies in self-dual models. Phys. Lett. B 345, 452–457 (1995) 14. Dunne, G.: Selfdual Chern–Simons theories. Lectures Notes in Physics, vol. 36, Berlin, New York: Springer-Verlag, 1995

Vortex Condensates for SU(3) Chern–Simons Theory

639

15. Dunne, G.: Vacuum mass spectra for SU(N) self-dual Chern–Simons–Higgs. Nucl. Phys. B 433, 333–348 (1995) 16. Fontana, L.: Sharp borderline Sobolev inequalities on compact Riemannian manifolds. Comment. Math. Helv. 68, 415–454 (1993) 17. Ghoussoub, N.: Duality and perturbation methods in critical point theory. Cambridge: Cambridge University Press, 1993 18. Hong, J., Kim, Y. and Pac, P.: Multivortex solutions of the Abelian Chern–Simons theory. Phys. Rev. Lett. 64, 2230–2233 (1990) 19. Humphreys, J.: Introduction to Lie Algebras and Representation Theory,. Berlin–Heidelberg–New York: Springer-Verlag, 1990 20. Jackiw, R. and Weinberg, E.: Selfdual Chern–Simons vortices. Phys. Rev. Lett. 64, 2234–2237 (1990) 21. Julia, B. and Zee, A.: Poles with both magnetic and electric charges in non-Abelian gauge theory. Phys. Rev. D 11, 2227–1232 (1975) 22. Kao, H. and Lee, K.: Selfdual SU(3) Chern–Simons-Higgs systems. Phys. Rev. D 50, 6626–6635 (1994) 23. Lee, C., Lee, K. and Min, H.: Self-Dual Maxwell ChernSimons solitons. Phys. Lett. B 252, 79–83 (1990) 24. Lee, K.: Relativistic nonabelian Chern–Simons systems. Phys. Lett. B 225, 381–384 (1991) 25. Lee, K.: Selfdual nonabelian Chern–Simons solitons. Phys. Rev. Lett. 66, 553–555 (1991) 26. Nielsen, H. and Olesen, P.: Vortex-line models for dual strings. Nucl. Phys. B 61, 45–61 (1973) 27. Nolasco, M. and Tarantello, G.: On a sharp Sobolev type inequality on two dimensional compact manifolds. Arch. Rational Mech. Anal. 145, 161–195 (1998) 28. Nolasco, M. and Tarantello, G.: Double vortex condensates in the Chern–Simons–Higgs theory. Calc. Var. and PDE 9, 31–94 (1999) 29. Ricciardi, T. and Tarantello, G.: Self-dual vortices in the Maxwell–Chern–Simons–Higgs theory. Comm. Pure Appl. Math. (in press) 30. Spruck, J. and Yang, Y.: The existence of non-topological solutions in the self-dual Chern–Simons theory. Commun. Math. Phys. 149, 361–376 (1992) 31. Spruck, J. and Yang, Y.: Topological solutions in the self-dual Chern–Simons theory: Existence and approximation. Ann. I.H.P. Analyse Nonlin. 12, 75–97 (1995) 32. Struwe, M. and Tarantello, G.: On multivortex solutions in Chern–Simons gauge theory. Boll. Un. Mat. Ital. Sez. B Artic. Ric. Mat. 1, 109–121 (1998) 33. ’t Hooft, G.: A property of electric and magnetic flux in nonabelian gauge theories. Nucl. Phys. B 153, 141–160 (1979) 34. Tarantello, G.: Multiple condensates solutions for the Chern–Simons–Higgs theory. J. Math. Phys. 37, 3769–3796 (1996) 35. Taubes, C.: Arbitrary n-vortex solutions to the first order Ginzburg–Landau equations. Commun. Math. Phys. 72, 277–292 (1980) 36. Wang, G. and Zhang, L.: Non-Topological solutions of relativistic SU(3) Chern–Simons Higgs model. Commun. Math. Phys. 202, 501–515 (1999) 37. Wang, R. The existence of Chern–Simons vortices. Commun. Math. Phys. 137, 587–597 (1991) 38. Yang, Y.: The Relativistic Non-Abelian Chern–Simons Equations. Commun. Math. Phys. 186, 199–218 (1997) Communicated by T. Miwa

Commun. Math. Phys. 213, 641 – 672 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Hyper-Kähler Hierarchies and Their Twistor Theory Maciej Dunajski, Lionel J. Mason The Mathematical Institute, 24-29 St Giles, Oxford OX1 3LB, UK. E-mail: [email protected] Received: 27 January 2000 / Accepted: 20 March 2000

Abstract: A twistor construction of the hierarchy associated with the hyper-Kähler equations on a metric (the anti-self-dual Einstein vacuum equations, ASDVE, in four dimensions) is given. The recursion operator R is constructed and used to build an infinite-dimensional symmetry algebra and in particular higher flows for the hyperKähler equations. It is shown that R acts on the twistor data by multiplication with a rational function. The structures are illustrated by the example of the Sparling–Tod (Eguchi–Hansen) solution. An extended space-time N is constructed whose extra dimensions correspond to higher flows of the hierarchy. It is shown that N is a moduli space of rational curves with normal bundle O(n) ⊕ O(n) in twistor space and is canonically equipped with a Lax distribution for ASDVE hierarchies. The space N is shown to be foliated by four dimensional hyper-Kähler slices. The Lagrangian, Hamiltonian and bi-Hamiltonian formulations of the ASDVE in the form of the heavenly equations are given. The symplectic form on the moduli space of solutions to heavenly equations is derived, and is shown to be compatible with the recursion operator. 1. Introduction Roger Penrose’s twistor theory gives rise to correspondences between solutions to differential equations on the one hand and unconstrained holomorphic geometry on the other. The two most prominent systems of nonlinear equations which admit such correspondences are the anti-self-dual vacuum Einstein equations (ASDVE) [23] which in Euclidean signature determine hyper-Kähler metrics, and the anti-self-dual Yang–Mills equations (ASDYM) [31]. Richard Ward [32] observed that many lower-dimensional integrable systems are symmetry reductions of ASDYM. This has led to an overview of the theory of integrable systems [20], which provides a classification of those lowerdimensional integrable systems that arise as reductions of the ASDYM equations and a

642

M. Dunajski, L. J. Mason

unification of the theory of such integrable equations as symmetry reduced versions of the corresponding theory of the ASDYM equations. In [20], Lagrangian and Hamiltonian frameworks for ASDYM were described together with a recursion operator. This leads to the corresponding structures for symmetry reductions of the ASDYM equations. In this paper we investigate these structures for the second important system of equations – the ASDVE or hyper-Kähler equation (this system also admits known integrable systems as symmetry reductions [11]). We shall give a twistor-geometric construction of the hierarchies associated to the ASDVE in the “heavenly” forms due to Pleba´nski [25]. In this context it is more natural to work with complex (holomorphic) metrics on complexified space-times and so we use the term ASDVE equations rather than hyperKähler equations. Our considerations will generally be local in space-time which will be understood to be a region in C4 . In Sect. 2 we summarise the twistor correspondences for flat and curved spaces. We establish a spinor notation (which will not be essential for the subsequent sections) and recall basic facts about the ASD conformal condition and the geometry of the spin bundle. In Sect. 3 the recursion operator R for the ASDVE is constructed as an integro- differential operator mapping solutions to the linearised heavenly equations to other solutions. We then use this to give an alternate development of the twistor correspondence by using R to build a family of foliations by twistor surfaces. We show that R corresponds to multiplication of the twistor data by a given twistor function. We then analyse the hidden symmetry algebra of the ASDVE, and use the recursion operator to construct Killing spinors. We illustrate the ideas using the example of the Sparling–Tod solution and show how R can be used to construct rational curves with normal bundle O(1) ⊕ O(1) in the associated twistor space. In Sect. 4 we give the twistor construction for the ASDVE hierarchies. The higher commuting flows can be thought of as coordinates on an extended space-time. This extended space-time has a twistor correspondence: it is the moduli space of rational curves with normal bundle O(n) ⊕ O(n) in a twistor space. This moduli space is canonically equipped with the Lax distribution for ASDVE hierarchies, and conversely that truncated hierarchies admit a Lax distribution that gives rise to such a twistor space. The Lax distribution can be interpreted as a connecting map in a long exact sequence of sheaves. In Sect. 5 we investigate the Lagrangian and Hamiltonian formulations of heavenly equations. The symplectic form on the moduli space of solutions to heavenly equations will be derived, and is shown to be compatible with the recursion operator. We end this introduction with some bibliographical remarks. Significant progress towards understanding the symmetry structure of the heavenly equations was achieved by Boyer and Pleba´nski [3, 4] who obtained an infinite number of conservation laws for the ASDVE equations and established some connections with the nonlinear graviton construction. Their results were later extended in papers of Strachan [26] and Takasaki [28, 29]. The present work is an extended version of [8–10]. 2. Preliminaries 2.1. Spinor notation. We work in the holomorphic category with complexified spacetimes: thus space-time M is a complex four-manifold equipped with a holomorphic metric g and compatible volume form ν. In four complex dimensions orthogonal transformations decompose into products of ASD and SD rotations C))/Z2 . SO(4, C) = (SL(2, C) × SL(2, (2.1)

Hyper-Kähler Hierarchies and Their Twistor Theory

643

The spinor calculus in four dimensions is based on this isomorphism. We use the conventions of Penrose and Rindler [24]. Indices will generally be assumed to be numerical if we work in any of the heavenly frames and otherwise abstract: a, b, . . . , a = 0, 1 . . . 3 are four-dimensional space-time indices and A, B, . . . , A , B , . . . , A = 0, 1, etc. are two-dimensional spinor indices. The tangent space at each point of M is isomorphic to a tensor product of the two spin spaces

T a M = SA ⊗ SA .

(2.2)

The complex Lorentz transformation V a −→ a b V b , a b c d gac = gbd , is equivalent to the composition of the SD and the ASD rotation

V AA −→ λA B V BB λA B ,

C). where λA B and λA B are elements of SL(2, C) and SL(2, Spin dyads (oA , ιA ) and (oA , ιA ) span S A and S A respectively. The spin spaces S A and S A are equipped with symplectic forms εAB and εA B such that ε01 = ε0 1 = 1. These anti-symmetric objects are used to raise and lower the spinor indices. We shall use normalised spin frames so that

oB ιC − ιB oC = εBC ,

o B ιC − ι B o C = ε B C .

Let eAA be a null tetrad of 1-forms on M and let ∇AA be the frame of dual vector fields. The orientation is given by fixing the volume form

ν = e01 ∧ e10 ∧ e11 ∧ e00 . Apart from orientability, M must satisfy some other topological restrictions for the global spinor fields to exist. We shall not take them into account as we work locally in M. The local basis AB and A B of spaces of ASD and SD two-forms are defined by

eAA ∧ eBB = εAB A B + ε A B AB .

(2.3)

The first Cartan structure equations are

deAA = eBA ∧ A B + eAB ∧ A B , C) spin connection one-forms. They where AB and A B are the SL(2, C) and SL(2, are symmetric in their indices, and

AB = CC AB eCC , A B = CC A B eCC , CC A B = oA ∇CC ιB − ιA ∇CC oB . The curvature of the spin connection R A B = d A B + A C ∧ C B decomposes as

R A B = C A BCD CD + (1/12)R A B + A BC D C D ,

and similarly for R A B . Here R is the Ricci scalar, ABA B is the trace-free part of the Ricci tensor Rab , and CABCD is the ASD part of the Weyl tensor Cabcd = εA B εC D CABCD + εAB εCD CA B C D .

644

M. Dunajski, L. J. Mason

2.2. The flat twistor correspondence. The flat twistor correspondence is a correspondence between points in complexified Minkowski space, C4 (or its conformal compactification) and holomorphic lines in CP3 . The flat twistor correspondence has an invariant formulation in terms of spinors. A point in C4 has position vector with coordinates (w, z, x, y). The isomorphism (2.2) is realised by y w x AA := , so that g = εAB εA B dx AA dx BB . −x z A two-plane in C4 is null if g(X, Y ) = 0 for every pair (X, Y ) of vectors tangent to it. The null planes can be self-dual (SD) or anti self-dual (ASD), depending on whether the tangent bi-vector X ∧ Y is SD or ASD. The SD null planes are called α-planes. The α-planes passing through a point in C4 are parametrised by λ = π0 /π1 ∈ CP1 . Tangents to α-planes are spanned by two vectors

LA = π A

∂ ∂x AA

(2.4)

which form the kernel of πA πB A B The set of all α-planes is called a projective twistor space and denoted PT . For C4 it is a three-dimensional complex manifold biholomorphic to CP3 − CP1 . The five complex dimensional correspondence space F := C4 × CP1 fibres over 4 C by (x AA , λ) → x AA and over PT with fibres spanned by LA . Twistor functions (functions on PT ) pull back to functions on F which are constant on α-planes, or equivalently satisfy LA f = 0. , where U is a Twistor space can be covered by two coordinate patches U and U is a complement of λ = 0. If (µ0 , µ1 , λ) are coordinates complement of λ = ∞ and U then on the overlap ˜ are coordinates on U on U and (µ˜ 0 , µ˜ 1 , λ) µ˜ 0 = µ0 /λ,

µ˜ 1 = µ1 /λ,

λ˜ = 1/λ.

The local coordinates (µ0 , µ1 , λ) on PT pulled back to F are µ0 = w + λy,

µ1 = z − λx,

λ.

(2.5)

We can introduce homogeneous coordinates on the twistor space (ωA , πA ) = (ω0 , ω1 , π0 , π1 ) := (µ0 π1 , µ1 π1 , λπ1 , π1 ).

The point x AA ∈ C4 lies on the α-plane corresponding to the twistor (ωA , πA ) ∈ PT iff

ωA = x AA πA .

(2.6)

For πA = 0 and (ωA , πA ) fixed, the solution to (2.6) is a complex two plane with tangent vectors of the form π A ν A for all ν A . Alternatively, if we fix x AA , then (2.6) defines a rational curve, CP1 , in PT with normal bundle O(1) ⊕ O(1).1 Kodaira theory guarantees that the family of such rational curves in PT is four complex dimensional. There is a canonical (quadratic) conformal structure ds 2 on C4 : the points p and q are null separated with respect to ds 2 in C4 iff the corresponding rational curves lp and lq intersect in PT at one point. 1 Here O(n) denotes the line bundle over CP1 with transition functions λ−n from the set λ = ∞ to λ = 0 (i.e. Chern class n).

Hyper-Kähler Hierarchies and Their Twistor Theory

645

2.3. Curved twistor spaces and the geometry of the primed spin bundle. Given a complex four-dimensional manifold M with curved metric g, a twistor in M is an α-surface, i.e. a null two-dimensional surface whose tangent space at each point is an α plane. There are Frobenius integrability conditions for the existence of such α-surfaces through each α-plane element at each point and these are equivalent, after some calculation, to the vanishing of the self-dual part of the Weyl curvature, CA B C D . Thus, given CA B C D = 0, we can define a twistor space PT to be the three complex dimensional manifold of α-surfaces in M. If g is also Ricci flat then PT has further structures which are listed in the Nonlinear Graviton Theorem: Theorem 2.1 (Penrose [23]). There is a 1-1 correspondence between complex ASD vacuum metrics on complex four-manifolds and three dimensional complex manifolds PT such that • There exists a holomorphic projection µ : PT −→ CP1 . • PT is equipped with a four complex parameter family of sections of µ each with a normal bundle O(1) ⊕ O(1), (this will follow from the existence of one such curve by Kodaira theory). • Each fibre of µ has a symplectic structure λ ∈ (2 (µ−1 (λ)) ⊗ O(2)), where λ ∈ CP1 . To obtain real metrics on a real 4-manifold, we can require further that the twistor space admit an anti-holomorphic involution. The correspondence space F = M×CP1 is coordinatized by (x, λ), where x denotes the coordinates on M and λ is the coordinate on CP1 that parametrises the α-surfaces through x in M. We represent F as the quotient of the primed-spin bundle S A with fibre coordinates πA by the Euler vector field ϒ = π A /∂π A . We relate the fibre coordinates to λ by λ = π0 /π1 . A form with values in the line bundle O(n) on F can be represented by a homogeneous form α on the non-projective spin bundle satisfying ϒ

α = 0,

Lϒ α = nα.

The space F possesses a natural two dimensional distribution called the twistor distribution, or Lax pair, to emphasise the analogy with integrable systems. The Lax pair on F arises as the image under the projection T S A −→ T F of the distribution spanned by

π A ∂AA + AA B C π A π B

∂ ∂πC

on T S A , where the ∂AA are a null tetrad for the metric on M, and AA B C are the components of the spin connection in the associated spin frame (∂AA +AA B C π B ∂π∂ C is thehorizontal distribution on SA ). We can also represent the Lax pair on the projective

646

M. Dunajski, L. J. Mason

spin bundle by 2

A LA = (π1−1 ∂AA + fA ∂λ ), )(π

A B C where fA = (π1−2 π π . )AA B C π

(2.8)

The integrability of the twistor distribution is equivalent to CA B C D = 0, the vanishing of the self-dual Weyl spinor. When the Ricci tensor vanishes also, a covariant constant primed spin frame can be found so that AA B C = 0. We assume this from now on. The projective twistor space PT arises as a quotient of F by the twistor distribution. With the Ricci flat condition, the coordinate λ descends to twistor space and πA descends to the non-projective twistor space. It can be covered by two sets, U = {|λ| < 1 + 3} and U˜ = {|λ| > 1−3}. On the non-projective space we can introduce extra coordinates ωA of homogeneity degree one so that (ωA , πA ), πA = ιA are homogeneous coordinates on U and similarly (ω˜ A , πA ), πA = oA ) on U˜ . The twistor space PT is then determined by the transition function ω˜ B = ω˜ B (ωA , πA ) on U ∩ U˜ . The correspondence space has the alternate definition F = PT × M|Z∈lx = M × CP1 , where lx is the line in PT that corresponds to x ∈ M and Z ∈ PT lies on lx . This leads to a double fibration p

q

M ←− F −→ PT .

(2.9)

The existence of LA can also be deduced directly from the correspondence. From [23], points in M correspond to rational curves in PT with normal bundle OA (1) := O(1)⊕O(1). The normal bundle to lx consists of vectors tangent to x (horizontally lifted to T(x,λ) F) modulo the twistor distribution. Therefore we have a sequence of sheaves over CP1 0 −→ D −→ C4 −→ OA (1) −→ 0.

The map C4 −→ OA (1) is given by V AA −→ V AA πA . Its kernel consists of vectors of the form π A λA with λA varying. The twistor distribution is therefore D = O(−1) ⊗ S A and so there is a canonical LA ∈ (D ⊗ O(1) ⊗ SA ), as given in (2.8). 2 Various powers of π in formulae like (2.8) guarantee the correct homogeneity. We usually shall omit 1 them when working on the projective spin bundle. In a projection S A −→ F we shall use the replacement

formula ∂ ∂π A

πA ∂λ . π1 2

−→

(2.7)

This is because (on functions of λ)

∂ π0 π1 oA − π0 ιA πA = = . 2 ∂πA π1 π 1 π1 2

Hyper-Kähler Hierarchies and Their Twistor Theory

647

2.4. Some formulations of the ASD vacuum condition. The ASD vacuum conditions CA B C D = 0, ABA B = 0 = R imply the existence of a normalised, covariantly constant frame (oA , ιA ) of S A , so that AA B C = 0. One can further choose an unprimed spin frame so that the Lax pair (2.8) consists of volume-preserving vector fields on M: AA = (∇ 10 ∇ 11 ) be four 00 , ∇ 01 ∇ Proposition 2.2 (Mason and Newman [18]). Let ∇ independent holomorphic vector fields on a four-dimensional complex manifold M and let ν be a nonzero holomorphic four-form. Put 00 − λ∇ 01 , L0 = ∇

10 − λ∇ 11 . L1 = ∇

(2.10)

Suppose that for every λ ∈ CP1 , [L0 , L1 ] = 0,

LLA ν = 0.

(2.11)

Here LV denotes the Lie derivative. Then AA , ∂AA = f −1 ∇

10 ∇ 11 ), 00 , ∇ 01 ∇ where f 2 := ν(∇

is a null-tetrad for an ASD vacuum metric. Every such metric locally arises in this way. In [7] the last proposition is generalised to the hyper-Hermitian case. A choice of unprimed spin frame with f 2 = 1 is always possible and we shall assume this here-on so AA . For easy reference we rewrite the field equations (2.11) in full that ∇AA = ∇ [∇A0 , ∇B0 ] = 0, [∇A0 , ∇B1 ] + [∇A1 , ∇B0 ] = 0, [∇A1 , ∇B1 ] = 0.

(2.12) (2.13) (2.14)

Let A B be the usual basis of SD two-forms. On the correspondence space, define

(λ) := A B πA πB .

(2.15)

The formulation of the ASDVE condition dual to (2.11) is: Proposition 2.3 (Plebanski ´ [25], Gindikin [12]). If a two-form of the form

(λ) := A B πA πB on the correspondence space satisfies dh (λ) = 0,

(λ) ∧ (λ) = 0,

(2.16)

where dh is the exterior derivative holding πA constant, then there exist one-forms eAA related to A B by Eq. (2.3) which give an ASD vacuum tetrad.

648

M. Dunajski, L. J. Mason

Note that the simplicity condition in (2.16) arises from the condition that A B comes from a tetrad. To construct Gindikin’s two-form starting from the twistor space, one can pull back the fibrewise complex symplectic structure on PT −→ CP1 to the projective spin bundle and fix the ambiguity by requiring that it annihilates vectors tangent to the fibres. The resulting two-form is O(2) valued. (To obtain Gindikin’s two-form one should divide it by a constant section of O(2).) Put 0 0 = −α, ˜ 0 1 = ω, 1 1 = α. The second equation in (2.16) becomes ω ∧ ω = 2α ∧ α˜ := −2ν,

α ∧ ω = α˜ ∧ ω = α ∧ α = α˜ ∧ α˜ = 0.

Equations (2.16) can be seen to arise from (2.11) by observing that (λ) can be defined by εAB (λ) = ν(LA , LB , . . . , . . . ). Note also that LA spans a two-dimensional distribution annihilating (λ). The two one-forms eA := πA eAA by definition annihilate the twistor distribution. Define (1, 1) tensors ∂AB := eAB ⊗ ∇AA so that ˜ − λ 2 ∂2 , eA ⊗ LA = πB π A ∂AB = ∂0 + λ(∂ − ∂) ˜ ∂0 , ∂2 , ∂). If the field equations are satisfied then the where (∂00 , ∂10 , ∂01 , ∂11 ) = (∂, Euclidean slice of M is equipped with three integrable complex structures given by ˜ (∂2 + ∂0 )} and three symplectic structures ωi = {(i(α − Ji := {i(∂2 − ∂0 ), (∂ − ∂), α), ˜ iω, (α + α)} ˜ compatible with the Ji . It is therefore a hyper-Kähler manifold.

2.5. The ASD condition and heavenly equations. Part of the residual gauge freedom in (2.11) is fixed by selecting one of Pleba´nski’s null coordinate systems. 1. Equations (2.13) and (2.14) imply the existence of a coordinate system (w, z, w, ˜ z˜ ) =: (wA , w˜ A ) and a complex-valued function 9 such that ∂

AA

=

9ww˜ ∂z˜ − 9wz˜ ∂w˜ ∂w 9zw˜ ∂z˜ − 9z˜z ∂w˜ ∂z

=

∂ 29 ∂ ∂ . ∂w A ∂ w˜ B ∂ w˜ B ∂w A

(2.17)

Equation (2.12) yields the first heavenly equation 9wz˜ 9zw˜ − 9ww˜ 9z˜z = 1 or

∂ 29 1 ∂ 29 = 1. 2 ∂wA ∂ w˜ B ∂w A ∂ w˜ B

(2.18)

The dual tetrad is

eA1 = dwA , eA0 =

∂ 29 dw˜ B ∂wA ∂ w˜ B

(2.19)

Hyper-Kähler Hierarchies and Their Twistor Theory

649

˜ with the flat solution 9 = wA w˜ A . The only nontrivial part of A B is 0 1 = ∂ ∂9 so that 9 is a Kähler scalar. The Lax pair for the first heavenly equation is L0 : = 9ww˜ ∂z˜ − 9wz˜ ∂w˜ − λ∂w , L1 : = 9zw˜ ∂z˜ − 9z˜z ∂w˜ − λ∂z .

(2.20)

Equations L0 : = L1 : = 0 have solutions provided that 9 satisfies the first heavenly equation (2.18). Here : is a function on F. 2. Alternatively Eqs. (2.12) and (2.13) imply the existence of a complex-valued function ; and coordinate system (w, z, x, y) =: (w A , xA ), w A as above, such that ∂ ∂ ∂ ∂ 2; ∂y ∂w + ;yy ∂x − ;xy ∂y ∂AA = = . + −∂x ∂z − ;xy ∂x + ;xx ∂y ∂x A ∂w A ∂x A ∂x B ∂xB (2.21) As a consequence of (2.14) ; satisfies second heavenly equation ;xw + ;yz + ;xx ;yy − ;xy 2 = 0 or

∂ 2; ∂ 2; 1 ∂ 2; + = 0. ∂w A ∂xA 2 ∂x B ∂x A ∂xB ∂xA (2.22)

The dual frame is given by

eA0 = dx A +

∂ 2; dw B , eA1 = dwA B ∂x ∂xA

(2.23)

with ; = 0 defining the flat metric. The Lax pair corresponding to (2.22) is L0 = ∂y − λ(∂w − ;xy ∂y + ;yy ∂x ), L1 = ∂x + λ(∂z + ;xx ∂y − ;xy ∂x ).

(2.24)

Both heavenly equations were originally derived by Pleba´nski [25] from the formulation (2.16). The closure condition is used, via Darboux’s theorem, to introduce ωA , canonical coordinates on the spin bundle, holomorphic around λ = 0 such that the two-form (2.15) is (λ) = dh ωA ∧ dh ωA . The various forms of the heavenly equations can be obtained by adapting different coordinates and gauges to these forms. 3. The Recursion Operator In Subsect. 3.1 the recursion operator R for the anti-self-dual Einstein vacuum equations is constructed. In Subsect. 3.2 then show that the generating function for R i φ is ˇ automatically a twistor function, and is in fact a Cech representative for φ. It is shown that R acts on such a twistor function by multiplication. A similar application to the coordinates used in the heavenly equations yields the coordinate description of the twistor space starting. In Subsect. 3.3 we show how that the action of the recursion operator on space-time corresponds to multiplication of the corresponding twistor functions by λ. In Subsect. 3.4 the algebra of hidden symmetries of the second heavenly equation is constructed by applying the recursion operator to the explicit symmetries. In Subsect. 3.5, R is used to build a higher valence Killing spinor corresponding to hidden symmetries. In the last subsections examples of the use of the recursion operator are given.

650

M. Dunajski, L. J. Mason

3.1. The recursion relations. The recursion operator R is a map from the space of linearised perturbations of the ASDVE equations to itself. This can be used to construct the ASDVE hierarchy whose higher flows are generated by acting on one of the coordinate flows with the recursion operator R. We will identify the space of linearised perturbations to the ASDVE equations with solutions to the background coupled wave equations in two ways as follows. Lemma 3.1. Let 9 and ; denote wave operators on the ASD background determined by 9 and ; respectively. Linearised solutions to (2.18) and (2.22) satisfy 9 δ9 = 0,

; δ; = 0.

(3.1)

Proof. In both cases g = ∇A1 ∇ A 0 since 1 √ g = √ ∂a (g ab g∂b ) = g ab ∂a ∂b + (∂a g ab )∂b g ˜ but ∂a g ab = 0 for both heavenly coordinate systems. For the first equation (∂ ∂(9 + 2 δ9)) = ν implies ˜ ∧ ∂ ∂)δ9 ˜ ˜ ∧ (∂ − ∂)δ9) ˜ 0 = (∂ ∂9 = d(∂ ∂9 = d ∗ dδ9. Here ∗ is the Hodge star operator corresponding to g. For the second equation we make use of the tetrad (2.21) and perform coordinate calculations. From now on we identify tangent spaces to the spaces of solutions to (2.18) and (2.22) with the space of solutions to the curved background wave equation, Wg . We will define the recursion operator on the space Wg . The above lemma shows that we can consider a linearised perturbation as an element of Wg in two ways. These two will be related by the square of the recursion operator. The linearised vacuum metrics corresponding to δ9 and δ; are hI AA BB = ι(A oB ) ∇(A1 ∇B)0 δ9,

hI I AA BB = oA oB ∇A0 ∇B0 δ;,

where oA = (1, 0) and ιA = (0, 1) are the constant spin frame associated to the null tetrads given above. Given φ ∈ Wg we use the first of these equations to find hI . If we put the perturbation obtained in this way on the LHS of the second equation and add an appropriate gauge term we obtain φ - the new element of Wg that provides the δ; which gives rise to hIabI = hIab + ∇(a Vb) .

(3.2)

To extract the recursion relations we must find V such that hI AA BB − ∇(AA VBB ) = oA oB χAB . Take VBB = oB ∇B1 δ9, which gives ∇(AA VBB ) = −ι(A oB ) ∇(A0 ∇B)1 δ9 + oA oB ∇A1 ∇B1 δ9. This reduces (3.2) to ∇A1 ∇B1 φ = ∇A0 ∇B0 φ .

(3.3)

Hyper-Kähler Hierarchies and Their Twistor Theory

651

Definition 3.2. Define the recursion operator R : Wg −→ Wg by

ιA ∇AA φ = oA ∇AA Rφ,

(3.4)

so formally R = (∇A0 )−1 ◦ ∇A1 (no summation over the index A). Remarks. • From (3.4) and from (2.11) it follows that if φ belongs to Wg then so does Rφ. • If R 2 δ9 = δ; then δ9 and δ; correspond to the same variation in the metric up to gauge. • The operator φ → ∇A0 φ is over-determined, and its consistency follows from the wave equation on φ. • This definition is formal in that in order to invert the operator φ → ∇A0 φ we need to specify boundary conditions. To summarize: Proposition 3.3. Let Wg be the space of solutions of the wave equation on the curved ASD background given by g. (i) Elements of Wg can be identified with linearised perturbations of the heavenly equations. (ii) There exists a (formal) map R : Wg −→ Wg given by (3.4). The recursion operator can be generalised to act on solutions to the higher helicity Zero Rest-Mass equations on the ASD vacuum backgrounds [10] by using Herz potentials. We restrict ourselves to the gauge invariant case of a left-handed neutrino field ψA on a heavenly background. First note that any solution of

∇ AA ψA = 0 must be of the form ∇A0 φ, where φ ∈ Wg . Define the recursion relations RψA := ∇A0 Rφ.

(3.5)

It is easy to see that R maps solutions into solutions, although again the definition is formal in that boundary conditions are required to eliminate the ambiguities. A conjugate recursion operator R will play a role in the Hamiltonian formulation in Sect. 5. 3.2. The recursion operator and twistor functions. A twistor function f can be pulled back to the correspondence space F . A function f on F descends to twistor space iff LA f = 0. Given φ ∈ Wg , define, for i ∈ Z, a hierarchy of linear fields, φi ≡ R i φ0 . Put i := ∞ −∞ φi λ and observe that the recursion equations are equivalent to LA : = 0. Thus : is a function on the twistor space PT . Conversely every solution of LA : = 0 defined on a neighbourhood of |λ| = 1 can be expanded in a Laurent series in λ with the coefficients forming a series of elements of Wg related by the recursion operator. ˇ The function :, when multiplied by 1/(π0 π1 ), is a Cech representative of the element 1 of H (PT , O(−2)) that corresponds to the solution of the wave equation φ under the Penrose transform (i.e. by integration around |λ| = 1). The ambiguity in the inversion

652

M. Dunajski, L. J. Mason

of ∇A0 means that there are many such functions : that can be obtained from a given φ. However, they are all equivalent as cohomology classes. It is clear that a series corresponding to Rφ is the function λ−1 :. As noted before, R is not completely well defined when acting on Wg because of the ambiguity in the inversion of ∇A0 . However, the definition R: = :/λ is well defined as a twistor function on PT , but the problem resurfaces when one attempts to treat :(λ) as a representative of a cohomology class since pure gauge elements of the first sheaf cohomology group H 1 (PT , O(−2)) are mapped to functions defining a non-trivial element of the cohomology. Note, however, that with the definition R: = :/λ, the action of R is well defined on twistor functions and can be iterated without ambiguity. We can in this way build coordinate charts on twistor space from those on spacetime arising from the choices in the Plebanski reductions. Put ω0A = wA = (w, z); the surfaces of constant ω0A are twistor surfaces. We have that ∇ A 0 ω0B = 0 so that in particular ∇A1 ∇ A 0 ω0B = 0 and if we define ωiA = R i ω0A , then we can choose ωiA = 0 for negative i. We define ωA =

∞ i=0

ωiA λi .

(3.6)

We can similarly define ω˜ A by ω˜ 0A = w˜ A and choose ω˜ iA = 0 for i > 0. Note that ωA and ω˜ A are solutions of LA holomorphic around λ = 0 and λ = ∞ respectively and they can be chosen so that they extend to a neighbourhood of the unit disc and a neighbourhood of the complement of the unit disc and can therefore be used to provide a patching description of the twistor space. 3.3. The Penrose transform of linearised deformations and the recursion operator. The recursion operator acts on linearised perturbations of the ASDVE equations. Under the twistor correspondence, these correspond to linearised holomorphic deformations of (part of) PT . Cover PT by two sets, U and U˜ with |λ| < 1 + 3 on U and |λ| > 1 − 3 on U˜ with A (ω , λ) coordinates on U and (ω˜ A , λ−1 ) on U˜ . The twistor space PT is then determined by the transition function ω˜ B = ω˜ B (ωA , πA ) on U ∩ U˜ which preserves the fibrewise 2-form, dωA ∧ dωA |λ=const. = dω˜ A ∧ dω˜ A |λ=const. . Infinitesimal deformations are given by elements of H 1 (PT , ;), where ; denotes a sheaf of germs of holomorphic vector fields. Let Y = f A (ωB , πB )

∂ ∂ωA

defined on the overlap U ∩ U˜ and define a class in H 1 (PT , ;) that preserves the fibration PT → CP1 . The corresponding infinitesimal deformation is given by ω˜ A (ωA , πA , t) = (1 + tY )(ω˜ A ) + O(t 2 ).

(3.7)

From the globality of (λ) = dωA ∧ dωA it follows that Y is a Hamiltonian vector field with a Hamiltonian f ∈ H 1 (PT , O(2)) with respect to the symplectic structure . A finite deformation is given by integrating dω˜ B ∂f = εBA A . ∂ ω˜ dt

Hyper-Kähler Hierarchies and Their Twistor Theory

653

from t = 0 to 1. Infinitesimally we can put δ ω˜ A =

∂δf . ∂ ω˜ A

(3.8)

If the ASD metric is determined by ; and then εBA ∂δf/∂ωB , (or more simply δf ) is a linearised deformation corresponding to δ; ∈ Wg . The recursion operator acts on linearised deformations as follows Proposition 3.4. Let R be the recursion operator defined by (3.4). Its twistor counterpart is the multiplication operator R δf =

π1 δf = λ−1 δf. π0

(3.9)

[Note that R acts on δf without ambiguity; the ambiguity in boundary condition for the definition of R on space-time is absorbed into the choice of explicit representative for the cohomology class determined by δf .] Proof. Pull back δf to the primed spin bundle on which it is a coboundary so that ˜ A , x a ), δf (πA , x a ) = h(πA , x a ) − h(π

(3.10)

where h and h˜ are holomorphic on U and U˜ respectively (here we abuse notation and denote by U and U˜ the open sets on the spin bundle that are the preimage of U and U˜ on twistor space). A choice for the splitting (3.10) is given by

1 (π A oA )3 h= δf (ρE )ρD dρ D , 2πi (ρ C πC )(ρ B oB )3 (3.11)

(π A oA )3 D ˜h = 1 δf (ρE )ρD dρ . 2πi ˜ (ρ C πC )(ρ B oB )3 Here ρA are homogeneous coordinates of CP1 pulled back to the spin bundle. The contours and ˜ are homologous to the equator of CP1 in U ∩ U˜ and are such that − ˜ surrounds the point ρA = πA . The functions h and h˜ are homogeneous of degree 1 in πA and do not descend to PT , whereas their difference does so that π A ∇AA h = π A ∇AA h˜ = π A π B π C AA B C ,

(3.12)

where the first equality shows that the LHS is global with homogeneity degree 2 and implies the second equality for some AA B C which will be the third potential for a linearised ASD Weyl spinor. AA B C is in general defined modulo terms of the form ∇A(A γB C ) but this gauge freedom is partially fixed by choosing the integral representation above; h vanishes to third order at πA = oA and direct differentiation, using ∇AA δf = ρA δfA for some δfA , gives AA B C = oA oB oC ∇A0 δ;, where

δf 1 ρD dρ D . (3.13) δ; = 4 2π i (ρ B oB ) This is consistent with the Plebanski gauge choices (there is also a gauge freedom in δ; arising from cohomology freedom in δf which we shall describe in the next subsection).

654

M. Dunajski, L. J. Mason

The condition ∇A(D A A B C ) = 0 follows from Eq. (3.12) which, with the Pleba´nski gauge choice, implies δ; ∈ Wg . Thus we obtain a twistor integral formula for the linearisation of the second heavenly equation. Now recall formula (3.4) defining R. Let Rδf be the twistor function corresponding to Rδ; by (3.13). The recursion relations yield

RδfA δfA D ρD dρ = ρD dρ D B 3 B 2 B (ρ oB ) (ρ oB ) (ρ ιB ) so Rδf = λ−1 δf .

Let δ9 be the linearisation of the first heavenly potential. From R 2 δ9 = δ; it follows that

1 δf δ9 = ρC dρ C . 2πi (ρA oA )2 (ρB ιB )2 3.4. Hidden symmetry algebra. The ASDVE equations in the Pleba´nski forms have a residual coordinate symmetry. This consists of area preserving diffeomorphisms in the wA coordinates together with some extra transformations that depend on whether one is reducing to the first or second form. By regarding the infinitesimal forms of these transformations as linearised perturbations and acting on them using the recursion operator, the coordinate (passive) symmetries can be extended to give “hidden” (active) symmetries of the heavenly equations. Formulae (3.13) and (3.9) can be used to recover the known relations (see for example [28]) of the hidden symmetry algebra of the heavenly equations. We deal with the second equation as the case of the first equation was investigated by other methods [21]. 0 ∇ := [M, ∇ ]. Let M be a volume preserving vector field on M. Define δM AA AA This is a pure gauge transformation corresponding to addition of LM g to the space-time metric and preserves the field equations. Note that 0 0 0 , δN ]∇AA := δ[M,N] ∇AA . [δM

Once a Pleba´nski coordinate system and reduced equations have been obtained, the reduced equation will not be invariant under all the SDiff(M) transformations. The second form will be preserved if we restrict ourselves to transformations which preserve the SD two-forms 1 1 = dwA ∧ dw A and 0 1 = dxA ∧ dw A . The conditions LM 0 0 = LM 0 1 = 0 imply that M is given by M=

∂g ∂ 2h ∂ ∂h ∂ B + − x , ∂wA ∂w A ∂wA ∂wA ∂w B ∂x A

where h = h(wA ) and g = g(wA ). The space-time is now viewed as a cotangent bundle M = T ∗ N 2 with w A being coordinates on a two-dimensional complex manifold N 2 . The full SDiff(M) symmetry breaks down to the semi-direct product of SDiff(N 2 ), which acts on M by a Lie lift, with (N 2 , O) which acts on M by translations of the zero section by the exterior derivatives of functions on N 2 . Let δM ; correspond to 0 ∇ by δM AA 0 δM ∇A1 =

∂ 2 δM ; ∂ . ∂x A ∂x B ∂xB

Hyper-Kähler Hierarchies and Their Twistor Theory

655

The “pure gauge” elements are ∂ 2g ∂ 3h + x A xB xC ∂wA ∂wB ∂wA ∂wB ∂wC 2h ∂ ∂; ∂g ∂; ∂h ∂; + + − xB , A A B ∂wA ∂x ∂wA ∂w ∂wA ∂w ∂x A

0 δM ; = F + x A G A + x A xB

(3.14)

where F, GA , g, h are functions of w B only. The above symmetries can be seen to arise from symmetries on twistor space as follows. Since we have the symplectic form = dωA ∧ dωA on the fibres of µ : PT −→ CP1 , a symmetry is a holomorphic diffeomorphism of the set U that restricts i to a canonical transformation on each fibre. Let H = H (x a , λ) = ∞ i=0 hi λ be the Hamiltonian for an infinitesimal such transformation pulled back to the projective spin bundle. The functions hi depend on space time coordinates only. In particular h0 and h1 give h and g from the previous construction (3.14). This can be seen by calculating how ; transforms if ωA = w A + λx A + λ2 ∂;/∂xA + · · · −→ ωˆ A . Now ; is treated as an object on the first jet bundle of a fixed fibre of PT and it determines the structure of the second jet. These symmetries take a solution to an equivalent solution. The recursion operator can be used to define an algebra of “hidden symmetries” that take one solution to a different one as follows. 0 ; be an expression of the form (3.14) which also satisfies δ 0 ; = 0. We Let δM g M set δM i ; := R i δM ; ∈ Wg .

Proposition 3.5. Generators of the hidden symmetry algebra of the second heavenly equation satisfy the relation [δM i , δN j ] = δ[M,N] i+j .

(3.15)

Proof. This can be proved directly by showing that the ambiguities in R can be chosen so that R ◦ δM = δM ◦ R. It is perhaps more informative to prove it by its action on twistor functions. i f be the twistor function corresponding to δ i ; (by (3.13)) treated as an Let δM M i , δ j ] by element of (U ∩ U˜ , O(2)) rather than H 1 (PT , O(2)). Define [δM N j

i [δM , δN ]; :=

1 2π i

j

i f, δ f } {δM N πA dπ A , 4 (π0 )

where the Poisson bracket is calculated with respect to a canonical Poisson structure on PT . From Proposition (3.9) it follows that

{δM f, δN f } 1 j i [δM λ−i−j , δN ]; = πA dπ A = R i+j δ[M,N] ; 4 2πi (π0 ) as required.

656

M. Dunajski, L. J. Mason

3.5. Recursion procedure for Killing spinors. Let (M, g) be an ASD vacuum space. We say that LA1 ...An is a Killing spinor of type (0, n) if ∇ A (A LB1 ...Bn ) = 0.

(3.16)

Killing spinors of type (0, n) give rise to Killing spinors of type (1, n − 1) by ∇ A A LB1 ...Bn = εA (B1 K A B2 ...Bn ) .

In an ASD vacuum, K BB2 ...Bn is also a Killing spinor ∇ (A (A K B) B2 ...Bn ) = 0. Put (for i = 0, . . . , n)

Li := ιB1 . . . ιBi oBi+1 . . . oBn LB1 ...Bn ,

and contract (3.16) with ιB1 . . . ιBi oBi+1 . . . oBn+1 to obtain i∇A1 Li−1 = −(n − i + 1)∇A0 Li ,

i = 0, . . . , n − 1.

We make use of the recursion relations (3.4): −i R(Li−1 ) = Li . n+1−i This leads to a general formula for Killing spinors (with ∇A0 L0 = 0) −1 n Li = (−1) R i (L0 ), i i

LB1 B2 ...Bn =

n i=0

o(B1 . . . oBi ιBi+1 . . . ιBn ) Li

(3.17)

and Eq. (3.16) is then satisfied iff R −1 L0 = RLn = 0. 3.6. Example 1. Let us demonstrate how to use the recursion procedure to find metrics with hidden symmetries. Let ∂tn 9 := φn be a linearisation of the first heavenly equation. We have R : z −→ 9w = ∂t1 9. Look for solutions to (2.18) with an additional constraint ∂t2 9 = 0. The recursion relations (3.4) imply 9wz = 9ww = 0, therefore 9(w, z, w, ˜ z˜ ) = wq(w, ˜ z˜ ) + P (z, w, ˜ z˜ ). The heavenly equation yields dq ∧ dP ∧ dz = d˜z ∧ dw˜ ∧ dz. With the definition ∂z P = p the metric is ds 2 = 2dwdq + 2dzdp + f dz2 , where f = −2Pzz . We adopt (w, z, q, p) as a new coordinate system. Heavenly equations imply that f = f (q, z) is an arbitrary function of two variables. These are the null ASD plane wave solutions.

Hyper-Kähler Hierarchies and Their Twistor Theory

657

3.7. Example 2. Now we shall illustrate Propositions 3.3 and 3.4 with the example of the Sparling–Tod solution [27]. The coordinate formulae for the pull back of twistor functions are: µ0 = w + λy − λ2 ;x + λ3 ;z + . . . , µ1 = z − λx − λ2 ;y − λ3 ;w + . . . .

(3.18)

Consider ;=

σ , wx + zy

(3.19)

where σ = const. It satisfies both the linear and the nonlinear part of (2.22). The flat case. First we shall treat (3.19), with σ = 1, as a solution φ0 to the wave equation on the flat background. The recursion relations are (Rφ0 )x =

y , (wx + zy)2

(Rφ0 )y =

−x . (wx + zy)2

They have a solution φ1 := Rφ0 = (−y/w)φ0 . More generally we find that φn := R n φ0 =

−

1 y n . w wx + zy

(3.20)

The last formula can be also found using twistor methods. The twistor function corresponding to φ0 is 1/(µ0 µ1 ), where µ0 = w + λy and µ1 = z − λx. By Proposition 3.9 the twistor function corresponding to φn is λ−n /(µ0 µ1 ). This can be seen by applying the formula (3.13) and computing the residue at the pole λ = −w/y. It is interesting to ask whether any φn (apart from φ0 ) is a solution to the heavenly equation. Inserting ; = φn to (2.22) yields n = 0 or n = 2. We parenthetically mention that φ2 yields (by formula (2.23)) a metric of type D which is conformal to the Eguchi–Hanson solution. The curved case. Now let ; given by (3.19) determine the curved metric ds 2 = 2dwdx + 2dzdy + 4σ (wx + zy)−3 (wdz − zdw)2 .

(3.21)

The recursion relations ∂y (Rφ) = (∂w − ;xy ∂y + ;yy ∂x )φ,

−∂x (Rφ) = (∂z + ;xx ∂y − ;xy ∂x )φ

are −∂x (Rψ) = (∂z + 2σ w(wx + zy)−3 (w∂x − z∂y ))ψ, ∂y (Rψ) = (∂w + 2σ z(wx + zy)−3 (w∂x − z∂y ))ψ, where ψ satisfies ; ψ = 2(∂x ∂w + ∂y ∂z + 2σ (wx + zy)−3 (z2 ∂x 2 + w 2 ∂y 2 − 2wz∂x ∂y ))ψ = 0. (3.22)

658

M. Dunajski, L. J. Mason

One solution to the last equation is ψ1 = (wx + zy)−1 . We apply the recursion relations to find the sequence of linearised solutions y 2 1 σ 1 y 2 , ψ3 = − ,..., + − 3 w wx + zy 3 (wx + zy) w wx + zy n y k ψn = Ak(n) − (wx + zy)k−n . w ψ2 =

−

k=0

To find Ak(n) note that the recursion relations imply R

−

y k (wx + zy)j w y y −1 k y k = − −σ − − (wx + zy)−2 (wx + zy)j . w w j +2 w

This yields a recursive formula k+1 Ak+1 , n − k + 1 (n) k = 0 . . . n,

Ak(n+1) = Ak−1 (n) − 2σ A−1 (n) = 0,

A0(1) = 1,

A1(1) = 0,

(3.23)

which determines the algebraic (as opposed to the differential) recursion relations between ψn and ψn+1 . It can be checked that functions ψn indeed satisfy (3.22). Notice that if σ = 0 (flat background) then we recover (3.20). We can also find the inhomogeneous twistor coordinates pulled back to F µ0 = w + λy + µ1 = z − λx +

∞

σ λn+2

n=0 ∞

σ λn+2

n=0

n

k=0 n k=0

y k k B(n) w − (wx + zy)k−n−1 , w

k B(n) z

x k z

(wx + zy)k−n−1 ,

where k+1 B k+1 , n − k + 2 (n) = 0, k = 0 . . . n.

k−1 k B(n+1) = B(n) − 2σ −1 1 B(1) = 0, B(n)

0 B(1) = 1,

The polynomials µA solve LA (µB ) = 0, where now L0 = −λ∂w − 2λσ z2 (wz + zy)−3 ∂x + (1 + 2λσ wz(wz + zy)−3 )∂y , L1 = λ∂z + (1 − 2λσ wz(wz + zy)−3 )∂x + 2λσ w 2 (wz + zy)−3 )∂y .

Hyper-Kähler Hierarchies and Their Twistor Theory

659

4. Hierarchies for the ASD Vacuum Equations The hidden symmetries corresponding to higher flows associated to translations along the coordinate vector fields give “higher flows” of a hierarchy. This yields a hierarchy of flows of the anti-self-dual Einstein vacuum equations. We first give this for the equations in their second heavenly form but then give the equations in the form of consistency conditions for a Lax system of vector fields generalizing equations 2.11. The nonlinear graviton construction generalizes to give a construction for the corresponding system of equations and is presented in Subsect. 4.2. In Subsect. 4.3 the geometric structure of solutions to the truncated hierarchy are explored in further detail. Finally in Subsect. 4.4 infinitesimal deformations are studied.

4.1. Hierarchies for the heavenly equations. The generators of higher flows are first obtained by applying powers of the recursion operator to the linearised perturbations corresponding to the evolution along coordinate vector fields. This embeds the second heavenly equation into an infinite system of over-determined, but consistent, PDEs (which we will truncate at some arbitrary but finite level). These equations in turn can be naturally embedded into a system of equations that are the consistency conditions for an associated linear system that extends (2.11). We shall discuss here the hierarchy for the second Pleba´nski form; that for the first arises from a different coordinate and gauge choice. Introduce the coordinates x Ai , where for i = 0, 1, x Ai = x AA are the original coordinates on M, and for 1 < i ≤ n, x Ai are the parameters for the new flows (with 2n − 2 dimensional parameter space X). The propagation of ; along these parameters is determined by the recursion relations ∂y (∂Bi+1 ;) = (∂w − ;xy ∂y + ;yy ∂x )∂Bi ;, −∂x (∂Bi+1 ;) = (∂z + ;xx ∂y − ;xy ∂x )∂Bi ;, or

(4.1)

∂A0 (∂Bi+1 ;) = (∂A1 + ∂C0 ∂A0 ;∂ C 0 )∂Bi ;,

where ∂Bi+1 := ∂/∂x B(i+1) , etc. However, we will take the hierarchy to be the system (containing the above when j = 1) ∂Ai ∂Bj −1 ; − ∂Bj ∂Ai−1 ; + {∂Ai−1 ;, ∂Bj −1 ;}yx = 0,

i, j = 1 . . . n.

(4.2)

Here {. . . , . . . }yx is the Poisson bracket with respect to the Poisson structure ∂/∂x A ∧ ∂/∂xA = 2∂x ∧ ∂y . Lemma 4.1. The linear system for Eqs. (4.2) is LAi s = (−λDAi+1 + δAi )s = 0,

i = 0, . . . , n − 1,

(4.3)

where 1. s := s(x Ai , λ) is a function on a spin bundle (a CP1 -bundle) over N = M × X, 2. DAi+1 := ∂Ai+1 + [∂Ai , V ], (V = εAB ∂A0 ;∂B0 ) and δAi := ∂Ai are 4n vector fields on N .

660

M. Dunajski, L. J. Mason

Proof. This follows by direct calculation. The compatibility conditions for (4.3) are: [DAi+1 , DBj +1 ] = 0, [δAi , δBj ] = 0, [DAi+1 , δBj ] − [DBj +1 , δAi ] = 0.

(4.4) (4.5) (4.6)

It is straightforward to see that Eqs. (4.5) and (4.6) hold identically with the above definitions and (4.4) is equivalent to (4.2). As a converse to this lemma, we will see in Subsect. 4.2 using the twistor correspondence, that given the Lax system above, in which the vector fields DAi and δAj are volume preserving vector fields, then coordinate and gauge choices can be made so that the Lax system takes on the above form. 4.1.1. Spinor notation. The above can also be represented in a spinorial formulation that will be useful later. We introduce the spinor indexed coordinates x AA1 ...An = x A(A1 ...An ) on N which correspond to the x Ai by x

Ai

n AA A ...An = x 1 2 oA1 . . . oAi ιAi+1 . . . ιAn (−1)n−i . i

The vector fields DAi+1 and δAi are then represented by the 4n vector fields on N , DAA1 (A2 ...An ) , where

DAA1 i = ιA2 . . . ιAi oAi+1 . . . oAn DAA1 (A2 ...An ) ,

DA1 i = DAi+1 ,

DA0 i = δAi ,

and LA(A2 ...An ) = π A1 DAA1 (A2 ...An ) , LAi = π A1 DAA1 i . In the adopted gauge DA0 A2 ...An = ∂A0 A2 ...An ,

DA1 A2 ...An = ∂A1 A2 ...An + [∂A0 A2 ...An , V ].

In what follows we will often be interested in ∇A(A1 A2 ...An ) , the symmetric part of DAA1 A2 ...An ,

∇Ai = DA(A1 A2 ...An ) ιA1 . . . ιAi oAi+1 . . . oAn =

1 i (iDA1 i−1 + (n − i)DA0 i ) = ∂Ai + [∂Ai−1 , V ]. n n

Put DA0 ...0 = ∂A . The 2n + 2 vector fields ∇AA1 ...An = {∂A , ∇A0 1 A2 ...An−1 , DAn } span T ∗ N .

(4.7) (4.8)

Hyper-Kähler Hierarchies and Their Twistor Theory

661

4.2. The twistor space for the hierarchy. The twistor space PT for a solution to the hierarchy associated to the Lax system on N as above is obtained by factoring the spin bundle N × CP1 by the twistor distribution (Lax system) LAi . This clearly has a projection q : N × CP1 → PT and we have a double fibration

N

p#

N × CP1

$q

PT

Since the twistor distribution is tangent to the fibres of N × CP1 → CP1 , twistor space inherits the projection µ : PT → CP1 . The twistor space for the hierarchy is three-dimensional as for the ordinary hyper-Kähler equations, but has a different topology. We have Lemma 4.2. The holomorphic curves q(CP1x ), where CP1x = p−1 x, x ∈ N , have normal bundle N = O(n) ⊕ O(n). Proof. To see this, note that N can be identified with the quotient p∗ (Tx N )/{spanLAi }, i = 1, . . . , n. In their homogeneous form the operators LAi have weight 1, so the distribution spanned by them is isomorphic to the bundle C2n ⊗ O(−1). The definition of the normal bundle as a quotient gives 0 → C2n ⊗ O(−1) → C2n+2 → N → 0 and we see, by taking determinants that the image is O(n + a) ⊕ O(n − a) for some a. We see that a = 0 as the last map, in the spinor notation introduced at the end of the last section, is given explicitly by V AA1 ...An → V AA1 ...An πA1 . . . πAn clearly projecting onto O(n) ⊕ O(n). A final structure that PT possesses is a skew form taking values in O(2n) on the fibres of the projection µ. This arises from the fact that the vector fields of the distribution preserve the coordinate volume form ν on N in the given coordinates system. Furthermore, the Lax system commutes exactly [LAi , LBj ] = 0 so that = ν(·, ·, L01 , . . . , L0n , L11 , . . . , L1n ) descends to the fibres of PT → CP1 and clearly has weight 2n as each of the LAi has weight one. Thus we see that, given a solution to the hyperkähler hierarchy in the form of a commuting Lax system, we can produce a twistor space with the above structures. Now we shall prove the main result of this section and demonstrate that, given PT , with the above structures, we can construct N (as a moduli space of rational curves in PT ) which is naturally equipped with a function ; satisfying (4.2) and with the Lax distribution (4.3). Proposition 4.3. Let PT be a 3 dimensional complex manifold with the following structures 1) a projection µ : PT −→ CP1 , 2) a section s : CP1 → PT of µ with normal bundle O(n) ⊕ O(n), 3) a non-degenerate 2-form on the fibres of µ, with values in the pullback from CP1 of O(2n).

662

M. Dunajski, L. J. Mason dim F = 2n + 3

Nλ

Lp

q dim PT = 3

p

Lp

r

p λ

dim N = 2n + 2

CP1

Fig. 1. Double fibration

Let N be the moduli space of sections that are deformations of the section s given in (2). Then N is 2n + 2 dimensional and a) There exists coordinates, x Ai , A = 0, 1, and i = 0, . . . , n and a function ; : N −→ C on N such that Eq. (4.2) is satisfied. b) The moduli space N of sections is equipped with – a factorisation of the tangent bundle T N = S A ⊗ %n S A , – a 2n-dimensional distribution on the ‘spin bundle’D ⊂ T (N ×CP1 ) that is tangent to the fibres of r over CP1 and, as a bundle on N × CP1 has an identification with O(−1) ⊗ SAA ...An−1 so that the linear system can be written as in Eq. (4.3). This correspondence is stable under small perturbations of the complex structure on PT preserving (1) and (3). Proof. The first claim, that N has dimension 2n + 2 follows from Kodaira theory as dim H 0 (CP1 , N) = 2n + 2 and dim H 1 (CP1 , N ) = dim H 1 (CP1 , EndN ) = 0. Proof of (a). We first start by defining homogeneous coordinates on PT . These are coordinates on T , the total space of the pullback from CP1 of the tautological line bundle O(−1). Let πA be homogeneous coordinates on CP1 pulled back to T and let ωA be local coordinates on T chosen on a neighbourhood of µ−1 {π0 = 0} that are homogeneous of degree n and canonical so that = εAB dωA ∧ dωB . We also use λ = π0 /π1 as an affine coordinate on CP1 . Let Lp be the line in PT that corresponds to p ∈ N and let Z ∈ PT lie on Lp . We denote by F the correspondence space PT × N |Z∈Lp = N × CP1 . (See Fig. 1 for the double fibration picture.) Pull back the twistor coordinates to F and define 2(n + 1) coordinates on N by

x AA1 A2 ...An = x A(A1 A2 ...An ) :=

∂ n ωA , ∂πA1 ∂πA2 . . . ∂πAn πA =oA

where the derivative is along the fibres of F over N . This can alternatively be expressed in affine coordinates on CP1 by expanding the coordinates ωA pulled back to F in

Hyper-Kähler Hierarchies and Their Twistor Theory

powers of λ = π0 /π1 :

A

ω = (π1 )

n

n

Ai n−i

x λ

663

+λ

n+1

i=0

∞ i=0

siA λi

,

(4.9)

AA1 ...An

where the siA are functions of x and will be useful later. The symplectic 2-form on the fibres of µ, when pulled back to the spin bundle, has expansion in powers of λ that truncates at order 2n + 1 by globality and homogeneity, so that

= dh ωA ∧ dh ωA = πA1 . . . πAn πB1 . . . πBn A1 ...An B1 ...Bn

for some symmetric spinor indexed 2-form A1 ...An B1 ...Bn . We have (λ) ∧ (λ) = 0,

dh (λ) = 0,

(4.10)

where in the exterior derivative dh , λ is understood to be held constant. If we express the forms in terms of the x Ai and the siA , the closure condition is satisfied identically, whereas the truncation condition will give rise to equations on the siA allowing one to express them in terms of a function ;(x AA ...An ) and to field equations on ; as follows. To deduce the existence of ;(x AA1 ...An ) observe that the vanishing of the coefficient 2n+1 A of λ in dω ∧ dωA gives n

dsAi ∧ dx Ai = d

i=0

n

sAi dx Ai = 0

'⇒

sAi =

i=0

∂; . ∂x Ai

The equations of the hierarchy arise from the vanishing of the coefficient of λ2n+2 , n

dx Ai ∧ dsAi+1 + ds0A ∧ dsA0 = 0.

i=0

This leads to Eqs. (4.2) on ; for i, j ≤ n − 1, ∂ 2; ∂ 2; ∂ 2; ∂ 2; CD − + ε =0 ∂x Ai+1 ∂x Bj ∂x Ai ∂x Bj +1 ∂x C0 ∂x Ai ∂x D0 ∂x Bj and further equations that determine s An+1 .

Proof of (b). The isomorphism T N = S A ⊗ %n S A follows simply from the structure of the normal bundle. From Kodaira theory, since the appropriate obstruction groups vanish, we have

Tx N = (CP1x , Nx ) = SxA ⊗ %n S A ,

(4.11)

where Nx is the normal bundle to the rational curve CP1x in PT corresponding to the point x ∈ N . The bundle S A on space-time is the Ward transform of O(−n) ⊗ TV PT , where the subscript V denotes the sub-bundle of the tangent bundle consisting of vectors up the fibres of µ, the projection to CP1 , so that SxA = (CP1x , O(−n) ⊗ TV PT ). The bundle S A = (CP1 , O(1)) is canonically trivial.

664

M. Dunajski, L. J. Mason

Let ∇AA1 ···An = ∇A(A1 ···An ) be the indexed vector field that establishes the isomor

phism (4.11) and let eAA1 ···An = eA(A1 ···An ) ∈ 91 ⊗ S A ⊗ %n S A be the dual (inverse) map. We now wish to derive the form of the linear system, Eqs. (4.3). For each fixed πA = (λ, 1) ∈ CP1 we have a copy of a space-time Nλ . The horizontal (i.e. holding λ constant) subspace of T(x,λ) (N × CP1 ) is spanned by ∇A(A ...An ) . An element of the normal bundle to the corresponding line CP1x consists of a horizontal tangent vector at (x, λ) modulo the twistor distribution. Therefore we have the sequence of sheaves over CP1 eA

0 −→ Dx −→ Tx N −→ S A ⊗ O(n) −→ 0 , where Dx is the twistor distribution at x and the map Tx N −→ S A ⊗ O(n) is given by the contraction of elements of Tx N with eA := eAA1 ...An πA1 . . . πAn since eA annihilates all LBi s in D. Consider the dual sequence tensored with O(−1) to obtain 0 −→ OA (−n − 1) −→ Tx∗ N (−1) −→ Dx∗ (−1) −→ 0.

(4.12)

From here we would like to extract the Lax distribution

LAA2 ...An = π A1 DAA1 A2 ...An ∈ SAA2 ...An ⊗ O(1) ⊗ D.

This can be achieved by globalising (4.12) in π A . The corresponding long exact sequence of cohomology groups yields δ

0 −→ (OA (−n − 1)) −→ (T ∗ N (−1)) −→ (D ∗ (−1)) −→ H 1 (OA (−n − 1)) −→ H 1 (T ∗ N (−1)) −→ . . . , which (because T ∗ N is a trivial bundle so that O(−1) ⊗ T ∗ N has no sections or cohomology) reduces to δ

0 −→ (D ∗ (−1)) −→ H 1 (OA (−n − 1)) −→ 0. From Serre duality we conclude, since D has rank 2n, that the connecting map δ is an isomorphism δ : (D ∗ (−1)) −→ SAA2 ...An . Therefore δ ∈ (D ⊗ O(1) ⊗ SAA2 ...An )

(4.13)

is a canonically defined object annihilating ωA given by (4.9). In index notation we can put

δ = LAA2 ...An = π A1 DAA1 A2 ...An , where LAA2 ...An = LA(A2 ...An ) , the second identity follows from the globality of LAA2 ...An and the DAA1 A2 ...An are vector fields on N lifted to N × CP1 using the product structure. It follows from LAA2 ...An ωB = 0 that if π A = oA then DA0 A2 ...An x Bn = 0 so BB ...B

n DA0 A2 ...An = AA02A ...A 2

n

∂ ∂x B0 B2 ...Bn

,

Hyper-Kähler Hierarchies and Their Twistor Theory

665

BB ...B

n for some matrix AA02A ...A . This matrix must be invertible by dimension counting. By n 2 multiplying LAA2 ...An by the inverse of this matrix, we find we can put

BB ...B

B

B

2

n

n B 2 n AA02A ...A = εA εA . . . εA . n

2

Therefore we can take LAA2 ...An = ∂A0 A2 ...An − λDA1 A2 ...An . Equating the (n − i + 1)th and (n + 1)th powers of λ in LAi ωB = 0 to zero yields DA1 A2 ...An = ∂A1 A2 ...An + [∂A0 A2 ...An , V ], where V = εAB ∂;/∂xA0 ∂/∂xB0 . So finally LAA2 ...An is of the form LAi = ∂Ai − λ(∂Ai+1 + [∂Ai , V ]).

4.3. Geometric structures. If one considers N = M × X as being foliated by four dimensional slices t Ai = const then structures (1)–(3) on PT can be used to define anti-self-dual vacuum metrics on the leaves of the foliation. Consider ;(x AA , t), where t = {t Ai , i = 2 . . . n}. For each fixed t the function ; satisfies the second heavenly equation. The ASD metric on a corresponding four-dimensional slice Nt=t0 is given by

ds 2 = 2εAB dx A1 dx B0 + 2

∂ 2; A1 dx B1 . dx ∂x A0 ∂x B0

This metric can be determined from the structure of the O(n) ⊕ O(n) twistor space as follows. Fix the first 2n − 2 parameters in the expansion (4.9) so the normal vector W = W A ∂/∂ωA is given by

W A = δωA = λn−1 W A1 + λn W A0 + λn+1

∂δ;

∂xA0

+ ...,

where δ; = W AA ∂;/∂x AA . The metric is

g(U, W ) = εAB εA B U AA W BB ,

(4.14)

where εA B is a fixed element of 2 S A and εAB ∈ 2 S A is determined by ; recall that SxA = (Lx , O(−n) ⊗ TV PT ). Thus if uA , v A ∈ SxA , then define εAB uA v B = (u, v), where u, v are the corresponding weighted vertical vector fields on PT . For n odd T N is equipped with a metric with holonomy SL(2, C). For n even, T N is endowed with a skew form. They are both given by

G(U, W ) = εAB εA1 B1 . . . εAn Bn U AA1 ...An W BB1 ...Bn .

(4.15)

These are special examples of the paraconformal structures considered by Bailey and Eastwood [2].

666

M. Dunajski, L. J. Mason

4.4. Holomorphic deformations and O(2n) twistor functions. We wish to consider holomorphic deformations of PT that preserve conditions (1 − 3) of Proposition 4.3 which will therefore correspond to perturbations of the hierarchy. Let ω˜ A = GA (ωB , πA , t) be the standard patching relation for PT and let f A ∈ A S ⊗ H 1 (PT , O(n)) give the infinitesimal deformation ω˜ A = GA + tf A + O(t 2 ). The globality of the symplectic structure dω˜ A ∧ dω˜ A = dωA ∧ dωA implies f A = εAB ∂f/∂ωB , where f ∈ H 1 (PT , O(2n)). Example. If we deform from the flat model using f = (π0 )4n /ω0 ω1 , then the deformation equations ω˜ 0 = ω0 + t

(π0 )4n + O(t 2 ), ω0 (ω1 )2

ω˜ 1 = ω1 − t

(π0 )4n + O(t 2 ), (ω0 )2 ω1

imply that Q = ω0 ω1 = ω˜ 0 ω˜ 1 is a global twistor function (up to O(t 2 )) which persists to all orders as εAB ∂Q/∂ωA ∂f/∂ωB = 0. The corresponding deformed paraconformal structure admits a symmetry corresponding to the global vector field ε AB ∂Q/∂ωA ∂/∂ωB on PT . To see how such “Hamiltonians” f correspond to variations in the paraconformal structure (or more simply ;), we form an indexed element of H 1 (PT , O(−1)), and pull it back to N × CP1 , where it can be split uniquely: πA2 . . . πAn

∂ 3 f 2n

= fABCA2 ...An = F˜ ABCA2 ...An − FABCA2 ...An ,

∂ωA ∂ωB ∂ωC

where FABCA2 ...An =

1 2π i

f ABCA2 ...An

ρA π A

ρ · dρ.

This gives rise to a global field that is symmetric over its indices: CABCDA2 ...An D2 ...Dn = LDD2 ...Dn FABCA2 ...An which is given also directly by the integral

∂ 4 f 2n 1 CABCDA2 ...An D2 ...Dn = ρA2 . . . ρAn ρD2 . . . ρDn ρ · dρ. A 2πi ∂ω ∂ωB ∂ωC ∂ωD To see how this corresponds to a variation of ;, we introduce a chain of potentials. 2n and define a global object of degree Use the non-unique splitting f 2n = F 2n − F 2n + 1 by

LAA2 ...An F 2n = AA2 ...An B1 ...Bn C1 ...Cn D1 π B1 . . . π Bn π D1 π C1 . . . π Cn . It is easy to see that

∇ AE1 ...En AA2 ...An B1 ...Bn C1 ...Cn D1 = 0,

Hyper-Kähler Hierarchies and Their Twistor Theory

667

and AA2 ...An B1 ...Bn C1 ...Cn D1 is a potential potentials, related to the field by D

C ...Cn

1 1 CABCDA2 ...An D2 ...Dn = ∇DD ...D ∇C 2

n

B ...Bn

∇B 1

AA2 ...An B1 ...Bn C1 ...Cn D1 .

The chain of potentials is δ;A1 B1 ...Bn C1 ...Cn D1 = oA1 oB1 . . . oBn oC1 . . . oCn oD1 δ;, AA2 ...An B1 ...Bn C1 ...Cn D1 = oB1 . . . oBn oC1 . . . oCn oD1 ∇A0 A2 ...An δ;, HABA2 ...An B1 ...Bn D1 = oB1 . . . oBn oD1 ∇B0 ∇A0 A2 ...An δ;, ABCA2 ...An D1 = oD1 ∇C0 ∇B0 ∇A0 A2 ...An δ;, CABCDA2 ...An D2 ...Dn = ∇C0 ∇B0 ∇A0 A2 ...An ∇D0 D2 ...Dn δ;. This can be compared with the corresponding chain for n = 1 [14]. 5. Hamiltonian and Lagrangian Formalisms In this section we shall investigate the Lagrangian and Hamiltonian formulations of the hyper-Kähler equations in their “heavenly” forms. The symplectic form on the space of solutions to heavenly equations will be derived, and proven to be compatible with a recursion operator. Both the first and second heavenly equations admit Lagrangian formulations, and these can be used to derive symplectic structures on the solution spaces, which we denote by S. Here, rather than consider the equations as a real system of elliptic or ultrahyperbolic equations, we complexify and consider the equations locally as evolving initial data from a 3-dimensional hyper-surface and it is this space of initial data that leads to local solutions on a neighbourhood of such a hyper-surface that is denoted by S and is endowed with a (conserved) symplectic form. For the first equation we have the Lagrangian density 1 ˜ 2 1 L9 = 9 ν − (∂ ∂9) = 9 − 9{9z˜ , 9w˜ }wz ν, (5.1) 3 3 and for the second equation 2 1 0 ;(∂2 ;)2 − (∂;) ∧ (∂2 ;) ∧ eA0 ∧ eA L; = 3 2 1 1 = ;{;x , ;y }xy − (;x ;w + ;y ;z ) ν. 3 2

(5.2)

0 can be replaced by dx ∧dy in the second Lagrangian as it is multiplied Note that eA0 ∧eA by dw ∧ dz. If the field equations are assumed, the variation of these Lagrangians will yield only a boundary term. Starting with the first equation, this defines a potential one-form P on the solution space S and hence a symplectic structure 9 = dP on S. Starting with the second we find a symplectic structure with the same expression on perturbations δ; as we had for δ9. However, since their relation to perturbations of the hyper-Kähler structure are different, they define different symplectic structures on S. These are related by the recursion operator since we have R 2 δ9 = δ; from above. In order to see that

668

M. Dunajski, L. J. Mason

these structures yield the usual bi-Hamiltonian framework, we will need to show that these symplectic structures are compatible with the recursion operator in the sense that 9(Rφ, φ ) = 9(φ, Rφ ). We shall demonstrate this using the first heavenly formulation which is easier as one can use identities from Kähler geometry. (The derivation of the symplectic structure from the second Lagrangian will be done in coordinates, since the useful relation between the Hodge star and the Kähler structure is missing in this case.) Proposition 5.1. The symplectic form on the space of solutions S derived from the boundary term in the variational principle for the first Lagrangian is 2 9(δ1 9, δ2 9) = δ1 9 ∗ d(δ2 9) − δ2 9 ∗ d(δ1 9). (5.3) 3 δM Proof. Varying (5.1) we obtain 1 ˜ 2 2 2 ˜ ˜ ∧ ∂ ∂δ9 ˜ ˜ − 9∂ ∂δ9). ˜ δL = δ9(ν − (∂ ∂9) ) − 9∂ ∂9 = ∂ ∂9 ∧ (δ9∂ ∂9 3 3 3 ˜ = 2∂∂, ˜ We use the identities d(∂ − ∂) equation to obtain

˜ ∧ (∂ − ∂) ˜ = ∗d and the field ω ∧ J1 d = ∂ ∂9

1 ˜ ˜ − 9d(∂ − ∂)δ9) ˜ δL = − ∂ ∂9 ∧ (δ9d(∂ − ∂)9 3 1 1 ˜ ˜ ˜ ˜ + ∗∂ ∂9(∂ ˜ ˜ ˜ = dA(δ9)− ∂ ∂9(−∗ ∂ ∂9(∂ − ∂)δ9(∂ − ∂)9 − ∂)9(∂ − ∂)δ9) 3 3 1 = dA(δ9) where A(δ9) = 9 ∗ dδ9 − δ9 ∗ d9. 3 Define the one form on S, P =

δM

A(δ9).

The symplectic structure 9 is the (functional) exterior derivative of P 9(δ1 9, δ2 9) = δ1 (P (δ2 9)) − δ2 (P (δ1 9)) − P ([δ1 9, δ2 9]) 2 δ1 9 ∗ d(δ2 9) − δ2 9 ∗ d(δ1 9). = 3 δM Thus 9 coincides with the symplectic form on the solution space to the wave equation on the ASD vacuum background. The existence of the recursion operator allows the construction of an infinite sequence of symplectic structures. The key property we need is the following Proposition 5.2. Let φ, φ ∈ Wg and let 9 be given by (5.3). Then 9(Rφ, φ ) = 9(φ, Rφ ). We first prove a technical lemma:

(5.4)

Hyper-Kähler Hierarchies and Their Twistor Theory

669

Lemma 5.3. The following identities hold: ˜ ω ∧ ∂φ = −α ∧ ∂Rφ, ω ∧ ∂2 φ = α˜ ∧ ∂2 Rφ, ˜ = α˜ ∧ ∂φ. ω ∧ ∂2 Rφ = −α ∧ ∂0 φ, ω ∧ ∂Rφ

(5.5)

Proof. From the definitions of A B and ∂AB it follows that

C] C A [B A B ∧ ∂D ∧ ∂D =

(5.6)

(recall that ∂AB = eAB ⊗ ∂AA ) which yields ω ∧ ∂˜ = α˜ ∧ ∂2 ,

ω ∧ ∂ = −α ∧ ∂0 , ω ∧ ∂2 = −α ∧ ∂2 , α ∧ ∂ = α˜ ∧ ∂˜ = 0.

ω ∧ ∂0 = α˜ ∧ ∂,

Multiplying (3.4) by combinations of spin co-frame we get an equivalent definition of the recursion operator

∂1A φ = ∂0A Rφ

(5.7)

˜ which is equivalent to ∂φ = ∂2 Rφ or φ = ∂Rφ. These formulae give the desired result. Proof of Proposition 5.2. The proof uses a (formal) application of Stokes’ theorem: 9(φ, φ ) = φ ∗ dφ − φ ∗ dφ δM ˜ ˜ − φ ∂φ + φ ∂φ) = ω ∧ (φ∂φ − φ ∂φ δM (5.8) ˜ − 2φ ∂φ) = ω ∧ (φdφ + φ dφ − 2φ ∂φ δM ˜ + φ∂φ ) ˜ + φ ∂φ) = 2 = −2 ω ∧ (φ ∂φ ω ∧ (φ ∂φ δM

δM

From (5.8) and from (5.5) we have ˜ + Rφ ∂φ) = − 9(φ, Rφ ) = − ω ∧ (φ ∂Rφ δM

δM

φ∂φ ∧ α˜ +

δM

˜ Rφ ∂Rφ ∧α

and analogously 9(Rφ, φ ) =

δM

φ ∂φ ∧ α˜ −

δM

˜ ∧ α. Rφ ∂Rφ

Equality (5.4) is achieved by subtracting the integral of d(φφ ) ∧ α˜ − d(RφRφ ) ∧ α and applying Stokes’ theorem. This property guarantees that the bilinear forms 9k (φ, φ ) ≡ 9(R k φ, φ )

(5.9)

670

M. Dunajski, L. J. Mason

are skew. Furthermore they are symplectic and lead to the bi-Hamiltonian formulation. In this context formula (5.4) and the closure condition for 9k are an algebraic consequence of the fact that R comes from two Poisson structures. Using the theory of bi-Hamiltonian systems one can now go on to prove that the flows constructed by application of R to some standard flow commute. To develop the bi-Hamiltonian theory, we would like to write the heavenly equations in Hamiltonian form. However the Legendre transform becomes singular for the coordinate flows associated to the coordinates we have chosen since they are, at least in the Minkowski space limit, null coordinates. One possibility is to develop a Hamiltonian formalism based on such null hyper-surfaces. We shall adopt a different approach and reformulate the second heavenly equation as a first order system. Define φ := −;x and formally rewrite the second heavenly equation (2.22) as ∂w φ = R(∂y φ),

where R = (∂z + {φ, . . . }yx ) ◦ ∂x −1 = ∇11 ◦ ∇10 −1 .

(5.10)

It is therefore a conjugated operator R (defined by (3.5)), acting on solutions to the zero-rest-mass equations, and plays the role of the recursion operator. Flows of the sub-hierarchy [L1 , L0j ] = 0 are ∂tj φ = Rj ∂y φ and the Hamiltonian for the first nontrivial flow is 2 φ H1 = dx ∧ dy ∧ dz. 2 Higher Hamiltonians Hn can in principle be constructed using the operator R. However, we have not developed explicit formulae for these Hn . 5.1. A local bi-Hamiltonian form for the hierarchy. To end this section, we express the equations of the second heavenly hierarchy (4.2) in a compact form, and then write it as a (formal) bi-Hamiltonian system on the spin bundle. This will be a rather different framework from that given above in that the Hamiltonian structure will in effect be local to the x A0 plane as opposed to a field theoretic formulation – it is the gravitational analogue of that given for the Bogomolny equations in [19] except that no symmetries are required here (in effect because ASD gravity can be expressed as ASD Yang–Mills with two symmetries but with gauge group the group of area preserving diffeomorphisms). This formulation is therefore presented merely as a curiosity. Define the j th truncation of ωA to be ωjA

= −x

A0

+

j

λm ∂ Am−1 ;,

m=1

where ∂ Ai = εAB ∂/∂x Bi . (Note that this is truncated at both ends, although the truncation at the lower end and multiplication by a power of λ is inessential.) Lemma 5.4. The truncated heavenly hierarchy is equivalent to ∂ Bj ωA (λ) = {ωA (λ), λ−j ωBj (λ)}yx .

(5.11)

Hyper-Kähler Hierarchies and Their Twistor Theory

671

Proof. First observe that one can sum the Lax system to obtain −

j −1

λi LAi = λj ∂Aj +

i=0

j −1

λi+1 ε CD ∂C0 ∂Ai ;∂D0 − ∂A0

i=0 j

= λ ∂Aj + {ωAj , ·}yx , where {f, ·}yx = εCD ∂C0 f ∂D0 . Thus, since LAi ωA = 0, we have ∂Bj ωA = −λ−j {ωjB , ωA } which yields the desired answer. For the remainder of this section, we shall fix the values of the spinor indices to be A = 0 and B = 1. Set ∂j := ∂1j , : := ω0 , and ψj := ω1j . Equation (5.11) takes the form ∂j :(λ) = {:(λ), λ−j ψj (λ)}yx which we rewrite as ∂j : = D

δhj . δ:

(5.12)

m Here D := {:(λ), . . . }yx = ∞ i=0 Dm λ is a λ-dependent Poisson structure, D0 = ∂x and Dm = [∂m−1 , V ] = D0m − ∂0m for m > 0. The Hamiltonians are hj (λ) = λ−j ψj (λ):(λ). 6. Outlook – Examples with Higher Symmetries This section motivates the study of solutions to heavenly equations which are invariant under some hidden symmetries, e.g. along the higher flows. More generally, one can consider solutions to the hyper-Kähler equations without symmetries, but whose hierarchies do admit symmetries. In a subsequent paper we shall give a general construction of such metrics based on a generalisation of [30]. We consider the case in which the twistor spaces have a globally defined twistor function homogeneous of degree n+1. This implies that the metric admits a Killing spinor (some solutions with this property are given by [7]). Global sections Q ∈ H 0 (CP1 , O(n + 1)) on non-deformed twistor space µ : PT −→ CP1 will be classified and Q-preserving deformations of the complex structure of a neighbourhood of an O(1) ⊕ O(1) section of µ will be studied. The cohomology classes determining the deformation will depend on the fibre coordinates of µ only via Q. The canonical forms of patching functions can be derived to give explicit solutions to anti-self-dual ASD vacuum Einstein equation. There are also further details of the bi-Hamiltonian structure that could usefully be clarified.

672

M. Dunajski, L. J. Mason

Acknowledgements. We are grateful to Roger Penrose, George Sparling, Paul Tod, Nick Woodhouse, and others for some helpful discussions. Some parts of this work were finished during the workshop Spaces of geodesics and complex methods in general relativity and geometry held in the summer of 1999 at the Erwin Schrödinger Institute in Vienna. We wish to thank ESI for the hospitality and for financial assistance. LJM was supported by NATO grant CRG 950300.

References 1. Ablowitz, M.J., Clarkson, P.A.: Solitons, Nonlinear evolution equations and inverse scattering. L.M.S. Lecture note series, 149. Cambridge: CUP, 1992 2. Bailey, T.N., Eastwood, M.G.: Complex paraconformal manifolds – their differential geometry and twistor theory. Forum Math. 3 1, 61–103 (1991) 3. Boyer, C.P., Pleba´nski, J.F.: Heavens and their integral manifolds. J. Math. Phys. 18, 1022–1031 (1977) 4. Boyer, C.P., Pleba´nski, J.F.: An infinite hierarchy of conservation laws and nonlinear superposition principles for self-dual Einstein spaces. J. Math. Phys. 26, 229–234 (1985) 5. Boyer, C., Winternitz, P.: Symmetries of the self-dual Einstein equations. I. The infinite-dimensional symmetry group and its low-dimensional subgroups. J. Math. Phys. 30, 1081–1094 (1989) 6. Dunajski, M.: The Nonlinear Graviton Construction as an Integrable System. DPhil thesis, Oxford University, 1998 7. Dunajski, M.: The Twisted Photon Associated to Hyper-hermitian Four Manifolds. J. Geom. Phys. 30, 266–281 (1999) 8. Dunajski, M., Mason, L.J.: Heavenly Hierarchies and Curved Twistor Spaces. Twistor Newsletter 41 (1996) 9. Dunajski, M., Mason, L.J.: Integrable flows on moduli of rational curves with normal bundle OA (n). Twistor Newsletter 42 (1997) 10. Dunajski, M., Mason, L.J.: A Recursion Operator for ASD Vacuums and Z.R.M Fields on ASD Backgrounds. Twistor Newsletter 43 (1997) 11. Dunajski, M. Mason, L.J., Woodhouse, N.M.J.: From 2D Integrable Systems to Self-Dual Gravity. J. Phys. A: Math. Gen. 31, 6019 (1998) 12. Gindikin, S.: On one construction of hyperKähler metrics. Funct. Anal. Appl. 20, 82–132 (1986) 13. Grant, J.D.E.: On Self-Dual Gravity. Phys. Rev. D48, 2606–2612 (1993) 14. Ko, B., Ludvigsen, M., Newman, E.T., Tod, K.P.: The theory of H space. Phys. Rep. 71 51–139 (1981) 15. Kodaira, K.: On stability of compact submanifolds of complex manifolds. Am. J. Math. 85, 79–94 (1963) 16. Magri, F.: A simple model of the integrable Hamiltonian equation. J. Math. Phys. 19, 1156–1162 (1978) 17. Mason, L.J.: H-space, a universal integrable system? Twistor Newsletter 30 (1990) 18. Mason, L.J., Newman, E.T.: A connection between the Einstein and Yang–Mills equations. Commun. Math. Phys. 121, 659–668 (1989) 19. Mason, L.J., Sparling, G.A.J.: Twistor correspondences for the soliton hierarchies. J. Geom. Phys. 8, 243–271 (1992) 20. Mason, L.J., Woodhouse, N.M.J.: Integrability, Self-Duality, and Twistor Theory. L.M.S. Monographs New Series, 15. Oxford: OUP, 1996 21. Park, Q.H.: Self-Dual Gravity as a Large-N Limit of the 2D Non-Linear Sigma Model. Phys. Lett. 238A, 287–290 (1990) 22. Penrose, R.: Zero rest-mass fields including gravitation: Asymptotic behaviour. Proc. Roy. Soc. London A284, 159–203 (1965) 23. Penrose, R.: Nonlinear gravitons and curved twistor theory. Gen. Rel. Grav. 7, 31–52 (1976) 24. Penrose, R., Rindler, W.: Spinors and Space-Time, Vol 1, 2. Cambridge: CUP, 1986 25. Pleba´nski, J.F.: Some solutions of complex Einstein Equations. J. Math. Phys. 16, 2395–2402 (1975) 26. Strachan, I.A.B.: The Symmetry Structure of the Anti-Self-Dual Einstein Hierarchy. J. Math. Phys. 36, 3566–3573 (1995) 27. Sparling, G. A., Tod, K.P.: An example of an H-space. J. Math. Phys. 22, 331–332 (1981) 28. Takasaki, K.: An infinite number of hidden variables in hyper-Kähler metrics. J. Math. Phys. 30, 1515– 1521 (1989) 29. Takasaki, K.: Symmetries of hyper-Kähler (or Poisson gauge field ) hierarchy. J. Math. Phys. 31, 1877– 1888 (1990) 30. Tod, K.P., Ward, R.S.: Self-dual metrics with self-dual Killing vectors. Proc. R. Soc. A 368, 411–427 (1979) 31. Ward, R.S.: On self-dual gauge fields. Phys. Lett. A 61, 81–82 (1977) 32. Ward, R.S.: Integrable and solvable systems and relations among them. Phil. Trans. R. Soc. A 315, 451–457 (1985) Communicated by H. Nicolai

Commun. Math. Phys. 213, 673 – 683 (2000)

Communications in

Mathematical Physics

Renormalization of the Regularized Relativistic Electron-Positron Field Elliott H. Lieb1 , Heinz Siedentop2, 1 Departments of Mathematics and Physics, Jadwin Hall, Princeton University, P.O.B. 708, Princeton,

NJ 08544-0708, USA. E-mail: [email protected]

2 Mathematik, Universität Regensburg, 93040 Regensburg, Germany

Received: 8 March 2000 / Accepted: 7 July 2000

Abstract: We consider the relativistic electron-positron field interacting with itself via the Coulomb potential defined with the physically motivated, positive, density-density quartic interaction. The more usual normal-ordered Hamiltonian differs from the bare Hamiltonian by a quadratic term and, by choosing the normal ordering in a suitable, selfconsistent manner, the quadratic term can be seen to be equivalent to a renormalization of the Dirac operator. Formally, this amounts to a Bogolubov-Valatin transformation, but in reality it is non-perturbative, for it leads to an inequivalent, fine-structure dependent representation of the canonical anticommutation relations. This non-perturbative redefinition of the electron/positron states can be interpreted as a mass, wave-function and charge renormalization, among other possibilities, but the main point is that a nonperturbative definition of normal ordering might be a useful starting point for developing a consistent quantum electrodynamics. 1. Introduction In relativistic quantum electrodynamics (QED) the quantized electron-positron field (x), which is an operator-valued spinor, is written formally as (x) := a(x) + b∗ (x),

(1)

where a(x) annihilates an electron at x and b∗ (x) creates a positron at x. (We use the notation namely x = (x, σ ) ∈ = R3 × {1, 2, 3, 4} 3that x denotes a space-spin point, 3 and d x denotes integration over R and a summation over the spin index.) More precisely, we take the Hilbert space, H of L2 (R3 ) spinors, i.e., H = L2 (R3 ) ⊗ C4 . To specify the one-electron space we choose a subspace, H+ of H and denote the orthogonal © 1999 by the authors. This paper may be reproduced, in its entirety, for non-commercial purposes.

Current address: Mathematik, Universität München, Theresienstraße 39, 80333 München, Germany.

E-mail: [email protected]

674

E. H. Lieb, H. Siedentop

projector onto this subspace by P+ . The one-positron space is the (anti-unitary) charge conjugate, CH− , of the orthogonal complement, H− with projector P− . In position space, ˆ ˆ := iβα2 ψ(−p). For (C(ψ))(x) := iβα2 ψ(x), whereas in momentum space (C(ψ))(p) later purposes we note explicitly that the Hilbert space H can be written as the orthogonal sum H = H+ ⊕ H − .

(2)

If fν is an orthonormal basis for H+ with ν ≥ 0 and an orthonormal basis for H− −1 with ν < 0 then a(x) := ∞ ν=0 a(fν )fν (x), and b(x) := ν=−∞ b(fν )fν (x), where ∗ a (f ) creates an electron in the state P+ f , etc. (For further details about the notation see Thaller [7], and also Helffer and Siedentop [4] and Bach et al [2, 1]. For free electrons and positrons, H+ and H− are, respectively, the positive and negative energy solutions of the free Dirac operator D0 = α · p + m0 β,

(3)

in which α and β denote the four 4 × 4 Dirac matrices and p = −i∇. The number m0 is the bare mass of the electron/positron. Perturbation theory is defined in terms of this splitting of H into H+ and H− . The electron/positron lines in Feynman diagrams are the resolvents of D0 split in this way. As we shall see, this splitting may not be the best choice, ultimately, and another choice (which will require a fine-structure dependent, inequivalent representation of the canonical anticommutation relations) might be more useful. The Hamiltonian for the free electron-positron field is H0 = d 3 x : (x)∗ D0 (x) : with this choice of H+ . The symbol : : denotes normal ordering, i.e., anti-commuting all a ∗ , b∗ operators to the left (but ignoring the anti-commutators). In this paper we investigate the effect of the Coulomb interaction among the particles, which is a quartic form in the a # , b# operators. (# denotes either a star or no star.) The normal ordering will give rise to extra constants and certain quadratic terms. Our main point will be that by making an appropriate choice of H+ that is different from the usual one mentioned above, we can absorb the additional quadratic terms into a mass, wave function, and charge renormalization. This choice of H+ has to be made self consistently and, as we show, can be successfully carried out if some combination of the fine structure constant α (=1/137 in nature) and the ultraviolet cutoff is small enough (but “small” is actually “huge” since the condition is, roughly, α log < 1). The physical significance of our construction is not immediate, but it does show that certain annoying terms, when treated non-perturbatively, can be incorporated into the renormalization program. It is not at all clear that what we have called wave function renormalization, for example, really corresponds to a correct interpretation of that renormalization, or even what the true meaning of wave function renormalization is. It could also be interpreted as a renormalization of Planck’s constant. Likewise, the charge renormalization we find is not mandatory. But our main point is that the effect is mathematically real and is best dealt with by a change of the meaning of the one-particle electron/positron states. It should be taken into account in the non-perturbative QED that is yet to be born. Of course, there are other renormalization effects in QED that we do not consider. In particular, the magnetic field has not been included and so we do

QED

675

not have a Ward identity to help us. Our result here is only a small part of the bigger picture of a proper, non-perturbative QED. Some other QED renormalizations involving the magnetic field are discussed in [6]. The last section of this paper contains a brief discussion of some possible interpretations of our findings and the interested reader is urged to look at that section. Our starting point is the unrenormalized (“bare”) Hamiltonian for a quantized electron/positron field interacting with itself via Coulomb’s law, namely H

bare

:=

d 3 x : ∗ (x)D0 (x) : α + d 3 x d 3 yW (x, y) : ∗ (x)(x) :: ∗ (y)(y) :, 2

(4)

where W is a symmetric (W (x, y) = W (y, x)) interaction potential. The case of interest here is the Coulomb potential W (x, y) = δσ,τ |x −y|−1 and regularized versions thereof. We will refer to the first term on the right side of (4) as the kinetic energy T and to the second term as the interaction energy Wbare . Our choice of the product of the two normal ordered factors in the interaction, namely : ρ(x) : : ρ(y) :, is taken from the book of Bjorken and Drell [3] (Eq. (15.28) without magnetic field). It is quite natural, we agree, to start with the closest possible analog to the classical energy of a field. Of course, this term is only partly normal ordered in the usual sense and therefore it does not have zero expectation value in the unperturbed vacuum or one-particle states. It is positive, however, as a Coulomb potential ought to be. The fully normal ordered interaction (which is not positive) is defined to be α ren 3 Wα = d x d 3 yW (x, y) : ∗ (x) ∗ (y)(y)(x) : . (5) 2 Bear in mind that this definition entails a determination of the splitting (2) of H. We introduce a normal ordered (“dressed”) Hamiltonian ren Hα := d 3 x : ∗ (x)DZ,m (x) : +Wren α .

(6)

(Again, this depends on the splitting of H.) The operator DZ,m differs from D0 in two respects: DZ,m = Z −1 (α · p + mβ),

(7)

where Z < 1 and m > m0 . We call m0 the bare mass and m the “physical” mass – up to further renormalizations, which we do not address in this paper. The factor Z is called the “wave function renormalization” because the “wave function renormalization” in the standard theory is defined by the condition that the electron-positron propagator equals Z times a non-interacting propagator, which is the inverse of a Dirac operator. To the extent that Wren α can be neglected, the propagator with just the first term on the right side of (6) would, indeed, be Z times a Dirac operator with mass m. One could also interpret Z in other ways, e.g., as a renormalization of the speed of light, but such a choice would be considerably more radical. Some other interpretations are discussed in the last section.

676

E. H. Lieb, H. Siedentop

Our goal is to show that by a suitable choice of Z, m, and the electron space H+ , we have – up to physically unimportant infinite additive constants – asymptotic equality between Hbare and Hren α for momenta that are much smaller than the cutoff Hbare = Hren α

(for small momenta).

(8)

The additional normal ordering in (6) will introduce some quadratic terms and it is these terms to which we turn our attention and which we would like to identify as renormalization terms. Whether our picture is really significant physically, or whether it is just a mathematical convenience remains to be seen. What it does do is indicate a need for starting with a non-conventional, non-perturbative choice of the free particle states, namely the choice of H+ . 2. Calculation of the New Quadratic Terms The canonical anti-commutation relations are {a(f ), a(g)} = {a ∗ (f ), a ∗ (g)} = {a(f ), b(g)} = {a ∗ (f ), b∗ (g)} = {a ∗ (f ), b(g)} = {a(f ), b∗ (g)} = 0,

(9)

{a(f ), a ∗ (g)} = (f, P+ g), {b∗ (f ), b(g)} = (f, P− g).

(10)

and

Formally, this is equivalent to {a(x), a(y)} = {a ∗ (x), a ∗ (y)} = {a(x), b(y)} = {a ∗ (x), b∗ (y)} = {a ∗ (x), b(y)} = {a(x), b∗ (y)} = 0,

(11)

{a(x), a ∗ (y)} = P+ (x, y), {b∗ (x), b(y)} = P− (x, y),

(12)

and

where P± (x, y) is the integral kernel of the projector P± . In order to compare the bare Hamiltonian Hbare with the renormalized one Hren α we must face a small problem; the momentum cutoff , which is necessary in order to get finite results, will spoil the theory at momenta comparable to this cutoff. Thus, our identification of the difference as a wave function and mass renormalization will be exact only for momenta that are small compared to . We will also require (although it is not clear if this is truly necessary or whether it is an artifact of our method) that α ln(/m0 ) is not too large. The potential energy Wbare clearly has positive singularities in it if the unregularized Coulomb interaction W (x, y) = |x − y|−1 δσ,τ is taken. We therefore cut off the field operators for high momenta above . (Another choice would be to cut off the Fourier transform of |x − y|−1 at and the result would be essentially the same; our choice is motivated partly by computational convenience.) Our cutoff procedure is to require that (p, σ )e−ip·x dp (x) := (2π)−3/2 (13) {p∈R3 | |p|<}

holds, with > 0. We could also take a smooth cutoff without any significant change. This regularization keeps the positivity of Wbare . To make all terms finite we also need

QED

677

a volume cutoff for one of the difference terms. However, the volume singularity only occurs as an additive constant, which we drop. The renormalized Hamiltonian is free of infrared divergent energies and so does not need a volume cutoff. Next, we calculate the difference α bare ren 3 R := W d x d 3 y W (x, y)[a ∗ (x)a(y)P+ (x, y) − Wα = 2 + a ∗ (x)b∗ (y)P+ (x, y) + b(x)a(y)P+ (x, y) − b∗ (y)b(x)P+ (x, y) − a ∗ (y)a(x)P− (x, y) + P− (x, y)P+ (x, y) + a(x)b(y)P− (x, y) + b∗ (x)a ∗ (y)P− (x, y) + b(x)∗ b(y)P− (x, y)] . (14) Thus, (4) can be rewritten as bare H = d 3 x : ∗ (x)D0 (x) : α P+ (x, y) − P− (x, y) + d 3x d 3y : ∗ (x)(y) : +Wren α 2 |x − y| α P+ (x, y)P− (x, y) + d 3x d 3y . 2 |x − y|

(15)

The first two terms on the right side are one-particle operators (i.e., quadratic in the field operators). Let us call their sum A, which we write (formally, since A is a differential operator) as 3 A = d x d 3 yA(x, y) : ∗ (x)(y) : . (16) The last term is a cutoff dependent constant, which happens to be infinite. (One can show that this term is positive and, if we put in a momentum cutoff and volume (infrared) cutoff V , this integral is proportional to V 4 .) We can make A positive by choosing the normal ordering appropriately. That is, we choose P+ to be the projector onto the positive spectral subspace of the operator A. Thus, we are led to the nonlinear equation for the unknown A, α R, 2

(17)

P+ (x, y) − P− (x, y) (sgn A)(x, y) = , |x − y| |x − y|

(18)

A = D0 + where the operator R has the integral kernel R(x, y) :=

with sgn t being the sign of t. For physical reasons and to simplify matters we restrict the search for a solution to (17) to translationally invariant operators, i.e., 4 × 4 matrix-valued Fourier multipliers. Moreover, we make the ansatz A(p) := α · ωp g1 (|p|) + βg0 (|p|)

(19)

with real functions g1 and g0 , where ωp is the unit vector in the p direction. In other words, we try to make A look as much as possible like a Dirac operator. With this ansatz

678

E. H. Lieb, H. Siedentop

we have sgn A(p) = A(p)/(g1 (|p|)2 + g0 (|p|)2 )1/2 . Thus, recalling (13), (17) can be fulfilled, if g0 (|q|) α 1 g0 (|p|) := m0 + dq , (20) 2 2 2 4π |q|< |p − q| (g1 (|q|) + g0 (|q|)2 )1/2 ω p · ωq g1 (|q|) α dq (21) g1 (|p|) := |p| + 2 2 2 4π |q|< |p − q| (g1 (|q|) + g0 (|q|)2 )1/2 is solvable. Note that the bare mass m0 appears in (20). 3. Determination of the Dressed Electron We will find a solution of the system (20), (21) by a fixed point argument. To this end we first integrate out the angle on the right-hand side. Setting u := |p| and v := |q| we get 1 u v g0 (v) α v g0 (u) = m0 + + dv Q0 , (22) 2 2π 0 u 2 v u (g1 (v) + g0 (v)2 )1/2 1 u v α v g1 (v) g1 (u) = u + dv Q1 , (23) + 2 2π 0 u 2 v u (g1 (v) + g0 (v)2 )1/2 where, for z > 1, 1 z+1 log 2 z−1

(24)

z+1 z log − 1. 2 z−1

(25)

Q0 (z) = and Q1 (z) =

The case of zero bare mass is particularly easy. We can choose g0 = 0, in which case g1 is obtained by integration. We get 1 u v α v + dv Q1 g1 (u) = u + 2π 0 u 2 v u /u α =u+ dv v Q1 ((1/v + v) /2) (26) u 2π 0 α 1 2 =u+ u log |(/u)2 − 1| − (/u)2 2π 3 3

(/u) + 1

2

+ log

. 3 + (/u) 6u (/u) − 1

The function g1 behaves asymptotically for small u or large as g1 (u) = Unfortunately, because of the log u, the operator A can never look like the renormalized Dirac operator in (7). This asymptotic expansion can be seen either from (26) or else by noting that the integrand is a continuous function, except for v = 1, that decays at infinity as 4/(3v). This follows from the large v expansion α 4 2π 3 u log(/u).

vQ1 ((1/v + v) /2) =

4 −1 8 v + v −3 + O(v −5 ). 3 15

(27)

QED

679

For positive bare masses we solve the system (22) and (23) by a fixed point argument. To this end we define the following set of pairs of functions, for /, δ > 0, S/,δ := {g = (g0 , g1 ) | ∀u∈[0,] g(u) ∈ [m0 , (1 + δ)m0 ] × [u, (1 + /)u]}.

(28)

Note that with the metric generated by the sup norm S/,δ is a complete metric space. Next define T : S/,δ → S/,δ by the right-hand side of (23) and (22), i.e., 1 u v g0 (v) α v + dv Q0 , 2 2π 0 u 2 v u (g1 (v) + g0 (v)2 )1/2 1 u v α v g1 (v) T1 (g)(u) = u + dv Q1 . + 2π 0 u 2 v u (g1 (v)2 + g0 (v)2 )1/2

T0 (g)(u) = m0 +

(29) (30)

Lemma 1. If Y , defined by Y := α arsinh(/m0 )/π,

(31)

satisfies Y < 9/50, if / ≥ 50Y /(9 − 50Y ), and if δ ≥ Y /(1 − Y ), then T maps S/,δ into S/,δ . Note that arsinh(x) = log(x +

√

x 2 + 1), i.e., Y grows logarithmically in /m0 .

Proof. Obviously, T0 (g)(u) ≥ m0 and T1 (g)(u) ≥ u. To bound g0 we return to (20) and note the bound (which holds in S/,δ ) (1 + δ)m0 α 1 dq 4π 2 |q|< |p − q|2 (|q|2 + m20 )1/2 α 1 (1 + δ)m0 ≤ m0 + dq 4π 2 |q|< q2 (|q|2 + m20 )1/2 α = m0 [1 + (1 + δ) arsinh(/m0 )]. π

g0 (|p|) ≤ m0 +

(32)

The second inequality holds because the convolution of two symmetric decreasing functions is symmetric decreasing. Next we turn to g1 , which is a bit more complicated. We split the integration region in (21) into A = {q | |q| < 2|p| and |q| < } and B = {q | |q| ≥ 2|p| and |q| < }. In region A we use ωp · ωq ≤ 1 and hence the contribution to g1 from this region is bounded above by α 4π 2

|q|<

dq

(1 + /)2|p| 1 α ≤ 2|p| (1 + /) arsinh(/m0 )], |p − q|2 (|q|2 + m20 )1/2 π

(33)

for the same reason as in (32). In region B we use that (|p|2 − |q|2 )2 ≥ 9|q|4 /16. We also note, for the integration over B, that we can take the mean of the integrand for q and −q. In other words, we can

680

E. H. Lieb, H. Siedentop

bound this term as follows:

ω p · ωq α

g1 (|q|)

dq

4π 2 B |p − q|2 (g1 (|q|)2 + g0 (|q|)2 )1/2

ωp · ω q ωp · ωq α

g1 (|q|)

≤ dq −

2 2 2 2 2 1/2 8π (g1 (|q|) + g0 (|q|) )

|p − q| |p + q| B α (1 + /)|q| 4|p||q| ≤ dq (34) 8π 2 B (|p|2 − |q|2 )2 (|q|2 + m20 )1/2 1 8α(1 + /) 1 ≤ |p| dq 2 2 + m2 )1/2 9π 2 |q| (|q| B 0 32α(1 + /) ≤ |p| arsinh(/m0 ). 9π Thus, we obtain the bound 50α g1 (|p|) ≤ |p|{1 + (1 + /) arsinh(/m0 )}. (35) 9π Theorem 1. If Y := α arsinh(/m0 )/π < 9/50, if (2 + / + δ)Y < 1 and if /, δ satisfy the conditions of Lemma 1 then T is a contraction. Proof. Thanks to Lemma 1 we only need to establish the contraction property. We first note that for positive real numbers x, y, x, ˜ y, ˜

|η| x x˜

˜ 2 + (y − y) ˜ 2 )1/2 , (36)

(x 2 + y 2 )1/2 − (x˜ 2 + y˜ 2 )1/2 ≤ ξ 2 + η2 ((x − x) where (ξ, η) is some point on the line between (x, y) and (x, ˜ y). ˜ Thus we get |[T0 (g) − T0 (g)](u)| ˜ + |[T1 (g) − T1 (g)](u)| ˜

1 u v v g˜ 0 (v) α g0 (v)

dv − ≤ Q0 +

g0 (v)2 + g1 (v)2 2π 0 u 2 v u g˜ 0 (v)2 + g˜ 1 (v)2

1 u v

g1 (v) g˜ 1 (v)

+ Q1 + −

2 2

g0 (v)2 + g1 (v)2 2 v u g˜ 0 (v) + g˜ 1 (v)

α 1 u v (1 + /)v 1 u v (1 + δ)m0 v ≤ dv + Q1 Q0 + + 2π 0 u 2 v u 2 v u v 2 + m20 v 2 + m20 1/2 × (g0 (v) − g˜ 0 (v))2 + (g1 (v) − g˜ 1 (v))2 1 u v (1 + /)v α v ≤ dv Q0 + 2π 0 u 2 v u v 2 + m20 1 u v (1 + δ)m0 g − g˜ +Q1 + 2 v u v 2 + m20 α 1 u v (1 + /)v + (1 + δ)m0 v ≤ dv Q0 g − g˜ + 2π 0 u 2 v u v 2 + m20 ≤ (2 + / + δ)Y g − g˜ , (37)

QED

681

where Y = α arsinh(/m0 )/π as in Lemma 1. In the last line we have simply noted that the integrals are smaller than the corresponding integral in (32). Corollary 1. The map T has a unique fixed point, if Y < 1/7. Proof. Take / = 50Y /(9 − 50Y ) and δ = Y /(1 − Y ) so that T maps S/,δ into itself. Then, if Y < Y0 , the contraction condition will be satisfied. For α = 1/137, i.e., the physical value of the fine structure constant – the condition Y < 1/7 is fulfilled for /m ≤ e18 . 4. Properties of the Renormalized Hamiltonian What we have shown up to now is that the bare Hamiltonian in (4) is equivalent, apart from some additive constants, to a renormalized Hamiltonian. This renormalized Hamiltonian has the form (6), but the quadratic term is only approximately the one given in (7). What takes the place of DZ,m is given in (17) and (19). We see immediately from (22), (23) that the solution is a continuous pair of functions. We also see that as long as m0 is not zero the functions g0 and g1 are positive and behave properly for small |p|, i.e., g0 is constant and g1 is proportional to |p|. To relate this to DZ,m we first factor out the constant limu→0+ (g1 (u)/u) and call this Z −1 . As is evident from (21), this constant 1/Z is larger than one – as it should be. The next thing to verify is that Zg0 (0) is bigger than m0 , since this is the renormalized mass m that appears in (7). In other words, we have to verify that m −1 g1 (|p|) g0 (0) Z = > lim = Z −1 . p→0 |p| m0 m0

(38)

We shall, in fact, prove more than this: Theorem 2. Assuming that Y < 1/7, the unique solution to the Eqs. (20) and (21) mentioned has the property g0 (|p|) g1 (|p|) > m0 |p|

(39)

for all p = 0. Proof. Define S˜δ,/ to be the subset of Sδ,/ on which g1 (|p|) g0 (|p|) ≥ m0 |p|

(40)

for all p = 0. If we can show that T also leaves S˜δ,/ invariant, then we have shown the wanted inequality (39) on the solution in Sδ,/ (by the uniqueness of the solution on Sδ,/ ) except for the possibility that equality also can occur. This is so, since we can apply the fixed point argument not only to Sδ,/ but also to S˜δ,/ . Showing that (40) holds is – according to (22) and (23) – equivalent to showing that 1 u v g0 (v)/m0 v + dv Q0 u 2 v u (g1 (v)2 + g0 (v)2 )1/2 0 1 u v v g1 (v)/u > dv Q1 (41) + 2 + g (v)2 )1/2 u 2 v u (g (v) 1 0 0

682

E. H. Lieb, H. Siedentop

holds. Now, using the Inequality (40) and the fact that the factors with the roots are monotone functions g0 and g1 respectively, it is enough to show that 0

v dv Q0 u

1 1 u v + 2 2 v u (m0 + v 2 )1/2 1 u v v v/u > dv Q1 + 2 u 2 v u (m0 + v 2 )1/2 0

(42)

holds. To proceed, we now compare the integrands pointwise. Using the explicit expressions (24) and (25) for Q0 and Q1 , and with t = v/u, we will have shown that the integrand of the left-hand side pointwise majorizes the one on the right-hand side of (42) if

t + 1

2t

< log

(43) t − 1 |t 2 − 1| holds for all t = 1. By symmetry, we only have to consider (43) for t > 1. To this end we exponentiate (43) t +1 2t < exp 2 t −1 t −1 and expand the exponential up to second order. (Note that this gives a lower bound on the exponential because the argument of the exponential function is positive.) I.e., it suffices to show that t +1 2t t2 <1+ 2 +2 2 t −1 t −1 (t − 1)2

(44)

which follows by direct computation. Having established (2) we see that (44) indeed gives strict inequality in (2) for the unique fixed point in S˜δ,/ . 5. Interpretations of Our Results What we have done is start with the “bare” Hamiltonian Hbare in (4), in which the interaction is the closest analog to the classical electrostatic energy of a field. After some analysis, we found in Sect. 3 that Z and m could be uniquely chosen so that for momenta much less than the ultraviolet cutoff , Hbare = Hren α plus a well defined infinite constant. Of course the field in the two cases is different. Formally, they differ by a Bogolubov transformation, but in fact an inequivalent, α dependent, representation of the CAR is needed. In Sect. 4 we showed that not only is Z −1 m > m0 but more is true, namely m > m0 , and this is comforting physically. We emphasize that we have not “integrated out” any field variables. Our new Hamiltonian Hren α is the same as the original one – on a purely formal level. It is suggestive, nevertheless, thata good deal of the electrostatic energy has now been incorporated into the leading term d 3 x : ∗ (x)DZ,m (x) : and that the remaining interaction is somehow less important than the original one. It is, after all, normal ordered, which means it vanishes on one-electron states, if we define such states by ∗ |0, where |0 is the vacuum of the new . While this makes sense perturbatively, it is, however, misleading

QED

683

because the new vacuum (the state of lowest energy) is surely not the obvious choice |0. 3 If we drop the new interaction term Wren α we are left with the Hamiltonian d x : ∗ (x)DZ,m (x) :. Unfortunately this is not the Hamiltonian of a Dirac operator (even at low momenta) because of the factor Z −1 . We have called this a wave function renormalization, but that is not really in the spirit of renormalizing the one-electron states (which is what is usually done) and is, instead, a renormalization of an operator. One school (Källén [5]) thinks it is proper to speak of a renormalized by incorporating a factor Z −1/2 into , but this changes the anti-commutation relations! Note that this formulation requires renormalizing the bare mass m0 to g0 (0)/Z and renormalizing the bare charge e to e/Z. Another point of view is to regard Z −1 m as the physical mass and to change Z −1 p into p by a unitary transformation (which is nothing other than a change of length scale). This has the disadvantage of changing the speed of light or Planck’s constant. It also would mean a different scale change for particles of different mass, e.g., muons. As opposed to Källén’s procedure, this requires renormalizing the mass from m0 to g0 (0) only, but there is no need for charge renormalization. Another possibility is to bring out the factor Z −1 as a multiplier of the whole Hamiltonian, which would mean changing the fine structure constant to αZ, i.e., a charge renormalization but now from e to e/Z 1/2 only. The obvious problem here is that since the Hamiltonian is the generator of time translation, this means a change of the time scale (which, again, depends on the particle in question). Doubtless, different people will have different opinions about these matters. We do not wish to commit ourselves to any point of view. But it is our opinion that the construction of Hren α is a significant piece of the puzzle of constructing a nonperturbative QED. Acknowledgement. The authors thank Dirk Hundertmark for a useful discussion. Financial support of the European Union, TMR grant FMRX-CT 96-0001, the U.S. National Science Foundation, grant PHY98-20650, and NATO, grant CRG96011 is acknowledged.

References 1. Bach, V., Barbaroux, J.-M., Helffer, B. and Siedentop, H.: Stability of matter for the Hartree-Fock functional of the relativistic electron-positron field. Doc. Math. 3, 353–364 (electronic) (1998) 2. Bach, V., Barbaroux, J.-M., Helffer, B. and Siedentop, H.: On the stability of the relativistic electronpositron field. Commun. Math. Phys. 201, 445–460 (1999) 3. Bjorken, J.D. and Drell, S.D.: Relativistic Quantum Fields. International Series in Pure andApplied Physics. New York: McGraw-Hill, 1st edition, 1965 4. Helffer, B. and Siedentop, H.: Form perturbations of the second quantized Dirac field. Math. Phys. Electron. J. 4, Paper 4, 16 pp. (electronic) (1998) 5. Källén, G.: Quantenelektrodynamik. In: S. Flügge, editor, Prinzipien der Quantentheorie I, volume V/1 of Handbuch der Physik, Berlin: Springer-Verlag, 1st edition, 1958, pp. 169–364 6. Lieb, E.H. and Loss, M.: Self-energy of electrons in non-perturbation QED. In: R. Weihard and G. Weinstein (eds.): Differential equations and mathematical physics, (Birmingham, AL, 1999). Amer. Math Soc./International Press (2000), pp. 255–269 7. Thaller, B.: The Dirac Equation. Texts and Monographs in Physics. Berlin: Springer-Verlag, 1st edition, 1992 Communicated by A. Jaffe

Commun. Math. Phys. 213, 685 – 696 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

The Number of Ramified Covering of a Riemann Surface by Riemann Surface An-Min Li , Guosong Zhao , Quan Zheng Department of Mathematics, Sichuan University, 610064, Chengdu, Sichuan, P.R. China. E-mail: [email protected]; [email protected] Received: 10 June 1999 / Accepted: 7 July 2000

Abstract: Interpreting the number of ramified covering of a Riemann surface by Riemann surfaces as the relative Gromov–Witten invariants and applying a gluing formula, we derive a recursive formula for the number of ramified covering of a Riemann surface by Riemann surface with elementary branch points and prescribed ramification type over a special point. 1. Introduction Let g be a compact connected Riemann surface of genus g and h a compact connected Riemann surface of genus h (g ≥ h ≥ 0). A ramified covering of h of degree k by g is a non-constant holomorphic map f : g → h such that |f −1 (q)| = k for all but a finite number of points q ∈ h , which are called branch points. Two ramified coverings f1 and f2 are said to be equivalent if there is a homeomorphism π : g → g such that f1 = f2 ◦ π . A ramified covering f is called almost simple if |f −1 (q)| = k − 1 for each branch point but one, that is denoted by ∞. If α1 , . . . , αm are the orders of the preimage of ∞, then the ordered m-tuple pair (α1 , · · · , αm ) = α is a partition of k, denoted by α k, and is called the ramification type of f (at ∞). We call m the length of α, denoted g,k by l(α) = m. Let µh,m (α) be the number of equivalent almost simple covering of h g,k

by g with ramification type of α. How to determine µh,m (α) is known as the Hurwitz

Enumeration Problem. It is Hurwitz who first gave an explicit expression for µ0,k 0,m (α), g,k

see [H]. µh,m (α) is called the Hurwitz number. Many mathematicians contribute to this problem for the case h = 0. J. Dénes [D] gave a formula for g = 0, l(α) = 1, and V. I. Arnol’d [A] for g = 0, l(α) = 2. By a combinatorial method, I. P. Goulden Partially supported by a NSFC, RFDP and a QiuShi grant.

Partially supported by the Postdoctor Foundation of Dalian University of Technology and the Youth

Foundation of Sichuan University

686

A.-M. Li, G. Zhao, Q. Zheng

and D. M. Jackson gave an explicit formula for g = 0, 1, 2 (see [GJ1, 2, 3], [GJV]). Physicists M. Crescimanno and W. Taylor [CT] solved the case when g = 0, and α is the identity. R. Vakil [V] derived explicit expressions for g = 0, 1, using a deformation theory of algebraic geometry. g,k In this paper we interpret the Hurwitz number µh,m (α) as the relative Gromov– Witten invariants defined by Li and Ruan [LR], then applying a gluing formula we g,k derive a recursive formula for µh,m (α) with g ≥ h. Suppose α = (α1 , · · · , αm ), put J˜(α) = (αi , αj )|1 ≤ i < j ≤ m / ∼, where (αi , αj ) ∼ (αs , αt ) iff (αi , αj ) = (αs , αt ), or (αi , αj ) = (αt , αs ). For every equivalent class [(αi , αj )] of J˜(α) we choose a representative element, say (αi , αj ), and associate with it an ordered (m − 1)-tuple θ = (α1 , · · · , αˆ i , · · · , αˆ j , · · · , αm ,αi + αj ), where the caret means that the term is omitted. Then we obtain a set J (α) of ordered (m − 1)-tuples. For θ = (α1 , · · · , αˆ i , · · · , αˆ j , · · · , αm , αi + αj ) ∈ J (α), we define an integer I1 (θ ) =

1 2 (αi

+ αj ) · #{λ ∈ θ |λ = αi + αj } (αi + αj ) · #{λ ∈ θ |λ = αi + αj }

if αi = αj , if αi = αj .

(1.1)

For every αi in α, we construct a set Cαi (α) = ω = (α1 , · · · , αˆ i , · · · , αm , ρ, αi − ρ) | 1 ≤ ρ ≤ [ α2i ] of ordered (m + 1)-tuples. Let α1 , α2 , · · · , αl be all elements in α with distinct value. Put C(α) = Cα1 (α) ∪ Cα2 (α) ∪ · · · ∪ Cαl (α). For every ω = (α1 , · · · , αˆ i , · · · , αm , ρ, αi − ρ) ∈ C(α), we associate with it a number I2 (ω) =

1 2 ρ(αi

− ρ) · #{λ ∈ ω|λ = ρ} · (#{µ ∈ ω|µ = αi − ρ} − 1) ρ(αi − ρ) · #{λ ∈ ω|λ = ρ} · #{µ ∈ ω|µ = αi − ρ}

if ρ = αi − ρ, if ρ = αi − ρ. (1.2)

ˆ · · · , m} into two parts π1 , π2 , correspondingly, we divide ω = Dividing {1, 2, · · · , i, (α1 , · · · , αˆi , · · · , αm , ρ, αi − ρ) into two parts in forms: ωπ1 = (απ1 , ρ), ωπ2 = ˆ · · · , m}, then ωπ1 = (α1 , ρ), (απ2 , αi − ρ). For example if π1 = {1}, π2 = {2, · · · , i, ωπ2 = (α2 , · · · , αˆi , · · · , αm , αi −ρ). Note that π1 , π2 may be empty. Write P ω = {π = (ωπ1 , ωπ2 )}/ ∼, where π = (ωπ1 , ωπ2 ) = (· · · , ρ, · · · , αi − ρ) ∼ π˜ = (ω˜ π1 , ω˜ π2 ) = (· · · , ρ, ˜ · · · , αi −ρ) ˜ ∈ Pω iff ρ = ρ˜ and ωπ1 and ω˜ π1 are the same through a permutation. We also use π = (ωπ1 , ωπ2 ) to denote its equivalent class when there is no confusion. For every equivalent class π = (ωπ1 , ωπ2 ) ∈ Pω , we associate with it a number I3 (π ) =

1 2 ρ(αi −ρ) · #{λ

∈ ωπ1 |λ = ρ} · #{µ ∈ ωπ2 |µ = αi − ρ} ρ(αi −ρ) · #{λ ∈ ωπ1 |λ = ρ} · #{µ ∈ ωπ2 |µ = αi − ρ}

In this paper, we will prove the following theorem, see Sect. 3:

if ρ = αi − ρ, if ρ = αi − ρ. (1.3)

Number of Ramified Covering of Riemann Surface by Riemann Surface

687

g,k

Theorem A. All µh,m (α) can be determined by a recursive formula:

g,k

µh,m (α) =

θ∈J (α)

+

g,k

µh,m−1 (θ ) · I1 (θ ) +

g−1,k

ω∈C(α)

µh,m+1 (ω) · I2 (ω)

(1.4)

ω∈C(α) g1 + g2 = g k1 + k2 = k m1 + m2 = m + 1 π∈Pω g1 , g2 ≥ 0

k1 , k2 ≥ 1

m1 , m2 ≥ 1

k + m − 2kh − 3 + 2g g ,k g2 ,k2 µ 1 1 (ωπ1 ) · µh,m (ωπ2 ) · I3 (π ), 2 k1 + m1 − 2k1 h − 2 + 2g1 h,m1

where mi = l(ωπi ), i = 1, 2. Let p = (p1 , p2 , p3 , · · · , ) be indeterminates, and pα = pα1 · · · pαm for α = g,k (α1 , · · · , αm ). Introduce a generating function for µh,m (α), h (u, x, z, p) =

k, m ≥ 1 α k g ≥ 0 l(α) = m

g,k

µh,m (α) ·

uk+m−2kh−2+2g xk g · · z pα . (k + m − 2kh − 2 + 2g)! k! (1.5)

Then after symmetrizing αi and αj in the variable θ = (α1 , · · · , αˆ i , · · · , αˆ j , · · · , αm , αi +αj ) ∈ J (α), and ρ and αi −ρ in the variable ω, γ = (α1 , · · · , αˆi , · · · , αm , ρ, αi −ρ) in (1.4), we have the following Theorem B. The recursive formula (1.4) is equivalent to the following partial differential equation: ∂h ∂h ∂h ∂h ∂ 2 h 1 ijpi+j z . (1.6) + ijpi+j + (i + j )pi pj = ∂u 2 ∂pi ∂pj ∂pi ∂pj ∂pi+j i,j ≥1

Remark. By a combinatorial method, I. P. Goulden and D. M. Jackson have proven the g,k above equation for the case h = 0, and gave an explicit formula of µh,m (α) for h = 0 and g = 0, 1, 2 (see ([GJ 1, 2, 3], [GJ V ]): m

µ0,k 0,m (α1 , · · · , αm ) =

α αi |cα | i (k + m − 2)!k m−3 , k! (αi − 1)! m

µ1,k 0,m (α1 , · · · , αm ) =

(1.7)

i=1 αi αi

|cα | (k + m)! 24k! (αi − 1)! i=1

· (k m − k m−1 −

m

(i − 2)!ei k m−i ),

(1.8)

i=2

where cα is the conjugacy class of the symmetric group Sk on k symbols indexed by the partition type α of k, and ei is the i th elementary symmetric function in α1 , · · · , αm .

688

A.-M. Li, G. Zhao, Q. Zheng

For h = 0, using the recursive formula (1.4) with initial value 1 if g = 0 g,1 , µ0,1 (1) = 0 if g ≥ 1 g,k

we calculate some values of µ0,m (α) with the aid of Maple for g = 0, 1, · · · , 5, k = 3, 4, 5, m = 1, 2, · · · , 5: α

µ0,k 0,m (α)

µ1,k 0,m (α)

µ2,k 0,m (α)

µ3,k 0,m (α)

µ4,k 0,m (α)

µ5,k 0,m (α)

(3)

1

9

81

729

6561

59049

(1,2)

4

40

364

3280

29524

265720

(1,1,1)

4

40

364

3280

29524

265720

(4)

4

160

5824

209920

7558144

272097280

(1,3)

27

1215

45927

1673055

60407127

2176250895

(2,2)

12

480

17472

629760

22674432

816291840

(1,1,2)

120

5460

206640

7528620

271831560

9793126980

(1,1,1,1)

120

5460

206640

7528620

271831560

9793126980

(5)

25

3125

328125

33203125

3330078125

333251953125

(1,4)

256

35840

3956736

409108480

41394569216

4156871147520

(2,3)

216

26460

2748816

277118820

27762350616

2777408868780

(1,1,3)

1620

234360

26184060

2719617120

275661886500

27700994510280

(1,2,2)

1440

188160

20160000

2059960320

207505858560

20803767828480

(1,1,1,2)

8400

1189440

131670000

13626893280

1379375197200

138543794363520

(1,1,1,1,1)

8400

1189440

131670000

13626893280

1379375197200

138543794363520

which coincide to the formulas (1.7) and (1.8) for g = 0, 1, respectively. To do a similar calculation for h > 0, we have to calculate some special Hurwitz g,k number µh,m (α) for the case when k + m − 2 + 2g − 2kh = 0, which we will discuss in another paper.

2. Relative GW-Invariant Let (M, ω) be a real 2n-dimension compact symplectic manifold with symplectic form ω, and Z 0 , . . . , Z p symplectic submanifolds of M with real codimension 2. Denote Z = (Z 0 , · · · , Z p ). Let g be a compact connected Riemann surface of genus g ≥ 0. Suppose A ∈ H2 (M, Z), k i = {k1i , · · · , klii } a set of positive integers, i = 0, . . . , p, denoted by K = {k 0 , . . . , k p }. Consider moduli space M = MM,Z A,l (g, K) of pseudoholomorphic maps f : g → M with marked points x1 , · · · , xl ; y10 , · · · , yl00 ; · · · ; p p y1 , · · · , ylp ∈ g such that [f ( g )] = A, and f is tangent to Z i at y1i , · · · , ylii with

order k1i , · · · , klii , i = 0, . . . , p. Denote x = (x1 , · · · , xl ), y i = (y1i , · · · , ylii ), y = (y 0 , · · · , y p ). Note that the intersection numbers #(A · Z i ) are topological invariants, and lji=1 kji = #(A · Z i ). Moreover, since Z i is a symplectic submanifold, if A can be expressed by the image of an nontrivial pseudo holomorphic map f : g → M, the intersection number #(Zi ·A) ≥ 0. Similarly to the Gromov–Uhlenbeck compactification

Number of Ramified Covering of Riemann Surface by Riemann Surface

689 M,Z

for the pseudo-holomorphic maps, we compactify M by M = MA,l (g, K), the space of relative stable maps (for details see [LR]). We have two natural maps: 1g,l : M → M l (f, g , x, y, K) −→ (f (x1 ), · · · , f (xl )),

(2.1)

and p : M → Z0 × · · · × Zp p p (f, g , x, y, K) −→ ((f (y10 ), · · · , f (yl00 )), · · · , (f (y1 ), · · · , f (ylp ))).

(2.2)

Roughly, the relative GW-invariants are defined as M,Z (δ|β, K) = 1∗g,l 5 δi ∧ p ∗ 5 β j , ψA,g,l i

M

j

j

j

where δ = (δ1 , · · · , δl ), δi ∈ H ∗ (M, R), β = (β 0 , · · · , β p ), β j = (β1 , · · · , βlj ), j

βi ∈ H ∗ (Z j , R) for any i. For precise definition, see [LR]. If g is not connected, suppose there exists c connected components g1 , . . . , gc , c then the genus g is defined to be it’s algebraic genus, i.e., g = i=1 gi −c +1. Let Px be the set of all ordered partitions of {x1 , . . . , xl } into c parts. Each π = (π1 , · · · , πc ) ∈ Px records which marked points x = (x1 , · · · , xl ) go on each components g1 , · · · , gc . Similarly, we can define σ i = (σ1i , · · · , σci ) ∈ Py i . Corresponding to the partition of y, we define the partition of K, i.e, σ i = (σ1i , · · · , σci ) ∈ PK i , and write σi = p (σi0 , · · · , σi ), σ = (σ1 , · · · , σc ). π , σ induces a partition of δ, β, respectively. Denote the parameters over the component gi by xπi , yσi , δπi , βσi . Suppose that f i : → M is a relative stable pseudo holomorphic map such that [fi ( gi )] = Ai , and ci=1 Ai = A. M,Z,c Then for given (A1 , · · · , Ac ), π and σ , the relative GW-invariant ψA,g,l (δ|β; K) is defined by c M,Zσ M,Z,c ψA,g,l (δ|β; K)(π, σ ) = ψAi ,gi ,li i (δπi |βσi ; Kσi ). i=1

¯ Consider the linearization of the ∂-operator Df = D∂¯J (f ) : C ∞ (, f ∗ T M) → :0,1 (f ∗ T M). If we choose a proper weighted Sobolev norm over C ∞ (, f ∗ T M) and :0,1 (f ∗ T M), we have the following, see [LR]. Lemma 2.1. Df is a Fredholm operator with index Ind (Df ) = 2C1 (M)A + (2n − 6)(1 − g) + 2

p i=0

li − 2

p li i=0 j =1

kji + 2l

(2.3)

M,Z (δ|β; K) is defined to be zero unless and the relative GW-invariant ψg,l l i=1

deg δi +

p li i=0 j =1

deg βji = Ind (Df ).

(2.4)

690

A.-M. Li, G. Zhao, Q. Zheng

Suppose that H : M → R is a proper periodic Hamiltonian function such that the Hamiltonian vector field XH generates a circle action. By adding a constant, we can assume that zero is a regular value. Then H −1 (0) is a smooth submanifold preserved by circle action. The quotient B = H −1 (0)/S 1 is the famous symplectic reduction. Namely, B has an induced symplectic structure, so we can regard B as a symplectic submanifold of M with real codimension 2. We cut M along H −1 (0). Suppose that we obtain two disjoint components M ± which have boundary H −1 (0). We can collapse the ± S 1 -action on H −1 (0) to obtain two closed symplectic manifolds M . This procedure is + called symplectic cutting, see [L, LR]. Without loss of generality, suppose M contains − Z + = (Z 0 , · · · , Z q ) as submanifolds and M contains Z − = (Z q+1 , · · · , Z p ), q ≤ p. There is a map +

− π :M→M BM . It induces a homomorphism +

−

π ∗ : H ∗ (M ∪B M , R) → H ∗ (M, R). It was shown by Lerman [L] that the restriction of the symplectic structure ω on M ± such that ω+ |B = ω− |B is the induced symplectic form from symplectic reduction. By + the Mayer-Vietoris sequence, a pair of cohomology classes (δ + , δ − ) ∈ H ∗ (M , R) ⊗ − + − H ∗ (M , R) with δ + |B = δ − |B defines a cohomology class of M ∪B M , denoted by δ + ∪B δ − . +

M ,Z + ,B,c+

Consider the moduli space M+ = M+ A+ ,l + the tuple

+ ( g , x + ,

y + , e+ , K + , α + , f + )

(g + , K + , α + ) which consists of

with properties:

+

g has c+ connected components; + + f + : g → M is a pseudo holomorphic map; + [f + ( g )] = A+ ; + f is tangent to Z + = (Z0 , · · · , Zq ) at y + = (y 0 , · · · , y q ) with order K + = (k 0 , · · · , k q ); • f + is tangent to B at e+ = (e1+ , · · · , ev+ ) with order α + = (α1+ , · · · , αv+ ). • • • •

−

M ,Z − ,B,c−

Similarly, we can define M− = M− A− ,l − g−

(g − , K − , α − ) which consists of the

tuple ( , x − , y − , e− , K − , α − , f − ). According to [LR], we can glue f + and f − to + obtain a pseudo holomorphic map f : g → M. A little more precisely, we glue M − and M as above. If f + and f − have the same periodic orbits at each end, i.e., they have same orders as the tangent to symplectic submanifold B, we can glue the maps + − f + and f − as f + #f − after gluing the domain of the Riemann surface g and g , + − which is the connected sum of g and g . Then perturbating the map f + #f − , we can get an unique pseudo holomorphic map f : g → M. In our paper, we always + − require that g # g is a connected Riemann surface. The following index addition formula is useful to our paper, Lemma 2.2 ([LR]). Ind(Df + ) + Ind(Df − ) = (2n − 2)v + IndDf .

Number of Ramified Covering of Riemann Surface by Riemann Surface

691

We also need a well known fact about the genus of the connected sum of Riemann surfaces: Lemma 2.3. The following equality is satisfied: g = g + + g − + v − 1,

(2.5) ±

where g is the genus of g , g ± is the algebraic genus of g , v is the number of the + − end, i.e., the number of the points where we glue g , g . M,Z (δ|β; K) can According to Theorem 5.8 of [LR], the relative GW-invariant ψA,l be expressed by the relative GW-invariants over each connected component. Precisely, J,A using the notations of [LR], suppose that Cg,l,K is the set of indices: ± ± ± ± ± (1) The combinatorial type of ( ± , f ± ) : {A± i , gi , li , Ki , (α1 , · · · , αv )}, i = v ± 1, · · · , v, αi = #(A · B); i=1

(2) A map ρ : {e1+ , · · · , ev+ } → {e1− , · · · , ev− }, where (e1± , · · · , ev± ) denote the puncture points of ± , satisfying (i) the map ρ is one-to-one; (ii) if we identify ei+ and ρ(ei+ ) , then + − forms a connected closed Riemann surface of genus g; (iii) f + (ei+ ) = f − (ρ(ei+ )) and they have same order of tangency; (iv) (( + , f + ), ( − , f − ), ρ) represents the homology class A. J,A For given C ∈ Cg,l,K suppose that πC± , σC± are partitions of x ± , y ± , e± , δ ± , β ± , α ± induced by C. Then we have the following: ([LR] Lemma 5.4 and Theorem 5.8). J,A Lemma 2.4. Cg,l,K is a finite set, and M,Z ψA,g (δ|β; K) =

ψC (δ|β; K),

(2.6)

J,A C∈Cg,l,K

where ψC (δ|β; K) = "α"

+

+

+

δ I,J ψAM+ ,g,Z+ ,l,B,c (δ + |β + ; ρI ; K + , α) + −

−

−

(πC+ , σC+ ) · ψAM− ,g,Z− ,l,B,c (δ − |β − ; ρJ ; K − , α)(πC− , σC− ), −

(2.7)

where "α" = α1 · · · αv ; δ I J = δ I1 J1 · · · δ Iv Jv , δ Ii Ji being the Kronecker symbol; and {ρ1 , · · · , ρs } is an orthonormal basis of H ∗ (B, R), ρI = {ρI1 , · · · , ρIv } ⊂ {ρ1 , · · · , ρs }, ρJ = {ρJ1 , · · · , ρJv } ⊂ {ρ1 , · · · , ρs }. For convenience in application, we will rewrite Lemma 2.4 in following steps: Step 1. We divide A into A+ and A− such that A = A+ ∪B A− . ±

Step 2. Suppose g have ai ≥ 0 end points with order i ∈ {1, · · · , #(A · B)} such that i · ai = #(A · B), and g = g + + g − + ai − 1. Denote a = (a1 , a2 , · · · , ). i

i

Step 3. Suppose that τ ± = (π ± , σ ± ) ∈ Px ± × Py ± ,e± record which marked points in ±

{x ± , y ± , e± } go on each component g1 , · · · ,

g ±± c

, satisfying:

692

A.-M. Li, G. Zhao, Q. Zheng

c±

(1) g ± =

gi+

± i=1 gi

− c± + 1, gi± ≥ 0, i = 1, · · · , c± , ±

±

are relative stable holomorphic maps, and [fi+ ( gi )] = A± i , ± c ± i = 1, · · · , c± with Ai = A± .

(2) fi :

−→ M

Denote τ =

i=1 + − (τ , τ ). Note

that τ ± induce a partition of δ ± , β ± , a and Z ± . +

−

+

−

Step 4. For given a and τ , we glue g and g in the above manner such that g # g is a connected Riemann surface of genus g. However, for given such a and τ , we can + − + − glue g and g in many different ways such that g # g is a connected Riemann surface of genus g. Denote the number of different ways by κ(a, τ ). Then we have the following gluing formula for the relative GW-invariants: Lemma 2.4 . + + + M,Z ψA,g (δ|β; K) = "a" · δ I J · κ(a, τ )ψAM+ ,g,Z+ ,l,B,c + τ

−

−

−

(δ |β ; ρI ; K , a)(π , σ ) · ψAM− ,g,Z− ,l,B,c (δ − |β − ; ρJ ; K − , a)(π − , σ − ), (2.8) − where "a" = 1a1 · 2a2 · · · · , and the first denotes that we sum all possibilities for

A = A+ A− , +

+

+

+

+

+

B

g = g + g − + v − 1, i · ai , #(A · B) =

(2.9)

i

ρI , ρJ ⊂ {ρ1 , · · · , ρs }. 3. Relative GW-Invariant Over h In our case, M is the Riemann surface h with real two dimension, thus Z consists of points, which are the divisors of M. Since H2 ( h , Z) ∼ = Z, denoted the generator by H , then the first Chern class C1 ( h ) = (2 − 2h)H . Let A = kH . When we say h ,Z g h f ∈ M A,l (g, K), we mean that f : → is a pseudo-holomorphic map such that [f ( g )] = kH and there exist marked points x, y ∈ g , f is tangent to Z at y with order K. Note that lji=1 kji = #(A · Zi ) = deg(f ) = k, then k i is a partition of h

,Z k, i = 0, · · · , p. Moreover the relative GW-invariant ψA,g,0 (|β; K) = 0 unless p li i=0 j =1

deg βji

h

= 2C1 ( )A + 4(g − 1) + 2

p

li − 2

i=0

p li i=0 j =1

kji ,

(3.1)

where βji ∈ H ∗ (Z i , R), j = 1, · · · , li , i = 0, · · · , p. However, since Z i is a point, deg βji = 0. Thus we have (2 − 2h)k + 2(g − 1) +

p i=0

li −

p li i=0 j =1

kji = 0.

(3.2)

Number of Ramified Covering of Riemann Surface by Riemann Surface

693

Remark 3.1. The equality (3.2) is exactly the Riemann–Hurwitz formula. Suppose g is connected, choose l = 0, K = (2, 1, · · · , 1; · · · ; 2, 1, · · · , 1; g,k h ,Z α1 , · · · , αm ). Then by definition µh,m (α) is just the relative GW-invariant ψA,g,0 (|β; K). From (3.1), we derive p = k + m − 2kh − 2 + 2g, i.e., we have k + m − 2kh − 2 + 2g g,k double branch points over h , otherwise, µh,m (α) = 0. Now, we can prove Theorem A by symplectic cutting and the gluing formula (2.8). We perform the symplectic cutting over h at ∞ in a small neighborhood such that there is only one other double branch point G in this neighborhood. We have M

+

= S2,

M

−

= h.

It’s easy to observe that A+ = kH , A− = kH , where H is the generator of H2 (S 2 , Z) ∼ = Z. We may consider dimension condition equations: (k − m) + (2 − 1) + (k − v) = 2k − 2 + 2g + . (3.3) g = g+ + g− + v − 1 +

+

+

We first consider M = S 2 . The map f + : g → M branches at only three points: infinity, the fixed double branch point G and the symplectic reduction point + + + + B. Suppose g = ∪ci=1 gi , i.e., g has c+ -connected components. Suppose the +

+

gi holomorphic map u+ → M + has degree ki+ . It is obvious that ki+ ≤ k. If gi i : contains a double ramification point, we have from the Riemann–Hurwitz formula over this component that + + ki+ − vi+ + 2 − 1 + ki+ − m+ i = 2ki − 2 + 2gi ,

(3.4)

where vi+ ≥ 1, m+ i ≥ 1 is the number of ramification at the symplectic reduction point B and infinity, respectively. Note that the geometric genus 0 ≤ gi+ ≤ g, so we have + two cases: (vi+ , m+ i , gi ) = (1, 2, 0), or (2, 1, 0). By the same reason, if the component +

+ gi doesn’t contain any double ramification point, we have one case (vi+ , m+ i , gi ) = c+ + c+ + (1, 1, 0). Note that i=1 mi = m, we have v = i=1 vi = m − 1, or m + 1, + correspondingly, c = m − 1, or, m. In sum, we have proven +

+

Lemma 3.2. For M = S 2 , the holomorphic map fi : gi → M following branch types:

+

has one of the

(1) (αi ; 1, 1, · · · , 1; αi ), 1 ≤ i ≤ m, (2) (αk , αl ; 2, 1, · · · , 1; αk + αl ), 1 ≤ k < l ≤ m, (3) (αi ; 2, 1, · · · , 1; ρ, αi − ρ), 1 ≤ i ≤ m, 1 ≤ ρ ≤ α2i at infinity, the fixed branch point G and the symplectic reduction point B respectively. + If v = m − 1, then c+ = m − 1, g + = ci=1 gi+ − c+ + 1 = 2 − m. Substituting − into g = g + + g − + v − 1, we have g − = g. Note that g − = ci=1 gi− − c− + 1, 0 ≤ gi− ≤ g, gi− ≥ 0, c− ≥ 1, we have only one case: c− = 1, g − = g. If v = m+1, then c+ = m. By the same reason, we have two cases: c− = 1, g − = g − 1 and c− = 2, g − = g1− + g2− − 2 + 1 = g − 1, g1+ ≥ 0, g2− ≥ 0. In sum, we have

694

A.-M. Li, G. Zhao, Q. Zheng

Lemma 3.3. The genus g − and the number c− of connected components of − are one of the following cases: (i) c− = 1, g − = g; (ii) c− = 1, g − = g − 1; (iii) c− = 2, g − = g1− + g2− − 1 = g − 1, g1− ≥ 0, g2− ≥ 0. −

Regarding the symplectic reduction point B ∈ M as infinity, we get many new − − almost simple ramified covering maps fi− : gi → M . However in any above case, −

−

the holomorphic map fi− : gi → M has either strict smaller number of ramification points at infinity, or strict smaller degree, or strict smaller genus than the holomorphic + map f : g → M = h . Thus if we have known the relative GW-invariant in M , g,k by Lemma 2.4, we can get a recursive formula for µh,m (α). So we need the following lemma, Lemma 3.4. Let θ = (α1 , · · · , αˆi , · · · , αˆj , · · · , αm , αi + αj ) ∈ J (α), ω = (α1 , + · · · , αˆi , · · · , αm , ρ, αi − ρ) ∈ C(α) as in the introduction, then for M , the product ψJ (α, θ ) of the relative GW-invariants of (m − 1) connected components is   1 · · · 1ˆ · · · 1ˆ · · · 1 if αi = αj , αi αj αm (3.5) ψJ (α, θ ) = α1 ˆ ˆ 1 1 1 1 1  ··· ··· ··· · if αi = αj α1

αi

αj

αm

2

and the product ψC (α, ω) of the relative GW-invariants of m connected components is   1 · · · 1ˆ · · · 1 if ρ = αi − ρ, αi αm ψC (α, ω) = α1 (3.6) ˆ 1 1 1 1  ··· ··· if ρ = αi − ρ. · α1

αi

αm

2

Proof. By the definition of the relative GW-invariant and Lemma 3.2, we only need to calculate the connected relative GW-invariant of two types: S 2 ,pt,pt,pt

Q1 = ψkH,0,0 Q2 =

(|pt, pt, pt; k; 1, 1, . . . , 1; k), k ≥ 1,

S 2 ,pt,pt,pt ψkH,0,0 (|pt, pt, pt; k; 2, 1, · · ·

, 1; ρ, k − ρ), ρ ≥ 1,

(3.7)

where the “pt” in the bracket records the point homology Poincaré dual to the generator E ∈ H 0 (pt, R), and the others correspond to Z. Regarding a holomorphic map f : S 2 → S 2 as a meromorphic function over the S 2 ,pt,pt,pt

Riemann plane, we write f ∈ MkH,0 α0 (x−y 1 )k ,x (x−y 2 )k

(0; K; 1, 1, · · · , 1; k) in the form F1 : C →

C, F1 (x) = ∈ C, where y1 = y2 ∈ C are k-ramification points. Without loss of generality, we choose zero and infinity as k-ramification points, and send 1 to 1; thus there exists an unique solution F1 (x) = x k . However we have conformal 2π i transformation πi : C → C, πi (x) = e k x, i = 0, · · · , k − 1, such that F1 (x) = F1 ◦ S 2 ,pt,pt,pt

πi (x), i.e, there exists a finite group Zk that acts on MkH,0 thus Q1 =

1 . k

(0; K; 1, 1, · · · , 1; k), (3.8)

Number of Ramified Covering of Riemann Surface by Riemann Surface

695

S 2 ,pt,pt,pt

By the same reason, we write f ∈ MkH,0 (0; K; 2, 1, · · · , 1; ρ, k − ρ) in form F2 : C → C, α0 (x − y11 )ρ (x − y21 )k−ρ , F2 (x) = (x − y 2 )k α0 = 0, x ∈ C, where y11 = y21 , y 2 ∈ C are ρ, k − ρ, k-ramification point, respectively. Suppose 1,2 and zero are ρ, k − ρ, k-ramification points, respectively. Then F2 = α0 (x−1)ρ (x−2)k−ρ . Since F2 has a double ramification point x at a given point, for instance xk at 1. Then x = 0, 1, 2. We have the following equation: F2 (x) = 1, (3.9) F2 (x) = 0. Solving(3.9), we have a unique solution F2 =

α0 (x−1)ρ (x−2)k−ρ , xk

where α0 = (2k)k . However, if ρ = k − ρ, we have a conformal transformation π : C → ρ ρ (2ρ−2k)k−ρ 2x C, π(x) = 3x−2 , such that F2 (x) = F2 ◦ π(x). Since π ◦ π = 1, there exists a finite S 2 ,pt,pt,pt group Z2 that acts on MkH,0 (0; K; 2, 1, · · · , 1; ρ, k − ρ), thus

Q2 =

1 2

1

ifρ = k − ρ, ifρ = k − ρ.

(3.10)

We complete Lemma 3.4. $ % Now, we prove Theorem A. For a positive integer b and an ordered positive integer tuple β = (λ1 , · · · , λt ), we define an integer ϕ(β, b) = #{λ ∈ β|λ = b}. According to Lemma 2.4 and Lemma 3.3, we have g,k

µh,m (α) g,k = µh,m−1 (θ ) · ψJ (α, θ ) · "θ " · ϕ(θ, αi + αj ) θ∈J (α)

+

ω∈C(α)

+

g−1,k

µh,m+1 (ω) · ψC (α, ω) · "ω" · ϕ(ω, ρ) · (ϕ(ω, αi − ρ) − δ ρ,αi −ρ )

k + m − 2kh − 3 + 2g k1 + m1 − 2k1 h − 2 + 2g1

ω∈C(α) g1 +g2 =g k1 +k2 =k m1 +m2 =m+1 π∈Pω g1 ,g2 ≥0 k1 ,k2 ≥1 m1 ,m2 ≥1 g ,k

g ,k

1 1 2 2 · µh,m (ωπ1 ) · ϕ(ωπ1 , ρ) · µh,m (ωπ2 ) · ϕ(ωπ2 , αi − ρ) · ψC (α, ω) · "ω", 1 2 (3.11)

where "θ " = α1 · · · αˆ i · · · αˆ j · · · αm (αi +αj ); "ω" = α1 · · · αˆ i · · · αm ρ(αi −ρ); δ ρ,αi −ρ is the Kronecker symbol; mi = l(ωπi ), i = 1, 2; the factor k1k+m−2kh−3+2g +m1 −2k1 h−2+2g1 comes from the fact that we can choose k1 +m1 −2k1 h−2+2g1 double ramification points over the component g1 from k + m − 2kh − 3 + 2g double ramification points. Substituting (3.5), (3.6) into (3.11), we get (1.4). $ %

696

A.-M. Li, G. Zhao, Q. Zheng

Remark. For h = 0, the initial value of our recursive formula is 1 if g = 0 g,1 . µ0,1 (1) = 0 if g ≥ 1 For h > 0,the initial value is not only g,1 µh,1 (1)

=

1 0

if g = h , if g ≥ h + 1

g,k

but also some special Hurwitz number µh,m (α) for the case when k +m−2kh−2+2g = 0, which we will discuss in another paper. Acknowledgements. The first author would like to thank Prof. Kefeng Liu, Prof. Zhengbo Qin and Prof. Yongbin Ruan for inviting him to visit their Universities and for valuable discussions. The third author would like to thank Prof. Yongbin Ruan and Prof. Renhong Wang for their honest help.

References [A]

Arnol’d, V.I.: Topological classification of trigonometric polynomial and combinatorics of graphs with an equal number of vertices and edges. Funct. Ann. and its Appl. 30 1, 1–17 (1996) [GJ1] Goulden, I.P., Jackson, D.M.: A proof of a conjecture for the number of ramified covering of the sphere by the torus. Preprint [GJ2] Goulden, I.P., Jackson, D.M.: The number of ramified covering of the sphere by the double torus, and a general form for higher genera. Preprint [GJ3] Goulden, I.P., Jackson, D.M.: Transitive factorisations into transpositions and holomorphic mapping on the sphere. Proceeding of AMS 125 1, 51–60 (1997) [GJV] Goulden, I.P., Jackson, D.M., Vainshtain, A.: The number of ramified covering of the sphere by the torus and surface of higher genera. AG/9902125 [CT] Crescimanno, M., Taylor, W.: Large N phases of chiral QCD2 . Nuclear Phys. B 437 1, 3–24 (1995) [D] Dénes, J.: The representation of a permutation as the product of a minimal number of transpositions and its connection with the theory of graphs. Publ. Math. Inst. Hungar. Acad. Soc. 4, 63–70 (1959) [H] Hurwitz, A.: Über Riemann’sche Flächen mit gegebenen Verzweigungspunkten. Math. Ann. 39, 1–60 (1891) [IP] Ionel, E., Parker, T.: Gromov–Witten invariants of symplectic sums. Prepint. Math.sg/9806013 [L] Lerman, E.: Symplectic cuts. Math. Research Lett. 2, 247–258 (1985) [LR] Li,An-Min, Ruan,Yongbin: Symplectic surgery and Gromov–Witten invariants of Calabi–Yau 3-folds I. Preprint. Math.alg-geom/9803036 [LLY] Lian, B., Liu, K., Yau, S.-T.: Mirror principle I. Asian J. Math. 1, 729–763 (1997) [V] Vakil, R.: Recursions,formulas,and graph-theoretic interpretation of ramified coverings of the sphere by surface of genus 0 and 1. CO/9812105 Communicated by A. Jaffe

Commun. Math. Phys. 213, 697 – 731 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Incompressible Flows of an Ideal Fluid with Unbounded Vorticity Misha Vishik Department of Mathematics, University of Texas at Austin, Austin, TX 78712 USA. E-mail: [email protected] Received: 31 March 1999 / Accepted: 10 July 2000

Abstract: In this paper we study solutions to the Euler equations of an ideal incompressible fluid in R n singular at the origin with a finite symmetry group. For an “admissible” class of finite groups we prove a local existence and uniqueness theorem. In even dimensions this theorem covers some symmetric flows with essentially unbounded vorticity. In arbitrary dimension (including n = 3) we construct local in time solutions with vorticity that behaves, e.g., like a function of homogeneous degree zero near the origin. The symmetry condition provides necessary additional cancellations and is preserved by the evolution due to uniqueness. Introduction In this paper we study solutions of the incompressible Euler equations of an ideal fluid with singularities. In [V] a uniqueness theorem was proved that covers solutions to the Euler equations with vorticity bounded, e.g., in bmo ∩ Lp , p ∈ (1, n), n being the dimension. Existence was proved there only in dimension 2 using the idea of wavelet decomposition (see, e.g. [M2]). Function spaces used to construct such a solution are variants of a borderline Besov space based on L∞ . These function spaces contain essentially unbounded functions. In [Y] V.I. Yudovich asked a question whether it is possible to prove (even locally) an existence and uniqueness theorem for the Euler equations in dimension n > 2 for any class of flows with essentially unbounded vorticity. As follows from the present paper such existence and uniqueness classes could be constructed for n = 4 (and for any even n). We use two-microlocal spaces C s,r introduced by J.-M. Bony [B2] to study propagation of singularities for nonlinear hyperbolic equations. Recently their role in analysis was thoroughly investigated by S. Jaffard and Supported in part by the NSF grants DMS-9531769 and DMS-9876947 and by the TARP grant 003658-

071.

698

M. Vishik

Y. Meyer [JM] and Y. Meyer [M3]. These important works contain among other results a complete wavelet description of two-microlocal spaces. The idea of this paper is to combine local description of singularities of vorticity in a two-microlocal space C 0,r , r ∈ (0, n) with an appropriate symmetry condition preserved by the evolution. It turns out in dimension 4 (and any even dimension) there are “admissible” finite symmetry groups such that the tensor of deformation rates is essentially bounded while the vorticity in general is not. This leads to the local existence theorem. Uniqueness follows from our previous results [V] since C 0,r → bmo for r > 0. We in fact use weighted variants of C 0,r . The typical singularity of vorticity covered by the results of the present paper is log2 log2 |x|−1 . In Appendix A we prove a (local) existence theorem for flows with vorticity in C s,r , s > 0, r ∈ (0, n) which is used in our approximation scheme. In case both the vorticity and the tensor of deformation rates are bounded due to an “admissible” symmetry condition, the results of the present paper imply a local existence and uniqueness theorem for such a flow. This works in any dimension in particular n = 3. It is interesting to compare this result with the paper of P. Gamblin and X. Saint Raymond [GS]. The method of Littlewood–Paley decompositions is used systematically. This method (paradifferential calculus) was introduced by J.-M. Bony [B1] and Y. Meyer [M1]. We use the approach of H. Bahouri and J.-Y. Chemin [BC] to estimate the terms in Littlewood–Paley decomposition. In addition we have to take care of the decay at infinity of each term in such a decomposition. Constructing existence and uniqueness classes for the Euler equations with essentially unbounded vorticity in dimension 3 is still an open problem. 1. Littlewood–Paley Analysis ∈ C ∞ (Rn ) and radial We choose radial ϕ ∈ C0∞ (Rn \0) such that the following 0 properties are satisfied: 5 ⊂ |ξ | ≤ • supp ; 6 3 5 • supp ϕ⊂ ≤ |ξ | ≤ ; 5 3 2 3 • | ϕ (ξ )| ≥ C > 0 for ≤ |ξ | ≤ ; 3 2 3 (ξ )| ≥ C > 0 for |ξ | ≤ ; • | 4 ∞ (ξ ) + • ϕj (ξ ) = 1, ξ ∈ Rn , (1.1) j =0

ϕ (2−j ξ ). where ϕj (ξ ) = Remark 1.1. For |j − j | ≥ 2 supp ϕj ∩ ϕj = ∅. Likewise, for j ≥ 1 supp ϕj ∩ supp = ∅. Definition 1.2. Let f ∈ S (Rn ). Then (D) f = ∗ f. −1 f =

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

699

For j ≥ 0 , j f = ϕj (D) f = ϕj ∗ f . For j ≤ −2 , j f = 0. For k ∈ Z , Sk f = j f . j ≤k

In particular, S−1 f = −1 f . It is well known and is easy to check that the choices above can be made. Because of (1.1), for any distribution f ∈ S , f =

∞

j f,

(1.2)

j =−1

and the series in the right side of (1.2) is convergent in S . For our purposes the condition that ϕ and are O(n) invariant is crucial. 2. Two-Microlocal Hölder Spaces Definition 2.1 ([B, M3]). For s, r ∈ R, C s,r is a Banach space of all tempered distributions f on Rn such that the following inequalities hold: |S−1 f (x)| ≤ C (1 + |x|)−r , |j f (x)| ≤ C 2

−j s

j

x ∈ Rn ; −r

(1 + 2 |x|) ,

j ≥0 ,

(2.1) n

x∈R .

(2.2)

m map continuously Classical pseudodifferential operators with symbols in S1,0 C s,r → C s−m,r (see [M3]). We need the following variation of this fact.

Proposition 2.2. Let k ∈ C ∞ (Rn \0) be homogeneous of degree 0. Assume r ∈ (0, n). Then (Kf )(x) = (2π )−n eiξ(x−y) k(ξ ) f (y) dy dξ acts continuously in C s,r . ϕ (2−j ξ ) k(ξ ) f(ξ ). Choose a function η ∈ Proof. For j ≥ 0 (j Kf )∧ (ξ ) = ∞ n C0 (R \0) such that η = 1 on supp ϕ . Then for ζ = η k, (j Kf )∧ (ξ ) = η (2−j ξ ) k(2−j ξ ) (j f )∧ (ξ ) = ζ (2−j ξ ) (j f )∧ (ξ ). Therefore,

(j Kf )(x) = 2nj (ζ (2j ·) ∗ j f )(x).

Using (2.2) |(j Kf ) (2−j x) ≤ C(|ζ | ∗ 2−j s (1 + | · |)−r )(x) ≤ C 2−j s (1 + |x|)−r . This implies (2.2) for Kf instead of f with j ≥ 0. We have to treat the j = 0 case. It is enough to check the following: Assume g(x) satisfies the inequality |g(x)| ≤ C (1 + |x|)−r ,

x ∈ Rn

(2.3)

700

M. Vishik

and the Fourier spectrum of g is supported in the ball of radius 1/2 centered at the origin. Then (2.4) |K g (x)| ≤ C (1 + |x|)−r . For |x| ≤ 1 this inequality follows from the Bernstein’s inequality and boundedness of K on Lp , where we choose p > nr . We have Kg= K j g = 2nj σ (2j ·) ∗ g, j ≤0

j ≤0

where σ ∈ S(Rn ) is defined by σ = k ϕ . We assume |x| > 1 and estimate the integral (N being arbitrary) jn j |K g(x)| = σ (2 (x − y)) g(y) dy 2 j ≤0

≤ C

2j n (1 + 2j |x − y|)−N (1 + |y|)−r dy

j ≤0

≤ C

|x−y|≤ |x| 2

+

|x| 2 <|x−y|<2|x|

.

+ 2|x|≤|x−y|

We estimate the first integral 2j n (1 + 2j |x − y|)−N (1 + |y|)−r dy ≤ C (1 + |x|)−r |x−y|≤ |x| 2 j ≤0

2

jn

|z|≤ |x| 2

j ≤0

−r

= C (1 + |x|)

(1 + 2j |z|)−N dz

|z|≤2j −1 |x|

j ≤0

≤ C (1 + |x|)−r

j ≤0

−r

≤ C (1 + |x|)

2j |x|

(1 + |z|)−N dz

t n−1 (1 + t)−N dt

0

(2j |x|)n (1 + 2j |x|)−N ≤ C (1 + |x|)−r .

j ≤0

For the second integral 2j n (1 + 2j |x − y|)−N (1 + |y|)−r dy |x| 2 <|x−y|≤2|x|

≤

j ≤0

jn

j

2 (1 + 2 |x|)

j ≤0

≤ C (1 + |x|)−n

≤ C (1 + |x|)−r .

−N

3|x| 0

|y|≤3|x|

(1 + |y|)−r dy

t n−1 (1 + t)−r dt ≤ C (1 + |x|)−n (1 + |x|)n−r

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

701

Here we have used the condition r < n. We now turn to the third integral jn 2 (1 + 2j |x − y|)−N (1 + |y|)−r dy |x−y|≥2|x|

j ≤0

≤C

2j n

j ≤0

≤C

2j n

j ≤0

≤C

2

jn

|z|≥2|x|

|z|≥2|x|

∞

t

(1 + |z|)−rp dz

n−1

(1 + t)

−rp

1 p

|z|≥2|x|

1

p

dt

2

− jpn

jn p

n

2 (1 + |x|) p

−r

n

(1 + 2j |x|) p

(1 + 2j |z|)−Np dz ∞

2j +1 |x|

2|x|

j ≤0

≤C

(1 + 2j |z|)−N (1 + |z|)−r dz

t

n−1

(1 + t)

1 p

−Np

dt

1 p

−N

j ≤0 − pn

n

−r

(1 + |x|) p ≤ C (1 + |x|)−r . Here we choose a fixed p ∈ nr , ∞ and after this choice is made choose N sufficiently large. This concludes the proof. ≤ C (1 + |x|)

Theorem 2.3. Let f ∈ C 0,r , r > 0. Then for every j ≥ −1, x ∈ Rn there exists an integer N = N (x, j ) ∈ [−1, j ] such that |Sj f (x) − SN(x,j ) f (0)| ≤ C f C 0,r ,

j ≥ −1,

x ∈ Rn .

(2.5)

Proof. We give the proof for r > 1. For r ∈ (0, 1) and r = 1 the proof is analogous. We have for any N ∈ [−1, j ], |Sj f (x) − SN f (0)| ≤ |S−1 f (x) − S−1 f (0)| +

N

|, f (x) − , f (0)| +

,=0

j

|, f (x)|.

(2.6)

,=N +1

From Taylor’s formula , f (x) − , f (0) =

k

1 0

, ∂k f (τ x) · xk dτ.

Using that ∂k f ∈ C −1,r we conclude |, f (x) − , f (0)| ≤ C f C 0,r |x| 2

,

1

(1 + 2, τ |x|)−r dτ

0

= C f C 0,r (1 − (1 + 2, |x|)1−r ). For x = 0 the inequality (2.5) is obviously satisfied (say, N = j ). Let 2−m−1 < |x| ≤ 2−m , m ∈ Z.

(2.7)

702

M. Vishik

Case I. m ≥ j . We set N = j and apply (2.1) to the first term in the right side of (2.6) and (2.7) to the rest |Sj f (x) − Sj f (0)| ≤ C f C 0,r + C f C 0,r

j

{1 − (1 + 2, |x|)1−r }

,=0

≤ C f C 0,r + C f C 0,r

j

2, |x|

,=0 j

≤ (C + C 2 |x|) f C 0,r ≤ C f C 0,r . Here we used the assumption 2j |x| ≤ 2m |x| ≤ 1. Case II. m < 0. We set N = −1, |Sj f (x) − S−1 f (0)| ≤ C f C 0,r +

j

|, f (x)|

,=0

≤ Cf C 0,r + Cf C 0,r  ≤ C + C

j



j

(1 + 2, |x|)−r

,=0

2−,r  f C 0,r ≤ C f C 0,r .

,=0

Here we used (2.2) for the sum in the right side of (2.6). Case III. 0 ≤ m ≤ j − 1. We set N = m, m (1 − (1 + 2, |x|)1−r ) |Sj f (x) − SN f (0)| ≤ C + C ,=0

+C 

j

(1 + 2, |x|)−r ) f C 0,r

,=m+1

≤ C + C

m

2, |x| + C

,=0 m

≤ (C + C 2 |x| + C 2

j

 2−,r |x|−r  f C 0,r

,=m+1 −mr

|x|−r ) f C 0,r ≤ C f C 0,r .

The theorem is proved. Remark 2.4. The constant for r = 1 obtained from this proof is O ((r|r − 1|)−1 ). There is a separate constant for r = 1. Corollary 2.5. Let f ∈ C 0,r , r > 0. Then the following two statements are equivalent: (i) f ∈ L∞ ; (ii) The sequence {Sj f (0)}, j ≥ −1 is a bounded sequence.

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

703

Proof. (i) obviously implies (ii) (even without f ∈ C 0,r ). The other way follows directly from the theorem. We now discuss a few examples of functions in C 0,r . We will say that a function (distribution) f on Rn “locally belongs to” C s,r if there exists a χ ∈ C0∞ (Rn ), χ = 1 on a neighbourhood of 0 such that χ f ∈ C s,r . Proposition 2.6. 1. Let f ∈ C ∞ (Rn \0) be homogeneous of order 0. Then f locally belongs to C 0,r for any r. 2. f (x) = log2 |x| locally belongs to C 0,r for any r. Proof. Replacing f by χf , where χ ∈ C0∞ (Rn ) we may assume that g(x) = f (x) − f (2x) ∈ C0∞ (Rn ). Thus (j g)(x) = (j f )(x) − (j −1 f )(2x). We have obviously |S−1 f (x)| ≤ C(1 + |x|)−r ; |0 f (x)| ≤ C(1 + |x|)−r . Using the recurrence relation above j

(j f )(x) = (0 f )(2 x) +

j

(k g)(2j −k x)

j ≥ 1.

k=1

Since g ∈ C 1,r , we have from the above |j f (x)| ≤ C (1 + 2j |x|)−r + C

j

(1 + 2j |x|)−r · 2−k

k=1 j

−r

≤ C (1 + 2 |x|) ,

j ≥ 1.

This concludes the proof. Proposition 2.7. For any r > 0 we have the following embedding: C 0,r → bmo. Proof. Since C 0,r → Lp for r > 0, rp > n we have to check only the following condition. Let I, be a cube with side 2−, , , ≥ 0. We have to verify that for f ∈ C 0,r ,  1  1 2 2 sup |j f | dx ≤ C f C 0,r .  I, ,,I,  |I, | j ≥,

We have 1 2 2 n, |j f | dx ≤ f C 0,r 2 (1 + 2j |x|)−2r dx |I, | I I , , j ≥, j ≥, = f 2C 0,r (1 + 2j −, |x|)−2r dx. j ≥,

I0

704

M. Vishik

Taking the “worst” location of I0 , namely the one centered at 0 we estimate the sum as follows:    j −,−1 2−n(j −,) + ≤C + 2−kn 2−2r(j −,−k)  j ≥,

Ij −,

I0 \Ij −,



k=0

j ≥,

≤ C 2 +



(j − ,)(2−2r(j −,) + 2−n(j −,−1)−2r )

j ≥,

≤ C. This completes the proof. 3. Definition of Admissibility 3.1. Consider a finite subgroup G ⊂ O(n). We say it is admissible if the following two conditions are satisfied: (i) (−1) ∈ G; (ii) Consider the following representation π of O(n) in the space of traceless symmetric matrices: π(h) : A → h A h−1 , h ∈ O(n)

A = t A;

tr A = 0.

Then there are no invariant elements for this representation restricted to G ⊂ O(n): {π(g) A = A

∀ g ∈ G ⇒ A = 0}.

We will often consider the following situation below. Assume u is a solenoidal vector field in Rn : u(x) = (u1 (x), . . . , un (x)), div u = 0, x ∈ Rn . We say u is symmetric with respect to a finite admissible subgroup G ⊂ O(n) provided the following natural condition holds: g∗ u (g −1 x) = u(x),

g ∈ G,

x ∈ Rn .

We give here one very simple example. Example 3.2. n = 3;

 2 2 −2  u1 (x) = x1 (x2 − x3 ) |x| 2 2 u2 (x) = x2 (x3 − x1 ) |x|−2  u (x) = x (x 2 − x 2 ) |x|−2 . 3 3 1 2

This field can be modified to belong to C 0,r by altering it at infinity. For example, we can use the following procedure: ua (x) = curl (χ (x) (K ∗ 1{|·|≤1} u) (x)). Here 1{|·|≤1} is a characteristic function of a unit ball; χ ∈ C0∞ is a radial cutoff function with support inside this ball, χ = 1 on a neighbourhood of the origin. Obviously ua constructed this way is symmetric with respect to G generated by

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

705

(i) mirror symmetries with respect to coordinate planes; (ii) cyclic permutation (x1 , x2 , x3 ). It is easy to see that G is admissible. Indeed, assume the contrary. This means there is a traceless symmetric real 3 × 3 matrix A such that g A g −1 = A for all g ∈ G. Taking g to be mirror symmetries we conclude that coordinate vectors are eigenvectors for A. Invariance under the permutation (x1 , x2 , x3 ) implies the eigenvalues are equal, which implies A = 0 since tr A = 0. An example of a different nature is discussed in Sect. 7 below. Remark 3.3. The local existence and uniqueness theorem below provides a unique local solution to the Euler equations in dimension 3 with initial data ua (x) or any flow with the same singularity at the origin and the same group of symmetries. These flows could be described as three-dimensional “vortex patches, collapsed to a point”. This should be compared with the results of [GS] where the solution for three-dimensional vortex patches was constructed under the condition of conormal regularity. 4. Formulation of the Main Result Let r ∈ R. Suppose two positive real sequences are given, δ(j ) and γ (j ) for j ≥ −1. Assume the following: • •

δ(j ) < 1 is nonincreasing; γ (j ) is nondecreasing; C

−1

δ(j ) ≤ δ(j + 1) ≤ C δ(j );

C

−1

γ (j ) ≤ γ (j + 1) ≤ C γ (j ), j ≥ −1.

(4.1) (4.2)

The first assumption implies •

δ(,) 2j −, ≤ 2 δ(j ), j ≥ −1.

(4.3)

,≥j

Definition 4.1 (of the Banach spaces B1 , B2 , B). B1 = {f ∈ S (Rn ) | |j f (x)| ≤ C δ(j )(1 + 2j |x|)−r ; j ≥ −1, x ∈ Rn };

(4.4)

B2 = {f ∈ S (Rn ) | |Sj f (x)| ≤ C γ (j ); j ≥ −1, x ∈ Rn }; B = B1 ∩ B2 .

(4.5) (4.6)

Note that B1,2 , B depends upon the choice of r, δ, γ (and n) but we will suppress this fact in our notation. The norm in B1,2 is the infimum of all C > 0 such that (4.4) or (4.5) holds. Theorem 4.2. Let n ≥ 2 be arbitrary, r ∈ (0, n), p ∈ (1, n), ω0 ∈ B ∩Lp , u0 = K ∗ω0 . Assume that u0 is symmetric with respect to a finite admissible subgroup G ∈ O(n). Then there exists a positive time T > 0 and a unique solution u(t) of the Euler equations ∂t ω = Lu ω,

u = K ∗ ω,

u(0) = u0

with the following properties: {t → ω(t)B(t) } ∈ L∞ ([0, T ]); ω(·) ∈ L∞ ([0, T ]; Lp ).

(4.7) (4.8)

706

M. Vishik

Here the Banach space B(t) corresponds to the following choice of sequences: δ(j )1−λt ; γ (j ) = − log δ(j );

(4.9)

for B = B(0) therefore the choice is δ(j ) and γ (j ) = − log δ(j ); λ > 0 is an appropriate constant. For a.e. t ∈ [0, T ] u(t) is symmetric with respect to G. Remark 4.3. Lu denotes the Lie derivative. We give another equivalent form of the equation below. K stands for the n-dimensional Biot–Savart kernel. Remark 4.4. In Appendix B below we show that a function with singularity of the type log2 log2 |x|−1 at the origin and with compact support belongs to B = B1 ∩ B2 , where δ(j ) = (j + 3)−1 , γ (j ) = log (j + 3). The first inclusion follows from Proposition B.1, the second is obvious (and follows from (B.4)). 5. Paraproduct Estimates In what follows ∂x u means the matrix (∂j ui )1≤i,j ≤n . As before the vorticity ω = ∂x u − t (∂x u) is thought of as a skew-symmetric matrix. The Euler equations have the following form: ∂t ω = −(u · ∂x ) ω − D ω − ω D, u = K ∗ ω.

(5.1) (5.2)

Here D stands for the following symmetric matrix: D=

1 (∂x u + t (∂x u)). 2

(5.3)

We need the following estimate. Theorem 5.1. Let ω ∈ C 0,r , r ∈ (0, n), u = K ∗ ω. For any j ≥ −1 let Rj (u, ω) = −(Sj −2 u · ∂x ) j ω + j (u · ∂x ) ω.

(5.4)

Then, sup |Rj (u, ω)(x)| (1 + 2j |x|)r

x∈Rn

≤ C δ(j )((ωB1 + ωLp )ωC 0,r + DL∞ ωB1 ).

(5.5)

Throughout this proof the index k appearing twice means summation with respect to k. Proof. We follow closely the arguments in [BC]. According to Bony’s celebrated paraproduct formula [B1] for two functions, a and b, ab = Ta b + Tb a + R(a, b). The paraproduct operators have the following meaning Ta b = Sj −2 a j b, j

(5.6)

(5.7)

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

Tb a =

j a Sj −2 b,

707

(5.8)

j

R(a, b) =

j a j b.

(5.9)

|j −j |≤1

We will often apply these formulae in case a or b, or both, are matrix valued functions (distributions). We have Rj (u, ω) =

4 ,=1

(,)

Rj (u, ω),

(5.10)

where (1)

Rj (u, ω) = j ∂k Tω uk ,

(5.11)

(2)

Rj (u, ω) = −[∂k Tuk ·, j ] ω, (3) Rj (u, ω) (4) Rj (u, ω)

(5.12)

= ∂k Tuk −Sj −2 uk j ω,

(5.13)

= j ∂k R(uk , ω) − ∂k R(Sj −2 uk , j ω).

(5.14)

Indeed, using (5.6)–(5.9) j ∂k (uk ω) = j ∂k (Tuk ω + Tω uk + R(uk , ω)) =

2 ,=1

=

3 ,=1

=

4 ,=1

(,)

Rj (u, ω) + ∂k Tuk j ω + j ∂k R(uk , ω) (,)

Rj (u, ω) + ∂k TSj −2 uk j ω + j ∂k R(uk , ω) (,)

Rj (u, ω) + ∂k TSj −2 uk j ω + ∂k Tj ω Sj −2 uk

+ ∂k R(Sj −2 uk , j ω) − ∂k Tj ω Sj −2 uk =

4 ,=1

(,)

Rj (u, ω) + (Sj −2 u · ∂x ) j ω − ∂k Tj ω Sj −2 uk .

We assert that the last term vanishes. Indeed, from (5.8), Tj ω Sj −2 uk = j Sj −2 uk Sj −2 j ω. j

But Sj −2 j = 0 for j ≤ j while j Sj −2 = 0 for j ≥ j . This proves (5.10). We (1) estimate all the terms in the right side of (5.10) starting with Rj (u, ω), (1) j ∂k (j uk Sj −2 ω) Rj (u, ω) = j

(5.15)

=

−2 j j

,=−1

j (j uk ∂k , ω).

708

M. Vishik

All the terms in the right side with |j − j | ≥ M0 vanish. We may also assume j ≥ 1 since otherwise Sj −2 ω = 0. Then |j uk (x)| ≤ C 2−j δ(j ) ωB1 (1 + 2j |x|)−r ; |∂k , ω (x)| ≤ C 2, ωC 0,r .

(5.16) (5.17)

After rescaling x → 2−j x the operator j acts as a convolution with a function from S. Such an operator preserves decay at infinity as (1 + |x|)−r . Therefore, (5.15)–(5.17) imply (1)

|Rj (u, ω)(x)| ≤ C δ (j ) ωC 0,r ωB1 (1 + 2j |x|)−r .

(5.18)

(2)

We turn now to Rj (u, ω): (2)

Rj (u, ω) = −

j

= −∂k

[∂k Sj −2 uk j ·, j ] ω

[Sj −2 uk ·, j ] j ω.

(5.19)

|j −j |≤M0

We use the explicit integral representation to estimate the sum in the right side of (5.19), ∂k [Sj −2 uk ·, j ] j ω (x) j (n+1) ∂k ϕ (2j (x − y))(Sj −2 uk (x) − Sj −2 u(y)) j ω(y) dy = 2 = 2j (n+1)

∂k ϕ (2j (x − y))

m=1

1 0

=

n

(5.20)

Sj −2 ∂m uk (x − τ (x − y))(xm − ym ) j ω(y) dy dτ

n

1

m=1 0

∂k ϕ(z) zm Sj −2 ∂m uk (x − 2−j τ z)

j ω(x − 2−j z) dz dτ. dϕ zk zm Notice that ∂k ϕ(z) zm = d|z| |z| so we can symmetrize the last expression with respect to m, k. The right side of (5.20) equals n 1 1 ∂k ϕ(z) zm Sj −2 (∂m uk + ∂k um )(x − 2−j τ z)j ω(x − 2−j z) dz dτ. 2 0 m=1

(5.21) We have from (5.19)–(5.21), (2) |Rj (u, ω)(x)|

≤ C DL∞

|∂z ϕ(z)| |z| |j ω(x − 2−j z)| dz. (5.22)

|j −j |≤M0

Since |∂z ϕ(z)| |z| is bounded by a radial decreasing function from L1 we can use the following lemma.

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

709

Lemma 5.2. Assume F (x) is radial, nonnegative, radially decreasing. Then

F (y) dy · M f (x) , x ∈ Rn . |(f ∗ F )(x)| ≤ Rn

Proof. See [S], p. 57.

Using Lemma 5.2 and (5.22) we obtain (2)

|Rj (u, ω)(x)| ≤ C δ(j ) DL∞ M{x → (1 + 2j |x|)−r } ωB1 ≤ C δ(j ) DL∞ ωB1 (1 + 2j |x|)−r .

(5.23)

Indeed, a simple computation shows M {x → (1 + |x|)−r } ≤ C (1 + |x|)−r . (3)

Next we estimate Rj (u, ω),

(3)

Rj (u, ω) = ∂k

Sj −2 (uk − Sj −2 uk ) j j ω

|j −j |≤1

= ∂k

 Sj −2 

|j −j |≤1

j



(5.24)

q uk  j j ω.

q=j −1

We use for q ≥ 0 the inequality |q uk (x)| ≤ C 2−q δ(q) ωB1 (1 + 2q |x|)−r ,

(5.25)

and for q = −1 the inequality |−1 uk (x)| ≤ C ωLp .

(5.26)

The last estimate (5.26) follows from the Hardy–Littlewood–Sobolev inequality applied to a convolution with K, and Bernstein’s inequality. In addition, since ω ∈ C 0,r , |j j ω(x)| ≤ C ωC 0,r (1 + 2j |x|)−r .

(5.27)

Combining (5.24)–(5.27) with (4.2) yields (3)

|Rj (u, ω)(x)| ≤ C δ(j ) ωC 0,r (ωB1 + ωLp ) (1 + 2j |x|)−r .

(5.28)

(4)

We now estimate Rj (u, ω): (4)

(4,1)

Rj (u, ω) = Rj (4,1) Rj (u, ω) (4,2) Rj (u, ω)

(4,2)

(u, ω) + Rj

(u, ω),

= j ∂k R(uk − Sj −2 uk , ω), = j ∂k R(Sj −2 uk , ω) − ∂k R(Sj −2 uk , j ω).

(5.29)

710

M. Vishik (4,1)

To estimate Rj

(4,1)

Rj

we proceed as follows:

(u, ω) = j ∂k

|j −j |≤1

= ∂k j

j (uk − Sj −2 uk ) j ω

j , uk

,≥j −1 |j −,|≤1

j ω.

(5.30)

|j −j |≤1

We have for , ≥ 0, |, uk (x)| ≤ C 2−, δ(,) ωB1 (1 + 2, |x|)−r ,

(5.31)

|−1 uk (x)| ≤ C ωLp .

(5.32)

and for , = −1,

In addition taking into account the restrictions on ,, j , j we find |j ω(x)| ≤ C ωC 0,r (1 + 2, |x|)−r .

(5.33)

Collecting together (5.30)–(5.33) and (4.3) and using Bernstein’s inequality yields (4,1)

|Rj

(u, ω)(x)| ≤ C δ(j ) ωC 0,r (ωB1 + ωLp ) (1 + 2j |x|)−r .

(5.34)

As for the last term (4,2)

Rj

(u, ω) = ∂k

{j ((j Sj −2 uk ) j ω)

|j −j |≤1

− (j Sj −2 uk ) · (j j ω)} [j , j Sj −2 uk ·] j ω. = ∂k

(5.35)

j −M0 ≤j ≤j −1 |j −j |≤1

We use the explicit integral representation ∂k [j , j Sj −2 uk ·] j ω(x) j (n+1) ∂k ϕ(2j (x − y))(j Sj −2 uk (y) = 2 − j Sj −2 uk (x)) j ω(y) dy j (n+1) ∂k ϕ(2j (x − y)) = −2 n

1

m=1 0 n

= −

m=1 1

0

j Sj −2 ∂m uk (x − τ (x − y))(xm − ym ) j ω(y) dy dτ

∂k ϕ(z) zm

j Sj −2 ∂m uk (x − τ 2−j z) j ω(x − 2−j z) dz dτ.

(5.36)

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

711

Using the restriction on j , j , |j Sj −2 ∂m uk (x)| ≤ C ωC 0,r ,

(5.37)

|j ω(x)| ≤ C δ(j ) ωB1 (1 + 2j |x|)−r .

(5.38)

From Lemma 5.2 and (5.35)–(5.38), (4,2)

|Rj

(u, ω)(x)| ≤ C δ(j ) ωC 0,r ωB1 M {x → (1 + 2j |x|)−r } ≤ C δ(j ) ωC 0,r ωB1 (1 + 2j |x|)−r .

(5.39)

Now (5.10) combined with (5.18), (5.23), (5.28), (5.34), (5.39) implies (5.5). This concludes the proof of the theorem. Theorem 5.3. Let ω ∈ C 0,r ∩ Lp , r ∈ (0, n), u = K ∗ ω. Then sup |j (Dω + ωD)(x)| (1 + 2j |x|)r

x∈Rn

(5.40)

≤ C δ(j ) DL∞ ωB1 + C δ(j ) γ (j ) ωB1 ωB2 + C δ(j ) (ωC 0,r + ωLp ) ωB1 .

(5.40)

Proof. Using (5.6)–(5.9), Dω =

TD ω + Tω D + R(D, ω).

(5.41)

j

We first discuss the term j TD ω =

j

=

j (Sj −2 D j ω)

j (Sj −2 D j ω).

(5.42)

|j −j |≤M0

We have |Sj −2 D(x)| ≤ C DL∞ ,

(5.43)

|j ω(x)| ≤ C δ(j ) ωB1 (1 + 2j |x|)−r .

(5.44)

|j TD ω(x)| ≤ C δ(j ) DL∞ ωB1 (1 + 2j |x|)−r .

(5.45)

Therefore,

The second term to estimate is j Tω D =

j

=

j (j D Sj −2 ω)

|j −j |≤M0

j (j D Sj −2 ω).

(5.46)

712

M. Vishik

The symmetric matrix D is an image of ω under a translation invariant Calderón– Zygmund singular integral operator. Using Proposition 2.2 to handle the low frequencies (j = −1), we obtain |j D(x)| ≤ C δ(j ) ωB1 (1 + 2j |x|)−r .

(5.47)

|Sj −2 ω(x)| ≤ C γ (j ) ωB2 .

(5.48)

Also, trivially

Combining (5.46)–(5.48) yields the estimate |j Tω D(x)| ≤ C δ(j ) γ (j ) ωB1 ωB2 (1 + 2j |x|)−r .

(5.49)

Obviously transposed terms to those in (5.45) and (5.49) satisfy the same inequalities. We treat now the following term: j (R(D, ω) + R(ω, D)). Let A = ∂x ω,

R(D, ω) + R(ω, D) = =

1 2

{j (A + t A) j (A − t A) + j (A − t A) j (A + t A)}

|j −j |≤1

=

{j D j ω + j ω j D}

|j −j |≤1

j A j A − j t A j t A

|j −j |≤1

1 1 j t A j A − j A j t A 2 2 1 1 t + j A j A − j t A j A. 2 2 +

Therefore, j (R(D, ω) + R(ω, D)) = j

{j A j A − j t A j t A}.

(5.50)

j ≥j −M0 |j −j |≤1

The (i, k)-matrix element of this product is j

n

j ≥j −M0

|j −j |≤1

m=1

= j = j

{j ∂m ui j ∂k um − j ∂i um j ∂m uk }

n

m=1

j ≥j −M0

|j −j |≤1

n m=1

∂m

{∂m (j ui j ∂k um ) − ∂m (j ∂i um j uk )}

j ≥j −M0 |j −j |≤1

{j ui j ∂k um − j ∂i um j uk }.

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

713

We have the following estimates (using Proposition 2.2 to handle the case j = −1):

|j ui (x)| ≤ C 2−j (1 + 2j |x|)−r ωC 0,r , j ≥ 0,

(5.51)

|−1 ui (x)| ≤ C ωLp ,

(5.52)

|j ∂k um (x)| ≤ C (1 + 2j |x|)−r δ(j ) ωB1 . Combining (5.51)–(5.53) and summing up with respect to |j (R(D, ω) + R(ω, D)(x)| ≤ C ωB1 (ωC 0,r + ωLp ) 2j

j

(5.53)

yields

2−j δ(j ) (1 + 2j |x|)−r

j ≥j −M0

(5.54)

≤ C δ(j ) ωB1 (ωC 0,r + ωLp ) (1 + 2j |x|)−r because of (4.1)–(4.3). Using (5.45), (5.49), (5.54) leads to (5.40). This concludes the proof. 6. Estimates for ωB2 In this section n ≥ 2 is arbitrary. Proposition 6.1. Let p ∈ (1, ∞), s > 1, u0 ∈ C s (“classical” Hölder space) being a divergence free vector field in Rn . Assume in addition that ∂x u0 ∈ Lp . Let T∗ be a maximal time of existence of a unique solution u(t) = K ∗ ω(t) to the Euler equations ∞ s p with initial condition u0 such that u(·) ∈ L∞ loc ([0, T∗ ), C ), ∂x u(·) ∈ Lloc ([0, T∗ ), L ). Then, t ω(t)Lq ≤ exp 2 D(τ )}L∞ dτ ω(0)Lq , q ∈ [p, ∞] (6.1) 0

for any t ∈ [0, T∗ ). Proof. We use the vorticity equation and estimate ω(t)Lq , q ∈ [p, ∞]. Let X(x, t) denote the corresponding flow: d X (x, t) = u(X(x, t), t); dt Then, from the vorticity equation

X(x, 0) = x;

t ∈ [0, T∗ ).

d ω (X(x, t), t) = −(D ω + ω D) (X(x, t), t). dt Multiplying by t ω and taking trace yields d |ω|2H S (X(x, t), t) = 4 tr(D ω2 ) , t ∈ [0, T∗ ). dt

(6.2)

(6.3)

Here |ω|2H S = tr ω t ω. From (6.3),

t |ω (X(x, t), t)|H S ≤ exp 2 |D(X(x, τ ), τ )| dτ |ω0 (x)|H S .

(6.4)

0

In (6.4) |D| is computed with respect to standard Euclidean structure on Rn . Taking the Lq norm of the left side and changing variables in the integral implies (6.1).

714

M. Vishik

7. Example Let n = 4, consider the following solenoidal covector field on R4 : u = g(|x|) {−x3 dx1 − x4 dx2 + x1 dx3 + x2 dx4 }. Let G ⊂ O(4) be generated by the following matrices: g1 = diag (−1, 1, −1, 1);   0 0 1 0 0 1  0 0   g2 =  ; −1 0 0 0 0 −1 0 0



0 1  g3 =  0 0

1 0 0 0

 0 0 0 0  . 0 1 1 0

Proposition 7.1. u ∈ A1 (R 4 )∗ is G-invariant. Proof. Direct computation. Since g(|x|) obviously is G-invariant we will assume g ≡ 1. 1. Let x1 = −y1 , x2 = y2 , x3 = −y3 , x4 = y4 . Then u = −y3 dy1 − y4 dy2 + y1 dy3 + y2 dy4 . 2. Let x1 = y3 , x2 = y4 , x3 = −y1 , x4 = −y2 . Then u = y1 dy3 + y2 dy4 − y3 dy1 − y4 dy2 . 3. Let x1 = y2 , x2 = y1 , x3 = y4 , x4 = y3 . Then u = −y4 dy2 − y3 dy1 + y2 dy4 + y1 dy3 . This completes the proof. Proposition 7.2. Let A be a symmetric traceless 4 × 4 matrix such that g A g −1 = A for all g ∈ G. Then A = 0. Proof. Let (aij )1≤i,j ≤4 be matrix elements of A. Then (g1 A g1−1 )ij = (−1)i+j aij . Therefore,   α 0 β 0   0 δ A B 0 γ    . A= = β 0  D 0 C D 0 δ 0 τ We have:

 g2 A = 

0

−1  A A g2 =  C

1

 

A

  0 C   B 0   D −1

B





C

= −A   1 −B = 0 −D

D

D



; −B  A . C

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

Hence, C = −B; A information,  0 1  g3 A =  0 0 

α 0  A g3 =  0 0

715

= D which means α = D, γ = τ , β = 0, δ = 0. Using this 1 0 0 0 0 γ 0 0

 0 0 0 0   0 1 1 0



α 0   0 0

 0 0 0 0   α 0 0 γ

0 γ 0 0

 0 1   0 0

1 0 0 0

  0 0 0 0 0  α   = α 0  0 0 γ 0

γ 0 0 0

  0 0 0 0 0  γ   = 0 1  0 1 0 0

Hence, α = γ . Since tr A = 0 this implies the proposition.

α 0 0 0

 0 0 0 0  ; 0 γ α 0  0 0 0 0  . 0 α γ 0

Remark 7.3. As follows from Appendix B choosing g(|x|) to behave near the origin like e.g. log2 log2 |x|−1 and having compact support gives an example where all the conditions of Theorem 4.2 are satisfied. Here δ(j ) = (j + 3)−1 , γ (j ) = log(j + 3). One could add to this vector field e.g. an arbitrary solenoidal vector field in S with the same finite symmetry group. 8. Construction of the Flow We are in a position now to start proving Theorem 4.2. Uniqueness follows from the results of [V] and the following facts: • ω(·)C 0,r ∈ L∞ loc ([0, T∗ )); p • ω(·) ∈ L∞ loc ([0, T∗ ); L ). From Proposition 2.7, for any positive r, C 0,r ⊂ bmo ⊂ C∗0 , so the results of [V] are applicable. The first remark is of a purely technical nature. Lemma 8.1. Assume δ(j ) in the definition of B1 is replaced by δ(j )D , D ∈ [0, 1]. Let γ (j ) = − log δ(j ). Then constants in (4.2) could be chosen to be uniform in D. Proof. 1. C −D δ(j )D ≤ δ(j + 1)D ≤ C D δ(j )D . 2. − C −1 log δ(j ) ≤ − log δ(j + 1) ≤ − C log δ(j ). This concludes the proof. Let ω0 ∈ B = B1 (0)∩B2 . We construct a sequence of solutions to the Euler equations um (t) = K ∗ ωm (t), defined as follows: ωm (0) = Sm ω0 ,

m = 1, 2, 3, . . . .

(8.1)

Evidently, ωm (0) ∈ C s,r ∩ Lp for any s ∈ (0, ∞). Therefore, there exists a maximal time T∗m > 0 and a unique solution to the Euler equations um (t) = K ∗ ωm (t) such that ωm (·)C s,r ∈ L∞ loc ([0, T∗m ))

(8.2)

716

M. Vishik

and p ωm (·) ∈ L∞ loc ([0, T∗m ); L )

(8.3)

(see Appendix A). Lemma 8.2. There is a positive lower bound T∗m > C > 0, where C depends on ω0 but not on m. The sequence ωm (t)B(t) is bounded in L∞ ([0, C]). Also {ωm } is bounded in L∞ ([0, C], Lp ). Proof. According to Theorem A.1 in Appendix A T∗m < ∞ implies

T∗m

0

Dm (τ )L∞ dτ = ∞.

(8.4)

Here Dm (τ ) is a tensor of deformation rates corresponding to um (τ ). Obviously, C s,r ⊂ C 0,r since s ∈ (0, ∞). It’s now time to use the symmetry condition g∗ um (g −1 x, τ ) = um (x, τ ), g ∈ G ⊂ O(n). Indeed, this follows from the symmetry of the initial condition, the uniqueness theorem and the fact that Littlewood–Paley analysis is O(n)-invariant by our choice. Since the finite group G ⊂ O(n) is admissible, it follows Sj Dm (τ ) (0) = 0,

j ≥ −1.

Indeed, Sj Dm (τ ) inherits the symmetries of Dm (τ ) for the same reason as just mentioned. Therefore, according to Corollary 2.5, Dm (τ )L∞ ≤ C Dm (τ )C 0,r ≤ C ωm (τ )C 0,r ,

(8.5)

because the operator ωm → Dm is a translationary invariant Calderón–Zygmund operator and r ∈ (0, n) (see Proposition 2.2). Using (8.4), (8.5) we arrive at 0

T∗m

ωm (τ )C 0,r dτ = ∞

(8.6)

Now it’s time to use the estimates obtained in § 5. We have for any j ≥ −1, ∂t j ωm = − (Sj −2 um · ∂x ) j ωm − Rj (um , ωm ) − j (Dm ωm + ωm Dm ).

(8.7)

1

We use the following notation below: λj (x) = (1 + 22j |x|2 ) 2 . Hence, ∂t {j ωm (x, t) λj (x)r } = − (Sj −2 um · ∂x ){j ωm λrj }

− Rj (um , ωm ) λrj − j (Dm ωm + ωm Dm ) λrj

+ r j ωm (x, t) λj (x)r 22j (Sj −2 um (x) · x) λj (x)−2 .

(8.8)

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

717

Solving Eq. (8.8) along the characteristics and using Theorems 5.1, 5.3 we obtain: j ωm (t) λrj L∞ ≤ j ωm (0) λrj L∞ t +C δ(j )1−λτ ((ωm (τ )B1 (τ ) + ωm (τ )Lp ) ωm (τ )C 0,r 0

+ Dm (τ )L∞ ωm (τ )B1 (τ ) ) dτ t +C δ(j )1−λτ γ (j ) ωm (τ )B1 (τ ) ωm (τ )B2 dτ 0 t +r j ωm (τ ) λrj L∞ Dm (τ )L∞ dτ. 0

(8.9) To treat the last term we have used the following identity (um (x) in an “even” vector field ι∗ um = um , ι : R n → R n being the central symmetry. Hence um (0) = 0): (Sj −2 um (x) · x) =

n k,,=1 0

1

Sj −2 ∂, umk (ρx) xk x, dρ.

(8.10)

It follows from (8.10) by symmetrizing the right side in k, , that |(Sj −2 um (x, τ ) · x)| ≤ C Dm (τ )L∞ |x|2 . Because of (8.5) we may replace Dm (τ )L∞ everywhere in the right side of (8.9) by ωm (τ )C 0,r . Let for any t < T∗m , F (t) = sup ωm (τ )B1 (τ ) , 0≤τ ≤t

G(t) = sup ωm (τ )B2 , 0≤τ ≤t

H (t) = sup ωm (τ )C 0,r , 0≤τ ≤t

I (t) = sup ωm (τ )Lp . 0≤τ ≤t

Obviously these 4 functions are nonnegative and nondecreasing. Since δ(j ) < 1 according to (4.1) it follows that B1 (τ ) → C 0,r for all τ ∈ [0, T∗m ) with norm ≤ 1. Hence H (t) ≤ F (t), t ∈ [0, T∗m ). We have from (8.9), j ωm (t) λrj L∞ ≤ C δ(j ) ωm (0)B1 (0) t + C δ(j )1−λt F (τ )2 + F (τ ) I (τ ) dτ 0

δ(j )1−λt +C γ (j ) F (t) G(t) (− log δ(j )) λ t + C δ(j )1−λt F (τ ) H (τ ) dτ. 0

(8.11)

718

M. Vishik

In the third term in the right side of (8.9) we integrated by parts and kept only one term corresponding to the upper bound t. The other two terms are negative. Because of our choice of γ (j ) in Theorem 4.2 (8.11) implies j ωm (t) λrj L∞ ≤ C δ(j ) ωm (0)B1 (0) t 1−λt + C δ(j ) F (τ )2 + F (τ ) I (τ ) dτ + C λ−1 δ(j )1−λt F (t) G(t). 0

Therefore, dividing by δ(j )1−λt , t F (τ )2 + F (τ ) I (τ ) dτ + C λ−1 F (t) G(t). F (t) ≤ C F (0) + C

(8.12)

0

In addition to this using (6.1) with q = j + 2 + p and Bernstein’s inequality, t F (τ ) dτ , t ∈ [0, T∗m ). G(t) ≤ C G(0) exp C

(8.13)

Using the same argument with q = p yields t F (τ ) dτ , t ∈ [0, T∗m ). I (t) ≤ C I (0) exp C

(8.14)

0

0

Taking λ sufficiently large, moving the last term in the right side of (8.12) to the left, using (8.13), (8.14) and Gronwall’s type of argument implies that F (t) is uniformly bounded above on an interval which is independent of m. Comparing this with (8.6) proves the statement. We now recall the definition of a particular Besov space that is crucial for our proof,   ∞   0 = f ∈ S (Rn ) | f B 0 = j f L∞ < ∞ . B∞,1 ∞,1   j =−1

0 ), where T > 0 is less Lemma 8.3. The sequence {um } is Cauchy in L∞ ([0, T ]; B∞,1 than the constant C from Lemma 8.2.

Proof. We are having for m > , ≥ 1, u m = K ∗ ωm , and we set

u, = K ∗ ω, ,

w = um − u, = K ∗ (ωm − ω, ) = K ∗ ω.

Then, ∂t w (x, t) = −(um , ∇) w − (w, ∇) u, − ∇p,

t ∈ [0, T ];

(8.15)

div w = 0;

(8.16)

w(x, 0) = ((Sm − S, ) w0 )(x).

(8.17)

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

719

Here p is the difference of two pressures that appear in the equations for um and u, . It follows from (8.15) that ∂t j w = − (Sj −2 um , ∇) j w − Rj (um , w) − (Sj −2 w, ∇) j u, − Rj (w, u, ) − j ∇p.

(8.18)

We use the following estimate from [V] which was obtained in that paper following the lines of [BC]. Theorem 8.4. There is an absolute constant M0 so that the following inequality is valid: Rj (u, w)L∞ ≤ C {Sj −2 ∇wL∞ j uL∞ |j −j |≤M0

+ Sj −2 ∇uL∞ j wL∞ } −j 2 j ∇uL∞ j wL∞ . + C 2j

(8.19)

j ≥j −M0 |j −j |≤1

When j = −1 the term −1 ∇uL∞ ought to be replaced by −1 uL∞ . We now estimate all the terms in the right side of (8.18) except the first term. Using (8.19), Rj (um , w)L∞ ≤ C {Sj −2 ∇wL∞ j um L∞ |j −j |≤M0

+ Sj −2 ∇um L∞ j wL∞ } −j 2 j ∇um L∞ j wL∞ , + C 2j

(8.20)

j ≥j −M0 |j −j |≤1

where has the same meaning as in (8.19). Likewise, Rj (w, u, )L∞ ≤ C {Sj −2 ∇wL∞ j u, L∞ |j −j |≤M0

+ Sj −2 ∇u, L∞ j wL∞ } −j 2 j ∇wL∞ j u, L∞ . + C 2j

(8.21)

j ≥j −M0 |j −j |≤1

We have to estimate j ∇p. We consider two cases. For j ≥ 0, using (8.18), (8.16) we obtain j ∇pL∞ ≤ C Rj (um , w)L∞ + C Rj (w, u, )L∞ + C 2−j Sj −2 ∇ um L∞ j ∇ wL∞ + C 2−j Sj −2 ∇ wL∞ j ∇ u, L∞ ≤ C Rj (um , w)L∞ + C Rj (w, u, )L∞ + C Sj −2 ∇ um L∞ j wL∞ + C Sj −2 ∇ wL∞ j u, L∞ .

(8.22)

720

M. Vishik np n−p , and using the Hardy–Littlewood–Sobolev

For the case j = −1, taking any p1 = inequality and Bernstein’s inequality,

−1 ∇pL∞ ≤ C−1 ∇pLp1 n 0 ≤ C − ∂k −1 {(um )k w + wk u, } p k=1 L 1 ≤C (j um Lp1 + j u, Lp1 ) j wL∞

(8.23)

|j −j |≤M0

≤C

(j ωm Lp + j ω, Lp ) j wL∞ .

|j −j |≤M0

We fix N ≥ −1 and estimate the sum N

Rj (um , w)L∞ +

j =−1

+

N

Rj (w, u, )L∞

j =−1

N

j ∇pL∞ +

j =−1

N

(Sj −2 w, ∇) j u, L∞

(8.24)

j =−1

= Q1 + Q2 + Q3 + Q4 . The contribution to Q1 of the first sum in the inequality (8.20), according to Lemma 8.2 is ≤C

N

{Sj −2 wL∞ j ∇um L∞

j =−1 |j −j |≤M0

+ Sj −2 ∇um L∞ j wL∞ } ≤C

sup

−1≤j ≤N+M0

+C

Sj −2 wL∞

sup

Sj −2 ∇um L∞

≤ C (N + 2) ωm C 0,r

j =−1

N+M 0 j =−1

≤ C (N + 2) ∇um C 0,r

≤ C (N + 2)

j ∇um L∞

j =−1

−1≤j ≤N+M0

N+M 0

N+M 0

N+M 0

j wL∞

j =−1

N+M 0

j wL∞

j =−1

j wL∞ .

j wL∞ (8.25)

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

721

We now examine the contribution of the second sum in the right side of (8.20), N

2j

j =−1

≤C

∞

j ≥j −M0 |j −j |≤1

2−j j ∇um L∞ j wL∞

j wL∞

+M +1) min(N,j 0

j =−1



≤ C



∞

2j −j

j =−1

j wL∞  2min(N,j

)−j

j =−1

≤C

∞

|j −j |≤1

j ∇um L∞ (8.26)

ωm C 0,r

j wL∞ .

j =−1

It follows from (8.25), (8.26) that

Q1 ≤ C(N + 2)

∞

j wL∞ .

(8.27)

j =−1

We now turn to Q2 . From (8.21) N

Q2 ≤ C

{Sj −2 wL∞ j ∇u, L∞

j =−1 |j −j |≤M0

+ Sj −2 ∇u, L∞ j wL∞ } +C

N

2j

j =−1

(8.28)

2−j j ∇wL∞ j u, L∞ .

j ≥j −M0 |j −j |≤1

The first sum in the right side of (8.28) is

≤C

sup

−1≤j ≤N+M0

+C

sup

S

−1≤j ≤N+M0

≤ C(N + 2)

j −2

w

L∞

N+M 0

j ∇ u, L∞

j =−1

Sj −2 ∇u, L∞

N+M 0 j =−1

N+M 0 j =−1

j wL∞ .

j wL∞

(8.29)

722

M. Vishik

To estimate the second term in the right side of (8.28) we notice that N

2j

j =−1 N

≤C

2j

≤C

2−j j ∇wL∞ j u, L∞

j wL∞ j u, L∞

j ≥j −M0 |j −j |≤1

j =−1 ∞

j ≥j −M0 |j −j |≤1



j wL∞ 

min(N,j +M0 )

j =−1

≤ C ≤ C

∞ j =−1 ∞



2j 

j u, L∞

(8.30)

|j −j |≤1

j =−1

j wL∞ 2min(N,j )

j u, L∞

|j −j |≤1

j wL∞ .

j =−1

We combine (8.28), (8.29), (8.30) to conclude Q2 ≤ C(N + 2)

∞

j wL∞ .

(8.31)

j =−1

We now examine Q3 in the right side of (8.24). The first two terms in the right side of (8.22) are already present in Q1 , Q2 . Hence using (8.23) and Lemma 8.2, Q3 ≤ C(N + 2) ≤ C(N + 2)

∞ j =−1 ∞

j wL∞ + C −1 ∇pL∞ (8.32) j wL∞ .

j =−1

We now have to take care of Q4 , Q4 ≤

N

Sj −2 wL∞ j ∇u, L∞

j =−1

≤ C(N + 2)

∞

(8.33) j wL∞ .

j =−1

We now combine the inequalities (8.27), (8.31), (8.32), (8.33) to conclude Q1 + Q2 + Q3 + Q4 ≤ C(N + 2)

∞ j =−1

j wL∞ ,

N ≥ −1.

(8.34)

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

723

Solving Eq. (8.18) along the characteristics we obtain t {Rj (um (τ ), w(τ ))L∞ j w(t)L∞ ≤ j w(0)L∞ + 0

+ Rj (w(τ ), u, (τ ))L∞ + j ∇p(τ )L∞ + (Sj −2 w(τ ), ∇) j u, (τ )L∞ } dτ.

(8.35)

For arbitrary N ≥ −1, t ∈ [0, T ], ∞

j w(t)L∞ =

j =−1

N j =−1

≤

∞

j w(t)L∞ +

N

j w(t)L∞

j =N+1

j w(t)L∞ + C

j =−1

(8.36)

∞

2−j j ∇w(t)L∞ .

j =N+1

We estimate the second term in the right side of (8.36). Let m

dm =

j ∇w(t)L∞ .

k=−1

Then, using Lemma 8.2, ∞

∞

2−j j ∇w(t)L∞ =

j =N+1

≤

j =N+1 ∞

2−j (dj − dj −1 ) 2−j (dj + dj +1 )

j =N+1 ∞

≤C

(8.37)

2−j (j + 1)

j =N+1

≤ C(N + 2) 2−N . From (8.34), (8.35), (8.36), (8.37), ∞

j w(t)

L∞

≤

j =−1

N

j w(0)L∞

j =−1

+ C(N + 2)

t

∞

0 j =−1

j w(τ )L∞ dτ

+ C(N + 2) 2−N ≤

∞

(8.38)

j w(0)L∞

j =−1

+ C(N + 2)

t

∞

0 j =−1

j w(τ )L∞ dτ

+ C(N + 2) 2−N , N ≥ −1.

724

M. Vishik

We introduce the following function on [0, T ]:

∞

t

ζ (t) =

0 j =−1

j w(τ )L∞ dτ.

We now choose for a fixed t ∈ [0, T ], N = [− log2 ζ (t)], where the square bracket means the entire part of a real number. Then we have from (8.38), ζ˙ (t) ≤ κm, + C ζ (t) (− log2 ζ (t) + 2), κm, =

∞

ζ (0) = 0,

(8.39)

j w(0)L∞ .

j =−1

In addition to this, ζ (t) is nondecreasing absolutely continuous since ∞ ∞ j w(·)L ≤ C ω(·)C 0,r ∞ L ([0,T ]) j =−1 ∞ L ([0,T ])

+ C ωL∞ ([0,T ];Lp ) . Now we observe that κm,, = =

∞ j =−1 ∞

j w(0)L∞ j K ∗ (Sm − S, ) w0 L∞

j =−1

m ∞ j K ∗ = ω q 0 j =−1 q=,+1 ≤

m

L∞

K ∗ j q ω0 L∞

q=,+1 |j −q|≤1 ∞

≤C ≤C ≤C

K ∗ q ω0 L∞

q=,+1 ∞ q=,+1 ∞ q=,+1

2−q q ω0 L∞ 2−q ≤ C 2−, .

(8.40)

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

725

Integrating both sides of (8.39) yields t −, ζ (τ ) (− log2 ζ (τ ) + 2) dτ. ζ (t) ≤ C 2 + C 0

Since ζ → ζ (− log ζ +2) is monotonic for small positive ζ , a standard argument implies ζ (t) ≤ η(t, C 2−, ),

t ∈ [0, T ],

where η is a solution to the Cauchy problem η˙ = Cη(− log2 η + 2)

η(0) = C 2−,

and , is sufficiently large. This concludes the proof since η(t) is convergent uniformly to 0 on [0, T ] as , → ∞. We now continue the proof of Theorem 4.2. According to Lemma 8.3 there exists a strong limit u = lim um m→∞

in

0 0 L∞ ([0, T ]; B∞,1 ) ∩ C([0, T ]; B∞,1 ).

(8.41)

Indeed, “classical” solutions we used in our approximation scheme are continuous with 0 values in B∞,1 (see the proof of Lemma 8.3). Using Lemma 8.2 by possibly choosing a subsequence we may assume weak ∗

ωm −→ ω

in

L∞ ([0, T ]; Lp ).

(8.42)

We claim now that for a.e. t ∈ [0, T ], u = K ∗ ω.

(8.43)

Indeed, since um = K ∗ ωm , it follows from Lemma 8.2 and the Hardy–Littlewood– Sobolev inequality that (possibly after choosing a subsequence) weak ∗

um −→ u

in

np

L∞ ([0, T ]; L n−p ).

(8.44)

Therefore, (u − K ∗ ω)(x, t) is a harmonic vector field in x that belongs to L∞ ([0, T ]; np L n−p ). This proves (8.43). Lemma 8.5. Vorticity ω(t), t ∈ [0, T ] satisfies (4.7). Proof. Let δ(j, t) = δ(j )1−λt ; γ (j ) = − log δ(j ). Fix arbitrary j ≥ −1. We have using Lemma 8.2, j ωm (x, t) · λj (x)r δ(j, t)−1 L∞ (Rn ×[0,T ]) ≤ C < ∞,

(8.45)

Sj ωm (x, t) γ (j )−1 L∞ (Rn ×[0,T ]) ≤ C < ∞.

(8.46)

The constant C in the right side of (8.45), (8.46) does not depend on j ≥ −1 and m ≥ 1. Fix an arbitrary ball B ⊂ Rn . Then j (ωm (x, t) − ω(x, t)) λj (x)r δ(j, t)−1 L∞ (B×[0,T ]) → 0

as

m → ∞. (8.47)

726

M. Vishik

0 Indeed this follows immediately from (8.41) since B∞,1 → L∞ . From (8.45), (8.47)

j ω (x, t) λj (x)r δ(j, t)−1 L∞ (B×[0,T ]) ≤ C < ∞ with the same constant C as in (8.45). When the radius of B tends to infinity we recover the estimate j ω (x, t) λj (x)r δ(j, t)−1 L∞ (Rn ×[0,T ]) ≤ C < ∞. Likewise,

(8.48)

Sj (ωm (x, t) − ω(x, t)) γ (j )−1 L∞ (Rn ×[0,T ]) ≤ C γ (j )−1 ωm (x, t) − ω(x, t)L∞ (Rn ×[0,T ]) ≤ C γ (j )−1 ωm (·) − ω(·)L∞ ([0,T ];B 0

∞,1 )

−→ 0

m→∞

again by using (8.41). Therefore, Sj ω(x, t) γ (j )−1 L∞ (Rn ×[0,T ]) ≤ C < ∞

(8.49)

with the same constant C as in (8.46). The statement of the lemma follows from (8.48), (8.49). We have verified (4.7), (4.8) follows from (8.42). We claim the pair (u, ω) that we have constructed satisfies the Euler equation. Let ρ ∈ S, div ρ = 0 be a test function. Also let θ ∈ D([0, T )). By construction, T T ˙ ) dτ + $um (0), ρ% θ (0) + $um (τ ), ρ% θ(τ $um (τ ), (um (τ ), ∇) ρ% θ(τ ) dτ = 0. 0

0

We have $um (0), ρ% −→ $u0 , ρ% because of (8.1), (8.41). Also from (8.41),

0 T

0

m→∞

T

$um (τ ), ρ% θ˙ (τ ) dτ −→

m→∞

T 0

$um (τ ), (um (τ ), ∇) ρ% θ(τ ) dτ −→

m→∞

˙ ) dτ ; $u(τ ), ρ% θ(τ

T

$u(τ ), (u(τ ), ∇) ρ% dτ.

0

Therefore, the limit u satisfies the Euler equations and the initial condition u(0) = u0 . The vector field u(t) is symmetric with respect to G ⊂ O(n) because of the uniqueness theorem [V] as above. This concudes the proof of Theorem 4.2. Appendix A Here we prove the existence theorem for the Euler equations with vorticity in C s,r ∩ Lp , s ∈ (0, ∞), r ∈ (0, n), p ∈ (1, n). For simplicity we consider only “even” vector fields in Rn , i.e. satisfying ι∗ u = u, ι : R n → R n being the central symmetry. Theorem A.1. Let u0 be an even solenoidal vector field in Rn such that ω0 = curl u0 ∈ C s,r (Rn ) ∩ Lp ,

(A.1)

s ∈ (0, ∞), r ∈ (0, n), p ∈ (1, n), u0 = K ∗ ω0 . Then there exists a maximal time T∗ > 0 (T∗ may be equal to ∞) and a unique ω(·) ω(·)C s,r ∈ L∞ loc ([0, T∗ )) such that:

(A.2)

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

• • • •

727

u(t) = K ∗ ω(t) satisfies the Euler equation and is even a.e. on [0, T ] u(0) = u0 p ω(·) ∈ L∞ loc ([0, T∗ ), L ) T∗ ≥ C > 0 where C depends only on n, ω0 C s,r .

In addition T∗ < ∞ implies

T∗

0

D(τ )L∞ dτ = ∞.

(A.3)

Proof. From (A.1) ω0 ∈ Lp . In addition ω0 ∈ C s , where s ∈ (0, ∞). Therefore, according to classical results (see [C]) there exists a maximal time T∗ > 0 and a solution to the Euler equation with ω(·)C s ∈ L∞ loc ([0, T∗ )), We will show that in fact

p ω(·) ∈ L∞ loc ([0, T∗ ); L ).

(A.4)

ω(·)C s,r ∈ L∞ loc ([0, T∗ )).

We keep the same notation as in Sect. 5. We need two lemmas. Lemma A.2. Let as above Rj (u, ω) = −(Sj −2 u · ∂x ) j ω + j (u · ∂x ) ω, Then,

j ≥ −1.

|Rj (u, ω) (x)| ≤ C 2−j s (ωC s + ωLp ) ωC s,r (1 + 2j |x|)−r ,

where p ∈ (1, n). (1)

Proof. We estimate the term Rj (u, ω). Using (5.15) and the estimates (it is sufficient here to have j ≥ 1) |j uk (x)| ≤ C 2−j (1+s) ωC s,r (1 + 2j |x|)−r ; Sj −2 ωL∞ ≤ C ωC s we get (1)

|Rj (u, ω) (x)| ≤ C 2−j s ωC s ωC s,r (1 + 2j |x|)−r .

(A.5)

For the second term we use (5.22) and Lemma 5.2 to obtain (2)

|Rj (u, ω) (x)| ≤ C DL∞ 2−j s (1 + 2j |x|)−r ωC s,r .

(A.6)

For the third term (5.24) we have |q uk (x)| ≤ C 2−j (1+s) ωC s , |−1 uk (x)| ≤ C ωLp ;

q ≥ 0;

|j j ω(x)| ≤ C ωC s,r 2−j s (1 + 2j |x|)−r . Hence (3)

|Rj (u, ω) (x)| ≤ C 2−2j s (ωC s + ωLp ) ωC s,r (1 + 2j |x|)−r .

(A.7)

728

M. Vishik (4)

(4,1)

We now estimate Rj (u, ω) = Rj

(4,2)

(u, ω) + Rj

(u, ω). We have (see (5.30))

|, uk (x)| ≤ C 2−,(1+s) ωC s , , ≥ 0; |−1 uk (x)| ≤ C ωLp ; |j ω(x)| ≤ C 2−,s ωC s,r (1 + 2j |x|)−r . Hence, (4,1)

|Rj

(u, ω) (x)| ≤ C 2−2j s (ωC s + ωLp ) ωC s,r (1 + 2j |x|)−r . (4,2)

For the term Rj

(u, ω) using (5.35), (5.36) and Lemma 5.2,

(4,2)

|Rj

(A.8)

(u, ω) (x)| ≤ C DL∞ 2−j s ωC s,r (1 + 2j |x|)−r .

(A.9)

In addition to this, DL∞ ≤ DC s ≤ C (ωC s + ωLp ). Combining (A.5)–(A.9) yields the statement of the lemma. Lemma A.3. |j (ω D + D ω)(x)| ≤ C 2−j s (ωC s + ωLp ) ωC s,r (1 + 2j |x|)−r , j ≥ −1, p ∈ (1, n). Proof.

j (ω D + D ω) = j

Sj −2 ω j D + j ω Sj −2 D

|j −j |≤M0

+ Sj −2 D j ω + j D Sj −2 ω + j

j ω j D + j D j ω .

j ≥j −M0 |j −j |≤1

Therefore, |j (ω D + D ω)(x)| ≤ C 2−j s (ωC s + ωLp ) ωC s,r (1 + 2j |x|)−r + C 2−2j s (ωC s + ωLp ) ωC s,r (1 + 2j |x|)−r . j ≥j −M0

This implies the statement.

We now continue the proof of Theorem A.1. We have ∂t j ω = −(Sj −2 u · ∂x ) j ω − Rj (u, ω) − j (ω D + D ω).

(A.10)

1

Therefore, using the same notation as above, λj (x) = (1 + 22j |x|2 ) 2 , we obtain: ∂t (j ω λrj ) = − (Sj −2 u · ∂x ) (j ω λrj ) − Rj (u, ω) λrj − j (D ω + ω D) λrj r

(A.11)

+ r j ω (x) λj (x) 2 (Sj −2 u(x) · x) λj (x) 2j

−2

.

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

729

We use Taylor’s formula and the fact u is even (which follows from the uniqueness theorem for a solution, satisfying (A.4)): Sj −2 u(x) · x = =

n

1

m,k=1 0 1 0

∂m Sj −2 uk (τ x) xm xk dτ

Sj −2 D(τ x) x · x dτ.

Hence, |Sj −2 u(x) · x| ≤ C DL∞ |x|2 .

(A.12)

Using Lemma A.2, Lemma A.3, (A.11), (A.12) and integrating along characteristics we obtain j ω(t) λrj L∞ ≤ j ω(0) λrj L∞ t + C 2−j s (ω(τ )C s + ω(τ )Lp ) ω(τ )C s,r dτ 0 t + C 2−j s D(τ )L∞ ω(τ )C s,r dτ.

(A.13)

0

Since D(τ )L∞ ≤ C(ω(τ )C s + ω(τ )Lp ) the third term in the right side of (A.13) could be incorporated into the second term. Besides, from Proposition 6.1, ω(τ )

Lp

t

≤ exp 2 0

D(ρ)

L∞

dρ ω(0)Lp .

(A.14)

Hence, from (A.13), (A.14), ω(t)C s,r ≤ ω(0)C s,r t +C (ω(τ )C s + ω(τ )Lp ) ω(τ )C s,r dτ, τ ∈ [0, T∗ )

(A.15)

0

and the expression in the round brackets belongs to L1loc ([0, T∗ )). Therefore, the solution T in fact stays in C s,r as long as it exists. The statement {T∗ < ∞ ⇒ 0 ∗ D(τ )L∞ dτ = T∗ ∞} immediately follows from the statement {T∗ < ∞ ⇒ 0 ω(τ )L∞ dτ = ∞} which is proved in [BKM, C] and from Proposition 6.1. This completes the proof. Appendix B Proposition B.1. Let n ≥ 2, f (x) = χ (x) log2 log2 |x|−1 , where χ (·) ∈ C0∞ (Rn ), χ (x) = 1 for |x| ≤ 1/4, supp χ ⊂ {|x| ≤ 1/2}. Let r ∈ (0, n − 1). Then, f ∈ B1 with δ(j ) = 1/(j + 3), i.e., |j f (x)| ≤ C(j + 3)−1 (1 + 2j |x|)−r ,

j ≥ −1.

(B.1)

730

M. Vishik

Proof. It is well known (see [T]) that a function such as f withlog2 log2 |x|−1 singularity belongs to a weighted bmoρ , where ρ(t) = log 1t −1 , t ∈ 0, 21 . This means the following 2 inequalities are satisfied: 1 sup |f − fB | dx ≤ C; (B.2) |B| ρ(,(B)) B B,,(B)≤ 21 sup |f | dx ≤ C. (B.3) ,(B)= 21

B

In (B.2) B runs over the set of cubes with side ,(B) ≤ 1/2. This implies via characterization of bmo, |j f (x)| ≤ C(j + 3)−1 , j ≥ −1, x ∈ Rn .

(B.4)

We have to prove a stronger inequality (B.1). Since (B.1) is obvious for j = −1 we will assume j ≥ 0 in the sequel. We have because of the cancellation properties of ϕ:   n−2 (α!)−1 ∂xα f (x) (−y)α  dy j f (x) = 2nj ϕ(2j y) f (x − y) − ,=0 |α|=,

d n−1 ϕ(2 y) dτ (n − 2)! (1 − τ ) f (x − τy) dy =2 dτ 0

1 n−1 ) (1 − τ )n−2 = 2nj ϕ(2j y) (n − 1)−1 β 0 nj

j

1

−1

n−2

|β|=n−1

β

β

(−y) ∂ f (x − τy) dy. Changing variables y → 2−j y and using Lemma 5.2, we obtain: |j f (x)| ≤ C 2−(n−1)j M |∇ n−1 f | (x).

(B.5)

An easy computation shows M |∇

n−1

f | (x) ≤

C |x|−(n−1) (log2 |x|−1 )−1 , |x| ≤ C (1 + |x|)−n , |x| > 21 .

We now consider 3 cases. Case I. |x| ≤ 2−j . Then (B.4) implies (B.1). Case II. |x| > 1/2. Then, (B.5), (B.6) yield |j f (x)| ≤ C 2−(n−1)j (1 + |x|)−n ≤ C 2−j (n−1−r) (1 + 2j |x|)−r ≤ C(j + 3)−1 (1 + 2j |x|)−r .

1 2

(B.6)

Incompressible Flows of Ideal Fluid with Unbounded Vorticity

731

Case III. 2−, < |x| ≤ 2−,+1 , 2 ≤ , ≤ j . Using (B.5) and (B.6) we obtain |j f (x)| ≤ C 2−(n−1)j |x|−(n−1) ,−1 ≤ C 2−(n−1)(j −,) ,−1 ≤ C 2−(n−1−r)(j −,) ,−1 (1 + 2j |x|)−r ≤ C(j + 3)−1 (1 + 2j |x|)−r . This concludes the proof. Acknowledgement. The author would like to thank Jean-Yves Chemin, Yves Meyer and Victor Yudovich for extremely useful discussions and encouragement. Partial support from the NSF and TARP is gratefully acknowledged. Part of this paper was written while the author was visiting IHES and ENS-Cachan. He is grateful to both institutions. The author thanks Cécile Gourgues for her excellent typing of the manuscript.

References [BC]

Bahouri, H., Chemin, J.-Y.: Equations de transport relatives à des champs de vecteurs non-lipschitziens et mécanique des fluides. Arch. Rational Mech. Anal. 127, 159–181 (1994) [BKM] Beale, T., Kato, T., Majda, A.: Remarks on the breakdown of smooth solution for the 3-D Euler equation. Commun. Math. Phys. 94, 61–66 (1984) [B1] Bony, J.-M.: Calcul symbolique et propagation des singularités pour les équations aux dérivées partielles non linéaires. Ann. de l’Ecole Norm. Sup. 14, 209–246 (1981) [B2] Bony, J.-M.: Second microlocalization and propagation of singularities for semi-linear hyperbolic equations. In:Proceedings Taniguchi Symposium, Katata and Kyoto, 1984, New York–London: Academic Press, 1986, pp. 11–49 [C] Chemin, J.-Y.: Fluides Parfaits Incompressibles. Asterisque 230 (1995) [GS] Gamblin, P., Saint Raymond, X.: On three-dimensional vortex patches. Bull. Soc. Math. France 123, 375–424 (1995) [JM] Jaffard, S., Meyer, Y.: Wavelet methods for pointwise regularity and local oscillations of functions. AMS, Memoires, 587, 1996 [M1] Meyer, Y.: Remarques sur un théorème de J.-M. Bony. Suppl. Rend. Circ. Mat. Palermo 1, 1–20 (1981) [M2] Meyer, Y.: Wavelets and operators. Cambridge: Cambridge University Press, 1992 [M3] Meyer, Y.: Wavelets, vibrations and scalings. AMS, CRM monograph series, 9, 1998 [S] Stein, E.: Harmonic analysis: Real-variable methods, orthogonality, and oscillatory integrals. Princeton, 1993 [T] Torchinsky, A.: Real-variable methods in harmonic analysis. New York–London: Academic Press, 1986 [Y] Yudovich, V.: Uniqueness theorem for the basic nonstationary problem in the dynamics of an ideal incompressible fluid. Mathematical Research Letters 2, 27–38 (1995) [V] Vishik, M.: Incompressible flows of an ideal fluid with vorticity in borderline spaces of Besov type. Ann. de l’Ecole Norm. Sup. 32, 769–812 (1999) Communicated by A. Jaffe